* perf test failures in linux-next on s390 @ 2023-06-13 12:54 Thomas Richter 2023-06-13 14:32 ` Ian Rogers 0 siblings, 1 reply; 15+ messages in thread From: Thomas Richter @ 2023-06-13 12:54 UTC (permalink / raw) To: linux-perf-use., Arnaldo Carvalho de Melo, Ian Rogers Hi all, I have run the perf test suite on the current 6.4rc6 kernel and see just one error: # ./perf test 2>&1 | fgrep FAILED fgrep: warning: fgrep is obsolescent; using grep -F 42.3: BPF prologue generation : FAILED! # However when I download the linux-next tree and build kernel and perf tool with the same kernel config file, I get a bunch of failing test cases, many with perf tool dumping core: # perf test 2>&1 | fgrep FAILED fgrep: warning: fgrep is obsolescent; using grep -F 6.1: Test event parsing : FAILED! 10.3: Parsing of PMU event table metrics : FAILED! 10.4: Parsing of PMU event table metrics with fake PMUs : FAILED! 17: Setup struct perf_event_attr : FAILED! 24: Number of exit events of a simple workload : FAILED! core-dump 28: Use a dummy software event to keep tracking : FAILED! 35: Track with sched_switch : FAILED! 42.3: BPF prologue generation : FAILED! 66: Parse and process metrics : FAILED! 68: Event expansion for cgroups : FAILED! 69.2: Perf time to TSC : FAILED! core-dump 74: build id cache operations : FAILED! core-dump 81: kernel lock contention analysis test : FAILED! 86: Zstd perf.data compression/decompression : FAILED! core-dump 87: perf record tests : FAILED! core-dump 94: perf all metricgroups test : FAILED! 95: perf all metrics test : FAILED! 106: Test java symbol : FAILED! core-dump # I am afraid this will show up pretty soon in the linux tree. I am going to look into each failure in the next few days. What I already found out is that many test cases now fail due to the event/PMU rework, here is one example: # perf test -Fvvvv 95 95: perf all metrics test --- start --- Testing cpi .... Metric 'transaction' not printed in: Error: The TX_NC_TABORT event is not supported. ---- end ---- perf all metrics test: FAILED! # ls -l /sys/devices/cpum_cf/events/TX_NC_TABORT -r--r--r--. 1 root root 4096 Jun 13 13:49 /sys/devices/cpum_cf/events/TX_NC_TABORT # As can be seen, the event is definitely there and supported. This same test case succeeds in the linux tree! Hopefully I can sort out some of the failures before this code show up in the linux tree. -- Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany -- Vorsitzender des Aufsichtsrats: Gregor Pillen Geschäftsführung: David Faller Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: perf test failures in linux-next on s390 2023-06-13 12:54 perf test failures in linux-next on s390 Thomas Richter @ 2023-06-13 14:32 ` Ian Rogers 2023-06-14 8:31 ` Thomas Richter 0 siblings, 1 reply; 15+ messages in thread From: Ian Rogers @ 2023-06-13 14:32 UTC (permalink / raw) To: Thomas Richter; +Cc: linux-perf-use., Arnaldo Carvalho de Melo On Tue, Jun 13, 2023 at 5:54 AM Thomas Richter <tmricht@linux.ibm.com> wrote: > > Hi all, > > I have run the perf test suite on the current 6.4rc6 kernel and see just one error: > # ./perf test 2>&1 | fgrep FAILED > fgrep: warning: fgrep is obsolescent; using grep -F > 42.3: BPF prologue generation : FAILED! > # > > However when I download the linux-next tree and build kernel and perf > tool with the same kernel config file, I get a bunch of failing test cases, > many with perf tool dumping core: > > # perf test 2>&1 | fgrep FAILED > fgrep: warning: fgrep is obsolescent; using grep -F > 6.1: Test event parsing : FAILED! > 10.3: Parsing of PMU event table metrics : FAILED! > 10.4: Parsing of PMU event table metrics with fake PMUs : FAILED! > 17: Setup struct perf_event_attr : FAILED! > 24: Number of exit events of a simple workload : FAILED! core-dump > 28: Use a dummy software event to keep tracking : FAILED! > 35: Track with sched_switch : FAILED! > 42.3: BPF prologue generation : FAILED! > 66: Parse and process metrics : FAILED! > 68: Event expansion for cgroups : FAILED! > 69.2: Perf time to TSC : FAILED! core-dump > 74: build id cache operations : FAILED! core-dump > 81: kernel lock contention analysis test : FAILED! > 86: Zstd perf.data compression/decompression : FAILED! core-dump > 87: perf record tests : FAILED! core-dump > 94: perf all metricgroups test : FAILED! > 95: perf all metrics test : FAILED! > 106: Test java symbol : FAILED! core-dump > # > > I am afraid this will show up pretty soon in the linux tree. > I am going to look into each failure in the next few days. > > What I already found out is that many test cases now fail due to the > event/PMU rework, here is one example: > > # perf test -Fvvvv 95 > 95: perf all metrics test > --- start --- > Testing cpi > .... > Metric 'transaction' not printed in: > Error: > The TX_NC_TABORT event is not supported. > ---- end ---- > perf all metrics test: FAILED! > # ls -l /sys/devices/cpum_cf/events/TX_NC_TABORT > -r--r--r--. 1 root root 4096 Jun 13 13:49 /sys/devices/cpum_cf/events/TX_NC_TABORT > # > > As can be seen, the event is definitely there and supported. > This same test case succeeds in the linux tree! > > Hopefully I can sort out some of the failures before this code show up > in the linux tree. Thanks Thomas, to be clear this is what is in perf-tools-next/linux-next and not 6.4? Rather than try to do more complicated cases like the metrics tests, it makes sense to dig into why event parsing is failing. Test 6 first of all, could you give output? Thanks, Ian > -- > Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany > -- > Vorsitzender des Aufsichtsrats: Gregor Pillen > Geschäftsführung: David Faller > Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: perf test failures in linux-next on s390 2023-06-13 14:32 ` Ian Rogers @ 2023-06-14 8:31 ` Thomas Richter 2023-06-14 14:57 ` Ian Rogers 0 siblings, 1 reply; 15+ messages in thread From: Thomas Richter @ 2023-06-14 8:31 UTC (permalink / raw) To: Ian Rogers; +Cc: linux-perf-use., Arnaldo Carvalho de Melo, Sumanth Korikkar On 6/13/23 16:32, Ian Rogers wrote: > On Tue, Jun 13, 2023 at 5:54 AM Thomas Richter <tmricht@linux.ibm.com> wrote: >> >> Hi all, >> >> I have run the perf test suite on the current 6.4rc6 kernel and see just one error: >> # ./perf test 2>&1 | fgrep FAILED >> fgrep: warning: fgrep is obsolescent; using grep -F >> 42.3: BPF prologue generation : FAILED! >> # >> >> However when I download the linux-next tree and build kernel and perf >> tool with the same kernel config file, I get a bunch of failing test cases, >> many with perf tool dumping core: >> >> # perf test 2>&1 | fgrep FAILED >> fgrep: warning: fgrep is obsolescent; using grep -F >> 6.1: Test event parsing : FAILED! >> 10.3: Parsing of PMU event table metrics : FAILED! >> 10.4: Parsing of PMU event table metrics with fake PMUs : FAILED! >> 17: Setup struct perf_event_attr : FAILED! >> 24: Number of exit events of a simple workload : FAILED! core-dump >> 28: Use a dummy software event to keep tracking : FAILED! >> 35: Track with sched_switch : FAILED! >> 42.3: BPF prologue generation : FAILED! >> 66: Parse and process metrics : FAILED! >> 68: Event expansion for cgroups : FAILED! >> 69.2: Perf time to TSC : FAILED! core-dump >> 74: build id cache operations : FAILED! core-dump >> 81: kernel lock contention analysis test : FAILED! >> 86: Zstd perf.data compression/decompression : FAILED! core-dump >> 87: perf record tests : FAILED! core-dump >> 94: perf all metricgroups test : FAILED! >> 95: perf all metrics test : FAILED! >> 106: Test java symbol : FAILED! core-dump >> # >> >> I am afraid this will show up pretty soon in the linux tree. >> I am going to look into each failure in the next few days. >> >> What I already found out is that many test cases now fail due to the >> event/PMU rework, here is one example: >> >> # perf test -Fvvvv 95 >> 95: perf all metrics test >> --- start --- >> Testing cpi >> .... >> Metric 'transaction' not printed in: >> Error: >> The TX_NC_TABORT event is not supported. >> ---- end ---- >> perf all metrics test: FAILED! >> # ls -l /sys/devices/cpum_cf/events/TX_NC_TABORT >> -r--r--r--. 1 root root 4096 Jun 13 13:49 /sys/devices/cpum_cf/events/TX_NC_TABORT >> # >> >> As can be seen, the event is definitely there and supported. >> This same test case succeeds in the linux tree! >> >> Hopefully I can sort out some of the failures before this code show up >> in the linux tree. > > Thanks Thomas, to be clear this is what is in > perf-tools-next/linux-next and not 6.4? Ian, thanks for your help. Correct, I am talking about the linux-next repo. The linux repo is fine. > > Rather than try to do more complicated cases like the metrics tests, > it makes sense to dig into why event parsing is failing. Test 6 first > of all, could you give output? > > Thanks, > Ian > We discussed some aspects of this about two weeks ago, but last week I was on vacation and now I resumed my work on linux-next. We run the linux-next perf test suite every night and I am concerned and would like to get this sorted out before it hits Linux 6.5. Here is the output on my linux-next tree built yesterday: # uname -a Linux a35lp67.lnxne.boe 6.4.0-rc6-next-20230613d-perf #2 \ SMP Tue Jun 13 15:18:43 CEST 2023 s390x GNU/Linux # ./perf test -F 6 6: Parse event definition strings : 6.1: Test event parsing :Segmentation fault (core dumped) # # gdb perf .... (gdb) r test -F 6 6: Parse event definition strings : 6.1: Test event parsing : Program received signal SIGSEGV, Segmentation fault. __GI_strcmp () at ../sysdeps/s390/strcmp-vx.S:47 (gdb) where #0 __GI_strcmp () at ../sysdeps/s390/strcmp-vx.S:47 #1 0x000000000110a18c in test__term_equal_term (evlist=0x152ea80) at tests/parse-events.c:1580 #2 0x000000000110a96a in test_event (e=0x14dc758 <test.events+1416>) at tests/parse-events.c:2209 #3 0x000000000110ac58 in test_events (events=0x14dc1d0 <test.events>, cnt=61) at tests/parse-events.c:2260 #4 0x000000000110ad52 in test__events2 (test=0x1500758 <suite.parse_events>, subtest=0) at tests/parse-events.c:2272 #5 0x00000000010f6fac in run_test (test=0x1500758 <suite.parse_events>, subtest=0) at tests/builtin-test.c:236 #6 0x00000000010f7142 in test_and_print (t=0x1500758 <suite.parse_events>, subtest=0) at tests/builtin-test.c:265 #7 0x00000000010f7b1e in __cmd_test (argc=1, argv=0x3ffffffa320, skiplist=0x0) at tests/builtin-test.c:436 #8 0x00000000010f8404 in cmd_test (argc=1, argv=0x3ffffffa320) at tests/builtin-test.c:559 #9 0x00000000011473fc in run_builtin (p=0x14f60e8 <commands+600>, argc=3, argv=0x3ffffffa320) at perf.c:323 #10 0x000000000114776e in handle_internal_command (argc=3, argv=0x3ffffffa320) at perf.c:377 #11 0x0000000001147980 in run_argv (argcp=0x3ffffff9f94, argv=0x3ffffff9f88) at perf.c:421 #12 0x0000000001147d48 in main (argc=3, argv=0x3ffffffa320) at perf.c:537 (gdb) To be honest, I am no expert on the yacc/bison/flex tool chain. I understand a little bit about them, but that is it. When I look at the output of perf test -Fvvvv 6 on linux-next, some things seem odd, I marked them with 3 question masks ???: # ./perf test -Fvvv 6 6: Parse event definition strings : 6.1: Test event parsing : --- start --- running test 0 'syscalls:sys_enter_openat' Using CPUID IBM,3931,704,A01,3.7,002f running test 1 'syscalls:*' running test 2 'r1a' running test 3 '1:1' running test 4 'instructions' No PMU found for 'instructions'FAILED tests/parse-events.c:143 wrong number of entries Event test failure: test 4 'instructions'running test 5 'cycles/period=100000,config2/' ??? What is wrong here? ??? Output on linux 6.4.0rc3: ??? # ./perf stat -e instructions -- true ??? ??? Performance counter stats for 'true': ??? ??? 2,965,720 instructions ??? ??? 0.002026832 seconds time elapsed ??? ??? 0.000056000 seconds user ??? 0.002048000 seconds sys ??? # ??? This is fine and works as expected. The s390 PMU for counters ??? has a direct mapping for this. So we end up in the s390 PMU ??? to retrieve the value. ??? ??? Output on linux-next ???# ./perf stat -e instructions -- true ??? ??? Performance counter stats for 'true': ??? ??? 0.65 msec task-clock # 0.250 CPUs utilized ??? 0 context-switches # 0.000 /sec ??? 0 cpu-migrations # 0.000 /sec ??? 49 page-faults # 75.375 K/sec ??? 3,367,228 cycles # 5.180 GHz ??? 2,880,270 instructions # 0.86 insn per cycle ??? <not supported> branches ??? <not supported> branch-misses ??? ??? 0.002599176 seconds time elapsed ??? ??? 0.000053000 seconds user ??? 0.002650000 seconds sys ??? ???# ??? Somehow we end up in a different PMU. The output is the same as if ??? I do not specify an event at all. To reach the s390 specific PMU ??? I have to add it explicitly as in: ???# ./perf stat -e cpum_cf/instructions/ -- true ??? ??? Performance counter stats for 'true': ??? ??? 2,814,522 cpum_cf/instructions/ ??? ??? 0.001899881 seconds time elapsed ??? ??? 0.000050000 seconds user ??? 0.001928000 seconds sys ??? ???]# No PMU found for 'cycles/period=100000,config2/'FAILED tests/parse-events.c:157 wrong number of entries Event test failure: test 5 'cycles/period=100000,config2/'running test 6 'faults' ... ??? Similar output for basicly all events. No PMU found for 'cycles'running test 59 'cycles/name=name/' No PMU found for 'name'Segmentation fault (core dumped) Hope this helps. PS: Should we keep the linux-perf-use mailing list as addressee? Not sure if everybody else is interested in this? -- Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany -- Vorsitzender des Aufsichtsrats: Gregor Pillen Geschäftsführung: David Faller Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: perf test failures in linux-next on s390 2023-06-14 8:31 ` Thomas Richter @ 2023-06-14 14:57 ` Ian Rogers 2023-06-15 8:57 ` Thomas Richter 2023-06-15 9:39 ` Thomas Richter 0 siblings, 2 replies; 15+ messages in thread From: Ian Rogers @ 2023-06-14 14:57 UTC (permalink / raw) To: Thomas Richter Cc: linux-perf-use., Arnaldo Carvalho de Melo, Sumanth Korikkar On Wed, Jun 14, 2023 at 1:32 AM Thomas Richter <tmricht@linux.ibm.com> wrote: > > On 6/13/23 16:32, Ian Rogers wrote: > > On Tue, Jun 13, 2023 at 5:54 AM Thomas Richter <tmricht@linux.ibm.com> wrote: > >> > >> Hi all, > >> > >> I have run the perf test suite on the current 6.4rc6 kernel and see just one error: > >> # ./perf test 2>&1 | fgrep FAILED > >> fgrep: warning: fgrep is obsolescent; using grep -F > >> 42.3: BPF prologue generation : FAILED! > >> # > >> > >> However when I download the linux-next tree and build kernel and perf > >> tool with the same kernel config file, I get a bunch of failing test cases, > >> many with perf tool dumping core: > >> > >> # perf test 2>&1 | fgrep FAILED > >> fgrep: warning: fgrep is obsolescent; using grep -F > >> 6.1: Test event parsing : FAILED! > >> 10.3: Parsing of PMU event table metrics : FAILED! > >> 10.4: Parsing of PMU event table metrics with fake PMUs : FAILED! > >> 17: Setup struct perf_event_attr : FAILED! > >> 24: Number of exit events of a simple workload : FAILED! core-dump > >> 28: Use a dummy software event to keep tracking : FAILED! > >> 35: Track with sched_switch : FAILED! > >> 42.3: BPF prologue generation : FAILED! > >> 66: Parse and process metrics : FAILED! > >> 68: Event expansion for cgroups : FAILED! > >> 69.2: Perf time to TSC : FAILED! core-dump > >> 74: build id cache operations : FAILED! core-dump > >> 81: kernel lock contention analysis test : FAILED! > >> 86: Zstd perf.data compression/decompression : FAILED! core-dump > >> 87: perf record tests : FAILED! core-dump > >> 94: perf all metricgroups test : FAILED! > >> 95: perf all metrics test : FAILED! > >> 106: Test java symbol : FAILED! core-dump > >> # > >> > >> I am afraid this will show up pretty soon in the linux tree. > >> I am going to look into each failure in the next few days. > >> > >> What I already found out is that many test cases now fail due to the > >> event/PMU rework, here is one example: > >> > >> # perf test -Fvvvv 95 > >> 95: perf all metrics test > >> --- start --- > >> Testing cpi > >> .... > >> Metric 'transaction' not printed in: > >> Error: > >> The TX_NC_TABORT event is not supported. > >> ---- end ---- > >> perf all metrics test: FAILED! > >> # ls -l /sys/devices/cpum_cf/events/TX_NC_TABORT > >> -r--r--r--. 1 root root 4096 Jun 13 13:49 /sys/devices/cpum_cf/events/TX_NC_TABORT > >> # > >> > >> As can be seen, the event is definitely there and supported. > >> This same test case succeeds in the linux tree! > >> > >> Hopefully I can sort out some of the failures before this code show up > >> in the linux tree. > > > > Thanks Thomas, to be clear this is what is in > > perf-tools-next/linux-next and not 6.4? > > Ian, > > thanks for your help. > Correct, I am talking about the linux-next repo. The linux repo is fine. > > > > > Rather than try to do more complicated cases like the metrics tests, > > it makes sense to dig into why event parsing is failing. Test 6 first > > of all, could you give output? > > > > Thanks, > > Ian > > > We discussed some aspects of this about two weeks ago, but last week > I was on vacation and now I resumed my work on linux-next. > We run the linux-next perf test suite every night and I am concerned > and would like to get this sorted out before it hits Linux 6.5. > > Here is the output on my linux-next tree built yesterday: > # uname -a > Linux a35lp67.lnxne.boe 6.4.0-rc6-next-20230613d-perf #2 \ > SMP Tue Jun 13 15:18:43 CEST 2023 s390x GNU/Linux > # ./perf test -F 6 > 6: Parse event definition strings : > 6.1: Test event parsing :Segmentation fault (core dumped) > # > # gdb perf > .... > (gdb) r test -F 6 > 6: Parse event definition strings : > 6.1: Test event parsing : > Program received signal SIGSEGV, Segmentation fault. > __GI_strcmp () at ../sysdeps/s390/strcmp-vx.S:47 > (gdb) where > #0 __GI_strcmp () at ../sysdeps/s390/strcmp-vx.S:47 > #1 0x000000000110a18c in test__term_equal_term (evlist=0x152ea80) at tests/parse-events.c:1580 > #2 0x000000000110a96a in test_event (e=0x14dc758 <test.events+1416>) at tests/parse-events.c:2209 > #3 0x000000000110ac58 in test_events (events=0x14dc1d0 <test.events>, cnt=61) at tests/parse-events.c:2260 > #4 0x000000000110ad52 in test__events2 (test=0x1500758 <suite.parse_events>, subtest=0) > at tests/parse-events.c:2272 > #5 0x00000000010f6fac in run_test (test=0x1500758 <suite.parse_events>, subtest=0) at tests/builtin-test.c:236 > #6 0x00000000010f7142 in test_and_print (t=0x1500758 <suite.parse_events>, subtest=0) at tests/builtin-test.c:265 > #7 0x00000000010f7b1e in __cmd_test (argc=1, argv=0x3ffffffa320, skiplist=0x0) at tests/builtin-test.c:436 > #8 0x00000000010f8404 in cmd_test (argc=1, argv=0x3ffffffa320) at tests/builtin-test.c:559 > #9 0x00000000011473fc in run_builtin (p=0x14f60e8 <commands+600>, argc=3, argv=0x3ffffffa320) at perf.c:323 > #10 0x000000000114776e in handle_internal_command (argc=3, argv=0x3ffffffa320) at perf.c:377 > #11 0x0000000001147980 in run_argv (argcp=0x3ffffff9f94, argv=0x3ffffff9f88) at perf.c:421 > #12 0x0000000001147d48 in main (argc=3, argv=0x3ffffffa320) at perf.c:537 > (gdb) > > To be honest, I am no expert on the yacc/bison/flex tool chain. > I understand a little bit about them, but that is it. > > When I look at the output of perf test -Fvvvv 6 on linux-next, some things seem odd, > I marked them with 3 question masks ???: > > # ./perf test -Fvvv 6 > 6: Parse event definition strings : > 6.1: Test event parsing : > --- start --- > running test 0 'syscalls:sys_enter_openat' > Using CPUID IBM,3931,704,A01,3.7,002f > running test 1 'syscalls:*' > running test 2 'r1a' > running test 3 '1:1' > running test 4 'instructions' > No PMU found for 'instructions'FAILED tests/parse-events.c:143 wrong number of entries > Event test failure: test 4 'instructions'running test 5 'cycles/period=100000,config2/' > ??? What is wrong here? > ??? Output on linux 6.4.0rc3: > ??? # ./perf stat -e instructions -- true > ??? > ??? Performance counter stats for 'true': > ??? > ??? 2,965,720 instructions > ??? > ??? 0.002026832 seconds time elapsed > ??? > ??? 0.000056000 seconds user > ??? 0.002048000 seconds sys > ??? # > ??? This is fine and works as expected. The s390 PMU for counters > ??? has a direct mapping for this. So we end up in the s390 PMU > ??? to retrieve the value. > ??? > ??? Output on linux-next > ???# ./perf stat -e instructions -- true > ??? > ??? Performance counter stats for 'true': > ??? > ??? 0.65 msec task-clock # 0.250 CPUs utilized > ??? 0 context-switches # 0.000 /sec > ??? 0 cpu-migrations # 0.000 /sec > ??? 49 page-faults # 75.375 K/sec > ??? 3,367,228 cycles # 5.180 GHz > ??? 2,880,270 instructions # 0.86 insn per cycle > ??? <not supported> branches > ??? <not supported> branch-misses > ??? > ??? 0.002599176 seconds time elapsed > ??? > ??? 0.000053000 seconds user > ??? 0.002650000 seconds sys > ??? > ???# > ??? Somehow we end up in a different PMU. The output is the same as if > ??? I do not specify an event at all. To reach the s390 specific PMU > ??? I have to add it explicitly as in: > ???# ./perf stat -e cpum_cf/instructions/ -- true > ??? > ??? Performance counter stats for 'true': > ??? > ??? 2,814,522 cpum_cf/instructions/ > ??? > ??? 0.001899881 seconds time elapsed > ??? > ??? 0.000050000 seconds user > ??? 0.001928000 seconds sys > ??? > ???]# > No PMU found for 'cycles/period=100000,config2/'FAILED tests/parse-events.c:157 wrong number of entries > Event test failure: test 5 'cycles/period=100000,config2/'running test 6 'faults' > ... > ??? Similar output for basicly all events. > > No PMU found for 'cycles'running test 59 'cycles/name=name/' > No PMU found for 'name'Segmentation fault (core dumped) > > Hope this helps. > > PS: Should we keep the linux-perf-use mailing list as addressee? Not sure > if everybody else is interested in this? Smaller list is okay. Could you send me a zip of the sysfs (/sys/devices) ? At least one issue is that the code didn't find a core PMU. On non-hybrid x86 this would be /sys/devices/cpu, I think we spoke about this before for s390 and there are >1. The issue here is that the test found 0, and we're trying to use PMUs in the code now as a way to sort events. There's code/comment in util/pmu.c: ''' /** * is_sysfs_pmu_core() - PMU CORE devices have different name other than cpu in * sysfs on some platforms like ARM or Intel hybrid. Looking for * possible the cpus file in sysfs files to identify whether this is a * core device. * @name: The PMU name such as "cpu_atom". */ static int is_sysfs_pmu_core(const char *name) { char path[PATH_MAX]; if (!perf_pmu__pathname_scnprintf(path, sizeof(path), name, "cpus")) return 0; return file_available(path); } ... bool is_pmu_core(const char *name) { return !strcmp(name, "cpu") || is_sysfs_pmu_core(name); } ''' Thanks, Ian > -- > Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany > -- > Vorsitzender des Aufsichtsrats: Gregor Pillen > Geschäftsführung: David Faller > Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: perf test failures in linux-next on s390 2023-06-14 14:57 ` Ian Rogers @ 2023-06-15 8:57 ` Thomas Richter 2023-06-15 9:39 ` Thomas Richter 1 sibling, 0 replies; 15+ messages in thread From: Thomas Richter @ 2023-06-15 8:57 UTC (permalink / raw) To: Ian Rogers; +Cc: linux-perf-use., Arnaldo Carvalho de Melo, Sumanth Korikkar [-- Attachment #1: Type: text/plain, Size: 12355 bytes --] On 6/14/23 16:57, Ian Rogers wrote: > On Wed, Jun 14, 2023 at 1:32 AM Thomas Richter <tmricht@linux.ibm.com> wrote: >> >> On 6/13/23 16:32, Ian Rogers wrote: >>> On Tue, Jun 13, 2023 at 5:54 AM Thomas Richter <tmricht@linux.ibm.com> wrote: >>>> >>>> Hi all, >>>> >>>> I have run the perf test suite on the current 6.4rc6 kernel and see just one error: >>>> # ./perf test 2>&1 | fgrep FAILED >>>> fgrep: warning: fgrep is obsolescent; using grep -F >>>> 42.3: BPF prologue generation : FAILED! >>>> # >>>> >>>> However when I download the linux-next tree and build kernel and perf >>>> tool with the same kernel config file, I get a bunch of failing test cases, >>>> many with perf tool dumping core: >>>> >>>> # perf test 2>&1 | fgrep FAILED >>>> fgrep: warning: fgrep is obsolescent; using grep -F >>>> 6.1: Test event parsing : FAILED! >>>> 10.3: Parsing of PMU event table metrics : FAILED! >>>> 10.4: Parsing of PMU event table metrics with fake PMUs : FAILED! >>>> 17: Setup struct perf_event_attr : FAILED! >>>> 24: Number of exit events of a simple workload : FAILED! core-dump >>>> 28: Use a dummy software event to keep tracking : FAILED! >>>> 35: Track with sched_switch : FAILED! >>>> 42.3: BPF prologue generation : FAILED! >>>> 66: Parse and process metrics : FAILED! >>>> 68: Event expansion for cgroups : FAILED! >>>> 69.2: Perf time to TSC : FAILED! core-dump >>>> 74: build id cache operations : FAILED! core-dump >>>> 81: kernel lock contention analysis test : FAILED! >>>> 86: Zstd perf.data compression/decompression : FAILED! core-dump >>>> 87: perf record tests : FAILED! core-dump >>>> 94: perf all metricgroups test : FAILED! >>>> 95: perf all metrics test : FAILED! >>>> 106: Test java symbol : FAILED! core-dump >>>> # >>>> >>>> I am afraid this will show up pretty soon in the linux tree. >>>> I am going to look into each failure in the next few days. >>>> >>>> What I already found out is that many test cases now fail due to the >>>> event/PMU rework, here is one example: >>>> >>>> # perf test -Fvvvv 95 >>>> 95: perf all metrics test >>>> --- start --- >>>> Testing cpi >>>> .... >>>> Metric 'transaction' not printed in: >>>> Error: >>>> The TX_NC_TABORT event is not supported. >>>> ---- end ---- >>>> perf all metrics test: FAILED! >>>> # ls -l /sys/devices/cpum_cf/events/TX_NC_TABORT >>>> -r--r--r--. 1 root root 4096 Jun 13 13:49 /sys/devices/cpum_cf/events/TX_NC_TABORT >>>> # >>>> >>>> As can be seen, the event is definitely there and supported. >>>> This same test case succeeds in the linux tree! >>>> >>>> Hopefully I can sort out some of the failures before this code show up >>>> in the linux tree. >>> >>> Thanks Thomas, to be clear this is what is in >>> perf-tools-next/linux-next and not 6.4? >> >> Ian, >> >> thanks for your help. >> Correct, I am talking about the linux-next repo. The linux repo is fine. >> >>> >>> Rather than try to do more complicated cases like the metrics tests, >>> it makes sense to dig into why event parsing is failing. Test 6 first >>> of all, could you give output? >>> >>> Thanks, >>> Ian >>> >> We discussed some aspects of this about two weeks ago, but last week >> I was on vacation and now I resumed my work on linux-next. >> We run the linux-next perf test suite every night and I am concerned >> and would like to get this sorted out before it hits Linux 6.5. >> >> Here is the output on my linux-next tree built yesterday: >> # uname -a >> Linux a35lp67.lnxne.boe 6.4.0-rc6-next-20230613d-perf #2 \ >> SMP Tue Jun 13 15:18:43 CEST 2023 s390x GNU/Linux >> # ./perf test -F 6 >> 6: Parse event definition strings : >> 6.1: Test event parsing :Segmentation fault (core dumped) >> # >> # gdb perf >> .... >> (gdb) r test -F 6 >> 6: Parse event definition strings : >> 6.1: Test event parsing : >> Program received signal SIGSEGV, Segmentation fault. >> __GI_strcmp () at ../sysdeps/s390/strcmp-vx.S:47 >> (gdb) where >> #0 __GI_strcmp () at ../sysdeps/s390/strcmp-vx.S:47 >> #1 0x000000000110a18c in test__term_equal_term (evlist=0x152ea80) at tests/parse-events.c:1580 >> #2 0x000000000110a96a in test_event (e=0x14dc758 <test.events+1416>) at tests/parse-events.c:2209 >> #3 0x000000000110ac58 in test_events (events=0x14dc1d0 <test.events>, cnt=61) at tests/parse-events.c:2260 >> #4 0x000000000110ad52 in test__events2 (test=0x1500758 <suite.parse_events>, subtest=0) >> at tests/parse-events.c:2272 >> #5 0x00000000010f6fac in run_test (test=0x1500758 <suite.parse_events>, subtest=0) at tests/builtin-test.c:236 >> #6 0x00000000010f7142 in test_and_print (t=0x1500758 <suite.parse_events>, subtest=0) at tests/builtin-test.c:265 >> #7 0x00000000010f7b1e in __cmd_test (argc=1, argv=0x3ffffffa320, skiplist=0x0) at tests/builtin-test.c:436 >> #8 0x00000000010f8404 in cmd_test (argc=1, argv=0x3ffffffa320) at tests/builtin-test.c:559 >> #9 0x00000000011473fc in run_builtin (p=0x14f60e8 <commands+600>, argc=3, argv=0x3ffffffa320) at perf.c:323 >> #10 0x000000000114776e in handle_internal_command (argc=3, argv=0x3ffffffa320) at perf.c:377 >> #11 0x0000000001147980 in run_argv (argcp=0x3ffffff9f94, argv=0x3ffffff9f88) at perf.c:421 >> #12 0x0000000001147d48 in main (argc=3, argv=0x3ffffffa320) at perf.c:537 >> (gdb) >> >> To be honest, I am no expert on the yacc/bison/flex tool chain. >> I understand a little bit about them, but that is it. >> >> When I look at the output of perf test -Fvvvv 6 on linux-next, some things seem odd, >> I marked them with 3 question masks ???: >> >> # ./perf test -Fvvv 6 >> 6: Parse event definition strings : >> 6.1: Test event parsing : >> --- start --- >> running test 0 'syscalls:sys_enter_openat' >> Using CPUID IBM,3931,704,A01,3.7,002f >> running test 1 'syscalls:*' >> running test 2 'r1a' >> running test 3 '1:1' >> running test 4 'instructions' >> No PMU found for 'instructions'FAILED tests/parse-events.c:143 wrong number of entries >> Event test failure: test 4 'instructions'running test 5 'cycles/period=100000,config2/' >> ??? What is wrong here? >> ??? Output on linux 6.4.0rc3: >> ??? # ./perf stat -e instructions -- true >> ??? >> ??? Performance counter stats for 'true': >> ??? >> ??? 2,965,720 instructions >> ??? >> ??? 0.002026832 seconds time elapsed >> ??? >> ??? 0.000056000 seconds user >> ??? 0.002048000 seconds sys >> ??? # >> ??? This is fine and works as expected. The s390 PMU for counters >> ??? has a direct mapping for this. So we end up in the s390 PMU >> ??? to retrieve the value. >> ??? >> ??? Output on linux-next >> ???# ./perf stat -e instructions -- true >> ??? >> ??? Performance counter stats for 'true': >> ??? >> ??? 0.65 msec task-clock # 0.250 CPUs utilized >> ??? 0 context-switches # 0.000 /sec >> ??? 0 cpu-migrations # 0.000 /sec >> ??? 49 page-faults # 75.375 K/sec >> ??? 3,367,228 cycles # 5.180 GHz >> ??? 2,880,270 instructions # 0.86 insn per cycle >> ??? <not supported> branches >> ??? <not supported> branch-misses >> ??? >> ??? 0.002599176 seconds time elapsed >> ??? >> ??? 0.000053000 seconds user >> ??? 0.002650000 seconds sys >> ??? >> ???# >> ??? Somehow we end up in a different PMU. The output is the same as if >> ??? I do not specify an event at all. To reach the s390 specific PMU >> ??? I have to add it explicitly as in: >> ???# ./perf stat -e cpum_cf/instructions/ -- true >> ??? >> ??? Performance counter stats for 'true': >> ??? >> ??? 2,814,522 cpum_cf/instructions/ >> ??? >> ??? 0.001899881 seconds time elapsed >> ??? >> ??? 0.000050000 seconds user >> ??? 0.001928000 seconds sys >> ??? >> ???]# >> No PMU found for 'cycles/period=100000,config2/'FAILED tests/parse-events.c:157 wrong number of entries >> Event test failure: test 5 'cycles/period=100000,config2/'running test 6 'faults' >> ... >> ??? Similar output for basicly all events. >> >> No PMU found for 'cycles'running test 59 'cycles/name=name/' >> No PMU found for 'name'Segmentation fault (core dumped) >> >> Hope this helps. >> >> PS: Should we keep the linux-perf-use mailing list as addressee? Not sure >> if everybody else is interested in this? > > Smaller list is okay. Could you send me a zip of the sysfs > (/sys/devices) ? At least one issue is that the code didn't find a > core PMU. On non-hybrid x86 this would be /sys/devices/cpu, I think we > spoke about this before for s390 and there are >1. The issue here is > that the test found 0, and we're trying to use PMUs in the code now as > a way to sort events. There's code/comment in util/pmu.c: > > ''' > /** > * is_sysfs_pmu_core() - PMU CORE devices have different name other than cpu in > * sysfs on some platforms like ARM or Intel hybrid. Looking for > * possible the cpus file in sysfs files to identify whether this is a > * core device. > * @name: The PMU name such as "cpu_atom". > */ > static int is_sysfs_pmu_core(const char *name) > { > char path[PATH_MAX]; > > if (!perf_pmu__pathname_scnprintf(path, sizeof(path), name, "cpus")) > return 0; > return file_available(path); > } > ... > bool is_pmu_core(const char *name) > { > return !strcmp(name, "cpu") || is_sysfs_pmu_core(name); > } > ''' > > Thanks, > Ian > Thanks for refreshing my memory. With s390 core PMU named as /sys/devices/cpum_cf, this fix was missing: diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c index fe64ad292d36..6142e4710a2f 100644 --- a/tools/perf/util/pmu.c +++ b/tools/perf/util/pmu.c @@ -1419,7 +1419,7 @@ void perf_pmu__del_formats(struct list_head *formats) bool is_pmu_core(const char *name) { - return !strcmp(name, "cpu") || is_sysfs_pmu_core(name); + return !strcmp(name, "cpu") || !strcmp(name, "cpum_cf") || is_sysfs_pmu_core(name); } bool perf_pmu__supports_legacy_cache(const struct perf_pmu *pmu) With that fix applied, the test succeeds: # ./perf test -F 6 6: Parse event definition strings : 6.1: Test event parsing : Ok 6.2: Parsing of all PMU events from sysfs : Ok 6.3: Parsing of given PMU events from sysfs : Ok 6.4: Parsing of aliased events from sysfs : Skip (no aliases in sysfs) 6.5: Parsing of aliased events : Ok 6.6: Parsing of terms (event modifiers) : Ok # I have tried to send my /sys/devices tree before as zip or tgz but that mail got deleted by your mailer because it contained compressed data. So I send you an ls -lR from our 5 PMUs on s390 as text attachment and here are the type values: # for i in /sys/devices/cpum_[sc]f* /sys/devices/pai_*; do echo PMU $i PMU_TYPE $(cat $i/type); done PMU /sys/devices/cpum_cf PMU_TYPE 8 PMU /sys/devices/cpum_cf_diag PMU_TYPE 9 PMU /sys/devices/cpum_sf PMU_TYPE 4 PMU /sys/devices/pai_crypto PMU_TYPE 10 PMU /sys/devices/pai_ext PMU_TYPE 11 # For the /sys/devices/PMU tree see attachment sysfs-s390.txt Thanks a lot for you help. -- Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany -- Vorsitzender des Aufsichtsrats: Gregor Pillen Geschäftsführung: David Faller Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 [-- Attachment #2: sysfs-s390.txt --] [-- Type: text/plain, Size: 19744 bytes --] /sys/devices/cpum_cf: total 0 drwxr-xr-x 2 root root 0 Jun 14 13:03 events drwxr-xr-x 2 root root 0 Jun 14 13:03 format -rw-r--r-- 1 root root 4096 Jun 15 10:48 perf_event_mux_interval_ms lrwxrwxrwx 1 root root 0 Jun 15 10:48 subsystem -> ../../bus/event_source -r--r--r-- 1 root root 4096 Jun 14 13:03 type -rw-r--r-- 1 root root 4096 Jun 15 10:48 uevent /sys/devices/cpum_cf/events: total 0 -r--r--r-- 1 root root 4096 Jun 14 13:03 AES_BLOCKED_CYCLES -r--r--r-- 1 root root 4096 Jun 14 13:03 AES_BLOCKED_FUNCTIONS -r--r--r-- 1 root root 4096 Jun 14 13:03 AES_CYCLES -r--r--r-- 1 root root 4096 Jun 14 13:03 AES_FUNCTIONS -r--r--r-- 1 root root 4096 Jun 14 13:03 BCD_DFP_EXECUTION_SLOTS -r--r--r-- 1 root root 4096 Jun 14 13:03 CPU_CYCLES -r--r--r-- 1 root root 4096 Jun 14 13:03 CRSTE_1MB_WRITES -r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_OFF_DRAWER -r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_OFF_DRAWER_MEMORY -r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_ON_CHIP -r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_ON_CHIP_CHIP_HIT -r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_ON_CHIP_DRAWER_HIT -r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_ON_CHIP_IV -r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_ON_CHIP_MEMORY -r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_ON_DRAWER -r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_ON_DRAWER_MEMORY -r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_ON_MODULE -r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_ON_MODULE_MEMORY -r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_REQ -r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_REQ_CHIP_HIT -r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_REQ_DRAWER_HIT -r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_REQ_IV -r--r--r-- 1 root root 4096 Jun 14 13:03 DEA_BLOCKED_CYCLES -r--r--r-- 1 root root 4096 Jun 14 13:03 DEA_BLOCKED_FUNCTIONS -r--r--r-- 1 root root 4096 Jun 14 13:03 DEA_CYCLES -r--r--r-- 1 root root 4096 Jun 14 13:03 DEA_FUNCTIONS -r--r--r-- 1 root root 4096 Jun 14 13:03 DECIMAL_INSTRUCTIONS -r--r--r-- 1 root root 4096 Jun 14 13:03 DFLT_ACCESS -r--r--r-- 1 root root 4096 Jun 14 13:03 DFLT_CC -r--r--r-- 1 root root 4096 Jun 14 13:03 DFLT_CCFINISH -r--r--r-- 1 root root 4096 Jun 14 13:03 DFLT_CYCLES -r--r--r-- 1 root root 4096 Jun 14 13:03 DTLB2_GPAGE_WRITES -r--r--r-- 1 root root 4096 Jun 14 13:03 DTLB2_MISSES -r--r--r-- 1 root root 4096 Jun 14 13:03 DTLB2_WRITES -r--r--r-- 1 root root 4096 Jun 14 13:03 ECC_BLOCKED_CYCLES_COUNT -r--r--r-- 1 root root 4096 Jun 14 13:03 ECC_BLOCKED_FUNCTION_COUNT -r--r--r-- 1 root root 4096 Jun 14 13:03 ECC_CYCLES_COUNT -r--r--r-- 1 root root 4096 Jun 14 13:03 ECC_FUNCTION_COUNT -r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_OFF_DRAWER -r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_OFF_DRAWER_MEMORY -r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_ON_CHIP -r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_ON_CHIP_CHIP_HIT -r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_ON_CHIP_DRAWER_HIT -r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_ON_CHIP_IV -r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_ON_CHIP_MEMORY -r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_ON_DRAWER -r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_ON_DRAWER_MEMORY -r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_ON_MODULE -r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_ON_MODULE_MEMORY -r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_REQ -r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_REQ_CHIP_HIT -r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_REQ_DRAWER_HIT -r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_REQ_IV -r--r--r-- 1 root root 4096 Jun 14 13:03 IDCW_OFF_DRAWER_CHIP_HIT -r--r--r-- 1 root root 4096 Jun 14 13:03 IDCW_OFF_DRAWER_DRAWER_HIT -r--r--r-- 1 root root 4096 Jun 14 13:03 IDCW_OFF_DRAWER_IV -r--r--r-- 1 root root 4096 Jun 14 13:03 IDCW_ON_DRAWER_CHIP_HIT -r--r--r-- 1 root root 4096 Jun 14 13:03 IDCW_ON_DRAWER_DRAWER_HIT -r--r--r-- 1 root root 4096 Jun 14 13:03 IDCW_ON_DRAWER_IV -r--r--r-- 1 root root 4096 Jun 14 13:03 IDCW_ON_MODULE_CHIP_HIT -r--r--r-- 1 root root 4096 Jun 14 13:03 IDCW_ON_MODULE_DRAWER_HIT -r--r--r-- 1 root root 4096 Jun 14 13:03 IDCW_ON_MODULE_IV -r--r--r-- 1 root root 4096 Jun 14 13:03 INSTRUCTIONS -r--r--r-- 1 root root 4096 Jun 14 13:03 ITLB2_MISSES -r--r--r-- 1 root root 4096 Jun 14 13:03 ITLB2_WRITES -r--r--r-- 1 root root 4096 Jun 14 13:03 L1C_TLB2_MISSES -r--r--r-- 1 root root 4096 Jun 14 13:03 L1D_DIR_WRITES -r--r--r-- 1 root root 4096 Jun 14 13:03 L1D_PENALTY_CYCLES -r--r--r-- 1 root root 4096 Jun 14 13:03 L1D_RO_EXCL_WRITES -r--r--r-- 1 root root 4096 Jun 14 13:03 L1I_DIR_WRITES -r--r--r-- 1 root root 4096 Jun 14 13:03 L1I_PENALTY_CYCLES -r--r--r-- 1 root root 4096 Jun 14 13:03 LAST_HOST_TRANSLATIONS -r--r--r-- 1 root root 4096 Jun 14 13:03 MT_DIAG_CYCLES_ONE_THR_ACTIVE -r--r--r-- 1 root root 4096 Jun 14 13:03 MT_DIAG_CYCLES_TWO_THR_ACTIVE -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_COMPLETIONS -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_HOLD_LOCK -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_INVOCATIONS -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_WAIT_LOCK -r--r--r-- 1 root root 4096 Jun 14 13:03 PRNG_BLOCKED_CYCLES -r--r--r-- 1 root root 4096 Jun 14 13:03 PRNG_BLOCKED_FUNCTIONS -r--r--r-- 1 root root 4096 Jun 14 13:03 PRNG_CYCLES -r--r--r-- 1 root root 4096 Jun 14 13:03 PRNG_FUNCTIONS -r--r--r-- 1 root root 4096 Jun 14 13:03 PROBLEM_STATE_CPU_CYCLES -r--r--r-- 1 root root 4096 Jun 14 13:03 PROBLEM_STATE_INSTRUCTIONS -r--r--r-- 1 root root 4096 Jun 14 13:03 SHA_BLOCKED_CYCLES -r--r--r-- 1 root root 4096 Jun 14 13:03 SHA_BLOCKED_FUNCTIONS -r--r--r-- 1 root root 4096 Jun 14 13:03 SHA_CYCLES -r--r--r-- 1 root root 4096 Jun 14 13:03 SHA_FUNCTIONS -r--r--r-- 1 root root 4096 Jun 14 13:03 SORTL -r--r--r-- 1 root root 4096 Jun 14 13:03 TLB2_CRSTE_WRITES -r--r--r-- 1 root root 4096 Jun 14 13:03 TLB2_ENGINES_BUSY -r--r--r-- 1 root root 4096 Jun 14 13:03 TLB2_PTE_WRITES -r--r--r-- 1 root root 4096 Jun 14 13:03 TX_C_TABORT_NO_SPECIAL -r--r--r-- 1 root root 4096 Jun 14 13:03 TX_C_TABORT_SPECIAL -r--r--r-- 1 root root 4096 Jun 14 13:03 TX_C_TEND -r--r--r-- 1 root root 4096 Jun 14 13:03 TX_NC_TABORT -r--r--r-- 1 root root 4096 Jun 14 13:03 TX_NC_TEND -r--r--r-- 1 root root 4096 Jun 14 13:03 VX_BCD_EXECUTION_SLOTS /sys/devices/cpum_cf/format: total 0 -r--r--r-- 1 root root 4096 Jun 14 13:03 event /sys/devices/cpum_cf_diag: total 0 drwxr-xr-x 2 root root 0 Jun 14 15:43 events drwxr-xr-x 2 root root 0 Jun 14 15:43 format -rw-r--r-- 1 root root 4096 Jun 15 10:53 perf_event_mux_interval_ms lrwxrwxrwx 1 root root 0 Jun 15 10:53 subsystem -> ../../bus/event_source -r--r--r-- 1 root root 4096 Jun 14 15:43 type -rw-r--r-- 1 root root 4096 Jun 15 10:53 uevent /sys/devices/cpum_cf_diag/events: total 0 -r--r--r-- 1 root root 4096 Jun 14 15:43 CF_DIAG /sys/devices/cpum_cf_diag/format: total 0 -r--r--r-- 1 root root 4096 Jun 14 15:43 event /sys/devices/cpum_sf: total 0 drwxr-xr-x 2 root root 0 Jun 14 13:03 events drwxr-xr-x 2 root root 0 Jun 14 13:03 format -rw-r--r-- 1 root root 4096 Jun 15 10:48 perf_event_mux_interval_ms lrwxrwxrwx 1 root root 0 Jun 15 10:48 subsystem -> ../../bus/event_source -r--r--r-- 1 root root 4096 Jun 14 13:03 type -rw-r--r-- 1 root root 4096 Jun 15 10:48 uevent /sys/devices/cpum_sf/events: total 0 -r--r--r-- 1 root root 4096 Jun 14 13:03 SF_CYCLES_BASIC -r--r--r-- 1 root root 4096 Jun 14 13:03 SF_CYCLES_BASIC_DIAG /sys/devices/cpum_sf/format: total 0 -r--r--r-- 1 root root 4096 Jun 14 13:03 event /sys/devices/pai_crypto: total 0 drwxr-xr-x 2 root root 0 Jun 14 12:12 events drwxr-xr-x 2 root root 0 Jun 14 13:03 format -rw-r--r-- 1 root root 4096 Jun 15 10:49 perf_event_mux_interval_ms lrwxrwxrwx 1 root root 0 Jun 15 10:49 subsystem -> ../../bus/event_source -r--r--r-- 1 root root 4096 Jun 14 13:03 type -rw-r--r-- 1 root root 4096 Jun 15 10:49 uevent /sys/devices/pai_crypto/events: total 0 -r--r--r-- 1 root root 4096 Jun 14 15:43 CRYPTO_ALL -r--r--r-- 1 root root 4096 Jun 15 10:33 IBM_RESERVED_155 -r--r--r-- 1 root root 4096 Jun 15 10:33 IBM_RESERVED_156 -r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_ECDSA_SIGN_P256 -r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_ECDSA_SIGN_P384 -r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_ECDSA_SIGN_P521 -r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_ECDSA_VERIFY_P256 -r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_ECDSA_VERIFY_P384 -r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_ECDSA_VERIFY_P521 -r--r--r-- 1 root root 4096 Jun 14 15:43 KDSA_EDDSA_SIGN_ED25519 -r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_EDDSA_SIGN_ED448 -r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_EDDSA_VERIFY_ED25519 -r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_EDDSA_VERIFY_ED448 -r--r--r-- 1 root root 4096 Jun 14 15:43 KDSA_ENCRYPTED_ECDSA_SIGN_P256 -r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_ENCRYPTED_ECDSA_SIGN_P384 -r--r--r-- 1 root root 4096 Jun 14 15:43 KDSA_ENCRYPTED_ECDSA_SIGN_P521 -r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_ENCRYPTED_EDDSA_SIGN_ED25519 -r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_ENCRYPTED_EDDSA_SIGN_ED448 -r--r--r-- 1 root root 4096 Jun 15 10:33 KIMD_GHASH -r--r--r-- 1 root root 4096 Jun 15 10:33 KIMD_SHA_1 -r--r--r-- 1 root root 4096 Jun 15 10:33 KIMD_SHA_256 -r--r--r-- 1 root root 4096 Jun 15 10:33 KIMD_SHA3_224 -r--r--r-- 1 root root 4096 Jun 15 10:33 KIMD_SHA3_256 -r--r--r-- 1 root root 4096 Jun 14 15:43 KIMD_SHA3_384 -r--r--r-- 1 root root 4096 Jun 15 10:33 KIMD_SHA3_512 -r--r--r-- 1 root root 4096 Jun 15 10:33 KIMD_SHA_512 -r--r--r-- 1 root root 4096 Jun 15 10:33 KIMD_SHAKE_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KIMD_SHAKE_256 -r--r--r-- 1 root root 4096 Jun 14 13:03 KLMD_SHA_1 -r--r--r-- 1 root root 4096 Jun 15 10:33 KLMD_SHA_256 -r--r--r-- 1 root root 4096 Jun 15 10:33 KLMD_SHA3_224 -r--r--r-- 1 root root 4096 Jun 15 10:33 KLMD_SHA3_256 -r--r--r-- 1 root root 4096 Jun 15 10:33 KLMD_SHA3_384 -r--r--r-- 1 root root 4096 Jun 14 15:43 KLMD_SHA3_512 -r--r--r-- 1 root root 4096 Jun 15 10:33 KLMD_SHA_512 -r--r--r-- 1 root root 4096 Jun 14 15:43 KLMD_SHAKE_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KLMD_SHAKE_256 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMAC_AES_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMAC_AES_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMAC_AES_256 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMAC_DEA -r--r--r-- 1 root root 4096 Jun 15 10:33 KMAC_ENCRYPTED_AES_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMAC_ENCRYPTED_AES_192 -r--r--r-- 1 root root 4096 Jun 14 15:43 KMAC_ENCRYPTED_AES_256 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMAC_ENCRYPTED_DEA -r--r--r-- 1 root root 4096 Jun 14 15:43 KMAC_ENCRYPTED_TDEA_128 -r--r--r-- 1 root root 4096 Jun 14 15:43 KMAC_ENCRYPTED_TDEA_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMAC_TDEA_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMAC_TDEA_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KM_AES_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KM_AES_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KM_AES_256 -r--r--r-- 1 root root 4096 Jun 14 15:43 KMA_GCM_AES_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMA_GCM_AES_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMA_GCM_AES_256 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMA_GCM_ENCRYPTED_AES_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMA_GCM_ENCRYPTED_AES_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMA_GCM_ENCRYPTED_AES_256 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_AES_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_AES_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_AES_256 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_DEA -r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_ENCRYPTED_AES_128 -r--r--r-- 1 root root 4096 Jun 14 15:43 KMC_ENCRYPTED_AES_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_ENCRYPTED_AES_256 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_ENCRYPTED_DEA -r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_ENCRYPTED_TDEA_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_ENCRYPTED_TDEA_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_PRNG -r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_TDEA_128 -r--r--r-- 1 root root 4096 Jun 14 15:43 KMC_TDEA_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMCTR_AES_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMCTR_AES_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMCTR_AES_256 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMCTR_DEA -r--r--r-- 1 root root 4096 Jun 14 15:43 KMCTR_ENCRYPTED_AES_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMCTR_ENCRYPTED_AES_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMCTR_ENCRYPTED_AES_256 -r--r--r-- 1 root root 4096 Jun 14 15:43 KMCTR_ENCRYPTED_DEA -r--r--r-- 1 root root 4096 Jun 15 10:33 KMCTR_ENCRYPTED_TDEA_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMCTR_ENCRYPTED_TDEA_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMCTR_TDEA_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMCTR_TDEA_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KM_DEA -r--r--r-- 1 root root 4096 Jun 15 10:33 KM_ENCRYPTED_AES_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KM_ENCRYPTED_AES_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KM_ENCRYPTED_AES_256 -r--r--r-- 1 root root 4096 Jun 15 10:33 KM_ENCRYPTED_DEA -r--r--r-- 1 root root 4096 Jun 15 10:33 KM_ENCRYPTED_TDEA_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KM_ENCRYPTED_TDEA_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMF_AES_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMF_AES_192 -r--r--r-- 1 root root 4096 Jun 14 15:43 KMF_AES_256 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMF_DEA -r--r--r-- 1 root root 4096 Jun 14 15:43 KMF_ENCRYPTED_AES_128 -r--r--r-- 1 root root 4096 Jun 14 15:43 KMF_ENCRYPTED_AES_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMF_ENCRYPTED_AES_256 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMF_ENCRYPTED_DEA -r--r--r-- 1 root root 4096 Jun 15 10:33 KMF_ENCRYPTED_TDEA_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMF_ENCRYPTED_TDEA_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMF_TDEA_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMF_TDEA_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMO_AES_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMO_AES_192 -r--r--r-- 1 root root 4096 Jun 14 15:43 KMO_AES_256 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMO_DEA -r--r--r-- 1 root root 4096 Jun 14 15:43 KMO_ENCRYPTED_AES_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMO_ENCRYPTED_AES_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMO_ENCRYPTED_AES_256 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMO_ENCRYPTED_DEA -r--r--r-- 1 root root 4096 Jun 15 10:33 KMO_ENCRYPTED_TDEA_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMO_ENCRYPTED_TDEA_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMO_TDEA_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KMO_TDEA_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 KM_TDEA_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KM_TDEA_192 -r--r--r-- 1 root root 4096 Jun 14 15:43 KM_XTS_AES_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KM_XTS_AES_256 -r--r--r-- 1 root root 4096 Jun 15 10:33 KM_XTS_ENCRYPTED_AES_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 KM_XTS_ENCRYPTED_AES_256 -r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_AES_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_AES_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_AES_256 -r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_DEA -r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_ENCRYPTED_AES_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_ENCRYPTED_AES_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_ENCRYPTED_AES_256A -r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_ENCRYPTED_DEA -r--r--r-- 1 root root 4096 Jun 14 15:43 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_ENCRYPTED_TDEA_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_ENCRYPTED_TDEA_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_TDEA_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_TDEA_192 -r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_XTS_PARAMETER_USING_AES_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_XTS_PARAMETER_USING_AES_256 -r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_XTS_PARAMETER_USING_ENCRYPTED_AES_128 -r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_XTS_PARAMETER_USING_ENCRYPTED_AES_256 -r--r--r-- 1 root root 4096 Jun 14 15:43 PCC_SCALAR_MULTIPLY_ED25519 -r--r--r-- 1 root root 4096 Jun 14 15:43 PCC_SCALAR_MULTIPLY_ED448 -r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_SCALAR_MULTIPLY_P256 -r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_SCALAR_MULTIPLY_P384 -r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_SCALAR_MULTIPLY_P521 -r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_SCALAR_MULTIPLY_X25519 -r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_SCALAR_MULTIPLY_X448 -r--r--r-- 1 root root 4096 Jun 15 10:33 PCKMO_ENCRYPT_AES_128_KEY -r--r--r-- 1 root root 4096 Jun 15 10:33 PCKMO_ENCRYPT_AES_192_KEY -r--r--r-- 1 root root 4096 Jun 15 10:33 PCKMO_ENCRYPT_AES_256_KEY -r--r--r-- 1 root root 4096 Jun 15 10:33 PCKMO_ENCRYPT_DEA_KEY -r--r--r-- 1 root root 4096 Jun 15 10:33 PCKMO_ENCRYPT_ECC_ED25519_KEY -r--r--r-- 1 root root 4096 Jun 15 10:33 PCKMO_ENCRYPT_ECC_ED448_KEY -r--r--r-- 1 root root 4096 Jun 15 10:33 PCKMO_ENCRYPT_ECC_P256_KEY -r--r--r-- 1 root root 4096 Jun 15 10:33 PCKMO_ENCRYPT_ECC_P384_KEY -r--r--r-- 1 root root 4096 Jun 15 10:33 PCKMO_ENCRYPT_ECC_P521_KEY -r--r--r-- 1 root root 4096 Jun 15 10:33 PCKMO_ENCRYPT_TDEA_128_KEY -r--r--r-- 1 root root 4096 Jun 14 15:43 PCKMO_ENCRYPT_TDEA_192_KEY -r--r--r-- 1 root root 4096 Jun 15 10:33 PRNO_SHA_512_DRNG -r--r--r-- 1 root root 4096 Jun 15 10:33 PRNO_TRNG -r--r--r-- 1 root root 4096 Jun 15 10:33 PRNO_TRNG_QUERY_RAW_TO_CONDITIONED_RATIO /sys/devices/pai_crypto/format: total 0 -r--r--r-- 1 root root 4096 Jun 14 13:03 event /sys/devices/pai_ext: total 0 drwxr-xr-x 2 root root 0 Jun 14 13:03 events drwxr-xr-x 2 root root 0 Jun 14 13:03 format -rw-r--r-- 1 root root 4096 Jun 15 10:49 perf_event_mux_interval_ms lrwxrwxrwx 1 root root 0 Jun 15 10:49 subsystem -> ../../bus/event_source -r--r--r-- 1 root root 4096 Jun 14 13:03 type -rw-r--r-- 1 root root 4096 Jun 15 10:49 uevent /sys/devices/pai_ext/events: total 0 -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_1MFRAME -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_2GFRAME -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_ACCESSEXCEPT -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_ADD -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_ALL -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_AVGPOOL2D -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_BATCHNORM -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_CONVOLUTION -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_DIV -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_EXP -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_GRUACT -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_IBM_RESERVED_9 -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_LARGEDIM -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_LOG -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_LSTMACT -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_MATMUL_OP -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_MATMUL_OP_BCAST23 -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_MAX -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_MAXPOOL2D -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_MIN -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_MUL -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_RELU -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_SIGMOID -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_SMALLBATCH -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_SMALLTENSOR -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_SOFTMAX -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_SUB -r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_TANH /sys/devices/pai_ext/format: total 0 -r--r--r-- 1 root root 4096 Jun 14 13:03 event ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: perf test failures in linux-next on s390 2023-06-14 14:57 ` Ian Rogers 2023-06-15 8:57 ` Thomas Richter @ 2023-06-15 9:39 ` Thomas Richter 2023-06-15 14:34 ` Arnaldo Carvalho de Melo 1 sibling, 1 reply; 15+ messages in thread From: Thomas Richter @ 2023-06-15 9:39 UTC (permalink / raw) To: Ian Rogers; +Cc: linux-perf-use., Arnaldo Carvalho de Melo, Sumanth Korikkar On 6/14/23 16:57, Ian Rogers wrote: > On Wed, Jun 14, 2023 at 1:32 AM Thomas Richter <tmricht@linux.ibm.com> wrote: .... > > Smaller list is okay. Could you send me a zip of the sysfs > (/sys/devices) ? At least one issue is that the code didn't find a > core PMU. On non-hybrid x86 this would be /sys/devices/cpu, I think we > spoke about this before for s390 and there are >1. The issue here is > that the test found 0, and we're trying to use PMUs in the code now as > a way to sort events. There's code/comment in util/pmu.c: > > ''' > /** > * is_sysfs_pmu_core() - PMU CORE devices have different name other than cpu in > * sysfs on some platforms like ARM or Intel hybrid. Looking for > * possible the cpus file in sysfs files to identify whether this is a > * core device. > * @name: The PMU name such as "cpu_atom". > */ > static int is_sysfs_pmu_core(const char *name) > { > char path[PATH_MAX]; > > if (!perf_pmu__pathname_scnprintf(path, sizeof(path), name, "cpus")) > return 0; > return file_available(path); > } > ... > bool is_pmu_core(const char *name) > { > return !strcmp(name, "cpu") || is_sysfs_pmu_core(name); > } > ''' > > Thanks, > Ian > Maybe we should scan the directory [linux-next]# ll /sys/bus/event_source/devices total 0 lrwxrwxrwx 1 root root 0 Jun 2 15:11 cpum_cf -> ../../../devices/cpum_cf lrwxrwxrwx 1 root root 0 Jun 2 15:11 cpum_cf_diag -> ../../../devices/cpum_cf_diag lrwxrwxrwx 1 root root 0 Jun 2 15:11 cpum_sf -> ../../../devices/cpum_sf lrwxrwxrwx 1 root root 0 Jun 2 15:11 kprobe -> ../../../devices/kprobe lrwxrwxrwx 1 root root 0 Jun 2 15:11 software -> ../../../devices/software lrwxrwxrwx 1 root root 0 Jun 2 15:11 tracepoint -> ../../../devices/tracepoint lrwxrwxrwx 1 root root 0 Jun 2 15:11 uprobe -> ../../../devices/uprobe [linux-next]# This directory lists the PMUs available on s390, maybe this is true for other platform... Just me 2 cents >> -- >> Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany >> -- >> Vorsitzender des Aufsichtsrats: Gregor Pillen >> Geschäftsführung: David Faller >> Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 >> -- Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany -- Vorsitzender des Aufsichtsrats: Gregor Pillen Geschäftsführung: David Faller Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: perf test failures in linux-next on s390 2023-06-15 9:39 ` Thomas Richter @ 2023-06-15 14:34 ` Arnaldo Carvalho de Melo 2023-06-16 14:23 ` Ian Rogers 0 siblings, 1 reply; 15+ messages in thread From: Arnaldo Carvalho de Melo @ 2023-06-15 14:34 UTC (permalink / raw) To: Thomas Richter Cc: Ian Rogers, linux-perf-use., Sumanth Korikkar, James Clark, Leo Yan, Suzuki K Poulose, Mike Leach, Mark Rutland, John Garry, Will Deacon Ccing the ARM people too: Em Thu, Jun 15, 2023 at 11:39:16AM +0200, Thomas Richter escreveu: > On 6/14/23 16:57, Ian Rogers wrote: > > On Wed, Jun 14, 2023 at 1:32 AM Thomas Richter <tmricht@linux.ibm.com> wrote: > > .... > > > > > Smaller list is okay. Could you send me a zip of the sysfs > > (/sys/devices) ? At least one issue is that the code didn't find a > > core PMU. On non-hybrid x86 this would be /sys/devices/cpu, I think we > > spoke about this before for s390 and there are >1. The issue here is > > that the test found 0, and we're trying to use PMUs in the code now as > > a way to sort events. There's code/comment in util/pmu.c: > > > > ''' > > /** > > * is_sysfs_pmu_core() - PMU CORE devices have different name other than cpu in > > * sysfs on some platforms like ARM or Intel hybrid. Looking for > > * possible the cpus file in sysfs files to identify whether this is a > > * core device. > > * @name: The PMU name such as "cpu_atom". > > */ > > static int is_sysfs_pmu_core(const char *name) > > { > > char path[PATH_MAX]; > > > > if (!perf_pmu__pathname_scnprintf(path, sizeof(path), name, "cpus")) > > return 0; > > return file_available(path); > > } > > ... > > bool is_pmu_core(const char *name) > > { > > return !strcmp(name, "cpu") || is_sysfs_pmu_core(name); > > } > > ''' > > Maybe we should scan the directory > > [linux-next]# ll /sys/bus/event_source/devices > total 0 > lrwxrwxrwx 1 root root 0 Jun 2 15:11 cpum_cf -> ../../../devices/cpum_cf > lrwxrwxrwx 1 root root 0 Jun 2 15:11 cpum_cf_diag -> ../../../devices/cpum_cf_diag > lrwxrwxrwx 1 root root 0 Jun 2 15:11 cpum_sf -> ../../../devices/cpum_sf > lrwxrwxrwx 1 root root 0 Jun 2 15:11 kprobe -> ../../../devices/kprobe > lrwxrwxrwx 1 root root 0 Jun 2 15:11 software -> ../../../devices/software > lrwxrwxrwx 1 root root 0 Jun 2 15:11 tracepoint -> ../../../devices/tracepoint > lrwxrwxrwx 1 root root 0 Jun 2 15:11 uprobe -> ../../../devices/uprobe > [linux-next]# > > This directory lists the PMUs available on s390, maybe this is true for > other platform... I noticed this on an arm64 board: acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat -e cycles:u,instructions:u ls COPYING CREDITS Documentation Kbuild Kconfig LICENSES MAINTAINERS Makefile README arch block certs crypto drivers fs include init io_uring ipc kernel lib mm net perf.data rust samples scripts security sound tools usr virt Performance counter stats for 'ls': <not supported> armv8_cortex_a72/cycles:u/ <not supported> armv8_cortex_a53/cycles:u/ <not supported> armv8_cortex_a72/instructions:u/ <not supported> armv8_cortex_a53/instructions:u/ 0.009192788 seconds time elapsed 0.000000000 seconds user 0.009411000 seconds sys acme@roc-rk3399-pc:~/git/perf-tools-next$ root@roc-rk3399-pc:~# ls -la /sys/bus/event_source/devices total 0 drwxr-xr-x 2 root root 0 Jan 1 1970 . drwxr-xr-x 4 root root 0 Jan 1 1970 .. lrwxrwxrwx 1 root root 0 Jan 1 1970 armv8_cortex_a53 -> ../../../devices/armv8_cortex_a53 lrwxrwxrwx 1 root root 0 Jan 1 1970 armv8_cortex_a72 -> ../../../devices/armv8_cortex_a72 lrwxrwxrwx 1 root root 0 Jan 1 1970 breakpoint -> ../../../devices/breakpoint lrwxrwxrwx 1 root root 0 Jun 14 21:40 cs_etm -> ../../../devices/cs_etm lrwxrwxrwx 1 root root 0 Jan 1 1970 software -> ../../../devices/software lrwxrwxrwx 1 root root 0 Jan 1 1970 tracepoint -> ../../../devices/tracepoint lrwxrwxrwx 1 root root 0 Jan 1 1970 uprobe -> ../../../devices/uprobe root@roc-rk3399-pc:~# running perf test now: Linux roc-rk3399-pc 6.1.0-rc5-00123-g4dd7ff4a0311 #2 SMP PREEMPT Wed Nov 16 19:55:11 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux root@roc-rk3399-pc:~# perf test 1: vmlinux symtab matches kallsyms : Ok 2: Detect openat syscall event : Ok 3: Detect openat syscall event on all cpus : Ok 4: mmap interface tests : 4.1: Read samples using the mmap interface : Ok 4.2: User space counter reading of instructions : Skip (permissions) 4.3: User space counter reading of cycles : Skip (permissions) 5: Test data source output : Ok 6: Parse event definition strings : 6.1: Test event parsing : FAILED! 6.2: Parsing of all PMU events from sysfs : Ok 6.3: Parsing of given PMU events from sysfs : Ok 6.4: Parsing of aliased events from sysfs : Skip (no aliases in sysfs) 6.5: Parsing of aliased events : Ok 6.6: Parsing of terms (event modifiers) : Ok 7: Simple expression parser : Ok 8: PERF_RECORD_* events & perf_sample fields : Ok 9: Parse perf pmu format : Ok 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok 10.5: Parsing of metric thresholds with fake PMUs : Ok 11: DSO data read : Ok 12: DSO data cache : Ok 13: DSO data reopen : Ok 14: Roundtrip evsel->name : Ok 15: Parse sched tracepoints fields : Ok 16: syscalls:sys_enter_openat event fields : Ok 17: Setup struct perf_event_attr : Skip 18: Match and link multiple hists : Ok 19: 'import perf' in python : FAILED! 20: Breakpoint overflow signal handler : Skip 21: Breakpoint overflow sampling : Skip 22: Breakpoint accounting : Ok 23: Watchpoint : 23.1: Read Only Watchpoint : Ok 23.2: Write Only Watchpoint : Ok 23.3: Read / Write Watchpoint : Ok 23.4: Modify Watchpoint : ... acme@roc-rk3399-pc:~/git/perf-tools-next$ cat /proc/cpuinfo processor : 0 BogoMIPS : 48.00 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part : 0xd03 CPU revision : 4 processor : 1 BogoMIPS : 48.00 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part : 0xd03 CPU revision : 4 processor : 2 BogoMIPS : 48.00 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part : 0xd03 CPU revision : 4 processor : 3 BogoMIPS : 48.00 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part : 0xd03 CPU revision : 4 processor : 4 BogoMIPS : 48.00 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part : 0xd08 CPU revision : 2 processor : 5 BogoMIPS : 48.00 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part : 0xd08 CPU revision : 2 acme@roc-rk3399-pc:~/git/perf-tools-next$ root@roc-rk3399-pc:~# dmidecode # dmidecode 3.3 Getting SMBIOS data from sysfs. SMBIOS 3.0 present. 7 structures occupying 287 bytes. Table at 0xF0E3C020. Handle 0x0000, DMI type 0, 24 bytes BIOS Information Vendor: U-Boot Version: 2022.10-rc5+ Release Date: 10/01/2022 ROM Size: 64 kB Characteristics: PCI is supported BIOS is upgradeable Selectable boot is supported Targeted content distribution is supported UEFI is supported BIOS Revision: 22.10 Handle 0x0001, DMI type 1, 27 bytes System Information Manufacturer: libre-computer Product Name: roc-rk3399-pc Version: Not Specified Serial Number: b03c01a7179278b7 UUID: 63333062-3130-3761-3137-393237386237 Wake-up Type: Reserved SKU Number: Not Specified Family: Not Specified Handle 0x0002, DMI type 2, 14 bytes Base Board Information Manufacturer: libre-computer Product Name: roc-rk3399-pc Version: Not Specified Serial Number: Not Specified Asset Tag: Not Specified Features: Board is a hosting board Location In Chassis: Not Specified Chassis Handle: 0x0000 Type: Motherboard Handle 0x0003, DMI type 3, 21 bytes Chassis Information Manufacturer: libre-computer Type: Desktop Lock: Not Present Version: Not Specified Serial Number: Not Specified Asset Tag: Not Specified Boot-up State: Safe Power Supply State: Safe Thermal State: Safe Security Status: None OEM Information: 0x00000000 Height: Unspecified Number Of Power Cords: Unspecified Contained Elements: 0 Handle 0x0004, DMI type 4, 48 bytes Processor Information Socket Designation: Not Specified Type: Central Processor Family: Unknown Manufacturer: Unknown ID: 00 00 00 00 00 00 00 00 Version: Unknown Voltage: Unknown External Clock: Unknown Max Speed: Unknown Current Speed: Unknown Status: Unpopulated Upgrade: None L1 Cache Handle: Not Provided L2 Cache Handle: Not Provided L3 Cache Handle: Not Provided Serial Number: Not Specified Asset Tag: Not Specified Part Number: Not Specified Characteristics: None Handle 0x0005, DMI type 32, 11 bytes System Boot Information Status: No errors detected Handle 0x0006, DMI type 127, 4 bytes End Of Table root@roc-rk3399-pc:~# ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: perf test failures in linux-next on s390 2023-06-15 14:34 ` Arnaldo Carvalho de Melo @ 2023-06-16 14:23 ` Ian Rogers 2023-06-16 14:36 ` Hybrid PMU issues on aarch64. was: " Arnaldo Carvalho de Melo 0 siblings, 1 reply; 15+ messages in thread From: Ian Rogers @ 2023-06-16 14:23 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Thomas Richter, linux-perf-use., Sumanth Korikkar, James Clark, Leo Yan, Suzuki K Poulose, Mike Leach, Mark Rutland, John Garry, Will Deacon On Thu, Jun 15, 2023 at 7:35 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Ccing the ARM people too: > > Em Thu, Jun 15, 2023 at 11:39:16AM +0200, Thomas Richter escreveu: > > On 6/14/23 16:57, Ian Rogers wrote: > > > On Wed, Jun 14, 2023 at 1:32 AM Thomas Richter <tmricht@linux.ibm.com> wrote: > > > > .... > > > > > > > > Smaller list is okay. Could you send me a zip of the sysfs > > > (/sys/devices) ? At least one issue is that the code didn't find a > > > core PMU. On non-hybrid x86 this would be /sys/devices/cpu, I think we > > > spoke about this before for s390 and there are >1. The issue here is > > > that the test found 0, and we're trying to use PMUs in the code now as > > > a way to sort events. There's code/comment in util/pmu.c: > > > > > > ''' > > > /** > > > * is_sysfs_pmu_core() - PMU CORE devices have different name other than cpu in > > > * sysfs on some platforms like ARM or Intel hybrid. Looking for > > > * possible the cpus file in sysfs files to identify whether this is a > > > * core device. > > > * @name: The PMU name such as "cpu_atom". > > > */ > > > static int is_sysfs_pmu_core(const char *name) > > > { > > > char path[PATH_MAX]; > > > > > > if (!perf_pmu__pathname_scnprintf(path, sizeof(path), name, "cpus")) > > > return 0; > > > return file_available(path); > > > } > > > ... > > > bool is_pmu_core(const char *name) > > > { > > > return !strcmp(name, "cpu") || is_sysfs_pmu_core(name); > > > } > > > ''' > > > > Maybe we should scan the directory > > > > [linux-next]# ll /sys/bus/event_source/devices > > total 0 > > lrwxrwxrwx 1 root root 0 Jun 2 15:11 cpum_cf -> ../../../devices/cpum_cf > > lrwxrwxrwx 1 root root 0 Jun 2 15:11 cpum_cf_diag -> ../../../devices/cpum_cf_diag > > lrwxrwxrwx 1 root root 0 Jun 2 15:11 cpum_sf -> ../../../devices/cpum_sf > > lrwxrwxrwx 1 root root 0 Jun 2 15:11 kprobe -> ../../../devices/kprobe > > lrwxrwxrwx 1 root root 0 Jun 2 15:11 software -> ../../../devices/software > > lrwxrwxrwx 1 root root 0 Jun 2 15:11 tracepoint -> ../../../devices/tracepoint > > lrwxrwxrwx 1 root root 0 Jun 2 15:11 uprobe -> ../../../devices/uprobe > > [linux-next]# > > > > This directory lists the PMUs available on s390, maybe this is true for > > other platform... > > I noticed this on an arm64 board: > > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat -e cycles:u,instructions:u ls > COPYING CREDITS Documentation Kbuild Kconfig LICENSES MAINTAINERS Makefile README arch block certs crypto drivers fs include init io_uring ipc kernel lib mm net perf.data rust samples scripts security sound tools usr virt > > Performance counter stats for 'ls': > > <not supported> armv8_cortex_a72/cycles:u/ > <not supported> armv8_cortex_a53/cycles:u/ > <not supported> armv8_cortex_a72/instructions:u/ > <not supported> armv8_cortex_a53/instructions:u/ I tested on a raspberry pi and perf-tools-next is working there. I suspect the issue here is the heterogeneous PMU. The cycles event is converted into a perf_event_attr with type 0 and config 0. When there are heterogeneous PMUs then we try to use the extended type to say we want armv8_cortex_a72 and armv8_cortex_a53 cycles events. Let's say the type number of armv8_cortex_a72 and armv8_cortex_a53 PMUs are 9 and 10 respectively. With heterogeneous encodings the type in the perf_event_attr remains as 0 and the config becomes 9 << 32 and 10 << 32. I suspect your kernel is seeing the extended type information and not handling it, hence the error. We add in the extended type for hardware and legacy cache events in the parse events code: https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n435 https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1239 https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1478 https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1511 The addition of the extended type happens if perf_pmus__supports_extended_type() returns true, its implementation is: https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/pmus.c?h=perf-tools-next#n480 bool perf_pmus__supports_extended_type(void) { return perf_pmus__num_core_pmus() > 1; } Previously on heterogeneous ARM the extended type wouldn't be encoded and I believe the event was opened on the PMU of the current CPU only. This is a bug because you will not count events on all PMUs. We can make perf_pmus__supports_extended_type return false on ARM which should bring back the previous behavior - or do some kind of dynamic detection using perf_event_open. We could do some kind of ARM quirk workaround behavior, for example, I suspect /sys/bus/event_source/devices/armv8_cortex_a53/events and /sys/bus/event_source/devices/armv8_cortex_a72/events both contain a cycles event. If we used a raw rather than hardware type encoding then the wildcarding should work. Unfortunately there are many encodings with extended type and sysfs won't have them all. Thanks, Ian > 0.009192788 seconds time elapsed > > 0.000000000 seconds user > 0.009411000 seconds sys > > > acme@roc-rk3399-pc:~/git/perf-tools-next$ > > root@roc-rk3399-pc:~# ls -la /sys/bus/event_source/devices > total 0 > drwxr-xr-x 2 root root 0 Jan 1 1970 . > drwxr-xr-x 4 root root 0 Jan 1 1970 .. > lrwxrwxrwx 1 root root 0 Jan 1 1970 armv8_cortex_a53 -> ../../../devices/armv8_cortex_a53 > lrwxrwxrwx 1 root root 0 Jan 1 1970 armv8_cortex_a72 -> ../../../devices/armv8_cortex_a72 > lrwxrwxrwx 1 root root 0 Jan 1 1970 breakpoint -> ../../../devices/breakpoint > lrwxrwxrwx 1 root root 0 Jun 14 21:40 cs_etm -> ../../../devices/cs_etm > lrwxrwxrwx 1 root root 0 Jan 1 1970 software -> ../../../devices/software > lrwxrwxrwx 1 root root 0 Jan 1 1970 tracepoint -> ../../../devices/tracepoint > lrwxrwxrwx 1 root root 0 Jan 1 1970 uprobe -> ../../../devices/uprobe > root@roc-rk3399-pc:~# > > running perf test now: > > Linux roc-rk3399-pc 6.1.0-rc5-00123-g4dd7ff4a0311 #2 SMP PREEMPT Wed Nov 16 19:55:11 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux > root@roc-rk3399-pc:~# perf test > 1: vmlinux symtab matches kallsyms : Ok > 2: Detect openat syscall event : Ok > 3: Detect openat syscall event on all cpus : Ok > 4: mmap interface tests : > 4.1: Read samples using the mmap interface : Ok > 4.2: User space counter reading of instructions : Skip (permissions) > 4.3: User space counter reading of cycles : Skip (permissions) > 5: Test data source output : Ok > 6: Parse event definition strings : > 6.1: Test event parsing : FAILED! > 6.2: Parsing of all PMU events from sysfs : Ok > 6.3: Parsing of given PMU events from sysfs : Ok > 6.4: Parsing of aliased events from sysfs : Skip (no aliases in sysfs) > 6.5: Parsing of aliased events : Ok > 6.6: Parsing of terms (event modifiers) : Ok > 7: Simple expression parser : Ok > 8: PERF_RECORD_* events & perf_sample fields : Ok > 9: Parse perf pmu format : Ok > 10: PMU events : > 10.1: PMU event table sanity : Ok > 10.2: PMU event map aliases : Ok > 10.3: Parsing of PMU event table metrics : Ok > 10.4: Parsing of PMU event table metrics with fake PMUs : Ok > 10.5: Parsing of metric thresholds with fake PMUs : Ok > 11: DSO data read : Ok > 12: DSO data cache : Ok > 13: DSO data reopen : Ok > 14: Roundtrip evsel->name : Ok > 15: Parse sched tracepoints fields : Ok > 16: syscalls:sys_enter_openat event fields : Ok > 17: Setup struct perf_event_attr : Skip > 18: Match and link multiple hists : Ok > 19: 'import perf' in python : FAILED! > 20: Breakpoint overflow signal handler : Skip > 21: Breakpoint overflow sampling : Skip > 22: Breakpoint accounting : Ok > 23: Watchpoint : > 23.1: Read Only Watchpoint : Ok > 23.2: Write Only Watchpoint : Ok > 23.3: Read / Write Watchpoint : Ok > 23.4: Modify Watchpoint : > ... > > > > > > acme@roc-rk3399-pc:~/git/perf-tools-next$ cat /proc/cpuinfo > processor : 0 > BogoMIPS : 48.00 > Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid > CPU implementer : 0x41 > CPU architecture: 8 > CPU variant : 0x0 > CPU part : 0xd03 > CPU revision : 4 > > processor : 1 > BogoMIPS : 48.00 > Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid > CPU implementer : 0x41 > CPU architecture: 8 > CPU variant : 0x0 > CPU part : 0xd03 > CPU revision : 4 > > processor : 2 > BogoMIPS : 48.00 > Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid > CPU implementer : 0x41 > CPU architecture: 8 > CPU variant : 0x0 > CPU part : 0xd03 > CPU revision : 4 > > processor : 3 > BogoMIPS : 48.00 > Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid > CPU implementer : 0x41 > CPU architecture: 8 > CPU variant : 0x0 > CPU part : 0xd03 > CPU revision : 4 > > processor : 4 > BogoMIPS : 48.00 > Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid > CPU implementer : 0x41 > CPU architecture: 8 > CPU variant : 0x0 > CPU part : 0xd08 > CPU revision : 2 > > processor : 5 > BogoMIPS : 48.00 > Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid > CPU implementer : 0x41 > CPU architecture: 8 > CPU variant : 0x0 > CPU part : 0xd08 > CPU revision : 2 > > acme@roc-rk3399-pc:~/git/perf-tools-next$ > > root@roc-rk3399-pc:~# dmidecode > # dmidecode 3.3 > Getting SMBIOS data from sysfs. > SMBIOS 3.0 present. > 7 structures occupying 287 bytes. > Table at 0xF0E3C020. > > Handle 0x0000, DMI type 0, 24 bytes > BIOS Information > Vendor: U-Boot > Version: 2022.10-rc5+ > Release Date: 10/01/2022 > ROM Size: 64 kB > Characteristics: > PCI is supported > BIOS is upgradeable > Selectable boot is supported > Targeted content distribution is supported > UEFI is supported > BIOS Revision: 22.10 > > Handle 0x0001, DMI type 1, 27 bytes > System Information > Manufacturer: libre-computer > Product Name: roc-rk3399-pc > Version: Not Specified > Serial Number: b03c01a7179278b7 > UUID: 63333062-3130-3761-3137-393237386237 > Wake-up Type: Reserved > SKU Number: Not Specified > Family: Not Specified > > Handle 0x0002, DMI type 2, 14 bytes > Base Board Information > Manufacturer: libre-computer > Product Name: roc-rk3399-pc > Version: Not Specified > Serial Number: Not Specified > Asset Tag: Not Specified > Features: > Board is a hosting board > Location In Chassis: Not Specified > Chassis Handle: 0x0000 > Type: Motherboard > > Handle 0x0003, DMI type 3, 21 bytes > Chassis Information > Manufacturer: libre-computer > Type: Desktop > Lock: Not Present > Version: Not Specified > Serial Number: Not Specified > Asset Tag: Not Specified > Boot-up State: Safe > Power Supply State: Safe > Thermal State: Safe > Security Status: None > OEM Information: 0x00000000 > Height: Unspecified > Number Of Power Cords: Unspecified > Contained Elements: 0 > > Handle 0x0004, DMI type 4, 48 bytes > Processor Information > Socket Designation: Not Specified > Type: Central Processor > Family: Unknown > Manufacturer: Unknown > ID: 00 00 00 00 00 00 00 00 > Version: Unknown > Voltage: Unknown > External Clock: Unknown > Max Speed: Unknown > Current Speed: Unknown > Status: Unpopulated > Upgrade: None > L1 Cache Handle: Not Provided > L2 Cache Handle: Not Provided > L3 Cache Handle: Not Provided > Serial Number: Not Specified > Asset Tag: Not Specified > Part Number: Not Specified > Characteristics: None > > Handle 0x0005, DMI type 32, 11 bytes > System Boot Information > Status: No errors detected > > Handle 0x0006, DMI type 127, 4 bytes > End Of Table > > root@roc-rk3399-pc:~# > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Hybrid PMU issues on aarch64. was: Re: perf test failures in linux-next on s390 2023-06-16 14:23 ` Ian Rogers @ 2023-06-16 14:36 ` Arnaldo Carvalho de Melo 2023-06-16 14:44 ` Arnaldo Carvalho de Melo 2023-06-19 10:04 ` Thomas Richter 0 siblings, 2 replies; 15+ messages in thread From: Arnaldo Carvalho de Melo @ 2023-06-16 14:36 UTC (permalink / raw) To: Ian Rogers Cc: Thomas Richter, linux-perf-use., Sumanth Korikkar, James Clark, Leo Yan, Suzuki K Poulose, Mike Leach, Mark Rutland, John Garry, Will Deacon Em Fri, Jun 16, 2023 at 07:23:30AM -0700, Ian Rogers escreveu: > On Thu, Jun 15, 2023 at 7:35 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Ccing the ARM people too: > > Em Thu, Jun 15, 2023 at 11:39:16AM +0200, Thomas Richter escreveu: > > > On 6/14/23 16:57, Ian Rogers wrote: > > > > On Wed, Jun 14, 2023 at 1:32 AM Thomas Richter <tmricht@linux.ibm.com> wrote: > > > > bool is_pmu_core(const char *name) > > > > { > > > > return !strcmp(name, "cpu") || is_sysfs_pmu_core(name); > > > > } > > > Maybe we should scan the directory > > > [linux-next]# ll /sys/bus/event_source/devices > > > total 0 > > > lrwxrwxrwx 1 root root 0 Jun 2 15:11 cpum_cf -> ../../../devices/cpum_cf > > > lrwxrwxrwx 1 root root 0 Jun 2 15:11 cpum_cf_diag -> ../../../devices/cpum_cf_diag > > > lrwxrwxrwx 1 root root 0 Jun 2 15:11 cpum_sf -> ../../../devices/cpum_sf > > > lrwxrwxrwx 1 root root 0 Jun 2 15:11 kprobe -> ../../../devices/kprobe > > > lrwxrwxrwx 1 root root 0 Jun 2 15:11 software -> ../../../devices/software > > > lrwxrwxrwx 1 root root 0 Jun 2 15:11 tracepoint -> ../../../devices/tracepoint > > > lrwxrwxrwx 1 root root 0 Jun 2 15:11 uprobe -> ../../../devices/uprobe > > > [linux-next]# > > > This directory lists the PMUs available on s390, maybe this is true for > > > other platform... > > I noticed this on an arm64 board: > > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat -e cycles:u,instructions:u ls > > COPYING CREDITS Documentation Kbuild Kconfig LICENSES MAINTAINERS Makefile README arch block certs crypto drivers fs include init io_uring ipc kernel lib mm net perf.data rust samples scripts security sound tools usr virt > > Performance counter stats for 'ls': > > <not supported> armv8_cortex_a72/cycles:u/ > > <not supported> armv8_cortex_a53/cycles:u/ > > <not supported> armv8_cortex_a72/instructions:u/ > > <not supported> armv8_cortex_a53/instructions:u/ > I tested on a raspberry pi and perf-tools-next is working there. I > suspect the issue here is the heterogeneous PMU. The cycles event is > converted into a perf_event_attr with type 0 and config 0. When there > are heterogeneous PMUs then we try to use the extended type to say we > want armv8_cortex_a72 and armv8_cortex_a53 cycles events. Let's say > the type number of armv8_cortex_a72 and armv8_cortex_a53 PMUs are 9 > and 10 respectively. With heterogeneous encodings the type in the The numbers are 8 and 7, PERF_TYPE_HW (thus zero, thus not printed): root@roc-rk3399-pc:~# perf stat -vv -e cycles sleep 1 Using CPUID 0x00000000410fd080 Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: size 136 config 0x800000000 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 13885 cpu -1 group_fd -1 flags 0x8 sys_perf_event_open failed, error -2 Warning: cycles event is not supported by the kernel. ------------------------------------------------------------ perf_event_attr: size 136 config 0x700000000 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 13885 cpu -1 group_fd -1 flags 0x8 sys_perf_event_open failed, error -2 Warning: cycles event is not supported by the kernel. failed to read counter cycles failed to read counter cycles Performance counter stats for 'sleep 1': <not supported> armv8_cortex_a72/cycles/ <not supported> armv8_cortex_a53/cycles/ 1.011406938 seconds time elapsed 0.000000000 seconds user 0.010886000 seconds sys root@roc-rk3399-pc:~# > perf_event_attr remains as 0 and the config becomes 9 << 32 and 10 << > 32. I suspect your kernel is seeing the extended type information and > not handling it, hence the error. looks this is the case indeed > We add in the extended type for hardware and legacy cache events in > the parse events code: > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n435 > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1239 > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1478 > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1511 > > The addition of the extended type happens if > perf_pmus__supports_extended_type() returns true, its implementation > is: > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/pmus.c?h=perf-tools-next#n480 > bool perf_pmus__supports_extended_type(void) > { > return perf_pmus__num_core_pmus() > 1; > } > > Previously on heterogeneous ARM the extended type wouldn't be encoded > and I believe the event was opened on the PMU of the current CPU only. I think that is the case, haven't checked so far tho. > This is a bug because you will not count events on all PMUs. We can > make perf_pmus__supports_extended_type return false on ARM which > should bring back the previous behavior - or do some kind of dynamic simplest first step, trying it. > detection using perf_event_open. We could do some kind of ARM quirk > workaround behavior, for example, I suspect > /sys/bus/event_source/devices/armv8_cortex_a53/events and > /sys/bus/event_source/devices/armv8_cortex_a72/events both contain a > cycles event. If we used a raw rather than hardware type encoding then > the wildcarding should work. Unfortunately there are many encodings > with extended type and sysfs won't have them all. > > Thanks, > Ian > > > 0.009192788 seconds time elapsed > > > > 0.000000000 seconds user > > 0.009411000 seconds sys > > > > > > acme@roc-rk3399-pc:~/git/perf-tools-next$ > > > > root@roc-rk3399-pc:~# ls -la /sys/bus/event_source/devices > > total 0 > > drwxr-xr-x 2 root root 0 Jan 1 1970 . > > drwxr-xr-x 4 root root 0 Jan 1 1970 .. > > lrwxrwxrwx 1 root root 0 Jan 1 1970 armv8_cortex_a53 -> ../../../devices/armv8_cortex_a53 > > lrwxrwxrwx 1 root root 0 Jan 1 1970 armv8_cortex_a72 -> ../../../devices/armv8_cortex_a72 > > lrwxrwxrwx 1 root root 0 Jan 1 1970 breakpoint -> ../../../devices/breakpoint > > lrwxrwxrwx 1 root root 0 Jun 14 21:40 cs_etm -> ../../../devices/cs_etm > > lrwxrwxrwx 1 root root 0 Jan 1 1970 software -> ../../../devices/software > > lrwxrwxrwx 1 root root 0 Jan 1 1970 tracepoint -> ../../../devices/tracepoint > > lrwxrwxrwx 1 root root 0 Jan 1 1970 uprobe -> ../../../devices/uprobe > > root@roc-rk3399-pc:~# > > > > running perf test now: > > > > Linux roc-rk3399-pc 6.1.0-rc5-00123-g4dd7ff4a0311 #2 SMP PREEMPT Wed Nov 16 19:55:11 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux > > root@roc-rk3399-pc:~# perf test > > 1: vmlinux symtab matches kallsyms : Ok > > 2: Detect openat syscall event : Ok > > 3: Detect openat syscall event on all cpus : Ok > > 4: mmap interface tests : > > 4.1: Read samples using the mmap interface : Ok > > 4.2: User space counter reading of instructions : Skip (permissions) > > 4.3: User space counter reading of cycles : Skip (permissions) > > 5: Test data source output : Ok > > 6: Parse event definition strings : > > 6.1: Test event parsing : FAILED! > > 6.2: Parsing of all PMU events from sysfs : Ok > > 6.3: Parsing of given PMU events from sysfs : Ok > > 6.4: Parsing of aliased events from sysfs : Skip (no aliases in sysfs) > > 6.5: Parsing of aliased events : Ok > > 6.6: Parsing of terms (event modifiers) : Ok > > 7: Simple expression parser : Ok > > 8: PERF_RECORD_* events & perf_sample fields : Ok > > 9: Parse perf pmu format : Ok > > 10: PMU events : > > 10.1: PMU event table sanity : Ok > > 10.2: PMU event map aliases : Ok > > 10.3: Parsing of PMU event table metrics : Ok > > 10.4: Parsing of PMU event table metrics with fake PMUs : Ok > > 10.5: Parsing of metric thresholds with fake PMUs : Ok > > 11: DSO data read : Ok > > 12: DSO data cache : Ok > > 13: DSO data reopen : Ok > > 14: Roundtrip evsel->name : Ok > > 15: Parse sched tracepoints fields : Ok > > 16: syscalls:sys_enter_openat event fields : Ok > > 17: Setup struct perf_event_attr : Skip > > 18: Match and link multiple hists : Ok > > 19: 'import perf' in python : FAILED! > > 20: Breakpoint overflow signal handler : Skip > > 21: Breakpoint overflow sampling : Skip > > 22: Breakpoint accounting : Ok > > 23: Watchpoint : > > 23.1: Read Only Watchpoint : Ok > > 23.2: Write Only Watchpoint : Ok > > 23.3: Read / Write Watchpoint : Ok > > 23.4: Modify Watchpoint : > > ... > > > > > > > > > > > > acme@roc-rk3399-pc:~/git/perf-tools-next$ cat /proc/cpuinfo > > processor : 0 > > BogoMIPS : 48.00 > > Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid > > CPU implementer : 0x41 > > CPU architecture: 8 > > CPU variant : 0x0 > > CPU part : 0xd03 > > CPU revision : 4 > > > > processor : 1 > > BogoMIPS : 48.00 > > Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid > > CPU implementer : 0x41 > > CPU architecture: 8 > > CPU variant : 0x0 > > CPU part : 0xd03 > > CPU revision : 4 > > > > processor : 2 > > BogoMIPS : 48.00 > > Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid > > CPU implementer : 0x41 > > CPU architecture: 8 > > CPU variant : 0x0 > > CPU part : 0xd03 > > CPU revision : 4 > > > > processor : 3 > > BogoMIPS : 48.00 > > Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid > > CPU implementer : 0x41 > > CPU architecture: 8 > > CPU variant : 0x0 > > CPU part : 0xd03 > > CPU revision : 4 > > > > processor : 4 > > BogoMIPS : 48.00 > > Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid > > CPU implementer : 0x41 > > CPU architecture: 8 > > CPU variant : 0x0 > > CPU part : 0xd08 > > CPU revision : 2 > > > > processor : 5 > > BogoMIPS : 48.00 > > Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid > > CPU implementer : 0x41 > > CPU architecture: 8 > > CPU variant : 0x0 > > CPU part : 0xd08 > > CPU revision : 2 > > > > acme@roc-rk3399-pc:~/git/perf-tools-next$ > > > > root@roc-rk3399-pc:~# dmidecode > > # dmidecode 3.3 > > Getting SMBIOS data from sysfs. > > SMBIOS 3.0 present. > > 7 structures occupying 287 bytes. > > Table at 0xF0E3C020. > > > > Handle 0x0000, DMI type 0, 24 bytes > > BIOS Information > > Vendor: U-Boot > > Version: 2022.10-rc5+ > > Release Date: 10/01/2022 > > ROM Size: 64 kB > > Characteristics: > > PCI is supported > > BIOS is upgradeable > > Selectable boot is supported > > Targeted content distribution is supported > > UEFI is supported > > BIOS Revision: 22.10 > > > > Handle 0x0001, DMI type 1, 27 bytes > > System Information > > Manufacturer: libre-computer > > Product Name: roc-rk3399-pc > > Version: Not Specified > > Serial Number: b03c01a7179278b7 > > UUID: 63333062-3130-3761-3137-393237386237 > > Wake-up Type: Reserved > > SKU Number: Not Specified > > Family: Not Specified > > > > Handle 0x0002, DMI type 2, 14 bytes > > Base Board Information > > Manufacturer: libre-computer > > Product Name: roc-rk3399-pc > > Version: Not Specified > > Serial Number: Not Specified > > Asset Tag: Not Specified > > Features: > > Board is a hosting board > > Location In Chassis: Not Specified > > Chassis Handle: 0x0000 > > Type: Motherboard > > > > Handle 0x0003, DMI type 3, 21 bytes > > Chassis Information > > Manufacturer: libre-computer > > Type: Desktop > > Lock: Not Present > > Version: Not Specified > > Serial Number: Not Specified > > Asset Tag: Not Specified > > Boot-up State: Safe > > Power Supply State: Safe > > Thermal State: Safe > > Security Status: None > > OEM Information: 0x00000000 > > Height: Unspecified > > Number Of Power Cords: Unspecified > > Contained Elements: 0 > > > > Handle 0x0004, DMI type 4, 48 bytes > > Processor Information > > Socket Designation: Not Specified > > Type: Central Processor > > Family: Unknown > > Manufacturer: Unknown > > ID: 00 00 00 00 00 00 00 00 > > Version: Unknown > > Voltage: Unknown > > External Clock: Unknown > > Max Speed: Unknown > > Current Speed: Unknown > > Status: Unpopulated > > Upgrade: None > > L1 Cache Handle: Not Provided > > L2 Cache Handle: Not Provided > > L3 Cache Handle: Not Provided > > Serial Number: Not Specified > > Asset Tag: Not Specified > > Part Number: Not Specified > > Characteristics: None > > > > Handle 0x0005, DMI type 32, 11 bytes > > System Boot Information > > Status: No errors detected > > > > Handle 0x0006, DMI type 127, 4 bytes > > End Of Table > > > > root@roc-rk3399-pc:~# > > > > -- - Arnaldo ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Hybrid PMU issues on aarch64. was: Re: perf test failures in linux-next on s390 2023-06-16 14:36 ` Hybrid PMU issues on aarch64. was: " Arnaldo Carvalho de Melo @ 2023-06-16 14:44 ` Arnaldo Carvalho de Melo 2023-06-16 16:28 ` Ian Rogers 2023-06-19 10:04 ` Thomas Richter 1 sibling, 1 reply; 15+ messages in thread From: Arnaldo Carvalho de Melo @ 2023-06-16 14:44 UTC (permalink / raw) To: Ian Rogers Cc: Thomas Richter, linux-perf-use., Sumanth Korikkar, James Clark, Leo Yan, Suzuki K Poulose, Mike Leach, Mark Rutland, John Garry, Will Deacon Em Fri, Jun 16, 2023 at 11:36:27AM -0300, Arnaldo Carvalho de Melo escreveu: > > > I noticed this on an arm64 board: > > > > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat -e cycles:u,instructions:u ls > > > COPYING CREDITS Documentation Kbuild Kconfig LICENSES MAINTAINERS Makefile README arch block certs crypto drivers fs include init io_uring ipc kernel lib mm net perf.data rust samples scripts security sound tools usr virt > > > > Performance counter stats for 'ls': > > > > <not supported> armv8_cortex_a72/cycles:u/ > > > <not supported> armv8_cortex_a53/cycles:u/ > > > <not supported> armv8_cortex_a72/instructions:u/ > > > <not supported> armv8_cortex_a53/instructions:u/ > > > I tested on a raspberry pi and perf-tools-next is working there. I > > suspect the issue here is the heterogeneous PMU. The cycles event is > > converted into a perf_event_attr with type 0 and config 0. When there > > are heterogeneous PMUs then we try to use the extended type to say we > > want armv8_cortex_a72 and armv8_cortex_a53 cycles events. Let's say > > the type number of armv8_cortex_a72 and armv8_cortex_a53 PMUs are 9 > > and 10 respectively. With heterogeneous encodings the type in the > > The numbers are 8 and 7, PERF_TYPE_HW (thus zero, thus not printed): > > root@roc-rk3399-pc:~# perf stat -vv -e cycles sleep 1 > Using CPUID 0x00000000410fd080 > Control descriptor is not initialized > ------------------------------------------------------------ > perf_event_attr: > size 136 > config 0x800000000 > ------------------------------------------------------------ > sys_perf_event_open: pid 13885 cpu -1 group_fd -1 flags 0x8 > sys_perf_event_open failed, error -2 > Warning: > cycles event is not supported by the kernel. > ------------------------------------------------------------ > perf_event_attr: > size 136 > config 0x700000000 > ------------------------------------------------------------ > sys_perf_event_open: pid 13885 cpu -1 group_fd -1 flags 0x8 > sys_perf_event_open failed, error -2 > Warning: > cycles event is not supported by the kernel. > failed to read counter cycles > failed to read counter cycles > > Performance counter stats for 'sleep 1': > > <not supported> armv8_cortex_a72/cycles/ > <not supported> armv8_cortex_a53/cycles/ > > 1.011406938 seconds time elapsed > > 0.000000000 seconds user > 0.010886000 seconds sys > > > root@roc-rk3399-pc:~# > > > perf_event_attr remains as 0 and the config becomes 9 << 32 and 10 << > > 32. I suspect your kernel is seeing the extended type information and > > not handling it, hence the error. > > looks this is the case indeed > > > We add in the extended type for hardware and legacy cache events in > > the parse events code: > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n435 > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1239 > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1478 > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1511 > > > > The addition of the extended type happens if > > perf_pmus__supports_extended_type() returns true, its implementation > > is: > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/pmus.c?h=perf-tools-next#n480 > > bool perf_pmus__supports_extended_type(void) > > { > > return perf_pmus__num_core_pmus() > 1; > > } > > > > Previously on heterogeneous ARM the extended type wouldn't be encoded > > and I believe the event was opened on the PMU of the current CPU only. > > I think that is the case, haven't checked so far tho. > > > This is a bug because you will not count events on all PMUs. We can > > make perf_pmus__supports_extended_type return false on ARM which > > should bring back the previous behavior - or do some kind of dynamic > > simplest first step, trying it. Spot on: acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat ls COPYING CREDITS Documentation Kbuild Kconfig LICENSES MAINTAINERS Makefile README arch block certs crypto drivers fs include init io_uring ipc kernel lib mm net rust samples scripts security sound tools usr virt Performance counter stats for 'ls': 9.01 msec task-clock:u # 0.401 CPUs utilized 0 context-switches:u # 0.000 /sec 0 cpu-migrations:u # 0.000 /sec 84 page-faults:u # 9.320 K/sec 1188641 cycles:u # 0.132 GHz 601132 instructions:u # 0.51 insn per cycle 64768 branches:u # 7.186 M/sec 11680 branch-misses:u # 18.03% of all branches 0.022502514 seconds time elapsed 0.000000000 seconds user 0.022946000 seconds sys acme@roc-rk3399-pc:~/git/perf-tools-next$ acme@roc-rk3399-pc:~/git/perf-tools-next$ perf record ls COPYING CREDITS Documentation Kbuild Kconfig LICENSES MAINTAINERS Makefile README arch block certs crypto drivers fs include init io_uring ipc kernel lib mm net perf.data rust samples scripts security sound tools usr virt [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.003 MB perf.data (18 samples) ] acme@roc-rk3399-pc:~/git/perf-tools-next$ perf evlist cycles:Pu dummy:HGu acme@roc-rk3399-pc:~/git/perf-tools-next$ ldd ~/bin/perf | grep asan libasan.so.6 => /lib/aarch64-linux-gnu/libasan.so.6 (0x0000ffffa5a00000) acme@roc-rk3399-pc:~/git/perf-tools-next$ With the following patch. Do you want to submit it or may I add it as is using an edited discussion in this thread as the commit log message? Thanks! - Arnaldo diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c index a2032c1b7644..9af961105a64 100644 --- a/tools/perf/util/pmus.c +++ b/tools/perf/util/pmus.c @@ -494,7 +494,13 @@ int perf_pmus__num_core_pmus(void) bool perf_pmus__supports_extended_type(void) { +#if defined(__aarch64__) + // We can't use the extended type information where the PMU number + // is encoded in the upper perf_event_attr::type bits. (<< 32). + return false; +#else return perf_pmus__num_core_pmus() > 1; +#endif } struct perf_pmu *evsel__find_pmu(const struct evsel *evsel) ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: Hybrid PMU issues on aarch64. was: Re: perf test failures in linux-next on s390 2023-06-16 14:44 ` Arnaldo Carvalho de Melo @ 2023-06-16 16:28 ` Ian Rogers 2023-06-16 16:53 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 15+ messages in thread From: Ian Rogers @ 2023-06-16 16:28 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Thomas Richter, linux-perf-use., Sumanth Korikkar, James Clark, Leo Yan, Suzuki K Poulose, Mike Leach, Mark Rutland, John Garry, Will Deacon On Fri, Jun 16, 2023 at 7:44 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Fri, Jun 16, 2023 at 11:36:27AM -0300, Arnaldo Carvalho de Melo escreveu: > > > > I noticed this on an arm64 board: > > > > > > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat -e cycles:u,instructions:u ls > > > > COPYING CREDITS Documentation Kbuild Kconfig LICENSES MAINTAINERS Makefile README arch block certs crypto drivers fs include init io_uring ipc kernel lib mm net perf.data rust samples scripts security sound tools usr virt > > > > > > Performance counter stats for 'ls': > > > > > > <not supported> armv8_cortex_a72/cycles:u/ > > > > <not supported> armv8_cortex_a53/cycles:u/ > > > > <not supported> armv8_cortex_a72/instructions:u/ > > > > <not supported> armv8_cortex_a53/instructions:u/ > > > > > I tested on a raspberry pi and perf-tools-next is working there. I > > > suspect the issue here is the heterogeneous PMU. The cycles event is > > > converted into a perf_event_attr with type 0 and config 0. When there > > > are heterogeneous PMUs then we try to use the extended type to say we > > > want armv8_cortex_a72 and armv8_cortex_a53 cycles events. Let's say > > > the type number of armv8_cortex_a72 and armv8_cortex_a53 PMUs are 9 > > > and 10 respectively. With heterogeneous encodings the type in the > > > > The numbers are 8 and 7, PERF_TYPE_HW (thus zero, thus not printed): > > > > root@roc-rk3399-pc:~# perf stat -vv -e cycles sleep 1 > > Using CPUID 0x00000000410fd080 > > Control descriptor is not initialized > > ------------------------------------------------------------ > > perf_event_attr: > > size 136 > > config 0x800000000 > > ------------------------------------------------------------ > > sys_perf_event_open: pid 13885 cpu -1 group_fd -1 flags 0x8 > > sys_perf_event_open failed, error -2 > > Warning: > > cycles event is not supported by the kernel. > > ------------------------------------------------------------ > > perf_event_attr: > > size 136 > > config 0x700000000 > > ------------------------------------------------------------ > > sys_perf_event_open: pid 13885 cpu -1 group_fd -1 flags 0x8 > > sys_perf_event_open failed, error -2 > > Warning: > > cycles event is not supported by the kernel. > > failed to read counter cycles > > failed to read counter cycles > > > > Performance counter stats for 'sleep 1': > > > > <not supported> armv8_cortex_a72/cycles/ > > <not supported> armv8_cortex_a53/cycles/ > > > > 1.011406938 seconds time elapsed > > > > 0.000000000 seconds user > > 0.010886000 seconds sys > > > > > > root@roc-rk3399-pc:~# > > > > > perf_event_attr remains as 0 and the config becomes 9 << 32 and 10 << > > > 32. I suspect your kernel is seeing the extended type information and > > > not handling it, hence the error. > > > > looks this is the case indeed > > > > > We add in the extended type for hardware and legacy cache events in > > > the parse events code: > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n435 > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1239 > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1478 > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1511 > > > > > > The addition of the extended type happens if > > > perf_pmus__supports_extended_type() returns true, its implementation > > > is: > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/pmus.c?h=perf-tools-next#n480 > > > bool perf_pmus__supports_extended_type(void) > > > { > > > return perf_pmus__num_core_pmus() > 1; > > > } > > > > > > Previously on heterogeneous ARM the extended type wouldn't be encoded > > > and I believe the event was opened on the PMU of the current CPU only. > > > > I think that is the case, haven't checked so far tho. > > > > > This is a bug because you will not count events on all PMUs. We can > > > make perf_pmus__supports_extended_type return false on ARM which > > > should bring back the previous behavior - or do some kind of dynamic > > > > simplest first step, trying it. > > Spot on: > > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat ls > COPYING CREDITS Documentation Kbuild Kconfig LICENSES MAINTAINERS Makefile README arch block certs crypto drivers fs include init io_uring ipc kernel lib mm net rust samples scripts security sound tools usr virt > > Performance counter stats for 'ls': > > 9.01 msec task-clock:u # 0.401 CPUs utilized > 0 context-switches:u # 0.000 /sec > 0 cpu-migrations:u # 0.000 /sec > 84 page-faults:u # 9.320 K/sec > 1188641 cycles:u # 0.132 GHz > 601132 instructions:u # 0.51 insn per cycle > 64768 branches:u # 7.186 M/sec > 11680 branch-misses:u # 18.03% of all branches > > 0.022502514 seconds time elapsed > > 0.000000000 seconds user > 0.022946000 seconds sys > > > acme@roc-rk3399-pc:~/git/perf-tools-next$ > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf record ls > COPYING CREDITS Documentation Kbuild Kconfig LICENSES MAINTAINERS Makefile README arch block certs crypto drivers fs include init io_uring ipc kernel lib mm net perf.data rust samples scripts security sound tools usr virt > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 0.003 MB perf.data (18 samples) ] > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf evlist > cycles:Pu > dummy:HGu > acme@roc-rk3399-pc:~/git/perf-tools-next$ ldd ~/bin/perf | grep asan > libasan.so.6 => /lib/aarch64-linux-gnu/libasan.so.6 (0x0000ffffa5a00000) > acme@roc-rk3399-pc:~/git/perf-tools-next$ > > With the following patch. Do you want to submit it or may I add it as is > using an edited discussion in this thread as the commit log message? > > Thanks! > > - Arnaldo Hi Arnaldo, presumably with the #ifdef you just get 1 PMU - shame. I think rather than do an #ifdef we can do something like call is_event_supported: https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/print-events.c?h=perf-tools-next#n232 so: bool perf_pmus__supports_extended_type(void) struct perf_pmu *pmu = NULL; if (perf_pmus__num_core_pmus() <= 1) return false; while((pmu = perf_pmus__scan_core(pmu) != NULL) { return is_event_supported(PERF_TYPE_HARDWARE, PERF_COUNT_HW_CPU_CYCLES | ((__u64)pmu->type << PERF_PMU_TYPE_SHIFT); } return false; } We probably don't want to do this for each call of perf_pmus__supports_extended_type so you could use a static and pthread_once, etc. This would mean if this regression is introduced elsewhere than ARM it will self heal. It will also mean that when ARM support extended types in the kernel, they will get the normal heterogeneous behavior. Thanks, Ian > diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c > index a2032c1b7644..9af961105a64 100644 > --- a/tools/perf/util/pmus.c > +++ b/tools/perf/util/pmus.c > @@ -494,7 +494,13 @@ int perf_pmus__num_core_pmus(void) > > bool perf_pmus__supports_extended_type(void) > { > +#if defined(__aarch64__) > + // We can't use the extended type information where the PMU number > + // is encoded in the upper perf_event_attr::type bits. (<< 32). > + return false; > +#else > return perf_pmus__num_core_pmus() > 1; > +#endif > } > > struct perf_pmu *evsel__find_pmu(const struct evsel *evsel) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Hybrid PMU issues on aarch64. was: Re: perf test failures in linux-next on s390 2023-06-16 16:28 ` Ian Rogers @ 2023-06-16 16:53 ` Arnaldo Carvalho de Melo 2023-06-16 21:47 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 15+ messages in thread From: Arnaldo Carvalho de Melo @ 2023-06-16 16:53 UTC (permalink / raw) To: Ian Rogers Cc: Thomas Richter, linux-perf-use., Sumanth Korikkar, James Clark, Leo Yan, Suzuki K Poulose, Mike Leach, Mark Rutland, John Garry, Will Deacon Em Fri, Jun 16, 2023 at 09:28:12AM -0700, Ian Rogers escreveu: > On Fri, Jun 16, 2023 at 7:44 AM Arnaldo Carvalho de Melo > <acme@kernel.org> wrote: > > > > Em Fri, Jun 16, 2023 at 11:36:27AM -0300, Arnaldo Carvalho de Melo escreveu: > > > > > I noticed this on an arm64 board: > > > > > > > > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat -e cycles:u,instructions:u ls > > > > > COPYING CREDITS Documentation Kbuild Kconfig LICENSES MAINTAINERS Makefile README arch block certs crypto drivers fs include init io_uring ipc kernel lib mm net perf.data rust samples scripts security sound tools usr virt > > > > > > > > Performance counter stats for 'ls': > > > > > > > > <not supported> armv8_cortex_a72/cycles:u/ > > > > > <not supported> armv8_cortex_a53/cycles:u/ > > > > > <not supported> armv8_cortex_a72/instructions:u/ > > > > > <not supported> armv8_cortex_a53/instructions:u/ > > > > > > > I tested on a raspberry pi and perf-tools-next is working there. I > > > > suspect the issue here is the heterogeneous PMU. The cycles event is > > > > converted into a perf_event_attr with type 0 and config 0. When there > > > > are heterogeneous PMUs then we try to use the extended type to say we > > > > want armv8_cortex_a72 and armv8_cortex_a53 cycles events. Let's say > > > > the type number of armv8_cortex_a72 and armv8_cortex_a53 PMUs are 9 > > > > and 10 respectively. With heterogeneous encodings the type in the > > > > > > The numbers are 8 and 7, PERF_TYPE_HW (thus zero, thus not printed): > > > > > > root@roc-rk3399-pc:~# perf stat -vv -e cycles sleep 1 > > > Using CPUID 0x00000000410fd080 > > > Control descriptor is not initialized > > > ------------------------------------------------------------ > > > perf_event_attr: > > > size 136 > > > config 0x800000000 > > > ------------------------------------------------------------ > > > sys_perf_event_open: pid 13885 cpu -1 group_fd -1 flags 0x8 > > > sys_perf_event_open failed, error -2 > > > Warning: > > > cycles event is not supported by the kernel. > > > ------------------------------------------------------------ > > > perf_event_attr: > > > size 136 > > > config 0x700000000 > > > ------------------------------------------------------------ > > > sys_perf_event_open: pid 13885 cpu -1 group_fd -1 flags 0x8 > > > sys_perf_event_open failed, error -2 > > > Warning: > > > cycles event is not supported by the kernel. > > > failed to read counter cycles > > > failed to read counter cycles > > > > > > Performance counter stats for 'sleep 1': > > > > > > <not supported> armv8_cortex_a72/cycles/ > > > <not supported> armv8_cortex_a53/cycles/ > > > > > > 1.011406938 seconds time elapsed > > > > > > 0.000000000 seconds user > > > 0.010886000 seconds sys > > > > > > > > > root@roc-rk3399-pc:~# > > > > > > > perf_event_attr remains as 0 and the config becomes 9 << 32 and 10 << > > > > 32. I suspect your kernel is seeing the extended type information and > > > > not handling it, hence the error. > > > > > > looks this is the case indeed > > > > > > > We add in the extended type for hardware and legacy cache events in > > > > the parse events code: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n435 > > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1239 > > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1478 > > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1511 > > > > > > > > The addition of the extended type happens if > > > > perf_pmus__supports_extended_type() returns true, its implementation > > > > is: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/pmus.c?h=perf-tools-next#n480 > > > > bool perf_pmus__supports_extended_type(void) > > > > { > > > > return perf_pmus__num_core_pmus() > 1; > > > > } > > > > > > > > Previously on heterogeneous ARM the extended type wouldn't be encoded > > > > and I believe the event was opened on the PMU of the current CPU only. > > > > > > I think that is the case, haven't checked so far tho. > > > > > > > This is a bug because you will not count events on all PMUs. We can > > > > make perf_pmus__supports_extended_type return false on ARM which > > > > should bring back the previous behavior - or do some kind of dynamic > > > > > > simplest first step, trying it. > > > > Spot on: > > > > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat ls > > COPYING CREDITS Documentation Kbuild Kconfig LICENSES MAINTAINERS Makefile README arch block certs crypto drivers fs include init io_uring ipc kernel lib mm net rust samples scripts security sound tools usr virt > > > > Performance counter stats for 'ls': > > > > 9.01 msec task-clock:u # 0.401 CPUs utilized > > 0 context-switches:u # 0.000 /sec > > 0 cpu-migrations:u # 0.000 /sec > > 84 page-faults:u # 9.320 K/sec > > 1188641 cycles:u # 0.132 GHz > > 601132 instructions:u # 0.51 insn per cycle > > 64768 branches:u # 7.186 M/sec > > 11680 branch-misses:u # 18.03% of all branches > > > > 0.022502514 seconds time elapsed > > > > 0.000000000 seconds user > > 0.022946000 seconds sys > > > > > > acme@roc-rk3399-pc:~/git/perf-tools-next$ > > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf record ls > > COPYING CREDITS Documentation Kbuild Kconfig LICENSES MAINTAINERS Makefile README arch block certs crypto drivers fs include init io_uring ipc kernel lib mm net perf.data rust samples scripts security sound tools usr virt > > [ perf record: Woken up 1 times to write data ] > > [ perf record: Captured and wrote 0.003 MB perf.data (18 samples) ] > > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf evlist > > cycles:Pu > > dummy:HGu > > acme@roc-rk3399-pc:~/git/perf-tools-next$ ldd ~/bin/perf | grep asan > > libasan.so.6 => /lib/aarch64-linux-gnu/libasan.so.6 (0x0000ffffa5a00000) > > acme@roc-rk3399-pc:~/git/perf-tools-next$ > > > > With the following patch. Do you want to submit it or may I add it as is > > using an edited discussion in this thread as the commit log message? > > > > Thanks! > > > > - Arnaldo > > Hi Arnaldo, > > presumably with the #ifdef you just get 1 PMU - shame. I think rather > than do an #ifdef we can do something like call is_event_supported: > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/print-events.c?h=perf-tools-next#n232 > so: > bool perf_pmus__supports_extended_type(void) > struct perf_pmu *pmu = NULL; > if (perf_pmus__num_core_pmus() <= 1) > return false; > while((pmu = perf_pmus__scan_core(pmu) != NULL) { > return is_event_supported(PERF_TYPE_HARDWARE, > PERF_COUNT_HW_CPU_CYCLES | ((__u64)pmu->type << PERF_PMU_TYPE_SHIFT); > } > return false; > } > We probably don't want to do this for each call of > perf_pmus__supports_extended_type so you could use a static and > pthread_once, etc. > > This would mean if this regression is introduced elsewhere than ARM it > will self heal. It will also mean that when ARM support extended types > in the kernel, they will get the normal heterogeneous behavior. That looks better, I'll try it when I get back to my office. - Arnaldo > Thanks, > Ian > > > diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c > > index a2032c1b7644..9af961105a64 100644 > > --- a/tools/perf/util/pmus.c > > +++ b/tools/perf/util/pmus.c > > @@ -494,7 +494,13 @@ int perf_pmus__num_core_pmus(void) > > > > bool perf_pmus__supports_extended_type(void) > > { > > +#if defined(__aarch64__) > > + // We can't use the extended type information where the PMU number > > + // is encoded in the upper perf_event_attr::type bits. (<< 32). > > + return false; > > +#else > > return perf_pmus__num_core_pmus() > 1; > > +#endif > > } > > > > struct perf_pmu *evsel__find_pmu(const struct evsel *evsel) -- - Arnaldo ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Hybrid PMU issues on aarch64. was: Re: perf test failures in linux-next on s390 2023-06-16 16:53 ` Arnaldo Carvalho de Melo @ 2023-06-16 21:47 ` Arnaldo Carvalho de Melo 2023-06-16 22:09 ` Ian Rogers 0 siblings, 1 reply; 15+ messages in thread From: Arnaldo Carvalho de Melo @ 2023-06-16 21:47 UTC (permalink / raw) To: Ian Rogers Cc: Thomas Richter, linux-perf-use., Sumanth Korikkar, James Clark, Leo Yan, Suzuki K Poulose, Mike Leach, Mark Rutland, John Garry, Will Deacon Em Fri, Jun 16, 2023 at 01:53:41PM -0300, Arnaldo Carvalho de Melo escreveu: > > presumably with the #ifdef you just get 1 PMU - shame. I think rather > > than do an #ifdef we can do something like call is_event_supported: > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/print-events.c?h=perf-tools-next#n232 > > so: > > bool perf_pmus__supports_extended_type(void) > > struct perf_pmu *pmu = NULL; > > if (perf_pmus__num_core_pmus() <= 1) > > return false; > > while((pmu = perf_pmus__scan_core(pmu) != NULL) { > > return is_event_supported(PERF_TYPE_HARDWARE, > > PERF_COUNT_HW_CPU_CYCLES | ((__u64)pmu->type << PERF_PMU_TYPE_SHIFT); > > } > > return false; > > } > > We probably don't want to do this for each call of > > perf_pmus__supports_extended_type so you could use a static and > > pthread_once, etc. > > > > This would mean if this regression is introduced elsewhere than ARM it > > will self heal. It will also mean that when ARM support extended types > > in the kernel, they will get the normal heterogeneous behavior. > > That looks better, I'll try it when I get back to my office. End result, Ack? diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c index a2032c1b7644..d891d72c824e 100644 --- a/tools/perf/util/pmus.c +++ b/tools/perf/util/pmus.c @@ -4,6 +4,7 @@ #include <subcmd/pager.h> #include <sys/types.h> #include <dirent.h> +#include <pthread.h> #include <string.h> #include <unistd.h> #include "debug.h" @@ -492,9 +493,35 @@ int perf_pmus__num_core_pmus(void) return count; } +static bool __perf_pmus__supports_extended_type(void) +{ + struct perf_pmu *pmu = NULL; + + if (perf_pmus__num_core_pmus() <= 1) + return false; + + while ((pmu = perf_pmus__scan_core(pmu)) != NULL) { + if (!is_event_supported(PERF_TYPE_HARDWARE, PERF_COUNT_HW_CPU_CYCLES | ((__u64)pmu->type << PERF_PMU_TYPE_SHIFT))) + return false; + } + + return true; +} + +static bool perf_pmus__do_support_extended_type; + +static void perf_pmus__init_supports_extended_type(void) +{ + perf_pmus__do_support_extended_type = __perf_pmus__supports_extended_type(); +} + bool perf_pmus__supports_extended_type(void) { - return perf_pmus__num_core_pmus() > 1; + static pthread_once_t extended_type_once = PTHREAD_ONCE_INIT; + + pthread_once(&extended_type_once, perf_pmus__init_supports_extended_type); + + return perf_pmus__do_support_extended_type; } struct perf_pmu *evsel__find_pmu(const struct evsel *evsel) diff --git a/tools/perf/util/print-events.c b/tools/perf/util/print-events.c index 7a5f87392720..a7566edc86a3 100644 --- a/tools/perf/util/print-events.c +++ b/tools/perf/util/print-events.c @@ -229,7 +229,7 @@ void print_sdt_events(const struct print_callbacks *print_cb, void *print_state) strlist__delete(sdtlist); } -static bool is_event_supported(u8 type, u64 config) +bool is_event_supported(u8 type, u64 config) { bool ret = true; int open_return; diff --git a/tools/perf/util/print-events.h b/tools/perf/util/print-events.h index e75a3d7e3fe3..d7fab411e75c 100644 --- a/tools/perf/util/print-events.h +++ b/tools/perf/util/print-events.h @@ -3,6 +3,7 @@ #define __PERF_PRINT_EVENTS_H #include <linux/perf_event.h> +#include <linux/types.h> #include <stdbool.h> struct event_symbol; @@ -36,5 +37,6 @@ void print_symbol_events(const struct print_callbacks *print_cb, void *print_sta unsigned int max); void print_tool_events(const struct print_callbacks *print_cb, void *print_state); void print_tracepoint_events(const struct print_callbacks *print_cb, void *print_state); +bool is_event_supported(u8 type, u64 config); #endif /* __PERF_PRINT_EVENTS_H */ ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: Hybrid PMU issues on aarch64. was: Re: perf test failures in linux-next on s390 2023-06-16 21:47 ` Arnaldo Carvalho de Melo @ 2023-06-16 22:09 ` Ian Rogers 0 siblings, 0 replies; 15+ messages in thread From: Ian Rogers @ 2023-06-16 22:09 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Thomas Richter, linux-perf-use., Sumanth Korikkar, James Clark, Leo Yan, Suzuki K Poulose, Mike Leach, Mark Rutland, John Garry, Will Deacon On Fri, Jun 16, 2023 at 2:47 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Fri, Jun 16, 2023 at 01:53:41PM -0300, Arnaldo Carvalho de Melo escreveu: > > > presumably with the #ifdef you just get 1 PMU - shame. I think rather > > > than do an #ifdef we can do something like call is_event_supported: > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/print-events.c?h=perf-tools-next#n232 > > > so: > > > bool perf_pmus__supports_extended_type(void) > > > struct perf_pmu *pmu = NULL; > > > if (perf_pmus__num_core_pmus() <= 1) > > > return false; > > > while((pmu = perf_pmus__scan_core(pmu) != NULL) { > > > return is_event_supported(PERF_TYPE_HARDWARE, > > > PERF_COUNT_HW_CPU_CYCLES | ((__u64)pmu->type << PERF_PMU_TYPE_SHIFT); > > > } > > > return false; > > > } > > > We probably don't want to do this for each call of > > > perf_pmus__supports_extended_type so you could use a static and > > > pthread_once, etc. > > > > > > This would mean if this regression is introduced elsewhere than ARM it > > > will self heal. It will also mean that when ARM support extended types > > > in the kernel, they will get the normal heterogeneous behavior. > > > > That looks better, I'll try it when I get back to my office. > > End result, Ack? > Acked-by: Ian Rogers <irogers@google.com> Thanks, Ian > diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c > index a2032c1b7644..d891d72c824e 100644 > --- a/tools/perf/util/pmus.c > +++ b/tools/perf/util/pmus.c > @@ -4,6 +4,7 @@ > #include <subcmd/pager.h> > #include <sys/types.h> > #include <dirent.h> > +#include <pthread.h> > #include <string.h> > #include <unistd.h> > #include "debug.h" > @@ -492,9 +493,35 @@ int perf_pmus__num_core_pmus(void) > return count; > } > > +static bool __perf_pmus__supports_extended_type(void) > +{ > + struct perf_pmu *pmu = NULL; > + > + if (perf_pmus__num_core_pmus() <= 1) > + return false; > + > + while ((pmu = perf_pmus__scan_core(pmu)) != NULL) { > + if (!is_event_supported(PERF_TYPE_HARDWARE, PERF_COUNT_HW_CPU_CYCLES | ((__u64)pmu->type << PERF_PMU_TYPE_SHIFT))) > + return false; > + } > + > + return true; > +} > + > +static bool perf_pmus__do_support_extended_type; > + > +static void perf_pmus__init_supports_extended_type(void) > +{ > + perf_pmus__do_support_extended_type = __perf_pmus__supports_extended_type(); > +} > + > bool perf_pmus__supports_extended_type(void) > { > - return perf_pmus__num_core_pmus() > 1; > + static pthread_once_t extended_type_once = PTHREAD_ONCE_INIT; > + > + pthread_once(&extended_type_once, perf_pmus__init_supports_extended_type); > + > + return perf_pmus__do_support_extended_type; > } > > struct perf_pmu *evsel__find_pmu(const struct evsel *evsel) > diff --git a/tools/perf/util/print-events.c b/tools/perf/util/print-events.c > index 7a5f87392720..a7566edc86a3 100644 > --- a/tools/perf/util/print-events.c > +++ b/tools/perf/util/print-events.c > @@ -229,7 +229,7 @@ void print_sdt_events(const struct print_callbacks *print_cb, void *print_state) > strlist__delete(sdtlist); > } > > -static bool is_event_supported(u8 type, u64 config) > +bool is_event_supported(u8 type, u64 config) > { > bool ret = true; > int open_return; > diff --git a/tools/perf/util/print-events.h b/tools/perf/util/print-events.h > index e75a3d7e3fe3..d7fab411e75c 100644 > --- a/tools/perf/util/print-events.h > +++ b/tools/perf/util/print-events.h > @@ -3,6 +3,7 @@ > #define __PERF_PRINT_EVENTS_H > > #include <linux/perf_event.h> > +#include <linux/types.h> > #include <stdbool.h> > > struct event_symbol; > @@ -36,5 +37,6 @@ void print_symbol_events(const struct print_callbacks *print_cb, void *print_sta > unsigned int max); > void print_tool_events(const struct print_callbacks *print_cb, void *print_state); > void print_tracepoint_events(const struct print_callbacks *print_cb, void *print_state); > +bool is_event_supported(u8 type, u64 config); > > #endif /* __PERF_PRINT_EVENTS_H */ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Hybrid PMU issues on aarch64. was: Re: perf test failures in linux-next on s390 2023-06-16 14:36 ` Hybrid PMU issues on aarch64. was: " Arnaldo Carvalho de Melo 2023-06-16 14:44 ` Arnaldo Carvalho de Melo @ 2023-06-19 10:04 ` Thomas Richter 1 sibling, 0 replies; 15+ messages in thread From: Thomas Richter @ 2023-06-19 10:04 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Ian Rogers Cc: linux-perf-use., Sumanth Korikkar, James Clark, Leo Yan, Suzuki K Poulose, Mike Leach, Mark Rutland, John Garry, Will Deacon On 6/16/23 16:36, Arnaldo Carvalho de Melo wrote: > Em Fri, Jun 16, 2023 at 07:23:30AM -0700, Ian Rogers escreveu: >> On Thu, Jun 15, 2023 at 7:35 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: >>> Ccing the ARM people too: >>> Em Thu, Jun 15, 2023 at 11:39:16AM +0200, Thomas Richter escreveu: >>>> On 6/14/23 16:57, Ian Rogers wrote: >>>>> On Wed, Jun 14, 2023 at 1:32 AM Thomas Richter <tmricht@linux.ibm.com> wrote: >>>>> bool is_pmu_core(const char *name) >>>>> { >>>>> return !strcmp(name, "cpu") || is_sysfs_pmu_core(name); >>>>> } > >>>> Maybe we should scan the directory > >>>> [linux-next]# ll /sys/bus/event_source/devices >>>> total 0 >>>> lrwxrwxrwx 1 root root 0 Jun 2 15:11 cpum_cf -> ../../../devices/cpum_cf >>>> lrwxrwxrwx 1 root root 0 Jun 2 15:11 cpum_cf_diag -> ../../../devices/cpum_cf_diag >>>> lrwxrwxrwx 1 root root 0 Jun 2 15:11 cpum_sf -> ../../../devices/cpum_sf >>>> lrwxrwxrwx 1 root root 0 Jun 2 15:11 kprobe -> ../../../devices/kprobe >>>> lrwxrwxrwx 1 root root 0 Jun 2 15:11 software -> ../../../devices/software >>>> lrwxrwxrwx 1 root root 0 Jun 2 15:11 tracepoint -> ../../../devices/tracepoint >>>> lrwxrwxrwx 1 root root 0 Jun 2 15:11 uprobe -> ../../../devices/uprobe >>>> [linux-next]# > >>>> This directory lists the PMUs available on s390, maybe this is true for >>>> other platform... > >>> I noticed this on an arm64 board: > >>> acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat -e cycles:u,instructions:u ls >>> COPYING CREDITS Documentation Kbuild Kconfig LICENSES MAINTAINERS Makefile README arch block certs crypto drivers fs include init io_uring ipc kernel lib mm net perf.data rust samples scripts security sound tools usr virt > >>> Performance counter stats for 'ls': > >>> <not supported> armv8_cortex_a72/cycles:u/ >>> <not supported> armv8_cortex_a53/cycles:u/ >>> <not supported> armv8_cortex_a72/instructions:u/ >>> <not supported> armv8_cortex_a53/instructions:u/ > >> I tested on a raspberry pi and perf-tools-next is working there. I >> suspect the issue here is the heterogeneous PMU. The cycles event is >> converted into a perf_event_attr with type 0 and config 0. When there >> are heterogeneous PMUs then we try to use the extended type to say we >> want armv8_cortex_a72 and armv8_cortex_a53 cycles events. Let's say >> the type number of armv8_cortex_a72 and armv8_cortex_a53 PMUs are 9 >> and 10 respectively. With heterogeneous encodings the type in the > > The numbers are 8 and 7, PERF_TYPE_HW (thus zero, thus not printed): > > root@roc-rk3399-pc:~# perf stat -vv -e cycles sleep 1 > Using CPUID 0x00000000410fd080 > Control descriptor is not initialized > ------------------------------------------------------------ > perf_event_attr: > size 136 > config 0x800000000 > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > enable_on_exec 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 13885 cpu -1 group_fd -1 flags 0x8 > sys_perf_event_open failed, error -2 > Warning: On s390 with above patch applied and latest git pull of linux-next this morning I get this result: # ./perf test -F 6 6: Parse event definition strings : 6.1: Test event parsing : Ok 6.2: Parsing of all PMU events from sysfs : Ok 6.3: Parsing of given PMU events from sysfs : Ok 6.4: Parsing of aliased events from sysfs : Skip (no aliases in sysfs) 6.5: Parsing of aliased events : Ok 6.6: Parsing of terms (event modifiers) : Ok # However the config member in perf_event_attr::config member does not change as can be seen in this trace: # ./perf stat -e cycles -vv true Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: size 136 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 6510 cpu -1 group_fd -1 flags 0x8 = 3 cycles: -1: 2646065 510719 510719 cycles: 2646065 510719 510719 Performance counter stats for 'true': 2,646,065 cycles 0.002084266 seconds time elapsed 0.000052000 seconds user 0.002107000 seconds sys # Thanks for fixing this... -- Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany -- Vorsitzender des Aufsichtsrats: Gregor Pillen Geschäftsführung: David Faller Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2023-06-19 10:04 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-06-13 12:54 perf test failures in linux-next on s390 Thomas Richter 2023-06-13 14:32 ` Ian Rogers 2023-06-14 8:31 ` Thomas Richter 2023-06-14 14:57 ` Ian Rogers 2023-06-15 8:57 ` Thomas Richter 2023-06-15 9:39 ` Thomas Richter 2023-06-15 14:34 ` Arnaldo Carvalho de Melo 2023-06-16 14:23 ` Ian Rogers 2023-06-16 14:36 ` Hybrid PMU issues on aarch64. was: " Arnaldo Carvalho de Melo 2023-06-16 14:44 ` Arnaldo Carvalho de Melo 2023-06-16 16:28 ` Ian Rogers 2023-06-16 16:53 ` Arnaldo Carvalho de Melo 2023-06-16 21:47 ` Arnaldo Carvalho de Melo 2023-06-16 22:09 ` Ian Rogers 2023-06-19 10:04 ` Thomas Richter
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).