From: Thomas Richter <tmricht@linux.ibm.com>
To: Ian Rogers <irogers@google.com>
Cc: "linux-perf-use." <linux-perf-users@vger.kernel.org>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Sumanth Korikkar <sumanthk@linux.ibm.com>
Subject: Re: perf test failures in linux-next on s390
Date: Wed, 14 Jun 2023 10:31:55 +0200 [thread overview]
Message-ID: <e7f0930a-a9fc-d768-a472-bd9af6fafdf5@linux.ibm.com> (raw)
In-Reply-To: <CAP-5=fU+0VXckQiq3E8yqaySNZ+-DDZahEd1OY0uKPWnfFsafg@mail.gmail.com>
On 6/13/23 16:32, Ian Rogers wrote:
> On Tue, Jun 13, 2023 at 5:54 AM Thomas Richter <tmricht@linux.ibm.com> wrote:
>>
>> Hi all,
>>
>> I have run the perf test suite on the current 6.4rc6 kernel and see just one error:
>> # ./perf test 2>&1 | fgrep FAILED
>> fgrep: warning: fgrep is obsolescent; using grep -F
>> 42.3: BPF prologue generation : FAILED!
>> #
>>
>> However when I download the linux-next tree and build kernel and perf
>> tool with the same kernel config file, I get a bunch of failing test cases,
>> many with perf tool dumping core:
>>
>> # perf test 2>&1 | fgrep FAILED
>> fgrep: warning: fgrep is obsolescent; using grep -F
>> 6.1: Test event parsing : FAILED!
>> 10.3: Parsing of PMU event table metrics : FAILED!
>> 10.4: Parsing of PMU event table metrics with fake PMUs : FAILED!
>> 17: Setup struct perf_event_attr : FAILED!
>> 24: Number of exit events of a simple workload : FAILED! core-dump
>> 28: Use a dummy software event to keep tracking : FAILED!
>> 35: Track with sched_switch : FAILED!
>> 42.3: BPF prologue generation : FAILED!
>> 66: Parse and process metrics : FAILED!
>> 68: Event expansion for cgroups : FAILED!
>> 69.2: Perf time to TSC : FAILED! core-dump
>> 74: build id cache operations : FAILED! core-dump
>> 81: kernel lock contention analysis test : FAILED!
>> 86: Zstd perf.data compression/decompression : FAILED! core-dump
>> 87: perf record tests : FAILED! core-dump
>> 94: perf all metricgroups test : FAILED!
>> 95: perf all metrics test : FAILED!
>> 106: Test java symbol : FAILED! core-dump
>> #
>>
>> I am afraid this will show up pretty soon in the linux tree.
>> I am going to look into each failure in the next few days.
>>
>> What I already found out is that many test cases now fail due to the
>> event/PMU rework, here is one example:
>>
>> # perf test -Fvvvv 95
>> 95: perf all metrics test
>> --- start ---
>> Testing cpi
>> ....
>> Metric 'transaction' not printed in:
>> Error:
>> The TX_NC_TABORT event is not supported.
>> ---- end ----
>> perf all metrics test: FAILED!
>> # ls -l /sys/devices/cpum_cf/events/TX_NC_TABORT
>> -r--r--r--. 1 root root 4096 Jun 13 13:49 /sys/devices/cpum_cf/events/TX_NC_TABORT
>> #
>>
>> As can be seen, the event is definitely there and supported.
>> This same test case succeeds in the linux tree!
>>
>> Hopefully I can sort out some of the failures before this code show up
>> in the linux tree.
>
> Thanks Thomas, to be clear this is what is in
> perf-tools-next/linux-next and not 6.4?
Ian,
thanks for your help.
Correct, I am talking about the linux-next repo. The linux repo is fine.
>
> Rather than try to do more complicated cases like the metrics tests,
> it makes sense to dig into why event parsing is failing. Test 6 first
> of all, could you give output?
>
> Thanks,
> Ian
>
We discussed some aspects of this about two weeks ago, but last week
I was on vacation and now I resumed my work on linux-next.
We run the linux-next perf test suite every night and I am concerned
and would like to get this sorted out before it hits Linux 6.5.
Here is the output on my linux-next tree built yesterday:
# uname -a
Linux a35lp67.lnxne.boe 6.4.0-rc6-next-20230613d-perf #2 \
SMP Tue Jun 13 15:18:43 CEST 2023 s390x GNU/Linux
# ./perf test -F 6
6: Parse event definition strings :
6.1: Test event parsing :Segmentation fault (core dumped)
#
# gdb perf
....
(gdb) r test -F 6
6: Parse event definition strings :
6.1: Test event parsing :
Program received signal SIGSEGV, Segmentation fault.
__GI_strcmp () at ../sysdeps/s390/strcmp-vx.S:47
(gdb) where
#0 __GI_strcmp () at ../sysdeps/s390/strcmp-vx.S:47
#1 0x000000000110a18c in test__term_equal_term (evlist=0x152ea80) at tests/parse-events.c:1580
#2 0x000000000110a96a in test_event (e=0x14dc758 <test.events+1416>) at tests/parse-events.c:2209
#3 0x000000000110ac58 in test_events (events=0x14dc1d0 <test.events>, cnt=61) at tests/parse-events.c:2260
#4 0x000000000110ad52 in test__events2 (test=0x1500758 <suite.parse_events>, subtest=0)
at tests/parse-events.c:2272
#5 0x00000000010f6fac in run_test (test=0x1500758 <suite.parse_events>, subtest=0) at tests/builtin-test.c:236
#6 0x00000000010f7142 in test_and_print (t=0x1500758 <suite.parse_events>, subtest=0) at tests/builtin-test.c:265
#7 0x00000000010f7b1e in __cmd_test (argc=1, argv=0x3ffffffa320, skiplist=0x0) at tests/builtin-test.c:436
#8 0x00000000010f8404 in cmd_test (argc=1, argv=0x3ffffffa320) at tests/builtin-test.c:559
#9 0x00000000011473fc in run_builtin (p=0x14f60e8 <commands+600>, argc=3, argv=0x3ffffffa320) at perf.c:323
#10 0x000000000114776e in handle_internal_command (argc=3, argv=0x3ffffffa320) at perf.c:377
#11 0x0000000001147980 in run_argv (argcp=0x3ffffff9f94, argv=0x3ffffff9f88) at perf.c:421
#12 0x0000000001147d48 in main (argc=3, argv=0x3ffffffa320) at perf.c:537
(gdb)
To be honest, I am no expert on the yacc/bison/flex tool chain.
I understand a little bit about them, but that is it.
When I look at the output of perf test -Fvvvv 6 on linux-next, some things seem odd,
I marked them with 3 question masks ???:
# ./perf test -Fvvv 6
6: Parse event definition strings :
6.1: Test event parsing :
--- start ---
running test 0 'syscalls:sys_enter_openat'
Using CPUID IBM,3931,704,A01,3.7,002f
running test 1 'syscalls:*'
running test 2 'r1a'
running test 3 '1:1'
running test 4 'instructions'
No PMU found for 'instructions'FAILED tests/parse-events.c:143 wrong number of entries
Event test failure: test 4 'instructions'running test 5 'cycles/period=100000,config2/'
??? What is wrong here?
??? Output on linux 6.4.0rc3:
??? # ./perf stat -e instructions -- true
???
??? Performance counter stats for 'true':
???
??? 2,965,720 instructions
???
??? 0.002026832 seconds time elapsed
???
??? 0.000056000 seconds user
??? 0.002048000 seconds sys
??? #
??? This is fine and works as expected. The s390 PMU for counters
??? has a direct mapping for this. So we end up in the s390 PMU
??? to retrieve the value.
???
??? Output on linux-next
???# ./perf stat -e instructions -- true
???
??? Performance counter stats for 'true':
???
??? 0.65 msec task-clock # 0.250 CPUs utilized
??? 0 context-switches # 0.000 /sec
??? 0 cpu-migrations # 0.000 /sec
??? 49 page-faults # 75.375 K/sec
??? 3,367,228 cycles # 5.180 GHz
??? 2,880,270 instructions # 0.86 insn per cycle
??? <not supported> branches
??? <not supported> branch-misses
???
??? 0.002599176 seconds time elapsed
???
??? 0.000053000 seconds user
??? 0.002650000 seconds sys
???
???#
??? Somehow we end up in a different PMU. The output is the same as if
??? I do not specify an event at all. To reach the s390 specific PMU
??? I have to add it explicitly as in:
???# ./perf stat -e cpum_cf/instructions/ -- true
???
??? Performance counter stats for 'true':
???
??? 2,814,522 cpum_cf/instructions/
???
??? 0.001899881 seconds time elapsed
???
??? 0.000050000 seconds user
??? 0.001928000 seconds sys
???
???]#
No PMU found for 'cycles/period=100000,config2/'FAILED tests/parse-events.c:157 wrong number of entries
Event test failure: test 5 'cycles/period=100000,config2/'running test 6 'faults'
...
??? Similar output for basicly all events.
No PMU found for 'cycles'running test 59 'cycles/name=name/'
No PMU found for 'name'Segmentation fault (core dumped)
Hope this helps.
PS: Should we keep the linux-perf-use mailing list as addressee? Not sure
if everybody else is interested in this?
--
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294
next prev parent reply other threads:[~2023-06-14 8:32 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-13 12:54 perf test failures in linux-next on s390 Thomas Richter
2023-06-13 14:32 ` Ian Rogers
2023-06-14 8:31 ` Thomas Richter [this message]
2023-06-14 14:57 ` Ian Rogers
2023-06-15 8:57 ` Thomas Richter
2023-06-15 9:39 ` Thomas Richter
2023-06-15 14:34 ` Arnaldo Carvalho de Melo
2023-06-16 14:23 ` Ian Rogers
2023-06-16 14:36 ` Hybrid PMU issues on aarch64. was: " Arnaldo Carvalho de Melo
2023-06-16 14:44 ` Arnaldo Carvalho de Melo
2023-06-16 16:28 ` Ian Rogers
2023-06-16 16:53 ` Arnaldo Carvalho de Melo
2023-06-16 21:47 ` Arnaldo Carvalho de Melo
2023-06-16 22:09 ` Ian Rogers
2023-06-19 10:04 ` Thomas Richter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e7f0930a-a9fc-d768-a472-bd9af6fafdf5@linux.ibm.com \
--to=tmricht@linux.ibm.com \
--cc=acme@kernel.org \
--cc=irogers@google.com \
--cc=linux-perf-users@vger.kernel.org \
--cc=sumanthk@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).