* Re: Problem testing with S390x under QEMU on x86_64 [not found] ` <ZsU3GdK5t6KEOr0g@kodidev-ubuntu> @ 2024-08-24 23:21 ` Tony Ambardar 2024-08-25 20:23 ` Yonghong Song 0 siblings, 1 reply; 3+ messages in thread From: Tony Ambardar @ 2024-08-24 23:21 UTC (permalink / raw) To: Ilya Leoshkevich Cc: bpf, linux-s390, llvm, Alexei Starovoitov, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt On Tue, Aug 20, 2024 at 05:38:49PM -0700, Tony Ambardar wrote: > > Hi Ilya, > > Thanks for following up. As it happens, I did this the day before out of > desperation after trying various kernel config and rootfs changes > with no luck, and can confirm the system runs faster and without the > kernel crashes noted above. Certainly the latest QEMU seems mandatory. > > The good news is that 99% of tests with my cross-compiled test_progs > work as expected out of the box, and some of the failing ones helped > troubleshoot a few hidden libbpf issues. I'll outline the remaining > failures for your feedback and comparison with native-built tests. > > I used the command line: > ./test_progs -d get_stack_raw_tp,stacktrace_build_id,verifier_iterating_callbacks,tailcalls > [snip] > Aside from the tests above, I see only 3 failing tests: > > All error logs: > test_map_ptr:PASS:skel_open 0 nsec > test_map_ptr:FAIL:skel_load unexpected error: -22 (errno 22) > #165 map_ptr:FAIL > subtest_userns:PASS:socketpair 0 nsec > subtest_userns:PASS:fork 0 nsec > recvfd:PASS:recvmsg 0 nsec > recvfd:PASS:cmsg_null 0 nsec > recvfd:PASS:cmsg_len 0 nsec > recvfd:PASS:cmsg_level 0 nsec > recvfd:PASS:cmsg_type 0 nsec > parent:PASS:recv_bpffs_fd 0 nsec > materialize_bpffs_fd:PASS:fs_cfg_cmds 0 nsec > materialize_bpffs_fd:PASS:fs_cfg_maps 0 nsec > materialize_bpffs_fd:PASS:fs_cfg_progs 0 nsec > materialize_bpffs_fd:PASS:fs_cfg_attachs 0 nsec > parent:PASS:materialize_bpffs_fd 0 nsec > sendfd:PASS:sendmsg 0 nsec > parent:PASS:send_mnt_fd 0 nsec > recvfd:PASS:recvmsg 0 nsec > recvfd:PASS:cmsg_null 0 nsec > recvfd:PASS:cmsg_len 0 nsec > recvfd:PASS:cmsg_level 0 nsec > recvfd:PASS:cmsg_type 0 nsec > parent:PASS:recv_token_fd 0 nsec > parent:FAIL:waitpid_child unexpected error: 22 (errno 3) > #402/9 token/obj_priv_implicit_token_envvar:FAIL > #402 token:FAIL > libbpf: prog 'on_event': BPF program load failed: Bad address > libbpf: prog 'on_event': -- BEGIN PROG LOAD LOG -- > The sequence of 8193 jumps is too complex. > verification time 2816240 usec > stack depth 360 > processed 116096 insns (limit 1000000) max_states_per_insn 1 total_states 5061 peak_states 5061 mark_read 2540 > -- END PROG LOAD LOG -- > libbpf: prog 'on_event': failed to load: -14 > libbpf: failed to load object 'pyperf600.bpf.o' > scale_test:FAIL:expect_success unexpected error: -14 (errno 14) > #525 verif_scale_pyperf600:FAIL > Summary: 559/4166 PASSED, 98 SKIPPED, 3 FAILED > Hi Ilya, A brief update with some good news: the 3 test failures above have been resolved and all expected tests now pass on QEMU/s390x under x86_64. Test '#165 map_ptr:FAIL' was a bug in my light-skeleton code, and fixed in my patch series v2: https://lore.kernel.org/bpf/cover.1724313164.git.tony.ambardar@gmail.com/ Test '#402/9 token/obj_priv_implicit_token_envvar:FAIL' was a problem in my rootfs configuration and now passes after resolving. Test '#525 verif_scale_pyperf600:FAIL' was caused by clang miscompilation exposed by my use of clang-19 and clang-20. The test passes when built with clang-17 (used by BPF CI) or clang-18 which I switched to use. One symptom of the problem is easily seen by manually compiling: $ clang-18 -g -Wall -Werror -D__TARGET_ARCH_s390 -mbig-endian -Itools/testing/selftests/bpf/tools/include -Itools/testing/selftests/bpf -Itools/include/uapi -Itools/testing/selftests/usr/include -Wno-compare-distinct-pointer-types -idirafter /usr/lib/llvm-18/lib/clang/18/include -idirafter /usr/local/include -idirafter /usr/lib/gcc-cross/s390x-linux-gnu/11/../../../../s390x-linux-gnu/include -idirafter /usr/include/s390x-linux-gnu -idirafter /usr/include -DENABLE_ATOMICS_TESTS -O2 --target=bpfeb -c tools/testing/selftests/bpf/progs/pyperf600.c -mcpu=v3 -o pyperf600.clang18.bpf.o $ clang-19 -g -Wall -Werror -D__TARGET_ARCH_s390 -mbig-endian -Itools/testing/selftests/bpf/tools/include -Itools/testing/selftests/bpf -Itools/include/uapi -Itools/testing/selftests/usr/include -Wno-compare-distinct-pointer-types -idirafter /usr/lib/llvm-19/lib/clang/19/include -idirafter /usr/local/include -idirafter /usr/lib/gcc-cross/s390x-linux-gnu/11/../../../../s390x-linux-gnu/include -idirafter /usr/include/s390x-linux-gnu -idirafter /usr/include -DENABLE_ATOMICS_TESTS -O2 --target=bpfeb -c tools/testing/selftests/bpf/progs/pyperf600.c -mcpu=v3 -o pyperf600.clang19.bpf.o $ llvm-readelf-18 -S pyperf600.clang{18,19}.bpf.o |grep .symtab [27] .symtab SYMTAB 0000000000000000 1739d0 01ad60 18 1 4572 8 [27] .symtab SYMTAB 0000000000000000 14f048 0001e0 18 1 12 8 Notice that the .symtab has shrunk by ~200X for example going to clang-19! (CCing llvm maintainers) Kind regards, Tony ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Problem testing with S390x under QEMU on x86_64 2024-08-24 23:21 ` Problem testing with S390x under QEMU on x86_64 Tony Ambardar @ 2024-08-25 20:23 ` Yonghong Song 2024-08-26 10:50 ` Tony Ambardar 0 siblings, 1 reply; 3+ messages in thread From: Yonghong Song @ 2024-08-25 20:23 UTC (permalink / raw) To: Tony Ambardar, Ilya Leoshkevich Cc: bpf, linux-s390, llvm, Alexei Starovoitov, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt On 8/24/24 4:21 PM, Tony Ambardar wrote: > On Tue, Aug 20, 2024 at 05:38:49PM -0700, Tony Ambardar wrote: >> Hi Ilya, >> >> Thanks for following up. As it happens, I did this the day before out of >> desperation after trying various kernel config and rootfs changes >> with no luck, and can confirm the system runs faster and without the >> kernel crashes noted above. Certainly the latest QEMU seems mandatory. >> >> The good news is that 99% of tests with my cross-compiled test_progs >> work as expected out of the box, and some of the failing ones helped >> troubleshoot a few hidden libbpf issues. I'll outline the remaining >> failures for your feedback and comparison with native-built tests. >> >> I used the command line: >> ./test_progs -d get_stack_raw_tp,stacktrace_build_id,verifier_iterating_callbacks,tailcalls >> > [snip] > >> Aside from the tests above, I see only 3 failing tests: >> >> All error logs: >> test_map_ptr:PASS:skel_open 0 nsec >> test_map_ptr:FAIL:skel_load unexpected error: -22 (errno 22) >> #165 map_ptr:FAIL >> subtest_userns:PASS:socketpair 0 nsec >> subtest_userns:PASS:fork 0 nsec >> recvfd:PASS:recvmsg 0 nsec >> recvfd:PASS:cmsg_null 0 nsec >> recvfd:PASS:cmsg_len 0 nsec >> recvfd:PASS:cmsg_level 0 nsec >> recvfd:PASS:cmsg_type 0 nsec >> parent:PASS:recv_bpffs_fd 0 nsec >> materialize_bpffs_fd:PASS:fs_cfg_cmds 0 nsec >> materialize_bpffs_fd:PASS:fs_cfg_maps 0 nsec >> materialize_bpffs_fd:PASS:fs_cfg_progs 0 nsec >> materialize_bpffs_fd:PASS:fs_cfg_attachs 0 nsec >> parent:PASS:materialize_bpffs_fd 0 nsec >> sendfd:PASS:sendmsg 0 nsec >> parent:PASS:send_mnt_fd 0 nsec >> recvfd:PASS:recvmsg 0 nsec >> recvfd:PASS:cmsg_null 0 nsec >> recvfd:PASS:cmsg_len 0 nsec >> recvfd:PASS:cmsg_level 0 nsec >> recvfd:PASS:cmsg_type 0 nsec >> parent:PASS:recv_token_fd 0 nsec >> parent:FAIL:waitpid_child unexpected error: 22 (errno 3) >> #402/9 token/obj_priv_implicit_token_envvar:FAIL >> #402 token:FAIL >> libbpf: prog 'on_event': BPF program load failed: Bad address >> libbpf: prog 'on_event': -- BEGIN PROG LOAD LOG -- >> The sequence of 8193 jumps is too complex. >> verification time 2816240 usec >> stack depth 360 >> processed 116096 insns (limit 1000000) max_states_per_insn 1 total_states 5061 peak_states 5061 mark_read 2540 >> -- END PROG LOAD LOG -- >> libbpf: prog 'on_event': failed to load: -14 >> libbpf: failed to load object 'pyperf600.bpf.o' >> scale_test:FAIL:expect_success unexpected error: -14 (errno 14) >> #525 verif_scale_pyperf600:FAIL >> Summary: 559/4166 PASSED, 98 SKIPPED, 3 FAILED >> > Hi Ilya, > > A brief update with some good news: the 3 test failures above have been > resolved and all expected tests now pass on QEMU/s390x under x86_64. > > Test '#165 map_ptr:FAIL' was a bug in my light-skeleton code, and fixed in > my patch series v2: > https://lore.kernel.org/bpf/cover.1724313164.git.tony.ambardar@gmail.com/ > > Test '#402/9 token/obj_priv_implicit_token_envvar:FAIL' was a problem in my > rootfs configuration and now passes after resolving. > > Test '#525 verif_scale_pyperf600:FAIL' was caused by clang miscompilation > exposed by my use of clang-19 and clang-20. The test passes when built > with clang-17 (used by BPF CI) or clang-18 which I switched to use. x86 has the same issue where clang19 generated code will cause verification failure. Eduard is working on this. > > One symptom of the problem is easily seen by manually compiling: > > $ clang-18 -g -Wall -Werror -D__TARGET_ARCH_s390 -mbig-endian -Itools/testing/selftests/bpf/tools/include -Itools/testing/selftests/bpf -Itools/include/uapi -Itools/testing/selftests/usr/include -Wno-compare-distinct-pointer-types -idirafter /usr/lib/llvm-18/lib/clang/18/include -idirafter /usr/local/include -idirafter /usr/lib/gcc-cross/s390x-linux-gnu/11/../../../../s390x-linux-gnu/include -idirafter /usr/include/s390x-linux-gnu -idirafter /usr/include -DENABLE_ATOMICS_TESTS -O2 --target=bpfeb -c tools/testing/selftests/bpf/progs/pyperf600.c -mcpu=v3 -o pyperf600.clang18.bpf.o > > $ clang-19 -g -Wall -Werror -D__TARGET_ARCH_s390 -mbig-endian -Itools/testing/selftests/bpf/tools/include -Itools/testing/selftests/bpf -Itools/include/uapi -Itools/testing/selftests/usr/include -Wno-compare-distinct-pointer-types -idirafter /usr/lib/llvm-19/lib/clang/19/include -idirafter /usr/local/include -idirafter /usr/lib/gcc-cross/s390x-linux-gnu/11/../../../../s390x-linux-gnu/include -idirafter /usr/include/s390x-linux-gnu -idirafter /usr/include -DENABLE_ATOMICS_TESTS -O2 --target=bpfeb -c tools/testing/selftests/bpf/progs/pyperf600.c -mcpu=v3 -o pyperf600.clang19.bpf.o > > $ llvm-readelf-18 -S pyperf600.clang{18,19}.bpf.o |grep .symtab > [27] .symtab SYMTAB 0000000000000000 1739d0 01ad60 18 1 4572 8 > [27] .symtab SYMTAB 0000000000000000 14f048 0001e0 18 1 12 8 > > Notice that the .symtab has shrunk by ~200X for example going to clang-19! > (CCing llvm maintainers) This is a known issue. In llvm18, all labels (to identify basic blocks) are in symbol table. Those labels are removed from symbol table in llvm19. > > > Kind regards, > Tony > ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Problem testing with S390x under QEMU on x86_64 2024-08-25 20:23 ` Yonghong Song @ 2024-08-26 10:50 ` Tony Ambardar 0 siblings, 0 replies; 3+ messages in thread From: Tony Ambardar @ 2024-08-26 10:50 UTC (permalink / raw) To: Yonghong Song Cc: Ilya Leoshkevich, bpf, linux-s390, llvm, Alexei Starovoitov, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt On Sun, Aug 25, 2024 at 01:23:51PM -0700, Yonghong Song wrote: > > On 8/24/24 4:21 PM, Tony Ambardar wrote: [snip] > > > > Test '#525 verif_scale_pyperf600:FAIL' was caused by clang miscompilation > > exposed by my use of clang-19 and clang-20. The test passes when built > > with clang-17 (used by BPF CI) or clang-18 which I switched to use. > > x86 has the same issue where clang19 generated code will cause verification > failure. Eduard is working on this. > > > > > One symptom of the problem is easily seen by manually compiling: > > > > $ clang-18 -g -Wall -Werror -D__TARGET_ARCH_s390 -mbig-endian -Itools/testing/selftests/bpf/tools/include -Itools/testing/selftests/bpf -Itools/include/uapi -Itools/testing/selftests/usr/include -Wno-compare-distinct-pointer-types -idirafter /usr/lib/llvm-18/lib/clang/18/include -idirafter /usr/local/include -idirafter /usr/lib/gcc-cross/s390x-linux-gnu/11/../../../../s390x-linux-gnu/include -idirafter /usr/include/s390x-linux-gnu -idirafter /usr/include -DENABLE_ATOMICS_TESTS -O2 --target=bpfeb -c tools/testing/selftests/bpf/progs/pyperf600.c -mcpu=v3 -o pyperf600.clang18.bpf.o > > > > $ clang-19 -g -Wall -Werror -D__TARGET_ARCH_s390 -mbig-endian -Itools/testing/selftests/bpf/tools/include -Itools/testing/selftests/bpf -Itools/include/uapi -Itools/testing/selftests/usr/include -Wno-compare-distinct-pointer-types -idirafter /usr/lib/llvm-19/lib/clang/19/include -idirafter /usr/local/include -idirafter /usr/lib/gcc-cross/s390x-linux-gnu/11/../../../../s390x-linux-gnu/include -idirafter /usr/include/s390x-linux-gnu -idirafter /usr/include -DENABLE_ATOMICS_TESTS -O2 --target=bpfeb -c tools/testing/selftests/bpf/progs/pyperf600.c -mcpu=v3 -o pyperf600.clang19.bpf.o > > > > $ llvm-readelf-18 -S pyperf600.clang{18,19}.bpf.o |grep .symtab > > [27] .symtab SYMTAB 0000000000000000 1739d0 01ad60 18 1 4572 8 > > [27] .symtab SYMTAB 0000000000000000 14f048 0001e0 18 1 12 8 > > > > Notice that the .symtab has shrunk by ~200X for example going to clang-19! > > (CCing llvm maintainers) > > This is a known issue. In llvm18, all labels (to identify basic blocks) are in symbol table. > Those labels are removed from symbol table in llvm19. Glad to hear this a known issue being looked at now. A quick search on my part found nothing, so sorry for the noise and thanks for clarifying. > > > > > > > Kind regards, > > Tony > > ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-08-26 10:50 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <ZsEcsaa3juxxQBUf@kodidev-ubuntu>
[not found] ` <180f4c27ebfb954d6b0fd2303c9fb7d5f21dae04.camel@linux.ibm.com>
[not found] ` <ZsU3GdK5t6KEOr0g@kodidev-ubuntu>
2024-08-24 23:21 ` Problem testing with S390x under QEMU on x86_64 Tony Ambardar
2024-08-25 20:23 ` Yonghong Song
2024-08-26 10:50 ` Tony Ambardar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox