Building the Linux kernel with Clang and LLVM
 help / color / mirror / Atom feed
* Re: Problem testing with S390x under QEMU on x86_64
       [not found]   ` <ZsU3GdK5t6KEOr0g@kodidev-ubuntu>
@ 2024-08-24 23:21     ` Tony Ambardar
  2024-08-25 20:23       ` Yonghong Song
  0 siblings, 1 reply; 3+ messages in thread
From: Tony Ambardar @ 2024-08-24 23:21 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: bpf, linux-s390, llvm, Alexei Starovoitov, Nathan Chancellor,
	Nick Desaulniers, Bill Wendling, Justin Stitt

On Tue, Aug 20, 2024 at 05:38:49PM -0700, Tony Ambardar wrote:
> 
> Hi Ilya,
> 
> Thanks for following up. As it happens, I did this the day before out of
> desperation after trying various kernel config and rootfs changes
> with no luck, and can confirm the system runs faster and without the
> kernel crashes noted above. Certainly the latest QEMU seems mandatory.
> 
> The good news is that 99% of tests with my cross-compiled test_progs
> work as expected out of the box, and some of the failing ones helped
> troubleshoot a few hidden libbpf issues. I'll outline the remaining
> failures for your feedback and comparison with native-built tests.
> 
> I used the command line:
>     ./test_progs -d get_stack_raw_tp,stacktrace_build_id,verifier_iterating_callbacks,tailcalls
> 

[snip]

> Aside from the tests above, I see only 3 failing tests:
> 
> All error logs:
> test_map_ptr:PASS:skel_open 0 nsec
> test_map_ptr:FAIL:skel_load unexpected error: -22 (errno 22)
> #165     map_ptr:FAIL
> subtest_userns:PASS:socketpair 0 nsec
> subtest_userns:PASS:fork 0 nsec
> recvfd:PASS:recvmsg 0 nsec
> recvfd:PASS:cmsg_null 0 nsec
> recvfd:PASS:cmsg_len 0 nsec
> recvfd:PASS:cmsg_level 0 nsec
> recvfd:PASS:cmsg_type 0 nsec
> parent:PASS:recv_bpffs_fd 0 nsec
> materialize_bpffs_fd:PASS:fs_cfg_cmds 0 nsec
> materialize_bpffs_fd:PASS:fs_cfg_maps 0 nsec
> materialize_bpffs_fd:PASS:fs_cfg_progs 0 nsec
> materialize_bpffs_fd:PASS:fs_cfg_attachs 0 nsec
> parent:PASS:materialize_bpffs_fd 0 nsec
> sendfd:PASS:sendmsg 0 nsec
> parent:PASS:send_mnt_fd 0 nsec
> recvfd:PASS:recvmsg 0 nsec
> recvfd:PASS:cmsg_null 0 nsec
> recvfd:PASS:cmsg_len 0 nsec
> recvfd:PASS:cmsg_level 0 nsec
> recvfd:PASS:cmsg_type 0 nsec
> parent:PASS:recv_token_fd 0 nsec
> parent:FAIL:waitpid_child unexpected error: 22 (errno 3)
> #402/9   token/obj_priv_implicit_token_envvar:FAIL
> #402     token:FAIL
> libbpf: prog 'on_event': BPF program load failed: Bad address
> libbpf: prog 'on_event': -- BEGIN PROG LOAD LOG --
> The sequence of 8193 jumps is too complex.
> verification time 2816240 usec
> stack depth 360
> processed 116096 insns (limit 1000000) max_states_per_insn 1 total_states 5061 peak_states 5061 mark_read 2540
> -- END PROG LOAD LOG --
> libbpf: prog 'on_event': failed to load: -14
> libbpf: failed to load object 'pyperf600.bpf.o'
> scale_test:FAIL:expect_success unexpected error: -14 (errno 14)
> #525     verif_scale_pyperf600:FAIL
> Summary: 559/4166 PASSED, 98 SKIPPED, 3 FAILED
> 

Hi Ilya,

A brief update with some good news: the 3 test failures above have been
resolved and all expected tests now pass on QEMU/s390x under x86_64.

Test '#165 map_ptr:FAIL' was a bug in my light-skeleton code, and fixed in
my patch series v2:
https://lore.kernel.org/bpf/cover.1724313164.git.tony.ambardar@gmail.com/

Test '#402/9 token/obj_priv_implicit_token_envvar:FAIL' was a problem in my
rootfs configuration and now passes after resolving.

Test '#525 verif_scale_pyperf600:FAIL' was caused by clang miscompilation
exposed by my use of clang-19 and clang-20. The test passes when built
with clang-17 (used by BPF CI) or clang-18 which I switched to use.

One symptom of the problem is easily seen by manually compiling:

$ clang-18  -g -Wall -Werror -D__TARGET_ARCH_s390 -mbig-endian -Itools/testing/selftests/bpf/tools/include -Itools/testing/selftests/bpf -Itools/include/uapi -Itools/testing/selftests/usr/include -Wno-compare-distinct-pointer-types -idirafter /usr/lib/llvm-18/lib/clang/18/include -idirafter /usr/local/include -idirafter /usr/lib/gcc-cross/s390x-linux-gnu/11/../../../../s390x-linux-gnu/include -idirafter /usr/include/s390x-linux-gnu -idirafter /usr/include -DENABLE_ATOMICS_TESTS -O2 --target=bpfeb -c tools/testing/selftests/bpf/progs/pyperf600.c -mcpu=v3 -o pyperf600.clang18.bpf.o

$ clang-19  -g -Wall -Werror -D__TARGET_ARCH_s390 -mbig-endian -Itools/testing/selftests/bpf/tools/include -Itools/testing/selftests/bpf -Itools/include/uapi -Itools/testing/selftests/usr/include -Wno-compare-distinct-pointer-types -idirafter /usr/lib/llvm-19/lib/clang/19/include -idirafter /usr/local/include -idirafter /usr/lib/gcc-cross/s390x-linux-gnu/11/../../../../s390x-linux-gnu/include -idirafter /usr/include/s390x-linux-gnu -idirafter /usr/include -DENABLE_ATOMICS_TESTS -O2 --target=bpfeb -c tools/testing/selftests/bpf/progs/pyperf600.c -mcpu=v3 -o pyperf600.clang19.bpf.o

$ llvm-readelf-18 -S pyperf600.clang{18,19}.bpf.o |grep .symtab
  [27] .symtab           SYMTAB          0000000000000000 1739d0 01ad60 18      1 4572  8
  [27] .symtab           SYMTAB          0000000000000000 14f048 0001e0 18      1  12  8

Notice that the .symtab has shrunk by ~200X for example going to clang-19!
(CCing llvm maintainers)


Kind regards,
Tony

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Problem testing with S390x under QEMU on x86_64
  2024-08-24 23:21     ` Problem testing with S390x under QEMU on x86_64 Tony Ambardar
@ 2024-08-25 20:23       ` Yonghong Song
  2024-08-26 10:50         ` Tony Ambardar
  0 siblings, 1 reply; 3+ messages in thread
From: Yonghong Song @ 2024-08-25 20:23 UTC (permalink / raw)
  To: Tony Ambardar, Ilya Leoshkevich
  Cc: bpf, linux-s390, llvm, Alexei Starovoitov, Nathan Chancellor,
	Nick Desaulniers, Bill Wendling, Justin Stitt


On 8/24/24 4:21 PM, Tony Ambardar wrote:
> On Tue, Aug 20, 2024 at 05:38:49PM -0700, Tony Ambardar wrote:
>> Hi Ilya,
>>
>> Thanks for following up. As it happens, I did this the day before out of
>> desperation after trying various kernel config and rootfs changes
>> with no luck, and can confirm the system runs faster and without the
>> kernel crashes noted above. Certainly the latest QEMU seems mandatory.
>>
>> The good news is that 99% of tests with my cross-compiled test_progs
>> work as expected out of the box, and some of the failing ones helped
>> troubleshoot a few hidden libbpf issues. I'll outline the remaining
>> failures for your feedback and comparison with native-built tests.
>>
>> I used the command line:
>>      ./test_progs -d get_stack_raw_tp,stacktrace_build_id,verifier_iterating_callbacks,tailcalls
>>
> [snip]
>
>> Aside from the tests above, I see only 3 failing tests:
>>
>> All error logs:
>> test_map_ptr:PASS:skel_open 0 nsec
>> test_map_ptr:FAIL:skel_load unexpected error: -22 (errno 22)
>> #165     map_ptr:FAIL
>> subtest_userns:PASS:socketpair 0 nsec
>> subtest_userns:PASS:fork 0 nsec
>> recvfd:PASS:recvmsg 0 nsec
>> recvfd:PASS:cmsg_null 0 nsec
>> recvfd:PASS:cmsg_len 0 nsec
>> recvfd:PASS:cmsg_level 0 nsec
>> recvfd:PASS:cmsg_type 0 nsec
>> parent:PASS:recv_bpffs_fd 0 nsec
>> materialize_bpffs_fd:PASS:fs_cfg_cmds 0 nsec
>> materialize_bpffs_fd:PASS:fs_cfg_maps 0 nsec
>> materialize_bpffs_fd:PASS:fs_cfg_progs 0 nsec
>> materialize_bpffs_fd:PASS:fs_cfg_attachs 0 nsec
>> parent:PASS:materialize_bpffs_fd 0 nsec
>> sendfd:PASS:sendmsg 0 nsec
>> parent:PASS:send_mnt_fd 0 nsec
>> recvfd:PASS:recvmsg 0 nsec
>> recvfd:PASS:cmsg_null 0 nsec
>> recvfd:PASS:cmsg_len 0 nsec
>> recvfd:PASS:cmsg_level 0 nsec
>> recvfd:PASS:cmsg_type 0 nsec
>> parent:PASS:recv_token_fd 0 nsec
>> parent:FAIL:waitpid_child unexpected error: 22 (errno 3)
>> #402/9   token/obj_priv_implicit_token_envvar:FAIL
>> #402     token:FAIL
>> libbpf: prog 'on_event': BPF program load failed: Bad address
>> libbpf: prog 'on_event': -- BEGIN PROG LOAD LOG --
>> The sequence of 8193 jumps is too complex.
>> verification time 2816240 usec
>> stack depth 360
>> processed 116096 insns (limit 1000000) max_states_per_insn 1 total_states 5061 peak_states 5061 mark_read 2540
>> -- END PROG LOAD LOG --
>> libbpf: prog 'on_event': failed to load: -14
>> libbpf: failed to load object 'pyperf600.bpf.o'
>> scale_test:FAIL:expect_success unexpected error: -14 (errno 14)
>> #525     verif_scale_pyperf600:FAIL
>> Summary: 559/4166 PASSED, 98 SKIPPED, 3 FAILED
>>
> Hi Ilya,
>
> A brief update with some good news: the 3 test failures above have been
> resolved and all expected tests now pass on QEMU/s390x under x86_64.
>
> Test '#165 map_ptr:FAIL' was a bug in my light-skeleton code, and fixed in
> my patch series v2:
> https://lore.kernel.org/bpf/cover.1724313164.git.tony.ambardar@gmail.com/
>
> Test '#402/9 token/obj_priv_implicit_token_envvar:FAIL' was a problem in my
> rootfs configuration and now passes after resolving.
>
> Test '#525 verif_scale_pyperf600:FAIL' was caused by clang miscompilation
> exposed by my use of clang-19 and clang-20. The test passes when built
> with clang-17 (used by BPF CI) or clang-18 which I switched to use.

x86 has the same issue where clang19 generated code will cause verification
failure. Eduard is working on this.

>
> One symptom of the problem is easily seen by manually compiling:
>
> $ clang-18  -g -Wall -Werror -D__TARGET_ARCH_s390 -mbig-endian -Itools/testing/selftests/bpf/tools/include -Itools/testing/selftests/bpf -Itools/include/uapi -Itools/testing/selftests/usr/include -Wno-compare-distinct-pointer-types -idirafter /usr/lib/llvm-18/lib/clang/18/include -idirafter /usr/local/include -idirafter /usr/lib/gcc-cross/s390x-linux-gnu/11/../../../../s390x-linux-gnu/include -idirafter /usr/include/s390x-linux-gnu -idirafter /usr/include -DENABLE_ATOMICS_TESTS -O2 --target=bpfeb -c tools/testing/selftests/bpf/progs/pyperf600.c -mcpu=v3 -o pyperf600.clang18.bpf.o
>
> $ clang-19  -g -Wall -Werror -D__TARGET_ARCH_s390 -mbig-endian -Itools/testing/selftests/bpf/tools/include -Itools/testing/selftests/bpf -Itools/include/uapi -Itools/testing/selftests/usr/include -Wno-compare-distinct-pointer-types -idirafter /usr/lib/llvm-19/lib/clang/19/include -idirafter /usr/local/include -idirafter /usr/lib/gcc-cross/s390x-linux-gnu/11/../../../../s390x-linux-gnu/include -idirafter /usr/include/s390x-linux-gnu -idirafter /usr/include -DENABLE_ATOMICS_TESTS -O2 --target=bpfeb -c tools/testing/selftests/bpf/progs/pyperf600.c -mcpu=v3 -o pyperf600.clang19.bpf.o
>
> $ llvm-readelf-18 -S pyperf600.clang{18,19}.bpf.o |grep .symtab
>    [27] .symtab           SYMTAB          0000000000000000 1739d0 01ad60 18      1 4572  8
>    [27] .symtab           SYMTAB          0000000000000000 14f048 0001e0 18      1  12  8
>
> Notice that the .symtab has shrunk by ~200X for example going to clang-19!
> (CCing llvm maintainers)

This is a known issue. In llvm18, all labels (to identify basic blocks) are in symbol table.
Those labels are removed from symbol table in llvm19.

>
>
> Kind regards,
> Tony
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Problem testing with S390x under QEMU on x86_64
  2024-08-25 20:23       ` Yonghong Song
@ 2024-08-26 10:50         ` Tony Ambardar
  0 siblings, 0 replies; 3+ messages in thread
From: Tony Ambardar @ 2024-08-26 10:50 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Ilya Leoshkevich, bpf, linux-s390, llvm, Alexei Starovoitov,
	Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt

On Sun, Aug 25, 2024 at 01:23:51PM -0700, Yonghong Song wrote:
> 
> On 8/24/24 4:21 PM, Tony Ambardar wrote:

[snip]

> > 
> > Test '#525 verif_scale_pyperf600:FAIL' was caused by clang miscompilation
> > exposed by my use of clang-19 and clang-20. The test passes when built
> > with clang-17 (used by BPF CI) or clang-18 which I switched to use.
> 
> x86 has the same issue where clang19 generated code will cause verification
> failure. Eduard is working on this.
> 
> > 
> > One symptom of the problem is easily seen by manually compiling:
> > 
> > $ clang-18  -g -Wall -Werror -D__TARGET_ARCH_s390 -mbig-endian -Itools/testing/selftests/bpf/tools/include -Itools/testing/selftests/bpf -Itools/include/uapi -Itools/testing/selftests/usr/include -Wno-compare-distinct-pointer-types -idirafter /usr/lib/llvm-18/lib/clang/18/include -idirafter /usr/local/include -idirafter /usr/lib/gcc-cross/s390x-linux-gnu/11/../../../../s390x-linux-gnu/include -idirafter /usr/include/s390x-linux-gnu -idirafter /usr/include -DENABLE_ATOMICS_TESTS -O2 --target=bpfeb -c tools/testing/selftests/bpf/progs/pyperf600.c -mcpu=v3 -o pyperf600.clang18.bpf.o
> > 
> > $ clang-19  -g -Wall -Werror -D__TARGET_ARCH_s390 -mbig-endian -Itools/testing/selftests/bpf/tools/include -Itools/testing/selftests/bpf -Itools/include/uapi -Itools/testing/selftests/usr/include -Wno-compare-distinct-pointer-types -idirafter /usr/lib/llvm-19/lib/clang/19/include -idirafter /usr/local/include -idirafter /usr/lib/gcc-cross/s390x-linux-gnu/11/../../../../s390x-linux-gnu/include -idirafter /usr/include/s390x-linux-gnu -idirafter /usr/include -DENABLE_ATOMICS_TESTS -O2 --target=bpfeb -c tools/testing/selftests/bpf/progs/pyperf600.c -mcpu=v3 -o pyperf600.clang19.bpf.o
> > 
> > $ llvm-readelf-18 -S pyperf600.clang{18,19}.bpf.o |grep .symtab
> >    [27] .symtab           SYMTAB          0000000000000000 1739d0 01ad60 18      1 4572  8
> >    [27] .symtab           SYMTAB          0000000000000000 14f048 0001e0 18      1  12  8
> > 
> > Notice that the .symtab has shrunk by ~200X for example going to clang-19!
> > (CCing llvm maintainers)
> 
> This is a known issue. In llvm18, all labels (to identify basic blocks) are in symbol table.
> Those labels are removed from symbol table in llvm19.

Glad to hear this a known issue being looked at now. A quick search on my part found nothing, so sorry for the noise and thanks for clarifying.

> 
> > 
> > 
> > Kind regards,
> > Tony
> > 

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-08-26 10:50 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <ZsEcsaa3juxxQBUf@kodidev-ubuntu>
     [not found] ` <180f4c27ebfb954d6b0fd2303c9fb7d5f21dae04.camel@linux.ibm.com>
     [not found]   ` <ZsU3GdK5t6KEOr0g@kodidev-ubuntu>
2024-08-24 23:21     ` Problem testing with S390x under QEMU on x86_64 Tony Ambardar
2024-08-25 20:23       ` Yonghong Song
2024-08-26 10:50         ` Tony Ambardar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox