From: Ilya Leoshkevich <iii@linux.ibm.com>
To: Tony Ambardar <tony.ambardar@gmail.com>
Cc: bpf@vger.kernel.org, linux-s390@vger.kernel.org,
Alexei Starovoitov <ast@kernel.org>
Subject: Re: Problem testing with S390x under QEMU on x86_64
Date: Wed, 21 Aug 2024 19:28:29 +0200 [thread overview]
Message-ID: <aa8fe2731224ffdb6d64a014e3e02740c50010cd.camel@linux.ibm.com> (raw)
In-Reply-To: <ZsU3GdK5t6KEOr0g@kodidev-ubuntu>
On Tue, 2024-08-20 at 17:38 -0700, Tony Ambardar wrote:
[...]
> I used the command line:
> ./test_progs -d
> get_stack_raw_tp,stacktrace_build_id,verifier_iterating_callbacks,tai
> lcalls
>
> which includes the current DENYLIST.s390x as well as 'tailcalls',
> which
> is also excluded by the kernel-patches/bpf s390x CI. I note the CI
> excludes several more tests that seem to work. Any idea why that is?
>
> For reference, the issue with 'tailcalls/tailcall_hierarchy_count' is
> an
> RCU stall and kernel hang:
>
> root@(none):/usr/libexec/kselftests-bpf# ./test_progs -v --debug -n
> 332/19
> bpf_testmod.ko is already unloaded.
> Loading bpf_testmod.ko...
> Successfully loaded bpf_testmod.ko.
> test_tailcall_hierarchy_count:PASS:load obj 0 nsec
> test_tailcall_hierarchy_count:PASS:find entry prog 0 nsec
> test_tailcall_hierarchy_count:PASS:prog_fd 0 nsec
> test_tailcall_hierarchy_count:PASS:find jmp_table 0 nsec
> test_tailcall_hierarchy_count:PASS:map_fd 0 nsec
> test_tailcall_hierarchy_count:PASS:update jmp_table 0 nsec
> test_tailcall_hierarchy_count:PASS:find data_map 0 nsec
> test_tailcall_hierarchy_count:PASS:open fentry_obj file 0 nsec
> test_tailcall_hierarchy_count:PASS:find fentry prog 0 nsec
> test_tailcall_hierarchy_count:PASS:set_attach_target subprog_tail 0
> nsec
> test_tailcall_hierarchy_count:PASS:load fentry_obj 0 nsec
> test_tailcall_hierarchy_count:PASS:attach_trace 0 nsec
> rcu: INFO: rcu_sched self-detected stall on CPU
> rcu: 0-....: (1 GPs behind) idle=4eb4/1/0x4000000000000000
> softirq=527/528 fqs=1050
> rcu: (t=2100 jiffies g=-379 q=20 ncpus=2)
> CPU: 0 UID: 0 PID: 84 Comm: test_progs Tainted: G O
> 6.10.0-12706-g853081e84612-dirty #111
> Tainted: [O]=OOT_MODULE
> Hardware name: QEMU 8561 QEMU (KVM/Linux)
> Krnl PSW : 0704f00180000000 000003ffe00f8fca
> (lock_release+0xf2/0x190)
> R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 RI:0
> EA:3
> Krnl GPRS: 00000000b298dd12 0000000000000000 000002f23fd767c8
> 000003ffe1848800
> 0000000000000001 0000037fe034edbc 0000037fe034fd74
> 0000000000000001
> 0700037fe034edc8 000003ffe0249e48 000003ffe1848800
> 000003ffe19ba7c8
> 000003ff9f7a7f90 0000037fe034ef00 000003ffe00f8f96
> 0000037fe034ed78
> Krnl Code: 000003ffe00f8fbe: a7820300 tmhh %r8,768
> 000003ffe00f8fc2: a7840004 brc
> 8,000003ffe00f8fca
> #000003ffe00f8fc6: ad03f0a0 stosm 160(%r15),3
> >000003ffe00f8fca: eb8ff0a80004 lmg
> %r8,%r15,168(%r15)
> 000003ffe00f8fd0: 07fe bcr 15,%r14
> 000003ffe00f8fd2: c0e500011057 brasl
> %r14,000003ffe011b080
> 000003ffe00f8fd8: ec26ffa6007e cij
> %r2,0,6,000003ffe00f8f24
> 000003ffe00f8fde: c01000b78b96 larl
> %r1,000003ffe17ea70a
> Call Trace:
> [<000003ffe00f8fca>] lock_release+0xf2/0x190
> ([<000003ffe00f8f96>] lock_release+0xbe/0x190)
> [<000003ffe0249ea4>] __bpf_prog_exit_recur+0x5c/0x68
> [<000003ff6001e0b0>] bpf_trampoline_73014444060+0xb0/0xd2
> [<000003ff60024d14>] bpf_prog_eb7edc599e93dcc8_entry+0x5c/0xc8
> [<000003ff60024d14>] bpf_prog_eb7edc599e93dcc8_entry+0x5c/0xc8
> [<000003ff60024d14>] bpf_prog_eb7edc599e93dcc8_entry+0x5c/0xc8
> [<000003ff60024d2a>] bpf_prog_eb7edc599e93dcc8_entry+0x72/0xc8
> [<000003ff60024d2a>] bpf_prog_eb7edc599e93dcc8_entry+0x72/0xc8
> [<000003ff60024d14>] bpf_prog_eb7edc599e93dcc8_entry+0x5c/0xc8
> [<000003ff60024d14>] bpf_prog_eb7edc599e93dcc8_entry+0x5c/0xc8
> [<000003ff60024d14>] bpf_prog_eb7edc599e93dcc8_entry+0x5c/0xc8
> [<000003ffe084ecee>] bpf_test_run+0x216/0x3a8
> [<000003ffe084f9cc>] bpf_prog_test_run_skb+0x21c/0x630
> [<000003ffe0202ad2>] __sys_bpf+0x7ea/0xbb0
> [<000003ffe0203114>] __s390x_sys_bpf+0x44/0
Thanks for the detailed analysis! I will need to port
commit 116e04ba1459fc08f80cf27b8c9f9f188be0fcb2
Author: Leon Hwang <hffilwlqm@gmail.com>
Date: Sun Jul 14 20:39:00 2024 +0800
bpf, x64: Fix tailcall hierarchy
to s390x to fix this.
> Another curiosity is with 'uprobe_multi_test/attach_uprobe_fails',
> which usually succeeds but generates an inode warning in
> kernel/events/uprobes.c: (with cross-compiled and native test_progs)
>
> #416 uprobe_autoattach:OK
> ref_ctr_offset mismatch. inode: 0x73c7 offset: 0x3c9b78
> ref_ctr_offset(old): 0x464d7be ref_ctr_offset(new): 0x464d7bc
> #417/1 uprobe_multi_test/skel_api:OK
> #417/2 uprobe_multi_test/attach_api_pattern:OK
> #417/3 uprobe_multi_test/attach_api_syms:OK
> #417/4 uprobe_multi_test/link_api:OK
> #417/5 uprobe_multi_test/bench_uprobe:OK
> #417/6 uprobe_multi_test/bench_usdt:OK
> #417/7 uprobe_multi_test/attach_api_fails:OK
> #417/8 uprobe_multi_test/attach_uprobe_fails:OK
> #417/9 uprobe_multi_test/consumers:OK
> #417 uprobe_multi_test:OK
>
> but occasionally I see this kernel fault:
>
> #416 uprobe_autoattach:OK
> User process fault: interruption code 0001 ilc:1 in
> test_progs[3c9ba2,2aa3b580000+cc5000]
> CPU: 0 UID: 0 PID: 165 Comm: new_name Tainted: G OE
> 6.10.0-12707-g8189b8007d01 #114
> Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> Hardware name: QEMU 8561 QEMU (KVM/Linux)
> User PSW : 0705000180000000 000002aa3b949ba2
> R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:0 PM:0 RI:0
> EA:3
> User GPRS: cccccccccccccccd 0000000000000000 000003ffbe080000
> 0000000000000000
> 000003ffbeb74828 0000000000000006 0000000000000000
> 000002aa3c245928
> 000003ffbeb2cbc0 000003ffbeb2d020 0000000000000003
> 000003ffdb379f20
> 000003ffbeb2cf98 0000000000000000 000002aa3b94a400
> 000003ffdb379f20
> User Code:>000002aa3b949ba2: 0000 illegal
> 000002aa3b949ba4: 0700 bcr 0,%r0
> 000002aa3b949ba6: b3cd00b0 lgdr %r11,%f0
> 000002aa3b949baa: 07fe bcr 15,%r14
> 000002aa3b949bac: 0707 bcr 0,%r7
> 000002aa3b949bae: 0707 bcr 0,%r7
> 000002aa3b949bb0: ebbff0580024 stmg
> %r11,%r15,88(%r15)
> 000002aa3b949bb6: e3f0ff48ff71 lay %r15,-
> 184(%r15)
> Last Breaking-Event-Address:
> [<000002aa3b94a3fa>] test_progs[3ca3fa,2aa3b580000+cc5000]
>
>
> Have you seen this fault before? Is the inode warning expected by the
> test?
Yes, this is caused by:
/* attach fail due to wrong ref_ctr_offs on one of the uprobes */
attach_uprobe_fail_refctr(skel);
The fault is a user fault, not a kernel fault. I could not reproduce it
on a real s390x machine. This may be an emulation problem, since
apparently the kernel does not recognize that "0000 illegal" is an
uprobe. Quite some time ago I fixed a similar issue in this area,
perhaps it's a new flavour. I will investigate.
[...]
Best regards,
Ilya
next prev parent reply other threads:[~2024-08-21 17:28 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-17 21:57 Problem testing with S390x under QEMU on x86_64 Tony Ambardar
2024-08-19 9:15 ` Ilya Leoshkevich
2024-08-21 0:38 ` Tony Ambardar
2024-08-21 17:28 ` Ilya Leoshkevich [this message]
2024-08-23 13:29 ` Leon Hwang
2024-08-24 23:21 ` Tony Ambardar
2024-08-25 20:23 ` Yonghong Song
2024-08-26 10:50 ` Tony Ambardar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aa8fe2731224ffdb6d64a014e3e02740c50010cd.camel@linux.ibm.com \
--to=iii@linux.ibm.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=tony.ambardar@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.