* Re: kernel lockup on bpf selftests module_attach
2025-08-09 8:15 kernel lockup on bpf selftests module_attach Vincent Li
@ 2025-08-09 3:03 ` Huacai Chen
2025-08-09 3:48 ` Vincent Li
0 siblings, 1 reply; 18+ messages in thread
From: Huacai Chen @ 2025-08-09 3:03 UTC (permalink / raw)
To: Vincent Li; +Cc: loongarch, Hengqi Chen, Chenghao Duan, Tiezhu Yang
Hi, Vincent,
On Sat, Aug 9, 2025 at 12:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
>
> Hi Folks,
>
> Hengqi mentioned offline that the loongarch kernel locked up when
> running full bpf selftests, so I went ahead and ran make run_tests to
> perform full bpf selftest, I observed lockup too. It appears the
> lockup happens when running module_attach test which includes testing
> on fentry so this could be related to the trampoline patch series. for
> example, if I just run ./test_progs -t module_attach, the kernel
> lockup immediately.
Is this a regression caused by the latest trampoline patches? Or in
another word, Does vanilla 6.16 has this problem?
Huacai
>
> A side note, if I put the module_attach test in
> tools/testing/selftests/bpf/DENYLIST to skip the module_attach test,
> the module_attach test is not skipped.
>
> Thanks
>
> Vincent
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel lockup on bpf selftests module_attach
2025-08-09 3:03 ` Huacai Chen
@ 2025-08-09 3:48 ` Vincent Li
2025-08-09 5:03 ` Vincent Li
0 siblings, 1 reply; 18+ messages in thread
From: Vincent Li @ 2025-08-09 3:48 UTC (permalink / raw)
To: Huacai Chen; +Cc: loongarch, Hengqi Chen, Chenghao Duan, Tiezhu Yang
On Fri, Aug 8, 2025 at 8:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
>
> Hi, Vincent,
>
> On Sat, Aug 9, 2025 at 12:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> >
> > Hi Folks,
> >
> > Hengqi mentioned offline that the loongarch kernel locked up when
> > running full bpf selftests, so I went ahead and ran make run_tests to
> > perform full bpf selftest, I observed lockup too. It appears the
> > lockup happens when running module_attach test which includes testing
> > on fentry so this could be related to the trampoline patch series. for
> > example, if I just run ./test_progs -t module_attach, the kernel
> > lockup immediately.
> Is this a regression caused by the latest trampoline patches? Or in
> another word, Does vanilla 6.16 has this problem?
>
I suspect this is caused by the latest trampoline patches because the
module_attach is to test the fentry feature for kernel module
functions, I believe Changhao and I only tested the fentry feature for
non-module kernel functions. I can try kernel without the trampoline
patches and will let you know the result.
> Huacai
>
> >
> > A side note, if I put the module_attach test in
> > tools/testing/selftests/bpf/DENYLIST to skip the module_attach test,
> > the module_attach test is not skipped.
> >
> > Thanks
> >
> > Vincent
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel lockup on bpf selftests module_attach
2025-08-09 3:48 ` Vincent Li
@ 2025-08-09 5:03 ` Vincent Li
2025-08-09 6:02 ` Huacai Chen
0 siblings, 1 reply; 18+ messages in thread
From: Vincent Li @ 2025-08-09 5:03 UTC (permalink / raw)
To: Huacai Chen; +Cc: loongarch, Hengqi Chen, Chenghao Duan, Tiezhu Yang
On Fri, Aug 8, 2025 at 8:48 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
>
> On Fri, Aug 8, 2025 at 8:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> > Hi, Vincent,
> >
> > On Sat, Aug 9, 2025 at 12:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > >
> > > Hi Folks,
> > >
> > > Hengqi mentioned offline that the loongarch kernel locked up when
> > > running full bpf selftests, so I went ahead and ran make run_tests to
> > > perform full bpf selftest, I observed lockup too. It appears the
> > > lockup happens when running module_attach test which includes testing
> > > on fentry so this could be related to the trampoline patch series. for
> > > example, if I just run ./test_progs -t module_attach, the kernel
> > > lockup immediately.
> > Is this a regression caused by the latest trampoline patches? Or in
> > another word, Does vanilla 6.16 has this problem?
> >
>
> I suspect this is caused by the latest trampoline patches because the
> module_attach is to test the fentry feature for kernel module
> functions, I believe Changhao and I only tested the fentry feature for
> non-module kernel functions. I can try kernel without the trampoline
> patches and will let you know the result.
>
I reverted trampoline patches from loongarch-next branch and run
./test_progs -t module_attach simply just errors out with the fentry
feature not supported
[root@fedora bpf]# ./test_progs -t module_attach
test_module_attach:PASS:skel_open 0 nsec
test_module_attach:PASS:set_attach_target 0 nsec
test_module_attach:PASS:set_attach_target_explicit 0 nsec
test_module_attach:PASS:skel_load 0 nsec
libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
test_module_attach:FAIL:skel_attach skeleton attach failed: -524
#205 module_attach:FAIL
All error logs:
test_module_attach:PASS:skel_open 0 nsec
test_module_attach:PASS:set_attach_target 0 nsec
test_module_attach:PASS:set_attach_target_explicit 0 nsec
test_module_attach:PASS:skel_load 0 nsec
libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
test_module_attach:FAIL:skel_attach skeleton attach failed: -524
#205 module_attach:FAIL
Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
I also tested loongarch-next branch with the trampoline patch series
with no lockup kernel config so I can run dmesg to check kernel error
log, ./test_progs -t module_attach result in below kernel log:
[ 417.429954] bpf_testmod: loading out-of-tree module taints kernel.
[ 419.728620] CPU 70475748 Unable to handle kernel paging request at
virtual address 0000000800000024, era == 90000000041d5854, ra ==
90000000041d5848
[ 419.728629] Oops[#1]:
[ 419.728632] CPU 70475748 Unable to handle kernel paging request at
virtual address 0000000000000018, era == 9000000005750268, ra ==
9000000004163938
[ 441.305370] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 441.305380] rcu: 5-...0: (29 ticks this GP)
idle=eb74/1/0x4000000000000000 softirq=72377/72379 fqs=2599
[ 441.305386] rcu: (detected by 4, t=5252 jiffies, g=60333, q=186 ncpus=8)
[ 441.305390] Sending NMI from CPU 4 to CPUs 5:
[ 451.305494] rcu: rcu_preempt kthread starved for 2499 jiffies!
g60333 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=1
[ 451.305500] rcu: Unless rcu_preempt kthread gets sufficient CPU
time, OOM is now expected behavior.
[ 451.305502] rcu: RCU grace-period kthread stack dump:
[ 451.305504] task:rcu_preempt state:R stack:0 pid:15
tgid:15 ppid:2 task_flags:0x208040 flags:0x00000800
[ 451.305510] Stack : 9000000100467e80 0000000000000402
0000000000000010 90000001003b0680
[ 451.305519] 90000000058e0000 0000000000000000
0000000000000040 9000000006c2dfd0
[ 451.305526] 900000000578c9b0 0000000000000001
9000000006b21000 0000000000000005
[ 451.305533] 00000001000093a8 00000001000093a8
0000000000000000 0000000000000004
[ 451.305540] 90000000058f04e0 0000000000000000
0000000000000002 b793724be1dfb2b8
[ 451.305547] 00000001000093a9 b793724be1dfb2b8
000000000000003f 9000000006c2dfd0
[ 451.305554] 9000000006c30c18 0000000000000005
9000000006b0e000 9000000006b21000
[ 451.305560] 9000000100453c98 90000001003aff80
9000000006c31140 900000000578c9b0
[ 451.305567] 00000001000093a8 9000000005794d3c
00000000000000b4 0000000000000000
[ 451.305574] 90000000024021b8 00000001000093a8
9000000004284f20 000000000a400001
[ 451.305581] ...
[ 451.305584] Call Trace:
[ 451.305586] [<900000000578b868>] __schedule+0x410/0x1520
[ 451.305595] [<900000000578c9ac>] schedule+0x34/0x190
[ 451.305599] [<9000000005794d38>] schedule_timeout+0x98/0x140
[ 451.305604] [<9000000004258f40>] rcu_gp_fqs_loop+0x5f8/0x868
[ 451.305609] [<900000000425d358>] rcu_gp_kthread+0x260/0x2e0
[ 451.305614] [<90000000041be704>] kthread+0x144/0x238
[ 451.305619] [<9000000005787b60>] ret_from_kernel_thread+0x28/0xc8
[ 451.305624] [<90000000041620e4>] ret_from_kernel_thread_asm+0xc/0x88
[ 451.305630] rcu: Stack dump where RCU GP kthread last ran:
[ 451.305633] Sending NMI from CPU 4 to CPUs 1:
[ 451.305636] NMI backtrace for cpu 1 skipped: idling at idle_exit+0x0/0x4
[ 451.306655] rcu: INFO: rcu_preempt detected expedited stalls on
CPUs/tasks: { 5-...D } 7298 jiffies s: 853 root: 0x20/.
[ 451.306665] rcu: blocking rcu_node structures (internal RCU debug):
[ 451.306669] Sending NMI from CPU 6 to CPUs 5:
[ 451.306672] Unable to send backtrace IPI to CPU5 - perhaps it hung?
So related to trampoline patches for sure unless I am missing something.
> > Huacai
> >
> > >
> > > A side note, if I put the module_attach test in
> > > tools/testing/selftests/bpf/DENYLIST to skip the module_attach test,
> > > the module_attach test is not skipped.
> > >
> > > Thanks
> > >
> > > Vincent
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel lockup on bpf selftests module_attach
2025-08-09 5:03 ` Vincent Li
@ 2025-08-09 6:02 ` Huacai Chen
2025-08-09 19:11 ` Vincent Li
0 siblings, 1 reply; 18+ messages in thread
From: Huacai Chen @ 2025-08-09 6:02 UTC (permalink / raw)
To: Vincent Li; +Cc: loongarch, Hengqi Chen, Chenghao Duan, Tiezhu Yang
Hi, Chenghao,
Please take a look.
Huacai
On Sat, Aug 9, 2025 at 1:03 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
>
> On Fri, Aug 8, 2025 at 8:48 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> >
> > On Fri, Aug 8, 2025 at 8:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > >
> > > Hi, Vincent,
> > >
> > > On Sat, Aug 9, 2025 at 12:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > >
> > > > Hi Folks,
> > > >
> > > > Hengqi mentioned offline that the loongarch kernel locked up when
> > > > running full bpf selftests, so I went ahead and ran make run_tests to
> > > > perform full bpf selftest, I observed lockup too. It appears the
> > > > lockup happens when running module_attach test which includes testing
> > > > on fentry so this could be related to the trampoline patch series. for
> > > > example, if I just run ./test_progs -t module_attach, the kernel
> > > > lockup immediately.
> > > Is this a regression caused by the latest trampoline patches? Or in
> > > another word, Does vanilla 6.16 has this problem?
> > >
> >
> > I suspect this is caused by the latest trampoline patches because the
> > module_attach is to test the fentry feature for kernel module
> > functions, I believe Changhao and I only tested the fentry feature for
> > non-module kernel functions. I can try kernel without the trampoline
> > patches and will let you know the result.
> >
>
> I reverted trampoline patches from loongarch-next branch and run
> ./test_progs -t module_attach simply just errors out with the fentry
> feature not supported
>
> [root@fedora bpf]# ./test_progs -t module_attach
> test_module_attach:PASS:skel_open 0 nsec
> test_module_attach:PASS:set_attach_target 0 nsec
> test_module_attach:PASS:set_attach_target_explicit 0 nsec
> test_module_attach:PASS:skel_load 0 nsec
> libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> #205 module_attach:FAIL
>
> All error logs:
> test_module_attach:PASS:skel_open 0 nsec
> test_module_attach:PASS:set_attach_target 0 nsec
> test_module_attach:PASS:set_attach_target_explicit 0 nsec
> test_module_attach:PASS:skel_load 0 nsec
> libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> #205 module_attach:FAIL
> Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
>
> I also tested loongarch-next branch with the trampoline patch series
> with no lockup kernel config so I can run dmesg to check kernel error
> log, ./test_progs -t module_attach result in below kernel log:
>
> [ 417.429954] bpf_testmod: loading out-of-tree module taints kernel.
> [ 419.728620] CPU 70475748 Unable to handle kernel paging request at
> virtual address 0000000800000024, era == 90000000041d5854, ra ==
> 90000000041d5848
> [ 419.728629] Oops[#1]:
> [ 419.728632] CPU 70475748 Unable to handle kernel paging request at
> virtual address 0000000000000018, era == 9000000005750268, ra ==
> 9000000004163938
> [ 441.305370] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 441.305380] rcu: 5-...0: (29 ticks this GP)
> idle=eb74/1/0x4000000000000000 softirq=72377/72379 fqs=2599
> [ 441.305386] rcu: (detected by 4, t=5252 jiffies, g=60333, q=186 ncpus=8)
> [ 441.305390] Sending NMI from CPU 4 to CPUs 5:
> [ 451.305494] rcu: rcu_preempt kthread starved for 2499 jiffies!
> g60333 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=1
> [ 451.305500] rcu: Unless rcu_preempt kthread gets sufficient CPU
> time, OOM is now expected behavior.
> [ 451.305502] rcu: RCU grace-period kthread stack dump:
> [ 451.305504] task:rcu_preempt state:R stack:0 pid:15
> tgid:15 ppid:2 task_flags:0x208040 flags:0x00000800
> [ 451.305510] Stack : 9000000100467e80 0000000000000402
> 0000000000000010 90000001003b0680
> [ 451.305519] 90000000058e0000 0000000000000000
> 0000000000000040 9000000006c2dfd0
> [ 451.305526] 900000000578c9b0 0000000000000001
> 9000000006b21000 0000000000000005
> [ 451.305533] 00000001000093a8 00000001000093a8
> 0000000000000000 0000000000000004
> [ 451.305540] 90000000058f04e0 0000000000000000
> 0000000000000002 b793724be1dfb2b8
> [ 451.305547] 00000001000093a9 b793724be1dfb2b8
> 000000000000003f 9000000006c2dfd0
> [ 451.305554] 9000000006c30c18 0000000000000005
> 9000000006b0e000 9000000006b21000
> [ 451.305560] 9000000100453c98 90000001003aff80
> 9000000006c31140 900000000578c9b0
> [ 451.305567] 00000001000093a8 9000000005794d3c
> 00000000000000b4 0000000000000000
> [ 451.305574] 90000000024021b8 00000001000093a8
> 9000000004284f20 000000000a400001
> [ 451.305581] ...
> [ 451.305584] Call Trace:
> [ 451.305586] [<900000000578b868>] __schedule+0x410/0x1520
> [ 451.305595] [<900000000578c9ac>] schedule+0x34/0x190
> [ 451.305599] [<9000000005794d38>] schedule_timeout+0x98/0x140
> [ 451.305604] [<9000000004258f40>] rcu_gp_fqs_loop+0x5f8/0x868
> [ 451.305609] [<900000000425d358>] rcu_gp_kthread+0x260/0x2e0
> [ 451.305614] [<90000000041be704>] kthread+0x144/0x238
> [ 451.305619] [<9000000005787b60>] ret_from_kernel_thread+0x28/0xc8
> [ 451.305624] [<90000000041620e4>] ret_from_kernel_thread_asm+0xc/0x88
>
> [ 451.305630] rcu: Stack dump where RCU GP kthread last ran:
> [ 451.305633] Sending NMI from CPU 4 to CPUs 1:
> [ 451.305636] NMI backtrace for cpu 1 skipped: idling at idle_exit+0x0/0x4
> [ 451.306655] rcu: INFO: rcu_preempt detected expedited stalls on
> CPUs/tasks: { 5-...D } 7298 jiffies s: 853 root: 0x20/.
> [ 451.306665] rcu: blocking rcu_node structures (internal RCU debug):
> [ 451.306669] Sending NMI from CPU 6 to CPUs 5:
> [ 451.306672] Unable to send backtrace IPI to CPU5 - perhaps it hung?
>
> So related to trampoline patches for sure unless I am missing something.
>
> > > Huacai
> > >
> > > >
> > > > A side note, if I put the module_attach test in
> > > > tools/testing/selftests/bpf/DENYLIST to skip the module_attach test,
> > > > the module_attach test is not skipped.
> > > >
> > > > Thanks
> > > >
> > > > Vincent
^ permalink raw reply [flat|nested] 18+ messages in thread
* kernel lockup on bpf selftests module_attach
@ 2025-08-09 8:15 Vincent Li
2025-08-09 3:03 ` Huacai Chen
0 siblings, 1 reply; 18+ messages in thread
From: Vincent Li @ 2025-08-09 8:15 UTC (permalink / raw)
To: loongarch; +Cc: Hengqi Chen, Chenghao Duan, Tiezhu Yang, Huacai Chen
Hi Folks,
Hengqi mentioned offline that the loongarch kernel locked up when
running full bpf selftests, so I went ahead and ran make run_tests to
perform full bpf selftest, I observed lockup too. It appears the
lockup happens when running module_attach test which includes testing
on fentry so this could be related to the trampoline patch series. for
example, if I just run ./test_progs -t module_attach, the kernel
lockup immediately.
A side note, if I put the module_attach test in
tools/testing/selftests/bpf/DENYLIST to skip the module_attach test,
the module_attach test is not skipped.
Thanks
Vincent
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel lockup on bpf selftests module_attach
2025-08-09 6:02 ` Huacai Chen
@ 2025-08-09 19:11 ` Vincent Li
2025-08-10 17:39 ` Vincent Li
0 siblings, 1 reply; 18+ messages in thread
From: Vincent Li @ 2025-08-09 19:11 UTC (permalink / raw)
To: Huacai Chen; +Cc: loongarch, Hengqi Chen, Chenghao Duan, Tiezhu Yang
On Fri, Aug 8, 2025 at 11:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
>
> Hi, Chenghao,
>
> Please take a look.
>
> Huacai
>
I reverted loongson-next branch tailcall count fix patches, struct
ops trampoline patch, keep the rest of trampoline patches,
module_attach test experienced the same issue, so definitely
trampoline patches issue.
> On Sat, Aug 9, 2025 at 1:03 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> >
> > On Fri, Aug 8, 2025 at 8:48 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > >
> > > On Fri, Aug 8, 2025 at 8:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > >
> > > > Hi, Vincent,
> > > >
> > > > On Sat, Aug 9, 2025 at 12:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > >
> > > > > Hi Folks,
> > > > >
> > > > > Hengqi mentioned offline that the loongarch kernel locked up when
> > > > > running full bpf selftests, so I went ahead and ran make run_tests to
> > > > > perform full bpf selftest, I observed lockup too. It appears the
> > > > > lockup happens when running module_attach test which includes testing
> > > > > on fentry so this could be related to the trampoline patch series. for
> > > > > example, if I just run ./test_progs -t module_attach, the kernel
> > > > > lockup immediately.
> > > > Is this a regression caused by the latest trampoline patches? Or in
> > > > another word, Does vanilla 6.16 has this problem?
> > > >
> > >
> > > I suspect this is caused by the latest trampoline patches because the
> > > module_attach is to test the fentry feature for kernel module
> > > functions, I believe Changhao and I only tested the fentry feature for
> > > non-module kernel functions. I can try kernel without the trampoline
> > > patches and will let you know the result.
> > >
> >
> > I reverted trampoline patches from loongarch-next branch and run
> > ./test_progs -t module_attach simply just errors out with the fentry
> > feature not supported
> >
> > [root@fedora bpf]# ./test_progs -t module_attach
> > test_module_attach:PASS:skel_open 0 nsec
> > test_module_attach:PASS:set_attach_target 0 nsec
> > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > test_module_attach:PASS:skel_load 0 nsec
> > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > #205 module_attach:FAIL
> >
> > All error logs:
> > test_module_attach:PASS:skel_open 0 nsec
> > test_module_attach:PASS:set_attach_target 0 nsec
> > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > test_module_attach:PASS:skel_load 0 nsec
> > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > #205 module_attach:FAIL
> > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> >
> > I also tested loongarch-next branch with the trampoline patch series
> > with no lockup kernel config so I can run dmesg to check kernel error
> > log, ./test_progs -t module_attach result in below kernel log:
> >
> > [ 417.429954] bpf_testmod: loading out-of-tree module taints kernel.
> > [ 419.728620] CPU 70475748 Unable to handle kernel paging request at
> > virtual address 0000000800000024, era == 90000000041d5854, ra ==
> > 90000000041d5848
> > [ 419.728629] Oops[#1]:
> > [ 419.728632] CPU 70475748 Unable to handle kernel paging request at
> > virtual address 0000000000000018, era == 9000000005750268, ra ==
> > 9000000004163938
> > [ 441.305370] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> > [ 441.305380] rcu: 5-...0: (29 ticks this GP)
> > idle=eb74/1/0x4000000000000000 softirq=72377/72379 fqs=2599
> > [ 441.305386] rcu: (detected by 4, t=5252 jiffies, g=60333, q=186 ncpus=8)
> > [ 441.305390] Sending NMI from CPU 4 to CPUs 5:
> > [ 451.305494] rcu: rcu_preempt kthread starved for 2499 jiffies!
> > g60333 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=1
> > [ 451.305500] rcu: Unless rcu_preempt kthread gets sufficient CPU
> > time, OOM is now expected behavior.
> > [ 451.305502] rcu: RCU grace-period kthread stack dump:
> > [ 451.305504] task:rcu_preempt state:R stack:0 pid:15
> > tgid:15 ppid:2 task_flags:0x208040 flags:0x00000800
> > [ 451.305510] Stack : 9000000100467e80 0000000000000402
> > 0000000000000010 90000001003b0680
> > [ 451.305519] 90000000058e0000 0000000000000000
> > 0000000000000040 9000000006c2dfd0
> > [ 451.305526] 900000000578c9b0 0000000000000001
> > 9000000006b21000 0000000000000005
> > [ 451.305533] 00000001000093a8 00000001000093a8
> > 0000000000000000 0000000000000004
> > [ 451.305540] 90000000058f04e0 0000000000000000
> > 0000000000000002 b793724be1dfb2b8
> > [ 451.305547] 00000001000093a9 b793724be1dfb2b8
> > 000000000000003f 9000000006c2dfd0
> > [ 451.305554] 9000000006c30c18 0000000000000005
> > 9000000006b0e000 9000000006b21000
> > [ 451.305560] 9000000100453c98 90000001003aff80
> > 9000000006c31140 900000000578c9b0
> > [ 451.305567] 00000001000093a8 9000000005794d3c
> > 00000000000000b4 0000000000000000
> > [ 451.305574] 90000000024021b8 00000001000093a8
> > 9000000004284f20 000000000a400001
> > [ 451.305581] ...
> > [ 451.305584] Call Trace:
> > [ 451.305586] [<900000000578b868>] __schedule+0x410/0x1520
> > [ 451.305595] [<900000000578c9ac>] schedule+0x34/0x190
> > [ 451.305599] [<9000000005794d38>] schedule_timeout+0x98/0x140
> > [ 451.305604] [<9000000004258f40>] rcu_gp_fqs_loop+0x5f8/0x868
> > [ 451.305609] [<900000000425d358>] rcu_gp_kthread+0x260/0x2e0
> > [ 451.305614] [<90000000041be704>] kthread+0x144/0x238
> > [ 451.305619] [<9000000005787b60>] ret_from_kernel_thread+0x28/0xc8
> > [ 451.305624] [<90000000041620e4>] ret_from_kernel_thread_asm+0xc/0x88
> >
> > [ 451.305630] rcu: Stack dump where RCU GP kthread last ran:
> > [ 451.305633] Sending NMI from CPU 4 to CPUs 1:
> > [ 451.305636] NMI backtrace for cpu 1 skipped: idling at idle_exit+0x0/0x4
> > [ 451.306655] rcu: INFO: rcu_preempt detected expedited stalls on
> > CPUs/tasks: { 5-...D } 7298 jiffies s: 853 root: 0x20/.
> > [ 451.306665] rcu: blocking rcu_node structures (internal RCU debug):
> > [ 451.306669] Sending NMI from CPU 6 to CPUs 5:
> > [ 451.306672] Unable to send backtrace IPI to CPU5 - perhaps it hung?
> >
> > So related to trampoline patches for sure unless I am missing something.
> >
> > > > Huacai
> > > >
> > > > >
> > > > > A side note, if I put the module_attach test in
> > > > > tools/testing/selftests/bpf/DENYLIST to skip the module_attach test,
> > > > > the module_attach test is not skipped.
> > > > >
> > > > > Thanks
> > > > >
> > > > > Vincent
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel lockup on bpf selftests module_attach
2025-08-09 19:11 ` Vincent Li
@ 2025-08-10 17:39 ` Vincent Li
2025-08-12 8:34 ` Chenghao Duan
0 siblings, 1 reply; 18+ messages in thread
From: Vincent Li @ 2025-08-10 17:39 UTC (permalink / raw)
To: Huacai Chen; +Cc: loongarch, Hengqi Chen, Chenghao Duan, Tiezhu Yang
Hi Chenghao,
On Sat, Aug 9, 2025 at 12:11 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
>
> On Fri, Aug 8, 2025 at 11:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> > Hi, Chenghao,
> >
> > Please take a look.
> >
> > Huacai
> >
> I reverted loongson-next branch tailcall count fix patches, struct
> ops trampoline patch, keep the rest of trampoline patches,
> module_attach test experienced the same issue, so definitely
> trampoline patches issue.
>
I attempted to isolate which test in module_attach triggers the
"Unable to handle kernel paging request..." error, it appears to be
this one in "prog_tests/module_attach.c"
ASSERT_OK(trigger_module_test_read(READ_SZ), "trigger_read");
you can try to comment out other tests in "prog_tests/module_attach.c"
and perform the test, it might help isolate the issue.
> > On Sat, Aug 9, 2025 at 1:03 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > >
> > > On Fri, Aug 8, 2025 at 8:48 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > >
> > > > On Fri, Aug 8, 2025 at 8:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > >
> > > > > Hi, Vincent,
> > > > >
> > > > > On Sat, Aug 9, 2025 at 12:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > >
> > > > > > Hi Folks,
> > > > > >
> > > > > > Hengqi mentioned offline that the loongarch kernel locked up when
> > > > > > running full bpf selftests, so I went ahead and ran make run_tests to
> > > > > > perform full bpf selftest, I observed lockup too. It appears the
> > > > > > lockup happens when running module_attach test which includes testing
> > > > > > on fentry so this could be related to the trampoline patch series. for
> > > > > > example, if I just run ./test_progs -t module_attach, the kernel
> > > > > > lockup immediately.
> > > > > Is this a regression caused by the latest trampoline patches? Or in
> > > > > another word, Does vanilla 6.16 has this problem?
> > > > >
> > > >
> > > > I suspect this is caused by the latest trampoline patches because the
> > > > module_attach is to test the fentry feature for kernel module
> > > > functions, I believe Changhao and I only tested the fentry feature for
> > > > non-module kernel functions. I can try kernel without the trampoline
> > > > patches and will let you know the result.
> > > >
> > >
> > > I reverted trampoline patches from loongarch-next branch and run
> > > ./test_progs -t module_attach simply just errors out with the fentry
> > > feature not supported
> > >
> > > [root@fedora bpf]# ./test_progs -t module_attach
> > > test_module_attach:PASS:skel_open 0 nsec
> > > test_module_attach:PASS:set_attach_target 0 nsec
> > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > test_module_attach:PASS:skel_load 0 nsec
> > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > #205 module_attach:FAIL
> > >
> > > All error logs:
> > > test_module_attach:PASS:skel_open 0 nsec
> > > test_module_attach:PASS:set_attach_target 0 nsec
> > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > test_module_attach:PASS:skel_load 0 nsec
> > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > #205 module_attach:FAIL
> > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > >
> > > I also tested loongarch-next branch with the trampoline patch series
> > > with no lockup kernel config so I can run dmesg to check kernel error
> > > log, ./test_progs -t module_attach result in below kernel log:
> > >
> > > [ 417.429954] bpf_testmod: loading out-of-tree module taints kernel.
> > > [ 419.728620] CPU 70475748 Unable to handle kernel paging request at
> > > virtual address 0000000800000024, era == 90000000041d5854, ra ==
> > > 90000000041d5848
> > > [ 419.728629] Oops[#1]:
> > > [ 419.728632] CPU 70475748 Unable to handle kernel paging request at
> > > virtual address 0000000000000018, era == 9000000005750268, ra ==
> > > 9000000004163938
> > > [ 441.305370] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> > > [ 441.305380] rcu: 5-...0: (29 ticks this GP)
> > > idle=eb74/1/0x4000000000000000 softirq=72377/72379 fqs=2599
> > > [ 441.305386] rcu: (detected by 4, t=5252 jiffies, g=60333, q=186 ncpus=8)
> > > [ 441.305390] Sending NMI from CPU 4 to CPUs 5:
> > > [ 451.305494] rcu: rcu_preempt kthread starved for 2499 jiffies!
> > > g60333 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=1
> > > [ 451.305500] rcu: Unless rcu_preempt kthread gets sufficient CPU
> > > time, OOM is now expected behavior.
> > > [ 451.305502] rcu: RCU grace-period kthread stack dump:
> > > [ 451.305504] task:rcu_preempt state:R stack:0 pid:15
> > > tgid:15 ppid:2 task_flags:0x208040 flags:0x00000800
> > > [ 451.305510] Stack : 9000000100467e80 0000000000000402
> > > 0000000000000010 90000001003b0680
> > > [ 451.305519] 90000000058e0000 0000000000000000
> > > 0000000000000040 9000000006c2dfd0
> > > [ 451.305526] 900000000578c9b0 0000000000000001
> > > 9000000006b21000 0000000000000005
> > > [ 451.305533] 00000001000093a8 00000001000093a8
> > > 0000000000000000 0000000000000004
> > > [ 451.305540] 90000000058f04e0 0000000000000000
> > > 0000000000000002 b793724be1dfb2b8
> > > [ 451.305547] 00000001000093a9 b793724be1dfb2b8
> > > 000000000000003f 9000000006c2dfd0
> > > [ 451.305554] 9000000006c30c18 0000000000000005
> > > 9000000006b0e000 9000000006b21000
> > > [ 451.305560] 9000000100453c98 90000001003aff80
> > > 9000000006c31140 900000000578c9b0
> > > [ 451.305567] 00000001000093a8 9000000005794d3c
> > > 00000000000000b4 0000000000000000
> > > [ 451.305574] 90000000024021b8 00000001000093a8
> > > 9000000004284f20 000000000a400001
> > > [ 451.305581] ...
> > > [ 451.305584] Call Trace:
> > > [ 451.305586] [<900000000578b868>] __schedule+0x410/0x1520
> > > [ 451.305595] [<900000000578c9ac>] schedule+0x34/0x190
> > > [ 451.305599] [<9000000005794d38>] schedule_timeout+0x98/0x140
> > > [ 451.305604] [<9000000004258f40>] rcu_gp_fqs_loop+0x5f8/0x868
> > > [ 451.305609] [<900000000425d358>] rcu_gp_kthread+0x260/0x2e0
> > > [ 451.305614] [<90000000041be704>] kthread+0x144/0x238
> > > [ 451.305619] [<9000000005787b60>] ret_from_kernel_thread+0x28/0xc8
> > > [ 451.305624] [<90000000041620e4>] ret_from_kernel_thread_asm+0xc/0x88
> > >
> > > [ 451.305630] rcu: Stack dump where RCU GP kthread last ran:
> > > [ 451.305633] Sending NMI from CPU 4 to CPUs 1:
> > > [ 451.305636] NMI backtrace for cpu 1 skipped: idling at idle_exit+0x0/0x4
> > > [ 451.306655] rcu: INFO: rcu_preempt detected expedited stalls on
> > > CPUs/tasks: { 5-...D } 7298 jiffies s: 853 root: 0x20/.
> > > [ 451.306665] rcu: blocking rcu_node structures (internal RCU debug):
> > > [ 451.306669] Sending NMI from CPU 6 to CPUs 5:
> > > [ 451.306672] Unable to send backtrace IPI to CPU5 - perhaps it hung?
> > >
> > > So related to trampoline patches for sure unless I am missing something.
> > >
> > > > > Huacai
> > > > >
> > > > > >
> > > > > > A side note, if I put the module_attach test in
> > > > > > tools/testing/selftests/bpf/DENYLIST to skip the module_attach test,
> > > > > > the module_attach test is not skipped.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > Vincent
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel lockup on bpf selftests module_attach
2025-08-10 17:39 ` Vincent Li
@ 2025-08-12 8:34 ` Chenghao Duan
2025-08-12 13:42 ` Vincent Li
0 siblings, 1 reply; 18+ messages in thread
From: Chenghao Duan @ 2025-08-12 8:34 UTC (permalink / raw)
To: Vincent Li; +Cc: Huacai Chen, loongarch, Hengqi Chen, Tiezhu Yang
On Sun, Aug 10, 2025 at 10:39:24AM -0700, Vincent Li wrote:
> Hi Chenghao,
>
> On Sat, Aug 9, 2025 at 12:11 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> >
> > On Fri, Aug 8, 2025 at 11:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > >
> > > Hi, Chenghao,
> > >
> > > Please take a look.
> > >
> > > Huacai
> > >
> > I reverted loongson-next branch tailcall count fix patches, struct
> > ops trampoline patch, keep the rest of trampoline patches,
> > module_attach test experienced the same issue, so definitely
> > trampoline patches issue.
> >
>
> I attempted to isolate which test in module_attach triggers the
> "Unable to handle kernel paging request..." error, it appears to be
> this one in "prog_tests/module_attach.c"
>
> ASSERT_OK(trigger_module_test_read(READ_SZ), "trigger_read");
>
> you can try to comment out other tests in "prog_tests/module_attach.c"
> and perform the test, it might help isolate the issue.
>
Hi Vincent,
The results I tested are different from yours. Could there be other
differences between us? I am using the latest code of the loongarch-next
branch.
[root@localhost bpf]# ./test_progs -v -t module_attach
bpf_testmod.ko is already unloaded.
Loading bpf_testmod.ko...
Successfully loaded bpf_testmod.ko.
test_module_attach:PASS:skel_open 0 nsec
test_module_attach:PASS:set_attach_target 0 nsec
test_module_attach:PASS:set_attach_target_explicit 0 nsec
WATCHDOG: test case module_attach executes for 10 seconds...
libbpf: prog 'handle_fmod_ret': BPF program load failed: -EINVAL
libbpf: prog 'handle_fmod_ret': -- BEGIN PROG LOAD LOG --
bpf_testmod_test_read() is not modifiable
processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
-- END PROG LOAD LOG --
libbpf: prog 'handle_fmod_ret': failed to load: -EINVAL
libbpf: failed to load object 'test_module_attach'
libbpf: failed to load BPF skeleton 'test_module_attach': -EINVAL
test_module_attach:FAIL:skel_load failed to load skeleton
#205 module_attach:FAIL
Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
Successfully unloaded bpf_testmod.ko.
Chenghao
>
> > > On Sat, Aug 9, 2025 at 1:03 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > >
> > > > On Fri, Aug 8, 2025 at 8:48 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > >
> > > > > On Fri, Aug 8, 2025 at 8:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > >
> > > > > > Hi, Vincent,
> > > > > >
> > > > > > On Sat, Aug 9, 2025 at 12:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > >
> > > > > > > Hi Folks,
> > > > > > >
> > > > > > > Hengqi mentioned offline that the loongarch kernel locked up when
> > > > > > > running full bpf selftests, so I went ahead and ran make run_tests to
> > > > > > > perform full bpf selftest, I observed lockup too. It appears the
> > > > > > > lockup happens when running module_attach test which includes testing
> > > > > > > on fentry so this could be related to the trampoline patch series. for
> > > > > > > example, if I just run ./test_progs -t module_attach, the kernel
> > > > > > > lockup immediately.
> > > > > > Is this a regression caused by the latest trampoline patches? Or in
> > > > > > another word, Does vanilla 6.16 has this problem?
> > > > > >
> > > > >
> > > > > I suspect this is caused by the latest trampoline patches because the
> > > > > module_attach is to test the fentry feature for kernel module
> > > > > functions, I believe Changhao and I only tested the fentry feature for
> > > > > non-module kernel functions. I can try kernel without the trampoline
> > > > > patches and will let you know the result.
> > > > >
> > > >
> > > > I reverted trampoline patches from loongarch-next branch and run
> > > > ./test_progs -t module_attach simply just errors out with the fentry
> > > > feature not supported
> > > >
> > > > [root@fedora bpf]# ./test_progs -t module_attach
> > > > test_module_attach:PASS:skel_open 0 nsec
> > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > test_module_attach:PASS:skel_load 0 nsec
> > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > #205 module_attach:FAIL
> > > >
> > > > All error logs:
> > > > test_module_attach:PASS:skel_open 0 nsec
> > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > test_module_attach:PASS:skel_load 0 nsec
> > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > #205 module_attach:FAIL
> > > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > >
> > > > I also tested loongarch-next branch with the trampoline patch series
> > > > with no lockup kernel config so I can run dmesg to check kernel error
> > > > log, ./test_progs -t module_attach result in below kernel log:
> > > >
> > > > [ 417.429954] bpf_testmod: loading out-of-tree module taints kernel.
> > > > [ 419.728620] CPU 70475748 Unable to handle kernel paging request at
> > > > virtual address 0000000800000024, era == 90000000041d5854, ra ==
> > > > 90000000041d5848
> > > > [ 419.728629] Oops[#1]:
> > > > [ 419.728632] CPU 70475748 Unable to handle kernel paging request at
> > > > virtual address 0000000000000018, era == 9000000005750268, ra ==
> > > > 9000000004163938
> > > > [ 441.305370] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> > > > [ 441.305380] rcu: 5-...0: (29 ticks this GP)
> > > > idle=eb74/1/0x4000000000000000 softirq=72377/72379 fqs=2599
> > > > [ 441.305386] rcu: (detected by 4, t=5252 jiffies, g=60333, q=186 ncpus=8)
> > > > [ 441.305390] Sending NMI from CPU 4 to CPUs 5:
> > > > [ 451.305494] rcu: rcu_preempt kthread starved for 2499 jiffies!
> > > > g60333 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=1
> > > > [ 451.305500] rcu: Unless rcu_preempt kthread gets sufficient CPU
> > > > time, OOM is now expected behavior.
> > > > [ 451.305502] rcu: RCU grace-period kthread stack dump:
> > > > [ 451.305504] task:rcu_preempt state:R stack:0 pid:15
> > > > tgid:15 ppid:2 task_flags:0x208040 flags:0x00000800
> > > > [ 451.305510] Stack : 9000000100467e80 0000000000000402
> > > > 0000000000000010 90000001003b0680
> > > > [ 451.305519] 90000000058e0000 0000000000000000
> > > > 0000000000000040 9000000006c2dfd0
> > > > [ 451.305526] 900000000578c9b0 0000000000000001
> > > > 9000000006b21000 0000000000000005
> > > > [ 451.305533] 00000001000093a8 00000001000093a8
> > > > 0000000000000000 0000000000000004
> > > > [ 451.305540] 90000000058f04e0 0000000000000000
> > > > 0000000000000002 b793724be1dfb2b8
> > > > [ 451.305547] 00000001000093a9 b793724be1dfb2b8
> > > > 000000000000003f 9000000006c2dfd0
> > > > [ 451.305554] 9000000006c30c18 0000000000000005
> > > > 9000000006b0e000 9000000006b21000
> > > > [ 451.305560] 9000000100453c98 90000001003aff80
> > > > 9000000006c31140 900000000578c9b0
> > > > [ 451.305567] 00000001000093a8 9000000005794d3c
> > > > 00000000000000b4 0000000000000000
> > > > [ 451.305574] 90000000024021b8 00000001000093a8
> > > > 9000000004284f20 000000000a400001
> > > > [ 451.305581] ...
> > > > [ 451.305584] Call Trace:
> > > > [ 451.305586] [<900000000578b868>] __schedule+0x410/0x1520
> > > > [ 451.305595] [<900000000578c9ac>] schedule+0x34/0x190
> > > > [ 451.305599] [<9000000005794d38>] schedule_timeout+0x98/0x140
> > > > [ 451.305604] [<9000000004258f40>] rcu_gp_fqs_loop+0x5f8/0x868
> > > > [ 451.305609] [<900000000425d358>] rcu_gp_kthread+0x260/0x2e0
> > > > [ 451.305614] [<90000000041be704>] kthread+0x144/0x238
> > > > [ 451.305619] [<9000000005787b60>] ret_from_kernel_thread+0x28/0xc8
> > > > [ 451.305624] [<90000000041620e4>] ret_from_kernel_thread_asm+0xc/0x88
> > > >
> > > > [ 451.305630] rcu: Stack dump where RCU GP kthread last ran:
> > > > [ 451.305633] Sending NMI from CPU 4 to CPUs 1:
> > > > [ 451.305636] NMI backtrace for cpu 1 skipped: idling at idle_exit+0x0/0x4
> > > > [ 451.306655] rcu: INFO: rcu_preempt detected expedited stalls on
> > > > CPUs/tasks: { 5-...D } 7298 jiffies s: 853 root: 0x20/.
> > > > [ 451.306665] rcu: blocking rcu_node structures (internal RCU debug):
> > > > [ 451.306669] Sending NMI from CPU 6 to CPUs 5:
> > > > [ 451.306672] Unable to send backtrace IPI to CPU5 - perhaps it hung?
> > > >
> > > > So related to trampoline patches for sure unless I am missing something.
> > > >
> > > > > > Huacai
> > > > > >
> > > > > > >
> > > > > > > A side note, if I put the module_attach test in
> > > > > > > tools/testing/selftests/bpf/DENYLIST to skip the module_attach test,
> > > > > > > the module_attach test is not skipped.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > Vincent
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel lockup on bpf selftests module_attach
2025-08-12 8:34 ` Chenghao Duan
@ 2025-08-12 13:42 ` Vincent Li
2025-08-14 12:00 ` Chenghao Duan
0 siblings, 1 reply; 18+ messages in thread
From: Vincent Li @ 2025-08-12 13:42 UTC (permalink / raw)
To: Chenghao Duan; +Cc: Huacai Chen, loongarch, Hengqi Chen, Tiezhu Yang
On Tue, Aug 12, 2025 at 1:34 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> On Sun, Aug 10, 2025 at 10:39:24AM -0700, Vincent Li wrote:
> > Hi Chenghao,
> >
> > On Sat, Aug 9, 2025 at 12:11 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > >
> > > On Fri, Aug 8, 2025 at 11:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > >
> > > > Hi, Chenghao,
> > > >
> > > > Please take a look.
> > > >
> > > > Huacai
> > > >
> > > I reverted loongson-next branch tailcall count fix patches, struct
> > > ops trampoline patch, keep the rest of trampoline patches,
> > > module_attach test experienced the same issue, so definitely
> > > trampoline patches issue.
> > >
> >
> > I attempted to isolate which test in module_attach triggers the
> > "Unable to handle kernel paging request..." error, it appears to be
> > this one in "prog_tests/module_attach.c"
> >
> > ASSERT_OK(trigger_module_test_read(READ_SZ), "trigger_read");
> >
> > you can try to comment out other tests in "prog_tests/module_attach.c"
> > and perform the test, it might help isolate the issue.
> >
>
> Hi Vincent,
>
> The results I tested are different from yours. Could there be other
> differences between us? I am using the latest code of the loongarch-next
> branch.
>
> [root@localhost bpf]# ./test_progs -v -t module_attach
> bpf_testmod.ko is already unloaded.
> Loading bpf_testmod.ko...
> Successfully loaded bpf_testmod.ko.
> test_module_attach:PASS:skel_open 0 nsec
> test_module_attach:PASS:set_attach_target 0 nsec
> test_module_attach:PASS:set_attach_target_explicit 0 nsec
> WATCHDOG: test case module_attach executes for 10 seconds...
> libbpf: prog 'handle_fmod_ret': BPF program load failed: -EINVAL
> libbpf: prog 'handle_fmod_ret': -- BEGIN PROG LOAD LOG --
> bpf_testmod_test_read() is not modifiable
> processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> -- END PROG LOAD LOG --
> libbpf: prog 'handle_fmod_ret': failed to load: -EINVAL
> libbpf: failed to load object 'test_module_attach'
> libbpf: failed to load BPF skeleton 'test_module_attach': -EINVAL
> test_module_attach:FAIL:skel_load failed to load skeleton
> #205 module_attach:FAIL
> Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> Successfully unloaded bpf_testmod.ko.
>
I build and run the most recent loongarch-next kernel too, can you try
my config https://www.bpfire.net/download/loongfire/config.txt? I am
on fedora, here are the steps I build, run the kernel, and run the
test
1, check branch
[root@fedora linux-loongson]# git branch
* loongarch-next
master
no-tailcall
no-trampoline
2, build kernel and reboot
cp config.txt .config; make clean; make -j6; make modules_install;
make install; grub2-mkconfig -o /boot/grub2/grub.cfg; reboot
3, after reboot and login, build bpf selftests, run module_attach
test, dmesg to check kernel log
cd tools/testing/selftests/bpf; make -j6; ./test_progs -t module_attach
>
>
> Chenghao
>
> >
> > > > On Sat, Aug 9, 2025 at 1:03 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > >
> > > > > On Fri, Aug 8, 2025 at 8:48 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Aug 8, 2025 at 8:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > >
> > > > > > > Hi, Vincent,
> > > > > > >
> > > > > > > On Sat, Aug 9, 2025 at 12:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > >
> > > > > > > > Hi Folks,
> > > > > > > >
> > > > > > > > Hengqi mentioned offline that the loongarch kernel locked up when
> > > > > > > > running full bpf selftests, so I went ahead and ran make run_tests to
> > > > > > > > perform full bpf selftest, I observed lockup too. It appears the
> > > > > > > > lockup happens when running module_attach test which includes testing
> > > > > > > > on fentry so this could be related to the trampoline patch series. for
> > > > > > > > example, if I just run ./test_progs -t module_attach, the kernel
> > > > > > > > lockup immediately.
> > > > > > > Is this a regression caused by the latest trampoline patches? Or in
> > > > > > > another word, Does vanilla 6.16 has this problem?
> > > > > > >
> > > > > >
> > > > > > I suspect this is caused by the latest trampoline patches because the
> > > > > > module_attach is to test the fentry feature for kernel module
> > > > > > functions, I believe Changhao and I only tested the fentry feature for
> > > > > > non-module kernel functions. I can try kernel without the trampoline
> > > > > > patches and will let you know the result.
> > > > > >
> > > > >
> > > > > I reverted trampoline patches from loongarch-next branch and run
> > > > > ./test_progs -t module_attach simply just errors out with the fentry
> > > > > feature not supported
> > > > >
> > > > > [root@fedora bpf]# ./test_progs -t module_attach
> > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > > #205 module_attach:FAIL
> > > > >
> > > > > All error logs:
> > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > > #205 module_attach:FAIL
> > > > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > > >
> > > > > I also tested loongarch-next branch with the trampoline patch series
> > > > > with no lockup kernel config so I can run dmesg to check kernel error
> > > > > log, ./test_progs -t module_attach result in below kernel log:
> > > > >
> > > > > [ 417.429954] bpf_testmod: loading out-of-tree module taints kernel.
> > > > > [ 419.728620] CPU 70475748 Unable to handle kernel paging request at
> > > > > virtual address 0000000800000024, era == 90000000041d5854, ra ==
> > > > > 90000000041d5848
> > > > > [ 419.728629] Oops[#1]:
> > > > > [ 419.728632] CPU 70475748 Unable to handle kernel paging request at
> > > > > virtual address 0000000000000018, era == 9000000005750268, ra ==
> > > > > 9000000004163938
> > > > > [ 441.305370] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> > > > > [ 441.305380] rcu: 5-...0: (29 ticks this GP)
> > > > > idle=eb74/1/0x4000000000000000 softirq=72377/72379 fqs=2599
> > > > > [ 441.305386] rcu: (detected by 4, t=5252 jiffies, g=60333, q=186 ncpus=8)
> > > > > [ 441.305390] Sending NMI from CPU 4 to CPUs 5:
> > > > > [ 451.305494] rcu: rcu_preempt kthread starved for 2499 jiffies!
> > > > > g60333 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=1
> > > > > [ 451.305500] rcu: Unless rcu_preempt kthread gets sufficient CPU
> > > > > time, OOM is now expected behavior.
> > > > > [ 451.305502] rcu: RCU grace-period kthread stack dump:
> > > > > [ 451.305504] task:rcu_preempt state:R stack:0 pid:15
> > > > > tgid:15 ppid:2 task_flags:0x208040 flags:0x00000800
> > > > > [ 451.305510] Stack : 9000000100467e80 0000000000000402
> > > > > 0000000000000010 90000001003b0680
> > > > > [ 451.305519] 90000000058e0000 0000000000000000
> > > > > 0000000000000040 9000000006c2dfd0
> > > > > [ 451.305526] 900000000578c9b0 0000000000000001
> > > > > 9000000006b21000 0000000000000005
> > > > > [ 451.305533] 00000001000093a8 00000001000093a8
> > > > > 0000000000000000 0000000000000004
> > > > > [ 451.305540] 90000000058f04e0 0000000000000000
> > > > > 0000000000000002 b793724be1dfb2b8
> > > > > [ 451.305547] 00000001000093a9 b793724be1dfb2b8
> > > > > 000000000000003f 9000000006c2dfd0
> > > > > [ 451.305554] 9000000006c30c18 0000000000000005
> > > > > 9000000006b0e000 9000000006b21000
> > > > > [ 451.305560] 9000000100453c98 90000001003aff80
> > > > > 9000000006c31140 900000000578c9b0
> > > > > [ 451.305567] 00000001000093a8 9000000005794d3c
> > > > > 00000000000000b4 0000000000000000
> > > > > [ 451.305574] 90000000024021b8 00000001000093a8
> > > > > 9000000004284f20 000000000a400001
> > > > > [ 451.305581] ...
> > > > > [ 451.305584] Call Trace:
> > > > > [ 451.305586] [<900000000578b868>] __schedule+0x410/0x1520
> > > > > [ 451.305595] [<900000000578c9ac>] schedule+0x34/0x190
> > > > > [ 451.305599] [<9000000005794d38>] schedule_timeout+0x98/0x140
> > > > > [ 451.305604] [<9000000004258f40>] rcu_gp_fqs_loop+0x5f8/0x868
> > > > > [ 451.305609] [<900000000425d358>] rcu_gp_kthread+0x260/0x2e0
> > > > > [ 451.305614] [<90000000041be704>] kthread+0x144/0x238
> > > > > [ 451.305619] [<9000000005787b60>] ret_from_kernel_thread+0x28/0xc8
> > > > > [ 451.305624] [<90000000041620e4>] ret_from_kernel_thread_asm+0xc/0x88
> > > > >
> > > > > [ 451.305630] rcu: Stack dump where RCU GP kthread last ran:
> > > > > [ 451.305633] Sending NMI from CPU 4 to CPUs 1:
> > > > > [ 451.305636] NMI backtrace for cpu 1 skipped: idling at idle_exit+0x0/0x4
> > > > > [ 451.306655] rcu: INFO: rcu_preempt detected expedited stalls on
> > > > > CPUs/tasks: { 5-...D } 7298 jiffies s: 853 root: 0x20/.
> > > > > [ 451.306665] rcu: blocking rcu_node structures (internal RCU debug):
> > > > > [ 451.306669] Sending NMI from CPU 6 to CPUs 5:
> > > > > [ 451.306672] Unable to send backtrace IPI to CPU5 - perhaps it hung?
> > > > >
> > > > > So related to trampoline patches for sure unless I am missing something.
> > > > >
> > > > > > > Huacai
> > > > > > >
> > > > > > > >
> > > > > > > > A side note, if I put the module_attach test in
> > > > > > > > tools/testing/selftests/bpf/DENYLIST to skip the module_attach test,
> > > > > > > > the module_attach test is not skipped.
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > Vincent
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel lockup on bpf selftests module_attach
2025-08-12 13:42 ` Vincent Li
@ 2025-08-14 12:00 ` Chenghao Duan
2025-08-14 13:42 ` Vincent Li
2025-08-21 15:04 ` Vincent Li
0 siblings, 2 replies; 18+ messages in thread
From: Chenghao Duan @ 2025-08-14 12:00 UTC (permalink / raw)
To: Vincent Li; +Cc: Huacai Chen, loongarch, Hengqi Chen, Tiezhu Yang
On Tue, Aug 12, 2025 at 06:42:08AM -0700, Vincent Li wrote:
> On Tue, Aug 12, 2025 at 1:34 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> >
> > On Sun, Aug 10, 2025 at 10:39:24AM -0700, Vincent Li wrote:
> > > Hi Chenghao,
> > >
> > > On Sat, Aug 9, 2025 at 12:11 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > >
> > > > On Fri, Aug 8, 2025 at 11:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > >
> > > > > Hi, Chenghao,
> > > > >
> > > > > Please take a look.
> > > > >
> > > > > Huacai
> > > > >
> > > > I reverted loongson-next branch tailcall count fix patches, struct
> > > > ops trampoline patch, keep the rest of trampoline patches,
> > > > module_attach test experienced the same issue, so definitely
> > > > trampoline patches issue.
> > > >
> > >
> > > I attempted to isolate which test in module_attach triggers the
> > > "Unable to handle kernel paging request..." error, it appears to be
> > > this one in "prog_tests/module_attach.c"
> > >
> > > ASSERT_OK(trigger_module_test_read(READ_SZ), "trigger_read");
> > >
> > > you can try to comment out other tests in "prog_tests/module_attach.c"
> > > and perform the test, it might help isolate the issue.
> > >
> >
> > Hi Vincent,
> >
> > The results I tested are different from yours. Could there be other
> > differences between us? I am using the latest code of the loongarch-next
> > branch.
> >
> > [root@localhost bpf]# ./test_progs -v -t module_attach
> > bpf_testmod.ko is already unloaded.
> > Loading bpf_testmod.ko...
> > Successfully loaded bpf_testmod.ko.
> > test_module_attach:PASS:skel_open 0 nsec
> > test_module_attach:PASS:set_attach_target 0 nsec
> > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > WATCHDOG: test case module_attach executes for 10 seconds...
> > libbpf: prog 'handle_fmod_ret': BPF program load failed: -EINVAL
> > libbpf: prog 'handle_fmod_ret': -- BEGIN PROG LOAD LOG --
> > bpf_testmod_test_read() is not modifiable
> > processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> > -- END PROG LOAD LOG --
> > libbpf: prog 'handle_fmod_ret': failed to load: -EINVAL
> > libbpf: failed to load object 'test_module_attach'
> > libbpf: failed to load BPF skeleton 'test_module_attach': -EINVAL
> > test_module_attach:FAIL:skel_load failed to load skeleton
> > #205 module_attach:FAIL
> > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > Successfully unloaded bpf_testmod.ko.
> >
>
> I build and run the most recent loongarch-next kernel too, can you try
> my config https://www.bpfire.net/download/loongfire/config.txt? I am
> on fedora, here are the steps I build, run the kernel, and run the
> test
>
> 1, check branch
> [root@fedora linux-loongson]# git branch
> * loongarch-next
> master
> no-tailcall
> no-trampoline
>
> 2, build kernel and reboot
> cp config.txt .config; make clean; make -j6; make modules_install;
> make install; grub2-mkconfig -o /boot/grub2/grub.cfg; reboot
>
> 3, after reboot and login, build bpf selftests, run module_attach
> test, dmesg to check kernel log
> cd tools/testing/selftests/bpf; make -j6; ./test_progs -t module_attach
>
Hi Vincent,
I tried to refer to the config you provided, but the test results I
obtained are as follows. I also specifically tested "modify" to verify
the effectiveness of the patch, and the test of module_attach returns -EOPNOTSUPP.
[root@localhost bpf]# ./test_progs -v -t modify_return
bpf_testmod.ko is already unloaded.
Loading bpf_testmod.ko...
Successfully loaded bpf_testmod.ko.
run_test:PASS:skel_load 0 nsec
run_test:PASS:modify_return__attach failed 0 nsec
run_test:PASS:test_run 0 nsec
run_test:PASS:test_run ret 0 nsec
run_test:PASS:modify_return side_effect 0 nsec
run_test:PASS:modify_return fentry_result 0 nsec
run_test:PASS:modify_return fexit_result 0 nsec
run_test:PASS:modify_return fmod_ret_result 0 nsec
run_test:PASS:modify_return fentry_result2 0 nsec
run_test:PASS:modify_return fexit_result2 0 nsec
run_test:PASS:modify_return fmod_ret_result2 0 nsec
run_test:PASS:skel_load 0 nsec
run_test:PASS:modify_return__attach failed 0 nsec
run_test:PASS:test_run 0 nsec
run_test:PASS:test_run ret 0 nsec
run_test:PASS:modify_return side_effect 0 nsec
run_test:PASS:modify_return fentry_result 0 nsec
run_test:PASS:modify_return fexit_result 0 nsec
run_test:PASS:modify_return fmod_ret_result 0 nsec
run_test:PASS:modify_return fentry_result2 0 nsec
run_test:PASS:modify_return fexit_result2 0 nsec
run_test:PASS:modify_return fmod_ret_result2 0 nsec
#200 modify_return:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
Successfully unloaded bpf_testmod.ko.
[root@localhost bpf]# ./test_progs -v -t module_attach
bpf_testmod.ko is already unloaded.
Loading bpf_testmod.ko...
Successfully loaded bpf_testmod.ko.
test_module_attach:PASS:skel_open 0 nsec
test_module_attach:PASS:set_attach_target 0 nsec
test_module_attach:PASS:set_attach_target_explicit 0 nsec
test_module_attach:PASS:skel_load 0 nsec
libbpf: prog 'kprobe_multi': failed to attach: -EOPNOTSUPP
libbpf: prog 'kprobe_multi': failed to auto-attach: -EOPNOTSUPP
test_module_attach:FAIL:skel_attach skeleton attach failed: -95
#201 module_attach:FAIL
Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
Successfully unloaded bpf_testmod.ko.
Chenghao
>
> >
> >
> > Chenghao
> >
> > >
> > > > > On Sat, Aug 9, 2025 at 1:03 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Aug 8, 2025 at 8:48 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > >
> > > > > > > On Fri, Aug 8, 2025 at 8:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > >
> > > > > > > > Hi, Vincent,
> > > > > > > >
> > > > > > > > On Sat, Aug 9, 2025 at 12:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > Hi Folks,
> > > > > > > > >
> > > > > > > > > Hengqi mentioned offline that the loongarch kernel locked up when
> > > > > > > > > running full bpf selftests, so I went ahead and ran make run_tests to
> > > > > > > > > perform full bpf selftest, I observed lockup too. It appears the
> > > > > > > > > lockup happens when running module_attach test which includes testing
> > > > > > > > > on fentry so this could be related to the trampoline patch series. for
> > > > > > > > > example, if I just run ./test_progs -t module_attach, the kernel
> > > > > > > > > lockup immediately.
> > > > > > > > Is this a regression caused by the latest trampoline patches? Or in
> > > > > > > > another word, Does vanilla 6.16 has this problem?
> > > > > > > >
> > > > > > >
> > > > > > > I suspect this is caused by the latest trampoline patches because the
> > > > > > > module_attach is to test the fentry feature for kernel module
> > > > > > > functions, I believe Changhao and I only tested the fentry feature for
> > > > > > > non-module kernel functions. I can try kernel without the trampoline
> > > > > > > patches and will let you know the result.
> > > > > > >
> > > > > >
> > > > > > I reverted trampoline patches from loongarch-next branch and run
> > > > > > ./test_progs -t module_attach simply just errors out with the fentry
> > > > > > feature not supported
> > > > > >
> > > > > > [root@fedora bpf]# ./test_progs -t module_attach
> > > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > > > #205 module_attach:FAIL
> > > > > >
> > > > > > All error logs:
> > > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > > > #205 module_attach:FAIL
> > > > > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > > > >
> > > > > > I also tested loongarch-next branch with the trampoline patch series
> > > > > > with no lockup kernel config so I can run dmesg to check kernel error
> > > > > > log, ./test_progs -t module_attach result in below kernel log:
> > > > > >
> > > > > > [ 417.429954] bpf_testmod: loading out-of-tree module taints kernel.
> > > > > > [ 419.728620] CPU 70475748 Unable to handle kernel paging request at
> > > > > > virtual address 0000000800000024, era == 90000000041d5854, ra ==
> > > > > > 90000000041d5848
> > > > > > [ 419.728629] Oops[#1]:
> > > > > > [ 419.728632] CPU 70475748 Unable to handle kernel paging request at
> > > > > > virtual address 0000000000000018, era == 9000000005750268, ra ==
> > > > > > 9000000004163938
> > > > > > [ 441.305370] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> > > > > > [ 441.305380] rcu: 5-...0: (29 ticks this GP)
> > > > > > idle=eb74/1/0x4000000000000000 softirq=72377/72379 fqs=2599
> > > > > > [ 441.305386] rcu: (detected by 4, t=5252 jiffies, g=60333, q=186 ncpus=8)
> > > > > > [ 441.305390] Sending NMI from CPU 4 to CPUs 5:
> > > > > > [ 451.305494] rcu: rcu_preempt kthread starved for 2499 jiffies!
> > > > > > g60333 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=1
> > > > > > [ 451.305500] rcu: Unless rcu_preempt kthread gets sufficient CPU
> > > > > > time, OOM is now expected behavior.
> > > > > > [ 451.305502] rcu: RCU grace-period kthread stack dump:
> > > > > > [ 451.305504] task:rcu_preempt state:R stack:0 pid:15
> > > > > > tgid:15 ppid:2 task_flags:0x208040 flags:0x00000800
> > > > > > [ 451.305510] Stack : 9000000100467e80 0000000000000402
> > > > > > 0000000000000010 90000001003b0680
> > > > > > [ 451.305519] 90000000058e0000 0000000000000000
> > > > > > 0000000000000040 9000000006c2dfd0
> > > > > > [ 451.305526] 900000000578c9b0 0000000000000001
> > > > > > 9000000006b21000 0000000000000005
> > > > > > [ 451.305533] 00000001000093a8 00000001000093a8
> > > > > > 0000000000000000 0000000000000004
> > > > > > [ 451.305540] 90000000058f04e0 0000000000000000
> > > > > > 0000000000000002 b793724be1dfb2b8
> > > > > > [ 451.305547] 00000001000093a9 b793724be1dfb2b8
> > > > > > 000000000000003f 9000000006c2dfd0
> > > > > > [ 451.305554] 9000000006c30c18 0000000000000005
> > > > > > 9000000006b0e000 9000000006b21000
> > > > > > [ 451.305560] 9000000100453c98 90000001003aff80
> > > > > > 9000000006c31140 900000000578c9b0
> > > > > > [ 451.305567] 00000001000093a8 9000000005794d3c
> > > > > > 00000000000000b4 0000000000000000
> > > > > > [ 451.305574] 90000000024021b8 00000001000093a8
> > > > > > 9000000004284f20 000000000a400001
> > > > > > [ 451.305581] ...
> > > > > > [ 451.305584] Call Trace:
> > > > > > [ 451.305586] [<900000000578b868>] __schedule+0x410/0x1520
> > > > > > [ 451.305595] [<900000000578c9ac>] schedule+0x34/0x190
> > > > > > [ 451.305599] [<9000000005794d38>] schedule_timeout+0x98/0x140
> > > > > > [ 451.305604] [<9000000004258f40>] rcu_gp_fqs_loop+0x5f8/0x868
> > > > > > [ 451.305609] [<900000000425d358>] rcu_gp_kthread+0x260/0x2e0
> > > > > > [ 451.305614] [<90000000041be704>] kthread+0x144/0x238
> > > > > > [ 451.305619] [<9000000005787b60>] ret_from_kernel_thread+0x28/0xc8
> > > > > > [ 451.305624] [<90000000041620e4>] ret_from_kernel_thread_asm+0xc/0x88
> > > > > >
> > > > > > [ 451.305630] rcu: Stack dump where RCU GP kthread last ran:
> > > > > > [ 451.305633] Sending NMI from CPU 4 to CPUs 1:
> > > > > > [ 451.305636] NMI backtrace for cpu 1 skipped: idling at idle_exit+0x0/0x4
> > > > > > [ 451.306655] rcu: INFO: rcu_preempt detected expedited stalls on
> > > > > > CPUs/tasks: { 5-...D } 7298 jiffies s: 853 root: 0x20/.
> > > > > > [ 451.306665] rcu: blocking rcu_node structures (internal RCU debug):
> > > > > > [ 451.306669] Sending NMI from CPU 6 to CPUs 5:
> > > > > > [ 451.306672] Unable to send backtrace IPI to CPU5 - perhaps it hung?
> > > > > >
> > > > > > So related to trampoline patches for sure unless I am missing something.
> > > > > >
> > > > > > > > Huacai
> > > > > > > >
> > > > > > > > >
> > > > > > > > > A side note, if I put the module_attach test in
> > > > > > > > > tools/testing/selftests/bpf/DENYLIST to skip the module_attach test,
> > > > > > > > > the module_attach test is not skipped.
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > Vincent
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel lockup on bpf selftests module_attach
2025-08-14 12:00 ` Chenghao Duan
@ 2025-08-14 13:42 ` Vincent Li
2025-08-14 13:47 ` Vincent Li
2025-08-21 15:04 ` Vincent Li
1 sibling, 1 reply; 18+ messages in thread
From: Vincent Li @ 2025-08-14 13:42 UTC (permalink / raw)
To: Chenghao Duan; +Cc: Huacai Chen, loongarch, Hengqi Chen, Tiezhu Yang
On Thu, Aug 14, 2025 at 5:00 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> On Tue, Aug 12, 2025 at 06:42:08AM -0700, Vincent Li wrote:
> > On Tue, Aug 12, 2025 at 1:34 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > >
> > > On Sun, Aug 10, 2025 at 10:39:24AM -0700, Vincent Li wrote:
> > > > Hi Chenghao,
> > > >
> > > > On Sat, Aug 9, 2025 at 12:11 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > >
> > > > > On Fri, Aug 8, 2025 at 11:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > >
> > > > > > Hi, Chenghao,
> > > > > >
> > > > > > Please take a look.
> > > > > >
> > > > > > Huacai
> > > > > >
> > > > > I reverted loongson-next branch tailcall count fix patches, struct
> > > > > ops trampoline patch, keep the rest of trampoline patches,
> > > > > module_attach test experienced the same issue, so definitely
> > > > > trampoline patches issue.
> > > > >
> > > >
> > > > I attempted to isolate which test in module_attach triggers the
> > > > "Unable to handle kernel paging request..." error, it appears to be
> > > > this one in "prog_tests/module_attach.c"
> > > >
> > > > ASSERT_OK(trigger_module_test_read(READ_SZ), "trigger_read");
> > > >
> > > > you can try to comment out other tests in "prog_tests/module_attach.c"
> > > > and perform the test, it might help isolate the issue.
> > > >
> > >
> > > Hi Vincent,
> > >
> > > The results I tested are different from yours. Could there be other
> > > differences between us? I am using the latest code of the loongarch-next
> > > branch.
> > >
> > > [root@localhost bpf]# ./test_progs -v -t module_attach
> > > bpf_testmod.ko is already unloaded.
> > > Loading bpf_testmod.ko...
> > > Successfully loaded bpf_testmod.ko.
> > > test_module_attach:PASS:skel_open 0 nsec
> > > test_module_attach:PASS:set_attach_target 0 nsec
> > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > WATCHDOG: test case module_attach executes for 10 seconds...
> > > libbpf: prog 'handle_fmod_ret': BPF program load failed: -EINVAL
> > > libbpf: prog 'handle_fmod_ret': -- BEGIN PROG LOAD LOG --
> > > bpf_testmod_test_read() is not modifiable
> > > processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> > > -- END PROG LOAD LOG --
> > > libbpf: prog 'handle_fmod_ret': failed to load: -EINVAL
> > > libbpf: failed to load object 'test_module_attach'
> > > libbpf: failed to load BPF skeleton 'test_module_attach': -EINVAL
> > > test_module_attach:FAIL:skel_load failed to load skeleton
> > > #205 module_attach:FAIL
> > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > Successfully unloaded bpf_testmod.ko.
> > >
> >
> > I build and run the most recent loongarch-next kernel too, can you try
> > my config https://www.bpfire.net/download/loongfire/config.txt? I am
> > on fedora, here are the steps I build, run the kernel, and run the
> > test
> >
> > 1, check branch
> > [root@fedora linux-loongson]# git branch
> > * loongarch-next
> > master
> > no-tailcall
> > no-trampoline
> >
> > 2, build kernel and reboot
> > cp config.txt .config; make clean; make -j6; make modules_install;
> > make install; grub2-mkconfig -o /boot/grub2/grub.cfg; reboot
> >
> > 3, after reboot and login, build bpf selftests, run module_attach
> > test, dmesg to check kernel log
> > cd tools/testing/selftests/bpf; make -j6; ./test_progs -t module_attach
> >
>
> Hi Vincent,
>
> I tried to refer to the config you provided, but the test results I
> obtained are as follows. I also specifically tested "modify" to verify
> the effectiveness of the patch, and the test of module_attach returns -EOPNOTSUPP.
>
> [root@localhost bpf]# ./test_progs -v -t modify_return
> bpf_testmod.ko is already unloaded.
> Loading bpf_testmod.ko...
> Successfully loaded bpf_testmod.ko.
> run_test:PASS:skel_load 0 nsec
> run_test:PASS:modify_return__attach failed 0 nsec
> run_test:PASS:test_run 0 nsec
> run_test:PASS:test_run ret 0 nsec
> run_test:PASS:modify_return side_effect 0 nsec
> run_test:PASS:modify_return fentry_result 0 nsec
> run_test:PASS:modify_return fexit_result 0 nsec
> run_test:PASS:modify_return fmod_ret_result 0 nsec
> run_test:PASS:modify_return fentry_result2 0 nsec
> run_test:PASS:modify_return fexit_result2 0 nsec
> run_test:PASS:modify_return fmod_ret_result2 0 nsec
> run_test:PASS:skel_load 0 nsec
> run_test:PASS:modify_return__attach failed 0 nsec
> run_test:PASS:test_run 0 nsec
> run_test:PASS:test_run ret 0 nsec
> run_test:PASS:modify_return side_effect 0 nsec
> run_test:PASS:modify_return fentry_result 0 nsec
> run_test:PASS:modify_return fexit_result 0 nsec
> run_test:PASS:modify_return fmod_ret_result 0 nsec
> run_test:PASS:modify_return fentry_result2 0 nsec
> run_test:PASS:modify_return fexit_result2 0 nsec
> run_test:PASS:modify_return fmod_ret_result2 0 nsec
> #200 modify_return:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> Successfully unloaded bpf_testmod.ko.
> [root@localhost bpf]# ./test_progs -v -t module_attach
> bpf_testmod.ko is already unloaded.
> Loading bpf_testmod.ko...
> Successfully loaded bpf_testmod.ko.
> test_module_attach:PASS:skel_open 0 nsec
> test_module_attach:PASS:set_attach_target 0 nsec
> test_module_attach:PASS:set_attach_target_explicit 0 nsec
> test_module_attach:PASS:skel_load 0 nsec
> libbpf: prog 'kprobe_multi': failed to attach: -EOPNOTSUPP
> libbpf: prog 'kprobe_multi': failed to auto-attach: -EOPNOTSUPP
> test_module_attach:FAIL:skel_attach skeleton attach failed: -95
> #201 module_attach:FAIL
> Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> Successfully unloaded bpf_testmod.ko.
>
this is what I got with addition of -v, it appears you failed at
skel_attach and maybe your libbpf is outdated and does not support
kprobe_multi? my libbpf is 1.5
/usr/lib64/libbpf.so.1.5.0
[root@fedora bpf]# ./test_progs -v -t module_attach
bpf_testmod.ko is already unloaded.
Loading bpf_testmod.ko...
Successfully loaded bpf_testmod.ko.
test_module_attach:PASS:skel_open 0 nsec
test_module_attach:PASS:set_attach_target 0 nsec
test_module_attach:PASS:set_attach_target_explicit 0 nsec
test_module_attach:PASS:skel_load 0 nsec
test_module_attach:PASS:skel_attach 0 nsec
trigger_module_test_read:PASS:testmod_file_open 0 nsec
WATCHDOG: test case module_attach executes for 10 seconds...
WATCHDOG: test case module_attach executes for 120 seconds,
terminating with SIGSEGV
>
> Chenghao
>
> >
> > >
> > >
> > > Chenghao
> > >
> > > >
> > > > > > On Sat, Aug 9, 2025 at 1:03 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > >
> > > > > > > On Fri, Aug 8, 2025 at 8:48 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Fri, Aug 8, 2025 at 8:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > >
> > > > > > > > > Hi, Vincent,
> > > > > > > > >
> > > > > > > > > On Sat, Aug 9, 2025 at 12:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > Hi Folks,
> > > > > > > > > >
> > > > > > > > > > Hengqi mentioned offline that the loongarch kernel locked up when
> > > > > > > > > > running full bpf selftests, so I went ahead and ran make run_tests to
> > > > > > > > > > perform full bpf selftest, I observed lockup too. It appears the
> > > > > > > > > > lockup happens when running module_attach test which includes testing
> > > > > > > > > > on fentry so this could be related to the trampoline patch series. for
> > > > > > > > > > example, if I just run ./test_progs -t module_attach, the kernel
> > > > > > > > > > lockup immediately.
> > > > > > > > > Is this a regression caused by the latest trampoline patches? Or in
> > > > > > > > > another word, Does vanilla 6.16 has this problem?
> > > > > > > > >
> > > > > > > >
> > > > > > > > I suspect this is caused by the latest trampoline patches because the
> > > > > > > > module_attach is to test the fentry feature for kernel module
> > > > > > > > functions, I believe Changhao and I only tested the fentry feature for
> > > > > > > > non-module kernel functions. I can try kernel without the trampoline
> > > > > > > > patches and will let you know the result.
> > > > > > > >
> > > > > > >
> > > > > > > I reverted trampoline patches from loongarch-next branch and run
> > > > > > > ./test_progs -t module_attach simply just errors out with the fentry
> > > > > > > feature not supported
> > > > > > >
> > > > > > > [root@fedora bpf]# ./test_progs -t module_attach
> > > > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > > > > #205 module_attach:FAIL
> > > > > > >
> > > > > > > All error logs:
> > > > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > > > > #205 module_attach:FAIL
> > > > > > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > > > > >
> > > > > > > I also tested loongarch-next branch with the trampoline patch series
> > > > > > > with no lockup kernel config so I can run dmesg to check kernel error
> > > > > > > log, ./test_progs -t module_attach result in below kernel log:
> > > > > > >
> > > > > > > [ 417.429954] bpf_testmod: loading out-of-tree module taints kernel.
> > > > > > > [ 419.728620] CPU 70475748 Unable to handle kernel paging request at
> > > > > > > virtual address 0000000800000024, era == 90000000041d5854, ra ==
> > > > > > > 90000000041d5848
> > > > > > > [ 419.728629] Oops[#1]:
> > > > > > > [ 419.728632] CPU 70475748 Unable to handle kernel paging request at
> > > > > > > virtual address 0000000000000018, era == 9000000005750268, ra ==
> > > > > > > 9000000004163938
> > > > > > > [ 441.305370] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> > > > > > > [ 441.305380] rcu: 5-...0: (29 ticks this GP)
> > > > > > > idle=eb74/1/0x4000000000000000 softirq=72377/72379 fqs=2599
> > > > > > > [ 441.305386] rcu: (detected by 4, t=5252 jiffies, g=60333, q=186 ncpus=8)
> > > > > > > [ 441.305390] Sending NMI from CPU 4 to CPUs 5:
> > > > > > > [ 451.305494] rcu: rcu_preempt kthread starved for 2499 jiffies!
> > > > > > > g60333 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=1
> > > > > > > [ 451.305500] rcu: Unless rcu_preempt kthread gets sufficient CPU
> > > > > > > time, OOM is now expected behavior.
> > > > > > > [ 451.305502] rcu: RCU grace-period kthread stack dump:
> > > > > > > [ 451.305504] task:rcu_preempt state:R stack:0 pid:15
> > > > > > > tgid:15 ppid:2 task_flags:0x208040 flags:0x00000800
> > > > > > > [ 451.305510] Stack : 9000000100467e80 0000000000000402
> > > > > > > 0000000000000010 90000001003b0680
> > > > > > > [ 451.305519] 90000000058e0000 0000000000000000
> > > > > > > 0000000000000040 9000000006c2dfd0
> > > > > > > [ 451.305526] 900000000578c9b0 0000000000000001
> > > > > > > 9000000006b21000 0000000000000005
> > > > > > > [ 451.305533] 00000001000093a8 00000001000093a8
> > > > > > > 0000000000000000 0000000000000004
> > > > > > > [ 451.305540] 90000000058f04e0 0000000000000000
> > > > > > > 0000000000000002 b793724be1dfb2b8
> > > > > > > [ 451.305547] 00000001000093a9 b793724be1dfb2b8
> > > > > > > 000000000000003f 9000000006c2dfd0
> > > > > > > [ 451.305554] 9000000006c30c18 0000000000000005
> > > > > > > 9000000006b0e000 9000000006b21000
> > > > > > > [ 451.305560] 9000000100453c98 90000001003aff80
> > > > > > > 9000000006c31140 900000000578c9b0
> > > > > > > [ 451.305567] 00000001000093a8 9000000005794d3c
> > > > > > > 00000000000000b4 0000000000000000
> > > > > > > [ 451.305574] 90000000024021b8 00000001000093a8
> > > > > > > 9000000004284f20 000000000a400001
> > > > > > > [ 451.305581] ...
> > > > > > > [ 451.305584] Call Trace:
> > > > > > > [ 451.305586] [<900000000578b868>] __schedule+0x410/0x1520
> > > > > > > [ 451.305595] [<900000000578c9ac>] schedule+0x34/0x190
> > > > > > > [ 451.305599] [<9000000005794d38>] schedule_timeout+0x98/0x140
> > > > > > > [ 451.305604] [<9000000004258f40>] rcu_gp_fqs_loop+0x5f8/0x868
> > > > > > > [ 451.305609] [<900000000425d358>] rcu_gp_kthread+0x260/0x2e0
> > > > > > > [ 451.305614] [<90000000041be704>] kthread+0x144/0x238
> > > > > > > [ 451.305619] [<9000000005787b60>] ret_from_kernel_thread+0x28/0xc8
> > > > > > > [ 451.305624] [<90000000041620e4>] ret_from_kernel_thread_asm+0xc/0x88
> > > > > > >
> > > > > > > [ 451.305630] rcu: Stack dump where RCU GP kthread last ran:
> > > > > > > [ 451.305633] Sending NMI from CPU 4 to CPUs 1:
> > > > > > > [ 451.305636] NMI backtrace for cpu 1 skipped: idling at idle_exit+0x0/0x4
> > > > > > > [ 451.306655] rcu: INFO: rcu_preempt detected expedited stalls on
> > > > > > > CPUs/tasks: { 5-...D } 7298 jiffies s: 853 root: 0x20/.
> > > > > > > [ 451.306665] rcu: blocking rcu_node structures (internal RCU debug):
> > > > > > > [ 451.306669] Sending NMI from CPU 6 to CPUs 5:
> > > > > > > [ 451.306672] Unable to send backtrace IPI to CPU5 - perhaps it hung?
> > > > > > >
> > > > > > > So related to trampoline patches for sure unless I am missing something.
> > > > > > >
> > > > > > > > > Huacai
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > A side note, if I put the module_attach test in
> > > > > > > > > > tools/testing/selftests/bpf/DENYLIST to skip the module_attach test,
> > > > > > > > > > the module_attach test is not skipped.
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > > Vincent
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel lockup on bpf selftests module_attach
2025-08-14 13:42 ` Vincent Li
@ 2025-08-14 13:47 ` Vincent Li
0 siblings, 0 replies; 18+ messages in thread
From: Vincent Li @ 2025-08-14 13:47 UTC (permalink / raw)
To: Chenghao Duan; +Cc: Huacai Chen, loongarch, Hengqi Chen, Tiezhu Yang
On Thu, Aug 14, 2025 at 6:42 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
>
> On Thu, Aug 14, 2025 at 5:00 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> >
> > On Tue, Aug 12, 2025 at 06:42:08AM -0700, Vincent Li wrote:
> > > On Tue, Aug 12, 2025 at 1:34 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > > >
> > > > On Sun, Aug 10, 2025 at 10:39:24AM -0700, Vincent Li wrote:
> > > > > Hi Chenghao,
> > > > >
> > > > > On Sat, Aug 9, 2025 at 12:11 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Aug 8, 2025 at 11:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > >
> > > > > > > Hi, Chenghao,
> > > > > > >
> > > > > > > Please take a look.
> > > > > > >
> > > > > > > Huacai
> > > > > > >
> > > > > > I reverted loongson-next branch tailcall count fix patches, struct
> > > > > > ops trampoline patch, keep the rest of trampoline patches,
> > > > > > module_attach test experienced the same issue, so definitely
> > > > > > trampoline patches issue.
> > > > > >
> > > > >
> > > > > I attempted to isolate which test in module_attach triggers the
> > > > > "Unable to handle kernel paging request..." error, it appears to be
> > > > > this one in "prog_tests/module_attach.c"
> > > > >
> > > > > ASSERT_OK(trigger_module_test_read(READ_SZ), "trigger_read");
> > > > >
> > > > > you can try to comment out other tests in "prog_tests/module_attach.c"
> > > > > and perform the test, it might help isolate the issue.
> > > > >
> > > >
> > > > Hi Vincent,
> > > >
> > > > The results I tested are different from yours. Could there be other
> > > > differences between us? I am using the latest code of the loongarch-next
> > > > branch.
> > > >
> > > > [root@localhost bpf]# ./test_progs -v -t module_attach
> > > > bpf_testmod.ko is already unloaded.
> > > > Loading bpf_testmod.ko...
> > > > Successfully loaded bpf_testmod.ko.
> > > > test_module_attach:PASS:skel_open 0 nsec
> > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > WATCHDOG: test case module_attach executes for 10 seconds...
> > > > libbpf: prog 'handle_fmod_ret': BPF program load failed: -EINVAL
> > > > libbpf: prog 'handle_fmod_ret': -- BEGIN PROG LOAD LOG --
> > > > bpf_testmod_test_read() is not modifiable
> > > > processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> > > > -- END PROG LOAD LOG --
> > > > libbpf: prog 'handle_fmod_ret': failed to load: -EINVAL
> > > > libbpf: failed to load object 'test_module_attach'
> > > > libbpf: failed to load BPF skeleton 'test_module_attach': -EINVAL
> > > > test_module_attach:FAIL:skel_load failed to load skeleton
> > > > #205 module_attach:FAIL
> > > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > > Successfully unloaded bpf_testmod.ko.
> > > >
> > >
> > > I build and run the most recent loongarch-next kernel too, can you try
> > > my config https://www.bpfire.net/download/loongfire/config.txt? I am
> > > on fedora, here are the steps I build, run the kernel, and run the
> > > test
> > >
> > > 1, check branch
> > > [root@fedora linux-loongson]# git branch
> > > * loongarch-next
> > > master
> > > no-tailcall
> > > no-trampoline
> > >
> > > 2, build kernel and reboot
> > > cp config.txt .config; make clean; make -j6; make modules_install;
> > > make install; grub2-mkconfig -o /boot/grub2/grub.cfg; reboot
> > >
> > > 3, after reboot and login, build bpf selftests, run module_attach
> > > test, dmesg to check kernel log
> > > cd tools/testing/selftests/bpf; make -j6; ./test_progs -t module_attach
> > >
> >
> > Hi Vincent,
> >
> > I tried to refer to the config you provided, but the test results I
> > obtained are as follows. I also specifically tested "modify" to verify
> > the effectiveness of the patch, and the test of module_attach returns -EOPNOTSUPP.
> >
> > [root@localhost bpf]# ./test_progs -v -t modify_return
> > bpf_testmod.ko is already unloaded.
> > Loading bpf_testmod.ko...
> > Successfully loaded bpf_testmod.ko.
> > run_test:PASS:skel_load 0 nsec
> > run_test:PASS:modify_return__attach failed 0 nsec
> > run_test:PASS:test_run 0 nsec
> > run_test:PASS:test_run ret 0 nsec
> > run_test:PASS:modify_return side_effect 0 nsec
> > run_test:PASS:modify_return fentry_result 0 nsec
> > run_test:PASS:modify_return fexit_result 0 nsec
> > run_test:PASS:modify_return fmod_ret_result 0 nsec
> > run_test:PASS:modify_return fentry_result2 0 nsec
> > run_test:PASS:modify_return fexit_result2 0 nsec
> > run_test:PASS:modify_return fmod_ret_result2 0 nsec
> > run_test:PASS:skel_load 0 nsec
> > run_test:PASS:modify_return__attach failed 0 nsec
> > run_test:PASS:test_run 0 nsec
> > run_test:PASS:test_run ret 0 nsec
> > run_test:PASS:modify_return side_effect 0 nsec
> > run_test:PASS:modify_return fentry_result 0 nsec
> > run_test:PASS:modify_return fexit_result 0 nsec
> > run_test:PASS:modify_return fmod_ret_result 0 nsec
> > run_test:PASS:modify_return fentry_result2 0 nsec
> > run_test:PASS:modify_return fexit_result2 0 nsec
> > run_test:PASS:modify_return fmod_ret_result2 0 nsec
> > #200 modify_return:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > Successfully unloaded bpf_testmod.ko.
> > [root@localhost bpf]# ./test_progs -v -t module_attach
> > bpf_testmod.ko is already unloaded.
> > Loading bpf_testmod.ko...
> > Successfully loaded bpf_testmod.ko.
> > test_module_attach:PASS:skel_open 0 nsec
> > test_module_attach:PASS:set_attach_target 0 nsec
> > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > test_module_attach:PASS:skel_load 0 nsec
> > libbpf: prog 'kprobe_multi': failed to attach: -EOPNOTSUPP
> > libbpf: prog 'kprobe_multi': failed to auto-attach: -EOPNOTSUPP
> > test_module_attach:FAIL:skel_attach skeleton attach failed: -95
> > #201 module_attach:FAIL
> > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > Successfully unloaded bpf_testmod.ko.
> >
> this is what I got with addition of -v, it appears you failed at
> skel_attach and maybe your libbpf is outdated and does not support
> kprobe_multi? my libbpf is 1.5
>
> /usr/lib64/libbpf.so.1.5.0
>
also double check if you have below in the .config? I recall
kprobe_multi requires CONFIG_FPROBE
[root@fedora linux-loongson]# grep 'KPROBE' .config
CONFIG_KPROBES=y
CONFIG_KPROBES_ON_FTRACE=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
CONFIG_KPROBE_EVENTS=y
# CONFIG_KPROBE_EVENTS_ON_NOTRACE is not set
CONFIG_BPF_KPROBE_OVERRIDE=y
# CONFIG_KPROBE_EVENT_GEN_TEST is not set
[root@fedora linux-loongson]# grep 'FPROBE' .config
CONFIG_FPROBE=y
CONFIG_FPROBE_EVENTS=y
> [root@fedora bpf]# ./test_progs -v -t module_attach
> bpf_testmod.ko is already unloaded.
> Loading bpf_testmod.ko...
> Successfully loaded bpf_testmod.ko.
> test_module_attach:PASS:skel_open 0 nsec
> test_module_attach:PASS:set_attach_target 0 nsec
> test_module_attach:PASS:set_attach_target_explicit 0 nsec
> test_module_attach:PASS:skel_load 0 nsec
> test_module_attach:PASS:skel_attach 0 nsec
> trigger_module_test_read:PASS:testmod_file_open 0 nsec
> WATCHDOG: test case module_attach executes for 10 seconds...
> WATCHDOG: test case module_attach executes for 120 seconds,
> terminating with SIGSEGV
>
> >
> > Chenghao
> >
> > >
> > > >
> > > >
> > > > Chenghao
> > > >
> > > > >
> > > > > > > On Sat, Aug 9, 2025 at 1:03 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Fri, Aug 8, 2025 at 8:48 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > On Fri, Aug 8, 2025 at 8:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > >
> > > > > > > > > > Hi, Vincent,
> > > > > > > > > >
> > > > > > > > > > On Sat, Aug 9, 2025 at 12:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > Hi Folks,
> > > > > > > > > > >
> > > > > > > > > > > Hengqi mentioned offline that the loongarch kernel locked up when
> > > > > > > > > > > running full bpf selftests, so I went ahead and ran make run_tests to
> > > > > > > > > > > perform full bpf selftest, I observed lockup too. It appears the
> > > > > > > > > > > lockup happens when running module_attach test which includes testing
> > > > > > > > > > > on fentry so this could be related to the trampoline patch series. for
> > > > > > > > > > > example, if I just run ./test_progs -t module_attach, the kernel
> > > > > > > > > > > lockup immediately.
> > > > > > > > > > Is this a regression caused by the latest trampoline patches? Or in
> > > > > > > > > > another word, Does vanilla 6.16 has this problem?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I suspect this is caused by the latest trampoline patches because the
> > > > > > > > > module_attach is to test the fentry feature for kernel module
> > > > > > > > > functions, I believe Changhao and I only tested the fentry feature for
> > > > > > > > > non-module kernel functions. I can try kernel without the trampoline
> > > > > > > > > patches and will let you know the result.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I reverted trampoline patches from loongarch-next branch and run
> > > > > > > > ./test_progs -t module_attach simply just errors out with the fentry
> > > > > > > > feature not supported
> > > > > > > >
> > > > > > > > [root@fedora bpf]# ./test_progs -t module_attach
> > > > > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > > > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > > > > > #205 module_attach:FAIL
> > > > > > > >
> > > > > > > > All error logs:
> > > > > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > > > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > > > > > #205 module_attach:FAIL
> > > > > > > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > > > > > >
> > > > > > > > I also tested loongarch-next branch with the trampoline patch series
> > > > > > > > with no lockup kernel config so I can run dmesg to check kernel error
> > > > > > > > log, ./test_progs -t module_attach result in below kernel log:
> > > > > > > >
> > > > > > > > [ 417.429954] bpf_testmod: loading out-of-tree module taints kernel.
> > > > > > > > [ 419.728620] CPU 70475748 Unable to handle kernel paging request at
> > > > > > > > virtual address 0000000800000024, era == 90000000041d5854, ra ==
> > > > > > > > 90000000041d5848
> > > > > > > > [ 419.728629] Oops[#1]:
> > > > > > > > [ 419.728632] CPU 70475748 Unable to handle kernel paging request at
> > > > > > > > virtual address 0000000000000018, era == 9000000005750268, ra ==
> > > > > > > > 9000000004163938
> > > > > > > > [ 441.305370] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> > > > > > > > [ 441.305380] rcu: 5-...0: (29 ticks this GP)
> > > > > > > > idle=eb74/1/0x4000000000000000 softirq=72377/72379 fqs=2599
> > > > > > > > [ 441.305386] rcu: (detected by 4, t=5252 jiffies, g=60333, q=186 ncpus=8)
> > > > > > > > [ 441.305390] Sending NMI from CPU 4 to CPUs 5:
> > > > > > > > [ 451.305494] rcu: rcu_preempt kthread starved for 2499 jiffies!
> > > > > > > > g60333 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=1
> > > > > > > > [ 451.305500] rcu: Unless rcu_preempt kthread gets sufficient CPU
> > > > > > > > time, OOM is now expected behavior.
> > > > > > > > [ 451.305502] rcu: RCU grace-period kthread stack dump:
> > > > > > > > [ 451.305504] task:rcu_preempt state:R stack:0 pid:15
> > > > > > > > tgid:15 ppid:2 task_flags:0x208040 flags:0x00000800
> > > > > > > > [ 451.305510] Stack : 9000000100467e80 0000000000000402
> > > > > > > > 0000000000000010 90000001003b0680
> > > > > > > > [ 451.305519] 90000000058e0000 0000000000000000
> > > > > > > > 0000000000000040 9000000006c2dfd0
> > > > > > > > [ 451.305526] 900000000578c9b0 0000000000000001
> > > > > > > > 9000000006b21000 0000000000000005
> > > > > > > > [ 451.305533] 00000001000093a8 00000001000093a8
> > > > > > > > 0000000000000000 0000000000000004
> > > > > > > > [ 451.305540] 90000000058f04e0 0000000000000000
> > > > > > > > 0000000000000002 b793724be1dfb2b8
> > > > > > > > [ 451.305547] 00000001000093a9 b793724be1dfb2b8
> > > > > > > > 000000000000003f 9000000006c2dfd0
> > > > > > > > [ 451.305554] 9000000006c30c18 0000000000000005
> > > > > > > > 9000000006b0e000 9000000006b21000
> > > > > > > > [ 451.305560] 9000000100453c98 90000001003aff80
> > > > > > > > 9000000006c31140 900000000578c9b0
> > > > > > > > [ 451.305567] 00000001000093a8 9000000005794d3c
> > > > > > > > 00000000000000b4 0000000000000000
> > > > > > > > [ 451.305574] 90000000024021b8 00000001000093a8
> > > > > > > > 9000000004284f20 000000000a400001
> > > > > > > > [ 451.305581] ...
> > > > > > > > [ 451.305584] Call Trace:
> > > > > > > > [ 451.305586] [<900000000578b868>] __schedule+0x410/0x1520
> > > > > > > > [ 451.305595] [<900000000578c9ac>] schedule+0x34/0x190
> > > > > > > > [ 451.305599] [<9000000005794d38>] schedule_timeout+0x98/0x140
> > > > > > > > [ 451.305604] [<9000000004258f40>] rcu_gp_fqs_loop+0x5f8/0x868
> > > > > > > > [ 451.305609] [<900000000425d358>] rcu_gp_kthread+0x260/0x2e0
> > > > > > > > [ 451.305614] [<90000000041be704>] kthread+0x144/0x238
> > > > > > > > [ 451.305619] [<9000000005787b60>] ret_from_kernel_thread+0x28/0xc8
> > > > > > > > [ 451.305624] [<90000000041620e4>] ret_from_kernel_thread_asm+0xc/0x88
> > > > > > > >
> > > > > > > > [ 451.305630] rcu: Stack dump where RCU GP kthread last ran:
> > > > > > > > [ 451.305633] Sending NMI from CPU 4 to CPUs 1:
> > > > > > > > [ 451.305636] NMI backtrace for cpu 1 skipped: idling at idle_exit+0x0/0x4
> > > > > > > > [ 451.306655] rcu: INFO: rcu_preempt detected expedited stalls on
> > > > > > > > CPUs/tasks: { 5-...D } 7298 jiffies s: 853 root: 0x20/.
> > > > > > > > [ 451.306665] rcu: blocking rcu_node structures (internal RCU debug):
> > > > > > > > [ 451.306669] Sending NMI from CPU 6 to CPUs 5:
> > > > > > > > [ 451.306672] Unable to send backtrace IPI to CPU5 - perhaps it hung?
> > > > > > > >
> > > > > > > > So related to trampoline patches for sure unless I am missing something.
> > > > > > > >
> > > > > > > > > > Huacai
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > A side note, if I put the module_attach test in
> > > > > > > > > > > tools/testing/selftests/bpf/DENYLIST to skip the module_attach test,
> > > > > > > > > > > the module_attach test is not skipped.
> > > > > > > > > > >
> > > > > > > > > > > Thanks
> > > > > > > > > > >
> > > > > > > > > > > Vincent
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel lockup on bpf selftests module_attach
2025-08-14 12:00 ` Chenghao Duan
2025-08-14 13:42 ` Vincent Li
@ 2025-08-21 15:04 ` Vincent Li
2025-08-22 3:11 ` Chenghao Duan
1 sibling, 1 reply; 18+ messages in thread
From: Vincent Li @ 2025-08-21 15:04 UTC (permalink / raw)
To: Chenghao Duan; +Cc: Huacai Chen, loongarch, Hengqi Chen, Tiezhu Yang
On Thu, Aug 14, 2025 at 5:00 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> On Tue, Aug 12, 2025 at 06:42:08AM -0700, Vincent Li wrote:
> > On Tue, Aug 12, 2025 at 1:34 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > >
> > > On Sun, Aug 10, 2025 at 10:39:24AM -0700, Vincent Li wrote:
> > > > Hi Chenghao,
> > > >
> > > > On Sat, Aug 9, 2025 at 12:11 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > >
> > > > > On Fri, Aug 8, 2025 at 11:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > >
> > > > > > Hi, Chenghao,
> > > > > >
> > > > > > Please take a look.
> > > > > >
> > > > > > Huacai
> > > > > >
> > > > > I reverted loongson-next branch tailcall count fix patches, struct
> > > > > ops trampoline patch, keep the rest of trampoline patches,
> > > > > module_attach test experienced the same issue, so definitely
> > > > > trampoline patches issue.
> > > > >
> > > >
> > > > I attempted to isolate which test in module_attach triggers the
> > > > "Unable to handle kernel paging request..." error, it appears to be
> > > > this one in "prog_tests/module_attach.c"
> > > >
> > > > ASSERT_OK(trigger_module_test_read(READ_SZ), "trigger_read");
> > > >
> > > > you can try to comment out other tests in "prog_tests/module_attach.c"
> > > > and perform the test, it might help isolate the issue.
> > > >
> > >
> > > Hi Vincent,
> > >
> > > The results I tested are different from yours. Could there be other
> > > differences between us? I am using the latest code of the loongarch-next
> > > branch.
> > >
> > > [root@localhost bpf]# ./test_progs -v -t module_attach
> > > bpf_testmod.ko is already unloaded.
> > > Loading bpf_testmod.ko...
> > > Successfully loaded bpf_testmod.ko.
> > > test_module_attach:PASS:skel_open 0 nsec
> > > test_module_attach:PASS:set_attach_target 0 nsec
> > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > WATCHDOG: test case module_attach executes for 10 seconds...
> > > libbpf: prog 'handle_fmod_ret': BPF program load failed: -EINVAL
> > > libbpf: prog 'handle_fmod_ret': -- BEGIN PROG LOAD LOG --
> > > bpf_testmod_test_read() is not modifiable
> > > processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> > > -- END PROG LOAD LOG --
> > > libbpf: prog 'handle_fmod_ret': failed to load: -EINVAL
> > > libbpf: failed to load object 'test_module_attach'
> > > libbpf: failed to load BPF skeleton 'test_module_attach': -EINVAL
> > > test_module_attach:FAIL:skel_load failed to load skeleton
> > > #205 module_attach:FAIL
> > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > Successfully unloaded bpf_testmod.ko.
> > >
> >
> > I build and run the most recent loongarch-next kernel too, can you try
> > my config https://www.bpfire.net/download/loongfire/config.txt? I am
> > on fedora, here are the steps I build, run the kernel, and run the
> > test
> >
> > 1, check branch
> > [root@fedora linux-loongson]# git branch
> > * loongarch-next
> > master
> > no-tailcall
> > no-trampoline
> >
> > 2, build kernel and reboot
> > cp config.txt .config; make clean; make -j6; make modules_install;
> > make install; grub2-mkconfig -o /boot/grub2/grub.cfg; reboot
> >
> > 3, after reboot and login, build bpf selftests, run module_attach
> > test, dmesg to check kernel log
> > cd tools/testing/selftests/bpf; make -j6; ./test_progs -t module_attach
> >
>
> Hi Vincent,
>
> I tried to refer to the config you provided, but the test results I
> obtained are as follows. I also specifically tested "modify" to verify
> the effectiveness of the patch, and the test of module_attach returns -EOPNOTSUPP.
>
> [root@localhost bpf]# ./test_progs -v -t modify_return
> bpf_testmod.ko is already unloaded.
> Loading bpf_testmod.ko...
> Successfully loaded bpf_testmod.ko.
> run_test:PASS:skel_load 0 nsec
> run_test:PASS:modify_return__attach failed 0 nsec
> run_test:PASS:test_run 0 nsec
> run_test:PASS:test_run ret 0 nsec
> run_test:PASS:modify_return side_effect 0 nsec
> run_test:PASS:modify_return fentry_result 0 nsec
> run_test:PASS:modify_return fexit_result 0 nsec
> run_test:PASS:modify_return fmod_ret_result 0 nsec
> run_test:PASS:modify_return fentry_result2 0 nsec
> run_test:PASS:modify_return fexit_result2 0 nsec
> run_test:PASS:modify_return fmod_ret_result2 0 nsec
> run_test:PASS:skel_load 0 nsec
> run_test:PASS:modify_return__attach failed 0 nsec
> run_test:PASS:test_run 0 nsec
> run_test:PASS:test_run ret 0 nsec
> run_test:PASS:modify_return side_effect 0 nsec
> run_test:PASS:modify_return fentry_result 0 nsec
> run_test:PASS:modify_return fexit_result 0 nsec
> run_test:PASS:modify_return fmod_ret_result 0 nsec
> run_test:PASS:modify_return fentry_result2 0 nsec
> run_test:PASS:modify_return fexit_result2 0 nsec
> run_test:PASS:modify_return fmod_ret_result2 0 nsec
> #200 modify_return:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> Successfully unloaded bpf_testmod.ko.
> [root@localhost bpf]# ./test_progs -v -t module_attach
> bpf_testmod.ko is already unloaded.
> Loading bpf_testmod.ko...
> Successfully loaded bpf_testmod.ko.
> test_module_attach:PASS:skel_open 0 nsec
> test_module_attach:PASS:set_attach_target 0 nsec
> test_module_attach:PASS:set_attach_target_explicit 0 nsec
> test_module_attach:PASS:skel_load 0 nsec
> libbpf: prog 'kprobe_multi': failed to attach: -EOPNOTSUPP
> libbpf: prog 'kprobe_multi': failed to auto-attach: -EOPNOTSUPP
the -EOPNOTSUPP comes from libbpf, but I am not sure if it is error in
kernel leads to libbpf error or libbpf itself, you can do strace -f
-s1024 -o /tmp/module_attatch.txt ./test_progs -v -t module_attach.
The strace should have bpf syscall and I think it can tell you if the
-EOPNOTSUPP is the result of kernel error or libbpf, you can share the
strace if you want.
> test_module_attach:FAIL:skel_attach skeleton attach failed: -95
> #201 module_attach:FAIL
> Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> Successfully unloaded bpf_testmod.ko.
>
>
> Chenghao
>
> >
> > >
> > >
> > > Chenghao
> > >
> > > >
> > > > > > On Sat, Aug 9, 2025 at 1:03 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > >
> > > > > > > On Fri, Aug 8, 2025 at 8:48 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Fri, Aug 8, 2025 at 8:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > >
> > > > > > > > > Hi, Vincent,
> > > > > > > > >
> > > > > > > > > On Sat, Aug 9, 2025 at 12:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > Hi Folks,
> > > > > > > > > >
> > > > > > > > > > Hengqi mentioned offline that the loongarch kernel locked up when
> > > > > > > > > > running full bpf selftests, so I went ahead and ran make run_tests to
> > > > > > > > > > perform full bpf selftest, I observed lockup too. It appears the
> > > > > > > > > > lockup happens when running module_attach test which includes testing
> > > > > > > > > > on fentry so this could be related to the trampoline patch series. for
> > > > > > > > > > example, if I just run ./test_progs -t module_attach, the kernel
> > > > > > > > > > lockup immediately.
> > > > > > > > > Is this a regression caused by the latest trampoline patches? Or in
> > > > > > > > > another word, Does vanilla 6.16 has this problem?
> > > > > > > > >
> > > > > > > >
> > > > > > > > I suspect this is caused by the latest trampoline patches because the
> > > > > > > > module_attach is to test the fentry feature for kernel module
> > > > > > > > functions, I believe Changhao and I only tested the fentry feature for
> > > > > > > > non-module kernel functions. I can try kernel without the trampoline
> > > > > > > > patches and will let you know the result.
> > > > > > > >
> > > > > > >
> > > > > > > I reverted trampoline patches from loongarch-next branch and run
> > > > > > > ./test_progs -t module_attach simply just errors out with the fentry
> > > > > > > feature not supported
> > > > > > >
> > > > > > > [root@fedora bpf]# ./test_progs -t module_attach
> > > > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > > > > #205 module_attach:FAIL
> > > > > > >
> > > > > > > All error logs:
> > > > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > > > > #205 module_attach:FAIL
> > > > > > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > > > > >
> > > > > > > I also tested loongarch-next branch with the trampoline patch series
> > > > > > > with no lockup kernel config so I can run dmesg to check kernel error
> > > > > > > log, ./test_progs -t module_attach result in below kernel log:
> > > > > > >
> > > > > > > [ 417.429954] bpf_testmod: loading out-of-tree module taints kernel.
> > > > > > > [ 419.728620] CPU 70475748 Unable to handle kernel paging request at
> > > > > > > virtual address 0000000800000024, era == 90000000041d5854, ra ==
> > > > > > > 90000000041d5848
> > > > > > > [ 419.728629] Oops[#1]:
> > > > > > > [ 419.728632] CPU 70475748 Unable to handle kernel paging request at
> > > > > > > virtual address 0000000000000018, era == 9000000005750268, ra ==
> > > > > > > 9000000004163938
> > > > > > > [ 441.305370] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> > > > > > > [ 441.305380] rcu: 5-...0: (29 ticks this GP)
> > > > > > > idle=eb74/1/0x4000000000000000 softirq=72377/72379 fqs=2599
> > > > > > > [ 441.305386] rcu: (detected by 4, t=5252 jiffies, g=60333, q=186 ncpus=8)
> > > > > > > [ 441.305390] Sending NMI from CPU 4 to CPUs 5:
> > > > > > > [ 451.305494] rcu: rcu_preempt kthread starved for 2499 jiffies!
> > > > > > > g60333 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=1
> > > > > > > [ 451.305500] rcu: Unless rcu_preempt kthread gets sufficient CPU
> > > > > > > time, OOM is now expected behavior.
> > > > > > > [ 451.305502] rcu: RCU grace-period kthread stack dump:
> > > > > > > [ 451.305504] task:rcu_preempt state:R stack:0 pid:15
> > > > > > > tgid:15 ppid:2 task_flags:0x208040 flags:0x00000800
> > > > > > > [ 451.305510] Stack : 9000000100467e80 0000000000000402
> > > > > > > 0000000000000010 90000001003b0680
> > > > > > > [ 451.305519] 90000000058e0000 0000000000000000
> > > > > > > 0000000000000040 9000000006c2dfd0
> > > > > > > [ 451.305526] 900000000578c9b0 0000000000000001
> > > > > > > 9000000006b21000 0000000000000005
> > > > > > > [ 451.305533] 00000001000093a8 00000001000093a8
> > > > > > > 0000000000000000 0000000000000004
> > > > > > > [ 451.305540] 90000000058f04e0 0000000000000000
> > > > > > > 0000000000000002 b793724be1dfb2b8
> > > > > > > [ 451.305547] 00000001000093a9 b793724be1dfb2b8
> > > > > > > 000000000000003f 9000000006c2dfd0
> > > > > > > [ 451.305554] 9000000006c30c18 0000000000000005
> > > > > > > 9000000006b0e000 9000000006b21000
> > > > > > > [ 451.305560] 9000000100453c98 90000001003aff80
> > > > > > > 9000000006c31140 900000000578c9b0
> > > > > > > [ 451.305567] 00000001000093a8 9000000005794d3c
> > > > > > > 00000000000000b4 0000000000000000
> > > > > > > [ 451.305574] 90000000024021b8 00000001000093a8
> > > > > > > 9000000004284f20 000000000a400001
> > > > > > > [ 451.305581] ...
> > > > > > > [ 451.305584] Call Trace:
> > > > > > > [ 451.305586] [<900000000578b868>] __schedule+0x410/0x1520
> > > > > > > [ 451.305595] [<900000000578c9ac>] schedule+0x34/0x190
> > > > > > > [ 451.305599] [<9000000005794d38>] schedule_timeout+0x98/0x140
> > > > > > > [ 451.305604] [<9000000004258f40>] rcu_gp_fqs_loop+0x5f8/0x868
> > > > > > > [ 451.305609] [<900000000425d358>] rcu_gp_kthread+0x260/0x2e0
> > > > > > > [ 451.305614] [<90000000041be704>] kthread+0x144/0x238
> > > > > > > [ 451.305619] [<9000000005787b60>] ret_from_kernel_thread+0x28/0xc8
> > > > > > > [ 451.305624] [<90000000041620e4>] ret_from_kernel_thread_asm+0xc/0x88
> > > > > > >
> > > > > > > [ 451.305630] rcu: Stack dump where RCU GP kthread last ran:
> > > > > > > [ 451.305633] Sending NMI from CPU 4 to CPUs 1:
> > > > > > > [ 451.305636] NMI backtrace for cpu 1 skipped: idling at idle_exit+0x0/0x4
> > > > > > > [ 451.306655] rcu: INFO: rcu_preempt detected expedited stalls on
> > > > > > > CPUs/tasks: { 5-...D } 7298 jiffies s: 853 root: 0x20/.
> > > > > > > [ 451.306665] rcu: blocking rcu_node structures (internal RCU debug):
> > > > > > > [ 451.306669] Sending NMI from CPU 6 to CPUs 5:
> > > > > > > [ 451.306672] Unable to send backtrace IPI to CPU5 - perhaps it hung?
> > > > > > >
> > > > > > > So related to trampoline patches for sure unless I am missing something.
> > > > > > >
> > > > > > > > > Huacai
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > A side note, if I put the module_attach test in
> > > > > > > > > > tools/testing/selftests/bpf/DENYLIST to skip the module_attach test,
> > > > > > > > > > the module_attach test is not skipped.
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > > Vincent
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel lockup on bpf selftests module_attach
2025-08-21 15:04 ` Vincent Li
@ 2025-08-22 3:11 ` Chenghao Duan
2025-08-22 5:10 ` Vincent Li
0 siblings, 1 reply; 18+ messages in thread
From: Chenghao Duan @ 2025-08-22 3:11 UTC (permalink / raw)
To: Vincent Li; +Cc: Huacai Chen, loongarch, Hengqi Chen, Tiezhu Yang
On Thu, Aug 21, 2025 at 08:04:07AM -0700, Vincent Li wrote:
> On Thu, Aug 14, 2025 at 5:00 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> >
> > On Tue, Aug 12, 2025 at 06:42:08AM -0700, Vincent Li wrote:
> > > On Tue, Aug 12, 2025 at 1:34 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > > >
> > > > On Sun, Aug 10, 2025 at 10:39:24AM -0700, Vincent Li wrote:
> > > > > Hi Chenghao,
> > > > >
> > > > > On Sat, Aug 9, 2025 at 12:11 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Aug 8, 2025 at 11:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > >
> > > > > > > Hi, Chenghao,
> > > > > > >
> > > > > > > Please take a look.
> > > > > > >
> > > > > > > Huacai
> > > > > > >
> > > > > > I reverted loongson-next branch tailcall count fix patches, struct
> > > > > > ops trampoline patch, keep the rest of trampoline patches,
> > > > > > module_attach test experienced the same issue, so definitely
> > > > > > trampoline patches issue.
> > > > > >
> > > > >
> > > > > I attempted to isolate which test in module_attach triggers the
> > > > > "Unable to handle kernel paging request..." error, it appears to be
> > > > > this one in "prog_tests/module_attach.c"
> > > > >
> > > > > ASSERT_OK(trigger_module_test_read(READ_SZ), "trigger_read");
> > > > >
> > > > > you can try to comment out other tests in "prog_tests/module_attach.c"
> > > > > and perform the test, it might help isolate the issue.
> > > > >
> > > >
> > > > Hi Vincent,
> > > >
> > > > The results I tested are different from yours. Could there be other
> > > > differences between us? I am using the latest code of the loongarch-next
> > > > branch.
> > > >
> > > > [root@localhost bpf]# ./test_progs -v -t module_attach
> > > > bpf_testmod.ko is already unloaded.
> > > > Loading bpf_testmod.ko...
> > > > Successfully loaded bpf_testmod.ko.
> > > > test_module_attach:PASS:skel_open 0 nsec
> > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > WATCHDOG: test case module_attach executes for 10 seconds...
> > > > libbpf: prog 'handle_fmod_ret': BPF program load failed: -EINVAL
> > > > libbpf: prog 'handle_fmod_ret': -- BEGIN PROG LOAD LOG --
> > > > bpf_testmod_test_read() is not modifiable
> > > > processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> > > > -- END PROG LOAD LOG --
> > > > libbpf: prog 'handle_fmod_ret': failed to load: -EINVAL
> > > > libbpf: failed to load object 'test_module_attach'
> > > > libbpf: failed to load BPF skeleton 'test_module_attach': -EINVAL
> > > > test_module_attach:FAIL:skel_load failed to load skeleton
> > > > #205 module_attach:FAIL
> > > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > > Successfully unloaded bpf_testmod.ko.
> > > >
> > >
> > > I build and run the most recent loongarch-next kernel too, can you try
> > > my config https://www.bpfire.net/download/loongfire/config.txt? I am
> > > on fedora, here are the steps I build, run the kernel, and run the
> > > test
> > >
> > > 1, check branch
> > > [root@fedora linux-loongson]# git branch
> > > * loongarch-next
> > > master
> > > no-tailcall
> > > no-trampoline
> > >
> > > 2, build kernel and reboot
> > > cp config.txt .config; make clean; make -j6; make modules_install;
> > > make install; grub2-mkconfig -o /boot/grub2/grub.cfg; reboot
> > >
> > > 3, after reboot and login, build bpf selftests, run module_attach
> > > test, dmesg to check kernel log
> > > cd tools/testing/selftests/bpf; make -j6; ./test_progs -t module_attach
> > >
> >
> > Hi Vincent,
> >
> > I tried to refer to the config you provided, but the test results I
> > obtained are as follows. I also specifically tested "modify" to verify
> > the effectiveness of the patch, and the test of module_attach returns -EOPNOTSUPP.
> >
> > [root@localhost bpf]# ./test_progs -v -t modify_return
> > bpf_testmod.ko is already unloaded.
> > Loading bpf_testmod.ko...
> > Successfully loaded bpf_testmod.ko.
> > run_test:PASS:skel_load 0 nsec
> > run_test:PASS:modify_return__attach failed 0 nsec
> > run_test:PASS:test_run 0 nsec
> > run_test:PASS:test_run ret 0 nsec
> > run_test:PASS:modify_return side_effect 0 nsec
> > run_test:PASS:modify_return fentry_result 0 nsec
> > run_test:PASS:modify_return fexit_result 0 nsec
> > run_test:PASS:modify_return fmod_ret_result 0 nsec
> > run_test:PASS:modify_return fentry_result2 0 nsec
> > run_test:PASS:modify_return fexit_result2 0 nsec
> > run_test:PASS:modify_return fmod_ret_result2 0 nsec
> > run_test:PASS:skel_load 0 nsec
> > run_test:PASS:modify_return__attach failed 0 nsec
> > run_test:PASS:test_run 0 nsec
> > run_test:PASS:test_run ret 0 nsec
> > run_test:PASS:modify_return side_effect 0 nsec
> > run_test:PASS:modify_return fentry_result 0 nsec
> > run_test:PASS:modify_return fexit_result 0 nsec
> > run_test:PASS:modify_return fmod_ret_result 0 nsec
> > run_test:PASS:modify_return fentry_result2 0 nsec
> > run_test:PASS:modify_return fexit_result2 0 nsec
> > run_test:PASS:modify_return fmod_ret_result2 0 nsec
> > #200 modify_return:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > Successfully unloaded bpf_testmod.ko.
> > [root@localhost bpf]# ./test_progs -v -t module_attach
> > bpf_testmod.ko is already unloaded.
> > Loading bpf_testmod.ko...
> > Successfully loaded bpf_testmod.ko.
> > test_module_attach:PASS:skel_open 0 nsec
> > test_module_attach:PASS:set_attach_target 0 nsec
> > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > test_module_attach:PASS:skel_load 0 nsec
> > libbpf: prog 'kprobe_multi': failed to attach: -EOPNOTSUPP
> > libbpf: prog 'kprobe_multi': failed to auto-attach: -EOPNOTSUPP
>
> the -EOPNOTSUPP comes from libbpf, but I am not sure if it is error in
> kernel leads to libbpf error or libbpf itself, you can do strace -f
> -s1024 -o /tmp/module_attatch.txt ./test_progs -v -t module_attach.
> The strace should have bpf syscall and I think it can tell you if the
> -EOPNOTSUPP is the result of kernel error or libbpf, you can share the
> strace if you want.
>
2037 read(16, "", 8192) = 0
2037 close(16) = 0
2037 bpf(BPF_LINK_CREATE, {link_create={prog_fd=61, target_fd=0, attach_type=BPF_TRACE_KPROBE_MULTI, flags=0, kprobe_multi={flags=0, cnt=1, syms=NULL, addrs=[0xffff8000035717d0], cookies=NULL}}}, 64) = -1 EOPNOTSUPP (不支持的操作)
2037 write(1, "libbpf: prog 'kprobe_multi': failed to attach: -EOPNOTSUPP\n", 59) = 59
2037 write(1, "libbpf: prog 'kprobe_multi': failed to auto-attach: -EOPNOTSUPP\n", 64) = 64
2037 write(1, "test_module_attach:FAIL:skel_attach skeleton attach failed: -95\n", 64) = 64
not support attach_type BPF_TRACE_KPROBE_MULTI
Chenghao
>
> > test_module_attach:FAIL:skel_attach skeleton attach failed: -95
> > #201 module_attach:FAIL
> > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > Successfully unloaded bpf_testmod.ko.
> >
> >
> > Chenghao
> >
> > >
> > > >
> > > >
> > > > Chenghao
> > > >
> > > > >
> > > > > > > On Sat, Aug 9, 2025 at 1:03 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Fri, Aug 8, 2025 at 8:48 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > On Fri, Aug 8, 2025 at 8:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > >
> > > > > > > > > > Hi, Vincent,
> > > > > > > > > >
> > > > > > > > > > On Sat, Aug 9, 2025 at 12:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > Hi Folks,
> > > > > > > > > > >
> > > > > > > > > > > Hengqi mentioned offline that the loongarch kernel locked up when
> > > > > > > > > > > running full bpf selftests, so I went ahead and ran make run_tests to
> > > > > > > > > > > perform full bpf selftest, I observed lockup too. It appears the
> > > > > > > > > > > lockup happens when running module_attach test which includes testing
> > > > > > > > > > > on fentry so this could be related to the trampoline patch series. for
> > > > > > > > > > > example, if I just run ./test_progs -t module_attach, the kernel
> > > > > > > > > > > lockup immediately.
> > > > > > > > > > Is this a regression caused by the latest trampoline patches? Or in
> > > > > > > > > > another word, Does vanilla 6.16 has this problem?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I suspect this is caused by the latest trampoline patches because the
> > > > > > > > > module_attach is to test the fentry feature for kernel module
> > > > > > > > > functions, I believe Changhao and I only tested the fentry feature for
> > > > > > > > > non-module kernel functions. I can try kernel without the trampoline
> > > > > > > > > patches and will let you know the result.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I reverted trampoline patches from loongarch-next branch and run
> > > > > > > > ./test_progs -t module_attach simply just errors out with the fentry
> > > > > > > > feature not supported
> > > > > > > >
> > > > > > > > [root@fedora bpf]# ./test_progs -t module_attach
> > > > > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > > > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > > > > > #205 module_attach:FAIL
> > > > > > > >
> > > > > > > > All error logs:
> > > > > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > > > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > > > > > #205 module_attach:FAIL
> > > > > > > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > > > > > >
> > > > > > > > I also tested loongarch-next branch with the trampoline patch series
> > > > > > > > with no lockup kernel config so I can run dmesg to check kernel error
> > > > > > > > log, ./test_progs -t module_attach result in below kernel log:
> > > > > > > >
> > > > > > > > [ 417.429954] bpf_testmod: loading out-of-tree module taints kernel.
> > > > > > > > [ 419.728620] CPU 70475748 Unable to handle kernel paging request at
> > > > > > > > virtual address 0000000800000024, era == 90000000041d5854, ra ==
> > > > > > > > 90000000041d5848
> > > > > > > > [ 419.728629] Oops[#1]:
> > > > > > > > [ 419.728632] CPU 70475748 Unable to handle kernel paging request at
> > > > > > > > virtual address 0000000000000018, era == 9000000005750268, ra ==
> > > > > > > > 9000000004163938
> > > > > > > > [ 441.305370] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> > > > > > > > [ 441.305380] rcu: 5-...0: (29 ticks this GP)
> > > > > > > > idle=eb74/1/0x4000000000000000 softirq=72377/72379 fqs=2599
> > > > > > > > [ 441.305386] rcu: (detected by 4, t=5252 jiffies, g=60333, q=186 ncpus=8)
> > > > > > > > [ 441.305390] Sending NMI from CPU 4 to CPUs 5:
> > > > > > > > [ 451.305494] rcu: rcu_preempt kthread starved for 2499 jiffies!
> > > > > > > > g60333 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=1
> > > > > > > > [ 451.305500] rcu: Unless rcu_preempt kthread gets sufficient CPU
> > > > > > > > time, OOM is now expected behavior.
> > > > > > > > [ 451.305502] rcu: RCU grace-period kthread stack dump:
> > > > > > > > [ 451.305504] task:rcu_preempt state:R stack:0 pid:15
> > > > > > > > tgid:15 ppid:2 task_flags:0x208040 flags:0x00000800
> > > > > > > > [ 451.305510] Stack : 9000000100467e80 0000000000000402
> > > > > > > > 0000000000000010 90000001003b0680
> > > > > > > > [ 451.305519] 90000000058e0000 0000000000000000
> > > > > > > > 0000000000000040 9000000006c2dfd0
> > > > > > > > [ 451.305526] 900000000578c9b0 0000000000000001
> > > > > > > > 9000000006b21000 0000000000000005
> > > > > > > > [ 451.305533] 00000001000093a8 00000001000093a8
> > > > > > > > 0000000000000000 0000000000000004
> > > > > > > > [ 451.305540] 90000000058f04e0 0000000000000000
> > > > > > > > 0000000000000002 b793724be1dfb2b8
> > > > > > > > [ 451.305547] 00000001000093a9 b793724be1dfb2b8
> > > > > > > > 000000000000003f 9000000006c2dfd0
> > > > > > > > [ 451.305554] 9000000006c30c18 0000000000000005
> > > > > > > > 9000000006b0e000 9000000006b21000
> > > > > > > > [ 451.305560] 9000000100453c98 90000001003aff80
> > > > > > > > 9000000006c31140 900000000578c9b0
> > > > > > > > [ 451.305567] 00000001000093a8 9000000005794d3c
> > > > > > > > 00000000000000b4 0000000000000000
> > > > > > > > [ 451.305574] 90000000024021b8 00000001000093a8
> > > > > > > > 9000000004284f20 000000000a400001
> > > > > > > > [ 451.305581] ...
> > > > > > > > [ 451.305584] Call Trace:
> > > > > > > > [ 451.305586] [<900000000578b868>] __schedule+0x410/0x1520
> > > > > > > > [ 451.305595] [<900000000578c9ac>] schedule+0x34/0x190
> > > > > > > > [ 451.305599] [<9000000005794d38>] schedule_timeout+0x98/0x140
> > > > > > > > [ 451.305604] [<9000000004258f40>] rcu_gp_fqs_loop+0x5f8/0x868
> > > > > > > > [ 451.305609] [<900000000425d358>] rcu_gp_kthread+0x260/0x2e0
> > > > > > > > [ 451.305614] [<90000000041be704>] kthread+0x144/0x238
> > > > > > > > [ 451.305619] [<9000000005787b60>] ret_from_kernel_thread+0x28/0xc8
> > > > > > > > [ 451.305624] [<90000000041620e4>] ret_from_kernel_thread_asm+0xc/0x88
> > > > > > > >
> > > > > > > > [ 451.305630] rcu: Stack dump where RCU GP kthread last ran:
> > > > > > > > [ 451.305633] Sending NMI from CPU 4 to CPUs 1:
> > > > > > > > [ 451.305636] NMI backtrace for cpu 1 skipped: idling at idle_exit+0x0/0x4
> > > > > > > > [ 451.306655] rcu: INFO: rcu_preempt detected expedited stalls on
> > > > > > > > CPUs/tasks: { 5-...D } 7298 jiffies s: 853 root: 0x20/.
> > > > > > > > [ 451.306665] rcu: blocking rcu_node structures (internal RCU debug):
> > > > > > > > [ 451.306669] Sending NMI from CPU 6 to CPUs 5:
> > > > > > > > [ 451.306672] Unable to send backtrace IPI to CPU5 - perhaps it hung?
> > > > > > > >
> > > > > > > > So related to trampoline patches for sure unless I am missing something.
> > > > > > > >
> > > > > > > > > > Huacai
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > A side note, if I put the module_attach test in
> > > > > > > > > > > tools/testing/selftests/bpf/DENYLIST to skip the module_attach test,
> > > > > > > > > > > the module_attach test is not skipped.
> > > > > > > > > > >
> > > > > > > > > > > Thanks
> > > > > > > > > > >
> > > > > > > > > > > Vincent
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel lockup on bpf selftests module_attach
2025-08-22 3:11 ` Chenghao Duan
@ 2025-08-22 5:10 ` Vincent Li
2025-08-22 5:22 ` Vincent Li
0 siblings, 1 reply; 18+ messages in thread
From: Vincent Li @ 2025-08-22 5:10 UTC (permalink / raw)
To: Chenghao Duan; +Cc: Huacai Chen, loongarch, Hengqi Chen, Tiezhu Yang
On Thu, Aug 21, 2025 at 8:11 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> On Thu, Aug 21, 2025 at 08:04:07AM -0700, Vincent Li wrote:
> > On Thu, Aug 14, 2025 at 5:00 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > >
> > > On Tue, Aug 12, 2025 at 06:42:08AM -0700, Vincent Li wrote:
> > > > On Tue, Aug 12, 2025 at 1:34 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > > > >
> > > > > On Sun, Aug 10, 2025 at 10:39:24AM -0700, Vincent Li wrote:
> > > > > > Hi Chenghao,
> > > > > >
> > > > > > On Sat, Aug 9, 2025 at 12:11 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > >
> > > > > > > On Fri, Aug 8, 2025 at 11:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > >
> > > > > > > > Hi, Chenghao,
> > > > > > > >
> > > > > > > > Please take a look.
> > > > > > > >
> > > > > > > > Huacai
> > > > > > > >
> > > > > > > I reverted loongson-next branch tailcall count fix patches, struct
> > > > > > > ops trampoline patch, keep the rest of trampoline patches,
> > > > > > > module_attach test experienced the same issue, so definitely
> > > > > > > trampoline patches issue.
> > > > > > >
> > > > > >
> > > > > > I attempted to isolate which test in module_attach triggers the
> > > > > > "Unable to handle kernel paging request..." error, it appears to be
> > > > > > this one in "prog_tests/module_attach.c"
> > > > > >
> > > > > > ASSERT_OK(trigger_module_test_read(READ_SZ), "trigger_read");
> > > > > >
> > > > > > you can try to comment out other tests in "prog_tests/module_attach.c"
> > > > > > and perform the test, it might help isolate the issue.
> > > > > >
> > > > >
> > > > > Hi Vincent,
> > > > >
> > > > > The results I tested are different from yours. Could there be other
> > > > > differences between us? I am using the latest code of the loongarch-next
> > > > > branch.
> > > > >
> > > > > [root@localhost bpf]# ./test_progs -v -t module_attach
> > > > > bpf_testmod.ko is already unloaded.
> > > > > Loading bpf_testmod.ko...
> > > > > Successfully loaded bpf_testmod.ko.
> > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > WATCHDOG: test case module_attach executes for 10 seconds...
> > > > > libbpf: prog 'handle_fmod_ret': BPF program load failed: -EINVAL
> > > > > libbpf: prog 'handle_fmod_ret': -- BEGIN PROG LOAD LOG --
> > > > > bpf_testmod_test_read() is not modifiable
> > > > > processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> > > > > -- END PROG LOAD LOG --
> > > > > libbpf: prog 'handle_fmod_ret': failed to load: -EINVAL
> > > > > libbpf: failed to load object 'test_module_attach'
> > > > > libbpf: failed to load BPF skeleton 'test_module_attach': -EINVAL
> > > > > test_module_attach:FAIL:skel_load failed to load skeleton
> > > > > #205 module_attach:FAIL
> > > > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > > > Successfully unloaded bpf_testmod.ko.
> > > > >
> > > >
> > > > I build and run the most recent loongarch-next kernel too, can you try
> > > > my config https://www.bpfire.net/download/loongfire/config.txt? I am
> > > > on fedora, here are the steps I build, run the kernel, and run the
> > > > test
> > > >
> > > > 1, check branch
> > > > [root@fedora linux-loongson]# git branch
> > > > * loongarch-next
> > > > master
> > > > no-tailcall
> > > > no-trampoline
> > > >
> > > > 2, build kernel and reboot
> > > > cp config.txt .config; make clean; make -j6; make modules_install;
> > > > make install; grub2-mkconfig -o /boot/grub2/grub.cfg; reboot
> > > >
> > > > 3, after reboot and login, build bpf selftests, run module_attach
> > > > test, dmesg to check kernel log
> > > > cd tools/testing/selftests/bpf; make -j6; ./test_progs -t module_attach
> > > >
> > >
> > > Hi Vincent,
> > >
> > > I tried to refer to the config you provided, but the test results I
> > > obtained are as follows. I also specifically tested "modify" to verify
> > > the effectiveness of the patch, and the test of module_attach returns -EOPNOTSUPP.
> > >
> > > [root@localhost bpf]# ./test_progs -v -t modify_return
> > > bpf_testmod.ko is already unloaded.
> > > Loading bpf_testmod.ko...
> > > Successfully loaded bpf_testmod.ko.
> > > run_test:PASS:skel_load 0 nsec
> > > run_test:PASS:modify_return__attach failed 0 nsec
> > > run_test:PASS:test_run 0 nsec
> > > run_test:PASS:test_run ret 0 nsec
> > > run_test:PASS:modify_return side_effect 0 nsec
> > > run_test:PASS:modify_return fentry_result 0 nsec
> > > run_test:PASS:modify_return fexit_result 0 nsec
> > > run_test:PASS:modify_return fmod_ret_result 0 nsec
> > > run_test:PASS:modify_return fentry_result2 0 nsec
> > > run_test:PASS:modify_return fexit_result2 0 nsec
> > > run_test:PASS:modify_return fmod_ret_result2 0 nsec
> > > run_test:PASS:skel_load 0 nsec
> > > run_test:PASS:modify_return__attach failed 0 nsec
> > > run_test:PASS:test_run 0 nsec
> > > run_test:PASS:test_run ret 0 nsec
> > > run_test:PASS:modify_return side_effect 0 nsec
> > > run_test:PASS:modify_return fentry_result 0 nsec
> > > run_test:PASS:modify_return fexit_result 0 nsec
> > > run_test:PASS:modify_return fmod_ret_result 0 nsec
> > > run_test:PASS:modify_return fentry_result2 0 nsec
> > > run_test:PASS:modify_return fexit_result2 0 nsec
> > > run_test:PASS:modify_return fmod_ret_result2 0 nsec
> > > #200 modify_return:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > Successfully unloaded bpf_testmod.ko.
> > > [root@localhost bpf]# ./test_progs -v -t module_attach
> > > bpf_testmod.ko is already unloaded.
> > > Loading bpf_testmod.ko...
> > > Successfully loaded bpf_testmod.ko.
> > > test_module_attach:PASS:skel_open 0 nsec
> > > test_module_attach:PASS:set_attach_target 0 nsec
> > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > test_module_attach:PASS:skel_load 0 nsec
> > > libbpf: prog 'kprobe_multi': failed to attach: -EOPNOTSUPP
> > > libbpf: prog 'kprobe_multi': failed to auto-attach: -EOPNOTSUPP
> >
> > the -EOPNOTSUPP comes from libbpf, but I am not sure if it is error in
> > kernel leads to libbpf error or libbpf itself, you can do strace -f
> > -s1024 -o /tmp/module_attatch.txt ./test_progs -v -t module_attach.
> > The strace should have bpf syscall and I think it can tell you if the
> > -EOPNOTSUPP is the result of kernel error or libbpf, you can share the
> > strace if you want.
> >
> 2037 read(16, "", 8192) = 0
> 2037 close(16) = 0
> 2037 bpf(BPF_LINK_CREATE, {link_create={prog_fd=61, target_fd=0, attach_type=BPF_TRACE_KPROBE_MULTI, flags=0, kprobe_multi={flags=0, cnt=1, syms=NULL, addrs=[0xffff8000035717d0], cookies=NULL}}}, 64) = -1 EOPNOTSUPP (不支持的操作)
so bpf syscall cmd BPF_LINK_CREATE returns '-1 EOPNOTSUPP' exactly? I
could not tell because I thought the return value is '-1'
> 2037 write(1, "libbpf: prog 'kprobe_multi': failed to attach: -EOPNOTSUPP\n", 59) = 59
> 2037 write(1, "libbpf: prog 'kprobe_multi': failed to auto-attach: -EOPNOTSUPP\n", 64) = 64
> 2037 write(1, "test_module_attach:FAIL:skel_attach skeleton attach failed: -95\n", 64) = 64
>
> not support attach_type BPF_TRACE_KPROBE_MULTI
>
Could you share your kernel config (.config used for kernel compiling
or running kernel /boot/config-*) ? I wonder if you have the FPROBE
really configured, since include/linux/fprobe.h has:
#ifdef CONFIG_FPROBE
int register_fprobe(struct fprobe *fp, const char *filter, const char
*notfilter);
int register_fprobe_ips(struct fprobe *fp, unsigned long *addrs, int num);
int register_fprobe_syms(struct fprobe *fp, const char **syms, int num);
int unregister_fprobe(struct fprobe *fp);
bool fprobe_is_registered(struct fprobe *fp);
int fprobe_count_ips_from_filter(const char *filter, const char *notfilter);
#else
static inline int register_fprobe(struct fprobe *fp, const char
*filter, const char *notfilter)
{
return -EOPNOTSUPP;
}
static inline int register_fprobe_ips(struct fprobe *fp, unsigned long
*addrs, int num)
{
return -EOPNOTSUPP;
}
static inline int register_fprobe_syms(struct fprobe *fp, const char
**syms, int num)
{
return -EOPNOTSUPP;
}
static inline int unregister_fprobe(struct fprobe *fp)
{
return -EOPNOTSUPP;
}
static inline bool fprobe_is_registered(struct fprobe *fp)
{
return false;
}
static inline int fprobe_count_ips_from_filter(const char *filter,
const char *notfilter)
{
return -EOPNOTSUPP;
}
#endif
> Chenghao
>
>
> >
> > > test_module_attach:FAIL:skel_attach skeleton attach failed: -95
> > > #201 module_attach:FAIL
> > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > Successfully unloaded bpf_testmod.ko.
> > >
> > >
> > > Chenghao
> > >
> > > >
> > > > >
> > > > >
> > > > > Chenghao
> > > > >
> > > > > >
> > > > > > > > On Sat, Aug 9, 2025 at 1:03 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > On Fri, Aug 8, 2025 at 8:48 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Fri, Aug 8, 2025 at 8:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > >
> > > > > > > > > > > Hi, Vincent,
> > > > > > > > > > >
> > > > > > > > > > > On Sat, Aug 9, 2025 at 12:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Hi Folks,
> > > > > > > > > > > >
> > > > > > > > > > > > Hengqi mentioned offline that the loongarch kernel locked up when
> > > > > > > > > > > > running full bpf selftests, so I went ahead and ran make run_tests to
> > > > > > > > > > > > perform full bpf selftest, I observed lockup too. It appears the
> > > > > > > > > > > > lockup happens when running module_attach test which includes testing
> > > > > > > > > > > > on fentry so this could be related to the trampoline patch series. for
> > > > > > > > > > > > example, if I just run ./test_progs -t module_attach, the kernel
> > > > > > > > > > > > lockup immediately.
> > > > > > > > > > > Is this a regression caused by the latest trampoline patches? Or in
> > > > > > > > > > > another word, Does vanilla 6.16 has this problem?
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I suspect this is caused by the latest trampoline patches because the
> > > > > > > > > > module_attach is to test the fentry feature for kernel module
> > > > > > > > > > functions, I believe Changhao and I only tested the fentry feature for
> > > > > > > > > > non-module kernel functions. I can try kernel without the trampoline
> > > > > > > > > > patches and will let you know the result.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I reverted trampoline patches from loongarch-next branch and run
> > > > > > > > > ./test_progs -t module_attach simply just errors out with the fentry
> > > > > > > > > feature not supported
> > > > > > > > >
> > > > > > > > > [root@fedora bpf]# ./test_progs -t module_attach
> > > > > > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > > > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > > > > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > > > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > > > > > > #205 module_attach:FAIL
> > > > > > > > >
> > > > > > > > > All error logs:
> > > > > > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > > > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > > > > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > > > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > > > > > > #205 module_attach:FAIL
> > > > > > > > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > > > > > > >
> > > > > > > > > I also tested loongarch-next branch with the trampoline patch series
> > > > > > > > > with no lockup kernel config so I can run dmesg to check kernel error
> > > > > > > > > log, ./test_progs -t module_attach result in below kernel log:
> > > > > > > > >
> > > > > > > > > [ 417.429954] bpf_testmod: loading out-of-tree module taints kernel.
> > > > > > > > > [ 419.728620] CPU 70475748 Unable to handle kernel paging request at
> > > > > > > > > virtual address 0000000800000024, era == 90000000041d5854, ra ==
> > > > > > > > > 90000000041d5848
> > > > > > > > > [ 419.728629] Oops[#1]:
> > > > > > > > > [ 419.728632] CPU 70475748 Unable to handle kernel paging request at
> > > > > > > > > virtual address 0000000000000018, era == 9000000005750268, ra ==
> > > > > > > > > 9000000004163938
> > > > > > > > > [ 441.305370] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> > > > > > > > > [ 441.305380] rcu: 5-...0: (29 ticks this GP)
> > > > > > > > > idle=eb74/1/0x4000000000000000 softirq=72377/72379 fqs=2599
> > > > > > > > > [ 441.305386] rcu: (detected by 4, t=5252 jiffies, g=60333, q=186 ncpus=8)
> > > > > > > > > [ 441.305390] Sending NMI from CPU 4 to CPUs 5:
> > > > > > > > > [ 451.305494] rcu: rcu_preempt kthread starved for 2499 jiffies!
> > > > > > > > > g60333 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=1
> > > > > > > > > [ 451.305500] rcu: Unless rcu_preempt kthread gets sufficient CPU
> > > > > > > > > time, OOM is now expected behavior.
> > > > > > > > > [ 451.305502] rcu: RCU grace-period kthread stack dump:
> > > > > > > > > [ 451.305504] task:rcu_preempt state:R stack:0 pid:15
> > > > > > > > > tgid:15 ppid:2 task_flags:0x208040 flags:0x00000800
> > > > > > > > > [ 451.305510] Stack : 9000000100467e80 0000000000000402
> > > > > > > > > 0000000000000010 90000001003b0680
> > > > > > > > > [ 451.305519] 90000000058e0000 0000000000000000
> > > > > > > > > 0000000000000040 9000000006c2dfd0
> > > > > > > > > [ 451.305526] 900000000578c9b0 0000000000000001
> > > > > > > > > 9000000006b21000 0000000000000005
> > > > > > > > > [ 451.305533] 00000001000093a8 00000001000093a8
> > > > > > > > > 0000000000000000 0000000000000004
> > > > > > > > > [ 451.305540] 90000000058f04e0 0000000000000000
> > > > > > > > > 0000000000000002 b793724be1dfb2b8
> > > > > > > > > [ 451.305547] 00000001000093a9 b793724be1dfb2b8
> > > > > > > > > 000000000000003f 9000000006c2dfd0
> > > > > > > > > [ 451.305554] 9000000006c30c18 0000000000000005
> > > > > > > > > 9000000006b0e000 9000000006b21000
> > > > > > > > > [ 451.305560] 9000000100453c98 90000001003aff80
> > > > > > > > > 9000000006c31140 900000000578c9b0
> > > > > > > > > [ 451.305567] 00000001000093a8 9000000005794d3c
> > > > > > > > > 00000000000000b4 0000000000000000
> > > > > > > > > [ 451.305574] 90000000024021b8 00000001000093a8
> > > > > > > > > 9000000004284f20 000000000a400001
> > > > > > > > > [ 451.305581] ...
> > > > > > > > > [ 451.305584] Call Trace:
> > > > > > > > > [ 451.305586] [<900000000578b868>] __schedule+0x410/0x1520
> > > > > > > > > [ 451.305595] [<900000000578c9ac>] schedule+0x34/0x190
> > > > > > > > > [ 451.305599] [<9000000005794d38>] schedule_timeout+0x98/0x140
> > > > > > > > > [ 451.305604] [<9000000004258f40>] rcu_gp_fqs_loop+0x5f8/0x868
> > > > > > > > > [ 451.305609] [<900000000425d358>] rcu_gp_kthread+0x260/0x2e0
> > > > > > > > > [ 451.305614] [<90000000041be704>] kthread+0x144/0x238
> > > > > > > > > [ 451.305619] [<9000000005787b60>] ret_from_kernel_thread+0x28/0xc8
> > > > > > > > > [ 451.305624] [<90000000041620e4>] ret_from_kernel_thread_asm+0xc/0x88
> > > > > > > > >
> > > > > > > > > [ 451.305630] rcu: Stack dump where RCU GP kthread last ran:
> > > > > > > > > [ 451.305633] Sending NMI from CPU 4 to CPUs 1:
> > > > > > > > > [ 451.305636] NMI backtrace for cpu 1 skipped: idling at idle_exit+0x0/0x4
> > > > > > > > > [ 451.306655] rcu: INFO: rcu_preempt detected expedited stalls on
> > > > > > > > > CPUs/tasks: { 5-...D } 7298 jiffies s: 853 root: 0x20/.
> > > > > > > > > [ 451.306665] rcu: blocking rcu_node structures (internal RCU debug):
> > > > > > > > > [ 451.306669] Sending NMI from CPU 6 to CPUs 5:
> > > > > > > > > [ 451.306672] Unable to send backtrace IPI to CPU5 - perhaps it hung?
> > > > > > > > >
> > > > > > > > > So related to trampoline patches for sure unless I am missing something.
> > > > > > > > >
> > > > > > > > > > > Huacai
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > A side note, if I put the module_attach test in
> > > > > > > > > > > > tools/testing/selftests/bpf/DENYLIST to skip the module_attach test,
> > > > > > > > > > > > the module_attach test is not skipped.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks
> > > > > > > > > > > >
> > > > > > > > > > > > Vincent
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel lockup on bpf selftests module_attach
2025-08-22 5:10 ` Vincent Li
@ 2025-08-22 5:22 ` Vincent Li
2025-08-22 5:33 ` Vincent Li
2025-08-22 5:36 ` Chenghao Duan
0 siblings, 2 replies; 18+ messages in thread
From: Vincent Li @ 2025-08-22 5:22 UTC (permalink / raw)
To: Chenghao Duan; +Cc: Huacai Chen, loongarch, Hengqi Chen, Tiezhu Yang
On Thu, Aug 21, 2025 at 10:10 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
>
> On Thu, Aug 21, 2025 at 8:11 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> >
> > On Thu, Aug 21, 2025 at 08:04:07AM -0700, Vincent Li wrote:
> > > On Thu, Aug 14, 2025 at 5:00 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > > >
> > > > On Tue, Aug 12, 2025 at 06:42:08AM -0700, Vincent Li wrote:
> > > > > On Tue, Aug 12, 2025 at 1:34 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > > > > >
> > > > > > On Sun, Aug 10, 2025 at 10:39:24AM -0700, Vincent Li wrote:
> > > > > > > Hi Chenghao,
> > > > > > >
> > > > > > > On Sat, Aug 9, 2025 at 12:11 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Fri, Aug 8, 2025 at 11:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > >
> > > > > > > > > Hi, Chenghao,
> > > > > > > > >
> > > > > > > > > Please take a look.
> > > > > > > > >
> > > > > > > > > Huacai
> > > > > > > > >
> > > > > > > > I reverted loongson-next branch tailcall count fix patches, struct
> > > > > > > > ops trampoline patch, keep the rest of trampoline patches,
> > > > > > > > module_attach test experienced the same issue, so definitely
> > > > > > > > trampoline patches issue.
> > > > > > > >
> > > > > > >
> > > > > > > I attempted to isolate which test in module_attach triggers the
> > > > > > > "Unable to handle kernel paging request..." error, it appears to be
> > > > > > > this one in "prog_tests/module_attach.c"
> > > > > > >
> > > > > > > ASSERT_OK(trigger_module_test_read(READ_SZ), "trigger_read");
> > > > > > >
> > > > > > > you can try to comment out other tests in "prog_tests/module_attach.c"
> > > > > > > and perform the test, it might help isolate the issue.
> > > > > > >
> > > > > >
> > > > > > Hi Vincent,
> > > > > >
> > > > > > The results I tested are different from yours. Could there be other
> > > > > > differences between us? I am using the latest code of the loongarch-next
> > > > > > branch.
> > > > > >
> > > > > > [root@localhost bpf]# ./test_progs -v -t module_attach
> > > > > > bpf_testmod.ko is already unloaded.
> > > > > > Loading bpf_testmod.ko...
> > > > > > Successfully loaded bpf_testmod.ko.
> > > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > > WATCHDOG: test case module_attach executes for 10 seconds...
> > > > > > libbpf: prog 'handle_fmod_ret': BPF program load failed: -EINVAL
> > > > > > libbpf: prog 'handle_fmod_ret': -- BEGIN PROG LOAD LOG --
> > > > > > bpf_testmod_test_read() is not modifiable
> > > > > > processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> > > > > > -- END PROG LOAD LOG --
> > > > > > libbpf: prog 'handle_fmod_ret': failed to load: -EINVAL
> > > > > > libbpf: failed to load object 'test_module_attach'
> > > > > > libbpf: failed to load BPF skeleton 'test_module_attach': -EINVAL
> > > > > > test_module_attach:FAIL:skel_load failed to load skeleton
> > > > > > #205 module_attach:FAIL
> > > > > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > > > > Successfully unloaded bpf_testmod.ko.
> > > > > >
> > > > >
> > > > > I build and run the most recent loongarch-next kernel too, can you try
> > > > > my config https://www.bpfire.net/download/loongfire/config.txt? I am
> > > > > on fedora, here are the steps I build, run the kernel, and run the
> > > > > test
> > > > >
> > > > > 1, check branch
> > > > > [root@fedora linux-loongson]# git branch
> > > > > * loongarch-next
> > > > > master
> > > > > no-tailcall
> > > > > no-trampoline
> > > > >
> > > > > 2, build kernel and reboot
> > > > > cp config.txt .config; make clean; make -j6; make modules_install;
> > > > > make install; grub2-mkconfig -o /boot/grub2/grub.cfg; reboot
> > > > >
> > > > > 3, after reboot and login, build bpf selftests, run module_attach
> > > > > test, dmesg to check kernel log
> > > > > cd tools/testing/selftests/bpf; make -j6; ./test_progs -t module_attach
> > > > >
> > > >
> > > > Hi Vincent,
> > > >
> > > > I tried to refer to the config you provided, but the test results I
> > > > obtained are as follows. I also specifically tested "modify" to verify
> > > > the effectiveness of the patch, and the test of module_attach returns -EOPNOTSUPP.
> > > >
> > > > [root@localhost bpf]# ./test_progs -v -t modify_return
> > > > bpf_testmod.ko is already unloaded.
> > > > Loading bpf_testmod.ko...
> > > > Successfully loaded bpf_testmod.ko.
> > > > run_test:PASS:skel_load 0 nsec
> > > > run_test:PASS:modify_return__attach failed 0 nsec
> > > > run_test:PASS:test_run 0 nsec
> > > > run_test:PASS:test_run ret 0 nsec
> > > > run_test:PASS:modify_return side_effect 0 nsec
> > > > run_test:PASS:modify_return fentry_result 0 nsec
> > > > run_test:PASS:modify_return fexit_result 0 nsec
> > > > run_test:PASS:modify_return fmod_ret_result 0 nsec
> > > > run_test:PASS:modify_return fentry_result2 0 nsec
> > > > run_test:PASS:modify_return fexit_result2 0 nsec
> > > > run_test:PASS:modify_return fmod_ret_result2 0 nsec
> > > > run_test:PASS:skel_load 0 nsec
> > > > run_test:PASS:modify_return__attach failed 0 nsec
> > > > run_test:PASS:test_run 0 nsec
> > > > run_test:PASS:test_run ret 0 nsec
> > > > run_test:PASS:modify_return side_effect 0 nsec
> > > > run_test:PASS:modify_return fentry_result 0 nsec
> > > > run_test:PASS:modify_return fexit_result 0 nsec
> > > > run_test:PASS:modify_return fmod_ret_result 0 nsec
> > > > run_test:PASS:modify_return fentry_result2 0 nsec
> > > > run_test:PASS:modify_return fexit_result2 0 nsec
> > > > run_test:PASS:modify_return fmod_ret_result2 0 nsec
> > > > #200 modify_return:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > Successfully unloaded bpf_testmod.ko.
> > > > [root@localhost bpf]# ./test_progs -v -t module_attach
> > > > bpf_testmod.ko is already unloaded.
> > > > Loading bpf_testmod.ko...
> > > > Successfully loaded bpf_testmod.ko.
> > > > test_module_attach:PASS:skel_open 0 nsec
> > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > test_module_attach:PASS:skel_load 0 nsec
> > > > libbpf: prog 'kprobe_multi': failed to attach: -EOPNOTSUPP
> > > > libbpf: prog 'kprobe_multi': failed to auto-attach: -EOPNOTSUPP
> > >
> > > the -EOPNOTSUPP comes from libbpf, but I am not sure if it is error in
> > > kernel leads to libbpf error or libbpf itself, you can do strace -f
> > > -s1024 -o /tmp/module_attatch.txt ./test_progs -v -t module_attach.
> > > The strace should have bpf syscall and I think it can tell you if the
> > > -EOPNOTSUPP is the result of kernel error or libbpf, you can share the
> > > strace if you want.
> > >
> > 2037 read(16, "", 8192) = 0
> > 2037 close(16) = 0
> > 2037 bpf(BPF_LINK_CREATE, {link_create={prog_fd=61, target_fd=0, attach_type=BPF_TRACE_KPROBE_MULTI, flags=0, kprobe_multi={flags=0, cnt=1, syms=NULL, addrs=[0xffff8000035717d0], cookies=NULL}}}, 64) = -1 EOPNOTSUPP (不支持的操作)
>
> so bpf syscall cmd BPF_LINK_CREATE returns '-1 EOPNOTSUPP' exactly? I
> could not tell because I thought the return value is '-1'
>
> > 2037 write(1, "libbpf: prog 'kprobe_multi': failed to attach: -EOPNOTSUPP\n", 59) = 59
> > 2037 write(1, "libbpf: prog 'kprobe_multi': failed to auto-attach: -EOPNOTSUPP\n", 64) = 64
> > 2037 write(1, "test_module_attach:FAIL:skel_attach skeleton attach failed: -95\n", 64) = 64
> >
> > not support attach_type BPF_TRACE_KPROBE_MULTI
> >
>
> Could you share your kernel config (.config used for kernel compiling
> or running kernel /boot/config-*) ? I wonder if you have the FPROBE
> really configured, since include/linux/fprobe.h has:
>
> #ifdef CONFIG_FPROBE
> int register_fprobe(struct fprobe *fp, const char *filter, const char
> *notfilter);
> int register_fprobe_ips(struct fprobe *fp, unsigned long *addrs, int num);
> int register_fprobe_syms(struct fprobe *fp, const char **syms, int num);
> int unregister_fprobe(struct fprobe *fp);
> bool fprobe_is_registered(struct fprobe *fp);
> int fprobe_count_ips_from_filter(const char *filter, const char *notfilter);
> #else
> static inline int register_fprobe(struct fprobe *fp, const char
> *filter, const char *notfilter)
> {
> return -EOPNOTSUPP;
> }
> static inline int register_fprobe_ips(struct fprobe *fp, unsigned long
> *addrs, int num)
> {
> return -EOPNOTSUPP;
> }
> static inline int register_fprobe_syms(struct fprobe *fp, const char
> **syms, int num)
> {
> return -EOPNOTSUPP;
> }
> static inline int unregister_fprobe(struct fprobe *fp)
> {
> return -EOPNOTSUPP;
> }
> static inline bool fprobe_is_registered(struct fprobe *fp)
> {
> return false;
> }
> static inline int fprobe_count_ips_from_filter(const char *filter,
> const char *notfilter)
> {
> return -EOPNOTSUPP;
> }
> #endif
>
and check CONFIG_BPF_EVENTS since linux/trace_events.h has:
#ifdef CONFIG_BPF_EVENTS
unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx);
int perf_event_attach_bpf_prog(struct perf_event *event, struct
bpf_prog *prog, u64 bpf_cookie);
void perf_event_detach_bpf_prog(struct perf_event *event);
int perf_event_query_prog_array(struct perf_event *event, void __user *info);
struct bpf_raw_tp_link;
int bpf_probe_register(struct bpf_raw_event_map *btp, struct
bpf_raw_tp_link *link);
int bpf_probe_unregister(struct bpf_raw_event_map *btp, struct
bpf_raw_tp_link *link);
struct bpf_raw_event_map *bpf_get_raw_tracepoint(const char *name);
void bpf_put_raw_tracepoint(struct bpf_raw_event_map *btp);
int bpf_get_perf_event_info(const struct perf_event *event, u32 *prog_id,
u32 *fd_type, const char **buf,
u64 *probe_offset, u64 *probe_addr,
unsigned long *missed);
int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct
bpf_prog *prog);
int bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct
bpf_prog *prog);
#else
static inline unsigned int trace_call_bpf(struct trace_event_call
*call, void *ctx)
{
return 1;
}
static inline int
perf_event_attach_bpf_prog(struct perf_event *event, struct bpf_prog
*prog, u64 bpf_cookie)
{
return -EOPNOTSUPP;
}
static inline void perf_event_detach_bpf_prog(struct perf_event *event) { }
static inline int
perf_event_query_prog_array(struct perf_event *event, void __user *info)
{
return -EOPNOTSUPP;
}
struct bpf_raw_tp_link;
static inline int bpf_probe_register(struct bpf_raw_event_map *btp,
struct bpf_raw_tp_link *link)
{
return -EOPNOTSUPP;
}
static inline int bpf_probe_unregister(struct bpf_raw_event_map *btp,
struct bpf_raw_tp_link *link)
{
return -EOPNOTSUPP;
}
static inline struct bpf_raw_event_map *bpf_get_raw_tracepoint(const char *name)
{
return NULL;
}
static inline void bpf_put_raw_tracepoint(struct bpf_raw_event_map *btp)
{
}
static inline int bpf_get_perf_event_info(const struct perf_event *event,
u32 *prog_id, u32 *fd_type,
const char **buf, u64 *probe_offset,
u64 *probe_addr, unsigned
long *missed)
{
return -EOPNOTSUPP;
}
static inline int
bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
{
return -EOPNOTSUPP;
}
static inline int
bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
{
return -EOPNOTSUPP;
}
#endif
> > Chenghao
> >
> >
> > >
> > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -95
> > > > #201 module_attach:FAIL
> > > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > > Successfully unloaded bpf_testmod.ko.
> > > >
> > > >
> > > > Chenghao
> > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > Chenghao
> > > > > >
> > > > > > >
> > > > > > > > > On Sat, Aug 9, 2025 at 1:03 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Fri, Aug 8, 2025 at 8:48 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Aug 8, 2025 at 8:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Hi, Vincent,
> > > > > > > > > > > >
> > > > > > > > > > > > On Sat, Aug 9, 2025 at 12:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hi Folks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hengqi mentioned offline that the loongarch kernel locked up when
> > > > > > > > > > > > > running full bpf selftests, so I went ahead and ran make run_tests to
> > > > > > > > > > > > > perform full bpf selftest, I observed lockup too. It appears the
> > > > > > > > > > > > > lockup happens when running module_attach test which includes testing
> > > > > > > > > > > > > on fentry so this could be related to the trampoline patch series. for
> > > > > > > > > > > > > example, if I just run ./test_progs -t module_attach, the kernel
> > > > > > > > > > > > > lockup immediately.
> > > > > > > > > > > > Is this a regression caused by the latest trampoline patches? Or in
> > > > > > > > > > > > another word, Does vanilla 6.16 has this problem?
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I suspect this is caused by the latest trampoline patches because the
> > > > > > > > > > > module_attach is to test the fentry feature for kernel module
> > > > > > > > > > > functions, I believe Changhao and I only tested the fentry feature for
> > > > > > > > > > > non-module kernel functions. I can try kernel without the trampoline
> > > > > > > > > > > patches and will let you know the result.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I reverted trampoline patches from loongarch-next branch and run
> > > > > > > > > > ./test_progs -t module_attach simply just errors out with the fentry
> > > > > > > > > > feature not supported
> > > > > > > > > >
> > > > > > > > > > [root@fedora bpf]# ./test_progs -t module_attach
> > > > > > > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > > > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > > > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > > > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > > > > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > > > > > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > > > > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > > > > > > > #205 module_attach:FAIL
> > > > > > > > > >
> > > > > > > > > > All error logs:
> > > > > > > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > > > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > > > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > > > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > > > > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > > > > > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > > > > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > > > > > > > #205 module_attach:FAIL
> > > > > > > > > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > > > > > > > >
> > > > > > > > > > I also tested loongarch-next branch with the trampoline patch series
> > > > > > > > > > with no lockup kernel config so I can run dmesg to check kernel error
> > > > > > > > > > log, ./test_progs -t module_attach result in below kernel log:
> > > > > > > > > >
> > > > > > > > > > [ 417.429954] bpf_testmod: loading out-of-tree module taints kernel.
> > > > > > > > > > [ 419.728620] CPU 70475748 Unable to handle kernel paging request at
> > > > > > > > > > virtual address 0000000800000024, era == 90000000041d5854, ra ==
> > > > > > > > > > 90000000041d5848
> > > > > > > > > > [ 419.728629] Oops[#1]:
> > > > > > > > > > [ 419.728632] CPU 70475748 Unable to handle kernel paging request at
> > > > > > > > > > virtual address 0000000000000018, era == 9000000005750268, ra ==
> > > > > > > > > > 9000000004163938
> > > > > > > > > > [ 441.305370] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> > > > > > > > > > [ 441.305380] rcu: 5-...0: (29 ticks this GP)
> > > > > > > > > > idle=eb74/1/0x4000000000000000 softirq=72377/72379 fqs=2599
> > > > > > > > > > [ 441.305386] rcu: (detected by 4, t=5252 jiffies, g=60333, q=186 ncpus=8)
> > > > > > > > > > [ 441.305390] Sending NMI from CPU 4 to CPUs 5:
> > > > > > > > > > [ 451.305494] rcu: rcu_preempt kthread starved for 2499 jiffies!
> > > > > > > > > > g60333 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=1
> > > > > > > > > > [ 451.305500] rcu: Unless rcu_preempt kthread gets sufficient CPU
> > > > > > > > > > time, OOM is now expected behavior.
> > > > > > > > > > [ 451.305502] rcu: RCU grace-period kthread stack dump:
> > > > > > > > > > [ 451.305504] task:rcu_preempt state:R stack:0 pid:15
> > > > > > > > > > tgid:15 ppid:2 task_flags:0x208040 flags:0x00000800
> > > > > > > > > > [ 451.305510] Stack : 9000000100467e80 0000000000000402
> > > > > > > > > > 0000000000000010 90000001003b0680
> > > > > > > > > > [ 451.305519] 90000000058e0000 0000000000000000
> > > > > > > > > > 0000000000000040 9000000006c2dfd0
> > > > > > > > > > [ 451.305526] 900000000578c9b0 0000000000000001
> > > > > > > > > > 9000000006b21000 0000000000000005
> > > > > > > > > > [ 451.305533] 00000001000093a8 00000001000093a8
> > > > > > > > > > 0000000000000000 0000000000000004
> > > > > > > > > > [ 451.305540] 90000000058f04e0 0000000000000000
> > > > > > > > > > 0000000000000002 b793724be1dfb2b8
> > > > > > > > > > [ 451.305547] 00000001000093a9 b793724be1dfb2b8
> > > > > > > > > > 000000000000003f 9000000006c2dfd0
> > > > > > > > > > [ 451.305554] 9000000006c30c18 0000000000000005
> > > > > > > > > > 9000000006b0e000 9000000006b21000
> > > > > > > > > > [ 451.305560] 9000000100453c98 90000001003aff80
> > > > > > > > > > 9000000006c31140 900000000578c9b0
> > > > > > > > > > [ 451.305567] 00000001000093a8 9000000005794d3c
> > > > > > > > > > 00000000000000b4 0000000000000000
> > > > > > > > > > [ 451.305574] 90000000024021b8 00000001000093a8
> > > > > > > > > > 9000000004284f20 000000000a400001
> > > > > > > > > > [ 451.305581] ...
> > > > > > > > > > [ 451.305584] Call Trace:
> > > > > > > > > > [ 451.305586] [<900000000578b868>] __schedule+0x410/0x1520
> > > > > > > > > > [ 451.305595] [<900000000578c9ac>] schedule+0x34/0x190
> > > > > > > > > > [ 451.305599] [<9000000005794d38>] schedule_timeout+0x98/0x140
> > > > > > > > > > [ 451.305604] [<9000000004258f40>] rcu_gp_fqs_loop+0x5f8/0x868
> > > > > > > > > > [ 451.305609] [<900000000425d358>] rcu_gp_kthread+0x260/0x2e0
> > > > > > > > > > [ 451.305614] [<90000000041be704>] kthread+0x144/0x238
> > > > > > > > > > [ 451.305619] [<9000000005787b60>] ret_from_kernel_thread+0x28/0xc8
> > > > > > > > > > [ 451.305624] [<90000000041620e4>] ret_from_kernel_thread_asm+0xc/0x88
> > > > > > > > > >
> > > > > > > > > > [ 451.305630] rcu: Stack dump where RCU GP kthread last ran:
> > > > > > > > > > [ 451.305633] Sending NMI from CPU 4 to CPUs 1:
> > > > > > > > > > [ 451.305636] NMI backtrace for cpu 1 skipped: idling at idle_exit+0x0/0x4
> > > > > > > > > > [ 451.306655] rcu: INFO: rcu_preempt detected expedited stalls on
> > > > > > > > > > CPUs/tasks: { 5-...D } 7298 jiffies s: 853 root: 0x20/.
> > > > > > > > > > [ 451.306665] rcu: blocking rcu_node structures (internal RCU debug):
> > > > > > > > > > [ 451.306669] Sending NMI from CPU 6 to CPUs 5:
> > > > > > > > > > [ 451.306672] Unable to send backtrace IPI to CPU5 - perhaps it hung?
> > > > > > > > > >
> > > > > > > > > > So related to trampoline patches for sure unless I am missing something.
> > > > > > > > > >
> > > > > > > > > > > > Huacai
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > A side note, if I put the module_attach test in
> > > > > > > > > > > > > tools/testing/selftests/bpf/DENYLIST to skip the module_attach test,
> > > > > > > > > > > > > the module_attach test is not skipped.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks
> > > > > > > > > > > > >
> > > > > > > > > > > > > Vincent
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel lockup on bpf selftests module_attach
2025-08-22 5:22 ` Vincent Li
@ 2025-08-22 5:33 ` Vincent Li
2025-08-22 5:36 ` Chenghao Duan
1 sibling, 0 replies; 18+ messages in thread
From: Vincent Li @ 2025-08-22 5:33 UTC (permalink / raw)
To: Chenghao Duan; +Cc: Huacai Chen, loongarch, Hengqi Chen, Tiezhu Yang
On Thu, Aug 21, 2025 at 10:22 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
>
> On Thu, Aug 21, 2025 at 10:10 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> >
> > On Thu, Aug 21, 2025 at 8:11 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > >
> > > On Thu, Aug 21, 2025 at 08:04:07AM -0700, Vincent Li wrote:
> > > > On Thu, Aug 14, 2025 at 5:00 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > > > >
> > > > > On Tue, Aug 12, 2025 at 06:42:08AM -0700, Vincent Li wrote:
> > > > > > On Tue, Aug 12, 2025 at 1:34 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > > > > > >
> > > > > > > On Sun, Aug 10, 2025 at 10:39:24AM -0700, Vincent Li wrote:
> > > > > > > > Hi Chenghao,
> > > > > > > >
> > > > > > > > On Sat, Aug 9, 2025 at 12:11 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > On Fri, Aug 8, 2025 at 11:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > >
> > > > > > > > > > Hi, Chenghao,
> > > > > > > > > >
> > > > > > > > > > Please take a look.
> > > > > > > > > >
> > > > > > > > > > Huacai
> > > > > > > > > >
> > > > > > > > > I reverted loongson-next branch tailcall count fix patches, struct
> > > > > > > > > ops trampoline patch, keep the rest of trampoline patches,
> > > > > > > > > module_attach test experienced the same issue, so definitely
> > > > > > > > > trampoline patches issue.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I attempted to isolate which test in module_attach triggers the
> > > > > > > > "Unable to handle kernel paging request..." error, it appears to be
> > > > > > > > this one in "prog_tests/module_attach.c"
> > > > > > > >
> > > > > > > > ASSERT_OK(trigger_module_test_read(READ_SZ), "trigger_read");
> > > > > > > >
> > > > > > > > you can try to comment out other tests in "prog_tests/module_attach.c"
> > > > > > > > and perform the test, it might help isolate the issue.
> > > > > > > >
> > > > > > >
> > > > > > > Hi Vincent,
> > > > > > >
> > > > > > > The results I tested are different from yours. Could there be other
> > > > > > > differences between us? I am using the latest code of the loongarch-next
> > > > > > > branch.
> > > > > > >
> > > > > > > [root@localhost bpf]# ./test_progs -v -t module_attach
> > > > > > > bpf_testmod.ko is already unloaded.
> > > > > > > Loading bpf_testmod.ko...
> > > > > > > Successfully loaded bpf_testmod.ko.
> > > > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > > > WATCHDOG: test case module_attach executes for 10 seconds...
> > > > > > > libbpf: prog 'handle_fmod_ret': BPF program load failed: -EINVAL
> > > > > > > libbpf: prog 'handle_fmod_ret': -- BEGIN PROG LOAD LOG --
> > > > > > > bpf_testmod_test_read() is not modifiable
> > > > > > > processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> > > > > > > -- END PROG LOAD LOG --
> > > > > > > libbpf: prog 'handle_fmod_ret': failed to load: -EINVAL
> > > > > > > libbpf: failed to load object 'test_module_attach'
> > > > > > > libbpf: failed to load BPF skeleton 'test_module_attach': -EINVAL
> > > > > > > test_module_attach:FAIL:skel_load failed to load skeleton
> > > > > > > #205 module_attach:FAIL
> > > > > > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > > > > > Successfully unloaded bpf_testmod.ko.
> > > > > > >
> > > > > >
> > > > > > I build and run the most recent loongarch-next kernel too, can you try
> > > > > > my config https://www.bpfire.net/download/loongfire/config.txt? I am
> > > > > > on fedora, here are the steps I build, run the kernel, and run the
> > > > > > test
> > > > > >
> > > > > > 1, check branch
> > > > > > [root@fedora linux-loongson]# git branch
> > > > > > * loongarch-next
> > > > > > master
> > > > > > no-tailcall
> > > > > > no-trampoline
> > > > > >
> > > > > > 2, build kernel and reboot
> > > > > > cp config.txt .config; make clean; make -j6; make modules_install;
> > > > > > make install; grub2-mkconfig -o /boot/grub2/grub.cfg; reboot
> > > > > >
> > > > > > 3, after reboot and login, build bpf selftests, run module_attach
> > > > > > test, dmesg to check kernel log
> > > > > > cd tools/testing/selftests/bpf; make -j6; ./test_progs -t module_attach
> > > > > >
> > > > >
> > > > > Hi Vincent,
> > > > >
> > > > > I tried to refer to the config you provided, but the test results I
> > > > > obtained are as follows. I also specifically tested "modify" to verify
> > > > > the effectiveness of the patch, and the test of module_attach returns -EOPNOTSUPP.
> > > > >
> > > > > [root@localhost bpf]# ./test_progs -v -t modify_return
> > > > > bpf_testmod.ko is already unloaded.
> > > > > Loading bpf_testmod.ko...
> > > > > Successfully loaded bpf_testmod.ko.
> > > > > run_test:PASS:skel_load 0 nsec
> > > > > run_test:PASS:modify_return__attach failed 0 nsec
> > > > > run_test:PASS:test_run 0 nsec
> > > > > run_test:PASS:test_run ret 0 nsec
> > > > > run_test:PASS:modify_return side_effect 0 nsec
> > > > > run_test:PASS:modify_return fentry_result 0 nsec
> > > > > run_test:PASS:modify_return fexit_result 0 nsec
> > > > > run_test:PASS:modify_return fmod_ret_result 0 nsec
> > > > > run_test:PASS:modify_return fentry_result2 0 nsec
> > > > > run_test:PASS:modify_return fexit_result2 0 nsec
> > > > > run_test:PASS:modify_return fmod_ret_result2 0 nsec
> > > > > run_test:PASS:skel_load 0 nsec
> > > > > run_test:PASS:modify_return__attach failed 0 nsec
> > > > > run_test:PASS:test_run 0 nsec
> > > > > run_test:PASS:test_run ret 0 nsec
> > > > > run_test:PASS:modify_return side_effect 0 nsec
> > > > > run_test:PASS:modify_return fentry_result 0 nsec
> > > > > run_test:PASS:modify_return fexit_result 0 nsec
> > > > > run_test:PASS:modify_return fmod_ret_result 0 nsec
> > > > > run_test:PASS:modify_return fentry_result2 0 nsec
> > > > > run_test:PASS:modify_return fexit_result2 0 nsec
> > > > > run_test:PASS:modify_return fmod_ret_result2 0 nsec
> > > > > #200 modify_return:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > Successfully unloaded bpf_testmod.ko.
> > > > > [root@localhost bpf]# ./test_progs -v -t module_attach
> > > > > bpf_testmod.ko is already unloaded.
> > > > > Loading bpf_testmod.ko...
> > > > > Successfully loaded bpf_testmod.ko.
> > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > libbpf: prog 'kprobe_multi': failed to attach: -EOPNOTSUPP
> > > > > libbpf: prog 'kprobe_multi': failed to auto-attach: -EOPNOTSUPP
> > > >
> > > > the -EOPNOTSUPP comes from libbpf, but I am not sure if it is error in
> > > > kernel leads to libbpf error or libbpf itself, you can do strace -f
> > > > -s1024 -o /tmp/module_attatch.txt ./test_progs -v -t module_attach.
> > > > The strace should have bpf syscall and I think it can tell you if the
> > > > -EOPNOTSUPP is the result of kernel error or libbpf, you can share the
> > > > strace if you want.
> > > >
> > > 2037 read(16, "", 8192) = 0
> > > 2037 close(16) = 0
> > > 2037 bpf(BPF_LINK_CREATE, {link_create={prog_fd=61, target_fd=0, attach_type=BPF_TRACE_KPROBE_MULTI, flags=0, kprobe_multi={flags=0, cnt=1, syms=NULL, addrs=[0xffff8000035717d0], cookies=NULL}}}, 64) = -1 EOPNOTSUPP (不支持的操作)
> >
> > so bpf syscall cmd BPF_LINK_CREATE returns '-1 EOPNOTSUPP' exactly? I
> > could not tell because I thought the return value is '-1'
> >
> > > 2037 write(1, "libbpf: prog 'kprobe_multi': failed to attach: -EOPNOTSUPP\n", 59) = 59
> > > 2037 write(1, "libbpf: prog 'kprobe_multi': failed to auto-attach: -EOPNOTSUPP\n", 64) = 64
> > > 2037 write(1, "test_module_attach:FAIL:skel_attach skeleton attach failed: -95\n", 64) = 64
> > >
> > > not support attach_type BPF_TRACE_KPROBE_MULTI
> > >
> >
> > Could you share your kernel config (.config used for kernel compiling
> > or running kernel /boot/config-*) ? I wonder if you have the FPROBE
> > really configured, since include/linux/fprobe.h has:
> >
> > #ifdef CONFIG_FPROBE
> > int register_fprobe(struct fprobe *fp, const char *filter, const char
> > *notfilter);
> > int register_fprobe_ips(struct fprobe *fp, unsigned long *addrs, int num);
> > int register_fprobe_syms(struct fprobe *fp, const char **syms, int num);
> > int unregister_fprobe(struct fprobe *fp);
> > bool fprobe_is_registered(struct fprobe *fp);
> > int fprobe_count_ips_from_filter(const char *filter, const char *notfilter);
> > #else
> > static inline int register_fprobe(struct fprobe *fp, const char
> > *filter, const char *notfilter)
> > {
> > return -EOPNOTSUPP;
> > }
> > static inline int register_fprobe_ips(struct fprobe *fp, unsigned long
> > *addrs, int num)
> > {
> > return -EOPNOTSUPP;
> > }
> > static inline int register_fprobe_syms(struct fprobe *fp, const char
> > **syms, int num)
> > {
> > return -EOPNOTSUPP;
> > }
> > static inline int unregister_fprobe(struct fprobe *fp)
> > {
> > return -EOPNOTSUPP;
> > }
> > static inline bool fprobe_is_registered(struct fprobe *fp)
> > {
> > return false;
> > }
> > static inline int fprobe_count_ips_from_filter(const char *filter,
> > const char *notfilter)
> > {
> > return -EOPNOTSUPP;
> > }
> > #endif
> >
>
> and check CONFIG_BPF_EVENTS since linux/trace_events.h has:
>
> #ifdef CONFIG_BPF_EVENTS
> unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx);
> int perf_event_attach_bpf_prog(struct perf_event *event, struct
> bpf_prog *prog, u64 bpf_cookie);
> void perf_event_detach_bpf_prog(struct perf_event *event);
> int perf_event_query_prog_array(struct perf_event *event, void __user *info);
>
> struct bpf_raw_tp_link;
> int bpf_probe_register(struct bpf_raw_event_map *btp, struct
> bpf_raw_tp_link *link);
> int bpf_probe_unregister(struct bpf_raw_event_map *btp, struct
> bpf_raw_tp_link *link);
>
> struct bpf_raw_event_map *bpf_get_raw_tracepoint(const char *name);
> void bpf_put_raw_tracepoint(struct bpf_raw_event_map *btp);
> int bpf_get_perf_event_info(const struct perf_event *event, u32 *prog_id,
> u32 *fd_type, const char **buf,
> u64 *probe_offset, u64 *probe_addr,
> unsigned long *missed);
> int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct
> bpf_prog *prog);
> int bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct
> bpf_prog *prog);
> #else
> static inline unsigned int trace_call_bpf(struct trace_event_call
> *call, void *ctx)
> {
> return 1;
> }
>
> static inline int
> perf_event_attach_bpf_prog(struct perf_event *event, struct bpf_prog
> *prog, u64 bpf_cookie)
> {
> return -EOPNOTSUPP;
> }
>
> static inline void perf_event_detach_bpf_prog(struct perf_event *event) { }
>
> static inline int
> perf_event_query_prog_array(struct perf_event *event, void __user *info)
> {
> return -EOPNOTSUPP;
> }
> struct bpf_raw_tp_link;
> static inline int bpf_probe_register(struct bpf_raw_event_map *btp,
> struct bpf_raw_tp_link *link)
> {
> return -EOPNOTSUPP;
> }
> static inline int bpf_probe_unregister(struct bpf_raw_event_map *btp,
> struct bpf_raw_tp_link *link)
> {
> return -EOPNOTSUPP;
> }
> static inline struct bpf_raw_event_map *bpf_get_raw_tracepoint(const char *name)
> {
> return NULL;
> }
> static inline void bpf_put_raw_tracepoint(struct bpf_raw_event_map *btp)
> {
> }
> static inline int bpf_get_perf_event_info(const struct perf_event *event,
> u32 *prog_id, u32 *fd_type,
> const char **buf, u64 *probe_offset,
> u64 *probe_addr, unsigned
> long *missed)
> {
> return -EOPNOTSUPP;
> }
> static inline int
> bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
> {
> return -EOPNOTSUPP;
> }
> static inline int
> bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
> {
> return -EOPNOTSUPP;
> }
> #endif
>
FYI, here are the places for bpf_kprobe_multi_link_attach(), if
CONFIG_FPROBE and CONFIG_BPF_EVENTS not defined,
bpf_kprobe_multi_link_attach() returns -EOPNOTSUPP
Cscope tag: bpf_kprobe_multi_link_attach
# line filename / context / line
1 781 include/linux/trace_events.h <<GLOBAL>>
int bpf_kprobe_multi_link_attach(const union bpf_attr
*attr, struct bpf_prog *prog);
2 826 include/linux/trace_events.h <<bpf_kprobe_multi_link_attach>>
bpf_kprobe_multi_link_attach(const union bpf_attr *attr,
struct bpf_prog *prog)
3 5606 kernel/bpf/syscall.c <<link_create>>
ret = bpf_kprobe_multi_link_attach(attr, prog);
4 2894 kernel/trace/bpf_trace.c <<bpf_kprobe_multi_link_attach>>
int bpf_kprobe_multi_link_attach(const union bpf_attr
*attr, struct bpf_prog *prog)
5 3042 kernel/trace/bpf_trace.c <<bpf_kprobe_multi_link_attach>>
int bpf_kprobe_multi_link_attach(const union bpf_attr
*attr, struct bpf_prog *prog)
> > > Chenghao
> > >
> > >
> > > >
> > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -95
> > > > > #201 module_attach:FAIL
> > > > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > > > Successfully unloaded bpf_testmod.ko.
> > > > >
> > > > >
> > > > > Chenghao
> > > > >
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Chenghao
> > > > > > >
> > > > > > > >
> > > > > > > > > > On Sat, Aug 9, 2025 at 1:03 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Aug 8, 2025 at 8:48 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Aug 8, 2025 at 8:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hi, Vincent,
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Sat, Aug 9, 2025 at 12:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Folks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hengqi mentioned offline that the loongarch kernel locked up when
> > > > > > > > > > > > > > running full bpf selftests, so I went ahead and ran make run_tests to
> > > > > > > > > > > > > > perform full bpf selftest, I observed lockup too. It appears the
> > > > > > > > > > > > > > lockup happens when running module_attach test which includes testing
> > > > > > > > > > > > > > on fentry so this could be related to the trampoline patch series. for
> > > > > > > > > > > > > > example, if I just run ./test_progs -t module_attach, the kernel
> > > > > > > > > > > > > > lockup immediately.
> > > > > > > > > > > > > Is this a regression caused by the latest trampoline patches? Or in
> > > > > > > > > > > > > another word, Does vanilla 6.16 has this problem?
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > I suspect this is caused by the latest trampoline patches because the
> > > > > > > > > > > > module_attach is to test the fentry feature for kernel module
> > > > > > > > > > > > functions, I believe Changhao and I only tested the fentry feature for
> > > > > > > > > > > > non-module kernel functions. I can try kernel without the trampoline
> > > > > > > > > > > > patches and will let you know the result.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I reverted trampoline patches from loongarch-next branch and run
> > > > > > > > > > > ./test_progs -t module_attach simply just errors out with the fentry
> > > > > > > > > > > feature not supported
> > > > > > > > > > >
> > > > > > > > > > > [root@fedora bpf]# ./test_progs -t module_attach
> > > > > > > > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > > > > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > > > > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > > > > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > > > > > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > > > > > > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > > > > > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > > > > > > > > #205 module_attach:FAIL
> > > > > > > > > > >
> > > > > > > > > > > All error logs:
> > > > > > > > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > > > > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > > > > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > > > > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > > > > > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > > > > > > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > > > > > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > > > > > > > > #205 module_attach:FAIL
> > > > > > > > > > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > > > > > > > > >
> > > > > > > > > > > I also tested loongarch-next branch with the trampoline patch series
> > > > > > > > > > > with no lockup kernel config so I can run dmesg to check kernel error
> > > > > > > > > > > log, ./test_progs -t module_attach result in below kernel log:
> > > > > > > > > > >
> > > > > > > > > > > [ 417.429954] bpf_testmod: loading out-of-tree module taints kernel.
> > > > > > > > > > > [ 419.728620] CPU 70475748 Unable to handle kernel paging request at
> > > > > > > > > > > virtual address 0000000800000024, era == 90000000041d5854, ra ==
> > > > > > > > > > > 90000000041d5848
> > > > > > > > > > > [ 419.728629] Oops[#1]:
> > > > > > > > > > > [ 419.728632] CPU 70475748 Unable to handle kernel paging request at
> > > > > > > > > > > virtual address 0000000000000018, era == 9000000005750268, ra ==
> > > > > > > > > > > 9000000004163938
> > > > > > > > > > > [ 441.305370] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> > > > > > > > > > > [ 441.305380] rcu: 5-...0: (29 ticks this GP)
> > > > > > > > > > > idle=eb74/1/0x4000000000000000 softirq=72377/72379 fqs=2599
> > > > > > > > > > > [ 441.305386] rcu: (detected by 4, t=5252 jiffies, g=60333, q=186 ncpus=8)
> > > > > > > > > > > [ 441.305390] Sending NMI from CPU 4 to CPUs 5:
> > > > > > > > > > > [ 451.305494] rcu: rcu_preempt kthread starved for 2499 jiffies!
> > > > > > > > > > > g60333 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=1
> > > > > > > > > > > [ 451.305500] rcu: Unless rcu_preempt kthread gets sufficient CPU
> > > > > > > > > > > time, OOM is now expected behavior.
> > > > > > > > > > > [ 451.305502] rcu: RCU grace-period kthread stack dump:
> > > > > > > > > > > [ 451.305504] task:rcu_preempt state:R stack:0 pid:15
> > > > > > > > > > > tgid:15 ppid:2 task_flags:0x208040 flags:0x00000800
> > > > > > > > > > > [ 451.305510] Stack : 9000000100467e80 0000000000000402
> > > > > > > > > > > 0000000000000010 90000001003b0680
> > > > > > > > > > > [ 451.305519] 90000000058e0000 0000000000000000
> > > > > > > > > > > 0000000000000040 9000000006c2dfd0
> > > > > > > > > > > [ 451.305526] 900000000578c9b0 0000000000000001
> > > > > > > > > > > 9000000006b21000 0000000000000005
> > > > > > > > > > > [ 451.305533] 00000001000093a8 00000001000093a8
> > > > > > > > > > > 0000000000000000 0000000000000004
> > > > > > > > > > > [ 451.305540] 90000000058f04e0 0000000000000000
> > > > > > > > > > > 0000000000000002 b793724be1dfb2b8
> > > > > > > > > > > [ 451.305547] 00000001000093a9 b793724be1dfb2b8
> > > > > > > > > > > 000000000000003f 9000000006c2dfd0
> > > > > > > > > > > [ 451.305554] 9000000006c30c18 0000000000000005
> > > > > > > > > > > 9000000006b0e000 9000000006b21000
> > > > > > > > > > > [ 451.305560] 9000000100453c98 90000001003aff80
> > > > > > > > > > > 9000000006c31140 900000000578c9b0
> > > > > > > > > > > [ 451.305567] 00000001000093a8 9000000005794d3c
> > > > > > > > > > > 00000000000000b4 0000000000000000
> > > > > > > > > > > [ 451.305574] 90000000024021b8 00000001000093a8
> > > > > > > > > > > 9000000004284f20 000000000a400001
> > > > > > > > > > > [ 451.305581] ...
> > > > > > > > > > > [ 451.305584] Call Trace:
> > > > > > > > > > > [ 451.305586] [<900000000578b868>] __schedule+0x410/0x1520
> > > > > > > > > > > [ 451.305595] [<900000000578c9ac>] schedule+0x34/0x190
> > > > > > > > > > > [ 451.305599] [<9000000005794d38>] schedule_timeout+0x98/0x140
> > > > > > > > > > > [ 451.305604] [<9000000004258f40>] rcu_gp_fqs_loop+0x5f8/0x868
> > > > > > > > > > > [ 451.305609] [<900000000425d358>] rcu_gp_kthread+0x260/0x2e0
> > > > > > > > > > > [ 451.305614] [<90000000041be704>] kthread+0x144/0x238
> > > > > > > > > > > [ 451.305619] [<9000000005787b60>] ret_from_kernel_thread+0x28/0xc8
> > > > > > > > > > > [ 451.305624] [<90000000041620e4>] ret_from_kernel_thread_asm+0xc/0x88
> > > > > > > > > > >
> > > > > > > > > > > [ 451.305630] rcu: Stack dump where RCU GP kthread last ran:
> > > > > > > > > > > [ 451.305633] Sending NMI from CPU 4 to CPUs 1:
> > > > > > > > > > > [ 451.305636] NMI backtrace for cpu 1 skipped: idling at idle_exit+0x0/0x4
> > > > > > > > > > > [ 451.306655] rcu: INFO: rcu_preempt detected expedited stalls on
> > > > > > > > > > > CPUs/tasks: { 5-...D } 7298 jiffies s: 853 root: 0x20/.
> > > > > > > > > > > [ 451.306665] rcu: blocking rcu_node structures (internal RCU debug):
> > > > > > > > > > > [ 451.306669] Sending NMI from CPU 6 to CPUs 5:
> > > > > > > > > > > [ 451.306672] Unable to send backtrace IPI to CPU5 - perhaps it hung?
> > > > > > > > > > >
> > > > > > > > > > > So related to trampoline patches for sure unless I am missing something.
> > > > > > > > > > >
> > > > > > > > > > > > > Huacai
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > A side note, if I put the module_attach test in
> > > > > > > > > > > > > > tools/testing/selftests/bpf/DENYLIST to skip the module_attach test,
> > > > > > > > > > > > > > the module_attach test is not skipped.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Vincent
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kernel lockup on bpf selftests module_attach
2025-08-22 5:22 ` Vincent Li
2025-08-22 5:33 ` Vincent Li
@ 2025-08-22 5:36 ` Chenghao Duan
1 sibling, 0 replies; 18+ messages in thread
From: Chenghao Duan @ 2025-08-22 5:36 UTC (permalink / raw)
To: Vincent Li; +Cc: Huacai Chen, loongarch, Hengqi Chen, Tiezhu Yang
On Thu, Aug 21, 2025 at 10:22:46PM -0700, Vincent Li wrote:
> On Thu, Aug 21, 2025 at 10:10 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> >
> > On Thu, Aug 21, 2025 at 8:11 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > >
> > > On Thu, Aug 21, 2025 at 08:04:07AM -0700, Vincent Li wrote:
> > > > On Thu, Aug 14, 2025 at 5:00 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > > > >
> > > > > On Tue, Aug 12, 2025 at 06:42:08AM -0700, Vincent Li wrote:
> > > > > > On Tue, Aug 12, 2025 at 1:34 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > > > > > >
> > > > > > > On Sun, Aug 10, 2025 at 10:39:24AM -0700, Vincent Li wrote:
> > > > > > > > Hi Chenghao,
> > > > > > > >
> > > > > > > > On Sat, Aug 9, 2025 at 12:11 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > On Fri, Aug 8, 2025 at 11:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > >
> > > > > > > > > > Hi, Chenghao,
> > > > > > > > > >
> > > > > > > > > > Please take a look.
> > > > > > > > > >
> > > > > > > > > > Huacai
> > > > > > > > > >
> > > > > > > > > I reverted loongson-next branch tailcall count fix patches, struct
> > > > > > > > > ops trampoline patch, keep the rest of trampoline patches,
> > > > > > > > > module_attach test experienced the same issue, so definitely
> > > > > > > > > trampoline patches issue.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I attempted to isolate which test in module_attach triggers the
> > > > > > > > "Unable to handle kernel paging request..." error, it appears to be
> > > > > > > > this one in "prog_tests/module_attach.c"
> > > > > > > >
> > > > > > > > ASSERT_OK(trigger_module_test_read(READ_SZ), "trigger_read");
> > > > > > > >
> > > > > > > > you can try to comment out other tests in "prog_tests/module_attach.c"
> > > > > > > > and perform the test, it might help isolate the issue.
> > > > > > > >
> > > > > > >
> > > > > > > Hi Vincent,
> > > > > > >
> > > > > > > The results I tested are different from yours. Could there be other
> > > > > > > differences between us? I am using the latest code of the loongarch-next
> > > > > > > branch.
> > > > > > >
> > > > > > > [root@localhost bpf]# ./test_progs -v -t module_attach
> > > > > > > bpf_testmod.ko is already unloaded.
> > > > > > > Loading bpf_testmod.ko...
> > > > > > > Successfully loaded bpf_testmod.ko.
> > > > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > > > WATCHDOG: test case module_attach executes for 10 seconds...
> > > > > > > libbpf: prog 'handle_fmod_ret': BPF program load failed: -EINVAL
> > > > > > > libbpf: prog 'handle_fmod_ret': -- BEGIN PROG LOAD LOG --
> > > > > > > bpf_testmod_test_read() is not modifiable
> > > > > > > processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> > > > > > > -- END PROG LOAD LOG --
> > > > > > > libbpf: prog 'handle_fmod_ret': failed to load: -EINVAL
> > > > > > > libbpf: failed to load object 'test_module_attach'
> > > > > > > libbpf: failed to load BPF skeleton 'test_module_attach': -EINVAL
> > > > > > > test_module_attach:FAIL:skel_load failed to load skeleton
> > > > > > > #205 module_attach:FAIL
> > > > > > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > > > > > Successfully unloaded bpf_testmod.ko.
> > > > > > >
> > > > > >
> > > > > > I build and run the most recent loongarch-next kernel too, can you try
> > > > > > my config https://www.bpfire.net/download/loongfire/config.txt? I am
> > > > > > on fedora, here are the steps I build, run the kernel, and run the
> > > > > > test
> > > > > >
> > > > > > 1, check branch
> > > > > > [root@fedora linux-loongson]# git branch
> > > > > > * loongarch-next
> > > > > > master
> > > > > > no-tailcall
> > > > > > no-trampoline
> > > > > >
> > > > > > 2, build kernel and reboot
> > > > > > cp config.txt .config; make clean; make -j6; make modules_install;
> > > > > > make install; grub2-mkconfig -o /boot/grub2/grub.cfg; reboot
> > > > > >
> > > > > > 3, after reboot and login, build bpf selftests, run module_attach
> > > > > > test, dmesg to check kernel log
> > > > > > cd tools/testing/selftests/bpf; make -j6; ./test_progs -t module_attach
> > > > > >
> > > > >
> > > > > Hi Vincent,
> > > > >
> > > > > I tried to refer to the config you provided, but the test results I
> > > > > obtained are as follows. I also specifically tested "modify" to verify
> > > > > the effectiveness of the patch, and the test of module_attach returns -EOPNOTSUPP.
> > > > >
> > > > > [root@localhost bpf]# ./test_progs -v -t modify_return
> > > > > bpf_testmod.ko is already unloaded.
> > > > > Loading bpf_testmod.ko...
> > > > > Successfully loaded bpf_testmod.ko.
> > > > > run_test:PASS:skel_load 0 nsec
> > > > > run_test:PASS:modify_return__attach failed 0 nsec
> > > > > run_test:PASS:test_run 0 nsec
> > > > > run_test:PASS:test_run ret 0 nsec
> > > > > run_test:PASS:modify_return side_effect 0 nsec
> > > > > run_test:PASS:modify_return fentry_result 0 nsec
> > > > > run_test:PASS:modify_return fexit_result 0 nsec
> > > > > run_test:PASS:modify_return fmod_ret_result 0 nsec
> > > > > run_test:PASS:modify_return fentry_result2 0 nsec
> > > > > run_test:PASS:modify_return fexit_result2 0 nsec
> > > > > run_test:PASS:modify_return fmod_ret_result2 0 nsec
> > > > > run_test:PASS:skel_load 0 nsec
> > > > > run_test:PASS:modify_return__attach failed 0 nsec
> > > > > run_test:PASS:test_run 0 nsec
> > > > > run_test:PASS:test_run ret 0 nsec
> > > > > run_test:PASS:modify_return side_effect 0 nsec
> > > > > run_test:PASS:modify_return fentry_result 0 nsec
> > > > > run_test:PASS:modify_return fexit_result 0 nsec
> > > > > run_test:PASS:modify_return fmod_ret_result 0 nsec
> > > > > run_test:PASS:modify_return fentry_result2 0 nsec
> > > > > run_test:PASS:modify_return fexit_result2 0 nsec
> > > > > run_test:PASS:modify_return fmod_ret_result2 0 nsec
> > > > > #200 modify_return:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > Successfully unloaded bpf_testmod.ko.
> > > > > [root@localhost bpf]# ./test_progs -v -t module_attach
> > > > > bpf_testmod.ko is already unloaded.
> > > > > Loading bpf_testmod.ko...
> > > > > Successfully loaded bpf_testmod.ko.
> > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > libbpf: prog 'kprobe_multi': failed to attach: -EOPNOTSUPP
> > > > > libbpf: prog 'kprobe_multi': failed to auto-attach: -EOPNOTSUPP
> > > >
> > > > the -EOPNOTSUPP comes from libbpf, but I am not sure if it is error in
> > > > kernel leads to libbpf error or libbpf itself, you can do strace -f
> > > > -s1024 -o /tmp/module_attatch.txt ./test_progs -v -t module_attach.
> > > > The strace should have bpf syscall and I think it can tell you if the
> > > > -EOPNOTSUPP is the result of kernel error or libbpf, you can share the
> > > > strace if you want.
> > > >
> > > 2037 read(16, "", 8192) = 0
> > > 2037 close(16) = 0
> > > 2037 bpf(BPF_LINK_CREATE, {link_create={prog_fd=61, target_fd=0, attach_type=BPF_TRACE_KPROBE_MULTI, flags=0, kprobe_multi={flags=0, cnt=1, syms=NULL, addrs=[0xffff8000035717d0], cookies=NULL}}}, 64) = -1 EOPNOTSUPP (不支持的操作)
> >
> > so bpf syscall cmd BPF_LINK_CREATE returns '-1 EOPNOTSUPP' exactly? I
> > could not tell because I thought the return value is '-1'
> >
> > > 2037 write(1, "libbpf: prog 'kprobe_multi': failed to attach: -EOPNOTSUPP\n", 59) = 59
> > > 2037 write(1, "libbpf: prog 'kprobe_multi': failed to auto-attach: -EOPNOTSUPP\n", 64) = 64
> > > 2037 write(1, "test_module_attach:FAIL:skel_attach skeleton attach failed: -95\n", 64) = 64
> > >
> > > not support attach_type BPF_TRACE_KPROBE_MULTI
> > >
> >
> > Could you share your kernel config (.config used for kernel compiling
> > or running kernel /boot/config-*) ? I wonder if you have the FPROBE
> > really configured, since include/linux/fprobe.h has:
> >
> > #ifdef CONFIG_FPROBE
> > int register_fprobe(struct fprobe *fp, const char *filter, const char
> > *notfilter);
> > int register_fprobe_ips(struct fprobe *fp, unsigned long *addrs, int num);
> > int register_fprobe_syms(struct fprobe *fp, const char **syms, int num);
> > int unregister_fprobe(struct fprobe *fp);
> > bool fprobe_is_registered(struct fprobe *fp);
> > int fprobe_count_ips_from_filter(const char *filter, const char *notfilter);
> > #else
> > static inline int register_fprobe(struct fprobe *fp, const char
> > *filter, const char *notfilter)
> > {
> > return -EOPNOTSUPP;
> > }
> > static inline int register_fprobe_ips(struct fprobe *fp, unsigned long
> > *addrs, int num)
> > {
> > return -EOPNOTSUPP;
> > }
> > static inline int register_fprobe_syms(struct fprobe *fp, const char
> > **syms, int num)
> > {
> > return -EOPNOTSUPP;
> > }
> > static inline int unregister_fprobe(struct fprobe *fp)
> > {
> > return -EOPNOTSUPP;
> > }
> > static inline bool fprobe_is_registered(struct fprobe *fp)
> > {
> > return false;
> > }
> > static inline int fprobe_count_ips_from_filter(const char *filter,
> > const char *notfilter)
> > {
> > return -EOPNOTSUPP;
> > }
> > #endif
> >
>
> and check CONFIG_BPF_EVENTS since linux/trace_events.h has:
>
> #ifdef CONFIG_BPF_EVENTS
> unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx);
> int perf_event_attach_bpf_prog(struct perf_event *event, struct
> bpf_prog *prog, u64 bpf_cookie);
> void perf_event_detach_bpf_prog(struct perf_event *event);
> int perf_event_query_prog_array(struct perf_event *event, void __user *info);
>
> struct bpf_raw_tp_link;
> int bpf_probe_register(struct bpf_raw_event_map *btp, struct
> bpf_raw_tp_link *link);
> int bpf_probe_unregister(struct bpf_raw_event_map *btp, struct
> bpf_raw_tp_link *link);
>
> struct bpf_raw_event_map *bpf_get_raw_tracepoint(const char *name);
> void bpf_put_raw_tracepoint(struct bpf_raw_event_map *btp);
> int bpf_get_perf_event_info(const struct perf_event *event, u32 *prog_id,
> u32 *fd_type, const char **buf,
> u64 *probe_offset, u64 *probe_addr,
> unsigned long *missed);
> int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct
> bpf_prog *prog);
> int bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct
> bpf_prog *prog);
> #else
> static inline unsigned int trace_call_bpf(struct trace_event_call
> *call, void *ctx)
> {
> return 1;
> }
>
> static inline int
> perf_event_attach_bpf_prog(struct perf_event *event, struct bpf_prog
> *prog, u64 bpf_cookie)
> {
> return -EOPNOTSUPP;
> }
>
> static inline void perf_event_detach_bpf_prog(struct perf_event *event) { }
>
> static inline int
> perf_event_query_prog_array(struct perf_event *event, void __user *info)
> {
> return -EOPNOTSUPP;
> }
> struct bpf_raw_tp_link;
> static inline int bpf_probe_register(struct bpf_raw_event_map *btp,
> struct bpf_raw_tp_link *link)
> {
> return -EOPNOTSUPP;
> }
> static inline int bpf_probe_unregister(struct bpf_raw_event_map *btp,
> struct bpf_raw_tp_link *link)
> {
> return -EOPNOTSUPP;
> }
> static inline struct bpf_raw_event_map *bpf_get_raw_tracepoint(const char *name)
> {
> return NULL;
> }
> static inline void bpf_put_raw_tracepoint(struct bpf_raw_event_map *btp)
> {
> }
> static inline int bpf_get_perf_event_info(const struct perf_event *event,
> u32 *prog_id, u32 *fd_type,
> const char **buf, u64 *probe_offset,
> u64 *probe_addr, unsigned
> long *missed)
> {
> return -EOPNOTSUPP;
> }
> static inline int
> bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
> {
> return -EOPNOTSUPP;
> }
> static inline int
> bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
> {
> return -EOPNOTSUPP;
> }
> #endif
I checked the config and used the most straightforward method:
intentionally adding errors in the macro's if-else to distinguish.
>
> > > Chenghao
> > >
> > >
> > > >
> > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -95
> > > > > #201 module_attach:FAIL
> > > > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > > > Successfully unloaded bpf_testmod.ko.
> > > > >
> > > > >
> > > > > Chenghao
> > > > >
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Chenghao
> > > > > > >
> > > > > > > >
> > > > > > > > > > On Sat, Aug 9, 2025 at 1:03 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Aug 8, 2025 at 8:48 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Aug 8, 2025 at 8:03 PM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hi, Vincent,
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Sat, Aug 9, 2025 at 12:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Folks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hengqi mentioned offline that the loongarch kernel locked up when
> > > > > > > > > > > > > > running full bpf selftests, so I went ahead and ran make run_tests to
> > > > > > > > > > > > > > perform full bpf selftest, I observed lockup too. It appears the
> > > > > > > > > > > > > > lockup happens when running module_attach test which includes testing
> > > > > > > > > > > > > > on fentry so this could be related to the trampoline patch series. for
> > > > > > > > > > > > > > example, if I just run ./test_progs -t module_attach, the kernel
> > > > > > > > > > > > > > lockup immediately.
> > > > > > > > > > > > > Is this a regression caused by the latest trampoline patches? Or in
> > > > > > > > > > > > > another word, Does vanilla 6.16 has this problem?
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > I suspect this is caused by the latest trampoline patches because the
> > > > > > > > > > > > module_attach is to test the fentry feature for kernel module
> > > > > > > > > > > > functions, I believe Changhao and I only tested the fentry feature for
> > > > > > > > > > > > non-module kernel functions. I can try kernel without the trampoline
> > > > > > > > > > > > patches and will let you know the result.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I reverted trampoline patches from loongarch-next branch and run
> > > > > > > > > > > ./test_progs -t module_attach simply just errors out with the fentry
> > > > > > > > > > > feature not supported
> > > > > > > > > > >
> > > > > > > > > > > [root@fedora bpf]# ./test_progs -t module_attach
> > > > > > > > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > > > > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > > > > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > > > > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > > > > > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > > > > > > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > > > > > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > > > > > > > > #205 module_attach:FAIL
> > > > > > > > > > >
> > > > > > > > > > > All error logs:
> > > > > > > > > > > test_module_attach:PASS:skel_open 0 nsec
> > > > > > > > > > > test_module_attach:PASS:set_attach_target 0 nsec
> > > > > > > > > > > test_module_attach:PASS:set_attach_target_explicit 0 nsec
> > > > > > > > > > > test_module_attach:PASS:skel_load 0 nsec
> > > > > > > > > > > libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
> > > > > > > > > > > libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
> > > > > > > > > > > test_module_attach:FAIL:skel_attach skeleton attach failed: -524
> > > > > > > > > > > #205 module_attach:FAIL
> > > > > > > > > > > Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
> > > > > > > > > > >
> > > > > > > > > > > I also tested loongarch-next branch with the trampoline patch series
> > > > > > > > > > > with no lockup kernel config so I can run dmesg to check kernel error
> > > > > > > > > > > log, ./test_progs -t module_attach result in below kernel log:
> > > > > > > > > > >
> > > > > > > > > > > [ 417.429954] bpf_testmod: loading out-of-tree module taints kernel.
> > > > > > > > > > > [ 419.728620] CPU 70475748 Unable to handle kernel paging request at
> > > > > > > > > > > virtual address 0000000800000024, era == 90000000041d5854, ra ==
> > > > > > > > > > > 90000000041d5848
> > > > > > > > > > > [ 419.728629] Oops[#1]:
> > > > > > > > > > > [ 419.728632] CPU 70475748 Unable to handle kernel paging request at
> > > > > > > > > > > virtual address 0000000000000018, era == 9000000005750268, ra ==
> > > > > > > > > > > 9000000004163938
> > > > > > > > > > > [ 441.305370] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> > > > > > > > > > > [ 441.305380] rcu: 5-...0: (29 ticks this GP)
> > > > > > > > > > > idle=eb74/1/0x4000000000000000 softirq=72377/72379 fqs=2599
> > > > > > > > > > > [ 441.305386] rcu: (detected by 4, t=5252 jiffies, g=60333, q=186 ncpus=8)
> > > > > > > > > > > [ 441.305390] Sending NMI from CPU 4 to CPUs 5:
> > > > > > > > > > > [ 451.305494] rcu: rcu_preempt kthread starved for 2499 jiffies!
> > > > > > > > > > > g60333 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=1
> > > > > > > > > > > [ 451.305500] rcu: Unless rcu_preempt kthread gets sufficient CPU
> > > > > > > > > > > time, OOM is now expected behavior.
> > > > > > > > > > > [ 451.305502] rcu: RCU grace-period kthread stack dump:
> > > > > > > > > > > [ 451.305504] task:rcu_preempt state:R stack:0 pid:15
> > > > > > > > > > > tgid:15 ppid:2 task_flags:0x208040 flags:0x00000800
> > > > > > > > > > > [ 451.305510] Stack : 9000000100467e80 0000000000000402
> > > > > > > > > > > 0000000000000010 90000001003b0680
> > > > > > > > > > > [ 451.305519] 90000000058e0000 0000000000000000
> > > > > > > > > > > 0000000000000040 9000000006c2dfd0
> > > > > > > > > > > [ 451.305526] 900000000578c9b0 0000000000000001
> > > > > > > > > > > 9000000006b21000 0000000000000005
> > > > > > > > > > > [ 451.305533] 00000001000093a8 00000001000093a8
> > > > > > > > > > > 0000000000000000 0000000000000004
> > > > > > > > > > > [ 451.305540] 90000000058f04e0 0000000000000000
> > > > > > > > > > > 0000000000000002 b793724be1dfb2b8
> > > > > > > > > > > [ 451.305547] 00000001000093a9 b793724be1dfb2b8
> > > > > > > > > > > 000000000000003f 9000000006c2dfd0
> > > > > > > > > > > [ 451.305554] 9000000006c30c18 0000000000000005
> > > > > > > > > > > 9000000006b0e000 9000000006b21000
> > > > > > > > > > > [ 451.305560] 9000000100453c98 90000001003aff80
> > > > > > > > > > > 9000000006c31140 900000000578c9b0
> > > > > > > > > > > [ 451.305567] 00000001000093a8 9000000005794d3c
> > > > > > > > > > > 00000000000000b4 0000000000000000
> > > > > > > > > > > [ 451.305574] 90000000024021b8 00000001000093a8
> > > > > > > > > > > 9000000004284f20 000000000a400001
> > > > > > > > > > > [ 451.305581] ...
> > > > > > > > > > > [ 451.305584] Call Trace:
> > > > > > > > > > > [ 451.305586] [<900000000578b868>] __schedule+0x410/0x1520
> > > > > > > > > > > [ 451.305595] [<900000000578c9ac>] schedule+0x34/0x190
> > > > > > > > > > > [ 451.305599] [<9000000005794d38>] schedule_timeout+0x98/0x140
> > > > > > > > > > > [ 451.305604] [<9000000004258f40>] rcu_gp_fqs_loop+0x5f8/0x868
> > > > > > > > > > > [ 451.305609] [<900000000425d358>] rcu_gp_kthread+0x260/0x2e0
> > > > > > > > > > > [ 451.305614] [<90000000041be704>] kthread+0x144/0x238
> > > > > > > > > > > [ 451.305619] [<9000000005787b60>] ret_from_kernel_thread+0x28/0xc8
> > > > > > > > > > > [ 451.305624] [<90000000041620e4>] ret_from_kernel_thread_asm+0xc/0x88
> > > > > > > > > > >
> > > > > > > > > > > [ 451.305630] rcu: Stack dump where RCU GP kthread last ran:
> > > > > > > > > > > [ 451.305633] Sending NMI from CPU 4 to CPUs 1:
> > > > > > > > > > > [ 451.305636] NMI backtrace for cpu 1 skipped: idling at idle_exit+0x0/0x4
> > > > > > > > > > > [ 451.306655] rcu: INFO: rcu_preempt detected expedited stalls on
> > > > > > > > > > > CPUs/tasks: { 5-...D } 7298 jiffies s: 853 root: 0x20/.
> > > > > > > > > > > [ 451.306665] rcu: blocking rcu_node structures (internal RCU debug):
> > > > > > > > > > > [ 451.306669] Sending NMI from CPU 6 to CPUs 5:
> > > > > > > > > > > [ 451.306672] Unable to send backtrace IPI to CPU5 - perhaps it hung?
> > > > > > > > > > >
> > > > > > > > > > > So related to trampoline patches for sure unless I am missing something.
> > > > > > > > > > >
> > > > > > > > > > > > > Huacai
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > A side note, if I put the module_attach test in
> > > > > > > > > > > > > > tools/testing/selftests/bpf/DENYLIST to skip the module_attach test,
> > > > > > > > > > > > > > the module_attach test is not skipped.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Vincent
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2025-08-22 5:36 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-09 8:15 kernel lockup on bpf selftests module_attach Vincent Li
2025-08-09 3:03 ` Huacai Chen
2025-08-09 3:48 ` Vincent Li
2025-08-09 5:03 ` Vincent Li
2025-08-09 6:02 ` Huacai Chen
2025-08-09 19:11 ` Vincent Li
2025-08-10 17:39 ` Vincent Li
2025-08-12 8:34 ` Chenghao Duan
2025-08-12 13:42 ` Vincent Li
2025-08-14 12:00 ` Chenghao Duan
2025-08-14 13:42 ` Vincent Li
2025-08-14 13:47 ` Vincent Li
2025-08-21 15:04 ` Vincent Li
2025-08-22 3:11 ` Chenghao Duan
2025-08-22 5:10 ` Vincent Li
2025-08-22 5:22 ` Vincent Li
2025-08-22 5:33 ` Vincent Li
2025-08-22 5:36 ` Chenghao Duan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).