netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v2 1/2] bpf-next: Introduced to support the ULP to get or set sockets
       [not found] <20250210134550.3189616-2-zhangmingyi5@huawei.com>
@ 2025-02-14  2:13 ` kernel test robot
  2025-02-14  6:23   ` Martin KaFai Lau
  0 siblings, 1 reply; 4+ messages in thread
From: kernel test robot @ 2025-02-14  2:13 UTC (permalink / raw)
  To: zhangmingyi
  Cc: oe-lkp, lkp, Xin Liu, netdev, bpf, mptcp, ast, daniel, andrii,
	martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo,
	jolsa, linux-kernel, yanan, wuchangye, xiesongyang, liwei883,
	tianmuyang, zhangmingyi5, oliver.sang



Hello,

kernel test robot noticed "BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/mutex.c" on:

commit: 8f510de3f26b2fabaf47eacd59053469e9c32754 ("[PATCH v2 1/2] bpf-next: Introduced to support the ULP to get or set sockets")
url: https://github.com/intel-lab-lkp/linux/commits/zhangmingyi/bpf-next-Introduced-to-support-the-ULP-to-get-or-set-sockets/20250210-215203
base: https://git.kernel.org/cgit/linux/kernel/git/bpf/bpf-next.git master
patch link: https://lore.kernel.org/all/20250210134550.3189616-2-zhangmingyi5@huawei.com/
patch subject: [PATCH v2 1/2] bpf-next: Introduced to support the ULP to get or set sockets

in testcase: trinity
version: trinity-i386-abe9de86-1_20230429
with following parameters:

	runtime: 300s
	group: group-03
	nr_groups: 5



config: i386-randconfig-054-20250212
compiler: gcc-12
test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G

(please refer to attached dmesg/kmsg for entire log/backtrace)


+-----------------------------------------------------------------------------+------------+------------+
|                                                                             | 9b6cdaf2ac | 8f510de3f2 |
+-----------------------------------------------------------------------------+------------+------------+
| BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/mutex.c | 0          | 6          |
+-----------------------------------------------------------------------------+------------+------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202502140959.f66e2ba6-lkp@intel.com


[   71.099773][ T3759] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:562
[   71.101798][ T3759] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 3759, name: trinity-c4
[   71.103659][ T3759] preempt_count: 0, expected: 0
[   71.104658][ T3759] RCU nest depth: 1, expected: 0
[   71.105669][ T3759] 2 locks held by trinity-c4/3759:
[ 71.106777][ T3759] #0: ecffcd80 (sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock (include/net/sock.h:1625) 
[ 71.108460][ T3759] #1: c3500498 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire (include/linux/rcupdate.h:336) 
[   71.110397][ T3759] CPU: 1 UID: 65534 PID: 3759 Comm: trinity-c4 Tainted: G                T  6.14.0-rc1-00030-g8f510de3f26b #1 8ad64aae41fa4cb8babad52c8f50e0a7d5e34569
[   71.110406][ T3759] Tainted: [T]=RANDSTRUCT
[   71.110407][ T3759] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[   71.110410][ T3759] Call Trace:
[ 71.110416][ T3759] dump_stack_lvl (lib/dump_stack.c:123) 
[ 71.110423][ T3759] dump_stack (lib/dump_stack.c:130) 
[ 71.110428][ T3759] __might_resched (kernel/sched/core.c:8767) 
[ 71.110440][ T3759] __might_sleep (kernel/sched/core.c:8696 (discriminator 17)) 
[ 71.110446][ T3759] __mutex_lock (include/linux/kernel.h:73 kernel/locking/mutex.c:562 kernel/locking/mutex.c:730) 
[ 71.110452][ T3759] ? rcu_read_unlock (include/linux/rcupdate.h:335) 
[ 71.110462][ T3759] ? mark_held_locks (kernel/locking/lockdep.c:4323) 
[ 71.110470][ T3759] ? lock_sock_nested (net/core/sock.c:3653) 
[ 71.110481][ T3759] mutex_lock_nested (kernel/locking/mutex.c:783) 
[ 71.110486][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) 
[ 71.110494][ T3759] tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) 
[ 71.110505][ T3759] tcp_set_ulp (net/ipv4/tcp_ulp.c:140 net/ipv4/tcp_ulp.c:166) 
[ 71.110513][ T3759] do_tcp_setsockopt (net/ipv4/tcp.c:3747) 
[ 71.110534][ T3759] tcp_setsockopt (net/ipv4/tcp.c:4032) 
[ 71.110542][ T3759] ? sock_common_recvmsg (net/core/sock.c:3833) 
[ 71.110548][ T3759] sock_common_setsockopt (net/core/sock.c:3838) 
[ 71.110561][ T3759] do_sock_setsockopt (net/socket.c:2298) 
[ 71.110577][ T3759] __sys_setsockopt (net/socket.c:2323) 
[ 71.110592][ T3759] __ia32_sys_setsockopt (net/socket.c:2326) 
[ 71.110599][ T3759] ia32_sys_call (kbuild/obj/consumer/i386-randconfig-054-20250212/./arch/x86/include/generated/asm/syscalls_32.h:367) 
[ 71.110607][ T3759] do_int80_syscall_32 (arch/x86/entry/common.c:165 arch/x86/entry/common.c:339) 
[ 71.110616][ T3759] entry_INT80_32 (arch/x86/entry/entry_32.S:942) 
[   71.110621][ T3759] EIP: 0xb4014092
[ 71.110626][ T3759] Code: 00 00 00 e9 90 ff ff ff ff a3 24 00 00 00 68 30 00 00 00 e9 80 ff ff ff ff a3 f8 ff ff ff 66 90 00 00 00 00 00 00 00 00 cd 80 <c3> 8d b4 26 00 00 00 00 8d b6 00 00 00 00 8b 1c 24 c3 8d b4 26 00
All code
========
   0:	00 00                	add    %al,(%rax)
   2:	00 e9                	add    %ch,%cl
   4:	90                   	nop
   5:	ff                   	(bad)
   6:	ff                   	(bad)
   7:	ff                   	(bad)
   8:	ff a3 24 00 00 00    	jmp    *0x24(%rbx)
   e:	68 30 00 00 00       	push   $0x30
  13:	e9 80 ff ff ff       	jmp    0xffffffffffffff98
  18:	ff a3 f8 ff ff ff    	jmp    *-0x8(%rbx)
  1e:	66 90                	xchg   %ax,%ax
	...
  28:	cd 80                	int    $0x80
  2a:*	c3                   	ret		<-- trapping instruction
  2b:	8d b4 26 00 00 00 00 	lea    0x0(%rsi,%riz,1),%esi
  32:	8d b6 00 00 00 00    	lea    0x0(%rsi),%esi
  38:	8b 1c 24             	mov    (%rsp),%ebx
  3b:	c3                   	ret
  3c:	8d                   	.byte 0x8d
  3d:	b4 26                	mov    $0x26,%ah
	...

Code starting with the faulting instruction
===========================================
   0:	c3                   	ret
   1:	8d b4 26 00 00 00 00 	lea    0x0(%rsi,%riz,1),%esi
   8:	8d b6 00 00 00 00    	lea    0x0(%rsi),%esi
   e:	8b 1c 24             	mov    (%rsp),%ebx
  11:	c3                   	ret
  12:	8d                   	.byte 0x8d
  13:	b4 26                	mov    $0x26,%ah
	...
[   71.110630][ T3759] EAX: ffffffda EBX: 00000134 ECX: 00000006 EDX: 0000001f
[   71.110634][ T3759] ESI: 08fee650 EDI: 00000004 EBP: 000012cf ESP: bfc1c538
[   71.110638][ T3759] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000296
[   71.182507][ T3759]
[   71.182999][ T3759] =============================
[   71.183907][ T3759] [ BUG: Invalid wait context ]
[   71.184819][ T3759] 6.14.0-rc1-00030-g8f510de3f26b #1 Tainted: G        W       T
[   71.186327][ T3759] -----------------------------
[   71.187265][ T3759] trinity-c4/3759 is trying to lock:
[ 71.188287][ T3759] c37b35e0 (tcpv4_prot_mutex){....}-{4:4}, at: tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) 
[   71.189847][ T3759] other info that might help us debug this:
[   71.191018][ T3759] context-{5:5}
[   71.191678][ T3759] 2 locks held by trinity-c4/3759:
[ 71.192635][ T3759] #0: ecffcd80 (sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock (include/net/sock.h:1625) 
[ 71.194220][ T3759] #1: c3500498 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire (include/linux/rcupdate.h:336) 
[   71.196078][ T3759] stack backtrace:
[   71.196797][ T3759] CPU: 0 UID: 65534 PID: 3759 Comm: trinity-c4 Tainted: G        W       T  6.14.0-rc1-00030-g8f510de3f26b #1 8ad64aae41fa4cb8babad52c8f50e0a7d5e34569
[   71.196807][ T3759] Tainted: [W]=WARN, [T]=RANDSTRUCT
[   71.196809][ T3759] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[   71.196812][ T3759] Call Trace:
[ 71.196818][ T3759] dump_stack_lvl (lib/dump_stack.c:123) 
[ 71.196825][ T3759] dump_stack (lib/dump_stack.c:130) 
[ 71.196830][ T3759] __lock_acquire (kernel/locking/lockdep.c:4830 kernel/locking/lockdep.c:4900 kernel/locking/lockdep.c:5178) 
[ 71.196840][ T3759] lock_acquire (kernel/locking/lockdep.c:469 kernel/locking/lockdep.c:5853) 
[ 71.196846][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) 
[ 71.196856][ T3759] ? __schedule (kernel/sched/core.c:5380) 
[ 71.196866][ T3759] __mutex_lock (kernel/locking/mutex.c:587 kernel/locking/mutex.c:730) 
[ 71.196872][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) 
[ 71.196878][ T3759] ? rcu_read_unlock (include/linux/rcupdate.h:335) 
[ 71.196885][ T3759] ? mark_held_locks (kernel/locking/lockdep.c:4323) 
[ 71.196889][ T3759] ? lock_sock_nested (net/core/sock.c:3653) 
[ 71.196898][ T3759] mutex_lock_nested (kernel/locking/mutex.c:783) 
[ 71.196904][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) 
[ 71.196909][ T3759] tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) 
[ 71.196916][ T3759] tcp_set_ulp (net/ipv4/tcp_ulp.c:140 net/ipv4/tcp_ulp.c:166) 
[ 71.196923][ T3759] do_tcp_setsockopt (net/ipv4/tcp.c:3747) 
[ 71.196934][ T3759] tcp_setsockopt (net/ipv4/tcp.c:4032) 
[ 71.196939][ T3759] ? sock_common_recvmsg (net/core/sock.c:3833) 
[ 71.196946][ T3759] sock_common_setsockopt (net/core/sock.c:3838) 
[ 71.196952][ T3759] do_sock_setsockopt (net/socket.c:2298) 
[ 71.196961][ T3759] __sys_setsockopt (net/socket.c:2323) 
[ 71.196967][ T3759] __ia32_sys_setsockopt (net/socket.c:2326) 
[ 71.196972][ T3759] ia32_sys_call (kbuild/obj/consumer/i386-randconfig-054-20250212/./arch/x86/include/generated/asm/syscalls_32.h:367) 
[ 71.196979][ T3759] do_int80_syscall_32 (arch/x86/entry/common.c:165 arch/x86/entry/common.c:339) 
[ 71.196985][ T3759] entry_INT80_32 (arch/x86/entry/entry_32.S:942) 
[   71.196990][ T3759] EIP: 0xb4014092
[ 71.196995][ T3759] Code: 00 00 00 e9 90 ff ff ff ff a3 24 00 00 00 68 30 00 00 00 e9 80 ff ff ff ff a3 f8 ff ff ff 66 90 00 00 00 00 00 00 00 00 cd 80 <c3> 8d b4 26 00 00 00 00 8d b6 00 00 00 00 8b 1c 24 c3 8d b4 26 00
All code
========
   0:	00 00                	add    %al,(%rax)
   2:	00 e9                	add    %ch,%cl
   4:	90                   	nop
   5:	ff                   	(bad)
   6:	ff                   	(bad)
   7:	ff                   	(bad)
   8:	ff a3 24 00 00 00    	jmp    *0x24(%rbx)
   e:	68 30 00 00 00       	push   $0x30
  13:	e9 80 ff ff ff       	jmp    0xffffffffffffff98
  18:	ff a3 f8 ff ff ff    	jmp    *-0x8(%rbx)
  1e:	66 90                	xchg   %ax,%ax
	...
  28:	cd 80                	int    $0x80
  2a:*	c3                   	ret		<-- trapping instruction
  2b:	8d b4 26 00 00 00 00 	lea    0x0(%rsi,%riz,1),%esi
  32:	8d b6 00 00 00 00    	lea    0x0(%rsi),%esi
  38:	8b 1c 24             	mov    (%rsp),%ebx
  3b:	c3                   	ret
  3c:	8d                   	.byte 0x8d
  3d:	b4 26                	mov    $0x26,%ah
	...

Code starting with the faulting instruction
===========================================
   0:	c3                   	ret
   1:	8d b4 26 00 00 00 00 	lea    0x0(%rsi,%riz,1),%esi
   8:	8d b6 00 00 00 00    	lea    0x0(%rsi),%esi
   e:	8b 1c 24             	mov    (%rsp),%ebx
  11:	c3                   	ret
  12:	8d                   	.byte 0x8d
  13:	b4 26                	mov    $0x26,%ah
	...
[   71.196999][ T3759] EAX: ffffffda EBX: 00000134 ECX: 00000006 EDX: 0000001f
[   71.197004][ T3759] ESI: 08fee650 EDI: 00000004 EBP: 000012cf ESP: bfc1c538
[   71.197008][ T3759] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000296



The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250214/202502140959.f66e2ba6-lkp@intel.com


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 1/2] bpf-next: Introduced to support the ULP to get or set sockets
  2025-02-14  2:13 ` [PATCH v2 1/2] bpf-next: Introduced to support the ULP to get or set sockets kernel test robot
@ 2025-02-14  6:23   ` Martin KaFai Lau
  2025-02-14 21:20     ` Jakub Kicinski
  0 siblings, 1 reply; 4+ messages in thread
From: Martin KaFai Lau @ 2025-02-14  6:23 UTC (permalink / raw)
  To: zhangmingyi
  Cc: kernel test robot, oe-lkp, lkp, Xin Liu, netdev, bpf, mptcp, ast,
	daniel, andrii, song, yhs, john.fastabend, kpsingh, sdf, haoluo,
	jolsa, linux-kernel, yanan, wuchangye, xiesongyang, liwei883,
	tianmuyang

On 2/13/25 6:13 PM, kernel test robot wrote:
> [   71.182999][ T3759] =============================
> [   71.183907][ T3759] [ BUG: Invalid wait context ]
> [   71.184819][ T3759] 6.14.0-rc1-00030-g8f510de3f26b #1 Tainted: G        W       T
> [   71.186327][ T3759] -----------------------------
> [   71.187265][ T3759] trinity-c4/3759 is trying to lock:
> [ 71.188287][ T3759] c37b35e0 (tcpv4_prot_mutex){....}-{4:4}, at: tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993)
> [   71.189847][ T3759] other info that might help us debug this:
> [   71.191018][ T3759] context-{5:5}
> [   71.191678][ T3759] 2 locks held by trinity-c4/3759:
> [ 71.192635][ T3759] #0: ecffcd80 (sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock (include/net/sock.h:1625)
> [ 71.194220][ T3759] #1: c3500498 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire (include/linux/rcupdate.h:336)
> [   71.196078][ T3759] stack backtrace:
> [   71.196797][ T3759] CPU: 0 UID: 65534 PID: 3759 Comm: trinity-c4 Tainted: G        W       T  6.14.0-rc1-00030-g8f510de3f26b #1 8ad64aae41fa4cb8babad52c8f50e0a7d5e34569
> [   71.196807][ T3759] Tainted: [W]=WARN, [T]=RANDSTRUCT
> [   71.196809][ T3759] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> [   71.196812][ T3759] Call Trace:
> [ 71.196818][ T3759] dump_stack_lvl (lib/dump_stack.c:123)
> [ 71.196825][ T3759] dump_stack (lib/dump_stack.c:130)
> [ 71.196830][ T3759] __lock_acquire (kernel/locking/lockdep.c:4830 kernel/locking/lockdep.c:4900 kernel/locking/lockdep.c:5178)
> [ 71.196840][ T3759] lock_acquire (kernel/locking/lockdep.c:469 kernel/locking/lockdep.c:5853)
> [ 71.196846][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993)
> [ 71.196856][ T3759] ? __schedule (kernel/sched/core.c:5380)
> [ 71.196866][ T3759] __mutex_lock (kernel/locking/mutex.c:587 kernel/locking/mutex.c:730)
> [ 71.196872][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993)
> [ 71.196878][ T3759] ? rcu_read_unlock (include/linux/rcupdate.h:335)
> [ 71.196885][ T3759] ? mark_held_locks (kernel/locking/lockdep.c:4323)
> [ 71.196889][ T3759] ? lock_sock_nested (net/core/sock.c:3653)
> [ 71.196898][ T3759] mutex_lock_nested (kernel/locking/mutex.c:783)

This is probably because __tcp_set_ulp is now under the rcu_read_lock() in patch 1.

Even fixing patch 1 will not be enough. The bpf cgrp prog (e.g. sockops) cannot 
sleep now, so it still cannot call bpf_setsockopt(TCP_ULP, "tls") which will 
take a mutex. This is a blocker :(

> [ 71.196904][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993)
> [ 71.196909][ T3759] tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993)
> [ 71.196916][ T3759] tcp_set_ulp (net/ipv4/tcp_ulp.c:140 net/ipv4/tcp_ulp.c:166)
> [ 71.196923][ T3759] do_tcp_setsockopt (net/ipv4/tcp.c:3747)
> [ 71.196934][ T3759] tcp_setsockopt (net/ipv4/tcp.c:4032)
> [ 71.196939][ T3759] ? sock_common_recvmsg (net/core/sock.c:3833)
> [ 71.196946][ T3759] sock_common_setsockopt (net/core/sock.c:3838)
> [ 71.196952][ T3759] do_sock_setsockopt (net/socket.c:2298)
> [ 71.196961][ T3759] __sys_setsockopt (net/socket.c:2323)
> [ 71.196967][ T3759] __ia32_sys_setsockopt (net/socket.c:2326)
> [ 71.196972][ T3759] ia32_sys_call (kbuild/obj/consumer/i386-randconfig-054-20250212/./arch/x86/include/generated/asm/syscalls_32.h:367)
> [ 71.196979][ T3759] do_int80_syscall_32 (arch/x86/entry/common.c:165 arch/x86/entry/common.c:339)
> [ 71.196985][ T3759] entry_INT80_32 (arch/x86/entry/entry_32.S:942)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 1/2] bpf-next: Introduced to support the ULP to get or set sockets
  2025-02-14  6:23   ` Martin KaFai Lau
@ 2025-02-14 21:20     ` Jakub Kicinski
  2025-02-14 22:11       ` Martin KaFai Lau
  0 siblings, 1 reply; 4+ messages in thread
From: Jakub Kicinski @ 2025-02-14 21:20 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: zhangmingyi, kernel test robot, oe-lkp, lkp, Xin Liu, netdev, bpf,
	mptcp, ast, daniel, andrii, song, yhs, john.fastabend, kpsingh,
	sdf, haoluo, jolsa, linux-kernel, yanan, wuchangye, xiesongyang,
	liwei883, tianmuyang

On Thu, 13 Feb 2025 22:23:39 -0800 Martin KaFai Lau wrote:
> On 2/13/25 6:13 PM, kernel test robot wrote:
> > [ 71.196846][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993)
> > [ 71.196856][ T3759] ? __schedule (kernel/sched/core.c:5380)
> > [ 71.196866][ T3759] __mutex_lock (kernel/locking/mutex.c:587 kernel/locking/mutex.c:730)
> > [ 71.196872][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993)
> > [ 71.196878][ T3759] ? rcu_read_unlock (include/linux/rcupdate.h:335)
> > [ 71.196885][ T3759] ? mark_held_locks (kernel/locking/lockdep.c:4323)
> > [ 71.196889][ T3759] ? lock_sock_nested (net/core/sock.c:3653)
> > [ 71.196898][ T3759] mutex_lock_nested (kernel/locking/mutex.c:783)  
> 
> This is probably because __tcp_set_ulp is now under the rcu_read_lock() in patch 1.
> 
> Even fixing patch 1 will not be enough. The bpf cgrp prog (e.g. sockops) cannot 
> sleep now, so it still cannot call bpf_setsockopt(TCP_ULP, "tls") which will 
> take a mutex. This is a blocker :(

Oh, kbuild bot was nice enough to CC netdev, it wasn't CCed on 
the submission.

I'd really rather we didn't allow setting ULP from BPF unless there 
is a strong and clear use case. The ULP configuration and stacking
is a source of many bugs. And the use case here AFAIU is to allow
attaching some ULP from an OOT module to a socket, which I think
won't make core BPF folks happy either, right?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 1/2] bpf-next: Introduced to support the ULP to get or set sockets
  2025-02-14 21:20     ` Jakub Kicinski
@ 2025-02-14 22:11       ` Martin KaFai Lau
  0 siblings, 0 replies; 4+ messages in thread
From: Martin KaFai Lau @ 2025-02-14 22:11 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: zhangmingyi, kernel test robot, oe-lkp, lkp, Xin Liu, netdev, bpf,
	mptcp, ast, daniel, andrii, song, yhs, john.fastabend, kpsingh,
	sdf, haoluo, jolsa, linux-kernel, yanan, wuchangye, xiesongyang,
	liwei883, tianmuyang

On 2/14/25 1:20 PM, Jakub Kicinski wrote:
> On Thu, 13 Feb 2025 22:23:39 -0800 Martin KaFai Lau wrote:
>> On 2/13/25 6:13 PM, kernel test robot wrote:
>>> [ 71.196846][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993)
>>> [ 71.196856][ T3759] ? __schedule (kernel/sched/core.c:5380)
>>> [ 71.196866][ T3759] __mutex_lock (kernel/locking/mutex.c:587 kernel/locking/mutex.c:730)
>>> [ 71.196872][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993)
>>> [ 71.196878][ T3759] ? rcu_read_unlock (include/linux/rcupdate.h:335)
>>> [ 71.196885][ T3759] ? mark_held_locks (kernel/locking/lockdep.c:4323)
>>> [ 71.196889][ T3759] ? lock_sock_nested (net/core/sock.c:3653)
>>> [ 71.196898][ T3759] mutex_lock_nested (kernel/locking/mutex.c:783)
>>
>> This is probably because __tcp_set_ulp is now under the rcu_read_lock() in patch 1.
>>
>> Even fixing patch 1 will not be enough. The bpf cgrp prog (e.g. sockops) cannot
>> sleep now, so it still cannot call bpf_setsockopt(TCP_ULP, "tls") which will
>> take a mutex. This is a blocker :(
> 
> Oh, kbuild bot was nice enough to CC netdev, it wasn't CCed on
> the submission.

Ah. I also didn't notice netdev was not cc-ed. will pay attention in the future.

> 
> I'd really rather we didn't allow setting ULP from BPF unless there
> is a strong and clear use case. The ULP configuration and stacking
> is a source of many bugs. And the use case here AFAIU is to allow
> attaching some ULP from an OOT module to a socket, which I think
> won't make core BPF folks happy either, right?

If the in-tree ulp does not work, there is little reason to do it for the 
out-of-tree module only.

My question on the ulp use case went to silence in v1, so we can assume it is 
out-of-tree ulp only. I also asked to replace the "smc" ulp testing with a more 
real "tls" ulp testing to see how it goes first. It does not work as the bot 
reported it.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-02-14 22:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20250210134550.3189616-2-zhangmingyi5@huawei.com>
2025-02-14  2:13 ` [PATCH v2 1/2] bpf-next: Introduced to support the ULP to get or set sockets kernel test robot
2025-02-14  6:23   ` Martin KaFai Lau
2025-02-14 21:20     ` Jakub Kicinski
2025-02-14 22:11       ` Martin KaFai Lau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).