The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* Re: [syzbot] [rdma] general protection fault in kernel_sock_shutdown (4)
       [not found] <69ea344f.a00a0220.17a17.0040.GAE@google.com>
@ 2026-05-06 13:48 ` syzbot
  2026-05-06 14:28   ` Zhu Yanjun
  2026-05-07  1:30   ` Hillf Danton
  2026-05-07  3:52 ` syzbot
  1 sibling, 2 replies; 21+ messages in thread
From: syzbot @ 2026-05-06 13:48 UTC (permalink / raw)
  To: akpm, arjan, davem, dsahern, edumazet, horms, jgg, kuba, kuni1840,
	kuniyu, leon, linux-kernel, linux-rdma, netdev, pabeni,
	syzkaller-bugs, yanjun.zhu, zyjzyj2000

syzbot has found a reproducer for the following issue on:

HEAD commit:    74fe02ce122a Merge tag 'wq-for-7.1-rc2-fixes' of git://git..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=16e895ce580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=59da38148f3a3d24
dashboard link: https://syzkaller.appspot.com/bug?extid=d8f76778263ab65c2b21
compiler:       gcc (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13a613ba580000

Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-74fe02ce.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/c0a591d96864/vmlinux-74fe02ce.xz
kernel image: https://storage.googleapis.com/syzbot-assets/9f94fb623cd1/bzImage-74fe02ce.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com

Oops: general protection fault, probably for non-canonical address 0xdffffc000000000d: 0000 [#1] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000068-0x000000000000006f]
CPU: 3 UID: 0 PID: 5986 Comm: syz.3.20 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:kernel_sock_shutdown+0x47/0x70 net/socket.c:3785
Code: fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 75 33 48 b8 00 00 00 00 00 fc ff df 4c 8b 63 20 49 8d 7c 24 68 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75 1a 49 8b 44 24 68 89 ee 48 89 df 5b 5d 41 5c ff e0
RSP: 0018:ffffc9000391f180 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff88802a2a0040 RCX: ffffffff8b8b72bd
RDX: 000000000000000d RSI: ffffffff89553b32 RDI: 0000000000000068
RBP: 0000000000000002 R08: 0000000000000001 R09: fffff52000723dfc
R10: ffffc9000391efe7 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8880311b8000 R14: 0000000000000002 R15: 0000000000000018
FS:  00007f602d1fe6c0(0000) GS:ffff8880d6675000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000561c522a6000 CR3: 000000002e99e000 CR4: 0000000000352ef0
Call Trace:
 <TASK>
 udp_tunnel_sock_release+0x68/0x80 net/ipv4/udp_tunnel_core.c:202
 rxe_release_udp_tunnel drivers/infiniband/sw/rxe/rxe_net.c:294 [inline]
 rxe_sock_put+0xae/0x130 drivers/infiniband/sw/rxe/rxe_net.c:639
 rxe_net_del+0x83/0x120 drivers/infiniband/sw/rxe/rxe_net.c:660
 rxe_dellink+0x15/0x20 drivers/infiniband/sw/rxe/rxe.c:254
 nldev_dellink+0x289/0x3c0 drivers/infiniband/core/nldev.c:1849
 rdma_nl_rcv_msg+0x392/0x6f0 drivers/infiniband/core/netlink.c:195
 rdma_nl_rcv_skb.constprop.0.isra.0+0x2cb/0x410 drivers/infiniband/core/netlink.c:239
 netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
 netlink_unicast+0x585/0x850 net/netlink/af_netlink.c:1344
 netlink_sendmsg+0x8b0/0xda0 net/netlink/af_netlink.c:1894
 sock_sendmsg_nosec net/socket.c:787 [inline]
 __sock_sendmsg net/socket.c:802 [inline]
 ____sys_sendmsg+0x9e1/0xb70 net/socket.c:2698
 ___sys_sendmsg+0x190/0x1e0 net/socket.c:2752
 __sys_sendmsg+0x170/0x220 net/socket.c:2784
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x10b/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f602db9cdd9
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f602d1fe028 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f602de16090 RCX: 00007f602db9cdd9
RDX: 0000000000000000 RSI: 00002000000002c0 RDI: 0000000000000007
RBP: 00007f602dc32d69 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f602de16128 R14: 00007f602de16090 R15: 00007ffc1d89c428
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:kernel_sock_shutdown+0x47/0x70 net/socket.c:3785
Code: fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 75 33 48 b8 00 00 00 00 00 fc ff df 4c 8b 63 20 49 8d 7c 24 68 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75 1a 49 8b 44 24 68 89 ee 48 89 df 5b 5d 41 5c ff e0
RSP: 0018:ffffc9000391f180 EFLAGS: 00010202

RAX: dffffc0000000000 RBX: ffff88802a2a0040 RCX: ffffffff8b8b72bd
RDX: 000000000000000d RSI: ffffffff89553b32 RDI: 0000000000000068
RBP: 0000000000000002 R08: 0000000000000001 R09: fffff52000723dfc
R10: ffffc9000391efe7 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8880311b8000 R14: 0000000000000002 R15: 0000000000000018
FS:  00007f602d1fe6c0(0000) GS:ffff8880d6675000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000561c522a6000 CR3: 000000002e99e000 CR4: 0000000000352ef0
----------------
Code disassembly (best guess):
   0:	fc                   	cld
   1:	ff                   	lcall  (bad)
   2:	df 48 89             	fisttps -0x77(%rax)
   5:	fa                   	cli
   6:	48 c1 ea 03          	shr    $0x3,%rdx
   a:	80 3c 02 00          	cmpb   $0x0,(%rdx,%rax,1)
   e:	75 33                	jne    0x43
  10:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
  17:	fc ff df
  1a:	4c 8b 63 20          	mov    0x20(%rbx),%r12
  1e:	49 8d 7c 24 68       	lea    0x68(%r12),%rdi
  23:	48 89 fa             	mov    %rdi,%rdx
  26:	48 c1 ea 03          	shr    $0x3,%rdx
* 2a:	80 3c 02 00          	cmpb   $0x0,(%rdx,%rax,1) <-- trapping instruction
  2e:	75 1a                	jne    0x4a
  30:	49 8b 44 24 68       	mov    0x68(%r12),%rax
  35:	89 ee                	mov    %ebp,%esi
  37:	48 89 df             	mov    %rbx,%rdi
  3a:	5b                   	pop    %rbx
  3b:	5d                   	pop    %rbp
  3c:	41 5c                	pop    %r12
  3e:	ff e0                	jmp    *%rax


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [syzbot] [rdma] general protection fault in kernel_sock_shutdown (4)
  2026-05-06 13:48 ` [syzbot] [rdma] general protection fault in kernel_sock_shutdown (4) syzbot
@ 2026-05-06 14:28   ` Zhu Yanjun
  2026-05-06 15:19     ` Kuniyuki Iwashima
  2026-05-07  1:30   ` Hillf Danton
  1 sibling, 1 reply; 21+ messages in thread
From: Zhu Yanjun @ 2026-05-06 14:28 UTC (permalink / raw)
  To: syzbot, akpm, arjan, davem, dsahern, edumazet, horms, jgg, kuba,
	kuni1840, kuniyu, leon, linux-kernel, linux-rdma, netdev, pabeni,
	syzkaller-bugs, zyjzyj2000
  Cc: Kuniyuki Iwashima


在 2026/5/6 6:48, syzbot 写道:
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit:    74fe02ce122a Merge tag 'wq-for-7.1-rc2-fixes' of git://git..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=16e895ce580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=59da38148f3a3d24
> dashboard link: https://syzkaller.appspot.com/bug?extid=d8f76778263ab65c2b21
> compiler:       gcc (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13a613ba580000
>
> Downloadable assets:
> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-74fe02ce.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/c0a591d96864/vmlinux-74fe02ce.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/9f94fb623cd1/bzImage-74fe02ce.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com
>
> Oops: general protection fault, probably for non-canonical address 0xdffffc000000000d: 0000 [#1] SMP KASAN NOPTI
> KASAN: null-ptr-deref in range [0x0000000000000068-0x000000000000006f]

Thanks a lot. IIRC, this problem is in process. The link is 
https://patchwork.kernel.org/project/linux-rdma/patch/20260424013759.728288-1-kuniyu@google.com/

Hi, Kuniyuki Iwashima

I think you are fixing this problem. I hope that we can see your commit 
very soon.

Zhu Yanjun

> CPU: 3 UID: 0 PID: 5986 Comm: syz.3.20 Not tainted syzkaller #0 PREEMPT(full)
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> RIP: 0010:kernel_sock_shutdown+0x47/0x70 net/socket.c:3785
> Code: fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 75 33 48 b8 00 00 00 00 00 fc ff df 4c 8b 63 20 49 8d 7c 24 68 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75 1a 49 8b 44 24 68 89 ee 48 89 df 5b 5d 41 5c ff e0
> RSP: 0018:ffffc9000391f180 EFLAGS: 00010202
> RAX: dffffc0000000000 RBX: ffff88802a2a0040 RCX: ffffffff8b8b72bd
> RDX: 000000000000000d RSI: ffffffff89553b32 RDI: 0000000000000068
> RBP: 0000000000000002 R08: 0000000000000001 R09: fffff52000723dfc
> R10: ffffc9000391efe7 R11: 0000000000000000 R12: 0000000000000000
> R13: ffff8880311b8000 R14: 0000000000000002 R15: 0000000000000018
> FS:  00007f602d1fe6c0(0000) GS:ffff8880d6675000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000561c522a6000 CR3: 000000002e99e000 CR4: 0000000000352ef0
> Call Trace:
>   <TASK>
>   udp_tunnel_sock_release+0x68/0x80 net/ipv4/udp_tunnel_core.c:202
>   rxe_release_udp_tunnel drivers/infiniband/sw/rxe/rxe_net.c:294 [inline]
>   rxe_sock_put+0xae/0x130 drivers/infiniband/sw/rxe/rxe_net.c:639
>   rxe_net_del+0x83/0x120 drivers/infiniband/sw/rxe/rxe_net.c:660
>   rxe_dellink+0x15/0x20 drivers/infiniband/sw/rxe/rxe.c:254
>   nldev_dellink+0x289/0x3c0 drivers/infiniband/core/nldev.c:1849
>   rdma_nl_rcv_msg+0x392/0x6f0 drivers/infiniband/core/netlink.c:195
>   rdma_nl_rcv_skb.constprop.0.isra.0+0x2cb/0x410 drivers/infiniband/core/netlink.c:239
>   netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
>   netlink_unicast+0x585/0x850 net/netlink/af_netlink.c:1344
>   netlink_sendmsg+0x8b0/0xda0 net/netlink/af_netlink.c:1894
>   sock_sendmsg_nosec net/socket.c:787 [inline]
>   __sock_sendmsg net/socket.c:802 [inline]
>   ____sys_sendmsg+0x9e1/0xb70 net/socket.c:2698
>   ___sys_sendmsg+0x190/0x1e0 net/socket.c:2752
>   __sys_sendmsg+0x170/0x220 net/socket.c:2784
>   do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>   do_syscall_64+0x10b/0xf80 arch/x86/entry/syscall_64.c:94
>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x7f602db9cdd9
> Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007f602d1fe028 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> RAX: ffffffffffffffda RBX: 00007f602de16090 RCX: 00007f602db9cdd9
> RDX: 0000000000000000 RSI: 00002000000002c0 RDI: 0000000000000007
> RBP: 00007f602dc32d69 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> R13: 00007f602de16128 R14: 00007f602de16090 R15: 00007ffc1d89c428
>   </TASK>
> Modules linked in:
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:kernel_sock_shutdown+0x47/0x70 net/socket.c:3785
> Code: fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 75 33 48 b8 00 00 00 00 00 fc ff df 4c 8b 63 20 49 8d 7c 24 68 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75 1a 49 8b 44 24 68 89 ee 48 89 df 5b 5d 41 5c ff e0
> RSP: 0018:ffffc9000391f180 EFLAGS: 00010202
>
> RAX: dffffc0000000000 RBX: ffff88802a2a0040 RCX: ffffffff8b8b72bd
> RDX: 000000000000000d RSI: ffffffff89553b32 RDI: 0000000000000068
> RBP: 0000000000000002 R08: 0000000000000001 R09: fffff52000723dfc
> R10: ffffc9000391efe7 R11: 0000000000000000 R12: 0000000000000000
> R13: ffff8880311b8000 R14: 0000000000000002 R15: 0000000000000018
> FS:  00007f602d1fe6c0(0000) GS:ffff8880d6675000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000561c522a6000 CR3: 000000002e99e000 CR4: 0000000000352ef0
> ----------------
> Code disassembly (best guess):
>     0:	fc                   	cld
>     1:	ff                   	lcall  (bad)
>     2:	df 48 89             	fisttps -0x77(%rax)
>     5:	fa                   	cli
>     6:	48 c1 ea 03          	shr    $0x3,%rdx
>     a:	80 3c 02 00          	cmpb   $0x0,(%rdx,%rax,1)
>     e:	75 33                	jne    0x43
>    10:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
>    17:	fc ff df
>    1a:	4c 8b 63 20          	mov    0x20(%rbx),%r12
>    1e:	49 8d 7c 24 68       	lea    0x68(%r12),%rdi
>    23:	48 89 fa             	mov    %rdi,%rdx
>    26:	48 c1 ea 03          	shr    $0x3,%rdx
> * 2a:	80 3c 02 00          	cmpb   $0x0,(%rdx,%rax,1) <-- trapping instruction
>    2e:	75 1a                	jne    0x4a
>    30:	49 8b 44 24 68       	mov    0x68(%r12),%rax
>    35:	89 ee                	mov    %ebp,%esi
>    37:	48 89 df             	mov    %rbx,%rdi
>    3a:	5b                   	pop    %rbx
>    3b:	5d                   	pop    %rbp
>    3c:	41 5c                	pop    %r12
>    3e:	ff e0                	jmp    *%rax
>
>
> ---
> If you want syzbot to run the reproducer, reply with:
> #syz test: git://repo/address.git branch-or-commit-hash
> If you attach or paste a git patch, syzbot will apply it before testing.

-- 
Best Regards,
Yanjun.Zhu


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [syzbot] [rdma] general protection fault in kernel_sock_shutdown (4)
  2026-05-06 14:28   ` Zhu Yanjun
@ 2026-05-06 15:19     ` Kuniyuki Iwashima
  0 siblings, 0 replies; 21+ messages in thread
From: Kuniyuki Iwashima @ 2026-05-06 15:19 UTC (permalink / raw)
  To: Zhu Yanjun
  Cc: syzbot, akpm, arjan, davem, dsahern, edumazet, horms, jgg, kuba,
	kuni1840, leon, linux-kernel, linux-rdma, netdev, pabeni,
	syzkaller-bugs, zyjzyj2000

On Wed, May 6, 2026 at 7:28 AM Zhu Yanjun <yanjun.zhu@linux.dev> wrote:
>
>
> 在 2026/5/6 6:48, syzbot 写道:
> > syzbot has found a reproducer for the following issue on:
> >
> > HEAD commit:    74fe02ce122a Merge tag 'wq-for-7.1-rc2-fixes' of git://git..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=16e895ce580000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=59da38148f3a3d24
> > dashboard link: https://syzkaller.appspot.com/bug?extid=d8f76778263ab65c2b21
> > compiler:       gcc (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13a613ba580000
> >
> > Downloadable assets:
> > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-74fe02ce.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/c0a591d96864/vmlinux-74fe02ce.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/9f94fb623cd1/bzImage-74fe02ce.xz
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com
> >
> > Oops: general protection fault, probably for non-canonical address 0xdffffc000000000d: 0000 [#1] SMP KASAN NOPTI
> > KASAN: null-ptr-deref in range [0x0000000000000068-0x000000000000006f]
>
> Thanks a lot. IIRC, this problem is in process. The link is
> https://patchwork.kernel.org/project/linux-rdma/patch/20260424013759.728288-1-kuniyu@google.com/
>
> Hi, Kuniyuki Iwashima
>
> I think you are fixing this problem. I hope that we can see your commit
> very soon.

Yes, I was sidetracked but will respin v3 this week.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [syzbot] [rdma] general protection fault in kernel_sock_shutdown (4)
  2026-05-06 13:48 ` [syzbot] [rdma] general protection fault in kernel_sock_shutdown (4) syzbot
  2026-05-06 14:28   ` Zhu Yanjun
@ 2026-05-07  1:30   ` Hillf Danton
  2026-05-07  1:57     ` syzbot
  1 sibling, 1 reply; 21+ messages in thread
From: Hillf Danton @ 2026-05-07  1:30 UTC (permalink / raw)
  To: syzbot; +Cc: linux-kernel, syzkaller-bugs

> Date: Wed, 06 May 2026 06:48:30 -0700	[thread overview]
> syzbot has found a reproducer for the following issue on:
> 
> HEAD commit:    74fe02ce122a Merge tag 'wq-for-7.1-rc2-fixes' of git://git..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=16e895ce580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=59da38148f3a3d24
> dashboard link: https://syzkaller.appspot.com/bug?extid=d8f76778263ab65c2b21
> compiler:       gcc (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13a613ba580000

#syz test

--- x/net/socket.c
+++ y/net/socket.c
@@ -3782,7 +3782,11 @@ EXPORT_SYMBOL(kernel_getpeername);
 
 int kernel_sock_shutdown(struct socket *sock, enum sock_shutdown_cmd how)
 {
-	return READ_ONCE(sock->ops)->shutdown(sock, how);
+	const struct proto_ops *ops = READ_ONCE(sock->ops);
+	if (ops)
+		return ops->shutdown(sock, how);
+	else
+		return 0;
 }
 EXPORT_SYMBOL(kernel_sock_shutdown);
 
--

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [syzbot] [rdma] general protection fault in kernel_sock_shutdown (4)
  2026-05-07  1:30   ` Hillf Danton
@ 2026-05-07  1:57     ` syzbot
  0 siblings, 0 replies; 21+ messages in thread
From: syzbot @ 2026-05-07  1:57 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot tried to test the proposed patch but the build/boot failed:

lost connection to test machine



syzkaller login: qemu-system-x86_64: ahci: PRDT length for NCQ command (0x0) is smaller than the requested size (0x1cc000)
Warning: Permanently added '[localhost]:12337' (ED25519) to the list of known hosts.
[   97.096781][   T10] cfg80211: failed to load regulatory.db
[  152.152171][ T1025] ata1.00: exception Emask 0x0 SAct 0x800 SErr 0x0 action 0x6 frozen
[  152.155707][ T1025] ata1.00: failed command: WRITE FPDMA QUEUED
[  152.158413][ T1025] ata1.00: cmd 61/60:58:36:81:04/0e:00:00:00:00/40 tag 11 ncq dma 1884160 ou
[  152.158413][ T1025]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  152.167703][ T1025] ata1.00: status: { DRDY }
[  152.169908][ T1025] ata1: hard resetting link
[  152.494622][ T1025] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[  152.499274][ T1025] ata1.00: configured for UDMA/100
[  152.502602][ T1025] ata1: EH complete
qemu-system-x86_64: ahci: PRDT length for NCQ command (0x0) is smaller than the requested size (0xc2000)
[  152.536671][ T1025] ata1.00: Read log 0x10 page 0x00 failed, Emask 0x1
[  152.539788][ T1025] ata1: failed to read log page 10h (errno=-5)
[  152.543143][ T1025] ata1.00: NCQ disabled due to excessive errors
[  152.546416][ T1025] ata1.00: exception Emask 0x1 SAct 0xfc00 SErr 0x0 action 0x0
[  152.549623][ T1025] ata1.00: irq_stat 0x41000000
[  152.552697][ T1025] ata1.00: failed command: WRITE FPDMA QUEUED
[  152.555276][ T1025] ata1.00: cmd 61/10:50:36:01:05/0c:00:00:00:00/40 tag 10 ncq dma 1581056 ou
[  152.555276][ T1025]          res 50/00:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device error)
[  152.564262][ T1025] ata1.00: status: { DRDY }
[  152.567499][ T1025] ata1.00: failed command: WRITE FPDMA QUEUED
[  152.570300][ T1025] ata1.00: cmd 61/b0:58:46:0d:05/03:00:00:00:00/40 tag 11 ncq dma 483328 out
[  152.570300][ T1025]          res 50/00:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device error)
[  152.578750][ T1025] ata1.00: status: { DRDY }
[  152.580940][ T1025] ata1.00: failed command: WRITE FPDMA QUEUED
[  152.584242][ T1025] ata1.00: cmd 61/c8:60:f6:10:05/05:00:00:00:00/40 tag 12 ncq dma 757760 out
[  152.584242][ T1025]          res 50/00:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device error)
[  152.591624][ T1025] ata1.00: status: { DRDY }
[  152.594145][ T1025] ata1.00: failed command: WRITE FPDMA QUEUED
[  152.597218][ T1025] ata1.00: cmd 61/f0:68:be:16:05/02:00:00:00:00/40 tag 13 ncq dma 385024 out
[  152.597218][ T1025]          res 50/00:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device error)
[  152.605201][ T1025] ata1.00: status: { DRDY }
[  152.607461][ T1025] ata1.00: failed command: WRITE FPDMA QUEUED
[  152.610119][ T1025] ata1.00: cmd 61/10:70:ae:19:05/06:00:00:00:00/40 tag 14 ncq dma 794624 out
[  152.610119][ T1025]          res 50/00:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device error)
[  152.617854][ T1025] ata1.00: status: { DRDY }
[  152.620399][ T1025] ata1.00: failed command: WRITE FPDMA QUEUED
[  152.623555][ T1025] ata1.00: cmd 61/f8:78:be:1f:05/02:00:00:00:00/40 tag 15 ncq dma 389120 out
[  152.623555][ T1025]          res 50/00:00:00:00:00/00:00:00:00:00/00 Emask 0x1 (device error)
[  152.632127][ T1025] ata1.00: status: { DRDY }
[  152.635455][ T1025] ata1.00: configured for UDMA/100
[  152.638343][ T1025] ata1: EH complete
qemu-system-x86_64: hw/ide/core.c:934: ide_dma_cb: Assertion `prep_size >= 0 && prep_size <= n * 512' failed.
Connection to localhost closed by remote host.


syzkaller build log:
go env (err=<nil>)
AR='ar'
CC='gcc'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='g++'
GCCGO='gccgo'
GO111MODULE='auto'
GOAMD64='v1'
GOARCH='amd64'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/syzkaller/.cache/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/syzkaller/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build2539546880=/tmp/go-build -gno-record-gcc-switches'
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMOD='/syzkaller/jobs/linux/gopath/src/github.com/google/syzkaller/go.mod'
GOMODCACHE='/syzkaller/jobs/linux/gopath/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/syzkaller/jobs/linux/gopath'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTELEMETRY='local'
GOTELEMETRYDIR='/syzkaller/.config/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.26.0'
GOWORK=''
PKG_CONFIG='pkg-config'

git status (err=<nil>)
HEAD detached at 23ad3581d162
nothing to commit, working tree clean


tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
go list -f '{{.Stale}}' -ldflags="-s -w -X github.com/google/syzkaller/prog.GitRevision=23ad3581d162728720256cdd0a99f8702ec9c4c5 -X github.com/google/syzkaller/prog.gitRevisionDate=20260506-081407"  ./sys/syz-sysgen | grep -q false || go install -ldflags="-s -w -X github.com/google/syzkaller/prog.GitRevision=23ad3581d162728720256cdd0a99f8702ec9c4c5 -X github.com/google/syzkaller/prog.gitRevisionDate=20260506-081407"  ./sys/syz-sysgen
make .descriptions
tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
bin/syz-sysgen
touch .descriptions
GOOS=linux GOARCH=amd64 go build -ldflags="-s -w -X github.com/google/syzkaller/prog.GitRevision=23ad3581d162728720256cdd0a99f8702ec9c4c5 -X github.com/google/syzkaller/prog.gitRevisionDate=20260506-081407"  -o ./bin/linux_amd64/syz-execprog github.com/google/syzkaller/tools/syz-execprog
mkdir -p ./bin/linux_amd64
g++ -o ./bin/linux_amd64/syz-executor executor/executor.cc \
	-m64 -O2 -pthread -Wall -Werror -Wparentheses -Wunused-const-variable -Wframe-larger-than=16384 -Wno-stringop-overflow -Wno-array-bounds -Wno-format-overflow -Wno-unused-but-set-variable -Wno-unused-command-line-argument -static-pie -std=c++17 -I. -Iexecutor/_include   -DGOOS_linux=1 -DGOARCH_amd64=1 \
	-DHOSTGOOS_linux=1 -DGIT_REVISION=\"23ad3581d162728720256cdd0a99f8702ec9c4c5\"
go: downloading golang.org/x/sync v0.20.0
go: downloading go.opentelemetry.io/otel/sdk v1.43.0
go: downloading google.golang.org/grpc v1.80.0
go: downloading go.opentelemetry.io/otel v1.43.0
go: downloading go.opentelemetry.io/otel/trace v1.43.0
go: downloading google.golang.org/genproto/googleapis/api v0.0.0-20260401024825-9d38bb4040a9
go: downloading golang.org/x/net v0.52.0
go: downloading google.golang.org/genproto/googleapis/rpc v0.0.0-20260401024825-9d38bb4040a9
go: downloading github.com/ianlancetaylor/demangle v0.0.0-20260505044615-1ff4bf46051f
go: downloading go.opentelemetry.io/otel/sdk/metric v1.43.0
go: downloading go.opentelemetry.io/otel/metric v1.43.0
go: downloading golang.org/x/crypto v0.49.0
go: downloading golang.org/x/text v0.35.0
go: downloading github.com/go-jose/go-jose/v4 v4.1.4
/usr/bin/ld: /tmp/ccMZeaB8.o: in function `Connection::Connect(char const*, char const*)':
executor.cc:(.text._ZN10Connection7ConnectEPKcS1_[_ZN10Connection7ConnectEPKcS1_]+0x386): warning: Using 'gethostbyname' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
./tools/check-syzos.sh 2>/dev/null



Tested on:

commit:         5862221f Merge tag 'parisc-for-7.1-rc3' of git://git.k..
git tree:       upstream
kernel config:  https://syzkaller.appspot.com/x/.config?x=7f195f6be48c12ec
dashboard link: https://syzkaller.appspot.com/bug?extid=d8f76778263ab65c2b21
compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1583eece580000


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [syzbot] [rdma] general protection fault in kernel_sock_shutdown (4)
       [not found] <69ea344f.a00a0220.17a17.0040.GAE@google.com>
  2026-05-06 13:48 ` [syzbot] [rdma] general protection fault in kernel_sock_shutdown (4) syzbot
@ 2026-05-07  3:52 ` syzbot
  2026-05-07 10:12   ` Edward Adam Davis
                     ` (2 more replies)
  1 sibling, 3 replies; 21+ messages in thread
From: syzbot @ 2026-05-07  3:52 UTC (permalink / raw)
  To: akpm, arjan, davem, dsahern, edumazet, hdanton, horms, jgg, kuba,
	kuni1840, kuniyu, leon, linux-kernel, linux-rdma, netdev, pabeni,
	syzkaller-bugs, yanjun.zhu, zyjzyj2000

syzbot has found a reproducer for the following issue on:

HEAD commit:    735d2f48cada Add linux-next specific files for 20260506
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=14f0e56a580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=a88880f0f312e277
dashboard link: https://syzkaller.appspot.com/bug?extid=d8f76778263ab65c2b21
compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=125c9f6c580000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=166580ec580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/e65b731bdb98/disk-735d2f48.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/60db2f3d3f2f/vmlinux-735d2f48.xz
kernel image: https://storage.googleapis.com/syzbot-assets/55da282f7ab4/bzImage-735d2f48.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com

rdma_rxe: rxe_newlink: failed to add lo
Oops: gen[  127.022080][ T5982] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000004: 0000 [#1] SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000020-0x0000000000000027]
CPU: 1 UID: 0 PID: 5982 Comm: syz.3.20 Not tainted syzkaller #0 PREEMPT_{RT,(full)} 
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/18/2026
RIP: 0010:kernel_sock_shutdown+0x2a/0x70 net/socket.c:3803
Code: f3 0f 1e fa 41 57 41 56 41 54 53 89 f3 49 89 fe 49 bc 00 00 00 00 00 fc ff df e8 e1 25 c5 f8 4d 8d 7e 20 4c 89 f8 48 c1 e8 03 <42> 80 3c 20 00 74 08 4c 89 ff e8 27 bf 2e f9 4d 8b 3f 49 83 c7 68
RSP: 0018:ffffc900015ef090 EFLAGS: 00010202
RAX: 0000000000000004 RBX: 0000000000000002 RCX: ffff88802dd89ec0
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
R10: dffffc0000000000 R11: ffffed1007cc8979 R12: dffffc0000000000
R13: dffffc0000000000 R14: 0000000000000000 R15: 0000000000000020
FS:  000055556d432500(0000) GS:ffff888125dca000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b34563fff CR3: 0000000042b1c000 CR4: 00000000003526f0
Call Trace:
 <TASK>
 udp_tunnel_sock_release+0x6d/0x80 net/ipv4/udp_tunnel_core.c:197
 rxe_release_udp_tunnel drivers/infiniband/sw/rxe/rxe_net.c:294 [inline]
 rxe_sock_put drivers/infiniband/sw/rxe/rxe_net.c:639 [inline]
 rxe_net_del+0xfb/0x290 drivers/infiniband/sw/rxe/rxe_net.c:660
 rxe_dellink+0x15/0x20 drivers/infiniband/sw/rxe/rxe.c:254
 nldev_dellink+0x304/0x3d0 drivers/infiniband/core/nldev.c:1849
 rdma_nl_rcv_msg drivers/infiniband/core/netlink.c:-1 [inline]
 rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
 rdma_nl_rcv+0x6d7/0xa10 drivers/infiniband/core/netlink.c:259
 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
 netlink_unicast+0x780/0x920 net/netlink/af_netlink.c:1345
 netlink_sendmsg+0x813/0xb40 net/netlink/af_netlink.c:1895
 sock_sendmsg_nosec+0x112/0x150 net/socket.c:797
 __sock_sendmsg net/socket.c:812 [inline]
 ____sys_sendmsg+0x55c/0x870 net/socket.c:2716
 ___sys_sendmsg+0x2a5/0x360 net/socket.c:2770
 __sys_sendmsg net/socket.c:2802 [inline]
 __do_sys_sendmsg net/socket.c:2807 [inline]
 __se_sys_sendmsg net/socket.c:2805 [inline]
 __x64_sys_sendmsg+0x1c3/0x2a0 net/socket.c:2805
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x15f/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f89172fcdd9
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffe8bf8c018 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f8917575fa0 RCX: 00007f89172fcdd9
RDX: 0000000000000000 RSI: 00002000000002c0 RDI: 0000000000000006
RBP: 00007f8917392d69 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f8917575fac R14: 00007f8917575fa0 R15: 00007f8917575fa0
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:kernel_sock_shutdown+0x2a/0x70 net/socket.c:3803
Code: f3 0f 1e fa 41 57 41 56 41 54 53 89 f3 49 89 fe 49 bc 00 00 00 00 00 fc ff df e8 e1 25 c5 f8 4d 8d 7e 20 4c 89 f8 48 c1 e8 03 <42> 80 3c 20 00 74 08 4c 89 ff e8 27 bf 2e f9 4d 8b 3f 49 83 c7 68
RSP: 0018:ffffc900015ef090 EFLAGS: 00010202
RAX: 0000000000000004 RBX: 0000000000000002 RCX: ffff88802dd89ec0
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
R10: dffffc0000000000 R11: ffffed1007cc8979 R12: dffffc0000000000
R13: dffffc0000000000 R14: 0000000000000000 R15: 0000000000000020
FS:  000055556d432500(0000) GS:ffff888125dca000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 0000000042b1c000 CR4: 00000000003526f0
----------------
Code disassembly (best guess):
   0:	f3 0f 1e fa          	endbr64
   4:	41 57                	push   %r15
   6:	41 56                	push   %r14
   8:	41 54                	push   %r12
   a:	53                   	push   %rbx
   b:	89 f3                	mov    %esi,%ebx
   d:	49 89 fe             	mov    %rdi,%r14
  10:	49 bc 00 00 00 00 00 	movabs $0xdffffc0000000000,%r12
  17:	fc ff df
  1a:	e8 e1 25 c5 f8       	call   0xf8c52600
  1f:	4d 8d 7e 20          	lea    0x20(%r14),%r15
  23:	4c 89 f8             	mov    %r15,%rax
  26:	48 c1 e8 03          	shr    $0x3,%rax
* 2a:	42 80 3c 20 00       	cmpb   $0x0,(%rax,%r12,1) <-- trapping instruction
  2f:	74 08                	je     0x39
  31:	4c 89 ff             	mov    %r15,%rdi
  34:	e8 27 bf 2e f9       	call   0xf92ebf60
  39:	4d 8b 3f             	mov    (%r15),%r15
  3c:	49 83 c7 68          	add    $0x68,%r15


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [syzbot] [rdma] general protection fault in kernel_sock_shutdown (4)
  2026-05-07  3:52 ` syzbot
@ 2026-05-07 10:12   ` Edward Adam Davis
  2026-05-07 12:02     ` syzbot
  2026-05-07 12:50   ` [PATCH] RDMA/nldev: add mutual exclusion in nldev_dellink() Edward Adam Davis
  2026-05-14  5:15   ` [syzbot] [rdma] general protection fault in kernel_sock_shutdown (4) Zhu Yanjun
  2 siblings, 1 reply; 21+ messages in thread
From: Edward Adam Davis @ 2026-05-07 10:12 UTC (permalink / raw)
  To: syzbot+d8f76778263ab65c2b21; +Cc: linux-kernel, syzkaller-bugs

#syz test

diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c
index 96c745d5bac4..3cb3cb7629fe 100644
--- a/drivers/infiniband/core/nldev.c
+++ b/drivers/infiniband/core/nldev.c
@@ -1816,6 +1816,8 @@ static int nldev_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
 	return err;
 }
 
+static DEFINE_MUTEX(nldev_dellink_mutex);
+
 static int nldev_dellink(struct sk_buff *skb, struct nlmsghdr *nlh,
 			  struct netlink_ext_ack *extack)
 {
@@ -1846,7 +1848,9 @@ static int nldev_dellink(struct sk_buff *skb, struct nlmsghdr *nlh,
 	 * implicitly scoped to the driver supporting dynamic link deletion like RXE.
 	 */
 	if (device->link_ops && device->link_ops->dellink) {
+		mutex_lock(&nldev_dellink_mutex);
 		err = device->link_ops->dellink(device);
+		mutex_unlock(&nldev_dellink_mutex);
 		if (err)
 			return err;
 	}


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [syzbot] [rdma] general protection fault in kernel_sock_shutdown (4)
  2026-05-07 10:12   ` Edward Adam Davis
@ 2026-05-07 12:02     ` syzbot
  0 siblings, 0 replies; 21+ messages in thread
From: syzbot @ 2026-05-07 12:02 UTC (permalink / raw)
  To: eadavis, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com
Tested-by: syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com

Tested on:

commit:         735d2f48 Add linux-next specific files for 20260506
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=15d2c196580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=a88880f0f312e277
dashboard link: https://syzkaller.appspot.com/bug?extid=d8f76778263ab65c2b21
compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
patch:          https://syzkaller.appspot.com/x/patch.diff?x=101a5f48580000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH] RDMA/nldev: add mutual exclusion in nldev_dellink()
  2026-05-07  3:52 ` syzbot
  2026-05-07 10:12   ` Edward Adam Davis
@ 2026-05-07 12:50   ` Edward Adam Davis
  2026-05-07 13:25     ` Zhu Yanjun
  2026-05-13 18:17     ` Leon Romanovsky
  2026-05-14  5:15   ` [syzbot] [rdma] general protection fault in kernel_sock_shutdown (4) Zhu Yanjun
  2 siblings, 2 replies; 21+ messages in thread
From: Edward Adam Davis @ 2026-05-07 12:50 UTC (permalink / raw)
  To: syzbot+d8f76778263ab65c2b21
  Cc: akpm, arjan, davem, dsahern, edumazet, hdanton, horms, jgg, kuba,
	kuni1840, kuniyu, leon, linux-kernel, linux-rdma, netdev, pabeni,
	syzkaller-bugs, yanjun.zhu, zyjzyj2000

We must serialize calls to nldev_dellink() or risk a crash as syzbot
reported:

KASAN: null-ptr-deref in range [0x0000000000000020-0x0000000000000027]
Call Trace:
 udp_tunnel_sock_release+0x6d/0x80 net/ipv4/udp_tunnel_core.c:197
 rxe_release_udp_tunnel drivers/infiniband/sw/rxe/rxe_net.c:294 [inline]
 rxe_sock_put drivers/infiniband/sw/rxe/rxe_net.c:639 [inline]
 rxe_net_del+0xfb/0x290 drivers/infiniband/sw/rxe/rxe_net.c:660
 rxe_dellink+0x15/0x20 drivers/infiniband/sw/rxe/rxe.c:254
 
Fixes: a60e3f3d6fba ("RDMA/nldev: Add dellink function pointer")
Reported-by: syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d8f76778263ab65c2b21
Tested-by: syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
---
 drivers/infiniband/core/nldev.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c
index 96c745d5bac4..3cb3cb7629fe 100644
--- a/drivers/infiniband/core/nldev.c
+++ b/drivers/infiniband/core/nldev.c
@@ -1816,6 +1816,8 @@ static int nldev_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
 	return err;
 }
 
+static DEFINE_MUTEX(nldev_dellink_mutex);
+
 static int nldev_dellink(struct sk_buff *skb, struct nlmsghdr *nlh,
 			  struct netlink_ext_ack *extack)
 {
@@ -1846,7 +1848,9 @@ static int nldev_dellink(struct sk_buff *skb, struct nlmsghdr *nlh,
 	 * implicitly scoped to the driver supporting dynamic link deletion like RXE.
 	 */
 	if (device->link_ops && device->link_ops->dellink) {
+		mutex_lock(&nldev_dellink_mutex);
 		err = device->link_ops->dellink(device);
+		mutex_unlock(&nldev_dellink_mutex);
 		if (err)
 			return err;
 	}
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] RDMA/nldev: add mutual exclusion in nldev_dellink()
  2026-05-07 12:50   ` [PATCH] RDMA/nldev: add mutual exclusion in nldev_dellink() Edward Adam Davis
@ 2026-05-07 13:25     ` Zhu Yanjun
  2026-05-07 13:40       ` Edward Adam Davis
  2026-05-13 18:17     ` Leon Romanovsky
  1 sibling, 1 reply; 21+ messages in thread
From: Zhu Yanjun @ 2026-05-07 13:25 UTC (permalink / raw)
  To: Edward Adam Davis, syzbot+d8f76778263ab65c2b21,
	yanjun.zhu@linux.dev
  Cc: akpm, arjan, davem, dsahern, edumazet, hdanton, horms, jgg, kuba,
	kuni1840, kuniyu, leon, linux-kernel, linux-rdma, netdev, pabeni,
	syzkaller-bugs, zyjzyj2000


在 2026/5/7 5:50, Edward Adam Davis 写道:
> We must serialize calls to nldev_dellink() or risk a crash as syzbot
> reported:
>
> KASAN: null-ptr-deref in range [0x0000000000000020-0x0000000000000027]
> Call Trace:
>   udp_tunnel_sock_release+0x6d/0x80 net/ipv4/udp_tunnel_core.c:197
>   rxe_release_udp_tunnel drivers/infiniband/sw/rxe/rxe_net.c:294 [inline]
>   rxe_sock_put drivers/infiniband/sw/rxe/rxe_net.c:639 [inline]
>   rxe_net_del+0xfb/0x290 drivers/infiniband/sw/rxe/rxe_net.c:660
>   rxe_dellink+0x15/0x20 drivers/infiniband/sw/rxe/rxe.c:254
>   
> Fixes: a60e3f3d6fba ("RDMA/nldev: Add dellink function pointer")
> Reported-by: syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=d8f76778263ab65c2b21
> Tested-by: syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com
> Signed-off-by: Edward Adam Davis <eadavis@qq.com>

Thanks a lot. This looks like a good solution. Since the issue is 
reproducible,

have you sent this commit to syzbot for verification?

Thanks,

Zhu Yanjun

> ---
>   drivers/infiniband/core/nldev.c | 4 ++++
>   1 file changed, 4 insertions(+)
>
> diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c
> index 96c745d5bac4..3cb3cb7629fe 100644
> --- a/drivers/infiniband/core/nldev.c
> +++ b/drivers/infiniband/core/nldev.c
> @@ -1816,6 +1816,8 @@ static int nldev_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
>   	return err;
>   }
>   
> +static DEFINE_MUTEX(nldev_dellink_mutex);
> +
>   static int nldev_dellink(struct sk_buff *skb, struct nlmsghdr *nlh,
>   			  struct netlink_ext_ack *extack)
>   {
> @@ -1846,7 +1848,9 @@ static int nldev_dellink(struct sk_buff *skb, struct nlmsghdr *nlh,
>   	 * implicitly scoped to the driver supporting dynamic link deletion like RXE.
>   	 */
>   	if (device->link_ops && device->link_ops->dellink) {
> +		mutex_lock(&nldev_dellink_mutex);
>   		err = device->link_ops->dellink(device);
> +		mutex_unlock(&nldev_dellink_mutex);
>   		if (err)
>   			return err;
>   	}

-- 
Best Regards,
Yanjun.Zhu


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] RDMA/nldev: add mutual exclusion in nldev_dellink()
  2026-05-07 13:25     ` Zhu Yanjun
@ 2026-05-07 13:40       ` Edward Adam Davis
  2026-05-07 14:11         ` Zhu Yanjun
  0 siblings, 1 reply; 21+ messages in thread
From: Edward Adam Davis @ 2026-05-07 13:40 UTC (permalink / raw)
  To: yanjun.zhu
  Cc: akpm, arjan, davem, dsahern, eadavis, edumazet, hdanton, horms,
	jgg, kuba, kuni1840, kuniyu, leon, linux-kernel, linux-rdma,
	netdev, pabeni, syzbot+d8f76778263ab65c2b21, syzkaller-bugs,
	zyjzyj2000

On Thu, 7 May 2026 06:25:54 -0700, Zhu Yanjun wrote:
> > We must serialize calls to nldev_dellink() or risk a crash as syzbot
> > reported:
> >
> > KASAN: null-ptr-deref in range [0x0000000000000020-0x0000000000000027]
> > Call Trace:
> >   udp_tunnel_sock_release+0x6d/0x80 net/ipv4/udp_tunnel_core.c:197
> >   rxe_release_udp_tunnel drivers/infiniband/sw/rxe/rxe_net.c:294 [inline]
> >   rxe_sock_put drivers/infiniband/sw/rxe/rxe_net.c:639 [inline]
> >   rxe_net_del+0xfb/0x290 drivers/infiniband/sw/rxe/rxe_net.c:660
> >   rxe_dellink+0x15/0x20 drivers/infiniband/sw/rxe/rxe.c:254
> >
> > Fixes: a60e3f3d6fba ("RDMA/nldev: Add dellink function pointer")
> > Reported-by: syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/bug?extid=d8f76778263ab65c2b21
> > Tested-by: syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com
> > Signed-off-by: Edward Adam Davis <eadavis@qq.com>
> 
> Thanks a lot. This looks like a good solution. Since the issue is
> reproducible,
> 
> have you sent this commit to syzbot for verification?
The patch has been verified by syzbot.

BR,
Edward


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] RDMA/nldev: add mutual exclusion in nldev_dellink()
  2026-05-07 13:40       ` Edward Adam Davis
@ 2026-05-07 14:11         ` Zhu Yanjun
  0 siblings, 0 replies; 21+ messages in thread
From: Zhu Yanjun @ 2026-05-07 14:11 UTC (permalink / raw)
  To: Edward Adam Davis, yanjun.zhu@linux.dev
  Cc: akpm, arjan, davem, dsahern, edumazet, hdanton, horms, jgg, kuba,
	kuni1840, kuniyu, leon, linux-kernel, linux-rdma, netdev, pabeni,
	syzbot+d8f76778263ab65c2b21, syzkaller-bugs, zyjzyj2000


在 2026/5/7 6:40, Edward Adam Davis 写道:
> On Thu, 7 May 2026 06:25:54 -0700, Zhu Yanjun wrote:
>>> We must serialize calls to nldev_dellink() or risk a crash as syzbot
>>> reported:
>>>
>>> KASAN: null-ptr-deref in range [0x0000000000000020-0x0000000000000027]
>>> Call Trace:
>>>    udp_tunnel_sock_release+0x6d/0x80 net/ipv4/udp_tunnel_core.c:197
>>>    rxe_release_udp_tunnel drivers/infiniband/sw/rxe/rxe_net.c:294 [inline]
>>>    rxe_sock_put drivers/infiniband/sw/rxe/rxe_net.c:639 [inline]
>>>    rxe_net_del+0xfb/0x290 drivers/infiniband/sw/rxe/rxe_net.c:660
>>>    rxe_dellink+0x15/0x20 drivers/infiniband/sw/rxe/rxe.c:254
>>>
>>> Fixes: a60e3f3d6fba ("RDMA/nldev: Add dellink function pointer")
>>> Reported-by: syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com
>>> Closes: https://syzkaller.appspot.com/bug?extid=d8f76778263ab65c2b21
>>> Tested-by: syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com
>>> Signed-off-by: Edward Adam Davis <eadavis@qq.com>
>> Thanks a lot. This looks like a good solution. Since the issue is
>> reproducible,
>>
>> have you sent this commit to syzbot for verification?
> The patch has been verified by syzbot.

Thanks a lot.

Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>

Zhu Yanjun

>
> BR,
> Edward
>
-- 
Best Regards,
Yanjun.Zhu


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] RDMA/nldev: add mutual exclusion in nldev_dellink()
  2026-05-07 12:50   ` [PATCH] RDMA/nldev: add mutual exclusion in nldev_dellink() Edward Adam Davis
  2026-05-07 13:25     ` Zhu Yanjun
@ 2026-05-13 18:17     ` Leon Romanovsky
  2026-05-13 23:46       ` Jason Gunthorpe
  1 sibling, 1 reply; 21+ messages in thread
From: Leon Romanovsky @ 2026-05-13 18:17 UTC (permalink / raw)
  To: syzbot+d8f76778263ab65c2b21, Edward Adam Davis
  Cc: akpm, arjan, davem, dsahern, edumazet, hdanton, horms, jgg, kuba,
	kuniyu, linux-kernel, linux-rdma, netdev, pabeni, syzkaller-bugs,
	yanjun.zhu, zyjzyj2000, Kuniyuki Iwashima


On Thu, 07 May 2026 20:50:10 +0800, Edward Adam Davis wrote:
> We must serialize calls to nldev_dellink() or risk a crash as syzbot
> reported:
> 
> Call Trace:
>  udp_tunnel_sock_release+0x6d/0x80 net/ipv4/udp_tunnel_core.c:197
>  rxe_release_udp_tunnel drivers/infiniband/sw/rxe/rxe_net.c:294 [inline]
>  rxe_sock_put drivers/infiniband/sw/rxe/rxe_net.c:639 [inline]
>  rxe_net_del+0xfb/0x290 drivers/infiniband/sw/rxe/rxe_net.c:660
>  rxe_dellink+0x15/0x20 drivers/infiniband/sw/rxe/rxe.c:254
> 
> [...]

Applied, thanks!

[1/1] RDMA/nldev: add mutual exclusion in nldev_dellink()
      https://git.kernel.org/rdma/rdma/c/0b28000b64f40d

Best regards,
-- 
Leon Romanovsky <leon@kernel.org>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] RDMA/nldev: add mutual exclusion in nldev_dellink()
  2026-05-13 18:17     ` Leon Romanovsky
@ 2026-05-13 23:46       ` Jason Gunthorpe
  2026-05-14  7:31         ` Edward Adam Davis
  0 siblings, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2026-05-13 23:46 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: syzbot+d8f76778263ab65c2b21, Edward Adam Davis, akpm, arjan,
	davem, dsahern, edumazet, hdanton, horms, kuba, kuniyu,
	linux-kernel, linux-rdma, netdev, pabeni, syzkaller-bugs,
	yanjun.zhu, zyjzyj2000

On Wed, May 13, 2026 at 02:17:28PM -0400, Leon Romanovsky wrote:
> 
> On Thu, 07 May 2026 20:50:10 +0800, Edward Adam Davis wrote:
> > We must serialize calls to nldev_dellink() or risk a crash as syzbot
> > reported:
> > 
> > Call Trace:
> >  udp_tunnel_sock_release+0x6d/0x80 net/ipv4/udp_tunnel_core.c:197
> >  rxe_release_udp_tunnel drivers/infiniband/sw/rxe/rxe_net.c:294 [inline]
> >  rxe_sock_put drivers/infiniband/sw/rxe/rxe_net.c:639 [inline]
> >  rxe_net_del+0xfb/0x290 drivers/infiniband/sw/rxe/rxe_net.c:660
> >  rxe_dellink+0x15/0x20 drivers/infiniband/sw/rxe/rxe.c:254
> > 
> > [...]
> 
> Applied, thanks!
> 
> [1/1] RDMA/nldev: add mutual exclusion in nldev_dellink()
>       https://git.kernel.org/rdma/rdma/c/0b28000b64f40d

This seems like a rxe bug, I would have expected the lock to be inside
rxe to protect its racy implementation of rxe_net_del(), which looks
like it is possibly also triggered by NETDEV_UNREGISTER...

ie it should not change nldev_dellink().

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [syzbot] [rdma] general protection fault in kernel_sock_shutdown (4)
  2026-05-07  3:52 ` syzbot
  2026-05-07 10:12   ` Edward Adam Davis
  2026-05-07 12:50   ` [PATCH] RDMA/nldev: add mutual exclusion in nldev_dellink() Edward Adam Davis
@ 2026-05-14  5:15   ` Zhu Yanjun
  2 siblings, 0 replies; 21+ messages in thread
From: Zhu Yanjun @ 2026-05-14  5:15 UTC (permalink / raw)
  To: syzbot, akpm, arjan, davem, dsahern, edumazet, hdanton, horms,
	jgg, kuba, kuni1840, kuniyu, leon, linux-kernel, linux-rdma,
	netdev, pabeni, syzkaller-bugs, zyjzyj2000

syz test: https://github.com/zhuyj/linux null-ptr-deref_kernel_sock_shutdown


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] RDMA/nldev: add mutual exclusion in nldev_dellink()
  2026-05-13 23:46       ` Jason Gunthorpe
@ 2026-05-14  7:31         ` Edward Adam Davis
  2026-05-14 11:50           ` Jason Gunthorpe
  0 siblings, 1 reply; 21+ messages in thread
From: Edward Adam Davis @ 2026-05-14  7:31 UTC (permalink / raw)
  To: jgg
  Cc: akpm, arjan, davem, dsahern, eadavis, edumazet, hdanton, horms,
	kuba, kuniyu, leon, linux-kernel, linux-rdma, netdev, pabeni,
	syzbot+d8f76778263ab65c2b21, syzkaller-bugs, yanjun.zhu,
	zyjzyj2000

On Wed, 13 May 2026 20:46:55 -0300, Jason Gunthorpe wrote:
> On Wed, May 13, 2026 at 02:17:28PM -0400, Leon Romanovsky wrote:
> >
> > On Thu, 07 May 2026 20:50:10 +0800, Edward Adam Davis wrote:
> > > We must serialize calls to nldev_dellink() or risk a crash as syzbot
> > > reported:
> > >
> > > Call Trace:
> > >  udp_tunnel_sock_release+0x6d/0x80 net/ipv4/udp_tunnel_core.c:197
> > >  rxe_release_udp_tunnel drivers/infiniband/sw/rxe/rxe_net.c:294 [inline]
> > >  rxe_sock_put drivers/infiniband/sw/rxe/rxe_net.c:639 [inline]
> > >  rxe_net_del+0xfb/0x290 drivers/infiniband/sw/rxe/rxe_net.c:660
> > >  rxe_dellink+0x15/0x20 drivers/infiniband/sw/rxe/rxe.c:254
> > >
> > > [...]
> >
> > Applied, thanks!
> >
> > [1/1] RDMA/nldev: add mutual exclusion in nldev_dellink()
> >       https://git.kernel.org/rdma/rdma/c/0b28000b64f40d
> 
> This seems like a rxe bug, I would have expected the lock to be inside
> rxe to protect its racy implementation of rxe_net_del(), which looks
> like it is possibly also triggered by NETDEV_UNREGISTER...
No, it was triggered by RDMA_NLDEV_CMD_DELLINK, you can see the "call trace".
> 
> ie it should not change nldev_dellink().
While this could be fixed within RXE, the same issue affects all other
RXE-like submodules when they subsequently support the "dellink" interface,
therefore, handling this within nldev_dellink() is relatively more appropriate.

Edward


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] RDMA/nldev: add mutual exclusion in nldev_dellink()
  2026-05-14  7:31         ` Edward Adam Davis
@ 2026-05-14 11:50           ` Jason Gunthorpe
  2026-05-14 13:58             ` David Ahern
  0 siblings, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2026-05-14 11:50 UTC (permalink / raw)
  To: Edward Adam Davis
  Cc: akpm, arjan, davem, dsahern, edumazet, hdanton, horms, kuba,
	kuniyu, leon, linux-kernel, linux-rdma, netdev, pabeni,
	syzbot+d8f76778263ab65c2b21, syzkaller-bugs, yanjun.zhu,
	zyjzyj2000

On Thu, May 14, 2026 at 03:31:22PM +0800, Edward Adam Davis wrote:
> On Wed, 13 May 2026 20:46:55 -0300, Jason Gunthorpe wrote:
> > On Wed, May 13, 2026 at 02:17:28PM -0400, Leon Romanovsky wrote:
> > >
> > > On Thu, 07 May 2026 20:50:10 +0800, Edward Adam Davis wrote:
> > > > We must serialize calls to nldev_dellink() or risk a crash as syzbot
> > > > reported:
> > > >
> > > > Call Trace:
> > > >  udp_tunnel_sock_release+0x6d/0x80 net/ipv4/udp_tunnel_core.c:197
> > > >  rxe_release_udp_tunnel drivers/infiniband/sw/rxe/rxe_net.c:294 [inline]
> > > >  rxe_sock_put drivers/infiniband/sw/rxe/rxe_net.c:639 [inline]
> > > >  rxe_net_del+0xfb/0x290 drivers/infiniband/sw/rxe/rxe_net.c:660
> > > >  rxe_dellink+0x15/0x20 drivers/infiniband/sw/rxe/rxe.c:254
> > > >
> > > > [...]
> > >
> > > Applied, thanks!
> > >
> > > [1/1] RDMA/nldev: add mutual exclusion in nldev_dellink()
> > >       https://git.kernel.org/rdma/rdma/c/0b28000b64f40d
> > 
> > This seems like a rxe bug, I would have expected the lock to be inside
> > rxe to protect its racy implementation of rxe_net_del(), which looks
> > like it is possibly also triggered by NETDEV_UNREGISTER...
> No, it was triggered by RDMA_NLDEV_CMD_DELLINK, you can see the "call trace".
> > 
> > ie it should not change nldev_dellink().
> While this could be fixed within RXE, the same issue affects all other
> RXE-like submodules when they subsequently support the "dellink" interface,
> therefore, handling this within nldev_dellink() is relatively more appropriate.

Why would other modules have an issue? The problem is rxe's racey
refcounting scheme for its lazy socket creation. There is nothing
wrong with nldev, and now you've created some nasty BKL in the nldev
code to fix rxe while ignoring its other races.

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] RDMA/nldev: add mutual exclusion in nldev_dellink()
  2026-05-14 11:50           ` Jason Gunthorpe
@ 2026-05-14 13:58             ` David Ahern
  2026-05-14 14:14               ` Jason Gunthorpe
  0 siblings, 1 reply; 21+ messages in thread
From: David Ahern @ 2026-05-14 13:58 UTC (permalink / raw)
  To: Jason Gunthorpe, Edward Adam Davis
  Cc: akpm, arjan, davem, edumazet, hdanton, horms, kuba, kuniyu, leon,
	linux-kernel, linux-rdma, netdev, pabeni,
	syzbot+d8f76778263ab65c2b21, syzkaller-bugs, yanjun.zhu,
	zyjzyj2000

On 5/14/26 5:50 AM, Jason Gunthorpe wrote:
> On Thu, May 14, 2026 at 03:31:22PM +0800, Edward Adam Davis wrote:
>> On Wed, 13 May 2026 20:46:55 -0300, Jason Gunthorpe wrote:
>>> On Wed, May 13, 2026 at 02:17:28PM -0400, Leon Romanovsky wrote:
>>>>
>>>> On Thu, 07 May 2026 20:50:10 +0800, Edward Adam Davis wrote:
>>>>> We must serialize calls to nldev_dellink() or risk a crash as syzbot
>>>>> reported:
>>>>>
>>>>> Call Trace:
>>>>>  udp_tunnel_sock_release+0x6d/0x80 net/ipv4/udp_tunnel_core.c:197
>>>>>  rxe_release_udp_tunnel drivers/infiniband/sw/rxe/rxe_net.c:294 [inline]
>>>>>  rxe_sock_put drivers/infiniband/sw/rxe/rxe_net.c:639 [inline]
>>>>>  rxe_net_del+0xfb/0x290 drivers/infiniband/sw/rxe/rxe_net.c:660
>>>>>  rxe_dellink+0x15/0x20 drivers/infiniband/sw/rxe/rxe.c:254
>>>>>
>>>>> [...]
>>>>
>>>> Applied, thanks!
>>>>
>>>> [1/1] RDMA/nldev: add mutual exclusion in nldev_dellink()
>>>>       https://git.kernel.org/rdma/rdma/c/0b28000b64f40d
>>>
>>> This seems like a rxe bug, I would have expected the lock to be inside
>>> rxe to protect its racy implementation of rxe_net_del(), which looks
>>> like it is possibly also triggered by NETDEV_UNREGISTER...
>> No, it was triggered by RDMA_NLDEV_CMD_DELLINK, you can see the "call trace".

Not that Jason's point. Code wise

rxe_dellink -> rxe_net_del

netdev NETDEV_UNREGISTER:
 rxe_notify -> rxe_net_del

both can lead to the same problem

>>>
>>> ie it should not change nldev_dellink().
>> While this could be fixed within RXE, the same issue affects all other
>> RXE-like submodules when they subsequently support the "dellink" interface,
>> therefore, handling this within nldev_dellink() is relatively more appropriate.
> 
> Why would other modules have an issue? The problem is rxe's racey
> refcounting scheme for its lazy socket creation. There is nothing
> wrong with nldev, and now you've created some nasty BKL in the nldev
> code to fix rxe while ignoring its other races.

+1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] RDMA/nldev: add mutual exclusion in nldev_dellink()
  2026-05-14 13:58             ` David Ahern
@ 2026-05-14 14:14               ` Jason Gunthorpe
  2026-05-14 14:26                 ` David Ahern
  0 siblings, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2026-05-14 14:14 UTC (permalink / raw)
  To: David Ahern
  Cc: Edward Adam Davis, akpm, arjan, davem, edumazet, hdanton, horms,
	kuba, kuniyu, leon, linux-kernel, linux-rdma, netdev, pabeni,
	syzbot+d8f76778263ab65c2b21, syzkaller-bugs, yanjun.zhu,
	zyjzyj2000

On Thu, May 14, 2026 at 07:58:18AM -0600, David Ahern wrote:
> On 5/14/26 5:50 AM, Jason Gunthorpe wrote:
> > On Thu, May 14, 2026 at 03:31:22PM +0800, Edward Adam Davis wrote:
> >> On Wed, 13 May 2026 20:46:55 -0300, Jason Gunthorpe wrote:
> >>> On Wed, May 13, 2026 at 02:17:28PM -0400, Leon Romanovsky wrote:
> >>>>
> >>>> On Thu, 07 May 2026 20:50:10 +0800, Edward Adam Davis wrote:
> >>>>> We must serialize calls to nldev_dellink() or risk a crash as syzbot
> >>>>> reported:
> >>>>>
> >>>>> Call Trace:
> >>>>>  udp_tunnel_sock_release+0x6d/0x80 net/ipv4/udp_tunnel_core.c:197
> >>>>>  rxe_release_udp_tunnel drivers/infiniband/sw/rxe/rxe_net.c:294 [inline]
> >>>>>  rxe_sock_put drivers/infiniband/sw/rxe/rxe_net.c:639 [inline]
> >>>>>  rxe_net_del+0xfb/0x290 drivers/infiniband/sw/rxe/rxe_net.c:660
> >>>>>  rxe_dellink+0x15/0x20 drivers/infiniband/sw/rxe/rxe.c:254
> >>>>>
> >>>>> [...]
> >>>>
> >>>> Applied, thanks!
> >>>>
> >>>> [1/1] RDMA/nldev: add mutual exclusion in nldev_dellink()
> >>>>       https://git.kernel.org/rdma/rdma/c/0b28000b64f40d
> >>>
> >>> This seems like a rxe bug, I would have expected the lock to be inside
> >>> rxe to protect its racy implementation of rxe_net_del(), which looks
> >>> like it is possibly also triggered by NETDEV_UNREGISTER...
> >> No, it was triggered by RDMA_NLDEV_CMD_DELLINK, you can see the "call trace".
> 
> Not that Jason's point. Code wise
> 
> rxe_dellink -> rxe_net_del
> 
> netdev NETDEV_UNREGISTER:
>  rxe_notify -> rxe_net_del
> 
> both can lead to the same problem
> 
> >>>
> >>> ie it should not change nldev_dellink().
> >> While this could be fixed within RXE, the same issue affects all other
> >> RXE-like submodules when they subsequently support the "dellink" interface,
> >> therefore, handling this within nldev_dellink() is relatively more appropriate.
> > 
> > Why would other modules have an issue? The problem is rxe's racey
> > refcounting scheme for its lazy socket creation. There is nothing
> > wrong with nldev, and now you've created some nasty BKL in the nldev
> > code to fix rxe while ignoring its other races.
> 
> +1

Edward, please come with a fixup on top of this since it was already
applied

Jason
 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] RDMA/nldev: add mutual exclusion in nldev_dellink()
  2026-05-14 14:14               ` Jason Gunthorpe
@ 2026-05-14 14:26                 ` David Ahern
  2026-05-14 15:46                   ` Zhu Yanjun
  0 siblings, 1 reply; 21+ messages in thread
From: David Ahern @ 2026-05-14 14:26 UTC (permalink / raw)
  To: Jason Gunthorpe, Zhu Yanjun
  Cc: Edward Adam Davis, akpm, arjan, davem, edumazet, hdanton, horms,
	kuba, kuniyu, leon, linux-kernel, linux-rdma, netdev, pabeni,
	syzbot+d8f76778263ab65c2b21, syzkaller-bugs, yanjun.zhu,
	zyjzyj2000

On 5/14/26 8:14 AM, Jason Gunthorpe wrote:
> 
> Edward, please come with a fixup on top of this since it was already
> applied
> 

Zhu Yanjun: As author of the patch that introduced the bug and
maintainer of the rxe code, why have you not addressed this problem? It
has been well known for many weeks now and multiple people have
attempted fixes. Seems like you need to step up and take care of it.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] RDMA/nldev: add mutual exclusion in nldev_dellink()
  2026-05-14 14:26                 ` David Ahern
@ 2026-05-14 15:46                   ` Zhu Yanjun
  0 siblings, 0 replies; 21+ messages in thread
From: Zhu Yanjun @ 2026-05-14 15:46 UTC (permalink / raw)
  To: David Ahern, Jason Gunthorpe, Zhu Yanjun
  Cc: Edward Adam Davis, akpm, arjan, davem, edumazet, hdanton, horms,
	kuba, kuniyu, leon, linux-kernel, linux-rdma, netdev, pabeni,
	syzbot+d8f76778263ab65c2b21, syzkaller-bugs, zyjzyj2000


在 2026/5/14 7:26, David Ahern 写道:
> On 5/14/26 8:14 AM, Jason Gunthorpe wrote:
>> Edward, please come with a fixup on top of this since it was already
>> applied
>>
> Zhu Yanjun: As author of the patch that introduced the bug and
> maintainer of the rxe code, why have you not addressed this problem? It
> has been well known for many weeks now and multiple people have
I am aware of the issue and have been following the discussion and 
proposed fixes.

I did not want to rush a change without fully understanding the 
implications on RXE

behavior and existing users. I am currently reviewing the proposed 
approaches and

working on a proper fix.

I appreciate everyone who helped investigate and test the issue.

Zhu Yanjun


> attempted fixes. Seems like you need to step up and take care of it.
>


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2026-05-14 15:48 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <69ea344f.a00a0220.17a17.0040.GAE@google.com>
2026-05-06 13:48 ` [syzbot] [rdma] general protection fault in kernel_sock_shutdown (4) syzbot
2026-05-06 14:28   ` Zhu Yanjun
2026-05-06 15:19     ` Kuniyuki Iwashima
2026-05-07  1:30   ` Hillf Danton
2026-05-07  1:57     ` syzbot
2026-05-07  3:52 ` syzbot
2026-05-07 10:12   ` Edward Adam Davis
2026-05-07 12:02     ` syzbot
2026-05-07 12:50   ` [PATCH] RDMA/nldev: add mutual exclusion in nldev_dellink() Edward Adam Davis
2026-05-07 13:25     ` Zhu Yanjun
2026-05-07 13:40       ` Edward Adam Davis
2026-05-07 14:11         ` Zhu Yanjun
2026-05-13 18:17     ` Leon Romanovsky
2026-05-13 23:46       ` Jason Gunthorpe
2026-05-14  7:31         ` Edward Adam Davis
2026-05-14 11:50           ` Jason Gunthorpe
2026-05-14 13:58             ` David Ahern
2026-05-14 14:14               ` Jason Gunthorpe
2026-05-14 14:26                 ` David Ahern
2026-05-14 15:46                   ` Zhu Yanjun
2026-05-14  5:15   ` [syzbot] [rdma] general protection fault in kernel_sock_shutdown (4) Zhu Yanjun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox