* [syzbot] [net?] possible deadlock in inet6_getname
@ 2026-02-13 12:15 syzbot
2026-02-13 17:26 ` Eric Dumazet
2026-02-16 11:32 ` Fernando Fernandez Mancera
0 siblings, 2 replies; 14+ messages in thread
From: syzbot @ 2026-02-13 12:15 UTC (permalink / raw)
To: davem, dsahern, edumazet, horms, kuba, linux-kernel, netdev,
pabeni, syzkaller-bugs
Hello,
syzbot found the following issue on:
HEAD commit: 57be33f85e36 nfc: nxp-nci: remove interrupt trigger type
git tree: net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=12f2165a580000
kernel config: https://syzkaller.appspot.com/x/.config?x=2c36acc86fd56a9d
dashboard link: https://syzkaller.appspot.com/bug?extid=5efae91f60932839f0a5
compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1584065a580000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16f2165a580000
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/77aaa9b6c846/disk-57be33f8.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/2617a56e7118/vmlinux-57be33f8.xz
kernel image: https://storage.googleapis.com/syzbot-assets/4fd173f33f5f/bzImage-57be33f8.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+5efae91f60932839f0a5@syzkaller.appspotmail.com
============================================
WARNING: possible recursive locking detected
syzkaller #0 Not tainted
--------------------------------------------
kworker/u8:6/2985 is trying to acquire lock:
ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1709 [inline]
ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: inet6_getname+0x15d/0x650 net/ipv6/af_inet6.c:533
but task is already holding lock:
ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1709 [inline]
ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: tcp_sock_set_cork+0x2c/0x2e0 net/ipv4/tcp.c:3694
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(k-sk_lock-AF_INET6);
lock(k-sk_lock-AF_INET6);
*** DEADLOCK ***
May be due to missing lock nesting notation
4 locks held by kworker/u8:6/2985:
#0: ffff888033131948 ((wq_completion)krds_cp_wq#1/0){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3232 [inline]
#0: ffff888033131948 ((wq_completion)krds_cp_wq#1/0){+.+.}-{0:0}, at: process_scheduled_works+0x9d4/0x17a0 kernel/workqueue.c:3340
#1: ffffc9000b8a7bc0 ((work_completion)(&(&cp->cp_send_w)->work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3233 [inline]
#1: ffffc9000b8a7bc0 ((work_completion)(&(&cp->cp_send_w)->work)){+.+.}-{0:0}, at: process_scheduled_works+0xa0f/0x17a0 kernel/workqueue.c:3340
#2: ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1709 [inline]
#2: ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: tcp_sock_set_cork+0x2c/0x2e0 net/ipv4/tcp.c:3694
#3: ffff88807a07abc8 (k-clock-AF_INET6){++.-}-{3:3}, at: rds_tcp_data_ready+0x113/0x950 net/rds/tcp_recv.c:320
stack backtrace:
CPU: 1 UID: 0 PID: 2985 Comm: kworker/u8:6 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/24/2026
Workqueue: krds_cp_wq#1/0 rds_send_worker
Call Trace:
<TASK>
dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
print_deadlock_bug+0x279/0x290 kernel/locking/lockdep.c:3041
check_deadlock kernel/locking/lockdep.c:3093 [inline]
validate_chain kernel/locking/lockdep.c:3895 [inline]
__lock_acquire+0x253f/0x2cf0 kernel/locking/lockdep.c:5237
lock_acquire+0x106/0x330 kernel/locking/lockdep.c:5868
lock_sock_nested+0x48/0x100 net/core/sock.c:3780
lock_sock include/net/sock.h:1709 [inline]
inet6_getname+0x15d/0x650 net/ipv6/af_inet6.c:533
rds_tcp_get_peer_sport net/rds/tcp_listen.c:70 [inline]
rds_tcp_conn_slots_available+0x288/0x470 net/rds/tcp_listen.c:149
rds_recv_hs_exthdrs+0x60f/0x7c0 net/rds/recv.c:265
rds_recv_incoming+0x9f6/0x12d0 net/rds/recv.c:389
rds_tcp_data_recv+0x7f1/0xa40 net/rds/tcp_recv.c:243
__tcp_read_sock+0x196/0x970 net/ipv4/tcp.c:1702
rds_tcp_read_sock net/rds/tcp_recv.c:277 [inline]
rds_tcp_data_ready+0x369/0x950 net/rds/tcp_recv.c:331
tcp_rcv_established+0x19e9/0x2670 net/ipv4/tcp_input.c:6675
tcp_v6_do_rcv+0x8eb/0x1ba0 net/ipv6/tcp_ipv6.c:1609
sk_backlog_rcv include/net/sock.h:1185 [inline]
__release_sock+0x1b8/0x3a0 net/core/sock.c:3213
release_sock+0x5f/0x1f0 net/core/sock.c:3795
rds_send_xmit+0x207e/0x28d0 net/rds/send.c:480
rds_send_worker+0x7d/0x2e0 net/rds/threads.c:200
process_one_work kernel/workqueue.c:3257 [inline]
process_scheduled_works+0xaec/0x17a0 kernel/workqueue.c:3340
worker_thread+0xda6/0x1360 kernel/workqueue.c:3421
kthread+0x726/0x8b0 kernel/kthread.c:463
ret_from_fork+0x51b/0xa40 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246
</TASK>
BUG: sleeping function called from invalid context at net/core/sock.c:3782
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2985, name: kworker/u8:6
preempt_count: 201, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
Preemption disabled at:
[<0000000000000000>] 0x0
CPU: 1 UID: 0 PID: 2985 Comm: kworker/u8:6 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/24/2026
Workqueue: krds_cp_wq#1/0 rds_send_worker
Call Trace:
<TASK>
dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
__might_resched+0x378/0x4d0 kernel/sched/core.c:8829
lock_sock_nested+0x5d/0x100 net/core/sock.c:3782
lock_sock include/net/sock.h:1709 [inline]
inet6_getname+0x15d/0x650 net/ipv6/af_inet6.c:533
rds_tcp_get_peer_sport net/rds/tcp_listen.c:70 [inline]
rds_tcp_conn_slots_available+0x288/0x470 net/rds/tcp_listen.c:149
rds_recv_hs_exthdrs+0x60f/0x7c0 net/rds/recv.c:265
rds_recv_incoming+0x9f6/0x12d0 net/rds/recv.c:389
rds_tcp_data_recv+0x7f1/0xa40 net/rds/tcp_recv.c:243
__tcp_read_sock+0x196/0x970 net/ipv4/tcp.c:1702
rds_tcp_read_sock net/rds/tcp_recv.c:277 [inline]
rds_tcp_data_ready+0x369/0x950 net/rds/tcp_recv.c:331
tcp_rcv_established+0x19e9/0x2670 net/ipv4/tcp_input.c:6675
tcp_v6_do_rcv+0x8eb/0x1ba0 net/ipv6/tcp_ipv6.c:1609
sk_backlog_rcv include/net/sock.h:1185 [inline]
__release_sock+0x1b8/0x3a0 net/core/sock.c:3213
release_sock+0x5f/0x1f0 net/core/sock.c:3795
rds_send_xmit+0x207e/0x28d0 net/rds/send.c:480
rds_send_worker+0x7d/0x2e0 net/rds/threads.c:200
process_one_work kernel/workqueue.c:3257 [inline]
process_scheduled_works+0xaec/0x17a0 kernel/workqueue.c:3340
worker_thread+0xda6/0x1360 kernel/workqueue.c:3421
kthread+0x726/0x8b0 kernel/kthread.c:463
ret_from_fork+0x51b/0xa40 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246
</TASK>
BUG: scheduling while atomic: kworker/u8:6/2985/0x00000202
INFO: lockdep is turned off.
Modules linked in:
Preemption disabled at:
[<0000000000000000>] 0x0
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [syzbot] [net?] possible deadlock in inet6_getname
2026-02-13 12:15 [syzbot] [net?] possible deadlock in inet6_getname syzbot
@ 2026-02-13 17:26 ` Eric Dumazet
2026-02-13 18:51 ` Gerd Rausch
2026-02-16 11:32 ` Fernando Fernandez Mancera
1 sibling, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2026-02-13 17:26 UTC (permalink / raw)
To: syzbot, Gerd Rausch
Cc: davem, dsahern, horms, kuba, linux-kernel, netdev, pabeni,
syzkaller-bugs
On Fri, Feb 13, 2026 at 1:15 PM syzbot
<syzbot+5efae91f60932839f0a5@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 57be33f85e36 nfc: nxp-nci: remove interrupt trigger type
> git tree: net-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=12f2165a580000
> kernel config: https://syzkaller.appspot.com/x/.config?x=2c36acc86fd56a9d
> dashboard link: https://syzkaller.appspot.com/bug?extid=5efae91f60932839f0a5
> compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1584065a580000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16f2165a580000
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/77aaa9b6c846/disk-57be33f8.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/2617a56e7118/vmlinux-57be33f8.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/4fd173f33f5f/bzImage-57be33f8.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+5efae91f60932839f0a5@syzkaller.appspotmail.com
>
> ============================================
> WARNING: possible recursive locking detected
> syzkaller #0 Not tainted
> --------------------------------------------
> kworker/u8:6/2985 is trying to acquire lock:
> ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1709 [inline]
> ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: inet6_getname+0x15d/0x650 net/ipv6/af_inet6.c:533
>
> but task is already holding lock:
> ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1709 [inline]
> ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: tcp_sock_set_cork+0x2c/0x2e0 net/ipv4/tcp.c:3694
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(k-sk_lock-AF_INET6);
> lock(k-sk_lock-AF_INET6);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 4 locks held by kworker/u8:6/2985:
> #0: ffff888033131948 ((wq_completion)krds_cp_wq#1/0){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3232 [inline]
> #0: ffff888033131948 ((wq_completion)krds_cp_wq#1/0){+.+.}-{0:0}, at: process_scheduled_works+0x9d4/0x17a0 kernel/workqueue.c:3340
> #1: ffffc9000b8a7bc0 ((work_completion)(&(&cp->cp_send_w)->work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3233 [inline]
> #1: ffffc9000b8a7bc0 ((work_completion)(&(&cp->cp_send_w)->work)){+.+.}-{0:0}, at: process_scheduled_works+0xa0f/0x17a0 kernel/workqueue.c:3340
> #2: ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1709 [inline]
> #2: ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: tcp_sock_set_cork+0x2c/0x2e0 net/ipv4/tcp.c:3694
> #3: ffff88807a07abc8 (k-clock-AF_INET6){++.-}-{3:3}, at: rds_tcp_data_ready+0x113/0x950 net/rds/tcp_recv.c:320
>
> stack backtrace:
> CPU: 1 UID: 0 PID: 2985 Comm: kworker/u8:6 Not tainted syzkaller #0 PREEMPT(full)
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/24/2026
> Workqueue: krds_cp_wq#1/0 rds_send_worker
> Call Trace:
> <TASK>
> dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
> print_deadlock_bug+0x279/0x290 kernel/locking/lockdep.c:3041
> check_deadlock kernel/locking/lockdep.c:3093 [inline]
> validate_chain kernel/locking/lockdep.c:3895 [inline]
> __lock_acquire+0x253f/0x2cf0 kernel/locking/lockdep.c:5237
> lock_acquire+0x106/0x330 kernel/locking/lockdep.c:5868
> lock_sock_nested+0x48/0x100 net/core/sock.c:3780
> lock_sock include/net/sock.h:1709 [inline]
> inet6_getname+0x15d/0x650 net/ipv6/af_inet6.c:533
> rds_tcp_get_peer_sport net/rds/tcp_listen.c:70 [inline]
> rds_tcp_conn_slots_available+0x288/0x470 net/rds/tcp_listen.c:149
> rds_recv_hs_exthdrs+0x60f/0x7c0 net/rds/recv.c:265
> rds_recv_incoming+0x9f6/0x12d0 net/rds/recv.c:389
> rds_tcp_data_recv+0x7f1/0xa40 net/rds/tcp_recv.c:243
> __tcp_read_sock+0x196/0x970 net/ipv4/tcp.c:1702
> rds_tcp_read_sock net/rds/tcp_recv.c:277 [inline]
> rds_tcp_data_ready+0x369/0x950 net/rds/tcp_recv.c:331
> tcp_rcv_established+0x19e9/0x2670 net/ipv4/tcp_input.c:6675
> tcp_v6_do_rcv+0x8eb/0x1ba0 net/ipv6/tcp_ipv6.c:1609
> sk_backlog_rcv include/net/sock.h:1185 [inline]
> __release_sock+0x1b8/0x3a0 net/core/sock.c:3213
> release_sock+0x5f/0x1f0 net/core/sock.c:3795
> rds_send_xmit+0x207e/0x28d0 net/rds/send.c:480
> rds_send_worker+0x7d/0x2e0 net/rds/threads.c:200
> process_one_work kernel/workqueue.c:3257 [inline]
> process_scheduled_works+0xaec/0x17a0 kernel/workqueue.c:3340
> worker_thread+0xda6/0x1360 kernel/workqueue.c:3421
> kthread+0x726/0x8b0 kernel/kthread.c:463
> ret_from_fork+0x51b/0xa40 arch/x86/kernel/process.c:158
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246
> </TASK>
> BUG: sleeping function called from invalid context at net/core/sock.c:3782
> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2985, name: kworker/u8:6
> preempt_count: 201, expected: 0
> RCU nest depth: 0, expected: 0
> INFO: lockdep is turned off.
> Preemption disabled at:
> [<0000000000000000>] 0x0
> CPU: 1 UID: 0 PID: 2985 Comm: kworker/u8:6 Not tainted syzkaller #0 PREEMPT(full)
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/24/2026
> Workqueue: krds_cp_wq#1/0 rds_send_worker
> Call Trace:
> <TASK>
> dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
> __might_resched+0x378/0x4d0 kernel/sched/core.c:8829
> lock_sock_nested+0x5d/0x100 net/core/sock.c:3782
> lock_sock include/net/sock.h:1709 [inline]
> inet6_getname+0x15d/0x650 net/ipv6/af_inet6.c:533
> rds_tcp_get_peer_sport net/rds/tcp_listen.c:70 [inline]
> rds_tcp_conn_slots_available+0x288/0x470 net/rds/tcp_listen.c:149
> rds_recv_hs_exthdrs+0x60f/0x7c0 net/rds/recv.c:265
> rds_recv_incoming+0x9f6/0x12d0 net/rds/recv.c:389
> rds_tcp_data_recv+0x7f1/0xa40 net/rds/tcp_recv.c:243
> __tcp_read_sock+0x196/0x970 net/ipv4/tcp.c:1702
> rds_tcp_read_sock net/rds/tcp_recv.c:277 [inline]
> rds_tcp_data_ready+0x369/0x950 net/rds/tcp_recv.c:331
> tcp_rcv_established+0x19e9/0x2670 net/ipv4/tcp_input.c:6675
> tcp_v6_do_rcv+0x8eb/0x1ba0 net/ipv6/tcp_ipv6.c:1609
> sk_backlog_rcv include/net/sock.h:1185 [inline]
> __release_sock+0x1b8/0x3a0 net/core/sock.c:3213
> release_sock+0x5f/0x1f0 net/core/sock.c:3795
> rds_send_xmit+0x207e/0x28d0 net/rds/send.c:480
> rds_send_worker+0x7d/0x2e0 net/rds/threads.c:200
> process_one_work kernel/workqueue.c:3257 [inline]
> process_scheduled_works+0xaec/0x17a0 kernel/workqueue.c:3340
> worker_thread+0xda6/0x1360 kernel/workqueue.c:3421
> kthread+0x726/0x8b0 kernel/kthread.c:463
> ret_from_fork+0x51b/0xa40 arch/x86/kernel/process.c:158
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246
> </TASK>
> BUG: scheduling while atomic: kworker/u8:6/2985/0x00000202
> INFO: lockdep is turned off.
> Modules linked in:
> Preemption disabled at:
> [<0000000000000000>] 0x0
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
>
> If you want syzbot to run the reproducer, reply with:
> #syz test: git://repo/address.git branch-or-commit-hash
> If you attach or paste a git patch, syzbot will apply it before testing.
>
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
>
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
>
> If you want to undo deduplication, reply with:
> #syz undup
Gerd, please take a look, thanks.
commit 9d27a0fb122f19b6d01d02f4b4f429ca28811ace
Author: Gerd Rausch <gerd.rausch@oracle.com>
Date: Mon Feb 2 22:57:23 2026 -0700
net/rds: Trigger rds_send_ping() more than once
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [syzbot] [net?] possible deadlock in inet6_getname
2026-02-13 17:26 ` Eric Dumazet
@ 2026-02-13 18:51 ` Gerd Rausch
2026-02-14 18:25 ` Fernando Fernandez Mancera
0 siblings, 1 reply; 14+ messages in thread
From: Gerd Rausch @ 2026-02-13 18:51 UTC (permalink / raw)
To: Eric Dumazet, syzbot
Cc: davem, dsahern, horms, kuba, linux-kernel, netdev, pabeni,
syzkaller-bugs
Hi,
On 2026-02-13 09:26, Eric Dumazet wrote:
> On Fri, Feb 13, 2026 at 1:15 PM syzbot
> <syzbot+5efae91f60932839f0a5@syzkaller.appspotmail.com> wrote:
>>
>>
[...]
>> ============================================
>> WARNING: possible recursive locking detected
>> syzkaller #0 Not tainted
>> --------------------------------------------
>> kworker/u8:6/2985 is trying to acquire lock:
>> ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1709 [inline]
>> ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: inet6_getname+0x15d/0x650 net/ipv6/af_inet6.c:533
>>
>> but task is already holding lock:
>> ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1709 [inline]
>> ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: tcp_sock_set_cork+0x2c/0x2e0 net/ipv4/tcp.c:3694
>>
[...]
>> lock_sock_nested+0x48/0x100 net/core/sock.c:3780
>> lock_sock include/net/sock.h:1709 [inline]
>> inet6_getname+0x15d/0x650 net/ipv6/af_inet6.c:533
>> rds_tcp_get_peer_sport net/rds/tcp_listen.c:70 [inline]
>> rds_tcp_conn_slots_available+0x288/0x470 net/rds/tcp_listen.c:149
>> rds_recv_hs_exthdrs+0x60f/0x7c0 net/rds/recv.c:265
>> rds_recv_incoming+0x9f6/0x12d0 net/rds/recv.c:389
>> rds_tcp_data_recv+0x7f1/0xa40 net/rds/tcp_recv.c:243
>> __tcp_read_sock+0x196/0x970 net/ipv4/tcp.c:1702
>> rds_tcp_read_sock net/rds/tcp_recv.c:277 [inline]
>> rds_tcp_data_ready+0x369/0x950 net/rds/tcp_recv.c:331
>> tcp_rcv_established+0x19e9/0x2670 net/ipv4/tcp_input.c:6675
>> tcp_v6_do_rcv+0x8eb/0x1ba0 net/ipv6/tcp_ipv6.c:1609
>> sk_backlog_rcv include/net/sock.h:1185 [inline]
>> __release_sock+0x1b8/0x3a0 net/core/sock.c:3213
>
[...]
> Gerd, please take a look, thanks.
>
> commit 9d27a0fb122f19b6d01d02f4b4f429ca28811ace
> Author: Gerd Rausch <gerd.rausch@oracle.com>
> Date: Mon Feb 2 22:57:23 2026 -0700
>
> net/rds: Trigger rds_send_ping() more than once
Syzbot is right:
inet_getname() acquires a lock_sock() that was already held
as __release_sock() is about to give it up, but before
doing so, handles the backlog receives & callbacks.
Just need to figure out a way to obtain the peer's port number,
without ending up in such a recursive lock scenario.
Thanks for forwarding this,
Gerd
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [syzbot] [net?] possible deadlock in inet6_getname
2026-02-13 18:51 ` Gerd Rausch
@ 2026-02-14 18:25 ` Fernando Fernandez Mancera
2026-02-17 16:59 ` Gerd Rausch
0 siblings, 1 reply; 14+ messages in thread
From: Fernando Fernandez Mancera @ 2026-02-14 18:25 UTC (permalink / raw)
To: Gerd Rausch, Eric Dumazet, syzbot
Cc: davem, dsahern, horms, kuba, linux-kernel, netdev, pabeni,
syzkaller-bugs
On 2/13/26 7:51 PM, Gerd Rausch wrote:
> Hi,
>
> On 2026-02-13 09:26, Eric Dumazet wrote:
>> On Fri, Feb 13, 2026 at 1:15 PM syzbot
>> <syzbot+5efae91f60932839f0a5@syzkaller.appspotmail.com> wrote:
>>>
>>>
> [...]
>>> ============================================
>>> WARNING: possible recursive locking detected
>>> syzkaller #0 Not tainted
>>> --------------------------------------------
>>> kworker/u8:6/2985 is trying to acquire lock:
>>> ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock
>>> include/net/sock.h:1709 [inline]
>>> ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at:
>>> inet6_getname+0x15d/0x650 net/ipv6/af_inet6.c:533
>>>
>>> but task is already holding lock:
>>> ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock
>>> include/net/sock.h:1709 [inline]
>>> ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at:
>>> tcp_sock_set_cork+0x2c/0x2e0 net/ipv4/tcp.c:3694
>>>
> [...]
>>> lock_sock_nested+0x48/0x100 net/core/sock.c:3780
>>> lock_sock include/net/sock.h:1709 [inline]
>>> inet6_getname+0x15d/0x650 net/ipv6/af_inet6.c:533
>>> rds_tcp_get_peer_sport net/rds/tcp_listen.c:70 [inline]
>>> rds_tcp_conn_slots_available+0x288/0x470 net/rds/tcp_listen.c:149
>>> rds_recv_hs_exthdrs+0x60f/0x7c0 net/rds/recv.c:265
>>> rds_recv_incoming+0x9f6/0x12d0 net/rds/recv.c:389
>>> rds_tcp_data_recv+0x7f1/0xa40 net/rds/tcp_recv.c:243
>>> __tcp_read_sock+0x196/0x970 net/ipv4/tcp.c:1702
>>> rds_tcp_read_sock net/rds/tcp_recv.c:277 [inline]
>>> rds_tcp_data_ready+0x369/0x950 net/rds/tcp_recv.c:331
>>> tcp_rcv_established+0x19e9/0x2670 net/ipv4/tcp_input.c:6675
>>> tcp_v6_do_rcv+0x8eb/0x1ba0 net/ipv6/tcp_ipv6.c:1609
>>> sk_backlog_rcv include/net/sock.h:1185 [inline]
>>> __release_sock+0x1b8/0x3a0 net/core/sock.c:3213
>>
> [...]
>> Gerd, please take a look, thanks.
>>
>> commit 9d27a0fb122f19b6d01d02f4b4f429ca28811ace
>> Author: Gerd Rausch <gerd.rausch@oracle.com>
>> Date: Mon Feb 2 22:57:23 2026 -0700
>>
>> net/rds: Trigger rds_send_ping() more than once
>
> Syzbot is right:
>
> inet_getname() acquires a lock_sock() that was already held
> as () is about to give it up, but before
> doing so, handles the backlog receives & callbacks.
>
> Just need to figure out a way to obtain the peer's port number,
> without ending up in such a recursive lock scenario.
>
Hi,
Shouldn't this be enough?
diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c
index 6fb5c928b8fd..a36e5dfd6c66 100644
--- a/net/rds/tcp_listen.c
+++ b/net/rds/tcp_listen.c
@@ -59,30 +59,12 @@ void rds_tcp_keepalive(struct socket *sock)
static int
rds_tcp_get_peer_sport(struct socket *sock)
{
- union {
- struct sockaddr_storage storage;
- struct sockaddr addr;
- struct sockaddr_in sin;
- struct sockaddr_in6 sin6;
- } saddr;
- int sport;
-
- if (kernel_getpeername(sock, &saddr.addr) >= 0) {
- switch (saddr.addr.sa_family) {
- case AF_INET:
- sport = ntohs(saddr.sin.sin_port);
- break;
- case AF_INET6:
- sport = ntohs(saddr.sin6.sin6_port);
- break;
- default:
- sport = -1;
- }
- } else {
- sport = -1;
- }
+ struct sock *sk = sock->sk;
+
+ if (!sk)
+ return -1;
- return sport;
+ return ntohs(inet_sk(sk)->inet_dport);
}
It would be safe from rds_tcp_accept_one() path as the new_sock has a
reference count of 1 and no other component should be to release it.
In rds_tcp_conn_slots_available() path, fan-out can be only performed
from receive path, AFAIU if data is being processed from the socket we
should always be holding a lock.
If these premises are not correct, we can always make this conditional.
But getting rid of the kernel_getpeername() call is performance-wise too.
I am testing this against the syzbot report/reproducer.
Thanks,
Fernando.
> Thanks for forwarding this,
>
> Gerd
>
>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [syzbot] [net?] possible deadlock in inet6_getname
2026-02-13 12:15 [syzbot] [net?] possible deadlock in inet6_getname syzbot
2026-02-13 17:26 ` Eric Dumazet
@ 2026-02-16 11:32 ` Fernando Fernandez Mancera
2026-02-16 11:45 ` syzbot
1 sibling, 1 reply; 14+ messages in thread
From: Fernando Fernandez Mancera @ 2026-02-16 11:32 UTC (permalink / raw)
To: syzbot, davem, dsahern, edumazet, horms, kuba, linux-kernel,
netdev, pabeni, syzkaller-bugs
On 2/13/26 1:15 PM, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 57be33f85e36 nfc: nxp-nci: remove interrupt trigger type
> git tree: net-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=12f2165a580000
> kernel config: https://syzkaller.appspot.com/x/.config?x=2c36acc86fd56a9d
> dashboard link: https://syzkaller.appspot.com/bug?extid=5efae91f60932839f0a5
> compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1584065a580000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16f2165a580000
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/77aaa9b6c846/disk-57be33f8.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/2617a56e7118/vmlinux-57be33f8.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/4fd173f33f5f/bzImage-57be33f8.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+5efae91f60932839f0a5@syzkaller.appspotmail.com
>
> ============================================
> WARNING: possible recursive locking detected
> syzkaller #0 Not tainted
> --------------------------------------------
> kworker/u8:6/2985 is trying to acquire lock:
> ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1709 [inline]
> ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: inet6_getname+0x15d/0x650 net/ipv6/af_inet6.c:533
>
> but task is already holding lock:
> ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1709 [inline]
> ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: tcp_sock_set_cork+0x2c/0x2e0 net/ipv4/tcp.c:3694
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(k-sk_lock-AF_INET6);
> lock(k-sk_lock-AF_INET6);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 4 locks held by kworker/u8:6/2985:
> #0: ffff888033131948 ((wq_completion)krds_cp_wq#1/0){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3232 [inline]
> #0: ffff888033131948 ((wq_completion)krds_cp_wq#1/0){+.+.}-{0:0}, at: process_scheduled_works+0x9d4/0x17a0 kernel/workqueue.c:3340
> #1: ffffc9000b8a7bc0 ((work_completion)(&(&cp->cp_send_w)->work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3233 [inline]
> #1: ffffc9000b8a7bc0 ((work_completion)(&(&cp->cp_send_w)->work)){+.+.}-{0:0}, at: process_scheduled_works+0xa0f/0x17a0 kernel/workqueue.c:3340
> #2: ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1709 [inline]
> #2: ffff88807a07aa20 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: tcp_sock_set_cork+0x2c/0x2e0 net/ipv4/tcp.c:3694
> #3: ffff88807a07abc8 (k-clock-AF_INET6){++.-}-{3:3}, at: rds_tcp_data_ready+0x113/0x950 net/rds/tcp_recv.c:320
>
> stack backtrace:
> CPU: 1 UID: 0 PID: 2985 Comm: kworker/u8:6 Not tainted syzkaller #0 PREEMPT(full)
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/24/2026
> Workqueue: krds_cp_wq#1/0 rds_send_worker
> Call Trace:
> <TASK>
> dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
> print_deadlock_bug+0x279/0x290 kernel/locking/lockdep.c:3041
> check_deadlock kernel/locking/lockdep.c:3093 [inline]
> validate_chain kernel/locking/lockdep.c:3895 [inline]
> __lock_acquire+0x253f/0x2cf0 kernel/locking/lockdep.c:5237
> lock_acquire+0x106/0x330 kernel/locking/lockdep.c:5868
> lock_sock_nested+0x48/0x100 net/core/sock.c:3780
> lock_sock include/net/sock.h:1709 [inline]
> inet6_getname+0x15d/0x650 net/ipv6/af_inet6.c:533
> rds_tcp_get_peer_sport net/rds/tcp_listen.c:70 [inline]
> rds_tcp_conn_slots_available+0x288/0x470 net/rds/tcp_listen.c:149
> rds_recv_hs_exthdrs+0x60f/0x7c0 net/rds/recv.c:265
> rds_recv_incoming+0x9f6/0x12d0 net/rds/recv.c:389
> rds_tcp_data_recv+0x7f1/0xa40 net/rds/tcp_recv.c:243
> __tcp_read_sock+0x196/0x970 net/ipv4/tcp.c:1702
> rds_tcp_read_sock net/rds/tcp_recv.c:277 [inline]
> rds_tcp_data_ready+0x369/0x950 net/rds/tcp_recv.c:331
> tcp_rcv_established+0x19e9/0x2670 net/ipv4/tcp_input.c:6675
> tcp_v6_do_rcv+0x8eb/0x1ba0 net/ipv6/tcp_ipv6.c:1609
> sk_backlog_rcv include/net/sock.h:1185 [inline]
> __release_sock+0x1b8/0x3a0 net/core/sock.c:3213
> release_sock+0x5f/0x1f0 net/core/sock.c:3795
> rds_send_xmit+0x207e/0x28d0 net/rds/send.c:480
> rds_send_worker+0x7d/0x2e0 net/rds/threads.c:200
> process_one_work kernel/workqueue.c:3257 [inline]
> process_scheduled_works+0xaec/0x17a0 kernel/workqueue.c:3340
> worker_thread+0xda6/0x1360 kernel/workqueue.c:3421
> kthread+0x726/0x8b0 kernel/kthread.c:463
> ret_from_fork+0x51b/0xa40 arch/x86/kernel/process.c:158
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246
> </TASK>
> BUG: sleeping function called from invalid context at net/core/sock.c:3782
> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2985, name: kworker/u8:6
> preempt_count: 201, expected: 0
> RCU nest depth: 0, expected: 0
> INFO: lockdep is turned off.
> Preemption disabled at:
> [<0000000000000000>] 0x0
> CPU: 1 UID: 0 PID: 2985 Comm: kworker/u8:6 Not tainted syzkaller #0 PREEMPT(full)
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/24/2026
> Workqueue: krds_cp_wq#1/0 rds_send_worker
> Call Trace:
> <TASK>
> dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
> __might_resched+0x378/0x4d0 kernel/sched/core.c:8829
> lock_sock_nested+0x5d/0x100 net/core/sock.c:3782
> lock_sock include/net/sock.h:1709 [inline]
> inet6_getname+0x15d/0x650 net/ipv6/af_inet6.c:533
> rds_tcp_get_peer_sport net/rds/tcp_listen.c:70 [inline]
> rds_tcp_conn_slots_available+0x288/0x470 net/rds/tcp_listen.c:149
> rds_recv_hs_exthdrs+0x60f/0x7c0 net/rds/recv.c:265
> rds_recv_incoming+0x9f6/0x12d0 net/rds/recv.c:389
> rds_tcp_data_recv+0x7f1/0xa40 net/rds/tcp_recv.c:243
> __tcp_read_sock+0x196/0x970 net/ipv4/tcp.c:1702
> rds_tcp_read_sock net/rds/tcp_recv.c:277 [inline]
> rds_tcp_data_ready+0x369/0x950 net/rds/tcp_recv.c:331
> tcp_rcv_established+0x19e9/0x2670 net/ipv4/tcp_input.c:6675
> tcp_v6_do_rcv+0x8eb/0x1ba0 net/ipv6/tcp_ipv6.c:1609
> sk_backlog_rcv include/net/sock.h:1185 [inline]
> __release_sock+0x1b8/0x3a0 net/core/sock.c:3213
> release_sock+0x5f/0x1f0 net/core/sock.c:3795
> rds_send_xmit+0x207e/0x28d0 net/rds/send.c:480
> rds_send_worker+0x7d/0x2e0 net/rds/threads.c:200
> process_one_work kernel/workqueue.c:3257 [inline]
> process_scheduled_works+0xaec/0x17a0 kernel/workqueue.c:3340
> worker_thread+0xda6/0x1360 kernel/workqueue.c:3421
> kthread+0x726/0x8b0 kernel/kthread.c:463
> ret_from_fork+0x51b/0xa40 arch/x86/kernel/process.c:158
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246
> </TASK>
> BUG: scheduling while atomic: kworker/u8:6/2985/0x00000202
> INFO: lockdep is turned off.
> Modules linked in:
> Preemption disabled at:
> [<0000000000000000>] 0x0
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
>
> If you want syzbot to run the reproducer, reply with:
> #syz test: git://repo/address.git branch-or-commit-hash
> If you attach or paste a git patch, syzbot will apply it before testing.
>
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
>
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
>
> If you want to undo deduplication, reply with:
> #syz undup
>
#syz test
diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c
index 6fb5c928b8fd..a36e5dfd6c66 100644
--- a/net/rds/tcp_listen.c
+++ b/net/rds/tcp_listen.c
@@ -59,30 +59,12 @@ void rds_tcp_keepalive(struct socket *sock)
static int
rds_tcp_get_peer_sport(struct socket *sock)
{
- union {
- struct sockaddr_storage storage;
- struct sockaddr addr;
- struct sockaddr_in sin;
- struct sockaddr_in6 sin6;
- } saddr;
- int sport;
-
- if (kernel_getpeername(sock, &saddr.addr) >= 0) {
- switch (saddr.addr.sa_family) {
- case AF_INET:
- sport = ntohs(saddr.sin.sin_port);
- break;
- case AF_INET6:
- sport = ntohs(saddr.sin6.sin6_port);
- break;
- default:
- sport = -1;
- }
- } else {
- sport = -1;
- }
+ struct sock *sk = sock->sk;
+
+ if (!sk)
+ return -1;
- return sport;
+ return ntohs(inet_sk(sk)->inet_dport);
}
/* rds_tcp_accept_one_path(): if accepting on cp_index > 0, make sure the
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [syzbot] [net?] possible deadlock in inet6_getname
2026-02-16 11:32 ` Fernando Fernandez Mancera
@ 2026-02-16 11:45 ` syzbot
0 siblings, 0 replies; 14+ messages in thread
From: syzbot @ 2026-02-16 11:45 UTC (permalink / raw)
To: davem, dsahern, edumazet, fmancera, horms, kuba, linux-kernel,
netdev, pabeni, syzkaller-bugs
Hello,
syzbot tried to test the proposed patch but the build/boot failed:
failed to apply patch:
checking file net/rds/tcp_listen.c
Hunk #1 FAILED at 59.
1 out of 1 hunk FAILED
Tested on:
commit: 37a93dd5 Merge tag 'net-next-7.0' of git://git.kernel...
git tree: net-next
kernel config: https://syzkaller.appspot.com/x/.config?x=2c36acc86fd56a9d
dashboard link: https://syzkaller.appspot.com/bug?extid=5efae91f60932839f0a5
compiler:
patch: https://syzkaller.appspot.com/x/patch.diff?x=14aef15a580000
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [syzbot] [net?] possible deadlock in inet6_getname
2026-02-14 18:25 ` Fernando Fernandez Mancera
@ 2026-02-17 16:59 ` Gerd Rausch
2026-02-17 17:03 ` Fernando Fernandez Mancera
0 siblings, 1 reply; 14+ messages in thread
From: Gerd Rausch @ 2026-02-17 16:59 UTC (permalink / raw)
To: Fernando Fernandez Mancera, Eric Dumazet, syzbot
Cc: davem, dsahern, horms, kuba, linux-kernel, netdev, pabeni,
syzkaller-bugs
Hi,
On 2026-02-14 10:25, Fernando Fernandez Mancera wrote:
> --- a/net/rds/tcp_listen.c
> +++ b/net/rds/tcp_listen.c
> @@ -59,30 +59,12 @@ void rds_tcp_keepalive(struct socket *sock)
> static int
> rds_tcp_get_peer_sport(struct socket *sock)
> {
> - union {
[...]
> - } else {
> - sport = -1;
> - }
> + struct sock *sk = sock->sk;
> +
> + if (!sk)
> + return -1;
>
> - return sport;
> + return ntohs(inet_sk(sk)->inet_dport);
> }
>
> It would be safe from rds_tcp_accept_one() path as the new_sock has a reference count of 1 and no other component should be to release it.
>
> In rds_tcp_conn_slots_available() path, fan-out can be only performed from receive path, AFAIU if data is being processed from the socket we should always be holding a lock.
>
> If these premises are not correct, we can always make this conditional. But getting rid of the kernel_getpeername() call is performance-wise too.
>
rds_tcp_conn_slots_available() can also be called from rds_conn_shutdown(),
where no "rds_tcp.ko" backend specific lock is held.
This is very solvable though.
Worst case, we can distinguish between the paths where a lock_sock()
is already held from those that don't.
> I am testing this against the syzbot report/reproducer.
>
Thanks,
Gerd
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [syzbot] [net?] possible deadlock in inet6_getname
2026-02-17 16:59 ` Gerd Rausch
@ 2026-02-17 17:03 ` Fernando Fernandez Mancera
2026-02-17 17:13 ` Gerd Rausch
0 siblings, 1 reply; 14+ messages in thread
From: Fernando Fernandez Mancera @ 2026-02-17 17:03 UTC (permalink / raw)
To: Gerd Rausch, Eric Dumazet, syzbot
Cc: davem, dsahern, horms, kuba, linux-kernel, netdev, pabeni,
syzkaller-bugs
On 2/17/26 5:59 PM, Gerd Rausch wrote:
> Hi,
>
> On 2026-02-14 10:25, Fernando Fernandez Mancera wrote:
>> --- a/net/rds/tcp_listen.c
>> +++ b/net/rds/tcp_listen.c
>> @@ -59,30 +59,12 @@ void rds_tcp_keepalive(struct socket *sock)
>> static int
>> rds_tcp_get_peer_sport(struct socket *sock)
>> {
>> - union {
> [...]
>> - } else {
>> - sport = -1;
>> - }
>> + struct sock *sk = sock->sk;
>> +
>> + if (!sk)
>> + return -1;
>>
>> - return sport;
>> + return ntohs(inet_sk(sk)->inet_dport);
>> }
>>
>> It would be safe from rds_tcp_accept_one() path as the new_sock has a
>> reference count of 1 and no other component should be to release it.
>>
>> In rds_tcp_conn_slots_available() path, fan-out can be only performed
>> from receive path, AFAIU if data is being processed from the socket we
>> should always be holding a lock.
>>
>> If these premises are not correct, we can always make this
>> conditional. But getting rid of the kernel_getpeername() call is
>> performance-wise too.
>>
>
> rds_tcp_conn_slots_available() can also be called from rds_conn_shutdown(),
> where no "rds_tcp.ko" backend specific lock is held.
>
AFAICS, from rds_conn_shutdown() path rds_tcp_conn_slots_available() is
called with fan-out argument as false. Therefore, no need to get the
peer source port.
I think that should be fine.
> This is very solvable though.
>
> Worst case, we can distinguish between the paths where a lock_sock()
> is already held from those that don't.
>
>> I am testing this against the syzbot report/reproducer.
>>
>
> Thanks,
>
> Gerd
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [syzbot] [net?] possible deadlock in inet6_getname
2026-02-17 17:03 ` Fernando Fernandez Mancera
@ 2026-02-17 17:13 ` Gerd Rausch
2026-02-17 17:19 ` Fernando Fernandez Mancera
0 siblings, 1 reply; 14+ messages in thread
From: Gerd Rausch @ 2026-02-17 17:13 UTC (permalink / raw)
To: Fernando Fernandez Mancera, Eric Dumazet, syzbot
Cc: davem, dsahern, horms, kuba, linux-kernel, netdev, pabeni,
syzkaller-bugs
Hi,
On 2026-02-17 09:03, Fernando Fernandez Mancera wrote:
>> rds_tcp_conn_slots_available() can also be called from rds_conn_shutdown(),
>> where no "rds_tcp.ko" backend specific lock is held.
>>
>
>
> AFAICS, from rds_conn_shutdown() path rds_tcp_conn_slots_available() is called with fan-out argument as false. Therefore, no need to get the peer source port.
>
> I think that should be fine.
>
True, but IMHO subtle and error-prone.
If someone were to change the code to pass in "fan_out == true"
from a context not already holding a socket lock,
would they remember to change rds_tcp_conn_slots_available()
to acquire that lock?
Thanks,
Gerd
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [syzbot] [net?] possible deadlock in inet6_getname
2026-02-17 17:13 ` Gerd Rausch
@ 2026-02-17 17:19 ` Fernando Fernandez Mancera
2026-02-17 17:28 ` Gerd Rausch
0 siblings, 1 reply; 14+ messages in thread
From: Fernando Fernandez Mancera @ 2026-02-17 17:19 UTC (permalink / raw)
To: Gerd Rausch, Eric Dumazet, syzbot
Cc: davem, dsahern, horms, kuba, linux-kernel, netdev, pabeni,
syzkaller-bugs
On 2/17/26 6:13 PM, Gerd Rausch wrote:
> Hi,
>
> On 2026-02-17 09:03, Fernando Fernandez Mancera wrote:
>>> rds_tcp_conn_slots_available() can also be called from
>>> rds_conn_shutdown(),
>>> where no "rds_tcp.ko" backend specific lock is held.
>>>
>>
>>
>> AFAICS, from rds_conn_shutdown() path rds_tcp_conn_slots_available()
>> is called with fan-out argument as false. Therefore, no need to get
>> the peer source port.
>>
>> I think that should be fine.
>>
>
> True, but IMHO subtle and error-prone.
>
> If someone were to change the code to pass in "fan_out == true"
> from a context not already holding a socket lock,
> would they remember to change rds_tcp_conn_slots_available()
> to acquire that lock?
>
Usually kernel requires the developer to understand when do they need to
acquire a lock or not. Anyway, what would you suggest? To check whether
we have acquired the lock or not and do it conditionally?
Thanks,
Fernando.
> Thanks,
>
> Gerd
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [syzbot] [net?] possible deadlock in inet6_getname
2026-02-17 17:19 ` Fernando Fernandez Mancera
@ 2026-02-17 17:28 ` Gerd Rausch
2026-02-17 18:58 ` Gerd Rausch
0 siblings, 1 reply; 14+ messages in thread
From: Gerd Rausch @ 2026-02-17 17:28 UTC (permalink / raw)
To: Fernando Fernandez Mancera, Eric Dumazet, syzbot
Cc: davem, dsahern, horms, kuba, linux-kernel, netdev, pabeni,
syzkaller-bugs
On 2026-02-17 09:19, Fernando Fernandez Mancera wrote:
> On 2/17/26 6:13 PM, Gerd Rausch wrote:
>> If someone were to change the code to pass in "fan_out == true"
>> from a context not already holding a socket lock,
>> would they remember to change rds_tcp_conn_slots_available()
>> to acquire that lock?
>>
>
> Usually kernel requires the developer to understand when do they need to acquire a lock or not.
> Anyway, what would you suggest? To check whether we have acquired the lock or not and do it conditionally?>
Something along those lines.
Either a "_locked" version of "conn_slots_available" function pointer,
a parameter, or a check.
What you suggested is fine though for the immediate syzbot need.
Thanks,
Gerd
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [syzbot] [net?] possible deadlock in inet6_getname
2026-02-17 17:28 ` Gerd Rausch
@ 2026-02-17 18:58 ` Gerd Rausch
2026-02-17 20:26 ` Fernando Fernandez Mancera
0 siblings, 1 reply; 14+ messages in thread
From: Gerd Rausch @ 2026-02-17 18:58 UTC (permalink / raw)
To: Fernando Fernandez Mancera, Eric Dumazet, syzbot,
Allison Henderson
Cc: davem, dsahern, horms, kuba, linux-kernel, netdev, pabeni,
syzkaller-bugs
Hi,
+Allison
On 2026-02-17 09:28, Gerd Rausch wrote:
> On 2026-02-17 09:19, Fernando Fernandez Mancera wrote:
>> On 2/17/26 6:13 PM, Gerd Rausch wrote:
>>> If someone were to change the code to pass in "fan_out == true"
>>> from a context not already holding a socket lock,
>>> would they remember to change rds_tcp_conn_slots_available()
>>> to acquire that lock?
>>>
>>
>> Usually kernel requires the developer to understand when do they need to acquire a lock or not.
>> Anyway, what would you suggest? To check whether we have acquired the lock or not and do it conditionally?>
>
> Something along those lines.
>
> Either a "_locked" version of "conn_slots_available" function pointer,
> a parameter, or a check.
>
> What you suggested is fine though for the immediate syzbot need.
>
FWIW, in UEK (the modified Linux kernel that Oracle uses),
this issue reported by syzbot was addressed differently:
https://github.com/oracle/linux-uek/commit/a94abc444c487
The commit under discussion here originally came from:
https://github.com/oracle/linux-uek/commit/ebf71f5b6c29c
and pre-dates:
9dfc685e0262 ("inet: remove races in inet{6}_getname()")
which is why UEK didn't have this deadlock issue reported by syszbot.
If the objective is to keep divergence of UEK and Upstream to a minimum,
it may be worthwhile to consider adopting the same fix we carry in UEK.
Thanks,
Gerd
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [syzbot] [net?] possible deadlock in inet6_getname
2026-02-17 18:58 ` Gerd Rausch
@ 2026-02-17 20:26 ` Fernando Fernandez Mancera
2026-02-17 21:57 ` Allison Henderson
0 siblings, 1 reply; 14+ messages in thread
From: Fernando Fernandez Mancera @ 2026-02-17 20:26 UTC (permalink / raw)
To: Gerd Rausch, Eric Dumazet, syzbot, Allison Henderson
Cc: davem, dsahern, horms, kuba, linux-kernel, netdev, pabeni,
syzkaller-bugs
On 2/17/26 7:58 PM, Gerd Rausch wrote:
> Hi,
>
> +Allison
>
> On 2026-02-17 09:28, Gerd Rausch wrote:
>> On 2026-02-17 09:19, Fernando Fernandez Mancera wrote:
>>> On 2/17/26 6:13 PM, Gerd Rausch wrote:
>>>> If someone were to change the code to pass in "fan_out == true"
>>>> from a context not already holding a socket lock,
>>>> would they remember to change rds_tcp_conn_slots_available()
>>>> to acquire that lock?
>>>>
>>>
>>> Usually kernel requires the developer to understand when do they need
>>> to acquire a lock or not.
>>> Anyway, what would you suggest? To check whether we have acquired the
>>> lock or not and do it conditionally?>
>>
>> Something along those lines.
>>
>> Either a "_locked" version of "conn_slots_available" function pointer,
>> a parameter, or a check.
>>
>> What you suggested is fine though for the immediate syzbot need.
>>
>
> FWIW, in UEK (the modified Linux kernel that Oracle uses),
> this issue reported by syzbot was addressed differently:
>
> https://github.com/oracle/linux-uek/commit/a94abc444c487
>
> The commit under discussion here originally came from:
> https://github.com/oracle/linux-uek/commit/ebf71f5b6c29c
>
> and pre-dates:
> 9dfc685e0262 ("inet: remove races in inet{6}_getname()")
>
> which is why UEK didn't have this deadlock issue reported by syszbot.
>
> If the objective is to keep divergence of UEK and Upstream to a minimum,
> it may be worthwhile to consider adopting the same fix we carry in UEK.
>
I see, I don't really have a strong opinion here. Although, the solution
at UEK seems a bit overkill to me. As RDS do not really need all the
information that inet_getname() provides. It just needs the source port
and it can be used without acquiring the lock (because it was already done).
The ideal solution to me would be the patch proposed here [1] + a
comment on the shutdown path to remember the developer that acquiring a
lock is needed to enable fan-out. Having said that, I cannot tell if it
would be better to keep divergence of UEK and Upstream or be more
efficient. I am leaving this dilemma to you and Allison and any other
maintainer that would like to chime in.
Just let me know so I can a send a v2 of my proposed patch with a more
detailed commit message and also a comment on the mentioned path.
[1] https://lore.kernel.org/netdev/20260216120804.14840-1-fmancera@suse.de/
Thanks,
Fernando.
> Thanks,
>
> Gerd
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [syzbot] [net?] possible deadlock in inet6_getname
2026-02-17 20:26 ` Fernando Fernandez Mancera
@ 2026-02-17 21:57 ` Allison Henderson
0 siblings, 0 replies; 14+ messages in thread
From: Allison Henderson @ 2026-02-17 21:57 UTC (permalink / raw)
To: fmancera@suse.de, Gerd Rausch, edumazet@google.com,
syzbot+5efae91f60932839f0a5@syzkaller.appspotmail.com
Cc: davem@davemloft.net, pabeni@redhat.com, dsahern@kernel.org,
linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com,
horms@kernel.org, netdev@vger.kernel.org, kuba@kernel.org
On Tue, 2026-02-17 at 21:26 +0100, Fernando Fernandez Mancera wrote:
> On 2/17/26 7:58 PM, Gerd Rausch wrote:
> > Hi,
> >
> > +Allison
> >
> > On 2026-02-17 09:28, Gerd Rausch wrote:
> > > On 2026-02-17 09:19, Fernando Fernandez Mancera wrote:
> > > > On 2/17/26 6:13 PM, Gerd Rausch wrote:
> > > > > If someone were to change the code to pass in "fan_out == true"
> > > > > from a context not already holding a socket lock,
> > > > > would they remember to change rds_tcp_conn_slots_available()
> > > > > to acquire that lock?
> > > > >
> > > >
> > > > Usually kernel requires the developer to understand when do they need
> > > > to acquire a lock or not.
> > > > Anyway, what would you suggest? To check whether we have acquired the
> > > > lock or not and do it conditionally?>
> > >
> > > Something along those lines.
> > >
> > > Either a "_locked" version of "conn_slots_available" function pointer,
> > > a parameter, or a check.
> > >
> > > What you suggested is fine though for the immediate syzbot need.
> > >
> >
> > FWIW, in UEK (the modified Linux kernel that Oracle uses),
> > this issue reported by syzbot was addressed differently:
> >
> > https://urldefense.com/v3/__https://github.com/oracle/linux-uek/commit/a94abc444c487__;!!ACWV5N9M2RV99hQ!IpOznnWqw_v-wi4Ns7OEC3SCjCoKB-QhIWR5lVP4pEWqzy8lj4TQveGITLmDWUwVeiILCQoO5CfEn8NZL-GEx9w$
> >
> > The commit under discussion here originally came from:
> > https://urldefense.com/v3/__https://github.com/oracle/linux-uek/commit/ebf71f5b6c29c__;!!ACWV5N9M2RV99hQ!IpOznnWqw_v-wi4Ns7OEC3SCjCoKB-QhIWR5lVP4pEWqzy8lj4TQveGITLmDWUwVeiILCQoO5CfEn8NZXhL1owk$
> >
> > and pre-dates:
> > 9dfc685e0262 ("inet: remove races in inet{6}_getname()")
> >
> > which is why UEK didn't have this deadlock issue reported by syszbot.
> >
> > If the objective is to keep divergence of UEK and Upstream to a minimum,
> > it may be worthwhile to consider adopting the same fix we carry in UEK.
> >
>
> I see, I don't really have a strong opinion here. Although, the solution
> at UEK seems a bit overkill to me. As RDS do not really need all the
> information that inet_getname() provides. It just needs the source port
> and it can be used without acquiring the lock (because it was already done).
>
> The ideal solution to me would be the patch proposed here [1] + a
> comment on the shutdown path to remember the developer that acquiring a
> lock is needed to enable fan-out. Having said that, I cannot tell if it
> would be better to keep divergence of UEK and Upstream or be more
> efficient. I am leaving this dilemma to you and Allison and any other
> maintainer that would like to chime in.
>
> Just let me know so I can a send a v2 of my proposed patch with a more
> detailed commit message and also a comment on the mentioned path.
>
> [1] https://urldefense.com/v3/__https://lore.kernel.org/netdev/20260216120804.14840-1-fmancera@suse.de/__;!!ACWV5N9M2RV99hQ!IpOznnWqw_v-wi4Ns7OEC3SCjCoKB-QhIWR5lVP4pEWqzy8lj4TQveGITLmDWUwVeiILCQoO5CfEn8NZpQsJl9A$
>
> Thanks,
> Fernando.
Hi all,
Lets move forward with Fernandos fix here. I will withdraw "net/rds: Use proper peer port number even when not
connected" from the fan out improvement series since it conflicts. It looks like "fix recursive lock in
rds_tcp_conn_slots_available" eliminates the recursive lock issue entirely, so that will take care of the syzbot issue
for now and then I can move forward with the rest of the fan out patches in the next merge window.
Fernando, please go ahead and send your v2 with the extra comment details.
Thanks all,
Allison
>
> > Thanks,
> >
> > Gerd
> >
> >
>
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2026-02-17 21:57 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-13 12:15 [syzbot] [net?] possible deadlock in inet6_getname syzbot
2026-02-13 17:26 ` Eric Dumazet
2026-02-13 18:51 ` Gerd Rausch
2026-02-14 18:25 ` Fernando Fernandez Mancera
2026-02-17 16:59 ` Gerd Rausch
2026-02-17 17:03 ` Fernando Fernandez Mancera
2026-02-17 17:13 ` Gerd Rausch
2026-02-17 17:19 ` Fernando Fernandez Mancera
2026-02-17 17:28 ` Gerd Rausch
2026-02-17 18:58 ` Gerd Rausch
2026-02-17 20:26 ` Fernando Fernandez Mancera
2026-02-17 21:57 ` Allison Henderson
2026-02-16 11:32 ` Fernando Fernandez Mancera
2026-02-16 11:45 ` syzbot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox