* [syzbot] [rdma?] possible deadlock in siw_create_listen (2)
@ 2024-09-26 13:34 syzbot
2024-10-04 16:10 ` Bernard Metzler
0 siblings, 1 reply; 4+ messages in thread
From: syzbot @ 2024-09-26 13:34 UTC (permalink / raw)
To: bmt, jgg, leon, linux-kernel, linux-rdma, netdev, syzkaller-bugs
Hello,
syzbot found the following issue on:
HEAD commit: 5f5673607153 Merge branch 'for-next/core' into for-kernelci
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
console output: https://syzkaller.appspot.com/x/log.txt?x=149fdca9980000
kernel config: https://syzkaller.appspot.com/x/.config?x=dedbcb1ff4387972
dashboard link: https://syzkaller.appspot.com/bug?extid=3eb27595de9aa3cf63c3
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
userspace arch: arm64
Unfortunately, I don't have any reproducer for this issue yet.
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/40172aed5414/disk-5f567360.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/58372f305e9d/vmlinux-5f567360.xz
kernel image: https://storage.googleapis.com/syzbot-assets/d2aae6fa798f/Image-5f567360.gz.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+3eb27595de9aa3cf63c3@syzkaller.appspotmail.com
iwpm_register_pid: Unable to send a nlmsg (client = 2)
======================================================
WARNING: possible circular locking dependency detected
6.11.0-rc7-syzkaller-g5f5673607153 #0 Not tainted
------------------------------------------------------
syz.4.157/7931 is trying to acquire lock:
ffff0000ee056458 (sk_lock-AF_INET){+.+.}-{0:0}, at: siw_create_listen+0x164/0xd70 drivers/infiniband/sw/siw/siw_cm.c:1776
but task is already holding lock:
ffff800091c21ea8 (lock#7){+.+.}-{3:3}, at: cma_add_one+0x510/0xab4 drivers/infiniband/core/cma.c:5354
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #3 (lock#7){+.+.}-{3:3}:
__mutex_lock_common+0x190/0x21a0 kernel/locking/mutex.c:608
__mutex_lock kernel/locking/mutex.c:752 [inline]
mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:804
cma_init+0x2c/0x158 drivers/infiniband/core/cma.c:5438
do_one_initcall+0x24c/0x9c0 init/main.c:1267
do_initcall_level+0x154/0x214 init/main.c:1329
do_initcalls+0x58/0xac init/main.c:1345
do_basic_setup+0x8c/0xa0 init/main.c:1364
kernel_init_freeable+0x324/0x478 init/main.c:1578
kernel_init+0x24/0x2a0 init/main.c:1467
ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860
-> #2 (rtnl_mutex){+.+.}-{3:3}:
__mutex_lock_common+0x190/0x21a0 kernel/locking/mutex.c:608
__mutex_lock kernel/locking/mutex.c:752 [inline]
mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:804
rtnl_lock+0x20/0x2c net/core/rtnetlink.c:79
do_ip_setsockopt+0xe8c/0x346c net/ipv4/ip_sockglue.c:1077
ip_setsockopt+0x80/0x128 net/ipv4/ip_sockglue.c:1417
tcp_setsockopt+0xcc/0xe8 net/ipv4/tcp.c:3768
sock_common_setsockopt+0xb0/0xcc net/core/sock.c:3735
smc_setsockopt+0x204/0x10fc net/smc/af_smc.c:3072
do_sock_setsockopt+0x2a0/0x4e0 net/socket.c:2324
__sys_setsockopt+0x128/0x1a8 net/socket.c:2347
__do_sys_setsockopt net/socket.c:2356 [inline]
__se_sys_setsockopt net/socket.c:2353 [inline]
__arm64_sys_setsockopt+0xb8/0xd4 net/socket.c:2353
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
-> #1 (&smc->clcsock_release_lock){+.+.}-{3:3}:
__mutex_lock_common+0x190/0x21a0 kernel/locking/mutex.c:608
__mutex_lock kernel/locking/mutex.c:752 [inline]
mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:804
smc_switch_to_fallback+0x48/0xa80 net/smc/af_smc.c:902
smc_sendmsg+0xfc/0x9f8 net/smc/af_smc.c:2779
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
__sys_sendto+0x374/0x4f4 net/socket.c:2204
__do_sys_sendto net/socket.c:2216 [inline]
__se_sys_sendto net/socket.c:2212 [inline]
__arm64_sys_sendto+0xd8/0xf8 net/socket.c:2212
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
-> #0 (sk_lock-AF_INET){+.+.}-{0:0}:
check_prev_add kernel/locking/lockdep.c:3133 [inline]
check_prevs_add kernel/locking/lockdep.c:3252 [inline]
validate_chain kernel/locking/lockdep.c:3868 [inline]
__lock_acquire+0x33d8/0x779c kernel/locking/lockdep.c:5142
lock_acquire+0x240/0x728 kernel/locking/lockdep.c:5759
lock_sock_nested net/core/sock.c:3543 [inline]
lock_sock include/net/sock.h:1607 [inline]
sock_set_reuseaddr+0x58/0x154 net/core/sock.c:782
siw_create_listen+0x164/0xd70 drivers/infiniband/sw/siw/siw_cm.c:1776
iw_cm_listen+0x14c/0x204 drivers/infiniband/core/iwcm.c:585
cma_iw_listen drivers/infiniband/core/cma.c:2668 [inline]
rdma_listen+0x774/0xae4 drivers/infiniband/core/cma.c:3953
cma_listen_on_dev+0x320/0x64c drivers/infiniband/core/cma.c:2727
cma_add_one+0x5ec/0xab4 drivers/infiniband/core/cma.c:5357
add_client_context+0x45c/0x7d0 drivers/infiniband/core/device.c:727
enable_device_and_get+0x1a8/0x3e8 drivers/infiniband/core/device.c:1338
ib_register_device+0xe40/0x108c drivers/infiniband/core/device.c:1426
siw_device_register drivers/infiniband/sw/siw/siw_main.c:72 [inline]
siw_newlink+0x80c/0xc2c drivers/infiniband/sw/siw/siw_main.c:489
nldev_newlink+0x49c/0x4fc drivers/infiniband/core/nldev.c:1794
rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
rdma_nl_rcv+0x5c4/0x858 drivers/infiniband/core/netlink.c:259
netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
netlink_unicast+0x668/0x8a4 net/netlink/af_netlink.c:1357
netlink_sendmsg+0x7a4/0xa8c net/netlink/af_netlink.c:1901
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
____sys_sendmsg+0x56c/0x840 net/socket.c:2597
___sys_sendmsg net/socket.c:2651 [inline]
__sys_sendmsg+0x26c/0x33c net/socket.c:2680
__do_sys_sendmsg net/socket.c:2689 [inline]
__se_sys_sendmsg net/socket.c:2687 [inline]
__arm64_sys_sendmsg+0x80/0x94 net/socket.c:2687
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
other info that might help us debug this:
Chain exists of:
sk_lock-AF_INET --> rtnl_mutex --> lock#7
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(lock#7);
lock(rtnl_mutex);
lock(lock#7);
lock(sk_lock-AF_INET);
*** DEADLOCK ***
6 locks held by syz.4.157/7931:
#0: ffff8000974142d8 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at: rdma_nl_rcv_msg drivers/infiniband/core/netlink.c:164 [inline]
#0: ffff8000974142d8 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at: rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
#0: ffff8000974142d8 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at: rdma_nl_rcv+0x330/0x858 drivers/infiniband/core/netlink.c:259
#1: ffff800091c0e870 (link_ops_rwsem){++++}-{3:3}, at: nldev_newlink+0x358/0x4fc drivers/infiniband/core/nldev.c:1784
#2: ffff800091bff210 (devices_rwsem){++++}-{3:3}, at: enable_device_and_get+0x104/0x3e8 drivers/infiniband/core/device.c:1328
#3: ffff800091bff510 (clients_rwsem){++++}-{3:3}, at: enable_device_and_get+0x160/0x3e8 drivers/infiniband/core/device.c:1336
#4: ffff0000d61505d0 (&device->client_data_rwsem){++++}-{3:3}, at: add_client_context+0x424/0x7d0 drivers/infiniband/core/device.c:725
#5: ffff800091c21ea8 (lock#7){+.+.}-{3:3}, at: cma_add_one+0x510/0xab4 drivers/infiniband/core/cma.c:5354
stack backtrace:
CPU: 0 UID: 0 PID: 7931 Comm: syz.4.157 Not tainted 6.11.0-rc7-syzkaller-g5f5673607153 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
Call trace:
dump_backtrace+0x1b8/0x1e4 arch/arm64/kernel/stacktrace.c:319
show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:326
__dump_stack lib/dump_stack.c:93 [inline]
dump_stack_lvl+0xe4/0x150 lib/dump_stack.c:119
dump_stack+0x1c/0x28 lib/dump_stack.c:128
print_circular_bug+0x150/0x1b8 kernel/locking/lockdep.c:2059
check_noncircular+0x310/0x404 kernel/locking/lockdep.c:2186
check_prev_add kernel/locking/lockdep.c:3133 [inline]
check_prevs_add kernel/locking/lockdep.c:3252 [inline]
validate_chain kernel/locking/lockdep.c:3868 [inline]
__lock_acquire+0x33d8/0x779c kernel/locking/lockdep.c:5142
lock_acquire+0x240/0x728 kernel/locking/lockdep.c:5759
lock_sock_nested net/core/sock.c:3543 [inline]
lock_sock include/net/sock.h:1607 [inline]
sock_set_reuseaddr+0x58/0x154 net/core/sock.c:782
siw_create_listen+0x164/0xd70 drivers/infiniband/sw/siw/siw_cm.c:1776
iw_cm_listen+0x14c/0x204 drivers/infiniband/core/iwcm.c:585
cma_iw_listen drivers/infiniband/core/cma.c:2668 [inline]
rdma_listen+0x774/0xae4 drivers/infiniband/core/cma.c:3953
cma_listen_on_dev+0x320/0x64c drivers/infiniband/core/cma.c:2727
cma_add_one+0x5ec/0xab4 drivers/infiniband/core/cma.c:5357
add_client_context+0x45c/0x7d0 drivers/infiniband/core/device.c:727
enable_device_and_get+0x1a8/0x3e8 drivers/infiniband/core/device.c:1338
ib_register_device+0xe40/0x108c drivers/infiniband/core/device.c:1426
siw_device_register drivers/infiniband/sw/siw/siw_main.c:72 [inline]
siw_newlink+0x80c/0xc2c drivers/infiniband/sw/siw/siw_main.c:489
nldev_newlink+0x49c/0x4fc drivers/infiniband/core/nldev.c:1794
rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
rdma_nl_rcv+0x5c4/0x858 drivers/infiniband/core/netlink.c:259
netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
netlink_unicast+0x668/0x8a4 net/netlink/af_netlink.c:1357
netlink_sendmsg+0x7a4/0xa8c net/netlink/af_netlink.c:1901
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
____sys_sendmsg+0x56c/0x840 net/socket.c:2597
___sys_sendmsg net/socket.c:2651 [inline]
__sys_sendmsg+0x26c/0x33c net/socket.c:2680
__do_sys_sendmsg net/socket.c:2689 [inline]
__se_sys_sendmsg net/socket.c:2687 [inline]
__arm64_sys_sendmsg+0x80/0x94 net/socket.c:2687
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
infiniband syz1: RDMA CMA: cma_listen_on_dev, error -98
overlay: ./file0 is not a directory
xt_nfacct: accounting object `sy\x05' does not exists
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
^ permalink raw reply [flat|nested] 4+ messages in thread* RE: [syzbot] [rdma?] possible deadlock in siw_create_listen (2)
2024-09-26 13:34 [syzbot] [rdma?] possible deadlock in siw_create_listen (2) syzbot
@ 2024-10-04 16:10 ` Bernard Metzler
2024-10-05 1:20 ` Jason Gunthorpe
0 siblings, 1 reply; 4+ messages in thread
From: Bernard Metzler @ 2024-10-04 16:10 UTC (permalink / raw)
To: jgg@ziepe.ca, leon@kernel.org, linux-rdma@vger.kernel.org
> -----Original Message-----
> From: syzbot <syzbot+3eb27595de9aa3cf63c3@syzkaller.appspotmail.com>
> Sent: Thursday, September 26, 2024 3:34 PM
> To: Bernard Metzler <BMT@zurich.ibm.com>; jgg@ziepe.ca; leon@kernel.org;
> linux-kernel@vger.kernel.org; linux-rdma@vger.kernel.org;
> netdev@vger.kernel.org; syzkaller-bugs@googlegroups.com
> Subject: [EXTERNAL] [syzbot] [rdma?] possible deadlock in siw_create_listen
> (2)
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 5f5673607153 Merge branch 'for-next/core' into for-kernelci
> git tree:
> git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
> console output: INVALID URI REMOVED
> 3A__syzkaller.appspot.com_x_log.txt-3Fx-
> 3D149fdca9980000&d=DwIFaQ&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbhvovE4t
> YSbqxyOwdSiLedP4yO55g&m=sr8zZDt-
> X4vizrmZ7oTOUCc_f4tzlpuv7bRCJEXpp32wPy_dhtBfJCqKKk2V7Bp0&s=xuvx4qT_oYipgtJx
> 0iJ1oKZQsCwdkBuRmnDShT45eOc&e=
> kernel config: INVALID URI REMOVED
> 3A__syzkaller.appspot.com_x_.config-3Fx-
> 3Ddedbcb1ff4387972&d=DwIFaQ&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbhvovE
> 4tYSbqxyOwdSiLedP4yO55g&m=sr8zZDt-
> X4vizrmZ7oTOUCc_f4tzlpuv7bRCJEXpp32wPy_dhtBfJCqKKk2V7Bp0&s=BjZg8UtYaAeXwr8W
> WxXuZ7A2QgccwxH4uGrmlPYBr0s&e=
> dashboard link: INVALID URI REMOVED
> 3A__syzkaller.appspot.com_bug-3Fextid-
> 3D3eb27595de9aa3cf63c3&d=DwIFaQ&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbh
> vovE4tYSbqxyOwdSiLedP4yO55g&m=sr8zZDt-
> X4vizrmZ7oTOUCc_f4tzlpuv7bRCJEXpp32wPy_dhtBfJCqKKk2V7Bp0&s=Mxs76HbB1WLfXbF9
> s3ulaR8KJd6t1Uz4K5IRN64eFVo&e=
> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for
> Debian) 2.40
> userspace arch: arm64
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image: INVALID URI REMOVED
> 3A__storage.googleapis.com_syzbot-2Dassets_40172aed5414_disk-
> 2D5f567360.raw.xz&d=DwIFaQ&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbhvovE4
> tYSbqxyOwdSiLedP4yO55g&m=sr8zZDt-
> X4vizrmZ7oTOUCc_f4tzlpuv7bRCJEXpp32wPy_dhtBfJCqKKk2V7Bp0&s=HJ8gCHhTtZtGTRe7
> 3tc_YcCZD2qxh-xhZFSsDV_tetc&e=
> vmlinux: INVALID URI REMOVED
> 3A__storage.googleapis.com_syzbot-2Dassets_58372f305e9d_vmlinux-
> 2D5f567360.xz&d=DwIFaQ&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbhvovE4tYSb
> qxyOwdSiLedP4yO55g&m=sr8zZDt-
> X4vizrmZ7oTOUCc_f4tzlpuv7bRCJEXpp32wPy_dhtBfJCqKKk2V7Bp0&s=I_ky8ZO37Twppvej
> koUyZpbrQC4ZkwxoCPf7SSerSm4&e=
> kernel image: INVALID URI REMOVED
> 3A__storage.googleapis.com_syzbot-2Dassets_d2aae6fa798f_Image-
> 2D5f567360.gz.xz&d=DwIFaQ&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbhvovE4t
> YSbqxyOwdSiLedP4yO55g&m=sr8zZDt-
> X4vizrmZ7oTOUCc_f4tzlpuv7bRCJEXpp32wPy_dhtBfJCqKKk2V7Bp0&s=huA26Ba18XAmiroY
> x2AAfOapW2IOIdxGPh0_ay4obP8&e=
>
> IMPORTANT: if you fix the issue, please add the following tag to the
> commit:
> Reported-by: syzbot+3eb27595de9aa3cf63c3@syzkaller.appspotmail.com
>
> iwpm_register_pid: Unable to send a nlmsg (client = 2)
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.11.0-rc7-syzkaller-g5f5673607153 #0 Not tainted
> ------------------------------------------------------
> syz.4.157/7931 is trying to acquire lock:
> ffff0000ee056458 (sk_lock-AF_INET){+.+.}-{0:0}, at:
> siw_create_listen+0x164/0xd70 drivers/infiniband/sw/siw/siw_cm.c:1776
>
> but task is already holding lock:
> ffff800091c21ea8 (lock#7){+.+.}-{3:3}, at: cma_add_one+0x510/0xab4
> drivers/infiniband/core/cma.c:5354
>
> which lock already depends on the new lock.
>
Could one please help me to understand this situation?
cma.c:5354
mutex_lock(&lock);
list_add_tail(&cma_dev->list, &dev_list);
list_for_each_entry(id_priv, &listen_any_list, listen_any_item) {
ret = cma_listen_on_dev(id_priv, cma_dev, &to_destroy);
if (ret)
goto free_listen;
}
mutex_unlock(&lock);
siw_cm.c:1776
sock_set_reuseaddr(s->sk);
...which calls lock_sock(sk) on a feshly created socket.
I don't see the dependency between the global cma lock and the socket lock.
Any help appreciated!
Thanks,
Bernard.
>
> the existing dependency chain (in reverse order) is:
>
> -> #3 (lock#7){+.+.}-{3:3}:
> __mutex_lock_common+0x190/0x21a0 kernel/locking/mutex.c:608
> __mutex_lock kernel/locking/mutex.c:752 [inline]
> mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:804
> cma_init+0x2c/0x158 drivers/infiniband/core/cma.c:5438
> do_one_initcall+0x24c/0x9c0 init/main.c:1267
> do_initcall_level+0x154/0x214 init/main.c:1329
> do_initcalls+0x58/0xac init/main.c:1345
> do_basic_setup+0x8c/0xa0 init/main.c:1364
> kernel_init_freeable+0x324/0x478 init/main.c:1578
> kernel_init+0x24/0x2a0 init/main.c:1467
> ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860
>
> -> #2 (rtnl_mutex){+.+.}-{3:3}:
> __mutex_lock_common+0x190/0x21a0 kernel/locking/mutex.c:608
> __mutex_lock kernel/locking/mutex.c:752 [inline]
> mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:804
> rtnl_lock+0x20/0x2c net/core/rtnetlink.c:79
> do_ip_setsockopt+0xe8c/0x346c net/ipv4/ip_sockglue.c:1077
> ip_setsockopt+0x80/0x128 net/ipv4/ip_sockglue.c:1417
> tcp_setsockopt+0xcc/0xe8 net/ipv4/tcp.c:3768
> sock_common_setsockopt+0xb0/0xcc net/core/sock.c:3735
> smc_setsockopt+0x204/0x10fc net/smc/af_smc.c:3072
> do_sock_setsockopt+0x2a0/0x4e0 net/socket.c:2324
> __sys_setsockopt+0x128/0x1a8 net/socket.c:2347
> __do_sys_setsockopt net/socket.c:2356 [inline]
> __se_sys_setsockopt net/socket.c:2353 [inline]
> __arm64_sys_setsockopt+0xb8/0xd4 net/socket.c:2353
> __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
> invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
> el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
> do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
> el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
> el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
> el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
>
> -> #1 (&smc->clcsock_release_lock){+.+.}-{3:3}:
> __mutex_lock_common+0x190/0x21a0 kernel/locking/mutex.c:608
> __mutex_lock kernel/locking/mutex.c:752 [inline]
> mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:804
> smc_switch_to_fallback+0x48/0xa80 net/smc/af_smc.c:902
> smc_sendmsg+0xfc/0x9f8 net/smc/af_smc.c:2779
> sock_sendmsg_nosec net/socket.c:730 [inline]
> __sock_sendmsg net/socket.c:745 [inline]
> __sys_sendto+0x374/0x4f4 net/socket.c:2204
> __do_sys_sendto net/socket.c:2216 [inline]
> __se_sys_sendto net/socket.c:2212 [inline]
> __arm64_sys_sendto+0xd8/0xf8 net/socket.c:2212
> __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
> invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
> el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
> do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
> el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
> el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
> el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
>
> -> #0 (sk_lock-AF_INET){+.+.}-{0:0}:
> check_prev_add kernel/locking/lockdep.c:3133 [inline]
> check_prevs_add kernel/locking/lockdep.c:3252 [inline]
> validate_chain kernel/locking/lockdep.c:3868 [inline]
> __lock_acquire+0x33d8/0x779c kernel/locking/lockdep.c:5142
> lock_acquire+0x240/0x728 kernel/locking/lockdep.c:5759
> lock_sock_nested net/core/sock.c:3543 [inline]
> lock_sock include/net/sock.h:1607 [inline]
> sock_set_reuseaddr+0x58/0x154 net/core/sock.c:782
> siw_create_listen+0x164/0xd70
> drivers/infiniband/sw/siw/siw_cm.c:1776
> iw_cm_listen+0x14c/0x204 drivers/infiniband/core/iwcm.c:585
> cma_iw_listen drivers/infiniband/core/cma.c:2668 [inline]
> rdma_listen+0x774/0xae4 drivers/infiniband/core/cma.c:3953
> cma_listen_on_dev+0x320/0x64c drivers/infiniband/core/cma.c:2727
> cma_add_one+0x5ec/0xab4 drivers/infiniband/core/cma.c:5357
> add_client_context+0x45c/0x7d0 drivers/infiniband/core/device.c:727
> enable_device_and_get+0x1a8/0x3e8
> drivers/infiniband/core/device.c:1338
> ib_register_device+0xe40/0x108c
> drivers/infiniband/core/device.c:1426
> siw_device_register drivers/infiniband/sw/siw/siw_main.c:72 [inline]
> siw_newlink+0x80c/0xc2c drivers/infiniband/sw/siw/siw_main.c:489
> nldev_newlink+0x49c/0x4fc drivers/infiniband/core/nldev.c:1794
> rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
> rdma_nl_rcv+0x5c4/0x858 drivers/infiniband/core/netlink.c:259
> netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
> netlink_unicast+0x668/0x8a4 net/netlink/af_netlink.c:1357
> netlink_sendmsg+0x7a4/0xa8c net/netlink/af_netlink.c:1901
> sock_sendmsg_nosec net/socket.c:730 [inline]
> __sock_sendmsg net/socket.c:745 [inline]
> ____sys_sendmsg+0x56c/0x840 net/socket.c:2597
> ___sys_sendmsg net/socket.c:2651 [inline]
> __sys_sendmsg+0x26c/0x33c net/socket.c:2680
> __do_sys_sendmsg net/socket.c:2689 [inline]
> __se_sys_sendmsg net/socket.c:2687 [inline]
> __arm64_sys_sendmsg+0x80/0x94 net/socket.c:2687
> __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
> invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
> el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
> do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
> el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
> el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
> el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
>
> other info that might help us debug this:
>
> Chain exists of:
> sk_lock-AF_INET --> rtnl_mutex --> lock#7
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(lock#7);
> lock(rtnl_mutex);
> lock(lock#7);
> lock(sk_lock-AF_INET);
>
> *** DEADLOCK ***
>
> 6 locks held by syz.4.157/7931:
> #0: ffff8000974142d8 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at:
> rdma_nl_rcv_msg drivers/infiniband/core/netlink.c:164 [inline]
> #0: ffff8000974142d8 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at:
> rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
> #0: ffff8000974142d8 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at:
> rdma_nl_rcv+0x330/0x858 drivers/infiniband/core/netlink.c:259
> #1: ffff800091c0e870 (link_ops_rwsem){++++}-{3:3}, at:
> nldev_newlink+0x358/0x4fc drivers/infiniband/core/nldev.c:1784
> #2: ffff800091bff210 (devices_rwsem){++++}-{3:3}, at:
> enable_device_and_get+0x104/0x3e8 drivers/infiniband/core/device.c:1328
> #3: ffff800091bff510 (clients_rwsem){++++}-{3:3}, at:
> enable_device_and_get+0x160/0x3e8 drivers/infiniband/core/device.c:1336
> #4: ffff0000d61505d0 (&device->client_data_rwsem){++++}-{3:3}, at:
> add_client_context+0x424/0x7d0 drivers/infiniband/core/device.c:725
> #5: ffff800091c21ea8 (lock#7){+.+.}-{3:3}, at: cma_add_one+0x510/0xab4
> drivers/infiniband/core/cma.c:5354
>
> stack backtrace:
> CPU: 0 UID: 0 PID: 7931 Comm: syz.4.157 Not tainted 6.11.0-rc7-syzkaller-
> g5f5673607153 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 08/06/2024
> Call trace:
> dump_backtrace+0x1b8/0x1e4 arch/arm64/kernel/stacktrace.c:319
> show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:326
> __dump_stack lib/dump_stack.c:93 [inline]
> dump_stack_lvl+0xe4/0x150 lib/dump_stack.c:119
> dump_stack+0x1c/0x28 lib/dump_stack.c:128
> print_circular_bug+0x150/0x1b8 kernel/locking/lockdep.c:2059
> check_noncircular+0x310/0x404 kernel/locking/lockdep.c:2186
> check_prev_add kernel/locking/lockdep.c:3133 [inline]
> check_prevs_add kernel/locking/lockdep.c:3252 [inline]
> validate_chain kernel/locking/lockdep.c:3868 [inline]
> __lock_acquire+0x33d8/0x779c kernel/locking/lockdep.c:5142
> lock_acquire+0x240/0x728 kernel/locking/lockdep.c:5759
> lock_sock_nested net/core/sock.c:3543 [inline]
> lock_sock include/net/sock.h:1607 [inline]
> sock_set_reuseaddr+0x58/0x154 net/core/sock.c:782
> siw_create_listen+0x164/0xd70 drivers/infiniband/sw/siw/siw_cm.c:1776
> iw_cm_listen+0x14c/0x204 drivers/infiniband/core/iwcm.c:585
> cma_iw_listen drivers/infiniband/core/cma.c:2668 [inline]
> rdma_listen+0x774/0xae4 drivers/infiniband/core/cma.c:3953
> cma_listen_on_dev+0x320/0x64c drivers/infiniband/core/cma.c:2727
> cma_add_one+0x5ec/0xab4 drivers/infiniband/core/cma.c:5357
> add_client_context+0x45c/0x7d0 drivers/infiniband/core/device.c:727
> enable_device_and_get+0x1a8/0x3e8 drivers/infiniband/core/device.c:1338
> ib_register_device+0xe40/0x108c drivers/infiniband/core/device.c:1426
> siw_device_register drivers/infiniband/sw/siw/siw_main.c:72 [inline]
> siw_newlink+0x80c/0xc2c drivers/infiniband/sw/siw/siw_main.c:489
> nldev_newlink+0x49c/0x4fc drivers/infiniband/core/nldev.c:1794
> rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
> rdma_nl_rcv+0x5c4/0x858 drivers/infiniband/core/netlink.c:259
> netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
> netlink_unicast+0x668/0x8a4 net/netlink/af_netlink.c:1357
> netlink_sendmsg+0x7a4/0xa8c net/netlink/af_netlink.c:1901
> sock_sendmsg_nosec net/socket.c:730 [inline]
> __sock_sendmsg net/socket.c:745 [inline]
> ____sys_sendmsg+0x56c/0x840 net/socket.c:2597
> ___sys_sendmsg net/socket.c:2651 [inline]
> __sys_sendmsg+0x26c/0x33c net/socket.c:2680
> __do_sys_sendmsg net/socket.c:2689 [inline]
> __se_sys_sendmsg net/socket.c:2687 [inline]
> __arm64_sys_sendmsg+0x80/0x94 net/socket.c:2687
> __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
> invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
> el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
> do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
> el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
> el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
> el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
> infiniband syz1: RDMA CMA: cma_listen_on_dev, error -98
> overlay: ./file0 is not a directory
> xt_nfacct: accounting object `sy\x05' does not exists
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See INVALID URI REMOVED
> 3A__goo.gl_tpsmEJ&d=DwIFaQ&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbhvovE4
> tYSbqxyOwdSiLedP4yO55g&m=sr8zZDt-
> X4vizrmZ7oTOUCc_f4tzlpuv7bRCJEXpp32wPy_dhtBfJCqKKk2V7Bp0&s=DA1M0kOP4c-
> 36riaoyaAE7WfF4I2V_cvru4PbF80xu4&e= for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> INVALID URI REMOVED
> 23status&d=DwIFaQ&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbhvovE4tYSbqxyOw
> dSiLedP4yO55g&m=sr8zZDt-
> X4vizrmZ7oTOUCc_f4tzlpuv7bRCJEXpp32wPy_dhtBfJCqKKk2V7Bp0&s=HTvYoHo7kNGdhvI6
> p66EC7F21n9dIQYD3aC3N_qXllQ&e= for how to communicate with syzbot.
>
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
>
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
>
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
>
> If you want to undo deduplication, reply with:
> #syz undup
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [syzbot] [rdma?] possible deadlock in siw_create_listen (2)
2024-10-04 16:10 ` Bernard Metzler
@ 2024-10-05 1:20 ` Jason Gunthorpe
2024-10-05 17:34 ` Bernard Metzler
0 siblings, 1 reply; 4+ messages in thread
From: Jason Gunthorpe @ 2024-10-05 1:20 UTC (permalink / raw)
To: Bernard Metzler; +Cc: leon@kernel.org, linux-rdma@vger.kernel.org
On Fri, Oct 04, 2024 at 04:10:31PM +0000, Bernard Metzler wrote:
> Could one please help me to understand this situation?
> cma.c:5354
>
> mutex_lock(&lock);
> list_add_tail(&cma_dev->list, &dev_list);
> list_for_each_entry(id_priv, &listen_any_list, listen_any_item) {
> ret = cma_listen_on_dev(id_priv, cma_dev, &to_destroy);
> if (ret)
> goto free_listen;
> }
> mutex_unlock(&lock);
>
> siw_cm.c:1776
> sock_set_reuseaddr(s->sk);
>
> ...which calls lock_sock(sk) on a feshly created socket.
I think this is a smc bug, and lockdep is getting confused about what
to report due to all the different locks.
smc_setsockopt() eventually in ip_setsockopt() does:
mutex_lock(&smc->clcsock_release_lock);
if (needs_rtnl)
rtnl_lock();
sockopt_lock_sock(sk);
mutex_unlock(&smc->clcsock_release_lock);
smc_sendmsg() does
lock_sock(sk);
mutex_lock(&smc->clcsock_release_lock);
Which is classic deadlock locking.
That the CMA gets involved here seems like wrong reporting because
syzkaller put those lock chains into it.
I guess this is a dup of
https://lore.kernel.org/netdev/00000000000093078f0622583e6e@google.com/T/
Or at least that should be fixed before looking at this
Jason
^ permalink raw reply [flat|nested] 4+ messages in thread* RE: [syzbot] [rdma?] possible deadlock in siw_create_listen (2)
2024-10-05 1:20 ` Jason Gunthorpe
@ 2024-10-05 17:34 ` Bernard Metzler
0 siblings, 0 replies; 4+ messages in thread
From: Bernard Metzler @ 2024-10-05 17:34 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: leon@kernel.org, linux-rdma@vger.kernel.org
> -----Original Message-----
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Saturday, October 5, 2024 3:21 AM
> To: Bernard Metzler <BMT@zurich.ibm.com>
> Cc: leon@kernel.org; linux-rdma@vger.kernel.org
> Subject: [EXTERNAL] Re: [syzbot] [rdma?] possible deadlock in
> siw_create_listen (2)
>
> On Fri, Oct 04, 2024 at 04:10:31PM +0000, Bernard Metzler wrote:
>
> > Could one please help me to understand this situation?
> > cma.c:5354
> >
> > mutex_lock(&lock);
> > list_add_tail(&cma_dev->list, &dev_list);
> > list_for_each_entry(id_priv, &listen_any_list, listen_any_item) {
> > ret = cma_listen_on_dev(id_priv, cma_dev, &to_destroy);
> > if (ret)
> > goto free_listen;
> > }
> > mutex_unlock(&lock);
> >
> > siw_cm.c:1776
> > sock_set_reuseaddr(s->sk);
> >
> > ...which calls lock_sock(sk) on a feshly created socket.
>
> I think this is a smc bug, and lockdep is getting confused about what
> to report due to all the different locks.
>
> smc_setsockopt() eventually in ip_setsockopt() does:
>
> mutex_lock(&smc->clcsock_release_lock);
>
> if (needs_rtnl)
> rtnl_lock();
> sockopt_lock_sock(sk);
> mutex_unlock(&smc->clcsock_release_lock);
>
>
> smc_sendmsg() does
>
> lock_sock(sk);
> mutex_lock(&smc->clcsock_release_lock);
>
> Which is classic deadlock locking.
>
Thank you for helping to clarify this. That would make much more sense.
So blaming
> siw_create_listen+0x164/0xd70 drivers/infiniband/sw/siw/siw_cm.c:1776
... isn't quite right. It doesn't deal with the SMC lock,
but locks a just created socket via
>> -> #0 (sk_lock-AF_INET){+.+.}-{0:0}:
>> check_prev_add kernel/locking/lockdep.c:3133 [inline]
>> check_prevs_add kernel/locking/lockdep.c:3252 [inline]
>> validate_chain kernel/locking/lockdep.c:3868 [inline]
>> __lock_acquire+0x33d8/0x779c kernel/locking/lockdep.c:5142
>> lock_acquire+0x240/0x728 kernel/locking/lockdep.c:5759
>> lock_sock_nested net/core/sock.c:3543 [inline]
>> lock_sock include/net/sock.h:1607 [inline]
>> sock_set_reuseaddr+0x58/0x154 net/core/sock.c:782
>> siw_create_listen+0x164/0xd70
> That the CMA gets involved here seems like wrong reporting because
> syzkaller put those lock chains into it.
>
> I guess this is a dup of
>
> INVALID URI REMOVED
> 3A__lore.kernel.org_netdev_00000000000093078f0622583e6e-
> 40google.com_T_&d=DwIBAg&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbhvovE4tY
> SbqxyOwdSiLedP4yO55g&m=JpX-DX-70KCh-9MzDE4Yt0wOtrMj03iWWukt_A_7qB2ycm-
> IeacSCUUDTQ5MS24-&s=DQc776KI863HX_sKom7kci4ykIgXdN7skIMVbWS1Hjc&e=
>
> Or at least that should be fixed before looking at this
>
Sounds reasonable...
Thanks!
Bernard.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-10-05 17:34 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-26 13:34 [syzbot] [rdma?] possible deadlock in siw_create_listen (2) syzbot
2024-10-04 16:10 ` Bernard Metzler
2024-10-05 1:20 ` Jason Gunthorpe
2024-10-05 17:34 ` Bernard Metzler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox