* [syzbot] [net?] possible deadlock in team_del_slave (3)
@ 2024-04-26 11:59 syzbot
2024-04-26 14:17 ` Hillf Danton
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: syzbot @ 2024-04-26 11:59 UTC (permalink / raw)
To: davem, edumazet, jiri, kuba, linux-kernel, netdev, pabeni,
syzkaller-bugs
Hello,
syzbot found the following issue on:
HEAD commit: 480e035fc4c7 Merge tag 'drm-next-2024-03-13' of https://gi..
git tree: upstream
console+strace: https://syzkaller.appspot.com/x/log.txt?x=1662179e180000
kernel config: https://syzkaller.appspot.com/x/.config?x=1e5b814e91787669
dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1058e7b9180000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11919365180000
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/5f73b6ef963d/disk-480e035f.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/46c949396aad/vmlinux-480e035f.xz
kernel image: https://storage.googleapis.com/syzbot-assets/e3b4d0f5a5f8/bzImage-480e035f.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com
======================================================
WARNING: possible circular locking dependency detected
6.8.0-syzkaller-08073-g480e035fc4c7 #0 Not tainted
------------------------------------------------------
syz-executor419/5074 is trying to acquire lock:
ffff888023dc4d20 (team->team_lock_key){+.+.}-{3:3}, at: team_del_slave+0x32/0x1d0 drivers/net/team/team.c:1988
but task is already holding lock:
ffff88802a210768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: nl80211_del_interface+0x11a/0x140 net/wireless/nl80211.c:4389
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&rdev->wiphy.mtx){+.+.}-{3:3}:
lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
__mutex_lock_common kernel/locking/mutex.c:608 [inline]
__mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
wiphy_lock include/net/cfg80211.h:5951 [inline]
ieee80211_open+0xe7/0x200 net/mac80211/iface.c:449
__dev_open+0x2d3/0x450 net/core/dev.c:1430
dev_open+0xae/0x1b0 net/core/dev.c:1466
team_port_add drivers/net/team/team.c:1214 [inline]
team_add_slave+0x9b3/0x2750 drivers/net/team/team.c:1974
do_set_master net/core/rtnetlink.c:2685 [inline]
do_setlink+0xe70/0x41f0 net/core/rtnetlink.c:2891
rtnl_setlink+0x40d/0x5a0 net/core/rtnetlink.c:3185
rtnetlink_rcv_msg+0x89b/0x10d0 net/core/rtnetlink.c:6595
netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:745
____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
___sys_sendmsg net/socket.c:2638 [inline]
__sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
do_syscall_64+0xfb/0x240
entry_SYSCALL_64_after_hwframe+0x6d/0x75
-> #0 (team->team_lock_key){+.+.}-{3:3}:
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
__lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
__mutex_lock_common kernel/locking/mutex.c:608 [inline]
__mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
team_del_slave+0x32/0x1d0 drivers/net/team/team.c:1988
team_device_event+0x200/0x5b0 drivers/net/team/team.c:3029
notifier_call_chain+0x18f/0x3b0 kernel/notifier.c:93
call_netdevice_notifiers_extack net/core/dev.c:1988 [inline]
call_netdevice_notifiers net/core/dev.c:2002 [inline]
unregister_netdevice_many_notify+0xd96/0x16d0 net/core/dev.c:11096
unregister_netdevice_many net/core/dev.c:11154 [inline]
unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11033
unregister_netdevice include/linux/netdevice.h:3115 [inline]
_cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
ieee80211_if_remove+0x25d/0x3a0 net/mac80211/iface.c:2242
ieee80211_del_iface+0x19/0x30 net/mac80211/cfg.c:202
rdev_del_virtual_intf net/wireless/rdev-ops.h:62 [inline]
cfg80211_remove_virtual_intf+0x230/0x3f0 net/wireless/util.c:2847
genl_family_rcv_msg_doit net/netlink/genetlink.c:1113 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:1193 [inline]
genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1208
netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1217
netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:745
____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
___sys_sendmsg net/socket.c:2638 [inline]
__sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
do_syscall_64+0xfb/0x240
entry_SYSCALL_64_after_hwframe+0x6d/0x75
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&rdev->wiphy.mtx);
lock(team->team_lock_key);
lock(&rdev->wiphy.mtx);
lock(team->team_lock_key);
*** DEADLOCK ***
3 locks held by syz-executor419/5074:
#0: ffffffff8f3f1a30 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40 net/netlink/genetlink.c:1216
#1: ffffffff8f38ce88 (rtnl_mutex){+.+.}-{3:3}, at: nl80211_pre_doit+0x5f/0x8b0 net/wireless/nl80211.c:16401
#2: ffff88802a210768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: nl80211_del_interface+0x11a/0x140 net/wireless/nl80211.c:4389
stack backtrace:
CPU: 1 PID: 5074 Comm: syz-executor419 Not tainted 6.8.0-syzkaller-08073-g480e035fc4c7 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
__lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
__mutex_lock_common kernel/locking/mutex.c:608 [inline]
__mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
team_del_slave+0x32/0x1d0 drivers/net/team/team.c:1988
team_device_event+0x200/0x5b0 drivers/net/team/team.c:3029
notifier_call_chain+0x18f/0x3b0 kernel/notifier.c:93
call_netdevice_notifiers_extack net/core/dev.c:1988 [inline]
call_netdevice_notifiers net/core/dev.c:2002 [inline]
unregister_netdevice_many_notify+0xd96/0x16d0 net/core/dev.c:11096
unregister_netdevice_many net/core/dev.c:11154 [inline]
unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11033
unregister_netdevice include/linux/netdevice.h:3115 [inline]
_cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
ieee80211_if_remove+0x25d/0x3a0 net/mac80211/iface.c:2242
ieee80211_del_iface+0x19/0x30 net/mac80211/cfg.c:202
rdev_del_virtual_intf net/wireless/rdev-ops.h:62 [inline]
cfg80211_remove_virtual_intf+0x230/0x3f0 net/wireless/util.c:2847
genl_family_rcv_msg_doit net/netlink/genetlink.c:1113 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:1193 [inline]
genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1208
netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1217
netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:745
____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
___sys_sendmsg net/socket.c:2638 [inline]
__sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
do_syscall_64+0xfb/0x240
entry_SYSCALL_64_after_hwframe+0x6d/0x75
RIP: 0033:0x7f963cb981a9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 d1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffdde1419a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f963cbe53f6 RCX: 00007f963cb981a9
RDX: 0000000000000000 RSI: 0000000020000400 RDI: 0000000000000004
RBP: 00007f963cc17440 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000031
R13: 0000000000000003 R14: 0000000000050012 R15: 00007ffdde141a02
</TASK>
team0: Port device wlan0 removed
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
2024-04-26 11:59 [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
@ 2024-04-26 14:17 ` Hillf Danton
2024-07-03 14:51 ` [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
` (2 subsequent siblings)
3 siblings, 0 replies; 11+ messages in thread
From: Hillf Danton @ 2024-04-26 14:17 UTC (permalink / raw)
To: syzbot; +Cc: edumazet, linux-kernel, netdev, Boqun Feng, syzkaller-bugs
On Fri, 26 Apr 2024 04:59:32 -0700
> syzbot found the following issue on:
>
> HEAD commit: 480e035fc4c7 Merge tag 'drm-next-2024-03-13' of https://gi..
> git tree: upstream
> console+strace: https://syzkaller.appspot.com/x/log.txt?x=1662179e180000
> kernel config: https://syzkaller.appspot.com/x/.config?x=1e5b814e91787669
> dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1058e7b9180000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11919365180000
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/5f73b6ef963d/disk-480e035f.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/46c949396aad/vmlinux-480e035f.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/e3b4d0f5a5f8/bzImage-480e035f.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.8.0-syzkaller-08073-g480e035fc4c7 #0 Not tainted
> ------------------------------------------------------
> syz-executor419/5074 is trying to acquire lock:
> ffff888023dc4d20 (team->team_lock_key){+.+.}-{3:3}, at: team_del_slave+0x32/0x1d0 drivers/net/team/team.c:1988
>
> but task is already holding lock:
> ffff88802a210768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: nl80211_del_interface+0x11a/0x140 net/wireless/nl80211.c:4389
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (&rdev->wiphy.mtx){+.+.}-{3:3}:
> lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
> __mutex_lock_common kernel/locking/mutex.c:608 [inline]
> __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
> wiphy_lock include/net/cfg80211.h:5951 [inline]
> ieee80211_open+0xe7/0x200 net/mac80211/iface.c:449
> __dev_open+0x2d3/0x450 net/core/dev.c:1430
ASSERT_RTNL();
> dev_open+0xae/0x1b0 net/core/dev.c:1466
> team_port_add drivers/net/team/team.c:1214 [inline]
> team_add_slave+0x9b3/0x2750 drivers/net/team/team.c:1974
> do_set_master net/core/rtnetlink.c:2685 [inline]
> do_setlink+0xe70/0x41f0 net/core/rtnetlink.c:2891
> rtnl_setlink+0x40d/0x5a0 net/core/rtnetlink.c:3185
> rtnetlink_rcv_msg+0x89b/0x10d0 net/core/rtnetlink.c:6595
> netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
> netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
> netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
> netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
> sock_sendmsg_nosec net/socket.c:730 [inline]
> __sock_sendmsg+0x221/0x270 net/socket.c:745
> ____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
> ___sys_sendmsg net/socket.c:2638 [inline]
> __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
> do_syscall_64+0xfb/0x240
> entry_SYSCALL_64_after_hwframe+0x6d/0x75
>
> -> #0 (team->team_lock_key){+.+.}-{3:3}:
> check_prev_add kernel/locking/lockdep.c:3134 [inline]
> check_prevs_add kernel/locking/lockdep.c:3253 [inline]
> validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
> __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
> lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
> __mutex_lock_common kernel/locking/mutex.c:608 [inline]
> __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
> team_del_slave+0x32/0x1d0 drivers/net/team/team.c:1988
> team_device_event+0x200/0x5b0 drivers/net/team/team.c:3029
> notifier_call_chain+0x18f/0x3b0 kernel/notifier.c:93
> call_netdevice_notifiers_extack net/core/dev.c:1988 [inline]
> call_netdevice_notifiers net/core/dev.c:2002 [inline]
> unregister_netdevice_many_notify+0xd96/0x16d0 net/core/dev.c:11096
> unregister_netdevice_many net/core/dev.c:11154 [inline]
> unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11033
> unregister_netdevice include/linux/netdevice.h:3115 [inline]
> _cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
> ieee80211_if_remove+0x25d/0x3a0 net/mac80211/iface.c:2242
ASSERT_RTNL();
lockdep_assert_wiphy(sdata->local->hw.wiphy);
Given ASSERT_RTNL() on both sides, difficult to understand the
deadlock reported.
> ieee80211_del_iface+0x19/0x30 net/mac80211/cfg.c:202
> rdev_del_virtual_intf net/wireless/rdev-ops.h:62 [inline]
> cfg80211_remove_virtual_intf+0x230/0x3f0 net/wireless/util.c:2847
> genl_family_rcv_msg_doit net/netlink/genetlink.c:1113 [inline]
> genl_family_rcv_msg net/netlink/genetlink.c:1193 [inline]
> genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1208
> netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
> genl_rcv+0x28/0x40 net/netlink/genetlink.c:1217
> netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
> netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
> netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
> sock_sendmsg_nosec net/socket.c:730 [inline]
> __sock_sendmsg+0x221/0x270 net/socket.c:745
> ____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
> ___sys_sendmsg net/socket.c:2638 [inline]
> __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
> do_syscall_64+0xfb/0x240
> entry_SYSCALL_64_after_hwframe+0x6d/0x75
>
> other info that might help us debug this:
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(&rdev->wiphy.mtx);
> lock(team->team_lock_key);
> lock(&rdev->wiphy.mtx);
> lock(team->team_lock_key);
>
> *** DEADLOCK ***
>
> 3 locks held by syz-executor419/5074:
> #0: ffffffff8f3f1a30 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40 net/netlink/genetlink.c:1216
> #1: ffffffff8f38ce88 (rtnl_mutex){+.+.}-{3:3}, at: nl80211_pre_doit+0x5f/0x8b0 net/wireless/nl80211.c:16401
> #2: ffff88802a210768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: nl80211_del_interface+0x11a/0x140 net/wireless/nl80211.c:4389
>
> stack backtrace:
> CPU: 1 PID: 5074 Comm: syz-executor419 Not tainted 6.8.0-syzkaller-08073-g480e035fc4c7 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
> Call Trace:
> <TASK>
> __dump_stack lib/dump_stack.c:88 [inline]
> dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
> check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187
> check_prev_add kernel/locking/lockdep.c:3134 [inline]
> check_prevs_add kernel/locking/lockdep.c:3253 [inline]
> validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
> __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
> lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
> __mutex_lock_common kernel/locking/mutex.c:608 [inline]
> __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
> team_del_slave+0x32/0x1d0 drivers/net/team/team.c:1988
> team_device_event+0x200/0x5b0 drivers/net/team/team.c:3029
> notifier_call_chain+0x18f/0x3b0 kernel/notifier.c:93
> call_netdevice_notifiers_extack net/core/dev.c:1988 [inline]
> call_netdevice_notifiers net/core/dev.c:2002 [inline]
> unregister_netdevice_many_notify+0xd96/0x16d0 net/core/dev.c:11096
> unregister_netdevice_many net/core/dev.c:11154 [inline]
> unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11033
> unregister_netdevice include/linux/netdevice.h:3115 [inline]
> _cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
> ieee80211_if_remove+0x25d/0x3a0 net/mac80211/iface.c:2242
> ieee80211_del_iface+0x19/0x30 net/mac80211/cfg.c:202
> rdev_del_virtual_intf net/wireless/rdev-ops.h:62 [inline]
> cfg80211_remove_virtual_intf+0x230/0x3f0 net/wireless/util.c:2847
> genl_family_rcv_msg_doit net/netlink/genetlink.c:1113 [inline]
> genl_family_rcv_msg net/netlink/genetlink.c:1193 [inline]
> genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1208
> netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
> genl_rcv+0x28/0x40 net/netlink/genetlink.c:1217
> netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
> netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
> netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
> sock_sendmsg_nosec net/socket.c:730 [inline]
> __sock_sendmsg+0x221/0x270 net/socket.c:745
> ____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
> ___sys_sendmsg net/socket.c:2638 [inline]
> __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
> do_syscall_64+0xfb/0x240
> entry_SYSCALL_64_after_hwframe+0x6d/0x75
> RIP: 0033:0x7f963cb981a9
> Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 d1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007ffdde1419a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> RAX: ffffffffffffffda RBX: 00007f963cbe53f6 RCX: 00007f963cb981a9
> RDX: 0000000000000000 RSI: 0000000020000400 RDI: 0000000000000004
> RBP: 00007f963cc17440 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000031
> R13: 0000000000000003 R14: 0000000000050012 R15: 00007ffdde141a02
> </TASK>
> team0: Port device wlan0 removed
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
>
> If you want syzbot to run the reproducer, reply with:
> #syz test: git://repo/address.git branch-or-commit-hash
> If you attach or paste a git patch, syzbot will apply it before testing.
>
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
>
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
>
> If you want to undo deduplication, reply with:
> #syz undup
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave
2024-04-26 11:59 [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
2024-04-26 14:17 ` Hillf Danton
@ 2024-07-03 14:51 ` Jeongjun Park
2024-07-03 15:18 ` Michal Kubiak
2024-07-04 10:15 ` [syzbot] [net?] possible deadlock in team_del_slave (3) Jiri Pirko
2024-07-06 4:13 ` [PATCH net,v2] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
3 siblings, 1 reply; 11+ messages in thread
From: Jeongjun Park @ 2024-07-03 14:51 UTC (permalink / raw)
To: jiri
Cc: syzbot+705c61d60b091ef42c04, davem, edumazet, kuba, linux-kernel,
netdev, pabeni, syzkaller-bugs, Jeongjun Park
CPU0 CPU1
---- ----
lock(&rdev->wiphy.mtx);
lock(team->team_lock_key#4);
lock(&rdev->wiphy.mtx);
lock(team->team_lock_key#4);
Deadlock occurs due to the above scenario. Therefore,
modify the code as shown in the patch below to prevent deadlock.
Regards,
Jeongjun Park.
Reported-and-tested-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com
Fixes: 61dc3461b954 ("team: convert overall spinlock to mutex")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
---
drivers/net/team/team_core.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..3ac82df876b0 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1970,11 +1970,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
struct netlink_ext_ack *extack)
{
struct team *team = netdev_priv(dev);
- int err;
+ int err, locked;
- mutex_lock(&team->lock);
+ locked = mutex_trylock(&team->lock);
err = team_port_add(team, port_dev, extack);
- mutex_unlock(&team->lock);
+ if (locked)
+ mutex_unlock(&team->lock);
if (!err)
netdev_change_features(dev);
@@ -1985,11 +1986,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
{
struct team *team = netdev_priv(dev);
- int err;
+ int err, locked;
- mutex_lock(&team->lock);
+ locked = mutex_trylock(&team->lock);
err = team_port_del(team, port_dev);
- mutex_unlock(&team->lock);
+ if (locked)
+ mutex_unlock(&team->lock);
if (err)
return err;
--
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave
2024-07-03 14:51 ` [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
@ 2024-07-03 15:18 ` Michal Kubiak
2024-07-03 16:02 ` Jeongjun Park
0 siblings, 1 reply; 11+ messages in thread
From: Michal Kubiak @ 2024-07-03 15:18 UTC (permalink / raw)
To: Jeongjun Park
Cc: jiri, syzbot+705c61d60b091ef42c04, davem, edumazet, kuba,
linux-kernel, netdev, pabeni, syzkaller-bugs
On Wed, Jul 03, 2024 at 11:51:59PM +0900, Jeongjun Park wrote:
> CPU0 CPU1
> ---- ----
> lock(&rdev->wiphy.mtx);
> lock(team->team_lock_key#4);
> lock(&rdev->wiphy.mtx);
> lock(team->team_lock_key#4);
>
> Deadlock occurs due to the above scenario. Therefore,
> modify the code as shown in the patch below to prevent deadlock.
>
> Regards,
> Jeongjun Park.
The commit message should contain the patch description only (without
salutations, etc.).
>
> Reported-and-tested-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com
> Fixes: 61dc3461b954 ("team: convert overall spinlock to mutex")
> Signed-off-by: Jeongjun Park <aha310510@gmail.com>
> ---
> drivers/net/team/team_core.c | 14 ++++++++------
> 1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
> index ab1935a4aa2c..3ac82df876b0 100644
> --- a/drivers/net/team/team_core.c
> +++ b/drivers/net/team/team_core.c
> @@ -1970,11 +1970,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> struct netlink_ext_ack *extack)
> {
> struct team *team = netdev_priv(dev);
> - int err;
> + int err, locked;
>
> - mutex_lock(&team->lock);
> + locked = mutex_trylock(&team->lock);
> err = team_port_add(team, port_dev, extack);
> - mutex_unlock(&team->lock);
> + if (locked)
> + mutex_unlock(&team->lock);
This is not correct usage of 'mutex_trylock()' API. In such a case you
could as well remove the lock completely from that part of code.
If "mutex_trylock()" returns false it means the mutex cannot be taken
(because it was already taken by other thread), so you should not modify
the resources that were expected to be protected by the mutex.
In other words, there is a risk of modifying resources using
"team_port_add()" by several threads at a time.
>
> if (!err)
> netdev_change_features(dev);
> @@ -1985,11 +1986,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
> {
> struct team *team = netdev_priv(dev);
> - int err;
> + int err, locked;
>
> - mutex_lock(&team->lock);
> + locked = mutex_trylock(&team->lock);
> err = team_port_del(team, port_dev);
> - mutex_unlock(&team->lock);
> + if (locked)
> + mutex_unlock(&team->lock);
The same story as in case of "team_add_slave()".
>
> if (err)
> return err;
> --
>
The patch does not seem to be a correct solution to remove a deadlock.
Most probably a synchronization design needs an inspection.
If you really want to use "mutex_trylock()" API, please consider several
attempts of taking the mutex, but never modify the protected resources when
the mutex is not taken successfully.
Thanks,
Michal
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave
2024-07-03 15:18 ` Michal Kubiak
@ 2024-07-03 16:02 ` Jeongjun Park
2024-07-03 16:30 ` Eric Dumazet
0 siblings, 1 reply; 11+ messages in thread
From: Jeongjun Park @ 2024-07-03 16:02 UTC (permalink / raw)
To: michal.kubiak
Cc: aha310510, davem, edumazet, jiri, kuba, linux-kernel, netdev,
pabeni, syzbot+705c61d60b091ef42c04, syzkaller-bugs
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=Y, Size: 4675 bytes --]
>
> On Wed, Jul 03, 2024 at 11:51:59PM +0900, Jeongjun Park wrote:
> > CPU0 CPU1
> > ---- ----
> > lock(&rdev->wiphy.mtx);
> > lock(team->team_lock_key#4);
> > lock(&rdev->wiphy.mtx);
> > lock(team->team_lock_key#4);
> >
> > Deadlock occurs due to the above scenario. Therefore,
> > modify the code as shown in the patch below to prevent deadlock.
> >
> > Regards,
> > Jeongjun Park.
>
> The commit message should contain the patch description only (without
> salutations, etc.).
>
> >
> > Reported-and-tested-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com
> > Fixes: 61dc3461b954 ("team: convert overall spinlock to mutex")
> > Signed-off-by: Jeongjun Park <aha310510@gmail.com>
> > ---
> > drivers/net/team/team_core.c | 14 ++++++++------
> > 1 file changed, 8 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
> > index ab1935a4aa2c..3ac82df876b0 100644
> > --- a/drivers/net/team/team_core.c
> > +++ b/drivers/net/team/team_core.c
> > @@ -1970,11 +1970,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> > struct netlink_ext_ack *extack)
> > {
> > struct team *team = netdev_priv(dev);
> > - int err;
> > + int err, locked;
> >
> > - mutex_lock(&team->lock);
> > + locked = mutex_trylock(&team->lock);
> > err = team_port_add(team, port_dev, extack);
> > - mutex_unlock(&team->lock);
> > + if (locked)
> > + mutex_unlock(&team->lock);
>
> This is not correct usage of 'mutex_trylock()' API. In such a case you
> could as well remove the lock completely from that part of code.
> If "mutex_trylock()" returns false it means the mutex cannot be taken
> (because it was already taken by other thread), so you should not modify
> the resources that were expected to be protected by the mutex.
> In other words, there is a risk of modifying resources using
> "team_port_add()" by several threads at a time.
>
> >
> > if (!err)
> > netdev_change_features(dev);
> > @@ -1985,11 +1986,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> > static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
> > {
> > struct team *team = netdev_priv(dev);
> > - int err;
> > + int err, locked;
> >
> > - mutex_lock(&team->lock);
> > + locked = mutex_trylock(&team->lock);
> > err = team_port_del(team, port_dev);
> > - mutex_unlock(&team->lock);
> > + if (locked)
> > + mutex_unlock(&team->lock);
>
> The same story as in case of "team_add_slave()".
>
> >
> > if (err)
> > return err;
> > --
> >
>
> The patch does not seem to be a correct solution to remove a deadlock.
> Most probably a synchronization design needs an inspection.
> If you really want to use "mutex_trylock()" API, please consider several
> attempts of taking the mutex, but never modify the protected resources when
> the mutex is not taken successfully.
>
Thanks for your comment. I rewrote the patch based on those comments.
This time, we modified it to return an error so that resources are not
modified when a race situation occurs. We would appreciate your
feedback on what this patch would be like.
> Thanks,
> Michal
>
>
Regards,
Jeongjun Park
---
drivers/net/team/team_core.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..43d7c73b25aa 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1972,7 +1972,8 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
struct team *team = netdev_priv(dev);
int err;
- mutex_lock(&team->lock);
+ if (!mutex_trylock(&team->lock))
+ return -EBUSY;
err = team_port_add(team, port_dev, extack);
mutex_unlock(&team->lock);
@@ -1987,7 +1988,8 @@ static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
struct team *team = netdev_priv(dev);
int err;
- mutex_lock(&team->lock);
+ if (!mutex_trylock(&team->lock))
+ return -EBUSY;
err = team_port_del(team, port_dev);
mutex_unlock(&team->lock);
--
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave
2024-07-03 16:02 ` Jeongjun Park
@ 2024-07-03 16:30 ` Eric Dumazet
2024-07-05 15:17 ` [syzbot] [net?] possible deadlock in team_del_slave (3) Jeongjun Park
2024-07-05 15:19 ` [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
0 siblings, 2 replies; 11+ messages in thread
From: Eric Dumazet @ 2024-07-03 16:30 UTC (permalink / raw)
To: Jeongjun Park
Cc: michal.kubiak, davem, jiri, kuba, linux-kernel, netdev, pabeni,
syzbot+705c61d60b091ef42c04, syzkaller-bugs
On Wed, Jul 3, 2024 at 6:02 PM Jeongjun Park <aha310510@gmail.com> wrote:
>
> >
> > On Wed, Jul 03, 2024 at 11:51:59PM +0900, Jeongjun Park wrote:
> > > CPU0 CPU1
> > > ---- ----
> > > lock(&rdev->wiphy.mtx);
> > > lock(team->team_lock_key#4);
> > > lock(&rdev->wiphy.mtx);
> > > lock(team->team_lock_key#4);
> > >
> > > Deadlock occurs due to the above scenario. Therefore,
> > > modify the code as shown in the patch below to prevent deadlock.
> > >
> > > Regards,
> > > Jeongjun Park.
> >
> > The commit message should contain the patch description only (without
> > salutations, etc.).
> >
> > >
> > > Reported-and-tested-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com
> > > Fixes: 61dc3461b954 ("team: convert overall spinlock to mutex")
> > > Signed-off-by: Jeongjun Park <aha310510@gmail.com>
> > > ---
> > > drivers/net/team/team_core.c | 14 ++++++++------
> > > 1 file changed, 8 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
> > > index ab1935a4aa2c..3ac82df876b0 100644
> > > --- a/drivers/net/team/team_core.c
> > > +++ b/drivers/net/team/team_core.c
> > > @@ -1970,11 +1970,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> > > struct netlink_ext_ack *extack)
> > > {
> > > struct team *team = netdev_priv(dev);
> > > - int err;
> > > + int err, locked;
> > >
> > > - mutex_lock(&team->lock);
> > > + locked = mutex_trylock(&team->lock);
> > > err = team_port_add(team, port_dev, extack);
> > > - mutex_unlock(&team->lock);
> > > + if (locked)
> > > + mutex_unlock(&team->lock);
> >
> > This is not correct usage of 'mutex_trylock()' API. In such a case you
> > could as well remove the lock completely from that part of code.
> > If "mutex_trylock()" returns false it means the mutex cannot be taken
> > (because it was already taken by other thread), so you should not modify
> > the resources that were expected to be protected by the mutex.
> > In other words, there is a risk of modifying resources using
> > "team_port_add()" by several threads at a time.
> >
> > >
> > > if (!err)
> > > netdev_change_features(dev);
> > > @@ -1985,11 +1986,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> > > static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
> > > {
> > > struct team *team = netdev_priv(dev);
> > > - int err;
> > > + int err, locked;
> > >
> > > - mutex_lock(&team->lock);
> > > + locked = mutex_trylock(&team->lock);
> > > err = team_port_del(team, port_dev);
> > > - mutex_unlock(&team->lock);
> > > + if (locked)
> > > + mutex_unlock(&team->lock);
> >
> > The same story as in case of "team_add_slave()".
> >
> > >
> > > if (err)
> > > return err;
> > > --
> > >
> >
> > The patch does not seem to be a correct solution to remove a deadlock.
> > Most probably a synchronization design needs an inspection.
> > If you really want to use "mutex_trylock()" API, please consider several
> > attempts of taking the mutex, but never modify the protected resources when
> > the mutex is not taken successfully.
> >
>
> Thanks for your comment. I rewrote the patch based on those comments.
> This time, we modified it to return an error so that resources are not
> modified when a race situation occurs. We would appreciate your
> feedback on what this patch would be like.
>
> > Thanks,
> > Michal
> >
> >
>
> Regards,
> Jeongjun Park
>
> ---
> drivers/net/team/team_core.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
> index ab1935a4aa2c..43d7c73b25aa 100644
> --- a/drivers/net/team/team_core.c
> +++ b/drivers/net/team/team_core.c
> @@ -1972,7 +1972,8 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> struct team *team = netdev_priv(dev);
> int err;
>
> - mutex_lock(&team->lock);
> + if (!mutex_trylock(&team->lock))
> + return -EBUSY;
> err = team_port_add(team, port_dev, extack);
> mutex_unlock(&team->lock);
>
> @@ -1987,7 +1988,8 @@ static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
> struct team *team = netdev_priv(dev);
> int err;
>
> - mutex_lock(&team->lock);
> + if (!mutex_trylock(&team->lock))
> + return -EBUSY;
> err = team_port_del(team, port_dev);
> mutex_unlock(&team->lock);
>
> --
Failing team_del_slave() is not an option. It will add various issues.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
2024-04-26 11:59 [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
2024-04-26 14:17 ` Hillf Danton
2024-07-03 14:51 ` [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
@ 2024-07-04 10:15 ` Jiri Pirko
2024-07-06 4:13 ` [PATCH net,v2] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
3 siblings, 0 replies; 11+ messages in thread
From: Jiri Pirko @ 2024-07-04 10:15 UTC (permalink / raw)
To: syzbot; +Cc: davem, edumazet, kuba, linux-kernel, netdev, pabeni,
syzkaller-bugs
Fri, Apr 26, 2024 at 01:59:32PM CEST, syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com wrote:
>Hello,
>
>syzbot found the following issue on:
>
>HEAD commit: 480e035fc4c7 Merge tag 'drm-next-2024-03-13' of https://gi..
>git tree: upstream
>console+strace: https://syzkaller.appspot.com/x/log.txt?x=1662179e180000
>kernel config: https://syzkaller.appspot.com/x/.config?x=1e5b814e91787669
>dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
>compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
>syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1058e7b9180000
>C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11919365180000
>
>Downloadable assets:
>disk image: https://storage.googleapis.com/syzbot-assets/5f73b6ef963d/disk-480e035f.raw.xz
>vmlinux: https://storage.googleapis.com/syzbot-assets/46c949396aad/vmlinux-480e035f.xz
>kernel image: https://storage.googleapis.com/syzbot-assets/e3b4d0f5a5f8/bzImage-480e035f.xz
>
>IMPORTANT: if you fix the issue, please add the following tag to the commit:
>Reported-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com
>
>======================================================
>WARNING: possible circular locking dependency detected
>6.8.0-syzkaller-08073-g480e035fc4c7 #0 Not tainted
>------------------------------------------------------
>syz-executor419/5074 is trying to acquire lock:
>ffff888023dc4d20 (team->team_lock_key){+.+.}-{3:3}, at: team_del_slave+0x32/0x1d0 drivers/net/team/team.c:1988
>
>but task is already holding lock:
>ffff88802a210768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: nl80211_del_interface+0x11a/0x140 net/wireless/nl80211.c:4389
>
>which lock already depends on the new lock.
>
>
>the existing dependency chain (in reverse order) is:
>
>-> #1 (&rdev->wiphy.mtx){+.+.}-{3:3}:
> lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
> __mutex_lock_common kernel/locking/mutex.c:608 [inline]
> __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
> wiphy_lock include/net/cfg80211.h:5951 [inline]
> ieee80211_open+0xe7/0x200 net/mac80211/iface.c:449
> __dev_open+0x2d3/0x450 net/core/dev.c:1430
> dev_open+0xae/0x1b0 net/core/dev.c:1466
> team_port_add drivers/net/team/team.c:1214 [inline]
> team_add_slave+0x9b3/0x2750 drivers/net/team/team.c:1974
> do_set_master net/core/rtnetlink.c:2685 [inline]
> do_setlink+0xe70/0x41f0 net/core/rtnetlink.c:2891
> rtnl_setlink+0x40d/0x5a0 net/core/rtnetlink.c:3185
> rtnetlink_rcv_msg+0x89b/0x10d0 net/core/rtnetlink.c:6595
> netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
> netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
> netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
> netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
> sock_sendmsg_nosec net/socket.c:730 [inline]
> __sock_sendmsg+0x221/0x270 net/socket.c:745
> ____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
> ___sys_sendmsg net/socket.c:2638 [inline]
> __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
> do_syscall_64+0xfb/0x240
> entry_SYSCALL_64_after_hwframe+0x6d/0x75
>
>-> #0 (team->team_lock_key){+.+.}-{3:3}:
> check_prev_add kernel/locking/lockdep.c:3134 [inline]
> check_prevs_add kernel/locking/lockdep.c:3253 [inline]
> validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
> __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
> lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
> __mutex_lock_common kernel/locking/mutex.c:608 [inline]
> __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
> team_del_slave+0x32/0x1d0 drivers/net/team/team.c:1988
> team_device_event+0x200/0x5b0 drivers/net/team/team.c:3029
> notifier_call_chain+0x18f/0x3b0 kernel/notifier.c:93
> call_netdevice_notifiers_extack net/core/dev.c:1988 [inline]
> call_netdevice_notifiers net/core/dev.c:2002 [inline]
> unregister_netdevice_many_notify+0xd96/0x16d0 net/core/dev.c:11096
> unregister_netdevice_many net/core/dev.c:11154 [inline]
> unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11033
> unregister_netdevice include/linux/netdevice.h:3115 [inline]
> _cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
> ieee80211_if_remove+0x25d/0x3a0 net/mac80211/iface.c:2242
> ieee80211_del_iface+0x19/0x30 net/mac80211/cfg.c:202
> rdev_del_virtual_intf net/wireless/rdev-ops.h:62 [inline]
> cfg80211_remove_virtual_intf+0x230/0x3f0 net/wireless/util.c:2847
> genl_family_rcv_msg_doit net/netlink/genetlink.c:1113 [inline]
> genl_family_rcv_msg net/netlink/genetlink.c:1193 [inline]
> genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1208
> netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
> genl_rcv+0x28/0x40 net/netlink/genetlink.c:1217
> netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
> netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
> netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
> sock_sendmsg_nosec net/socket.c:730 [inline]
> __sock_sendmsg+0x221/0x270 net/socket.c:745
> ____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
> ___sys_sendmsg net/socket.c:2638 [inline]
> __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
> do_syscall_64+0xfb/0x240
> entry_SYSCALL_64_after_hwframe+0x6d/0x75
I wonder, since we already rely on rtnl in lots of team code, perhaps we
can remove team->lock completely and convert the rest of the code to be
protected by rtnl lock as well.
>
>other info that might help us debug this:
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(&rdev->wiphy.mtx);
> lock(team->team_lock_key);
> lock(&rdev->wiphy.mtx);
> lock(team->team_lock_key);
>
> *** DEADLOCK ***
>
>3 locks held by syz-executor419/5074:
> #0: ffffffff8f3f1a30 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40 net/netlink/genetlink.c:1216
> #1: ffffffff8f38ce88 (rtnl_mutex){+.+.}-{3:3}, at: nl80211_pre_doit+0x5f/0x8b0 net/wireless/nl80211.c:16401
> #2: ffff88802a210768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: nl80211_del_interface+0x11a/0x140 net/wireless/nl80211.c:4389
>
>stack backtrace:
>CPU: 1 PID: 5074 Comm: syz-executor419 Not tainted 6.8.0-syzkaller-08073-g480e035fc4c7 #0
>Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
>Call Trace:
> <TASK>
> __dump_stack lib/dump_stack.c:88 [inline]
> dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
> check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187
> check_prev_add kernel/locking/lockdep.c:3134 [inline]
> check_prevs_add kernel/locking/lockdep.c:3253 [inline]
> validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
> __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
> lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
> __mutex_lock_common kernel/locking/mutex.c:608 [inline]
> __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
> team_del_slave+0x32/0x1d0 drivers/net/team/team.c:1988
> team_device_event+0x200/0x5b0 drivers/net/team/team.c:3029
> notifier_call_chain+0x18f/0x3b0 kernel/notifier.c:93
> call_netdevice_notifiers_extack net/core/dev.c:1988 [inline]
> call_netdevice_notifiers net/core/dev.c:2002 [inline]
> unregister_netdevice_many_notify+0xd96/0x16d0 net/core/dev.c:11096
> unregister_netdevice_many net/core/dev.c:11154 [inline]
> unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11033
> unregister_netdevice include/linux/netdevice.h:3115 [inline]
> _cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
> ieee80211_if_remove+0x25d/0x3a0 net/mac80211/iface.c:2242
> ieee80211_del_iface+0x19/0x30 net/mac80211/cfg.c:202
> rdev_del_virtual_intf net/wireless/rdev-ops.h:62 [inline]
> cfg80211_remove_virtual_intf+0x230/0x3f0 net/wireless/util.c:2847
> genl_family_rcv_msg_doit net/netlink/genetlink.c:1113 [inline]
> genl_family_rcv_msg net/netlink/genetlink.c:1193 [inline]
> genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1208
> netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
> genl_rcv+0x28/0x40 net/netlink/genetlink.c:1217
> netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
> netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
> netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
> sock_sendmsg_nosec net/socket.c:730 [inline]
> __sock_sendmsg+0x221/0x270 net/socket.c:745
> ____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
> ___sys_sendmsg net/socket.c:2638 [inline]
> __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
> do_syscall_64+0xfb/0x240
> entry_SYSCALL_64_after_hwframe+0x6d/0x75
>RIP: 0033:0x7f963cb981a9
>Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 d1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
>RSP: 002b:00007ffdde1419a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
>RAX: ffffffffffffffda RBX: 00007f963cbe53f6 RCX: 00007f963cb981a9
>RDX: 0000000000000000 RSI: 0000000020000400 RDI: 0000000000000004
>RBP: 00007f963cc17440 R08: 0000000000000000 R09: 0000000000000000
>R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000031
>R13: 0000000000000003 R14: 0000000000050012 R15: 00007ffdde141a02
> </TASK>
>team0: Port device wlan0 removed
>
>
>---
>This report is generated by a bot. It may contain errors.
>See https://goo.gl/tpsmEJ for more information about syzbot.
>syzbot engineers can be reached at syzkaller@googlegroups.com.
>
>syzbot will keep track of this issue. See:
>https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
>If the report is already addressed, let syzbot know by replying with:
>#syz fix: exact-commit-title
>
>If you want syzbot to run the reproducer, reply with:
>#syz test: git://repo/address.git branch-or-commit-hash
>If you attach or paste a git patch, syzbot will apply it before testing.
>
>If you want to overwrite report's subsystems, reply with:
>#syz set subsystems: new-subsystem
>(See the list of subsystem names on the web dashboard)
>
>If the report is a duplicate of another one, reply with:
>#syz dup: exact-subject-of-another-report
>
>If you want to undo deduplication, reply with:
>#syz undup
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
2024-07-03 16:30 ` Eric Dumazet
@ 2024-07-05 15:17 ` Jeongjun Park
2024-07-05 15:19 ` [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
1 sibling, 0 replies; 11+ messages in thread
From: Jeongjun Park @ 2024-07-05 15:17 UTC (permalink / raw)
To: edumazet
Cc: aha310510, davem, jiri, kuba, linux-kernel, michal.kubiak, netdev,
pabeni, syzbot+705c61d60b091ef42c04, syzkaller-bugs
> >
> > Thanks for your comment. I rewrote the patch based on those comments.
> > This time, we modified it to return an error so that resources are not
> > modified when a race situation occurs. We would appreciate your
> > feedback on what this patch would be like.
> >
> > > Thanks,
> > > Michal
> > >
> > >
> >
> > Regards,
> > Jeongjun Park
> >
> > ---
> > drivers/net/team/team_core.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
> > index ab1935a4aa2c..43d7c73b25aa 100644
> > --- a/drivers/net/team/team_core.c
> > +++ b/drivers/net/team/team_core.c
> > @@ -1972,7 +1972,8 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> > struct team *team = netdev_priv(dev);
> > int err;
> >
> > - mutex_lock(&team->lock);
> > + if (!mutex_trylock(&team->lock))
> > + return -EBUSY;
> > err = team_port_add(team, port_dev, extack);
> > mutex_unlock(&team->lock);
> >
> > @@ -1987,7 +1988,8 @@ static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
> > struct team *team = netdev_priv(dev);
> > int err;
> >
> > - mutex_lock(&team->lock);
> > + if (!mutex_trylock(&team->lock))
> > + return -EBUSY;
> > err = team_port_del(team, port_dev);
> > mutex_unlock(&team->lock);
> >
> > --
>
> Failing team_del_slave() is not an option. It will add various issues.
Thank you for comment.
So, how about briefly releasing the lock before calling dev_open()
in team_port_add() and then locking it again? dev_open() does not use
&team, so disabling it briefly will not cause any major problems.
Regards,
Jeongjun Park
---
drivers/net/team/team_core.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..245566a1875d 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1213,7 +1213,9 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
goto err_port_enter;
}
+ mutex_unlock(&team->lock);
err = dev_open(port_dev, extack);
+ mutex_lock(&team->lock);
if (err) {
netdev_dbg(dev, "Device %s opening failed\n",
portname);
--
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave
2024-07-03 16:30 ` Eric Dumazet
2024-07-05 15:17 ` [syzbot] [net?] possible deadlock in team_del_slave (3) Jeongjun Park
@ 2024-07-05 15:19 ` Jeongjun Park
1 sibling, 0 replies; 11+ messages in thread
From: Jeongjun Park @ 2024-07-05 15:19 UTC (permalink / raw)
To: edumazet
Cc: aha310510, davem, jiri, kuba, linux-kernel, michal.kubiak, netdev,
pabeni, syzbot+705c61d60b091ef42c04, syzkaller-bugs
> >
> > Thanks for your comment. I rewrote the patch based on those comments.
> > This time, we modified it to return an error so that resources are not
> > modified when a race situation occurs. We would appreciate your
> > feedback on what this patch would be like.
> >
> > > Thanks,
> > > Michal
> > >
> > >
> >
> > Regards,
> > Jeongjun Park
> >
> > ---
> > drivers/net/team/team_core.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
> > index ab1935a4aa2c..43d7c73b25aa 100644
> > --- a/drivers/net/team/team_core.c
> > +++ b/drivers/net/team/team_core.c
> > @@ -1972,7 +1972,8 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> > struct team *team = netdev_priv(dev);
> > int err;
> >
> > - mutex_lock(&team->lock);
> > + if (!mutex_trylock(&team->lock))
> > + return -EBUSY;
> > err = team_port_add(team, port_dev, extack);
> > mutex_unlock(&team->lock);
> >
> > @@ -1987,7 +1988,8 @@ static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
> > struct team *team = netdev_priv(dev);
> > int err;
> >
> > - mutex_lock(&team->lock);
> > + if (!mutex_trylock(&team->lock))
> > + return -EBUSY;
> > err = team_port_del(team, port_dev);
> > mutex_unlock(&team->lock);
> >
> > --
>
> Failing team_del_slave() is not an option. It will add various issues.
Thank you for comment.
So, how about briefly releasing the lock before calling dev_open()
in team_port_add() and then locking it again? dev_open() does not use
&team, so disabling it briefly will not cause any major problems.
Regards,
Jeongjun Park
---
drivers/net/team/team_core.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..245566a1875d 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1213,7 +1213,9 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
goto err_port_enter;
}
+ mutex_unlock(&team->lock);
err = dev_open(port_dev, extack);
+ mutex_lock(&team->lock);
if (err) {
netdev_dbg(dev, "Device %s opening failed\n",
portname);
--
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net,v2] team: Fix ABBA deadlock caused by race in team_del_slave
2024-04-26 11:59 [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
` (2 preceding siblings ...)
2024-07-04 10:15 ` [syzbot] [net?] possible deadlock in team_del_slave (3) Jiri Pirko
@ 2024-07-06 4:13 ` Jeongjun Park
2024-07-06 15:01 ` Stephen Hemminger
3 siblings, 1 reply; 11+ messages in thread
From: Jeongjun Park @ 2024-07-06 4:13 UTC (permalink / raw)
To: jiri
Cc: syzbot+705c61d60b091ef42c04, davem, edumazet, kuba, linux-kernel,
netdev, pabeni, syzkaller-bugs, Jeongjun Park
CPU0 CPU1
---- ----
lock(&rdev->wiphy.mtx);
lock(team->team_lock_key#4);
lock(&rdev->wiphy.mtx);
lock(team->team_lock_key#4);
Deadlock occurs due to the above scenario. Therefore, you can prevent
deadlock by briefly releasing the lock before calling dev_open() in
team_port_add() and locking it again after it returns.
Reported-and-tested-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com
Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
---
drivers/net/team/team_core.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..245566a1875d 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1213,7 +1213,9 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
goto err_port_enter;
}
+ mutex_unlock(&team->lock);
err = dev_open(port_dev, extack);
+ mutex_lock(&team->lock);
if (err) {
netdev_dbg(dev, "Device %s opening failed\n",
portname);
--
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH net,v2] team: Fix ABBA deadlock caused by race in team_del_slave
2024-07-06 4:13 ` [PATCH net,v2] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
@ 2024-07-06 15:01 ` Stephen Hemminger
0 siblings, 0 replies; 11+ messages in thread
From: Stephen Hemminger @ 2024-07-06 15:01 UTC (permalink / raw)
To: Jeongjun Park
Cc: jiri, syzbot+705c61d60b091ef42c04, davem, edumazet, kuba,
linux-kernel, netdev, pabeni, syzkaller-bugs
On Sat, 6 Jul 2024 13:13:29 +0900
Jeongjun Park <aha310510@gmail.com> wrote:
> CPU0 CPU1
> ---- ----
> lock(&rdev->wiphy.mtx);
> lock(team->team_lock_key#4);
> lock(&rdev->wiphy.mtx);
> lock(team->team_lock_key#4);
>
> Deadlock occurs due to the above scenario. Therefore, you can prevent
> deadlock by briefly releasing the lock before calling dev_open() in
> team_port_add() and locking it again after it returns.
>
> Reported-and-tested-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com
> Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device")
> Signed-off-by: Jeongjun Park <aha310510@gmail.com>
> ---
But if you drop the lock the actual data structures might have changed.
Usually not a good idea,
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2024-07-06 15:01 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-26 11:59 [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
2024-04-26 14:17 ` Hillf Danton
2024-07-03 14:51 ` [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
2024-07-03 15:18 ` Michal Kubiak
2024-07-03 16:02 ` Jeongjun Park
2024-07-03 16:30 ` Eric Dumazet
2024-07-05 15:17 ` [syzbot] [net?] possible deadlock in team_del_slave (3) Jeongjun Park
2024-07-05 15:19 ` [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
2024-07-04 10:15 ` [syzbot] [net?] possible deadlock in team_del_slave (3) Jiri Pirko
2024-07-06 4:13 ` [PATCH net,v2] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
2024-07-06 15:01 ` Stephen Hemminger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).