public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [syzbot] [net?] possible deadlock in team_del_slave (3)
@ 2024-04-26 11:59 syzbot
  2024-04-26 14:17 ` Hillf Danton
                   ` (14 more replies)
  0 siblings, 15 replies; 32+ messages in thread
From: syzbot @ 2024-04-26 11:59 UTC (permalink / raw)
  To: davem, edumazet, jiri, kuba, linux-kernel, netdev, pabeni,
	syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    480e035fc4c7 Merge tag 'drm-next-2024-03-13' of https://gi..
git tree:       upstream
console+strace: https://syzkaller.appspot.com/x/log.txt?x=1662179e180000
kernel config:  https://syzkaller.appspot.com/x/.config?x=1e5b814e91787669
dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1058e7b9180000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11919365180000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/5f73b6ef963d/disk-480e035f.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/46c949396aad/vmlinux-480e035f.xz
kernel image: https://storage.googleapis.com/syzbot-assets/e3b4d0f5a5f8/bzImage-480e035f.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com

======================================================
WARNING: possible circular locking dependency detected
6.8.0-syzkaller-08073-g480e035fc4c7 #0 Not tainted
------------------------------------------------------
syz-executor419/5074 is trying to acquire lock:
ffff888023dc4d20 (team->team_lock_key){+.+.}-{3:3}, at: team_del_slave+0x32/0x1d0 drivers/net/team/team.c:1988

but task is already holding lock:
ffff88802a210768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: nl80211_del_interface+0x11a/0x140 net/wireless/nl80211.c:4389

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&rdev->wiphy.mtx){+.+.}-{3:3}:
       lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
       __mutex_lock_common kernel/locking/mutex.c:608 [inline]
       __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
       wiphy_lock include/net/cfg80211.h:5951 [inline]
       ieee80211_open+0xe7/0x200 net/mac80211/iface.c:449
       __dev_open+0x2d3/0x450 net/core/dev.c:1430
       dev_open+0xae/0x1b0 net/core/dev.c:1466
       team_port_add drivers/net/team/team.c:1214 [inline]
       team_add_slave+0x9b3/0x2750 drivers/net/team/team.c:1974
       do_set_master net/core/rtnetlink.c:2685 [inline]
       do_setlink+0xe70/0x41f0 net/core/rtnetlink.c:2891
       rtnl_setlink+0x40d/0x5a0 net/core/rtnetlink.c:3185
       rtnetlink_rcv_msg+0x89b/0x10d0 net/core/rtnetlink.c:6595
       netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
       netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
       netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
       netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg+0x221/0x270 net/socket.c:745
       ____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
       ___sys_sendmsg net/socket.c:2638 [inline]
       __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
       do_syscall_64+0xfb/0x240
       entry_SYSCALL_64_after_hwframe+0x6d/0x75

-> #0 (team->team_lock_key){+.+.}-{3:3}:
       check_prev_add kernel/locking/lockdep.c:3134 [inline]
       check_prevs_add kernel/locking/lockdep.c:3253 [inline]
       validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
       __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
       lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
       __mutex_lock_common kernel/locking/mutex.c:608 [inline]
       __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
       team_del_slave+0x32/0x1d0 drivers/net/team/team.c:1988
       team_device_event+0x200/0x5b0 drivers/net/team/team.c:3029
       notifier_call_chain+0x18f/0x3b0 kernel/notifier.c:93
       call_netdevice_notifiers_extack net/core/dev.c:1988 [inline]
       call_netdevice_notifiers net/core/dev.c:2002 [inline]
       unregister_netdevice_many_notify+0xd96/0x16d0 net/core/dev.c:11096
       unregister_netdevice_many net/core/dev.c:11154 [inline]
       unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11033
       unregister_netdevice include/linux/netdevice.h:3115 [inline]
       _cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
       ieee80211_if_remove+0x25d/0x3a0 net/mac80211/iface.c:2242
       ieee80211_del_iface+0x19/0x30 net/mac80211/cfg.c:202
       rdev_del_virtual_intf net/wireless/rdev-ops.h:62 [inline]
       cfg80211_remove_virtual_intf+0x230/0x3f0 net/wireless/util.c:2847
       genl_family_rcv_msg_doit net/netlink/genetlink.c:1113 [inline]
       genl_family_rcv_msg net/netlink/genetlink.c:1193 [inline]
       genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1208
       netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
       genl_rcv+0x28/0x40 net/netlink/genetlink.c:1217
       netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
       netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
       netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg+0x221/0x270 net/socket.c:745
       ____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
       ___sys_sendmsg net/socket.c:2638 [inline]
       __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
       do_syscall_64+0xfb/0x240
       entry_SYSCALL_64_after_hwframe+0x6d/0x75

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&rdev->wiphy.mtx);
                               lock(team->team_lock_key);
                               lock(&rdev->wiphy.mtx);
  lock(team->team_lock_key);

 *** DEADLOCK ***

3 locks held by syz-executor419/5074:
 #0: ffffffff8f3f1a30 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40 net/netlink/genetlink.c:1216
 #1: ffffffff8f38ce88 (rtnl_mutex){+.+.}-{3:3}, at: nl80211_pre_doit+0x5f/0x8b0 net/wireless/nl80211.c:16401
 #2: ffff88802a210768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: nl80211_del_interface+0x11a/0x140 net/wireless/nl80211.c:4389

stack backtrace:
CPU: 1 PID: 5074 Comm: syz-executor419 Not tainted 6.8.0-syzkaller-08073-g480e035fc4c7 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
 check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187
 check_prev_add kernel/locking/lockdep.c:3134 [inline]
 check_prevs_add kernel/locking/lockdep.c:3253 [inline]
 validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
 __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
 lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
 __mutex_lock_common kernel/locking/mutex.c:608 [inline]
 __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
 team_del_slave+0x32/0x1d0 drivers/net/team/team.c:1988
 team_device_event+0x200/0x5b0 drivers/net/team/team.c:3029
 notifier_call_chain+0x18f/0x3b0 kernel/notifier.c:93
 call_netdevice_notifiers_extack net/core/dev.c:1988 [inline]
 call_netdevice_notifiers net/core/dev.c:2002 [inline]
 unregister_netdevice_many_notify+0xd96/0x16d0 net/core/dev.c:11096
 unregister_netdevice_many net/core/dev.c:11154 [inline]
 unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11033
 unregister_netdevice include/linux/netdevice.h:3115 [inline]
 _cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
 ieee80211_if_remove+0x25d/0x3a0 net/mac80211/iface.c:2242
 ieee80211_del_iface+0x19/0x30 net/mac80211/cfg.c:202
 rdev_del_virtual_intf net/wireless/rdev-ops.h:62 [inline]
 cfg80211_remove_virtual_intf+0x230/0x3f0 net/wireless/util.c:2847
 genl_family_rcv_msg_doit net/netlink/genetlink.c:1113 [inline]
 genl_family_rcv_msg net/netlink/genetlink.c:1193 [inline]
 genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1208
 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1217
 netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
 netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
 netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg+0x221/0x270 net/socket.c:745
 ____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
 ___sys_sendmsg net/socket.c:2638 [inline]
 __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
 do_syscall_64+0xfb/0x240
 entry_SYSCALL_64_after_hwframe+0x6d/0x75
RIP: 0033:0x7f963cb981a9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 d1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffdde1419a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f963cbe53f6 RCX: 00007f963cb981a9
RDX: 0000000000000000 RSI: 0000000020000400 RDI: 0000000000000004
RBP: 00007f963cc17440 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000031
R13: 0000000000000003 R14: 0000000000050012 R15: 00007ffdde141a02
 </TASK>
team0: Port device wlan0 removed


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
  2024-04-26 11:59 [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
@ 2024-04-26 14:17 ` Hillf Danton
  2024-07-03 11:25 ` Jeongjun Park
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 32+ messages in thread
From: Hillf Danton @ 2024-04-26 14:17 UTC (permalink / raw)
  To: syzbot; +Cc: edumazet, linux-kernel, netdev, Boqun Feng, syzkaller-bugs

On Fri, 26 Apr 2024 04:59:32 -0700
> syzbot found the following issue on:
> 
> HEAD commit:    480e035fc4c7 Merge tag 'drm-next-2024-03-13' of https://gi..
> git tree:       upstream
> console+strace: https://syzkaller.appspot.com/x/log.txt?x=1662179e180000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=1e5b814e91787669
> dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1058e7b9180000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11919365180000
> 
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/5f73b6ef963d/disk-480e035f.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/46c949396aad/vmlinux-480e035f.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/e3b4d0f5a5f8/bzImage-480e035f.xz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.8.0-syzkaller-08073-g480e035fc4c7 #0 Not tainted
> ------------------------------------------------------
> syz-executor419/5074 is trying to acquire lock:
> ffff888023dc4d20 (team->team_lock_key){+.+.}-{3:3}, at: team_del_slave+0x32/0x1d0 drivers/net/team/team.c:1988
> 
> but task is already holding lock:
> ffff88802a210768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: nl80211_del_interface+0x11a/0x140 net/wireless/nl80211.c:4389
> 
> which lock already depends on the new lock.
> 
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #1 (&rdev->wiphy.mtx){+.+.}-{3:3}:
>        lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
>        __mutex_lock_common kernel/locking/mutex.c:608 [inline]
>        __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
>        wiphy_lock include/net/cfg80211.h:5951 [inline]
>        ieee80211_open+0xe7/0x200 net/mac80211/iface.c:449
>        __dev_open+0x2d3/0x450 net/core/dev.c:1430

	ASSERT_RTNL();

>        dev_open+0xae/0x1b0 net/core/dev.c:1466
>        team_port_add drivers/net/team/team.c:1214 [inline]
>        team_add_slave+0x9b3/0x2750 drivers/net/team/team.c:1974
>        do_set_master net/core/rtnetlink.c:2685 [inline]
>        do_setlink+0xe70/0x41f0 net/core/rtnetlink.c:2891
>        rtnl_setlink+0x40d/0x5a0 net/core/rtnetlink.c:3185
>        rtnetlink_rcv_msg+0x89b/0x10d0 net/core/rtnetlink.c:6595
>        netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
>        netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
>        netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
>        netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
>        sock_sendmsg_nosec net/socket.c:730 [inline]
>        __sock_sendmsg+0x221/0x270 net/socket.c:745
>        ____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
>        ___sys_sendmsg net/socket.c:2638 [inline]
>        __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
>        do_syscall_64+0xfb/0x240
>        entry_SYSCALL_64_after_hwframe+0x6d/0x75
> 
> -> #0 (team->team_lock_key){+.+.}-{3:3}:
>        check_prev_add kernel/locking/lockdep.c:3134 [inline]
>        check_prevs_add kernel/locking/lockdep.c:3253 [inline]
>        validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
>        __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
>        lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
>        __mutex_lock_common kernel/locking/mutex.c:608 [inline]
>        __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
>        team_del_slave+0x32/0x1d0 drivers/net/team/team.c:1988
>        team_device_event+0x200/0x5b0 drivers/net/team/team.c:3029
>        notifier_call_chain+0x18f/0x3b0 kernel/notifier.c:93
>        call_netdevice_notifiers_extack net/core/dev.c:1988 [inline]
>        call_netdevice_notifiers net/core/dev.c:2002 [inline]
>        unregister_netdevice_many_notify+0xd96/0x16d0 net/core/dev.c:11096
>        unregister_netdevice_many net/core/dev.c:11154 [inline]
>        unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11033
>        unregister_netdevice include/linux/netdevice.h:3115 [inline]
>        _cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
>        ieee80211_if_remove+0x25d/0x3a0 net/mac80211/iface.c:2242

	ASSERT_RTNL();
	lockdep_assert_wiphy(sdata->local->hw.wiphy);

Given ASSERT_RTNL() on both sides, difficult to understand the
deadlock reported.

>        ieee80211_del_iface+0x19/0x30 net/mac80211/cfg.c:202
>        rdev_del_virtual_intf net/wireless/rdev-ops.h:62 [inline]
>        cfg80211_remove_virtual_intf+0x230/0x3f0 net/wireless/util.c:2847
>        genl_family_rcv_msg_doit net/netlink/genetlink.c:1113 [inline]
>        genl_family_rcv_msg net/netlink/genetlink.c:1193 [inline]
>        genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1208
>        netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
>        genl_rcv+0x28/0x40 net/netlink/genetlink.c:1217
>        netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
>        netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
>        netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
>        sock_sendmsg_nosec net/socket.c:730 [inline]
>        __sock_sendmsg+0x221/0x270 net/socket.c:745
>        ____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
>        ___sys_sendmsg net/socket.c:2638 [inline]
>        __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
>        do_syscall_64+0xfb/0x240
>        entry_SYSCALL_64_after_hwframe+0x6d/0x75
> 
> other info that might help us debug this:
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(&rdev->wiphy.mtx);
>                                lock(team->team_lock_key);
>                                lock(&rdev->wiphy.mtx);
>   lock(team->team_lock_key);
> 
>  *** DEADLOCK ***
> 
> 3 locks held by syz-executor419/5074:
>  #0: ffffffff8f3f1a30 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40 net/netlink/genetlink.c:1216
>  #1: ffffffff8f38ce88 (rtnl_mutex){+.+.}-{3:3}, at: nl80211_pre_doit+0x5f/0x8b0 net/wireless/nl80211.c:16401
>  #2: ffff88802a210768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: nl80211_del_interface+0x11a/0x140 net/wireless/nl80211.c:4389
> 
> stack backtrace:
> CPU: 1 PID: 5074 Comm: syz-executor419 Not tainted 6.8.0-syzkaller-08073-g480e035fc4c7 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
> Call Trace:
>  <TASK>
>  __dump_stack lib/dump_stack.c:88 [inline]
>  dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
>  check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187
>  check_prev_add kernel/locking/lockdep.c:3134 [inline]
>  check_prevs_add kernel/locking/lockdep.c:3253 [inline]
>  validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
>  __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
>  lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
>  __mutex_lock_common kernel/locking/mutex.c:608 [inline]
>  __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
>  team_del_slave+0x32/0x1d0 drivers/net/team/team.c:1988
>  team_device_event+0x200/0x5b0 drivers/net/team/team.c:3029
>  notifier_call_chain+0x18f/0x3b0 kernel/notifier.c:93
>  call_netdevice_notifiers_extack net/core/dev.c:1988 [inline]
>  call_netdevice_notifiers net/core/dev.c:2002 [inline]
>  unregister_netdevice_many_notify+0xd96/0x16d0 net/core/dev.c:11096
>  unregister_netdevice_many net/core/dev.c:11154 [inline]
>  unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11033
>  unregister_netdevice include/linux/netdevice.h:3115 [inline]
>  _cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
>  ieee80211_if_remove+0x25d/0x3a0 net/mac80211/iface.c:2242
>  ieee80211_del_iface+0x19/0x30 net/mac80211/cfg.c:202
>  rdev_del_virtual_intf net/wireless/rdev-ops.h:62 [inline]
>  cfg80211_remove_virtual_intf+0x230/0x3f0 net/wireless/util.c:2847
>  genl_family_rcv_msg_doit net/netlink/genetlink.c:1113 [inline]
>  genl_family_rcv_msg net/netlink/genetlink.c:1193 [inline]
>  genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1208
>  netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
>  genl_rcv+0x28/0x40 net/netlink/genetlink.c:1217
>  netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
>  netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
>  netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
>  sock_sendmsg_nosec net/socket.c:730 [inline]
>  __sock_sendmsg+0x221/0x270 net/socket.c:745
>  ____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
>  ___sys_sendmsg net/socket.c:2638 [inline]
>  __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
>  do_syscall_64+0xfb/0x240
>  entry_SYSCALL_64_after_hwframe+0x6d/0x75
> RIP: 0033:0x7f963cb981a9
> Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 d1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007ffdde1419a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> RAX: ffffffffffffffda RBX: 00007f963cbe53f6 RCX: 00007f963cb981a9
> RDX: 0000000000000000 RSI: 0000000020000400 RDI: 0000000000000004
> RBP: 00007f963cc17440 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000031
> R13: 0000000000000003 R14: 0000000000050012 R15: 00007ffdde141a02
>  </TASK>
> team0: Port device wlan0 removed
> 
> 
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> 
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
> 
> If you want syzbot to run the reproducer, reply with:
> #syz test: git://repo/address.git branch-or-commit-hash
> If you attach or paste a git patch, syzbot will apply it before testing.
> 
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
> 
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
> 
> If you want to undo deduplication, reply with:
> #syz undup
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
  2024-04-26 11:59 [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
  2024-04-26 14:17 ` Hillf Danton
@ 2024-07-03 11:25 ` Jeongjun Park
  2024-07-03 13:41   ` syzbot
  2024-07-03 13:44 ` Jeongjun Park
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 32+ messages in thread
From: Jeongjun Park @ 2024-07-03 11:25 UTC (permalink / raw)
  To: syzbot+705c61d60b091ef42c04; +Cc: linux-kernel, syzkaller-bugs

#syz test git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
  2024-07-03 11:25 ` Jeongjun Park
@ 2024-07-03 13:41   ` syzbot
  0 siblings, 0 replies; 32+ messages in thread
From: syzbot @ 2024-07-03 13:41 UTC (permalink / raw)
  To: aha310510, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
possible deadlock in team_del_slave

bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): Released all slaves
======================================================
WARNING: possible circular locking dependency detected
6.10.0-rc6-syzkaller-00061-ge9d22f7a6655 #0 Not tainted
------------------------------------------------------
kworker/u8:4/61 is trying to acquire lock:
ffff888023524d20 (team->team_lock_key#4){+.+.}-{3:3}, at: team_del_slave+0x32/0x1d0 drivers/net/team/team_core.c:1990

but task is already holding lock:
ffff8880226b0768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: wiphy_lock include/net/cfg80211.h:5966 [inline]
ffff8880226b0768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: ieee80211_remove_interfaces+0x12b/0x700 net/mac80211/iface.c:2280

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&rdev->wiphy.mtx){+.+.}-{3:3}:
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
       __mutex_lock_common kernel/locking/mutex.c:608 [inline]
       __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
       wiphy_lock include/net/cfg80211.h:5966 [inline]
       ieee80211_open+0xe7/0x200 net/mac80211/iface.c:449
       __dev_open+0x2d3/0x450 net/core/dev.c:1472
       dev_open+0xae/0x1b0 net/core/dev.c:1508
       team_port_add drivers/net/team/team_core.c:1216 [inline]
       team_add_slave+0x9b3/0x2750 drivers/net/team/team_core.c:1976
       do_set_master net/core/rtnetlink.c:2701 [inline]
       do_setlink+0xe70/0x41f0 net/core/rtnetlink.c:2907
       __rtnl_newlink net/core/rtnetlink.c:3696 [inline]
       rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3743
       rtnetlink_rcv_msg+0x89b/0x1180 net/core/rtnetlink.c:6635
       netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2564
       netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
       netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
       netlink_sendmsg+0x8db/0xcb0 net/netlink/af_netlink.c:1905
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg+0x221/0x270 net/socket.c:745
       ____sys_sendmsg+0x525/0x7d0 net/socket.c:2585
       ___sys_sendmsg net/socket.c:2639 [inline]
       __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2668
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #0 (team->team_lock_key#4){+.+.}-{3:3}:
       check_prev_add kernel/locking/lockdep.c:3134 [inline]
       check_prevs_add kernel/locking/lockdep.c:3253 [inline]
       validate_chain+0x18e0/0x5900 kernel/locking/lockdep.c:3869
       __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
       __mutex_lock_common kernel/locking/mutex.c:608 [inline]
       __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
       team_del_slave+0x32/0x1d0 drivers/net/team/team_core.c:1990
       team_device_event+0x200/0x5b0 drivers/net/team/team_core.c:2984
       notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
       call_netdevice_notifiers_extack net/core/dev.c:2030 [inline]
       call_netdevice_notifiers net/core/dev.c:2044 [inline]
       unregister_netdevice_many_notify+0xd75/0x16b0 net/core/dev.c:11219
       unregister_netdevice_many net/core/dev.c:11277 [inline]
       unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11156
       unregister_netdevice include/linux/netdevice.h:3119 [inline]
       _cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
       ieee80211_remove_interfaces+0x4db/0x700 net/mac80211/iface.c:2305
       ieee80211_unregister_hw+0x5d/0x2c0 net/mac80211/main.c:1658
       mac80211_hwsim_del_radio+0x2c2/0x4c0 drivers/net/wireless/virtual/mac80211_hwsim.c:5576
       hwsim_exit_net+0x5c1/0x670 drivers/net/wireless/virtual/mac80211_hwsim.c:6453
       ops_exit_list net/core/net_namespace.c:173 [inline]
       cleanup_net+0x802/0xcc0 net/core/net_namespace.c:640
       process_one_work kernel/workqueue.c:3248 [inline]
       process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3329
       worker_thread+0x86d/0xd50 kernel/workqueue.c:3409
       kthread+0x2f0/0x390 kernel/kthread.c:389
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&rdev->wiphy.mtx);
                               lock(team->team_lock_key#4);
                               lock(&rdev->wiphy.mtx);
  lock(team->team_lock_key#4);

 *** DEADLOCK ***

5 locks held by kworker/u8:4/61:
 #0: ffff888015ed5948 ((wq_completion)netns){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3223 [inline]
 #0: ffff888015ed5948 ((wq_completion)netns){+.+.}-{0:0}, at: process_scheduled_works+0x90a/0x1830 kernel/workqueue.c:3329
 #1: ffffc900015c7d00 (net_cleanup_work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3224 [inline]
 #1: ffffc900015c7d00 (net_cleanup_work){+.+.}-{0:0}, at: process_scheduled_works+0x945/0x1830 kernel/workqueue.c:3329
 #2: ffffffff8f5da690 (pernet_ops_rwsem){++++}-{3:3}, at: cleanup_net+0x16a/0xcc0 net/core/net_namespace.c:594
 #3: ffffffff8f5e6ec8 (rtnl_mutex){+.+.}-{3:3}, at: ieee80211_unregister_hw+0x55/0x2c0 net/mac80211/main.c:1651
 #4: ffff8880226b0768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: wiphy_lock include/net/cfg80211.h:5966 [inline]
 #4: ffff8880226b0768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: ieee80211_remove_interfaces+0x12b/0x700 net/mac80211/iface.c:2280

stack backtrace:
CPU: 0 PID: 61 Comm: kworker/u8:4 Not tainted 6.10.0-rc6-syzkaller-00061-ge9d22f7a6655 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
Workqueue: netns cleanup_net
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
 check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187
 check_prev_add kernel/locking/lockdep.c:3134 [inline]
 check_prevs_add kernel/locking/lockdep.c:3253 [inline]
 validate_chain+0x18e0/0x5900 kernel/locking/lockdep.c:3869
 __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
 __mutex_lock_common kernel/locking/mutex.c:608 [inline]
 __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
 team_del_slave+0x32/0x1d0 drivers/net/team/team_core.c:1990
 team_device_event+0x200/0x5b0 drivers/net/team/team_core.c:2984
 notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
 call_netdevice_notifiers_extack net/core/dev.c:2030 [inline]
 call_netdevice_notifiers net/core/dev.c:2044 [inline]
 unregister_netdevice_many_notify+0xd75/0x16b0 net/core/dev.c:11219
 unregister_netdevice_many net/core/dev.c:11277 [inline]
 unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11156
 unregister_netdevice include/linux/netdevice.h:3119 [inline]
 _cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
 ieee80211_remove_interfaces+0x4db/0x700 net/mac80211/iface.c:2305
 ieee80211_unregister_hw+0x5d/0x2c0 net/mac80211/main.c:1658
 mac80211_hwsim_del_radio+0x2c2/0x4c0 drivers/net/wireless/virtual/mac80211_hwsim.c:5576
 hwsim_exit_net+0x5c1/0x670 drivers/net/wireless/virtual/mac80211_hwsim.c:6453
 ops_exit_list net/core/net_namespace.c:173 [inline]
 cleanup_net+0x802/0xcc0 net/core/net_namespace.c:640
 process_one_work kernel/workqueue.c:3248 [inline]
 process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3329
 worker_thread+0x86d/0xd50 kernel/workqueue.c:3409
 kthread+0x2f0/0x390 kernel/kthread.c:389
 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>
team0: Port device wlan1 removed
hsr_slave_0: left promiscuous mode
hsr_slave_1: left promiscuous mode
batman_adv: batadv0: Interface deactivated: batadv_slave_0
batman_adv: batadv0: Removing interface: batadv_slave_0
batman_adv: batadv0: Interface deactivated: batadv_slave_1
batman_adv: batadv0: Removing interface: batadv_slave_1
veth1_macvtap: left promiscuous mode
veth0_macvtap: left promiscuous mode
veth1_vlan: left promiscuous mode
veth0_vlan: left promiscuous mode
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed
netdevsim netdevsim4 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim4 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim4 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim4 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim2 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim2 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim2 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim2 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
bridge_slave_1: left allmulticast mode
bridge_slave_1: left promiscuous mode
bridge0: port 2(bridge_slave_1) entered disabled state
bridge_slave_0: left allmulticast mode
bridge_slave_0: left promiscuous mode
bridge0: port 1(bridge_slave_0) entered disabled state
bridge_slave_1: left allmulticast mode
bridge_slave_1: left promiscuous mode
bridge0: port 2(bridge_slave_1) entered disabled state
bridge_slave_0: left allmulticast mode
bridge_slave_0: left promiscuous mode
bridge0: port 1(bridge_slave_0) entered disabled state
bridge_slave_1: left allmulticast mode
bridge_slave_1: left promiscuous mode
bridge0: port 2(bridge_slave_1) entered disabled state
bridge_slave_0: left allmulticast mode
bridge_slave_0: left promiscuous mode
bridge0: port 1(bridge_slave_0) entered disabled state
bridge_slave_1: left allmulticast mode
bridge_slave_1: left promiscuous mode
bridge0: port 2(bridge_slave_1) entered disabled state
bridge_slave_0: left allmulticast mode
bridge_slave_0: left promiscuous mode
bridge0: port 1(bridge_slave_0) entered disabled state
bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): Released all slaves
bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): Released all slaves
bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): Released all slaves
bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): Released all slaves
team0: Port device wlan1 removed
team0: Port device wlan1 removed
team0: Port device wlan1 removed
team0: Port device wlan1 removed
hsr_slave_0: left promiscuous mode
hsr_slave_1: left promiscuous mode
batman_adv: batadv0: Interface deactivated: batadv_slave_0
batman_adv: batadv0: Removing interface: batadv_slave_0
batman_adv: batadv0: Interface deactivated: batadv_slave_1
batman_adv: batadv0: Removing interface: batadv_slave_1
hsr_slave_0: left promiscuous mode
hsr_slave_1: left promiscuous mode
batman_adv: batadv0: Interface deactivated: batadv_slave_0
batman_adv: batadv0: Removing interface: batadv_slave_0
batman_adv: batadv0: Interface deactivated: batadv_slave_1
batman_adv: batadv0: Removing interface: batadv_slave_1
hsr_slave_0: left promiscuous mode
hsr_slave_1: left promiscuous mode
batman_adv: batadv0: Interface deactivated: batadv_slave_0
batman_adv: batadv0: Removing interface: batadv_slave_0
batman_adv: batadv0: Interface deactivated: batadv_slave_1
batman_adv: batadv0: Removing interface: batadv_slave_1
hsr_slave_0: left promiscuous mode
hsr_slave_1: left promiscuous mode
batman_adv: batadv0: Interface deactivated: batadv_slave_0
batman_adv: batadv0: Removing interface: batadv_slave_0
batman_adv: batadv0: Interface deactivated: batadv_slave_1
batman_adv: batadv0: Removing interface: batadv_slave_1
veth1_macvtap: left promiscuous mode
veth0_macvtap: left promiscuous mode
veth1_vlan: left promiscuous mode
veth0_vlan: left promiscuous mode
veth1_macvtap: left promiscuous mode
veth0_macvtap: left promiscuous mode
veth1_vlan: left promiscuous mode
veth0_vlan: left promiscuous mode
veth1_macvtap: left promiscuous mode
veth0_macvtap: left promiscuous mode
veth1_vlan: left promiscuous mode
veth0_vlan: left promiscuous mode
veth1_macvtap: left promiscuous mode
veth0_macvtap: left promiscuous mode
veth1_vlan: left promiscuous mode
veth0_vlan: left promiscuous mode
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed


Tested on:

commit:         e9d22f7a Merge tag 'linux_kselftest-fixes-6.10-rc7' of..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=14efde81980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=864caee5f78cab51
dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Note: no patches were applied.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
  2024-04-26 11:59 [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
  2024-04-26 14:17 ` Hillf Danton
  2024-07-03 11:25 ` Jeongjun Park
@ 2024-07-03 13:44 ` Jeongjun Park
  2024-07-03 14:19   ` syzbot
  2024-07-03 14:51 ` [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 32+ messages in thread
From: Jeongjun Park @ 2024-07-03 13:44 UTC (permalink / raw)
  To: syzbot+705c61d60b091ef42c04; +Cc: linux-kernel, syzkaller-bugs

#syz test git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

---
 drivers/net/team/team_core.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..3ac82df876b0 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1970,11 +1970,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
                          struct netlink_ext_ack *extack)
 {
        struct team *team = netdev_priv(dev);
-       int err;
+       int err, locked;
 
-       mutex_lock(&team->lock);
+       locked = mutex_trylock(&team->lock);
        err = team_port_add(team, port_dev, extack);
-       mutex_unlock(&team->lock);
+       if (locked)
+               mutex_unlock(&team->lock);
 
        if (!err)
                netdev_change_features(dev);
@@ -1985,11 +1986,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
 static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
 {
        struct team *team = netdev_priv(dev);
-       int err;
+       int err, locked;
 
-       mutex_lock(&team->lock);
+       locked = mutex_trylock(&team->lock);
        err = team_port_del(team, port_dev);
-       mutex_unlock(&team->lock);
+       if (locked)
+               mutex_unlock(&team->lock);
 
        if (err)
                return err;
--

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
  2024-07-03 13:44 ` Jeongjun Park
@ 2024-07-03 14:19   ` syzbot
  0 siblings, 0 replies; 32+ messages in thread
From: syzbot @ 2024-07-03 14:19 UTC (permalink / raw)
  To: aha310510, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com

Tested on:

commit:         e9d22f7a Merge tag 'linux_kselftest-fixes-6.10-rc7' of..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=16dbf4e1980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=864caee5f78cab51
dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=16fc5485980000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave
  2024-04-26 11:59 [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
                   ` (2 preceding siblings ...)
  2024-07-03 13:44 ` Jeongjun Park
@ 2024-07-03 14:51 ` Jeongjun Park
  2024-07-03 15:18   ` Michal Kubiak
  2024-07-03 15:51 ` [syzbot] [net?] possible deadlock in team_del_slave (3) Jeongjun Park
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 32+ messages in thread
From: Jeongjun Park @ 2024-07-03 14:51 UTC (permalink / raw)
  To: jiri
  Cc: syzbot+705c61d60b091ef42c04, davem, edumazet, kuba, linux-kernel,
	netdev, pabeni, syzkaller-bugs, Jeongjun Park

       CPU0                    CPU1
       ----                    ----
  lock(&rdev->wiphy.mtx);
                               lock(team->team_lock_key#4);
                               lock(&rdev->wiphy.mtx);
  lock(team->team_lock_key#4);

Deadlock occurs due to the above scenario. Therefore,
modify the code as shown in the patch below to prevent deadlock.

Regards,
Jeongjun Park.

Reported-and-tested-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com
Fixes: 61dc3461b954 ("team: convert overall spinlock to mutex")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
---
 drivers/net/team/team_core.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..3ac82df876b0 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1970,11 +1970,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
                          struct netlink_ext_ack *extack)
 {
        struct team *team = netdev_priv(dev);
-       int err;
+       int err, locked;
 
-       mutex_lock(&team->lock);
+       locked = mutex_trylock(&team->lock);
        err = team_port_add(team, port_dev, extack);
-       mutex_unlock(&team->lock);
+       if (locked)
+               mutex_unlock(&team->lock);
 
        if (!err)
                netdev_change_features(dev);
@@ -1985,11 +1986,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
 static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
 {
        struct team *team = netdev_priv(dev);
-       int err;
+       int err, locked;
 
-       mutex_lock(&team->lock);
+       locked = mutex_trylock(&team->lock);
        err = team_port_del(team, port_dev);
-       mutex_unlock(&team->lock);
+       if (locked)
+               mutex_unlock(&team->lock);
 
        if (err)
                return err;
--

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave
  2024-07-03 14:51 ` [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
@ 2024-07-03 15:18   ` Michal Kubiak
  2024-07-03 16:02     ` Jeongjun Park
  0 siblings, 1 reply; 32+ messages in thread
From: Michal Kubiak @ 2024-07-03 15:18 UTC (permalink / raw)
  To: Jeongjun Park
  Cc: jiri, syzbot+705c61d60b091ef42c04, davem, edumazet, kuba,
	linux-kernel, netdev, pabeni, syzkaller-bugs

On Wed, Jul 03, 2024 at 11:51:59PM +0900, Jeongjun Park wrote:
>        CPU0                    CPU1
>        ----                    ----
>   lock(&rdev->wiphy.mtx);
>                                lock(team->team_lock_key#4);
>                                lock(&rdev->wiphy.mtx);
>   lock(team->team_lock_key#4);
> 
> Deadlock occurs due to the above scenario. Therefore,
> modify the code as shown in the patch below to prevent deadlock.
> 
> Regards,
> Jeongjun Park.

The commit message should contain the patch description only (without
salutations, etc.).

> 
> Reported-and-tested-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com
> Fixes: 61dc3461b954 ("team: convert overall spinlock to mutex")
> Signed-off-by: Jeongjun Park <aha310510@gmail.com>
> ---
>  drivers/net/team/team_core.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
> index ab1935a4aa2c..3ac82df876b0 100644
> --- a/drivers/net/team/team_core.c
> +++ b/drivers/net/team/team_core.c
> @@ -1970,11 +1970,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
>                           struct netlink_ext_ack *extack)
>  {
>         struct team *team = netdev_priv(dev);
> -       int err;
> +       int err, locked;
>  
> -       mutex_lock(&team->lock);
> +       locked = mutex_trylock(&team->lock);
>         err = team_port_add(team, port_dev, extack);
> -       mutex_unlock(&team->lock);
> +       if (locked)
> +               mutex_unlock(&team->lock);

This is not correct usage of 'mutex_trylock()' API. In such a case you
could as well remove the lock completely from that part of code.
If "mutex_trylock()" returns false it means the mutex cannot be taken
(because it was already taken by other thread), so you should not modify
the resources that were expected to be protected by the mutex.
In other words, there is a risk of modifying resources using
"team_port_add()" by several threads at a time.

>  
>         if (!err)
>                 netdev_change_features(dev);
> @@ -1985,11 +1986,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
>  static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
>  {
>         struct team *team = netdev_priv(dev);
> -       int err;
> +       int err, locked;
>  
> -       mutex_lock(&team->lock);
> +       locked = mutex_trylock(&team->lock);
>         err = team_port_del(team, port_dev);
> -       mutex_unlock(&team->lock);
> +       if (locked)
> +               mutex_unlock(&team->lock);

The same story as in case of "team_add_slave()".

>  
>         if (err)
>                 return err;
> --
> 

The patch does not seem to be a correct solution to remove a deadlock.
Most probably a synchronization design needs an inspection.
If you really want to use "mutex_trylock()" API, please consider several
attempts of taking the mutex, but never modify the protected resources when
the mutex is not taken successfully.

Thanks,
Michal



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
  2024-04-26 11:59 [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
                   ` (3 preceding siblings ...)
  2024-07-03 14:51 ` [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
@ 2024-07-03 15:51 ` Jeongjun Park
  2024-07-03 16:35   ` syzbot
  2024-07-04 10:15 ` Jiri Pirko
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 32+ messages in thread
From: Jeongjun Park @ 2024-07-03 15:51 UTC (permalink / raw)
  To: syzbot+705c61d60b091ef42c04; +Cc: linux-kernel, syzkaller-bugs

#syz test git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

---
 drivers/net/team/team_core.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..43d7c73b25aa 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1972,7 +1972,8 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
        struct team *team = netdev_priv(dev);
        int err;
 
-       mutex_lock(&team->lock);
+       if (!mutex_trylock(&team->lock))
+               return -EBUSY;
        err = team_port_add(team, port_dev, extack);
        mutex_unlock(&team->lock);
 
@@ -1987,7 +1988,8 @@ static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
        struct team *team = netdev_priv(dev);
        int err;
 
-       mutex_lock(&team->lock);
+       if (!mutex_trylock(&team->lock))
+               return -EBUSY;
        err = team_port_del(team, port_dev);
        mutex_unlock(&team->lock);
 
--

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave
  2024-07-03 15:18   ` Michal Kubiak
@ 2024-07-03 16:02     ` Jeongjun Park
  2024-07-03 16:30       ` Eric Dumazet
  0 siblings, 1 reply; 32+ messages in thread
From: Jeongjun Park @ 2024-07-03 16:02 UTC (permalink / raw)
  To: michal.kubiak
  Cc: aha310510, davem, edumazet, jiri, kuba, linux-kernel, netdev,
	pabeni, syzbot+705c61d60b091ef42c04, syzkaller-bugs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=Y, Size: 4675 bytes --]

>
> On Wed, Jul 03, 2024 at 11:51:59PM +0900, Jeongjun Park wrote:
> >        CPU0                    CPU1
> >        ----                    ----
> >   lock(&rdev->wiphy.mtx);
> >                                lock(team->team_lock_key#4);
> >                                lock(&rdev->wiphy.mtx);
> >   lock(team->team_lock_key#4);
> >
> > Deadlock occurs due to the above scenario. Therefore,
> > modify the code as shown in the patch below to prevent deadlock.
> >
> > Regards,
> > Jeongjun Park.
>
> The commit message should contain the patch description only (without
> salutations, etc.).
>
> >
> > Reported-and-tested-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com
> > Fixes: 61dc3461b954 ("team: convert overall spinlock to mutex")
> > Signed-off-by: Jeongjun Park <aha310510@gmail.com>
> > ---
> >  drivers/net/team/team_core.c | 14 ++++++++------
> >  1 file changed, 8 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
> > index ab1935a4aa2c..3ac82df876b0 100644
> > --- a/drivers/net/team/team_core.c
> > +++ b/drivers/net/team/team_core.c
> > @@ -1970,11 +1970,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> >                           struct netlink_ext_ack *extack)
> >  {
> >         struct team *team = netdev_priv(dev);
> > -       int err;
> > +       int err, locked;
> > 
> > -       mutex_lock(&team->lock);
> > +       locked = mutex_trylock(&team->lock);
> >         err = team_port_add(team, port_dev, extack);
> > -       mutex_unlock(&team->lock);
> > +       if (locked)
> > +               mutex_unlock(&team->lock);
>
> This is not correct usage of 'mutex_trylock()' API. In such a case you
> could as well remove the lock completely from that part of code.
> If "mutex_trylock()" returns false it means the mutex cannot be taken
> (because it was already taken by other thread), so you should not modify
> the resources that were expected to be protected by the mutex.
> In other words, there is a risk of modifying resources using
> "team_port_add()" by several threads at a time.
>
> > 
> >         if (!err)
> >                 netdev_change_features(dev);
> > @@ -1985,11 +1986,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> >  static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
> >  {
> >         struct team *team = netdev_priv(dev);
> > -       int err;
> > +       int err, locked;
> > 
> > -       mutex_lock(&team->lock);
> > +       locked = mutex_trylock(&team->lock);
> >         err = team_port_del(team, port_dev);
> > -       mutex_unlock(&team->lock);
> > +       if (locked)
> > +               mutex_unlock(&team->lock);
>
> The same story as in case of "team_add_slave()".
>
> > 
> >         if (err)
> >                 return err;
> > --
> >
>
> The patch does not seem to be a correct solution to remove a deadlock.
> Most probably a synchronization design needs an inspection.
> If you really want to use "mutex_trylock()" API, please consider several
> attempts of taking the mutex, but never modify the protected resources when
> the mutex is not taken successfully.
>

Thanks for your comment. I rewrote the patch based on those comments. 
This time, we modified it to return an error so that resources are not 
modified when a race situation occurs. We would appreciate your 
feedback on what this patch would be like.

> Thanks,
> Michal
>
>

Regards,
Jeongjun Park

---
 drivers/net/team/team_core.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..43d7c73b25aa 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1972,7 +1972,8 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
        struct team *team = netdev_priv(dev);
        int err;
 
-       mutex_lock(&team->lock);
+       if (!mutex_trylock(&team->lock))
+               return -EBUSY;
        err = team_port_add(team, port_dev, extack);
        mutex_unlock(&team->lock);
 
@@ -1987,7 +1988,8 @@ static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
        struct team *team = netdev_priv(dev);
        int err;
 
-       mutex_lock(&team->lock);
+       if (!mutex_trylock(&team->lock))
+               return -EBUSY;
        err = team_port_del(team, port_dev);
        mutex_unlock(&team->lock);
 
--

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave
  2024-07-03 16:02     ` Jeongjun Park
@ 2024-07-03 16:30       ` Eric Dumazet
  2024-07-05 15:17         ` [syzbot] [net?] possible deadlock in team_del_slave (3) Jeongjun Park
  2024-07-05 15:19         ` [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
  0 siblings, 2 replies; 32+ messages in thread
From: Eric Dumazet @ 2024-07-03 16:30 UTC (permalink / raw)
  To: Jeongjun Park
  Cc: michal.kubiak, davem, jiri, kuba, linux-kernel, netdev, pabeni,
	syzbot+705c61d60b091ef42c04, syzkaller-bugs

On Wed, Jul 3, 2024 at 6:02 PM Jeongjun Park <aha310510@gmail.com> wrote:
>
> >
> > On Wed, Jul 03, 2024 at 11:51:59PM +0900, Jeongjun Park wrote:
> > >        CPU0                    CPU1
> > >        ----                    ----
> > >   lock(&rdev->wiphy.mtx);
> > >                                lock(team->team_lock_key#4);
> > >                                lock(&rdev->wiphy.mtx);
> > >   lock(team->team_lock_key#4);
> > >
> > > Deadlock occurs due to the above scenario. Therefore,
> > > modify the code as shown in the patch below to prevent deadlock.
> > >
> > > Regards,
> > > Jeongjun Park.
> >
> > The commit message should contain the patch description only (without
> > salutations, etc.).
> >
> > >
> > > Reported-and-tested-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com
> > > Fixes: 61dc3461b954 ("team: convert overall spinlock to mutex")
> > > Signed-off-by: Jeongjun Park <aha310510@gmail.com>
> > > ---
> > >  drivers/net/team/team_core.c | 14 ++++++++------
> > >  1 file changed, 8 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
> > > index ab1935a4aa2c..3ac82df876b0 100644
> > > --- a/drivers/net/team/team_core.c
> > > +++ b/drivers/net/team/team_core.c
> > > @@ -1970,11 +1970,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> > >                           struct netlink_ext_ack *extack)
> > >  {
> > >         struct team *team = netdev_priv(dev);
> > > -       int err;
> > > +       int err, locked;
> > >
> > > -       mutex_lock(&team->lock);
> > > +       locked = mutex_trylock(&team->lock);
> > >         err = team_port_add(team, port_dev, extack);
> > > -       mutex_unlock(&team->lock);
> > > +       if (locked)
> > > +               mutex_unlock(&team->lock);
> >
> > This is not correct usage of 'mutex_trylock()' API. In such a case you
> > could as well remove the lock completely from that part of code.
> > If "mutex_trylock()" returns false it means the mutex cannot be taken
> > (because it was already taken by other thread), so you should not modify
> > the resources that were expected to be protected by the mutex.
> > In other words, there is a risk of modifying resources using
> > "team_port_add()" by several threads at a time.
> >
> > >
> > >         if (!err)
> > >                 netdev_change_features(dev);
> > > @@ -1985,11 +1986,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> > >  static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
> > >  {
> > >         struct team *team = netdev_priv(dev);
> > > -       int err;
> > > +       int err, locked;
> > >
> > > -       mutex_lock(&team->lock);
> > > +       locked = mutex_trylock(&team->lock);
> > >         err = team_port_del(team, port_dev);
> > > -       mutex_unlock(&team->lock);
> > > +       if (locked)
> > > +               mutex_unlock(&team->lock);
> >
> > The same story as in case of "team_add_slave()".
> >
> > >
> > >         if (err)
> > >                 return err;
> > > --
> > >
> >
> > The patch does not seem to be a correct solution to remove a deadlock.
> > Most probably a synchronization design needs an inspection.
> > If you really want to use "mutex_trylock()" API, please consider several
> > attempts of taking the mutex, but never modify the protected resources when
> > the mutex is not taken successfully.
> >
>
> Thanks for your comment. I rewrote the patch based on those comments.
> This time, we modified it to return an error so that resources are not
> modified when a race situation occurs. We would appreciate your
> feedback on what this patch would be like.
>
> > Thanks,
> > Michal
> >
> >
>
> Regards,
> Jeongjun Park
>
> ---
>  drivers/net/team/team_core.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
> index ab1935a4aa2c..43d7c73b25aa 100644
> --- a/drivers/net/team/team_core.c
> +++ b/drivers/net/team/team_core.c
> @@ -1972,7 +1972,8 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
>         struct team *team = netdev_priv(dev);
>         int err;
>
> -       mutex_lock(&team->lock);
> +       if (!mutex_trylock(&team->lock))
> +               return -EBUSY;
>         err = team_port_add(team, port_dev, extack);
>         mutex_unlock(&team->lock);
>
> @@ -1987,7 +1988,8 @@ static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
>         struct team *team = netdev_priv(dev);
>         int err;
>
> -       mutex_lock(&team->lock);
> +       if (!mutex_trylock(&team->lock))
> +               return -EBUSY;
>         err = team_port_del(team, port_dev);
>         mutex_unlock(&team->lock);
>
> --

Failing team_del_slave() is not an option. It will add various issues.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
  2024-07-03 15:51 ` [syzbot] [net?] possible deadlock in team_del_slave (3) Jeongjun Park
@ 2024-07-03 16:35   ` syzbot
  0 siblings, 0 replies; 32+ messages in thread
From: syzbot @ 2024-07-03 16:35 UTC (permalink / raw)
  To: aha310510, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com

Tested on:

commit:         e9d22f7a Merge tag 'linux_kselftest-fixes-6.10-rc7' of..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=14125485980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=864caee5f78cab51
dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1489b399980000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
  2024-04-26 11:59 [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
                   ` (4 preceding siblings ...)
  2024-07-03 15:51 ` [syzbot] [net?] possible deadlock in team_del_slave (3) Jeongjun Park
@ 2024-07-04 10:15 ` Jiri Pirko
  2024-07-04 10:43 ` [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 32+ messages in thread
From: Jiri Pirko @ 2024-07-04 10:15 UTC (permalink / raw)
  To: syzbot; +Cc: davem, edumazet, kuba, linux-kernel, netdev, pabeni,
	syzkaller-bugs

Fri, Apr 26, 2024 at 01:59:32PM CEST, syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com wrote:
>Hello,
>
>syzbot found the following issue on:
>
>HEAD commit:    480e035fc4c7 Merge tag 'drm-next-2024-03-13' of https://gi..
>git tree:       upstream
>console+strace: https://syzkaller.appspot.com/x/log.txt?x=1662179e180000
>kernel config:  https://syzkaller.appspot.com/x/.config?x=1e5b814e91787669
>dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
>compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
>syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1058e7b9180000
>C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11919365180000
>
>Downloadable assets:
>disk image: https://storage.googleapis.com/syzbot-assets/5f73b6ef963d/disk-480e035f.raw.xz
>vmlinux: https://storage.googleapis.com/syzbot-assets/46c949396aad/vmlinux-480e035f.xz
>kernel image: https://storage.googleapis.com/syzbot-assets/e3b4d0f5a5f8/bzImage-480e035f.xz
>
>IMPORTANT: if you fix the issue, please add the following tag to the commit:
>Reported-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com
>
>======================================================
>WARNING: possible circular locking dependency detected
>6.8.0-syzkaller-08073-g480e035fc4c7 #0 Not tainted
>------------------------------------------------------
>syz-executor419/5074 is trying to acquire lock:
>ffff888023dc4d20 (team->team_lock_key){+.+.}-{3:3}, at: team_del_slave+0x32/0x1d0 drivers/net/team/team.c:1988
>
>but task is already holding lock:
>ffff88802a210768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: nl80211_del_interface+0x11a/0x140 net/wireless/nl80211.c:4389
>
>which lock already depends on the new lock.
>
>
>the existing dependency chain (in reverse order) is:
>
>-> #1 (&rdev->wiphy.mtx){+.+.}-{3:3}:
>       lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
>       __mutex_lock_common kernel/locking/mutex.c:608 [inline]
>       __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
>       wiphy_lock include/net/cfg80211.h:5951 [inline]
>       ieee80211_open+0xe7/0x200 net/mac80211/iface.c:449
>       __dev_open+0x2d3/0x450 net/core/dev.c:1430
>       dev_open+0xae/0x1b0 net/core/dev.c:1466
>       team_port_add drivers/net/team/team.c:1214 [inline]
>       team_add_slave+0x9b3/0x2750 drivers/net/team/team.c:1974
>       do_set_master net/core/rtnetlink.c:2685 [inline]
>       do_setlink+0xe70/0x41f0 net/core/rtnetlink.c:2891
>       rtnl_setlink+0x40d/0x5a0 net/core/rtnetlink.c:3185
>       rtnetlink_rcv_msg+0x89b/0x10d0 net/core/rtnetlink.c:6595
>       netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
>       netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
>       netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
>       netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
>       sock_sendmsg_nosec net/socket.c:730 [inline]
>       __sock_sendmsg+0x221/0x270 net/socket.c:745
>       ____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
>       ___sys_sendmsg net/socket.c:2638 [inline]
>       __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
>       do_syscall_64+0xfb/0x240
>       entry_SYSCALL_64_after_hwframe+0x6d/0x75
>
>-> #0 (team->team_lock_key){+.+.}-{3:3}:
>       check_prev_add kernel/locking/lockdep.c:3134 [inline]
>       check_prevs_add kernel/locking/lockdep.c:3253 [inline]
>       validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
>       __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
>       lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
>       __mutex_lock_common kernel/locking/mutex.c:608 [inline]
>       __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
>       team_del_slave+0x32/0x1d0 drivers/net/team/team.c:1988
>       team_device_event+0x200/0x5b0 drivers/net/team/team.c:3029
>       notifier_call_chain+0x18f/0x3b0 kernel/notifier.c:93
>       call_netdevice_notifiers_extack net/core/dev.c:1988 [inline]
>       call_netdevice_notifiers net/core/dev.c:2002 [inline]
>       unregister_netdevice_many_notify+0xd96/0x16d0 net/core/dev.c:11096
>       unregister_netdevice_many net/core/dev.c:11154 [inline]
>       unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11033
>       unregister_netdevice include/linux/netdevice.h:3115 [inline]
>       _cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
>       ieee80211_if_remove+0x25d/0x3a0 net/mac80211/iface.c:2242
>       ieee80211_del_iface+0x19/0x30 net/mac80211/cfg.c:202
>       rdev_del_virtual_intf net/wireless/rdev-ops.h:62 [inline]
>       cfg80211_remove_virtual_intf+0x230/0x3f0 net/wireless/util.c:2847
>       genl_family_rcv_msg_doit net/netlink/genetlink.c:1113 [inline]
>       genl_family_rcv_msg net/netlink/genetlink.c:1193 [inline]
>       genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1208
>       netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
>       genl_rcv+0x28/0x40 net/netlink/genetlink.c:1217
>       netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
>       netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
>       netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
>       sock_sendmsg_nosec net/socket.c:730 [inline]
>       __sock_sendmsg+0x221/0x270 net/socket.c:745
>       ____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
>       ___sys_sendmsg net/socket.c:2638 [inline]
>       __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
>       do_syscall_64+0xfb/0x240
>       entry_SYSCALL_64_after_hwframe+0x6d/0x75

I wonder, since we already rely on rtnl in lots of team code, perhaps we
can remove team->lock completely and convert the rest of the code to be
protected by rtnl lock as well.



>
>other info that might help us debug this:
>
> Possible unsafe locking scenario:
>
>       CPU0                    CPU1
>       ----                    ----
>  lock(&rdev->wiphy.mtx);
>                               lock(team->team_lock_key);
>                               lock(&rdev->wiphy.mtx);
>  lock(team->team_lock_key);
>
> *** DEADLOCK ***
>
>3 locks held by syz-executor419/5074:
> #0: ffffffff8f3f1a30 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40 net/netlink/genetlink.c:1216
> #1: ffffffff8f38ce88 (rtnl_mutex){+.+.}-{3:3}, at: nl80211_pre_doit+0x5f/0x8b0 net/wireless/nl80211.c:16401
> #2: ffff88802a210768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: nl80211_del_interface+0x11a/0x140 net/wireless/nl80211.c:4389
>
>stack backtrace:
>CPU: 1 PID: 5074 Comm: syz-executor419 Not tainted 6.8.0-syzkaller-08073-g480e035fc4c7 #0
>Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
>Call Trace:
> <TASK>
> __dump_stack lib/dump_stack.c:88 [inline]
> dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
> check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187
> check_prev_add kernel/locking/lockdep.c:3134 [inline]
> check_prevs_add kernel/locking/lockdep.c:3253 [inline]
> validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
> __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
> lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
> __mutex_lock_common kernel/locking/mutex.c:608 [inline]
> __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
> team_del_slave+0x32/0x1d0 drivers/net/team/team.c:1988
> team_device_event+0x200/0x5b0 drivers/net/team/team.c:3029
> notifier_call_chain+0x18f/0x3b0 kernel/notifier.c:93
> call_netdevice_notifiers_extack net/core/dev.c:1988 [inline]
> call_netdevice_notifiers net/core/dev.c:2002 [inline]
> unregister_netdevice_many_notify+0xd96/0x16d0 net/core/dev.c:11096
> unregister_netdevice_many net/core/dev.c:11154 [inline]
> unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11033
> unregister_netdevice include/linux/netdevice.h:3115 [inline]
> _cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
> ieee80211_if_remove+0x25d/0x3a0 net/mac80211/iface.c:2242
> ieee80211_del_iface+0x19/0x30 net/mac80211/cfg.c:202
> rdev_del_virtual_intf net/wireless/rdev-ops.h:62 [inline]
> cfg80211_remove_virtual_intf+0x230/0x3f0 net/wireless/util.c:2847
> genl_family_rcv_msg_doit net/netlink/genetlink.c:1113 [inline]
> genl_family_rcv_msg net/netlink/genetlink.c:1193 [inline]
> genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1208
> netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
> genl_rcv+0x28/0x40 net/netlink/genetlink.c:1217
> netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
> netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
> netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
> sock_sendmsg_nosec net/socket.c:730 [inline]
> __sock_sendmsg+0x221/0x270 net/socket.c:745
> ____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
> ___sys_sendmsg net/socket.c:2638 [inline]
> __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
> do_syscall_64+0xfb/0x240
> entry_SYSCALL_64_after_hwframe+0x6d/0x75
>RIP: 0033:0x7f963cb981a9
>Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 d1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
>RSP: 002b:00007ffdde1419a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
>RAX: ffffffffffffffda RBX: 00007f963cbe53f6 RCX: 00007f963cb981a9
>RDX: 0000000000000000 RSI: 0000000020000400 RDI: 0000000000000004
>RBP: 00007f963cc17440 R08: 0000000000000000 R09: 0000000000000000
>R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000031
>R13: 0000000000000003 R14: 0000000000050012 R15: 00007ffdde141a02
> </TASK>
>team0: Port device wlan0 removed
>
>
>---
>This report is generated by a bot. It may contain errors.
>See https://goo.gl/tpsmEJ for more information about syzbot.
>syzbot engineers can be reached at syzkaller@googlegroups.com.
>
>syzbot will keep track of this issue. See:
>https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
>If the report is already addressed, let syzbot know by replying with:
>#syz fix: exact-commit-title
>
>If you want syzbot to run the reproducer, reply with:
>#syz test: git://repo/address.git branch-or-commit-hash
>If you attach or paste a git patch, syzbot will apply it before testing.
>
>If you want to overwrite report's subsystems, reply with:
>#syz set subsystems: new-subsystem
>(See the list of subsystem names on the web dashboard)
>
>If the report is a duplicate of another one, reply with:
>#syz dup: exact-subject-of-another-report
>
>If you want to undo deduplication, reply with:
>#syz undup

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave
  2024-04-26 11:59 [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
                   ` (5 preceding siblings ...)
  2024-07-04 10:15 ` Jiri Pirko
@ 2024-07-04 10:43 ` Jeongjun Park
  2024-07-04 10:45 ` [syzbot] [net?] possible deadlock in team_del_slave (3) Jeongjun Park
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 32+ messages in thread
From: Jeongjun Park @ 2024-07-04 10:43 UTC (permalink / raw)
  To: syzbot+705c61d60b091ef42c04; +Cc: linux-kernel, syzkaller-bugs

>
> On Wed, Jul 03, 2024 at 11:51:59PM +0900, Jeongjun Park wrote:
> >        CPU0                    CPU1
> >        ----                    ----
> >   lock(&rdev->wiphy.mtx);
> >                                lock(team->team_lock_key#4);
> >                                lock(&rdev->wiphy.mtx);
> >   lock(team->team_lock_key#4);
> >
> > Deadlock occurs due to the above scenario. Therefore,
> > modify the code as shown in the patch below to prevent deadlock.
> >
> > Regards,
> > Jeongjun Park.
>
> The commit message should contain the patch description only (without
> salutations, etc.).
>
> >
> > Reported-and-tested-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com
> > Fixes: 61dc3461b954 ("team: convert overall spinlock to mutex")
> > Signed-off-by: Jeongjun Park <aha310510@gmail.com>
> > ---
> >  drivers/net/team/team_core.c | 14 ++++++++------
> >  1 file changed, 8 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
> > index ab1935a4aa2c..3ac82df876b0 100644
> > --- a/drivers/net/team/team_core.c
> > +++ b/drivers/net/team/team_core.c
> > @@ -1970,11 +1970,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> >                           struct netlink_ext_ack *extack)
> >  {
> >         struct team *team = netdev_priv(dev);
> > -       int err;
> > +       int err, locked;
> > 
> > -       mutex_lock(&team->lock);
> > +       locked = mutex_trylock(&team->lock);
> >         err = team_port_add(team, port_dev, extack);
> > -       mutex_unlock(&team->lock);
> > +       if (locked)
> > +               mutex_unlock(&team->lock);
>
> This is not correct usage of 'mutex_trylock()' API. In such a case you
> could as well remove the lock completely from that part of code.
> If "mutex_trylock()" returns false it means the mutex cannot be taken
> (because it was already taken by other thread), so you should not modify
> the resources that were expected to be protected by the mutex.
> In other words, there is a risk of modifying resources using
> "team_port_add()" by several threads at a time.
>
> > 
> >         if (!err)
> >                 netdev_change_features(dev);
> > @@ -1985,11 +1986,12 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> >  static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
> >  {
> >         struct team *team = netdev_priv(dev);
> > -       int err;
> > +       int err, locked;
> > 
> > -       mutex_lock(&team->lock);
> > +       locked = mutex_trylock(&team->lock);
> >         err = team_port_del(team, port_dev);
> > -       mutex_unlock(&team->lock);
> > +       if (locked)
> > +               mutex_unlock(&team->lock);
>
> The same story as in case of "team_add_slave()".
>
> > 
> >         if (err)
> >                 return err;
> > --
> >
>
> The patch does not seem to be a correct solution to remove a deadlock.
> Most probably a synchronization design needs an inspection.
> If you really want to use "mutex_trylock()" API, please consider several
> attempts of taking the mutex, but never modify the protected resources when
> the mutex is not taken successfully.
>

Thanks for your comment. I rewrote the patch based on those comments. 
This time, we modified it to return an error so that resources are not 
modified when a race situation occurs. We would appreciate your 
feedback on what this patch would be like.

> Thanks,
> Michal
>
>

Regards,
Jeongjun Park

---
 drivers/net/team/team_core.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..43d7c73b25aa 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1972,7 +1972,8 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
        struct team *team = netdev_priv(dev);
        int err;
 
-       mutex_lock(&team->lock);
+       if (!mutex_trylock(&team->lock))
+               return -EBUSY;
        err = team_port_add(team, port_dev, extack);
        mutex_unlock(&team->lock);
 
@@ -1987,7 +1988,8 @@ static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
        struct team *team = netdev_priv(dev);
        int err;
 
-       mutex_lock(&team->lock);
+       if (!mutex_trylock(&team->lock))
+               return -EBUSY;
        err = team_port_del(team, port_dev);
        mutex_unlock(&team->lock);
 
--

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
  2024-04-26 11:59 [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
                   ` (6 preceding siblings ...)
  2024-07-04 10:43 ` [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
@ 2024-07-04 10:45 ` Jeongjun Park
  2024-07-04 16:07   ` syzbot
  2024-07-04 11:02 ` Jeongjun Park
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 32+ messages in thread
From: Jeongjun Park @ 2024-07-04 10:45 UTC (permalink / raw)
  To: syzbot+705c61d60b091ef42c04; +Cc: linux-kernel, syzkaller-bugs

#syz test git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

---
 drivers/net/team/team_core.c | 32 +++++++++++++++++++++++---------
 1 file changed, 23 insertions(+), 9 deletions(-)

diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..a12366fd420c 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1142,31 +1142,37 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
 	char *portname = port_dev->name;
 	int err;
 
+	rtnl_lock();
+
 	if (port_dev->flags & IFF_LOOPBACK) {
 		NL_SET_ERR_MSG(extack, "Loopback device can't be added as a team port");
 		netdev_err(dev, "Device %s is loopback device. Loopback devices can't be added as a team port\n",
 			   portname);
-		return -EINVAL;
+		err = -EINVAL;
+		goto err_out;
 	}
 
 	if (netif_is_team_port(port_dev)) {
 		NL_SET_ERR_MSG(extack, "Device is already a port of a team device");
 		netdev_err(dev, "Device %s is already a port "
 				"of a team device\n", portname);
-		return -EBUSY;
+		err = -EBUSY;
+		goto err_out;
 	}
 
 	if (dev == port_dev) {
 		NL_SET_ERR_MSG(extack, "Cannot enslave team device to itself");
 		netdev_err(dev, "Cannot enslave team device to itself\n");
-		return -EINVAL;
+		err = -EINVAL;
+		goto err_out;
 	}
 
 	if (netdev_has_upper_dev(dev, port_dev)) {
 		NL_SET_ERR_MSG(extack, "Device is already an upper device of the team interface");
 		netdev_err(dev, "Device %s is already an upper device of the team interface\n",
 			   portname);
-		return -EBUSY;
+		err = -EBUSY;
+		goto err_out;
 	}
 
 	if (port_dev->features & NETIF_F_VLAN_CHALLENGED &&
@@ -1174,7 +1180,8 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
 		NL_SET_ERR_MSG(extack, "Device is VLAN challenged and team device has VLAN set up");
 		netdev_err(dev, "Device %s is VLAN challenged and team device has VLAN set up\n",
 			   portname);
-		return -EPERM;
+		err = -EPERM;
+		goto err_out;
 	}
 
 	err = team_dev_type_check_change(dev, port_dev);
@@ -1185,13 +1192,16 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
 		NL_SET_ERR_MSG(extack, "Device is up. Set it down before adding it as a team port");
 		netdev_err(dev, "Device %s is up. Set it down before adding it as a team port\n",
 			   portname);
-		return -EBUSY;
+		err = -EBUSY;
+		goto err_out;
 	}
 
 	port = kzalloc(sizeof(struct team_port) + team->mode->port_priv_size,
 		       GFP_KERNEL);
-	if (!port)
-		return -ENOMEM;
+	if (!port) {
+		err = -ENOMEM;
+		goto err_out;
+	}
 
 	port->dev = port_dev;
 	port->team = team;
@@ -1213,7 +1223,9 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
 		goto err_port_enter;
 	}
 
+	mutex_unlock(&team->lock);
 	err = dev_open(port_dev, extack);
+	mutex_lock(&team->lock);
 	if (err) {
 		netdev_dbg(dev, "Device %s opening failed\n",
 			   portname);
@@ -1292,6 +1304,7 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
 
 	netdev_info(dev, "Port device %s added\n", portname);
 
+	rtnl_unlock();
 	return 0;
 
 err_set_slave_promisc:
@@ -1321,7 +1334,8 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
 
 err_set_mtu:
 	kfree(port);
-
+err_out:
+	rtnl_unlock();
 	return err;
 }
 
--

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
  2024-04-26 11:59 [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
                   ` (7 preceding siblings ...)
  2024-07-04 10:45 ` [syzbot] [net?] possible deadlock in team_del_slave (3) Jeongjun Park
@ 2024-07-04 11:02 ` Jeongjun Park
  2024-07-04 16:28   ` syzbot
  2024-07-06  4:13 ` [PATCH net,v2] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 32+ messages in thread
From: Jeongjun Park @ 2024-07-04 11:02 UTC (permalink / raw)
  To: syzbot+705c61d60b091ef42c04; +Cc: linux-kernel, syzkaller-bugs

#syz test git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

---
 drivers/net/team/team_core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..245566a1875d 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1213,7 +1213,9 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
 		goto err_port_enter;
 	}
 
+	mutex_unlock(&team->lock);
 	err = dev_open(port_dev, extack);
+	mutex_lock(&team->lock);
 	if (err) {
 		netdev_dbg(dev, "Device %s opening failed\n",
 			   portname);
--

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
  2024-07-04 10:45 ` [syzbot] [net?] possible deadlock in team_del_slave (3) Jeongjun Park
@ 2024-07-04 16:07   ` syzbot
  0 siblings, 0 replies; 32+ messages in thread
From: syzbot @ 2024-07-04 16:07 UTC (permalink / raw)
  To: aha310510, linux-kernel, syzkaller-bugs

Hello,

syzbot tried to test the proposed patch but the build/boot failed:

possible deadlock in team_add_slave

bond0: (slave bond_slave_0): Enslaving as an active interface with an up link
bond0: (slave bond_slave_1): Enslaving as an active interface with an up link
============================================
WARNING: possible recursive locking detected
6.10.0-rc6-syzkaller-00069-g795c58e4c7fc-dirty #0 Not tainted
--------------------------------------------
syz-executor.0/5159 is trying to acquire lock:
ffffffff8f5e6ec8 (rtnl_mutex){+.+.}-{3:3}, at: team_port_add drivers/net/team/team_core.c:1145 [inline]
ffffffff8f5e6ec8 (rtnl_mutex){+.+.}-{3:3}, at: team_add_slave+0xdd/0x2720 drivers/net/team/team_core.c:1990

but task is already holding lock:
ffffffff8f5e6ec8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock net/core/rtnetlink.c:79 [inline]
ffffffff8f5e6ec8 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x842/0x1180 net/core/rtnetlink.c:6632

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(rtnl_mutex);
  lock(rtnl_mutex);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

2 locks held by syz-executor.0/5159:
 #0: ffffffff8f5e6ec8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock net/core/rtnetlink.c:79 [inline]
 #0: ffffffff8f5e6ec8 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x842/0x1180 net/core/rtnetlink.c:6632
 #1: ffff88806a00cd20 (team->team_lock_key){+.+.}-{3:3}, at: team_add_slave+0xb0/0x2720 drivers/net/team/team_core.c:1989

stack backtrace:
CPU: 0 PID: 5159 Comm: syz-executor.0 Not tainted 6.10.0-rc6-syzkaller-00069-g795c58e4c7fc-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
 check_deadlock kernel/locking/lockdep.c:3062 [inline]
 validate_chain+0x15d3/0x5900 kernel/locking/lockdep.c:3856
 __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
 __mutex_lock_common kernel/locking/mutex.c:608 [inline]
 __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
 team_port_add drivers/net/team/team_core.c:1145 [inline]
 team_add_slave+0xdd/0x2720 drivers/net/team/team_core.c:1990
 do_set_master net/core/rtnetlink.c:2701 [inline]
 do_setlink+0xe70/0x41f0 net/core/rtnetlink.c:2907
 __rtnl_newlink net/core/rtnetlink.c:3696 [inline]
 rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3743
 rtnetlink_rcv_msg+0x89b/0x1180 net/core/rtnetlink.c:6635
 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2564
 netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
 netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
 netlink_sendmsg+0x8db/0xcb0 net/netlink/af_netlink.c:1905
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg+0x221/0x270 net/socket.c:745
 __sys_sendto+0x3a4/0x4f0 net/socket.c:2192
 __do_sys_sendto net/socket.c:2204 [inline]
 __se_sys_sendto net/socket.c:2200 [inline]
 __x64_sys_sendto+0xde/0x100 net/socket.c:2200
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fe59307ed43
Code: 64 89 02 48 c7 c0 ff ff ff ff eb b7 66 2e 0f 1f 84 00 00 00 00 00 90 80 3d c1 91 10 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
RSP: 002b:00007fe5932df648 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007fe593ce4620 RCX: 00007fe59307ed43
RDX: 0000000000000028 RSI: 00007fe593ce4670 RDI: 0000000000000003
RBP: 0000000000000001 R08: 00007fe5932df664 R09: 000000000000000c
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000003
R13: 0000000000000000 R14: 00007fe593ce4670 R15: 0000000000000000
 </TASK>


Warning: Permanently added '10.128.0.29' (ED25519) to the list of known hosts.
2024/07/04 16:06:06 ignoring optional flag "sandboxArg"="0"
2024/07/04 16:06:07 parsed 1 programs
[   63.126006][ T5090] cgroup: Unknown subsys name 'net'
[   63.411712][ T5090] cgroup: Unknown subsys name 'rlimit'
[   64.609660][ T5092] Adding 124996k swap on ./swap-file.  Priority:0 extents:1 across:124996k 
[   65.064470][   T53] Bluetooth: hci0: unexpected cc 0x0c03 length: 249 > 1
[   65.072894][   T53] Bluetooth: hci0: unexpected cc 0x1003 length: 249 > 9
[   65.080962][   T53] Bluetooth: hci0: unexpected cc 0x1001 length: 249 > 9
[   65.090842][   T53] Bluetooth: hci0: unexpected cc 0x0c23 length: 249 > 4
[   65.103105][   T53] Bluetooth: hci0: unexpected cc 0x0c25 length: 249 > 3
[   65.121545][   T53] Bluetooth: hci0: unexpected cc 0x0c38 length: 249 > 2
[   65.438608][ T1046] wlan0: Created IBSS using preconfigured BSSID 50:50:50:50:50:50
[   65.457546][ T1046] wlan0: Creating new IBSS network, BSSID 50:50:50:50:50:50
[   65.495887][ T2472] wlan1: Created IBSS using preconfigured BSSID 50:50:50:50:50:50
[   65.504409][ T2472] wlan1: Creating new IBSS network, BSSID 50:50:50:50:50:50
[   66.784406][ T5159] chnl_net:caif_netlink_parms(): no params data found
[   66.879723][ T5159] bridge0: port 1(bridge_slave_0) entered blocking state
[   66.889205][ T5159] bridge0: port 1(bridge_slave_0) entered disabled state
[   66.896897][ T5159] bridge_slave_0: entered allmulticast mode
[   66.905071][ T5159] bridge_slave_0: entered promiscuous mode
[   66.914541][ T5159] bridge0: port 2(bridge_slave_1) entered blocking state
[   66.922253][ T5159] bridge0: port 2(bridge_slave_1) entered disabled state
[   66.929529][ T5159] bridge_slave_1: entered allmulticast mode
[   66.937325][ T5159] bridge_slave_1: entered promiscuous mode
[   66.974384][ T5159] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link
[   66.986385][ T5159] bond0: (slave bond_slave_1): Enslaving as an active interface with an up link
[   67.014300][ T5159] 
[   67.016708][ T5159] ============================================
[   67.022882][ T5159] WARNING: possible recursive locking detected
[   67.029164][ T5159] 6.10.0-rc6-syzkaller-00069-g795c58e4c7fc-dirty #0 Not tainted
[   67.036959][ T5159] --------------------------------------------
[   67.043287][ T5159] syz-executor.0/5159 is trying to acquire lock:
[   67.049949][ T5159] ffffffff8f5e6ec8 (rtnl_mutex){+.+.}-{3:3}, at: team_add_slave+0xdd/0x2720
[   67.058734][ T5159] 
[   67.058734][ T5159] but task is already holding lock:
[   67.066374][ T5159] ffffffff8f5e6ec8 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x842/0x1180
[   67.075446][ T5159] 
[   67.075446][ T5159] other info that might help us debug this:
[   67.083669][ T5159]  Possible unsafe locking scenario:
[   67.083669][ T5159] 
[   67.091184][ T5159]        CPU0
[   67.094442][ T5159]        ----
[   67.097721][ T5159]   lock(rtnl_mutex);
[   67.101746][ T5159]   lock(rtnl_mutex);
[   67.105832][ T5159] 
[   67.105832][ T5159]  *** DEADLOCK ***
[   67.105832][ T5159] 
[   67.113981][ T5159]  May be due to missing lock nesting notation
[   67.113981][ T5159] 
[   67.122398][ T5159] 2 locks held by syz-executor.0/5159:
[   67.127935][ T5159]  #0: ffffffff8f5e6ec8 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x842/0x1180
[   67.137584][ T5159]  #1: ffff88806a00cd20 (team->team_lock_key){+.+.}-{3:3}, at: team_add_slave+0xb0/0x2720
[   67.147698][ T5159] 
[   67.147698][ T5159] stack backtrace:
[   67.153714][ T5159] CPU: 0 PID: 5159 Comm: syz-executor.0 Not tainted 6.10.0-rc6-syzkaller-00069-g795c58e4c7fc-dirty #0
[   67.164638][ T5159] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
[   67.174864][ T5159] Call Trace:
[   67.178320][ T5159]  <TASK>
[   67.181235][ T5159]  dump_stack_lvl+0x241/0x360
[   67.185997][ T5159]  ? __pfx_dump_stack_lvl+0x10/0x10
[   67.191370][ T5159]  ? print_deadlock_bug+0x479/0x620
[   67.196552][ T5159]  validate_chain+0x15d3/0x5900
[   67.201409][ T5159]  ? __pfx_validate_chain+0x10/0x10
[   67.206707][ T5159]  ? stack_trace_save+0x118/0x1d0
[   67.211816][ T5159]  ? __pfx_stack_trace_save+0x10/0x10
[   67.217296][ T5159]  ? lockdep_unlock+0x16a/0x300
[   67.222144][ T5159]  ? mark_lock+0x9a/0x350
[   67.226480][ T5159]  __lock_acquire+0x1346/0x1fd0
[   67.231609][ T5159]  lock_acquire+0x1ed/0x550
[   67.236105][ T5159]  ? team_add_slave+0xdd/0x2720
[   67.240978][ T5159]  ? __pfx_lock_acquire+0x10/0x10
[   67.246022][ T5159]  ? __pfx___might_resched+0x10/0x10
[   67.251353][ T5159]  ? __pfx___mutex_trylock_common+0x10/0x10
[   67.257275][ T5159]  __mutex_lock+0x136/0xd70
[   67.261783][ T5159]  ? team_add_slave+0xdd/0x2720
[   67.266645][ T5159]  ? team_add_slave+0xdd/0x2720
[   67.271654][ T5159]  ? __pfx___mutex_lock+0x10/0x10
[   67.276846][ T5159]  team_add_slave+0xdd/0x2720
[   67.281508][ T5159]  ? __pfx_lock_acquire+0x10/0x10
[   67.286605][ T5159]  ? deref_stack_reg+0x1c7/0x260
[   67.291535][ T5159]  ? __pfx_team_add_slave+0x10/0x10
[   67.296823][ T5159]  ? is_bpf_text_address+0x285/0x2a0
[   67.302187][ T5159]  ? is_bpf_text_address+0x26/0x2a0
[   67.307484][ T5159]  ? __pfx_stack_trace_consume_entry+0x10/0x10
[   67.313666][ T5159]  ? kernel_text_address+0xa7/0xe0
[   67.318792][ T5159]  ? __kernel_text_address+0xd/0x40
[   67.323994][ T5159]  ? unwind_get_return_address+0x91/0xc0
[   67.329730][ T5159]  ? mutex_is_locked+0x12/0x50
[   67.334513][ T5159]  do_setlink+0xe70/0x41f0
[   67.339161][ T5159]  ? stack_trace_save+0x118/0x1d0
[   67.344184][ T5159]  ? __pfx_stack_trace_save+0x10/0x10
[   67.349548][ T5159]  ? __pfx_do_setlink+0x10/0x10
[   67.354458][ T5159]  ? __nla_validate_parse+0x26ce/0x3090
[   67.360197][ T5159]  ? kmalloc_trace_noprof+0x19c/0x2c0
[   67.365569][ T5159]  ? rtnl_newlink+0xf2/0x20a0
[   67.370328][ T5159]  ? __pfx___nla_validate_parse+0x10/0x10
[   67.376043][ T5159]  ? validate_linkmsg+0x71e/0x900
[   67.381225][ T5159]  rtnl_newlink+0x180b/0x20a0
[   67.385888][ T5159]  ? rtnl_newlink+0x4f1/0x20a0
[   67.390647][ T5159]  ? __pfx_rtnl_newlink+0x10/0x10
[   67.395719][ T5159]  ? __pfx___mutex_trylock_common+0x10/0x10
[   67.401615][ T5159]  ? rcu_is_watching+0x15/0xb0
[   67.406401][ T5159]  ? trace_contention_end+0x3c/0x120
[   67.411876][ T5159]  ? __mutex_lock+0x2ef/0xd70
[   67.416760][ T5159]  ? __pfx_lock_release+0x10/0x10
[   67.421801][ T5159]  ? __pfx_rtnl_newlink+0x10/0x10
[   67.426918][ T5159]  rtnetlink_rcv_msg+0x89b/0x1180
[   67.432098][ T5159]  ? rtnetlink_rcv_msg+0x208/0x1180
[   67.437351][ T5159]  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[   67.442824][ T5159]  ? is_bpf_text_address+0x285/0x2a0
[   67.448107][ T5159]  ? __pfx_validate_chain+0x10/0x10
[   67.453468][ T5159]  ? __pfx_validate_chain+0x10/0x10
[   67.458685][ T5159]  ? arch_stack_walk+0x16d/0x1b0
[   67.463679][ T5159]  ? mark_lock+0x9a/0x350
[   67.468000][ T5159]  ? __pfx_validate_chain+0x10/0x10
[   67.473274][ T5159]  ? __lock_acquire+0x1346/0x1fd0
[   67.478279][ T5159]  ? mark_lock+0x9a/0x350
[   67.482618][ T5159]  ? __lock_acquire+0x1346/0x1fd0
[   67.487750][ T5159]  netlink_rcv_skb+0x1e3/0x430
[   67.492526][ T5159]  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[   67.497973][ T5159]  ? __pfx_netlink_rcv_skb+0x10/0x10
[   67.503358][ T5159]  ? netlink_deliver_tap+0x2e/0x1b0
[   67.508562][ T5159]  netlink_unicast+0x7ea/0x980
[   67.513366][ T5159]  ? __pfx_netlink_unicast+0x10/0x10
[   67.518654][ T5159]  ? __virt_addr_valid+0x183/0x520
[   67.524039][ T5159]  ? __check_object_size+0x49c/0x900
[   67.529325][ T5159]  ? bpf_lsm_netlink_send+0x9/0x10
[   67.534430][ T5159]  netlink_sendmsg+0x8db/0xcb0
[   67.539209][ T5159]  ? __pfx_netlink_sendmsg+0x10/0x10
[   67.544486][ T5159]  ? lockdep_hardirqs_on_prepare+0x43d/0x780
[   67.550487][ T5159]  ? aa_sock_msg_perm+0x91/0x160
[   67.555506][ T5159]  ? bpf_lsm_socket_sendmsg+0x9/0x10
[   67.560779][ T5159]  ? security_socket_sendmsg+0x87/0xb0
[   67.566234][ T5159]  ? __pfx_netlink_sendmsg+0x10/0x10
[   67.571514][ T5159]  __sock_sendmsg+0x221/0x270
[   67.576197][ T5159]  __sys_sendto+0x3a4/0x4f0
[   67.580871][ T5159]  ? __pfx___sys_sendto+0x10/0x10
[   67.585926][ T5159]  ? lockdep_hardirqs_on_prepare+0x43d/0x780
[   67.591896][ T5159]  ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10
[   67.598304][ T5159]  __x64_sys_sendto+0xde/0x100
[   67.603056][ T5159]  do_syscall_64+0xf3/0x230
[   67.607636][ T5159]  ? clear_bhb_loop+0x35/0x90
[   67.612394][ T5159]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   67.618397][ T5159] RIP: 0033:0x7fe59307ed43
[   67.622867][ T5159] Code: 64 89 02 48 c7 c0 ff ff ff ff eb b7 66 2e 0f 1f 84 00 00 00 00 00 90 80 3d c1 91 10 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24
[   67.642562][ T5159] RSP: 002b:00007fe5932df648 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
[   67.650973][ T5159] RAX: ffffffffffffffda RBX: 00007fe593ce4620 RCX: 00007fe59307ed43
[   67.659020][ T5159] RDX: 0000000000000028 RSI: 00007fe593ce4670 RDI: 0000000000000003
[   67.666978][ T5159] RBP: 0000000000000001 R08: 00007fe5932df664 R09: 000000000000000c
[   67.675020][ T5159] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000003
[   67.683433][ T5159] R13: 0000000000000000 R14: 00007fe593ce4670 R15: 0000000000000000
[   67.691430][ T5159]  </TASK>
[   72.393400][ T1249] ieee802154 phy0 wpan0: encryption failed: -22
[   72.399745][ T1249] ieee802154 phy1 wpan1: encryption failed: -22


syzkaller build log:
go env (err=<nil>)
GO111MODULE='auto'
GOARCH='amd64'
GOBIN=''
GOCACHE='/syzkaller/.cache/go-build'
GOENV='/syzkaller/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/syzkaller/jobs-2/linux/gopath/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/syzkaller/jobs-2/linux/gopath'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.21.4'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/syzkaller/jobs-2/linux/gopath/src/github.com/google/syzkaller/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build2557619195=/tmp/go-build -gno-record-gcc-switches'

git status (err=<nil>)
HEAD detached at edc5149ad2
nothing to commit, working tree clean


tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
go list -f '{{.Stale}}' ./sys/syz-sysgen | grep -q false || go install ./sys/syz-sysgen
make .descriptions
tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
bin/syz-sysgen
go fmt ./sys/... >/dev/null
touch .descriptions
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=edc5149ad2ab7a38db6b3bcb1b594e0264a92163 -X 'github.com/google/syzkaller/prog.gitRevisionDate=20240621-090414'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-fuzzer github.com/google/syzkaller/syz-fuzzer
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=edc5149ad2ab7a38db6b3bcb1b594e0264a92163 -X 'github.com/google/syzkaller/prog.gitRevisionDate=20240621-090414'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-execprog github.com/google/syzkaller/tools/syz-execprog
mkdir -p ./bin/linux_amd64
g++ -o ./bin/linux_amd64/syz-executor executor/executor.cc \
	-m64 -O2 -pthread -Wall -Werror -Wparentheses -Wunused-const-variable -Wframe-larger-than=16384 -Wno-stringop-overflow -Wno-array-bounds -Wno-format-overflow -Wno-unused-but-set-variable -Wno-unused-command-line-argument -static-pie -std=c++17 -I. -Iexecutor/_include -fpermissive -w -DGOOS_linux=1 -DGOARCH_amd64=1 \
	-DHOSTGOOS_linux=1 -DGIT_REVISION=\"edc5149ad2ab7a38db6b3bcb1b594e0264a92163\"



Tested on:

commit:         795c58e4 Merge tag 'trace-v6.10-rc6' of git://git.kern..
git tree:       upstream
kernel config:  https://syzkaller.appspot.com/x/.config?x=864caee5f78cab51
dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=13e99bae980000


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
  2024-07-04 11:02 ` Jeongjun Park
@ 2024-07-04 16:28   ` syzbot
  0 siblings, 0 replies; 32+ messages in thread
From: syzbot @ 2024-07-04 16:28 UTC (permalink / raw)
  To: aha310510, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com

Tested on:

commit:         795c58e4 Merge tag 'trace-v6.10-rc6' of git://git.kern..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13bf4581980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=864caee5f78cab51
dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=104920a5980000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
  2024-07-03 16:30       ` Eric Dumazet
@ 2024-07-05 15:17         ` Jeongjun Park
  2024-07-05 15:19         ` [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
  1 sibling, 0 replies; 32+ messages in thread
From: Jeongjun Park @ 2024-07-05 15:17 UTC (permalink / raw)
  To: edumazet
  Cc: aha310510, davem, jiri, kuba, linux-kernel, michal.kubiak, netdev,
	pabeni, syzbot+705c61d60b091ef42c04, syzkaller-bugs

> >
> > Thanks for your comment. I rewrote the patch based on those comments.
> > This time, we modified it to return an error so that resources are not
> > modified when a race situation occurs. We would appreciate your
> > feedback on what this patch would be like.
> >
> > > Thanks,
> > > Michal
> > >
> > >
> >
> > Regards,
> > Jeongjun Park
> >
> > ---
> >  drivers/net/team/team_core.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
> > index ab1935a4aa2c..43d7c73b25aa 100644
> > --- a/drivers/net/team/team_core.c
> > +++ b/drivers/net/team/team_core.c
> > @@ -1972,7 +1972,8 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> >         struct team *team = netdev_priv(dev);
> >         int err;
> >
> > -       mutex_lock(&team->lock);
> > +       if (!mutex_trylock(&team->lock))
> > +               return -EBUSY;
> >         err = team_port_add(team, port_dev, extack);
> >         mutex_unlock(&team->lock);
> >
> > @@ -1987,7 +1988,8 @@ static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
> >         struct team *team = netdev_priv(dev);
> >         int err;
> >
> > -       mutex_lock(&team->lock);
> > +       if (!mutex_trylock(&team->lock))
> > +               return -EBUSY;
> >         err = team_port_del(team, port_dev);
> >         mutex_unlock(&team->lock);
> >
> > --
>
> Failing team_del_slave() is not an option. It will add various issues.

Thank you for comment. 

So, how about briefly releasing the lock before calling dev_open()
in team_port_add() and then locking it again? dev_open() does not use
&team, so disabling it briefly will not cause any major problems.

Regards,
Jeongjun Park

---
 drivers/net/team/team_core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..245566a1875d 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1213,7 +1213,9 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
 		goto err_port_enter;
 	}
 
+	mutex_unlock(&team->lock);
 	err = dev_open(port_dev, extack);
+	mutex_lock(&team->lock);
 	if (err) {
 		netdev_dbg(dev, "Device %s opening failed\n",
 			   portname);
--

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave
  2024-07-03 16:30       ` Eric Dumazet
  2024-07-05 15:17         ` [syzbot] [net?] possible deadlock in team_del_slave (3) Jeongjun Park
@ 2024-07-05 15:19         ` Jeongjun Park
  1 sibling, 0 replies; 32+ messages in thread
From: Jeongjun Park @ 2024-07-05 15:19 UTC (permalink / raw)
  To: edumazet
  Cc: aha310510, davem, jiri, kuba, linux-kernel, michal.kubiak, netdev,
	pabeni, syzbot+705c61d60b091ef42c04, syzkaller-bugs

> >
> > Thanks for your comment. I rewrote the patch based on those comments.
> > This time, we modified it to return an error so that resources are not
> > modified when a race situation occurs. We would appreciate your
> > feedback on what this patch would be like.
> >
> > > Thanks,
> > > Michal
> > >
> > >
> >
> > Regards,
> > Jeongjun Park
> >
> > ---
> >  drivers/net/team/team_core.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
> > index ab1935a4aa2c..43d7c73b25aa 100644
> > --- a/drivers/net/team/team_core.c
> > +++ b/drivers/net/team/team_core.c
> > @@ -1972,7 +1972,8 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
> >         struct team *team = netdev_priv(dev);
> >         int err;
> >
> > -       mutex_lock(&team->lock);
> > +       if (!mutex_trylock(&team->lock))
> > +               return -EBUSY;
> >         err = team_port_add(team, port_dev, extack);
> >         mutex_unlock(&team->lock);
> >
> > @@ -1987,7 +1988,8 @@ static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
> >         struct team *team = netdev_priv(dev);
> >         int err;
> >
> > -       mutex_lock(&team->lock);
> > +       if (!mutex_trylock(&team->lock))
> > +               return -EBUSY;
> >         err = team_port_del(team, port_dev);
> >         mutex_unlock(&team->lock);
> >
> > --
>
> Failing team_del_slave() is not an option. It will add various issues.

Thank you for comment. 

So, how about briefly releasing the lock before calling dev_open()
in team_port_add() and then locking it again? dev_open() does not use
&team, so disabling it briefly will not cause any major problems.

Regards,
Jeongjun Park

---
 drivers/net/team/team_core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..245566a1875d 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1213,7 +1213,9 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
 		goto err_port_enter;
 	}
 
+	mutex_unlock(&team->lock);
 	err = dev_open(port_dev, extack);
+	mutex_lock(&team->lock);
 	if (err) {
 		netdev_dbg(dev, "Device %s opening failed\n",
 			   portname);
--

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net,v2] team: Fix ABBA deadlock caused by race in team_del_slave
  2024-04-26 11:59 [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
                   ` (8 preceding siblings ...)
  2024-07-04 11:02 ` Jeongjun Park
@ 2024-07-06  4:13 ` Jeongjun Park
  2024-07-06 15:01   ` Stephen Hemminger
  2024-07-07  6:00 ` [PATCH] change list_del to list_del_init in ieee80211_remove_interfaces Jeongjun Park
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 32+ messages in thread
From: Jeongjun Park @ 2024-07-06  4:13 UTC (permalink / raw)
  To: jiri
  Cc: syzbot+705c61d60b091ef42c04, davem, edumazet, kuba, linux-kernel,
	netdev, pabeni, syzkaller-bugs, Jeongjun Park

       CPU0                    CPU1
       ----                    ----
  lock(&rdev->wiphy.mtx);
                               lock(team->team_lock_key#4);
                               lock(&rdev->wiphy.mtx);
  lock(team->team_lock_key#4);

Deadlock occurs due to the above scenario. Therefore, you can prevent
deadlock by briefly releasing the lock before calling dev_open() in
team_port_add() and locking it again after it returns.

Reported-and-tested-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com
Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
---
 drivers/net/team/team_core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index ab1935a4aa2c..245566a1875d 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -1213,7 +1213,9 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
 		goto err_port_enter;
 	}
 
+	mutex_unlock(&team->lock);
 	err = dev_open(port_dev, extack);
+	mutex_lock(&team->lock);
 	if (err) {
 		netdev_dbg(dev, "Device %s opening failed\n",
 			   portname);
--

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH net,v2] team: Fix ABBA deadlock caused by race in team_del_slave
  2024-07-06  4:13 ` [PATCH net,v2] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
@ 2024-07-06 15:01   ` Stephen Hemminger
  0 siblings, 0 replies; 32+ messages in thread
From: Stephen Hemminger @ 2024-07-06 15:01 UTC (permalink / raw)
  To: Jeongjun Park
  Cc: jiri, syzbot+705c61d60b091ef42c04, davem, edumazet, kuba,
	linux-kernel, netdev, pabeni, syzkaller-bugs

On Sat,  6 Jul 2024 13:13:29 +0900
Jeongjun Park <aha310510@gmail.com> wrote:

>        CPU0                    CPU1
>        ----                    ----
>   lock(&rdev->wiphy.mtx);
>                                lock(team->team_lock_key#4);
>                                lock(&rdev->wiphy.mtx);
>   lock(team->team_lock_key#4);
> 
> Deadlock occurs due to the above scenario. Therefore, you can prevent
> deadlock by briefly releasing the lock before calling dev_open() in
> team_port_add() and locking it again after it returns.
> 
> Reported-and-tested-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com
> Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device")
> Signed-off-by: Jeongjun Park <aha310510@gmail.com>
> ---

But if you drop the lock the actual data structures might have changed.
Usually not a good idea,

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] change list_del to list_del_init in ieee80211_remove_interfaces
  2024-04-26 11:59 [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
                   ` (9 preceding siblings ...)
  2024-07-06  4:13 ` [PATCH net,v2] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
@ 2024-07-07  6:00 ` Jeongjun Park
  2024-07-07  6:23   ` [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
  2024-07-07  6:02 ` Jeongjun Park
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 32+ messages in thread
From: Jeongjun Park @ 2024-07-07  6:00 UTC (permalink / raw)
  To: syzbot+705c61d60b091ef42c04; +Cc: linux-kernel, syzkaller-bugs

#syz test git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

---
 net/mac80211/iface.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index b935bb5d8ed1..7ac4a62ed536 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -2301,7 +2301,7 @@ void ieee80211_remove_interfaces(struct ieee80211_local *local)
 			ieee80211_vif_cfg_change_notify(sdata,
 							BSS_CHANGED_ARP_FILTER);
 
-		list_del(&sdata->list);
+		list_del_init(&sdata->list);
 		cfg80211_unregister_wdev(&sdata->wdev);
 
 		if (!netdev)
--

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
  2024-04-26 11:59 [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
                   ` (10 preceding siblings ...)
  2024-07-07  6:00 ` [PATCH] change list_del to list_del_init in ieee80211_remove_interfaces Jeongjun Park
@ 2024-07-07  6:02 ` Jeongjun Park
  2024-07-07  6:44   ` syzbot
  2024-07-07  6:06 ` Jeongjun Park
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 32+ messages in thread
From: Jeongjun Park @ 2024-07-07  6:02 UTC (permalink / raw)
  To: syzbot+705c61d60b091ef42c04; +Cc: linux-kernel, syzkaller-bugs

#syz test git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

---
 net/mac80211/iface.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index b935bb5d8ed1..7ac4a62ed536 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -2301,7 +2301,7 @@ void ieee80211_remove_interfaces(struct ieee80211_local *local)
 			ieee80211_vif_cfg_change_notify(sdata,
 							BSS_CHANGED_ARP_FILTER);
 
-		list_del(&sdata->list);
+		list_del_init(&sdata->list);
 		cfg80211_unregister_wdev(&sdata->wdev);
 
 		if (!netdev)
--

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
  2024-04-26 11:59 [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
                   ` (11 preceding siblings ...)
  2024-07-07  6:02 ` Jeongjun Park
@ 2024-07-07  6:06 ` Jeongjun Park
  2024-07-07  7:04   ` syzbot
  2025-05-14 13:18 ` [syzbot] " syzbot
  2025-05-16 13:55 ` syzbot
  14 siblings, 1 reply; 32+ messages in thread
From: Jeongjun Park @ 2024-07-07  6:06 UTC (permalink / raw)
  To: syzbot+705c61d60b091ef42c04; +Cc: linux-kernel, syzkaller-bugs

#syz test git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

---
 net/mac80211/iface.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index 7ac4a62ed536..e55b1c2654ab 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -2286,6 +2286,8 @@ void ieee80211_remove_interfaces(struct ieee80211_local *local)
 	list_splice_init(&local->interfaces, &unreg_list);
 	mutex_unlock(&local->iflist_mtx);
 
+	wiphy_unlock(local->hw.wiphy);
+
 	list_for_each_entry_safe(sdata, tmp, &unreg_list, list) {
 		bool netdev = sdata->dev;
 
@@ -2307,7 +2309,6 @@ void ieee80211_remove_interfaces(struct ieee80211_local *local)
 		if (!netdev)
 			kfree(sdata);
 	}
-	wiphy_unlock(local->hw.wiphy);
 }
 
 static int netdev_notify(struct notifier_block *nb,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
  2024-07-07  6:00 ` [PATCH] change list_del to list_del_init in ieee80211_remove_interfaces Jeongjun Park
@ 2024-07-07  6:23   ` syzbot
  0 siblings, 0 replies; 32+ messages in thread
From: syzbot @ 2024-07-07  6:23 UTC (permalink / raw)
  To: aha310510, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
possible deadlock in team_del_slave

======================================================
WARNING: possible circular locking dependency detected
6.10.0-rc6-syzkaller-00223-gc6653f49e4fd-dirty #0 Not tainted
------------------------------------------------------
kworker/u8:0/11 is trying to acquire lock:
ffff888023258d20 (team->team_lock_key#4){+.+.}-{3:3}, at: team_del_slave+0x32/0x1d0 drivers/net/team/team_core.c:1990

but task is already holding lock:
ffff88801eed0768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: wiphy_lock include/net/cfg80211.h:5966 [inline]
ffff88801eed0768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: ieee80211_remove_interfaces+0x12b/0x6f0 net/mac80211/iface.c:2280

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&rdev->wiphy.mtx){+.+.}-{3:3}:
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
       __mutex_lock_common kernel/locking/mutex.c:608 [inline]
       __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
       wiphy_lock include/net/cfg80211.h:5966 [inline]
       ieee80211_open+0xe7/0x200 net/mac80211/iface.c:449
       __dev_open+0x2d3/0x450 net/core/dev.c:1472
       dev_open+0xae/0x1b0 net/core/dev.c:1508
       team_port_add drivers/net/team/team_core.c:1216 [inline]
       team_add_slave+0x9b3/0x2750 drivers/net/team/team_core.c:1976
       do_set_master net/core/rtnetlink.c:2701 [inline]
       do_setlink+0xe70/0x41f0 net/core/rtnetlink.c:2907
       __rtnl_newlink net/core/rtnetlink.c:3696 [inline]
       rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3743
       rtnetlink_rcv_msg+0x89b/0x1180 net/core/rtnetlink.c:6635
       netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2564
       netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
       netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
       netlink_sendmsg+0x8db/0xcb0 net/netlink/af_netlink.c:1905
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg+0x221/0x270 net/socket.c:745
       ____sys_sendmsg+0x525/0x7d0 net/socket.c:2585
       ___sys_sendmsg net/socket.c:2639 [inline]
       __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2668
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #0 (team->team_lock_key#4){+.+.}-{3:3}:
       check_prev_add kernel/locking/lockdep.c:3134 [inline]
       check_prevs_add kernel/locking/lockdep.c:3253 [inline]
       validate_chain+0x18e0/0x5900 kernel/locking/lockdep.c:3869
       __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
       __mutex_lock_common kernel/locking/mutex.c:608 [inline]
       __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
       team_del_slave+0x32/0x1d0 drivers/net/team/team_core.c:1990
       team_device_event+0x200/0x5b0 drivers/net/team/team_core.c:2984
       notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
       call_netdevice_notifiers_extack net/core/dev.c:2030 [inline]
       call_netdevice_notifiers net/core/dev.c:2044 [inline]
       unregister_netdevice_many_notify+0xd75/0x16b0 net/core/dev.c:11219
       unregister_netdevice_many net/core/dev.c:11277 [inline]
       unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11156
       unregister_netdevice include/linux/netdevice.h:3119 [inline]
       _cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
       ieee80211_remove_interfaces+0x4cd/0x6f0 net/mac80211/iface.c:2305
       ieee80211_unregister_hw+0x5d/0x2c0 net/mac80211/main.c:1659
       mac80211_hwsim_del_radio+0x2c2/0x4c0 drivers/net/wireless/virtual/mac80211_hwsim.c:5576
       hwsim_exit_net+0x5c1/0x670 drivers/net/wireless/virtual/mac80211_hwsim.c:6453
       ops_exit_list net/core/net_namespace.c:173 [inline]
       cleanup_net+0x802/0xcc0 net/core/net_namespace.c:640
       process_one_work kernel/workqueue.c:3248 [inline]
       process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3329
       worker_thread+0x86d/0xd50 kernel/workqueue.c:3409
       kthread+0x2f0/0x390 kernel/kthread.c:389
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&rdev->wiphy.mtx);
                               lock(team->team_lock_key#4);
                               lock(&rdev->wiphy.mtx);
  lock(team->team_lock_key#4);

 *** DEADLOCK ***

5 locks held by kworker/u8:0/11:
 #0: ffff888015ed5948 ((wq_completion)netns){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3223 [inline]
 #0: ffff888015ed5948 ((wq_completion)netns){+.+.}-{0:0}, at: process_scheduled_works+0x90a/0x1830 kernel/workqueue.c:3329
 #1: ffffc90000107d00 (net_cleanup_work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3224 [inline]
 #1: ffffc90000107d00 (net_cleanup_work){+.+.}-{0:0}, at: process_scheduled_works+0x945/0x1830 kernel/workqueue.c:3329
 #2: ffffffff8f5da590 (pernet_ops_rwsem){++++}-{3:3}, at: cleanup_net+0x16a/0xcc0 net/core/net_namespace.c:594
 #3: ffffffff8f5e6dc8 (rtnl_mutex){+.+.}-{3:3}, at: ieee80211_unregister_hw+0x55/0x2c0 net/mac80211/main.c:1652
 #4: ffff88801eed0768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: wiphy_lock include/net/cfg80211.h:5966 [inline]
 #4: ffff88801eed0768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: ieee80211_remove_interfaces+0x12b/0x6f0 net/mac80211/iface.c:2280

stack backtrace:
CPU: 1 PID: 11 Comm: kworker/u8:0 Not tainted 6.10.0-rc6-syzkaller-00223-gc6653f49e4fd-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
Workqueue: netns cleanup_net
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
 check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187
 check_prev_add kernel/locking/lockdep.c:3134 [inline]
 check_prevs_add kernel/locking/lockdep.c:3253 [inline]
 validate_chain+0x18e0/0x5900 kernel/locking/lockdep.c:3869
 __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
 __mutex_lock_common kernel/locking/mutex.c:608 [inline]
 __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
 team_del_slave+0x32/0x1d0 drivers/net/team/team_core.c:1990
 team_device_event+0x200/0x5b0 drivers/net/team/team_core.c:2984
 notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
 call_netdevice_notifiers_extack net/core/dev.c:2030 [inline]
 call_netdevice_notifiers net/core/dev.c:2044 [inline]
 unregister_netdevice_many_notify+0xd75/0x16b0 net/core/dev.c:11219
 unregister_netdevice_many net/core/dev.c:11277 [inline]
 unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11156
 unregister_netdevice include/linux/netdevice.h:3119 [inline]
 _cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
 ieee80211_remove_interfaces+0x4cd/0x6f0 net/mac80211/iface.c:2305
 ieee80211_unregister_hw+0x5d/0x2c0 net/mac80211/main.c:1659
 mac80211_hwsim_del_radio+0x2c2/0x4c0 drivers/net/wireless/virtual/mac80211_hwsim.c:5576
 hwsim_exit_net+0x5c1/0x670 drivers/net/wireless/virtual/mac80211_hwsim.c:6453
 ops_exit_list net/core/net_namespace.c:173 [inline]
 cleanup_net+0x802/0xcc0 net/core/net_namespace.c:640
 process_one_work kernel/workqueue.c:3248 [inline]
 process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3329
 worker_thread+0x86d/0xd50 kernel/workqueue.c:3409
 kthread+0x2f0/0x390 kernel/kthread.c:389
 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>
team0: Port device wlan1 removed
hsr_slave_0: left promiscuous mode
hsr_slave_1: left promiscuous mode
batman_adv: batadv0: Interface deactivated: batadv_slave_0
batman_adv: batadv0: Removing interface: batadv_slave_0
batman_adv: batadv0: Interface deactivated: batadv_slave_1
batman_adv: batadv0: Removing interface: batadv_slave_1
veth1_macvtap: left promiscuous mode
veth0_macvtap: left promiscuous mode
veth1_vlan: left promiscuous mode
veth0_vlan: left promiscuous mode
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed
netdevsim netdevsim1 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim1 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim1 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim1 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim0 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim0 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim0 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim0 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
bridge_slave_1: left allmulticast mode
bridge_slave_1: left promiscuous mode
bridge0: port 2(bridge_slave_1) entered disabled state
bridge_slave_0: left allmulticast mode
bridge_slave_0: left promiscuous mode
bridge0: port 1(bridge_slave_0) entered disabled state
bridge_slave_1: left allmulticast mode
bridge_slave_1: left promiscuous mode
bridge0: port 2(bridge_slave_1) entered disabled state
bridge_slave_0: left allmulticast mode
bridge_slave_0: left promiscuous mode
bridge0: port 1(bridge_slave_0) entered disabled state
bridge_slave_1: left allmulticast mode
bridge_slave_1: left promiscuous mode
bridge0: port 2(bridge_slave_1) entered disabled state
bridge_slave_0: left allmulticast mode
bridge_slave_0: left promiscuous mode
bridge0: port 1(bridge_slave_0) entered disabled state
bridge_slave_1: left allmulticast mode
bridge_slave_1: left promiscuous mode
bridge0: port 2(bridge_slave_1) entered disabled state
bridge_slave_0: left allmulticast mode
bridge_slave_0: left promiscuous mode
bridge0: port 1(bridge_slave_0) entered disabled state
bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): Released all slaves
bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): Released all slaves
bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): Released all slaves
bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): Released all slaves
team0: Port device wlan1 removed
team0: Port device wlan1 removed
team0: Port device wlan1 removed
team0: Port device wlan1 removed
hsr_slave_0: left promiscuous mode
hsr_slave_1: left promiscuous mode
batman_adv: batadv0: Interface deactivated: batadv_slave_0
batman_adv: batadv0: Removing interface: batadv_slave_0
batman_adv: batadv0: Interface deactivated: batadv_slave_1
batman_adv: batadv0: Removing interface: batadv_slave_1
hsr_slave_0: left promiscuous mode
hsr_slave_1: left promiscuous mode
batman_adv: batadv0: Interface deactivated: batadv_slave_0
batman_adv: batadv0: Removing interface: batadv_slave_0
batman_adv: batadv0: Interface deactivated: batadv_slave_1
batman_adv: batadv0: Removing interface: batadv_slave_1
hsr_slave_0: left promiscuous mode
hsr_slave_1: left promiscuous mode
batman_adv: batadv0: Interface deactivated: batadv_slave_0
batman_adv: batadv0: Removing interface: batadv_slave_0
batman_adv: batadv0: Interface deactivated: batadv_slave_1
batman_adv: batadv0: Removing interface: batadv_slave_1
hsr_slave_0: left promiscuous mode
hsr_slave_1: left promiscuous mode
batman_adv: batadv0: Interface deactivated: batadv_slave_0
batman_adv: batadv0: Removing interface: batadv_slave_0
batman_adv: batadv0: Interface deactivated: batadv_slave_1
batman_adv: batadv0: Removing interface: batadv_slave_1
veth1_macvtap: left promiscuous mode
veth0_macvtap: left promiscuous mode
veth1_vlan: left promiscuous mode
veth0_vlan: left promiscuous mode
veth1_macvtap: left promiscuous mode
veth0_macvtap: left promiscuous mode
veth1_vlan: left promiscuous mode
veth0_vlan: left promiscuous mode
veth1_macvtap: left promiscuous mode
veth0_macvtap: left promiscuous mode
veth1_vlan: left promiscuous mode
veth0_vlan: left promiscuous mode
veth1_macvtap: left promiscuous mode
veth0_macvtap: left promiscuous mode
veth1_vlan: left promiscuous mode
veth0_vlan: left promiscuous mode
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed


Tested on:

commit:         c6653f49 Merge tag 'powerpc-6.10-4' of git://git.kerne..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1037d4a5980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=864caee5f78cab51
dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1063e781980000


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
  2024-07-07  6:02 ` Jeongjun Park
@ 2024-07-07  6:44   ` syzbot
  0 siblings, 0 replies; 32+ messages in thread
From: syzbot @ 2024-07-07  6:44 UTC (permalink / raw)
  To: aha310510, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
possible deadlock in team_del_slave

bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): Released all slaves
======================================================
WARNING: possible circular locking dependency detected
6.10.0-rc6-syzkaller-00223-gc6653f49e4fd-dirty #0 Not tainted
------------------------------------------------------
kworker/u8:5/1042 is trying to acquire lock:
ffff88802e894d20 (team->team_lock_key#4){+.+.}-{3:3}, at: team_del_slave+0x32/0x1d0 drivers/net/team/team_core.c:1990

but task is already holding lock:
ffff88807e3f8768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: wiphy_lock include/net/cfg80211.h:5966 [inline]
ffff88807e3f8768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: ieee80211_remove_interfaces+0x12b/0x6f0 net/mac80211/iface.c:2280

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&rdev->wiphy.mtx){+.+.}-{3:3}:
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
       __mutex_lock_common kernel/locking/mutex.c:608 [inline]
       __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
       wiphy_lock include/net/cfg80211.h:5966 [inline]
       ieee80211_open+0xe7/0x200 net/mac80211/iface.c:449
       __dev_open+0x2d3/0x450 net/core/dev.c:1472
       dev_open+0xae/0x1b0 net/core/dev.c:1508
       team_port_add drivers/net/team/team_core.c:1216 [inline]
       team_add_slave+0x9b3/0x2750 drivers/net/team/team_core.c:1976
       do_set_master net/core/rtnetlink.c:2701 [inline]
       do_setlink+0xe70/0x41f0 net/core/rtnetlink.c:2907
       __rtnl_newlink net/core/rtnetlink.c:3696 [inline]
       rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3743
       rtnetlink_rcv_msg+0x89b/0x1180 net/core/rtnetlink.c:6635
       netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2564
       netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
       netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
       netlink_sendmsg+0x8db/0xcb0 net/netlink/af_netlink.c:1905
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg+0x221/0x270 net/socket.c:745
       ____sys_sendmsg+0x525/0x7d0 net/socket.c:2585
       ___sys_sendmsg net/socket.c:2639 [inline]
       __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2668
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #0 (team->team_lock_key#4){+.+.}-{3:3}:
       check_prev_add kernel/locking/lockdep.c:3134 [inline]
       check_prevs_add kernel/locking/lockdep.c:3253 [inline]
       validate_chain+0x18e0/0x5900 kernel/locking/lockdep.c:3869
       __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
       __mutex_lock_common kernel/locking/mutex.c:608 [inline]
       __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
       team_del_slave+0x32/0x1d0 drivers/net/team/team_core.c:1990
       team_device_event+0x200/0x5b0 drivers/net/team/team_core.c:2984
       notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
       call_netdevice_notifiers_extack net/core/dev.c:2030 [inline]
       call_netdevice_notifiers net/core/dev.c:2044 [inline]
       unregister_netdevice_many_notify+0xd75/0x16b0 net/core/dev.c:11219
       unregister_netdevice_many net/core/dev.c:11277 [inline]
       unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11156
       unregister_netdevice include/linux/netdevice.h:3119 [inline]
       _cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
       ieee80211_remove_interfaces+0x4cd/0x6f0 net/mac80211/iface.c:2305
       ieee80211_unregister_hw+0x5d/0x2c0 net/mac80211/main.c:1659
       mac80211_hwsim_del_radio+0x2c2/0x4c0 drivers/net/wireless/virtual/mac80211_hwsim.c:5576
       hwsim_exit_net+0x5c1/0x670 drivers/net/wireless/virtual/mac80211_hwsim.c:6453
       ops_exit_list net/core/net_namespace.c:173 [inline]
       cleanup_net+0x802/0xcc0 net/core/net_namespace.c:640
       process_one_work kernel/workqueue.c:3248 [inline]
       process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3329
       worker_thread+0x86d/0xd50 kernel/workqueue.c:3409
       kthread+0x2f0/0x390 kernel/kthread.c:389
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&rdev->wiphy.mtx);
                               lock(team->team_lock_key#4);
                               lock(&rdev->wiphy.mtx);
  lock(team->team_lock_key#4);

 *** DEADLOCK ***

5 locks held by kworker/u8:5/1042:
 #0: ffff888015ed5948 ((wq_completion)netns){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3223 [inline]
 #0: ffff888015ed5948 ((wq_completion)netns){+.+.}-{0:0}, at: process_scheduled_works+0x90a/0x1830 kernel/workqueue.c:3329
 #1: ffffc900041ffd00 (net_cleanup_work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3224 [inline]
 #1: ffffc900041ffd00 (net_cleanup_work){+.+.}-{0:0}, at: process_scheduled_works+0x945/0x1830 kernel/workqueue.c:3329
 #2: ffffffff8f5da590 (pernet_ops_rwsem){++++}-{3:3}, at: cleanup_net+0x16a/0xcc0 net/core/net_namespace.c:594
 #3: ffffffff8f5e6dc8 (rtnl_mutex){+.+.}-{3:3}, at: ieee80211_unregister_hw+0x55/0x2c0 net/mac80211/main.c:1652
 #4: ffff88807e3f8768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: wiphy_lock include/net/cfg80211.h:5966 [inline]
 #4: ffff88807e3f8768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: ieee80211_remove_interfaces+0x12b/0x6f0 net/mac80211/iface.c:2280

stack backtrace:
CPU: 0 PID: 1042 Comm: kworker/u8:5 Not tainted 6.10.0-rc6-syzkaller-00223-gc6653f49e4fd-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
Workqueue: netns cleanup_net
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
 check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187
 check_prev_add kernel/locking/lockdep.c:3134 [inline]
 check_prevs_add kernel/locking/lockdep.c:3253 [inline]
 validate_chain+0x18e0/0x5900 kernel/locking/lockdep.c:3869
 __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
 __mutex_lock_common kernel/locking/mutex.c:608 [inline]
 __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
 team_del_slave+0x32/0x1d0 drivers/net/team/team_core.c:1990
 team_device_event+0x200/0x5b0 drivers/net/team/team_core.c:2984
 notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
 call_netdevice_notifiers_extack net/core/dev.c:2030 [inline]
 call_netdevice_notifiers net/core/dev.c:2044 [inline]
 unregister_netdevice_many_notify+0xd75/0x16b0 net/core/dev.c:11219
 unregister_netdevice_many net/core/dev.c:11277 [inline]
 unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11156
 unregister_netdevice include/linux/netdevice.h:3119 [inline]
 _cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
 ieee80211_remove_interfaces+0x4cd/0x6f0 net/mac80211/iface.c:2305
 ieee80211_unregister_hw+0x5d/0x2c0 net/mac80211/main.c:1659
 mac80211_hwsim_del_radio+0x2c2/0x4c0 drivers/net/wireless/virtual/mac80211_hwsim.c:5576
 hwsim_exit_net+0x5c1/0x670 drivers/net/wireless/virtual/mac80211_hwsim.c:6453
 ops_exit_list net/core/net_namespace.c:173 [inline]
 cleanup_net+0x802/0xcc0 net/core/net_namespace.c:640
 process_one_work kernel/workqueue.c:3248 [inline]
 process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3329
 worker_thread+0x86d/0xd50 kernel/workqueue.c:3409
 kthread+0x2f0/0x390 kernel/kthread.c:389
 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>
team0: Port device wlan1 removed
hsr_slave_0: left promiscuous mode
hsr_slave_1: left promiscuous mode
batman_adv: batadv0: Interface deactivated: batadv_slave_0
batman_adv: batadv0: Removing interface: batadv_slave_0
batman_adv: batadv0: Interface deactivated: batadv_slave_1
batman_adv: batadv0: Removing interface: batadv_slave_1
veth1_macvtap: left promiscuous mode
veth0_macvtap: left promiscuous mode
veth1_vlan: left promiscuous mode
veth0_vlan: left promiscuous mode
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed
netdevsim netdevsim2 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim2 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim2 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim2 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim1 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim1 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim1 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim1 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
netdevsim netdevsim3 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
bridge_slave_1: left allmulticast mode
bridge_slave_1: left promiscuous mode
bridge0: port 2(bridge_slave_1) entered disabled state
bridge_slave_0: left allmulticast mode
bridge_slave_0: left promiscuous mode
bridge0: port 1(bridge_slave_0) entered disabled state
bridge_slave_1: left allmulticast mode
bridge_slave_1: left promiscuous mode
bridge0: port 2(bridge_slave_1) entered disabled state
bridge_slave_0: left allmulticast mode
bridge_slave_0: left promiscuous mode
bridge0: port 1(bridge_slave_0) entered disabled state
bridge_slave_1: left allmulticast mode
bridge_slave_1: left promiscuous mode
bridge0: port 2(bridge_slave_1) entered disabled state
bridge_slave_0: left allmulticast mode
bridge_slave_0: left promiscuous mode
bridge0: port 1(bridge_slave_0) entered disabled state
bridge_slave_1: left allmulticast mode
bridge_slave_1: left promiscuous mode
bridge0: port 2(bridge_slave_1) entered disabled state
bridge_slave_0: left allmulticast mode
bridge_slave_0: left promiscuous mode
bridge0: port 1(bridge_slave_0) entered disabled state
bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): Released all slaves
bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): Released all slaves
bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): Released all slaves
bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): Released all slaves
team0: Port device wlan1 removed
team0: Port device wlan1 removed
team0: Port device wlan1 removed
team0: Port device wlan1 removed
hsr_slave_0: left promiscuous mode
hsr_slave_1: left promiscuous mode
batman_adv: batadv0: Interface deactivated: batadv_slave_0
batman_adv: batadv0: Removing interface: batadv_slave_0
batman_adv: batadv0: Interface deactivated: batadv_slave_1
batman_adv: batadv0: Removing interface: batadv_slave_1
hsr_slave_0: left promiscuous mode
hsr_slave_1: left promiscuous mode
batman_adv: batadv0: Interface deactivated: batadv_slave_0
batman_adv: batadv0: Removing interface: batadv_slave_0
batman_adv: batadv0: Interface deactivated: batadv_slave_1
batman_adv: batadv0: Removing interface: batadv_slave_1
hsr_slave_0: left promiscuous mode
hsr_slave_1: left promiscuous mode
batman_adv: batadv0: Interface deactivated: batadv_slave_0
batman_adv: batadv0: Removing interface: batadv_slave_0
batman_adv: batadv0: Interface deactivated: batadv_slave_1
batman_adv: batadv0: Removing interface: batadv_slave_1
hsr_slave_0: left promiscuous mode
hsr_slave_1: left promiscuous mode
batman_adv: batadv0: Interface deactivated: batadv_slave_0
batman_adv: batadv0: Removing interface: batadv_slave_0
batman_adv: batadv0: Interface deactivated: batadv_slave_1
batman_adv: batadv0: Removing interface: batadv_slave_1
veth1_macvtap: left promiscuous mode
veth0_macvtap: left promiscuous mode
veth1_vlan: left promiscuous mode
veth0_vlan: left promiscuous mode
veth1_macvtap: left promiscuous mode
veth0_macvtap: left promiscuous mode
veth1_vlan: left promiscuous mode
veth0_vlan: left promiscuous mode
veth1_macvtap: left promiscuous mode
veth0_macvtap: left promiscuous mode
veth1_vlan: left promiscuous mode
veth0_vlan: left promiscuous mode
veth1_macvtap: left promiscuous mode
veth0_macvtap: left promiscuous mode
veth1_vlan: left promiscuous mode
veth0_vlan: left promiscuous mode
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed
team0 (unregistering): Port device team_slave_1 removed
team0 (unregistering): Port device team_slave_0 removed


Tested on:

commit:         c6653f49 Merge tag 'powerpc-6.10-4' of git://git.kerne..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=14a88376980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=864caee5f78cab51
dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=12446f81980000


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
  2024-07-07  6:06 ` Jeongjun Park
@ 2024-07-07  7:04   ` syzbot
  0 siblings, 0 replies; 32+ messages in thread
From: syzbot @ 2024-07-07  7:04 UTC (permalink / raw)
  To: aha310510, linux-kernel, syzkaller-bugs

Hello,

syzbot tried to test the proposed patch but the build/boot failed:

. Setting the MTU to 1560 would solve the problem.
[   65.749523][ T5188] batman_adv: batadv0: Not using interface batadv_slave_1 (retrying later): interface not active
[   65.794737][ T5188] hsr_slave_0: entered promiscuous mode
[   65.800990][ T5188] hsr_slave_1: entered promiscuous mode
[   65.807988][ T5188] debugfs: Directory 'hsr0' with parent 'hsr' already present!
[   65.816311][ T5188] Cannot create hsr debugfs directory
[   65.949582][ T5188] netdevsim netdevsim2 netdevsim0: renamed from eth0
[   65.959631][ T5188] netdevsim netdevsim2 netdevsim1: renamed from eth1
[   65.969541][ T5188] netdevsim netdevsim2 netdevsim2: renamed from eth2
[   65.979136][ T5188] netdevsim netdevsim2 netdevsim3: renamed from eth3
[   66.059668][ T5188] 8021q: adding VLAN 0 to HW filter on device bond0
[   66.079370][ T5188] 8021q: adding VLAN 0 to HW filter on device team0
[   66.091102][ T5170] bridge0: port 1(bridge_slave_0) entered blocking state
[   66.098400][ T5170] bridge0: port 1(bridge_slave_0) entered forwarding state
[   66.116765][ T5170] bridge0: port 2(bridge_slave_1) entered blocking state
[   66.124047][ T5170] bridge0: port 2(bridge_slave_1) entered forwarding state
[   66.159855][ T5188] hsr0: Slave A (hsr_slave_0) is not up; please bring it up to get a fully working HSR network
[   66.170938][ T5188] hsr0: Slave B (hsr_slave_1) is not up; please bring it up to get a fully working HSR network
[   66.209690][ T5188] 8021q: adding VLAN 0 to HW filter on device batadv0
[   66.250323][ T5188] veth0_vlan: entered promiscuous mode
[   66.267593][ T5188] veth1_vlan: entered promiscuous mode
[   66.295077][ T5188] veth0_macvtap: entered promiscuous mode
[   66.305000][ T5188] veth1_macvtap: entered promiscuous mode
[   66.321220][ T5188] batman_adv: The newly added mac address (aa:aa:aa:aa:aa:3e) already exists on: batadv_slave_0
[   66.334443][ T5188] batman_adv: It is strongly recommended to keep mac addresses unique to avoid problems!
[   66.347433][ T5188] batman_adv: batadv0: Interface activated: batadv_slave_0
[   66.364590][ T5188] batman_adv: The newly added mac address (aa:aa:aa:aa:aa:3f) already exists on: batadv_slave_1
[   66.375515][ T5188] batman_adv: It is strongly recommended to keep mac addresses unique to avoid problems!
[   66.387573][ T5188] batman_adv: batadv0: Interface activated: batadv_slave_1
[   66.399784][ T5188] netdevsim netdevsim2 netdevsim0: set [1, 0] type 2 family 0 port 6081 - 0
[   66.408658][ T5188] netdevsim netdevsim2 netdevsim1: set [1, 0] type 2 family 0 port 6081 - 0
[   66.417776][ T5188] netdevsim netdevsim2 netdevsim2: set [1, 0] type 2 family 0 port 6081 - 0
[   66.427420][ T5188] netdevsim netdevsim2 netdevsim3: set [1, 0] type 2 family 0 port 6081 - 0
[   66.499007][  T134] wlan0: Created IBSS using preconfigured BSSID 50:50:50:50:50:50
[   66.507906][  T134] wlan0: Creating new IBSS network, BSSID 50:50:50:50:50:50
[   66.537892][   T51] wlan1: Created IBSS using preconfigured BSSID 50:50:50:50:50:50
[   66.546807][   T51] wlan1: Creating new IBSS network, BSSID 50:50:50:50:50:50
[   69.815443][   T12] bridge_slave_1: left allmulticast mode
[   69.821395][   T12] bridge_slave_1: left promiscuous mode
[   69.844241][   T12] bridge0: port 2(bridge_slave_1) entered disabled state
[   69.859472][   T12] bridge_slave_0: left allmulticast mode
[   69.867932][   T12] bridge_slave_0: left promiscuous mode
[   69.873969][   T12] bridge0: port 1(bridge_slave_0) entered disabled state
[   70.095084][   T12] bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
[   70.107948][   T12] bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
[   70.118543][   T12] bond0 (unregistering): Released all slaves
[   70.239777][   T12] hsr_slave_0: left promiscuous mode
[   70.246490][   T12] hsr_slave_1: left promiscuous mode
[   70.255808][   T12] batman_adv: batadv0: Interface deactivated: batadv_slave_0
[   70.265531][   T12] batman_adv: batadv0: Removing interface: batadv_slave_0
[   70.283319][   T12] batman_adv: batadv0: Interface deactivated: batadv_slave_1
[   70.290777][   T12] batman_adv: batadv0: Removing interface: batadv_slave_1
[   70.316399][   T12] veth1_macvtap: left promiscuous mode
[   70.323105][   T12] veth0_macvtap: left promiscuous mode
[   70.328757][   T12] veth1_vlan: left promiscuous mode
[   70.335500][   T12] veth0_vlan: left promiscuous mode
[   70.689807][   T12] team0 (unregistering): Port device team_slave_1 removed
[   70.718877][   T12] team0 (unregistering): Port device team_slave_0 removed
[   71.258531][   T12] netdevsim netdevsim2 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
[   71.857737][ T1248] ieee802154 phy0 wpan0: encryption failed: -22
[   71.871811][ T1248] ieee802154 phy1 wpan1: encryption failed: -22
[   72.118696][   T12] netdevsim netdevsim2 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
[   72.177872][   T12] netdevsim netdevsim2 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
[   72.269874][   T12] netdevsim netdevsim2 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
[   72.413143][   T12] bridge_slave_1: left allmulticast mode
[   72.418935][   T12] bridge_slave_1: left promiscuous mode
[   72.427494][   T12] bridge0: port 2(bridge_slave_1) entered disabled state
[   72.440417][   T12] bridge_slave_0: left allmulticast mode
[   72.448397][   T12] bridge_slave_0: left promiscuous mode
[   72.455186][   T12] bridge0: port 1(bridge_slave_0) entered disabled state
[   72.738209][   T12] bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
[   72.749757][   T12] bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
[   72.760471][   T12] bond0 (unregistering): Released all slaves
[   73.080824][   T12] ------------[ cut here ]------------
[   73.086430][   T12] WARNING: CPU: 0 PID: 12 at net/wireless/core.c:1197 _cfg80211_unregister_wdev+0x46d/0x560
[   73.096902][   T12] Modules linked in:
[   73.100847][   T12] CPU: 0 PID: 12 Comm: kworker/u8:1 Not tainted 6.10.0-rc6-syzkaller-00223-gc6653f49e4fd-dirty #0
[   73.111921][   T12] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
[   73.122420][   T12] Workqueue: netns cleanup_net
[   73.127239][   T12] RIP: 0010:_cfg80211_unregister_wdev+0x46d/0x560
[   73.134251][   T12] Code: 0f b6 04 38 84 c0 0f 85 ec 00 00 00 41 80 65 00 fe 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 04 07 c5 f6 90 <0f> 0b 90 e9 61 fc ff ff e8 f6 06 c5 f6 c6 05 2c ac c6 04 01 90 48
[   73.154633][   T12] RSP: 0018:ffffc90000117798 EFLAGS: 00010293
[   73.160724][   T12] RAX: ffffffff8ad120ac RBX: 0000000000000000 RCX: ffff8880176c5a00
[   73.169098][   T12] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[   73.177679][   T12] RBP: ffff888068618000 R08: ffffffff8ad11d02 R09: 1ffffffff1ebcdac
[   73.186247][   T12] R10: dffffc0000000000 R11: fffffbfff1ebcdad R12: 0000000000000001
[   73.194767][   T12] R13: ffff88801cf7ccb0 R14: ffff888068618700 R15: dffffc0000000000
[   73.203072][   T12] FS:  0000000000000000(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000
[   73.212323][   T12] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   73.218895][   T12] CR2: 000055f5b8852950 CR3: 000000000e132000 CR4: 00000000003506f0
[   73.227244][   T12] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   73.235570][   T12] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   73.243988][   T12] Call Trace:
[   73.247277][   T12]  <TASK>
[   73.250193][   T12]  ? __warn+0x163/0x4e0
[   73.254816][   T12]  ? _cfg80211_unregister_wdev+0x46d/0x560
[   73.260665][   T12]  ? report_bug+0x2b3/0x500
[   73.265522][   T12]  ? _cfg80211_unregister_wdev+0x46d/0x560
[   73.271592][   T12]  ? handle_bug+0x3e/0x70
[   73.276016][   T12]  ? exc_invalid_op+0x1a/0x50
[   73.280956][   T12]  ? asm_exc_invalid_op+0x1a/0x20
[   73.286396][   T12]  ? _cfg80211_unregister_wdev+0xc2/0x560
[   73.292211][   T12]  ? _cfg80211_unregister_wdev+0x46c/0x560
[   73.298035][   T12]  ? _cfg80211_unregister_wdev+0x46d/0x560
[   73.303878][   T12]  ? _cfg80211_unregister_wdev+0x46c/0x560
[   73.309708][   T12]  ieee80211_remove_interfaces+0x525/0x720
[   73.315641][   T12]  ? ieee80211_unregister_hw+0x55/0x2c0
[   73.321227][   T12]  ? __pfx_ieee80211_remove_interfaces+0x10/0x10
[   73.327613][   T12]  ieee80211_unregister_hw+0x5d/0x2c0
[   73.333058][   T12]  mac80211_hwsim_del_radio+0x2c2/0x4c0
[   73.338612][   T12]  ? __pfx_mac80211_hwsim_del_radio+0x10/0x10
[   73.344761][   T12]  hwsim_exit_net+0x5c1/0x670
[   73.349450][   T12]  ? __pfx_hwsim_exit_net+0x10/0x10
[   73.354809][   T12]  ? __ip_vs_dev_cleanup_batch+0x239/0x260
[   73.360650][   T12]  cleanup_net+0x802/0xcc0
[   73.365129][   T12]  ? __pfx_cleanup_net+0x10/0x10
[   73.370092][   T12]  ? process_scheduled_works+0x945/0x1830
[   73.375946][   T12]  process_scheduled_works+0xa2c/0x1830
[   73.381548][   T12]  ? __pfx_process_scheduled_works+0x10/0x10
[   73.387971][   T12]  ? assign_work+0x364/0x3d0
[   73.392920][   T12]  worker_thread+0x86d/0xd50
[   73.397535][   T12]  ? __kthread_parkme+0x169/0x1d0
[   73.402621][   T12]  ? __pfx_worker_thread+0x10/0x10
[   73.407745][   T12]  kthread+0x2f0/0x390
[   73.411959][   T12]  ? __pfx_worker_thread+0x10/0x10
[   73.417079][   T12]  ? __pfx_kthread+0x10/0x10
[   73.421677][   T12]  ret_from_fork+0x4b/0x80
[   73.426151][   T12]  ? __pfx_kthread+0x10/0x10
[   73.430832][   T12]  ret_from_fork_asm+0x1a/0x30
[   73.435683][   T12]  </TASK>
[   73.438739][   T12] Kernel panic - not syncing: kernel: panic_on_warn set ...
[   73.446004][   T12] CPU: 0 PID: 12 Comm: kworker/u8:1 Not tainted 6.10.0-rc6-syzkaller-00223-gc6653f49e4fd-dirty #0
[   73.456604][   T12] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
[   73.466655][   T12] Workqueue: netns cleanup_net
[   73.471414][   T12] Call Trace:
[   73.474709][   T12]  <TASK>
[   73.477655][   T12]  dump_stack_lvl+0x241/0x360
[   73.482346][   T12]  ? __pfx_dump_stack_lvl+0x10/0x10
[   73.487561][   T12]  ? __pfx__printk+0x10/0x10
[   73.492183][   T12]  ? _printk+0xd5/0x120
[   73.496355][   T12]  ? vscnprintf+0x5d/0x90
[   73.500704][   T12]  panic+0x349/0x860
[   73.504805][   T12]  ? __warn+0x172/0x4e0
[   73.509075][   T12]  ? __pfx_panic+0x10/0x10
[   73.513494][   T12]  ? show_trace_log_lvl+0x4e6/0x520
[   73.518728][   T12]  ? ret_from_fork_asm+0x1a/0x30
[   73.523686][   T12]  __warn+0x346/0x4e0
[   73.527674][   T12]  ? _cfg80211_unregister_wdev+0x46d/0x560
[   73.533569][   T12]  report_bug+0x2b3/0x500
[   73.537892][   T12]  ? _cfg80211_unregister_wdev+0x46d/0x560
[   73.543696][   T12]  handle_bug+0x3e/0x70
[   73.547850][   T12]  exc_invalid_op+0x1a/0x50
[   73.552387][   T12]  asm_exc_invalid_op+0x1a/0x20
[   73.557264][   T12] RIP: 0010:_cfg80211_unregister_wdev+0x46d/0x560
[   73.563690][   T12] Code: 0f b6 04 38 84 c0 0f 85 ec 00 00 00 41 80 65 00 fe 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 04 07 c5 f6 90 <0f> 0b 90 e9 61 fc ff ff e8 f6 06 c5 f6 c6 05 2c ac c6 04 01 90 48
[   73.583293][   T12] RSP: 0018:ffffc90000117798 EFLAGS: 00010293
[   73.589358][   T12] RAX: ffffffff8ad120ac RBX: 0000000000000000 RCX: ffff8880176c5a00
[   73.597318][   T12] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[   73.605284][   T12] RBP: ffff888068618000 R08: ffffffff8ad11d02 R09: 1ffffffff1ebcdac
[   73.613250][   T12] R10: dffffc0000000000 R11: fffffbfff1ebcdad R12: 0000000000000001
[   73.621239][   T12] R13: ffff88801cf7ccb0 R14: ffff888068618700 R15: dffffc0000000000
[   73.629242][   T12]  ? _cfg80211_unregister_wdev+0xc2/0x560
[   73.635048][   T12]  ? _cfg80211_unregister_wdev+0x46c/0x560
[   73.640870][   T12]  ? _cfg80211_unregister_wdev+0x46c/0x560
[   73.646705][   T12]  ieee80211_remove_interfaces+0x525/0x720
[   73.652529][   T12]  ? ieee80211_unregister_hw+0x55/0x2c0
[   73.658195][   T12]  ? __pfx_ieee80211_remove_interfaces+0x10/0x10
[   73.664568][   T12]  ieee80211_unregister_hw+0x5d/0x2c0
[   73.669955][   T12]  mac80211_hwsim_del_radio+0x2c2/0x4c0
[   73.675512][   T12]  ? __pfx_mac80211_hwsim_del_radio+0x10/0x10
[   73.681583][   T12]  hwsim_exit_net+0x5c1/0x670
[   73.686317][   T12]  ? __pfx_hwsim_exit_net+0x10/0x10
[   73.691541][   T12]  ? __ip_vs_dev_cleanup_batch+0x239/0x260
[   73.697357][   T12]  cleanup_net+0x802/0xcc0
[   73.701800][   T12]  ? __pfx_cleanup_net+0x10/0x10
[   73.706737][   T12]  ? process_scheduled_works+0x945/0x1830
[   73.712531][   T12]  process_scheduled_works+0xa2c/0x1830
[   73.718085][   T12]  ? __pfx_process_scheduled_works+0x10/0x10
[   73.724060][   T12]  ? assign_work+0x364/0x3d0
[   73.728641][   T12]  worker_thread+0x86d/0xd50
[   73.733244][   T12]  ? __kthread_parkme+0x169/0x1d0
[   73.738257][   T12]  ? __pfx_worker_thread+0x10/0x10
[   73.743365][   T12]  kthread+0x2f0/0x390
[   73.747424][   T12]  ? __pfx_worker_thread+0x10/0x10
[   73.752533][   T12]  ? __pfx_kthread+0x10/0x10
[   73.757113][   T12]  ret_from_fork+0x4b/0x80
[   73.761518][   T12]  ? __pfx_kthread+0x10/0x10
[   73.766100][   T12]  ret_from_fork_asm+0x1a/0x30
[   73.770872][   T12]  </TASK>
[   73.774152][   T12] Kernel Offset: disabled
[   73.778533][   T12] Rebooting in 86400 seconds..


syzkaller build log:
go env (err=<nil>)
GO111MODULE='auto'
GOARCH='amd64'
GOBIN=''
GOCACHE='/syzkaller/.cache/go-build'
GOENV='/syzkaller/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/syzkaller/jobs-2/linux/gopath/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/syzkaller/jobs-2/linux/gopath'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.21.4'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/syzkaller/jobs-2/linux/gopath/src/github.com/google/syzkaller/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build2801630060=/tmp/go-build -gno-record-gcc-switches'

git status (err=<nil>)
HEAD detached at edc5149ad2
nothing to commit, working tree clean


tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
go list -f '{{.Stale}}' ./sys/syz-sysgen | grep -q false || go install ./sys/syz-sysgen
make .descriptions
tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
bin/syz-sysgen
go fmt ./sys/... >/dev/null
touch .descriptions
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=edc5149ad2ab7a38db6b3bcb1b594e0264a92163 -X 'github.com/google/syzkaller/prog.gitRevisionDate=20240621-090414'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-fuzzer github.com/google/syzkaller/syz-fuzzer
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=edc5149ad2ab7a38db6b3bcb1b594e0264a92163 -X 'github.com/google/syzkaller/prog.gitRevisionDate=20240621-090414'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-execprog github.com/google/syzkaller/tools/syz-execprog
mkdir -p ./bin/linux_amd64
g++ -o ./bin/linux_amd64/syz-executor executor/executor.cc \
	-m64 -O2 -pthread -Wall -Werror -Wparentheses -Wunused-const-variable -Wframe-larger-than=16384 -Wno-stringop-overflow -Wno-array-bounds -Wno-format-overflow -Wno-unused-but-set-variable -Wno-unused-command-line-argument -static-pie -std=c++17 -I. -Iexecutor/_include -fpermissive -w -DGOOS_linux=1 -DGOARCH_amd64=1 \
	-DHOSTGOOS_linux=1 -DGIT_REVISION=\"edc5149ad2ab7a38db6b3bcb1b594e0264a92163\"


Error text is too large and was truncated, full error text is at:
https://syzkaller.appspot.com/x/error.txt?x=10e59fbe980000


Tested on:

commit:         c6653f49 Merge tag 'powerpc-6.10-4' of git://git.kerne..
git tree:       upstream
kernel config:  https://syzkaller.appspot.com/x/.config?x=864caee5f78cab51
dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=116cb7c1980000


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [syzbot] Re: possible deadlock in team_del_slave (3)
  2024-04-26 11:59 [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
                   ` (12 preceding siblings ...)
  2024-07-07  6:06 ` Jeongjun Park
@ 2025-05-14 13:18 ` syzbot
  2025-05-16 13:55 ` syzbot
  14 siblings, 0 replies; 32+ messages in thread
From: syzbot @ 2025-05-14 13:18 UTC (permalink / raw)
  To: linux-kernel

For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org.

***

Subject: Re: possible deadlock in team_del_slave (3)
Author: penguin-kernel@i-love.sakura.ne.jp

#syz test

diff --git a/net/wireless/core.c b/net/wireless/core.c
index dcce326fdb8c..ad812d4be773 100644
--- a/net/wireless/core.c
+++ b/net/wireless/core.c
@@ -1236,6 +1236,21 @@ void wiphy_rfkill_set_hw_state_reason(struct wiphy *wiphy, bool blocked,
 }
 EXPORT_SYMBOL(wiphy_rfkill_set_hw_state_reason);
 
+struct netdev_unregister_work {
+	struct work_struct work;
+	struct net_device *netdev;
+};
+
+static void cfg80211_unregister_netdevice_work(struct work_struct *work)
+{
+	struct netdev_unregister_work *w = container_of(work, struct netdev_unregister_work, work);
+
+	rtnl_lock();
+	unregister_netdevice(w->netdev);
+	rtnl_unlock();
+	kfree(w);
+}
+
 static void _cfg80211_unregister_wdev(struct wireless_dev *wdev,
 				      bool unregister_netdev)
 {
@@ -1252,8 +1267,14 @@ static void _cfg80211_unregister_wdev(struct wireless_dev *wdev,
 
 	if (wdev->netdev) {
 		sysfs_remove_link(&wdev->netdev->dev.kobj, "phy80211");
-		if (unregister_netdev)
-			unregister_netdevice(wdev->netdev);
+		if (unregister_netdev) {
+			struct netdev_unregister_work *w
+				= kmalloc(sizeof(*w), GFP_KERNEL | __GFP_NOFAIL);
+
+			INIT_WORK(&w->work, cfg80211_unregister_netdevice_work);
+			w->netdev = wdev->netdev;
+			schedule_work(&w->work);
+		}
 	}
 
 	list_del_rcu(&wdev->list);


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
       [not found] <80fd38b6-6cb2-4470-8531-60ee0e332787@I-love.SAKURA.ne.jp>
@ 2025-05-14 19:51 ` syzbot
  0 siblings, 0 replies; 32+ messages in thread
From: syzbot @ 2025-05-14 19:51 UTC (permalink / raw)
  To: linux-kernel, penguin-kernel, syzkaller-bugs

Hello,

syzbot tried to test the proposed patch but the build/boot failed:

length: 249 > 9
[   91.962498][ T5129] Bluetooth: hci0: unexpected cc 0x1001 length: 249 > 9
[   91.974784][ T5129] Bluetooth: hci0: unexpected cc 0x0c23 length: 249 > 4
[   91.983891][ T5129] Bluetooth: hci0: unexpected cc 0x0c38 length: 249 > 2
[   92.271242][ T5925] chnl_net:caif_netlink_parms(): no params data found
[   92.341639][ T5925] bridge0: port 1(bridge_slave_0) entered blocking state
[   92.349003][ T5925] bridge0: port 1(bridge_slave_0) entered disabled state
[   92.356210][ T5925] bridge_slave_0: entered allmulticast mode
[   92.364780][ T5925] bridge_slave_0: entered promiscuous mode
[   92.377104][ T5925] bridge0: port 2(bridge_slave_1) entered blocking state
[   92.384348][ T5925] bridge0: port 2(bridge_slave_1) entered disabled state
[   92.392377][ T5925] bridge_slave_1: entered allmulticast mode
[   92.400665][ T5925] bridge_slave_1: entered promiscuous mode
[   92.433442][ T5925] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link
[   92.444942][ T5925] bond0: (slave bond_slave_1): Enslaving as an active interface with an up link
[   92.480353][ T5925] team0: Port device team_slave_0 added
[   92.489803][ T5925] team0: Port device team_slave_1 added
[   92.522393][ T5925] batman_adv: batadv0: Adding interface: batadv_slave_0
[   92.529430][ T5925] batman_adv: batadv0: The MTU of interface batadv_slave_0 is too small (1500) to handle the transport of batman-adv packets. Packets going over this interface will be fragmented on layer2 which could impact the performance. Setting the MTU to 1560 would solve the problem.
[   92.556059][ T5925] batman_adv: batadv0: Not using interface batadv_slave_0 (retrying later): interface not active
[   92.569051][ T5925] batman_adv: batadv0: Adding interface: batadv_slave_1
[   92.576693][ T5925] batman_adv: batadv0: The MTU of interface batadv_slave_1 is too small (1500) to handle the transport of batman-adv packets. Packets going over this interface will be fragmented on layer2 which could impact the performance. Setting the MTU to 1560 would solve the problem.
[   92.602926][ T5925] batman_adv: batadv0: Not using interface batadv_slave_1 (retrying later): interface not active
[   92.658607][ T5925] hsr_slave_0: entered promiscuous mode
[   92.665128][ T5925] hsr_slave_1: entered promiscuous mode
[   92.671651][ T5925] debugfs: Directory 'hsr0' with parent 'hsr' already present!
[   92.680672][ T5925] Cannot create hsr debugfs directory
[   92.835258][ T5925] netdevsim netdevsim0 netdevsim0: renamed from eth0
[   92.846166][ T5925] netdevsim netdevsim0 netdevsim1: renamed from eth1
[   92.857198][ T5925] netdevsim netdevsim0 netdevsim2: renamed from eth2
[   92.867926][ T5925] netdevsim netdevsim0 netdevsim3: renamed from eth3
[   92.902977][ T5925] bridge0: port 2(bridge_slave_1) entered blocking state
[   92.910414][ T5925] bridge0: port 2(bridge_slave_1) entered forwarding state
[   92.918452][ T5925] bridge0: port 1(bridge_slave_0) entered blocking state
[   92.925571][ T5925] bridge0: port 1(bridge_slave_0) entered forwarding state
[   92.983142][ T5925] 8021q: adding VLAN 0 to HW filter on device bond0
[   93.000293][ T3504] bridge0: port 1(bridge_slave_0) entered disabled state
[   93.009888][ T3504] bridge0: port 2(bridge_slave_1) entered disabled state
[   93.024748][ T5925] 8021q: adding VLAN 0 to HW filter on device team0
[   93.040019][ T4982] bridge0: port 1(bridge_slave_0) entered blocking state
[   93.047149][ T4982] bridge0: port 1(bridge_slave_0) entered forwarding state
[   93.063879][ T3504] bridge0: port 2(bridge_slave_1) entered blocking state
[   93.071208][ T3504] bridge0: port 2(bridge_slave_1) entered forwarding state
[   93.115899][ T5925] hsr0: Slave B (hsr_slave_1) is not up; please bring it up to get a fully working HSR network
[   93.259512][ T5925] 8021q: adding VLAN 0 to HW filter on device batadv0
[   93.305822][ T5925] veth0_vlan: entered promiscuous mode
[   93.319389][ T5925] veth1_vlan: entered promiscuous mode
[   93.348544][ T5925] veth0_macvtap: entered promiscuous mode
[   93.359245][ T5925] veth1_macvtap: entered promiscuous mode
[   93.378814][ T5925] batman_adv: The newly added mac address (aa:aa:aa:aa:aa:3e) already exists on: batadv_slave_0
[   93.391022][ T5925] batman_adv: It is strongly recommended to keep mac addresses unique to avoid problems!
[   93.403633][ T5925] batman_adv: batadv0: Interface activated: batadv_slave_0
[   93.417993][ T5925] batman_adv: The newly added mac address (aa:aa:aa:aa:aa:3f) already exists on: batadv_slave_1
[   93.429061][ T5925] batman_adv: It is strongly recommended to keep mac addresses unique to avoid problems!
[   93.440932][ T5925] batman_adv: batadv0: Interface activated: batadv_slave_1
[   93.455386][ T5925] netdevsim netdevsim0 netdevsim0: set [1, 0] type 2 family 0 port 6081 - 0
[   93.464709][ T5925] netdevsim netdevsim0 netdevsim1: set [1, 0] type 2 family 0 port 6081 - 0
[   93.473626][ T5925] netdevsim netdevsim0 netdevsim2: set [1, 0] type 2 family 0 port 6081 - 0
[   93.482664][ T5925] netdevsim netdevsim0 netdevsim3: set [1, 0] type 2 family 0 port 6081 - 0
[   93.554050][   T53] wlan0: Created IBSS using preconfigured BSSID 50:50:50:50:50:50
[   93.567636][   T53] wlan0: Creating new IBSS network, BSSID 50:50:50:50:50:50
[   93.601893][ T3504] wlan1: Created IBSS using preconfigured BSSID 50:50:50:50:50:50
[   93.610487][ T3504] wlan1: Creating new IBSS network, BSSID 50:50:50:50:50:50
[   94.005771][   T24] ------------[ cut here ]------------
[   94.011540][   T24] WARNING: CPU: 1 PID: 24 at net/wireless/core.c:1759 wiphy_delayed_work_cancel+0x8a/0xb0
[   94.021614][   T24] Modules linked in:
[   94.025672][   T24] CPU: 1 UID: 0 PID: 24 Comm: kworker/1:0 Not tainted 6.15.0-rc6-syzkaller-g1a80a098c606-dirty #0 PREEMPT(full) 
[   94.037920][   T24] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
[   94.048197][   T24] Workqueue: events cfg80211_unregister_netdevice_work
[   94.055274][   T24] RIP: 0010:wiphy_delayed_work_cancel+0x8a/0xb0
[   94.061715][   T24] Code: e8 8b ce 1f f7 eb 05 e8 84 ce 1f f7 48 8d 7b 20 e8 7b 92 0d f7 4c 89 f7 48 89 de 5b 41 5e 5d e9 ec f5 ff ff e8 67 ce 1f f7 90 <0f> 0b 90 eb dd 48 c7 c1 50 01 7e 8f 80 e1 07 80 c1 03 38 c1 7c 90
[   94.082068][   T24] RSP: 0018:ffffc900001e7740 EFLAGS: 00010293
[   94.088498][   T24] RAX: ffffffff8aa02299 RBX: ffff88806a2b96a8 RCX: ffff88801de80000
[   94.096806][   T24] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[   94.104806][   T24] RBP: 0000000000000000 R08: ffffffff8f7dd277 R09: 1ffffffff1efba4e
[   94.113238][   T24] R10: dffffc0000000000 R11: ffffffff8ac5eb30 R12: ffff88806a2b8d80
[   94.121555][   T24] R13: 1ffff1100fab81d0 R14: ffff88807d5c0700 R15: ffffc900001e77a0
[   94.130303][   T24] FS:  0000000000000000(0000) GS:ffff8881261fb000(0000) knlGS:0000000000000000
[   94.139304][   T24] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   94.145902][   T24] CR2: 000000c003a86000 CR3: 0000000032fcc000 CR4: 00000000003526f0
[   94.154171][   T24] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   94.162228][   T24] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   94.170393][   T24] Call Trace:
[   94.173704][   T24]  <TASK>
[   94.176763][   T24]  ieee80211_free_keys+0xff/0x650
[   94.181832][   T24]  ? __pfx_ieee80211_free_keys+0x10/0x10
[   94.187557][   T24]  ? ip6_route_dev_notify+0x9a/0x5b0
[   94.192880][   T24]  ? notifier_call_chain+0x3bf/0x3e0
[   94.198251][   T24]  ieee80211_teardown_sdata+0x52/0x140
[   94.203758][   T24]  ? __pfx_ieee80211_uninit+0x10/0x10
[   94.209438][   T24]  unregister_netdevice_many_notify+0x1c34/0x2330
[   94.215922][   T24]  ? __pfx_unregister_netdevice_many_notify+0x10/0x10
[   94.222857][   T24]  ? rcu_is_watching+0x15/0xb0
[   94.228043][   T24]  ? __mutex_lock+0xa6d/0xe80
[   94.232839][   T24]  ? __mutex_lock+0x51b/0xe80
[   94.237793][   T24]  ? cfg80211_unregister_netdevice_work+0x12/0x50
[   94.244253][   T24]  unregister_netdevice_queue+0x33c/0x380
[   94.250165][   T24]  ? __pfx_unregister_netdevice_queue+0x10/0x10
[   94.256526][   T24]  ? _raw_spin_unlock_irq+0x23/0x50
[   94.261863][   T24]  cfg80211_unregister_netdevice_work+0x3d/0x50
[   94.268224][   T24]  ? process_scheduled_works+0x9ec/0x17a0
[   94.274067][   T24]  process_scheduled_works+0xadb/0x17a0
[   94.279838][   T24]  ? __pfx_process_scheduled_works+0x10/0x10
[   94.286074][   T24]  worker_thread+0x8a0/0xda0
[   94.290884][   T24]  kthread+0x70e/0x8a0
[   94.295232][   T24]  ? __pfx_worker_thread+0x10/0x10
[   94.300455][   T24]  ? __pfx_kthread+0x10/0x10
[   94.305085][   T24]  ? __pfx_kthread+0x10/0x10
[   94.310237][   T24]  ? _raw_spin_unlock_irq+0x23/0x50
[   94.315469][   T24]  ? lockdep_hardirqs_on+0x9c/0x150
[   94.320777][   T24]  ? __pfx_kthread+0x10/0x10
[   94.325392][   T24]  ret_from_fork+0x4b/0x80
[   94.329953][   T24]  ? __pfx_kthread+0x10/0x10
[   94.334582][   T24]  ret_from_fork_asm+0x1a/0x30
[   94.339462][   T24]  </TASK>
[   94.342497][   T24] Kernel panic - not syncing: kernel: panic_on_warn set ...
[   94.349804][   T24] CPU: 1 UID: 0 PID: 24 Comm: kworker/1:0 Not tainted 6.15.0-rc6-syzkaller-g1a80a098c606-dirty #0 PREEMPT(full) 
[   94.361729][   T24] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
[   94.371900][   T24] Workqueue: events cfg80211_unregister_netdevice_work
[   94.378882][   T24] Call Trace:
[   94.382189][   T24]  <TASK>
[   94.385131][   T24]  dump_stack_lvl+0x99/0x250
[   94.389748][   T24]  ? __asan_memcpy+0x40/0x70
[   94.394367][   T24]  ? __pfx_dump_stack_lvl+0x10/0x10
[   94.399599][   T24]  ? __pfx__printk+0x10/0x10
[   94.404215][   T24]  panic+0x2db/0x790
[   94.408122][   T24]  ? __pfx_panic+0x10/0x10
[   94.412710][   T24]  ? show_trace_log_lvl+0x4fb/0x550
[   94.418012][   T24]  ? ret_from_fork_asm+0x1a/0x30
[   94.422967][   T24]  __warn+0x31b/0x4b0
[   94.426949][   T24]  ? wiphy_delayed_work_cancel+0x8a/0xb0
[   94.432591][   T24]  ? wiphy_delayed_work_cancel+0x8a/0xb0
[   94.438228][   T24]  report_bug+0x2be/0x4f0
[   94.442557][   T24]  ? wiphy_delayed_work_cancel+0x8a/0xb0
[   94.448292][   T24]  ? wiphy_delayed_work_cancel+0x8a/0xb0
[   94.453947][   T24]  ? wiphy_delayed_work_cancel+0x8c/0xb0
[   94.459670][   T24]  handle_bug+0x84/0x160
[   94.463913][   T24]  exc_invalid_op+0x1a/0x50
[   94.468417][   T24]  asm_exc_invalid_op+0x1a/0x20
[   94.473260][   T24] RIP: 0010:wiphy_delayed_work_cancel+0x8a/0xb0
[   94.479505][   T24] Code: e8 8b ce 1f f7 eb 05 e8 84 ce 1f f7 48 8d 7b 20 e8 7b 92 0d f7 4c 89 f7 48 89 de 5b 41 5e 5d e9 ec f5 ff ff e8 67 ce 1f f7 90 <0f> 0b 90 eb dd 48 c7 c1 50 01 7e 8f 80 e1 07 80 c1 03 38 c1 7c 90
[   94.499114][   T24] RSP: 0018:ffffc900001e7740 EFLAGS: 00010293
[   94.505185][   T24] RAX: ffffffff8aa02299 RBX: ffff88806a2b96a8 RCX: ffff88801de80000
[   94.513345][   T24] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[   94.521331][   T24] RBP: 0000000000000000 R08: ffffffff8f7dd277 R09: 1ffffffff1efba4e
[   94.529328][   T24] R10: dffffc0000000000 R11: ffffffff8ac5eb30 R12: ffff88806a2b8d80
[   94.537300][   T24] R13: 1ffff1100fab81d0 R14: ffff88807d5c0700 R15: ffffc900001e77a0
[   94.545286][   T24]  ? __pfx_ieee80211_uninit+0x10/0x10
[   94.550760][   T24]  ? wiphy_delayed_work_cancel+0x89/0xb0
[   94.556521][   T24]  ieee80211_free_keys+0xff/0x650
[   94.561637][   T24]  ? __pfx_ieee80211_free_keys+0x10/0x10
[   94.567799][   T24]  ? ip6_route_dev_notify+0x9a/0x5b0
[   94.573181][   T24]  ? notifier_call_chain+0x3bf/0x3e0
[   94.578474][   T24]  ieee80211_teardown_sdata+0x52/0x140
[   94.583931][   T24]  ? __pfx_ieee80211_uninit+0x10/0x10
[   94.589304][   T24]  unregister_netdevice_many_notify+0x1c34/0x2330
[   94.595760][   T24]  ? __pfx_unregister_netdevice_many_notify+0x10/0x10
[   94.602532][   T24]  ? rcu_is_watching+0x15/0xb0
[   94.607405][   T24]  ? __mutex_lock+0xa6d/0xe80
[   94.612101][   T24]  ? __mutex_lock+0x51b/0xe80
[   94.616880][   T24]  ? cfg80211_unregister_netdevice_work+0x12/0x50
[   94.623402][   T24]  unregister_netdevice_queue+0x33c/0x380
[   94.629238][   T24]  ? __pfx_unregister_netdevice_queue+0x10/0x10
[   94.635492][   T24]  ? _raw_spin_unlock_irq+0x23/0x50
[   94.640690][   T24]  cfg80211_unregister_netdevice_work+0x3d/0x50
[   94.646947][   T24]  ? process_scheduled_works+0x9ec/0x17a0
[   94.652684][   T24]  process_scheduled_works+0xadb/0x17a0
[   94.658268][   T24]  ? __pfx_process_scheduled_works+0x10/0x10
[   94.664533][   T24]  worker_thread+0x8a0/0xda0
[   94.669246][   T24]  kthread+0x70e/0x8a0
[   94.673316][   T24]  ? __pfx_worker_thread+0x10/0x10
[   94.678423][   T24]  ? __pfx_kthread+0x10/0x10
[   94.683024][   T24]  ? __pfx_kthread+0x10/0x10
[   94.687621][   T24]  ? _raw_spin_unlock_irq+0x23/0x50
[   94.692817][   T24]  ? lockdep_hardirqs_on+0x9c/0x150
[   94.698104][   T24]  ? __pfx_kthread+0x10/0x10
[   94.702694][   T24]  ret_from_fork+0x4b/0x80
[   94.707203][   T24]  ? __pfx_kthread+0x10/0x10
[   94.711802][   T24]  ret_from_fork_asm+0x1a/0x30
[   94.716597][   T24]  </TASK>
[   94.720039][   T24] Kernel Offset: disabled
[   94.724403][   T24] Rebooting in 86400 seconds..


syzkaller build log:
go env (err=<nil>)
GO111MODULE='auto'
GOARCH='amd64'
GOBIN=''
GOCACHE='/syzkaller/.cache/go-build'
GOENV='/syzkaller/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/syzkaller/jobs-2/linux/gopath/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/syzkaller/jobs-2/linux/gopath'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/syzkaller/jobs-2/linux/gopath/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.7.linux-amd64'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/syzkaller/jobs-2/linux/gopath/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.7.linux-amd64/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.23.7'
GODEBUG=''
GOTELEMETRY='local'
GOTELEMETRYDIR='/syzkaller/.config/go/telemetry'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/syzkaller/jobs-2/linux/gopath/src/github.com/google/syzkaller/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build4040498717=/tmp/go-build -gno-record-gcc-switches'

git status (err=<nil>)
HEAD detached at ce7952f4e36
nothing to commit, working tree clean


tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
go list -f '{{.Stale}}' ./sys/syz-sysgen | grep -q false || go install ./sys/syz-sysgen
make .descriptions
tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
bin/syz-sysgen
touch .descriptions
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=ce7952f4e369f2440b2bc369868df305c42bf7d6 -X 'github.com/google/syzkaller/prog.gitRevisionDate=20250430-132727'" -o ./bin/linux_amd64/syz-execprog github.com/google/syzkaller/tools/syz-execprog
mkdir -p ./bin/linux_amd64
g++ -o ./bin/linux_amd64/syz-executor executor/executor.cc \
	-m64 -O2 -pthread -Wall -Werror -Wparentheses -Wunused-const-variable -Wframe-larger-than=16384 -Wno-stringop-overflow -Wno-array-bounds -Wno-format-overflow -Wno-unused-but-set-variable -Wno-unused-command-line-argument -static-pie -std=c++17 -I. -Iexecutor/_include   -DGOOS_linux=1 -DGOARCH_amd64=1 \
	-DHOSTGOOS_linux=1 -DGIT_REVISION=\"ce7952f4e369f2440b2bc369868df305c42bf7d6\"
/usr/bin/ld: /tmp/ccLDnfUR.o: in function `Connection::Connect(char const*, char const*)':
executor.cc:(.text._ZN10Connection7ConnectEPKcS1_[_ZN10Connection7ConnectEPKcS1_]+0x104): warning: Using 'gethostbyname' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking


Error text is too large and was truncated, full error text is at:
https://syzkaller.appspot.com/x/error.txt?x=112cd6f4580000


Tested on:

commit:         1a80a098 Merge tag 'execve-v6.15-rc7' of git://git.ker..
git tree:       upstream
kernel config:  https://syzkaller.appspot.com/x/.config?x=5929ac65be9baf3c
dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
compiler:       Debian clang version 20.1.2 (++20250402124445+58df0ef89dd6-1~exp1~20250402004600.97), Debian LLD 20.1.2
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1284cf68580000


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [syzbot] Re: possible deadlock in team_del_slave (3)
  2024-04-26 11:59 [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
                   ` (13 preceding siblings ...)
  2025-05-14 13:18 ` [syzbot] " syzbot
@ 2025-05-16 13:55 ` syzbot
  14 siblings, 0 replies; 32+ messages in thread
From: syzbot @ 2025-05-16 13:55 UTC (permalink / raw)
  To: linux-kernel

For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org.

***

Subject: Re: possible deadlock in team_del_slave (3)
Author: penguin-kernel@i-love.sakura.ne.jp

#syz test

diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
index d8fc0c79745d..96bbe146b884 100644
--- a/drivers/net/team/team_core.c
+++ b/drivers/net/team/team_core.c
@@ -933,7 +933,7 @@ static bool team_port_find(const struct team *team,
  * Enable/disable port by adding to enabled port hashlist and setting
  * port->index (Might be racy so reader could see incorrect ifindex when
  * processing a flying packet, but that is not a problem). Write guarded
- * by team->lock.
+ * by RTNL.
  */
 static void team_port_enable(struct team *team,
 			     struct team_port *port)
@@ -1660,8 +1660,6 @@ static int team_init(struct net_device *dev)
 		goto err_options_register;
 	netif_carrier_off(dev);
 
-	lockdep_register_key(&team->team_lock_key);
-	__mutex_init(&team->lock, "team->team_lock_key", &team->team_lock_key);
 	netdev_lockdep_set_classes(dev);
 
 	return 0;
@@ -1682,7 +1680,7 @@ static void team_uninit(struct net_device *dev)
 	struct team_port *port;
 	struct team_port *tmp;
 
-	mutex_lock(&team->lock);
+	ASSERT_RTNL();
 	list_for_each_entry_safe(port, tmp, &team->port_list, list)
 		team_port_del(team, port->dev);
 
@@ -1691,9 +1689,8 @@ static void team_uninit(struct net_device *dev)
 	team_mcast_rejoin_fini(team);
 	team_notify_peers_fini(team);
 	team_queue_override_fini(team);
-	mutex_unlock(&team->lock);
+	ASSERT_RTNL();
 	netdev_change_features(dev);
-	lockdep_unregister_key(&team->team_lock_key);
 }
 
 static void team_destructor(struct net_device *dev)
@@ -1814,11 +1811,11 @@ static int team_set_mac_address(struct net_device *dev, void *p)
 	if (dev->type == ARPHRD_ETHER && !is_valid_ether_addr(addr->sa_data))
 		return -EADDRNOTAVAIL;
 	dev_addr_set(dev, addr->sa_data);
-	mutex_lock(&team->lock);
+	ASSERT_RTNL();
 	list_for_each_entry(port, &team->port_list, list)
 		if (team->ops.port_change_dev_addr)
 			team->ops.port_change_dev_addr(team, port);
-	mutex_unlock(&team->lock);
+	ASSERT_RTNL();
 	return 0;
 }
 
@@ -1832,7 +1829,7 @@ static int team_change_mtu(struct net_device *dev, int new_mtu)
 	 * Alhough this is reader, it's guarded by team lock. It's not possible
 	 * to traverse list in reverse under rcu_read_lock
 	 */
-	mutex_lock(&team->lock);
+	ASSERT_RTNL();
 	team->port_mtu_change_allowed = true;
 	list_for_each_entry(port, &team->port_list, list) {
 		err = dev_set_mtu(port->dev, new_mtu);
@@ -1843,7 +1840,7 @@ static int team_change_mtu(struct net_device *dev, int new_mtu)
 		}
 	}
 	team->port_mtu_change_allowed = false;
-	mutex_unlock(&team->lock);
+	ASSERT_RTNL();
 
 	WRITE_ONCE(dev->mtu, new_mtu);
 
@@ -1853,7 +1850,7 @@ static int team_change_mtu(struct net_device *dev, int new_mtu)
 	list_for_each_entry_continue_reverse(port, &team->port_list, list)
 		dev_set_mtu(port->dev, dev->mtu);
 	team->port_mtu_change_allowed = false;
-	mutex_unlock(&team->lock);
+	ASSERT_RTNL();
 
 	return err;
 }
@@ -1907,20 +1904,20 @@ static int team_vlan_rx_add_vid(struct net_device *dev, __be16 proto, u16 vid)
 	 * Alhough this is reader, it's guarded by team lock. It's not possible
 	 * to traverse list in reverse under rcu_read_lock
 	 */
-	mutex_lock(&team->lock);
+	ASSERT_RTNL();
 	list_for_each_entry(port, &team->port_list, list) {
 		err = vlan_vid_add(port->dev, proto, vid);
 		if (err)
 			goto unwind;
 	}
-	mutex_unlock(&team->lock);
+	ASSERT_RTNL();
 
 	return 0;
 
 unwind:
 	list_for_each_entry_continue_reverse(port, &team->port_list, list)
 		vlan_vid_del(port->dev, proto, vid);
-	mutex_unlock(&team->lock);
+	ASSERT_RTNL();
 
 	return err;
 }
@@ -1930,10 +1927,10 @@ static int team_vlan_rx_kill_vid(struct net_device *dev, __be16 proto, u16 vid)
 	struct team *team = netdev_priv(dev);
 	struct team_port *port;
 
-	mutex_lock(&team->lock);
+	ASSERT_RTNL();
 	list_for_each_entry(port, &team->port_list, list)
 		vlan_vid_del(port->dev, proto, vid);
-	mutex_unlock(&team->lock);
+	ASSERT_RTNL();
 
 	return 0;
 }
@@ -1955,9 +1952,9 @@ static void team_netpoll_cleanup(struct net_device *dev)
 {
 	struct team *team = netdev_priv(dev);
 
-	mutex_lock(&team->lock);
+	ASSERT_RTNL();
 	__team_netpoll_cleanup(team);
-	mutex_unlock(&team->lock);
+	ASSERT_RTNL();
 }
 
 static int team_netpoll_setup(struct net_device *dev)
@@ -1966,7 +1963,7 @@ static int team_netpoll_setup(struct net_device *dev)
 	struct team_port *port;
 	int err = 0;
 
-	mutex_lock(&team->lock);
+	ASSERT_RTNL();
 	list_for_each_entry(port, &team->port_list, list) {
 		err = __team_port_enable_netpoll(port);
 		if (err) {
@@ -1974,7 +1971,7 @@ static int team_netpoll_setup(struct net_device *dev)
 			break;
 		}
 	}
-	mutex_unlock(&team->lock);
+	ASSERT_RTNL();
 	return err;
 }
 #endif
@@ -1985,9 +1982,9 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev,
 	struct team *team = netdev_priv(dev);
 	int err;
 
-	mutex_lock(&team->lock);
+	ASSERT_RTNL();
 	err = team_port_add(team, port_dev, extack);
-	mutex_unlock(&team->lock);
+	ASSERT_RTNL();
 
 	if (!err)
 		netdev_change_features(dev);
@@ -2000,18 +1997,13 @@ static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
 	struct team *team = netdev_priv(dev);
 	int err;
 
-	mutex_lock(&team->lock);
+	ASSERT_RTNL();
 	err = team_port_del(team, port_dev);
-	mutex_unlock(&team->lock);
+	ASSERT_RTNL();
 
 	if (err)
 		return err;
 
-	if (netif_is_team_master(port_dev)) {
-		lockdep_unregister_key(&team->team_lock_key);
-		lockdep_register_key(&team->team_lock_key);
-		lockdep_set_class(&team->lock, &team->team_lock_key);
-	}
 	netdev_change_features(dev);
 
 	return err;
@@ -2319,13 +2311,13 @@ static struct team *team_nl_team_get(struct genl_info *info)
 	}
 
 	team = netdev_priv(dev);
-	mutex_lock(&team->lock);
+	ASSERT_RTNL();
 	return team;
 }
 
 static void team_nl_team_put(struct team *team)
 {
-	mutex_unlock(&team->lock);
+	ASSERT_RTNL();
 	dev_put(team->dev);
 }
 
@@ -2961,11 +2953,9 @@ static void __team_port_change_port_removed(struct team_port *port)
 
 static void team_port_change_check(struct team_port *port, bool linkup)
 {
-	struct team *team = port->team;
-
-	mutex_lock(&team->lock);
+	ASSERT_RTNL();
 	__team_port_change_check(port, linkup);
-	mutex_unlock(&team->lock);
+	ASSERT_RTNL();
 }
 
 
diff --git a/drivers/net/team/team_mode_activebackup.c b/drivers/net/team/team_mode_activebackup.c
index e0f599e2a51d..4e133451f4d6 100644
--- a/drivers/net/team/team_mode_activebackup.c
+++ b/drivers/net/team/team_mode_activebackup.c
@@ -68,7 +68,7 @@ static void ab_active_port_get(struct team *team, struct team_gsetter_ctx *ctx)
 	struct team_port *active_port;
 
 	active_port = rcu_dereference_protected(ab_priv(team)->active_port,
-						lockdep_is_held(&team->lock));
+						rtnl_is_locked());
 	if (active_port)
 		ctx->data.u32_val = active_port->dev->ifindex;
 	else
diff --git a/drivers/net/team/team_mode_loadbalance.c b/drivers/net/team/team_mode_loadbalance.c
index 00f8989c29c0..79ac52d086e0 100644
--- a/drivers/net/team/team_mode_loadbalance.c
+++ b/drivers/net/team/team_mode_loadbalance.c
@@ -302,7 +302,7 @@ static int lb_bpf_func_set(struct team *team, struct team_gsetter_ctx *ctx)
 		/* Clear old filter data */
 		__fprog_destroy(lb_priv->ex->orig_fprog);
 		orig_fp = rcu_dereference_protected(lb_priv->fp,
-						lockdep_is_held(&team->lock));
+						    rtnl_is_locked());
 	}
 
 	rcu_assign_pointer(lb_priv->fp, fp);
@@ -325,7 +325,7 @@ static void lb_bpf_func_free(struct team *team)
 
 	__fprog_destroy(lb_priv->ex->orig_fprog);
 	fp = rcu_dereference_protected(lb_priv->fp,
-				       lockdep_is_held(&team->lock));
+				       rtnl_is_locked());
 	bpf_prog_destroy(fp);
 }
 
@@ -336,7 +336,7 @@ static void lb_tx_method_get(struct team *team, struct team_gsetter_ctx *ctx)
 	char *name;
 
 	func = rcu_dereference_protected(lb_priv->select_tx_port_func,
-					 lockdep_is_held(&team->lock));
+					 rtnl_is_locked());
 	name = lb_select_tx_port_get_name(func);
 	BUG_ON(!name);
 	ctx->data.str_val = name;
@@ -471,6 +471,7 @@ static void lb_stats_refresh(struct work_struct *work)
 	bool changed = false;
 	int i;
 	int j;
+	bool locked;
 
 	lb_priv_ex = container_of(work, struct lb_priv_ex,
 				  stats.refresh_dw.work);
@@ -478,7 +479,8 @@ static void lb_stats_refresh(struct work_struct *work)
 	team = lb_priv_ex->team;
 	lb_priv = get_lb_priv(team);
 
-	if (!mutex_trylock(&team->lock)) {
+	locked = rtnl_is_locked();
+	if (!locked && !rtnl_trylock()) {
 		schedule_delayed_work(&lb_priv_ex->stats.refresh_dw, 0);
 		return;
 	}
@@ -515,7 +517,8 @@ static void lb_stats_refresh(struct work_struct *work)
 	schedule_delayed_work(&lb_priv_ex->stats.refresh_dw,
 			      (lb_priv_ex->stats.refresh_interval * HZ) / 10);
 
-	mutex_unlock(&team->lock);
+	if (!locked)
+		rtnl_unlock();
 }
 
 static void lb_stats_refresh_interval_get(struct team *team,
diff --git a/include/linux/if_team.h b/include/linux/if_team.h
index cdc684e04a2f..ce97d891cf72 100644
--- a/include/linux/if_team.h
+++ b/include/linux/if_team.h
@@ -191,8 +191,6 @@ struct team {
 
 	const struct header_ops *header_ops_cache;
 
-	struct mutex lock; /* used for overall locking, e.g. port lists write */
-
 	/*
 	 * List of enabled ports and their count
 	 */
@@ -223,7 +221,6 @@ struct team {
 		atomic_t count_pending;
 		struct delayed_work dw;
 	} mcast_rejoin;
-	struct lock_class_key team_lock_key;
 	long mode_priv[TEAM_MODE_PRIV_LONGS];
 };
 


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [syzbot] [net?] possible deadlock in team_del_slave (3)
       [not found] <9d977b09-72b6-40bb-bfb6-c8884b2436e0@I-love.SAKURA.ne.jp>
@ 2025-05-16 15:15 ` syzbot
  0 siblings, 0 replies; 32+ messages in thread
From: syzbot @ 2025-05-16 15:15 UTC (permalink / raw)
  To: linux-kernel, penguin-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com
Tested-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com

Tested on:

commit:         fee3e843 Merge tag 'bcachefs-2025-05-15' of git://evil..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1493f6f4580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=ea35e429f965296e
dashboard link: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
compiler:       Debian clang version 20.1.2 (++20250402124445+58df0ef89dd6-1~exp1~20250402004600.97), Debian LLD 20.1.2
patch:          https://syzkaller.appspot.com/x/patch.diff?x=148676f4580000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2025-05-16 15:15 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-26 11:59 [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
2024-04-26 14:17 ` Hillf Danton
2024-07-03 11:25 ` Jeongjun Park
2024-07-03 13:41   ` syzbot
2024-07-03 13:44 ` Jeongjun Park
2024-07-03 14:19   ` syzbot
2024-07-03 14:51 ` [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
2024-07-03 15:18   ` Michal Kubiak
2024-07-03 16:02     ` Jeongjun Park
2024-07-03 16:30       ` Eric Dumazet
2024-07-05 15:17         ` [syzbot] [net?] possible deadlock in team_del_slave (3) Jeongjun Park
2024-07-05 15:19         ` [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
2024-07-03 15:51 ` [syzbot] [net?] possible deadlock in team_del_slave (3) Jeongjun Park
2024-07-03 16:35   ` syzbot
2024-07-04 10:15 ` Jiri Pirko
2024-07-04 10:43 ` [PATCH net] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
2024-07-04 10:45 ` [syzbot] [net?] possible deadlock in team_del_slave (3) Jeongjun Park
2024-07-04 16:07   ` syzbot
2024-07-04 11:02 ` Jeongjun Park
2024-07-04 16:28   ` syzbot
2024-07-06  4:13 ` [PATCH net,v2] team: Fix ABBA deadlock caused by race in team_del_slave Jeongjun Park
2024-07-06 15:01   ` Stephen Hemminger
2024-07-07  6:00 ` [PATCH] change list_del to list_del_init in ieee80211_remove_interfaces Jeongjun Park
2024-07-07  6:23   ` [syzbot] [net?] possible deadlock in team_del_slave (3) syzbot
2024-07-07  6:02 ` Jeongjun Park
2024-07-07  6:44   ` syzbot
2024-07-07  6:06 ` Jeongjun Park
2024-07-07  7:04   ` syzbot
2025-05-14 13:18 ` [syzbot] " syzbot
2025-05-16 13:55 ` syzbot
     [not found] <80fd38b6-6cb2-4470-8531-60ee0e332787@I-love.SAKURA.ne.jp>
2025-05-14 19:51 ` [syzbot] [net?] " syzbot
     [not found] <9d977b09-72b6-40bb-bfb6-c8884b2436e0@I-love.SAKURA.ne.jp>
2025-05-16 15:15 ` syzbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox