netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [syzbot] [net?] possible deadlock in rtnl_newlink
@ 2025-05-29 10:32 syzbot
  2025-05-29 15:59 ` Stanislav Fomichev
  0 siblings, 1 reply; 6+ messages in thread
From: syzbot @ 2025-05-29 10:32 UTC (permalink / raw)
  To: davem, edumazet, horms, kuba, linux-kernel, netdev, pabeni,
	syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    b1427432d3b6 Merge tag 'iommu-fixes-v6.15-rc7' of git://gi..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=161ef5f4580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=9fd1c9848687d742
dashboard link: https://syzkaller.appspot.com/bug?extid=846bb38dc67fe62cc733
compiler:       Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12d21170580000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17d9a8e8580000

Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-b1427432.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/47b0c66c70d9/vmlinux-b1427432.xz
kernel image: https://storage.googleapis.com/syzbot-assets/a2df6bfabd3c/bzImage-b1427432.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+846bb38dc67fe62cc733@syzkaller.appspotmail.com

ifb0: entered allmulticast mode
ifb1: entered allmulticast mode
======================================================
WARNING: possible circular locking dependency detected
6.15.0-rc7-syzkaller-00144-gb1427432d3b6 #0 Not tainted
------------------------------------------------------
syz-executor216/5313 is trying to acquire lock:
ffff888033f496f0 ((work_completion)(&adapter->reset_task)){+.+.}-{0:0}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline]
ffff888033f496f0 ((work_completion)(&adapter->reset_task)){+.+.}-{0:0}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline]
ffff888033f496f0 ((work_completion)(&adapter->reset_task)){+.+.}-{0:0}, at: start_flush_work kernel/workqueue.c:4150 [inline]
ffff888033f496f0 ((work_completion)(&adapter->reset_task)){+.+.}-{0:0}, at: __flush_work+0xd2/0xbc0 kernel/workqueue.c:4208

but task is already holding lock:
ffffffff8f2fab48 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_lock net/core/rtnetlink.c:80 [inline]
ffffffff8f2fab48 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_nets_lock net/core/rtnetlink.c:341 [inline]
ffffffff8f2fab48 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_newlink+0x8db/0x1c70 net/core/rtnetlink.c:4064

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (rtnl_mutex){+.+.}-{4:4}:
       lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5866
       __mutex_lock_common kernel/locking/mutex.c:601 [inline]
       __mutex_lock+0x182/0xe80 kernel/locking/mutex.c:746
       e1000_reset_task+0x56/0xc0 drivers/net/ethernet/intel/e1000/e1000_main.c:3512
       process_one_work kernel/workqueue.c:3238 [inline]
       process_scheduled_works+0xadb/0x17a0 kernel/workqueue.c:3319
       worker_thread+0x8a0/0xda0 kernel/workqueue.c:3400
       kthread+0x70e/0x8a0 kernel/kthread.c:464
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:153
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

-> #0 ((work_completion)(&adapter->reset_task)){+.+.}-{0:0}:
       check_prev_add kernel/locking/lockdep.c:3166 [inline]
       check_prevs_add kernel/locking/lockdep.c:3285 [inline]
       validate_chain+0xb9b/0x2140 kernel/locking/lockdep.c:3909
       __lock_acquire+0xaac/0xd20 kernel/locking/lockdep.c:5235
       lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5866
       touch_work_lockdep_map kernel/workqueue.c:3922 [inline]
       start_flush_work kernel/workqueue.c:4176 [inline]
       __flush_work+0x6b8/0xbc0 kernel/workqueue.c:4208
       __cancel_work_sync+0xbe/0x110 kernel/workqueue.c:4364
       e1000_down+0x402/0x6b0 drivers/net/ethernet/intel/e1000/e1000_main.c:526
       e1000_close+0x17b/0xa10 drivers/net/ethernet/intel/e1000/e1000_main.c:1448
       __dev_close_many+0x361/0x6f0 net/core/dev.c:1702
       __dev_close net/core/dev.c:1714 [inline]
       __dev_change_flags+0x2c7/0x6d0 net/core/dev.c:9352
       netif_change_flags+0x88/0x1a0 net/core/dev.c:9417
       do_setlink+0xcb9/0x40d0 net/core/rtnetlink.c:3152
       rtnl_group_changelink net/core/rtnetlink.c:3783 [inline]
       __rtnl_newlink net/core/rtnetlink.c:3937 [inline]
       rtnl_newlink+0x149f/0x1c70 net/core/rtnetlink.c:4065
       rtnetlink_rcv_msg+0x7cc/0xb70 net/core/rtnetlink.c:6955
       netlink_rcv_skb+0x219/0x490 net/netlink/af_netlink.c:2534
       netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
       netlink_unicast+0x75b/0x8d0 net/netlink/af_netlink.c:1339
       netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1883
       sock_sendmsg_nosec net/socket.c:712 [inline]
       __sock_sendmsg+0x21c/0x270 net/socket.c:727
       ____sys_sendmsg+0x505/0x830 net/socket.c:2566
       ___sys_sendmsg+0x21f/0x2a0 net/socket.c:2620
       __sys_sendmsg net/socket.c:2652 [inline]
       __do_sys_sendmsg net/socket.c:2657 [inline]
       __se_sys_sendmsg net/socket.c:2655 [inline]
       __x64_sys_sendmsg+0x19b/0x260 net/socket.c:2655
       do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
       do_syscall_64+0xf6/0x210 arch/x86/entry/syscall_64.c:94
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(rtnl_mutex);
                               lock((work_completion)(&adapter->reset_task));
                               lock(rtnl_mutex);
  lock((work_completion)(&adapter->reset_task));

 *** DEADLOCK ***

2 locks held by syz-executor216/5313:
 #0: ffffffff8f2fab48 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_lock net/core/rtnetlink.c:80 [inline]
 #0: ffffffff8f2fab48 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_nets_lock net/core/rtnetlink.c:341 [inline]
 #0: ffffffff8f2fab48 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_newlink+0x8db/0x1c70 net/core/rtnetlink.c:4064
 #1: ffffffff8df3dee0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline]
 #1: ffffffff8df3dee0 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline]
 #1: ffffffff8df3dee0 (rcu_read_lock){....}-{1:3}, at: start_flush_work kernel/workqueue.c:4150 [inline]
 #1: ffffffff8df3dee0 (rcu_read_lock){....}-{1:3}, at: __flush_work+0xd2/0xbc0 kernel/workqueue.c:4208

stack backtrace:
CPU: 0 UID: 0 PID: 5313 Comm: syz-executor216 Not tainted 6.15.0-rc7-syzkaller-00144-gb1427432d3b6 #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
 print_circular_bug+0x2ee/0x310 kernel/locking/lockdep.c:2079
 check_noncircular+0x134/0x160 kernel/locking/lockdep.c:2211
 check_prev_add kernel/locking/lockdep.c:3166 [inline]
 check_prevs_add kernel/locking/lockdep.c:3285 [inline]
 validate_chain+0xb9b/0x2140 kernel/locking/lockdep.c:3909
 __lock_acquire+0xaac/0xd20 kernel/locking/lockdep.c:5235
 lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5866
 touch_work_lockdep_map kernel/workqueue.c:3922 [inline]
 start_flush_work kernel/workqueue.c:4176 [inline]
 __flush_work+0x6b8/0xbc0 kernel/workqueue.c:4208
 __cancel_work_sync+0xbe/0x110 kernel/workqueue.c:4364
 e1000_down+0x402/0x6b0 drivers/net/ethernet/intel/e1000/e1000_main.c:526
 e1000_close+0x17b/0xa10 drivers/net/ethernet/intel/e1000/e1000_main.c:1448
 __dev_close_many+0x361/0x6f0 net/core/dev.c:1702
 __dev_close net/core/dev.c:1714 [inline]
 __dev_change_flags+0x2c7/0x6d0 net/core/dev.c:9352
 netif_change_flags+0x88/0x1a0 net/core/dev.c:9417
 do_setlink+0xcb9/0x40d0 net/core/rtnetlink.c:3152
 rtnl_group_changelink net/core/rtnetlink.c:3783 [inline]
 __rtnl_newlink net/core/rtnetlink.c:3937 [inline]
 rtnl_newlink+0x149f/0x1c70 net/core/rtnetlink.c:4065
 rtnetlink_rcv_msg+0x7cc/0xb70 net/core/rtnetlink.c:6955
 netlink_rcv_skb+0x219/0x490 net/netlink/af_netlink.c:2534
 netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
 netlink_unicast+0x75b/0x8d0 net/netlink/af_netlink.c:1339
 netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1883
 sock_sendmsg_nosec net/socket.c:712 [inline]
 __sock_sendmsg+0x21c/0x270 net/socket.c:727
 ____sys_sendmsg+0x505/0x830 net/socket.c:2566
 ___sys_sendmsg+0x21f/0x2a0 net/socket.c:2620
 __sys_sendmsg net/socket.c:2652 [inline]
 __do_sys_sendmsg net/socket.c:2657 [inline]
 __se_sys_sendmsg net/socket.c:2655 [inline]
 __x64_sys_sendmsg+0x19b/0x260 net/socket.c:2655
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xf6/0x210 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f09c1caf4a9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 51 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f09c1c47198 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f09c1d39318 RCX: 00007f09c1caf4a9
RDX: 0000000000000000 RSI: 0000200000000140 RDI: 0000000000000005
RBP: 00007f09c1d39310 R08: 0000000000000008 R09: 0000000000000000
R10: 0000000000000004 R11: 0000000000000246 R12: 00007f09c1d060ac
R13: 000000000000006e R14: 0000200000000080 R15: 0000200000000150
 </TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] [net?] possible deadlock in rtnl_newlink
  2025-05-29 10:32 [syzbot] [net?] possible deadlock in rtnl_newlink syzbot
@ 2025-05-29 15:59 ` Stanislav Fomichev
  2025-05-29 16:10   ` Jakub Kicinski
  0 siblings, 1 reply; 6+ messages in thread
From: Stanislav Fomichev @ 2025-05-29 15:59 UTC (permalink / raw)
  To: syzbot
  Cc: davem, edumazet, horms, kuba, linux-kernel, netdev, pabeni,
	syzkaller-bugs

On 05/29, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    b1427432d3b6 Merge tag 'iommu-fixes-v6.15-rc7' of git://gi..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=161ef5f4580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=9fd1c9848687d742
> dashboard link: https://syzkaller.appspot.com/bug?extid=846bb38dc67fe62cc733
> compiler:       Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12d21170580000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17d9a8e8580000
> 
> Downloadable assets:
> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-b1427432.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/47b0c66c70d9/vmlinux-b1427432.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/a2df6bfabd3c/bzImage-b1427432.xz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+846bb38dc67fe62cc733@syzkaller.appspotmail.com
> 
> ifb0: entered allmulticast mode
> ifb1: entered allmulticast mode
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.15.0-rc7-syzkaller-00144-gb1427432d3b6 #0 Not tainted
> ------------------------------------------------------
> syz-executor216/5313 is trying to acquire lock:
> ffff888033f496f0 ((work_completion)(&adapter->reset_task)){+.+.}-{0:0}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline]
> ffff888033f496f0 ((work_completion)(&adapter->reset_task)){+.+.}-{0:0}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline]
> ffff888033f496f0 ((work_completion)(&adapter->reset_task)){+.+.}-{0:0}, at: start_flush_work kernel/workqueue.c:4150 [inline]
> ffff888033f496f0 ((work_completion)(&adapter->reset_task)){+.+.}-{0:0}, at: __flush_work+0xd2/0xbc0 kernel/workqueue.c:4208
> 
> but task is already holding lock:
> ffffffff8f2fab48 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_lock net/core/rtnetlink.c:80 [inline]
> ffffffff8f2fab48 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_nets_lock net/core/rtnetlink.c:341 [inline]
> ffffffff8f2fab48 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_newlink+0x8db/0x1c70 net/core/rtnetlink.c:4064
> 
> which lock already depends on the new lock.
> 
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #1 (rtnl_mutex){+.+.}-{4:4}:
>        lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5866
>        __mutex_lock_common kernel/locking/mutex.c:601 [inline]
>        __mutex_lock+0x182/0xe80 kernel/locking/mutex.c:746
>        e1000_reset_task+0x56/0xc0 drivers/net/ethernet/intel/e1000/e1000_main.c:3512
>        process_one_work kernel/workqueue.c:3238 [inline]
>        process_scheduled_works+0xadb/0x17a0 kernel/workqueue.c:3319
>        worker_thread+0x8a0/0xda0 kernel/workqueue.c:3400
>        kthread+0x70e/0x8a0 kernel/kthread.c:464
>        ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:153
>        ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> 
> -> #0 ((work_completion)(&adapter->reset_task)){+.+.}-{0:0}:
>        check_prev_add kernel/locking/lockdep.c:3166 [inline]
>        check_prevs_add kernel/locking/lockdep.c:3285 [inline]
>        validate_chain+0xb9b/0x2140 kernel/locking/lockdep.c:3909
>        __lock_acquire+0xaac/0xd20 kernel/locking/lockdep.c:5235
>        lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5866
>        touch_work_lockdep_map kernel/workqueue.c:3922 [inline]
>        start_flush_work kernel/workqueue.c:4176 [inline]
>        __flush_work+0x6b8/0xbc0 kernel/workqueue.c:4208
>        __cancel_work_sync+0xbe/0x110 kernel/workqueue.c:4364
>        e1000_down+0x402/0x6b0 drivers/net/ethernet/intel/e1000/e1000_main.c:526
>        e1000_close+0x17b/0xa10 drivers/net/ethernet/intel/e1000/e1000_main.c:1448
>        __dev_close_many+0x361/0x6f0 net/core/dev.c:1702
>        __dev_close net/core/dev.c:1714 [inline]
>        __dev_change_flags+0x2c7/0x6d0 net/core/dev.c:9352
>        netif_change_flags+0x88/0x1a0 net/core/dev.c:9417
>        do_setlink+0xcb9/0x40d0 net/core/rtnetlink.c:3152
>        rtnl_group_changelink net/core/rtnetlink.c:3783 [inline]
>        __rtnl_newlink net/core/rtnetlink.c:3937 [inline]
>        rtnl_newlink+0x149f/0x1c70 net/core/rtnetlink.c:4065
>        rtnetlink_rcv_msg+0x7cc/0xb70 net/core/rtnetlink.c:6955
>        netlink_rcv_skb+0x219/0x490 net/netlink/af_netlink.c:2534
>        netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
>        netlink_unicast+0x75b/0x8d0 net/netlink/af_netlink.c:1339
>        netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1883
>        sock_sendmsg_nosec net/socket.c:712 [inline]
>        __sock_sendmsg+0x21c/0x270 net/socket.c:727
>        ____sys_sendmsg+0x505/0x830 net/socket.c:2566
>        ___sys_sendmsg+0x21f/0x2a0 net/socket.c:2620
>        __sys_sendmsg net/socket.c:2652 [inline]
>        __do_sys_sendmsg net/socket.c:2657 [inline]
>        __se_sys_sendmsg net/socket.c:2655 [inline]
>        __x64_sys_sendmsg+0x19b/0x260 net/socket.c:2655
>        do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>        do_syscall_64+0xf6/0x210 arch/x86/entry/syscall_64.c:94
>        entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> other info that might help us debug this:
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(rtnl_mutex);
>                                lock((work_completion)(&adapter->reset_task));
>                                lock(rtnl_mutex);
>   lock((work_completion)(&adapter->reset_task));

So this is internal WQ entry lock that is being reordered with rtnl
lock. But looking at process_one_work, I don't see actual locks, mostly
lock_map_acquire/lock_map_release calls to enforce some internal WQ
invariants. Not sure what to do with it, will try to read more.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] [net?] possible deadlock in rtnl_newlink
  2025-05-29 15:59 ` Stanislav Fomichev
@ 2025-05-29 16:10   ` Jakub Kicinski
  2025-05-29 16:45     ` Stanislav Fomichev
  0 siblings, 1 reply; 6+ messages in thread
From: Jakub Kicinski @ 2025-05-29 16:10 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: syzbot, davem, edumazet, horms, linux-kernel, netdev, pabeni,
	syzkaller-bugs

On Thu, 29 May 2025 08:59:43 -0700 Stanislav Fomichev wrote:
> So this is internal WQ entry lock that is being reordered with rtnl
> lock. But looking at process_one_work, I don't see actual locks, mostly
> lock_map_acquire/lock_map_release calls to enforce some internal WQ
> invariants. Not sure what to do with it, will try to read more.

Basically a flush_work() happens while holding rtnl_lock,
but the work itself takes that lock. It's a driver bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] [net?] possible deadlock in rtnl_newlink
  2025-05-29 16:10   ` Jakub Kicinski
@ 2025-05-29 16:45     ` Stanislav Fomichev
  2025-05-29 23:54       ` Joe Damato
  0 siblings, 1 reply; 6+ messages in thread
From: Stanislav Fomichev @ 2025-05-29 16:45 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: syzbot, davem, edumazet, horms, linux-kernel, netdev, pabeni,
	syzkaller-bugs

On 05/29, Jakub Kicinski wrote:
> On Thu, 29 May 2025 08:59:43 -0700 Stanislav Fomichev wrote:
> > So this is internal WQ entry lock that is being reordered with rtnl
> > lock. But looking at process_one_work, I don't see actual locks, mostly
> > lock_map_acquire/lock_map_release calls to enforce some internal WQ
> > invariants. Not sure what to do with it, will try to read more.
> 
> Basically a flush_work() happens while holding rtnl_lock,
> but the work itself takes that lock. It's a driver bug.

e400c7444d84 ("e1000: Hold RTNL when e1000_down can be called") ?
I think similar things (but wrt netdev instance lock) are happening
with iavf: iavf_remove calls cancel_work_sync while holding the
instance lock and the work callbacks grab the instance lock as well :-/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] [net?] possible deadlock in rtnl_newlink
  2025-05-29 16:45     ` Stanislav Fomichev
@ 2025-05-29 23:54       ` Joe Damato
  0 siblings, 0 replies; 6+ messages in thread
From: Joe Damato @ 2025-05-29 23:54 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Jakub Kicinski, syzbot, davem, edumazet, horms, linux-kernel,
	netdev, pabeni, syzkaller-bugs

On Thu, May 29, 2025 at 09:45:10AM -0700, Stanislav Fomichev wrote:
> On 05/29, Jakub Kicinski wrote:
> > On Thu, 29 May 2025 08:59:43 -0700 Stanislav Fomichev wrote:
> > > So this is internal WQ entry lock that is being reordered with rtnl
> > > lock. But looking at process_one_work, I don't see actual locks, mostly
> > > lock_map_acquire/lock_map_release calls to enforce some internal WQ
> > > invariants. Not sure what to do with it, will try to read more.
> > 
> > Basically a flush_work() happens while holding rtnl_lock,
> > but the work itself takes that lock. It's a driver bug.
> 
> e400c7444d84 ("e1000: Hold RTNL when e1000_down can be called") ?
> I think similar things (but wrt netdev instance lock) are happening
> with iavf: iavf_remove calls cancel_work_sync while holding the
> instance lock and the work callbacks grab the instance lock as well :-/

I think this is probably the same thread as:

 https://lore.kernel.org/netdev/CAP=Rh=OEsn4y_2LvkO3UtDWurKcGPnZ_NPSXK=FbgygNXL37Sw@mail.gmail.com/

I posted a response there about how to possibly avoid the problem
(based on my rough reading of the driver code), but am still
thinking more on this.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] [net?] possible deadlock in rtnl_newlink
       [not found] <20250531011248.2445-1-hdanton@sina.com>
@ 2025-05-31  1:33 ` syzbot
  0 siblings, 0 replies; 6+ messages in thread
From: syzbot @ 2025-05-31  1:33 UTC (permalink / raw)
  To: edumazet, hdanton, jdamato, john.cs.hey, linux-kernel, netdev,
	stfomichev, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
no output from test machine



Tested on:

commit:         0f70f5b0 Merge tag 'pull-automount' of git://git.kerne..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15927ff4580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=8a01551457d63a4b
dashboard link: https://syzkaller.appspot.com/bug?extid=846bb38dc67fe62cc733
compiler:       Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6
patch:          https://syzkaller.appspot.com/x/patch.diff?x=17b8ded4580000


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-05-31  1:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-29 10:32 [syzbot] [net?] possible deadlock in rtnl_newlink syzbot
2025-05-29 15:59 ` Stanislav Fomichev
2025-05-29 16:10   ` Jakub Kicinski
2025-05-29 16:45     ` Stanislav Fomichev
2025-05-29 23:54       ` Joe Damato
     [not found] <20250531011248.2445-1-hdanton@sina.com>
2025-05-31  1:33 ` syzbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).