The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [syzbot] [net?] possible deadlock in rlb_choose_channel (2)
@ 2026-05-13  8:46 syzbot
  2026-05-13 14:41 ` Jay Vosburgh
  0 siblings, 1 reply; 2+ messages in thread
From: syzbot @ 2026-05-13  8:46 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, jv, kuba, linux-kernel, netdev,
	pabeni, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    c21b90f77687 x86/CPU/AMD: Prevent improper isolation of sh..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=10ec7dba580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=4caf64b1ee83dac0
dashboard link: https://syzkaller.appspot.com/bug?extid=1db58dbbccbf93c65c83
compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/2f3edabe3b67/disk-c21b90f7.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/539b63753e79/vmlinux-c21b90f7.xz
kernel image: https://storage.googleapis.com/syzbot-assets/48e6e7cbc4ca/bzImage-c21b90f7.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+1db58dbbccbf93c65c83@syzkaller.appspotmail.com

ip6_tunnel: ip6tnl1 xmit: Local address not yet configured!
ip6_tunnel: ip6tnl1 xmit: Local address not yet configured!
============================================
WARNING: possible recursive locking detected
syzkaller #0 Tainted: G             L     
--------------------------------------------
kworker/u8:3/47 is trying to acquire lock:
ffff88807a618e98 (&bond->mode_lock){+.-.}-{3:3}, at: spin_lock include/linux/spinlock.h:342 [inline]
ffff88807a618e98 (&bond->mode_lock){+.-.}-{3:3}, at: rlb_choose_channel+0x37/0x19a0 drivers/net/bonding/bond_alb.c:562

but task is already holding lock:
ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: spin_lock_bh include/linux/spinlock.h:348 [inline]
ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: rlb_update_rx_clients drivers/net/bonding/bond_alb.c:466 [inline]
ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: bond_alb_monitor+0xe8a/0x17e0 drivers/net/bonding/bond_alb.c:1618

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&bond->mode_lock);
  lock(&bond->mode_lock);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

7 locks held by kworker/u8:3/47:
 #0: ffff8880516b7140 ((wq_completion)bond5#2){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3277 [inline]
 #0: ffff8880516b7140 ((wq_completion)bond5#2){+.+.}-{0:0}, at: process_scheduled_works+0xa35/0x1860 kernel/workqueue.c:3385
 #1: ffffc90000b77c40 ((work_completion)(&(&bond->alb_work)->work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3278 [inline]
 #1: ffffc90000b77c40 ((work_completion)(&(&bond->alb_work)->work)){+.+.}-{0:0}, at: process_scheduled_works+0xa70/0x1860 kernel/workqueue.c:3385
 #2: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:300 [inline]
 #2: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
 #2: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: bond_alb_monitor+0xf8/0x17e0 drivers/net/bonding/bond_alb.c:1546
 #3: ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: spin_lock_bh include/linux/spinlock.h:348 [inline]
 #3: ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: rlb_update_rx_clients drivers/net/bonding/bond_alb.c:466 [inline]
 #3: ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: bond_alb_monitor+0xe8a/0x17e0 drivers/net/bonding/bond_alb.c:1618
 #4: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:300 [inline]
 #4: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
 #4: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: arp_xmit+0x23/0x270 net/ipv4/arp.c:663
 #5: ffffffff8e95cdc0 (rcu_read_lock_bh){....}-{1:3}, at: local_bh_disable include/linux/bottom_half.h:20 [inline]
 #5: ffffffff8e95cdc0 (rcu_read_lock_bh){....}-{1:3}, at: rcu_read_lock_bh include/linux/rcupdate.h:891 [inline]
 #5: ffffffff8e95cdc0 (rcu_read_lock_bh){....}-{1:3}, at: __dev_queue_xmit+0x2b6/0x3950 net/core/dev.c:4791
 #6: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:300 [inline]
 #6: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
 #6: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: bond_start_xmit+0xb4/0x1900 drivers/net/bonding/bond_main.c:5591

stack backtrace:
CPU: 0 UID: 0 PID: 47 Comm: kworker/u8:3 Tainted: G             L      syzkaller #0 PREEMPT(full) 
Tainted: [L]=SOFTLOCKUP
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/18/2026
Workqueue: bond5 bond_alb_monitor
Call Trace:
 <TASK>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 print_deadlock_bug+0x279/0x290 kernel/locking/lockdep.c:3041
 check_deadlock kernel/locking/lockdep.c:3093 [inline]
 validate_chain kernel/locking/lockdep.c:3895 [inline]
 __lock_acquire+0x253f/0x2cf0 kernel/locking/lockdep.c:5237
 lock_acquire+0x106/0x350 kernel/locking/lockdep.c:5868
 __raw_spin_lock include/linux/spinlock_api_smp.h:158 [inline]
 _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:158
 spin_lock include/linux/spinlock.h:342 [inline]
 rlb_choose_channel+0x37/0x19a0 drivers/net/bonding/bond_alb.c:562
 rlb_arp_xmit drivers/net/bonding/bond_alb.c:680 [inline]
 bond_xmit_alb_slave_get+0x1071/0x20a0 drivers/net/bonding/bond_alb.c:1493
 bond_alb_xmit+0x24/0x40 drivers/net/bonding/bond_alb.c:1528
 __bond_start_xmit drivers/net/bonding/bond_main.c:5569 [inline]
 bond_start_xmit+0x6a2/0x1900 drivers/net/bonding/bond_main.c:5593
 __netdev_start_xmit include/linux/netdevice.h:5368 [inline]
 netdev_start_xmit include/linux/netdevice.h:5377 [inline]
 xmit_one net/core/dev.c:3888 [inline]
 dev_hard_start_xmit+0x2cd/0x830 net/core/dev.c:3904
 __dev_queue_xmit+0x14d9/0x3950 net/core/dev.c:4870
 NF_HOOK+0x33a/0x3c0 include/linux/netfilter.h:-1
 arp_xmit+0x16c/0x270 net/ipv4/arp.c:665
 rlb_update_client+0x2a8/0x6b0 drivers/net/bonding/bond_alb.c:455
 rlb_update_rx_clients drivers/net/bonding/bond_alb.c:473 [inline]
 bond_alb_monitor+0xf6a/0x17e0 drivers/net/bonding/bond_alb.c:1618
 process_one_work kernel/workqueue.c:3302 [inline]
 process_scheduled_works+0xb5d/0x1860 kernel/workqueue.c:3385
 worker_thread+0xa53/0xfc0 kernel/workqueue.c:3466
 kthread+0x388/0x470 kernel/kthread.c:436
 ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
 </TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [syzbot] [net?] possible deadlock in rlb_choose_channel (2)
  2026-05-13  8:46 [syzbot] [net?] possible deadlock in rlb_choose_channel (2) syzbot
@ 2026-05-13 14:41 ` Jay Vosburgh
  0 siblings, 0 replies; 2+ messages in thread
From: Jay Vosburgh @ 2026-05-13 14:41 UTC (permalink / raw)
  To: syzbot
  Cc: andrew+netdev, davem, edumazet, kuba, linux-kernel, netdev,
	pabeni, syzkaller-bugs

syzbot <syzbot+1db58dbbccbf93c65c83@syzkaller.appspotmail.com> wrote:

>Hello,
>
>syzbot found the following issue on:
>
>HEAD commit:    c21b90f77687 x86/CPU/AMD: Prevent improper isolation of sh..
>git tree:       upstream
>console output: https://syzkaller.appspot.com/x/log.txt?x=10ec7dba580000
>kernel config:  https://syzkaller.appspot.com/x/.config?x=4caf64b1ee83dac0
>dashboard link: https://syzkaller.appspot.com/bug?extid=1db58dbbccbf93c65c83
>compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
>
>Unfortunately, I don't have any reproducer for this issue yet.
>
>Downloadable assets:
>disk image: https://storage.googleapis.com/syzbot-assets/2f3edabe3b67/disk-c21b90f7.raw.xz
>vmlinux: https://storage.googleapis.com/syzbot-assets/539b63753e79/vmlinux-c21b90f7.xz
>kernel image: https://storage.googleapis.com/syzbot-assets/48e6e7cbc4ca/bzImage-c21b90f7.xz
>
>IMPORTANT: if you fix the issue, please add the following tag to the commit:
>Reported-by: syzbot+1db58dbbccbf93c65c83@syzkaller.appspotmail.com
>
>ip6_tunnel: ip6tnl1 xmit: Local address not yet configured!
>ip6_tunnel: ip6tnl1 xmit: Local address not yet configured!
>============================================
>WARNING: possible recursive locking detected
>syzkaller #0 Tainted: G             L     
>--------------------------------------------
>kworker/u8:3/47 is trying to acquire lock:
>ffff88807a618e98 (&bond->mode_lock){+.-.}-{3:3}, at: spin_lock include/linux/spinlock.h:342 [inline]
>ffff88807a618e98 (&bond->mode_lock){+.-.}-{3:3}, at: rlb_choose_channel+0x37/0x19a0 drivers/net/bonding/bond_alb.c:562
>
>but task is already holding lock:
>ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: spin_lock_bh include/linux/spinlock.h:348 [inline]
>ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: rlb_update_rx_clients drivers/net/bonding/bond_alb.c:466 [inline]
>ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: bond_alb_monitor+0xe8a/0x17e0 drivers/net/bonding/bond_alb.c:1618
>
>other info that might help us debug this:
> Possible unsafe locking scenario:
>
>       CPU0
>       ----
>  lock(&bond->mode_lock);
>  lock(&bond->mode_lock);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
>7 locks held by kworker/u8:3/47:
> #0: ffff8880516b7140 ((wq_completion)bond5#2){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3277 [inline]
> #0: ffff8880516b7140 ((wq_completion)bond5#2){+.+.}-{0:0}, at: process_scheduled_works+0xa35/0x1860 kernel/workqueue.c:3385
> #1: ffffc90000b77c40 ((work_completion)(&(&bond->alb_work)->work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3278 [inline]
> #1: ffffc90000b77c40 ((work_completion)(&(&bond->alb_work)->work)){+.+.}-{0:0}, at: process_scheduled_works+0xa70/0x1860 kernel/workqueue.c:3385
> #2: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:300 [inline]
> #2: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
> #2: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: bond_alb_monitor+0xf8/0x17e0 drivers/net/bonding/bond_alb.c:1546
> #3: ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: spin_lock_bh include/linux/spinlock.h:348 [inline]
> #3: ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: rlb_update_rx_clients drivers/net/bonding/bond_alb.c:466 [inline]
> #3: ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: bond_alb_monitor+0xe8a/0x17e0 drivers/net/bonding/bond_alb.c:1618
> #4: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:300 [inline]
> #4: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
> #4: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: arp_xmit+0x23/0x270 net/ipv4/arp.c:663
> #5: ffffffff8e95cdc0 (rcu_read_lock_bh){....}-{1:3}, at: local_bh_disable include/linux/bottom_half.h:20 [inline]
> #5: ffffffff8e95cdc0 (rcu_read_lock_bh){....}-{1:3}, at: rcu_read_lock_bh include/linux/rcupdate.h:891 [inline]
> #5: ffffffff8e95cdc0 (rcu_read_lock_bh){....}-{1:3}, at: __dev_queue_xmit+0x2b6/0x3950 net/core/dev.c:4791
> #6: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:300 [inline]
> #6: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
> #6: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: bond_start_xmit+0xb4/0x1900 drivers/net/bonding/bond_main.c:5591
>
>stack backtrace:
>CPU: 0 UID: 0 PID: 47 Comm: kworker/u8:3 Tainted: G             L      syzkaller #0 PREEMPT(full) 
>Tainted: [L]=SOFTLOCKUP
>Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/18/2026
>Workqueue: bond5 bond_alb_monitor
>Call Trace:
> <TASK>
> dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
> print_deadlock_bug+0x279/0x290 kernel/locking/lockdep.c:3041
> check_deadlock kernel/locking/lockdep.c:3093 [inline]
> validate_chain kernel/locking/lockdep.c:3895 [inline]
> __lock_acquire+0x253f/0x2cf0 kernel/locking/lockdep.c:5237
> lock_acquire+0x106/0x350 kernel/locking/lockdep.c:5868
> __raw_spin_lock include/linux/spinlock_api_smp.h:158 [inline]
> _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:158
> spin_lock include/linux/spinlock.h:342 [inline]
> rlb_choose_channel+0x37/0x19a0 drivers/net/bonding/bond_alb.c:562
> rlb_arp_xmit drivers/net/bonding/bond_alb.c:680 [inline]
> bond_xmit_alb_slave_get+0x1071/0x20a0 drivers/net/bonding/bond_alb.c:1493
> bond_alb_xmit+0x24/0x40 drivers/net/bonding/bond_alb.c:1528
> __bond_start_xmit drivers/net/bonding/bond_main.c:5569 [inline]
> bond_start_xmit+0x6a2/0x1900 drivers/net/bonding/bond_main.c:5593
> __netdev_start_xmit include/linux/netdevice.h:5368 [inline]
> netdev_start_xmit include/linux/netdevice.h:5377 [inline]
> xmit_one net/core/dev.c:3888 [inline]
> dev_hard_start_xmit+0x2cd/0x830 net/core/dev.c:3904
> __dev_queue_xmit+0x14d9/0x3950 net/core/dev.c:4870
> NF_HOOK+0x33a/0x3c0 include/linux/netfilter.h:-1
> arp_xmit+0x16c/0x270 net/ipv4/arp.c:665
> rlb_update_client+0x2a8/0x6b0 drivers/net/bonding/bond_alb.c:455
> rlb_update_rx_clients drivers/net/bonding/bond_alb.c:473 [inline]
> bond_alb_monitor+0xf6a/0x17e0 drivers/net/bonding/bond_alb.c:1618

	Just looking at the stack, I suspect that this is either a false
positive, or the NF_HOOK action (a netfilter rule) is reinjecting the
ARP packet in to the same bond that created it.

	If the packet is being reinjected to the same interface that
generated it in rlb_update_client, then I believe the above would be the
expected behavior.

	On the other hand, if the network configuration is nested bonds,
then the rlb_arp_xmit -> rlb_choose_channel call path above would be
operating on a different instance of the bond->mode_lock, and would not
actually deadlock.

	-J

> process_one_work kernel/workqueue.c:3302 [inline]
> process_scheduled_works+0xb5d/0x1860 kernel/workqueue.c:3385
> worker_thread+0xa53/0xfc0 kernel/workqueue.c:3466
> kthread+0x388/0x470 kernel/kthread.c:436
> ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> </TASK>
>
>
>---
>This report is generated by a bot. It may contain errors.
>See https://goo.gl/tpsmEJ for more information about syzbot.
>syzbot engineers can be reached at syzkaller@googlegroups.com.
>
>syzbot will keep track of this issue. See:
>https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
>If the report is already addressed, let syzbot know by replying with:
>#syz fix: exact-commit-title
>
>If you want to overwrite report's subsystems, reply with:
>#syz set subsystems: new-subsystem
>(See the list of subsystem names on the web dashboard)
>
>If the report is a duplicate of another one, reply with:
>#syz dup: exact-subject-of-another-report
>
>If you want to undo deduplication, reply with:
>#syz undup

---
	-Jay Vosburgh, jv@jvosburgh.net

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-05-13 14:41 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-13  8:46 [syzbot] [net?] possible deadlock in rlb_choose_channel (2) syzbot
2026-05-13 14:41 ` Jay Vosburgh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox