* [syzbot] [net?] possible deadlock in rlb_choose_channel (2)
@ 2026-05-13 8:46 syzbot
2026-05-13 14:41 ` Jay Vosburgh
0 siblings, 1 reply; 2+ messages in thread
From: syzbot @ 2026-05-13 8:46 UTC (permalink / raw)
To: andrew+netdev, davem, edumazet, jv, kuba, linux-kernel, netdev,
pabeni, syzkaller-bugs
Hello,
syzbot found the following issue on:
HEAD commit: c21b90f77687 x86/CPU/AMD: Prevent improper isolation of sh..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=10ec7dba580000
kernel config: https://syzkaller.appspot.com/x/.config?x=4caf64b1ee83dac0
dashboard link: https://syzkaller.appspot.com/bug?extid=1db58dbbccbf93c65c83
compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
Unfortunately, I don't have any reproducer for this issue yet.
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/2f3edabe3b67/disk-c21b90f7.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/539b63753e79/vmlinux-c21b90f7.xz
kernel image: https://storage.googleapis.com/syzbot-assets/48e6e7cbc4ca/bzImage-c21b90f7.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+1db58dbbccbf93c65c83@syzkaller.appspotmail.com
ip6_tunnel: ip6tnl1 xmit: Local address not yet configured!
ip6_tunnel: ip6tnl1 xmit: Local address not yet configured!
============================================
WARNING: possible recursive locking detected
syzkaller #0 Tainted: G L
--------------------------------------------
kworker/u8:3/47 is trying to acquire lock:
ffff88807a618e98 (&bond->mode_lock){+.-.}-{3:3}, at: spin_lock include/linux/spinlock.h:342 [inline]
ffff88807a618e98 (&bond->mode_lock){+.-.}-{3:3}, at: rlb_choose_channel+0x37/0x19a0 drivers/net/bonding/bond_alb.c:562
but task is already holding lock:
ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: spin_lock_bh include/linux/spinlock.h:348 [inline]
ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: rlb_update_rx_clients drivers/net/bonding/bond_alb.c:466 [inline]
ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: bond_alb_monitor+0xe8a/0x17e0 drivers/net/bonding/bond_alb.c:1618
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&bond->mode_lock);
lock(&bond->mode_lock);
*** DEADLOCK ***
May be due to missing lock nesting notation
7 locks held by kworker/u8:3/47:
#0: ffff8880516b7140 ((wq_completion)bond5#2){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3277 [inline]
#0: ffff8880516b7140 ((wq_completion)bond5#2){+.+.}-{0:0}, at: process_scheduled_works+0xa35/0x1860 kernel/workqueue.c:3385
#1: ffffc90000b77c40 ((work_completion)(&(&bond->alb_work)->work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3278 [inline]
#1: ffffc90000b77c40 ((work_completion)(&(&bond->alb_work)->work)){+.+.}-{0:0}, at: process_scheduled_works+0xa70/0x1860 kernel/workqueue.c:3385
#2: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:300 [inline]
#2: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
#2: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: bond_alb_monitor+0xf8/0x17e0 drivers/net/bonding/bond_alb.c:1546
#3: ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: spin_lock_bh include/linux/spinlock.h:348 [inline]
#3: ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: rlb_update_rx_clients drivers/net/bonding/bond_alb.c:466 [inline]
#3: ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: bond_alb_monitor+0xe8a/0x17e0 drivers/net/bonding/bond_alb.c:1618
#4: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:300 [inline]
#4: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
#4: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: arp_xmit+0x23/0x270 net/ipv4/arp.c:663
#5: ffffffff8e95cdc0 (rcu_read_lock_bh){....}-{1:3}, at: local_bh_disable include/linux/bottom_half.h:20 [inline]
#5: ffffffff8e95cdc0 (rcu_read_lock_bh){....}-{1:3}, at: rcu_read_lock_bh include/linux/rcupdate.h:891 [inline]
#5: ffffffff8e95cdc0 (rcu_read_lock_bh){....}-{1:3}, at: __dev_queue_xmit+0x2b6/0x3950 net/core/dev.c:4791
#6: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:300 [inline]
#6: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
#6: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: bond_start_xmit+0xb4/0x1900 drivers/net/bonding/bond_main.c:5591
stack backtrace:
CPU: 0 UID: 0 PID: 47 Comm: kworker/u8:3 Tainted: G L syzkaller #0 PREEMPT(full)
Tainted: [L]=SOFTLOCKUP
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/18/2026
Workqueue: bond5 bond_alb_monitor
Call Trace:
<TASK>
dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
print_deadlock_bug+0x279/0x290 kernel/locking/lockdep.c:3041
check_deadlock kernel/locking/lockdep.c:3093 [inline]
validate_chain kernel/locking/lockdep.c:3895 [inline]
__lock_acquire+0x253f/0x2cf0 kernel/locking/lockdep.c:5237
lock_acquire+0x106/0x350 kernel/locking/lockdep.c:5868
__raw_spin_lock include/linux/spinlock_api_smp.h:158 [inline]
_raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:158
spin_lock include/linux/spinlock.h:342 [inline]
rlb_choose_channel+0x37/0x19a0 drivers/net/bonding/bond_alb.c:562
rlb_arp_xmit drivers/net/bonding/bond_alb.c:680 [inline]
bond_xmit_alb_slave_get+0x1071/0x20a0 drivers/net/bonding/bond_alb.c:1493
bond_alb_xmit+0x24/0x40 drivers/net/bonding/bond_alb.c:1528
__bond_start_xmit drivers/net/bonding/bond_main.c:5569 [inline]
bond_start_xmit+0x6a2/0x1900 drivers/net/bonding/bond_main.c:5593
__netdev_start_xmit include/linux/netdevice.h:5368 [inline]
netdev_start_xmit include/linux/netdevice.h:5377 [inline]
xmit_one net/core/dev.c:3888 [inline]
dev_hard_start_xmit+0x2cd/0x830 net/core/dev.c:3904
__dev_queue_xmit+0x14d9/0x3950 net/core/dev.c:4870
NF_HOOK+0x33a/0x3c0 include/linux/netfilter.h:-1
arp_xmit+0x16c/0x270 net/ipv4/arp.c:665
rlb_update_client+0x2a8/0x6b0 drivers/net/bonding/bond_alb.c:455
rlb_update_rx_clients drivers/net/bonding/bond_alb.c:473 [inline]
bond_alb_monitor+0xf6a/0x17e0 drivers/net/bonding/bond_alb.c:1618
process_one_work kernel/workqueue.c:3302 [inline]
process_scheduled_works+0xb5d/0x1860 kernel/workqueue.c:3385
worker_thread+0xa53/0xfc0 kernel/workqueue.c:3466
kthread+0x388/0x470 kernel/kthread.c:436
ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [syzbot] [net?] possible deadlock in rlb_choose_channel (2)
2026-05-13 8:46 [syzbot] [net?] possible deadlock in rlb_choose_channel (2) syzbot
@ 2026-05-13 14:41 ` Jay Vosburgh
0 siblings, 0 replies; 2+ messages in thread
From: Jay Vosburgh @ 2026-05-13 14:41 UTC (permalink / raw)
To: syzbot
Cc: andrew+netdev, davem, edumazet, kuba, linux-kernel, netdev,
pabeni, syzkaller-bugs
syzbot <syzbot+1db58dbbccbf93c65c83@syzkaller.appspotmail.com> wrote:
>Hello,
>
>syzbot found the following issue on:
>
>HEAD commit: c21b90f77687 x86/CPU/AMD: Prevent improper isolation of sh..
>git tree: upstream
>console output: https://syzkaller.appspot.com/x/log.txt?x=10ec7dba580000
>kernel config: https://syzkaller.appspot.com/x/.config?x=4caf64b1ee83dac0
>dashboard link: https://syzkaller.appspot.com/bug?extid=1db58dbbccbf93c65c83
>compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
>
>Unfortunately, I don't have any reproducer for this issue yet.
>
>Downloadable assets:
>disk image: https://storage.googleapis.com/syzbot-assets/2f3edabe3b67/disk-c21b90f7.raw.xz
>vmlinux: https://storage.googleapis.com/syzbot-assets/539b63753e79/vmlinux-c21b90f7.xz
>kernel image: https://storage.googleapis.com/syzbot-assets/48e6e7cbc4ca/bzImage-c21b90f7.xz
>
>IMPORTANT: if you fix the issue, please add the following tag to the commit:
>Reported-by: syzbot+1db58dbbccbf93c65c83@syzkaller.appspotmail.com
>
>ip6_tunnel: ip6tnl1 xmit: Local address not yet configured!
>ip6_tunnel: ip6tnl1 xmit: Local address not yet configured!
>============================================
>WARNING: possible recursive locking detected
>syzkaller #0 Tainted: G L
>--------------------------------------------
>kworker/u8:3/47 is trying to acquire lock:
>ffff88807a618e98 (&bond->mode_lock){+.-.}-{3:3}, at: spin_lock include/linux/spinlock.h:342 [inline]
>ffff88807a618e98 (&bond->mode_lock){+.-.}-{3:3}, at: rlb_choose_channel+0x37/0x19a0 drivers/net/bonding/bond_alb.c:562
>
>but task is already holding lock:
>ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: spin_lock_bh include/linux/spinlock.h:348 [inline]
>ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: rlb_update_rx_clients drivers/net/bonding/bond_alb.c:466 [inline]
>ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: bond_alb_monitor+0xe8a/0x17e0 drivers/net/bonding/bond_alb.c:1618
>
>other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(&bond->mode_lock);
> lock(&bond->mode_lock);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
>7 locks held by kworker/u8:3/47:
> #0: ffff8880516b7140 ((wq_completion)bond5#2){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3277 [inline]
> #0: ffff8880516b7140 ((wq_completion)bond5#2){+.+.}-{0:0}, at: process_scheduled_works+0xa35/0x1860 kernel/workqueue.c:3385
> #1: ffffc90000b77c40 ((work_completion)(&(&bond->alb_work)->work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3278 [inline]
> #1: ffffc90000b77c40 ((work_completion)(&(&bond->alb_work)->work)){+.+.}-{0:0}, at: process_scheduled_works+0xa70/0x1860 kernel/workqueue.c:3385
> #2: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:300 [inline]
> #2: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
> #2: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: bond_alb_monitor+0xf8/0x17e0 drivers/net/bonding/bond_alb.c:1546
> #3: ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: spin_lock_bh include/linux/spinlock.h:348 [inline]
> #3: ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: rlb_update_rx_clients drivers/net/bonding/bond_alb.c:466 [inline]
> #3: ffff88807ffa0e98 (&bond->mode_lock){+.-.}-{3:3}, at: bond_alb_monitor+0xe8a/0x17e0 drivers/net/bonding/bond_alb.c:1618
> #4: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:300 [inline]
> #4: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
> #4: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: arp_xmit+0x23/0x270 net/ipv4/arp.c:663
> #5: ffffffff8e95cdc0 (rcu_read_lock_bh){....}-{1:3}, at: local_bh_disable include/linux/bottom_half.h:20 [inline]
> #5: ffffffff8e95cdc0 (rcu_read_lock_bh){....}-{1:3}, at: rcu_read_lock_bh include/linux/rcupdate.h:891 [inline]
> #5: ffffffff8e95cdc0 (rcu_read_lock_bh){....}-{1:3}, at: __dev_queue_xmit+0x2b6/0x3950 net/core/dev.c:4791
> #6: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:300 [inline]
> #6: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
> #6: ffffffff8e95cd60 (rcu_read_lock){....}-{1:3}, at: bond_start_xmit+0xb4/0x1900 drivers/net/bonding/bond_main.c:5591
>
>stack backtrace:
>CPU: 0 UID: 0 PID: 47 Comm: kworker/u8:3 Tainted: G L syzkaller #0 PREEMPT(full)
>Tainted: [L]=SOFTLOCKUP
>Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/18/2026
>Workqueue: bond5 bond_alb_monitor
>Call Trace:
> <TASK>
> dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
> print_deadlock_bug+0x279/0x290 kernel/locking/lockdep.c:3041
> check_deadlock kernel/locking/lockdep.c:3093 [inline]
> validate_chain kernel/locking/lockdep.c:3895 [inline]
> __lock_acquire+0x253f/0x2cf0 kernel/locking/lockdep.c:5237
> lock_acquire+0x106/0x350 kernel/locking/lockdep.c:5868
> __raw_spin_lock include/linux/spinlock_api_smp.h:158 [inline]
> _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:158
> spin_lock include/linux/spinlock.h:342 [inline]
> rlb_choose_channel+0x37/0x19a0 drivers/net/bonding/bond_alb.c:562
> rlb_arp_xmit drivers/net/bonding/bond_alb.c:680 [inline]
> bond_xmit_alb_slave_get+0x1071/0x20a0 drivers/net/bonding/bond_alb.c:1493
> bond_alb_xmit+0x24/0x40 drivers/net/bonding/bond_alb.c:1528
> __bond_start_xmit drivers/net/bonding/bond_main.c:5569 [inline]
> bond_start_xmit+0x6a2/0x1900 drivers/net/bonding/bond_main.c:5593
> __netdev_start_xmit include/linux/netdevice.h:5368 [inline]
> netdev_start_xmit include/linux/netdevice.h:5377 [inline]
> xmit_one net/core/dev.c:3888 [inline]
> dev_hard_start_xmit+0x2cd/0x830 net/core/dev.c:3904
> __dev_queue_xmit+0x14d9/0x3950 net/core/dev.c:4870
> NF_HOOK+0x33a/0x3c0 include/linux/netfilter.h:-1
> arp_xmit+0x16c/0x270 net/ipv4/arp.c:665
> rlb_update_client+0x2a8/0x6b0 drivers/net/bonding/bond_alb.c:455
> rlb_update_rx_clients drivers/net/bonding/bond_alb.c:473 [inline]
> bond_alb_monitor+0xf6a/0x17e0 drivers/net/bonding/bond_alb.c:1618
Just looking at the stack, I suspect that this is either a false
positive, or the NF_HOOK action (a netfilter rule) is reinjecting the
ARP packet in to the same bond that created it.
If the packet is being reinjected to the same interface that
generated it in rlb_update_client, then I believe the above would be the
expected behavior.
On the other hand, if the network configuration is nested bonds,
then the rlb_arp_xmit -> rlb_choose_channel call path above would be
operating on a different instance of the bond->mode_lock, and would not
actually deadlock.
-J
> process_one_work kernel/workqueue.c:3302 [inline]
> process_scheduled_works+0xb5d/0x1860 kernel/workqueue.c:3385
> worker_thread+0xa53/0xfc0 kernel/workqueue.c:3466
> kthread+0x388/0x470 kernel/kthread.c:436
> ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> </TASK>
>
>
>---
>This report is generated by a bot. It may contain errors.
>See https://goo.gl/tpsmEJ for more information about syzbot.
>syzbot engineers can be reached at syzkaller@googlegroups.com.
>
>syzbot will keep track of this issue. See:
>https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
>If the report is already addressed, let syzbot know by replying with:
>#syz fix: exact-commit-title
>
>If you want to overwrite report's subsystems, reply with:
>#syz set subsystems: new-subsystem
>(See the list of subsystem names on the web dashboard)
>
>If the report is a duplicate of another one, reply with:
>#syz dup: exact-subject-of-another-report
>
>If you want to undo deduplication, reply with:
>#syz undup
---
-Jay Vosburgh, jv@jvosburgh.net
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-05-13 14:41 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-13 8:46 [syzbot] [net?] possible deadlock in rlb_choose_channel (2) syzbot
2026-05-13 14:41 ` Jay Vosburgh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox