netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [syzbot] [netfilter?] possible deadlock in nf_tables_dumpreset_obj
@ 2025-12-20  0:58 syzbot
  2025-12-21 23:16 ` Florian Westphal
  0 siblings, 1 reply; 7+ messages in thread
From: syzbot @ 2025-12-20  0:58 UTC (permalink / raw)
  To: coreteam, davem, edumazet, fw, horms, kadlec, kuba, linux-kernel,
	netdev, netfilter-devel, pabeni, pablo, phil, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    8f0b4cce4481 Linux 6.19-rc1
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=104f2d92580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=a11e0f726bfb6765
dashboard link: https://syzkaller.appspot.com/bug?extid=ff16b505ec9152e5f448
compiler:       gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-8f0b4cce.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/64c9a36f3f29/vmlinux-8f0b4cce.xz
kernel image: https://storage.googleapis.com/syzbot-assets/27a5e8a8a4b8/bzImage-8f0b4cce.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+ff16b505ec9152e5f448@syzkaller.appspotmail.com

======================================================
WARNING: possible circular locking dependency detected
syzkaller #0 Not tainted
------------------------------------------------------
syz.3.970/9330 is trying to acquire lock:
ffff888012d4ccd8 (&nft_net->commit_mutex){+.+.}-{4:4}, at: nf_tables_dumpreset_obj+0x6f/0xa0 net/netfilter/nf_tables_api.c:8491

but task is already holding lock:
ffff88802bce36f0 (nlk_cb_mutex-NETFILTER){+.+.}-{4:4}, at: __netlink_dump_start+0x150/0x990 net/netlink/af_netlink.c:2404

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (nlk_cb_mutex-NETFILTER){+.+.}-{4:4}:
       __mutex_lock_common kernel/locking/mutex.c:614 [inline]
       __mutex_lock+0x1aa/0x1ca0 kernel/locking/mutex.c:776
       __netlink_dump_start+0x150/0x990 net/netlink/af_netlink.c:2404
       netlink_dump_start include/linux/netlink.h:341 [inline]
       ip_set_dump+0x17f/0x210 net/netfilter/ipset/ip_set_core.c:1717
       nfnetlink_rcv_msg+0x9fc/0x1200 net/netfilter/nfnetlink.c:302
       netlink_rcv_skb+0x158/0x420 net/netlink/af_netlink.c:2550
       nfnetlink_rcv+0x1b3/0x430 net/netfilter/nfnetlink.c:669
       netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
       netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1344
       netlink_sendmsg+0x8c8/0xdd0 net/netlink/af_netlink.c:1894
       sock_sendmsg_nosec net/socket.c:727 [inline]
       __sock_sendmsg net/socket.c:742 [inline]
       ____sys_sendmsg+0xa5d/0xc30 net/socket.c:2592
       ___sys_sendmsg+0x134/0x1d0 net/socket.c:2646
       __sys_sendmsg+0x16d/0x220 net/socket.c:2678
       do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
       do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #1 (nfnl_subsys_ipset){+.+.}-{4:4}:
       __mutex_lock_common kernel/locking/mutex.c:614 [inline]
       __mutex_lock+0x1aa/0x1ca0 kernel/locking/mutex.c:776
       ip_set_nfnl_get_byindex+0x7c/0x290 net/netfilter/ipset/ip_set_core.c:909
       set_target_v1_checkentry+0x1ac/0x570 net/netfilter/xt_set.c:313
       xt_check_target+0x27c/0xa40 net/netfilter/x_tables.c:1038
       nft_target_init+0x459/0x7d0 net/netfilter/nft_compat.c:267
       nf_tables_newexpr net/netfilter/nf_tables_api.c:3527 [inline]
       nf_tables_newrule+0xedd/0x2910 net/netfilter/nf_tables_api.c:4358
       nfnetlink_rcv_batch+0x190d/0x2350 net/netfilter/nfnetlink.c:526
       nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:649 [inline]
       nfnetlink_rcv+0x3c1/0x430 net/netfilter/nfnetlink.c:667
       netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
       netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1344
       netlink_sendmsg+0x8c8/0xdd0 net/netlink/af_netlink.c:1894
       sock_sendmsg_nosec net/socket.c:727 [inline]
       __sock_sendmsg net/socket.c:742 [inline]
       ____sys_sendmsg+0xa5d/0xc30 net/socket.c:2592
       ___sys_sendmsg+0x134/0x1d0 net/socket.c:2646
       __sys_sendmsg+0x16d/0x220 net/socket.c:2678
       do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
       do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #0 (&nft_net->commit_mutex){+.+.}-{4:4}:
       check_prev_add kernel/locking/lockdep.c:3165 [inline]
       check_prevs_add kernel/locking/lockdep.c:3284 [inline]
       validate_chain kernel/locking/lockdep.c:3908 [inline]
       __lock_acquire+0x1669/0x2890 kernel/locking/lockdep.c:5237
       lock_acquire kernel/locking/lockdep.c:5868 [inline]
       lock_acquire+0x179/0x330 kernel/locking/lockdep.c:5825
       __mutex_lock_common kernel/locking/mutex.c:614 [inline]
       __mutex_lock+0x1aa/0x1ca0 kernel/locking/mutex.c:776
       nf_tables_dumpreset_obj+0x6f/0xa0 net/netfilter/nf_tables_api.c:8491
       netlink_dump+0x539/0xd30 net/netlink/af_netlink.c:2325
       __netlink_dump_start+0x6d6/0x990 net/netlink/af_netlink.c:2440
       netlink_dump_start include/linux/netlink.h:341 [inline]
       nft_netlink_dump_start_rcu+0x81/0x1f0 net/netfilter/nf_tables_api.c:1286
       nf_tables_getobj_reset+0x56b/0x6b0 net/netfilter/nf_tables_api.c:8626
       nfnetlink_rcv_msg+0x583/0x1200 net/netfilter/nfnetlink.c:290
       netlink_rcv_skb+0x158/0x420 net/netlink/af_netlink.c:2550
       nfnetlink_rcv+0x1b3/0x430 net/netfilter/nfnetlink.c:669
       netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
       netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1344
       netlink_sendmsg+0x8c8/0xdd0 net/netlink/af_netlink.c:1894
       sock_sendmsg_nosec net/socket.c:727 [inline]
       __sock_sendmsg net/socket.c:742 [inline]
       ____sys_sendmsg+0xa5d/0xc30 net/socket.c:2592
       ___sys_sendmsg+0x134/0x1d0 net/socket.c:2646
       __sys_sendmsg+0x16d/0x220 net/socket.c:2678
       do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
       do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

other info that might help us debug this:

Chain exists of:
  &nft_net->commit_mutex --> nfnl_subsys_ipset --> nlk_cb_mutex-NETFILTER

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(nlk_cb_mutex-NETFILTER);
                               lock(nfnl_subsys_ipset);
                               lock(nlk_cb_mutex-NETFILTER);
  lock(&nft_net->commit_mutex);

 *** DEADLOCK ***

1 lock held by syz.3.970/9330:
 #0: ffff88802bce36f0 (nlk_cb_mutex-NETFILTER){+.+.}-{4:4}, at: __netlink_dump_start+0x150/0x990 net/netlink/af_netlink.c:2404

stack backtrace:
CPU: 0 UID: 0 PID: 9330 Comm: syz.3.970 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
 print_circular_bug+0x275/0x340 kernel/locking/lockdep.c:2043
 check_noncircular+0x146/0x160 kernel/locking/lockdep.c:2175
 check_prev_add kernel/locking/lockdep.c:3165 [inline]
 check_prevs_add kernel/locking/lockdep.c:3284 [inline]
 validate_chain kernel/locking/lockdep.c:3908 [inline]
 __lock_acquire+0x1669/0x2890 kernel/locking/lockdep.c:5237
 lock_acquire kernel/locking/lockdep.c:5868 [inline]
 lock_acquire+0x179/0x330 kernel/locking/lockdep.c:5825
 __mutex_lock_common kernel/locking/mutex.c:614 [inline]
 __mutex_lock+0x1aa/0x1ca0 kernel/locking/mutex.c:776
 nf_tables_dumpreset_obj+0x6f/0xa0 net/netfilter/nf_tables_api.c:8491
 netlink_dump+0x539/0xd30 net/netlink/af_netlink.c:2325
 __netlink_dump_start+0x6d6/0x990 net/netlink/af_netlink.c:2440
 netlink_dump_start include/linux/netlink.h:341 [inline]
 nft_netlink_dump_start_rcu+0x81/0x1f0 net/netfilter/nf_tables_api.c:1286
 nf_tables_getobj_reset+0x56b/0x6b0 net/netfilter/nf_tables_api.c:8626
 nfnetlink_rcv_msg+0x583/0x1200 net/netfilter/nfnetlink.c:290
 netlink_rcv_skb+0x158/0x420 net/netlink/af_netlink.c:2550
 nfnetlink_rcv+0x1b3/0x430 net/netfilter/nfnetlink.c:669
 netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
 netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1344
 netlink_sendmsg+0x8c8/0xdd0 net/netlink/af_netlink.c:1894
 sock_sendmsg_nosec net/socket.c:727 [inline]
 __sock_sendmsg net/socket.c:742 [inline]
 ____sys_sendmsg+0xa5d/0xc30 net/socket.c:2592
 ___sys_sendmsg+0x134/0x1d0 net/socket.c:2646
 __sys_sendmsg+0x16d/0x220 net/socket.c:2678
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fb7e7b8f7c9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fb7e8a9c038 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fb7e7de5fa0 RCX: 00007fb7e7b8f7c9
RDX: 0000000004004004 RSI: 0000200000000140 RDI: 0000000000000003
RBP: 00007fb7e7c13f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fb7e7de6038 R14: 00007fb7e7de5fa0 R15: 00007fffe518fab8
 </TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [netfilter?] possible deadlock in nf_tables_dumpreset_obj
  2025-12-20  0:58 [syzbot] [netfilter?] possible deadlock in nf_tables_dumpreset_obj syzbot
@ 2025-12-21 23:16 ` Florian Westphal
  2025-12-22  9:21   ` Pablo Neira Ayuso
  2025-12-23 12:32   ` Jozsef Kadlecsik
  0 siblings, 2 replies; 7+ messages in thread
From: Florian Westphal @ 2025-12-21 23:16 UTC (permalink / raw)
  To: syzbot
  Cc: coreteam, davem, edumazet, horms, kadlec, kuba, linux-kernel,
	netdev, netfilter-devel, pabeni, pablo, phil, syzkaller-bugs

syzbot <syzbot+ff16b505ec9152e5f448@syzkaller.appspotmail.com> wrote:
> syz.3.970/9330 is trying to acquire lock:
> ffff888012d4ccd8 (&nft_net->commit_mutex){+.+.}-{4:4}, at: nf_tables_dumpreset_obj+0x6f/0xa0 net/netfilter/nf_tables_api.c:8491
> 
> but task is already holding lock:
> ffff88802bce36f0 (nlk_cb_mutex-NETFILTER){+.+.}-{4:4}, at: __netlink_dump_start+0x150/0x990 net/netlink/af_netlink.c:2404
> 
> which lock already depends on the new lock.

I think this is a real bug:

CPU0: 'nft reset'.
CPU1: 'ipset list' (anything in ipset doing a netlink dump op)
CPU2: 'iptables-nft -A ... -m set ...'

... can result in:

CPU0                    CPU1                            CPU2
----                    ----                            ----
lock(nlk_cb_mutex-NETFILTER);
                        lock(nfnl_subsys_ipset);
                                                       lock(&nft_net->commit_mutex);
                        lock(nlk_cb_mutex-NETFILTER);
                                                       lock(nfnl_subsys_ipset);
lock(&nft_net->commit_mutex);

CPU0 is waiting for CPU2 to release transaction mutex.
CPU1 is waiting for CPU0 to release the netlink dump mutex
CPU2 is waiting for CPU1 to release the ipset subsys mutex

This bug was added when 'nft reset' started to grab the transaction
mutex from the dump callback path in nf_tables.

Not yet sure how to avoid it.
Maybe we could get rid of 'lock(nfnl_subsys_ipset);'
from the xt_set module call paths.

Or add a new lock (spinlock?) to protect the 'reset' object info
instead of using the transaction mutex.

I haven't given it much thought yet and will likely not
investigate further for the next two weeks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [netfilter?] possible deadlock in nf_tables_dumpreset_obj
  2025-12-21 23:16 ` Florian Westphal
@ 2025-12-22  9:21   ` Pablo Neira Ayuso
  2025-12-22  9:30     ` Pablo Neira Ayuso
  2025-12-23 12:32   ` Jozsef Kadlecsik
  1 sibling, 1 reply; 7+ messages in thread
From: Pablo Neira Ayuso @ 2025-12-22  9:21 UTC (permalink / raw)
  To: Florian Westphal
  Cc: syzbot, coreteam, davem, edumazet, horms, kadlec, kuba,
	linux-kernel, netdev, netfilter-devel, pabeni, phil,
	syzkaller-bugs

On Mon, Dec 22, 2025 at 12:16:53AM +0100, Florian Westphal wrote:
> syzbot <syzbot+ff16b505ec9152e5f448@syzkaller.appspotmail.com> wrote:
> > syz.3.970/9330 is trying to acquire lock:
> > ffff888012d4ccd8 (&nft_net->commit_mutex){+.+.}-{4:4}, at: nf_tables_dumpreset_obj+0x6f/0xa0 net/netfilter/nf_tables_api.c:8491
> > 
> > but task is already holding lock:
> > ffff88802bce36f0 (nlk_cb_mutex-NETFILTER){+.+.}-{4:4}, at: __netlink_dump_start+0x150/0x990 net/netlink/af_netlink.c:2404
> > 
> > which lock already depends on the new lock.
> 
> I think this is a real bug:

Yes, I think so too, it was a bad idea to use the commit_mutex for this.

> CPU0: 'nft reset'.
> CPU1: 'ipset list' (anything in ipset doing a netlink dump op)
> CPU2: 'iptables-nft -A ... -m set ...'
> 
> ... can result in:
> 
> CPU0                    CPU1                            CPU2
> ----                    ----                            ----
> lock(nlk_cb_mutex-NETFILTER);
>                         lock(nfnl_subsys_ipset);
>                                                        lock(&nft_net->commit_mutex);
>                         lock(nlk_cb_mutex-NETFILTER);
>                                                        lock(nfnl_subsys_ipset);
> lock(&nft_net->commit_mutex);
> 
> CPU0 is waiting for CPU2 to release transaction mutex.
> CPU1 is waiting for CPU0 to release the netlink dump mutex
> CPU2 is waiting for CPU1 to release the ipset subsys mutex
> 
> This bug was added when 'nft reset' started to grab the transaction
> mutex from the dump callback path in nf_tables.
> 
> Not yet sure how to avoid it.
> Maybe we could get rid of 'lock(nfnl_subsys_ipset);'
> from the xt_set module call paths.
> 
> Or add a new lock (spinlock?) to protect the 'reset' object info
> instead of using the transaction mutex.
> 
> I haven't given it much thought yet and will likely not
> investigate further for the next two weeks.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [netfilter?] possible deadlock in nf_tables_dumpreset_obj
  2025-12-22  9:21   ` Pablo Neira Ayuso
@ 2025-12-22  9:30     ` Pablo Neira Ayuso
  2025-12-22 11:16       ` Florian Westphal
  0 siblings, 1 reply; 7+ messages in thread
From: Pablo Neira Ayuso @ 2025-12-22  9:30 UTC (permalink / raw)
  To: Florian Westphal
  Cc: syzbot, coreteam, davem, edumazet, horms, kadlec, kuba,
	linux-kernel, netdev, netfilter-devel, pabeni, phil,
	syzkaller-bugs

Sorry, I pressed sent too fast... see below.

On Mon, Dec 22, 2025 at 10:22:02AM +0100, Pablo Neira Ayuso wrote:
> On Mon, Dec 22, 2025 at 12:16:53AM +0100, Florian Westphal wrote:
> > syzbot <syzbot+ff16b505ec9152e5f448@syzkaller.appspotmail.com> wrote:
> > > syz.3.970/9330 is trying to acquire lock:
> > > ffff888012d4ccd8 (&nft_net->commit_mutex){+.+.}-{4:4}, at: nf_tables_dumpreset_obj+0x6f/0xa0 net/netfilter/nf_tables_api.c:8491
> > > 
> > > but task is already holding lock:
> > > ffff88802bce36f0 (nlk_cb_mutex-NETFILTER){+.+.}-{4:4}, at: __netlink_dump_start+0x150/0x990 net/netlink/af_netlink.c:2404
> > > 
> > > which lock already depends on the new lock.
> > 
> > I think this is a real bug:
> 
> Yes, I think so too, it was a bad idea to use the commit_mutex for this.
> 
> > CPU0: 'nft reset'.
> > CPU1: 'ipset list' (anything in ipset doing a netlink dump op)
> > CPU2: 'iptables-nft -A ... -m set ...'
> > 
> > ... can result in:
> > 
> > CPU0                    CPU1                            CPU2
> > ----                    ----                            ----
> > lock(nlk_cb_mutex-NETFILTER);
> >                         lock(nfnl_subsys_ipset);
> >                                                        lock(&nft_net->commit_mutex);
> >                         lock(nlk_cb_mutex-NETFILTER);
> >                                                        lock(nfnl_subsys_ipset);
> > lock(&nft_net->commit_mutex);

Would it work to use a separated mutex for reset itself?

> > CPU0 is waiting for CPU2 to release transaction mutex.
> > CPU1 is waiting for CPU0 to release the netlink dump mutex
> > CPU2 is waiting for CPU1 to release the ipset subsys mutex
> > 
> > This bug was added when 'nft reset' started to grab the transaction
> > mutex from the dump callback path in nf_tables.
> > 
> > Not yet sure how to avoid it.
> > Maybe we could get rid of 'lock(nfnl_subsys_ipset);'
> > from the xt_set module call paths.
> > 
> > Or add a new lock (spinlock?) to protect the 'reset' object info
> > instead of using the transaction mutex.
> > 
> > I haven't given it much thought yet and will likely not
> > investigate further for the next two weeks.
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [netfilter?] possible deadlock in nf_tables_dumpreset_obj
  2025-12-22  9:30     ` Pablo Neira Ayuso
@ 2025-12-22 11:16       ` Florian Westphal
  0 siblings, 0 replies; 7+ messages in thread
From: Florian Westphal @ 2025-12-22 11:16 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: syzbot, coreteam, davem, edumazet, horms, kadlec, kuba,
	linux-kernel, netdev, netfilter-devel, pabeni, phil,
	syzkaller-bugs

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > > CPU0: 'nft reset'.
> > > CPU1: 'ipset list' (anything in ipset doing a netlink dump op)
> > > CPU2: 'iptables-nft -A ... -m set ...'
> > > 
> > > ... can result in:
> > > 
> > > CPU0                    CPU1                            CPU2
> > > ----                    ----                            ----
> > > lock(nlk_cb_mutex-NETFILTER);
> > >                         lock(nfnl_subsys_ipset);
> > >                                                        lock(&nft_net->commit_mutex);
> > >                         lock(nlk_cb_mutex-NETFILTER);
> > >                                                        lock(nfnl_subsys_ipset);
> > > lock(&nft_net->commit_mutex);
> 
> Would it work to use a separated mutex for reset itself?

I think so, yes, its only job is to prevent concurrent reset actions,
the objects themselves are protected by rcu.

Parallel add/removal should be fine.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [netfilter?] possible deadlock in nf_tables_dumpreset_obj
  2025-12-21 23:16 ` Florian Westphal
  2025-12-22  9:21   ` Pablo Neira Ayuso
@ 2025-12-23 12:32   ` Jozsef Kadlecsik
  2025-12-23 13:14     ` Florian Westphal
  1 sibling, 1 reply; 7+ messages in thread
From: Jozsef Kadlecsik @ 2025-12-23 12:32 UTC (permalink / raw)
  To: Florian Westphal
  Cc: syzbot, coreteam, davem, edumazet, horms, kuba, linux-kernel,
	netdev, netfilter-devel, pabeni, pablo, phil, syzkaller-bugs

Hi,

On Mon, 22 Dec 2025, Florian Westphal wrote:

> syzbot <syzbot+ff16b505ec9152e5f448@syzkaller.appspotmail.com> wrote:
> > syz.3.970/9330 is trying to acquire lock:
> > ffff888012d4ccd8 (&nft_net->commit_mutex){+.+.}-{4:4}, at: nf_tables_dumpreset_obj+0x6f/0xa0 net/netfilter/nf_tables_api.c:8491
> > 
> > but task is already holding lock:
> > ffff88802bce36f0 (nlk_cb_mutex-NETFILTER){+.+.}-{4:4}, at: __netlink_dump_start+0x150/0x990 net/netlink/af_netlink.c:2404
> > 
> > which lock already depends on the new lock.
> 
> I think this is a real bug:
> 
> CPU0: 'nft reset'.
> CPU1: 'ipset list' (anything in ipset doing a netlink dump op)
> CPU2: 'iptables-nft -A ... -m set ...'
> 
> ... can result in:
> 
> CPU0                    CPU1                            CPU2
> ----                    ----                            ----
> lock(nlk_cb_mutex-NETFILTER);
>                         lock(nfnl_subsys_ipset);
>                                                        lock(&nft_net->commit_mutex);
>                         lock(nlk_cb_mutex-NETFILTER);
>                                                        lock(nfnl_subsys_ipset);
> lock(&nft_net->commit_mutex);
> 
> CPU0 is waiting for CPU2 to release transaction mutex.
> CPU1 is waiting for CPU0 to release the netlink dump mutex
> CPU2 is waiting for CPU1 to release the ipset subsys mutex
> 
> This bug was added when 'nft reset' started to grab the transaction
> mutex from the dump callback path in nf_tables.
> 
> Not yet sure how to avoid it.
> Maybe we could get rid of 'lock(nfnl_subsys_ipset);'
> from the xt_set module call paths.

I don't know how calling it could be avoided: userspace commands (ipset + 
iptables checkentry using ipset match/target) are serialized by 
nfnl_subsys_ipset.

Is there a way to force acquiring nlk_cb_mutex-NETFILTER first and then 
nfnl_subsys_ipset when doing a netlink dump?

> Or add a new lock (spinlock?) to protect the 'reset' object info
> instead of using the transaction mutex.
> 
> I haven't given it much thought yet and will likely not
> investigate further for the next two weeks.

Best regards,
Jozsef
-- 
E-mail : kadlec@netfilter.org, kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu
Address: Wigner Research Centre for Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [netfilter?] possible deadlock in nf_tables_dumpreset_obj
  2025-12-23 12:32   ` Jozsef Kadlecsik
@ 2025-12-23 13:14     ` Florian Westphal
  0 siblings, 0 replies; 7+ messages in thread
From: Florian Westphal @ 2025-12-23 13:14 UTC (permalink / raw)
  To: Jozsef Kadlecsik
  Cc: syzbot, coreteam, davem, edumazet, horms, kuba, linux-kernel,
	netdev, netfilter-devel, pabeni, pablo, phil, syzkaller-bugs

Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> wrote:
> > Not yet sure how to avoid it.
> > Maybe we could get rid of 'lock(nfnl_subsys_ipset);'
> > from the xt_set module call paths.
> 
> I don't know how calling it could be avoided: userspace commands (ipset +
> iptables checkentry using ipset match/target) are serialized by
> nfnl_subsys_ipset.

Ok, thanks Jozsef.  In that case its much simpler to leave ipset
alone and add a new reset serialization mutex in nf_tables.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-12-23 13:14 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-20  0:58 [syzbot] [netfilter?] possible deadlock in nf_tables_dumpreset_obj syzbot
2025-12-21 23:16 ` Florian Westphal
2025-12-22  9:21   ` Pablo Neira Ayuso
2025-12-22  9:30     ` Pablo Neira Ayuso
2025-12-22 11:16       ` Florian Westphal
2025-12-23 12:32   ` Jozsef Kadlecsik
2025-12-23 13:14     ` Florian Westphal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).