public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work
@ 2024-07-01 20:19 syzbot
  2024-07-02  9:39 ` [syzbot] " syzbot
                   ` (5 more replies)
  0 siblings, 6 replies; 27+ messages in thread
From: syzbot @ 2024-07-01 20:19 UTC (permalink / raw)
  To: coreteam, davem, edumazet, kadlec, kuba, linux-kernel, netdev,
	netfilter-devel, pabeni, pablo, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    1c5fc27bc48a Merge tag 'nf-next-24-06-28' of git://git.ker..
git tree:       net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=103b1da9980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=5264b58fdff6e881
dashboard link: https://syzkaller.appspot.com/bug?extid=4fd66a69358fc15ae2ad
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=148791c6980000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12cecb1e980000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/9672225af907/disk-1c5fc27b.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/0f14d163a914/vmlinux-1c5fc27b.xz
kernel image: https://storage.googleapis.com/syzbot-assets/ec6c331e6a6e/bzImage-1c5fc27b.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+4fd66a69358fc15ae2ad@syzkaller.appspotmail.com

==================================================================
BUG: KASAN: slab-use-after-free in nft_ctx_update include/net/netfilter/nf_tables.h:1831 [inline]
BUG: KASAN: slab-use-after-free in nft_commit_release net/netfilter/nf_tables_api.c:9530 [inline]
BUG: KASAN: slab-use-after-free in nf_tables_trans_destroy_work+0x152b/0x1750 net/netfilter/nf_tables_api.c:9597
Read of size 2 at addr ffff88802b0051c4 by task kworker/1:1/45

CPU: 1 PID: 45 Comm: kworker/1:1 Not tainted 6.10.0-rc5-syzkaller-01137-g1c5fc27bc48a #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
Workqueue: events nf_tables_trans_destroy_work
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
 print_address_description mm/kasan/report.c:377 [inline]
 print_report+0x169/0x550 mm/kasan/report.c:488
 kasan_report+0x143/0x180 mm/kasan/report.c:601
 nft_ctx_update include/net/netfilter/nf_tables.h:1831 [inline]
 nft_commit_release net/netfilter/nf_tables_api.c:9530 [inline]
 nf_tables_trans_destroy_work+0x152b/0x1750 net/netfilter/nf_tables_api.c:9597
 process_one_work kernel/workqueue.c:3248 [inline]
 process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3329
 worker_thread+0x86d/0xd50 kernel/workqueue.c:3409
 kthread+0x2f0/0x390 kernel/kthread.c:389
 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>

Allocated by task 6682:
 kasan_save_stack mm/kasan/common.c:47 [inline]
 kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
 poison_kmalloc_redzone mm/kasan/common.c:370 [inline]
 __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387
 kasan_kmalloc include/linux/kasan.h:211 [inline]
 kmalloc_trace_noprof+0x19c/0x2c0 mm/slub.c:4154
 kmalloc_noprof include/linux/slab.h:660 [inline]
 kzalloc_noprof include/linux/slab.h:778 [inline]
 nf_tables_newtable+0x52e/0x1dc0 net/netfilter/nf_tables_api.c:1465
 nfnetlink_rcv_batch net/netfilter/nfnetlink.c:522 [inline]
 nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:644 [inline]
 nfnetlink_rcv+0x1427/0x2a90 net/netfilter/nfnetlink.c:662
 netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
 netlink_unicast+0x7f0/0x990 net/netlink/af_netlink.c:1357
 netlink_sendmsg+0x8e4/0xcb0 net/netlink/af_netlink.c:1901
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg+0x221/0x270 net/socket.c:745
 ____sys_sendmsg+0x525/0x7d0 net/socket.c:2585
 ___sys_sendmsg net/socket.c:2639 [inline]
 __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2668
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Freed by task 6682:
 kasan_save_stack mm/kasan/common.c:47 [inline]
 kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
 kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
 poison_slab_object+0xe0/0x150 mm/kasan/common.c:240
 __kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
 kasan_slab_free include/linux/kasan.h:184 [inline]
 slab_free_hook mm/slub.c:2196 [inline]
 slab_free mm/slub.c:4438 [inline]
 kfree+0x149/0x360 mm/slub.c:4559
 nf_tables_table_destroy net/netfilter/nf_tables_api.c:1672 [inline]
 __nft_release_table+0xe80/0xf40 net/netfilter/nf_tables_api.c:11517
 nft_rcv_nl_event+0x55f/0x6d0 net/netfilter/nf_tables_api.c:11576
 notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
 blocking_notifier_call_chain+0x69/0x90 kernel/notifier.c:388
 netlink_release+0x11a6/0x1b10 net/netlink/af_netlink.c:787
 __sock_release net/socket.c:659 [inline]
 sock_close+0xbc/0x240 net/socket.c:1421
 __fput+0x406/0x8b0 fs/file_table.c:422
 task_work_run+0x24f/0x310 kernel/task_work.c:180
 exit_task_work include/linux/task_work.h:38 [inline]
 do_exit+0xa27/0x27e0 kernel/exit.c:874
 do_group_exit+0x207/0x2c0 kernel/exit.c:1023
 __do_sys_exit_group kernel/exit.c:1034 [inline]
 __se_sys_exit_group kernel/exit.c:1032 [inline]
 __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1032
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Last potentially related work creation:
 kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47
 __kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:541
 insert_work+0x3e/0x330 kernel/workqueue.c:2208
 __queue_work+0xc16/0xee0 kernel/workqueue.c:2360
 queue_work_on+0x1c2/0x380 kernel/workqueue.c:2411
 queue_work include/linux/workqueue.h:621 [inline]
 schedule_work include/linux/workqueue.h:682 [inline]
 __rhashtable_remove_fast_one include/linux/rhashtable.h:1069 [inline]
 __rhashtable_remove_fast include/linux/rhashtable.h:1093 [inline]
 rhltable_remove+0x1097/0x1160 include/linux/rhashtable.h:1144
 nft_chain_del net/netfilter/nf_tables_api.c:9781 [inline]
 __nft_release_table+0xc57/0xf40 net/netfilter/nf_tables_api.c:11513
 nft_rcv_nl_event+0x55f/0x6d0 net/netfilter/nf_tables_api.c:11576
 notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
 blocking_notifier_call_chain+0x69/0x90 kernel/notifier.c:388
 netlink_release+0x11a6/0x1b10 net/netlink/af_netlink.c:787
 __sock_release net/socket.c:659 [inline]
 sock_close+0xbc/0x240 net/socket.c:1421
 __fput+0x406/0x8b0 fs/file_table.c:422
 task_work_run+0x24f/0x310 kernel/task_work.c:180
 exit_task_work include/linux/task_work.h:38 [inline]
 do_exit+0xa27/0x27e0 kernel/exit.c:874
 do_group_exit+0x207/0x2c0 kernel/exit.c:1023
 __do_sys_exit_group kernel/exit.c:1034 [inline]
 __se_sys_exit_group kernel/exit.c:1032 [inline]
 __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1032
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

The buggy address belongs to the object at ffff88802b005000
 which belongs to the cache kmalloc-cg-512 of size 512
The buggy address is located 452 bytes inside of
 freed 512-byte region [ffff88802b005000, ffff88802b005200)

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2b004
head: order:2 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0xfff00000000040(head|node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xffffefff(slab)
raw: 00fff00000000040 ffff88801504f140 ffffea0000b63700 0000000000000002
raw: 0000000000000000 0000000080100010 00000001ffffefff 0000000000000000
head: 00fff00000000040 ffff88801504f140 ffffea0000b63700 0000000000000002
head: 0000000000000000 0000000080100010 00000001ffffefff 0000000000000000
head: 00fff00000000002 ffffea0000ac0101 ffffffffffffffff 0000000000000000
head: 0000000000000004 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 2, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 4764, tgid 4764 (dhcpcd), ts 34584428687, free_ts 34557820760
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1473
 prep_new_page mm/page_alloc.c:1481 [inline]
 get_page_from_freelist+0x2e4c/0x2f10 mm/page_alloc.c:3425
 __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4683
 __alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
 alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
 alloc_slab_page+0x5f/0x120 mm/slub.c:2265
 allocate_slab+0x5a/0x2f0 mm/slub.c:2428
 new_slab mm/slub.c:2481 [inline]
 ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3667
 __slab_alloc+0x58/0xa0 mm/slub.c:3757
 __slab_alloc_node mm/slub.c:3810 [inline]
 slab_alloc_node mm/slub.c:3990 [inline]
 __do_kmalloc_node mm/slub.c:4122 [inline]
 kmalloc_node_track_caller_noprof+0x281/0x440 mm/slub.c:4143
 kmalloc_reserve+0x111/0x2a0 net/core/skbuff.c:605
 __alloc_skb+0x1f3/0x440 net/core/skbuff.c:674
 alloc_skb include/linux/skbuff.h:1320 [inline]
 alloc_skb_with_frags+0xc3/0x770 net/core/skbuff.c:6524
 sock_alloc_send_pskb+0x91a/0xa60 net/core/sock.c:2815
 unix_dgram_sendmsg+0x6d3/0x1f80 net/unix/af_unix.c:2030
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg+0x221/0x270 net/socket.c:745
 __sys_sendto+0x3a4/0x4f0 net/socket.c:2192
 __do_sys_sendto net/socket.c:2204 [inline]
 __se_sys_sendto net/socket.c:2200 [inline]
 __x64_sys_sendto+0xde/0x100 net/socket.c:2200
page last free pid 4764 tgid 4764 stack trace:
 reset_page_owner include/linux/page_owner.h:25 [inline]
 free_pages_prepare mm/page_alloc.c:1093 [inline]
 free_unref_page+0xd22/0xea0 mm/page_alloc.c:2588
 stack_depot_save_flags+0x6f6/0x830 lib/stackdepot.c:666
 kasan_save_stack mm/kasan/common.c:48 [inline]
 kasan_save_track+0x51/0x80 mm/kasan/common.c:68
 unpoison_slab_object mm/kasan/common.c:312 [inline]
 __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:338
 kasan_slab_alloc include/linux/kasan.h:201 [inline]
 slab_post_alloc_hook mm/slub.c:3940 [inline]
 slab_alloc_node mm/slub.c:4002 [inline]
 kmem_cache_alloc_node_noprof+0x16b/0x320 mm/slub.c:4045
 kmalloc_reserve+0xa8/0x2a0 net/core/skbuff.c:583
 __alloc_skb+0x1f3/0x440 net/core/skbuff.c:674
 alloc_skb include/linux/skbuff.h:1320 [inline]
 __ip6_append_data+0x2ba6/0x4070 net/ipv6/ip6_output.c:1648
 ip6_append_data+0x264/0x3a0 net/ipv6/ip6_output.c:1835
 rawv6_sendmsg+0x18f1/0x23c0 net/ipv6/raw.c:919
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg+0x1a6/0x270 net/socket.c:745
 ____sys_sendmsg+0x525/0x7d0 net/socket.c:2585
 ___sys_sendmsg net/socket.c:2639 [inline]
 __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2668
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Memory state around the buggy address:
 ffff88802b005080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88802b005100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88802b005180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                           ^
 ffff88802b005200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88802b005280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] Re: [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work
  2024-07-01 20:19 [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work syzbot
@ 2024-07-02  9:39 ` syzbot
  2024-07-02 11:28 ` Hillf Danton
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: syzbot @ 2024-07-02  9:39 UTC (permalink / raw)
  To: linux-kernel

For archival purposes, forwarding an incoming command email to
linux-kernel@vger.kernel.org.

***

Subject: Re: [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work
Author: fw@strlen.de

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/fwestphal/nf-next.git flush_uncond

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work
  2024-07-01 20:19 [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work syzbot
  2024-07-02  9:39 ` [syzbot] " syzbot
@ 2024-07-02 11:28 ` Hillf Danton
  2024-07-02 13:57   ` syzbot
  2024-07-02 13:14 ` Edward Adam Davis
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Hillf Danton @ 2024-07-02 11:28 UTC (permalink / raw)
  To: syzbot; +Cc: linux-kernel, syzkaller-bugs

On Mon, 01 Jul 2024 13:19:15 -0700
> syzbot found the following issue on:
> 
> HEAD commit:    1c5fc27bc48a Merge tag 'nf-next-24-06-28' of git://git.ker..
> git tree:       net-next
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12cecb1e980000

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git  1c5fc27bc48a

--- x/include/net/netfilter/nf_tables.h
+++ y/include/net/netfilter/nf_tables.h
@@ -1281,6 +1281,7 @@ struct nft_table {
 	u64				hgenerator;
 	u64				handle;
 	u32				use;
+	atomic_t			cnt;
 	u16				family:6,
 					flags:8,
 					genmask:2;
--- x/net/netfilter/nf_tables_api.c
+++ y/net/netfilter/nf_tables_api.c
@@ -162,6 +162,8 @@ static struct nft_trans *nft_trans_alloc
 
 	trans->net = ctx->net;
 	trans->table = ctx->table;
+	if (trans->table)
+		atomic_inc(&trans->table->cnt);
 	trans->seq = ctx->seq;
 	trans->flags = ctx->flags;
 	trans->report = ctx->report;
@@ -1498,6 +1500,7 @@ static int nf_tables_newtable(struct sk_
 	if (err < 0)
 		goto err_trans;
 
+	atomic_inc(&table->cnt);
 	list_add_tail_rcu(&table->list, &nft_net->tables);
 	return 0;
 err_trans:
@@ -1663,6 +1666,8 @@ static int nf_tables_deltable(struct sk_
 
 static void nf_tables_table_destroy(struct nft_table *table)
 {
+	if (!atomic_dec_and_test(&table->cnt))
+		return;
 	if (WARN_ON(table->use > 0))
 		return;
 
@@ -9532,7 +9537,6 @@ static void nft_commit_release(struct nf
 	switch (trans->msg_type) {
 	case NFT_MSG_DELTABLE:
 	case NFT_MSG_DESTROYTABLE:
-		nf_tables_table_destroy(trans->table);
 		break;
 	case NFT_MSG_NEWCHAIN:
 		free_percpu(nft_trans_chain_stats(trans));
@@ -9572,6 +9576,9 @@ static void nft_commit_release(struct nf
 		break;
 	}
 
+	if (trans->table)
+		nf_tables_table_destroy(trans->table);
+
 	if (trans->put_net)
 		put_net(trans->net);
 
--

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work
  2024-07-01 20:19 [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work syzbot
  2024-07-02  9:39 ` [syzbot] " syzbot
  2024-07-02 11:28 ` Hillf Danton
@ 2024-07-02 13:14 ` Edward Adam Davis
  2024-07-02 14:13   ` syzbot
  2024-07-02 14:08 ` [PATCH nf] netfilter: nf_tables: unconditionally flush pending work before notifier Florian Westphal
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Edward Adam Davis @ 2024-07-02 13:14 UTC (permalink / raw)
  To: syzbot+4fd66a69358fc15ae2ad; +Cc: linux-kernel, syzkaller-bugs

Unbalanced increase and decrease of use in the table

#syz test: net-next 1c5fc27bc48a

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 70d0bad029fd..b77b764955a0 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -1498,6 +1498,7 @@ static int nf_tables_newtable(struct sk_buff *skb, const struct nfnl_info *info,
 	if (err < 0)
 		goto err_trans;
 
+	nft_use_inc_restore(&table->use);
 	list_add_tail_rcu(&table->list, &nft_net->tables);
 	return 0;
 err_trans:


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work
       [not found] <20240702093924.GA23593@breakpoint.cc>
@ 2024-07-02 13:33 ` syzbot
  0 siblings, 0 replies; 27+ messages in thread
From: syzbot @ 2024-07-02 13:33 UTC (permalink / raw)
  To: fw, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+4fd66a69358fc15ae2ad@syzkaller.appspotmail.com

Tested on:

commit:         f2a491c9 netfilter: nf_tables: unconditionally flush p..
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/fwestphal/nf-next.git flush_uncond
console output: https://syzkaller.appspot.com/x/log.txt?x=1626b9c6980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=e78fc116033e0ab7
dashboard link: https://syzkaller.appspot.com/bug?extid=4fd66a69358fc15ae2ad
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Note: no patches were applied.
Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work
  2024-07-02 11:28 ` Hillf Danton
@ 2024-07-02 13:57   ` syzbot
  0 siblings, 0 replies; 27+ messages in thread
From: syzbot @ 2024-07-02 13:57 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+4fd66a69358fc15ae2ad@syzkaller.appspotmail.com

Tested on:

commit:         1c5fc27b Merge tag 'nf-next-24-06-28' of git://git.ker..
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=10e91b25980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=5264b58fdff6e881
dashboard link: https://syzkaller.appspot.com/bug?extid=4fd66a69358fc15ae2ad
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=104f8466980000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH nf] netfilter: nf_tables: unconditionally flush pending work before notifier
  2024-07-01 20:19 [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work syzbot
                   ` (2 preceding siblings ...)
  2024-07-02 13:14 ` Edward Adam Davis
@ 2024-07-02 14:08 ` Florian Westphal
  2024-07-03 10:35   ` Hillf Danton
  2024-07-02 14:55 ` Edward Adam Davis
  2024-07-06 23:13 ` Hillf Danton
  5 siblings, 1 reply; 27+ messages in thread
From: Florian Westphal @ 2024-07-02 14:08 UTC (permalink / raw)
  To: netfilter-devel
  Cc: linux-kernel, netdev, syzkaller-bugs, Florian Westphal,
	syzbot+4fd66a69358fc15ae2ad

syzbot reports:

KASAN: slab-uaf in nft_ctx_update include/net/netfilter/nf_tables.h:1831
KASAN: slab-uaf in nft_commit_release net/netfilter/nf_tables_api.c:9530
KASAN: slab-uaf int nf_tables_trans_destroy_work+0x152b/0x1750 net/netfilter/nf_tables_api.c:9597
Read of size 2 at addr ffff88802b0051c4 by task kworker/1:1/45
[..]
Workqueue: events nf_tables_trans_destroy_work
Call Trace:
 nft_ctx_update include/net/netfilter/nf_tables.h:1831 [inline]
 nft_commit_release net/netfilter/nf_tables_api.c:9530 [inline]
 nf_tables_trans_destroy_work+0x152b/0x1750 net/netfilter/nf_tables_api.c:9597

Problem is that the notifier does a conditional flush, but its possible
that the table-to-be-removed is still referenced by transactions being
processed by the worker, so we need to flush unconditionally.

We could make the flush_work depend on whether we found a table to delete
in nf-next to avoid the flush for most cases.

AFAICS this problem is only exposed in nf-next, with
commit e169285f8c56 ("netfilter: nf_tables: do not store nft_ctx in transaction objects"),
with this commit applied there is an unconditional fetch of
table->family which is whats triggering the above splat.

Fixes: 2c9f0293280e ("netfilter: nf_tables: flush pending destroy work before netlink notifier")
Reported-and-tested-by: syzbot+4fd66a69358fc15ae2ad@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=4fd66a69358fc15ae2ad
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nf_tables_api.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 02d75aefaa8e..683f6a4518ee 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -11552,8 +11552,7 @@ static int nft_rcv_nl_event(struct notifier_block *this, unsigned long event,
 
 	gc_seq = nft_gc_seq_begin(nft_net);
 
-	if (!list_empty(&nf_tables_destroy_list))
-		nf_tables_trans_destroy_flush_work();
+	nf_tables_trans_destroy_flush_work();
 again:
 	list_for_each_entry(table, &nft_net->tables, list) {
 		if (nft_table_has_owner(table) &&
-- 
2.44.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work
  2024-07-02 13:14 ` Edward Adam Davis
@ 2024-07-02 14:13   ` syzbot
  0 siblings, 0 replies; 27+ messages in thread
From: syzbot @ 2024-07-02 14:13 UTC (permalink / raw)
  To: eadavis, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
WARNING in __nft_release_table

------------[ cut here ]------------
WARNING: CPU: 1 PID: 5870 at net/netfilter/nf_tables_api.c:1667 nf_tables_table_destroy net/netfilter/nf_tables_api.c:1667 [inline]
WARNING: CPU: 1 PID: 5870 at net/netfilter/nf_tables_api.c:1667 __nft_release_table+0xed3/0xf40 net/netfilter/nf_tables_api.c:11518
Modules linked in:
CPU: 1 PID: 5870 Comm: syz.3.18 Not tainted 6.10.0-rc5-syzkaller-01137-g1c5fc27bc48a-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
RIP: 0010:nf_tables_table_destroy net/netfilter/nf_tables_api.c:1667 [inline]
RIP: 0010:__nft_release_table+0xed3/0xf40 net/netfilter/nf_tables_api.c:11518
Code: 8b 04 25 28 00 00 00 48 3b 84 24 e0 00 00 00 75 72 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 6e f3 e0 f7 90 <0f> 0b 90 eb a8 89 d9 80 e1 07 fe c1 38 c1 0f 8c fd f1 ff ff 48 89
RSP: 0018:ffffc900045e78a0 EFLAGS: 00010293
RAX: ffffffff89b53392 RBX: 0000000000000001 RCX: ffff88802a273c00
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffffc900045e79d0 R08: ffffffff89b532c3 R09: 1ffffffff25f78ca
R10: dffffc0000000000 R11: fffffbfff25f78cb R12: dffffc0000000000
R13: ffff88801ee19100 R14: ffff88801181e970 R15: ffff88801181e9c0
FS:  00005555817aa500(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000200 CR3: 000000001c2b0000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 nft_rcv_nl_event+0x55f/0x6d0 net/netfilter/nf_tables_api.c:11577
 notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
 blocking_notifier_call_chain+0x69/0x90 kernel/notifier.c:388
 netlink_release+0x11a6/0x1b10 net/netlink/af_netlink.c:787
 __sock_release net/socket.c:659 [inline]
 sock_close+0xbc/0x240 net/socket.c:1421
 __fput+0x406/0x8b0 fs/file_table.c:422
 task_work_run+0x24f/0x310 kernel/task_work.c:180
 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
 exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
 __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
 syscall_exit_to_user_mode+0x168/0x370 kernel/entry/common.c:218
 do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f91b1775b99
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fff8ddfaf58 EFLAGS: 00000246 ORIG_RAX: 00000000000001b4
RAX: 0000000000000000 RBX: 0000000000019328 RCX: 00007f91b1775b99
RDX: 0000000000000000 RSI: 000000000000001e RDI: 0000000000000003
RBP: ffffffffffffffff R08: 0000000000000001 R09: 000000048ddfb26f
R10: 00007f91b1600000 R11: 0000000000000246 R12: 00007f91b1903fac
R13: 0000000000000032 R14: 00007f91b1905aa0 R15: 00007f91b1903fa0
 </TASK>


Tested on:

commit:         1c5fc27b Merge tag 'nf-next-24-06-28' of git://git.ker..
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=107e8ec6980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=5264b58fdff6e881
dashboard link: https://syzkaller.appspot.com/bug?extid=4fd66a69358fc15ae2ad
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=137e0f1e980000


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work
  2024-07-01 20:19 [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work syzbot
                   ` (3 preceding siblings ...)
  2024-07-02 14:08 ` [PATCH nf] netfilter: nf_tables: unconditionally flush pending work before notifier Florian Westphal
@ 2024-07-02 14:55 ` Edward Adam Davis
  2024-07-02 15:28   ` syzbot
  2024-07-06 23:13 ` Hillf Danton
  5 siblings, 1 reply; 27+ messages in thread
From: Edward Adam Davis @ 2024-07-02 14:55 UTC (permalink / raw)
  To: syzbot+4fd66a69358fc15ae2ad; +Cc: linux-kernel, syzkaller-bugs

Unbalanced increase and decrease of use in the table

#syz test: net-next 1c5fc27bc48a

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 70d0bad029fd..33139e1b6d46 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -1498,6 +1498,7 @@ static int nf_tables_newtable(struct sk_buff *skb, const struct nfnl_info *info,
 	if (err < 0)
 		goto err_trans;
 
+	nft_use_inc_restore(&table->use);
 	list_add_tail_rcu(&table->list, &nft_net->tables);
 	return 0;
 err_trans:
@@ -9529,6 +9530,9 @@ static void nft_commit_release(struct nft_trans *trans)
 
 	nft_ctx_update(&ctx, trans);
 
+	if (trans->table)
+		nft_use_dec(&trans->table->use);
+
 	switch (trans->msg_type) {
 	case NFT_MSG_DELTABLE:
 	case NFT_MSG_DESTROYTABLE:


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work
  2024-07-02 14:55 ` Edward Adam Davis
@ 2024-07-02 15:28   ` syzbot
  0 siblings, 0 replies; 27+ messages in thread
From: syzbot @ 2024-07-02 15:28 UTC (permalink / raw)
  To: eadavis, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
WARNING in __nft_release_table

------------[ cut here ]------------
WARNING: CPU: 1 PID: 5991 at net/netfilter/nf_tables_api.c:1667 nf_tables_table_destroy net/netfilter/nf_tables_api.c:1667 [inline]
WARNING: CPU: 1 PID: 5991 at net/netfilter/nf_tables_api.c:1667 __nft_release_table+0xed3/0xf40 net/netfilter/nf_tables_api.c:11521
Modules linked in:
CPU: 1 PID: 5991 Comm: syz.1.21 Not tainted 6.10.0-rc5-syzkaller-01137-g1c5fc27bc48a-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
RIP: 0010:nf_tables_table_destroy net/netfilter/nf_tables_api.c:1667 [inline]
RIP: 0010:__nft_release_table+0xed3/0xf40 net/netfilter/nf_tables_api.c:11521
Code: 8b 04 25 28 00 00 00 48 3b 84 24 e0 00 00 00 75 72 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 9e f2 e0 f7 90 <0f> 0b 90 eb a8 89 d9 80 e1 07 fe c1 38 c1 0f 8c fd f1 ff ff 48 89
RSP: 0018:ffffc9000465f8a0 EFLAGS: 00010293
RAX: ffffffff89b53462 RBX: 0000000000000001 RCX: ffff8880258d8000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffffc9000465f9d0 R08: ffffffff89b53393 R09: 1ffffffff25f78ca
R10: dffffc0000000000 R11: fffffbfff25f78cb R12: dffffc0000000000
R13: ffff88802be8fb00 R14: ffff8880644edd70 R15: ffff8880644eddc0
FS:  000055556d3a6500(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f6829251378 CR3: 000000002a83e000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 nft_rcv_nl_event+0x55f/0x6d0 net/netfilter/nf_tables_api.c:11580
 notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
 blocking_notifier_call_chain+0x69/0x90 kernel/notifier.c:388
 netlink_release+0x11a6/0x1b10 net/netlink/af_netlink.c:787
 __sock_release net/socket.c:659 [inline]
 sock_close+0xbc/0x240 net/socket.c:1421
 __fput+0x406/0x8b0 fs/file_table.c:422
 task_work_run+0x24f/0x310 kernel/task_work.c:180
 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
 exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
 __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
 syscall_exit_to_user_mode+0x168/0x370 kernel/entry/common.c:218
 do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f052bf75b99
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffd273700b8 EFLAGS: 00000246 ORIG_RAX: 00000000000001b4
RAX: 0000000000000000 RBX: 0000000000018813 RCX: 00007f052bf75b99
RDX: 0000000000000000 RSI: 000000000000001e RDI: 0000000000000003
RBP: ffffffffffffffff R08: 0000000000000001 R09: 00000004273703cf
R10: 00007f052be00000 R11: 0000000000000246 R12: 00007f052c103fac
R13: 0000000000000032 R14: 00007f052c105aa0 R15: 00007f052c103fa0
 </TASK>


Tested on:

commit:         1c5fc27b Merge tag 'nf-next-24-06-28' of git://git.ker..
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=120dd466980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=5264b58fdff6e881
dashboard link: https://syzkaller.appspot.com/bug?extid=4fd66a69358fc15ae2ad
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=11b97f05980000


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH nf] netfilter: nf_tables: unconditionally flush pending work before notifier
  2024-07-02 14:08 ` [PATCH nf] netfilter: nf_tables: unconditionally flush pending work before notifier Florian Westphal
@ 2024-07-03 10:35   ` Hillf Danton
  2024-07-03 10:52     ` Florian Westphal
  0 siblings, 1 reply; 27+ messages in thread
From: Hillf Danton @ 2024-07-03 10:35 UTC (permalink / raw)
  To: Florian Westphal
  Cc: netfilter-devel, linux-kernel, netdev, syzkaller-bugs,
	syzbot+4fd66a69358fc15ae2ad

On Tue,  2 Jul 2024 16:08:14 +0200 Florian Westphal <fw@strlen.de>
> syzbot reports:
> 
> KASAN: slab-uaf in nft_ctx_update include/net/netfilter/nf_tables.h:1831
> KASAN: slab-uaf in nft_commit_release net/netfilter/nf_tables_api.c:9530
> KASAN: slab-uaf int nf_tables_trans_destroy_work+0x152b/0x1750 net/netfilter/nf_tables_api.c:9597
> Read of size 2 at addr ffff88802b0051c4 by task kworker/1:1/45
> [..]
> Workqueue: events nf_tables_trans_destroy_work
> Call Trace:
>  nft_ctx_update include/net/netfilter/nf_tables.h:1831 [inline]
>  nft_commit_release net/netfilter/nf_tables_api.c:9530 [inline]
>  nf_tables_trans_destroy_work+0x152b/0x1750 net/netfilter/nf_tables_api.c:9597
> 
> Problem is that the notifier does a conditional flush, but its possible
> that the table-to-be-removed is still referenced by transactions being
> processed by the worker, so we need to flush unconditionally.
> 
Given trans->table goes thru the lifespan of trans, your proposal is a bandaid
if trans outlives table.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH nf] netfilter: nf_tables: unconditionally flush pending work before notifier
  2024-07-03 10:35   ` Hillf Danton
@ 2024-07-03 10:52     ` Florian Westphal
  2024-07-03 12:09       ` Hillf Danton
  0 siblings, 1 reply; 27+ messages in thread
From: Florian Westphal @ 2024-07-03 10:52 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Florian Westphal, netfilter-devel, linux-kernel, netdev,
	syzkaller-bugs, syzbot+4fd66a69358fc15ae2ad

Hillf Danton <hdanton@sina.com> wrote:
> Given trans->table goes thru the lifespan of trans, your proposal is a bandaid
> if trans outlives table.

trans must never outlive table.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH nf] netfilter: nf_tables: unconditionally flush pending work before notifier
  2024-07-03 10:52     ` Florian Westphal
@ 2024-07-03 12:09       ` Hillf Danton
  2024-07-03 13:01         ` Florian Westphal
  0 siblings, 1 reply; 27+ messages in thread
From: Hillf Danton @ 2024-07-03 12:09 UTC (permalink / raw)
  To: Florian Westphal
  Cc: netfilter-devel, linux-kernel, netdev, syzkaller-bugs,
	syzbot+4fd66a69358fc15ae2ad

On Wed, 3 Jul 2024 12:52:15 +0200 Florian Westphal <fw@strlen.de>
> Hillf Danton <hdanton@sina.com> wrote:
> > Given trans->table goes thru the lifespan of trans, your proposal is a bandaid
> > if trans outlives table.
> 
> trans must never outlive table.
> 
What is preventing trans from being freed after closing sock, given
trans is freed in workqueue?

	close sock
	queue work

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH nf] netfilter: nf_tables: unconditionally flush pending work before notifier
  2024-07-03 12:09       ` Hillf Danton
@ 2024-07-03 13:01         ` Florian Westphal
  2024-07-04 10:35           ` Hillf Danton
  0 siblings, 1 reply; 27+ messages in thread
From: Florian Westphal @ 2024-07-03 13:01 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Florian Westphal, netfilter-devel, linux-kernel, netdev,
	syzkaller-bugs, syzbot+4fd66a69358fc15ae2ad

Hillf Danton <hdanton@sina.com> wrote:
> On Wed, 3 Jul 2024 12:52:15 +0200 Florian Westphal <fw@strlen.de>
> > Hillf Danton <hdanton@sina.com> wrote:
> > > Given trans->table goes thru the lifespan of trans, your proposal is a bandaid
> > > if trans outlives table.
> > 
> > trans must never outlive table.
> > 
> What is preventing trans from being freed after closing sock, given
> trans is freed in workqueue?
> 
> 	close sock
> 	queue work

The notifier acquires the transaction mutex, locking out all other
transactions, so no further transactions requests referencing
the table can be queued.

The work queue is flushed before potentially ripping the table
out.  After this, no transactions referencing the table can exist
anymore; the only transactions than can still be queued are those
coming from a different netns, and tables are scoped per netns.

Table is torn down.  Transaction mutex is released.

Next transaction from userspace can't find the table anymore (its gone),
so no more transactions can be queued for this table.

As I wrote in the commit message, the flush is dumb, this should first
walk to see if there is a matching table to be torn down, and then flush
work queue once before tearing the table down.

But its better to clearly split bug fix and such a change.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH nf] netfilter: nf_tables: unconditionally flush pending work before notifier
  2024-07-03 13:01         ` Florian Westphal
@ 2024-07-04 10:35           ` Hillf Danton
  2024-07-04 10:54             ` Florian Westphal
  0 siblings, 1 reply; 27+ messages in thread
From: Hillf Danton @ 2024-07-04 10:35 UTC (permalink / raw)
  To: Florian Westphal
  Cc: netfilter-devel, linux-kernel, netdev, syzkaller-bugs,
	syzbot+4fd66a69358fc15ae2ad

On Wed, 3 Jul 2024 15:01:07 +0200 Florian Westphal <fw@strlen.de>
> Hillf Danton <hdanton@sina.com> wrote:
> > On Wed, 3 Jul 2024 12:52:15 +0200 Florian Westphal <fw@strlen.de>
> > > Hillf Danton <hdanton@sina.com> wrote:
> > > > Given trans->table goes thru the lifespan of trans, your proposal is a bandaid
> > > > if trans outlives table.
> > > 
> > > trans must never outlive table.
> > > 
> > What is preventing trans from being freed after closing sock, given
> > trans is freed in workqueue?
> > 
> > 	close sock
> > 	queue work
> 
> The notifier acquires the transaction mutex, locking out all other
> transactions, so no further transactions requests referencing
> the table can be queued.
> 
As per the syzbot report, trans->table could be instantiated before
notifier acquires the transaction mutex. And in fact the lock helps
trans outlive table even with your patch.

	cpu1			cpu2
	---			---
	transB->table = A
				lock trans mutex
				flush work
				free A
				unlock trans mutex

	queue work to free transB

> The work queue is flushed before potentially ripping the table
> out.  After this, no transactions referencing the table can exist
> anymore; the only transactions than can still be queued are those
> coming from a different netns, and tables are scoped per netns.
> 
> Table is torn down.  Transaction mutex is released.
> 
> Next transaction from userspace can't find the table anymore (its gone),
> so no more transactions can be queued for this table.
> 
> As I wrote in the commit message, the flush is dumb, this should first
> walk to see if there is a matching table to be torn down, and then flush
> work queue once before tearing the table down.
> 
> But its better to clearly split bug fix and such a change.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH nf] netfilter: nf_tables: unconditionally flush pending work before notifier
  2024-07-04 10:35           ` Hillf Danton
@ 2024-07-04 10:54             ` Florian Westphal
  2024-07-05 10:48               ` Hillf Danton
  0 siblings, 1 reply; 27+ messages in thread
From: Florian Westphal @ 2024-07-04 10:54 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Florian Westphal, netfilter-devel, linux-kernel, netdev,
	syzkaller-bugs, syzbot+4fd66a69358fc15ae2ad

Hillf Danton <hdanton@sina.com> wrote:
> On Wed, 3 Jul 2024 15:01:07 +0200 Florian Westphal <fw@strlen.de>
> > Hillf Danton <hdanton@sina.com> wrote:
> > > On Wed, 3 Jul 2024 12:52:15 +0200 Florian Westphal <fw@strlen.de>
> > > > Hillf Danton <hdanton@sina.com> wrote:
> > > > > Given trans->table goes thru the lifespan of trans, your proposal is a bandaid
> > > > > if trans outlives table.
> > > > 
> > > > trans must never outlive table.
> > > > 
> > > What is preventing trans from being freed after closing sock, given
> > > trans is freed in workqueue?
> > > 
> > > 	close sock
> > > 	queue work
> > 
> > The notifier acquires the transaction mutex, locking out all other
> > transactions, so no further transactions requests referencing
> > the table can be queued.
> > 
> As per the syzbot report, trans->table could be instantiated before
> notifier acquires the transaction mutex. And in fact the lock helps
> trans outlive table even with your patch.
> 
> 	cpu1			cpu2
> 	---			---
> 	transB->table = A
> 				lock trans mutex
> 				flush work
> 				free A
> 				unlock trans mutex
> 
> 	queue work to free transB

Can you show a crash reproducer or explain how this assign
and queueing happens unordered wrt. cpu2?

This should look like this:

 	cpu1			cpu2
 	---			---
	lock trans mutex
  				lock trans mutex -> blocks
 	transB->table = A
  	queue work to free transB
	unlock trans mutex
				lock trans mutex returns
  				flush work
  				free A
  				unlock trans mutex

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH nf] netfilter: nf_tables: unconditionally flush pending work before notifier
  2024-07-04 10:54             ` Florian Westphal
@ 2024-07-05 10:48               ` Hillf Danton
  2024-07-05 11:02                 ` Florian Westphal
  2024-07-05 11:18                 ` [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work syzbot
  0 siblings, 2 replies; 27+ messages in thread
From: Hillf Danton @ 2024-07-05 10:48 UTC (permalink / raw)
  To: Florian Westphal
  Cc: netfilter-devel, linux-kernel, netdev, syzkaller-bugs,
	syzbot+4fd66a69358fc15ae2ad

On Thu, 4 Jul 2024 12:54:18 +0200 Florian Westphal <fw@strlen.de>
> Hillf Danton <hdanton@sina.com> wrote:
> > On Wed, 3 Jul 2024 15:01:07 +0200 Florian Westphal <fw@strlen.de>
> > > Hillf Danton <hdanton@sina.com> wrote:
> > > > On Wed, 3 Jul 2024 12:52:15 +0200 Florian Westphal <fw@strlen.de>
> > > > > Hillf Danton <hdanton@sina.com> wrote:
> > > > > > Given trans->table goes thru the lifespan of trans, your proposal is a bandaid
> > > > > > if trans outlives table.
> > > > > 
> > > > > trans must never outlive table.
> > > > > 
> > > > What is preventing trans from being freed after closing sock, given
> > > > trans is freed in workqueue?
> > > > 
> > > > 	close sock
> > > > 	queue work
> > > 
> > > The notifier acquires the transaction mutex, locking out all other
> > > transactions, so no further transactions requests referencing
> > > the table can be queued.
> > > 
> > As per the syzbot report, trans->table could be instantiated before
> > notifier acquires the transaction mutex. And in fact the lock helps
> > trans outlive table even with your patch.
> > 
> > 	cpu1			cpu2
> > 	---			---
> > 	transB->table = A
> > 				lock trans mutex
> > 				flush work
> > 				free A
> > 				unlock trans mutex
> > 
> > 	queue work to free transB
> 
> Can you show a crash reproducer or explain how this assign
> and queueing happens unordered wrt. cpu2?
> 
Not so difficult.

> This should look like this:
> 
>  	cpu1			cpu2
>  	---			---
> 	lock trans mutex
>   				lock trans mutex -> blocks
>  	transB->table = A
>   	queue work to free transB
> 	unlock trans mutex
> 				lock trans mutex returns
>   				flush work
>   				free A
>   				unlock trans mutex
> 
If your patch is correct, it should survive a warning.

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git  1c5fc27bc48a

--- x/net/netfilter/nf_tables_api.c
+++ y/net/netfilter/nf_tables_api.c
@@ -11552,9 +11552,10 @@ static int nft_rcv_nl_event(struct notif
 
 	gc_seq = nft_gc_seq_begin(nft_net);
 
-	if (!list_empty(&nf_tables_destroy_list))
-		nf_tables_trans_destroy_flush_work();
+	nf_tables_trans_destroy_flush_work();
 again:
+	WARN_ON(!list_empty(&nft_net->commit_list));
+
 	list_for_each_entry(table, &nft_net->tables, list) {
 		if (nft_table_has_owner(table) &&
 		    n->portid == table->nlpid) {
--

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH nf] netfilter: nf_tables: unconditionally flush pending work before notifier
  2024-07-05 10:48               ` Hillf Danton
@ 2024-07-05 11:02                 ` Florian Westphal
  2024-07-07  7:56                   ` Hillf Danton
  2024-07-05 11:18                 ` [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work syzbot
  1 sibling, 1 reply; 27+ messages in thread
From: Florian Westphal @ 2024-07-05 11:02 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Florian Westphal, netfilter-devel, linux-kernel, netdev,
	syzkaller-bugs, syzbot+4fd66a69358fc15ae2ad

Hillf Danton <hdanton@sina.com> wrote:
> > 				lock trans mutex returns
> >   				flush work
> >   				free A
> >   				unlock trans mutex
> > 
> If your patch is correct, it should survive a warning.
> 
> #syz test https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git  1c5fc27bc48a
> 
> --- x/net/netfilter/nf_tables_api.c
> +++ y/net/netfilter/nf_tables_api.c
> @@ -11552,9 +11552,10 @@ static int nft_rcv_nl_event(struct notif
>  
>  	gc_seq = nft_gc_seq_begin(nft_net);
>  
> -	if (!list_empty(&nf_tables_destroy_list))
> -		nf_tables_trans_destroy_flush_work();
> +	nf_tables_trans_destroy_flush_work();
>  again:
> +	WARN_ON(!list_empty(&nft_net->commit_list));
> +

You could officially submit this patch to nf-next, this
is a slow path and the transaction list must be empty here.

I think this change might be useful as it also documents
this requirement.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work
  2024-07-05 10:48               ` Hillf Danton
  2024-07-05 11:02                 ` Florian Westphal
@ 2024-07-05 11:18                 ` syzbot
  1 sibling, 0 replies; 27+ messages in thread
From: syzbot @ 2024-07-05 11:18 UTC (permalink / raw)
  To: fw, hdanton, linux-kernel, netdev, netfilter-devel,
	syzkaller-bugs

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+4fd66a69358fc15ae2ad@syzkaller.appspotmail.com

Tested on:

commit:         1c5fc27b Merge tag 'nf-next-24-06-28' of git://git.ker..
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=152db3d1980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=5264b58fdff6e881
dashboard link: https://syzkaller.appspot.com/bug?extid=4fd66a69358fc15ae2ad
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1395cd71980000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work
  2024-07-01 20:19 [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work syzbot
                   ` (4 preceding siblings ...)
  2024-07-02 14:55 ` Edward Adam Davis
@ 2024-07-06 23:13 ` Hillf Danton
  2024-07-07  0:21   ` syzbot
  5 siblings, 1 reply; 27+ messages in thread
From: Hillf Danton @ 2024-07-06 23:13 UTC (permalink / raw)
  To: syzbot; +Cc: linux-kernel, syzkaller-bugs

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git  1c5fc27bc48a

--- x/net/netfilter/nf_tables_api.c
+++ y/net/netfilter/nf_tables_api.c
@@ -11552,9 +11552,10 @@ static int nft_rcv_nl_event(struct notif
 
 	gc_seq = nft_gc_seq_begin(nft_net);
 
-	if (!list_empty(&nf_tables_destroy_list))
-		nf_tables_trans_destroy_flush_work();
+	nf_tables_trans_destroy_flush_work();
 again:
+	WARN_ON(!list_empty(&nft_net->commit_list) || !list_empty(&nf_tables_destroy_list));
+
 	list_for_each_entry(table, &nft_net->tables, list) {
 		if (nft_table_has_owner(table) &&
 		    n->portid == table->nlpid) {
--

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work
  2024-07-06 23:13 ` Hillf Danton
@ 2024-07-07  0:21   ` syzbot
  0 siblings, 0 replies; 27+ messages in thread
From: syzbot @ 2024-07-07  0:21 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
WARNING in nft_rcv_nl_event

------------[ cut here ]------------
WARNING: CPU: 0 PID: 6134 at net/netfilter/nf_tables_api.c:11557 nft_rcv_nl_event+0x1e2/0x730 net/netfilter/nf_tables_api.c:11557
Modules linked in:
CPU: 0 PID: 6134 Comm: syz.2.44 Not tainted 6.10.0-rc5-syzkaller-01137-g1c5fc27bc48a-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
RIP: 0010:nft_rcv_nl_event+0x1e2/0x730 net/netfilter/nf_tables_api.c:11557
Code: 6c 24 40 4c 89 7c 24 60 48 8b 44 24 78 42 80 3c 20 00 74 08 4c 89 ff e8 9c dc 46 f8 49 8b 07 4c 39 f8 74 0b e8 6f 0e e1 f7 90 <0f> 0b 90 eb 1c 48 8b 05 a2 34 b2 05 48 c7 c1 40 4d 67 8f 48 39 c8
RSP: 0018:ffffc90003eb79e0 EFLAGS: 00010293
RAX: ffffffff89b51c9f RBX: ffff88807abfac00 RCX: ffff888029719e00
RDX: 0000000000000000 RSI: ffffffff8bcabb40 RDI: ffffffff8c1fee00
RBP: ffffc90003eb7b28 R08: ffffffff92fbc657 R09: 1ffffffff25f78ca
R10: dffffc0000000000 R11: fffffbfff25f78cb R12: dffffc0000000000
R13: ffff888023bbd700 R14: ffff88807abfac00 R15: ffff88807abfac10
FS:  00005555803a6500(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000200 CR3: 000000007a250000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93
 blocking_notifier_call_chain+0x69/0x90 kernel/notifier.c:388
 netlink_release+0x11a6/0x1b10 net/netlink/af_netlink.c:787
 __sock_release net/socket.c:659 [inline]
 sock_close+0xbc/0x240 net/socket.c:1421
 __fput+0x406/0x8b0 fs/file_table.c:422
 task_work_run+0x24f/0x310 kernel/task_work.c:180
 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
 exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
 __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
 syscall_exit_to_user_mode+0x168/0x370 kernel/entry/common.c:218
 do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f48f9d75b99
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffecf0981a8 EFLAGS: 00000246 ORIG_RAX: 00000000000001b4
RAX: 0000000000000000 RBX: 000000000001d197 RCX: 00007f48f9d75b99
RDX: 0000000000000000 RSI: 000000000000001e RDI: 0000000000000003
RBP: ffffffffffffffff R08: 0000000000000001 R09: 00000004cf0984bf
R10: 00007f48f9c00000 R11: 0000000000000246 R12: 00007f48f9f03fac
R13: 0000000000000032 R14: 00007f48f9f05aa0 R15: 00007f48f9f03fa0
 </TASK>


Tested on:

commit:         1c5fc27b Merge tag 'nf-next-24-06-28' of git://git.ker..
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=16077369980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=5264b58fdff6e881
dashboard link: https://syzkaller.appspot.com/bug?extid=4fd66a69358fc15ae2ad
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=112dca6e980000


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH nf] netfilter: nf_tables: unconditionally flush pending work before notifier
  2024-07-05 11:02                 ` Florian Westphal
@ 2024-07-07  7:56                   ` Hillf Danton
  2024-07-07  8:08                     ` Florian Westphal
  0 siblings, 1 reply; 27+ messages in thread
From: Hillf Danton @ 2024-07-07  7:56 UTC (permalink / raw)
  To: Florian Westphal
  Cc: netfilter-devel, linux-kernel, netdev, syzkaller-bugs,
	syzbot+4fd66a69358fc15ae2ad

On Fri, 5 Jul 2024 13:02:18 +0200 Florian Westphal <fw@strlen.de>
> Hillf Danton <hdanton@sina.com> wrote:
> > > 				lock trans mutex returns
> > >   				flush work
> > >   				free A
> > >   				unlock trans mutex
> > > 
> > If your patch is correct, it should survive a warning.
> > 
> > #syz test https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git  1c5fc27bc48a
> > 
> > --- x/net/netfilter/nf_tables_api.c
> > +++ y/net/netfilter/nf_tables_api.c
> > @@ -11552,9 +11552,10 @@ static int nft_rcv_nl_event(struct notif
> >  
> >  	gc_seq = nft_gc_seq_begin(nft_net);
> >  
> > -	if (!list_empty(&nf_tables_destroy_list))
> > -		nf_tables_trans_destroy_flush_work();
> > +	nf_tables_trans_destroy_flush_work();
> >  again:
> > +	WARN_ON(!list_empty(&nft_net->commit_list));
> > +
> 
> You could officially submit this patch to nf-next, this
> is a slow path and the transaction list must be empty here.
> 
> I think this change might be useful as it also documents
> this requirement.

Yes it is boy and the current reproducer triggered another warning [1,2].

[1] https://lore.kernel.org/lkml/20240706231332.3261-1-hdanton@sina.com/
[2] https://lore.kernel.org/lkml/000000000000a42289061c9d452e@google.com/

And feel free to take a look at fput() and sock_put() for instance of
freeing slab pieces asyncally.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH nf] netfilter: nf_tables: unconditionally flush pending work before notifier
  2024-07-07  7:56                   ` Hillf Danton
@ 2024-07-07  8:08                     ` Florian Westphal
  2024-07-08 10:56                       ` Hillf Danton
  0 siblings, 1 reply; 27+ messages in thread
From: Florian Westphal @ 2024-07-07  8:08 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Florian Westphal, netfilter-devel, linux-kernel, netdev,
	syzkaller-bugs, syzbot+4fd66a69358fc15ae2ad

Hillf Danton <hdanton@sina.com> wrote:
> > I think this change might be useful as it also documents
> > this requirement.
> 
> Yes it is boy and the current reproducer triggered another warning [1,2].
> 
> [1] https://lore.kernel.org/lkml/20240706231332.3261-1-hdanton@sina.com/

The WARN is incorrect.  The destroy list can be non-empty; i already
tried to explain why.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH nf] netfilter: nf_tables: unconditionally flush pending work before notifier
  2024-07-07  8:08                     ` Florian Westphal
@ 2024-07-08 10:56                       ` Hillf Danton
  2024-07-08 11:58                         ` Florian Westphal
  0 siblings, 1 reply; 27+ messages in thread
From: Hillf Danton @ 2024-07-08 10:56 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Tejun Heo, netfilter-devel, linux-kernel, netdev, syzkaller-bugs,
	syzbot+4fd66a69358fc15ae2ad

On Sun, 7 Jul 2024 10:08:24 +0200 Florian Westphal <fw@strlen.de>
> Hillf Danton <hdanton@sina.com> wrote:
> > > I think this change might be useful as it also documents
> > > this requirement.
> > 
> > Yes it is boy and the current reproducer triggered another warning [1,2].
> > 
> > [1] https://lore.kernel.org/lkml/20240706231332.3261-1-hdanton@sina.com/
> 
> The WARN is incorrect.  The destroy list can be non-empty; i already
> tried to explain why.
>
That warning as-is could be false positive but it could be triggered with a
single netns.

	cpu1		cpu2		cpu3
	---		---		---
					nf_tables_trans_destroy_work()
					spin_lock(&nf_tables_destroy_list_lock);

					// 1) clear the destroy list
					list_splice_init(&nf_tables_destroy_list, &head);
					spin_unlock(&nf_tables_destroy_list_lock);

			nf_tables_commit_release()
			spin_lock(&nf_tables_destroy_list_lock);

			// 2) refill the destroy list
			list_splice_tail_init(&nft_net->commit_list, &nf_tables_destroy_list);
			spin_unlock(&nf_tables_destroy_list_lock);
			schedule_work(&trans_destroy_work);
			mutex_unlock(&nft_net->commit_mutex);

	nft_rcv_nl_event()
	mutex_lock(&nft_net->commit_mutex);
	flush_work(&trans_destroy_work);
	  start_flush_work()
	    insert_wq_barrier()
	    /*
	     * If @target is currently being executed, schedule the
	     * barrier to the worker; otherwise, put it after @target.
	     */

	// 3) flush work ends with the refilled destroy list left intact
	tear tables down


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH nf] netfilter: nf_tables: unconditionally flush pending work before notifier
  2024-07-08 10:56                       ` Hillf Danton
@ 2024-07-08 11:58                         ` Florian Westphal
  2024-07-08 12:17                           ` Hillf Danton
  0 siblings, 1 reply; 27+ messages in thread
From: Florian Westphal @ 2024-07-08 11:58 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Florian Westphal, Tejun Heo, netfilter-devel, linux-kernel,
	netdev, syzkaller-bugs, syzbot+4fd66a69358fc15ae2ad

Hillf Danton <hdanton@sina.com> wrote:
> On Sun, 7 Jul 2024 10:08:24 +0200 Florian Westphal <fw@strlen.de>
> > Hillf Danton <hdanton@sina.com> wrote:
> > > > I think this change might be useful as it also documents
> > > > this requirement.
> > > 
> > > Yes it is boy and the current reproducer triggered another warning [1,2].
> > > 
> > > [1] https://lore.kernel.org/lkml/20240706231332.3261-1-hdanton@sina.com/
> > 
> > The WARN is incorrect.  The destroy list can be non-empty; i already
> > tried to explain why.
> >
> That warning as-is could be false positive but it could be triggered with a
> single netns.

How?

> 	cpu1		cpu2		cpu3
> 	---		---		---
> 					nf_tables_trans_destroy_work()
> 					spin_lock(&nf_tables_destroy_list_lock);
> 
> 					// 1) clear the destroy list
> 					list_splice_init(&nf_tables_destroy_list, &head);
> 					spin_unlock(&nf_tables_destroy_list_lock);
> 
> 			nf_tables_commit_release()
> 			spin_lock(&nf_tables_destroy_list_lock);
> 
> 			// 2) refill the destroy list
> 			list_splice_tail_init(&nft_net->commit_list, &nf_tables_destroy_list);
> 			spin_unlock(&nf_tables_destroy_list_lock);
> 			schedule_work(&trans_destroy_work);
> 			mutex_unlock(&nft_net->commit_mutex);

So you're saying work can be IDLE after schedule_work()?

I'm not following at all.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH nf] netfilter: nf_tables: unconditionally flush pending work before notifier
  2024-07-08 11:58                         ` Florian Westphal
@ 2024-07-08 12:17                           ` Hillf Danton
  2024-07-08 12:43                             ` Florian Westphal
  0 siblings, 1 reply; 27+ messages in thread
From: Hillf Danton @ 2024-07-08 12:17 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Tejun Heo, netfilter-devel, linux-kernel, netdev, syzkaller-bugs,
	syzbot+4fd66a69358fc15ae2ad

On Mon, 8 Jul 2024 13:58:31 +0200 Florian Westphal <fw@strlen.de>
> Hillf Danton <hdanton@sina.com> wrote:
> > On Sun, 7 Jul 2024 10:08:24 +0200 Florian Westphal <fw@strlen.de>
> > > Hillf Danton <hdanton@sina.com> wrote:
> > > > > I think this change might be useful as it also documents
> > > > > this requirement.
> > > > 
> > > > Yes it is boy and the current reproducer triggered another warning [1,2].
> > > > 
> > > > [1] https://lore.kernel.org/lkml/20240706231332.3261-1-hdanton@sina.com/
> > > 
> > > The WARN is incorrect.  The destroy list can be non-empty; i already
> > > tried to explain why.
> > >
> > That warning as-is could be false positive but it could be triggered with a
> > single netns.
> 
> How?
> 
You saw the below cpu diagram, no?

> > 	cpu1		cpu2		cpu3
> > 	---		---		---
> > 					nf_tables_trans_destroy_work()
> > 					spin_lock(&nf_tables_destroy_list_lock);
> > 
> > 					// 1) clear the destroy list
> > 					list_splice_init(&nf_tables_destroy_list, &head);
> > 					spin_unlock(&nf_tables_destroy_list_lock);
> > 
> > 			nf_tables_commit_release()
> > 			spin_lock(&nf_tables_destroy_list_lock);
> > 
> > 			// 2) refill the destroy list
> > 			list_splice_tail_init(&nft_net->commit_list, &nf_tables_destroy_list);
> > 			spin_unlock(&nf_tables_destroy_list_lock);
> > 			schedule_work(&trans_destroy_work);
> > 			mutex_unlock(&nft_net->commit_mutex);
> 
> So you're saying work can be IDLE after schedule_work()?
> 
I got your point but difficult to explain you. In simple words,
like runqueue, workqueue has latency.

> I'm not following at all.

This does not matter but is why I added tj to the cc list.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH nf] netfilter: nf_tables: unconditionally flush pending work before notifier
  2024-07-08 12:17                           ` Hillf Danton
@ 2024-07-08 12:43                             ` Florian Westphal
  0 siblings, 0 replies; 27+ messages in thread
From: Florian Westphal @ 2024-07-08 12:43 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Florian Westphal, Tejun Heo, netfilter-devel, linux-kernel,
	netdev, syzkaller-bugs, syzbot+4fd66a69358fc15ae2ad

Hillf Danton <hdanton@sina.com> wrote:
> On Mon, 8 Jul 2024 13:58:31 +0200 Florian Westphal <fw@strlen.de>
> > Hillf Danton <hdanton@sina.com> wrote:
> > > On Sun, 7 Jul 2024 10:08:24 +0200 Florian Westphal <fw@strlen.de>
> > > > Hillf Danton <hdanton@sina.com> wrote:
> > > > > > I think this change might be useful as it also documents
> > > > > > this requirement.
> > > > > 
> > > > > Yes it is boy and the current reproducer triggered another warning [1,2].
> > > > > 
> > > > > [1] https://lore.kernel.org/lkml/20240706231332.3261-1-hdanton@sina.com/
> > > > 
> > > > The WARN is incorrect.  The destroy list can be non-empty; i already
> > > > tried to explain why.
> > > >
> > > That warning as-is could be false positive but it could be triggered with a
> > > single netns.
> > 
> > How?
> > 
> You saw the below cpu diagram, no?

It did not explain the problem in a way I understand.

>	cpu1		cpu2		cpu3
>	---		---		---
>					nf_tables_trans_destroy_work()
>					spin_lock(&nf_tables_destroy_list_lock);
>
>					// 1) clear the destroy list
>					list_splice_init(&nf_tables_destroy_list, &head);
>					spin_unlock(&nf_tables_destroy_list_lock);

This means @work is running on cpu3 and made a snapshot of the list.
I don't even understand how thats relevant, but OK.

>			nf_tables_commit_release()
>			spin_lock(&nf_tables_destroy_list_lock);
>			// 2) refill the destroy list
>			list_splice_tail_init(&nft_net->commit_list, &nf_tables_destroy_list);
>			spin_unlock(&nf_tables_destroy_list_lock);
>			schedule_work(&trans_destroy_work);
>			mutex_unlock(&nft_net->commit_mutex);

Means CPU2 has added transaction structures that could
reference @table to list.

It also called schedule_work BEFORE releasing the mutex and
after placing entries on destroy list.

> nft_rcv_nl_event()
> mutex_lock(&nft_net->commit_mutex);
> flush_work(&trans_destroy_work);

Means cpu1 serializes vs. cpu2, @work
was scheduled.

flush_work() must only return if @work is idle, without
any other pending execution.

If it gets scheduled again right after flush_work
returns that is NOT a problem, as I tried to explain several times.

We hold the transaction mutex, only a different netns can queue more
work, and such foreign netns can only see struct nft_table structures
that are private to their namespaces.

> // 3) flush work ends with the refilled destroy list left intact
> tear tables down

Again, I do not understand how its possible.

The postcondition after flush_work returns is:

1. nf_tables_destroy_list must be empty, UNLESS its from unrelated
   net namespaces, they cannot see the tables we're tearing down in 3),
   so they cannot reference them.

2. nf_tables_trans_destroy_work() is NOT running, unless its
   processing entries queued by other netns, after flush work
   returned.


cpu2 does:
   -> add trans->table to @nf_tables_destroy_list
   -> unlock list spinlock
   -> schedule_work
   -> unlock mutex

cpu1 does:
 -> lock mutex
 -> flush work

You say its not enough and that trans->table queued by cpu2 can still
be on @nf_tables_destroy_list.

I say flush_work after taking the mutex guarantees strans->table has been
processed by @work in all cases.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2024-07-08 12:43 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-01 20:19 [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work syzbot
2024-07-02  9:39 ` [syzbot] " syzbot
2024-07-02 11:28 ` Hillf Danton
2024-07-02 13:57   ` syzbot
2024-07-02 13:14 ` Edward Adam Davis
2024-07-02 14:13   ` syzbot
2024-07-02 14:08 ` [PATCH nf] netfilter: nf_tables: unconditionally flush pending work before notifier Florian Westphal
2024-07-03 10:35   ` Hillf Danton
2024-07-03 10:52     ` Florian Westphal
2024-07-03 12:09       ` Hillf Danton
2024-07-03 13:01         ` Florian Westphal
2024-07-04 10:35           ` Hillf Danton
2024-07-04 10:54             ` Florian Westphal
2024-07-05 10:48               ` Hillf Danton
2024-07-05 11:02                 ` Florian Westphal
2024-07-07  7:56                   ` Hillf Danton
2024-07-07  8:08                     ` Florian Westphal
2024-07-08 10:56                       ` Hillf Danton
2024-07-08 11:58                         ` Florian Westphal
2024-07-08 12:17                           ` Hillf Danton
2024-07-08 12:43                             ` Florian Westphal
2024-07-05 11:18                 ` [syzbot] [netfilter?] KASAN: slab-use-after-free Read in nf_tables_trans_destroy_work syzbot
2024-07-02 14:55 ` Edward Adam Davis
2024-07-02 15:28   ` syzbot
2024-07-06 23:13 ` Hillf Danton
2024-07-07  0:21   ` syzbot
     [not found] <20240702093924.GA23593@breakpoint.cc>
2024-07-02 13:33 ` syzbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox