public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org, netfilter-devel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev,
	syzbot+b26935466701e56cfdc2@syzkaller.appspotmail.com,
	Florian Westphal <fw@strlen.de>,
	Pablo Neira Ayuso <pablo@netfilter.org>
Subject: [PATCH 6.1 90/97] netfilter: nf_tables: do not defer rule destruction via call_rcu
Date: Tue, 20 May 2025 15:50:55 +0200	[thread overview]
Message-ID: <20250520125804.192426591@linuxfoundation.org> (raw)
In-Reply-To: <20250520125800.653047540@linuxfoundation.org>

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Florian Westphal <fw@strlen.de>

commit b04df3da1b5c6f6dc7cdccc37941740c078c4043 upstream.

nf_tables_chain_destroy can sleep, it can't be used from call_rcu
callbacks.

Moreover, nf_tables_rule_release() is only safe for error unwinding,
while transaction mutex is held and the to-be-desroyed rule was not
exposed to either dataplane or dumps, as it deactives+frees without
the required synchronize_rcu() in-between.

nft_rule_expr_deactivate() callbacks will change ->use counters
of other chains/sets, see e.g. nft_lookup .deactivate callback, these
must be serialized via transaction mutex.

Also add a few lockdep asserts to make this more explicit.

Calling synchronize_rcu() isn't ideal, but fixing this without is hard
and way more intrusive.  As-is, we can get:

WARNING: .. net/netfilter/nf_tables_api.c:5515 nft_set_destroy+0x..
Workqueue: events nf_tables_trans_destroy_work
RIP: 0010:nft_set_destroy+0x3fe/0x5c0
Call Trace:
 <TASK>
 nf_tables_trans_destroy_work+0x6b7/0xad0
 process_one_work+0x64a/0xce0
 worker_thread+0x613/0x10d0

In case the synchronize_rcu becomes an issue, we can explore alternatives.

One way would be to allocate nft_trans_rule objects + one nft_trans_chain
object, deactivate the rules + the chain and then defer the freeing to the
nft destroy workqueue.  We'd still need to keep the synchronize_rcu path as
a fallback to handle -ENOMEM corner cases though.

Reported-by: syzbot+b26935466701e56cfdc2@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/67478d92.050a0220.253251.0062.GAE@google.com/T/
Fixes: c03d278fdf35 ("netfilter: nf_tables: wait for rcu grace period on net_device removal")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/net/netfilter/nf_tables.h |    3 ---
 net/netfilter/nf_tables_api.c     |   32 +++++++++++++++-----------------
 2 files changed, 15 insertions(+), 20 deletions(-)

--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -1062,7 +1062,6 @@ struct nft_chain {
 	char				*name;
 	u16				udlen;
 	u8				*udata;
-	struct rcu_head			rcu_head;
 
 	/* Only used during control plane commit phase: */
 	struct nft_rule_blob		*blob_next;
@@ -1205,7 +1204,6 @@ static inline void nft_use_inc_restore(u
  *	@sets: sets in the table
  *	@objects: stateful objects in the table
  *	@flowtables: flow tables in the table
- *	@net: netnamespace this table belongs to
  *	@hgenerator: handle generator state
  *	@handle: table handle
  *	@use: number of chain references to this table
@@ -1221,7 +1219,6 @@ struct nft_table {
 	struct list_head		sets;
 	struct list_head		objects;
 	struct list_head		flowtables;
-	possible_net_t			net;
 	u64				hgenerator;
 	u64				handle;
 	u32				use;
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -1413,7 +1413,6 @@ static int nf_tables_newtable(struct sk_
 	INIT_LIST_HEAD(&table->sets);
 	INIT_LIST_HEAD(&table->objects);
 	INIT_LIST_HEAD(&table->flowtables);
-	write_pnet(&table->net, net);
 	table->family = family;
 	table->flags = flags;
 	table->handle = ++nft_net->table_handle;
@@ -3511,8 +3510,11 @@ void nf_tables_rule_destroy(const struct
 	kfree(rule);
 }
 
+/* can only be used if rule is no longer visible to dumps */
 static void nf_tables_rule_release(const struct nft_ctx *ctx, struct nft_rule *rule)
 {
+	lockdep_commit_lock_is_held(ctx->net);
+
 	nft_rule_expr_deactivate(ctx, rule, NFT_TRANS_RELEASE);
 	nf_tables_rule_destroy(ctx, rule);
 }
@@ -5248,6 +5250,8 @@ void nf_tables_deactivate_set(const stru
 			      struct nft_set_binding *binding,
 			      enum nft_trans_phase phase)
 {
+	lockdep_commit_lock_is_held(ctx->net);
+
 	switch (phase) {
 	case NFT_TRANS_PREPARE_ERROR:
 		nft_set_trans_unbind(ctx, set);
@@ -10674,19 +10678,6 @@ static void __nft_release_basechain_now(
 	nf_tables_chain_destroy(ctx->chain);
 }
 
-static void nft_release_basechain_rcu(struct rcu_head *head)
-{
-	struct nft_chain *chain = container_of(head, struct nft_chain, rcu_head);
-	struct nft_ctx ctx = {
-		.family	= chain->table->family,
-		.chain	= chain,
-		.net	= read_pnet(&chain->table->net),
-	};
-
-	__nft_release_basechain_now(&ctx);
-	put_net(ctx.net);
-}
-
 int __nft_release_basechain(struct nft_ctx *ctx)
 {
 	struct nft_rule *rule;
@@ -10701,11 +10692,18 @@ int __nft_release_basechain(struct nft_c
 	nft_chain_del(ctx->chain);
 	nft_use_dec(&ctx->table->use);
 
-	if (maybe_get_net(ctx->net))
-		call_rcu(&ctx->chain->rcu_head, nft_release_basechain_rcu);
-	else
+	if (!maybe_get_net(ctx->net)) {
 		__nft_release_basechain_now(ctx);
+		return 0;
+	}
+
+	/* wait for ruleset dumps to complete.  Owning chain is no longer in
+	 * lists, so new dumps can't find any of these rules anymore.
+	 */
+	synchronize_rcu();
 
+	__nft_release_basechain_now(ctx);
+	put_net(ctx->net);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(__nft_release_basechain);



  parent reply	other threads:[~2025-05-20 13:59 UTC|newest]

Thread overview: 107+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-20 13:49 [PATCH 6.1 00/97] 6.1.140-rc1 review Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 01/97] binfmt: Fix whitespace issues Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 02/97] binfmt_elf: Support segments with 0 filesz and misaligned starts Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 03/97] binfmt_elf: elf_bss no longer used by load_elf_binary() Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 04/97] selftests/exec: load_address: conform test to TAP format output Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 05/97] binfmt_elf: Leave a gap between .bss and brk Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 06/97] selftests/exec: Build both static and non-static load_address tests Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 07/97] binfmt_elf: Calculate total_size earlier Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 08/97] binfmt_elf: Honor PT_LOAD alignment for static PIE Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 09/97] binfmt_elf: Move brk for static PIE even if ASLR disabled Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 10/97] platform/x86: asus-wmi: Fix wlan_ctrl_by_user detection Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 11/97] tracing: probes: Fix a possible race in trace_probe_log APIs Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 12/97] tpm: tis: Double the timeout B to 4s Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 13/97] iio: adc: ad7266: Fix potential timestamp alignment issue Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 14/97] drm/amd: Stop evicting resources on APUs in suspend Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 15/97] drm/amdgpu: Fix the runtime resume failure issue Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 16/97] drm/amdgpu: trigger flr_work if reading pf2vf data failed Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 17/97] drm/amd: Add Suspend/Hibernate notification callback support Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 18/97] Revert "drm/amd: Stop evicting resources on APUs in suspend" Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 19/97] iio: adc: ad7768-1: Fix insufficient alignment of timestamp Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 20/97] iio: chemical: sps30: use aligned_s64 for timestamp Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 21/97] clocksource/i8253: Use raw_spinlock_irqsave() in clockevent_i8253_disable() Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 22/97] RDMA/rxe: Fix slab-use-after-free Read in rxe_queue_cleanup bug Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 23/97] HID: thrustmaster: fix memory leak in thrustmaster_interrupts() Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 24/97] HID: uclogic: Add NULL check in uclogic_input_configured() Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 25/97] nfs: handle failure of nfs_get_lock_context in unlock path Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 26/97] spi: loopback-test: Do not split 1024-byte hexdumps Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 27/97] net_sched: Flush gso_skb list too during ->change() Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 28/97] net: mctp: Ensure keys maintain only one ref to corresponding dev Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 29/97] net: cadence: macb: Fix a possible deadlock in macb_halt_tx Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 30/97] net: dsa: sja1105: discard incoming frames in BR_STATE_LISTENING Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 31/97] nvme-pci: make nvme_pci_npages_prp() __always_inline Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 32/97] nvme-pci: acquire cq_poll_lock in nvme_poll_irqdisable Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 33/97] ALSA: sh: SND_AICA should depend on SH_DMA_API Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 34/97] net/mlx5e: Disable MACsec offload for uplink representor profile Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 35/97] qlcnic: fix memory leak in qlcnic_sriov_channel_cfg_cmd() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 36/97] regulator: max20086: fix invalid memory access Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 37/97] octeontx2-pf: macsec: Fix incorrect max transmit size in TX secy Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 38/97] net/tls: fix kernel panic when alloc_page failed Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 39/97] NFSv4/pnfs: Reset the layout state after a layoutreturn Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 40/97] dmaengine: Revert "dmaengine: dmatest: Fix dmatest waiting less when interrupted" Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 41/97] LoongArch: Fix MAX_REG_OFFSET calculation Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 42/97] btrfs: fix discard worker infinite loop after disabling discard Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 43/97] drm/amd/display: Correct the reply value when AUX write incomplete Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 44/97] drm/amd/display: Avoid flooding unnecessary info messages Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 45/97] ACPI: PPTT: Fix processor subtable walk Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 46/97] ALSA: es1968: Add error handling for snd_pcm_hw_constraint_pow2() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 47/97] ALSA: usb-audio: Add sample rate quirk for Audioengine D1 Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 48/97] ALSA: usb-audio: Add sample rate quirk for Microdia JP001 USB Camera Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 49/97] dma-buf: insert memory barrier before updating num_fences Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 50/97] hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 51/97] hv_netvsc: Preserve contiguous PFN grouping in the page buffer array Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 52/97] hv_netvsc: Remove rmsg_pgcnt Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 53/97] Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple ranges Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 54/97] Drivers: hv: vmbus: Remove vmbus_sendpacket_pagebuffer() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 55/97] ftrace: Fix preemption accounting for stacktrace trigger command Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 56/97] ftrace: Fix preemption accounting for stacktrace filter command Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 57/97] tracing: samples: Initialize trace_array_printk() with the correct function Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 58/97] phy: Fix error handling in tegra_xusb_port_init Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 59/97] phy: renesas: rcar-gen3-usb2: Fix role detection on unbind/bind Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 60/97] phy: renesas: rcar-gen3-usb2: Set timing registers only once Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 61/97] scsi: sd_zbc: block: Respect bio vector limits for REPORT ZONES buffer Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 62/97] smb: client: fix memory leak during error handling for POSIX mkdir Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 63/97] wifi: mt76: disable napi on driver removal Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 64/97] net: qede: Initialize qede_ll_ops with designated initializer Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 65/97] dmaengine: ti: k3-udma: Add missing locking Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 66/97] dmaengine: ti: k3-udma: Use cap_mask directly from dma_device structure instead of a local copy Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 67/97] dmaengine: idxd: fix memory leak in error handling path of idxd_setup_wqs Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 68/97] dmaengine: idxd: fix memory leak in error handling path of idxd_setup_engines Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 69/97] dmaengine: idxd: fix memory leak in error handling path of idxd_setup_groups Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 70/97] dmaengine: idxd: Add missing cleanup for early error out in idxd_setup_internals Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 71/97] dmaengine: idxd: Add missing cleanups in cleanup internals Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 72/97] dmaengine: idxd: Add missing idxd cleanup to fix memory leak in remove call Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 73/97] dmaengine: idxd: fix memory leak in error handling path of idxd_alloc Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 74/97] dmaengine: idxd: fix memory leak in error handling path of idxd_pci_probe Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 75/97] usb: typec: ucsi: displayport: Fix deadlock Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 76/97] usb: typec: altmodes/displayport: create sysfs nodes as drivers default device attribute group Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 77/97] usb: typec: fix potential array underflow in ucsi_ccg_sync_control() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 78/97] usb: typec: fix pm usage counter imbalance " Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 79/97] selftests/mm: compaction_test: support platform with huge mount of memory Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 80/97] mm/vmscan: fix a bug calling wakeup_kswapd() with a wrong zone index Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 81/97] riscv: mm: Fix the out of bound issue of vmemmap address Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 82/97] bpf, arm64: Fix trampoline for BPF_TRAMP_F_CALL_ORIG Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 83/97] bpf, arm64: Fix address emission with tag-based KASAN enabled Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 84/97] LoongArch: Explicitly specify code model in Makefile Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 85/97] hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 86/97] sctp: add mutual exclusion in proc_sctp_do_udp_port() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 87/97] btrfs: dont BUG_ON() when 0 reference count at btrfs_lookup_extent_info() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 88/97] netfilter: nf_tables: pass nft_chain to destroy function, not nft_ctx Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 89/97] netfilter: nf_tables: wait for rcu grace period on net_device removal Greg Kroah-Hartman
2025-05-20 13:50 ` Greg Kroah-Hartman [this message]
2025-05-20 13:50 ` [PATCH 6.1 91/97] arm64/sme: Always exit sme_alloc() early with existing storage Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 92/97] platform/x86/amd/pmc: Only disable IRQ1 wakeup where i8042 actually enabled it Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 93/97] bnxt_en: Fix receive ring space parameters when XDP is active Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 94/97] ipv6: Fix potential uninit-value access in __ip6_make_skb() Greg Kroah-Hartman
2025-05-20 13:51 ` [PATCH 6.1 95/97] ipv4: Fix uninit-value access in __ip_make_skb() Greg Kroah-Hartman
2025-05-20 13:51 ` [PATCH 6.1 96/97] spi: cadence-qspi: fix pointer reference in runtime PM hooks Greg Kroah-Hartman
2025-05-20 13:51 ` [PATCH 6.1 97/97] drm/amdgpu: fix pm notifier handling Greg Kroah-Hartman
2025-05-20 18:28 ` [PATCH 6.1 00/97] 6.1.140-rc1 review Florian Fainelli
2025-05-20 21:19 ` Shuah Khan
2025-05-21  1:46 ` Ron Economos
2025-05-21  8:31 ` Jon Hunter
2025-05-21 11:01 ` Naresh Kamboju
2025-05-21 11:46 ` Mark Brown
2025-05-21 14:23 ` Peter Schneider
2025-05-21 15:38 ` Pavel Machek
2025-05-22  5:07 ` Hardik Garg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250520125804.192426591@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=fw@strlen.de \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=patches@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    --cc=syzbot+b26935466701e56cfdc2@syzkaller.appspotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox