All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org, netfilter-devel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev,
	syzbot+b26935466701e56cfdc2@syzkaller.appspotmail.com,
	Florian Westphal <fw@strlen.de>,
	Pablo Neira Ayuso <pablo@netfilter.org>
Subject: [PATCH 5.15 58/59] netfilter: nf_tables: do not defer rule destruction via call_rcu
Date: Tue, 20 May 2025 15:50:49 +0200	[thread overview]
Message-ID: <20250520125756.144291818@linuxfoundation.org> (raw)
In-Reply-To: <20250520125753.836407405@linuxfoundation.org>

5.15-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Florian Westphal <fw@strlen.de>

commit b04df3da1b5c6f6dc7cdccc37941740c078c4043 upstream.

nf_tables_chain_destroy can sleep, it can't be used from call_rcu
callbacks.

Moreover, nf_tables_rule_release() is only safe for error unwinding,
while transaction mutex is held and the to-be-desroyed rule was not
exposed to either dataplane or dumps, as it deactives+frees without
the required synchronize_rcu() in-between.

nft_rule_expr_deactivate() callbacks will change ->use counters
of other chains/sets, see e.g. nft_lookup .deactivate callback, these
must be serialized via transaction mutex.

Also add a few lockdep asserts to make this more explicit.

Calling synchronize_rcu() isn't ideal, but fixing this without is hard
and way more intrusive.  As-is, we can get:

WARNING: .. net/netfilter/nf_tables_api.c:5515 nft_set_destroy+0x..
Workqueue: events nf_tables_trans_destroy_work
RIP: 0010:nft_set_destroy+0x3fe/0x5c0
Call Trace:
 <TASK>
 nf_tables_trans_destroy_work+0x6b7/0xad0
 process_one_work+0x64a/0xce0
 worker_thread+0x613/0x10d0

In case the synchronize_rcu becomes an issue, we can explore alternatives.

One way would be to allocate nft_trans_rule objects + one nft_trans_chain
object, deactivate the rules + the chain and then defer the freeing to the
nft destroy workqueue.  We'd still need to keep the synchronize_rcu path as
a fallback to handle -ENOMEM corner cases though.

Reported-by: syzbot+b26935466701e56cfdc2@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/67478d92.050a0220.253251.0062.GAE@google.com/T/
Fixes: c03d278fdf35 ("netfilter: nf_tables: wait for rcu grace period on net_device removal")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/net/netfilter/nf_tables.h |    3 ---
 net/netfilter/nf_tables_api.c     |   32 +++++++++++++++-----------------
 2 files changed, 15 insertions(+), 20 deletions(-)

--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -1028,7 +1028,6 @@ struct nft_chain {
 	char				*name;
 	u16				udlen;
 	u8				*udata;
-	struct rcu_head			rcu_head;
 
 	/* Only used during control plane commit phase: */
 	struct nft_rule			**rules_next;
@@ -1171,7 +1170,6 @@ static inline void nft_use_inc_restore(u
  *	@sets: sets in the table
  *	@objects: stateful objects in the table
  *	@flowtables: flow tables in the table
- *	@net: netnamespace this table belongs to
  *	@hgenerator: handle generator state
  *	@handle: table handle
  *	@use: number of chain references to this table
@@ -1187,7 +1185,6 @@ struct nft_table {
 	struct list_head		sets;
 	struct list_head		objects;
 	struct list_head		flowtables;
-	possible_net_t			net;
 	u64				hgenerator;
 	u64				handle;
 	u32				use;
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -1360,7 +1360,6 @@ static int nf_tables_newtable(struct sk_
 	INIT_LIST_HEAD(&table->sets);
 	INIT_LIST_HEAD(&table->objects);
 	INIT_LIST_HEAD(&table->flowtables);
-	write_pnet(&table->net, net);
 	table->family = family;
 	table->flags = flags;
 	table->handle = ++nft_net->table_handle;
@@ -3441,8 +3440,11 @@ void nf_tables_rule_destroy(const struct
 	kfree(rule);
 }
 
+/* can only be used if rule is no longer visible to dumps */
 static void nf_tables_rule_release(const struct nft_ctx *ctx, struct nft_rule *rule)
 {
+	lockdep_commit_lock_is_held(ctx->net);
+
 	nft_rule_expr_deactivate(ctx, rule, NFT_TRANS_RELEASE);
 	nf_tables_rule_destroy(ctx, rule);
 }
@@ -5178,6 +5180,8 @@ void nf_tables_deactivate_set(const stru
 			      struct nft_set_binding *binding,
 			      enum nft_trans_phase phase)
 {
+	lockdep_commit_lock_is_held(ctx->net);
+
 	switch (phase) {
 	case NFT_TRANS_PREPARE_ERROR:
 		nft_set_trans_unbind(ctx, set);
@@ -10440,19 +10444,6 @@ static void __nft_release_basechain_now(
 	nf_tables_chain_destroy(ctx->chain);
 }
 
-static void nft_release_basechain_rcu(struct rcu_head *head)
-{
-	struct nft_chain *chain = container_of(head, struct nft_chain, rcu_head);
-	struct nft_ctx ctx = {
-		.family	= chain->table->family,
-		.chain	= chain,
-		.net	= read_pnet(&chain->table->net),
-	};
-
-	__nft_release_basechain_now(&ctx);
-	put_net(ctx.net);
-}
-
 int __nft_release_basechain(struct nft_ctx *ctx)
 {
 	struct nft_rule *rule;
@@ -10467,11 +10458,18 @@ int __nft_release_basechain(struct nft_c
 	nft_chain_del(ctx->chain);
 	nft_use_dec(&ctx->table->use);
 
-	if (maybe_get_net(ctx->net))
-		call_rcu(&ctx->chain->rcu_head, nft_release_basechain_rcu);
-	else
+	if (!maybe_get_net(ctx->net)) {
 		__nft_release_basechain_now(ctx);
+		return 0;
+	}
+
+	/* wait for ruleset dumps to complete.  Owning chain is no longer in
+	 * lists, so new dumps can't find any of these rules anymore.
+	 */
+	synchronize_rcu();
 
+	__nft_release_basechain_now(ctx);
+	put_net(ctx->net);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(__nft_release_basechain);



  parent reply	other threads:[~2025-05-20 13:54 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-20 13:49 [PATCH 5.15 00/59] 5.15.184-rc1 review Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 5.15 01/59] platform/x86: asus-wmi: Fix wlan_ctrl_by_user detection Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 5.15 02/59] tracing: probes: Fix a possible race in trace_probe_log APIs Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 5.15 03/59] iio: adc: ad7768-1: Fix insufficient alignment of timestamp Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 5.15 04/59] iio: chemical: sps30: use aligned_s64 for timestamp Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 5.15 05/59] RDMA/rxe: Fix slab-use-after-free Read in rxe_queue_cleanup bug Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 5.15 06/59] nfs: handle failure of nfs_get_lock_context in unlock path Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 5.15 07/59] spi: loopback-test: Do not split 1024-byte hexdumps Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 5.15 08/59] net_sched: Flush gso_skb list too during ->change() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 09/59] net: cadence: macb: Fix a possible deadlock in macb_halt_tx Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 10/59] net: dsa: sja1105: discard incoming frames in BR_STATE_LISTENING Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 11/59] ALSA: sh: SND_AICA should depend on SH_DMA_API Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 12/59] qlcnic: fix memory leak in qlcnic_sriov_channel_cfg_cmd() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 13/59] NFSv4/pnfs: Reset the layout state after a layoutreturn Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 14/59] x86,nospec: Simplify {JMP,CALL}_NOSPEC Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 15/59] x86/speculation: Simplify and make CALL_NOSPEC consistent Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 16/59] x86/speculation: Add a conditional CS prefix to CALL_NOSPEC Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 17/59] x86/speculation: Remove the extra #ifdef around CALL_NOSPEC Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 18/59] Documentation: x86/bugs/its: Add ITS documentation Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 19/59] x86/its: Enumerate Indirect Target Selection (ITS) bug Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 20/59] x86/its: Add support for ITS-safe indirect thunk Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 21/59] x86/alternative: Optimize returns patching Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 22/59] x86/alternatives: Remove faulty optimization Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 23/59] x86/its: Add support for ITS-safe return thunk Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 24/59] x86/its: Enable Indirect Target Selection mitigation Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 25/59] x86/its: Add "vmexit" option to skip mitigation on some CPUs Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 26/59] x86/its: Align RETs in BHB clear sequence to avoid thunking Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 27/59] x86/its: Use dynamic thunks for indirect branches Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 28/59] x86/its: Fix build errors when CONFIG_MODULES=n Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 29/59] x86/its: FineIBT-paranoid vs ITS Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 30/59] dmaengine: Revert "dmaengine: dmatest: Fix dmatest waiting less when interrupted" Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 31/59] btrfs: fix discard worker infinite loop after disabling discard Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 32/59] ACPI: PPTT: Fix processor subtable walk Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 33/59] ALSA: es1968: Add error handling for snd_pcm_hw_constraint_pow2() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 34/59] ALSA: usb-audio: Add sample rate quirk for Audioengine D1 Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 35/59] ALSA: usb-audio: Add sample rate quirk for Microdia JP001 USB Camera Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 36/59] ftrace: Fix preemption accounting for stacktrace trigger command Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 37/59] ftrace: Fix preemption accounting for stacktrace filter command Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 38/59] tracing: samples: Initialize trace_array_printk() with the correct function Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 39/59] phy: Fix error handling in tegra_xusb_port_init Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 40/59] phy: renesas: rcar-gen3-usb2: Set timing registers only once Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 41/59] wifi: mt76: disable napi on driver removal Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 42/59] dmaengine: ti: k3-udma: Add missing locking Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 43/59] dmaengine: ti: k3-udma: Use cap_mask directly from dma_device structure instead of a local copy Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 44/59] dmaengine: idxd: fix memory leak in error handling path of idxd_setup_engines Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 45/59] dmaengine: idxd: fix memory leak in error handling path of idxd_setup_groups Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 46/59] block: fix direct io NOWAIT flag not work Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 47/59] clocksource/i8253: Use raw_spinlock_irqsave() in clockevent_i8253_disable() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 48/59] usb: typec: ucsi: displayport: Fix deadlock Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 49/59] usb: typec: altmodes/displayport: create sysfs nodes as drivers default device attribute group Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 50/59] usb: typec: fix potential array underflow in ucsi_ccg_sync_control() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 51/59] usb: typec: fix pm usage counter imbalance " Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 52/59] selftests/mm: compaction_test: support platform with huge mount of memory Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 53/59] sctp: add mutual exclusion in proc_sctp_do_udp_port() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 54/59] btrfs: dont BUG_ON() when 0 reference count at btrfs_lookup_extent_info() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 55/59] btrfs: do not clean up repair bio if submit fails Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 56/59] netfilter: nf_tables: pass nft_chain to destroy function, not nft_ctx Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 57/59] netfilter: nf_tables: wait for rcu grace period on net_device removal Greg Kroah-Hartman
2025-05-20 13:50 ` Greg Kroah-Hartman [this message]
2025-05-20 13:50 ` [PATCH 5.15 59/59] ice: arfs: fix use-after-free when freeing @rx_cpu_rmap Greg Kroah-Hartman
2025-05-20 18:19 ` [PATCH 5.15 00/59] 5.15.184-rc1 review Florian Fainelli
2025-05-20 22:46 ` Shuah Khan
2025-05-21  1:53 ` Ron Economos
2025-05-21  3:16 ` Vijayendra Suman
2025-05-21  8:30 ` Jon Hunter
2025-05-21 12:39 ` Naresh Kamboju
2025-05-21 18:54 ` Mark Brown
2025-05-21 19:10 ` Alexandre Chartre
2025-05-21 21:25   ` Pawan Gupta
2025-05-22  5:09 ` Hardik Garg
2025-05-23  9:25 ` Guenter Roeck
2025-05-27 19:31   ` Richard Narron
2025-05-28  0:55     ` Pawan Gupta
2025-05-29  4:49       ` Richard Narron
2025-05-29 17:40         ` Pawan Gupta
2025-05-30  5:11           ` Greg Kroah-Hartman
2025-05-30  5:21             ` Pawan Gupta
2025-05-30  6:04             ` Pawan Gupta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250520125756.144291818@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=fw@strlen.de \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=patches@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    --cc=syzbot+b26935466701e56cfdc2@syzkaller.appspotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.