From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org, netfilter-devel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev,
syzbot+b26935466701e56cfdc2@syzkaller.appspotmail.com,
Florian Westphal <fw@strlen.de>,
Pablo Neira Ayuso <pablo@netfilter.org>
Subject: [PATCH 5.15 58/59] netfilter: nf_tables: do not defer rule destruction via call_rcu
Date: Tue, 20 May 2025 15:50:49 +0200 [thread overview]
Message-ID: <20250520125756.144291818@linuxfoundation.org> (raw)
In-Reply-To: <20250520125753.836407405@linuxfoundation.org>
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal <fw@strlen.de>
commit b04df3da1b5c6f6dc7cdccc37941740c078c4043 upstream.
nf_tables_chain_destroy can sleep, it can't be used from call_rcu
callbacks.
Moreover, nf_tables_rule_release() is only safe for error unwinding,
while transaction mutex is held and the to-be-desroyed rule was not
exposed to either dataplane or dumps, as it deactives+frees without
the required synchronize_rcu() in-between.
nft_rule_expr_deactivate() callbacks will change ->use counters
of other chains/sets, see e.g. nft_lookup .deactivate callback, these
must be serialized via transaction mutex.
Also add a few lockdep asserts to make this more explicit.
Calling synchronize_rcu() isn't ideal, but fixing this without is hard
and way more intrusive. As-is, we can get:
WARNING: .. net/netfilter/nf_tables_api.c:5515 nft_set_destroy+0x..
Workqueue: events nf_tables_trans_destroy_work
RIP: 0010:nft_set_destroy+0x3fe/0x5c0
Call Trace:
<TASK>
nf_tables_trans_destroy_work+0x6b7/0xad0
process_one_work+0x64a/0xce0
worker_thread+0x613/0x10d0
In case the synchronize_rcu becomes an issue, we can explore alternatives.
One way would be to allocate nft_trans_rule objects + one nft_trans_chain
object, deactivate the rules + the chain and then defer the freeing to the
nft destroy workqueue. We'd still need to keep the synchronize_rcu path as
a fallback to handle -ENOMEM corner cases though.
Reported-by: syzbot+b26935466701e56cfdc2@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/67478d92.050a0220.253251.0062.GAE@google.com/T/
Fixes: c03d278fdf35 ("netfilter: nf_tables: wait for rcu grace period on net_device removal")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/net/netfilter/nf_tables.h | 3 ---
net/netfilter/nf_tables_api.c | 32 +++++++++++++++-----------------
2 files changed, 15 insertions(+), 20 deletions(-)
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -1028,7 +1028,6 @@ struct nft_chain {
char *name;
u16 udlen;
u8 *udata;
- struct rcu_head rcu_head;
/* Only used during control plane commit phase: */
struct nft_rule **rules_next;
@@ -1171,7 +1170,6 @@ static inline void nft_use_inc_restore(u
* @sets: sets in the table
* @objects: stateful objects in the table
* @flowtables: flow tables in the table
- * @net: netnamespace this table belongs to
* @hgenerator: handle generator state
* @handle: table handle
* @use: number of chain references to this table
@@ -1187,7 +1185,6 @@ struct nft_table {
struct list_head sets;
struct list_head objects;
struct list_head flowtables;
- possible_net_t net;
u64 hgenerator;
u64 handle;
u32 use;
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -1360,7 +1360,6 @@ static int nf_tables_newtable(struct sk_
INIT_LIST_HEAD(&table->sets);
INIT_LIST_HEAD(&table->objects);
INIT_LIST_HEAD(&table->flowtables);
- write_pnet(&table->net, net);
table->family = family;
table->flags = flags;
table->handle = ++nft_net->table_handle;
@@ -3441,8 +3440,11 @@ void nf_tables_rule_destroy(const struct
kfree(rule);
}
+/* can only be used if rule is no longer visible to dumps */
static void nf_tables_rule_release(const struct nft_ctx *ctx, struct nft_rule *rule)
{
+ lockdep_commit_lock_is_held(ctx->net);
+
nft_rule_expr_deactivate(ctx, rule, NFT_TRANS_RELEASE);
nf_tables_rule_destroy(ctx, rule);
}
@@ -5178,6 +5180,8 @@ void nf_tables_deactivate_set(const stru
struct nft_set_binding *binding,
enum nft_trans_phase phase)
{
+ lockdep_commit_lock_is_held(ctx->net);
+
switch (phase) {
case NFT_TRANS_PREPARE_ERROR:
nft_set_trans_unbind(ctx, set);
@@ -10440,19 +10444,6 @@ static void __nft_release_basechain_now(
nf_tables_chain_destroy(ctx->chain);
}
-static void nft_release_basechain_rcu(struct rcu_head *head)
-{
- struct nft_chain *chain = container_of(head, struct nft_chain, rcu_head);
- struct nft_ctx ctx = {
- .family = chain->table->family,
- .chain = chain,
- .net = read_pnet(&chain->table->net),
- };
-
- __nft_release_basechain_now(&ctx);
- put_net(ctx.net);
-}
-
int __nft_release_basechain(struct nft_ctx *ctx)
{
struct nft_rule *rule;
@@ -10467,11 +10458,18 @@ int __nft_release_basechain(struct nft_c
nft_chain_del(ctx->chain);
nft_use_dec(&ctx->table->use);
- if (maybe_get_net(ctx->net))
- call_rcu(&ctx->chain->rcu_head, nft_release_basechain_rcu);
- else
+ if (!maybe_get_net(ctx->net)) {
__nft_release_basechain_now(ctx);
+ return 0;
+ }
+
+ /* wait for ruleset dumps to complete. Owning chain is no longer in
+ * lists, so new dumps can't find any of these rules anymore.
+ */
+ synchronize_rcu();
+ __nft_release_basechain_now(ctx);
+ put_net(ctx->net);
return 0;
}
EXPORT_SYMBOL_GPL(__nft_release_basechain);
next prev parent reply other threads:[~2025-05-20 13:54 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-20 13:49 [PATCH 5.15 00/59] 5.15.184-rc1 review Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 5.15 01/59] platform/x86: asus-wmi: Fix wlan_ctrl_by_user detection Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 5.15 02/59] tracing: probes: Fix a possible race in trace_probe_log APIs Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 5.15 03/59] iio: adc: ad7768-1: Fix insufficient alignment of timestamp Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 5.15 04/59] iio: chemical: sps30: use aligned_s64 for timestamp Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 5.15 05/59] RDMA/rxe: Fix slab-use-after-free Read in rxe_queue_cleanup bug Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 5.15 06/59] nfs: handle failure of nfs_get_lock_context in unlock path Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 5.15 07/59] spi: loopback-test: Do not split 1024-byte hexdumps Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 5.15 08/59] net_sched: Flush gso_skb list too during ->change() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 09/59] net: cadence: macb: Fix a possible deadlock in macb_halt_tx Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 10/59] net: dsa: sja1105: discard incoming frames in BR_STATE_LISTENING Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 11/59] ALSA: sh: SND_AICA should depend on SH_DMA_API Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 12/59] qlcnic: fix memory leak in qlcnic_sriov_channel_cfg_cmd() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 13/59] NFSv4/pnfs: Reset the layout state after a layoutreturn Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 14/59] x86,nospec: Simplify {JMP,CALL}_NOSPEC Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 15/59] x86/speculation: Simplify and make CALL_NOSPEC consistent Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 16/59] x86/speculation: Add a conditional CS prefix to CALL_NOSPEC Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 17/59] x86/speculation: Remove the extra #ifdef around CALL_NOSPEC Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 18/59] Documentation: x86/bugs/its: Add ITS documentation Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 19/59] x86/its: Enumerate Indirect Target Selection (ITS) bug Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 20/59] x86/its: Add support for ITS-safe indirect thunk Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 21/59] x86/alternative: Optimize returns patching Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 22/59] x86/alternatives: Remove faulty optimization Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 23/59] x86/its: Add support for ITS-safe return thunk Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 24/59] x86/its: Enable Indirect Target Selection mitigation Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 25/59] x86/its: Add "vmexit" option to skip mitigation on some CPUs Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 26/59] x86/its: Align RETs in BHB clear sequence to avoid thunking Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 27/59] x86/its: Use dynamic thunks for indirect branches Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 28/59] x86/its: Fix build errors when CONFIG_MODULES=n Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 29/59] x86/its: FineIBT-paranoid vs ITS Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 30/59] dmaengine: Revert "dmaengine: dmatest: Fix dmatest waiting less when interrupted" Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 31/59] btrfs: fix discard worker infinite loop after disabling discard Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 32/59] ACPI: PPTT: Fix processor subtable walk Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 33/59] ALSA: es1968: Add error handling for snd_pcm_hw_constraint_pow2() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 34/59] ALSA: usb-audio: Add sample rate quirk for Audioengine D1 Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 35/59] ALSA: usb-audio: Add sample rate quirk for Microdia JP001 USB Camera Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 36/59] ftrace: Fix preemption accounting for stacktrace trigger command Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 37/59] ftrace: Fix preemption accounting for stacktrace filter command Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 38/59] tracing: samples: Initialize trace_array_printk() with the correct function Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 39/59] phy: Fix error handling in tegra_xusb_port_init Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 40/59] phy: renesas: rcar-gen3-usb2: Set timing registers only once Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 41/59] wifi: mt76: disable napi on driver removal Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 42/59] dmaengine: ti: k3-udma: Add missing locking Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 43/59] dmaengine: ti: k3-udma: Use cap_mask directly from dma_device structure instead of a local copy Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 44/59] dmaengine: idxd: fix memory leak in error handling path of idxd_setup_engines Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 45/59] dmaengine: idxd: fix memory leak in error handling path of idxd_setup_groups Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 46/59] block: fix direct io NOWAIT flag not work Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 47/59] clocksource/i8253: Use raw_spinlock_irqsave() in clockevent_i8253_disable() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 48/59] usb: typec: ucsi: displayport: Fix deadlock Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 49/59] usb: typec: altmodes/displayport: create sysfs nodes as drivers default device attribute group Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 50/59] usb: typec: fix potential array underflow in ucsi_ccg_sync_control() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 51/59] usb: typec: fix pm usage counter imbalance " Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 52/59] selftests/mm: compaction_test: support platform with huge mount of memory Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 53/59] sctp: add mutual exclusion in proc_sctp_do_udp_port() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 54/59] btrfs: dont BUG_ON() when 0 reference count at btrfs_lookup_extent_info() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 55/59] btrfs: do not clean up repair bio if submit fails Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 56/59] netfilter: nf_tables: pass nft_chain to destroy function, not nft_ctx Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 5.15 57/59] netfilter: nf_tables: wait for rcu grace period on net_device removal Greg Kroah-Hartman
2025-05-20 13:50 ` Greg Kroah-Hartman [this message]
2025-05-20 13:50 ` [PATCH 5.15 59/59] ice: arfs: fix use-after-free when freeing @rx_cpu_rmap Greg Kroah-Hartman
2025-05-20 18:19 ` [PATCH 5.15 00/59] 5.15.184-rc1 review Florian Fainelli
2025-05-20 22:46 ` Shuah Khan
2025-05-21 1:53 ` Ron Economos
2025-05-21 3:16 ` Vijayendra Suman
2025-05-21 8:30 ` Jon Hunter
2025-05-21 12:39 ` Naresh Kamboju
2025-05-21 18:54 ` Mark Brown
2025-05-21 19:10 ` Alexandre Chartre
2025-05-21 21:25 ` Pawan Gupta
2025-05-22 5:09 ` Hardik Garg
2025-05-23 9:25 ` Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250520125756.144291818@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=fw@strlen.de \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
--cc=syzbot+b26935466701e56cfdc2@syzkaller.appspotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).