From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org, netfilter-devel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev,
syzbot+b26935466701e56cfdc2@syzkaller.appspotmail.com,
Florian Westphal <fw@strlen.de>,
Pablo Neira Ayuso <pablo@netfilter.org>
Subject: [PATCH 6.1 90/97] netfilter: nf_tables: do not defer rule destruction via call_rcu
Date: Tue, 20 May 2025 15:50:55 +0200 [thread overview]
Message-ID: <20250520125804.192426591@linuxfoundation.org> (raw)
In-Reply-To: <20250520125800.653047540@linuxfoundation.org>
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal <fw@strlen.de>
commit b04df3da1b5c6f6dc7cdccc37941740c078c4043 upstream.
nf_tables_chain_destroy can sleep, it can't be used from call_rcu
callbacks.
Moreover, nf_tables_rule_release() is only safe for error unwinding,
while transaction mutex is held and the to-be-desroyed rule was not
exposed to either dataplane or dumps, as it deactives+frees without
the required synchronize_rcu() in-between.
nft_rule_expr_deactivate() callbacks will change ->use counters
of other chains/sets, see e.g. nft_lookup .deactivate callback, these
must be serialized via transaction mutex.
Also add a few lockdep asserts to make this more explicit.
Calling synchronize_rcu() isn't ideal, but fixing this without is hard
and way more intrusive. As-is, we can get:
WARNING: .. net/netfilter/nf_tables_api.c:5515 nft_set_destroy+0x..
Workqueue: events nf_tables_trans_destroy_work
RIP: 0010:nft_set_destroy+0x3fe/0x5c0
Call Trace:
<TASK>
nf_tables_trans_destroy_work+0x6b7/0xad0
process_one_work+0x64a/0xce0
worker_thread+0x613/0x10d0
In case the synchronize_rcu becomes an issue, we can explore alternatives.
One way would be to allocate nft_trans_rule objects + one nft_trans_chain
object, deactivate the rules + the chain and then defer the freeing to the
nft destroy workqueue. We'd still need to keep the synchronize_rcu path as
a fallback to handle -ENOMEM corner cases though.
Reported-by: syzbot+b26935466701e56cfdc2@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/67478d92.050a0220.253251.0062.GAE@google.com/T/
Fixes: c03d278fdf35 ("netfilter: nf_tables: wait for rcu grace period on net_device removal")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/net/netfilter/nf_tables.h | 3 ---
net/netfilter/nf_tables_api.c | 32 +++++++++++++++-----------------
2 files changed, 15 insertions(+), 20 deletions(-)
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -1062,7 +1062,6 @@ struct nft_chain {
char *name;
u16 udlen;
u8 *udata;
- struct rcu_head rcu_head;
/* Only used during control plane commit phase: */
struct nft_rule_blob *blob_next;
@@ -1205,7 +1204,6 @@ static inline void nft_use_inc_restore(u
* @sets: sets in the table
* @objects: stateful objects in the table
* @flowtables: flow tables in the table
- * @net: netnamespace this table belongs to
* @hgenerator: handle generator state
* @handle: table handle
* @use: number of chain references to this table
@@ -1221,7 +1219,6 @@ struct nft_table {
struct list_head sets;
struct list_head objects;
struct list_head flowtables;
- possible_net_t net;
u64 hgenerator;
u64 handle;
u32 use;
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -1413,7 +1413,6 @@ static int nf_tables_newtable(struct sk_
INIT_LIST_HEAD(&table->sets);
INIT_LIST_HEAD(&table->objects);
INIT_LIST_HEAD(&table->flowtables);
- write_pnet(&table->net, net);
table->family = family;
table->flags = flags;
table->handle = ++nft_net->table_handle;
@@ -3511,8 +3510,11 @@ void nf_tables_rule_destroy(const struct
kfree(rule);
}
+/* can only be used if rule is no longer visible to dumps */
static void nf_tables_rule_release(const struct nft_ctx *ctx, struct nft_rule *rule)
{
+ lockdep_commit_lock_is_held(ctx->net);
+
nft_rule_expr_deactivate(ctx, rule, NFT_TRANS_RELEASE);
nf_tables_rule_destroy(ctx, rule);
}
@@ -5248,6 +5250,8 @@ void nf_tables_deactivate_set(const stru
struct nft_set_binding *binding,
enum nft_trans_phase phase)
{
+ lockdep_commit_lock_is_held(ctx->net);
+
switch (phase) {
case NFT_TRANS_PREPARE_ERROR:
nft_set_trans_unbind(ctx, set);
@@ -10674,19 +10678,6 @@ static void __nft_release_basechain_now(
nf_tables_chain_destroy(ctx->chain);
}
-static void nft_release_basechain_rcu(struct rcu_head *head)
-{
- struct nft_chain *chain = container_of(head, struct nft_chain, rcu_head);
- struct nft_ctx ctx = {
- .family = chain->table->family,
- .chain = chain,
- .net = read_pnet(&chain->table->net),
- };
-
- __nft_release_basechain_now(&ctx);
- put_net(ctx.net);
-}
-
int __nft_release_basechain(struct nft_ctx *ctx)
{
struct nft_rule *rule;
@@ -10701,11 +10692,18 @@ int __nft_release_basechain(struct nft_c
nft_chain_del(ctx->chain);
nft_use_dec(&ctx->table->use);
- if (maybe_get_net(ctx->net))
- call_rcu(&ctx->chain->rcu_head, nft_release_basechain_rcu);
- else
+ if (!maybe_get_net(ctx->net)) {
__nft_release_basechain_now(ctx);
+ return 0;
+ }
+
+ /* wait for ruleset dumps to complete. Owning chain is no longer in
+ * lists, so new dumps can't find any of these rules anymore.
+ */
+ synchronize_rcu();
+ __nft_release_basechain_now(ctx);
+ put_net(ctx->net);
return 0;
}
EXPORT_SYMBOL_GPL(__nft_release_basechain);
next prev parent reply other threads:[~2025-05-20 13:59 UTC|newest]
Thread overview: 107+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-20 13:49 [PATCH 6.1 00/97] 6.1.140-rc1 review Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 01/97] binfmt: Fix whitespace issues Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 02/97] binfmt_elf: Support segments with 0 filesz and misaligned starts Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 03/97] binfmt_elf: elf_bss no longer used by load_elf_binary() Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 04/97] selftests/exec: load_address: conform test to TAP format output Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 05/97] binfmt_elf: Leave a gap between .bss and brk Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 06/97] selftests/exec: Build both static and non-static load_address tests Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 07/97] binfmt_elf: Calculate total_size earlier Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 08/97] binfmt_elf: Honor PT_LOAD alignment for static PIE Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 09/97] binfmt_elf: Move brk for static PIE even if ASLR disabled Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 10/97] platform/x86: asus-wmi: Fix wlan_ctrl_by_user detection Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 11/97] tracing: probes: Fix a possible race in trace_probe_log APIs Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 12/97] tpm: tis: Double the timeout B to 4s Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 13/97] iio: adc: ad7266: Fix potential timestamp alignment issue Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 14/97] drm/amd: Stop evicting resources on APUs in suspend Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 15/97] drm/amdgpu: Fix the runtime resume failure issue Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 16/97] drm/amdgpu: trigger flr_work if reading pf2vf data failed Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 17/97] drm/amd: Add Suspend/Hibernate notification callback support Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 18/97] Revert "drm/amd: Stop evicting resources on APUs in suspend" Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 19/97] iio: adc: ad7768-1: Fix insufficient alignment of timestamp Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 20/97] iio: chemical: sps30: use aligned_s64 for timestamp Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 21/97] clocksource/i8253: Use raw_spinlock_irqsave() in clockevent_i8253_disable() Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 22/97] RDMA/rxe: Fix slab-use-after-free Read in rxe_queue_cleanup bug Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 23/97] HID: thrustmaster: fix memory leak in thrustmaster_interrupts() Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 24/97] HID: uclogic: Add NULL check in uclogic_input_configured() Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 25/97] nfs: handle failure of nfs_get_lock_context in unlock path Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 26/97] spi: loopback-test: Do not split 1024-byte hexdumps Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 27/97] net_sched: Flush gso_skb list too during ->change() Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 28/97] net: mctp: Ensure keys maintain only one ref to corresponding dev Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 29/97] net: cadence: macb: Fix a possible deadlock in macb_halt_tx Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 30/97] net: dsa: sja1105: discard incoming frames in BR_STATE_LISTENING Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 31/97] nvme-pci: make nvme_pci_npages_prp() __always_inline Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 32/97] nvme-pci: acquire cq_poll_lock in nvme_poll_irqdisable Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 33/97] ALSA: sh: SND_AICA should depend on SH_DMA_API Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 34/97] net/mlx5e: Disable MACsec offload for uplink representor profile Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 35/97] qlcnic: fix memory leak in qlcnic_sriov_channel_cfg_cmd() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 36/97] regulator: max20086: fix invalid memory access Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 37/97] octeontx2-pf: macsec: Fix incorrect max transmit size in TX secy Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 38/97] net/tls: fix kernel panic when alloc_page failed Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 39/97] NFSv4/pnfs: Reset the layout state after a layoutreturn Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 40/97] dmaengine: Revert "dmaengine: dmatest: Fix dmatest waiting less when interrupted" Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 41/97] LoongArch: Fix MAX_REG_OFFSET calculation Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 42/97] btrfs: fix discard worker infinite loop after disabling discard Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 43/97] drm/amd/display: Correct the reply value when AUX write incomplete Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 44/97] drm/amd/display: Avoid flooding unnecessary info messages Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 45/97] ACPI: PPTT: Fix processor subtable walk Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 46/97] ALSA: es1968: Add error handling for snd_pcm_hw_constraint_pow2() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 47/97] ALSA: usb-audio: Add sample rate quirk for Audioengine D1 Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 48/97] ALSA: usb-audio: Add sample rate quirk for Microdia JP001 USB Camera Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 49/97] dma-buf: insert memory barrier before updating num_fences Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 50/97] hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 51/97] hv_netvsc: Preserve contiguous PFN grouping in the page buffer array Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 52/97] hv_netvsc: Remove rmsg_pgcnt Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 53/97] Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple ranges Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 54/97] Drivers: hv: vmbus: Remove vmbus_sendpacket_pagebuffer() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 55/97] ftrace: Fix preemption accounting for stacktrace trigger command Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 56/97] ftrace: Fix preemption accounting for stacktrace filter command Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 57/97] tracing: samples: Initialize trace_array_printk() with the correct function Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 58/97] phy: Fix error handling in tegra_xusb_port_init Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 59/97] phy: renesas: rcar-gen3-usb2: Fix role detection on unbind/bind Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 60/97] phy: renesas: rcar-gen3-usb2: Set timing registers only once Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 61/97] scsi: sd_zbc: block: Respect bio vector limits for REPORT ZONES buffer Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 62/97] smb: client: fix memory leak during error handling for POSIX mkdir Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 63/97] wifi: mt76: disable napi on driver removal Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 64/97] net: qede: Initialize qede_ll_ops with designated initializer Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 65/97] dmaengine: ti: k3-udma: Add missing locking Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 66/97] dmaengine: ti: k3-udma: Use cap_mask directly from dma_device structure instead of a local copy Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 67/97] dmaengine: idxd: fix memory leak in error handling path of idxd_setup_wqs Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 68/97] dmaengine: idxd: fix memory leak in error handling path of idxd_setup_engines Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 69/97] dmaengine: idxd: fix memory leak in error handling path of idxd_setup_groups Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 70/97] dmaengine: idxd: Add missing cleanup for early error out in idxd_setup_internals Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 71/97] dmaengine: idxd: Add missing cleanups in cleanup internals Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 72/97] dmaengine: idxd: Add missing idxd cleanup to fix memory leak in remove call Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 73/97] dmaengine: idxd: fix memory leak in error handling path of idxd_alloc Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 74/97] dmaengine: idxd: fix memory leak in error handling path of idxd_pci_probe Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 75/97] usb: typec: ucsi: displayport: Fix deadlock Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 76/97] usb: typec: altmodes/displayport: create sysfs nodes as drivers default device attribute group Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 77/97] usb: typec: fix potential array underflow in ucsi_ccg_sync_control() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 78/97] usb: typec: fix pm usage counter imbalance " Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 79/97] selftests/mm: compaction_test: support platform with huge mount of memory Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 80/97] mm/vmscan: fix a bug calling wakeup_kswapd() with a wrong zone index Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 81/97] riscv: mm: Fix the out of bound issue of vmemmap address Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 82/97] bpf, arm64: Fix trampoline for BPF_TRAMP_F_CALL_ORIG Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 83/97] bpf, arm64: Fix address emission with tag-based KASAN enabled Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 84/97] LoongArch: Explicitly specify code model in Makefile Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 85/97] hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 86/97] sctp: add mutual exclusion in proc_sctp_do_udp_port() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 87/97] btrfs: dont BUG_ON() when 0 reference count at btrfs_lookup_extent_info() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 88/97] netfilter: nf_tables: pass nft_chain to destroy function, not nft_ctx Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 89/97] netfilter: nf_tables: wait for rcu grace period on net_device removal Greg Kroah-Hartman
2025-05-20 13:50 ` Greg Kroah-Hartman [this message]
2025-05-20 13:50 ` [PATCH 6.1 91/97] arm64/sme: Always exit sme_alloc() early with existing storage Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 92/97] platform/x86/amd/pmc: Only disable IRQ1 wakeup where i8042 actually enabled it Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 93/97] bnxt_en: Fix receive ring space parameters when XDP is active Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 94/97] ipv6: Fix potential uninit-value access in __ip6_make_skb() Greg Kroah-Hartman
2025-05-20 13:51 ` [PATCH 6.1 95/97] ipv4: Fix uninit-value access in __ip_make_skb() Greg Kroah-Hartman
2025-05-20 13:51 ` [PATCH 6.1 96/97] spi: cadence-qspi: fix pointer reference in runtime PM hooks Greg Kroah-Hartman
2025-05-20 13:51 ` [PATCH 6.1 97/97] drm/amdgpu: fix pm notifier handling Greg Kroah-Hartman
2025-05-20 18:28 ` [PATCH 6.1 00/97] 6.1.140-rc1 review Florian Fainelli
2025-05-20 21:19 ` Shuah Khan
2025-05-21 1:46 ` Ron Economos
2025-05-21 8:31 ` Jon Hunter
2025-05-21 11:01 ` Naresh Kamboju
2025-05-21 11:46 ` Mark Brown
2025-05-21 14:23 ` Peter Schneider
2025-05-21 15:38 ` Pavel Machek
2025-05-22 5:07 ` Hardik Garg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250520125804.192426591@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=fw@strlen.de \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
--cc=syzbot+b26935466701e56cfdc2@syzkaller.appspotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.