From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org, netfilter-devel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, Pablo Neira Ayuso <pablo@netfilter.org>
Subject: [PATCH 6.1 89/97] netfilter: nf_tables: wait for rcu grace period on net_device removal
Date: Tue, 20 May 2025 15:50:54 +0200 [thread overview]
Message-ID: <20250520125804.153728936@linuxfoundation.org> (raw)
In-Reply-To: <20250520125800.653047540@linuxfoundation.org>
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso <pablo@netfilter.org>
commit c03d278fdf35e73dd0ec543b9b556876b9d9a8dc upstream.
8c873e219970 ("netfilter: core: free hooks with call_rcu") removed
synchronize_net() call when unregistering basechain hook, however,
net_device removal event handler for the NFPROTO_NETDEV was not updated
to wait for RCU grace period.
Note that 835b803377f5 ("netfilter: nf_tables_netdev: unregister hooks
on net_device removal") does not remove basechain rules on device
removal, I was hinted to remove rules on net_device removal later, see
5ebe0b0eec9d ("netfilter: nf_tables: destroy basechain and rules on
netdevice removal").
Although NETDEV_UNREGISTER event is guaranteed to be handled after
synchronize_net() call, this path needs to wait for rcu grace period via
rcu callback to release basechain hooks if netns is alive because an
ongoing netlink dump could be in progress (sockets hold a reference on
the netns).
Note that nf_tables_pre_exit_net() unregisters and releases basechain
hooks but it is possible to see NETDEV_UNREGISTER at a later stage in
the netns exit path, eg. veth peer device in another netns:
cleanup_net()
default_device_exit_batch()
unregister_netdevice_many_notify()
notifier_call_chain()
nf_tables_netdev_event()
__nft_release_basechain()
In this particular case, same rule of thumb applies: if netns is alive,
then wait for rcu grace period because netlink dump in the other netns
could be in progress. Otherwise, if the other netns is going away then
no netlink dump can be in progress and basechain hooks can be released
inmediately.
While at it, turn WARN_ON() into WARN_ON_ONCE() for the basechain
validation, which should not ever happen.
Fixes: 835b803377f5 ("netfilter: nf_tables_netdev: unregister hooks on net_device removal")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/net/netfilter/nf_tables.h | 4 +++
net/netfilter/nf_tables_api.c | 41 +++++++++++++++++++++++++++++++-------
2 files changed, 38 insertions(+), 7 deletions(-)
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -1045,6 +1045,7 @@ struct nft_rule_blob {
* @use: number of jump references to this chain
* @flags: bitmask of enum nft_chain_flags
* @name: name of the chain
+ * @rcu_head: rcu head for deferred release
*/
struct nft_chain {
struct nft_rule_blob __rcu *blob_gen_0;
@@ -1061,6 +1062,7 @@ struct nft_chain {
char *name;
u16 udlen;
u8 *udata;
+ struct rcu_head rcu_head;
/* Only used during control plane commit phase: */
struct nft_rule_blob *blob_next;
@@ -1203,6 +1205,7 @@ static inline void nft_use_inc_restore(u
* @sets: sets in the table
* @objects: stateful objects in the table
* @flowtables: flow tables in the table
+ * @net: netnamespace this table belongs to
* @hgenerator: handle generator state
* @handle: table handle
* @use: number of chain references to this table
@@ -1218,6 +1221,7 @@ struct nft_table {
struct list_head sets;
struct list_head objects;
struct list_head flowtables;
+ possible_net_t net;
u64 hgenerator;
u64 handle;
u32 use;
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -1413,6 +1413,7 @@ static int nf_tables_newtable(struct sk_
INIT_LIST_HEAD(&table->sets);
INIT_LIST_HEAD(&table->objects);
INIT_LIST_HEAD(&table->flowtables);
+ write_pnet(&table->net, net);
table->family = family;
table->flags = flags;
table->handle = ++nft_net->table_handle;
@@ -10662,22 +10663,48 @@ int nft_data_dump(struct sk_buff *skb, i
}
EXPORT_SYMBOL_GPL(nft_data_dump);
-int __nft_release_basechain(struct nft_ctx *ctx)
+static void __nft_release_basechain_now(struct nft_ctx *ctx)
{
struct nft_rule *rule, *nr;
- if (WARN_ON(!nft_is_base_chain(ctx->chain)))
- return 0;
-
- nf_tables_unregister_hook(ctx->net, ctx->chain->table, ctx->chain);
list_for_each_entry_safe(rule, nr, &ctx->chain->rules, list) {
list_del(&rule->list);
- nft_use_dec(&ctx->chain->use);
nf_tables_rule_release(ctx, rule);
}
+ nf_tables_chain_destroy(ctx->chain);
+}
+
+static void nft_release_basechain_rcu(struct rcu_head *head)
+{
+ struct nft_chain *chain = container_of(head, struct nft_chain, rcu_head);
+ struct nft_ctx ctx = {
+ .family = chain->table->family,
+ .chain = chain,
+ .net = read_pnet(&chain->table->net),
+ };
+
+ __nft_release_basechain_now(&ctx);
+ put_net(ctx.net);
+}
+
+int __nft_release_basechain(struct nft_ctx *ctx)
+{
+ struct nft_rule *rule;
+
+ if (WARN_ON_ONCE(!nft_is_base_chain(ctx->chain)))
+ return 0;
+
+ nf_tables_unregister_hook(ctx->net, ctx->chain->table, ctx->chain);
+ list_for_each_entry(rule, &ctx->chain->rules, list)
+ nft_use_dec(&ctx->chain->use);
+
nft_chain_del(ctx->chain);
nft_use_dec(&ctx->table->use);
- nf_tables_chain_destroy(ctx->chain);
+
+ if (maybe_get_net(ctx->net))
+ call_rcu(&ctx->chain->rcu_head, nft_release_basechain_rcu);
+ else
+ __nft_release_basechain_now(ctx);
return 0;
}
next prev parent reply other threads:[~2025-05-20 13:59 UTC|newest]
Thread overview: 107+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-20 13:49 [PATCH 6.1 00/97] 6.1.140-rc1 review Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 01/97] binfmt: Fix whitespace issues Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 02/97] binfmt_elf: Support segments with 0 filesz and misaligned starts Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 03/97] binfmt_elf: elf_bss no longer used by load_elf_binary() Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 04/97] selftests/exec: load_address: conform test to TAP format output Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 05/97] binfmt_elf: Leave a gap between .bss and brk Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 06/97] selftests/exec: Build both static and non-static load_address tests Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 07/97] binfmt_elf: Calculate total_size earlier Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 08/97] binfmt_elf: Honor PT_LOAD alignment for static PIE Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 09/97] binfmt_elf: Move brk for static PIE even if ASLR disabled Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 10/97] platform/x86: asus-wmi: Fix wlan_ctrl_by_user detection Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 11/97] tracing: probes: Fix a possible race in trace_probe_log APIs Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 12/97] tpm: tis: Double the timeout B to 4s Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 13/97] iio: adc: ad7266: Fix potential timestamp alignment issue Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 14/97] drm/amd: Stop evicting resources on APUs in suspend Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 15/97] drm/amdgpu: Fix the runtime resume failure issue Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 16/97] drm/amdgpu: trigger flr_work if reading pf2vf data failed Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 17/97] drm/amd: Add Suspend/Hibernate notification callback support Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 18/97] Revert "drm/amd: Stop evicting resources on APUs in suspend" Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 19/97] iio: adc: ad7768-1: Fix insufficient alignment of timestamp Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 20/97] iio: chemical: sps30: use aligned_s64 for timestamp Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 21/97] clocksource/i8253: Use raw_spinlock_irqsave() in clockevent_i8253_disable() Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 22/97] RDMA/rxe: Fix slab-use-after-free Read in rxe_queue_cleanup bug Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 23/97] HID: thrustmaster: fix memory leak in thrustmaster_interrupts() Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 24/97] HID: uclogic: Add NULL check in uclogic_input_configured() Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 25/97] nfs: handle failure of nfs_get_lock_context in unlock path Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 26/97] spi: loopback-test: Do not split 1024-byte hexdumps Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 27/97] net_sched: Flush gso_skb list too during ->change() Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 28/97] net: mctp: Ensure keys maintain only one ref to corresponding dev Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 29/97] net: cadence: macb: Fix a possible deadlock in macb_halt_tx Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 30/97] net: dsa: sja1105: discard incoming frames in BR_STATE_LISTENING Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 31/97] nvme-pci: make nvme_pci_npages_prp() __always_inline Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 32/97] nvme-pci: acquire cq_poll_lock in nvme_poll_irqdisable Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 33/97] ALSA: sh: SND_AICA should depend on SH_DMA_API Greg Kroah-Hartman
2025-05-20 13:49 ` [PATCH 6.1 34/97] net/mlx5e: Disable MACsec offload for uplink representor profile Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 35/97] qlcnic: fix memory leak in qlcnic_sriov_channel_cfg_cmd() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 36/97] regulator: max20086: fix invalid memory access Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 37/97] octeontx2-pf: macsec: Fix incorrect max transmit size in TX secy Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 38/97] net/tls: fix kernel panic when alloc_page failed Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 39/97] NFSv4/pnfs: Reset the layout state after a layoutreturn Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 40/97] dmaengine: Revert "dmaengine: dmatest: Fix dmatest waiting less when interrupted" Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 41/97] LoongArch: Fix MAX_REG_OFFSET calculation Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 42/97] btrfs: fix discard worker infinite loop after disabling discard Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 43/97] drm/amd/display: Correct the reply value when AUX write incomplete Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 44/97] drm/amd/display: Avoid flooding unnecessary info messages Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 45/97] ACPI: PPTT: Fix processor subtable walk Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 46/97] ALSA: es1968: Add error handling for snd_pcm_hw_constraint_pow2() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 47/97] ALSA: usb-audio: Add sample rate quirk for Audioengine D1 Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 48/97] ALSA: usb-audio: Add sample rate quirk for Microdia JP001 USB Camera Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 49/97] dma-buf: insert memory barrier before updating num_fences Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 50/97] hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 51/97] hv_netvsc: Preserve contiguous PFN grouping in the page buffer array Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 52/97] hv_netvsc: Remove rmsg_pgcnt Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 53/97] Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple ranges Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 54/97] Drivers: hv: vmbus: Remove vmbus_sendpacket_pagebuffer() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 55/97] ftrace: Fix preemption accounting for stacktrace trigger command Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 56/97] ftrace: Fix preemption accounting for stacktrace filter command Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 57/97] tracing: samples: Initialize trace_array_printk() with the correct function Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 58/97] phy: Fix error handling in tegra_xusb_port_init Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 59/97] phy: renesas: rcar-gen3-usb2: Fix role detection on unbind/bind Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 60/97] phy: renesas: rcar-gen3-usb2: Set timing registers only once Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 61/97] scsi: sd_zbc: block: Respect bio vector limits for REPORT ZONES buffer Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 62/97] smb: client: fix memory leak during error handling for POSIX mkdir Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 63/97] wifi: mt76: disable napi on driver removal Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 64/97] net: qede: Initialize qede_ll_ops with designated initializer Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 65/97] dmaengine: ti: k3-udma: Add missing locking Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 66/97] dmaengine: ti: k3-udma: Use cap_mask directly from dma_device structure instead of a local copy Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 67/97] dmaengine: idxd: fix memory leak in error handling path of idxd_setup_wqs Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 68/97] dmaengine: idxd: fix memory leak in error handling path of idxd_setup_engines Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 69/97] dmaengine: idxd: fix memory leak in error handling path of idxd_setup_groups Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 70/97] dmaengine: idxd: Add missing cleanup for early error out in idxd_setup_internals Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 71/97] dmaengine: idxd: Add missing cleanups in cleanup internals Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 72/97] dmaengine: idxd: Add missing idxd cleanup to fix memory leak in remove call Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 73/97] dmaengine: idxd: fix memory leak in error handling path of idxd_alloc Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 74/97] dmaengine: idxd: fix memory leak in error handling path of idxd_pci_probe Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 75/97] usb: typec: ucsi: displayport: Fix deadlock Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 76/97] usb: typec: altmodes/displayport: create sysfs nodes as drivers default device attribute group Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 77/97] usb: typec: fix potential array underflow in ucsi_ccg_sync_control() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 78/97] usb: typec: fix pm usage counter imbalance " Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 79/97] selftests/mm: compaction_test: support platform with huge mount of memory Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 80/97] mm/vmscan: fix a bug calling wakeup_kswapd() with a wrong zone index Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 81/97] riscv: mm: Fix the out of bound issue of vmemmap address Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 82/97] bpf, arm64: Fix trampoline for BPF_TRAMP_F_CALL_ORIG Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 83/97] bpf, arm64: Fix address emission with tag-based KASAN enabled Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 84/97] LoongArch: Explicitly specify code model in Makefile Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 85/97] hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 86/97] sctp: add mutual exclusion in proc_sctp_do_udp_port() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 87/97] btrfs: dont BUG_ON() when 0 reference count at btrfs_lookup_extent_info() Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 88/97] netfilter: nf_tables: pass nft_chain to destroy function, not nft_ctx Greg Kroah-Hartman
2025-05-20 13:50 ` Greg Kroah-Hartman [this message]
2025-05-20 13:50 ` [PATCH 6.1 90/97] netfilter: nf_tables: do not defer rule destruction via call_rcu Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 91/97] arm64/sme: Always exit sme_alloc() early with existing storage Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 92/97] platform/x86/amd/pmc: Only disable IRQ1 wakeup where i8042 actually enabled it Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 93/97] bnxt_en: Fix receive ring space parameters when XDP is active Greg Kroah-Hartman
2025-05-20 13:50 ` [PATCH 6.1 94/97] ipv6: Fix potential uninit-value access in __ip6_make_skb() Greg Kroah-Hartman
2025-05-20 13:51 ` [PATCH 6.1 95/97] ipv4: Fix uninit-value access in __ip_make_skb() Greg Kroah-Hartman
2025-05-20 13:51 ` [PATCH 6.1 96/97] spi: cadence-qspi: fix pointer reference in runtime PM hooks Greg Kroah-Hartman
2025-05-20 13:51 ` [PATCH 6.1 97/97] drm/amdgpu: fix pm notifier handling Greg Kroah-Hartman
2025-05-20 18:28 ` [PATCH 6.1 00/97] 6.1.140-rc1 review Florian Fainelli
2025-05-20 21:19 ` Shuah Khan
2025-05-21 1:46 ` Ron Economos
2025-05-21 8:31 ` Jon Hunter
2025-05-21 11:01 ` Naresh Kamboju
2025-05-21 11:46 ` Mark Brown
2025-05-21 14:23 ` Peter Schneider
2025-05-21 15:38 ` Pavel Machek
2025-05-22 5:07 ` Hardik Garg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250520125804.153728936@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox