All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Jozsef Kadlecsik <kadlec@netfilter.org>,
	Pablo Neira Ayuso <pablo@netfilter.org>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.19 08/55] netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test
Date: Mon, 11 Dec 2023 19:21:18 +0100	[thread overview]
Message-ID: <20231211182012.528035596@linuxfoundation.org> (raw)
In-Reply-To: <20231211182012.263036284@linuxfoundation.org>

4.19-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jozsef Kadlecsik <kadlec@netfilter.org>

[ Upstream commit 28628fa952fefc7f2072ce6e8016968cc452b1ba ]

Linkui Xiao reported that there's a race condition when ipset swap and destroy is
called, which can lead to crash in add/del/test element operations. Swap then
destroy are usual operations to replace a set with another one in a production
system. The issue can in some cases be reproduced with the script:

ipset create hash_ip1 hash:net family inet hashsize 1024 maxelem 1048576
ipset add hash_ip1 172.20.0.0/16
ipset add hash_ip1 192.168.0.0/16
iptables -A INPUT -m set --match-set hash_ip1 src -j ACCEPT
while [ 1 ]
do
	# ... Ongoing traffic...
        ipset create hash_ip2 hash:net family inet hashsize 1024 maxelem 1048576
        ipset add hash_ip2 172.20.0.0/16
        ipset swap hash_ip1 hash_ip2
        ipset destroy hash_ip2
        sleep 0.05
done

In the race case the possible order of the operations are

	CPU0			CPU1
	ip_set_test
				ipset swap hash_ip1 hash_ip2
				ipset destroy hash_ip2
	hash_net_kadt

Swap replaces hash_ip1 with hash_ip2 and then destroy removes hash_ip2 which
is the original hash_ip1. ip_set_test was called on hash_ip1 and because destroy
removed it, hash_net_kadt crashes.

The fix is to force ip_set_swap() to wait for all readers to finish accessing the
old set pointers by calling synchronize_rcu().

The first version of the patch was written by Linkui Xiao <xiaolinkui@kylinos.cn>.

v2: synchronize_rcu() is moved into ip_set_swap() in order not to burden
    ip_set_destroy() unnecessarily when all sets are destroyed.
v3: Florian Westphal pointed out that all netfilter hooks run with rcu_read_lock() held
    and em_ipset.c wraps the entire ip_set_test() in rcu read lock/unlock pair.
    So there's no need to extend the rcu read locked area in ipset itself.

Closes: https://lore.kernel.org/all/69e7963b-e7f8-3ad0-210-7b86eebf7f78@netfilter.org/
Reported by: Linkui Xiao <xiaolinkui@kylinos.cn>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/netfilter/ipset/ip_set_core.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c
index 31756d1bf83e7..031bb83aed70a 100644
--- a/net/netfilter/ipset/ip_set_core.c
+++ b/net/netfilter/ipset/ip_set_core.c
@@ -64,6 +64,8 @@ MODULE_ALIAS_NFNL_SUBSYS(NFNL_SUBSYS_IPSET);
 	ip_set_dereference((inst)->ip_set_list)[id]
 #define ip_set_ref_netlink(inst,id)	\
 	rcu_dereference_raw((inst)->ip_set_list)[id]
+#define ip_set_dereference_nfnl(p)	\
+	rcu_dereference_check(p, lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET))
 
 /* The set types are implemented in modules and registered set types
  * can be found in ip_set_type_list. Adding/deleting types is
@@ -552,15 +554,10 @@ __ip_set_put_netlink(struct ip_set *set)
 static inline struct ip_set *
 ip_set_rcu_get(struct net *net, ip_set_id_t index)
 {
-	struct ip_set *set;
 	struct ip_set_net *inst = ip_set_pernet(net);
 
-	rcu_read_lock();
-	/* ip_set_list itself needs to be protected */
-	set = rcu_dereference(inst->ip_set_list)[index];
-	rcu_read_unlock();
-
-	return set;
+	/* ip_set_list and the set pointer need to be protected */
+	return ip_set_dereference_nfnl(inst->ip_set_list)[index];
 }
 
 int
@@ -1227,6 +1224,9 @@ static int ip_set_swap(struct net *net, struct sock *ctnl, struct sk_buff *skb,
 	ip_set(inst, to_id) = from;
 	write_unlock_bh(&ip_set_ref_lock);
 
+	/* Make sure all readers of the old set pointers are completed. */
+	synchronize_rcu();
+
 	return 0;
 }
 
-- 
2.42.0




  parent reply	other threads:[~2023-12-11 18:24 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-11 18:21 [PATCH 4.19 00/55] 4.19.302-rc1 review Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 01/55] spi: imx: add a device specific prepare_message callback Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 02/55] spi: imx: move wml setting to later than setup_transfer Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 03/55] spi: imx: correct wml as the last sg length Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 04/55] spi: imx: mx51-ecspi: Move some initialisation to prepare_message hook Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 05/55] media: davinci: vpif_capture: fix potential double free Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 06/55] block: introduce multi-page bvec helpers Greg Kroah-Hartman
2023-12-12  5:47   ` Christoph Hellwig
2023-12-12  8:38     ` Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 07/55] hrtimers: Push pending hrtimers away from outgoing CPU earlier Greg Kroah-Hartman
2023-12-11 18:21 ` Greg Kroah-Hartman [this message]
2023-12-11 18:21 ` [PATCH 4.19 09/55] tg3: Move the [rt]x_dropped counters to tg3_napi Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 10/55] tg3: Increment tx_dropped in tg3_tso_bug() Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 11/55] kconfig: fix memory leak from range properties Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 12/55] drm/amdgpu: correct chunk_ptr to a pointer to chunk Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 13/55] ipv6: fix potential NULL deref in fib6_add() Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 14/55] hv_netvsc: rndis_filter needs to select NLS Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 15/55] net: arcnet: Fix RESET flag handling Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 16/55] net: arcnet: com20020 fix error handling Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 17/55] arcnet: restoring support for multiple Sohard Arcnet cards Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 18/55] ipv4: ip_gre: Avoid skb_pull() failure in ipgre_xmit() Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 19/55] net: hns: fix fake link up on xge port Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 20/55] netfilter: xt_owner: Add supplementary groups option Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 21/55] netfilter: xt_owner: Fix for unsafe access of sk->sk_socket Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 22/55] tcp: do not accept ACK of bytes we never sent Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 23/55] RDMA/bnxt_re: Correct module description string Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 24/55] hwmon: (acpi_power_meter) Fix 4.29 MW bug Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 25/55] tracing: Fix a warning when allocating buffered events fails Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 26/55] scsi: be2iscsi: Fix a memleak in beiscsi_init_wrb_handle() Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 27/55] ARM: imx: Check return value of devm_kasprintf in imx_mmdc_perf_init Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 28/55] ARM: dts: imx: make gpt node name generic Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 29/55] ARM: dts: imx7: Declare timers compatible with fsl,imx6dl-gpt Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 30/55] ALSA: pcm: fix out-of-bounds in snd_pcm_state_names Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 31/55] packet: Move reference count in packet_sock to atomic_long_t Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 32/55] nilfs2: prevent WARNING in nilfs_sufile_set_segment_usage() Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 33/55] tracing: Always update snapshot buffer size Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 34/55] tracing: Fix incomplete locking when disabling buffered events Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 35/55] tracing: Fix a possible race " Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 36/55] perf/core: Add a new read format to get a number of lost samples Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 37/55] perf: Fix perf_event_validate_size() Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 38/55] gpiolib: sysfs: Fix error handling on failed export Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 39/55] usb: gadget: f_hid: fix report descriptor allocation Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 40/55] parport: Add support for Brainboxes IX/UC/PX parallel cards Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 41/55] usb: typec: class: fix typec_altmode_put_partner to put plugs Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 42/55] ARM: PL011: Fix DMA support Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 43/55] serial: sc16is7xx: address RX timeout interrupt errata Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 44/55] serial: 8250_omap: Add earlycon support for the AM654 UART controller Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 45/55] x86/CPU/AMD: Check vendor in the AMD microcode callback Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 46/55] KVM: s390/mm: Properly reset no-dat Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 47/55] nilfs2: fix missing error check for sb_set_blocksize call Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 48/55] netlink: dont call ->netlink_bind with table lock held Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 49/55] genetlink: add CAP_NET_ADMIN test for multicast bind Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 4.19 50/55] psample: Require CAP_NET_ADMIN when joining "packets" group Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 4.19 51/55] drop_monitor: Require CAP_SYS_ADMIN when joining "events" group Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 4.19 52/55] tools headers UAPI: Sync linux/perf_event.h with the kernel sources Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 4.19 53/55] IB/isert: Fix unaligned immediate-data handling Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 4.19 54/55] devcoredump : Serialize devcd_del work Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 4.19 55/55] devcoredump: Send uevent once devcd is ready Greg Kroah-Hartman
2023-12-11 21:04 ` [PATCH 4.19 00/55] 4.19.302-rc1 review Daniel Díaz
2023-12-12 10:39   ` Greg Kroah-Hartman
2023-12-12 16:14 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231211182012.528035596@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=kadlec@netfilter.org \
    --cc=pablo@netfilter.org \
    --cc=patches@lists.linux.dev \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.