From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83E43C61D99 for ; Wed, 22 Nov 2023 15:37:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344518AbjKVPiA (ORCPT ); Wed, 22 Nov 2023 10:38:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59220 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344527AbjKVPhe (ORCPT ); Wed, 22 Nov 2023 10:37:34 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C57361734; Wed, 22 Nov 2023 07:35:24 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EB6CDC433D9; Wed, 22 Nov 2023 15:35:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700667324; bh=Ix+7p0GeTshUCvz1IW8R8XazXW6e+BBqnWSJ2H2SN/0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=P3KPhLEfD8qWSB0qrPZOfaPGK58ahd4vaCnrAmKq7g+Hq24L5pRVgLtfOijgTo+15 Nk5CNheiIxpgwJ+QrmB+zYSNOby1NRAq9nbdBPb/YvWs/uIADklbFgsLiVXGvX3COY ii27dP1tYszdECKFWQp5ZIuRWzkPPAU6WxK+/6ItuUZH9mbhwIRbW4Pn8dXD5VLxJZ YNrEG2eIQnrua0qm9aX8tLoc4FrU+1d/RXOthYihkzKn4lwytDDlC4rJgEanwPxmP0 z4Tb/91+Sq+9mjNNaQQ62eQnFIoHISk4ZKT2V7OTllpSKCqzH9Yz/wu5aIOIbfbbTp 1paX7RFPTltow== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Jozsef Kadlecsik , Pablo Neira Ayuso , Sasha Levin , fw@strlen.de, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, justinstitt@google.com, kuniyu@amazon.com, netfilter-devel@vger.kernel.org, coreteam@netfilter.org, netdev@vger.kernel.org Subject: [PATCH AUTOSEL 5.15 3/7] netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test Date: Wed, 22 Nov 2023 10:35:03 -0500 Message-ID: <20231122153512.853015-3-sashal@kernel.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231122153512.853015-1-sashal@kernel.org> References: <20231122153512.853015-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 5.15.139 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: netfilter-devel@vger.kernel.org From: Jozsef Kadlecsik [ Upstream commit 28628fa952fefc7f2072ce6e8016968cc452b1ba ] Linkui Xiao reported that there's a race condition when ipset swap and destroy is called, which can lead to crash in add/del/test element operations. Swap then destroy are usual operations to replace a set with another one in a production system. The issue can in some cases be reproduced with the script: ipset create hash_ip1 hash:net family inet hashsize 1024 maxelem 1048576 ipset add hash_ip1 172.20.0.0/16 ipset add hash_ip1 192.168.0.0/16 iptables -A INPUT -m set --match-set hash_ip1 src -j ACCEPT while [ 1 ] do # ... Ongoing traffic... ipset create hash_ip2 hash:net family inet hashsize 1024 maxelem 1048576 ipset add hash_ip2 172.20.0.0/16 ipset swap hash_ip1 hash_ip2 ipset destroy hash_ip2 sleep 0.05 done In the race case the possible order of the operations are CPU0 CPU1 ip_set_test ipset swap hash_ip1 hash_ip2 ipset destroy hash_ip2 hash_net_kadt Swap replaces hash_ip1 with hash_ip2 and then destroy removes hash_ip2 which is the original hash_ip1. ip_set_test was called on hash_ip1 and because destroy removed it, hash_net_kadt crashes. The fix is to force ip_set_swap() to wait for all readers to finish accessing the old set pointers by calling synchronize_rcu(). The first version of the patch was written by Linkui Xiao . v2: synchronize_rcu() is moved into ip_set_swap() in order not to burden ip_set_destroy() unnecessarily when all sets are destroyed. v3: Florian Westphal pointed out that all netfilter hooks run with rcu_read_lock() held and em_ipset.c wraps the entire ip_set_test() in rcu read lock/unlock pair. So there's no need to extend the rcu read locked area in ipset itself. Closes: https://lore.kernel.org/all/69e7963b-e7f8-3ad0-210-7b86eebf7f78@netfilter.org/ Reported by: Linkui Xiao Signed-off-by: Jozsef Kadlecsik Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/netfilter/ipset/ip_set_core.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c index 33869db42bb6b..978014928d07a 100644 --- a/net/netfilter/ipset/ip_set_core.c +++ b/net/netfilter/ipset/ip_set_core.c @@ -61,6 +61,8 @@ MODULE_ALIAS_NFNL_SUBSYS(NFNL_SUBSYS_IPSET); ip_set_dereference((inst)->ip_set_list)[id] #define ip_set_ref_netlink(inst,id) \ rcu_dereference_raw((inst)->ip_set_list)[id] +#define ip_set_dereference_nfnl(p) \ + rcu_dereference_check(p, lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET)) /* The set types are implemented in modules and registered set types * can be found in ip_set_type_list. Adding/deleting types is @@ -708,15 +710,10 @@ __ip_set_put_netlink(struct ip_set *set) static struct ip_set * ip_set_rcu_get(struct net *net, ip_set_id_t index) { - struct ip_set *set; struct ip_set_net *inst = ip_set_pernet(net); - rcu_read_lock(); - /* ip_set_list itself needs to be protected */ - set = rcu_dereference(inst->ip_set_list)[index]; - rcu_read_unlock(); - - return set; + /* ip_set_list and the set pointer need to be protected */ + return ip_set_dereference_nfnl(inst->ip_set_list)[index]; } static inline void @@ -1399,6 +1396,9 @@ static int ip_set_swap(struct sk_buff *skb, const struct nfnl_info *info, ip_set(inst, to_id) = from; write_unlock_bh(&ip_set_ref_lock); + /* Make sure all readers of the old set pointers are completed. */ + synchronize_rcu(); + return 0; } -- 2.42.0