All of lore.kernel.org
 help / color / mirror / Atom feed
From: Florian Westphal <fw@strlen.de>
To: Jozsef Kadlecsik <kadlec@netfilter.org>
Cc: netfilter-devel@vger.kernel.org,
	Pablo Neira Ayuso <pablo@netfilter.org>,
	Linkui Xiao <xiaolinkui@kylinos.cn>
Subject: Re: [PATCH 1/1] netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test, v2
Date: Sat, 4 Nov 2023 11:35:51 +0100	[thread overview]
Message-ID: <20231104103551.GA9892@breakpoint.cc> (raw)
In-Reply-To: <20231104100349.4184215-2-kadlec@netfilter.org>

Jozsef Kadlecsik <kadlec@netfilter.org> wrote:
> Linkui Xiao reported that there's a race condition when ipset swap and destroy is
> called, which can lead to crash in add/del/test element operations. Swap then
> destroy are usual operations to replace a set with another one in a production
> system. The issue can in some cases be reproduced with the script:

Hi Jozsef,

> iptables -A INPUT -m set --match-set hash_ip1 src -j ACCEPT
> while [ 1 ]
> do
> 	# ... Ongoing traffic...
>         ipset create hash_ip2 hash:net family inet hashsize 1024 maxelem 1048576
>         ipset add hash_ip2 172.20.0.0/16
>         ipset swap hash_ip1 hash_ip2
>         ipset destroy hash_ip2
>         sleep 0.05
> done
> 
> In the race case the possible order of the operations are
> 
> 	CPU0			CPU1
> 	ip_set_test
> 				ipset swap hash_ip1 hash_ip2
> 				ipset destroy hash_ip2
> 	hash_net_kadt
> 
> Swap replaces hash_ip1 with hash_ip2 and then destroy removes hash_ip2 which
> is the original hash_ip1. ip_set_test was called on hash_ip1 and because destroy
> removed it, hash_net_kadt crashes.
> 
> The fix is to protect both the list of the sets and the set pointers in an extended RCU
> region and before exiting ip_set_swap(), wait to finish all started rcu_read_lock().
> The first version of the patch was written by Linkui Xiao <xiaolinkui@kylinos.cn>.
> 
> v2: synchronize_rcu() is moved into ip_set_swap() in order not to burden
> ip_set_destroy() unnecessarily when all sets are destroyed.

Hmm.  Isn't it enough to only call synchronize_rcu() in ip_set_swap?

All netfilter hooks run with rcu_read_lock() held,
em_ipset.c wraps the entire ip_set_test() in rcu read lock/unlock pair.

> @@ -704,13 +704,18 @@ ip_set_rcu_get(struct net *net, ip_set_id_t index)
>  	struct ip_set_net *inst = ip_set_pernet(net);
>  
>  	rcu_read_lock();
> -	/* ip_set_list itself needs to be protected */
> +	/* ip_set_list and the set pointer need to be protected */
>  	set = rcu_dereference(inst->ip_set_list)[index];
> -	rcu_read_unlock();
>  
>  	return set;
>  }

... so I don't understand why ip_set_rcu_get() has to extend
the locked section.

AFAICS there are only two type of callers:
1. rcu read lock is already held (datapath)
2. ipset nfnl subsys mutex is held

*probably* This could be changed in a separate patch to:

  -	rcu_read_lock();
  -	/* ip_set_list itself needs to be protected */
  -	set = rcu_dereference(inst->ip_set_list)[index];
  -	rcu_read_unlock();
  +	/* ip_set_list and the set pointer need to be protected */
  +	return rcu_dereference_check(inst->ip_set_list, lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET)[index];

This is an example, probably better to add a small
"ip_set_dereference_nfnl" helper to hide the lockdep construct...

Not saying the patch is wrong; rcu read locks nest and
ipset locking is not simple so I might be missing something.

  reply	other threads:[~2023-11-04 10:36 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-04 10:03 [PATCH 0/1] ipset patch to fix race condition between swap/destroy and add/del/test, v2 Jozsef Kadlecsik
2023-11-04 10:03 ` [PATCH 1/1] netfilter: ipset: fix race condition between swap/destroy and kernel side " Jozsef Kadlecsik
2023-11-04 10:35   ` Florian Westphal [this message]
2023-11-08 10:46     ` Jozsef Kadlecsik
  -- strict thread matches above, loose matches on Subject: below --
2023-11-13 20:09 [PATCH 0/1] ipset patch to fix race condition between swap/destroy and add/del/test, v3 Jozsef Kadlecsik
2023-11-13 20:09 ` [PATCH 1/1] netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test, v2 Jozsef Kadlecsik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231104103551.GA9892@breakpoint.cc \
    --to=fw@strlen.de \
    --cc=kadlec@netfilter.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=xiaolinkui@kylinos.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.