public inbox for netfilter-devel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH nf] netfilter: nf_tables: clone set on flush only
@ 2026-02-25  0:13 Pablo Neira Ayuso
  2026-02-25  0:37 ` Florian Westphal
  0 siblings, 1 reply; 2+ messages in thread
From: Pablo Neira Ayuso @ 2026-02-25  0:13 UTC (permalink / raw)
  To: netfilter-devel; +Cc: fw

Restrict set clone to the flush set command in the preparation phase.
Add NFT_ITER_UPDATE_CLONE and use it for this purpose, update the rbtree
and pipapo backends to only clone the set when this iteration type is
used.

As for the existing NFT_ITER_UPDATE type, update the pipapo backend to
use the existing set clone if available, otherwise use the existing set
representation. After this update, there is no need to clone a set that
is being deleted, this includes bound anonymous set.

An alternative approach to NFT_ITER_UPDATE_CLONE is to add a .clone
interface and call it from the flush set path.

Reported-by: syzbot+4924a0edc148e8b4b342@syzkaller.appspotmail.com
Fixes: 3f1d886cc7c3 ("netfilter: nft_set_pipapo: move cloning of match info to insert/removal path")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_tables.h |  2 ++
 net/netfilter/nf_tables_api.c     |  2 +-
 net/netfilter/nft_set_hash.c      |  1 +
 net/netfilter/nft_set_pipapo.c    | 11 +++++++++--
 net/netfilter/nft_set_rbtree.c    |  8 +++++---
 5 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index 426534a711b0..ea6f29ad7888 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -320,11 +320,13 @@ static inline void *nft_elem_priv_cast(const struct nft_elem_priv *priv)
  * @NFT_ITER_UNSPEC: unspecified, to catch errors
  * @NFT_ITER_READ: read-only iteration over set elements
  * @NFT_ITER_UPDATE: iteration under mutex to update set element state
+ * @NFT_ITER_UPDATE_CLONE: clone set before iteration under mutex to update element
  */
 enum nft_iter_type {
 	NFT_ITER_UNSPEC,
 	NFT_ITER_READ,
 	NFT_ITER_UPDATE,
+	NFT_ITER_UPDATE_CLONE,
 };
 
 struct nft_set;
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 834736237b09..618f79700f75 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -7906,7 +7906,7 @@ static int nft_set_flush(struct nft_ctx *ctx, struct nft_set *set, u8 genmask)
 {
 	struct nft_set_iter iter = {
 		.genmask	= genmask,
-		.type		= NFT_ITER_UPDATE,
+		.type		= NFT_ITER_UPDATE_CLONE,
 		.fn		= nft_setelem_flush,
 	};
 
diff --git a/net/netfilter/nft_set_hash.c b/net/netfilter/nft_set_hash.c
index 739b992bde59..b0e571c8e3f3 100644
--- a/net/netfilter/nft_set_hash.c
+++ b/net/netfilter/nft_set_hash.c
@@ -374,6 +374,7 @@ static void nft_rhash_walk(const struct nft_ctx *ctx, struct nft_set *set,
 {
 	switch (iter->type) {
 	case NFT_ITER_UPDATE:
+	case NFT_ITER_UPDATE_CLONE:
 		/* only relevant for netlink dumps which use READ type */
 		WARN_ON_ONCE(iter->skip != 0);
 
diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c
index 18e1903b1d3d..cd0d2d4ae36b 100644
--- a/net/netfilter/nft_set_pipapo.c
+++ b/net/netfilter/nft_set_pipapo.c
@@ -2145,13 +2145,20 @@ static void nft_pipapo_walk(const struct nft_ctx *ctx, struct nft_set *set,
 	const struct nft_pipapo_match *m;
 
 	switch (iter->type) {
-	case NFT_ITER_UPDATE:
+	case NFT_ITER_UPDATE_CLONE:
 		m = pipapo_maybe_clone(set);
 		if (!m) {
 			iter->err = -ENOMEM;
 			return;
 		}
-
+		nft_pipapo_do_walk(ctx, set, m, iter);
+		break;
+	case NFT_ITER_UPDATE:
+		if (priv->clone)
+			m = priv->clone;
+		else
+			m = rcu_dereference_protected(priv->match,
+						      nft_pipapo_transaction_mutex_held(set));
 		nft_pipapo_do_walk(ctx, set, m, iter);
 		break;
 	case NFT_ITER_READ:
diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
index 644d4b916705..853ff30a208c 100644
--- a/net/netfilter/nft_set_rbtree.c
+++ b/net/netfilter/nft_set_rbtree.c
@@ -861,13 +861,15 @@ static void nft_rbtree_walk(const struct nft_ctx *ctx,
 	struct nft_rbtree *priv = nft_set_priv(set);
 
 	switch (iter->type) {
-	case NFT_ITER_UPDATE:
-		lockdep_assert_held(&nft_pernet(ctx->net)->commit_mutex);
-
+	case NFT_ITER_UPDATE_CLONE:
 		if (nft_array_may_resize(set) < 0) {
 			iter->err = -ENOMEM;
 			break;
 		}
+		fallthrough;
+	case NFT_ITER_UPDATE:
+		lockdep_assert_held(&nft_pernet(ctx->net)->commit_mutex);
+
 		nft_rbtree_do_walk(ctx, set, iter);
 		break;
 	case NFT_ITER_READ:
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH nf] netfilter: nf_tables: clone set on flush only
  2026-02-25  0:13 [PATCH nf] netfilter: nf_tables: clone set on flush only Pablo Neira Ayuso
@ 2026-02-25  0:37 ` Florian Westphal
  0 siblings, 0 replies; 2+ messages in thread
From: Florian Westphal @ 2026-02-25  0:37 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> Restrict set clone to the flush set command in the preparation phase.
> Add NFT_ITER_UPDATE_CLONE and use it for this purpose, update the rbtree
> and pipapo backends to only clone the set when this iteration type is
> used.
> 
> As for the existing NFT_ITER_UPDATE type, update the pipapo backend to
> use the existing set clone if available, otherwise use the existing set
> representation. After this update, there is no need to clone a set that
> is being deleted, this includes bound anonymous set.

Thanks Pablo, this looks good to me. Two small nits/suggestions:

1. Add to commit message that this affects fault injection resp.
   that this requires failing GFP_KERNEL allocation to trigger
   the WARN splat.

>  struct nft_set;
> diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
> index 834736237b09..618f79700f75 100644
> --- a/net/netfilter/nf_tables_api.c
> +++ b/net/netfilter/nf_tables_api.c
> @@ -7906,7 +7906,7 @@ static int nft_set_flush(struct nft_ctx *ctx, struct nft_set *set, u8 genmask)
>  {
>  	struct nft_set_iter iter = {
>  		.genmask	= genmask,
> -		.type		= NFT_ITER_UPDATE,
> +		.type		= NFT_ITER_UPDATE_CLONE,
>  		.fn		= nft_setelem_flush,
>  	};

2. I think it would help to add a comment to the existing
NFT_ITER_UPDATE users as to why they use UPDATE and not CLONE.

Or, alternatively, add comment here why this one needs the _CLONE
variant.

But this looks correct to me, nft_set_flush() makes clone mandatory
while others either can re-use the live copy OR are guaranteed to have
a clone, e.g. because the function is always called after a
delete/insert operation that already did the clone.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-02-25  0:37 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-25  0:13 [PATCH nf] netfilter: nf_tables: clone set on flush only Pablo Neira Ayuso
2026-02-25  0:37 ` Florian Westphal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox