All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Florian Westphal <fw@strlen.de>
Cc: Phil Sutter <phil@nwl.cc>,
	Hamza Mahfooz <hamzamahfooz@linux.microsoft.com>,
	netdev@vger.kernel.org, Jozsef Kadlecsik <kadlec@netfilter.org>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Simon Horman <horms@kernel.org>,
	netfilter-devel@vger.kernel.org, coreteam@netfilter.org,
	linux-kernel@vger.kernel.org
Subject: Re: Soft lock-ups caused by iptables
Date: Thu, 20 Nov 2025 22:01:54 +0100	[thread overview]
Message-ID: <aR-BwjLjeEyq3Hfd@calendula> (raw)
In-Reply-To: <aR7grVC-kLg76kvE@strlen.de>

On Thu, Nov 20, 2025 at 10:34:46AM +0100, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > > > Yes, but you also need to annotate the type of the last base chain origin,
> > > > else you might skip validation of 'chain foo' because its depth value says its
> > > > fine but new caller is coming from filter, not nat, and chain foo had
> > > > masquerade expression.
> > 
> > You could also have chains being called from different levels.
> 
> But thats not an issue.  If you see a jump from c1 to c2, and c2
> has been validated for a level of 5, then you need to revalidate
> only if c1->depth >= 5.

OK, you could also have a jump to chain from filter and nat basechain
chains, does this optimization below works in that case too?

Validation is two-folded:

- Search for cycles.
- Ensure expression can be called from basechains that can reach it.

> Do you see any issue with this? (it still lacks annotation for
> the calling basechains type, so this cannot be applied as-is):
> 
> netfilter: nf_tables: avoid chain re-validation if possible
> 
> Consider:
> 
>       input -> j2 -> j3
>       input -> j2 -> j3
>       input -> j1 -> j2 -> j3
> 
> Then the second rule does not need to revalidate j2, and, by extension j3.
> 
> We need to validate it only for rule 3.
> 
> This is needed because chain loop detection also ensures we do not
> exceed the jump stack: Just because we know that j2 is cycle free, its
> last jump might now exceed the allowed stack.  We also need to update
> the new largest call depth for all the reachable nodes.
> 
> diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
> --- a/include/net/netfilter/nf_tables.h
> +++ b/include/net/netfilter/nf_tables.h
> @@ -1109,6 +1109,7 @@ struct nft_rule_blob {
>   *	@udlen: user data length
>   *	@udata: user data in the chain
>   *	@blob_next: rule blob pointer to the next in the chain
> + *	@depth: chain was validated for call level <= depth
>   */
>  struct nft_chain {
>  	struct nft_rule_blob		__rcu *blob_gen_0;
> @@ -1128,9 +1129,10 @@ struct nft_chain {
>  
>  	/* Only used during control plane commit phase: */
>  	struct nft_rule_blob		*blob_next;
> +	u8				depth;
>  };
>  
> -int nft_chain_validate(const struct nft_ctx *ctx, const struct nft_chain *chain);
> +int nft_chain_validate(const struct nft_ctx *ctx, struct nft_chain *chain);
>  int nft_setelem_validate(const struct nft_ctx *ctx, struct nft_set *set,
>  			 const struct nft_set_iter *iter,
>  			 struct nft_elem_priv *elem_priv);
> diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
> --- a/net/netfilter/nf_tables_api.c
> +++ b/net/netfilter/nf_tables_api.c
> @@ -4088,15 +4088,26 @@ static void nf_tables_rule_release(const struct nft_ctx *ctx, struct nft_rule *r
>   * and set lookups until either the jump limit is hit or all reachable
>   * chains have been validated.
>   */
> -int nft_chain_validate(const struct nft_ctx *ctx, const struct nft_chain *chain)
> +int nft_chain_validate(const struct nft_ctx *ctx, struct nft_chain *chain)
>  {
>  	struct nft_expr *expr, *last;
>  	struct nft_rule *rule;
>  	int err;
>  
> +	BUILD_BUG_ON(NFT_JUMP_STACK_SIZE > 255);
>  	if (ctx->level == NFT_JUMP_STACK_SIZE)
>  		return -EMLINK;
>  
> +	/* jumps to base chains are not allowed, this is already
> +	 * validated by nft_verdict_init().
> +	 *
> +	 * Chain must be re-validated if we are entering for first
> +	 * time or if the current jumpstack usage is higher than on
> +	 * previous check.
> +	 */
> +	if (ctx->level && chain->depth >= ctx->level)
> +		return 0;
> +
>  	list_for_each_entry(rule, &chain->rules, list) {
>  		if (fatal_signal_pending(current))
>  			return -EINTR;
> @@ -4117,6 +4128,10 @@ int nft_chain_validate(const struct nft_ctx *ctx, const struct nft_chain *chain)
>  		}
>  	}
>  
> +	/* Chain needs no re-validation if called again
> +	 * from a path that doesn't exceed level.
> +	 */
> +	chain->depth = ctx->level;
>  	return 0;
>  }
>  EXPORT_SYMBOL_GPL(nft_chain_validate);
> @@ -4128,7 +4143,7 @@ static int nft_table_validate(struct net *net, const struct nft_table *table)
>  		.net	= net,
>  		.family	= table->family,
>  	};
> -	int err;
> +	int err = 0;
>  
>  	list_for_each_entry(chain, &table->chains, list) {
>  		if (!nft_is_base_chain(chain))
> @@ -4137,12 +4152,16 @@ static int nft_table_validate(struct net *net, const struct nft_table *table)
>  		ctx.chain = chain;
>  		err = nft_chain_validate(&ctx, chain);
>  		if (err < 0)
> -			return err;
> +			goto err;
>  
>  		cond_resched();
>  	}
>  
> -	return 0;
> +err:
> +	list_for_each_entry(chain, &table->chains, list)
> +		chain->depth = 0;
> +
> +	return err;
>  }
>  
>  int nft_setelem_validate(const struct nft_ctx *ctx, struct nft_set *set,
> 

  parent reply	other threads:[~2025-11-20 21:01 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-18 22:17 Soft lock-ups caused by iptables Hamza Mahfooz
2025-11-19 14:49 ` Phil Sutter
2025-11-19 15:58   ` Florian Westphal
2025-11-19 18:12     ` Phil Sutter
2025-11-19 23:10       ` Pablo Neira Ayuso
2025-11-20  9:34         ` Florian Westphal
2025-11-20 11:22           ` Phil Sutter
2025-11-20 20:38           ` Hamza Mahfooz
2025-11-20 20:46             ` Florian Westphal
2025-11-20 21:07             ` Pablo Neira Ayuso
2025-11-21 20:59               ` Hamza Mahfooz
2025-11-20 21:01           ` Pablo Neira Ayuso [this message]
2025-11-19 22:29   ` Hamza Mahfooz
2025-11-19 23:14     ` Pablo Neira Ayuso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aR-BwjLjeEyq3Hfd@calendula \
    --to=pablo@netfilter.org \
    --cc=coreteam@netfilter.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=fw@strlen.de \
    --cc=hamzamahfooz@linux.microsoft.com \
    --cc=horms@kernel.org \
    --cc=kadlec@netfilter.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=phil@nwl.cc \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.