All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: Florian Westphal <fw@strlen.de>
Cc: <netfilter-devel@vger.kernel.org>
Subject: Re: [PATCH nf-next] netfilter: nft_set_pipapo_avx2: restore performance optimization
Date: Sat, 11 Apr 2026 12:30:16 +0200 (CEST)	[thread overview]
Message-ID: <20260411123015.4a78f491@elisabeth> (raw)
In-Reply-To: <20260401110230.19226-1-fw@strlen.de>

Sorry for the late review.

On Wed,  1 Apr 2026 13:02:27 +0200
Florian Westphal <fw@strlen.de> wrote:

> The avx2 lookup routines get the next map index to process passes as a
> function argument, but this isn't bery obvious because its hidden in the

Nits: very, it's

> lookup macro.

...right, I'm not exactly proud of it, but if we don't hide that stuff
the amount of visual noise explodes. I couldn't think of any better
solution.

> In commit 17a20e09f086 ("netfilter: nft_set: remove one argument from
> lookup and update functions") I incorrectly moved the "ret" scope into
> the loop.

This took me quite a bit to review because I stupidly assumed that that
commit was the same as the "return b;" -> "ret = b;" recent change :)
...that part is my fault of course, I didn't read the reference.

I really couldn't understand how turning those into "else if" would have
anything to do with the scope "ret"... until I realised and actually
read your reference. Oops.

But I have to say this was triggered by reading the next part which had
little to do with this...

> This has no effect on the correctness, but it can (depending on map sizes)
> cause a redundant repeat of an earlier processing step.
> 
> Restore the intended 'pass map index' instead of always-0.
> Note that I did not see any change in performance numbers, so
> an alternative would be to axe this optimization and go with
> slightly simpler code instead.

This probably depends a lot on the data set we're using. I'm not sure
if the current performance tests would consistently trigger a sparse
intermediate matching bitmap (between fields) with a lot of leading
zeroes.

By the way, I don't think a branch predictor could figure that out. So
all in all I would keep the optimisation you're restoring here.

> Additionally, a recent LLM review pointed out following "bug":
>  -------------------------------------------------------------
>  >               b = nft_pipapo_avx2_refill(i_ul, &map[i_ul], fill, f->mt, last);
>  >               if (last)
>  > -                     return b;
>  > +                     ret = b;
>  >
>  >               if (unlikely(ret == -1))
>  >                       ret = b / XSAVE_YMM_SIZE;  
> 
>  Does this change introduce a logic error when last=true and no match is
>  found? The old code used 'return b;' which immediately exited the loop. The
>  new code changes this to 'ret = b;' to allow loop continuation, but when
>  last=true and nft_pipapo_avx2_refill() returns -1 (no match found), the
>  execution flow becomes:
> 
>  1. ret = -1 (from 'if (last) ret = b;')
>  2. The condition 'if (unlikely(ret == -1))' evaluates to TRUE
>  3. ret = -1 / XSAVE_YMM_SIZE = -1 / 32 = 0 (integer division)
>  4. Loop continues with ret=0
> 
>  [..]
> 
>  Should this be changed to an else-if structure instead?
>  -------------------------------------------------------------
> 
> All call sites invoke nft_pipapo_avx2_refill() only when at least one
> bit in the map is set, i.e. nft_pipapo_avx2_refill() never returns -1.

It also took me a while to understand the actual reason behind this LLM
report.

It simply seems to be inferring that your patch introduced a control
flow change in a case that would never happen anyway, so that wasn't
expected, and even though it doesn't happen now either, it reports that
it was unexpected... not exactly useful.

But anyway:

> Add a runtime debug check that fires if we'd return -1 as additional
> documentation and also make the suggested change, code might be easier
> to understand this way.

...for me it's exactly as readable as before. If you think that it
might be easier to understand this way, though, I guess that's a good
indication that it's better to change it. So, either way:

> Signed-off-by: Florian Westphal <fw@strlen.de>

Reviewed-by: Stefano Brivio <sbrivio@redhat.com>

> ---
>  net/netfilter/nft_set_pipapo_avx2.c | 35 ++++++++++++-----------------
>  1 file changed, 14 insertions(+), 21 deletions(-)
> 
> diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c
> index dad265807b8b..b3f105520a85 100644
> --- a/net/netfilter/nft_set_pipapo_avx2.c
> +++ b/net/netfilter/nft_set_pipapo_avx2.c
> @@ -144,6 +144,7 @@ static void nft_pipapo_avx2_fill(unsigned long *data, int start, int len)
>   * This is an alternative implementation of pipapo_refill() suitable for usage
>   * with AVX2 lookup routines: we know there are four words to be scanned, at
>   * a given offset inside the map, for each matching iteration.
> + * The caller must ensure at least one bit in the four words is set.
>   *
>   * This function doesn't actually use any AVX2 instruction.
>   *
> @@ -179,6 +180,7 @@ static int nft_pipapo_avx2_refill(int offset, unsigned long *map,
>  	NFT_PIPAPO_AVX2_REFILL_ONE_WORD(3);
>  #undef NFT_PIPAPO_AVX2_REFILL_ONE_WORD
>  
> +	DEBUG_NET_WARN_ON_ONCE(ret < 0);
>  	return ret;
>  }
>  
> @@ -243,8 +245,7 @@ static int nft_pipapo_avx2_lookup_4b_2(unsigned long *map, unsigned long *fill,
>  		b = nft_pipapo_avx2_refill(i_ul, &map[i_ul], fill, f->mt, last);
>  		if (last)
>  			ret = b;
> -
> -		if (unlikely(ret == -1))
> +		else if (unlikely(ret == -1))
>  			ret = b / XSAVE_YMM_SIZE;
>  
>  		continue;
> @@ -320,8 +321,7 @@ static int nft_pipapo_avx2_lookup_4b_4(unsigned long *map, unsigned long *fill,
>  		b = nft_pipapo_avx2_refill(i_ul, &map[i_ul], fill, f->mt, last);
>  		if (last)
>  			ret = b;
> -
> -		if (unlikely(ret == -1))
> +		else if (unlikely(ret == -1))
>  			ret = b / XSAVE_YMM_SIZE;
>  
>  		continue;
> @@ -415,8 +415,7 @@ static int nft_pipapo_avx2_lookup_4b_8(unsigned long *map, unsigned long *fill,
>  		b = nft_pipapo_avx2_refill(i_ul, &map[i_ul], fill, f->mt, last);
>  		if (last)
>  			ret = b;
> -
> -		if (unlikely(ret == -1))
> +		else if (unlikely(ret == -1))
>  			ret = b / XSAVE_YMM_SIZE;
>  
>  		continue;
> @@ -506,8 +505,7 @@ static int nft_pipapo_avx2_lookup_4b_12(unsigned long *map, unsigned long *fill,
>  		b = nft_pipapo_avx2_refill(i_ul, &map[i_ul], fill, f->mt, last);
>  		if (last)
>  			ret = b;
> -
> -		if (unlikely(ret == -1))
> +		else if (unlikely(ret == -1))
>  			ret = b / XSAVE_YMM_SIZE;
>  
>  		continue;
> @@ -642,8 +640,7 @@ static int nft_pipapo_avx2_lookup_4b_32(unsigned long *map, unsigned long *fill,
>  		b = nft_pipapo_avx2_refill(i_ul, &map[i_ul], fill, f->mt, last);
>  		if (last)
>  			ret = b;
> -
> -		if (unlikely(ret == -1))
> +		else if (unlikely(ret == -1))
>  			ret = b / XSAVE_YMM_SIZE;
>  
>  		continue;
> @@ -700,8 +697,7 @@ static int nft_pipapo_avx2_lookup_8b_1(unsigned long *map, unsigned long *fill,
>  		b = nft_pipapo_avx2_refill(i_ul, &map[i_ul], fill, f->mt, last);
>  		if (last)
>  			ret = b;
> -
> -		if (unlikely(ret == -1))
> +		else if (unlikely(ret == -1))
>  			ret = b / XSAVE_YMM_SIZE;
>  
>  		continue;
> @@ -765,8 +761,7 @@ static int nft_pipapo_avx2_lookup_8b_2(unsigned long *map, unsigned long *fill,
>  		b = nft_pipapo_avx2_refill(i_ul, &map[i_ul], fill, f->mt, last);
>  		if (last)
>  			ret = b;
> -
> -		if (unlikely(ret == -1))
> +		else if (unlikely(ret == -1))
>  			ret = b / XSAVE_YMM_SIZE;
>  
>  		continue;
> @@ -840,8 +835,7 @@ static int nft_pipapo_avx2_lookup_8b_4(unsigned long *map, unsigned long *fill,
>  		b = nft_pipapo_avx2_refill(i_ul, &map[i_ul], fill, f->mt, last);
>  		if (last)
>  			ret = b;
> -
> -		if (unlikely(ret == -1))
> +		else if (unlikely(ret == -1))
>  			ret = b / XSAVE_YMM_SIZE;
>  
>  		continue;
> @@ -926,8 +920,7 @@ static int nft_pipapo_avx2_lookup_8b_6(unsigned long *map, unsigned long *fill,
>  		b = nft_pipapo_avx2_refill(i_ul, &map[i_ul], fill, f->mt, last);
>  		if (last)
>  			ret = b;
> -
> -		if (unlikely(ret == -1))
> +		else if (unlikely(ret == -1))
>  			ret = b / XSAVE_YMM_SIZE;
>  
>  		continue;
> @@ -1020,8 +1013,7 @@ static int nft_pipapo_avx2_lookup_8b_16(unsigned long *map, unsigned long *fill,
>  		b = nft_pipapo_avx2_refill(i_ul, &map[i_ul], fill, f->mt, last);
>  		if (last)
>  			ret = b;
> -
> -		if (unlikely(ret == -1))
> +		else if (unlikely(ret == -1))
>  			ret = b / XSAVE_YMM_SIZE;
>  
>  		continue;
> @@ -1143,6 +1135,7 @@ struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m,
>  	const struct nft_pipapo_field *f;
>  	unsigned long *res, *fill, *map;
>  	bool map_index;
> +	int ret = 0;
>  	int i;
>  
>  	scratch = *raw_cpu_ptr(m->scratch);
> @@ -1167,8 +1160,8 @@ struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m,
>  
>  	nft_pipapo_for_each_field(f, i, m) {
>  		bool last = i == m->field_count - 1, first = !i;
> -		int ret = 0;
>  
> +		/* NB: previous round @ret is passed to avx2 lookup fn */
>  #define NFT_SET_PIPAPO_AVX2_LOOKUP(b, n)				\
>  		(ret = nft_pipapo_avx2_lookup_##b##b_##n(res, fill, f,	\
>  							 ret, data,	\

-- 
Stefano


  reply	other threads:[~2026-04-11 10:30 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-01 11:02 [PATCH nf-next] netfilter: nft_set_pipapo_avx2: restore performance optimization Florian Westphal
2026-04-11 10:30 ` Stefano Brivio [this message]
2026-04-11 12:14   ` Florian Westphal
2026-04-12  9:44     ` Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260411123015.4a78f491@elisabeth \
    --to=sbrivio@redhat.com \
    --cc=fw@strlen.de \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.