netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: Florian Westphal <fw@strlen.de>
Cc: <netfilter-devel@vger.kernel.org>
Subject: Re: [PATCH RFC nf-next 1/2] netfilter: nft_set_pipapo_avx2: fix skip of expired entries
Date: Sat, 13 Sep 2025 02:13:26 +0200	[thread overview]
Message-ID: <20250913021326.52fc3ca6@elisabeth> (raw)
In-Reply-To: <20250912132004.7925-1-fw@strlen.de>

On Fri, 12 Sep 2025 15:19:59 +0200
Florian Westphal <fw@strlen.de> wrote:

> KASAN reports following splat:
> BUG: KASAN: slab-out-of-bounds in pipapo_get_avx2+0x941/0x25d0
> Read of size 1 at addr ffff88814c561be0 by task nft/3944
> 
> CPU: 7 UID: 0 PID: 3944 Comm: nft Not tainted 6.17.0-rc4+ #637 PREEMPT(full)
> Call Trace:
>  pipapo_get_avx2+0x941/0x25d0
>  ? __local_bh_enable_ip+0x116/0x1a0
>  ? pipapo_get_avx2+0xee/0x25d0
>  ? nft_pipapo_insert+0x22b/0x11b0
>  nft_pipapo_insert+0x440/0x11b0
>  nf_tables_newsetelem+0x220a/0x3a00
>  ..
> 
> This bisects down to
> 84c1da7b38d9 ("netfilter: nft_set_pipapo: use AVX2 algorithm for insertions too"),
> however, it merely uncovers this bug.
> 
> When we find a match but that match has expired or timed out, the AVX2
> implementation restarts the full match loop.
> 
> At that point, data (key element or start of register space with the key)
> has already been incremented to point to the last key field:
> out-of-bounds access occurs.

Oops.

By the way, you're referring to 'rp' here, and nothing else, right?
Could you mention that explicitly if that's the case? I have to say
that "data (...) has already been incremented" took me a while to
understand.

> The restart logic in AVX2 is different compared to the plain C
> implementation, but both should follow the same logic.

That's because I wanted to avoid calling pipapo_refill() from the AVX2
lookup path, as it was significantly slower than re-doing the full
lookup, at least for "net, port" sets which I assumed would be the most
common.

On the other hand, it should always be a corner case (right?), so I
guess simplicity / consistency should prevail.

> The C implementation just calls pipapo_refill() again to check the next
> entry.  Do the same in the AVX2 implementation.

An alternative would be to reset rp = key, but maybe what you're doing
is actually saner, see above.

> Note that with this change, due to implementation differences of
> pipapo_refill vs. nft_pipapo_avx2_refill, the refill call will return
> the same element again, then, on the next call it will move to the next
> entry as expected.  This is because avx2_refill doesn't clear the bitmap
> in the 'last' conditional.  This is harmless.
> 
> A selftest test case comes in a followup patch.
> 
> Sent as RFC tag because it needs to be revamped after net -> net-next
> merge, there are conflicting changes in these two trees at the moment.
> 
> Another alternative is to retarget this patch to nf, but given its
> a day-0 bug that only got exposed due to the use of AVX2 in insertion
> path added recently I think -next is fine.
> 
> Cc: Stefano Brivio <sbrivio@redhat.com>
> Fixes: 7400b063969b ("nft_set_pipapo: Introduce AVX2-based lookup implementation")
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>  net/netfilter/nft_set_pipapo_avx2.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c
> index 7559306d0aed..d97b67a4de16 100644
> --- a/net/netfilter/nft_set_pipapo_avx2.c
> +++ b/net/netfilter/nft_set_pipapo_avx2.c
> @@ -1179,7 +1179,6 @@ struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m,
>  
>  	nft_pipapo_avx2_prepare();
>  
> -next_match:
>  	nft_pipapo_for_each_field(f, i, m) {
>  		bool last = i == m->field_count - 1, first = !i;
>  		int ret = 0;
> @@ -1226,6 +1225,7 @@ struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m,
>  
>  #undef NFT_SET_PIPAPO_AVX2_LOOKUP
>  
> +next_match:
>  		if (ret < 0) {
>  			scratch->map_index = map_index;
>  			kernel_fpu_end();
> @@ -1238,8 +1238,10 @@ struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m,
>  
>  			e = f->mt[ret].e;
>  			if (unlikely(__nft_set_elem_expired(&e->ext, tstamp) ||
> -				     !nft_set_elem_active(&e->ext, genmask)))
> -				goto next_match;
> +				     !nft_set_elem_active(&e->ext, genmask))) {
> +				ret = pipapo_refill(res, f->bsize, f->rules, fill, f->mt, last);

It could be wrapped like this:

				ret = pipapo_refill(res, f->bsize, f->rules,
						    fill, f->mt, last);

> +				goto next_match:
> +			}
>  
>  			scratch->map_index = map_index;
>  			kernel_fpu_end();

In any case:

Reviewed-by: Stefano Brivio <sbrivio@redhat.com>

-- 
Stefano


  parent reply	other threads:[~2025-09-13  0:13 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-12 13:19 [PATCH RFC nf-next 1/2] netfilter: nft_set_pipapo_avx2: fix skip of expired entries Florian Westphal
2025-09-12 13:20 ` [PATCH RFC nf-next 2/2] selftests: netfilter: nft_concat_range.sh: add check for double-create bug Florian Westphal
2025-09-13  0:13   ` Stefano Brivio
2025-09-13  0:13 ` Stefano Brivio [this message]
2025-09-13 20:17   ` [PATCH RFC nf-next 1/2] netfilter: nft_set_pipapo_avx2: fix skip of expired entries Florian Westphal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250913021326.52fc3ca6@elisabeth \
    --to=sbrivio@redhat.com \
    --cc=fw@strlen.de \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).