* [PATCH nf-next 0/2] netfilter: nft_set_pipapo: speed up insertions
@ 2025-08-15 14:36 Florian Westphal
2025-08-15 14:36 ` [PATCH nf-next 1/2] netfilter: nft_set_pipapo_avx2: split lookup function in two parts Florian Westphal
2025-08-15 14:36 ` [PATCH nf-next 2/2] netfilter: nft_set_pipapo: use avx2 algorithm for insertions too Florian Westphal
0 siblings, 2 replies; 10+ messages in thread
From: Florian Westphal @ 2025-08-15 14:36 UTC (permalink / raw)
To: netfilter-devel; +Cc: sbrivio, Florian Westphal
Always prefer the avx2 implementation if its available.
This greatly improves insertion performance (each insertion
checks if the new element would overlap with an existing one):
time nft -f - <<EOF
table ip pipapo {
set s {
typeof ip saddr . tcp dport
flags interval
size 800000
elements = { 10.1.1.1 - 10.1.1.4 . 3996,
[.. 800k entries elided .. ]
before:
real 1m55.993s
user 0m2.505s
sys 1m53.296s
after:
real 0m42.586s
user 0m2.554s
sys 0m39.811s
First patch does some refactoring so the common part can be reused
for both packetpath and control plane.
Second patch alters control plane to use avx2.
Florian Westphal (2):
netfilter: nft_set_pipapo_avx2: split lookup function in two parts
netfilter: nft_set_pipapo: use avx2 algorithm for insertions too
net/netfilter/nft_set_pipapo.c | 47 ++++++++--
net/netfilter/nft_set_pipapo_avx2.c | 127 +++++++++++++++++-----------
net/netfilter/nft_set_pipapo_avx2.h | 4 +
3 files changed, 122 insertions(+), 56 deletions(-)
--
2.49.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH nf-next 1/2] netfilter: nft_set_pipapo_avx2: split lookup function in two parts
2025-08-15 14:36 [PATCH nf-next 0/2] netfilter: nft_set_pipapo: speed up insertions Florian Westphal
@ 2025-08-15 14:36 ` Florian Westphal
2025-08-18 16:29 ` Stefano Brivio
2025-08-15 14:36 ` [PATCH nf-next 2/2] netfilter: nft_set_pipapo: use avx2 algorithm for insertions too Florian Westphal
1 sibling, 1 reply; 10+ messages in thread
From: Florian Westphal @ 2025-08-15 14:36 UTC (permalink / raw)
To: netfilter-devel; +Cc: sbrivio, Florian Westphal
Split the main avx2 lookup function into a helper.
This is a preparation patch: followup change will use the new helper
from the insertion path if possible. This greatly improves insertion
performance when avx2 is supported.
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nft_set_pipapo_avx2.c | 127 +++++++++++++++++-----------
1 file changed, 76 insertions(+), 51 deletions(-)
diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c
index 2f090e253caf..d512877cc211 100644
--- a/net/netfilter/nft_set_pipapo_avx2.c
+++ b/net/netfilter/nft_set_pipapo_avx2.c
@@ -1133,58 +1133,35 @@ static inline void pipapo_resmap_init_avx2(const struct nft_pipapo_match *m, uns
}
/**
- * nft_pipapo_avx2_lookup() - Lookup function for AVX2 implementation
- * @net: Network namespace
- * @set: nftables API set representation
- * @key: nftables API element representation containing key data
+ * pipapo_get_avx2() - Lookup function for AVX2 implementation
+ * @m: storage containing the set elements
+ * @data: Key data to be matched against existing elements
+ * @genmask: If set, check that element is active in given genmask
+ * @tstamp: timestamp to check for expired elements
*
* For more details, see DOC: Theory of Operation in nft_set_pipapo.c.
*
* This implementation exploits the repetitive characteristic of the algorithm
* to provide a fast, vectorised version using the AVX2 SIMD instruction set.
*
- * Return: true on match, false otherwise.
+ * The caller MUST check that the fpu is usable.
+ * This function must be called with BH disabled.
+ *
+ * Return: pointer to &struct nft_pipapo_elem on match, NULL otherwise.
*/
-const struct nft_set_ext *
-nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set,
- const u32 *key)
+static struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m,
+ const u8 *data, u8 genmask,
+ u64 tstamp)
{
- struct nft_pipapo *priv = nft_set_priv(set);
- const struct nft_set_ext *ext = NULL;
struct nft_pipapo_scratch *scratch;
- u8 genmask = nft_genmask_cur(net);
- const struct nft_pipapo_match *m;
const struct nft_pipapo_field *f;
- const u8 *rp = (const u8 *)key;
unsigned long *res, *fill;
bool map_index;
int i;
- local_bh_disable();
-
- if (unlikely(!irq_fpu_usable())) {
- ext = nft_pipapo_lookup(net, set, key);
-
- local_bh_enable();
- return ext;
- }
-
- m = rcu_dereference(priv->match);
-
- /* This also protects access to all data related to scratch maps.
- *
- * Note that we don't need a valid MXCSR state for any of the
- * operations we use here, so pass 0 as mask and spare a LDMXCSR
- * instruction.
- */
- kernel_fpu_begin_mask(0);
-
scratch = *raw_cpu_ptr(m->scratch);
- if (unlikely(!scratch)) {
- kernel_fpu_end();
- local_bh_enable();
+ if (unlikely(!scratch))
return NULL;
- }
map_index = scratch->map_index;
@@ -1202,7 +1179,7 @@ nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set,
#define NFT_SET_PIPAPO_AVX2_LOOKUP(b, n) \
(ret = nft_pipapo_avx2_lookup_##b##b_##n(res, fill, f, \
- ret, rp, \
+ ret, data, \
first, last))
if (likely(f->bb == 8)) {
@@ -1218,7 +1195,7 @@ nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set,
NFT_SET_PIPAPO_AVX2_LOOKUP(8, 16);
} else {
ret = nft_pipapo_avx2_lookup_slow(m, res, fill, f,
- ret, rp,
+ ret, data,
first, last);
}
} else {
@@ -1234,7 +1211,7 @@ nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set,
NFT_SET_PIPAPO_AVX2_LOOKUP(4, 32);
} else {
ret = nft_pipapo_avx2_lookup_slow(m, res, fill, f,
- ret, rp,
+ ret, data,
first, last);
}
}
@@ -1242,29 +1219,77 @@ nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set,
#undef NFT_SET_PIPAPO_AVX2_LOOKUP
- if (ret < 0)
- goto out;
+ if (ret < 0) {
+ scratch->map_index = map_index;
+ return NULL;
+ }
if (last) {
- const struct nft_set_ext *e = &f->mt[ret].e->ext;
+ struct nft_pipapo_elem *e;
- if (unlikely(nft_set_elem_expired(e) ||
- !nft_set_elem_active(e, genmask)))
+ e = f->mt[ret].e;
+ if (unlikely(__nft_set_elem_expired(&e->ext, tstamp) ||
+ !nft_set_elem_active(&e->ext, genmask)))
goto next_match;
- ext = e;
- goto out;
+ scratch->map_index = map_index;
+ return e;
}
+ map_index = !map_index;
swap(res, fill);
- rp += NFT_PIPAPO_GROUPS_PADDED_SIZE(f);
+ data += NFT_PIPAPO_GROUPS_PADDED_SIZE(f);
+ }
+
+ return NULL;
+}
+
+/**
+ * nft_pipapo_avx2_lookup() - Dataplane frontend for AVX2 implementation
+ * @net: Network namespace
+ * @set: nftables API set representation
+ * @key: nftables API element representation containing key data
+ *
+ * This function is called from the data path. It will search for
+ * an element matching the given key in the current active copy using
+ * the avx2 routines if the fpu is usable or fall back to the generic
+ * implementation of the algorithm otherwise.
+ *
+ * Return: ntables API extension pointer or NULL if no match.
+ */
+const struct nft_set_ext *
+nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set,
+ const u32 *key)
+{
+ struct nft_pipapo *priv = nft_set_priv(set);
+ u8 genmask = nft_genmask_cur(net);
+ const struct nft_pipapo_match *m;
+ const u8 *rp = (const u8 *)key;
+ const struct nft_pipapo_elem *e;
+
+ local_bh_disable();
+
+ if (unlikely(!irq_fpu_usable())) {
+ const struct nft_set_ext *ext;
+
+ ext = nft_pipapo_lookup(net, set, key);
+
+ local_bh_enable();
+ return ext;
}
-out:
- if (i % 2)
- scratch->map_index = !map_index;
+ m = rcu_dereference(priv->match);
+
+ /* This also protects access to all data related to scratch maps.
+ *
+ * Note that we don't need a valid MXCSR state for any of the
+ * operations we use here, so pass 0 as mask and spare a LDMXCSR
+ * instruction.
+ */
+ kernel_fpu_begin_mask(0);
+ e = pipapo_get_avx2(m, rp, genmask, get_jiffies_64());
kernel_fpu_end();
local_bh_enable();
- return ext;
+ return e ? &e->ext : NULL;
}
--
2.49.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH nf-next 2/2] netfilter: nft_set_pipapo: use avx2 algorithm for insertions too
2025-08-15 14:36 [PATCH nf-next 0/2] netfilter: nft_set_pipapo: speed up insertions Florian Westphal
2025-08-15 14:36 ` [PATCH nf-next 1/2] netfilter: nft_set_pipapo_avx2: split lookup function in two parts Florian Westphal
@ 2025-08-15 14:36 ` Florian Westphal
2025-08-18 16:32 ` Stefano Brivio
1 sibling, 1 reply; 10+ messages in thread
From: Florian Westphal @ 2025-08-15 14:36 UTC (permalink / raw)
To: netfilter-devel; +Cc: sbrivio, Florian Westphal
Always prefer the avx2 implementation if its available.
This greatly improves insertion performance (each insertion
checks if the new element would overlap with an existing one):
time nft -f - <<EOF
table ip pipapo {
set s {
typeof ip saddr . tcp dport
flags interval
size 800000
elements = { 10.1.1.1 - 10.1.1.4 . 3996,
[.. 800k entries elided .. ]
before:
real 1m55.993s
user 0m2.505s
sys 1m53.296s
after:
real 0m42.586s
user 0m2.554s
sys 0m39.811s
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nft_set_pipapo.c | 47 ++++++++++++++++++++++++++---
net/netfilter/nft_set_pipapo_avx2.c | 6 ++--
net/netfilter/nft_set_pipapo_avx2.h | 4 +++
3 files changed, 49 insertions(+), 8 deletions(-)
diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c
index 9a10251228fd..affa473b13a6 100644
--- a/net/netfilter/nft_set_pipapo.c
+++ b/net/netfilter/nft_set_pipapo.c
@@ -397,7 +397,7 @@ int pipapo_refill(unsigned long *map, unsigned int len, unsigned int rules,
}
/**
- * pipapo_get() - Get matching element reference given key data
+ * pipapo_get_slow() - Get matching element reference given key data
* @m: storage containing the set elements
* @data: Key data to be matched against existing elements
* @genmask: If set, check that element is active in given genmask
@@ -414,9 +414,9 @@ int pipapo_refill(unsigned long *map, unsigned int len, unsigned int rules,
*
* Return: pointer to &struct nft_pipapo_elem on match, NULL otherwise.
*/
-static struct nft_pipapo_elem *pipapo_get(const struct nft_pipapo_match *m,
- const u8 *data, u8 genmask,
- u64 tstamp)
+static struct nft_pipapo_elem *pipapo_get_slow(const struct nft_pipapo_match *m,
+ const u8 *data, u8 genmask,
+ u64 tstamp)
{
struct nft_pipapo_scratch *scratch;
unsigned long *res_map, *fill_map;
@@ -502,6 +502,43 @@ static struct nft_pipapo_elem *pipapo_get(const struct nft_pipapo_match *m,
return NULL;
}
+/**
+ * pipapo_get() - Get matching element reference given key data
+ * @m: storage containing the set elements
+ * @data: Key data to be matched against existing elements
+ * @genmask: If set, check that element is active in given genmask
+ * @tstamp: timestamp to check for expired elements
+ *
+ * This is a dispatcher function, either calling out the generic C
+ * implementation or, if available, the avx2 one.
+ * This helper is only called from the control plane, with either rcu
+ * read lock or transaction mutex held.
+ *
+ * Return: pointer to &struct nft_pipapo_elem on match, NULL otherwise.
+ */
+static struct nft_pipapo_elem *pipapo_get(const struct nft_pipapo_match *m,
+ const u8 *data, u8 genmask,
+ u64 tstamp)
+{
+ struct nft_pipapo_elem *e;
+
+ local_bh_disable();
+
+#if defined(CONFIG_X86_64) && !defined(CONFIG_UML)
+ if (boot_cpu_has(X86_FEATURE_AVX2) && boot_cpu_has(X86_FEATURE_AVX) &&
+ irq_fpu_usable()) {
+ kernel_fpu_begin_mask(0);
+ e = pipapo_get_avx2(m, data, genmask, tstamp);
+ kernel_fpu_end();
+ local_bh_enable();
+ return e;
+ }
+#endif
+ e = pipapo_get_slow(m, data, genmask, tstamp);
+ local_bh_enable();
+ return e;
+}
+
/**
* nft_pipapo_lookup() - Dataplane fronted for main lookup function
* @net: Network namespace
@@ -523,7 +560,7 @@ nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
const struct nft_pipapo_elem *e;
m = rcu_dereference(priv->match);
- e = pipapo_get(m, (const u8 *)key, genmask, get_jiffies_64());
+ e = pipapo_get_slow(m, (const u8 *)key, genmask, get_jiffies_64());
return e ? &e->ext : NULL;
}
diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c
index d512877cc211..7a9900f8ce2f 100644
--- a/net/netfilter/nft_set_pipapo_avx2.c
+++ b/net/netfilter/nft_set_pipapo_avx2.c
@@ -1149,9 +1149,9 @@ static inline void pipapo_resmap_init_avx2(const struct nft_pipapo_match *m, uns
*
* Return: pointer to &struct nft_pipapo_elem on match, NULL otherwise.
*/
-static struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m,
- const u8 *data, u8 genmask,
- u64 tstamp)
+struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m,
+ const u8 *data, u8 genmask,
+ u64 tstamp)
{
struct nft_pipapo_scratch *scratch;
const struct nft_pipapo_field *f;
diff --git a/net/netfilter/nft_set_pipapo_avx2.h b/net/netfilter/nft_set_pipapo_avx2.h
index dbb6aaca8a7a..c2999b63da3f 100644
--- a/net/netfilter/nft_set_pipapo_avx2.h
+++ b/net/netfilter/nft_set_pipapo_avx2.h
@@ -5,8 +5,12 @@
#include <asm/fpu/xstate.h>
#define NFT_PIPAPO_ALIGN (XSAVE_YMM_SIZE / BITS_PER_BYTE)
+struct nft_pipapo_match;
bool nft_pipapo_avx2_estimate(const struct nft_set_desc *desc, u32 features,
struct nft_set_estimate *est);
+struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m,
+ const u8 *data, u8 genmask,
+ u64 tstamp);
#endif /* defined(CONFIG_X86_64) && !defined(CONFIG_UML) */
#endif /* _NFT_SET_PIPAPO_AVX2_H */
--
2.49.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH nf-next 1/2] netfilter: nft_set_pipapo_avx2: split lookup function in two parts
2025-08-15 14:36 ` [PATCH nf-next 1/2] netfilter: nft_set_pipapo_avx2: split lookup function in two parts Florian Westphal
@ 2025-08-18 16:29 ` Stefano Brivio
2025-08-18 18:23 ` Florian Westphal
0 siblings, 1 reply; 10+ messages in thread
From: Stefano Brivio @ 2025-08-18 16:29 UTC (permalink / raw)
To: Florian Westphal; +Cc: netfilter-devel
On Fri, 15 Aug 2025 16:36:57 +0200
Florian Westphal <fw@strlen.de> wrote:
> Split the main avx2 lookup function into a helper.
>
> This is a preparation patch: followup change will use the new helper
> from the insertion path if possible. This greatly improves insertion
> performance when avx2 is supported.
Ah, nice. This was on my to-do list at some point, but then I thought
insertion time didn't matter as much, now clearly disproved by the
numbers from 2/2.
One actual concern and a few nits:
>
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
> net/netfilter/nft_set_pipapo_avx2.c | 127 +++++++++++++++++-----------
> 1 file changed, 76 insertions(+), 51 deletions(-)
>
> diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c
> index 2f090e253caf..d512877cc211 100644
> --- a/net/netfilter/nft_set_pipapo_avx2.c
> +++ b/net/netfilter/nft_set_pipapo_avx2.c
> @@ -1133,58 +1133,35 @@ static inline void pipapo_resmap_init_avx2(const struct nft_pipapo_match *m, uns
> }
>
> /**
> - * nft_pipapo_avx2_lookup() - Lookup function for AVX2 implementation
> - * @net: Network namespace
> - * @set: nftables API set representation
> - * @key: nftables API element representation containing key data
> + * pipapo_get_avx2() - Lookup function for AVX2 implementation
> + * @m: storage containing the set elements
> + * @data: Key data to be matched against existing elements
> + * @genmask: If set, check that element is active in given genmask
> + * @tstamp: timestamp to check for expired elements
Nits: Storage, Timestamp (or all lowercase, for consistency with the
other ones).
> *
> * For more details, see DOC: Theory of Operation in nft_set_pipapo.c.
> *
> * This implementation exploits the repetitive characteristic of the algorithm
> * to provide a fast, vectorised version using the AVX2 SIMD instruction set.
> *
> - * Return: true on match, false otherwise.
> + * The caller MUST check that the fpu is usable.
FPU
> + * This function must be called with BH disabled.
> + *
> + * Return: pointer to &struct nft_pipapo_elem on match, NULL otherwise.
> */
> -const struct nft_set_ext *
> -nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set,
> - const u32 *key)
> +static struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m,
> + const u8 *data, u8 genmask,
> + u64 tstamp)
> {
> - struct nft_pipapo *priv = nft_set_priv(set);
> - const struct nft_set_ext *ext = NULL;
> struct nft_pipapo_scratch *scratch;
> - u8 genmask = nft_genmask_cur(net);
> - const struct nft_pipapo_match *m;
> const struct nft_pipapo_field *f;
> - const u8 *rp = (const u8 *)key;
> unsigned long *res, *fill;
> bool map_index;
> int i;
>
> - local_bh_disable();
> -
> - if (unlikely(!irq_fpu_usable())) {
> - ext = nft_pipapo_lookup(net, set, key);
> -
> - local_bh_enable();
> - return ext;
> - }
> -
> - m = rcu_dereference(priv->match);
> -
> - /* This also protects access to all data related to scratch maps.
> - *
> - * Note that we don't need a valid MXCSR state for any of the
> - * operations we use here, so pass 0 as mask and spare a LDMXCSR
> - * instruction.
> - */
> - kernel_fpu_begin_mask(0);
> -
> scratch = *raw_cpu_ptr(m->scratch);
> - if (unlikely(!scratch)) {
> - kernel_fpu_end();
> - local_bh_enable();
> + if (unlikely(!scratch))
> return NULL;
> - }
>
> map_index = scratch->map_index;
>
> @@ -1202,7 +1179,7 @@ nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set,
>
> #define NFT_SET_PIPAPO_AVX2_LOOKUP(b, n) \
> (ret = nft_pipapo_avx2_lookup_##b##b_##n(res, fill, f, \
> - ret, rp, \
> + ret, data, \
> first, last))
>
> if (likely(f->bb == 8)) {
> @@ -1218,7 +1195,7 @@ nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set,
> NFT_SET_PIPAPO_AVX2_LOOKUP(8, 16);
> } else {
> ret = nft_pipapo_avx2_lookup_slow(m, res, fill, f,
> - ret, rp,
> + ret, data,
> first, last);
> }
> } else {
> @@ -1234,7 +1211,7 @@ nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set,
> NFT_SET_PIPAPO_AVX2_LOOKUP(4, 32);
> } else {
> ret = nft_pipapo_avx2_lookup_slow(m, res, fill, f,
> - ret, rp,
> + ret, data,
> first, last);
> }
> }
> @@ -1242,29 +1219,77 @@ nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set,
>
> #undef NFT_SET_PIPAPO_AVX2_LOOKUP
>
> - if (ret < 0)
> - goto out;
> + if (ret < 0) {
> + scratch->map_index = map_index;
> + return NULL;
> + }
>
> if (last) {
> - const struct nft_set_ext *e = &f->mt[ret].e->ext;
> + struct nft_pipapo_elem *e;
>
> - if (unlikely(nft_set_elem_expired(e) ||
> - !nft_set_elem_active(e, genmask)))
> + e = f->mt[ret].e;
> + if (unlikely(__nft_set_elem_expired(&e->ext, tstamp) ||
Here's the actual concern, even if I haven't tested this: I guess you now
pass the timestamp to this function instead of getting it with each
nft_set_elem_expired() call for either correctness (it should be done at
the beginning of the insertion?) or as an optimisation (if BITS_PER_LONG < 64
the overhead isn't necessarily trivial).
As long as machines supporting AVX2 are concerned, we can assume in general
that BITS_PER_LONG >= 64 (even though I'm not sure about x32).
But with 2/2, you need to call get_jiffies_64() as a result, from non-AVX2
code, even for sets without a timeout (without NFT_SET_EXT_TIMEOUT
extension).
Does that risk causing a regression on non-AVX2? If it's for correctness,
I think we shouldn't care, but if it's done as an optimisation, perhaps
it's not a universal one.
> + !nft_set_elem_active(&e->ext, genmask)))
> goto next_match;
>
> - ext = e;
> - goto out;
> + scratch->map_index = map_index;
> + return e;
> }
>
> + map_index = !map_index;
> swap(res, fill);
> - rp += NFT_PIPAPO_GROUPS_PADDED_SIZE(f);
> + data += NFT_PIPAPO_GROUPS_PADDED_SIZE(f);
> + }
> +
> + return NULL;
> +}
> +
> +/**
> + * nft_pipapo_avx2_lookup() - Dataplane frontend for AVX2 implementation
> + * @net: Network namespace
> + * @set: nftables API set representation
> + * @key: nftables API element representation containing key data
> + *
> + * This function is called from the data path. It will search for
> + * an element matching the given key in the current active copy using
> + * the avx2 routines if the fpu is usable or fall back to the generic
AVX2, FPU
> + * implementation of the algorithm otherwise.
> + *
> + * Return: ntables API extension pointer or NULL if no match.
nftables
> + */
> +const struct nft_set_ext *
> +nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set,
> + const u32 *key)
> +{
> + struct nft_pipapo *priv = nft_set_priv(set);
> + u8 genmask = nft_genmask_cur(net);
> + const struct nft_pipapo_match *m;
> + const u8 *rp = (const u8 *)key;
> + const struct nft_pipapo_elem *e;
> +
> + local_bh_disable();
> +
> + if (unlikely(!irq_fpu_usable())) {
> + const struct nft_set_ext *ext;
> +
> + ext = nft_pipapo_lookup(net, set, key);
> +
> + local_bh_enable();
> + return ext;
> }
>
> -out:
> - if (i % 2)
> - scratch->map_index = !map_index;
I originally did it like this to avoid an assignment in the fast path,
but the amount of time I just spent figuring out why I did this shows
that perhaps setting map_index = !map_index as you do now is worth it.
> + m = rcu_dereference(priv->match);
> +
> + /* This also protects access to all data related to scratch maps.
> + *
> + * Note that we don't need a valid MXCSR state for any of the
> + * operations we use here, so pass 0 as mask and spare a LDMXCSR
> + * instruction.
> + */
> + kernel_fpu_begin_mask(0);
I would leave an empty line here so that it's clear what "This" in the
comment refers to (it always sounds a bit confusing to me that
kernel_fpu_begin_mask() actually does that, hence the comment).
> + e = pipapo_get_avx2(m, rp, genmask, get_jiffies_64());
> kernel_fpu_end();
> local_bh_enable();
>
> - return ext;
> + return e ? &e->ext : NULL;
> }
--
Stefano
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH nf-next 2/2] netfilter: nft_set_pipapo: use avx2 algorithm for insertions too
2025-08-15 14:36 ` [PATCH nf-next 2/2] netfilter: nft_set_pipapo: use avx2 algorithm for insertions too Florian Westphal
@ 2025-08-18 16:32 ` Stefano Brivio
2025-08-18 18:25 ` Florian Westphal
0 siblings, 1 reply; 10+ messages in thread
From: Stefano Brivio @ 2025-08-18 16:32 UTC (permalink / raw)
To: Florian Westphal; +Cc: netfilter-devel
On Fri, 15 Aug 2025 16:36:58 +0200
Florian Westphal <fw@strlen.de> wrote:
> Always prefer the avx2 implementation if its available.
> This greatly improves insertion performance (each insertion
> checks if the new element would overlap with an existing one):
>
> time nft -f - <<EOF
> table ip pipapo {
> set s {
> typeof ip saddr . tcp dport
> flags interval
> size 800000
> elements = { 10.1.1.1 - 10.1.1.4 . 3996,
> [.. 800k entries elided .. ]
>
> before:
> real 1m55.993s
> user 0m2.505s
> sys 1m53.296s
>
> after:
> real 0m42.586s
> user 0m2.554s
> sys 0m39.811s
>
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
> net/netfilter/nft_set_pipapo.c | 47 ++++++++++++++++++++++++++---
> net/netfilter/nft_set_pipapo_avx2.c | 6 ++--
> net/netfilter/nft_set_pipapo_avx2.h | 4 +++
> 3 files changed, 49 insertions(+), 8 deletions(-)
>
> diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c
> index 9a10251228fd..affa473b13a6 100644
> --- a/net/netfilter/nft_set_pipapo.c
> +++ b/net/netfilter/nft_set_pipapo.c
> @@ -397,7 +397,7 @@ int pipapo_refill(unsigned long *map, unsigned int len, unsigned int rules,
> }
>
> /**
> - * pipapo_get() - Get matching element reference given key data
> + * pipapo_get_slow() - Get matching element reference given key data
> * @m: storage containing the set elements
> * @data: Key data to be matched against existing elements
> * @genmask: If set, check that element is active in given genmask
> @@ -414,9 +414,9 @@ int pipapo_refill(unsigned long *map, unsigned int len, unsigned int rules,
> *
> * Return: pointer to &struct nft_pipapo_elem on match, NULL otherwise.
> */
> -static struct nft_pipapo_elem *pipapo_get(const struct nft_pipapo_match *m,
> - const u8 *data, u8 genmask,
> - u64 tstamp)
> +static struct nft_pipapo_elem *pipapo_get_slow(const struct nft_pipapo_match *m,
> + const u8 *data, u8 genmask,
> + u64 tstamp)
> {
> struct nft_pipapo_scratch *scratch;
> unsigned long *res_map, *fill_map;
> @@ -502,6 +502,43 @@ static struct nft_pipapo_elem *pipapo_get(const struct nft_pipapo_match *m,
> return NULL;
> }
>
> +/**
> + * pipapo_get() - Get matching element reference given key data
> + * @m: storage containing the set elements
> + * @data: Key data to be matched against existing elements
> + * @genmask: If set, check that element is active in given genmask
> + * @tstamp: timestamp to check for expired elements
> + *
> + * This is a dispatcher function, either calling out the generic C
> + * implementation or, if available, the avx2 one.
> + * This helper is only called from the control plane, with either rcu
Nit: AVX2, RCU
> + * read lock or transaction mutex held.
> + *
> + * Return: pointer to &struct nft_pipapo_elem on match, NULL otherwise.
> + */
> +static struct nft_pipapo_elem *pipapo_get(const struct nft_pipapo_match *m,
> + const u8 *data, u8 genmask,
> + u64 tstamp)
> +{
> + struct nft_pipapo_elem *e;
> +
> + local_bh_disable();
> +
> +#if defined(CONFIG_X86_64) && !defined(CONFIG_UML)
> + if (boot_cpu_has(X86_FEATURE_AVX2) && boot_cpu_has(X86_FEATURE_AVX) &&
I don't have any straightforward idea on how to avoid introducing AVX2
stuff (even if compiled out) in the generic function, which we had
managed to avoid so far. I don't think it's a big deal, though.
> + irq_fpu_usable()) {
> + kernel_fpu_begin_mask(0);
> + e = pipapo_get_avx2(m, data, genmask, tstamp);
> + kernel_fpu_end();
> + local_bh_enable();
> + return e;
> + }
> +#endif
> + e = pipapo_get_slow(m, data, genmask, tstamp);
> + local_bh_enable();
> + return e;
> +}
> +
> /**
> * nft_pipapo_lookup() - Dataplane fronted for main lookup function
> * @net: Network namespace
> @@ -523,7 +560,7 @@ nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
> const struct nft_pipapo_elem *e;
>
> m = rcu_dereference(priv->match);
> - e = pipapo_get(m, (const u8 *)key, genmask, get_jiffies_64());
> + e = pipapo_get_slow(m, (const u8 *)key, genmask, get_jiffies_64());
>
> return e ? &e->ext : NULL;
> }
> diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c
> index d512877cc211..7a9900f8ce2f 100644
> --- a/net/netfilter/nft_set_pipapo_avx2.c
> +++ b/net/netfilter/nft_set_pipapo_avx2.c
> @@ -1149,9 +1149,9 @@ static inline void pipapo_resmap_init_avx2(const struct nft_pipapo_match *m, uns
> *
> * Return: pointer to &struct nft_pipapo_elem on match, NULL otherwise.
> */
> -static struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m,
> - const u8 *data, u8 genmask,
> - u64 tstamp)
> +struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m,
> + const u8 *data, u8 genmask,
> + u64 tstamp)
> {
> struct nft_pipapo_scratch *scratch;
> const struct nft_pipapo_field *f;
> diff --git a/net/netfilter/nft_set_pipapo_avx2.h b/net/netfilter/nft_set_pipapo_avx2.h
> index dbb6aaca8a7a..c2999b63da3f 100644
> --- a/net/netfilter/nft_set_pipapo_avx2.h
> +++ b/net/netfilter/nft_set_pipapo_avx2.h
> @@ -5,8 +5,12 @@
> #include <asm/fpu/xstate.h>
> #define NFT_PIPAPO_ALIGN (XSAVE_YMM_SIZE / BITS_PER_BYTE)
>
> +struct nft_pipapo_match;
> bool nft_pipapo_avx2_estimate(const struct nft_set_desc *desc, u32 features,
> struct nft_set_estimate *est);
> +struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m,
> + const u8 *data, u8 genmask,
> + u64 tstamp);
> #endif /* defined(CONFIG_X86_64) && !defined(CONFIG_UML) */
>
> #endif /* _NFT_SET_PIPAPO_AVX2_H */
--
Stefano
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH nf-next 1/2] netfilter: nft_set_pipapo_avx2: split lookup function in two parts
2025-08-18 16:29 ` Stefano Brivio
@ 2025-08-18 18:23 ` Florian Westphal
2025-08-18 18:56 ` Stefano Brivio
0 siblings, 1 reply; 10+ messages in thread
From: Florian Westphal @ 2025-08-18 18:23 UTC (permalink / raw)
To: Stefano Brivio; +Cc: netfilter-devel
Stefano Brivio <sbrivio@redhat.com> wrote:
> > - * nft_pipapo_avx2_lookup() - Lookup function for AVX2 implementation
> > - * @net: Network namespace
> > - * @set: nftables API set representation
> > - * @key: nftables API element representation containing key data
> > + * pipapo_get_avx2() - Lookup function for AVX2 implementation
> > + * @m: storage containing the set elements
> > + * @data: Key data to be matched against existing elements
> > + * @genmask: If set, check that element is active in given genmask
> > + * @tstamp: timestamp to check for expired elements
>
> Nits: Storage, Timestamp (or all lowercase, for consistency with the
> other ones).
Note that there is no consistency whatsoever in the kernel.
Some use upper case, some lower, some indent on same level (like done
here), some don't.
So, I don't care anymore since it will never be right.
In case i have to mangle it anyway i will "fix" it.
> > + e = f->mt[ret].e;
> > + if (unlikely(__nft_set_elem_expired(&e->ext, tstamp) ||
>
> Here's the actual concern, even if I haven't tested this: I guess you now
> pass the timestamp to this function instead of getting it with each
> nft_set_elem_expired() call for either correctness (it should be done at
> the beginning of the insertion?) or as an optimisation (if BITS_PER_LONG < 64
> the overhead isn't necessarily trivial).
Its done because during insertion time should be frozen to avoid
elements timing out while transaction is in progress.
(this is unrelated to this patchset).
But in order to use this snippet from both control and data path
this has to be passed in so it can either be 'now' or 'time at
start of transaction'.
> But with 2/2, you need to call get_jiffies_64() as a result, from non-AVX2
> code, even for sets without a timeout (without NFT_SET_EXT_TIMEOUT
> extension).
>
> Does that risk causing a regression on non-AVX2? If it's for correctness,
> I think we shouldn't care, but if it's done as an optimisation, perhaps
> it's not a universal one.
Its not an optimisation. I could pass a 'is_control_plane' or
'is_packetpath' but I considered it too verbose and not needed for
correctness.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH nf-next 2/2] netfilter: nft_set_pipapo: use avx2 algorithm for insertions too
2025-08-18 16:32 ` Stefano Brivio
@ 2025-08-18 18:25 ` Florian Westphal
2025-08-18 18:56 ` Stefano Brivio
0 siblings, 1 reply; 10+ messages in thread
From: Florian Westphal @ 2025-08-18 18:25 UTC (permalink / raw)
To: Stefano Brivio; +Cc: netfilter-devel
Stefano Brivio <sbrivio@redhat.com> wrote:
> > +static struct nft_pipapo_elem *pipapo_get(const struct nft_pipapo_match *m,
> > + const u8 *data, u8 genmask,
> > + u64 tstamp)
> > +{
> > + struct nft_pipapo_elem *e;
> > +
> > + local_bh_disable();
> > +
> > +#if defined(CONFIG_X86_64) && !defined(CONFIG_UML)
> > + if (boot_cpu_has(X86_FEATURE_AVX2) && boot_cpu_has(X86_FEATURE_AVX) &&
>
> I don't have any straightforward idea on how to avoid introducing AVX2
> stuff (even if compiled out) in the generic function, which we had
> managed to avoid so far. I don't think it's a big deal, though.
It could be hidden away in a static inline helper if that makes it more
acceptable.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH nf-next 1/2] netfilter: nft_set_pipapo_avx2: split lookup function in two parts
2025-08-18 18:23 ` Florian Westphal
@ 2025-08-18 18:56 ` Stefano Brivio
0 siblings, 0 replies; 10+ messages in thread
From: Stefano Brivio @ 2025-08-18 18:56 UTC (permalink / raw)
To: Florian Westphal; +Cc: netfilter-devel
On Mon, 18 Aug 2025 20:23:49 +0200
Florian Westphal <fw@strlen.de> wrote:
> Stefano Brivio <sbrivio@redhat.com> wrote:
> > > - * nft_pipapo_avx2_lookup() - Lookup function for AVX2 implementation
> > > - * @net: Network namespace
> > > - * @set: nftables API set representation
> > > - * @key: nftables API element representation containing key data
> > > + * pipapo_get_avx2() - Lookup function for AVX2 implementation
> > > + * @m: storage containing the set elements
> > > + * @data: Key data to be matched against existing elements
> > > + * @genmask: If set, check that element is active in given genmask
> > > + * @tstamp: timestamp to check for expired elements
> >
> > Nits: Storage, Timestamp (or all lowercase, for consistency with the
> > other ones).
>
> Note that there is no consistency whatsoever in the kernel.
> Some use upper case, some lower, some indent on same level (like done
> here), some don't.
Yeah, I just meant for consistency with the other parameter descriptions
of this function itself ("the other ones") and of these files.
> So, I don't care anymore since it will never be right.
>
> In case i have to mangle it anyway i will "fix" it.
Either way,
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
> > > + e = f->mt[ret].e;
> > > + if (unlikely(__nft_set_elem_expired(&e->ext, tstamp) ||
> >
> > Here's the actual concern, even if I haven't tested this: I guess you now
> > pass the timestamp to this function instead of getting it with each
> > nft_set_elem_expired() call for either correctness (it should be done at
> > the beginning of the insertion?) or as an optimisation (if BITS_PER_LONG < 64
> > the overhead isn't necessarily trivial).
>
> Its done because during insertion time should be frozen to avoid
> elements timing out while transaction is in progress.
> (this is unrelated to this patchset).
>
> But in order to use this snippet from both control and data path
> this has to be passed in so it can either be 'now' or 'time at
> start of transaction'.
Ah, I see now, thanks for explaining!
> > But with 2/2, you need to call get_jiffies_64() as a result, from non-AVX2
> > code, even for sets without a timeout (without NFT_SET_EXT_TIMEOUT
> > extension).
> >
> > Does that risk causing a regression on non-AVX2? If it's for correctness,
> > I think we shouldn't care, but if it's done as an optimisation, perhaps
> > it's not a universal one.
>
> Its not an optimisation. I could pass a 'is_control_plane' or
> 'is_packetpath' but I considered it too verbose and not needed for
> correctness.
...and it wouldn't help with my potential concern either as
is_packetpath would always be set anyway on lookups, with or without
timeout. Never mind then.
--
Stefano
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH nf-next 2/2] netfilter: nft_set_pipapo: use avx2 algorithm for insertions too
2025-08-18 18:25 ` Florian Westphal
@ 2025-08-18 18:56 ` Stefano Brivio
2025-08-18 21:41 ` Florian Westphal
0 siblings, 1 reply; 10+ messages in thread
From: Stefano Brivio @ 2025-08-18 18:56 UTC (permalink / raw)
To: Florian Westphal; +Cc: netfilter-devel
On Mon, 18 Aug 2025 20:25:02 +0200
Florian Westphal <fw@strlen.de> wrote:
> Stefano Brivio <sbrivio@redhat.com> wrote:
> > > +static struct nft_pipapo_elem *pipapo_get(const struct nft_pipapo_match *m,
> > > + const u8 *data, u8 genmask,
> > > + u64 tstamp)
> > > +{
> > > + struct nft_pipapo_elem *e;
> > > +
> > > + local_bh_disable();
> > > +
> > > +#if defined(CONFIG_X86_64) && !defined(CONFIG_UML)
> > > + if (boot_cpu_has(X86_FEATURE_AVX2) && boot_cpu_has(X86_FEATURE_AVX) &&
> >
> > I don't have any straightforward idea on how to avoid introducing AVX2
> > stuff (even if compiled out) in the generic function, which we had
> > managed to avoid so far. I don't think it's a big deal, though.
>
> It could be hidden away in a static inline helper if that makes it more
> acceptable.
I think that would just make it... hidden. I'd rather leave it like it
is.
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
--
Stefano
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH nf-next 2/2] netfilter: nft_set_pipapo: use avx2 algorithm for insertions too
2025-08-18 18:56 ` Stefano Brivio
@ 2025-08-18 21:41 ` Florian Westphal
0 siblings, 0 replies; 10+ messages in thread
From: Florian Westphal @ 2025-08-18 21:41 UTC (permalink / raw)
To: Stefano Brivio; +Cc: netfilter-devel
Stefano Brivio <sbrivio@redhat.com> wrote:
> I think that would just make it... hidden. I'd rather leave it like it
> is.
>
> Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Thanks. For the record, I did fix up the various typos in the
comments meanwhile, I'll retain your reviewed-by tag as i did
not make code changes.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-08-18 21:41 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-15 14:36 [PATCH nf-next 0/2] netfilter: nft_set_pipapo: speed up insertions Florian Westphal
2025-08-15 14:36 ` [PATCH nf-next 1/2] netfilter: nft_set_pipapo_avx2: split lookup function in two parts Florian Westphal
2025-08-18 16:29 ` Stefano Brivio
2025-08-18 18:23 ` Florian Westphal
2025-08-18 18:56 ` Stefano Brivio
2025-08-15 14:36 ` [PATCH nf-next 2/2] netfilter: nft_set_pipapo: use avx2 algorithm for insertions too Florian Westphal
2025-08-18 16:32 ` Stefano Brivio
2025-08-18 18:25 ` Florian Westphal
2025-08-18 18:56 ` Stefano Brivio
2025-08-18 21:41 ` Florian Westphal
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).