* [PATCH nf-next 0/2] netfilter: nft_set_pipapo: speed up insertions @ 2025-08-15 14:36 Florian Westphal 2025-08-15 14:36 ` [PATCH nf-next 1/2] netfilter: nft_set_pipapo_avx2: split lookup function in two parts Florian Westphal 2025-08-15 14:36 ` [PATCH nf-next 2/2] netfilter: nft_set_pipapo: use avx2 algorithm for insertions too Florian Westphal 0 siblings, 2 replies; 10+ messages in thread From: Florian Westphal @ 2025-08-15 14:36 UTC (permalink / raw) To: netfilter-devel; +Cc: sbrivio, Florian Westphal Always prefer the avx2 implementation if its available. This greatly improves insertion performance (each insertion checks if the new element would overlap with an existing one): time nft -f - <<EOF table ip pipapo { set s { typeof ip saddr . tcp dport flags interval size 800000 elements = { 10.1.1.1 - 10.1.1.4 . 3996, [.. 800k entries elided .. ] before: real 1m55.993s user 0m2.505s sys 1m53.296s after: real 0m42.586s user 0m2.554s sys 0m39.811s First patch does some refactoring so the common part can be reused for both packetpath and control plane. Second patch alters control plane to use avx2. Florian Westphal (2): netfilter: nft_set_pipapo_avx2: split lookup function in two parts netfilter: nft_set_pipapo: use avx2 algorithm for insertions too net/netfilter/nft_set_pipapo.c | 47 ++++++++-- net/netfilter/nft_set_pipapo_avx2.c | 127 +++++++++++++++++----------- net/netfilter/nft_set_pipapo_avx2.h | 4 + 3 files changed, 122 insertions(+), 56 deletions(-) -- 2.49.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH nf-next 1/2] netfilter: nft_set_pipapo_avx2: split lookup function in two parts 2025-08-15 14:36 [PATCH nf-next 0/2] netfilter: nft_set_pipapo: speed up insertions Florian Westphal @ 2025-08-15 14:36 ` Florian Westphal 2025-08-18 16:29 ` Stefano Brivio 2025-08-15 14:36 ` [PATCH nf-next 2/2] netfilter: nft_set_pipapo: use avx2 algorithm for insertions too Florian Westphal 1 sibling, 1 reply; 10+ messages in thread From: Florian Westphal @ 2025-08-15 14:36 UTC (permalink / raw) To: netfilter-devel; +Cc: sbrivio, Florian Westphal Split the main avx2 lookup function into a helper. This is a preparation patch: followup change will use the new helper from the insertion path if possible. This greatly improves insertion performance when avx2 is supported. Signed-off-by: Florian Westphal <fw@strlen.de> --- net/netfilter/nft_set_pipapo_avx2.c | 127 +++++++++++++++++----------- 1 file changed, 76 insertions(+), 51 deletions(-) diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c index 2f090e253caf..d512877cc211 100644 --- a/net/netfilter/nft_set_pipapo_avx2.c +++ b/net/netfilter/nft_set_pipapo_avx2.c @@ -1133,58 +1133,35 @@ static inline void pipapo_resmap_init_avx2(const struct nft_pipapo_match *m, uns } /** - * nft_pipapo_avx2_lookup() - Lookup function for AVX2 implementation - * @net: Network namespace - * @set: nftables API set representation - * @key: nftables API element representation containing key data + * pipapo_get_avx2() - Lookup function for AVX2 implementation + * @m: storage containing the set elements + * @data: Key data to be matched against existing elements + * @genmask: If set, check that element is active in given genmask + * @tstamp: timestamp to check for expired elements * * For more details, see DOC: Theory of Operation in nft_set_pipapo.c. * * This implementation exploits the repetitive characteristic of the algorithm * to provide a fast, vectorised version using the AVX2 SIMD instruction set. * - * Return: true on match, false otherwise. + * The caller MUST check that the fpu is usable. + * This function must be called with BH disabled. + * + * Return: pointer to &struct nft_pipapo_elem on match, NULL otherwise. */ -const struct nft_set_ext * -nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, - const u32 *key) +static struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m, + const u8 *data, u8 genmask, + u64 tstamp) { - struct nft_pipapo *priv = nft_set_priv(set); - const struct nft_set_ext *ext = NULL; struct nft_pipapo_scratch *scratch; - u8 genmask = nft_genmask_cur(net); - const struct nft_pipapo_match *m; const struct nft_pipapo_field *f; - const u8 *rp = (const u8 *)key; unsigned long *res, *fill; bool map_index; int i; - local_bh_disable(); - - if (unlikely(!irq_fpu_usable())) { - ext = nft_pipapo_lookup(net, set, key); - - local_bh_enable(); - return ext; - } - - m = rcu_dereference(priv->match); - - /* This also protects access to all data related to scratch maps. - * - * Note that we don't need a valid MXCSR state for any of the - * operations we use here, so pass 0 as mask and spare a LDMXCSR - * instruction. - */ - kernel_fpu_begin_mask(0); - scratch = *raw_cpu_ptr(m->scratch); - if (unlikely(!scratch)) { - kernel_fpu_end(); - local_bh_enable(); + if (unlikely(!scratch)) return NULL; - } map_index = scratch->map_index; @@ -1202,7 +1179,7 @@ nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, #define NFT_SET_PIPAPO_AVX2_LOOKUP(b, n) \ (ret = nft_pipapo_avx2_lookup_##b##b_##n(res, fill, f, \ - ret, rp, \ + ret, data, \ first, last)) if (likely(f->bb == 8)) { @@ -1218,7 +1195,7 @@ nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, NFT_SET_PIPAPO_AVX2_LOOKUP(8, 16); } else { ret = nft_pipapo_avx2_lookup_slow(m, res, fill, f, - ret, rp, + ret, data, first, last); } } else { @@ -1234,7 +1211,7 @@ nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, NFT_SET_PIPAPO_AVX2_LOOKUP(4, 32); } else { ret = nft_pipapo_avx2_lookup_slow(m, res, fill, f, - ret, rp, + ret, data, first, last); } } @@ -1242,29 +1219,77 @@ nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, #undef NFT_SET_PIPAPO_AVX2_LOOKUP - if (ret < 0) - goto out; + if (ret < 0) { + scratch->map_index = map_index; + return NULL; + } if (last) { - const struct nft_set_ext *e = &f->mt[ret].e->ext; + struct nft_pipapo_elem *e; - if (unlikely(nft_set_elem_expired(e) || - !nft_set_elem_active(e, genmask))) + e = f->mt[ret].e; + if (unlikely(__nft_set_elem_expired(&e->ext, tstamp) || + !nft_set_elem_active(&e->ext, genmask))) goto next_match; - ext = e; - goto out; + scratch->map_index = map_index; + return e; } + map_index = !map_index; swap(res, fill); - rp += NFT_PIPAPO_GROUPS_PADDED_SIZE(f); + data += NFT_PIPAPO_GROUPS_PADDED_SIZE(f); + } + + return NULL; +} + +/** + * nft_pipapo_avx2_lookup() - Dataplane frontend for AVX2 implementation + * @net: Network namespace + * @set: nftables API set representation + * @key: nftables API element representation containing key data + * + * This function is called from the data path. It will search for + * an element matching the given key in the current active copy using + * the avx2 routines if the fpu is usable or fall back to the generic + * implementation of the algorithm otherwise. + * + * Return: ntables API extension pointer or NULL if no match. + */ +const struct nft_set_ext * +nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, + const u32 *key) +{ + struct nft_pipapo *priv = nft_set_priv(set); + u8 genmask = nft_genmask_cur(net); + const struct nft_pipapo_match *m; + const u8 *rp = (const u8 *)key; + const struct nft_pipapo_elem *e; + + local_bh_disable(); + + if (unlikely(!irq_fpu_usable())) { + const struct nft_set_ext *ext; + + ext = nft_pipapo_lookup(net, set, key); + + local_bh_enable(); + return ext; } -out: - if (i % 2) - scratch->map_index = !map_index; + m = rcu_dereference(priv->match); + + /* This also protects access to all data related to scratch maps. + * + * Note that we don't need a valid MXCSR state for any of the + * operations we use here, so pass 0 as mask and spare a LDMXCSR + * instruction. + */ + kernel_fpu_begin_mask(0); + e = pipapo_get_avx2(m, rp, genmask, get_jiffies_64()); kernel_fpu_end(); local_bh_enable(); - return ext; + return e ? &e->ext : NULL; } -- 2.49.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH nf-next 1/2] netfilter: nft_set_pipapo_avx2: split lookup function in two parts 2025-08-15 14:36 ` [PATCH nf-next 1/2] netfilter: nft_set_pipapo_avx2: split lookup function in two parts Florian Westphal @ 2025-08-18 16:29 ` Stefano Brivio 2025-08-18 18:23 ` Florian Westphal 0 siblings, 1 reply; 10+ messages in thread From: Stefano Brivio @ 2025-08-18 16:29 UTC (permalink / raw) To: Florian Westphal; +Cc: netfilter-devel On Fri, 15 Aug 2025 16:36:57 +0200 Florian Westphal <fw@strlen.de> wrote: > Split the main avx2 lookup function into a helper. > > This is a preparation patch: followup change will use the new helper > from the insertion path if possible. This greatly improves insertion > performance when avx2 is supported. Ah, nice. This was on my to-do list at some point, but then I thought insertion time didn't matter as much, now clearly disproved by the numbers from 2/2. One actual concern and a few nits: > > Signed-off-by: Florian Westphal <fw@strlen.de> > --- > net/netfilter/nft_set_pipapo_avx2.c | 127 +++++++++++++++++----------- > 1 file changed, 76 insertions(+), 51 deletions(-) > > diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c > index 2f090e253caf..d512877cc211 100644 > --- a/net/netfilter/nft_set_pipapo_avx2.c > +++ b/net/netfilter/nft_set_pipapo_avx2.c > @@ -1133,58 +1133,35 @@ static inline void pipapo_resmap_init_avx2(const struct nft_pipapo_match *m, uns > } > > /** > - * nft_pipapo_avx2_lookup() - Lookup function for AVX2 implementation > - * @net: Network namespace > - * @set: nftables API set representation > - * @key: nftables API element representation containing key data > + * pipapo_get_avx2() - Lookup function for AVX2 implementation > + * @m: storage containing the set elements > + * @data: Key data to be matched against existing elements > + * @genmask: If set, check that element is active in given genmask > + * @tstamp: timestamp to check for expired elements Nits: Storage, Timestamp (or all lowercase, for consistency with the other ones). > * > * For more details, see DOC: Theory of Operation in nft_set_pipapo.c. > * > * This implementation exploits the repetitive characteristic of the algorithm > * to provide a fast, vectorised version using the AVX2 SIMD instruction set. > * > - * Return: true on match, false otherwise. > + * The caller MUST check that the fpu is usable. FPU > + * This function must be called with BH disabled. > + * > + * Return: pointer to &struct nft_pipapo_elem on match, NULL otherwise. > */ > -const struct nft_set_ext * > -nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, > - const u32 *key) > +static struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m, > + const u8 *data, u8 genmask, > + u64 tstamp) > { > - struct nft_pipapo *priv = nft_set_priv(set); > - const struct nft_set_ext *ext = NULL; > struct nft_pipapo_scratch *scratch; > - u8 genmask = nft_genmask_cur(net); > - const struct nft_pipapo_match *m; > const struct nft_pipapo_field *f; > - const u8 *rp = (const u8 *)key; > unsigned long *res, *fill; > bool map_index; > int i; > > - local_bh_disable(); > - > - if (unlikely(!irq_fpu_usable())) { > - ext = nft_pipapo_lookup(net, set, key); > - > - local_bh_enable(); > - return ext; > - } > - > - m = rcu_dereference(priv->match); > - > - /* This also protects access to all data related to scratch maps. > - * > - * Note that we don't need a valid MXCSR state for any of the > - * operations we use here, so pass 0 as mask and spare a LDMXCSR > - * instruction. > - */ > - kernel_fpu_begin_mask(0); > - > scratch = *raw_cpu_ptr(m->scratch); > - if (unlikely(!scratch)) { > - kernel_fpu_end(); > - local_bh_enable(); > + if (unlikely(!scratch)) > return NULL; > - } > > map_index = scratch->map_index; > > @@ -1202,7 +1179,7 @@ nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, > > #define NFT_SET_PIPAPO_AVX2_LOOKUP(b, n) \ > (ret = nft_pipapo_avx2_lookup_##b##b_##n(res, fill, f, \ > - ret, rp, \ > + ret, data, \ > first, last)) > > if (likely(f->bb == 8)) { > @@ -1218,7 +1195,7 @@ nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, > NFT_SET_PIPAPO_AVX2_LOOKUP(8, 16); > } else { > ret = nft_pipapo_avx2_lookup_slow(m, res, fill, f, > - ret, rp, > + ret, data, > first, last); > } > } else { > @@ -1234,7 +1211,7 @@ nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, > NFT_SET_PIPAPO_AVX2_LOOKUP(4, 32); > } else { > ret = nft_pipapo_avx2_lookup_slow(m, res, fill, f, > - ret, rp, > + ret, data, > first, last); > } > } > @@ -1242,29 +1219,77 @@ nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, > > #undef NFT_SET_PIPAPO_AVX2_LOOKUP > > - if (ret < 0) > - goto out; > + if (ret < 0) { > + scratch->map_index = map_index; > + return NULL; > + } > > if (last) { > - const struct nft_set_ext *e = &f->mt[ret].e->ext; > + struct nft_pipapo_elem *e; > > - if (unlikely(nft_set_elem_expired(e) || > - !nft_set_elem_active(e, genmask))) > + e = f->mt[ret].e; > + if (unlikely(__nft_set_elem_expired(&e->ext, tstamp) || Here's the actual concern, even if I haven't tested this: I guess you now pass the timestamp to this function instead of getting it with each nft_set_elem_expired() call for either correctness (it should be done at the beginning of the insertion?) or as an optimisation (if BITS_PER_LONG < 64 the overhead isn't necessarily trivial). As long as machines supporting AVX2 are concerned, we can assume in general that BITS_PER_LONG >= 64 (even though I'm not sure about x32). But with 2/2, you need to call get_jiffies_64() as a result, from non-AVX2 code, even for sets without a timeout (without NFT_SET_EXT_TIMEOUT extension). Does that risk causing a regression on non-AVX2? If it's for correctness, I think we shouldn't care, but if it's done as an optimisation, perhaps it's not a universal one. > + !nft_set_elem_active(&e->ext, genmask))) > goto next_match; > > - ext = e; > - goto out; > + scratch->map_index = map_index; > + return e; > } > > + map_index = !map_index; > swap(res, fill); > - rp += NFT_PIPAPO_GROUPS_PADDED_SIZE(f); > + data += NFT_PIPAPO_GROUPS_PADDED_SIZE(f); > + } > + > + return NULL; > +} > + > +/** > + * nft_pipapo_avx2_lookup() - Dataplane frontend for AVX2 implementation > + * @net: Network namespace > + * @set: nftables API set representation > + * @key: nftables API element representation containing key data > + * > + * This function is called from the data path. It will search for > + * an element matching the given key in the current active copy using > + * the avx2 routines if the fpu is usable or fall back to the generic AVX2, FPU > + * implementation of the algorithm otherwise. > + * > + * Return: ntables API extension pointer or NULL if no match. nftables > + */ > +const struct nft_set_ext * > +nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, > + const u32 *key) > +{ > + struct nft_pipapo *priv = nft_set_priv(set); > + u8 genmask = nft_genmask_cur(net); > + const struct nft_pipapo_match *m; > + const u8 *rp = (const u8 *)key; > + const struct nft_pipapo_elem *e; > + > + local_bh_disable(); > + > + if (unlikely(!irq_fpu_usable())) { > + const struct nft_set_ext *ext; > + > + ext = nft_pipapo_lookup(net, set, key); > + > + local_bh_enable(); > + return ext; > } > > -out: > - if (i % 2) > - scratch->map_index = !map_index; I originally did it like this to avoid an assignment in the fast path, but the amount of time I just spent figuring out why I did this shows that perhaps setting map_index = !map_index as you do now is worth it. > + m = rcu_dereference(priv->match); > + > + /* This also protects access to all data related to scratch maps. > + * > + * Note that we don't need a valid MXCSR state for any of the > + * operations we use here, so pass 0 as mask and spare a LDMXCSR > + * instruction. > + */ > + kernel_fpu_begin_mask(0); I would leave an empty line here so that it's clear what "This" in the comment refers to (it always sounds a bit confusing to me that kernel_fpu_begin_mask() actually does that, hence the comment). > + e = pipapo_get_avx2(m, rp, genmask, get_jiffies_64()); > kernel_fpu_end(); > local_bh_enable(); > > - return ext; > + return e ? &e->ext : NULL; > } -- Stefano ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH nf-next 1/2] netfilter: nft_set_pipapo_avx2: split lookup function in two parts 2025-08-18 16:29 ` Stefano Brivio @ 2025-08-18 18:23 ` Florian Westphal 2025-08-18 18:56 ` Stefano Brivio 0 siblings, 1 reply; 10+ messages in thread From: Florian Westphal @ 2025-08-18 18:23 UTC (permalink / raw) To: Stefano Brivio; +Cc: netfilter-devel Stefano Brivio <sbrivio@redhat.com> wrote: > > - * nft_pipapo_avx2_lookup() - Lookup function for AVX2 implementation > > - * @net: Network namespace > > - * @set: nftables API set representation > > - * @key: nftables API element representation containing key data > > + * pipapo_get_avx2() - Lookup function for AVX2 implementation > > + * @m: storage containing the set elements > > + * @data: Key data to be matched against existing elements > > + * @genmask: If set, check that element is active in given genmask > > + * @tstamp: timestamp to check for expired elements > > Nits: Storage, Timestamp (or all lowercase, for consistency with the > other ones). Note that there is no consistency whatsoever in the kernel. Some use upper case, some lower, some indent on same level (like done here), some don't. So, I don't care anymore since it will never be right. In case i have to mangle it anyway i will "fix" it. > > + e = f->mt[ret].e; > > + if (unlikely(__nft_set_elem_expired(&e->ext, tstamp) || > > Here's the actual concern, even if I haven't tested this: I guess you now > pass the timestamp to this function instead of getting it with each > nft_set_elem_expired() call for either correctness (it should be done at > the beginning of the insertion?) or as an optimisation (if BITS_PER_LONG < 64 > the overhead isn't necessarily trivial). Its done because during insertion time should be frozen to avoid elements timing out while transaction is in progress. (this is unrelated to this patchset). But in order to use this snippet from both control and data path this has to be passed in so it can either be 'now' or 'time at start of transaction'. > But with 2/2, you need to call get_jiffies_64() as a result, from non-AVX2 > code, even for sets without a timeout (without NFT_SET_EXT_TIMEOUT > extension). > > Does that risk causing a regression on non-AVX2? If it's for correctness, > I think we shouldn't care, but if it's done as an optimisation, perhaps > it's not a universal one. Its not an optimisation. I could pass a 'is_control_plane' or 'is_packetpath' but I considered it too verbose and not needed for correctness. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH nf-next 1/2] netfilter: nft_set_pipapo_avx2: split lookup function in two parts 2025-08-18 18:23 ` Florian Westphal @ 2025-08-18 18:56 ` Stefano Brivio 0 siblings, 0 replies; 10+ messages in thread From: Stefano Brivio @ 2025-08-18 18:56 UTC (permalink / raw) To: Florian Westphal; +Cc: netfilter-devel On Mon, 18 Aug 2025 20:23:49 +0200 Florian Westphal <fw@strlen.de> wrote: > Stefano Brivio <sbrivio@redhat.com> wrote: > > > - * nft_pipapo_avx2_lookup() - Lookup function for AVX2 implementation > > > - * @net: Network namespace > > > - * @set: nftables API set representation > > > - * @key: nftables API element representation containing key data > > > + * pipapo_get_avx2() - Lookup function for AVX2 implementation > > > + * @m: storage containing the set elements > > > + * @data: Key data to be matched against existing elements > > > + * @genmask: If set, check that element is active in given genmask > > > + * @tstamp: timestamp to check for expired elements > > > > Nits: Storage, Timestamp (or all lowercase, for consistency with the > > other ones). > > Note that there is no consistency whatsoever in the kernel. > Some use upper case, some lower, some indent on same level (like done > here), some don't. Yeah, I just meant for consistency with the other parameter descriptions of this function itself ("the other ones") and of these files. > So, I don't care anymore since it will never be right. > > In case i have to mangle it anyway i will "fix" it. Either way, Reviewed-by: Stefano Brivio <sbrivio@redhat.com> > > > + e = f->mt[ret].e; > > > + if (unlikely(__nft_set_elem_expired(&e->ext, tstamp) || > > > > Here's the actual concern, even if I haven't tested this: I guess you now > > pass the timestamp to this function instead of getting it with each > > nft_set_elem_expired() call for either correctness (it should be done at > > the beginning of the insertion?) or as an optimisation (if BITS_PER_LONG < 64 > > the overhead isn't necessarily trivial). > > Its done because during insertion time should be frozen to avoid > elements timing out while transaction is in progress. > (this is unrelated to this patchset). > > But in order to use this snippet from both control and data path > this has to be passed in so it can either be 'now' or 'time at > start of transaction'. Ah, I see now, thanks for explaining! > > But with 2/2, you need to call get_jiffies_64() as a result, from non-AVX2 > > code, even for sets without a timeout (without NFT_SET_EXT_TIMEOUT > > extension). > > > > Does that risk causing a regression on non-AVX2? If it's for correctness, > > I think we shouldn't care, but if it's done as an optimisation, perhaps > > it's not a universal one. > > Its not an optimisation. I could pass a 'is_control_plane' or > 'is_packetpath' but I considered it too verbose and not needed for > correctness. ...and it wouldn't help with my potential concern either as is_packetpath would always be set anyway on lookups, with or without timeout. Never mind then. -- Stefano ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH nf-next 2/2] netfilter: nft_set_pipapo: use avx2 algorithm for insertions too 2025-08-15 14:36 [PATCH nf-next 0/2] netfilter: nft_set_pipapo: speed up insertions Florian Westphal 2025-08-15 14:36 ` [PATCH nf-next 1/2] netfilter: nft_set_pipapo_avx2: split lookup function in two parts Florian Westphal @ 2025-08-15 14:36 ` Florian Westphal 2025-08-18 16:32 ` Stefano Brivio 1 sibling, 1 reply; 10+ messages in thread From: Florian Westphal @ 2025-08-15 14:36 UTC (permalink / raw) To: netfilter-devel; +Cc: sbrivio, Florian Westphal Always prefer the avx2 implementation if its available. This greatly improves insertion performance (each insertion checks if the new element would overlap with an existing one): time nft -f - <<EOF table ip pipapo { set s { typeof ip saddr . tcp dport flags interval size 800000 elements = { 10.1.1.1 - 10.1.1.4 . 3996, [.. 800k entries elided .. ] before: real 1m55.993s user 0m2.505s sys 1m53.296s after: real 0m42.586s user 0m2.554s sys 0m39.811s Signed-off-by: Florian Westphal <fw@strlen.de> --- net/netfilter/nft_set_pipapo.c | 47 ++++++++++++++++++++++++++--- net/netfilter/nft_set_pipapo_avx2.c | 6 ++-- net/netfilter/nft_set_pipapo_avx2.h | 4 +++ 3 files changed, 49 insertions(+), 8 deletions(-) diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c index 9a10251228fd..affa473b13a6 100644 --- a/net/netfilter/nft_set_pipapo.c +++ b/net/netfilter/nft_set_pipapo.c @@ -397,7 +397,7 @@ int pipapo_refill(unsigned long *map, unsigned int len, unsigned int rules, } /** - * pipapo_get() - Get matching element reference given key data + * pipapo_get_slow() - Get matching element reference given key data * @m: storage containing the set elements * @data: Key data to be matched against existing elements * @genmask: If set, check that element is active in given genmask @@ -414,9 +414,9 @@ int pipapo_refill(unsigned long *map, unsigned int len, unsigned int rules, * * Return: pointer to &struct nft_pipapo_elem on match, NULL otherwise. */ -static struct nft_pipapo_elem *pipapo_get(const struct nft_pipapo_match *m, - const u8 *data, u8 genmask, - u64 tstamp) +static struct nft_pipapo_elem *pipapo_get_slow(const struct nft_pipapo_match *m, + const u8 *data, u8 genmask, + u64 tstamp) { struct nft_pipapo_scratch *scratch; unsigned long *res_map, *fill_map; @@ -502,6 +502,43 @@ static struct nft_pipapo_elem *pipapo_get(const struct nft_pipapo_match *m, return NULL; } +/** + * pipapo_get() - Get matching element reference given key data + * @m: storage containing the set elements + * @data: Key data to be matched against existing elements + * @genmask: If set, check that element is active in given genmask + * @tstamp: timestamp to check for expired elements + * + * This is a dispatcher function, either calling out the generic C + * implementation or, if available, the avx2 one. + * This helper is only called from the control plane, with either rcu + * read lock or transaction mutex held. + * + * Return: pointer to &struct nft_pipapo_elem on match, NULL otherwise. + */ +static struct nft_pipapo_elem *pipapo_get(const struct nft_pipapo_match *m, + const u8 *data, u8 genmask, + u64 tstamp) +{ + struct nft_pipapo_elem *e; + + local_bh_disable(); + +#if defined(CONFIG_X86_64) && !defined(CONFIG_UML) + if (boot_cpu_has(X86_FEATURE_AVX2) && boot_cpu_has(X86_FEATURE_AVX) && + irq_fpu_usable()) { + kernel_fpu_begin_mask(0); + e = pipapo_get_avx2(m, data, genmask, tstamp); + kernel_fpu_end(); + local_bh_enable(); + return e; + } +#endif + e = pipapo_get_slow(m, data, genmask, tstamp); + local_bh_enable(); + return e; +} + /** * nft_pipapo_lookup() - Dataplane fronted for main lookup function * @net: Network namespace @@ -523,7 +560,7 @@ nft_pipapo_lookup(const struct net *net, const struct nft_set *set, const struct nft_pipapo_elem *e; m = rcu_dereference(priv->match); - e = pipapo_get(m, (const u8 *)key, genmask, get_jiffies_64()); + e = pipapo_get_slow(m, (const u8 *)key, genmask, get_jiffies_64()); return e ? &e->ext : NULL; } diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c index d512877cc211..7a9900f8ce2f 100644 --- a/net/netfilter/nft_set_pipapo_avx2.c +++ b/net/netfilter/nft_set_pipapo_avx2.c @@ -1149,9 +1149,9 @@ static inline void pipapo_resmap_init_avx2(const struct nft_pipapo_match *m, uns * * Return: pointer to &struct nft_pipapo_elem on match, NULL otherwise. */ -static struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m, - const u8 *data, u8 genmask, - u64 tstamp) +struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m, + const u8 *data, u8 genmask, + u64 tstamp) { struct nft_pipapo_scratch *scratch; const struct nft_pipapo_field *f; diff --git a/net/netfilter/nft_set_pipapo_avx2.h b/net/netfilter/nft_set_pipapo_avx2.h index dbb6aaca8a7a..c2999b63da3f 100644 --- a/net/netfilter/nft_set_pipapo_avx2.h +++ b/net/netfilter/nft_set_pipapo_avx2.h @@ -5,8 +5,12 @@ #include <asm/fpu/xstate.h> #define NFT_PIPAPO_ALIGN (XSAVE_YMM_SIZE / BITS_PER_BYTE) +struct nft_pipapo_match; bool nft_pipapo_avx2_estimate(const struct nft_set_desc *desc, u32 features, struct nft_set_estimate *est); +struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m, + const u8 *data, u8 genmask, + u64 tstamp); #endif /* defined(CONFIG_X86_64) && !defined(CONFIG_UML) */ #endif /* _NFT_SET_PIPAPO_AVX2_H */ -- 2.49.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH nf-next 2/2] netfilter: nft_set_pipapo: use avx2 algorithm for insertions too 2025-08-15 14:36 ` [PATCH nf-next 2/2] netfilter: nft_set_pipapo: use avx2 algorithm for insertions too Florian Westphal @ 2025-08-18 16:32 ` Stefano Brivio 2025-08-18 18:25 ` Florian Westphal 0 siblings, 1 reply; 10+ messages in thread From: Stefano Brivio @ 2025-08-18 16:32 UTC (permalink / raw) To: Florian Westphal; +Cc: netfilter-devel On Fri, 15 Aug 2025 16:36:58 +0200 Florian Westphal <fw@strlen.de> wrote: > Always prefer the avx2 implementation if its available. > This greatly improves insertion performance (each insertion > checks if the new element would overlap with an existing one): > > time nft -f - <<EOF > table ip pipapo { > set s { > typeof ip saddr . tcp dport > flags interval > size 800000 > elements = { 10.1.1.1 - 10.1.1.4 . 3996, > [.. 800k entries elided .. ] > > before: > real 1m55.993s > user 0m2.505s > sys 1m53.296s > > after: > real 0m42.586s > user 0m2.554s > sys 0m39.811s > > Signed-off-by: Florian Westphal <fw@strlen.de> > --- > net/netfilter/nft_set_pipapo.c | 47 ++++++++++++++++++++++++++--- > net/netfilter/nft_set_pipapo_avx2.c | 6 ++-- > net/netfilter/nft_set_pipapo_avx2.h | 4 +++ > 3 files changed, 49 insertions(+), 8 deletions(-) > > diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c > index 9a10251228fd..affa473b13a6 100644 > --- a/net/netfilter/nft_set_pipapo.c > +++ b/net/netfilter/nft_set_pipapo.c > @@ -397,7 +397,7 @@ int pipapo_refill(unsigned long *map, unsigned int len, unsigned int rules, > } > > /** > - * pipapo_get() - Get matching element reference given key data > + * pipapo_get_slow() - Get matching element reference given key data > * @m: storage containing the set elements > * @data: Key data to be matched against existing elements > * @genmask: If set, check that element is active in given genmask > @@ -414,9 +414,9 @@ int pipapo_refill(unsigned long *map, unsigned int len, unsigned int rules, > * > * Return: pointer to &struct nft_pipapo_elem on match, NULL otherwise. > */ > -static struct nft_pipapo_elem *pipapo_get(const struct nft_pipapo_match *m, > - const u8 *data, u8 genmask, > - u64 tstamp) > +static struct nft_pipapo_elem *pipapo_get_slow(const struct nft_pipapo_match *m, > + const u8 *data, u8 genmask, > + u64 tstamp) > { > struct nft_pipapo_scratch *scratch; > unsigned long *res_map, *fill_map; > @@ -502,6 +502,43 @@ static struct nft_pipapo_elem *pipapo_get(const struct nft_pipapo_match *m, > return NULL; > } > > +/** > + * pipapo_get() - Get matching element reference given key data > + * @m: storage containing the set elements > + * @data: Key data to be matched against existing elements > + * @genmask: If set, check that element is active in given genmask > + * @tstamp: timestamp to check for expired elements > + * > + * This is a dispatcher function, either calling out the generic C > + * implementation or, if available, the avx2 one. > + * This helper is only called from the control plane, with either rcu Nit: AVX2, RCU > + * read lock or transaction mutex held. > + * > + * Return: pointer to &struct nft_pipapo_elem on match, NULL otherwise. > + */ > +static struct nft_pipapo_elem *pipapo_get(const struct nft_pipapo_match *m, > + const u8 *data, u8 genmask, > + u64 tstamp) > +{ > + struct nft_pipapo_elem *e; > + > + local_bh_disable(); > + > +#if defined(CONFIG_X86_64) && !defined(CONFIG_UML) > + if (boot_cpu_has(X86_FEATURE_AVX2) && boot_cpu_has(X86_FEATURE_AVX) && I don't have any straightforward idea on how to avoid introducing AVX2 stuff (even if compiled out) in the generic function, which we had managed to avoid so far. I don't think it's a big deal, though. > + irq_fpu_usable()) { > + kernel_fpu_begin_mask(0); > + e = pipapo_get_avx2(m, data, genmask, tstamp); > + kernel_fpu_end(); > + local_bh_enable(); > + return e; > + } > +#endif > + e = pipapo_get_slow(m, data, genmask, tstamp); > + local_bh_enable(); > + return e; > +} > + > /** > * nft_pipapo_lookup() - Dataplane fronted for main lookup function > * @net: Network namespace > @@ -523,7 +560,7 @@ nft_pipapo_lookup(const struct net *net, const struct nft_set *set, > const struct nft_pipapo_elem *e; > > m = rcu_dereference(priv->match); > - e = pipapo_get(m, (const u8 *)key, genmask, get_jiffies_64()); > + e = pipapo_get_slow(m, (const u8 *)key, genmask, get_jiffies_64()); > > return e ? &e->ext : NULL; > } > diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c > index d512877cc211..7a9900f8ce2f 100644 > --- a/net/netfilter/nft_set_pipapo_avx2.c > +++ b/net/netfilter/nft_set_pipapo_avx2.c > @@ -1149,9 +1149,9 @@ static inline void pipapo_resmap_init_avx2(const struct nft_pipapo_match *m, uns > * > * Return: pointer to &struct nft_pipapo_elem on match, NULL otherwise. > */ > -static struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m, > - const u8 *data, u8 genmask, > - u64 tstamp) > +struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m, > + const u8 *data, u8 genmask, > + u64 tstamp) > { > struct nft_pipapo_scratch *scratch; > const struct nft_pipapo_field *f; > diff --git a/net/netfilter/nft_set_pipapo_avx2.h b/net/netfilter/nft_set_pipapo_avx2.h > index dbb6aaca8a7a..c2999b63da3f 100644 > --- a/net/netfilter/nft_set_pipapo_avx2.h > +++ b/net/netfilter/nft_set_pipapo_avx2.h > @@ -5,8 +5,12 @@ > #include <asm/fpu/xstate.h> > #define NFT_PIPAPO_ALIGN (XSAVE_YMM_SIZE / BITS_PER_BYTE) > > +struct nft_pipapo_match; > bool nft_pipapo_avx2_estimate(const struct nft_set_desc *desc, u32 features, > struct nft_set_estimate *est); > +struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m, > + const u8 *data, u8 genmask, > + u64 tstamp); > #endif /* defined(CONFIG_X86_64) && !defined(CONFIG_UML) */ > > #endif /* _NFT_SET_PIPAPO_AVX2_H */ -- Stefano ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH nf-next 2/2] netfilter: nft_set_pipapo: use avx2 algorithm for insertions too 2025-08-18 16:32 ` Stefano Brivio @ 2025-08-18 18:25 ` Florian Westphal 2025-08-18 18:56 ` Stefano Brivio 0 siblings, 1 reply; 10+ messages in thread From: Florian Westphal @ 2025-08-18 18:25 UTC (permalink / raw) To: Stefano Brivio; +Cc: netfilter-devel Stefano Brivio <sbrivio@redhat.com> wrote: > > +static struct nft_pipapo_elem *pipapo_get(const struct nft_pipapo_match *m, > > + const u8 *data, u8 genmask, > > + u64 tstamp) > > +{ > > + struct nft_pipapo_elem *e; > > + > > + local_bh_disable(); > > + > > +#if defined(CONFIG_X86_64) && !defined(CONFIG_UML) > > + if (boot_cpu_has(X86_FEATURE_AVX2) && boot_cpu_has(X86_FEATURE_AVX) && > > I don't have any straightforward idea on how to avoid introducing AVX2 > stuff (even if compiled out) in the generic function, which we had > managed to avoid so far. I don't think it's a big deal, though. It could be hidden away in a static inline helper if that makes it more acceptable. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH nf-next 2/2] netfilter: nft_set_pipapo: use avx2 algorithm for insertions too 2025-08-18 18:25 ` Florian Westphal @ 2025-08-18 18:56 ` Stefano Brivio 2025-08-18 21:41 ` Florian Westphal 0 siblings, 1 reply; 10+ messages in thread From: Stefano Brivio @ 2025-08-18 18:56 UTC (permalink / raw) To: Florian Westphal; +Cc: netfilter-devel On Mon, 18 Aug 2025 20:25:02 +0200 Florian Westphal <fw@strlen.de> wrote: > Stefano Brivio <sbrivio@redhat.com> wrote: > > > +static struct nft_pipapo_elem *pipapo_get(const struct nft_pipapo_match *m, > > > + const u8 *data, u8 genmask, > > > + u64 tstamp) > > > +{ > > > + struct nft_pipapo_elem *e; > > > + > > > + local_bh_disable(); > > > + > > > +#if defined(CONFIG_X86_64) && !defined(CONFIG_UML) > > > + if (boot_cpu_has(X86_FEATURE_AVX2) && boot_cpu_has(X86_FEATURE_AVX) && > > > > I don't have any straightforward idea on how to avoid introducing AVX2 > > stuff (even if compiled out) in the generic function, which we had > > managed to avoid so far. I don't think it's a big deal, though. > > It could be hidden away in a static inline helper if that makes it more > acceptable. I think that would just make it... hidden. I'd rather leave it like it is. Reviewed-by: Stefano Brivio <sbrivio@redhat.com> -- Stefano ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH nf-next 2/2] netfilter: nft_set_pipapo: use avx2 algorithm for insertions too 2025-08-18 18:56 ` Stefano Brivio @ 2025-08-18 21:41 ` Florian Westphal 0 siblings, 0 replies; 10+ messages in thread From: Florian Westphal @ 2025-08-18 21:41 UTC (permalink / raw) To: Stefano Brivio; +Cc: netfilter-devel Stefano Brivio <sbrivio@redhat.com> wrote: > I think that would just make it... hidden. I'd rather leave it like it > is. > > Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Thanks. For the record, I did fix up the various typos in the comments meanwhile, I'll retain your reviewed-by tag as i did not make code changes. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-08-18 21:41 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-08-15 14:36 [PATCH nf-next 0/2] netfilter: nft_set_pipapo: speed up insertions Florian Westphal 2025-08-15 14:36 ` [PATCH nf-next 1/2] netfilter: nft_set_pipapo_avx2: split lookup function in two parts Florian Westphal 2025-08-18 16:29 ` Stefano Brivio 2025-08-18 18:23 ` Florian Westphal 2025-08-18 18:56 ` Stefano Brivio 2025-08-15 14:36 ` [PATCH nf-next 2/2] netfilter: nft_set_pipapo: use avx2 algorithm for insertions too Florian Westphal 2025-08-18 16:32 ` Stefano Brivio 2025-08-18 18:25 ` Florian Westphal 2025-08-18 18:56 ` Stefano Brivio 2025-08-18 21:41 ` Florian Westphal
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).