* [PATCH nf 0/3] netfilter: nft_set_pipapo: map_index must be per set
@ 2024-02-06 12:23 Florian Westphal
2024-02-06 12:23 ` [PATCH nf 1/3] netfilter: nft_set_pipapo: store index in scratch maps Florian Westphal
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Florian Westphal @ 2024-02-06 12:23 UTC (permalink / raw)
To: netfilter-devel; +Cc: sbrivio, Florian Westphal
While working on speeding up pipapo rule insertions I found that the
map_index needs to be percpu and per-set, not just percpu.
At this time its possible for a pipapo set to fill the all-zero part
with ones and take the 'might have bits set' as 'start-from-zero' area.
First patch changes scratchpad area to a structure that provides
space for a per-set-and-cpu toggle and uses it of the percpu one.
Second patch prepares for patch 3, adds a new free helper.
Third patch removes the scratch_aligned pointer and makes AVX2
implementation use the exact same memory addresses for read/store of
the matching state.
Florian Westphal (3):
netfilter: nft_set_pipapo: store index in scratch maps
netfilter: nft_set_pipapo: add helper to release pcpu scratch area
netfilter: nft_set_pipapo: remove scratch_aligned pointer
net/netfilter/nft_set_pipapo.c | 96 +++++++++++++----------------
net/netfilter/nft_set_pipapo.h | 18 ++++--
net/netfilter/nft_set_pipapo_avx2.c | 17 +++--
3 files changed, 63 insertions(+), 68 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 10+ messages in thread* [PATCH nf 1/3] netfilter: nft_set_pipapo: store index in scratch maps 2024-02-06 12:23 [PATCH nf 0/3] netfilter: nft_set_pipapo: map_index must be per set Florian Westphal @ 2024-02-06 12:23 ` Florian Westphal 2024-02-07 14:15 ` Stefano Brivio 2024-02-06 12:23 ` [PATCH nf 2/3] netfilter: nft_set_pipapo: add helper to release pcpu scratch area Florian Westphal 2024-02-06 12:23 ` [PATCH nf 3/3] netfilter: nft_set_pipapo: remove scratch_aligned pointer Florian Westphal 2 siblings, 1 reply; 10+ messages in thread From: Florian Westphal @ 2024-02-06 12:23 UTC (permalink / raw) To: netfilter-devel; +Cc: sbrivio, Florian Westphal Pipapo needs a scratchpad area to keep state during matching. This state can be large and thus cannot reside on stack. Each set preallocates percpu areas for this. On each match stage, one scratchpad half starts with all-zero and the other is inited to all-ones. At the end of each stage, the half that starts with all-ones is always zero. Before next field is tested, pointers to the two halves are swapped, i.e. resmap pointer turns into fill pointer and vice versa. After the last field has been processed, pipapo stashes the index toggle in a percpu variable, with assumption that next packet will start with the all-zero half and sets all bits in the other to 1. This isn't reliable. There can be multiple sets and we can't be sure that the upper and lower half of all set scratch map is always in sync (lookups can be conditional), so one set might have swapped, but other might not have been queried. Thus we need to keep the index per-set-and-cpu, just like the scratchpad. Note that this bug fix is incomplete, there is a related issue. avx2 and normal implementation might use slightly different areas of the map array space due to the avx2 alignment requirements, so m->scratch (generic/fallback implementation) and ->scratch_aligned (avx) may partially overlap. scratch and scratch_aligned are not distinct objects, the latter is just the aligned address of the former. After this change, write to scratch_align->map_index may write to scratch->map, so this issue becomes more prominent, we can set to 1 a bit in the supposedly-all-zero area of scratch->map[]. A followup patch will remove the scratch_aligned and makes generic and avx code use the same (aligned) area. Its done in a separate change to ease review. Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges") Signed-off-by: Florian Westphal <fw@strlen.de> --- net/netfilter/nft_set_pipapo.c | 34 +++++++++++++++-------------- net/netfilter/nft_set_pipapo.h | 14 ++++++++++-- net/netfilter/nft_set_pipapo_avx2.c | 15 ++++++------- 3 files changed, 37 insertions(+), 26 deletions(-) diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c index efd523496be4..a8aa915f3f0b 100644 --- a/net/netfilter/nft_set_pipapo.c +++ b/net/netfilter/nft_set_pipapo.c @@ -342,9 +342,6 @@ #include "nft_set_pipapo_avx2.h" #include "nft_set_pipapo.h" -/* Current working bitmap index, toggled between field matches */ -static DEFINE_PER_CPU(bool, nft_pipapo_scratch_index); - /** * pipapo_refill() - For each set bit, set bits from selected mapping table item * @map: Bitmap to be scanned for set bits @@ -412,6 +409,7 @@ bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set, const u32 *key, const struct nft_set_ext **ext) { struct nft_pipapo *priv = nft_set_priv(set); + struct nft_pipapo_scratch *scratch; unsigned long *res_map, *fill_map; u8 genmask = nft_genmask_cur(net); const u8 *rp = (const u8 *)key; @@ -422,15 +420,17 @@ bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set, local_bh_disable(); - map_index = raw_cpu_read(nft_pipapo_scratch_index); - m = rcu_dereference(priv->match); if (unlikely(!m || !*raw_cpu_ptr(m->scratch))) goto out; - res_map = *raw_cpu_ptr(m->scratch) + (map_index ? m->bsize_max : 0); - fill_map = *raw_cpu_ptr(m->scratch) + (map_index ? 0 : m->bsize_max); + scratch = *raw_cpu_ptr(m->scratch); + + map_index = scratch->map_index; + + res_map = scratch->map + (map_index ? m->bsize_max : 0); + fill_map = scratch->map + (map_index ? 0 : m->bsize_max); memset(res_map, 0xff, m->bsize_max * sizeof(*res_map)); @@ -460,7 +460,7 @@ bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set, b = pipapo_refill(res_map, f->bsize, f->rules, fill_map, f->mt, last); if (b < 0) { - raw_cpu_write(nft_pipapo_scratch_index, map_index); + scratch->map_index = map_index; local_bh_enable(); return false; @@ -477,7 +477,7 @@ bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set, * current inactive bitmap is clean and can be reused as * *next* bitmap (not initial) for the next packet. */ - raw_cpu_write(nft_pipapo_scratch_index, map_index); + scratch->map_index = map_index; local_bh_enable(); return true; @@ -1121,12 +1121,12 @@ static int pipapo_realloc_scratch(struct nft_pipapo_match *clone, int i; for_each_possible_cpu(i) { - unsigned long *scratch; + struct nft_pipapo_scratch *scratch; #ifdef NFT_PIPAPO_ALIGN - unsigned long *scratch_aligned; + void *scratch_aligned; #endif - - scratch = kzalloc_node(bsize_max * sizeof(*scratch) * 2 + + scratch = kzalloc_node(struct_size(scratch, map, + bsize_max * 2) + NFT_PIPAPO_ALIGN_HEADROOM, GFP_KERNEL, cpu_to_node(i)); if (!scratch) { @@ -1145,7 +1145,9 @@ static int pipapo_realloc_scratch(struct nft_pipapo_match *clone, *per_cpu_ptr(clone->scratch, i) = scratch; #ifdef NFT_PIPAPO_ALIGN - scratch_aligned = NFT_PIPAPO_LT_ALIGN(scratch); + scratch_aligned = NFT_PIPAPO_LT_ALIGN(&scratch->map); + /* Need to align ->map start, not start of structure itself */ + scratch_aligned -= offsetof(struct nft_pipapo_scratch, map); *per_cpu_ptr(clone->scratch_aligned, i) = scratch_aligned; #endif } @@ -2132,7 +2134,7 @@ static int nft_pipapo_init(const struct nft_set *set, m->field_count = field_count; m->bsize_max = 0; - m->scratch = alloc_percpu(unsigned long *); + m->scratch = alloc_percpu(struct nft_pipapo_scratch *); if (!m->scratch) { err = -ENOMEM; goto out_scratch; @@ -2141,7 +2143,7 @@ static int nft_pipapo_init(const struct nft_set *set, *per_cpu_ptr(m->scratch, i) = NULL; #ifdef NFT_PIPAPO_ALIGN - m->scratch_aligned = alloc_percpu(unsigned long *); + m->scratch_aligned = alloc_percpu(struct nft_pipapo_scratch *); if (!m->scratch_aligned) { err = -ENOMEM; goto out_free; diff --git a/net/netfilter/nft_set_pipapo.h b/net/netfilter/nft_set_pipapo.h index 1040223da5fa..144b186c4caf 100644 --- a/net/netfilter/nft_set_pipapo.h +++ b/net/netfilter/nft_set_pipapo.h @@ -130,6 +130,16 @@ struct nft_pipapo_field { union nft_pipapo_map_bucket *mt; }; +/** + * struct nft_pipapo_scratch - percpu data used for lookup and matching + * @map_index Current working bitmap index, toggled between field matches + * @map store partial matching results during lookup + */ +struct nft_pipapo_scratch { + u8 map_index; + unsigned long map[]; +}; + /** * struct nft_pipapo_match - Data used for lookup and matching * @field_count Amount of fields in set @@ -142,9 +152,9 @@ struct nft_pipapo_field { struct nft_pipapo_match { int field_count; #ifdef NFT_PIPAPO_ALIGN - unsigned long * __percpu *scratch_aligned; + struct nft_pipapo_scratch * __percpu *scratch_aligned; #endif - unsigned long * __percpu *scratch; + struct nft_pipapo_scratch * __percpu *scratch; size_t bsize_max; struct rcu_head rcu; struct nft_pipapo_field f[] __counted_by(field_count); diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c index 52e0d026d30a..8cad7b2e759d 100644 --- a/net/netfilter/nft_set_pipapo_avx2.c +++ b/net/netfilter/nft_set_pipapo_avx2.c @@ -71,9 +71,6 @@ #define NFT_PIPAPO_AVX2_ZERO(reg) \ asm volatile("vpxor %ymm" #reg ", %ymm" #reg ", %ymm" #reg) -/* Current working bitmap index, toggled between field matches */ -static DEFINE_PER_CPU(bool, nft_pipapo_avx2_scratch_index); - /** * nft_pipapo_avx2_prepare() - Prepare before main algorithm body * @@ -1120,11 +1117,12 @@ bool nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, const u32 *key, const struct nft_set_ext **ext) { struct nft_pipapo *priv = nft_set_priv(set); - unsigned long *res, *fill, *scratch; + struct nft_pipapo_scratch *scratch; u8 genmask = nft_genmask_cur(net); const u8 *rp = (const u8 *)key; struct nft_pipapo_match *m; struct nft_pipapo_field *f; + unsigned long *res, *fill; bool map_index; int i, ret = 0; @@ -1146,10 +1144,11 @@ bool nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, kernel_fpu_end(); return false; } - map_index = raw_cpu_read(nft_pipapo_avx2_scratch_index); - res = scratch + (map_index ? m->bsize_max : 0); - fill = scratch + (map_index ? 0 : m->bsize_max); + map_index = scratch->map_index; + + res = scratch->map + (map_index ? m->bsize_max : 0); + fill = scratch->map + (map_index ? 0 : m->bsize_max); /* Starting map doesn't need to be set for this implementation */ @@ -1221,7 +1220,7 @@ bool nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, out: if (i % 2) - raw_cpu_write(nft_pipapo_avx2_scratch_index, !map_index); + scratch->map_index = !map_index; kernel_fpu_end(); return ret >= 0; -- 2.43.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH nf 1/3] netfilter: nft_set_pipapo: store index in scratch maps 2024-02-06 12:23 ` [PATCH nf 1/3] netfilter: nft_set_pipapo: store index in scratch maps Florian Westphal @ 2024-02-07 14:15 ` Stefano Brivio 2024-02-07 15:23 ` Florian Westphal 0 siblings, 1 reply; 10+ messages in thread From: Stefano Brivio @ 2024-02-07 14:15 UTC (permalink / raw) To: Florian Westphal; +Cc: netfilter-devel On Tue, 6 Feb 2024 13:23:06 +0100 Florian Westphal <fw@strlen.de> wrote: > Pipapo needs a scratchpad area to keep state during matching. > This state can be large and thus cannot reside on stack. > > Each set preallocates percpu areas for this. > > On each match stage, one scratchpad half starts with all-zero and the other > is inited to all-ones. > > At the end of each stage, the half that starts with all-ones is > always zero. Before next field is tested, pointers to the two halves > are swapped, i.e. resmap pointer turns into fill pointer and vice versa. > > After the last field has been processed, pipapo stashes the > index toggle in a percpu variable, with assumption that next packet > will start with the all-zero half and sets all bits in the other to 1. > > This isn't reliable. Uh oh. In hindsight, this sounds so obvious now... > There can be multiple sets and we can't be sure that the upper > and lower half of all set scratch map is always in sync (lookups > can be conditional), so one set might have swapped, but other might > not have been queried. > > Thus we need to keep the index per-set-and-cpu, just like the > scratchpad. > > Note that this bug fix is incomplete, there is a related issue. > > avx2 and normal implementation might use slightly different areas of the > map array space due to the avx2 alignment requirements, so > m->scratch (generic/fallback implementation) and ->scratch_aligned > (avx) may partially overlap. scratch and scratch_aligned are not distinct > objects, the latter is just the aligned address of the former. > > After this change, write to scratch_align->map_index may write to > scratch->map, so this issue becomes more prominent, we can set to 1 > a bit in the supposedly-all-zero area of scratch->map[]. > > A followup patch will remove the scratch_aligned and makes generic and > avx code use the same (aligned) area. > > Its done in a separate change to ease review. > > Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges") > Signed-off-by: Florian Westphal <fw@strlen.de> Minus one nit (not worth respinning) and one half-doubt below, Reviewed-by: Stefano Brivio <sbrivio@redhat.com> ...I'm still reviewing the rest. > --- > net/netfilter/nft_set_pipapo.c | 34 +++++++++++++++-------------- > net/netfilter/nft_set_pipapo.h | 14 ++++++++++-- > net/netfilter/nft_set_pipapo_avx2.c | 15 ++++++------- > 3 files changed, 37 insertions(+), 26 deletions(-) > > diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c > index efd523496be4..a8aa915f3f0b 100644 > --- a/net/netfilter/nft_set_pipapo.c > +++ b/net/netfilter/nft_set_pipapo.c > @@ -342,9 +342,6 @@ > #include "nft_set_pipapo_avx2.h" > #include "nft_set_pipapo.h" > > -/* Current working bitmap index, toggled between field matches */ > -static DEFINE_PER_CPU(bool, nft_pipapo_scratch_index); > - > /** > * pipapo_refill() - For each set bit, set bits from selected mapping table item > * @map: Bitmap to be scanned for set bits > @@ -412,6 +409,7 @@ bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set, > const u32 *key, const struct nft_set_ext **ext) > { > struct nft_pipapo *priv = nft_set_priv(set); > + struct nft_pipapo_scratch *scratch; > unsigned long *res_map, *fill_map; > u8 genmask = nft_genmask_cur(net); > const u8 *rp = (const u8 *)key; > @@ -422,15 +420,17 @@ bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set, > > local_bh_disable(); > > - map_index = raw_cpu_read(nft_pipapo_scratch_index); > - > m = rcu_dereference(priv->match); > > if (unlikely(!m || !*raw_cpu_ptr(m->scratch))) > goto out; > > - res_map = *raw_cpu_ptr(m->scratch) + (map_index ? m->bsize_max : 0); > - fill_map = *raw_cpu_ptr(m->scratch) + (map_index ? 0 : m->bsize_max); > + scratch = *raw_cpu_ptr(m->scratch); > + > + map_index = scratch->map_index; > + > + res_map = scratch->map + (map_index ? m->bsize_max : 0); > + fill_map = scratch->map + (map_index ? 0 : m->bsize_max); > > memset(res_map, 0xff, m->bsize_max * sizeof(*res_map)); > > @@ -460,7 +460,7 @@ bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set, > b = pipapo_refill(res_map, f->bsize, f->rules, fill_map, f->mt, > last); > if (b < 0) { > - raw_cpu_write(nft_pipapo_scratch_index, map_index); > + scratch->map_index = map_index; > local_bh_enable(); > > return false; > @@ -477,7 +477,7 @@ bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set, > * current inactive bitmap is clean and can be reused as > * *next* bitmap (not initial) for the next packet. > */ > - raw_cpu_write(nft_pipapo_scratch_index, map_index); > + scratch->map_index = map_index; > local_bh_enable(); > > return true; > @@ -1121,12 +1121,12 @@ static int pipapo_realloc_scratch(struct nft_pipapo_match *clone, > int i; > > for_each_possible_cpu(i) { > - unsigned long *scratch; > + struct nft_pipapo_scratch *scratch; > #ifdef NFT_PIPAPO_ALIGN > - unsigned long *scratch_aligned; > + void *scratch_aligned; > #endif > - > - scratch = kzalloc_node(bsize_max * sizeof(*scratch) * 2 + > + scratch = kzalloc_node(struct_size(scratch, map, > + bsize_max * 2) + > NFT_PIPAPO_ALIGN_HEADROOM, > GFP_KERNEL, cpu_to_node(i)); > if (!scratch) { > @@ -1145,7 +1145,9 @@ static int pipapo_realloc_scratch(struct nft_pipapo_match *clone, > *per_cpu_ptr(clone->scratch, i) = scratch; > > #ifdef NFT_PIPAPO_ALIGN > - scratch_aligned = NFT_PIPAPO_LT_ALIGN(scratch); > + scratch_aligned = NFT_PIPAPO_LT_ALIGN(&scratch->map); > + /* Need to align ->map start, not start of structure itself */ > + scratch_aligned -= offsetof(struct nft_pipapo_scratch, map); This should be fine with the current version of NFT_PIPAPO_ALIGN_HEADROOM, but it took me quite some pondering, reasoning below if you feel like double checking. Let's say ARCH_KMALLOC_MINALIGN is 4, NFT_PIPAPO_LT_ALIGN is 32, we need 100 bytes for the scratch map (existing implementation), and the address, x, we get from kzalloc_node() is k + 28, with k aligned to 32 bytes. Then, we'll ask to allocate 32 - 4 = 28 extra bytes (NFT_PIPAPO_ALIGN_HEADROOM), that is, 128 bytes, and we'll start using the area at x + 4 (aligned to 32 bytes), with 124 bytes in front of us. With this change, and the current NFT_PIPAPO_ALIGN_HEADROOM, we'll allocate (usually) 4 bytes more, 132 bytes, and we'll start using the area at x + 4 anyway, with 128 bytes in front of us, and we could have asked to allocate less, which made me think for a moment that NFT_PIPAPO_ALIGN_HEADROOM needed to be adjusted too. However, 'scratch' at k + 28 is not the worst case: k + 32 is. At that point, we need anyway to ask for 28 extra bytes, because 'map_index' will force us to start from x + 32. > *per_cpu_ptr(clone->scratch_aligned, i) = scratch_aligned; > #endif > } > @@ -2132,7 +2134,7 @@ static int nft_pipapo_init(const struct nft_set *set, > m->field_count = field_count; > m->bsize_max = 0; > > - m->scratch = alloc_percpu(unsigned long *); > + m->scratch = alloc_percpu(struct nft_pipapo_scratch *); > if (!m->scratch) { > err = -ENOMEM; > goto out_scratch; > @@ -2141,7 +2143,7 @@ static int nft_pipapo_init(const struct nft_set *set, > *per_cpu_ptr(m->scratch, i) = NULL; > > #ifdef NFT_PIPAPO_ALIGN > - m->scratch_aligned = alloc_percpu(unsigned long *); > + m->scratch_aligned = alloc_percpu(struct nft_pipapo_scratch *); > if (!m->scratch_aligned) { > err = -ENOMEM; > goto out_free; > diff --git a/net/netfilter/nft_set_pipapo.h b/net/netfilter/nft_set_pipapo.h > index 1040223da5fa..144b186c4caf 100644 > --- a/net/netfilter/nft_set_pipapo.h > +++ b/net/netfilter/nft_set_pipapo.h > @@ -130,6 +130,16 @@ struct nft_pipapo_field { > union nft_pipapo_map_bucket *mt; > }; > > +/** > + * struct nft_pipapo_scratch - percpu data used for lookup and matching > + * @map_index Current working bitmap index, toggled between field matches > + * @map store partial matching results during lookup > + */ > +struct nft_pipapo_scratch { > + u8 map_index; > + unsigned long map[]; > +}; > + > /** > * struct nft_pipapo_match - Data used for lookup and matching > * @field_count Amount of fields in set > @@ -142,9 +152,9 @@ struct nft_pipapo_field { > struct nft_pipapo_match { > int field_count; > #ifdef NFT_PIPAPO_ALIGN > - unsigned long * __percpu *scratch_aligned; > + struct nft_pipapo_scratch * __percpu *scratch_aligned; > #endif > - unsigned long * __percpu *scratch; > + struct nft_pipapo_scratch * __percpu *scratch; > size_t bsize_max; > struct rcu_head rcu; > struct nft_pipapo_field f[] __counted_by(field_count); > diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c > index 52e0d026d30a..8cad7b2e759d 100644 > --- a/net/netfilter/nft_set_pipapo_avx2.c > +++ b/net/netfilter/nft_set_pipapo_avx2.c > @@ -71,9 +71,6 @@ > #define NFT_PIPAPO_AVX2_ZERO(reg) \ > asm volatile("vpxor %ymm" #reg ", %ymm" #reg ", %ymm" #reg) > > -/* Current working bitmap index, toggled between field matches */ > -static DEFINE_PER_CPU(bool, nft_pipapo_avx2_scratch_index); > - > /** > * nft_pipapo_avx2_prepare() - Prepare before main algorithm body > * > @@ -1120,11 +1117,12 @@ bool nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, > const u32 *key, const struct nft_set_ext **ext) > { > struct nft_pipapo *priv = nft_set_priv(set); > - unsigned long *res, *fill, *scratch; > + struct nft_pipapo_scratch *scratch; > u8 genmask = nft_genmask_cur(net); > const u8 *rp = (const u8 *)key; > struct nft_pipapo_match *m; > struct nft_pipapo_field *f; > + unsigned long *res, *fill; > bool map_index; > int i, ret = 0; > > @@ -1146,10 +1144,11 @@ bool nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, > kernel_fpu_end(); > return false; > } > - map_index = raw_cpu_read(nft_pipapo_avx2_scratch_index); > > - res = scratch + (map_index ? m->bsize_max : 0); > - fill = scratch + (map_index ? 0 : m->bsize_max); > + map_index = scratch->map_index; > + > + res = scratch->map + (map_index ? m->bsize_max : 0); > + fill = scratch->map + (map_index ? 0 : m->bsize_max); Nit (if you respin for any reason): the existing version had one extra space to highlight the symmetry between 'res' and 'fill' in the right operand. You kept that in nft_pipapo_lookup(), but dropped it here. > > /* Starting map doesn't need to be set for this implementation */ > > @@ -1221,7 +1220,7 @@ bool nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, > > out: > if (i % 2) > - raw_cpu_write(nft_pipapo_avx2_scratch_index, !map_index); > + scratch->map_index = !map_index; > kernel_fpu_end(); > > return ret >= 0; -- Stefano ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH nf 1/3] netfilter: nft_set_pipapo: store index in scratch maps 2024-02-07 14:15 ` Stefano Brivio @ 2024-02-07 15:23 ` Florian Westphal 2024-02-07 17:27 ` Stefano Brivio 0 siblings, 1 reply; 10+ messages in thread From: Florian Westphal @ 2024-02-07 15:23 UTC (permalink / raw) To: Stefano Brivio; +Cc: Florian Westphal, netfilter-devel Stefano Brivio <sbrivio@redhat.com> wrote: > > This isn't reliable. > > Uh oh. In hindsight, this sounds so obvious now... Thats a recurring theme with a lot of bugs. So, no, it was not obvious. > > There can be multiple sets and we can't be sure that the upper > > and lower half of all set scratch map is always in sync (lookups > > can be conditional), so one set might have swapped, but other might > > not have been queried. > > > > Thus we need to keep the index per-set-and-cpu, just like the > > scratchpad. > > > > Note that this bug fix is incomplete, there is a related issue. > > > > avx2 and normal implementation might use slightly different areas of the > > map array space due to the avx2 alignment requirements, so > > m->scratch (generic/fallback implementation) and ->scratch_aligned > > (avx) may partially overlap. scratch and scratch_aligned are not distinct > > objects, the latter is just the aligned address of the former. > > > > After this change, write to scratch_align->map_index may write to > > scratch->map, so this issue becomes more prominent, we can set to 1 > > a bit in the supposedly-all-zero area of scratch->map[]. > > > > A followup patch will remove the scratch_aligned and makes generic and > > avx code use the same (aligned) area. > > > > Its done in a separate change to ease review. > > > > Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges") > > Signed-off-by: Florian Westphal <fw@strlen.de> > > Minus one nit (not worth respinning) and one half-doubt below, > > Reviewed-by: Stefano Brivio <sbrivio@redhat.com> > > ...I'm still reviewing the rest. Thanks for reviewing. > > #ifdef NFT_PIPAPO_ALIGN > > - scratch_aligned = NFT_PIPAPO_LT_ALIGN(scratch); > > + scratch_aligned = NFT_PIPAPO_LT_ALIGN(&scratch->map); > > + /* Need to align ->map start, not start of structure itself */ > > + scratch_aligned -= offsetof(struct nft_pipapo_scratch, map); > > This should be fine with the current version of > NFT_PIPAPO_ALIGN_HEADROOM, but it took me quite some pondering, reasoning > below if you feel like double checking. Good point. > Let's say ARCH_KMALLOC_MINALIGN is 4, NFT_PIPAPO_LT_ALIGN is 32, we > need 100 bytes for the scratch map (existing implementation), and the > address, x, we get from kzalloc_node() is k + 28, with k aligned to 32 > bytes. > Then, we'll ask to allocate 32 - 4 = 28 extra bytes > (NFT_PIPAPO_ALIGN_HEADROOM), that is, 128 bytes, and we'll start using > the area at x + 4 (aligned to 32 bytes), with 124 bytes in front of us. > > With this change, and the current NFT_PIPAPO_ALIGN_HEADROOM, we'll > allocate (usually) 4 bytes more, 132 bytes, and we'll start using the > area at x + 4 anyway, with 128 bytes in front of us, and we could have > asked to allocate less, which made me think for a moment that > NFT_PIPAPO_ALIGN_HEADROOM needed to be adjusted too. We'll allocate sizeof(long) more space (map_index), which is 4 bytes in your example. > However, 'scratch' at k + 28 is not the worst case: k + 32 is. At that > point, we need anyway to ask for 28 extra bytes, because 'map_index' > will force us to start from x + 32. Wait. k + 32 is really "k" for old code: slab gave us an aligned address. In new code, k+4 is the perfect "already-aligned" address where we would 'no-op' the address on a ARCH_KMALLOC_MINALIGN == 4 system. Lets assume we get address k, and that address is the worst possible aligned one (with minalign 4), so we align (k + 4) (&scratch->map[0]), then subtract the index/struct head, which means we store (align(k+4) - 4), which would be 28. Worst case aligned value allocator can provide for new code is k or k + 32 (make no difference): Lets say address we got from allocator is 0x4: NFT_PIPAPO_LT_ALIGN(&scratch->map); -> aligned to 32, we store 28 as start of struct, so ->map[0] is at address 32. Lets say address we got from allocator is 0x20 (32): NFT_PIPAPO_LT_ALIGN(&scratch->map); -> aligned to 64, we store 60 as start of struct, so ->map[0] at 64. In both cases ALIGN() ate 28 bytes of the allocation, which we accounted for as NFT_PIPAPO_ALIGN_HEADROOM. Maybe thats what you were saying. I could try to add/expand the comments here for the alignment calculations. > > - res = scratch + (map_index ? m->bsize_max : 0); > > - fill = scratch + (map_index ? 0 : m->bsize_max); > > + map_index = scratch->map_index; > > + > > + res = scratch->map + (map_index ? m->bsize_max : 0); > > + fill = scratch->map + (map_index ? 0 : m->bsize_max); > > Nit (if you respin for any reason): the existing version had one extra > space to highlight the symmetry between 'res' and 'fill' in the right > operand. You kept that in nft_pipapo_lookup(), but dropped it here. Oh, indeed, i'll fix this up. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH nf 1/3] netfilter: nft_set_pipapo: store index in scratch maps 2024-02-07 15:23 ` Florian Westphal @ 2024-02-07 17:27 ` Stefano Brivio 2024-02-07 19:24 ` Florian Westphal 0 siblings, 1 reply; 10+ messages in thread From: Stefano Brivio @ 2024-02-07 17:27 UTC (permalink / raw) To: Florian Westphal; +Cc: netfilter-devel On Wed, 7 Feb 2024 16:23:28 +0100 Florian Westphal <fw@strlen.de> wrote: > Stefano Brivio <sbrivio@redhat.com> wrote: > > > This isn't reliable. > > > > Uh oh. In hindsight, this sounds so obvious now... > > Thats a recurring theme with a lot of bugs. > So, no, it was not obvious. > > > > There can be multiple sets and we can't be sure that the upper > > > and lower half of all set scratch map is always in sync (lookups > > > can be conditional), so one set might have swapped, but other might > > > not have been queried. > > > > > > Thus we need to keep the index per-set-and-cpu, just like the > > > scratchpad. > > > > > > Note that this bug fix is incomplete, there is a related issue. > > > > > > avx2 and normal implementation might use slightly different areas of the > > > map array space due to the avx2 alignment requirements, so > > > m->scratch (generic/fallback implementation) and ->scratch_aligned > > > (avx) may partially overlap. scratch and scratch_aligned are not distinct > > > objects, the latter is just the aligned address of the former. > > > > > > After this change, write to scratch_align->map_index may write to > > > scratch->map, so this issue becomes more prominent, we can set to 1 > > > a bit in the supposedly-all-zero area of scratch->map[]. > > > > > > A followup patch will remove the scratch_aligned and makes generic and > > > avx code use the same (aligned) area. > > > > > > Its done in a separate change to ease review. > > > > > > Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges") > > > Signed-off-by: Florian Westphal <fw@strlen.de> > > > > Minus one nit (not worth respinning) and one half-doubt below, > > > > Reviewed-by: Stefano Brivio <sbrivio@redhat.com> > > > > ...I'm still reviewing the rest. > > Thanks for reviewing. > > > > #ifdef NFT_PIPAPO_ALIGN > > > - scratch_aligned = NFT_PIPAPO_LT_ALIGN(scratch); > > > + scratch_aligned = NFT_PIPAPO_LT_ALIGN(&scratch->map); > > > + /* Need to align ->map start, not start of structure itself */ > > > + scratch_aligned -= offsetof(struct nft_pipapo_scratch, map); > > > > This should be fine with the current version of > > NFT_PIPAPO_ALIGN_HEADROOM, but it took me quite some pondering, reasoning > > below if you feel like double checking. > > Good point. > > > Let's say ARCH_KMALLOC_MINALIGN is 4, NFT_PIPAPO_LT_ALIGN is 32, we > > need 100 bytes for the scratch map (existing implementation), and the > > address, x, we get from kzalloc_node() is k + 28, with k aligned to 32 > > bytes. > > Then, we'll ask to allocate 32 - 4 = 28 extra bytes > > (NFT_PIPAPO_ALIGN_HEADROOM), that is, 128 bytes, and we'll start using > > the area at x + 4 (aligned to 32 bytes), with 124 bytes in front of us. > > > > With this change, and the current NFT_PIPAPO_ALIGN_HEADROOM, we'll > > allocate (usually) 4 bytes more, 132 bytes, and we'll start using the > > area at x + 4 anyway, with 128 bytes in front of us, and we could have > > asked to allocate less, which made me think for a moment that > > NFT_PIPAPO_ALIGN_HEADROOM needed to be adjusted too. > > We'll allocate sizeof(long) more space (map_index), which is 4 bytes in > your example. Oh, right, it could be 8 bytes too. Let's stick to 4 for simplicity. > > However, 'scratch' at k + 28 is not the worst case: k + 32 is. At that > > point, we need anyway to ask for 28 extra bytes, because 'map_index' > > will force us to start from x + 32. > > Wait. k + 32 is really "k" for old code: slab gave us an aligned > address. Yes, I meant that for the *new* code: k + 32 mod 32 (that is, k) is the worst case for the new code. > In new code, k+4 is the perfect "already-aligned" address where we would > 'no-op' the address on a ARCH_KMALLOC_MINALIGN == 4 system. Isn't the already aligned-address k - 4, that is, k + 28? With k + 4, we would have &scratch->map[0] at k + 8. But anyway: > Lets assume we get address k, and that address is the worst > possible aligned one (with minalign 4), so we align > (k + 4) (&scratch->map[0]), then subtract the index/struct head, > which means we store (align(k+4) - 4), which would be 28. > > Worst case aligned value allocator can provide for new code > is k or k + 32 (make no difference): > > Lets say address we got from allocator is 0x4: > > NFT_PIPAPO_LT_ALIGN(&scratch->map); -> aligned to 32, we store 28 > as start of struct, so ->map[0] is at address 32. > > Lets say address we got from allocator is 0x20 (32): > NFT_PIPAPO_LT_ALIGN(&scratch->map); -> aligned to 64, we store 60 > as start of struct, so ->map[0] at 64. > > In both cases ALIGN() ate 28 bytes of the allocation, which we accounted > for as NFT_PIPAPO_ALIGN_HEADROOM. > > Maybe thats what you were saying. I could try to add/expand the > comments here for the alignment calculations. ...yes, the rest is exactly what I meant. I'm not really satisfied of the paragraph below but maybe something on the lines of: /* Align &scratch->map (not the struct itself): the extra * %NFT_PIPAPO_ALIGN_HEADROOM bytes passed to kzalloc_node() above * guarantee we can waste up to those bytes in order to align the map * field regardless of its offset within the struct. */ -- Stefano ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH nf 1/3] netfilter: nft_set_pipapo: store index in scratch maps 2024-02-07 17:27 ` Stefano Brivio @ 2024-02-07 19:24 ` Florian Westphal 0 siblings, 0 replies; 10+ messages in thread From: Florian Westphal @ 2024-02-07 19:24 UTC (permalink / raw) To: Stefano Brivio; +Cc: Florian Westphal, netfilter-devel Stefano Brivio <sbrivio@redhat.com> wrote: > > In new code, k+4 is the perfect "already-aligned" address where we would > > 'no-op' the address on a ARCH_KMALLOC_MINALIGN == 4 system. > > Isn't the already aligned-address k - 4, that is, k + 28? With k + 4, > we would have &scratch->map[0] at k + 8. But anyway: Yes, k-4 (k+28). [..] > > Maybe thats what you were saying. I could try to add/expand the > > comments here for the alignment calculations. > > ...yes, the rest is exactly what I meant. I'm not really satisfied of > the paragraph below but maybe something on the lines of: > > /* Align &scratch->map (not the struct itself): the extra > * %NFT_PIPAPO_ALIGN_HEADROOM bytes passed to kzalloc_node() above > * guarantee we can waste up to those bytes in order to align the map > * field regardless of its offset within the struct. > */ Thanks, thats good, I'll use that. I think I'll also add a BUILD_BUG_ON to assert map offset <= NFT_PIPAPO_ALIGN_HEADROOM, just in case. ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH nf 2/3] netfilter: nft_set_pipapo: add helper to release pcpu scratch area 2024-02-06 12:23 [PATCH nf 0/3] netfilter: nft_set_pipapo: map_index must be per set Florian Westphal 2024-02-06 12:23 ` [PATCH nf 1/3] netfilter: nft_set_pipapo: store index in scratch maps Florian Westphal @ 2024-02-06 12:23 ` Florian Westphal 2024-02-07 17:29 ` Stefano Brivio 2024-02-06 12:23 ` [PATCH nf 3/3] netfilter: nft_set_pipapo: remove scratch_aligned pointer Florian Westphal 2 siblings, 1 reply; 10+ messages in thread From: Florian Westphal @ 2024-02-06 12:23 UTC (permalink / raw) To: netfilter-devel; +Cc: sbrivio, Florian Westphal After next patch simple kfree() is not enough anymore, so add a helper for it. Signed-off-by: Florian Westphal <fw@strlen.de> --- net/netfilter/nft_set_pipapo.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c index a8aa915f3f0b..3d308e31b048 100644 --- a/net/netfilter/nft_set_pipapo.c +++ b/net/netfilter/nft_set_pipapo.c @@ -1108,6 +1108,19 @@ static void pipapo_map(struct nft_pipapo_match *m, f->mt[map[i].to + j].e = e; } +static void pipapo_free_scratch(const struct nft_pipapo_match *m, unsigned int cpu) +{ + struct nft_pipapo_scratch *s; + void *mem; + + s = *per_cpu_ptr(m->scratch, cpu); + if (!s) + return; + + mem = s; + kfree(mem); +} + /** * pipapo_realloc_scratch() - Reallocate scratch maps for partial match results * @clone: Copy of matching data with pending insertions and deletions @@ -1140,7 +1153,7 @@ static int pipapo_realloc_scratch(struct nft_pipapo_match *clone, return -ENOMEM; } - kfree(*per_cpu_ptr(clone->scratch, i)); + pipapo_free_scratch(clone, i); *per_cpu_ptr(clone->scratch, i) = scratch; @@ -1359,7 +1372,7 @@ static struct nft_pipapo_match *pipapo_clone(struct nft_pipapo_match *old) } out_scratch_realloc: for_each_possible_cpu(i) - kfree(*per_cpu_ptr(new->scratch, i)); + pipapo_free_scratch(new, i); #ifdef NFT_PIPAPO_ALIGN free_percpu(new->scratch_aligned); #endif @@ -1642,7 +1655,7 @@ static void pipapo_free_match(struct nft_pipapo_match *m) int i; for_each_possible_cpu(i) - kfree(*per_cpu_ptr(m->scratch, i)); + pipapo_free_scratch(m, i); #ifdef NFT_PIPAPO_ALIGN free_percpu(m->scratch_aligned); @@ -2242,7 +2255,7 @@ static void nft_pipapo_destroy(const struct nft_ctx *ctx, free_percpu(m->scratch_aligned); #endif for_each_possible_cpu(cpu) - kfree(*per_cpu_ptr(m->scratch, cpu)); + pipapo_free_scratch(m, cpu); free_percpu(m->scratch); pipapo_free_fields(m); kfree(m); @@ -2259,7 +2272,7 @@ static void nft_pipapo_destroy(const struct nft_ctx *ctx, free_percpu(priv->clone->scratch_aligned); #endif for_each_possible_cpu(cpu) - kfree(*per_cpu_ptr(priv->clone->scratch, cpu)); + pipapo_free_scratch(priv->clone, cpu); free_percpu(priv->clone->scratch); pipapo_free_fields(priv->clone); -- 2.43.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH nf 2/3] netfilter: nft_set_pipapo: add helper to release pcpu scratch area 2024-02-06 12:23 ` [PATCH nf 2/3] netfilter: nft_set_pipapo: add helper to release pcpu scratch area Florian Westphal @ 2024-02-07 17:29 ` Stefano Brivio 0 siblings, 0 replies; 10+ messages in thread From: Stefano Brivio @ 2024-02-07 17:29 UTC (permalink / raw) To: Florian Westphal; +Cc: netfilter-devel On Tue, 6 Feb 2024 13:23:07 +0100 Florian Westphal <fw@strlen.de> wrote: > After next patch simple kfree() is not enough anymore, so add > a helper for it. > > Signed-off-by: Florian Westphal <fw@strlen.de> > --- > net/netfilter/nft_set_pipapo.c | 23 ++++++++++++++++++----- > 1 file changed, 18 insertions(+), 5 deletions(-) > > diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c > index a8aa915f3f0b..3d308e31b048 100644 > --- a/net/netfilter/nft_set_pipapo.c > +++ b/net/netfilter/nft_set_pipapo.c > @@ -1108,6 +1108,19 @@ static void pipapo_map(struct nft_pipapo_match *m, > f->mt[map[i].to + j].e = e; > } > > +static void pipapo_free_scratch(const struct nft_pipapo_match *m, unsigned int cpu) Almost everything else here has kerneldoc-style comments, perhaps (already accounting for 3/3): /** * pipapo_free_scratch() - Free per-CPU map at original (not aligned) address * @m: Matching data * @cpu: CPU number */ Other than this, Reviewed-by: Stefano Brivio <sbrivio@redhat.com> -- Stefano ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH nf 3/3] netfilter: nft_set_pipapo: remove scratch_aligned pointer 2024-02-06 12:23 [PATCH nf 0/3] netfilter: nft_set_pipapo: map_index must be per set Florian Westphal 2024-02-06 12:23 ` [PATCH nf 1/3] netfilter: nft_set_pipapo: store index in scratch maps Florian Westphal 2024-02-06 12:23 ` [PATCH nf 2/3] netfilter: nft_set_pipapo: add helper to release pcpu scratch area Florian Westphal @ 2024-02-06 12:23 ` Florian Westphal 2024-02-07 17:29 ` Stefano Brivio 2 siblings, 1 reply; 10+ messages in thread From: Florian Westphal @ 2024-02-06 12:23 UTC (permalink / raw) To: netfilter-devel; +Cc: sbrivio, Florian Westphal use ->scratch for both avx2 and the generic implementation. After previous change the scratch->map member is always aligned properly for AVX2, so we can just use scratch->map in AVX2 too. The alignoff delta is stored in the scratchpad so we can reconstruct the correct address to free the area again. Fixes: 7400b063969b ("nft_set_pipapo: Introduce AVX2-based lookup implementation") Signed-off-by: Florian Westphal <fw@strlen.de> --- net/netfilter/nft_set_pipapo.c | 41 +++++------------------------ net/netfilter/nft_set_pipapo.h | 6 ++--- net/netfilter/nft_set_pipapo_avx2.c | 2 +- 3 files changed, 10 insertions(+), 39 deletions(-) diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c index 3d308e31b048..7ec284fd5093 100644 --- a/net/netfilter/nft_set_pipapo.c +++ b/net/netfilter/nft_set_pipapo.c @@ -1118,6 +1118,7 @@ static void pipapo_free_scratch(const struct nft_pipapo_match *m, unsigned int c return; mem = s; + mem -= s->align_off; kfree(mem); } @@ -1137,6 +1138,7 @@ static int pipapo_realloc_scratch(struct nft_pipapo_match *clone, struct nft_pipapo_scratch *scratch; #ifdef NFT_PIPAPO_ALIGN void *scratch_aligned; + u32 align_off; #endif scratch = kzalloc_node(struct_size(scratch, map, bsize_max * 2) + @@ -1155,14 +1157,16 @@ static int pipapo_realloc_scratch(struct nft_pipapo_match *clone, pipapo_free_scratch(clone, i); - *per_cpu_ptr(clone->scratch, i) = scratch; - #ifdef NFT_PIPAPO_ALIGN scratch_aligned = NFT_PIPAPO_LT_ALIGN(&scratch->map); /* Need to align ->map start, not start of structure itself */ scratch_aligned -= offsetof(struct nft_pipapo_scratch, map); - *per_cpu_ptr(clone->scratch_aligned, i) = scratch_aligned; + align_off = scratch_aligned - (void *)scratch; + + scratch = scratch_aligned; + scratch->align_off = align_off; #endif + *per_cpu_ptr(clone->scratch, i) = scratch; } return 0; @@ -1316,11 +1320,6 @@ static struct nft_pipapo_match *pipapo_clone(struct nft_pipapo_match *old) if (!new->scratch) goto out_scratch; -#ifdef NFT_PIPAPO_ALIGN - new->scratch_aligned = alloc_percpu(*new->scratch_aligned); - if (!new->scratch_aligned) - goto out_scratch; -#endif for_each_possible_cpu(i) *per_cpu_ptr(new->scratch, i) = NULL; @@ -1373,9 +1372,6 @@ static struct nft_pipapo_match *pipapo_clone(struct nft_pipapo_match *old) out_scratch_realloc: for_each_possible_cpu(i) pipapo_free_scratch(new, i); -#ifdef NFT_PIPAPO_ALIGN - free_percpu(new->scratch_aligned); -#endif out_scratch: free_percpu(new->scratch); kfree(new); @@ -1657,11 +1653,7 @@ static void pipapo_free_match(struct nft_pipapo_match *m) for_each_possible_cpu(i) pipapo_free_scratch(m, i); -#ifdef NFT_PIPAPO_ALIGN - free_percpu(m->scratch_aligned); -#endif free_percpu(m->scratch); - pipapo_free_fields(m); kfree(m); @@ -2155,16 +2147,6 @@ static int nft_pipapo_init(const struct nft_set *set, for_each_possible_cpu(i) *per_cpu_ptr(m->scratch, i) = NULL; -#ifdef NFT_PIPAPO_ALIGN - m->scratch_aligned = alloc_percpu(struct nft_pipapo_scratch *); - if (!m->scratch_aligned) { - err = -ENOMEM; - goto out_free; - } - for_each_possible_cpu(i) - *per_cpu_ptr(m->scratch_aligned, i) = NULL; -#endif - rcu_head_init(&m->rcu); nft_pipapo_for_each_field(f, i, m) { @@ -2195,9 +2177,6 @@ static int nft_pipapo_init(const struct nft_set *set, return 0; out_free: -#ifdef NFT_PIPAPO_ALIGN - free_percpu(m->scratch_aligned); -#endif free_percpu(m->scratch); out_scratch: kfree(m); @@ -2251,9 +2230,6 @@ static void nft_pipapo_destroy(const struct nft_ctx *ctx, nft_set_pipapo_match_destroy(ctx, set, m); -#ifdef NFT_PIPAPO_ALIGN - free_percpu(m->scratch_aligned); -#endif for_each_possible_cpu(cpu) pipapo_free_scratch(m, cpu); free_percpu(m->scratch); @@ -2268,9 +2244,6 @@ static void nft_pipapo_destroy(const struct nft_ctx *ctx, if (priv->dirty) nft_set_pipapo_match_destroy(ctx, set, m); -#ifdef NFT_PIPAPO_ALIGN - free_percpu(priv->clone->scratch_aligned); -#endif for_each_possible_cpu(cpu) pipapo_free_scratch(priv->clone, cpu); free_percpu(priv->clone->scratch); diff --git a/net/netfilter/nft_set_pipapo.h b/net/netfilter/nft_set_pipapo.h index 144b186c4caf..e5f67c5cf30a 100644 --- a/net/netfilter/nft_set_pipapo.h +++ b/net/netfilter/nft_set_pipapo.h @@ -133,10 +133,12 @@ struct nft_pipapo_field { /** * struct nft_pipapo_scratch - percpu data used for lookup and matching * @map_index Current working bitmap index, toggled between field matches + * @align_off Offset to get the originally allocated address * @map store partial matching results during lookup */ struct nft_pipapo_scratch { u8 map_index; + u32 align_off; unsigned long map[]; }; @@ -144,16 +146,12 @@ struct nft_pipapo_scratch { * struct nft_pipapo_match - Data used for lookup and matching * @field_count Amount of fields in set * @scratch: Preallocated per-CPU maps for partial matching results - * @scratch_aligned: Version of @scratch aligned to NFT_PIPAPO_ALIGN bytes * @bsize_max: Maximum lookup table bucket size of all fields, in longs * @rcu Matching data is swapped on commits * @f: Fields, with lookup and mapping tables */ struct nft_pipapo_match { int field_count; -#ifdef NFT_PIPAPO_ALIGN - struct nft_pipapo_scratch * __percpu *scratch_aligned; -#endif struct nft_pipapo_scratch * __percpu *scratch; size_t bsize_max; struct rcu_head rcu; diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c index 8cad7b2e759d..107cffa8200c 100644 --- a/net/netfilter/nft_set_pipapo_avx2.c +++ b/net/netfilter/nft_set_pipapo_avx2.c @@ -1139,7 +1139,7 @@ bool nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, */ kernel_fpu_begin_mask(0); - scratch = *raw_cpu_ptr(m->scratch_aligned); + scratch = *raw_cpu_ptr(m->scratch); if (unlikely(!scratch)) { kernel_fpu_end(); return false; -- 2.43.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH nf 3/3] netfilter: nft_set_pipapo: remove scratch_aligned pointer 2024-02-06 12:23 ` [PATCH nf 3/3] netfilter: nft_set_pipapo: remove scratch_aligned pointer Florian Westphal @ 2024-02-07 17:29 ` Stefano Brivio 0 siblings, 0 replies; 10+ messages in thread From: Stefano Brivio @ 2024-02-07 17:29 UTC (permalink / raw) To: Florian Westphal; +Cc: netfilter-devel On Tue, 6 Feb 2024 13:23:08 +0100 Florian Westphal <fw@strlen.de> wrote: > use ->scratch for both avx2 and the generic implementation. > > After previous change the scratch->map member is always aligned properly > for AVX2, so we can just use scratch->map in AVX2 too. > > The alignoff delta is stored in the scratchpad so we can reconstruct > the correct address to free the area again. That's a nice simplification, thanks. > Fixes: 7400b063969b ("nft_set_pipapo: Introduce AVX2-based lookup implementation") > Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> -- Stefano ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-02-07 19:24 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-02-06 12:23 [PATCH nf 0/3] netfilter: nft_set_pipapo: map_index must be per set Florian Westphal 2024-02-06 12:23 ` [PATCH nf 1/3] netfilter: nft_set_pipapo: store index in scratch maps Florian Westphal 2024-02-07 14:15 ` Stefano Brivio 2024-02-07 15:23 ` Florian Westphal 2024-02-07 17:27 ` Stefano Brivio 2024-02-07 19:24 ` Florian Westphal 2024-02-06 12:23 ` [PATCH nf 2/3] netfilter: nft_set_pipapo: add helper to release pcpu scratch area Florian Westphal 2024-02-07 17:29 ` Stefano Brivio 2024-02-06 12:23 ` [PATCH nf 3/3] netfilter: nft_set_pipapo: remove scratch_aligned pointer Florian Westphal 2024-02-07 17:29 ` Stefano Brivio
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).