netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: Florian Westphal <fw@strlen.de>
Cc: <netfilter-devel@vger.kernel.org>
Subject: Re: [PATCH nf-next 4/5] netfilter: nft_set_pipapo: merge pipapo_get/lookup
Date: Tue, 8 Jul 2025 08:09:17 +0200	[thread overview]
Message-ID: <20250708080917.41f4b693@elisabeth> (raw)
In-Reply-To: <20250701185245.31370-5-fw@strlen.de>

On Tue,  1 Jul 2025 20:52:41 +0200
Florian Westphal <fw@strlen.de> wrote:

> The matching algorithm has implemented thrice:
> 1. data path lookup, generic version
> 2. data path lookup, avx2 version
> 3. control plane lookup
> 
> Merge 1 and 3 by refactoring pipapo_get as a common helper, then make
> nft_pipapo_lookup and nft_pipapo_get both call the common helper.
> 
> Aside from the code savings this has the benefit that we no longer allocate
> temporary scratch maps for each control plane get and insertion operation.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>

Some nits below, but other than those:

Reviewed-by: Stefano Brivio <sbrivio@redhat.com>

> ---
>  net/netfilter/nft_set_pipapo.c | 189 ++++++++++-----------------------
>  1 file changed, 59 insertions(+), 130 deletions(-)
> 
> diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c
> index 36a4de11995b..ebd142d8d4d4 100644
> --- a/net/netfilter/nft_set_pipapo.c
> +++ b/net/netfilter/nft_set_pipapo.c
> @@ -397,35 +397,37 @@ int pipapo_refill(unsigned long *map, unsigned int len, unsigned int rules,
>  }
>  
>  /**
> - * nft_pipapo_lookup() - Lookup function
> - * @net:	Network namespace
> - * @set:	nftables API set representation
> - * @key:	nftables API element representation containing key data
> - * @ext:	nftables API extension pointer, filled with matching reference
> + * pipapo_get() - Get matching element reference given key data
> + * @m:		storage containing the set elements
> + * @data:	Key data to be matched against existing elements
> + * @genmask:	If set, check that element is active in given genmask
> + * @tstamp:	timestamp to check for expired elements
>   *
>   * For more details, see DOC: Theory of Operation.
>   *
> - * Return: true on match, false otherwise.
> + * This is the main lookup function.  It matches key data either against
> + * the working match set or the uncomitted copy, depending on what the
> + * caller passed to us.

Here and below: uncommitted.

I think clearer: "It matches key data against either ... or ...".

> + * nft_pipapo_get (lookup from userspace/control plane) and nft_pipapo_lookup
> + * (datapath lookup) pass the active copy.
> + * The insertion path will pass the uncomitted working copy.
> + *
> + * Return: pointer to &struct nft_pipapo_elem on match, NULL otherwise.
>   */
> -const struct nft_set_ext *
> -nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
> -		  const u32 *key)
> +static struct nft_pipapo_elem *pipapo_get(const struct nft_pipapo_match *m,
> +					  const u8 *data, u8 genmask,
> +					  u64 tstamp)
>  {
> -	struct nft_pipapo *priv = nft_set_priv(set);
>  	struct nft_pipapo_scratch *scratch;
>  	unsigned long *res_map, *fill_map;
> -	u8 genmask = nft_genmask_cur(net);
> -	const struct nft_pipapo_match *m;
>  	const struct nft_pipapo_field *f;
> -	const u8 *rp = (const u8 *)key;
>  	bool map_index;
>  	int i;
>  
>  	local_bh_disable();
>  
> -	m = rcu_dereference(priv->match);
> -
> -	if (unlikely(!m || !*raw_cpu_ptr(m->scratch)))
> +	/* XXX: fix this, prealloc and remove this check */
> +	if (unlikely(!raw_cpu_ptr(m->scratch)))

The check should be cheap, but sure, why not. I'm just asking if you
accidentally left the XXX: here in this version or if you meant it as a
TODO: for the future.

>  		goto out;
>  
>  	scratch = *raw_cpu_ptr(m->scratch);
> @@ -445,12 +447,12 @@ nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
>  		 * packet bytes value, then AND bucket value
>  		 */
>  		if (likely(f->bb == 8))
> -			pipapo_and_field_buckets_8bit(f, res_map, rp);
> +			pipapo_and_field_buckets_8bit(f, res_map, data);
>  		else
> -			pipapo_and_field_buckets_4bit(f, res_map, rp);
> +			pipapo_and_field_buckets_4bit(f, res_map, data);
>  		NFT_PIPAPO_GROUP_BITS_ARE_8_OR_4;
>  
> -		rp += f->groups / NFT_PIPAPO_GROUPS_PER_BYTE(f);
> +		data += f->groups / NFT_PIPAPO_GROUPS_PER_BYTE(f);
>  
>  		/* Now populate the bitmap for the next field, unless this is
>  		 * the last field, in which case return the matched 'ext'
> @@ -470,11 +472,11 @@ nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
>  		}
>  
>  		if (last) {
> -			const struct nft_set_ext *ext;
> +			struct nft_pipapo_elem *e;
>  
> -			ext = &f->mt[b].e->ext;
> -			if (unlikely(nft_set_elem_expired(ext) ||
> -				     !nft_set_elem_active(ext, genmask)))
> +			e = f->mt[b].e;
> +			if (unlikely(__nft_set_elem_expired(&e->ext, tstamp) ||
> +				     !nft_set_elem_active(&e->ext, genmask)))
>  				goto next_match;
>  
>  			/* Last field: we're just returning the key without
> @@ -484,8 +486,7 @@ nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
>  			 */
>  			scratch->map_index = map_index;
>  			local_bh_enable();
> -
> -			return ext;
> +			return e;
>  		}
>  
>  		/* Swap bitmap indices: res_map is the initial bitmap for the
> @@ -495,7 +496,7 @@ nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
>  		map_index = !map_index;
>  		swap(res_map, fill_map);
>  
> -		rp += NFT_PIPAPO_GROUPS_PADDING(f);
> +		data += NFT_PIPAPO_GROUPS_PADDING(f);
>  	}
>  
>  out:
> @@ -504,99 +505,29 @@ nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
>  }
>  
>  /**
> - * pipapo_get() - Get matching element reference given key data
> - * @m:		storage containing active/existing elements
> - * @data:	Key data to be matched against existing elements
> - * @genmask:	If set, check that element is active in given genmask
> - * @tstamp:	timestamp to check for expired elements
> - * @gfp:	the type of memory to allocate (see kmalloc).
> + * nft_pipapo_lookup() - Dataplane fronted for main lookup function
> + * @net:	Network namespace
> + * @set:	nftables API set representation
> + * @key:	pointer to nft registers containing key data
>   *
> - * This is essentially the same as the lookup function, except that it matches
> - * key data against the uncommitted copy and doesn't use preallocated maps for
> - * bitmap results.
> + * This function is called from the data path.  It will search for
> + * an element matching the given key in the current active copy.
>   *
> - * Return: pointer to &struct nft_pipapo_elem on match, error pointer otherwise.
> + * Return: ntables API extension pointer or NULL if no match.
>   */
> -static struct nft_pipapo_elem *pipapo_get(const struct nft_pipapo_match *m,
> -					  const u8 *data, u8 genmask,
> -					  u64 tstamp, gfp_t gfp)
> +const struct nft_set_ext *
> +nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
> +		  const u32 *key)
>  {
> -	struct nft_pipapo_elem *ret = ERR_PTR(-ENOENT);
> -	unsigned long *res_map, *fill_map = NULL;
> -	const struct nft_pipapo_field *f;
> -	int i;
> -
> -	if (m->bsize_max == 0)
> -		return ret;
> -
> -	res_map = kmalloc_array(m->bsize_max, sizeof(*res_map), gfp);
> -	if (!res_map) {
> -		ret = ERR_PTR(-ENOMEM);
> -		goto out;
> -	}
> -
> -	fill_map = kcalloc(m->bsize_max, sizeof(*res_map), gfp);
> -	if (!fill_map) {
> -		ret = ERR_PTR(-ENOMEM);
> -		goto out;
> -	}
> -
> -	pipapo_resmap_init(m, res_map);
> -
> -	nft_pipapo_for_each_field(f, i, m) {
> -		bool last = i == m->field_count - 1;
> -		int b;
> -
> -		/* For each bit group: select lookup table bucket depending on
> -		 * packet bytes value, then AND bucket value
> -		 */
> -		if (f->bb == 8)
> -			pipapo_and_field_buckets_8bit(f, res_map, data);
> -		else if (f->bb == 4)
> -			pipapo_and_field_buckets_4bit(f, res_map, data);
> -		else
> -			BUG();
> -
> -		data += f->groups / NFT_PIPAPO_GROUPS_PER_BYTE(f);
> -
> -		/* Now populate the bitmap for the next field, unless this is
> -		 * the last field, in which case return the matched 'ext'
> -		 * pointer if any.
> -		 *
> -		 * Now res_map contains the matching bitmap, and fill_map is the
> -		 * bitmap for the next field.
> -		 */
> -next_match:
> -		b = pipapo_refill(res_map, f->bsize, f->rules, fill_map, f->mt,
> -				  last);
> -		if (b < 0)
> -			goto out;
> -
> -		if (last) {
> -			if (__nft_set_elem_expired(&f->mt[b].e->ext, tstamp))
> -				goto next_match;
> -			if ((genmask &&
> -			     !nft_set_elem_active(&f->mt[b].e->ext, genmask)))
> -				goto next_match;
> -
> -			ret = f->mt[b].e;
> -			goto out;
> -		}
> -
> -		data += NFT_PIPAPO_GROUPS_PADDING(f);
> +	struct nft_pipapo *priv = nft_set_priv(set);
> +	u8 genmask = nft_genmask_cur(net);
> +	const struct nft_pipapo_match *m;
> +	const struct nft_pipapo_elem *e;
>  
> -		/* Swap bitmap indices: fill_map will be the initial bitmap for
> -		 * the next field (i.e. the new res_map), and res_map is
> -		 * guaranteed to be all-zeroes at this point, ready to be filled
> -		 * according to the next mapping table.
> -		 */
> -		swap(res_map, fill_map);
> -	}
> +	m = rcu_dereference(priv->match);
> +	e = pipapo_get(m, (const u8 *)key, genmask, get_jiffies_64());
>  
> -out:
> -	kfree(fill_map);
> -	kfree(res_map);
> -	return ret;
> +	return e ? &e->ext : NULL;
>  }
>  
>  /**
> @@ -605,6 +536,11 @@ static struct nft_pipapo_elem *pipapo_get(const struct nft_pipapo_match *m,
>   * @set:	nftables API set representation
>   * @elem:	nftables API element representation containing key data
>   * @flags:	Unused
> + *
> + * This function is called from the control plane path under
> + * RCU read lock.
> + *
> + * Return: set element private pointer; ERR_PTR if no match.

Conceptually, we rather return -ENOENT, I'd mention that instead.

>   */
>  static struct nft_elem_priv *
>  nft_pipapo_get(const struct net *net, const struct nft_set *set,
> @@ -615,10 +551,9 @@ nft_pipapo_get(const struct net *net, const struct nft_set *set,
>  	struct nft_pipapo_elem *e;
>  
>  	e = pipapo_get(m, (const u8 *)elem->key.val.data,
> -		       nft_genmask_cur(net), get_jiffies_64(),
> -		       GFP_ATOMIC);
> -	if (IS_ERR(e))
> -		return ERR_CAST(e);
> +		       nft_genmask_cur(net), get_jiffies_64());
> +	if (!e)
> +		return ERR_PTR(-ENOENT);
>  
>  	return &e->priv;
>  }
> @@ -1344,8 +1279,8 @@ static int nft_pipapo_insert(const struct net *net, const struct nft_set *set,
>  	else
>  		end = start;
>  
> -	dup = pipapo_get(m, start, genmask, tstamp, GFP_KERNEL);
> -	if (!IS_ERR(dup)) {
> +	dup = pipapo_get(m, start, genmask, tstamp);
> +	if (dup) {
>  		/* Check if we already have the same exact entry */
>  		const struct nft_data *dup_key, *dup_end;
>  
> @@ -1364,15 +1299,9 @@ static int nft_pipapo_insert(const struct net *net, const struct nft_set *set,
>  		return -ENOTEMPTY;
>  	}
>  
> -	if (PTR_ERR(dup) == -ENOENT) {
> -		/* Look for partially overlapping entries */
> -		dup = pipapo_get(m, end, nft_genmask_next(net), tstamp,
> -				 GFP_KERNEL);
> -	}
> -
> -	if (PTR_ERR(dup) != -ENOENT) {
> -		if (IS_ERR(dup))
> -			return PTR_ERR(dup);
> +	/* Look for partially overlapping entries */
> +	dup = pipapo_get(m, end, nft_genmask_next(net), tstamp);
> +	if (dup) {
>  		*elem_priv = &dup->priv;
>  		return -ENOTEMPTY;
>  	}
> @@ -1914,8 +1843,8 @@ nft_pipapo_deactivate(const struct net *net, const struct nft_set *set,
>  		return NULL;
>  
>  	e = pipapo_get(m, (const u8 *)elem->key.val.data,
> -		       nft_genmask_next(net), nft_net_tstamp(net), GFP_KERNEL);
> -	if (IS_ERR(e))
> +		       nft_genmask_next(net), nft_net_tstamp(net));
> +	if (!e)
>  		return NULL;
>  
>  	nft_set_elem_change_active(net, set, &e->ext);

-- 
Stefano


  reply	other threads:[~2025-07-08  6:09 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-01 18:52 [PATCH nf-next 0/5] netfilter: nft_set updates Florian Westphal
2025-07-01 18:52 ` [PATCH nf-next 1/5] netfilter: nft_set_pipapo: remove unused arguments Florian Westphal
2025-07-08  6:09   ` Stefano Brivio
2025-07-01 18:52 ` [PATCH nf-next 2/5] netfilter: nft_set: remove one argument from lookup and update functions Florian Westphal
2025-07-03 20:10   ` kernel test robot
2025-07-01 18:52 ` [PATCH nf-next 3/5] netfilter: nft_set: remove indirection from update API call Florian Westphal
2025-07-01 18:52 ` [PATCH nf-next 4/5] netfilter: nft_set_pipapo: merge pipapo_get/lookup Florian Westphal
2025-07-08  6:09   ` Stefano Brivio [this message]
2025-07-08 19:02     ` Florian Westphal
2025-07-01 18:52 ` [PATCH nf-next 5/5] netfilter: nft_set_pipapo: prefer kvmalloc for scratch maps Florian Westphal
2025-07-08  6:09   ` Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250708080917.41f4b693@elisabeth \
    --to=sbrivio@redhat.com \
    --cc=fw@strlen.de \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).