All of lore.kernel.org
 help / color / mirror / Atom feed
* pipapo with element shadowing, wildcard support
@ 2025-12-02 15:21 Florian Westphal
  2025-12-03 14:08 ` Stefano Brivio
  0 siblings, 1 reply; 4+ messages in thread
From: Florian Westphal @ 2025-12-02 15:21 UTC (permalink / raw)
  To: netfilter-devel; +Cc: sbrivio

[-- Attachment #1: Type: text/plain, Size: 2601 bytes --]

Hi,

I am currently looking into pipapo and wildcard matching.
Before going into a bit more details consider the following
example set:

 map s {
   typeof iifname . ip saddr . oifname : verdict
   flags interval
 }

This works fine, but pipapo comes with caveats.
1. Currently pipapo allows to insert a 'narrow' key like e.g.:
  add element t s {  "lo" . 127.0.0.1 . "lo"  : accept }

  ... and then add a 'broader' one afterwards, e.g.:
  add element t s {  "lo" . 127.0.0.1/8 . "lo"  : drop }

but this doesn't work when placing these add calls in the
reverse order (you get -EEXIST).

Question would be if it makes sense to relax the -EEXIST checks so that in
step 1 the wider key could be inserted first and then allow it to be
(partially) shadowed later.

Things get worse when adding wildcard support:

 map s {
   typeof iifname . ip saddr . oifname : verdict
   flags interval
   elements = { "*" . 127.0.0.5 . "" : jump test1,
                "*" . 127.0.0.1 . "" : jump test2,
                "lo" . 127.0.0.1 . "" : jump test3,
                "lo" . 127.0.0.5 . "" : jump test4 }
 }

(requires attached nft hack to permit both "*" and "" identifiers)

Example is same as before:
 add element inet t s '{  "lo" . 127.0.0.1 . "" counter : jump testlo1 }'
 add element inet t s '{  "lo" . 127.0.0.0/8 . "" counter : jump testW1 }'

 works but not vice versa.  (fails with -EEXIST).

If the existing behavior is ok as-is, i.e. userspace has to pre-order the
add calls, then no changes are needed.

But I dislike this, I don't think users should be expected to first
autoremove existing elements, then add the new element, then add back the
old, wider entry.

I also worry that we could end up with subtle bugs, e.g. rule(set) dumps
that can't be restored because of element ordering tripping over -EEXIST
test.

Do you think that:
   add element inet t s '{  "lo" . 127.0.0.0/8 . ...'
   add element inet t s '{  "lo" . 127.0.0.1 . ...'

should work, i.e. second element add should work (and override the
pre-existing entry in case new entry is more specific)?

AFAICS the 'override' part isn't as easy as relaxing the -EEXIST checks;
we would have to re-order the elements internally, else even 'lo 127.0.0.1'
would match the /8 entry first, so we would require to order the elements
so the most narrow element is placed first in the lookup table.

That in turn won't be nice to handle from kernel, so I wonder if nftables
needs to do more work here, similar to how we 'auto-merge' for rbtree/interval
set types?

Before I spend more time on it, has anyone looked into this before?

Thanks!

[-- Attachment #2: nft-wildcard-support-hack.diff --]
[-- Type: text/plain, Size: 1515 bytes --]

diff --git a/src/evaluate.c b/src/evaluate.c
--- a/src/evaluate.c
+++ b/src/evaluate.c
@@ -378,11 +378,10 @@ static int expr_evaluate_string(struct eval_ctx *ctx, struct expr **exprp)
 	memset(data + len, 0, data_len - len);
 	mpz_export_data(data, expr->value, BYTEORDER_HOST_ENDIAN, len);
 
-	if (strlen(data) == 0)
-		return expr_error(ctx->msgs, expr,
-				  "Empty string is not allowed");
+	datalen = strlen(data);
+	if (datalen)
+		datalen--;
 
-	datalen = strlen(data) - 1;
 	if (data[datalen] != '*') {
 		/* We need to reallocate the constant expression with the right
 		 * expression length to avoid problems on big endian.
@@ -395,11 +394,7 @@ static int expr_evaluate_string(struct eval_ctx *ctx, struct expr **exprp)
 		return 0;
 	}
 
-	if (datalen == 0)
-		return expr_error(ctx->msgs, expr,
-				  "All-wildcard strings are not supported");
-
-	if (data[datalen - 1] == '\\') {
+	if (datalen > 0 && data[datalen - 1] == '\\') {
 		char unescaped_str[data_len];
 
 		memset(unescaped_str, 0, sizeof(unescaped_str));
diff --git a/src/segtree.c b/src/segtree.c
--- a/src/segtree.c
+++ b/src/segtree.c
@@ -412,7 +412,8 @@ void concat_range_aggregate(struct expr *set)
 				unsigned int str_len = prefix_len / BITS_PER_BYTE;
 				char data[str_len + 2];
 
-				mpz_export_data(data, r1->value, BYTEORDER_HOST_ENDIAN, str_len);
+				if (str_len)
+					mpz_export_data(data, r1->value, BYTEORDER_HOST_ENDIAN, str_len);
 				data[str_len] = '*';
 
 				tmp = constant_expr_alloc(&r1->location, r1->dtype,

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: pipapo with element shadowing, wildcard support
  2025-12-02 15:21 pipapo with element shadowing, wildcard support Florian Westphal
@ 2025-12-03 14:08 ` Stefano Brivio
  2025-12-03 17:25   ` Florian Westphal
  0 siblings, 1 reply; 4+ messages in thread
From: Stefano Brivio @ 2025-12-03 14:08 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel, Phil Sutter

Hi Florian,

On Tue, 2 Dec 2025 16:21:10 +0100
Florian Westphal <fw@strlen.de> wrote:

> Hi,
> 
> I am currently looking into pipapo and wildcard matching.
> Before going into a bit more details consider the following
> example set:
> 
>  map s {
>    typeof iifname . ip saddr . oifname : verdict
>    flags interval
>  }
> 
> This works fine, but pipapo comes with caveats.
> 1. Currently pipapo allows to insert a 'narrow' key like e.g.:
>   add element t s {  "lo" . 127.0.0.1 . "lo"  : accept }
> 
>   ... and then add a 'broader' one afterwards, e.g.:
>   add element t s {  "lo" . 127.0.0.1/8 . "lo"  : drop }
> 
> but this doesn't work when placing these add calls in the
> reverse order (you get -EEXIST).
> 
> Question would be if it makes sense to relax the -EEXIST checks so that in
> step 1 the wider key could be inserted first and then allow it to be
> (partially) shadowed later.

The reason why I implemented it this way was to avoid possible
ambiguity because entries inserted first are anyway matched first.
Details with example at:

  https://lore.kernel.org/all/20200226115924.461f2029@redhat.com/

It might make sense to change that, but that's not entirely trivial as
you would need to renumber / reorder entries in the buckets on
insertion, so that more specific entries are always added first.

I guess you assumed this was already implemented, which is a reasonable
assumption, but unfortunately I didn't add that, it looked good enough
as it was, back then.

> Things get worse when adding wildcard support:
> 
>  map s {
>    typeof iifname . ip saddr . oifname : verdict
>    flags interval
>    elements = { "*" . 127.0.0.5 . "" : jump test1,
>                 "*" . 127.0.0.1 . "" : jump test2,
>                 "lo" . 127.0.0.1 . "" : jump test3,
>                 "lo" . 127.0.0.5 . "" : jump test4 }
>  }
> 
> (requires attached nft hack to permit both "*" and "" identifiers)
> 
> Example is same as before:
>  add element inet t s '{  "lo" . 127.0.0.1 . "" counter : jump testlo1 }'
>  add element inet t s '{  "lo" . 127.0.0.0/8 . "" counter : jump testW1 }'
> 
>  works but not vice versa.  (fails with -EEXIST).
> 
> If the existing behavior is ok as-is, i.e. userspace has to pre-order the
> add calls, then no changes are needed.
> 
> But I dislike this, I don't think users should be expected to first
> autoremove existing elements, then add the new element, then add back the
> old, wider entry.
> 
> I also worry that we could end up with subtle bugs, e.g. rule(set) dumps
> that can't be restored because of element ordering tripping over -EEXIST
> test.
> 
> Do you think that:
>    add element inet t s '{  "lo" . 127.0.0.0/8 . ...'
>    add element inet t s '{  "lo" . 127.0.0.1 . ...'
> 
> should work, i.e. second element add should work (and override the
> pre-existing entry in case new entry is more specific)?

It looks like a nice feature to me, now that you mention it and its
usage with wildcards.

> AFAICS the 'override' part isn't as easy as relaxing the -EEXIST checks;
> we would have to re-order the elements internally, else even 'lo 127.0.0.1'
> would match the /8 entry first, so we would require to order the elements
> so the most narrow element is placed first in the lookup table.
> 
> That in turn won't be nice to handle from kernel, so I wonder if nftables
> needs to do more work here, similar to how we 'auto-merge' for rbtree/interval
> set types?
> 
> Before I spend more time on it, has anyone looked into this before?

See that same thread for some discussion about it, in particular around:

  https://lore.kernel.org/all/20200225184857.GC9532@orbyte.nwl.cc/

other than that I'm not aware of previous discussions.

-- 
Stefano


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: pipapo with element shadowing, wildcard support
  2025-12-03 14:08 ` Stefano Brivio
@ 2025-12-03 17:25   ` Florian Westphal
  2025-12-23 12:23     ` Stefano Brivio
  0 siblings, 1 reply; 4+ messages in thread
From: Florian Westphal @ 2025-12-03 17:25 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: netfilter-devel, Phil Sutter

Stefano Brivio <sbrivio@redhat.com> wrote:
> > Question would be if it makes sense to relax the -EEXIST checks so that in
> > step 1 the wider key could be inserted first and then allow it to be
> > (partially) shadowed later.
> 
> The reason why I implemented it this way was to avoid possible
> ambiguity because entries inserted first are anyway matched first.
> Details with example at:
> 
>   https://lore.kernel.org/all/20200226115924.461f2029@redhat.com/

Thanks.

> It might make sense to change that, but that's not entirely trivial as
> you would need to renumber / reorder entries in the buckets on
> insertion, so that more specific entries are always added first.

Right, it needs a reorder step.

> I guess you assumed this was already implemented, which is a reasonable
> assumption, but unfortunately I didn't add that, it looked good enough
> as it was, back then.

No, I realized this was missing, the -EEXIST tests would not have
made sense otherwise.

What I did not know if it was omitted due to 'not feasible/too hard' or
'good enough for now'.

> See that same thread for some discussion about it, in particular around:
> 
>   https://lore.kernel.org/all/20200225184857.GC9532@orbyte.nwl.cc/
> 
> other than that I'm not aware of previous discussions.

Thanks for the pointer!

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: pipapo with element shadowing, wildcard support
  2025-12-03 17:25   ` Florian Westphal
@ 2025-12-23 12:23     ` Stefano Brivio
  0 siblings, 0 replies; 4+ messages in thread
From: Stefano Brivio @ 2025-12-23 12:23 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel, Phil Sutter

On Wed, 3 Dec 2025 18:25:11 +0100
Florian Westphal <fw@strlen.de> wrote:

> Stefano Brivio <sbrivio@redhat.com> wrote:
>
> > It might make sense to change that, but that's not entirely trivial as
> > you would need to renumber / reorder entries in the buckets on
> > insertion, so that more specific entries are always added first.  
> 
> Right, it needs a reorder step.

Perhaps too late to be useful: if you're trying to implement something
like that, the stand-alone prototype at https://pipapo.lameexcu.se/
might be useful to conceptually try out things. This could probably be
implemented as a variation in unmap() from set.c.

-- 
Stefano


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-12-23 12:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-02 15:21 pipapo with element shadowing, wildcard support Florian Westphal
2025-12-03 14:08 ` Stefano Brivio
2025-12-03 17:25   ` Florian Westphal
2025-12-23 12:23     ` Stefano Brivio

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.