From: Stefano Brivio <sbrivio@redhat.com>
To: Florian Westphal <fw@strlen.de>
Cc: netfilter-devel@vger.kernel.org, Phil Sutter <phil@nwl.cc>
Subject: Re: pipapo with element shadowing, wildcard support
Date: Wed, 3 Dec 2025 15:08:49 +0100 [thread overview]
Message-ID: <20251203150849.0ea16d5f@elisabeth> (raw)
In-Reply-To: <aS8D5pxjnGg6WH-2@strlen.de>
Hi Florian,
On Tue, 2 Dec 2025 16:21:10 +0100
Florian Westphal <fw@strlen.de> wrote:
> Hi,
>
> I am currently looking into pipapo and wildcard matching.
> Before going into a bit more details consider the following
> example set:
>
> map s {
> typeof iifname . ip saddr . oifname : verdict
> flags interval
> }
>
> This works fine, but pipapo comes with caveats.
> 1. Currently pipapo allows to insert a 'narrow' key like e.g.:
> add element t s { "lo" . 127.0.0.1 . "lo" : accept }
>
> ... and then add a 'broader' one afterwards, e.g.:
> add element t s { "lo" . 127.0.0.1/8 . "lo" : drop }
>
> but this doesn't work when placing these add calls in the
> reverse order (you get -EEXIST).
>
> Question would be if it makes sense to relax the -EEXIST checks so that in
> step 1 the wider key could be inserted first and then allow it to be
> (partially) shadowed later.
The reason why I implemented it this way was to avoid possible
ambiguity because entries inserted first are anyway matched first.
Details with example at:
https://lore.kernel.org/all/20200226115924.461f2029@redhat.com/
It might make sense to change that, but that's not entirely trivial as
you would need to renumber / reorder entries in the buckets on
insertion, so that more specific entries are always added first.
I guess you assumed this was already implemented, which is a reasonable
assumption, but unfortunately I didn't add that, it looked good enough
as it was, back then.
> Things get worse when adding wildcard support:
>
> map s {
> typeof iifname . ip saddr . oifname : verdict
> flags interval
> elements = { "*" . 127.0.0.5 . "" : jump test1,
> "*" . 127.0.0.1 . "" : jump test2,
> "lo" . 127.0.0.1 . "" : jump test3,
> "lo" . 127.0.0.5 . "" : jump test4 }
> }
>
> (requires attached nft hack to permit both "*" and "" identifiers)
>
> Example is same as before:
> add element inet t s '{ "lo" . 127.0.0.1 . "" counter : jump testlo1 }'
> add element inet t s '{ "lo" . 127.0.0.0/8 . "" counter : jump testW1 }'
>
> works but not vice versa. (fails with -EEXIST).
>
> If the existing behavior is ok as-is, i.e. userspace has to pre-order the
> add calls, then no changes are needed.
>
> But I dislike this, I don't think users should be expected to first
> autoremove existing elements, then add the new element, then add back the
> old, wider entry.
>
> I also worry that we could end up with subtle bugs, e.g. rule(set) dumps
> that can't be restored because of element ordering tripping over -EEXIST
> test.
>
> Do you think that:
> add element inet t s '{ "lo" . 127.0.0.0/8 . ...'
> add element inet t s '{ "lo" . 127.0.0.1 . ...'
>
> should work, i.e. second element add should work (and override the
> pre-existing entry in case new entry is more specific)?
It looks like a nice feature to me, now that you mention it and its
usage with wildcards.
> AFAICS the 'override' part isn't as easy as relaxing the -EEXIST checks;
> we would have to re-order the elements internally, else even 'lo 127.0.0.1'
> would match the /8 entry first, so we would require to order the elements
> so the most narrow element is placed first in the lookup table.
>
> That in turn won't be nice to handle from kernel, so I wonder if nftables
> needs to do more work here, similar to how we 'auto-merge' for rbtree/interval
> set types?
>
> Before I spend more time on it, has anyone looked into this before?
See that same thread for some discussion about it, in particular around:
https://lore.kernel.org/all/20200225184857.GC9532@orbyte.nwl.cc/
other than that I'm not aware of previous discussions.
--
Stefano
next prev parent reply other threads:[~2025-12-03 14:08 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-02 15:21 pipapo with element shadowing, wildcard support Florian Westphal
2025-12-03 14:08 ` Stefano Brivio [this message]
2025-12-03 17:25 ` Florian Westphal
2025-12-23 12:23 ` Stefano Brivio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251203150849.0ea16d5f@elisabeth \
--to=sbrivio@redhat.com \
--cc=fw@strlen.de \
--cc=netfilter-devel@vger.kernel.org \
--cc=phil@nwl.cc \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.