Re: Feedback on variable sized set elements

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Florian Westphal <fw@strlen.de>
To: Shaun Brady <brady.1345@gmail.com>
Cc: netfilter-devel@vger.kernel.org
Subject: Re: Feedback on variable sized set elements
Date: Fri, 11 Jul 2025 12:27:45 +0200	[thread overview]
Message-ID: <aHDnIS1iaBKtxove@strlen.de> (raw)
In-Reply-To: <aHCFaArfREnXjy5Y@fedora>

Shaun Brady <brady.1345@gmail.com> wrote:
> I'm sure I bit off more than I could chew, but I attempted to write a proof of
> concept patch to add a new set type, inet_addr, which would allow elements of
> both ipv4_addr and ipv6_addr types.

Why?  This is hard, the kernel has no notion of data types.

> Something to the tune of:
> 
> nft add set inet filter set_inet {type inet_addr\;}
> nft add element inet filter set_inet { 10.0.1.195, 10.0.1.200, 10.0.1.201, 2001:db8::8a2e:370:7334 }

How would this work with ranges or concatenations?

> Figuring most of this would be implemented in the nft userland, I started
> there, and was able to successfully get a new set type that allowed v4
> addresses OR v6 addresses, depending on how I defined the datasize of
> inet_addr (4 bytes or 16 bytes).
> 
> When leaving inet_addr size at the required (for both v4 and v6) 16 bytes
> netlink would return EINVAL when adding v4 addresses to the set. We found in
> nft_value_init:
> 
>                 if (len != desc->len)
>                         return -EINVAL;
> 
> with len being the nlattr (the v4 address) and desc being the nft_set_desc. 4 != 16.
> My questions:
> 
> 1) Is this feature interesting enough to pursue (given what would have to be
> done to make it work (see next question))?  The set type only makes sense in
> inet tables (I think...) and even then, would roughly be syntactic sugar for
> what could be done (more efficiently) with two sets of the base protocols. But
> hey, nice things make nice tools?

I don't see how its doable.  The lookup key fed to the set lookup
function via nft_lookup.c has a fixed size.

From kernel point of view, its an array of u32 of a given size dictated
by the sets key length.

> 2) (assuming #1) I believe we would have to put a condition to check the set
> type versus the nlattr type, and allow a size difference on
> set(inet_addr)/set_elem(ipv4_addr) (I don't know if that has any
> ramifications).

The kernel doesn't know what an ipv4 or ipv6 address is.
It only knows the total key size.  In case of nft_set_pipapo.c its also
told the sizes of the individual subkeys. (e.g. ipv4_addr .
inet_service -> 4 . 4, ipv6_addr . ip_protocol -> 16 . 4).

Maybe it would be possble to xlate ipv4 addresses to ipv6 mapped
addresses, but that would still require expansion in userland, because

ip saddr @foo
ip6 saddr @foo

cannot work.  We'd need to rewrite it to something like
meta nfproto ipv4 {
	reg32_1 = 0xffffffff
	reg32_2 = 0xffffffff
	reg32_3 = 0xffffffff
	reg32_4 = ip saddr
	lookup @foo sreg32_1
}
meta nfproto ipv6 ip6 saddr @foo

I don't think its worth the pain.  Also because then ipv4 becomes
indistinguishable from on-wire mapped addresses.

> Another possible approach would be to create an API to transmit valid size
> types for a set type from userland. We would still need to ID the set type,
> and that has the above problems of set.ktype.

There are different sets, yes, but none of these sets support a
particular data type.  They don't know what that is, the datatype is a
userspace thing and its only stored in the kernel so that 'nft list
ruleset' and friends know how to pretty-print the octet soups stored in
the set.  Its not related to matching.

next prev parent reply	other threads:[~2025-07-11 10:27 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-11  3:30 Feedback on variable sized set elements Shaun Brady
2025-07-11 10:27 ` Florian Westphal [this message]
2025-07-12  2:43   ` Shaun Brady

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aHDnIS1iaBKtxove@strlen.de \
    --to=fw@strlen.de \
    --cc=brady.1345@gmail.com \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.