All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Phil Sutter <phil@nwl.cc>, Florian Westphal <fw@strlen.de>,
	netfilter-devel@vger.kernel.org
Subject: Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
Date: Thu, 3 Jul 2025 23:32:26 +0200	[thread overview]
Message-ID: <aGbu5ugsBY8Bu3Ad@calendula> (raw)
In-Reply-To: <aGaUzVUf_-xbowvO@orbyte.nwl.cc>

On Thu, Jul 03, 2025 at 04:33:49PM +0200, Phil Sutter wrote:
[...]
> > > Pablo, WDYT? Feasible alternative to the feature flag?

That is more simple, it puts a bit more pressure on netlink_dump().
Worst case fully duplicates the information, I think we discussed this
already.

If my code reading is correct, maximum size of the area to add netlink
messages in the dump path is SKB_WITH_OVERHEAD(32768) according to
netlink_recvmsg(), there is a fallback to NLMSG_GOODSIZE when 3-order
allocation fails.

In the event path, memory budget is already used up since
NLMSG_GOODSIZE (4096) even before this proposal because

256 devices * IFNAMSIZ = 4096

This is beyond the limit.

This can be fixed by splitting the report in two events of NEWCHAIN,
but if there is another large attribute reaching the worst case, the
splitting will get more complicated.

With your new attribute, worst case scenario means duplicating _DEVS
with the same devices.

There is a memory budget for the netlink message, and the ADDCHAIN
netlink message with the netdev family is already pushing it to the
limit when *I rised* the maximum to 256 devices per request of a user.
I can post a patch to reduce it to 128 devices now that your device
wildcard feature is available.

I regret _DEVS attribute only includes DEV_NAME, it should have been
possible to add a flag and provide a more efficient representation
than your proposal.

A possible fugly hack would be to stuffed this information after \0 in
DEV_NAME, but that would feel like the iptables revision
infrastructure, better not to go that way.

Maybe all this is not worth to worry and we can assume in 2025 that
when 3-order allocation fails netlink dump will simply fail? Probably
this already is right for other existing netlink subsystems.

And this effort is to provide a way for the user to know that the
device that has been specified has actually registered a hook so the
chain will see traffic.

So far we only have events via NEWDEV to report new hooks. Maybe
GETDEV to consult the hooks that are attached to what chains is
needed? That would solve this usability issue.

But that is _a lot more work_, stuffing more information into the
ADDCHAIN netlink message is easier. GETDEV means more netlink boiler
plate code to avoid this simple extra attribute you propose.

GETDEV would be paired with NEWDEV events to determine which device
the base chain is hooked to.

Maybe it is not for users to know that that dummy* is matching
_something_ but also they want to know what device is matching such
pattern for debugging purpose.

It boils down to "should we care to provide facility to allow for more
instrospection in this box"?

If the answer is "no, what we have is sufficient" then let's not worry
about mistypes. GETDEV facility would be rarely used, then skip.

If you want to complete this picture, then add GETDEV, because NEWDEV
events without GETDEV command looks incomplete.

> > If my understanding is correct, I think this approach will break new
> > nft binary with old kernel.
> 
> If not present (old kernel), new nft would assume all device specs
> matched an interface. I would implement this as comment, something like:
> "# not bound: ethX, wlan*" which would be missing with old kernels.

If after all this, you decide to go for this approach with the new
attribute into ADDCHAIN, maybe a more compact representation,
add ? notation:

table ip x {
        chain y {
                type filter hook ingress devices = { "eth*"?, "vlan0"?, "wan1" } priority 0;
        }
}

This notation can be removed if -s/--stateless is used and ignored if
ruleset is loaded and it is more compact.

It feels like we are trying to stuff too much information in the
existing output, and my ? notation is just trying to find a "smart"
way to make thing not look bloated. Then, GETDEV comes again and this
silly notation is really not needed if a more complete view is
provided.

BTW, note there is one inconsistency that need to be addressed in the
listing, currently 'devices' does not enclose the names in quotes
while 'device' does it.

I mean, with device:

table netdev x {
        chain y {
                type filter hook ingress device "dummy0" priority filter; policy accept;
        }
}

with devices:

table netdev x {
        chain y {
                type filter hook ingress devices = { dummy0, dummy1 } priority filter; policy accept;
        }
}

It would be great to fix this.

P.S: This still leaves room to discuss if comments are the best way to
go to display handle and set count, but we can start a new thread to
discuss this.

  reply	other threads:[~2025-07-03 21:32 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-02 17:47 [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration Phil Sutter
2025-07-02 22:39 ` Florian Westphal
2025-07-03 10:21   ` Phil Sutter
2025-07-03 11:35     ` Pablo Neira Ayuso
2025-07-03 12:09       ` Florian Westphal
2025-07-03 12:37         ` Phil Sutter
2025-07-03 12:25       ` Phil Sutter
2025-07-03 12:39         ` Florian Westphal
2025-07-03 12:47           ` Phil Sutter
2025-07-03 12:54             ` Florian Westphal
2025-07-03 13:17               ` Phil Sutter
2025-07-03 14:19                 ` Pablo Neira Ayuso
2025-07-03 14:33                   ` Phil Sutter
2025-07-03 21:32                     ` Pablo Neira Ayuso [this message]
2025-07-04 12:41                       ` Phil Sutter
2025-07-04 14:04                         ` Florian Westphal
2025-07-04 15:33                           ` Phil Sutter
2025-07-07 19:25                           ` Pablo Neira Ayuso
2025-07-08 14:38                             ` Phil Sutter
2025-07-09 22:43                               ` Pablo Neira Ayuso
2025-07-10 13:55                                 ` Phil Sutter
2025-07-11 12:19                                 ` Phil Sutter
2025-07-11 13:16                                   ` Florian Westphal
2025-07-11 13:43                                     ` Phil Sutter
2025-07-11 13:48                                       ` Florian Westphal
2025-07-11 14:52                                   ` Pablo Neira Ayuso
2025-07-11 16:39                                     ` Phil Sutter
2025-07-14 14:02                                       ` Pablo Neira Ayuso
2025-07-03 11:55     ` Florian Westphal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aGbu5ugsBY8Bu3Ad@calendula \
    --to=pablo@netfilter.org \
    --cc=fw@strlen.de \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=phil@nwl.cc \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.