netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
@ 2025-07-02 17:47 Phil Sutter
  2025-07-02 22:39 ` Florian Westphal
  0 siblings, 1 reply; 29+ messages in thread
From: Phil Sutter @ 2025-07-02 17:47 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel

Require user space to set a flag upon flowtable or netdev-family chain
creation explicitly relaxing the hook registration when it comes to
non-existent interfaces. For the sake of simplicity, just restore error
condition if a given hook does not find an interface to bind to, leave
everyting else in place. Therefore:

- A wildcard interface spec is accepted as long as at least a single
  interface matches.
- Dynamic unregistering and re-registering of vanishing/re-appearing
  interfaces is still happening.

Note that this flag is persistent, i.e. included in ruleset dumps. This
effectively makes it "updatable": User space may create a "name-based"
flowtable for a non-existent interface, then update the flowtable to
drop the flag. What should happen then? Right now this is simply
accepted, even though the flowtable still does not bind to an interface.

Signed-off-by: Phil Sutter <phil@nwl.cc>
---
 include/uapi/linux/netfilter/nf_tables.h |  9 +++++++--
 net/netfilter/nf_tables_api.c            | 15 ++++++++++++---
 2 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index 792836149eca..d08c7fbde1d1 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -219,10 +219,12 @@ enum nft_chain_flags {
 	NFT_CHAIN_BASE		= (1 << 0),
 	NFT_CHAIN_HW_OFFLOAD	= (1 << 1),
 	NFT_CHAIN_BINDING	= (1 << 2),
+	NFT_CHAIN_NAME_BASED	= (1 << 3),
 };
 #define NFT_CHAIN_FLAGS		(NFT_CHAIN_BASE		| \
 				 NFT_CHAIN_HW_OFFLOAD	| \
-				 NFT_CHAIN_BINDING)
+				 NFT_CHAIN_BINDING	| \
+				 NFT_CHAIN_NAME_BASED)
 
 /**
  * enum nft_chain_attributes - nf_tables chain netlink attributes
@@ -1699,12 +1701,15 @@ enum nft_object_attributes {
  *
  * @NFT_FLOWTABLE_HW_OFFLOAD: flowtable hardware offload is enabled
  * @NFT_FLOWTABLE_COUNTER: enable flow counters
+ * @NFT_FLOWTABLE_NAME_BASED: relax interface hooks
  */
 enum nft_flowtable_flags {
 	NFT_FLOWTABLE_HW_OFFLOAD	= 0x1,
 	NFT_FLOWTABLE_COUNTER		= 0x2,
+	NFT_FLOWTABLE_NAME_BASED	= 0x4,
 	NFT_FLOWTABLE_MASK		= (NFT_FLOWTABLE_HW_OFFLOAD |
-					   NFT_FLOWTABLE_COUNTER)
+					   NFT_FLOWTABLE_COUNTER |
+					   NFT_FLOWTABLE_NAME_BASED)
 };
 
 /**
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 8fcc6393be38..5ae736715eec 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -2374,7 +2374,8 @@ static struct nft_hook *nft_hook_list_find(struct list_head *hook_list,
 static int nf_tables_parse_netdev_hooks(struct net *net,
 					const struct nlattr *attr,
 					struct list_head *hook_list,
-					struct netlink_ext_ack *extack)
+					struct netlink_ext_ack *extack,
+					bool relaxed)
 {
 	struct nft_hook *hook, *next;
 	const struct nlattr *tmp;
@@ -2392,6 +2393,12 @@ static int nf_tables_parse_netdev_hooks(struct net *net,
 			err = PTR_ERR(hook);
 			goto err_hook;
 		}
+		if (!relaxed && list_empty(&hook->ops_list)) {
+			NL_SET_BAD_ATTR(extack, tmp);
+			nft_netdev_hook_free(hook);
+			err = -ENOENT;
+			goto err_hook;
+		}
 		if (nft_hook_list_find(hook_list, hook)) {
 			NL_SET_BAD_ATTR(extack, tmp);
 			nft_netdev_hook_free(hook);
@@ -2441,7 +2448,8 @@ static int nft_chain_parse_netdev(struct net *net, struct nlattr *tb[],
 		list_add_tail(&hook->list, hook_list);
 	} else if (tb[NFTA_HOOK_DEVS]) {
 		err = nf_tables_parse_netdev_hooks(net, tb[NFTA_HOOK_DEVS],
-						   hook_list, extack);
+						   hook_list, extack,
+						   flags & NFT_CHAIN_NAME_BASED);
 		if (err < 0)
 			return err;
 
@@ -8847,6 +8855,7 @@ static int nft_flowtable_parse_hook(const struct nft_ctx *ctx,
 				    struct nft_flowtable *flowtable,
 				    struct netlink_ext_ack *extack, bool add)
 {
+	bool relaxed = flowtable->data.flags & NFT_FLOWTABLE_NAME_BASED;
 	struct nlattr *tb[NFTA_FLOWTABLE_HOOK_MAX + 1];
 	struct nf_hook_ops *ops;
 	struct nft_hook *hook;
@@ -8897,7 +8906,7 @@ static int nft_flowtable_parse_hook(const struct nft_ctx *ctx,
 		err = nf_tables_parse_netdev_hooks(ctx->net,
 						   tb[NFTA_FLOWTABLE_HOOK_DEVS],
 						   &flowtable_hook->list,
-						   extack);
+						   extack, relaxed);
 		if (err < 0)
 			return err;
 	}
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-02 17:47 [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration Phil Sutter
@ 2025-07-02 22:39 ` Florian Westphal
  2025-07-03 10:21   ` Phil Sutter
  0 siblings, 1 reply; 29+ messages in thread
From: Florian Westphal @ 2025-07-02 22:39 UTC (permalink / raw)
  To: Phil Sutter; +Cc: Pablo Neira Ayuso, netfilter-devel

Phil Sutter <phil@nwl.cc> wrote:
> Require user space to set a flag upon flowtable or netdev-family chain
> creation explicitly relaxing the hook registration when it comes to
> non-existent interfaces. For the sake of simplicity, just restore error
> condition if a given hook does not find an interface to bind to, leave
> everyting else in place.

OK, but then this needs to go in via nf.git and:

Fixes: 6d07a289504a ("netfilter: nf_tables: Support wildcard netdev hook specs")

tag.  We shouldn't introduce a "error" -> "no error" -> "error" semantic
change sequence in kernel releases, i.e. this change is urgent; its now
(before 6.16 release) or never.

> - A wildcard interface spec is accepted as long as at least a single
>   interface matches.

Is there a reason for this? Why are they handled differently?

> - Dynamic unregistering and re-registering of vanishing/re-appearing
>   interfaces is still happening.

You mean, without the flag? AFAIU old behaviour is:
For netdev chains:
- auto-removal AND free of device basechain -> no reappearance
- -ENOENT error on chain add if device name doesn't exist
For flowtable:
- device is removed from the list (and list can become empty), flowtable
  stays 100%, just the device name disappears from the devices list.
  Doesn't reappear (auto re-added) either.
- -ENOENT error on flowtable add if even one device doesn't exist

Neither netdev nor flowtable support "foo*" wildcards.

nf.git:
- netdev basechain kept alive, no freeing, auto-reregister (becomes
  active again if device with same name reappears).
  No error if device name doesn't exists -> delayed auto-register
  instead, including multi-reg for "foo*" case.
- flowtable: same as old BUT device is auto-(re)added if same name
  (re)appears.
- No -ENOENT error on flowtable add, even if no single device existed

Full "foo*" support.

Now (this patch, without new flag):
- netdev basechain: same as above.
  But you do get an error if the device name did not exist.
  Unless it was for "foo*", thats accepted even if no match is found.
  AFAICS its a userspace/nft change, ie. the new flag is actually
  provided silently in the "foo*" case?
- flowtable: same as old BUT device is auto-(re)added if same name
  (re)appears.
- -ENOENT error on flowtable add if even one device doesn't exist
  Except "foo*" case, then its ok even if no match found.

Maybe add a table that explains the old/new/wanted (this patch) behaviours?
And an explanation/rationale for the new flag?

Is there a concern that users depend on old behaviour?
If so, why are we only concerned about the "add" behaviour but not the
auto-reregistering?

Is it to protect users from typos going unnoticed?
I could imagine "wlp0s20f1" getting misspelled occasionally...

> Note that this flag is persistent, i.e. included in ruleset dumps. This
> effectively makes it "updatable": User space may create a "name-based"
> flowtable for a non-existent interface, then update the flowtable to
> drop the flag. What should happen then? Right now this is simply
> accepted, even though the flowtable still does not bind to an interface.

AFAIU:
If we accept off -> On, the flowtable should bind.
If we accept on -> off, then it looks we should continue to drop devices
from the list but just stop auto-readding?

If in doubt the flag should not be updateable (hard error), in
that case we can refine/relax later.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-02 22:39 ` Florian Westphal
@ 2025-07-03 10:21   ` Phil Sutter
  2025-07-03 11:35     ` Pablo Neira Ayuso
  2025-07-03 11:55     ` Florian Westphal
  0 siblings, 2 replies; 29+ messages in thread
From: Phil Sutter @ 2025-07-03 10:21 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Pablo Neira Ayuso, netfilter-devel

Hi Florian,

On Thu, Jul 03, 2025 at 12:39:32AM +0200, Florian Westphal wrote:
> Phil Sutter <phil@nwl.cc> wrote:
> > Require user space to set a flag upon flowtable or netdev-family chain
> > creation explicitly relaxing the hook registration when it comes to
> > non-existent interfaces. For the sake of simplicity, just restore error
> > condition if a given hook does not find an interface to bind to, leave
> > everyting else in place.
> 
> OK, but then this needs to go in via nf.git and:
> 
> Fixes: 6d07a289504a ("netfilter: nf_tables: Support wildcard netdev hook specs")
> 
> tag.  We shouldn't introduce a "error" -> "no error" -> "error" semantic
> change sequence in kernel releases, i.e. this change is urgent; its now
> (before 6.16 release) or never.

Oh, right. So a decision whether this is feasible and if, how it should
behave in detail, is urgent.

> > - A wildcard interface spec is accepted as long as at least a single
> >   interface matches.
> 
> Is there a reason for this? Why are they handled differently?

I wasn't sure if it's "required" to prevent it as well or not. This
patch was motivated by Pablo reporting users would not notice mis-typed
interface names anymore and asking for whether introducing a feature
flag for it was possible. So I went ahead to have something for a
discussion.

Actually, wildcards are not handled differently: If user specifies
"eth123", kernel errors if no "eth123" exists and accepts otherwise. If
user specifies "eth*", kernel errors if no interface with that prefix
exists and accepts otherwise.

I don't know where to go with this. If the flag should turn interface
specs name-based, its absence should fully restore the old behaviour (as
you kindly summarized below). If it's just about the typo, this patch
might be fine.

> > - Dynamic unregistering and re-registering of vanishing/re-appearing
> >   interfaces is still happening.
> 
> You mean, without the flag? AFAIU old behaviour is:
> For netdev chains:
> - auto-removal AND free of device basechain -> no reappearance
> - -ENOENT error on chain add if device name doesn't exist
> For flowtable:
> - device is removed from the list (and list can become empty), flowtable
>   stays 100%, just the device name disappears from the devices list.
>   Doesn't reappear (auto re-added) either.
> - -ENOENT error on flowtable add if even one device doesn't exist
> 
> Neither netdev nor flowtable support "foo*" wildcards.
> 
> nf.git:
> - netdev basechain kept alive, no freeing, auto-reregister (becomes
>   active again if device with same name reappears).
>   No error if device name doesn't exists -> delayed auto-register
>   instead, including multi-reg for "foo*" case.
> - flowtable: same as old BUT device is auto-(re)added if same name
>   (re)appears.
> - No -ENOENT error on flowtable add, even if no single device existed
> 
> Full "foo*" support.
> 
> Now (this patch, without new flag):
> - netdev basechain: same as above.
>   But you do get an error if the device name did not exist.
>   Unless it was for "foo*", thats accepted even if no match is found.

No, this patch has the kernel error also if it doesn't find a match for
the wildcard. It merely asserts that the hook's ops_list is non-empty
after nft_netdev_hook_alloc() (which did the search for matching
interfaces) returns.

>   AFAICS its a userspace/nft change, ie. the new flag is actually
>   provided silently in the "foo*" case?
> - flowtable: same as old BUT device is auto-(re)added if same name
>   (re)appears.
> - -ENOENT error on flowtable add if even one device doesn't exist
>   Except "foo*" case, then its ok even if no match found.
> 
> Maybe add a table that explains the old/new/wanted (this patch) behaviours?
> And an explanation/rationale for the new flag?
> 
> Is there a concern that users depend on old behaviour?
> If so, why are we only concerned about the "add" behaviour but not the
> auto-reregistering?
> 
> Is it to protect users from typos going unnoticed?
> I could imagine "wlp0s20f1" getting misspelled occasionally...

Yes, that was the premise upon which I wrote the patch. I didn't intend
to make the flag toggle between the old interface hooks and the new
interface name hooks.

> > Note that this flag is persistent, i.e. included in ruleset dumps. This
> > effectively makes it "updatable": User space may create a "name-based"
> > flowtable for a non-existent interface, then update the flowtable to
> > drop the flag. What should happen then? Right now this is simply
> > accepted, even though the flowtable still does not bind to an interface.
> 
> AFAIU:
> If we accept off -> On, the flowtable should bind.
> If we accept on -> off, then it looks we should continue to drop devices
> from the list but just stop auto-readding?
> 
> If in doubt the flag should not be updateable (hard error), in
> that case we can refine/relax later.

My statement above was probably a bit confusing: With non-persistent, I
meant for the flag to be recognized upon chain/flowtable creation but
not added to chain->flags or flowtable->data.flags.

Cheers, Phil

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-03 10:21   ` Phil Sutter
@ 2025-07-03 11:35     ` Pablo Neira Ayuso
  2025-07-03 12:09       ` Florian Westphal
  2025-07-03 12:25       ` Phil Sutter
  2025-07-03 11:55     ` Florian Westphal
  1 sibling, 2 replies; 29+ messages in thread
From: Pablo Neira Ayuso @ 2025-07-03 11:35 UTC (permalink / raw)
  To: Phil Sutter, Florian Westphal, netfilter-devel

On Thu, Jul 03, 2025 at 12:21:17PM +0200, Phil Sutter wrote:
> Hi Florian,
> 
> On Thu, Jul 03, 2025 at 12:39:32AM +0200, Florian Westphal wrote:
> > Phil Sutter <phil@nwl.cc> wrote:
> > > Require user space to set a flag upon flowtable or netdev-family chain
> > > creation explicitly relaxing the hook registration when it comes to
> > > non-existent interfaces. For the sake of simplicity, just restore error
> > > condition if a given hook does not find an interface to bind to, leave
> > > everyting else in place.
> > 
> > OK, but then this needs to go in via nf.git and:
> > 
> > Fixes: 6d07a289504a ("netfilter: nf_tables: Support wildcard netdev hook specs")
> > 
> > tag.  We shouldn't introduce a "error" -> "no error" -> "error" semantic
> > change sequence in kernel releases, i.e. this change is urgent; its now
> > (before 6.16 release) or never.
> 
> Oh, right. So a decision whether this is feasible and if, how it should
> behave in detail, is urgent.

Downside is that this flag adds more complexity, since there will be
two paths to test (flag on/off).

> > > - A wildcard interface spec is accepted as long as at least a single
> > >   interface matches.
> > 
> > Is there a reason for this? Why are they handled differently?
> 
> I wasn't sure if it's "required" to prevent it as well or not. This
> patch was motivated by Pablo reporting users would not notice mis-typed
> interface names anymore and asking for whether introducing a feature
> flag for it was possible. So I went ahead to have something for a
> discussion.
>
> Actually, wildcards are not handled differently: If user specifies
> "eth123", kernel errors if no "eth123" exists and accepts otherwise. If
> user specifies "eth*", kernel errors if no interface with that prefix
> exists and accepts otherwise.
> 
> I don't know where to go with this. If the flag should turn interface
> specs name-based, its absence should fully restore the old behaviour (as
> you kindly summarized below). If it's just about the typo, this patch
> might be fine.

Another concern is having a lot of devices, this is now interating
linearly performing strcmp() to find matches from the control plane
(ie. maybe this slow down time to load ruleset?), IIRC you mentioned
this should not be an issue.

> > > - Dynamic unregistering and re-registering of vanishing/re-appearing
> > >   interfaces is still happening.
> > 
> > You mean, without the flag? AFAIU old behaviour is:
> > For netdev chains:
> > - auto-removal AND free of device basechain -> no reappearance
> > - -ENOENT error on chain add if device name doesn't exist
> > For flowtable:
> > - device is removed from the list (and list can become empty), flowtable
> >   stays 100%, just the device name disappears from the devices list.
> >   Doesn't reappear (auto re-added) either.
> > - -ENOENT error on flowtable add if even one device doesn't exist
> > 
> > Neither netdev nor flowtable support "foo*" wildcards.
> > 
> > nf.git:
> > - netdev basechain kept alive, no freeing, auto-reregister (becomes
> >   active again if device with same name reappears).
> >   No error if device name doesn't exists -> delayed auto-register
> >   instead, including multi-reg for "foo*" case.
> > - flowtable: same as old BUT device is auto-(re)added if same name
> >   (re)appears.
> > - No -ENOENT error on flowtable add, even if no single device existed
> > 
> > Full "foo*" support.
> > 
> > Now (this patch, without new flag):
> > - netdev basechain: same as above.
> >   But you do get an error if the device name did not exist.
> >   Unless it was for "foo*", thats accepted even if no match is found.
> 
> No, this patch has the kernel error also if it doesn't find a match for
> the wildcard. It merely asserts that the hook's ops_list is non-empty
> after nft_netdev_hook_alloc() (which did the search for matching
> interfaces) returns.
>
> >   AFAICS its a userspace/nft change, ie. the new flag is actually
> >   provided silently in the "foo*" case?
> > - flowtable: same as old BUT device is auto-(re)added if same name
> >   (re)appears.
> > - -ENOENT error on flowtable add if even one device doesn't exist
> >   Except "foo*" case, then its ok even if no match found.
> > 
> > Maybe add a table that explains the old/new/wanted (this patch) behaviours?
> > And an explanation/rationale for the new flag?
> > 
> > Is there a concern that users depend on old behaviour?
> > If so, why are we only concerned about the "add" behaviour but not the
> > auto-reregistering?
> > 
> > Is it to protect users from typos going unnoticed?
> > I could imagine "wlp0s20f1" getting misspelled occasionally...
> 
> Yes, that was the premise upon which I wrote the patch. I didn't intend
> to make the flag toggle between the old interface hooks and the new
> interface name hooks.

Mistyped name is another scenario this flag could help.

> > > Note that this flag is persistent, i.e. included in ruleset dumps. This
> > > effectively makes it "updatable": User space may create a "name-based"
> > > flowtable for a non-existent interface, then update the flowtable to
> > > drop the flag. What should happen then? Right now this is simply
> > > accepted, even though the flowtable still does not bind to an interface.
> > 
> > AFAIU:
> > If we accept off -> On, the flowtable should bind.
> > If we accept on -> off, then it looks we should continue to drop devices
> > from the list but just stop auto-readding?
> > 
> > If in doubt the flag should not be updateable (hard error), in
> > that case we can refine/relax later.
> 
> My statement above was probably a bit confusing: With non-persistent, I
> meant for the flag to be recognized upon chain/flowtable creation but
> not added to chain->flags or flowtable->data.flags.

If this flag is added, I won't allow for updates until such
possibility is carefully review, having all possible tricky scenarios
in mind.

I think it boils down to the extra complexity that this flag adds is
worth having or documenting the new behaviour is sufficient, assuming
the two issues that have been mentioned are not problematic.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-03 10:21   ` Phil Sutter
  2025-07-03 11:35     ` Pablo Neira Ayuso
@ 2025-07-03 11:55     ` Florian Westphal
  1 sibling, 0 replies; 29+ messages in thread
From: Florian Westphal @ 2025-07-03 11:55 UTC (permalink / raw)
  To: Phil Sutter, Pablo Neira Ayuso, netfilter-devel

Phil Sutter <phil@nwl.cc> wrote:
> On Thu, Jul 03, 2025 at 12:39:32AM +0200, Florian Westphal wrote:
> > > - A wildcard interface spec is accepted as long as at least a single
> > >   interface matches.
> > 
> > Is there a reason for this? Why are they handled differently?
> 
> I wasn't sure if it's "required" to prevent it as well or not. This
> patch was motivated by Pablo reporting users would not notice mis-typed
> interface names anymore and asking for whether introducing a feature
> flag for it was possible. So I went ahead to have something for a
> discussion.

Ah, thanks, that makes sense.

> Actually, wildcards are not handled differently: If user specifies
> "eth123", kernel errors if no "eth123" exists and accepts otherwise. If
> user specifies "eth*", kernel errors if no interface with that prefix
> exists and accepts otherwise.

Indeed, thanks for clarifying.

> I don't know where to go with this. If the flag should turn interface
> specs name-based, its absence should fully restore the old behaviour (as
> you kindly summarized below). If it's just about the typo, this patch
> might be fine.

Yes.

> > Now (this patch, without new flag):
> > - netdev basechain: same as above.
> >   But you do get an error if the device name did not exist.
> >   Unless it was for "foo*", thats accepted even if no match is found.
> 
> No, this patch has the kernel error also if it doesn't find a match for
> the wildcard. It merely asserts that the hook's ops_list is non-empty
> after nft_netdev_hook_alloc() (which did the search for matching
> interfaces) returns.

Aaah, ok, I see now. Then its waaaay less confusing than I thought.

> > If in doubt the flag should not be updateable (hard error), in
> > that case we can refine/relax later.
> 
> My statement above was probably a bit confusing: With non-persistent, I
> meant for the flag to be recognized upon chain/flowtable creation but
> not added to chain->flags or flowtable->data.flags.

I see, thats a good question, I feel exposing it (add to ->flags member)
is better.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-03 11:35     ` Pablo Neira Ayuso
@ 2025-07-03 12:09       ` Florian Westphal
  2025-07-03 12:37         ` Phil Sutter
  2025-07-03 12:25       ` Phil Sutter
  1 sibling, 1 reply; 29+ messages in thread
From: Florian Westphal @ 2025-07-03 12:09 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Phil Sutter, netfilter-devel

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > Oh, right. So a decision whether this is feasible and if, how it should
> > behave in detail, is urgent.
> 
> Downside is that this flag adds more complexity, since there will be
> two paths to test (flag on/off).

Right.

> Another concern is having a lot of devices, this is now interating
> linearly performing strcmp() to find matches from the control plane
> (ie. maybe this slow down time to load ruleset?), IIRC you mentioned
> this should not be an issue.

Can you suggest an alternative?

I see the following:

- revert to old behaviour (no search,
  lookup-by-name), and introduce a new netlink attribute for the 'wildcard
  name / full-search'.
  Since thats a big change this requires a revert in nf.git and then a
  followup change in nf-next to amend this.

- Only search if we have a wildcard.
  It should be enough to check, from nft_netdev_hook_alloc, if hook->ifname
  is null-terminated or not. If it is, lookup-by-name, else
  for_each_netdev search.

  Thats assuming that the netlink attributes are encoded as
  'eth\0' (4 bytes, no wildcard), vs 'eth' (3 bytes, wildcard).

> > Yes, that was the premise upon which I wrote the patch. I didn't intend
> > to make the flag toggle between the old interface hooks and the new
> > interface name hooks.
> 
> Mistyped name is another scenario this flag could help.

Regardless of this flag patch one could update nftables userspace to
display hints like we do for sets with the new '# count 42' comment
annotation.

Something that tells that the hook is subscribed for eth42 but currently
not active.

Same with flowtables, something that tells which devices are configured
(subscribed) and which devices are used (should likely still display
 ppp* and not list 4000k ppp1234 :-) ).

Phil, whats your take here?

From a quick glance there is currently no way for a user to tell if the
hook is active or not (except by adding a dummy counter rule).

> If this flag is added, I won't allow for updates until such
> possibility is carefully review, having all possible tricky scenarios
> in mind.

Makes sense to me.

> I think it boils down to the extra complexity that this flag adds is
> worth having or documenting the new behaviour is sufficient, assuming
> the two issues that have been mentioned are not problematic.

Yes.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-03 11:35     ` Pablo Neira Ayuso
  2025-07-03 12:09       ` Florian Westphal
@ 2025-07-03 12:25       ` Phil Sutter
  2025-07-03 12:39         ` Florian Westphal
  1 sibling, 1 reply; 29+ messages in thread
From: Phil Sutter @ 2025-07-03 12:25 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel

Hi Pablo,

On Thu, Jul 03, 2025 at 01:35:43PM +0200, Pablo Neira Ayuso wrote:
> On Thu, Jul 03, 2025 at 12:21:17PM +0200, Phil Sutter wrote:
> > On Thu, Jul 03, 2025 at 12:39:32AM +0200, Florian Westphal wrote:
> > > Phil Sutter <phil@nwl.cc> wrote:
> > > > Require user space to set a flag upon flowtable or netdev-family chain
> > > > creation explicitly relaxing the hook registration when it comes to
> > > > non-existent interfaces. For the sake of simplicity, just restore error
> > > > condition if a given hook does not find an interface to bind to, leave
> > > > everyting else in place.
> > > 
> > > OK, but then this needs to go in via nf.git and:
> > > 
> > > Fixes: 6d07a289504a ("netfilter: nf_tables: Support wildcard netdev hook specs")
> > > 
> > > tag.  We shouldn't introduce a "error" -> "no error" -> "error" semantic
> > > change sequence in kernel releases, i.e. this change is urgent; its now
> > > (before 6.16 release) or never.
> > 
> > Oh, right. So a decision whether this is feasible and if, how it should
> > behave in detail, is urgent.
> 
> Downside is that this flag adds more complexity, since there will be
> two paths to test (flag on/off).

Indeed, but depending on the flag's scope we may get by with just a
shell test case to cover the "accept with no matching interface"
behaviour.

> > > > - A wildcard interface spec is accepted as long as at least a single
> > > >   interface matches.
> > > 
> > > Is there a reason for this? Why are they handled differently?
> > 
> > I wasn't sure if it's "required" to prevent it as well or not. This
> > patch was motivated by Pablo reporting users would not notice mis-typed
> > interface names anymore and asking for whether introducing a feature
> > flag for it was possible. So I went ahead to have something for a
> > discussion.
> >
> > Actually, wildcards are not handled differently: If user specifies
> > "eth123", kernel errors if no "eth123" exists and accepts otherwise. If
> > user specifies "eth*", kernel errors if no interface with that prefix
> > exists and accepts otherwise.
> > 
> > I don't know where to go with this. If the flag should turn interface
> > specs name-based, its absence should fully restore the old behaviour (as
> > you kindly summarized below). If it's just about the typo, this patch
> > might be fine.
> 
> Another concern is having a lot of devices, this is now interating
> linearly performing strcmp() to find matches from the control plane
> (ie. maybe this slow down time to load ruleset?), IIRC you mentioned
> this should not be an issue.

I did not measure a significant slowdown even with a wildcard interface
spec matching 10k interfaces, neither for flowtable or chain creation or
chain deletion. The only impacted action is flowtable deletion, I
suspect nf_flowtable_by_dev_remove() to cause the slowdown.

I don't think we need a feature flag for this though, users may simply
not use wildcard interface specs to avoid it.

> > > > - Dynamic unregistering and re-registering of vanishing/re-appearing
> > > >   interfaces is still happening.
> > > 
> > > You mean, without the flag? AFAIU old behaviour is:
> > > For netdev chains:
> > > - auto-removal AND free of device basechain -> no reappearance
> > > - -ENOENT error on chain add if device name doesn't exist
> > > For flowtable:
> > > - device is removed from the list (and list can become empty), flowtable
> > >   stays 100%, just the device name disappears from the devices list.
> > >   Doesn't reappear (auto re-added) either.
> > > - -ENOENT error on flowtable add if even one device doesn't exist
> > > 
> > > Neither netdev nor flowtable support "foo*" wildcards.
> > > 
> > > nf.git:
> > > - netdev basechain kept alive, no freeing, auto-reregister (becomes
> > >   active again if device with same name reappears).
> > >   No error if device name doesn't exists -> delayed auto-register
> > >   instead, including multi-reg for "foo*" case.
> > > - flowtable: same as old BUT device is auto-(re)added if same name
> > >   (re)appears.
> > > - No -ENOENT error on flowtable add, even if no single device existed
> > > 
> > > Full "foo*" support.
> > > 
> > > Now (this patch, without new flag):
> > > - netdev basechain: same as above.
> > >   But you do get an error if the device name did not exist.
> > >   Unless it was for "foo*", thats accepted even if no match is found.
> > 
> > No, this patch has the kernel error also if it doesn't find a match for
> > the wildcard. It merely asserts that the hook's ops_list is non-empty
> > after nft_netdev_hook_alloc() (which did the search for matching
> > interfaces) returns.
> >
> > >   AFAICS its a userspace/nft change, ie. the new flag is actually
> > >   provided silently in the "foo*" case?
> > > - flowtable: same as old BUT device is auto-(re)added if same name
> > >   (re)appears.
> > > - -ENOENT error on flowtable add if even one device doesn't exist
> > >   Except "foo*" case, then its ok even if no match found.
> > > 
> > > Maybe add a table that explains the old/new/wanted (this patch) behaviours?
> > > And an explanation/rationale for the new flag?
> > > 
> > > Is there a concern that users depend on old behaviour?
> > > If so, why are we only concerned about the "add" behaviour but not the
> > > auto-reregistering?
> > > 
> > > Is it to protect users from typos going unnoticed?
> > > I could imagine "wlp0s20f1" getting misspelled occasionally...
> > 
> > Yes, that was the premise upon which I wrote the patch. I didn't intend
> > to make the flag toggle between the old interface hooks and the new
> > interface name hooks.
> 
> Mistyped name is another scenario this flag could help.

It's the only one I had in mind.

> > > > Note that this flag is persistent, i.e. included in ruleset dumps. This
> > > > effectively makes it "updatable": User space may create a "name-based"
> > > > flowtable for a non-existent interface, then update the flowtable to
> > > > drop the flag. What should happen then? Right now this is simply
> > > > accepted, even though the flowtable still does not bind to an interface.
> > > 
> > > AFAIU:
> > > If we accept off -> On, the flowtable should bind.
> > > If we accept on -> off, then it looks we should continue to drop devices
> > > from the list but just stop auto-readding?
> > > 
> > > If in doubt the flag should not be updateable (hard error), in
> > > that case we can refine/relax later.
> > 
> > My statement above was probably a bit confusing: With non-persistent, I
> > meant for the flag to be recognized upon chain/flowtable creation but
> > not added to chain->flags or flowtable->data.flags.
> 
> If this flag is added, I won't allow for updates until such
> possibility is carefully review, having all possible tricky scenarios
> in mind.
> 
> I think it boils down to the extra complexity that this flag adds is
> worth having or documenting the new behaviour is sufficient, assuming
> the two issues that have been mentioned are not problematic.

The two issues being typos and performance issues? If so, performance
should not be an issue (merely flowtable deletion slows down, but that
may be something we might optimize away). Typos are something I
personally wouldn't care about as I find it similar to mis-typing an IP
address or RHS to an iifname match. If transparency of behaviour is a
concern, I'd rather implement GETDEV message type and enable user space
to print the list of currently bound interfaces (though it's partially
redundant, 'nft list hooks' helps there although it does not show which
flowtable/chain "owns" the hook).

Cheers, Phil

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-03 12:09       ` Florian Westphal
@ 2025-07-03 12:37         ` Phil Sutter
  0 siblings, 0 replies; 29+ messages in thread
From: Phil Sutter @ 2025-07-03 12:37 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Pablo Neira Ayuso, netfilter-devel

On Thu, Jul 03, 2025 at 02:09:36PM +0200, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > > Oh, right. So a decision whether this is feasible and if, how it should
> > > behave in detail, is urgent.
> > 
> > Downside is that this flag adds more complexity, since there will be
> > two paths to test (flag on/off).
> 
> Right.
> 
> > Another concern is having a lot of devices, this is now interating
> > linearly performing strcmp() to find matches from the control plane
> > (ie. maybe this slow down time to load ruleset?), IIRC you mentioned
> > this should not be an issue.
> 
> Can you suggest an alternative?
> 
> I see the following:
> 
> - revert to old behaviour (no search,
>   lookup-by-name), and introduce a new netlink attribute for the 'wildcard
>   name / full-search'.
>   Since thats a big change this requires a revert in nf.git and then a
>   followup change in nf-next to amend this.
> 
> - Only search if we have a wildcard.
>   It should be enough to check, from nft_netdev_hook_alloc, if hook->ifname
>   is null-terminated or not. If it is, lookup-by-name, else
>   for_each_netdev search.
> 
>   Thats assuming that the netlink attributes are encoded as
>   'eth\0' (4 bytes, no wildcard), vs 'eth' (3 bytes, wildcard).

Yes, that's how "wildcards" are implemented. (Hence why a simple
strncmp() is sufficient.)  When Pablo asked about it, I also realized I
could keep the lookup-by-name unless manual search was needed. I even
implemented it but surprisingly couldn't measure a difference. Quoting
myself from another mail here:

| > > Quick follow-up here: To test the above, I created many dummy NICs with
| > > same prefix and timed creation of a chain with matching wildcard hook
| > > spec. Failing to measure a significant delay, I increased the number of
| > > those NICs. Currently I have 10k matching and 10k non-matching NICs and
| > > still can't measure a slowdown creating that wildcard chain, even with
| > > 'nft monitor' running in another terminal which reports the added
| > > interfaces (that's code I haven't submitted yet).
| >
| > Are you sure you see no spike in perf record for nft_hook_alloc()?
| 
| I don't, flowtable creation is really fast (0.2s with 1k matching NICs),
| but deleting the same flowtable then takes about 60s. The perf report
| shows nf_flow_offload_xdp_setup() up high, I suspect the nested loops in
| nf_flowtable_by_dev_remove() to be the culprit.
| 
| Testing a netdev chain instead, both creation and deletion are almost
| instant (0.16s and 0.26s).

> > > Yes, that was the premise upon which I wrote the patch. I didn't intend
> > > to make the flag toggle between the old interface hooks and the new
> > > interface name hooks.
> > 
> > Mistyped name is another scenario this flag could help.
> 
> Regardless of this flag patch one could update nftables userspace to
> display hints like we do for sets with the new '# count 42' comment
> annotation.
> 
> Something that tells that the hook is subscribed for eth42 but currently
> not active.
> 
> Same with flowtables, something that tells which devices are configured
> (subscribed) and which devices are used (should likely still display
>  ppp* and not list 4000k ppp1234 :-) ).
> 
> Phil, whats your take here?

As suggested in my other reply, I could implement GETDEV request so
nftables may print either:

| flowtable f {
| 	hook ingress priority filter
| 	devices = { eth* (active: eth0, eth1), test1 (inactive) }
| }

or:

| flowtable f {
| 	hook ingress priority filter
| 	devices = { eth*, test1 } # active: eth0, eth1
| }

The NEWDEV/DELDEV notifications the kernel currently emits will be
printed by 'nft monitor'. This is still useful despite the above to
notify user space if a device is bound/unbound at run-time.

Cheers, Phil

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-03 12:25       ` Phil Sutter
@ 2025-07-03 12:39         ` Florian Westphal
  2025-07-03 12:47           ` Phil Sutter
  0 siblings, 1 reply; 29+ messages in thread
From: Florian Westphal @ 2025-07-03 12:39 UTC (permalink / raw)
  To: Phil Sutter, Pablo Neira Ayuso, netfilter-devel

Phil Sutter <phil@nwl.cc> wrote:
> personally wouldn't care about as I find it similar to mis-typing an IP
> address or RHS to an iifname match.

Good point.  I think if performance isn't an issue then we can go ahead
without this flag.

> If transparency of behaviour is a
> concern, I'd rather implement GETDEV message type and enable user space
> to print the list of currently bound interfaces (though it's partially
> redundant, 'nft list hooks' helps there although it does not show which
> flowtable/chain "owns" the hook).

Do we need new query types for this?
nftables could just query via rtnetlink if the device exists or not
and then print a hint if its absent.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-03 12:39         ` Florian Westphal
@ 2025-07-03 12:47           ` Phil Sutter
  2025-07-03 12:54             ` Florian Westphal
  0 siblings, 1 reply; 29+ messages in thread
From: Phil Sutter @ 2025-07-03 12:47 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Pablo Neira Ayuso, netfilter-devel

On Thu, Jul 03, 2025 at 02:39:47PM +0200, Florian Westphal wrote:
> Phil Sutter <phil@nwl.cc> wrote:
> > personally wouldn't care about as I find it similar to mis-typing an IP
> > address or RHS to an iifname match.
> 
> Good point.  I think if performance isn't an issue then we can go ahead
> without this flag.
> 
> > If transparency of behaviour is a
> > concern, I'd rather implement GETDEV message type and enable user space
> > to print the list of currently bound interfaces (though it's partially
> > redundant, 'nft list hooks' helps there although it does not show which
> > flowtable/chain "owns" the hook).
> 
> Do we need new query types for this?
> nftables could just query via rtnetlink if the device exists or not
> and then print a hint if its absent.

Hey, that's a hack! :P
Under normal circumstances, this should indeed suffice. The ruleset is
per-netns, so the kernel's view matches nft's. The only downside I see
is that we would not detect kernel bugs this way, e.g. if a new device
slipped through and was not bound. Debatable if the GETDEV extra effort
is justified for this "should not happen" situation, though.

Cheers, Phil

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-03 12:47           ` Phil Sutter
@ 2025-07-03 12:54             ` Florian Westphal
  2025-07-03 13:17               ` Phil Sutter
  0 siblings, 1 reply; 29+ messages in thread
From: Florian Westphal @ 2025-07-03 12:54 UTC (permalink / raw)
  To: Phil Sutter, Pablo Neira Ayuso, netfilter-devel

Phil Sutter <phil@nwl.cc> wrote:
> > Do we need new query types for this?
> > nftables could just query via rtnetlink if the device exists or not
> > and then print a hint if its absent.
> 
> Hey, that's a hack! :P
> Under normal circumstances, this should indeed suffice. The ruleset is
> per-netns, so the kernel's view matches nft's. The only downside I see
> is that we would not detect kernel bugs this way, e.g. if a new device
> slipped through and was not bound. Debatable if the GETDEV extra effort
> is justified for this "should not happen" situation, though.

Could the info be included in the dump? For this we'd only need a
'is_empty()' result.  For things like eth*, nft list hooks might be
good enough to spot bugs (e.g., you have 'eth*' subscription, but
eth0 is registed but eth1 isn't but it should be.

In any case I think that can be added later.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-03 12:54             ` Florian Westphal
@ 2025-07-03 13:17               ` Phil Sutter
  2025-07-03 14:19                 ` Pablo Neira Ayuso
  0 siblings, 1 reply; 29+ messages in thread
From: Phil Sutter @ 2025-07-03 13:17 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Pablo Neira Ayuso, netfilter-devel

On Thu, Jul 03, 2025 at 02:54:36PM +0200, Florian Westphal wrote:
> Phil Sutter <phil@nwl.cc> wrote:
> > > Do we need new query types for this?
> > > nftables could just query via rtnetlink if the device exists or not
> > > and then print a hint if its absent.
> > 
> > Hey, that's a hack! :P
> > Under normal circumstances, this should indeed suffice. The ruleset is
> > per-netns, so the kernel's view matches nft's. The only downside I see
> > is that we would not detect kernel bugs this way, e.g. if a new device
> > slipped through and was not bound. Debatable if the GETDEV extra effort
> > is justified for this "should not happen" situation, though.
> 
> Could the info be included in the dump? For this we'd only need a
> 'is_empty()' result.  For things like eth*, nft list hooks might be
> good enough to spot bugs (e.g., you have 'eth*' subscription, but
> eth0 is registed but eth1 isn't but it should be.

That may indeed be a simple solution avoiding to bloat
NEWFLOWTABLE/NEWCHAIN messages.

> In any case I think that can be added later.

Right now, NFTA_FLOWTABLE_HOOK_DEVS is just an array of NFTA_DEVICE_NAME
attributes. Guess the easiest way would be to introduce
NFTA_FLOWTABLE_HOOKLESS_DEVS array of NFTA_DEVICE_NAME attributes, old
user space would just ignore that second array.

Pablo, WDYT? Feasible alternative to the feature flag?

Thanks, Phil

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-03 13:17               ` Phil Sutter
@ 2025-07-03 14:19                 ` Pablo Neira Ayuso
  2025-07-03 14:33                   ` Phil Sutter
  0 siblings, 1 reply; 29+ messages in thread
From: Pablo Neira Ayuso @ 2025-07-03 14:19 UTC (permalink / raw)
  To: Phil Sutter, Florian Westphal, netfilter-devel

On Thu, Jul 03, 2025 at 03:17:06PM +0200, Phil Sutter wrote:
> On Thu, Jul 03, 2025 at 02:54:36PM +0200, Florian Westphal wrote:
> > Phil Sutter <phil@nwl.cc> wrote:
> > > > Do we need new query types for this?
> > > > nftables could just query via rtnetlink if the device exists or not
> > > > and then print a hint if its absent.
> > > 
> > > Hey, that's a hack! :P
> > > Under normal circumstances, this should indeed suffice. The ruleset is
> > > per-netns, so the kernel's view matches nft's. The only downside I see
> > > is that we would not detect kernel bugs this way, e.g. if a new device
> > > slipped through and was not bound. Debatable if the GETDEV extra effort
> > > is justified for this "should not happen" situation, though.
> > 
> > Could the info be included in the dump? For this we'd only need a
> > 'is_empty()' result.  For things like eth*, nft list hooks might be
> > good enough to spot bugs (e.g., you have 'eth*' subscription, but
> > eth0 is registed but eth1 isn't but it should be.
> 
> That may indeed be a simple solution avoiding to bloat
> NEWFLOWTABLE/NEWCHAIN messages.
> 
> > In any case I think that can be added later.
> 
> Right now, NFTA_FLOWTABLE_HOOK_DEVS is just an array of NFTA_DEVICE_NAME
> attributes. Guess the easiest way would be to introduce
> NFTA_FLOWTABLE_HOOKLESS_DEVS array of NFTA_DEVICE_NAME attributes, old
> user space would just ignore that second array.

That is, new nftables binaries use NFTA_FLOWTABLE_HOOKLESS_DEVS.

> Pablo, WDYT? Feasible alternative to the feature flag?

If my understanding is correct, I think this approach will break new
nft binary with old kernel.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-03 14:19                 ` Pablo Neira Ayuso
@ 2025-07-03 14:33                   ` Phil Sutter
  2025-07-03 21:32                     ` Pablo Neira Ayuso
  0 siblings, 1 reply; 29+ messages in thread
From: Phil Sutter @ 2025-07-03 14:33 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel

On Thu, Jul 03, 2025 at 04:19:20PM +0200, Pablo Neira Ayuso wrote:
> On Thu, Jul 03, 2025 at 03:17:06PM +0200, Phil Sutter wrote:
> > On Thu, Jul 03, 2025 at 02:54:36PM +0200, Florian Westphal wrote:
> > > Phil Sutter <phil@nwl.cc> wrote:
> > > > > Do we need new query types for this?
> > > > > nftables could just query via rtnetlink if the device exists or not
> > > > > and then print a hint if its absent.
> > > > 
> > > > Hey, that's a hack! :P
> > > > Under normal circumstances, this should indeed suffice. The ruleset is
> > > > per-netns, so the kernel's view matches nft's. The only downside I see
> > > > is that we would not detect kernel bugs this way, e.g. if a new device
> > > > slipped through and was not bound. Debatable if the GETDEV extra effort
> > > > is justified for this "should not happen" situation, though.
> > > 
> > > Could the info be included in the dump? For this we'd only need a
> > > 'is_empty()' result.  For things like eth*, nft list hooks might be
> > > good enough to spot bugs (e.g., you have 'eth*' subscription, but
> > > eth0 is registed but eth1 isn't but it should be.
> > 
> > That may indeed be a simple solution avoiding to bloat
> > NEWFLOWTABLE/NEWCHAIN messages.
> > 
> > > In any case I think that can be added later.
> > 
> > Right now, NFTA_FLOWTABLE_HOOK_DEVS is just an array of NFTA_DEVICE_NAME
> > attributes. Guess the easiest way would be to introduce
> > NFTA_FLOWTABLE_HOOKLESS_DEVS array of NFTA_DEVICE_NAME attributes, old
> > user space would just ignore that second array.
> 
> That is, new nftables binaries use NFTA_FLOWTABLE_HOOKLESS_DEVS.

If present, they will use that attribute to indicate that a given device
spec does not match any existing devices.

> > Pablo, WDYT? Feasible alternative to the feature flag?
> 
> If my understanding is correct, I think this approach will break new
> nft binary with old kernel.

If not present (old kernel), new nft would assume all device specs
matched an interface. I would implement this as comment, something like:
"# not bound: ethX, wlan*" which would be missing with old kernels.

Cheers, Phil

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-03 14:33                   ` Phil Sutter
@ 2025-07-03 21:32                     ` Pablo Neira Ayuso
  2025-07-04 12:41                       ` Phil Sutter
  0 siblings, 1 reply; 29+ messages in thread
From: Pablo Neira Ayuso @ 2025-07-03 21:32 UTC (permalink / raw)
  To: Phil Sutter, Florian Westphal, netfilter-devel

On Thu, Jul 03, 2025 at 04:33:49PM +0200, Phil Sutter wrote:
[...]
> > > Pablo, WDYT? Feasible alternative to the feature flag?

That is more simple, it puts a bit more pressure on netlink_dump().
Worst case fully duplicates the information, I think we discussed this
already.

If my code reading is correct, maximum size of the area to add netlink
messages in the dump path is SKB_WITH_OVERHEAD(32768) according to
netlink_recvmsg(), there is a fallback to NLMSG_GOODSIZE when 3-order
allocation fails.

In the event path, memory budget is already used up since
NLMSG_GOODSIZE (4096) even before this proposal because

256 devices * IFNAMSIZ = 4096

This is beyond the limit.

This can be fixed by splitting the report in two events of NEWCHAIN,
but if there is another large attribute reaching the worst case, the
splitting will get more complicated.

With your new attribute, worst case scenario means duplicating _DEVS
with the same devices.

There is a memory budget for the netlink message, and the ADDCHAIN
netlink message with the netdev family is already pushing it to the
limit when *I rised* the maximum to 256 devices per request of a user.
I can post a patch to reduce it to 128 devices now that your device
wildcard feature is available.

I regret _DEVS attribute only includes DEV_NAME, it should have been
possible to add a flag and provide a more efficient representation
than your proposal.

A possible fugly hack would be to stuffed this information after \0 in
DEV_NAME, but that would feel like the iptables revision
infrastructure, better not to go that way.

Maybe all this is not worth to worry and we can assume in 2025 that
when 3-order allocation fails netlink dump will simply fail? Probably
this already is right for other existing netlink subsystems.

And this effort is to provide a way for the user to know that the
device that has been specified has actually registered a hook so the
chain will see traffic.

So far we only have events via NEWDEV to report new hooks. Maybe
GETDEV to consult the hooks that are attached to what chains is
needed? That would solve this usability issue.

But that is _a lot more work_, stuffing more information into the
ADDCHAIN netlink message is easier. GETDEV means more netlink boiler
plate code to avoid this simple extra attribute you propose.

GETDEV would be paired with NEWDEV events to determine which device
the base chain is hooked to.

Maybe it is not for users to know that that dummy* is matching
_something_ but also they want to know what device is matching such
pattern for debugging purpose.

It boils down to "should we care to provide facility to allow for more
instrospection in this box"?

If the answer is "no, what we have is sufficient" then let's not worry
about mistypes. GETDEV facility would be rarely used, then skip.

If you want to complete this picture, then add GETDEV, because NEWDEV
events without GETDEV command looks incomplete.

> > If my understanding is correct, I think this approach will break new
> > nft binary with old kernel.
> 
> If not present (old kernel), new nft would assume all device specs
> matched an interface. I would implement this as comment, something like:
> "# not bound: ethX, wlan*" which would be missing with old kernels.

If after all this, you decide to go for this approach with the new
attribute into ADDCHAIN, maybe a more compact representation,
add ? notation:

table ip x {
        chain y {
                type filter hook ingress devices = { "eth*"?, "vlan0"?, "wan1" } priority 0;
        }
}

This notation can be removed if -s/--stateless is used and ignored if
ruleset is loaded and it is more compact.

It feels like we are trying to stuff too much information in the
existing output, and my ? notation is just trying to find a "smart"
way to make thing not look bloated. Then, GETDEV comes again and this
silly notation is really not needed if a more complete view is
provided.

BTW, note there is one inconsistency that need to be addressed in the
listing, currently 'devices' does not enclose the names in quotes
while 'device' does it.

I mean, with device:

table netdev x {
        chain y {
                type filter hook ingress device "dummy0" priority filter; policy accept;
        }
}

with devices:

table netdev x {
        chain y {
                type filter hook ingress devices = { dummy0, dummy1 } priority filter; policy accept;
        }
}

It would be great to fix this.

P.S: This still leaves room to discuss if comments are the best way to
go to display handle and set count, but we can start a new thread to
discuss this.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-03 21:32                     ` Pablo Neira Ayuso
@ 2025-07-04 12:41                       ` Phil Sutter
  2025-07-04 14:04                         ` Florian Westphal
  0 siblings, 1 reply; 29+ messages in thread
From: Phil Sutter @ 2025-07-04 12:41 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel

Hi Pablo,

On Thu, Jul 03, 2025 at 11:32:26PM +0200, Pablo Neira Ayuso wrote:
> On Thu, Jul 03, 2025 at 04:33:49PM +0200, Phil Sutter wrote:
> [...]
> > > > Pablo, WDYT? Feasible alternative to the feature flag?
> 
> That is more simple, it puts a bit more pressure on netlink_dump().
> Worst case fully duplicates the information, I think we discussed this
> already.
> 
> If my code reading is correct, maximum size of the area to add netlink
> messages in the dump path is SKB_WITH_OVERHEAD(32768) according to
> netlink_recvmsg(), there is a fallback to NLMSG_GOODSIZE when 3-order
> allocation fails.
> 
> In the event path, memory budget is already used up since
> NLMSG_GOODSIZE (4096) even before this proposal because
> 
> 256 devices * IFNAMSIZ = 4096
> 
> This is beyond the limit.
> 
> This can be fixed by splitting the report in two events of NEWCHAIN,
> but if there is another large attribute reaching the worst case, the
> splitting will get more complicated.
> 
> With your new attribute, worst case scenario means duplicating _DEVS
> with the same devices.
> 
> There is a memory budget for the netlink message, and the ADDCHAIN
> netlink message with the netdev family is already pushing it to the
> limit when *I rised* the maximum to 256 devices per request of a user.
> I can post a patch to reduce it to 128 devices now that your device
> wildcard feature is available.

While 128 should suffice for all practical cases (I hope), we could also
count hooks (empty ones twice) in _fill routines and omit the
HOOKLESS_DEVS attribute if the number exceeds 256.

> I regret _DEVS attribute only includes DEV_NAME, it should have been
> possible to add a flag and provide a more efficient representation
> than your proposal.

Yes, these kinds of short-cuts tend to kill the flexibility of the netlink
format. We could introduce _HOOK_DEVS_NEW but just like with _HOOK_DEV,
we can't get rid of _HOOK_DEVS in exchange.

> A possible fugly hack would be to stuffed this information after \0 in
> DEV_NAME, but that would feel like the iptables revision
> infrastructure, better not to go that way.

ACK.

> Maybe all this is not worth to worry and we can assume in 2025 that
> when 3-order allocation fails netlink dump will simply fail? Probably
> this already is right for other existing netlink subsystems.
> 
> And this effort is to provide a way for the user to know that the
> device that has been specified has actually registered a hook so the
> chain will see traffic.

Please keep in mind we already have 'nft list hooks' which provides
hints in that direction. It does not show which flowtable/chain actually
binds to a given device, though.

> So far we only have events via NEWDEV to report new hooks. Maybe
> GETDEV to consult the hooks that are attached to what chains is
> needed? That would solve this usability issue.
> 
> But that is _a lot more work_, stuffing more information into the
> ADDCHAIN netlink message is easier. GETDEV means more netlink boiler
> plate code to avoid this simple extra attribute you propose.
> 
> GETDEV would be paired with NEWDEV events to determine which device
> the base chain is hooked to.

Yes, it is definitely more work than the HOOKLESS_DEVS extra attribute,
but both user and kernel space would reuse code from NEWDEV event
support for the new request.

OTOH using it instead of HOOKLESS_DEVS means one more round-trip for
each flowtable and netdev chain being cached in user space.

> Maybe it is not for users to know that that dummy* is matching
> _something_ but also they want to know what device is matching such
> pattern for debugging purpose.

There is^Wwill be also 'nft monitor' helping with that.

> It boils down to "should we care to provide facility to allow for more
> instrospection in this box"?
> 
> If the answer is "no, what we have is sufficient" then let's not worry
> about mistypes. GETDEV facility would be rarely used, then skip.
> 
> If you want to complete this picture, then add GETDEV, because NEWDEV
> events without GETDEV command looks incomplete.

So you're suggesting to either implement GETDEV now or maybe later but
not the HOOKLESS_DEVS attribute for it being redundant wrt. GETDEV?

> > > If my understanding is correct, I think this approach will break new
> > > nft binary with old kernel.
> > 
> > If not present (old kernel), new nft would assume all device specs
> > matched an interface. I would implement this as comment, something like:
> > "# not bound: ethX, wlan*" which would be missing with old kernels.
> 
> If after all this, you decide to go for this approach with the new
> attribute into ADDCHAIN, maybe a more compact representation,
> add ? notation:
> 
> table ip x {
>         chain y {
>                 type filter hook ingress devices = { "eth*"?, "vlan0"?, "wan1" } priority 0;
>         }
> }
> 
> This notation can be removed if -s/--stateless is used and ignored if
> ruleset is loaded and it is more compact.
> 
> It feels like we are trying to stuff too much information in the
> existing output, and my ? notation is just trying to find a "smart"
> way to make thing not look bloated. Then, GETDEV comes again and this
> silly notation is really not needed if a more complete view is
> provided.

I prefer the comment format simply because it is easier on the parsers:
The one in nft is already quite convoluted, anyone trying to parse nft
output (yes, there's JSON for that but anyway) will likely expect
comments already.

On top of that, we don't break syntax for older binaries.

> BTW, note there is one inconsistency that need to be addressed in the
> listing, currently 'devices' does not enclose the names in quotes
> while 'device' does it.
> 
> I mean, with device:
> 
> table netdev x {
>         chain y {
>                 type filter hook ingress device "dummy0" priority filter; policy accept;
>         }
> }
> 
> with devices:
> 
> table netdev x {
>         chain y {
>                 type filter hook ingress devices = { dummy0, dummy1 } priority filter; policy accept;
>         }
> }
> 
> It would be great to fix this.

We could start by accepting quoted strings there. To not cause too much
fuss, quotes could be limited to wildcard interface names on output (at
least for now).

> P.S: This still leaves room to discuss if comments are the best way to
> go to display handle and set count, but we can start a new thread to
> discuss this.

My maxim would be to use comments for data which is output only, i.e.
useless on input.

Cheers, Phil

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-04 12:41                       ` Phil Sutter
@ 2025-07-04 14:04                         ` Florian Westphal
  2025-07-04 15:33                           ` Phil Sutter
  2025-07-07 19:25                           ` Pablo Neira Ayuso
  0 siblings, 2 replies; 29+ messages in thread
From: Florian Westphal @ 2025-07-04 14:04 UTC (permalink / raw)
  To: Phil Sutter, Pablo Neira Ayuso, netfilter-devel

Phil Sutter <phil@nwl.cc> wrote:
> Please keep in mind we already have 'nft list hooks' which provides
> hints in that direction. It does not show which flowtable/chain actually
> binds to a given device, though.

Its possible to extend it:
- add NF_HOOK_OP_NFT_FT to enum nf_hook_ops_type
- add

static int nfnl_hook_put_nft_ft_info(struct sk_buff *nlskb,
                                   const struct nfnl_dump_hook_data *ctx,
                                   unsigned int seq,
                                   struct nf_flowtable *ft)

to nfnetlink_hook.c

it can use container_of to get to the nft_flowtable struct.
It might be possibe to share some code with nfnl_hook_put_nft_chain_info
and reuse some of the same netlink attributes.

- call it from nfnl_hook_dump_one.

I think it would use useful to have, independent of "eth*" support.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-04 14:04                         ` Florian Westphal
@ 2025-07-04 15:33                           ` Phil Sutter
  2025-07-07 19:25                           ` Pablo Neira Ayuso
  1 sibling, 0 replies; 29+ messages in thread
From: Phil Sutter @ 2025-07-04 15:33 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Pablo Neira Ayuso, netfilter-devel

Florian!

On Fri, Jul 04, 2025 at 04:04:39PM +0200, Florian Westphal wrote:
> Phil Sutter <phil@nwl.cc> wrote:
> > Please keep in mind we already have 'nft list hooks' which provides
> > hints in that direction. It does not show which flowtable/chain actually
> > binds to a given device, though.
> 
> Its possible to extend it:
> - add NF_HOOK_OP_NFT_FT to enum nf_hook_ops_type
> - add
> 
> static int nfnl_hook_put_nft_ft_info(struct sk_buff *nlskb,
>                                    const struct nfnl_dump_hook_data *ctx,
>                                    unsigned int seq,
>                                    struct nf_flowtable *ft)
> 
> to nfnetlink_hook.c
> 
> it can use container_of to get to the nft_flowtable struct.
> It might be possibe to share some code with nfnl_hook_put_nft_chain_info
> and reuse some of the same netlink attributes.
> 
> - call it from nfnl_hook_dump_one.
> 
> I think it would use useful to have, independent of "eth*" support.

I entirely missed the fact that 'list hooks' output sucks with
flowtables only and is fine with chains! Thanks for the quick howto,
I'll implement this next week.

Thanks, Phil

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-04 14:04                         ` Florian Westphal
  2025-07-04 15:33                           ` Phil Sutter
@ 2025-07-07 19:25                           ` Pablo Neira Ayuso
  2025-07-08 14:38                             ` Phil Sutter
  1 sibling, 1 reply; 29+ messages in thread
From: Pablo Neira Ayuso @ 2025-07-07 19:25 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Phil Sutter, netfilter-devel

On Fri, Jul 04, 2025 at 04:04:39PM +0200, Florian Westphal wrote:
> Phil Sutter <phil@nwl.cc> wrote:
> > Please keep in mind we already have 'nft list hooks' which provides
> > hints in that direction. It does not show which flowtable/chain actually
> > binds to a given device, though.
> 
> Its possible to extend it:
> - add NF_HOOK_OP_NFT_FT to enum nf_hook_ops_type
> - add
> 
> static int nfnl_hook_put_nft_ft_info(struct sk_buff *nlskb,
>                                    const struct nfnl_dump_hook_data *ctx,
>                                    unsigned int seq,
>                                    struct nf_flowtable *ft)
> 
> to nfnetlink_hook.c
> 
> it can use container_of to get to the nft_flowtable struct.
> It might be possibe to share some code with nfnl_hook_put_nft_chain_info
> and reuse some of the same netlink attributes.
> 
> - call it from nfnl_hook_dump_one.
> 
> I think it would use useful to have, independent of "eth*" support.

This is a good idea to place this in nfnetlink_hook, that
infrastructure is for debugging purpose indeed.

If this update is made, I also think it makes sense to remove the
netlink event notification code for devices, I don't have a use case
for that code in the new device group other than debugging.

If Phil's intention is to make code savings, then extending
nfnetlink_hook and removing the existing device notification group
make sense to me.

User can simply resort to check via dump if a matching hook is
registered for eth* in nfnetlink_hook.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-07 19:25                           ` Pablo Neira Ayuso
@ 2025-07-08 14:38                             ` Phil Sutter
  2025-07-09 22:43                               ` Pablo Neira Ayuso
  0 siblings, 1 reply; 29+ messages in thread
From: Phil Sutter @ 2025-07-08 14:38 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel

Hi Pablo,

On Mon, Jul 07, 2025 at 09:25:50PM +0200, Pablo Neira Ayuso wrote:
> On Fri, Jul 04, 2025 at 04:04:39PM +0200, Florian Westphal wrote:
> > Phil Sutter <phil@nwl.cc> wrote:
> > > Please keep in mind we already have 'nft list hooks' which provides
> > > hints in that direction. It does not show which flowtable/chain actually
> > > binds to a given device, though.
> > 
> > Its possible to extend it:
> > - add NF_HOOK_OP_NFT_FT to enum nf_hook_ops_type
> > - add
> > 
> > static int nfnl_hook_put_nft_ft_info(struct sk_buff *nlskb,
> >                                    const struct nfnl_dump_hook_data *ctx,
> >                                    unsigned int seq,
> >                                    struct nf_flowtable *ft)
> > 
> > to nfnetlink_hook.c
> > 
> > it can use container_of to get to the nft_flowtable struct.
> > It might be possibe to share some code with nfnl_hook_put_nft_chain_info
> > and reuse some of the same netlink attributes.
> > 
> > - call it from nfnl_hook_dump_one.
> > 
> > I think it would use useful to have, independent of "eth*" support.
> 
> This is a good idea to place this in nfnetlink_hook, that
> infrastructure is for debugging purpose indeed.
> 
> If this update is made, I also think it makes sense to remove the
> netlink event notification code for devices, I don't have a use case
> for that code in the new device group other than debugging.
> 
> If Phil's intention is to make code savings, then extending
> nfnetlink_hook and removing the existing device notification group
> make sense to me.
> 
> User can simply resort to check via dump if a matching hook is
> registered for eth* in nfnetlink_hook.

What is the downside of having it? Are you concerned about the need to
maintain it or something else (as well)?

I had extended the monitor testsuite to assert correct behaviour wrt.
adding/removing devices. Implementing this is in a shell test is
trivial, but still work to be done. :)

Cheers, Phil

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-08 14:38                             ` Phil Sutter
@ 2025-07-09 22:43                               ` Pablo Neira Ayuso
  2025-07-10 13:55                                 ` Phil Sutter
  2025-07-11 12:19                                 ` Phil Sutter
  0 siblings, 2 replies; 29+ messages in thread
From: Pablo Neira Ayuso @ 2025-07-09 22:43 UTC (permalink / raw)
  To: Phil Sutter, Florian Westphal, netfilter-devel

Hi Phil,

On Tue, Jul 08, 2025 at 04:38:44PM +0200, Phil Sutter wrote:
> Hi Pablo,
> 
> On Mon, Jul 07, 2025 at 09:25:50PM +0200, Pablo Neira Ayuso wrote:
> > On Fri, Jul 04, 2025 at 04:04:39PM +0200, Florian Westphal wrote:
> > > Phil Sutter <phil@nwl.cc> wrote:
> > > > Please keep in mind we already have 'nft list hooks' which provides
> > > > hints in that direction. It does not show which flowtable/chain actually
> > > > binds to a given device, though.
> > > 
> > > Its possible to extend it:
> > > - add NF_HOOK_OP_NFT_FT to enum nf_hook_ops_type
> > > - add
> > > 
> > > static int nfnl_hook_put_nft_ft_info(struct sk_buff *nlskb,
> > >                                    const struct nfnl_dump_hook_data *ctx,
> > >                                    unsigned int seq,
> > >                                    struct nf_flowtable *ft)
> > > 
> > > to nfnetlink_hook.c
> > > 
> > > it can use container_of to get to the nft_flowtable struct.
> > > It might be possibe to share some code with nfnl_hook_put_nft_chain_info
> > > and reuse some of the same netlink attributes.
> > > 
> > > - call it from nfnl_hook_dump_one.
> > > 
> > > I think it would use useful to have, independent of "eth*" support.
> > 
> > This is a good idea to place this in nfnetlink_hook, that
> > infrastructure is for debugging purpose indeed.
> > 
> > If this update is made, I also think it makes sense to remove the
> > netlink event notification code for devices, I don't have a use case
> > for that code in the new device group other than debugging.
> > 
> > If Phil's intention is to make code savings, then extending
> > nfnetlink_hook and removing the existing device notification group
> > make sense to me.
> > 
> > User can simply resort to check via dump if a matching hook is
> > registered for eth* in nfnetlink_hook.
> 
> What is the downside of having it? Are you concerned about the need to
> maintain it or something else (as well)?

I was considering that nfnetlink_hook is a better fit for this
purpose, these event notifications that report new devices could come
from net/netfilter/core.c instead. That is, nf_register_net_hook() and
nf_tables_unregister_hook().

You also mentioned you originally used this syntax:

        nft monitor hooks

which, after Florian's suggestion, made me think all this belongs to
nfnetlink_hook.

This would avoid an asymmetry in the API. At this moment, new device
hooks are reported via nftables, but listing will be retrieved via
nfnetlink_hook.

This would also provide a generic infrastructure to report hook
registration and unregistration, as a side effect.

If you accept this suggestion, it is a matter of:

#1 revert the patch in nf.git for the incomplete event notification
   (you have three more patches pending for nf-next to complete this
    for control plane notifications).
#2 add event notifications to net/netfilter/core.c and nfnetlink_hook.

Only -rc kernels have been release containing the incomplete device
event notification. It is a bit late to revert to be honest, but
better late than never. This infrastructure is triggering more debate
than expected.

And that would be more work on your pile to respin, which is always a
hard sell.

> I had extended the monitor testsuite to assert correct behaviour wrt.
> adding/removing devices. Implementing this is in a shell test is
> trivial, but still work to be done. :)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-09 22:43                               ` Pablo Neira Ayuso
@ 2025-07-10 13:55                                 ` Phil Sutter
  2025-07-11 12:19                                 ` Phil Sutter
  1 sibling, 0 replies; 29+ messages in thread
From: Phil Sutter @ 2025-07-10 13:55 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel

Hi Pablo,

On Thu, Jul 10, 2025 at 12:43:03AM +0200, Pablo Neira Ayuso wrote:
> On Tue, Jul 08, 2025 at 04:38:44PM +0200, Phil Sutter wrote:
> > On Mon, Jul 07, 2025 at 09:25:50PM +0200, Pablo Neira Ayuso wrote:
> > > On Fri, Jul 04, 2025 at 04:04:39PM +0200, Florian Westphal wrote:
> > > > Phil Sutter <phil@nwl.cc> wrote:
> > > > > Please keep in mind we already have 'nft list hooks' which provides
> > > > > hints in that direction. It does not show which flowtable/chain actually
> > > > > binds to a given device, though.
> > > > 
> > > > Its possible to extend it:
> > > > - add NF_HOOK_OP_NFT_FT to enum nf_hook_ops_type
> > > > - add
> > > > 
> > > > static int nfnl_hook_put_nft_ft_info(struct sk_buff *nlskb,
> > > >                                    const struct nfnl_dump_hook_data *ctx,
> > > >                                    unsigned int seq,
> > > >                                    struct nf_flowtable *ft)
> > > > 
> > > > to nfnetlink_hook.c
> > > > 
> > > > it can use container_of to get to the nft_flowtable struct.
> > > > It might be possibe to share some code with nfnl_hook_put_nft_chain_info
> > > > and reuse some of the same netlink attributes.
> > > > 
> > > > - call it from nfnl_hook_dump_one.
> > > > 
> > > > I think it would use useful to have, independent of "eth*" support.
> > > 
> > > This is a good idea to place this in nfnetlink_hook, that
> > > infrastructure is for debugging purpose indeed.
> > > 
> > > If this update is made, I also think it makes sense to remove the
> > > netlink event notification code for devices, I don't have a use case
> > > for that code in the new device group other than debugging.
> > > 
> > > If Phil's intention is to make code savings, then extending
> > > nfnetlink_hook and removing the existing device notification group
> > > make sense to me.
> > > 
> > > User can simply resort to check via dump if a matching hook is
> > > registered for eth* in nfnetlink_hook.
> > 
> > What is the downside of having it? Are you concerned about the need to
> > maintain it or something else (as well)?
> 
> I was considering that nfnetlink_hook is a better fit for this
> purpose, these event notifications that report new devices could come
> from net/netfilter/core.c instead. That is, nf_register_net_hook() and
> nf_tables_unregister_hook().
> 
> You also mentioned you originally used this syntax:
> 
>         nft monitor hooks
> 
> which, after Florian's suggestion, made me think all this belongs to
> nfnetlink_hook.
> 
> This would avoid an asymmetry in the API. At this moment, new device
> hooks are reported via nftables, but listing will be retrieved via
> nfnetlink_hook.

Ah, I see.

> This would also provide a generic infrastructure to report hook
> registration and unregistration, as a side effect.

OK, fair with me!

> If you accept this suggestion, it is a matter of:
> 
> #1 revert the patch in nf.git for the incomplete event notification
>    (you have three more patches pending for nf-next to complete this
>     for control plane notifications).
> #2 add event notifications to net/netfilter/core.c and nfnetlink_hook.
> 
> Only -rc kernels have been release containing the incomplete device
> event notification. It is a bit late to revert to be honest, but
> better late than never. This infrastructure is triggering more debate
> than expected.
> 
> And that would be more work on your pile to respin, which is always a
> hard sell.

No problem. I'll quickly submit a revert for nf.git and attempt an
implementation in nfnetlink or core code "later" - I assume the
flowtable support in 'nft list hooks' output is fine to satisfy the
traceability requirement for name-based hooks and so we're good to go
with the user space implementation?

> > I had extended the monitor testsuite to assert correct behaviour wrt.
> > adding/removing devices. Implementing this is in a shell test is
> > trivial, but still work to be done. :)

I'll then drop the 'nft monitor' addon for now and write a shell test
using 'nft list hooks' to validate correct behaviour instead.

Thanks, Phil

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-09 22:43                               ` Pablo Neira Ayuso
  2025-07-10 13:55                                 ` Phil Sutter
@ 2025-07-11 12:19                                 ` Phil Sutter
  2025-07-11 13:16                                   ` Florian Westphal
  2025-07-11 14:52                                   ` Pablo Neira Ayuso
  1 sibling, 2 replies; 29+ messages in thread
From: Phil Sutter @ 2025-07-11 12:19 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel

Pablo,

On Thu, Jul 10, 2025 at 12:43:03AM +0200, Pablo Neira Ayuso wrote:
[...]
> If you accept this suggestion, it is a matter of:
> 
> #1 revert the patch in nf.git for the incomplete event notification
>    (you have three more patches pending for nf-next to complete this
>     for control plane notifications).
> #2 add event notifications to net/netfilter/core.c and nfnetlink_hook.

Since Florian wondered whether I am wasting my time with a quick attempt
at #2, could you please confirm/deny whether this is a requirement for
the default to name-based interface hooks or does the 'list hooks'
extension satisfy the need for user space traceability?

Thanks, Phil

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-11 12:19                                 ` Phil Sutter
@ 2025-07-11 13:16                                   ` Florian Westphal
  2025-07-11 13:43                                     ` Phil Sutter
  2025-07-11 14:52                                   ` Pablo Neira Ayuso
  1 sibling, 1 reply; 29+ messages in thread
From: Florian Westphal @ 2025-07-11 13:16 UTC (permalink / raw)
  To: Phil Sutter, Pablo Neira Ayuso, netfilter-devel

Phil Sutter <phil@nwl.cc> wrote:
> On Thu, Jul 10, 2025 at 12:43:03AM +0200, Pablo Neira Ayuso wrote:
> [...]
> > If you accept this suggestion, it is a matter of:
> > 
> > #1 revert the patch in nf.git for the incomplete event notification
> >    (you have three more patches pending for nf-next to complete this
> >     for control plane notifications).
> > #2 add event notifications to net/netfilter/core.c and nfnetlink_hook.
> 
> Since Florian wondered whether I am wasting my time with a quick attempt
> at #2, could you please confirm/deny whether this is a requirement for
> the default to name-based interface hooks or does the 'list hooks'
> extension satisfy the need for user space traceability?

My main point is that rtnetlink has a notifier for new/removed links,
see 'ip monitor.'.

So even if we want a 'nft hooks monitor', I don't see why kernel changes
are needed for it: subscribe to rtnetlink, then for a new link dump
the interfaces hooks *should* work.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-11 13:16                                   ` Florian Westphal
@ 2025-07-11 13:43                                     ` Phil Sutter
  2025-07-11 13:48                                       ` Florian Westphal
  0 siblings, 1 reply; 29+ messages in thread
From: Phil Sutter @ 2025-07-11 13:43 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Pablo Neira Ayuso, netfilter-devel

On Fri, Jul 11, 2025 at 03:16:39PM +0200, Florian Westphal wrote:
> Phil Sutter <phil@nwl.cc> wrote:
> > On Thu, Jul 10, 2025 at 12:43:03AM +0200, Pablo Neira Ayuso wrote:
> > [...]
> > > If you accept this suggestion, it is a matter of:
> > > 
> > > #1 revert the patch in nf.git for the incomplete event notification
> > >    (you have three more patches pending for nf-next to complete this
> > >     for control plane notifications).
> > > #2 add event notifications to net/netfilter/core.c and nfnetlink_hook.
> > 
> > Since Florian wondered whether I am wasting my time with a quick attempt
> > at #2, could you please confirm/deny whether this is a requirement for
> > the default to name-based interface hooks or does the 'list hooks'
> > extension satisfy the need for user space traceability?
> 
> My main point is that rtnetlink has a notifier for new/removed links,
> see 'ip monitor.'.
> 
> So even if we want a 'nft hooks monitor', I don't see why kernel changes
> are needed for it: subscribe to rtnetlink, then for a new link dump
> the interfaces hooks *should* work.

Oh, I didn't get that. Does it work with removed interfaces? 'nft
monitor' will notice, but fetching hooks for the removed interface won't
return anything then, right?

Cheers, Phil

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-11 13:43                                     ` Phil Sutter
@ 2025-07-11 13:48                                       ` Florian Westphal
  0 siblings, 0 replies; 29+ messages in thread
From: Florian Westphal @ 2025-07-11 13:48 UTC (permalink / raw)
  To: Phil Sutter; +Cc: Pablo Neira Ayuso, netfilter-devel

Phil Sutter <phil@nwl.cc> wrote:
> Oh, I didn't get that. Does it work with removed interfaces? 'nft
> monitor' will notice, but fetching hooks for the removed interface won't
> return anything then, right?

Yes, it won't find the interface since its already gone.
One solution is to remember which interfaces had hooks and then just
inform the user that the interface is going away.

If a kernel implementation is small enough, then fine but I don't
really see why its needed.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-11 12:19                                 ` Phil Sutter
  2025-07-11 13:16                                   ` Florian Westphal
@ 2025-07-11 14:52                                   ` Pablo Neira Ayuso
  2025-07-11 16:39                                     ` Phil Sutter
  1 sibling, 1 reply; 29+ messages in thread
From: Pablo Neira Ayuso @ 2025-07-11 14:52 UTC (permalink / raw)
  To: Phil Sutter, Florian Westphal, netfilter-devel

On Fri, Jul 11, 2025 at 02:19:04PM +0200, Phil Sutter wrote:
> Pablo,
> 
> On Thu, Jul 10, 2025 at 12:43:03AM +0200, Pablo Neira Ayuso wrote:
> [...]
> > If you accept this suggestion, it is a matter of:
> > 
> > #1 revert the patch in nf.git for the incomplete event notification
> >    (you have three more patches pending for nf-next to complete this
> >     for control plane notifications).
> > #2 add event notifications to net/netfilter/core.c and nfnetlink_hook.
> 
> Since Florian wondered whether I am wasting my time with a quick attempt
> at #2, could you please confirm/deny whether this is a requirement for
> the default to name-based interface hooks or does the 'list hooks'
> extension satisfy the need for user space traceability?

For me, listing is just fine for debugging.

If there is a need to track hook updates via events, then
nfnetlink_hook can be extended later.

So I am not asking for this, I thought you needed both listing and
events, that is why I suggest to add events to nfnetlink_hook.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-11 14:52                                   ` Pablo Neira Ayuso
@ 2025-07-11 16:39                                     ` Phil Sutter
  2025-07-14 14:02                                       ` Pablo Neira Ayuso
  0 siblings, 1 reply; 29+ messages in thread
From: Phil Sutter @ 2025-07-11 16:39 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel

On Fri, Jul 11, 2025 at 04:52:55PM +0200, Pablo Neira Ayuso wrote:
> On Fri, Jul 11, 2025 at 02:19:04PM +0200, Phil Sutter wrote:
> > Pablo,
> > 
> > On Thu, Jul 10, 2025 at 12:43:03AM +0200, Pablo Neira Ayuso wrote:
> > [...]
> > > If you accept this suggestion, it is a matter of:
> > > 
> > > #1 revert the patch in nf.git for the incomplete event notification
> > >    (you have three more patches pending for nf-next to complete this
> > >     for control plane notifications).
> > > #2 add event notifications to net/netfilter/core.c and nfnetlink_hook.
> > 
> > Since Florian wondered whether I am wasting my time with a quick attempt
> > at #2, could you please confirm/deny whether this is a requirement for
> > the default to name-based interface hooks or does the 'list hooks'
> > extension satisfy the need for user space traceability?
> 
> For me, listing is just fine for debugging.
> 
> If there is a need to track hook updates via events, then
> nfnetlink_hook can be extended later.

OK, cool!

> So I am not asking for this, I thought you needed both listing and
> events, that is why I suggest to add events to nfnetlink_hook.

Just to be sure I wrote shell test case asserting correct device
reg/dereg using 'nft list hooks' tool, works just fine. So let's skip
notifications for now.

Thanks, Phil

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration
  2025-07-11 16:39                                     ` Phil Sutter
@ 2025-07-14 14:02                                       ` Pablo Neira Ayuso
  0 siblings, 0 replies; 29+ messages in thread
From: Pablo Neira Ayuso @ 2025-07-14 14:02 UTC (permalink / raw)
  To: Phil Sutter, Florian Westphal, netfilter-devel

On Fri, Jul 11, 2025 at 06:39:50PM +0200, Phil Sutter wrote:
> On Fri, Jul 11, 2025 at 04:52:55PM +0200, Pablo Neira Ayuso wrote:
> > On Fri, Jul 11, 2025 at 02:19:04PM +0200, Phil Sutter wrote:
> > > Pablo,
> > > 
> > > On Thu, Jul 10, 2025 at 12:43:03AM +0200, Pablo Neira Ayuso wrote:
> > > [...]
> > > > If you accept this suggestion, it is a matter of:
> > > > 
> > > > #1 revert the patch in nf.git for the incomplete event notification
> > > >    (you have three more patches pending for nf-next to complete this
> > > >     for control plane notifications).
> > > > #2 add event notifications to net/netfilter/core.c and nfnetlink_hook.
> > > 
> > > Since Florian wondered whether I am wasting my time with a quick attempt
> > > at #2, could you please confirm/deny whether this is a requirement for
> > > the default to name-based interface hooks or does the 'list hooks'
> > > extension satisfy the need for user space traceability?
> > 
> > For me, listing is just fine for debugging.
> > 
> > If there is a need to track hook updates via events, then
> > nfnetlink_hook can be extended later.
> 
> OK, cool!
> 
> > So I am not asking for this, I thought you needed both listing and
> > events, that is why I suggest to add events to nfnetlink_hook.
> 
> Just to be sure I wrote shell test case asserting correct device
> reg/dereg using 'nft list hooks' tool, works just fine. So let's skip
> notifications for now.

OK.

Would you rebase userspace on top of git HEAD so next kernel release
comes with userspace code to start testing this new feature?

Your test will need to wait for next kernel to include your
nfnetlink_hook extension, you can post it and keep it around if you
like.

Thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2025-07-14 14:02 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-02 17:47 [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration Phil Sutter
2025-07-02 22:39 ` Florian Westphal
2025-07-03 10:21   ` Phil Sutter
2025-07-03 11:35     ` Pablo Neira Ayuso
2025-07-03 12:09       ` Florian Westphal
2025-07-03 12:37         ` Phil Sutter
2025-07-03 12:25       ` Phil Sutter
2025-07-03 12:39         ` Florian Westphal
2025-07-03 12:47           ` Phil Sutter
2025-07-03 12:54             ` Florian Westphal
2025-07-03 13:17               ` Phil Sutter
2025-07-03 14:19                 ` Pablo Neira Ayuso
2025-07-03 14:33                   ` Phil Sutter
2025-07-03 21:32                     ` Pablo Neira Ayuso
2025-07-04 12:41                       ` Phil Sutter
2025-07-04 14:04                         ` Florian Westphal
2025-07-04 15:33                           ` Phil Sutter
2025-07-07 19:25                           ` Pablo Neira Ayuso
2025-07-08 14:38                             ` Phil Sutter
2025-07-09 22:43                               ` Pablo Neira Ayuso
2025-07-10 13:55                                 ` Phil Sutter
2025-07-11 12:19                                 ` Phil Sutter
2025-07-11 13:16                                   ` Florian Westphal
2025-07-11 13:43                                     ` Phil Sutter
2025-07-11 13:48                                       ` Florian Westphal
2025-07-11 14:52                                   ` Pablo Neira Ayuso
2025-07-11 16:39                                     ` Phil Sutter
2025-07-14 14:02                                       ` Pablo Neira Ayuso
2025-07-03 11:55     ` Florian Westphal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).