All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Phil Sutter <phil@nwl.cc>,
	netfilter-devel@vger.kernel.org, Florian Westphal <fw@strlen.de>,
	Thomas Haller <thaller@redhat.com>
Subject: Re: [PATCH v2 0/7] Dynamic hook interface binding
Date: Wed, 19 Jun 2024 16:48:35 +0200	[thread overview]
Message-ID: <ZnLvw0MbcL81GUrc@calendula> (raw)
In-Reply-To: <ZnLAiiJJldqUXl_s@orbyte.nwl.cc>

Hi Phil,

On Wed, Jun 19, 2024 at 01:27:06PM +0200, Phil Sutter wrote:
> On Tue, Jun 18, 2024 at 01:10:21AM +0200, Pablo Neira Ayuso wrote:
> > Hi Phil,
> > 
> > On Fri, May 17, 2024 at 03:06:08PM +0200, Phil Sutter wrote:
> > > Changes since v1:
> > > - New patch 6 adding notifications for updated hooks.
> > > - New patch 7 adding the requested torture test.
> > > 
> > > Currently, netdev-family chains and flowtables expect their interfaces
> > > to exist at creation time. In practice, this bites users of virtual
> > > interfaces if these happen to be created after the nftables service
> > > starts up and loads the stored ruleset.
> > > 
> > > Vice-versa, if an interface disappears at run-time (via module unloading
> > > or 'ip link del'), it also disappears from the ruleset, along with the
> > > chain and its rules which binds to it. This is at least problematic for
> > > setups which store the running ruleset during system shutdown.
> > 
> > I'd suggest that you place your patch 2/7 to modify the existing
> > behaviour in first place.
> 
> I did not to avoid "dead" entries in the hook list (with ops.dev gone,
> there's nothing left identifying the hook). But it's only temporary and
> maybe cleans up the diffs, will give it a go.

OK.

> > > This series attempts to solve these problems by effectively making
> > > netdev hooks name-based: If no matching interface is found at hook
> > > creation time, it will be inactive until a matching interface appears.
> > > If a bound interface is renamed, a matching inactive hook is searched
> > > for it.
> > > 
> > > Ruleset dumps will stabilize in that regard. To still provide
> > > information about which existing interfaces a chain/flowtable currently
> > > binds to, new netlink attributes *_ACT_DEVS are introduced which are
> > > filled from the active hooks only.
> > 
> > Currently, NFTA_HOOK_DEVS already represents the netdevice that are
> > active. If one of these devices goes aways, then it is removed from
> > the basechain and it does not show up in NFTA_HOOK_DEVS anymore.
> > 
> > There are netlink notifications that need to fit into NLMSG_GOODSIZE,
> > but this adds yet another netlink array attribute.
> 
> Hmm. I could introduce NFTA_HOOK_INACTIVE_DEVS which contains only those
> entries missing in NFTA_HOOK_DEVS. This shouldn't bloat the dumps too
> much (apart from the added overhead) and won't change old user space
> behaviour.

Not sure. What does NFTA_HOOK_INACTIVE_DEVS contains? Could you
provide an example? Again, if this array gets again too large, there
could be issues with NLMSG_GOODSIZE again for notifications.

I would just display these active devices (in the list of devices that
are attached to this basechain) via the new command _GETDEV that we
are discussing below? These netdevices that match the pattern come and
go, I guess user only wants to make sure they are actually registered
to this hook for diagnostics, showing an exact match, ie. tap0, or
inexact match, ie. tap* should be should when listing the ruleset IMO.

> > I think we cannot escape adding new commands such as:
> > 
> > NFT_MSG_NEWDEVICE
> > NFT_MSG_GETDEVICE
> > NFT_MSG_DELDEVICE
> > 
> > to populate the basechain/flowtable, those can be used only if the
> > "subscription" mecanism is used, so older kernels still rely in
> > NFTA_HOOK_DEVS (older-older kernels actually already deal with
> > NFTA_HOOK_DEV only...).
> 
> Why can't they co-exist? I.e., NFTA_HOOK_DEV{,S} continues to behave as
> before and NEWDEV/DELDEV modify the list of existing chains/flowtables.
> An explicit GETDEV to dump currently active interfaces seems reasonable,
> especially with wildcard specifiers in mind.

yes, NFTA_HOOK_DEVS and these new commands need to coexist, there is
no other way to retain backward compatibility.

> > NFT_MSG_NEWDEVICE can provide a flag to specify this is a wildcard
> > or exact matching.
> 
> I had the same mechanism we use for wildcard ifname matching in mind
> which is to specify a name length which either includes or excludes the
> terminating NUL-char. Using strncmp() is sufficient then.

That's fine.

> Or do you think more advanced (e.g. "enp*s0") wildcards are useful?

I prefer future does not bring "enp*s0", but let's agree on an API
that can be extended to accomodate new requirements, that's my only
comment in this regard.

> > There is also a need for a netlink attribute to specify if this is
> > adding a device to a chain or flowtable.
> 
> Since one has to specify the specific chain or flowtable to add to
> anyway, one could just add NFTA_DEV_CHAIN and NFTA_DEV_FLOWTABLE
> attributes, which are mutually exclusive and together with
> NFTA_DEV_TABLE identify the object to manipulate.

Right.

> > With these new commands, the NFT_NETDEVICE_MAX cap can also go away in
> > newer kernels with a command like this:
> > 
> >         nft add device flowtable ip x y { a, b, c }
> >         nft add device chain ip x y { a, b, c }
> > 
> > which expands to one one NFT_MSG_* message for each item.
> > 
> > Then, the nested notation will need to detect what to use depending on
> > the user input:
> > 
> >         table netdev x {
> >                 chain y {
> >                         type filter hook ingress devices = { tap* } priority 0
> >                 }
> >         }
> > 
> > this triggers the new commands path, same if the list of devices goes
> > over NFT_NETDEVICE_MAX (256).
> 
> This is for user space to remain compatible to older kernels, right?

Yes.

> > This can also help incrementally report on new devices that match on a
> > given "subscription pattern" via new NFT_MSG_NEWDEVICE command for
> > monitoring purpose.
> 
> Yes, adding NEWDEV/DELDEV notifications is a nice approach!

This will also help to remove the existing NFT_NETDEVICE_MAX limit.

> > Maybe a command like:
> > 
> >         nft list device chain ip x y
> > 
> > could be used to list the existing devices that are "active" as per
> > your definition, so:
> > 
> >         nft list ruleset
> > 
> > does not show the listing of "active" devices that expand to tap*,
> > because this is only informational (not really required to restore a
> > ruleset), I guess the user only wants this to inspect to make sure
> > what devices are registered to the given tap* pattern.
> 
> ACK.
> 
> > There is also another case that would need to be handled:
> > 
> >         - chain A with device tap0
> >         - chain B with wildcard device tap*
> > 
> > I would expect a "exclude" clause for the wildcard case will come
> > sooner or later to define a different policy for a specify chain.
> > The new specific command approach would be extensible in that sense.
> 
> As a first implementation, I would just forbid such combinations.
> Assuming "tap*" is just "tap" with length 3 and "tap0" is "tap0\0" with
> length 5, modifying the duplicate hook check in
> nf_tables_parse_netdev_hooks() to perform a strncmp(namea, nameb,
> min(lena, lenb)) should suffice.

That is fine to ensure a given basechain does not have both tap0 and tap*

> Another option avoiding maintenance of an exclusion list would be to
> just add a new interface to the first chain/flowtable with matching hook
> in the list. This should work since 'nft list ruleset' does not sort the
> ruleset, so 'add chain t b; add chain t a' is a meaningful difference to
> 'add chain t a; add chain t b'.
> 
> The exclusion list approach either means tracking a list of active
> hooks' matching names in each wildcard hook or some preference when
> searching a hook for a new interface. All this may become increasingly
> complicated by stuff like one hook for 'eth*' and another for 'et*'.
> 
> > > This series is also prep work for a simple wildcard interface binding
> > > similar to the wildcard interface matching in meta expression. It should
> > > suffice to turn struct nft_hook::ops into an array of all matching
> > > interfaces, but the respective code does not exist yet.
> > 
> > I think this series is all about this wildcard interface matching,
> > which is not coming explicitly.
> 
> My motivation for this series is an open ticket complaining about the
> inability to define a flowtable for an interface which is created by
> NetworkManager (which depends on nftables service for startup).
> 
> A related (but not complained so far) issue is in reverse direction: If
> the interface vanishes before nftables service saves the ruleset upon
> shutdown, the dump will be incomplete.
>
> I consider wildcard interface hooks merely a (nice) side-effect from
> fixing the above. Which should not require too much additional work.

OK.

Thanks.

  reply	other threads:[~2024-06-19 14:48 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-17 13:06 [PATCH v2 0/7] Dynamic hook interface binding Phil Sutter
2024-05-17 13:06 ` [PATCH v2 1/7] netfilter: nf_tables: Store user-defined hook ifname Phil Sutter
2024-05-17 13:06 ` [PATCH v2 2/7] netfilter: nf_tables: Relax hook interface binding Phil Sutter
2024-05-17 13:06 ` [PATCH v2 3/7] netfilter: nf_tables: Report active interfaces to user space Phil Sutter
2024-05-17 13:06 ` [PATCH v2 4/7] netfilter: nf_tables: Dynamic hook interface binding Phil Sutter
2024-05-17 13:06 ` [PATCH v2 5/7] netfilter: nf_tables: Correctly handle NETDEV_RENAME events Phil Sutter
2024-05-17 13:06 ` [PATCH v2 6/7] netfilter: nf_tables: Add notications for hook changes Phil Sutter
2024-05-17 13:06 ` [PATCH v2 7/7] selftests: netfilter: Torture nftables netdev hooks Phil Sutter
2024-06-17 23:10 ` [PATCH v2 0/7] Dynamic hook interface binding Pablo Neira Ayuso
2024-06-19 11:27   ` Phil Sutter
2024-06-19 14:48     ` Pablo Neira Ayuso [this message]
2024-06-19 15:59       ` Phil Sutter
2024-06-20  9:30         ` Pablo Neira Ayuso
2024-06-23 22:12         ` Pablo Neira Ayuso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZnLvw0MbcL81GUrc@calendula \
    --to=pablo@netfilter.org \
    --cc=fw@strlen.de \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=phil@nwl.cc \
    --cc=thaller@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.