From: Sabrina Dubroca <sd@queasysnail.net>
To: Ido Schimmel <idosch@nvidia.com>
Cc: Hangbin Liu <liuhangbin@gmail.com>,
netdev@vger.kernel.org, Jay Vosburgh <jv@jvosburgh.net>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Jiri Pirko <jiri@resnulli.us>, Simon Horman <horms@kernel.org>,
Nikolay Aleksandrov <razor@blackwall.org>,
Shuah Khan <shuah@kernel.org>,
Stanislav Fomichev <sdf@fomichev.me>,
Kuniyuki Iwashima <kuniyu@google.com>,
Ahmed Zaki <ahmed.zaki@intel.com>,
Alexander Lobakin <aleksander.lobakin@intel.com>,
bridge@lists.linux.dev, linux-kselftest@vger.kernel.org
Subject: Re: [PATCH net-next 1/5] net: add a common function to compute features from lowers devices
Date: Wed, 10 Sep 2025 19:41:47 +0200 [thread overview]
Message-ID: <aMG4W9xUGxjLAVys@krikkit> (raw)
In-Reply-To: <aMGwcyKTvmz5StN1@shredder>
2025-09-10, 20:08:03 +0300, Ido Schimmel wrote:
> On Wed, Sep 10, 2025 at 04:29:35PM +0200, Sabrina Dubroca wrote:
> > 2025-08-31, 18:35:49 +0300, Ido Schimmel wrote:
> > > On Fri, Aug 29, 2025 at 09:54:26AM +0000, Hangbin Liu wrote:
> > > > Some high level virtual drivers need to compute features from lower
> > > > devices. But each has their own implementations and may lost some
> > > > feature compute. Let's use one common function to compute features
> > > > for kinds of these devices.
> > > >
> > > > The new helper uses the current bond implementation as the reference
> > > > one, as the latter already handles all the relevant aspects: netdev
> > > > features, TSO limits and dst retention.
> > > >
> > > > Suggested-by: Paolo Abeni <pabeni@redhat.com>
> > > > Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
> > > > ---
> > > > include/linux/netdevice.h | 19 ++++++++++
> > > > net/core/dev.c | 79 +++++++++++++++++++++++++++++++++++++++
> > > > 2 files changed, 98 insertions(+)
> > > >
> > > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > > > index f3a3b761abfb..42742a47f2c6 100644
> > > > --- a/include/linux/netdevice.h
> > > > +++ b/include/linux/netdevice.h
> > > > @@ -5279,6 +5279,25 @@ int __netdev_update_features(struct net_device *dev);
> > > > void netdev_update_features(struct net_device *dev);
> > > > void netdev_change_features(struct net_device *dev);
> > > >
> > > > +/* netdevice features */
> > > > +#define VIRTUAL_DEV_VLAN_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG | \
> > > > + NETIF_F_FRAGLIST | NETIF_F_GSO_SOFTWARE | \
> > > > + NETIF_F_GSO_ENCAP_ALL | \
> > > > + NETIF_F_HIGHDMA | NETIF_F_LRO)
> > > > +
> > > > +#define VIRTUAL_DEV_ENC_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG | \
> > > > + NETIF_F_RXCSUM | NETIF_F_GSO_SOFTWARE | \
> > > > + NETIF_F_GSO_PARTIAL)
> > > > +
> > > > +#define VIRTUAL_DEV_MPLS_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG | \
> > > > + NETIF_F_GSO_SOFTWARE)
> > > > +
> > > > +#define VIRTUAL_DEV_XFRM_FEATURES (NETIF_F_HW_ESP | NETIF_F_HW_ESP_TX_CSUM | \
> > > > + NETIF_F_GSO_ESP)
> > > > +
> > > > +#define VIRTUAL_DEV_GSO_PARTIAL_FEATURES (NETIF_F_GSO_ESP)
> > > > +void netdev_compute_features_from_lowers(struct net_device *dev);
> > > > +
> > > > void netif_stacked_transfer_operstate(const struct net_device *rootdev,
> > > > struct net_device *dev);
> > > >
> > > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > > index 1d1650d9ecff..fcad2a9f6b65 100644
> > > > --- a/net/core/dev.c
> > > > +++ b/net/core/dev.c
> > > > @@ -12577,6 +12577,85 @@ netdev_features_t netdev_increment_features(netdev_features_t all,
> > > > }
> > > > EXPORT_SYMBOL(netdev_increment_features);
> > > >
> > > > +/**
> > > > + * netdev_compute_features_from_lowers - compute feature from lowers
> > > > + * @dev: the upper device
> > > > + *
> > > > + * Recompute the upper device's feature based on all lower devices.
> > > > + */
> > > > +void netdev_compute_features_from_lowers(struct net_device *dev)
> > > > +{
> > > > + unsigned int dst_release_flag = IFF_XMIT_DST_RELEASE | IFF_XMIT_DST_RELEASE_PERM;
> > > > + netdev_features_t gso_partial_features = VIRTUAL_DEV_GSO_PARTIAL_FEATURES;
> > > > +#ifdef CONFIG_XFRM_OFFLOAD
> > > > + netdev_features_t xfrm_features = VIRTUAL_DEV_XFRM_FEATURES;
> > > ^ double space (in other places as well)
> > >
> > > > +#endif
> > > > + netdev_features_t mpls_features = VIRTUAL_DEV_MPLS_FEATURES;
> > > > + netdev_features_t vlan_features = VIRTUAL_DEV_VLAN_FEATURES;
> > > > + netdev_features_t enc_features = VIRTUAL_DEV_ENC_FEATURES;
> > > > + unsigned short max_hard_header_len = ETH_HLEN;
> >
> > Going back to this discussion about hard_header_len:
> >
> > > hard_header_len is not really a feature, so does not sound like it
> > > belongs here. I'm pretty sure it's not needed at all.
> > >
> > > It was added to the bond driver in 2006 by commit 54ef31371407 ("[PATCH]
> > > bonding: Handle large hard_header_len") citing panics with gianfar on
> > > xmit. In 2009 commit 93c1285c5d92 ("gianfar: reallocate skb when
> > > headroom is not enough for fcb") fixed the gianfar driver to stop
> > > assuming that it has enough room to push its custom header. Further,
> > > commit bee9e58c9e98 ("gianfar:don't add FCB length to hard_header_len")
> > > from 2012 fixed this driver to use needed_headroom instead of
> > > hard_header_len.
> > >
> > > The team driver is also adjusting hard_header_len according to the lower
> > > devices, but it most likely copied it from the bond driver. On the other
> > > hand, the bridge driver does not mess with hard_header_len and no
> > > problems were reported there (that I know of).
> > >
> > > Might be a good idea to remove this hard_header_len logic from bond and
> > > team and instead set their needed_headroom according to the lower device
> > > with the highest needed_headroom. Paolo added similar logic in bridge
> > > and ovs but the use case is a bit different there.
> >
> > I'm not convinced removing adapting hard_header_len on bond/team is
> > correct, even with old and broken drivers getting fixed years
> > ago. hard_header_len will be used on the TX path (for some devices
> > like bridge/macvlan via dev_forward_skb() and similar helpers, for IP
> > tunnels setting their MTU, and via LL_RESERVED_SPACE).
> >
> > So I think we should keep setting hard_header_len to the largest of
> > all lowers.
>
> It is not clear to me why we are setting hard_header_len to the largest
> of all lowers and not needed_headroom. While bond/team allow
> non-Ethernet lowers (unlike bridge, which is also adjusted to use this
> helper), they do verify that all the lower devices are of the same type.
> Shouldn't devices of the same type have the same hardware header length?
At least not with VLANs. Both basic ethernet and vlan devices are
ARPHRD_ETHER, but the hard_header_len of the vlan device will be
larger if we're not offloading:
dev->hard_header_len = real_dev->hard_header_len + VLAN_HLEN;
> On the other hand, needed_headroom can and does vary between devices of
> the same type.
I'm not saying anything about needed_headroom. It sounds like it
should be updated as well.
--
Sabrina
next prev parent reply other threads:[~2025-09-10 17:41 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-29 9:54 [PATCH net-next 0/5] net: common feature compute for upper interface Hangbin Liu
2025-08-29 9:54 ` [PATCH net-next 1/5] net: add a common function to compute features from lowers devices Hangbin Liu
2025-08-31 15:35 ` Ido Schimmel
2025-09-01 9:46 ` Hangbin Liu
2025-09-10 14:29 ` Sabrina Dubroca
2025-09-10 17:08 ` Ido Schimmel
2025-09-10 17:41 ` Sabrina Dubroca [this message]
2025-09-11 12:59 ` Ido Schimmel
2025-09-12 1:08 ` Hangbin Liu
2025-08-29 9:54 ` [PATCH net-next 2/5] bonding: use common function to compute the features Hangbin Liu
2025-08-29 9:54 ` [PATCH net-next 3/5] team: " Hangbin Liu
2025-09-02 16:22 ` Stanislav Fomichev
2025-08-29 9:54 ` [PATCH net-next 4/5] net: bridge: " Hangbin Liu
2025-08-29 9:54 ` [PATCH net-next 5/5] selftests/net: add offload checking test for virtual interface Hangbin Liu
2025-08-31 15:52 ` Ido Schimmel
2025-09-01 9:29 ` Hangbin Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aMG4W9xUGxjLAVys@krikkit \
--to=sd@queasysnail.net \
--cc=ahmed.zaki@intel.com \
--cc=aleksander.lobakin@intel.com \
--cc=andrew+netdev@lunn.ch \
--cc=bridge@lists.linux.dev \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=idosch@nvidia.com \
--cc=jiri@resnulli.us \
--cc=jv@jvosburgh.net \
--cc=kuba@kernel.org \
--cc=kuniyu@google.com \
--cc=linux-kselftest@vger.kernel.org \
--cc=liuhangbin@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=razor@blackwall.org \
--cc=sdf@fomichev.me \
--cc=shuah@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).