From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: Richard Gobert <richardbgobert@gmail.com>, netdev@vger.kernel.org
Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, horms@kernel.org, corbet@lwn.net,
shenjian15@huawei.com, salil.mehta@huawei.com,
shaojijie@huawei.com, andrew+netdev@lunn.ch,
saeedm@nvidia.com, tariqt@nvidia.com, mbloch@nvidia.com,
leon@kernel.org, ecree.xilinx@gmail.com, dsahern@kernel.org,
ncardwell@google.com, kuniyu@google.com, shuah@kernel.org,
sdf@fomichev.me, ahmed.zaki@intel.com,
aleksander.lobakin@intel.com, linux-kernel@vger.kernel.org,
linux-net-drivers@amd.com,
Richard Gobert <richardbgobert@gmail.com>
Subject: Re: [PATCH net-next 3/5] net: gso: restore ids of outer ip headers correctly
Date: Fri, 15 Aug 2025 06:23:37 -0400 [thread overview]
Message-ID: <willemdebruijn.kernel.26e28ce8a7fe@gmail.com> (raw)
In-Reply-To: <20250814114030.7683-4-richardbgobert@gmail.com>
Richard Gobert wrote:
> Currently, NETIF_F_TSO_MANGLEID indicates that the inner-most ID can
> be mangled. Outer IDs can always be mangled.
>
> Make GSO preserve outer IDs by default, with NETIF_F_TSO_MANGLEID allowing
> both inner and outer IDs to be mangled. In the future, we could add
> NETIF_F_TSO_MANGLEID_{INNER,OUTER} to provide more granular control to
> drivers.
>
> This commit also modifies a few drivers that use SKB_GSO_FIXEDID directly.
>
> Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
> ---
> Documentation/networking/segmentation-offloads.rst | 4 ++--
> drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 2 +-
> drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 8 ++++++--
> drivers/net/ethernet/sfc/ef100_tx.c | 14 ++++++++------
> include/linux/netdevice.h | 9 +++++++--
> include/linux/skbuff.h | 6 +++++-
> net/core/dev.c | 7 +++----
> net/ipv4/af_inet.c | 13 ++++++-------
> net/ipv4/tcp_offload.c | 4 +---
> 9 files changed, 39 insertions(+), 28 deletions(-)
>
> diff --git a/Documentation/networking/segmentation-offloads.rst b/Documentation/networking/segmentation-offloads.rst
> index 085e8fab03fd..21c759b81f4e 100644
> --- a/Documentation/networking/segmentation-offloads.rst
> +++ b/Documentation/networking/segmentation-offloads.rst
> @@ -42,8 +42,8 @@ also point to the TCP header of the packet.
>
> For IPv4 segmentation we support one of two types in terms of the IP ID.
> The default behavior is to increment the IP ID with every segment. If the
> -GSO type SKB_GSO_TCP_FIXEDID is specified then we will not increment the IP
> -ID and all segments will use the same IP ID. If a device has
> +GSO type SKB_GSO_TCP_FIXEDID_{OUTER,INNER} is specified then we will not
> +increment the IP ID and all segments will use the same IP ID. If a device has
> NETIF_F_TSO_MANGLEID set then the IP ID can be ignored when performing TSO
> and we will either increment the IP ID for all frames, or leave it at a
> static value based on driver preference.
> diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
> index bfa5568baa92..b28f890b0af5 100644
> --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
> @@ -3868,7 +3868,7 @@ static int hns3_gro_complete(struct sk_buff *skb, u32 l234info)
> skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ECN;
>
> if (l234info & BIT(HNS3_RXD_GRO_FIXID_B))
> - skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_FIXEDID;
> + skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_FIXEDID_OUTER;
>
> skb->csum_start = (unsigned char *)th - skb->head;
> skb->csum_offset = offsetof(struct tcphdr, check);
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> index b8c609d91d11..78df60c62225 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> @@ -1289,8 +1289,12 @@ static void mlx5e_shampo_update_ipv4_tcp_hdr(struct mlx5e_rq *rq, struct iphdr *
> tcp->check = ~tcp_v4_check(skb->len - tcp_off, ipv4->saddr,
> ipv4->daddr, 0);
> skb_shinfo(skb)->gso_type |= SKB_GSO_TCPV4;
> - if (ntohs(ipv4->id) == rq->hw_gro_data->second_ip_id)
> - skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_FIXEDID;
> + if (ntohs(ipv4->id) == rq->hw_gro_data->second_ip_id) {
> + bool encap = rq->hw_gro_data->fk.control.flags & FLOW_DIS_ENCAPSULATION;
> +
> + skb_shinfo(skb)->gso_type |= encap ?
> + SKB_GSO_TCP_FIXEDID_INNER : SKB_GSO_TCP_FIXEDID_OUTER;
> + }
>
> skb->csum_start = (unsigned char *)tcp - skb->head;
> skb->csum_offset = offsetof(struct tcphdr, check);
> diff --git a/drivers/net/ethernet/sfc/ef100_tx.c b/drivers/net/ethernet/sfc/ef100_tx.c
> index e6b6be549581..aab2425e62bb 100644
> --- a/drivers/net/ethernet/sfc/ef100_tx.c
> +++ b/drivers/net/ethernet/sfc/ef100_tx.c
> @@ -189,7 +189,8 @@ static void ef100_make_tso_desc(struct efx_nic *efx,
> {
> bool gso_partial = skb_shinfo(skb)->gso_type & SKB_GSO_PARTIAL;
> unsigned int len, ip_offset, tcp_offset, payload_segs;
> - u32 mangleid = ESE_GZ_TX_DESC_IP4_ID_INC_MOD16;
> + u32 mangleid_outer = ESE_GZ_TX_DESC_IP4_ID_INC_MOD16;
> + u32 mangleid_inner = ESE_GZ_TX_DESC_IP4_ID_INC_MOD16;
> unsigned int outer_ip_offset, outer_l4_offset;
> u16 vlan_tci = skb_vlan_tag_get(skb);
> u32 mss = skb_shinfo(skb)->gso_size;
> @@ -200,8 +201,10 @@ static void ef100_make_tso_desc(struct efx_nic *efx,
> bool outer_csum;
> u32 paylen;
>
> - if (skb_shinfo(skb)->gso_type & SKB_GSO_TCP_FIXEDID)
> - mangleid = ESE_GZ_TX_DESC_IP4_ID_NO_OP;
> + if (skb_shinfo(skb)->gso_type & SKB_GSO_TCP_FIXEDID_OUTER)
> + mangleid_outer = ESE_GZ_TX_DESC_IP4_ID_NO_OP;
> + if (skb_shinfo(skb)->gso_type & SKB_GSO_TCP_FIXEDID_INNER)
> + mangleid_inner = ESE_GZ_TX_DESC_IP4_ID_NO_OP;
> if (efx->net_dev->features & NETIF_F_HW_VLAN_CTAG_TX)
> vlan_enable = skb_vlan_tag_present(skb);
>
> @@ -239,14 +242,13 @@ static void ef100_make_tso_desc(struct efx_nic *efx,
> ESF_GZ_TX_TSO_CSO_INNER_L4, 1,
> ESF_GZ_TX_TSO_INNER_L3_OFF_W, ip_offset >> 1,
> ESF_GZ_TX_TSO_INNER_L4_OFF_W, tcp_offset >> 1,
> - ESF_GZ_TX_TSO_ED_INNER_IP4_ID, mangleid,
> + ESF_GZ_TX_TSO_ED_INNER_IP4_ID, mangleid_inner,
> ESF_GZ_TX_TSO_ED_INNER_IP_LEN, 1,
> ESF_GZ_TX_TSO_OUTER_L3_OFF_W, outer_ip_offset >> 1,
> ESF_GZ_TX_TSO_OUTER_L4_OFF_W, outer_l4_offset >> 1,
> ESF_GZ_TX_TSO_ED_OUTER_UDP_LEN, udp_encap && !gso_partial,
> ESF_GZ_TX_TSO_ED_OUTER_IP_LEN, encap && !gso_partial,
> - ESF_GZ_TX_TSO_ED_OUTER_IP4_ID, encap ? mangleid :
> - ESE_GZ_TX_DESC_IP4_ID_NO_OP,
> + ESF_GZ_TX_TSO_ED_OUTER_IP4_ID, mangleid_outer,
> ESF_GZ_TX_TSO_VLAN_INSERT_EN, vlan_enable,
> ESF_GZ_TX_TSO_VLAN_INSERT_TCI, vlan_tci
> );
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 5e5de4b0a433..e55ba6918b0a 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -5287,13 +5287,18 @@ void skb_warn_bad_offload(const struct sk_buff *skb);
>
> static inline bool net_gso_ok(netdev_features_t features, int gso_type)
> {
> - netdev_features_t feature = (netdev_features_t)gso_type << NETIF_F_GSO_SHIFT;
> + netdev_features_t feature;
> +
> + if (gso_type & (SKB_GSO_TCP_FIXEDID_OUTER | SKB_GSO_TCP_FIXEDID_INNER))
> + gso_type |= __SKB_GSO_TCP_FIXEDID;
This is quite peculiar.
Is there a real use case for specifying FIXEDID separately for outer
and inner? Can the existing single bit govern both together instead?
That would be a lot simpler.
> +
> + feature = ((netdev_features_t)gso_type << NETIF_F_GSO_SHIFT) & NETIF_F_GSO_MASK;
>
> /* check flags correspondence */
> BUILD_BUG_ON(SKB_GSO_TCPV4 != (NETIF_F_TSO >> NETIF_F_GSO_SHIFT));
> BUILD_BUG_ON(SKB_GSO_DODGY != (NETIF_F_GSO_ROBUST >> NETIF_F_GSO_SHIFT));
> BUILD_BUG_ON(SKB_GSO_TCP_ECN != (NETIF_F_TSO_ECN >> NETIF_F_GSO_SHIFT));
> - BUILD_BUG_ON(SKB_GSO_TCP_FIXEDID != (NETIF_F_TSO_MANGLEID >> NETIF_F_GSO_SHIFT));
> + BUILD_BUG_ON(__SKB_GSO_TCP_FIXEDID != (NETIF_F_TSO_MANGLEID >> NETIF_F_GSO_SHIFT));
> BUILD_BUG_ON(SKB_GSO_TCPV6 != (NETIF_F_TSO6 >> NETIF_F_GSO_SHIFT));
> BUILD_BUG_ON(SKB_GSO_FCOE != (NETIF_F_FSO >> NETIF_F_GSO_SHIFT));
> BUILD_BUG_ON(SKB_GSO_GRE != (NETIF_F_GSO_GRE >> NETIF_F_GSO_SHIFT));
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 14b923ddb6df..5cfbf6e8c7ea 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -674,7 +674,7 @@ enum {
> /* This indicates the tcp segment has CWR set. */
> SKB_GSO_TCP_ECN = 1 << 2,
>
> - SKB_GSO_TCP_FIXEDID = 1 << 3,
> + __SKB_GSO_TCP_FIXEDID = 1 << 3,
>
> SKB_GSO_TCPV6 = 1 << 4,
>
> @@ -707,6 +707,10 @@ enum {
> SKB_GSO_FRAGLIST = 1 << 18,
>
> SKB_GSO_TCP_ACCECN = 1 << 19,
> +
> + /* These don't correspond with netdev features. */
> + SKB_GSO_TCP_FIXEDID_OUTER = 1 << 30,
> + SKB_GSO_TCP_FIXEDID_INNER = 1 << 31,
> };
>
> #if BITS_PER_LONG > 32
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 68dc47d7e700..9941c39b5970 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3772,10 +3772,9 @@ static netdev_features_t gso_features_check(const struct sk_buff *skb,
> * IPv4 header has the potential to be fragmented.
> */
> if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4) {
> - struct iphdr *iph = skb->encapsulation ?
> - inner_ip_hdr(skb) : ip_hdr(skb);
> -
> - if (!(iph->frag_off & htons(IP_DF)))
> + if (!(ip_hdr(skb)->frag_off & htons(IP_DF)) ||
> + (skb->encapsulation &&
> + !(inner_ip_hdr(skb)->frag_off & htons(IP_DF))))
> features &= ~NETIF_F_TSO_MANGLEID;
> }
>
> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index 76e38092cd8a..7f29b485009d 100644
> --- a/net/ipv4/af_inet.c
> +++ b/net/ipv4/af_inet.c
> @@ -1393,14 +1393,13 @@ struct sk_buff *inet_gso_segment(struct sk_buff *skb,
>
> segs = ERR_PTR(-EPROTONOSUPPORT);
>
> - if (!skb->encapsulation || encap) {
> - udpfrag = !!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP);
> - fixedid = !!(skb_shinfo(skb)->gso_type & SKB_GSO_TCP_FIXEDID);
> + /* fixed ID is invalid if DF bit is not set */
> + fixedid = !!(skb_shinfo(skb)->gso_type & (SKB_GSO_TCP_FIXEDID_OUTER << encap));
> + if (fixedid && !(ip_hdr(skb)->frag_off & htons(IP_DF)))
> + goto out;
>
> - /* fixed ID is invalid if DF bit is not set */
> - if (fixedid && !(ip_hdr(skb)->frag_off & htons(IP_DF)))
> - goto out;
> - }
> + if (!skb->encapsulation || encap)
> + udpfrag = !!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP);
>
> ops = rcu_dereference(inet_offloads[proto]);
> if (likely(ops && ops->callbacks.gso_segment)) {
> diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
> index 74f46663eeae..83fa6b2aecf4 100644
> --- a/net/ipv4/tcp_offload.c
> +++ b/net/ipv4/tcp_offload.c
> @@ -485,10 +485,8 @@ INDIRECT_CALLABLE_SCOPE int tcp4_gro_complete(struct sk_buff *skb, int thoff)
> th->check = ~tcp_v4_check(skb->len - thoff, iph->saddr,
> iph->daddr, 0);
>
> - bool is_fixedid = (NAPI_GRO_CB(skb)->ip_fixedid >> skb->encapsulation) & 1;
> -
> skb_shinfo(skb)->gso_type |= SKB_GSO_TCPV4 |
> - (is_fixedid * SKB_GSO_TCP_FIXEDID);
> + (NAPI_GRO_CB(skb)->ip_fixedid * SKB_GSO_TCP_FIXEDID_OUTER);
>
> tcp_gro_complete(skb);
> return 0;
> --
> 2.36.1
>
next prev parent reply other threads:[~2025-08-15 10:23 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-14 11:40 [PATCH net-next 0/5] net: gso: restore outer ip ids correctly Richard Gobert
2025-08-14 11:40 ` [PATCH net-next 1/5] net: gro: remove is_ipv6 from napi_gro_cb Richard Gobert
2025-08-14 20:45 ` Florian Fainelli
2025-08-15 10:17 ` Willem de Bruijn
2025-08-18 11:33 ` Richard Gobert
2025-08-14 11:40 ` [PATCH net-next 2/5] net: gro: only merge packets with incrementing or fixed outer ids Richard Gobert
2025-08-14 11:40 ` [PATCH net-next 3/5] net: gso: restore ids of outer ip headers correctly Richard Gobert
2025-08-15 10:23 ` Willem de Bruijn [this message]
2025-08-15 13:00 ` Willem de Bruijn
2025-08-18 11:46 ` Richard Gobert
2025-08-18 11:58 ` Willem de Bruijn
2025-08-18 14:07 ` Richard Gobert
2025-08-14 11:40 ` [PATCH net-next 4/5] net: gro: remove unnecessary df checks Richard Gobert
2025-08-14 11:40 ` [PATCH net-next 5/5] selftests/net: test ipip packets in gro.sh Richard Gobert
2025-08-14 19:42 ` Jakub Kicinski
2025-08-18 12:06 ` Richard Gobert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=willemdebruijn.kernel.26e28ce8a7fe@gmail.com \
--to=willemdebruijn.kernel@gmail.com \
--cc=ahmed.zaki@intel.com \
--cc=aleksander.lobakin@intel.com \
--cc=andrew+netdev@lunn.ch \
--cc=corbet@lwn.net \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=ecree.xilinx@gmail.com \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=kuniyu@google.com \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-net-drivers@amd.com \
--cc=mbloch@nvidia.com \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=richardbgobert@gmail.com \
--cc=saeedm@nvidia.com \
--cc=salil.mehta@huawei.com \
--cc=sdf@fomichev.me \
--cc=shaojijie@huawei.com \
--cc=shenjian15@huawei.com \
--cc=shuah@kernel.org \
--cc=tariqt@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.