linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/5] net: gso: restore outer ip ids correctly
@ 2025-08-19  6:32 Richard Gobert
  2025-08-19  6:32 ` [PATCH net-next v2 1/5] net: gro: remove is_ipv6 from napi_gro_cb Richard Gobert
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Richard Gobert @ 2025-08-19  6:32 UTC (permalink / raw)
  To: netdev
  Cc: davem, edumazet, kuba, pabeni, horms, corbet, shenjian15,
	salil.mehta, shaojijie, andrew+netdev, saeedm, tariqt, mbloch,
	leon, ecree.xilinx, dsahern, ncardwell, kuniyu, shuah, sdf,
	ahmed.zaki, aleksander.lobakin, florian.fainelli,
	willemdebruijn.kernel, linux-kernel, linux-net-drivers,
	Richard Gobert

GRO currently ignores outer IPv4 header IDs for encapsulated packets
that have their don't-fragment flag set. GSO, however, always assumes
that outer IP IDs are incrementing. This results in GSO mangling the
outer IDs when they aren't incrementing. For example, GSO mangles the
outer IDs of IPv6 packets that were converted to IPv4, which must
have an ID of 0 according to RFC 6145, sect. 5.1.

GRO+GSO is supposed to be entirely transparent by default. GSO already
correctly restores inner IDs and IDs of non-encapsulated packets. The
tx-tcp-mangleid-segmentation feature can be enabled to allow the
mangling of such IDs so that TSO can be used.

This series fixes outer ID restoration for encapsulated packets when
tx-tcp-mangleid-segmentation is disabled. It also allows GRO to merge
packets with fixed IDs that don't have their don't-fragment flag set.

v1 -> v2:
 - Add fou_gro_ops helper
 - Clarify why sk_family check works
 - Fix ipip packet generation in selftest

Links:
 - v1: https://lore.kernel.org/netdev/20250814114030.7683-1-richardbgobert@gmail.com/

Richard Gobert (5):
  net: gro: remove is_ipv6 from napi_gro_cb
  net: gro: only merge packets with incrementing or fixed outer ids
  net: gso: restore ids of outer ip headers correctly
  net: gro: remove unnecessary df checks
  selftests/net: test ipip packets in gro.sh

 .../networking/segmentation-offloads.rst      |  4 +-
 .../net/ethernet/hisilicon/hns3/hns3_enet.c   |  2 +-
 .../net/ethernet/mellanox/mlx5/core/en_rx.c   |  8 ++-
 drivers/net/ethernet/sfc/ef100_tx.c           | 14 +++--
 include/linux/netdevice.h                     |  9 ++-
 include/linux/skbuff.h                        |  6 +-
 include/net/gro.h                             | 32 ++++------
 net/core/dev.c                                |  7 +--
 net/ipv4/af_inet.c                            | 10 +---
 net/ipv4/fou_core.c                           | 31 +++++-----
 net/ipv4/tcp_offload.c                        |  2 +-
 net/ipv4/udp_offload.c                        |  2 -
 net/ipv6/udp_offload.c                        |  2 -
 tools/testing/selftests/net/gro.c             | 58 ++++++++++++++-----
 tools/testing/selftests/net/gro.sh            |  5 +-
 15 files changed, 107 insertions(+), 85 deletions(-)

-- 
2.36.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH net-next v2 1/5] net: gro: remove is_ipv6 from napi_gro_cb
  2025-08-19  6:32 [PATCH net-next v2 0/5] net: gso: restore outer ip ids correctly Richard Gobert
@ 2025-08-19  6:32 ` Richard Gobert
  2025-08-19  8:47   ` Eric Dumazet
  2025-08-19  6:32 ` [PATCH net-next v2 2/5] net: gro: only merge packets with incrementing or fixed outer ids Richard Gobert
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Richard Gobert @ 2025-08-19  6:32 UTC (permalink / raw)
  To: netdev
  Cc: davem, edumazet, kuba, pabeni, horms, corbet, shenjian15,
	salil.mehta, shaojijie, andrew+netdev, saeedm, tariqt, mbloch,
	leon, ecree.xilinx, dsahern, ncardwell, kuniyu, shuah, sdf,
	ahmed.zaki, aleksander.lobakin, florian.fainelli,
	willemdebruijn.kernel, linux-kernel, linux-net-drivers,
	Richard Gobert

Remove is_ipv6 from napi_gro_cb and use sk->sk_family instead.
This frees up space for another ip_fixedid bit that will be added
in the next commit.

Using sk->sk_family is reliable since udp_sock_create always
creates either a AF_INET or a AF_INET6 socket, and IPv6-FOU
doesn't support receiving IPv4 packets.

Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
---
 include/net/gro.h      |  3 ---
 net/ipv4/fou_core.c    | 31 +++++++++++++------------------
 net/ipv4/udp_offload.c |  2 --
 net/ipv6/udp_offload.c |  2 --
 4 files changed, 13 insertions(+), 25 deletions(-)

diff --git a/include/net/gro.h b/include/net/gro.h
index a0fca7ac6e7e..87c68007f949 100644
--- a/include/net/gro.h
+++ b/include/net/gro.h
@@ -71,9 +71,6 @@ struct napi_gro_cb {
 		/* Free the skb? */
 		u8	free:2;
 
-		/* Used in foo-over-udp, set in udp[46]_gro_receive */
-		u8	is_ipv6:1;
-
 		/* Used in GRE, set in fou/gue_gro_receive */
 		u8	is_fou:1;
 
diff --git a/net/ipv4/fou_core.c b/net/ipv4/fou_core.c
index 3e30745e2c09..261ea2cf460f 100644
--- a/net/ipv4/fou_core.c
+++ b/net/ipv4/fou_core.c
@@ -228,21 +228,26 @@ static int gue_udp_recv(struct sock *sk, struct sk_buff *skb)
 	return 0;
 }
 
+static inline const struct net_offload *fou_gro_ops(struct sock *sk, int proto)
+{
+	const struct net_offload __rcu **offloads;
+
+	/* FOU doesn't allow IPv4 on IPv6 sockets. */
+	offloads = sk->sk_family == AF_INET6 ? inet6_offloads : inet_offloads;
+	return rcu_dereference(offloads[proto]);
+}
+
 static struct sk_buff *fou_gro_receive(struct sock *sk,
 				       struct list_head *head,
 				       struct sk_buff *skb)
 {
-	const struct net_offload __rcu **offloads;
 	struct fou *fou = fou_from_sock(sk);
 	const struct net_offload *ops;
 	struct sk_buff *pp = NULL;
-	u8 proto;
 
 	if (!fou)
 		goto out;
 
-	proto = fou->protocol;
-
 	/* We can clear the encap_mark for FOU as we are essentially doing
 	 * one of two possible things.  We are either adding an L4 tunnel
 	 * header to the outer L3 tunnel header, or we are simply
@@ -254,8 +259,7 @@ static struct sk_buff *fou_gro_receive(struct sock *sk,
 	/* Flag this frame as already having an outer encap header */
 	NAPI_GRO_CB(skb)->is_fou = 1;
 
-	offloads = NAPI_GRO_CB(skb)->is_ipv6 ? inet6_offloads : inet_offloads;
-	ops = rcu_dereference(offloads[proto]);
+	ops = fou_gro_ops(sk, fou->protocol);
 	if (!ops || !ops->callbacks.gro_receive)
 		goto out;
 
@@ -268,10 +272,8 @@ static struct sk_buff *fou_gro_receive(struct sock *sk,
 static int fou_gro_complete(struct sock *sk, struct sk_buff *skb,
 			    int nhoff)
 {
-	const struct net_offload __rcu **offloads;
 	struct fou *fou = fou_from_sock(sk);
 	const struct net_offload *ops;
-	u8 proto;
 	int err;
 
 	if (!fou) {
@@ -279,10 +281,7 @@ static int fou_gro_complete(struct sock *sk, struct sk_buff *skb,
 		goto out;
 	}
 
-	proto = fou->protocol;
-
-	offloads = NAPI_GRO_CB(skb)->is_ipv6 ? inet6_offloads : inet_offloads;
-	ops = rcu_dereference(offloads[proto]);
+	ops = fou_gro_ops(sk, fou->protocol);
 	if (WARN_ON(!ops || !ops->callbacks.gro_complete)) {
 		err = -ENOSYS;
 		goto out;
@@ -323,7 +322,6 @@ static struct sk_buff *gue_gro_receive(struct sock *sk,
 				       struct list_head *head,
 				       struct sk_buff *skb)
 {
-	const struct net_offload __rcu **offloads;
 	const struct net_offload *ops;
 	struct sk_buff *pp = NULL;
 	struct sk_buff *p;
@@ -450,8 +448,7 @@ static struct sk_buff *gue_gro_receive(struct sock *sk,
 	/* Flag this frame as already having an outer encap header */
 	NAPI_GRO_CB(skb)->is_fou = 1;
 
-	offloads = NAPI_GRO_CB(skb)->is_ipv6 ? inet6_offloads : inet_offloads;
-	ops = rcu_dereference(offloads[proto]);
+	ops = fou_gro_ops(sk, proto);
 	if (!ops || !ops->callbacks.gro_receive)
 		goto out;
 
@@ -467,7 +464,6 @@ static struct sk_buff *gue_gro_receive(struct sock *sk,
 static int gue_gro_complete(struct sock *sk, struct sk_buff *skb, int nhoff)
 {
 	struct guehdr *guehdr = (struct guehdr *)(skb->data + nhoff);
-	const struct net_offload __rcu **offloads;
 	const struct net_offload *ops;
 	unsigned int guehlen = 0;
 	u8 proto;
@@ -494,8 +490,7 @@ static int gue_gro_complete(struct sock *sk, struct sk_buff *skb, int nhoff)
 		return err;
 	}
 
-	offloads = NAPI_GRO_CB(skb)->is_ipv6 ? inet6_offloads : inet_offloads;
-	ops = rcu_dereference(offloads[proto]);
+	ops = fou_gro_ops(sk, proto);
 	if (WARN_ON(!ops || !ops->callbacks.gro_complete))
 		goto out;
 
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 5128e2a5b00a..683689cf4293 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -891,8 +891,6 @@ struct sk_buff *udp4_gro_receive(struct list_head *head, struct sk_buff *skb)
 		skb_gro_checksum_try_convert(skb, IPPROTO_UDP,
 					     inet_gro_compute_pseudo);
 skip:
-	NAPI_GRO_CB(skb)->is_ipv6 = 0;
-
 	if (static_branch_unlikely(&udp_encap_needed_key))
 		sk = udp4_gro_lookup_skb(skb, uh->source, uh->dest);
 
diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index d8445ac1b2e4..046f13b1d77a 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -154,8 +154,6 @@ struct sk_buff *udp6_gro_receive(struct list_head *head, struct sk_buff *skb)
 					     ip6_gro_compute_pseudo);
 
 skip:
-	NAPI_GRO_CB(skb)->is_ipv6 = 1;
-
 	if (static_branch_unlikely(&udpv6_encap_needed_key))
 		sk = udp6_gro_lookup_skb(skb, uh->source, uh->dest);
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next v2 2/5] net: gro: only merge packets with incrementing or fixed outer ids
  2025-08-19  6:32 [PATCH net-next v2 0/5] net: gso: restore outer ip ids correctly Richard Gobert
  2025-08-19  6:32 ` [PATCH net-next v2 1/5] net: gro: remove is_ipv6 from napi_gro_cb Richard Gobert
@ 2025-08-19  6:32 ` Richard Gobert
  2025-08-19 14:46   ` Willem de Bruijn
  2025-08-20  0:30   ` Jakub Kicinski
  2025-08-19  6:32 ` [PATCH net-next v2 3/5] net: gso: restore ids of outer ip headers correctly Richard Gobert
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 16+ messages in thread
From: Richard Gobert @ 2025-08-19  6:32 UTC (permalink / raw)
  To: netdev
  Cc: davem, edumazet, kuba, pabeni, horms, corbet, shenjian15,
	salil.mehta, shaojijie, andrew+netdev, saeedm, tariqt, mbloch,
	leon, ecree.xilinx, dsahern, ncardwell, kuniyu, shuah, sdf,
	ahmed.zaki, aleksander.lobakin, florian.fainelli,
	willemdebruijn.kernel, linux-kernel, linux-net-drivers,
	Richard Gobert

Only merge encapsulated packets if their outer IDs are either
incrementing or fixed, just like for inner IDs and IDs of non-encapsulated
packets.

Add another ip_fixedid bit for a total of two bits: one for outer IDs and
one for inner IDs.

This commit preserves the current behavior of GSO where only the IDs of the
inner-most headers are restored correctly.

Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
---
 include/net/gro.h      | 26 +++++++++++---------------
 net/ipv4/tcp_offload.c |  4 +++-
 2 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/include/net/gro.h b/include/net/gro.h
index 87c68007f949..e7997a9fb30b 100644
--- a/include/net/gro.h
+++ b/include/net/gro.h
@@ -75,7 +75,7 @@ struct napi_gro_cb {
 		u8	is_fou:1;
 
 		/* Used to determine if ipid_offset can be ignored */
-		u8	ip_fixedid:1;
+		u8	ip_fixedid:2;
 
 		/* Number of gro_receive callbacks this packet already went through */
 		u8 recursion_counter:4;
@@ -442,29 +442,26 @@ static inline __wsum ip6_gro_compute_pseudo(const struct sk_buff *skb,
 }
 
 static inline int inet_gro_flush(const struct iphdr *iph, const struct iphdr *iph2,
-				 struct sk_buff *p, bool outer)
+				 struct sk_buff *p, bool inner)
 {
 	const u32 id = ntohl(*(__be32 *)&iph->id);
 	const u32 id2 = ntohl(*(__be32 *)&iph2->id);
 	const u16 ipid_offset = (id >> 16) - (id2 >> 16);
 	const u16 count = NAPI_GRO_CB(p)->count;
 	const u32 df = id & IP_DF;
-	int flush;
 
 	/* All fields must match except length and checksum. */
-	flush = (iph->ttl ^ iph2->ttl) | (iph->tos ^ iph2->tos) | (df ^ (id2 & IP_DF));
-
-	if (flush | (outer && df))
-		return flush;
+	if ((iph->ttl ^ iph2->ttl) | (iph->tos ^ iph2->tos) | (df ^ (id2 & IP_DF)))
+		return true;
 
 	/* When we receive our second frame we can make a decision on if we
 	 * continue this flow as an atomic flow with a fixed ID or if we use
 	 * an incrementing ID.
 	 */
 	if (count == 1 && df && !ipid_offset)
-		NAPI_GRO_CB(p)->ip_fixedid = true;
+		NAPI_GRO_CB(p)->ip_fixedid |= 1 << inner;
 
-	return ipid_offset ^ (count * !NAPI_GRO_CB(p)->ip_fixedid);
+	return ipid_offset ^ (count * !(NAPI_GRO_CB(p)->ip_fixedid & (1 << inner)));
 }
 
 static inline int ipv6_gro_flush(const struct ipv6hdr *iph, const struct ipv6hdr *iph2)
@@ -479,7 +476,7 @@ static inline int ipv6_gro_flush(const struct ipv6hdr *iph, const struct ipv6hdr
 
 static inline int __gro_receive_network_flush(const void *th, const void *th2,
 					      struct sk_buff *p, const u16 diff,
-					      bool outer)
+					      bool inner)
 {
 	const void *nh = th - diff;
 	const void *nh2 = th2 - diff;
@@ -487,19 +484,18 @@ static inline int __gro_receive_network_flush(const void *th, const void *th2,
 	if (((struct iphdr *)nh)->version == 6)
 		return ipv6_gro_flush(nh, nh2);
 	else
-		return inet_gro_flush(nh, nh2, p, outer);
+		return inet_gro_flush(nh, nh2, p, inner);
 }
 
 static inline int gro_receive_network_flush(const void *th, const void *th2,
 					    struct sk_buff *p)
 {
-	const bool encap_mark = NAPI_GRO_CB(p)->encap_mark;
 	int off = skb_transport_offset(p);
 	int flush;
 
-	flush = __gro_receive_network_flush(th, th2, p, off - NAPI_GRO_CB(p)->network_offset, encap_mark);
-	if (encap_mark)
-		flush |= __gro_receive_network_flush(th, th2, p, off - NAPI_GRO_CB(p)->inner_network_offset, false);
+	flush = __gro_receive_network_flush(th, th2, p, off - NAPI_GRO_CB(p)->network_offset, false);
+	if (NAPI_GRO_CB(p)->encap_mark)
+		flush |= __gro_receive_network_flush(th, th2, p, off - NAPI_GRO_CB(p)->inner_network_offset, true);
 
 	return flush;
 }
diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
index be5c2294610e..74f46663eeae 100644
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -485,8 +485,10 @@ INDIRECT_CALLABLE_SCOPE int tcp4_gro_complete(struct sk_buff *skb, int thoff)
 	th->check = ~tcp_v4_check(skb->len - thoff, iph->saddr,
 				  iph->daddr, 0);
 
+	bool is_fixedid = (NAPI_GRO_CB(skb)->ip_fixedid >> skb->encapsulation) & 1;
+
 	skb_shinfo(skb)->gso_type |= SKB_GSO_TCPV4 |
-			(NAPI_GRO_CB(skb)->ip_fixedid * SKB_GSO_TCP_FIXEDID);
+			(is_fixedid * SKB_GSO_TCP_FIXEDID);
 
 	tcp_gro_complete(skb);
 	return 0;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next v2 3/5] net: gso: restore ids of outer ip headers correctly
  2025-08-19  6:32 [PATCH net-next v2 0/5] net: gso: restore outer ip ids correctly Richard Gobert
  2025-08-19  6:32 ` [PATCH net-next v2 1/5] net: gro: remove is_ipv6 from napi_gro_cb Richard Gobert
  2025-08-19  6:32 ` [PATCH net-next v2 2/5] net: gro: only merge packets with incrementing or fixed outer ids Richard Gobert
@ 2025-08-19  6:32 ` Richard Gobert
  2025-08-19  6:32 ` [PATCH net-next v2 4/5] net: gro: remove unnecessary df checks Richard Gobert
  2025-08-19  6:32 ` [PATCH net-next v2 5/5] selftests/net: test ipip packets in gro.sh Richard Gobert
  4 siblings, 0 replies; 16+ messages in thread
From: Richard Gobert @ 2025-08-19  6:32 UTC (permalink / raw)
  To: netdev
  Cc: davem, edumazet, kuba, pabeni, horms, corbet, shenjian15,
	salil.mehta, shaojijie, andrew+netdev, saeedm, tariqt, mbloch,
	leon, ecree.xilinx, dsahern, ncardwell, kuniyu, shuah, sdf,
	ahmed.zaki, aleksander.lobakin, florian.fainelli,
	willemdebruijn.kernel, linux-kernel, linux-net-drivers,
	Richard Gobert

Currently, NETIF_F_TSO_MANGLEID indicates that the inner-most ID can
be mangled. Outer IDs can always be mangled.

Make GSO preserve outer IDs by default, with NETIF_F_TSO_MANGLEID allowing
both inner and outer IDs to be mangled. In the future, we could add
NETIF_F_TSO_MANGLEID_{INNER,OUTER} to provide more granular control to
drivers.

This commit also modifies a few drivers that use SKB_GSO_FIXEDID directly.

Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
---
 Documentation/networking/segmentation-offloads.rst |  4 ++--
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c    |  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    |  8 ++++++--
 drivers/net/ethernet/sfc/ef100_tx.c                | 14 ++++++++------
 include/linux/netdevice.h                          |  9 +++++++--
 include/linux/skbuff.h                             |  6 +++++-
 net/core/dev.c                                     |  7 +++----
 net/ipv4/af_inet.c                                 | 13 ++++++-------
 net/ipv4/tcp_offload.c                             |  4 +---
 9 files changed, 39 insertions(+), 28 deletions(-)

diff --git a/Documentation/networking/segmentation-offloads.rst b/Documentation/networking/segmentation-offloads.rst
index 085e8fab03fd..21c759b81f4e 100644
--- a/Documentation/networking/segmentation-offloads.rst
+++ b/Documentation/networking/segmentation-offloads.rst
@@ -42,8 +42,8 @@ also point to the TCP header of the packet.
 
 For IPv4 segmentation we support one of two types in terms of the IP ID.
 The default behavior is to increment the IP ID with every segment.  If the
-GSO type SKB_GSO_TCP_FIXEDID is specified then we will not increment the IP
-ID and all segments will use the same IP ID.  If a device has
+GSO type SKB_GSO_TCP_FIXEDID_{OUTER,INNER} is specified then we will not
+increment the IP ID and all segments will use the same IP ID.  If a device has
 NETIF_F_TSO_MANGLEID set then the IP ID can be ignored when performing TSO
 and we will either increment the IP ID for all frames, or leave it at a
 static value based on driver preference.
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index bfa5568baa92..b28f890b0af5 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -3868,7 +3868,7 @@ static int hns3_gro_complete(struct sk_buff *skb, u32 l234info)
 		skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ECN;
 
 	if (l234info & BIT(HNS3_RXD_GRO_FIXID_B))
-		skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_FIXEDID;
+		skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_FIXEDID_OUTER;
 
 	skb->csum_start = (unsigned char *)th - skb->head;
 	skb->csum_offset = offsetof(struct tcphdr, check);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index b8c609d91d11..78df60c62225 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -1289,8 +1289,12 @@ static void mlx5e_shampo_update_ipv4_tcp_hdr(struct mlx5e_rq *rq, struct iphdr *
 	tcp->check = ~tcp_v4_check(skb->len - tcp_off, ipv4->saddr,
 				   ipv4->daddr, 0);
 	skb_shinfo(skb)->gso_type |= SKB_GSO_TCPV4;
-	if (ntohs(ipv4->id) == rq->hw_gro_data->second_ip_id)
-		skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_FIXEDID;
+	if (ntohs(ipv4->id) == rq->hw_gro_data->second_ip_id) {
+		bool encap = rq->hw_gro_data->fk.control.flags & FLOW_DIS_ENCAPSULATION;
+
+		skb_shinfo(skb)->gso_type |= encap ?
+					     SKB_GSO_TCP_FIXEDID_INNER : SKB_GSO_TCP_FIXEDID_OUTER;
+	}
 
 	skb->csum_start = (unsigned char *)tcp - skb->head;
 	skb->csum_offset = offsetof(struct tcphdr, check);
diff --git a/drivers/net/ethernet/sfc/ef100_tx.c b/drivers/net/ethernet/sfc/ef100_tx.c
index e6b6be549581..aab2425e62bb 100644
--- a/drivers/net/ethernet/sfc/ef100_tx.c
+++ b/drivers/net/ethernet/sfc/ef100_tx.c
@@ -189,7 +189,8 @@ static void ef100_make_tso_desc(struct efx_nic *efx,
 {
 	bool gso_partial = skb_shinfo(skb)->gso_type & SKB_GSO_PARTIAL;
 	unsigned int len, ip_offset, tcp_offset, payload_segs;
-	u32 mangleid = ESE_GZ_TX_DESC_IP4_ID_INC_MOD16;
+	u32 mangleid_outer = ESE_GZ_TX_DESC_IP4_ID_INC_MOD16;
+	u32 mangleid_inner = ESE_GZ_TX_DESC_IP4_ID_INC_MOD16;
 	unsigned int outer_ip_offset, outer_l4_offset;
 	u16 vlan_tci = skb_vlan_tag_get(skb);
 	u32 mss = skb_shinfo(skb)->gso_size;
@@ -200,8 +201,10 @@ static void ef100_make_tso_desc(struct efx_nic *efx,
 	bool outer_csum;
 	u32 paylen;
 
-	if (skb_shinfo(skb)->gso_type & SKB_GSO_TCP_FIXEDID)
-		mangleid = ESE_GZ_TX_DESC_IP4_ID_NO_OP;
+	if (skb_shinfo(skb)->gso_type & SKB_GSO_TCP_FIXEDID_OUTER)
+		mangleid_outer = ESE_GZ_TX_DESC_IP4_ID_NO_OP;
+	if (skb_shinfo(skb)->gso_type & SKB_GSO_TCP_FIXEDID_INNER)
+		mangleid_inner = ESE_GZ_TX_DESC_IP4_ID_NO_OP;
 	if (efx->net_dev->features & NETIF_F_HW_VLAN_CTAG_TX)
 		vlan_enable = skb_vlan_tag_present(skb);
 
@@ -239,14 +242,13 @@ static void ef100_make_tso_desc(struct efx_nic *efx,
 			      ESF_GZ_TX_TSO_CSO_INNER_L4, 1,
 			      ESF_GZ_TX_TSO_INNER_L3_OFF_W, ip_offset >> 1,
 			      ESF_GZ_TX_TSO_INNER_L4_OFF_W, tcp_offset >> 1,
-			      ESF_GZ_TX_TSO_ED_INNER_IP4_ID, mangleid,
+			      ESF_GZ_TX_TSO_ED_INNER_IP4_ID, mangleid_inner,
 			      ESF_GZ_TX_TSO_ED_INNER_IP_LEN, 1,
 			      ESF_GZ_TX_TSO_OUTER_L3_OFF_W, outer_ip_offset >> 1,
 			      ESF_GZ_TX_TSO_OUTER_L4_OFF_W, outer_l4_offset >> 1,
 			      ESF_GZ_TX_TSO_ED_OUTER_UDP_LEN, udp_encap && !gso_partial,
 			      ESF_GZ_TX_TSO_ED_OUTER_IP_LEN, encap && !gso_partial,
-			      ESF_GZ_TX_TSO_ED_OUTER_IP4_ID, encap ? mangleid :
-								     ESE_GZ_TX_DESC_IP4_ID_NO_OP,
+			      ESF_GZ_TX_TSO_ED_OUTER_IP4_ID, mangleid_outer,
 			      ESF_GZ_TX_TSO_VLAN_INSERT_EN, vlan_enable,
 			      ESF_GZ_TX_TSO_VLAN_INSERT_TCI, vlan_tci
 		);
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 5e5de4b0a433..e55ba6918b0a 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -5287,13 +5287,18 @@ void skb_warn_bad_offload(const struct sk_buff *skb);
 
 static inline bool net_gso_ok(netdev_features_t features, int gso_type)
 {
-	netdev_features_t feature = (netdev_features_t)gso_type << NETIF_F_GSO_SHIFT;
+	netdev_features_t feature;
+
+	if (gso_type & (SKB_GSO_TCP_FIXEDID_OUTER | SKB_GSO_TCP_FIXEDID_INNER))
+		gso_type |= __SKB_GSO_TCP_FIXEDID;
+
+	feature = ((netdev_features_t)gso_type << NETIF_F_GSO_SHIFT) & NETIF_F_GSO_MASK;
 
 	/* check flags correspondence */
 	BUILD_BUG_ON(SKB_GSO_TCPV4   != (NETIF_F_TSO >> NETIF_F_GSO_SHIFT));
 	BUILD_BUG_ON(SKB_GSO_DODGY   != (NETIF_F_GSO_ROBUST >> NETIF_F_GSO_SHIFT));
 	BUILD_BUG_ON(SKB_GSO_TCP_ECN != (NETIF_F_TSO_ECN >> NETIF_F_GSO_SHIFT));
-	BUILD_BUG_ON(SKB_GSO_TCP_FIXEDID != (NETIF_F_TSO_MANGLEID >> NETIF_F_GSO_SHIFT));
+	BUILD_BUG_ON(__SKB_GSO_TCP_FIXEDID != (NETIF_F_TSO_MANGLEID >> NETIF_F_GSO_SHIFT));
 	BUILD_BUG_ON(SKB_GSO_TCPV6   != (NETIF_F_TSO6 >> NETIF_F_GSO_SHIFT));
 	BUILD_BUG_ON(SKB_GSO_FCOE    != (NETIF_F_FSO >> NETIF_F_GSO_SHIFT));
 	BUILD_BUG_ON(SKB_GSO_GRE     != (NETIF_F_GSO_GRE >> NETIF_F_GSO_SHIFT));
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 14b923ddb6df..5cfbf6e8c7ea 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -674,7 +674,7 @@ enum {
 	/* This indicates the tcp segment has CWR set. */
 	SKB_GSO_TCP_ECN = 1 << 2,
 
-	SKB_GSO_TCP_FIXEDID = 1 << 3,
+	__SKB_GSO_TCP_FIXEDID = 1 << 3,
 
 	SKB_GSO_TCPV6 = 1 << 4,
 
@@ -707,6 +707,10 @@ enum {
 	SKB_GSO_FRAGLIST = 1 << 18,
 
 	SKB_GSO_TCP_ACCECN = 1 << 19,
+
+	/* These don't correspond with netdev features. */
+	SKB_GSO_TCP_FIXEDID_OUTER = 1 << 30,
+	SKB_GSO_TCP_FIXEDID_INNER = 1 << 31,
 };
 
 #if BITS_PER_LONG > 32
diff --git a/net/core/dev.c b/net/core/dev.c
index 68dc47d7e700..9941c39b5970 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3772,10 +3772,9 @@ static netdev_features_t gso_features_check(const struct sk_buff *skb,
 	 * IPv4 header has the potential to be fragmented.
 	 */
 	if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4) {
-		struct iphdr *iph = skb->encapsulation ?
-				    inner_ip_hdr(skb) : ip_hdr(skb);
-
-		if (!(iph->frag_off & htons(IP_DF)))
+		if (!(ip_hdr(skb)->frag_off & htons(IP_DF)) ||
+		    (skb->encapsulation &&
+		     !(inner_ip_hdr(skb)->frag_off & htons(IP_DF))))
 			features &= ~NETIF_F_TSO_MANGLEID;
 	}
 
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 76e38092cd8a..7f29b485009d 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1393,14 +1393,13 @@ struct sk_buff *inet_gso_segment(struct sk_buff *skb,
 
 	segs = ERR_PTR(-EPROTONOSUPPORT);
 
-	if (!skb->encapsulation || encap) {
-		udpfrag = !!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP);
-		fixedid = !!(skb_shinfo(skb)->gso_type & SKB_GSO_TCP_FIXEDID);
+	/* fixed ID is invalid if DF bit is not set */
+	fixedid = !!(skb_shinfo(skb)->gso_type & (SKB_GSO_TCP_FIXEDID_OUTER << encap));
+	if (fixedid && !(ip_hdr(skb)->frag_off & htons(IP_DF)))
+		goto out;
 
-		/* fixed ID is invalid if DF bit is not set */
-		if (fixedid && !(ip_hdr(skb)->frag_off & htons(IP_DF)))
-			goto out;
-	}
+	if (!skb->encapsulation || encap)
+		udpfrag = !!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP);
 
 	ops = rcu_dereference(inet_offloads[proto]);
 	if (likely(ops && ops->callbacks.gso_segment)) {
diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
index 74f46663eeae..83fa6b2aecf4 100644
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -485,10 +485,8 @@ INDIRECT_CALLABLE_SCOPE int tcp4_gro_complete(struct sk_buff *skb, int thoff)
 	th->check = ~tcp_v4_check(skb->len - thoff, iph->saddr,
 				  iph->daddr, 0);
 
-	bool is_fixedid = (NAPI_GRO_CB(skb)->ip_fixedid >> skb->encapsulation) & 1;
-
 	skb_shinfo(skb)->gso_type |= SKB_GSO_TCPV4 |
-			(is_fixedid * SKB_GSO_TCP_FIXEDID);
+			(NAPI_GRO_CB(skb)->ip_fixedid * SKB_GSO_TCP_FIXEDID_OUTER);
 
 	tcp_gro_complete(skb);
 	return 0;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next v2 4/5] net: gro: remove unnecessary df checks
  2025-08-19  6:32 [PATCH net-next v2 0/5] net: gso: restore outer ip ids correctly Richard Gobert
                   ` (2 preceding siblings ...)
  2025-08-19  6:32 ` [PATCH net-next v2 3/5] net: gso: restore ids of outer ip headers correctly Richard Gobert
@ 2025-08-19  6:32 ` Richard Gobert
  2025-08-20 11:24   ` Willem de Bruijn
  2025-08-19  6:32 ` [PATCH net-next v2 5/5] selftests/net: test ipip packets in gro.sh Richard Gobert
  4 siblings, 1 reply; 16+ messages in thread
From: Richard Gobert @ 2025-08-19  6:32 UTC (permalink / raw)
  To: netdev
  Cc: davem, edumazet, kuba, pabeni, horms, corbet, shenjian15,
	salil.mehta, shaojijie, andrew+netdev, saeedm, tariqt, mbloch,
	leon, ecree.xilinx, dsahern, ncardwell, kuniyu, shuah, sdf,
	ahmed.zaki, aleksander.lobakin, florian.fainelli,
	willemdebruijn.kernel, linux-kernel, linux-net-drivers,
	Richard Gobert

Currently, packets with fixed IDs will be merged only if their
don't-fragment bit is set. Merged packets are re-split into segments
before being fragmented, so the result is the same as if the packets
weren't merged to begin with.

Remove unnecessary don't-fragment checks.

Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
---
 include/net/gro.h                 | 5 ++---
 net/ipv4/af_inet.c                | 3 ---
 tools/testing/selftests/net/gro.c | 9 ++++-----
 3 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/include/net/gro.h b/include/net/gro.h
index e7997a9fb30b..e3affb2e2ca8 100644
--- a/include/net/gro.h
+++ b/include/net/gro.h
@@ -448,17 +448,16 @@ static inline int inet_gro_flush(const struct iphdr *iph, const struct iphdr *ip
 	const u32 id2 = ntohl(*(__be32 *)&iph2->id);
 	const u16 ipid_offset = (id >> 16) - (id2 >> 16);
 	const u16 count = NAPI_GRO_CB(p)->count;
-	const u32 df = id & IP_DF;
 
 	/* All fields must match except length and checksum. */
-	if ((iph->ttl ^ iph2->ttl) | (iph->tos ^ iph2->tos) | (df ^ (id2 & IP_DF)))
+	if ((iph->ttl ^ iph2->ttl) | (iph->tos ^ iph2->tos) | ((id ^ id2) & IP_DF))
 		return true;
 
 	/* When we receive our second frame we can make a decision on if we
 	 * continue this flow as an atomic flow with a fixed ID or if we use
 	 * an incrementing ID.
 	 */
-	if (count == 1 && df && !ipid_offset)
+	if (count == 1 && !ipid_offset)
 		NAPI_GRO_CB(p)->ip_fixedid |= 1 << inner;
 
 	return ipid_offset ^ (count * !(NAPI_GRO_CB(p)->ip_fixedid & (1 << inner)));
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 7f29b485009d..b4052cabdfc4 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1393,10 +1393,7 @@ struct sk_buff *inet_gso_segment(struct sk_buff *skb,
 
 	segs = ERR_PTR(-EPROTONOSUPPORT);
 
-	/* fixed ID is invalid if DF bit is not set */
 	fixedid = !!(skb_shinfo(skb)->gso_type & (SKB_GSO_TCP_FIXEDID_OUTER << encap));
-	if (fixedid && !(ip_hdr(skb)->frag_off & htons(IP_DF)))
-		goto out;
 
 	if (!skb->encapsulation || encap)
 		udpfrag = !!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP);
diff --git a/tools/testing/selftests/net/gro.c b/tools/testing/selftests/net/gro.c
index d5824eadea10..3d4a82a2607c 100644
--- a/tools/testing/selftests/net/gro.c
+++ b/tools/testing/selftests/net/gro.c
@@ -670,7 +670,7 @@ static void send_flush_id_case(int fd, struct sockaddr_ll *daddr, int tcase)
 		iph2->id = htons(9);
 		break;
 
-	case 3: /* DF=0, Fixed - should not coalesce */
+	case 3: /* DF=0, Fixed - should coalesce */
 		iph1->frag_off &= ~htons(IP_DF);
 		iph1->id = htons(8);
 
@@ -1188,10 +1188,9 @@ static void gro_receiver(void)
 			correct_payload[0] = PAYLOAD_LEN * 2;
 			check_recv_pkts(rxfd, correct_payload, 1);
 
-			printf("DF=0, Fixed - should not coalesce: ");
-			correct_payload[0] = PAYLOAD_LEN;
-			correct_payload[1] = PAYLOAD_LEN;
-			check_recv_pkts(rxfd, correct_payload, 2);
+			printf("DF=0, Fixed - should coalesce: ");
+			correct_payload[0] = PAYLOAD_LEN * 2;
+			check_recv_pkts(rxfd, correct_payload, 1);
 
 			printf("DF=1, 2 Incrementing and one fixed - should coalesce only first 2 packets: ");
 			correct_payload[0] = PAYLOAD_LEN * 2;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next v2 5/5] selftests/net: test ipip packets in gro.sh
  2025-08-19  6:32 [PATCH net-next v2 0/5] net: gso: restore outer ip ids correctly Richard Gobert
                   ` (3 preceding siblings ...)
  2025-08-19  6:32 ` [PATCH net-next v2 4/5] net: gro: remove unnecessary df checks Richard Gobert
@ 2025-08-19  6:32 ` Richard Gobert
  2025-08-20 11:21   ` Willem de Bruijn
  4 siblings, 1 reply; 16+ messages in thread
From: Richard Gobert @ 2025-08-19  6:32 UTC (permalink / raw)
  To: netdev
  Cc: davem, edumazet, kuba, pabeni, horms, corbet, shenjian15,
	salil.mehta, shaojijie, andrew+netdev, saeedm, tariqt, mbloch,
	leon, ecree.xilinx, dsahern, ncardwell, kuniyu, shuah, sdf,
	ahmed.zaki, aleksander.lobakin, florian.fainelli,
	willemdebruijn.kernel, linux-kernel, linux-net-drivers,
	Richard Gobert

Add IPIP test-cases to the GRO selftest.

This selftest already contains IP ID test-cases. They are now
also tested for encapsulated packets.

Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
---
 tools/testing/selftests/net/gro.c  | 49 ++++++++++++++++++++++++------
 tools/testing/selftests/net/gro.sh |  5 +--
 2 files changed, 42 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/net/gro.c b/tools/testing/selftests/net/gro.c
index 3d4a82a2607c..aeb4973418a4 100644
--- a/tools/testing/selftests/net/gro.c
+++ b/tools/testing/selftests/net/gro.c
@@ -93,6 +93,7 @@ static bool tx_socket = true;
 static int tcp_offset = -1;
 static int total_hdr_len = -1;
 static int ethhdr_proto = -1;
+static bool ipip;
 static const int num_flush_id_cases = 6;
 
 static void vlog(const char *fmt, ...)
@@ -114,7 +115,9 @@ static void setup_sock_filter(int fd)
 	int ipproto_off, opt_ipproto_off;
 	int next_off;
 
-	if (proto == PF_INET)
+	if (ipip)
+		next_off = sizeof(struct iphdr) + offsetof(struct iphdr, protocol);
+	else if (proto == PF_INET)
 		next_off = offsetof(struct iphdr, protocol);
 	else
 		next_off = offsetof(struct ipv6hdr, nexthdr);
@@ -244,7 +247,7 @@ static void fill_datalinklayer(void *buf)
 	eth->h_proto = ethhdr_proto;
 }
 
-static void fill_networklayer(void *buf, int payload_len)
+static void fill_networklayer(void *buf, int payload_len, int protocol)
 {
 	struct ipv6hdr *ip6h = buf;
 	struct iphdr *iph = buf;
@@ -254,7 +257,7 @@ static void fill_networklayer(void *buf, int payload_len)
 
 		ip6h->version = 6;
 		ip6h->payload_len = htons(sizeof(struct tcphdr) + payload_len);
-		ip6h->nexthdr = IPPROTO_TCP;
+		ip6h->nexthdr = protocol;
 		ip6h->hop_limit = 8;
 		if (inet_pton(AF_INET6, addr6_src, &ip6h->saddr) != 1)
 			error(1, errno, "inet_pton source ip6");
@@ -266,7 +269,7 @@ static void fill_networklayer(void *buf, int payload_len)
 		iph->version = 4;
 		iph->ihl = 5;
 		iph->ttl = 8;
-		iph->protocol	= IPPROTO_TCP;
+		iph->protocol	= protocol;
 		iph->tot_len = htons(sizeof(struct tcphdr) +
 				payload_len + sizeof(struct iphdr));
 		iph->frag_off = htons(0x4000); /* DF = 1, MF = 0 */
@@ -313,9 +316,19 @@ static void create_packet(void *buf, int seq_offset, int ack_offset,
 {
 	memset(buf, 0, total_hdr_len);
 	memset(buf + total_hdr_len, 'a', payload_len);
+
 	fill_transportlayer(buf + tcp_offset, seq_offset, ack_offset,
 			    payload_len, fin);
-	fill_networklayer(buf + ETH_HLEN, payload_len);
+
+	if (ipip) {
+		fill_networklayer(buf + ETH_HLEN + sizeof(struct iphdr),
+				  payload_len, IPPROTO_TCP);
+		fill_networklayer(buf + ETH_HLEN, payload_len + sizeof(struct iphdr),
+				  IPPROTO_IPIP);
+	} else {
+		fill_networklayer(buf + ETH_HLEN, payload_len, IPPROTO_TCP);
+	}
+
 	fill_datalinklayer(buf);
 }
 
@@ -416,6 +429,13 @@ static void recompute_packet(char *buf, char *no_ext, int extlen)
 		iph->tot_len = htons(ntohs(iph->tot_len) + extlen);
 		iph->check = 0;
 		iph->check = checksum_fold(iph, sizeof(struct iphdr), 0);
+
+		if (ipip) {
+			iph += 1;
+			iph->tot_len = htons(ntohs(iph->tot_len) + extlen);
+			iph->check = 0;
+			iph->check = checksum_fold(iph, sizeof(struct iphdr), 0);
+		}
 	} else {
 		ip6h->payload_len = htons(ntohs(ip6h->payload_len) + extlen);
 	}
@@ -777,7 +797,7 @@ static void send_fragment4(int fd, struct sockaddr_ll *daddr)
 	 */
 	memset(buf + total_hdr_len, 'a', PAYLOAD_LEN * 2);
 	fill_transportlayer(buf + tcp_offset, PAYLOAD_LEN, 0, PAYLOAD_LEN * 2, 0);
-	fill_networklayer(buf + ETH_HLEN, PAYLOAD_LEN);
+	fill_networklayer(buf + ETH_HLEN, PAYLOAD_LEN, IPPROTO_TCP);
 	fill_datalinklayer(buf);
 
 	iph->frag_off = htons(0x6000); // DF = 1, MF = 1
@@ -1071,7 +1091,7 @@ static void gro_sender(void)
 		 * and min ipv6hdr size. Like MAX_HDR_SIZE,
 		 * MAX_PAYLOAD is defined with the larger header of the two.
 		 */
-		int offset = proto == PF_INET ? 20 : 0;
+		int offset = (proto == PF_INET && !ipip) ? 20 : 0;
 		int remainder = (MAX_PAYLOAD + offset) % MSS;
 
 		send_large(txfd, &daddr, remainder);
@@ -1221,7 +1241,7 @@ static void gro_receiver(void)
 			check_recv_pkts(rxfd, correct_payload, 2);
 		}
 	} else if (strcmp(testname, "large") == 0) {
-		int offset = proto == PF_INET ? 20 : 0;
+		int offset = (proto == PF_INET && !ipip) ? 20 : 0;
 		int remainder = (MAX_PAYLOAD + offset) % MSS;
 
 		correct_payload[0] = (MAX_PAYLOAD + offset);
@@ -1250,6 +1270,7 @@ static void parse_args(int argc, char **argv)
 		{ "iface", required_argument, NULL, 'i' },
 		{ "ipv4", no_argument, NULL, '4' },
 		{ "ipv6", no_argument, NULL, '6' },
+		{ "ipip", no_argument, NULL, 'e' },
 		{ "rx", no_argument, NULL, 'r' },
 		{ "saddr", required_argument, NULL, 's' },
 		{ "smac", required_argument, NULL, 'S' },
@@ -1259,7 +1280,7 @@ static void parse_args(int argc, char **argv)
 	};
 	int c;
 
-	while ((c = getopt_long(argc, argv, "46d:D:i:rs:S:t:v", opts, NULL)) != -1) {
+	while ((c = getopt_long(argc, argv, "46ed:D:i:rs:S:t:v", opts, NULL)) != -1) {
 		switch (c) {
 		case '4':
 			proto = PF_INET;
@@ -1269,6 +1290,11 @@ static void parse_args(int argc, char **argv)
 			proto = PF_INET6;
 			ethhdr_proto = htons(ETH_P_IPV6);
 			break;
+		case 'e':
+			ipip = true;
+			proto = PF_INET;
+			ethhdr_proto = htons(ETH_P_IP);
+			break;
 		case 'd':
 			addr4_dst = addr6_dst = optarg;
 			break;
@@ -1304,7 +1330,10 @@ int main(int argc, char **argv)
 {
 	parse_args(argc, argv);
 
-	if (proto == PF_INET) {
+	if (ipip) {
+		tcp_offset = ETH_HLEN + sizeof(struct iphdr) * 2;
+		total_hdr_len = tcp_offset + sizeof(struct tcphdr);
+	} else if (proto == PF_INET) {
 		tcp_offset = ETH_HLEN + sizeof(struct iphdr);
 		total_hdr_len = tcp_offset + sizeof(struct tcphdr);
 	} else if (proto == PF_INET6) {
diff --git a/tools/testing/selftests/net/gro.sh b/tools/testing/selftests/net/gro.sh
index 9e3f186bc2a1..d16ec365b3cf 100755
--- a/tools/testing/selftests/net/gro.sh
+++ b/tools/testing/selftests/net/gro.sh
@@ -4,7 +4,7 @@
 readonly SERVER_MAC="aa:00:00:00:00:02"
 readonly CLIENT_MAC="aa:00:00:00:00:01"
 readonly TESTS=("data" "ack" "flags" "tcp" "ip" "large")
-readonly PROTOS=("ipv4" "ipv6")
+readonly PROTOS=("ipv4" "ipv6" "ipip")
 dev=""
 test="all"
 proto="ipv4"
@@ -31,7 +31,8 @@ run_test() {
       1>>log.txt
     wait "${server_pid}"
     exit_code=$?
-    if [[ ${test} == "large" && -n "${KSFT_MACHINE_SLOW}" && \
+    if [[ ( ${test} == "large" || ${protocol} == "ipip" ) && \
+          -n "${KSFT_MACHINE_SLOW}" && \
           ${exit_code} -ne 0 ]]; then
         echo "Ignoring errors due to slow environment" 1>&2
         exit_code=0
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 1/5] net: gro: remove is_ipv6 from napi_gro_cb
  2025-08-19  6:32 ` [PATCH net-next v2 1/5] net: gro: remove is_ipv6 from napi_gro_cb Richard Gobert
@ 2025-08-19  8:47   ` Eric Dumazet
  2025-08-19 12:26     ` Richard Gobert
  0 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2025-08-19  8:47 UTC (permalink / raw)
  To: Richard Gobert
  Cc: netdev, davem, kuba, pabeni, horms, corbet, shenjian15,
	salil.mehta, shaojijie, andrew+netdev, saeedm, tariqt, mbloch,
	leon, ecree.xilinx, dsahern, ncardwell, kuniyu, shuah, sdf,
	ahmed.zaki, aleksander.lobakin, florian.fainelli,
	willemdebruijn.kernel, linux-kernel, linux-net-drivers

On Mon, Aug 18, 2025 at 11:32 PM Richard Gobert
<richardbgobert@gmail.com> wrote:
>
> Remove is_ipv6 from napi_gro_cb and use sk->sk_family instead.
> This frees up space for another ip_fixedid bit that will be added
> in the next commit.
>
> Using sk->sk_family is reliable since udp_sock_create always
> creates either a AF_INET or a AF_INET6 socket, and IPv6-FOU
> doesn't support receiving IPv4 packets.
>
> Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
> ---
>  include/net/gro.h      |  3 ---
>  net/ipv4/fou_core.c    | 31 +++++++++++++------------------
>  net/ipv4/udp_offload.c |  2 --
>  net/ipv6/udp_offload.c |  2 --
>  4 files changed, 13 insertions(+), 25 deletions(-)
>
> diff --git a/include/net/gro.h b/include/net/gro.h
> index a0fca7ac6e7e..87c68007f949 100644
> --- a/include/net/gro.h
> +++ b/include/net/gro.h
> @@ -71,9 +71,6 @@ struct napi_gro_cb {
>                 /* Free the skb? */
>                 u8      free:2;
>
> -               /* Used in foo-over-udp, set in udp[46]_gro_receive */
> -               u8      is_ipv6:1;
> -
>                 /* Used in GRE, set in fou/gue_gro_receive */
>                 u8      is_fou:1;
>
> diff --git a/net/ipv4/fou_core.c b/net/ipv4/fou_core.c
> index 3e30745e2c09..261ea2cf460f 100644
> --- a/net/ipv4/fou_core.c
> +++ b/net/ipv4/fou_core.c
> @@ -228,21 +228,26 @@ static int gue_udp_recv(struct sock *sk, struct sk_buff *skb)
>         return 0;
>  }
>
> +static inline const struct net_offload *fou_gro_ops(struct sock *sk, int proto)

const struct sock *sk

> +{
> +       const struct net_offload __rcu **offloads;
> +
> +       /* FOU doesn't allow IPv4 on IPv6 sockets. */
> +       offloads = sk->sk_family == AF_INET6 ? inet6_offloads : inet_offloads;


Do we have a guarantee sk->sk_family can not change under us ?

IPV6_ADDRFORM is available to UDP sockets after all.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 1/5] net: gro: remove is_ipv6 from napi_gro_cb
  2025-08-19  8:47   ` Eric Dumazet
@ 2025-08-19 12:26     ` Richard Gobert
  0 siblings, 0 replies; 16+ messages in thread
From: Richard Gobert @ 2025-08-19 12:26 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, davem, kuba, pabeni, horms, corbet, shenjian15,
	salil.mehta, shaojijie, andrew+netdev, saeedm, tariqt, mbloch,
	leon, ecree.xilinx, dsahern, ncardwell, kuniyu, shuah, sdf,
	ahmed.zaki, aleksander.lobakin, florian.fainelli,
	willemdebruijn.kernel, linux-kernel, linux-net-drivers

Eric Dumazet wrote:
> On Mon, Aug 18, 2025 at 11:32 PM Richard Gobert
> <richardbgobert@gmail.com> wrote:
>>
>> Remove is_ipv6 from napi_gro_cb and use sk->sk_family instead.
>> This frees up space for another ip_fixedid bit that will be added
>> in the next commit.
>>
>> Using sk->sk_family is reliable since udp_sock_create always
>> creates either a AF_INET or a AF_INET6 socket, and IPv6-FOU
>> doesn't support receiving IPv4 packets.
>>
>> Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
>> ---
>>  include/net/gro.h      |  3 ---
>>  net/ipv4/fou_core.c    | 31 +++++++++++++------------------
>>  net/ipv4/udp_offload.c |  2 --
>>  net/ipv6/udp_offload.c |  2 --
>>  4 files changed, 13 insertions(+), 25 deletions(-)
>>
>> diff --git a/include/net/gro.h b/include/net/gro.h
>> index a0fca7ac6e7e..87c68007f949 100644
>> --- a/include/net/gro.h
>> +++ b/include/net/gro.h
>> @@ -71,9 +71,6 @@ struct napi_gro_cb {
>>                 /* Free the skb? */
>>                 u8      free:2;
>>
>> -               /* Used in foo-over-udp, set in udp[46]_gro_receive */
>> -               u8      is_ipv6:1;
>> -
>>                 /* Used in GRE, set in fou/gue_gro_receive */
>>                 u8      is_fou:1;
>>
>> diff --git a/net/ipv4/fou_core.c b/net/ipv4/fou_core.c
>> index 3e30745e2c09..261ea2cf460f 100644
>> --- a/net/ipv4/fou_core.c
>> +++ b/net/ipv4/fou_core.c
>> @@ -228,21 +228,26 @@ static int gue_udp_recv(struct sock *sk, struct sk_buff *skb)
>>         return 0;
>>  }
>>
>> +static inline const struct net_offload *fou_gro_ops(struct sock *sk, int proto)
> 
> const struct sock *sk
> 
>> +{
>> +       const struct net_offload __rcu **offloads;
>> +
>> +       /* FOU doesn't allow IPv4 on IPv6 sockets. */
>> +       offloads = sk->sk_family == AF_INET6 ? inet6_offloads : inet_offloads;
> 
> 
> Do we have a guarantee sk->sk_family can not change under us ?
> 
> IPV6_ADDRFORM is available to UDP sockets after all.

This is a kernel socket, and FOU doesn't use IPV6_ADDRFORM.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 2/5] net: gro: only merge packets with incrementing or fixed outer ids
  2025-08-19  6:32 ` [PATCH net-next v2 2/5] net: gro: only merge packets with incrementing or fixed outer ids Richard Gobert
@ 2025-08-19 14:46   ` Willem de Bruijn
  2025-08-20  0:30     ` Jakub Kicinski
  2025-08-20  0:30   ` Jakub Kicinski
  1 sibling, 1 reply; 16+ messages in thread
From: Willem de Bruijn @ 2025-08-19 14:46 UTC (permalink / raw)
  To: Richard Gobert, netdev
  Cc: davem, edumazet, kuba, pabeni, horms, corbet, shenjian15,
	salil.mehta, shaojijie, andrew+netdev, saeedm, tariqt, mbloch,
	leon, ecree.xilinx, dsahern, ncardwell, kuniyu, shuah, sdf,
	ahmed.zaki, aleksander.lobakin, florian.fainelli,
	willemdebruijn.kernel, linux-kernel, linux-net-drivers,
	Richard Gobert

Richard Gobert wrote:
> Only merge encapsulated packets if their outer IDs are either
> incrementing or fixed, just like for inner IDs and IDs of non-encapsulated
> packets.
> 
> Add another ip_fixedid bit for a total of two bits: one for outer IDs and
> one for inner IDs.
> 
> This commit preserves the current behavior of GSO where only the IDs of the
> inner-most headers are restored correctly.
> 
> Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
> ---
>  include/net/gro.h      | 26 +++++++++++---------------
>  net/ipv4/tcp_offload.c |  4 +++-
>  2 files changed, 14 insertions(+), 16 deletions(-)
> 
> diff --git a/include/net/gro.h b/include/net/gro.h
> index 87c68007f949..e7997a9fb30b 100644
> --- a/include/net/gro.h
> +++ b/include/net/gro.h
> @@ -75,7 +75,7 @@ struct napi_gro_cb {
>  		u8	is_fou:1;
>  
>  		/* Used to determine if ipid_offset can be ignored */
> -		u8	ip_fixedid:1;
> +		u8	ip_fixedid:2;
>  
>  		/* Number of gro_receive callbacks this packet already went through */
>  		u8 recursion_counter:4;
> @@ -442,29 +442,26 @@ static inline __wsum ip6_gro_compute_pseudo(const struct sk_buff *skb,
>  }
>  
>  static inline int inet_gro_flush(const struct iphdr *iph, const struct iphdr *iph2,
> -				 struct sk_buff *p, bool outer)
> +				 struct sk_buff *p, bool inner)
>  {
>  	const u32 id = ntohl(*(__be32 *)&iph->id);
>  	const u32 id2 = ntohl(*(__be32 *)&iph2->id);
>  	const u16 ipid_offset = (id >> 16) - (id2 >> 16);
>  	const u16 count = NAPI_GRO_CB(p)->count;
>  	const u32 df = id & IP_DF;
> -	int flush;
>  
>  	/* All fields must match except length and checksum. */
> -	flush = (iph->ttl ^ iph2->ttl) | (iph->tos ^ iph2->tos) | (df ^ (id2 & IP_DF));
> -
> -	if (flush | (outer && df))
> -		return flush;
> +	if ((iph->ttl ^ iph2->ttl) | (iph->tos ^ iph2->tos) | (df ^ (id2 & IP_DF)))
> +		return true;
>  
>  	/* When we receive our second frame we can make a decision on if we
>  	 * continue this flow as an atomic flow with a fixed ID or if we use
>  	 * an incrementing ID.
>  	 */
>  	if (count == 1 && df && !ipid_offset)
> -		NAPI_GRO_CB(p)->ip_fixedid = true;
> +		NAPI_GRO_CB(p)->ip_fixedid |= 1 << inner;
>  
> -	return ipid_offset ^ (count * !NAPI_GRO_CB(p)->ip_fixedid);
> +	return ipid_offset ^ (count * !(NAPI_GRO_CB(p)->ip_fixedid & (1 << inner)));
>  }
>  
>  static inline int ipv6_gro_flush(const struct ipv6hdr *iph, const struct ipv6hdr *iph2)
> @@ -479,7 +476,7 @@ static inline int ipv6_gro_flush(const struct ipv6hdr *iph, const struct ipv6hdr
>  
>  static inline int __gro_receive_network_flush(const void *th, const void *th2,
>  					      struct sk_buff *p, const u16 diff,
> -					      bool outer)
> +					      bool inner)
>  {
>  	const void *nh = th - diff;
>  	const void *nh2 = th2 - diff;
> @@ -487,19 +484,18 @@ static inline int __gro_receive_network_flush(const void *th, const void *th2,
>  	if (((struct iphdr *)nh)->version == 6)
>  		return ipv6_gro_flush(nh, nh2);
>  	else
> -		return inet_gro_flush(nh, nh2, p, outer);
> +		return inet_gro_flush(nh, nh2, p, inner);
>  }
>  
>  static inline int gro_receive_network_flush(const void *th, const void *th2,
>  					    struct sk_buff *p)
>  {
> -	const bool encap_mark = NAPI_GRO_CB(p)->encap_mark;
>  	int off = skb_transport_offset(p);
>  	int flush;
>  
> -	flush = __gro_receive_network_flush(th, th2, p, off - NAPI_GRO_CB(p)->network_offset, encap_mark);
> -	if (encap_mark)
> -		flush |= __gro_receive_network_flush(th, th2, p, off - NAPI_GRO_CB(p)->inner_network_offset, false);
> +	flush = __gro_receive_network_flush(th, th2, p, off - NAPI_GRO_CB(p)->network_offset, false);
> +	if (NAPI_GRO_CB(p)->encap_mark)
> +		flush |= __gro_receive_network_flush(th, th2, p, off - NAPI_GRO_CB(p)->inner_network_offset, true);

It's a bit unclear what the meaning of inner and outer are in the
unencapsulated (i.e., normal) case. In my intuition outer only exists
if encapsulated, but it seems you reason the other way around: inner
is absent unless encapsulated. I guess they're equivalent, but please
explicitly comment this choice somewhere.

>  
>  	return flush;
>  }
> diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
> index be5c2294610e..74f46663eeae 100644
> --- a/net/ipv4/tcp_offload.c
> +++ b/net/ipv4/tcp_offload.c
> @@ -485,8 +485,10 @@ INDIRECT_CALLABLE_SCOPE int tcp4_gro_complete(struct sk_buff *skb, int thoff)
>  	th->check = ~tcp_v4_check(skb->len - thoff, iph->saddr,
>  				  iph->daddr, 0);
>  
> +	bool is_fixedid = (NAPI_GRO_CB(skb)->ip_fixedid >> skb->encapsulation) & 1;
> +

Variable definition at top of function (or basic block)

>  	skb_shinfo(skb)->gso_type |= SKB_GSO_TCPV4 |
> -			(NAPI_GRO_CB(skb)->ip_fixedid * SKB_GSO_TCP_FIXEDID);
> +			(is_fixedid * SKB_GSO_TCP_FIXEDID);
>  
>  	tcp_gro_complete(skb);
>  	return 0;
> -- 
> 2.36.1
> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 2/5] net: gro: only merge packets with incrementing or fixed outer ids
  2025-08-19 14:46   ` Willem de Bruijn
@ 2025-08-20  0:30     ` Jakub Kicinski
  2025-08-20 12:27       ` Richard Gobert
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2025-08-20  0:30 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Richard Gobert, netdev, davem, edumazet, pabeni, horms, corbet,
	shenjian15, salil.mehta, shaojijie, andrew+netdev, saeedm, tariqt,
	mbloch, leon, ecree.xilinx, dsahern, ncardwell, kuniyu, shuah,
	sdf, ahmed.zaki, aleksander.lobakin, florian.fainelli,
	linux-kernel, linux-net-drivers

On Tue, 19 Aug 2025 10:46:01 -0400 Willem de Bruijn wrote:
> It's a bit unclear what the meaning of inner and outer are in the
> unencapsulated (i.e., normal) case. In my intuition outer only exists
> if encapsulated, but it seems you reason the other way around: inner
> is absent unless encapsulated. 

+1, whether the header in unencapsulted packet is inner or outer
is always a source of unnecessary confusion. I would have also
preferred your suggestion on v1 to use _ENCAP in the name.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 2/5] net: gro: only merge packets with incrementing or fixed outer ids
  2025-08-19  6:32 ` [PATCH net-next v2 2/5] net: gro: only merge packets with incrementing or fixed outer ids Richard Gobert
  2025-08-19 14:46   ` Willem de Bruijn
@ 2025-08-20  0:30   ` Jakub Kicinski
  1 sibling, 0 replies; 16+ messages in thread
From: Jakub Kicinski @ 2025-08-20  0:30 UTC (permalink / raw)
  To: Richard Gobert
  Cc: netdev, davem, edumazet, pabeni, horms, corbet, shenjian15,
	salil.mehta, shaojijie, andrew+netdev, saeedm, tariqt, mbloch,
	leon, ecree.xilinx, dsahern, ncardwell, kuniyu, shuah, sdf,
	ahmed.zaki, aleksander.lobakin, florian.fainelli,
	willemdebruijn.kernel, linux-kernel, linux-net-drivers

On Tue, 19 Aug 2025 08:32:20 +0200 Richard Gobert wrote:
> +	flush = __gro_receive_network_flush(th, th2, p, off - NAPI_GRO_CB(p)->network_offset, false);
> +	if (NAPI_GRO_CB(p)->encap_mark)
> +		flush |= __gro_receive_network_flush(th, th2, p, off - NAPI_GRO_CB(p)->inner_network_offset, true);

Please wrap lines at 80 columns
-- 
pw-bot: cr

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 5/5] selftests/net: test ipip packets in gro.sh
  2025-08-19  6:32 ` [PATCH net-next v2 5/5] selftests/net: test ipip packets in gro.sh Richard Gobert
@ 2025-08-20 11:21   ` Willem de Bruijn
  2025-08-20 12:21     ` Richard Gobert
  0 siblings, 1 reply; 16+ messages in thread
From: Willem de Bruijn @ 2025-08-20 11:21 UTC (permalink / raw)
  To: Richard Gobert, netdev
  Cc: davem, edumazet, kuba, pabeni, horms, corbet, shenjian15,
	salil.mehta, shaojijie, andrew+netdev, saeedm, tariqt, mbloch,
	leon, ecree.xilinx, dsahern, ncardwell, kuniyu, shuah, sdf,
	ahmed.zaki, aleksander.lobakin, florian.fainelli,
	willemdebruijn.kernel, linux-kernel, linux-net-drivers,
	Richard Gobert

Richard Gobert wrote:
> Add IPIP test-cases to the GRO selftest.
> 
> This selftest already contains IP ID test-cases. They are now
> also tested for encapsulated packets.
> 
> Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
> ---
>  tools/testing/selftests/net/gro.c  | 49 ++++++++++++++++++++++++------
>  tools/testing/selftests/net/gro.sh |  5 +--
>  2 files changed, 42 insertions(+), 12 deletions(-)
> 
> diff --git a/tools/testing/selftests/net/gro.c b/tools/testing/selftests/net/gro.c
> index 3d4a82a2607c..aeb4973418a4 100644
> --- a/tools/testing/selftests/net/gro.c
> +++ b/tools/testing/selftests/net/gro.c
> @@ -93,6 +93,7 @@ static bool tx_socket = true;
>  static int tcp_offset = -1;
>  static int total_hdr_len = -1;
>  static int ethhdr_proto = -1;
> +static bool ipip;
>  static const int num_flush_id_cases = 6;
>  
>  static void vlog(const char *fmt, ...)
> @@ -114,7 +115,9 @@ static void setup_sock_filter(int fd)
>  	int ipproto_off, opt_ipproto_off;
>  	int next_off;
>  
> -	if (proto == PF_INET)
> +	if (ipip)
> +		next_off = sizeof(struct iphdr) + offsetof(struct iphdr, protocol);
> +	else if (proto == PF_INET)
>  		next_off = offsetof(struct iphdr, protocol);
>  	else
>  		next_off = offsetof(struct ipv6hdr, nexthdr);
> @@ -244,7 +247,7 @@ static void fill_datalinklayer(void *buf)
>  	eth->h_proto = ethhdr_proto;
>  }
>  
> -static void fill_networklayer(void *buf, int payload_len)
> +static void fill_networklayer(void *buf, int payload_len, int protocol)
>  {
>  	struct ipv6hdr *ip6h = buf;
>  	struct iphdr *iph = buf;
> @@ -254,7 +257,7 @@ static void fill_networklayer(void *buf, int payload_len)
>  
>  		ip6h->version = 6;
>  		ip6h->payload_len = htons(sizeof(struct tcphdr) + payload_len);
> -		ip6h->nexthdr = IPPROTO_TCP;
> +		ip6h->nexthdr = protocol;
>  		ip6h->hop_limit = 8;
>  		if (inet_pton(AF_INET6, addr6_src, &ip6h->saddr) != 1)
>  			error(1, errno, "inet_pton source ip6");
> @@ -266,7 +269,7 @@ static void fill_networklayer(void *buf, int payload_len)
>  		iph->version = 4;
>  		iph->ihl = 5;
>  		iph->ttl = 8;
> -		iph->protocol	= IPPROTO_TCP;
> +		iph->protocol	= protocol;
>  		iph->tot_len = htons(sizeof(struct tcphdr) +
>  				payload_len + sizeof(struct iphdr));
>  		iph->frag_off = htons(0x4000); /* DF = 1, MF = 0 */
> @@ -313,9 +316,19 @@ static void create_packet(void *buf, int seq_offset, int ack_offset,
>  {
>  	memset(buf, 0, total_hdr_len);
>  	memset(buf + total_hdr_len, 'a', payload_len);
> +
>  	fill_transportlayer(buf + tcp_offset, seq_offset, ack_offset,
>  			    payload_len, fin);
> -	fill_networklayer(buf + ETH_HLEN, payload_len);
> +
> +	if (ipip) {
> +		fill_networklayer(buf + ETH_HLEN + sizeof(struct iphdr),
> +				  payload_len, IPPROTO_TCP);
> +		fill_networklayer(buf + ETH_HLEN, payload_len + sizeof(struct iphdr),
> +				  IPPROTO_IPIP);
> +	} else {
> +		fill_networklayer(buf + ETH_HLEN, payload_len, IPPROTO_TCP);
> +	}
> +
>  	fill_datalinklayer(buf);
>  }
>  
> @@ -416,6 +429,13 @@ static void recompute_packet(char *buf, char *no_ext, int extlen)
>  		iph->tot_len = htons(ntohs(iph->tot_len) + extlen);
>  		iph->check = 0;
>  		iph->check = checksum_fold(iph, sizeof(struct iphdr), 0);
> +
> +		if (ipip) {
> +			iph += 1;
> +			iph->tot_len = htons(ntohs(iph->tot_len) + extlen);
> +			iph->check = 0;
> +			iph->check = checksum_fold(iph, sizeof(struct iphdr), 0);
> +		}
>  	} else {
>  		ip6h->payload_len = htons(ntohs(ip6h->payload_len) + extlen);
>  	}
> @@ -777,7 +797,7 @@ static void send_fragment4(int fd, struct sockaddr_ll *daddr)
>  	 */
>  	memset(buf + total_hdr_len, 'a', PAYLOAD_LEN * 2);
>  	fill_transportlayer(buf + tcp_offset, PAYLOAD_LEN, 0, PAYLOAD_LEN * 2, 0);
> -	fill_networklayer(buf + ETH_HLEN, PAYLOAD_LEN);
> +	fill_networklayer(buf + ETH_HLEN, PAYLOAD_LEN, IPPROTO_TCP);
>  	fill_datalinklayer(buf);
>  
>  	iph->frag_off = htons(0x6000); // DF = 1, MF = 1
> @@ -1071,7 +1091,7 @@ static void gro_sender(void)
>  		 * and min ipv6hdr size. Like MAX_HDR_SIZE,
>  		 * MAX_PAYLOAD is defined with the larger header of the two.
>  		 */
> -		int offset = proto == PF_INET ? 20 : 0;
> +		int offset = (proto == PF_INET && !ipip) ? 20 : 0;
>  		int remainder = (MAX_PAYLOAD + offset) % MSS;
>  
>  		send_large(txfd, &daddr, remainder);
> @@ -1221,7 +1241,7 @@ static void gro_receiver(void)
>  			check_recv_pkts(rxfd, correct_payload, 2);
>  		}
>  	} else if (strcmp(testname, "large") == 0) {
> -		int offset = proto == PF_INET ? 20 : 0;
> +		int offset = (proto == PF_INET && !ipip) ? 20 : 0;
>  		int remainder = (MAX_PAYLOAD + offset) % MSS;
>  
>  		correct_payload[0] = (MAX_PAYLOAD + offset);
> @@ -1250,6 +1270,7 @@ static void parse_args(int argc, char **argv)
>  		{ "iface", required_argument, NULL, 'i' },
>  		{ "ipv4", no_argument, NULL, '4' },
>  		{ "ipv6", no_argument, NULL, '6' },
> +		{ "ipip", no_argument, NULL, 'e' },
>  		{ "rx", no_argument, NULL, 'r' },
>  		{ "saddr", required_argument, NULL, 's' },
>  		{ "smac", required_argument, NULL, 'S' },
> @@ -1259,7 +1280,7 @@ static void parse_args(int argc, char **argv)
>  	};
>  	int c;
>  
> -	while ((c = getopt_long(argc, argv, "46d:D:i:rs:S:t:v", opts, NULL)) != -1) {
> +	while ((c = getopt_long(argc, argv, "46ed:D:i:rs:S:t:v", opts, NULL)) != -1) {

Please keep alphabetical order.

>  		switch (c) {
>  		case '4':
>  			proto = PF_INET;
> @@ -1269,6 +1290,11 @@ static void parse_args(int argc, char **argv)
>  			proto = PF_INET6;
>  			ethhdr_proto = htons(ETH_P_IPV6);
>  			break;
> +		case 'e':
> +			ipip = true;
> +			proto = PF_INET;
> +			ethhdr_proto = htons(ETH_P_IP);
> +			break;
>  		case 'd':
>  			addr4_dst = addr6_dst = optarg;
>  			break;
> @@ -1304,7 +1330,10 @@ int main(int argc, char **argv)
>  {
>  	parse_args(argc, argv);
>  
> -	if (proto == PF_INET) {
> +	if (ipip) {
> +		tcp_offset = ETH_HLEN + sizeof(struct iphdr) * 2;
> +		total_hdr_len = tcp_offset + sizeof(struct tcphdr);
> +	} else if (proto == PF_INET) {
>  		tcp_offset = ETH_HLEN + sizeof(struct iphdr);
>  		total_hdr_len = tcp_offset + sizeof(struct tcphdr);
>  	} else if (proto == PF_INET6) {
> diff --git a/tools/testing/selftests/net/gro.sh b/tools/testing/selftests/net/gro.sh
> index 9e3f186bc2a1..d16ec365b3cf 100755
> --- a/tools/testing/selftests/net/gro.sh
> +++ b/tools/testing/selftests/net/gro.sh
> @@ -4,7 +4,7 @@
>  readonly SERVER_MAC="aa:00:00:00:00:02"
>  readonly CLIENT_MAC="aa:00:00:00:00:01"
>  readonly TESTS=("data" "ack" "flags" "tcp" "ip" "large")
> -readonly PROTOS=("ipv4" "ipv6")
> +readonly PROTOS=("ipv4" "ipv6" "ipip")
>  dev=""
>  test="all"
>  proto="ipv4"
> @@ -31,7 +31,8 @@ run_test() {
>        1>>log.txt
>      wait "${server_pid}"
>      exit_code=$?
> -    if [[ ${test} == "large" && -n "${KSFT_MACHINE_SLOW}" && \
> +    if [[ ( ${test} == "large" || ${protocol} == "ipip" ) && \
> +          -n "${KSFT_MACHINE_SLOW}" && \

Why does the ipip test need this debug xfail path?

>            ${exit_code} -ne 0 ]]; then
>          echo "Ignoring errors due to slow environment" 1>&2
>          exit_code=0
> -- 
> 2.36.1
> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 4/5] net: gro: remove unnecessary df checks
  2025-08-19  6:32 ` [PATCH net-next v2 4/5] net: gro: remove unnecessary df checks Richard Gobert
@ 2025-08-20 11:24   ` Willem de Bruijn
  0 siblings, 0 replies; 16+ messages in thread
From: Willem de Bruijn @ 2025-08-20 11:24 UTC (permalink / raw)
  To: Richard Gobert, netdev
  Cc: davem, edumazet, kuba, pabeni, horms, corbet, shenjian15,
	salil.mehta, shaojijie, andrew+netdev, saeedm, tariqt, mbloch,
	leon, ecree.xilinx, dsahern, ncardwell, kuniyu, shuah, sdf,
	ahmed.zaki, aleksander.lobakin, florian.fainelli,
	willemdebruijn.kernel, linux-kernel, linux-net-drivers,
	Richard Gobert

Richard Gobert wrote:
> Currently, packets with fixed IDs will be merged only if their
> don't-fragment bit is set. Merged packets are re-split into segments
> before being fragmented, so the result is the same as if the packets
> weren't merged to begin with.
> 
> Remove unnecessary don't-fragment checks.
> 
> Signed-off-by: Richard Gobert <richardbgobert@gmail.com>

Reviewed-by: Willem de Bruijn <willemb@google.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 5/5] selftests/net: test ipip packets in gro.sh
  2025-08-20 11:21   ` Willem de Bruijn
@ 2025-08-20 12:21     ` Richard Gobert
  0 siblings, 0 replies; 16+ messages in thread
From: Richard Gobert @ 2025-08-20 12:21 UTC (permalink / raw)
  To: Willem de Bruijn, netdev
  Cc: davem, edumazet, kuba, pabeni, horms, corbet, shenjian15,
	salil.mehta, shaojijie, andrew+netdev, saeedm, tariqt, mbloch,
	leon, ecree.xilinx, dsahern, ncardwell, kuniyu, shuah, sdf,
	ahmed.zaki, aleksander.lobakin, florian.fainelli, linux-kernel,
	linux-net-drivers

Willem de Bruijn wrote:
> Richard Gobert wrote:
>> Add IPIP test-cases to the GRO selftest.
>>
>> This selftest already contains IP ID test-cases. They are now
>> also tested for encapsulated packets.
>>
>> Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
>> ---
>>  tools/testing/selftests/net/gro.c  | 49 ++++++++++++++++++++++++------
>>  tools/testing/selftests/net/gro.sh |  5 +--
>>  2 files changed, 42 insertions(+), 12 deletions(-)
>>
>> diff --git a/tools/testing/selftests/net/gro.c b/tools/testing/selftests/net/gro.c
>> index 3d4a82a2607c..aeb4973418a4 100644
>> --- a/tools/testing/selftests/net/gro.c
>> +++ b/tools/testing/selftests/net/gro.c
>> @@ -93,6 +93,7 @@ static bool tx_socket = true;
>>  static int tcp_offset = -1;
>>  static int total_hdr_len = -1;
>>  static int ethhdr_proto = -1;
>> +static bool ipip;
>>  static const int num_flush_id_cases = 6;
>>  
>>  static void vlog(const char *fmt, ...)
>> @@ -114,7 +115,9 @@ static void setup_sock_filter(int fd)
>>  	int ipproto_off, opt_ipproto_off;
>>  	int next_off;
>>  
>> -	if (proto == PF_INET)
>> +	if (ipip)
>> +		next_off = sizeof(struct iphdr) + offsetof(struct iphdr, protocol);
>> +	else if (proto == PF_INET)
>>  		next_off = offsetof(struct iphdr, protocol);
>>  	else
>>  		next_off = offsetof(struct ipv6hdr, nexthdr);
>> @@ -244,7 +247,7 @@ static void fill_datalinklayer(void *buf)
>>  	eth->h_proto = ethhdr_proto;
>>  }
>>  
>> -static void fill_networklayer(void *buf, int payload_len)
>> +static void fill_networklayer(void *buf, int payload_len, int protocol)
>>  {
>>  	struct ipv6hdr *ip6h = buf;
>>  	struct iphdr *iph = buf;
>> @@ -254,7 +257,7 @@ static void fill_networklayer(void *buf, int payload_len)
>>  
>>  		ip6h->version = 6;
>>  		ip6h->payload_len = htons(sizeof(struct tcphdr) + payload_len);
>> -		ip6h->nexthdr = IPPROTO_TCP;
>> +		ip6h->nexthdr = protocol;
>>  		ip6h->hop_limit = 8;
>>  		if (inet_pton(AF_INET6, addr6_src, &ip6h->saddr) != 1)
>>  			error(1, errno, "inet_pton source ip6");
>> @@ -266,7 +269,7 @@ static void fill_networklayer(void *buf, int payload_len)
>>  		iph->version = 4;
>>  		iph->ihl = 5;
>>  		iph->ttl = 8;
>> -		iph->protocol	= IPPROTO_TCP;
>> +		iph->protocol	= protocol;
>>  		iph->tot_len = htons(sizeof(struct tcphdr) +
>>  				payload_len + sizeof(struct iphdr));
>>  		iph->frag_off = htons(0x4000); /* DF = 1, MF = 0 */
>> @@ -313,9 +316,19 @@ static void create_packet(void *buf, int seq_offset, int ack_offset,
>>  {
>>  	memset(buf, 0, total_hdr_len);
>>  	memset(buf + total_hdr_len, 'a', payload_len);
>> +
>>  	fill_transportlayer(buf + tcp_offset, seq_offset, ack_offset,
>>  			    payload_len, fin);
>> -	fill_networklayer(buf + ETH_HLEN, payload_len);
>> +
>> +	if (ipip) {
>> +		fill_networklayer(buf + ETH_HLEN + sizeof(struct iphdr),
>> +				  payload_len, IPPROTO_TCP);
>> +		fill_networklayer(buf + ETH_HLEN, payload_len + sizeof(struct iphdr),
>> +				  IPPROTO_IPIP);
>> +	} else {
>> +		fill_networklayer(buf + ETH_HLEN, payload_len, IPPROTO_TCP);
>> +	}
>> +
>>  	fill_datalinklayer(buf);
>>  }
>>  
>> @@ -416,6 +429,13 @@ static void recompute_packet(char *buf, char *no_ext, int extlen)
>>  		iph->tot_len = htons(ntohs(iph->tot_len) + extlen);
>>  		iph->check = 0;
>>  		iph->check = checksum_fold(iph, sizeof(struct iphdr), 0);
>> +
>> +		if (ipip) {
>> +			iph += 1;
>> +			iph->tot_len = htons(ntohs(iph->tot_len) + extlen);
>> +			iph->check = 0;
>> +			iph->check = checksum_fold(iph, sizeof(struct iphdr), 0);
>> +		}
>>  	} else {
>>  		ip6h->payload_len = htons(ntohs(ip6h->payload_len) + extlen);
>>  	}
>> @@ -777,7 +797,7 @@ static void send_fragment4(int fd, struct sockaddr_ll *daddr)
>>  	 */
>>  	memset(buf + total_hdr_len, 'a', PAYLOAD_LEN * 2);
>>  	fill_transportlayer(buf + tcp_offset, PAYLOAD_LEN, 0, PAYLOAD_LEN * 2, 0);
>> -	fill_networklayer(buf + ETH_HLEN, PAYLOAD_LEN);
>> +	fill_networklayer(buf + ETH_HLEN, PAYLOAD_LEN, IPPROTO_TCP);
>>  	fill_datalinklayer(buf);
>>  
>>  	iph->frag_off = htons(0x6000); // DF = 1, MF = 1
>> @@ -1071,7 +1091,7 @@ static void gro_sender(void)
>>  		 * and min ipv6hdr size. Like MAX_HDR_SIZE,
>>  		 * MAX_PAYLOAD is defined with the larger header of the two.
>>  		 */
>> -		int offset = proto == PF_INET ? 20 : 0;
>> +		int offset = (proto == PF_INET && !ipip) ? 20 : 0;
>>  		int remainder = (MAX_PAYLOAD + offset) % MSS;
>>  
>>  		send_large(txfd, &daddr, remainder);
>> @@ -1221,7 +1241,7 @@ static void gro_receiver(void)
>>  			check_recv_pkts(rxfd, correct_payload, 2);
>>  		}
>>  	} else if (strcmp(testname, "large") == 0) {
>> -		int offset = proto == PF_INET ? 20 : 0;
>> +		int offset = (proto == PF_INET && !ipip) ? 20 : 0;
>>  		int remainder = (MAX_PAYLOAD + offset) % MSS;
>>  
>>  		correct_payload[0] = (MAX_PAYLOAD + offset);
>> @@ -1250,6 +1270,7 @@ static void parse_args(int argc, char **argv)
>>  		{ "iface", required_argument, NULL, 'i' },
>>  		{ "ipv4", no_argument, NULL, '4' },
>>  		{ "ipv6", no_argument, NULL, '6' },
>> +		{ "ipip", no_argument, NULL, 'e' },
>>  		{ "rx", no_argument, NULL, 'r' },
>>  		{ "saddr", required_argument, NULL, 's' },
>>  		{ "smac", required_argument, NULL, 'S' },
>> @@ -1259,7 +1280,7 @@ static void parse_args(int argc, char **argv)
>>  	};
>>  	int c;
>>  
>> -	while ((c = getopt_long(argc, argv, "46d:D:i:rs:S:t:v", opts, NULL)) != -1) {
>> +	while ((c = getopt_long(argc, argv, "46ed:D:i:rs:S:t:v", opts, NULL)) != -1) {
> 
> Please keep alphabetical order.
> 
>>  		switch (c) {
>>  		case '4':
>>  			proto = PF_INET;
>> @@ -1269,6 +1290,11 @@ static void parse_args(int argc, char **argv)
>>  			proto = PF_INET6;
>>  			ethhdr_proto = htons(ETH_P_IPV6);
>>  			break;
>> +		case 'e':
>> +			ipip = true;
>> +			proto = PF_INET;
>> +			ethhdr_proto = htons(ETH_P_IP);
>> +			break;
>>  		case 'd':
>>  			addr4_dst = addr6_dst = optarg;
>>  			break;
>> @@ -1304,7 +1330,10 @@ int main(int argc, char **argv)
>>  {
>>  	parse_args(argc, argv);
>>  
>> -	if (proto == PF_INET) {
>> +	if (ipip) {
>> +		tcp_offset = ETH_HLEN + sizeof(struct iphdr) * 2;
>> +		total_hdr_len = tcp_offset + sizeof(struct tcphdr);
>> +	} else if (proto == PF_INET) {
>>  		tcp_offset = ETH_HLEN + sizeof(struct iphdr);
>>  		total_hdr_len = tcp_offset + sizeof(struct tcphdr);
>>  	} else if (proto == PF_INET6) {
>> diff --git a/tools/testing/selftests/net/gro.sh b/tools/testing/selftests/net/gro.sh
>> index 9e3f186bc2a1..d16ec365b3cf 100755
>> --- a/tools/testing/selftests/net/gro.sh
>> +++ b/tools/testing/selftests/net/gro.sh
>> @@ -4,7 +4,7 @@
>>  readonly SERVER_MAC="aa:00:00:00:00:02"
>>  readonly CLIENT_MAC="aa:00:00:00:00:01"
>>  readonly TESTS=("data" "ack" "flags" "tcp" "ip" "large")
>> -readonly PROTOS=("ipv4" "ipv6")
>> +readonly PROTOS=("ipv4" "ipv6" "ipip")
>>  dev=""
>>  test="all"
>>  proto="ipv4"
>> @@ -31,7 +31,8 @@ run_test() {
>>        1>>log.txt
>>      wait "${server_pid}"
>>      exit_code=$?
>> -    if [[ ${test} == "large" && -n "${KSFT_MACHINE_SLOW}" && \
>> +    if [[ ( ${test} == "large" || ${protocol} == "ipip" ) && \
>> +          -n "${KSFT_MACHINE_SLOW}" && \
> 
> Why does the ipip test need this debug xfail path?

When I tested on a VM, ipip packets sometimes didn't get merged because they
weren't received quickly enough (optimizing the selftest helped somewhat but
not entirely), but I can't seem to reproduce this anymore.

> 
>>            ${exit_code} -ne 0 ]]; then
>>          echo "Ignoring errors due to slow environment" 1>&2
>>          exit_code=0
>> -- 
>> 2.36.1
>>
> 
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 2/5] net: gro: only merge packets with incrementing or fixed outer ids
  2025-08-20  0:30     ` Jakub Kicinski
@ 2025-08-20 12:27       ` Richard Gobert
  2025-08-20 15:04         ` Jakub Kicinski
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Gobert @ 2025-08-20 12:27 UTC (permalink / raw)
  To: Jakub Kicinski, Willem de Bruijn
  Cc: netdev, davem, edumazet, pabeni, horms, corbet, shenjian15,
	salil.mehta, shaojijie, andrew+netdev, saeedm, tariqt, mbloch,
	leon, ecree.xilinx, dsahern, ncardwell, kuniyu, shuah, sdf,
	ahmed.zaki, aleksander.lobakin, florian.fainelli, linux-kernel,
	linux-net-drivers

Jakub Kicinski wrote:
> On Tue, 19 Aug 2025 10:46:01 -0400 Willem de Bruijn wrote:
>> It's a bit unclear what the meaning of inner and outer are in the
>> unencapsulated (i.e., normal) case. In my intuition outer only exists
>> if encapsulated, but it seems you reason the other way around: inner
>> is absent unless encapsulated. 
> 
> +1, whether the header in unencapsulted packet is inner or outer
> is always a source of unnecessary confusion. I would have also
> preferred your suggestion on v1 to use _ENCAP in the name.

Yeah, I guess that was the source of confusion. IMO, it makes more sense that
INNER is absent unless encapsulated since that seems to be the convention in
the rest of the network stack. (e.g. inner_network_header for both skb and
napi_gro_cb is only relevant for encapsulation)

I could rename the OUTER variant to simply SKB_GSO_TCP_FIXEDID so that it's
clearer that it's the default (resembling network_header). WDYT?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v2 2/5] net: gro: only merge packets with incrementing or fixed outer ids
  2025-08-20 12:27       ` Richard Gobert
@ 2025-08-20 15:04         ` Jakub Kicinski
  0 siblings, 0 replies; 16+ messages in thread
From: Jakub Kicinski @ 2025-08-20 15:04 UTC (permalink / raw)
  To: Richard Gobert
  Cc: Willem de Bruijn, netdev, davem, edumazet, pabeni, horms, corbet,
	shenjian15, salil.mehta, shaojijie, andrew+netdev, saeedm, tariqt,
	mbloch, leon, ecree.xilinx, dsahern, ncardwell, kuniyu, shuah,
	sdf, ahmed.zaki, aleksander.lobakin, florian.fainelli,
	linux-kernel, linux-net-drivers

On Wed, 20 Aug 2025 14:27:12 +0200 Richard Gobert wrote:
> Jakub Kicinski wrote:
> > On Tue, 19 Aug 2025 10:46:01 -0400 Willem de Bruijn wrote:  
> >> It's a bit unclear what the meaning of inner and outer are in the
> >> unencapsulated (i.e., normal) case. In my intuition outer only exists
> >> if encapsulated, but it seems you reason the other way around: inner
> >> is absent unless encapsulated.   
> > 
> > +1, whether the header in unencapsulted packet is inner or outer
> > is always a source of unnecessary confusion. I would have also
> > preferred your suggestion on v1 to use _ENCAP in the name.  
> 
> Yeah, I guess that was the source of confusion. IMO, it makes more sense that
> INNER is absent unless encapsulated since that seems to be the convention in
> the rest of the network stack. (e.g. inner_network_header for both skb and
> napi_gro_cb is only relevant for encapsulation)
> 
> I could rename the OUTER variant to simply SKB_GSO_TCP_FIXEDID so that it's
> clearer that it's the default (resembling network_header). WDYT?

Yup! That'd match the skb fields so SGTM!

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-08-20 15:04 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-19  6:32 [PATCH net-next v2 0/5] net: gso: restore outer ip ids correctly Richard Gobert
2025-08-19  6:32 ` [PATCH net-next v2 1/5] net: gro: remove is_ipv6 from napi_gro_cb Richard Gobert
2025-08-19  8:47   ` Eric Dumazet
2025-08-19 12:26     ` Richard Gobert
2025-08-19  6:32 ` [PATCH net-next v2 2/5] net: gro: only merge packets with incrementing or fixed outer ids Richard Gobert
2025-08-19 14:46   ` Willem de Bruijn
2025-08-20  0:30     ` Jakub Kicinski
2025-08-20 12:27       ` Richard Gobert
2025-08-20 15:04         ` Jakub Kicinski
2025-08-20  0:30   ` Jakub Kicinski
2025-08-19  6:32 ` [PATCH net-next v2 3/5] net: gso: restore ids of outer ip headers correctly Richard Gobert
2025-08-19  6:32 ` [PATCH net-next v2 4/5] net: gro: remove unnecessary df checks Richard Gobert
2025-08-20 11:24   ` Willem de Bruijn
2025-08-19  6:32 ` [PATCH net-next v2 5/5] selftests/net: test ipip packets in gro.sh Richard Gobert
2025-08-20 11:21   ` Willem de Bruijn
2025-08-20 12:21     ` Richard Gobert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).