netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net v3 0/4] gro: various fixes related to UDP tunnels
@ 2024-03-22 11:46 Antoine Tenart
  2024-03-22 11:46 ` [PATCH net v3 1/4] udp: do not accept non-tunnel GSO skbs landing in a tunnel Antoine Tenart
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Antoine Tenart @ 2024-03-22 11:46 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet
  Cc: Antoine Tenart, steffen.klassert, willemdebruijn.kernel, netdev

Hello,

We found issues when a UDP tunnel endpoint is in a different netns than
where UDP GRO happens. This kind of setup is actually quite diverse,
from having one leg of the tunnel on a remove host, to having a tunnel
between netns (eg. being bridged in another one or on the host). In our
case that UDP tunnel was geneve.

UDP tunnel packets should not be GROed at the UDP level. The fundamental
issue here is such packet can't be detected in a foolproof way: we can't
know by looking at a packet alone and the current logic of looking up
UDP sockets is fragile (socket could be in another netns, packet could
be modified in between, etc). Because there is no way to make the GRO
code to correctly handle those packets in all cases, this series aims at
two things: making the net stack to correctly behave (as in, no crash
and no invalid packet) when such thing happens, and in some cases to
prevent this "early GRO" from happening.

First three patches fix issues when an "UDP tunneled" packet is being
GROed too early by rx-udp-gro-forwarding or rx-gro-list.

Last patch is preventing locally generated UDP tunnel packets from being
GROed. This turns out to be more complex than this patch alone as it
relies on skb->encapsulation which is currently untrusty in some cases
(see iptunnel_handle_offloads); but that should fix things in practice
and is acceptable for a fix. Future work is required to improve things
(prevent all locally generated UDP tunnel packets from being GROed),
such as fixing the misuse of skb->encapsulation in drivers; but that
would be net-next material.

Thanks!
Antoine

Since v2:
  - Fixed a build issue with IPv6=m in patch 1 (Jakub Kicinski
    feedback).
  - Fixed typo in patch 1 (Nikolay Aleksandrov feedback).
  - Added Reviewed-by tag on patch 2 (Willem de Bruijn feeback).
  - Added back conversion to CHECKSUM_UNNECESSARY but only from non
    CHECKSUM_PARTIAL in patch 3 (Paolo Abeni & Willem de Bruijn
    feeback).
  - Reworded patch 3 commit msg.

Since v1:
  - Fixed a build issue with IPv6 disabled in patch 1.
  - Reworked commit log in patch 2 (Willem de Bruijn feedback).
  - Added Reviewed-by tags on patches 1 & 4 (Willem de Bruijn feeback).

Antoine Tenart (4):
  udp: do not accept non-tunnel GSO skbs landing in a tunnel
  gro: fix ownership transfer
  udp: do not transition UDP GRO fraglist partial checksums to
    unnecessary
  udp: prevent local UDP tunnel packets from being GROed

 include/linux/udp.h    | 28 ++++++++++++++++++++++++++++
 net/core/gro.c         |  3 ++-
 net/ipv4/udp.c         |  7 +++++++
 net/ipv4/udp_offload.c | 23 +++++++++++++----------
 net/ipv6/udp.c         |  2 +-
 net/ipv6/udp_offload.c |  8 +-------
 6 files changed, 52 insertions(+), 19 deletions(-)

-- 
2.44.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net v3 1/4] udp: do not accept non-tunnel GSO skbs landing in a tunnel
  2024-03-22 11:46 [PATCH net v3 0/4] gro: various fixes related to UDP tunnels Antoine Tenart
@ 2024-03-22 11:46 ` Antoine Tenart
  2024-03-22 11:46 ` [PATCH net v3 2/4] gro: fix ownership transfer Antoine Tenart
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Antoine Tenart @ 2024-03-22 11:46 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet
  Cc: Antoine Tenart, steffen.klassert, willemdebruijn.kernel, netdev,
	Willem de Bruijn

When rx-udp-gro-forwarding is enabled UDP packets might be GROed when
being forwarded. If such packets might land in a tunnel this can cause
various issues and udp_gro_receive makes sure this isn't the case by
looking for a matching socket. This is performed in
udp4/6_gro_lookup_skb but only in the current netns. This is an issue
with tunneled packets when the endpoint is in another netns. In such
cases the packets will be GROed at the UDP level, which leads to various
issues later on. The same thing can happen with rx-gro-list.

We saw this with geneve packets being GROed at the UDP level. In such
case gso_size is set; later the packet goes through the geneve rx path,
the geneve header is pulled, the offset are adjusted and frag_list skbs
are not adjusted with regard to geneve. When those skbs hit
skb_fragment, it will misbehave. Different outcomes are possible
depending on what the GROed skbs look like; from corrupted packets to
kernel crashes.

One example is a BUG_ON[1] triggered in skb_segment while processing the
frag_list. Because gso_size is wrong (geneve header was pulled)
skb_segment thinks there is "geneve header size" of data in frag_list,
although it's in fact the next packet. The BUG_ON itself has nothing to
do with the issue. This is only one of the potential issues.

Looking up for a matching socket in udp_gro_receive is fragile: the
lookup could be extended to all netns (not speaking about performances)
but nothing prevents those packets from being modified in between and we
could still not find a matching socket. It's OK to keep the current
logic there as it should cover most cases but we also need to make sure
we handle tunnel packets being GROed too early.

This is done by extending the checks in udp_unexpected_gso: GSO packets
lacking the SKB_GSO_UDP_TUNNEL/_CSUM bits and landing in a tunnel must
be segmented.

[1] kernel BUG at net/core/skbuff.c:4408!
    RIP: 0010:skb_segment+0xd2a/0xf70
    __udp_gso_segment+0xaa/0x560

Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
Fixes: 36707061d6ba ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
---
 include/linux/udp.h    | 28 ++++++++++++++++++++++++++++
 net/ipv4/udp.c         |  7 +++++++
 net/ipv4/udp_offload.c |  6 ++++--
 net/ipv6/udp.c         |  2 +-
 4 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/include/linux/udp.h b/include/linux/udp.h
index 3748e82b627b..17539d089666 100644
--- a/include/linux/udp.h
+++ b/include/linux/udp.h
@@ -150,6 +150,24 @@ static inline void udp_cmsg_recv(struct msghdr *msg, struct sock *sk,
 	}
 }
 
+DECLARE_STATIC_KEY_FALSE(udp_encap_needed_key);
+#if IS_ENABLED(CONFIG_IPV6)
+DECLARE_STATIC_KEY_FALSE(udpv6_encap_needed_key);
+#endif
+
+static inline bool udp_encap_needed(void)
+{
+	if (static_branch_unlikely(&udp_encap_needed_key))
+		return true;
+
+#if IS_ENABLED(CONFIG_IPV6)
+	if (static_branch_unlikely(&udpv6_encap_needed_key))
+		return true;
+#endif
+
+	return false;
+}
+
 static inline bool udp_unexpected_gso(struct sock *sk, struct sk_buff *skb)
 {
 	if (!skb_is_gso(skb))
@@ -163,6 +181,16 @@ static inline bool udp_unexpected_gso(struct sock *sk, struct sk_buff *skb)
 	    !udp_test_bit(ACCEPT_FRAGLIST, sk))
 		return true;
 
+	/* GSO packets lacking the SKB_GSO_UDP_TUNNEL/_CSUM bits might still
+	 * land in a tunnel as the socket check in udp_gro_receive cannot be
+	 * foolproof.
+	 */
+	if (udp_encap_needed() &&
+	    READ_ONCE(udp_sk(sk)->encap_rcv) &&
+	    !(skb_shinfo(skb)->gso_type &
+	      (SKB_GSO_UDP_TUNNEL | SKB_GSO_UDP_TUNNEL_CSUM)))
+		return true;
+
 	return false;
 }
 
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 661d0e0d273f..c02bf011d4a6 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -582,6 +582,13 @@ static inline bool __udp_is_mcast_sock(struct net *net, const struct sock *sk,
 }
 
 DEFINE_STATIC_KEY_FALSE(udp_encap_needed_key);
+EXPORT_SYMBOL(udp_encap_needed_key);
+
+#if IS_ENABLED(CONFIG_IPV6)
+DEFINE_STATIC_KEY_FALSE(udpv6_encap_needed_key);
+EXPORT_SYMBOL(udpv6_encap_needed_key);
+#endif
+
 void udp_encap_enable(void)
 {
 	static_branch_inc(&udp_encap_needed_key);
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index b9880743765c..e9719afe91cf 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -551,8 +551,10 @@ struct sk_buff *udp_gro_receive(struct list_head *head, struct sk_buff *skb,
 	unsigned int off = skb_gro_offset(skb);
 	int flush = 1;
 
-	/* we can do L4 aggregation only if the packet can't land in a tunnel
-	 * otherwise we could corrupt the inner stream
+	/* We can do L4 aggregation only if the packet can't land in a tunnel
+	 * otherwise we could corrupt the inner stream. Detecting such packets
+	 * cannot be foolproof and the aggregation might still happen in some
+	 * cases. Such packets should be caught in udp_unexpected_gso later.
 	 */
 	NAPI_GRO_CB(skb)->is_flist = 0;
 	if (!sk || !udp_sk(sk)->gro_receive) {
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 7c1e6469d091..8b1dd7f51249 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -447,7 +447,7 @@ int udpv6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
 	goto try_again;
 }
 
-DEFINE_STATIC_KEY_FALSE(udpv6_encap_needed_key);
+DECLARE_STATIC_KEY_FALSE(udpv6_encap_needed_key);
 void udpv6_encap_enable(void)
 {
 	static_branch_inc(&udpv6_encap_needed_key);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net v3 2/4] gro: fix ownership transfer
  2024-03-22 11:46 [PATCH net v3 0/4] gro: various fixes related to UDP tunnels Antoine Tenart
  2024-03-22 11:46 ` [PATCH net v3 1/4] udp: do not accept non-tunnel GSO skbs landing in a tunnel Antoine Tenart
@ 2024-03-22 11:46 ` Antoine Tenart
  2024-03-22 11:46 ` [PATCH net v3 3/4] udp: do not transition UDP GRO fraglist partial checksums to unnecessary Antoine Tenart
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Antoine Tenart @ 2024-03-22 11:46 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet
  Cc: Antoine Tenart, steffen.klassert, willemdebruijn.kernel, netdev,
	Willem de Bruijn

If packets are GROed with fraglist they might be segmented later on and
continue their journey in the stack. In skb_segment_list those skbs can
be reused as-is. This is an issue as their destructor was removed in
skb_gro_receive_list but not the reference to their socket, and then
they can't be orphaned. Fix this by also removing the reference to the
socket.

For example this could be observed,

  kernel BUG at include/linux/skbuff.h:3131!  (skb_orphan)
  RIP: 0010:ip6_rcv_core+0x11bc/0x19a0
  Call Trace:
   ipv6_list_rcv+0x250/0x3f0
   __netif_receive_skb_list_core+0x49d/0x8f0
   netif_receive_skb_list_internal+0x634/0xd40
   napi_complete_done+0x1d2/0x7d0
   gro_cell_poll+0x118/0x1f0

A similar construction is found in skb_gro_receive, apply the same
change there.

Fixes: 5e10da5385d2 ("skbuff: allow 'slow_gro' for skb carring sock reference")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
---
 net/core/gro.c         | 3 ++-
 net/ipv4/udp_offload.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/core/gro.c b/net/core/gro.c
index ee30d4f0c038..83f35d99a682 100644
--- a/net/core/gro.c
+++ b/net/core/gro.c
@@ -192,8 +192,9 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb)
 	}
 
 merge:
-	/* sk owenrship - if any - completely transferred to the aggregated packet */
+	/* sk ownership - if any - completely transferred to the aggregated packet */
 	skb->destructor = NULL;
+	skb->sk = NULL;
 	delta_truesize = skb->truesize;
 	if (offset > headlen) {
 		unsigned int eat = offset - headlen;
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index e9719afe91cf..3bb69464930b 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -449,8 +449,9 @@ static int skb_gro_receive_list(struct sk_buff *p, struct sk_buff *skb)
 	NAPI_GRO_CB(p)->count++;
 	p->data_len += skb->len;
 
-	/* sk owenrship - if any - completely transferred to the aggregated packet */
+	/* sk ownership - if any - completely transferred to the aggregated packet */
 	skb->destructor = NULL;
+	skb->sk = NULL;
 	p->truesize += skb->truesize;
 	p->len += skb->len;
 
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net v3 3/4] udp: do not transition UDP GRO fraglist partial checksums to unnecessary
  2024-03-22 11:46 [PATCH net v3 0/4] gro: various fixes related to UDP tunnels Antoine Tenart
  2024-03-22 11:46 ` [PATCH net v3 1/4] udp: do not accept non-tunnel GSO skbs landing in a tunnel Antoine Tenart
  2024-03-22 11:46 ` [PATCH net v3 2/4] gro: fix ownership transfer Antoine Tenart
@ 2024-03-22 11:46 ` Antoine Tenart
  2024-03-22 17:29   ` Willem de Bruijn
  2024-03-22 11:46 ` [PATCH net v3 4/4] udp: prevent local UDP tunnel packets from being GROed Antoine Tenart
  2024-03-22 22:55 ` [PATCH net v3 0/4] gro: various fixes related to UDP tunnels Jakub Kicinski
  4 siblings, 1 reply; 12+ messages in thread
From: Antoine Tenart @ 2024-03-22 11:46 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet
  Cc: Antoine Tenart, steffen.klassert, willemdebruijn.kernel, netdev

UDP GRO validates checksums and in udp4/6_gro_complete fraglist packets
are converted to CHECKSUM_UNNECESSARY to avoid later checks. However
this is an issue for CHECKSUM_PARTIAL packets as they can be looped in
an egress path and then their partial checksums are not fixed.

Different issues can be observed, from invalid checksum on packets to
traces like:

  gen01: hw csum failure
  skb len=3008 headroom=160 headlen=1376 tailroom=0
  mac=(106,14) net=(120,40) trans=160
  shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
  csum(0xffff232e ip_summed=2 complete_sw=0 valid=0 level=0)
  hash(0x77e3d716 sw=1 l4=1) proto=0x86dd pkttype=0 iif=12
  ...

Fix this by only converting CHECKSUM_NONE packets to
CHECKSUM_UNNECESSARY by reusing __skb_incr_checksum_unnecessary.

Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
---
 net/ipv4/udp_offload.c | 8 +-------
 net/ipv6/udp_offload.c | 8 +-------
 2 files changed, 2 insertions(+), 14 deletions(-)

diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 3bb69464930b..548476d78237 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -722,13 +722,7 @@ INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff)
 		skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
 		skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
 
-		if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
-			if (skb->csum_level < SKB_MAX_CSUM_LEVEL)
-				skb->csum_level++;
-		} else {
-			skb->ip_summed = CHECKSUM_UNNECESSARY;
-			skb->csum_level = 0;
-		}
+		__skb_incr_checksum_unnecessary(skb);
 
 		return 0;
 	}
diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index 312bcaeea96f..bbd347de00b4 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -174,13 +174,7 @@ INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
 		skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
 		skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
 
-		if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
-			if (skb->csum_level < SKB_MAX_CSUM_LEVEL)
-				skb->csum_level++;
-		} else {
-			skb->ip_summed = CHECKSUM_UNNECESSARY;
-			skb->csum_level = 0;
-		}
+		__skb_incr_checksum_unnecessary(skb);
 
 		return 0;
 	}
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net v3 4/4] udp: prevent local UDP tunnel packets from being GROed
  2024-03-22 11:46 [PATCH net v3 0/4] gro: various fixes related to UDP tunnels Antoine Tenart
                   ` (2 preceding siblings ...)
  2024-03-22 11:46 ` [PATCH net v3 3/4] udp: do not transition UDP GRO fraglist partial checksums to unnecessary Antoine Tenart
@ 2024-03-22 11:46 ` Antoine Tenart
  2024-03-22 22:55 ` [PATCH net v3 0/4] gro: various fixes related to UDP tunnels Jakub Kicinski
  4 siblings, 0 replies; 12+ messages in thread
From: Antoine Tenart @ 2024-03-22 11:46 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet
  Cc: Antoine Tenart, steffen.klassert, willemdebruijn.kernel, netdev,
	Willem de Bruijn

GRO has a fundamental issue with UDP tunnel packets as it can't detect
those in a foolproof way and GRO could happen before they reach the
tunnel endpoint. Previous commits have fixed issues when UDP tunnel
packets come from a remote host, but if those packets are issued locally
they could run into checksum issues.

If the inner packet has a partial checksum the information will be lost
in the GRO logic, either in udp4/6_gro_complete or in
udp_gro_complete_segment and packets will have an invalid checksum when
leaving the host.

Prevent local UDP tunnel packets from ever being GROed at the outer UDP
level.

Due to skb->encapsulation being wrongly used in some drivers this is
actually only preventing UDP tunnel packets with a partial checksum to
be GROed (see iptunnel_handle_offloads) but those were also the packets
triggering issues so in practice this should be sufficient.

Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
Fixes: 36707061d6ba ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets")
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
---
 net/ipv4/udp_offload.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 548476d78237..3498dd1d0694 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -559,6 +559,12 @@ struct sk_buff *udp_gro_receive(struct list_head *head, struct sk_buff *skb,
 	 */
 	NAPI_GRO_CB(skb)->is_flist = 0;
 	if (!sk || !udp_sk(sk)->gro_receive) {
+		/* If the packet was locally encapsulated in a UDP tunnel that
+		 * wasn't detected above, do not GRO.
+		 */
+		if (skb->encapsulation)
+			goto out;
+
 		if (skb->dev->features & NETIF_F_GRO_FRAGLIST)
 			NAPI_GRO_CB(skb)->is_flist = sk ? !udp_test_bit(GRO_ENABLED, sk) : 1;
 
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net v3 3/4] udp: do not transition UDP GRO fraglist partial checksums to unnecessary
  2024-03-22 11:46 ` [PATCH net v3 3/4] udp: do not transition UDP GRO fraglist partial checksums to unnecessary Antoine Tenart
@ 2024-03-22 17:29   ` Willem de Bruijn
  2024-03-25 10:37     ` Antoine Tenart
  0 siblings, 1 reply; 12+ messages in thread
From: Willem de Bruijn @ 2024-03-22 17:29 UTC (permalink / raw)
  To: Antoine Tenart, davem, kuba, pabeni, edumazet
  Cc: Antoine Tenart, steffen.klassert, willemdebruijn.kernel, netdev

Antoine Tenart wrote:
> UDP GRO validates checksums and in udp4/6_gro_complete fraglist packets
> are converted to CHECKSUM_UNNECESSARY to avoid later checks. However
> this is an issue for CHECKSUM_PARTIAL packets as they can be looped in
> an egress path and then their partial checksums are not fixed.
> 
> Different issues can be observed, from invalid checksum on packets to
> traces like:
> 
>   gen01: hw csum failure
>   skb len=3008 headroom=160 headlen=1376 tailroom=0
>   mac=(106,14) net=(120,40) trans=160
>   shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
>   csum(0xffff232e ip_summed=2 complete_sw=0 valid=0 level=0)
>   hash(0x77e3d716 sw=1 l4=1) proto=0x86dd pkttype=0 iif=12
>   ...
> 
> Fix this by only converting CHECKSUM_NONE packets to
> CHECKSUM_UNNECESSARY by reusing __skb_incr_checksum_unnecessary.
> 
> Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
> Signed-off-by: Antoine Tenart <atenart@kernel.org>

Sorry to have yet more questions, but

Should fraglist UDP GRO and non-fraglist (udp_gro_complete_segment)
have the same checksumming behavior?

Second, this leaves CHECKSUM_COMPLETE as is. Is that intentional? I
don't immediately see where GSO skb->csum would be updated.

I can take a closer look too, but did not want to delay feedback.

> ---
>  net/ipv4/udp_offload.c | 8 +-------
>  net/ipv6/udp_offload.c | 8 +-------
>  2 files changed, 2 insertions(+), 14 deletions(-)
> 
> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> index 3bb69464930b..548476d78237 100644
> --- a/net/ipv4/udp_offload.c
> +++ b/net/ipv4/udp_offload.c
> @@ -722,13 +722,7 @@ INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff)
>  		skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
>  		skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
>  
> -		if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
> -			if (skb->csum_level < SKB_MAX_CSUM_LEVEL)
> -				skb->csum_level++;
> -		} else {
> -			skb->ip_summed = CHECKSUM_UNNECESSARY;
> -			skb->csum_level = 0;
> -		}
> +		__skb_incr_checksum_unnecessary(skb);
>  
>  		return 0;
>  	}
> diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
> index 312bcaeea96f..bbd347de00b4 100644
> --- a/net/ipv6/udp_offload.c
> +++ b/net/ipv6/udp_offload.c
> @@ -174,13 +174,7 @@ INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
>  		skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
>  		skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
>  
> -		if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
> -			if (skb->csum_level < SKB_MAX_CSUM_LEVEL)
> -				skb->csum_level++;
> -		} else {
> -			skb->ip_summed = CHECKSUM_UNNECESSARY;
> -			skb->csum_level = 0;
> -		}
> +		__skb_incr_checksum_unnecessary(skb);
>  
>  		return 0;
>  	}
> -- 
> 2.44.0
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v3 0/4] gro: various fixes related to UDP tunnels
  2024-03-22 11:46 [PATCH net v3 0/4] gro: various fixes related to UDP tunnels Antoine Tenart
                   ` (3 preceding siblings ...)
  2024-03-22 11:46 ` [PATCH net v3 4/4] udp: prevent local UDP tunnel packets from being GROed Antoine Tenart
@ 2024-03-22 22:55 ` Jakub Kicinski
  2024-03-25 10:16   ` Antoine Tenart
  4 siblings, 1 reply; 12+ messages in thread
From: Jakub Kicinski @ 2024-03-22 22:55 UTC (permalink / raw)
  To: Antoine Tenart
  Cc: davem, pabeni, edumazet, steffen.klassert, willemdebruijn.kernel,
	netdev

On Fri, 22 Mar 2024 12:46:19 +0100 Antoine Tenart wrote:
> We found issues when a UDP tunnel endpoint is in a different netns than
> where UDP GRO happens. This kind of setup is actually quite diverse,
> from having one leg of the tunnel on a remove host, to having a tunnel
> between netns (eg. being bridged in another one or on the host). In our
> case that UDP tunnel was geneve.

I think this series makes net/udpgro_fwd.sh selftest fail.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v3 0/4] gro: various fixes related to UDP tunnels
  2024-03-22 22:55 ` [PATCH net v3 0/4] gro: various fixes related to UDP tunnels Jakub Kicinski
@ 2024-03-25 10:16   ` Antoine Tenart
  0 siblings, 0 replies; 12+ messages in thread
From: Antoine Tenart @ 2024-03-25 10:16 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, pabeni, edumazet, steffen.klassert, willemdebruijn.kernel,
	netdev

Quoting Jakub Kicinski (2024-03-22 23:55:50)
> On Fri, 22 Mar 2024 12:46:19 +0100 Antoine Tenart wrote:
> > We found issues when a UDP tunnel endpoint is in a different netns than
> > where UDP GRO happens. This kind of setup is actually quite diverse,
> > from having one leg of the tunnel on a remove host, to having a tunnel
> > between netns (eg. being bridged in another one or on the host). In our
> > case that UDP tunnel was geneve.
> 
> I think this series makes net/udpgro_fwd.sh selftest fail.

Thanks! Sorry for not checking this earlier...

What happens is the vxlan tunnel tests expect GRO to happen at the veth
level which is exactly the issues this series is fixing.

The below diff should fix the tests. I think it's good to keep them to
ensure GRO is actually not happening while packets are in an UDP tunnel;
I also removed the "UDP tunnel fwd perf" test as it's not providing any
useful data now IMO.

diff --git a/tools/testing/selftests/net/udpgro_fwd.sh b/tools/testing/selftests/net/udpgro_fwd.sh
index 380cb15e942e..83ed987cff34 100755
--- a/tools/testing/selftests/net/udpgro_fwd.sh
+++ b/tools/testing/selftests/net/udpgro_fwd.sh
@@ -244,7 +244,7 @@ for family in 4 6; do
        create_vxlan_pair
        ip netns exec $NS_DST ethtool -K veth$DST generic-receive-offload on
        ip netns exec $NS_DST ethtool -K veth$DST rx-gro-list on
-       run_test "GRO frag list over UDP tunnel" $OL_NET$DST 1 1
+       run_test "GRO frag list over UDP tunnel" $OL_NET$DST 10 10
        cleanup
 
        # use NAT to circumvent GRO FWD check
@@ -258,13 +258,7 @@ for family in 4 6; do
        # load arp cache before running the test to reduce the amount of
        # stray traffic on top of the UDP tunnel
        ip netns exec $NS_SRC $PING -q -c 1 $OL_NET$DST_NAT >/dev/null
-       run_test "GRO fwd over UDP tunnel" $OL_NET$DST_NAT 1 1 $OL_NET$DST
-       cleanup
-
-       create_vxlan_pair
-       run_bench "UDP tunnel fwd perf" $OL_NET$DST
-       ip netns exec $NS_DST ethtool -K veth$DST rx-udp-gro-forwarding on
-       run_bench "UDP tunnel GRO fwd perf" $OL_NET$DST
+       run_test "GRO fwd over UDP tunnel" $OL_NET$DST_NAT 10 10 $OL_NET$DST
        cleanup
 done

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net v3 3/4] udp: do not transition UDP GRO fraglist partial checksums to unnecessary
  2024-03-22 17:29   ` Willem de Bruijn
@ 2024-03-25 10:37     ` Antoine Tenart
  2024-03-25 18:39       ` Willem de Bruijn
  0 siblings, 1 reply; 12+ messages in thread
From: Antoine Tenart @ 2024-03-25 10:37 UTC (permalink / raw)
  To: Willem de Bruijn, davem, edumazet, kuba, pabeni
  Cc: steffen.klassert, willemdebruijn.kernel, netdev

Quoting Willem de Bruijn (2024-03-22 18:29:40)
> Antoine Tenart wrote:
> > UDP GRO validates checksums and in udp4/6_gro_complete fraglist packets
> > are converted to CHECKSUM_UNNECESSARY to avoid later checks. However
> > this is an issue for CHECKSUM_PARTIAL packets as they can be looped in
> > an egress path and then their partial checksums are not fixed.
> > 
> > Different issues can be observed, from invalid checksum on packets to
> > traces like:
> > 
> >   gen01: hw csum failure
> >   skb len=3008 headroom=160 headlen=1376 tailroom=0
> >   mac=(106,14) net=(120,40) trans=160
> >   shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
> >   csum(0xffff232e ip_summed=2 complete_sw=0 valid=0 level=0)
> >   hash(0x77e3d716 sw=1 l4=1) proto=0x86dd pkttype=0 iif=12
> >   ...
> > 
> > Fix this by only converting CHECKSUM_NONE packets to
> > CHECKSUM_UNNECESSARY by reusing __skb_incr_checksum_unnecessary.
> > 
> > Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
> > Signed-off-by: Antoine Tenart <atenart@kernel.org>
> 
> Should fraglist UDP GRO and non-fraglist (udp_gro_complete_segment)
> have the same checksumming behavior?

They can't as non-fraglist GRO packets can be aggregated, csum can't
just be converted there. It seems non-fraglist handles csum as it should
already, except for tunneled packets but that's why patch 4 prevents
those packets from being GROed.

> Second, this leaves CHECKSUM_COMPLETE as is. Is that intentional? I
> don't immediately see where GSO skb->csum would be updated.

That is intentional, fraglist GSO packets aren't modified and csums
don't need to be updated. The issues are with converting the checksum
type: partial checksum information can be lost.

Thanks,
Antoine

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v3 3/4] udp: do not transition UDP GRO fraglist partial checksums to unnecessary
  2024-03-25 10:37     ` Antoine Tenart
@ 2024-03-25 18:39       ` Willem de Bruijn
  2024-03-26  9:44         ` Antoine Tenart
  0 siblings, 1 reply; 12+ messages in thread
From: Willem de Bruijn @ 2024-03-25 18:39 UTC (permalink / raw)
  To: Antoine Tenart, Willem de Bruijn, davem, edumazet, kuba, pabeni
  Cc: steffen.klassert, willemdebruijn.kernel, netdev

Antoine Tenart wrote:
> Quoting Willem de Bruijn (2024-03-22 18:29:40)
> > Antoine Tenart wrote:
> > > UDP GRO validates checksums and in udp4/6_gro_complete fraglist packets
> > > are converted to CHECKSUM_UNNECESSARY to avoid later checks. However
> > > this is an issue for CHECKSUM_PARTIAL packets as they can be looped in
> > > an egress path and then their partial checksums are not fixed.
> > > 
> > > Different issues can be observed, from invalid checksum on packets to
> > > traces like:
> > > 
> > >   gen01: hw csum failure
> > >   skb len=3008 headroom=160 headlen=1376 tailroom=0
> > >   mac=(106,14) net=(120,40) trans=160
> > >   shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
> > >   csum(0xffff232e ip_summed=2 complete_sw=0 valid=0 level=0)
> > >   hash(0x77e3d716 sw=1 l4=1) proto=0x86dd pkttype=0 iif=12
> > >   ...
> > > 
> > > Fix this by only converting CHECKSUM_NONE packets to
> > > CHECKSUM_UNNECESSARY by reusing __skb_incr_checksum_unnecessary.
> > > 
> > > Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
> > > Signed-off-by: Antoine Tenart <atenart@kernel.org>
> > 
> > Should fraglist UDP GRO and non-fraglist (udp_gro_complete_segment)
> > have the same checksumming behavior?
> 
> They can't as non-fraglist GRO packets can be aggregated, csum can't
> just be converted there.

I suppose this could be done. But it is just simpler to convert to
CHECKSUM_UNNECESSARY.

> It seems non-fraglist handles csum as it should
> already, except for tunneled packets but that's why patch 4 prevents
> those packets from being GROed.

You mean that on segmentation, the segments are restored and thus
skb->csum of each segment is again correct, right?

I suppose this could be converted to CHECKSUM_UNNECESSARY if just
for equivalence between the two UDP_GRO methods and simplicity.

But also fine to leave as is.

Can you at least summarize this in the commit message? Currently
CHECKSUM_COMPLETE is not mentioned, but the behavior is not trivial.
It may be helpful next time we again stumble on this code and do a
git blame.

> 
> > Second, this leaves CHECKSUM_COMPLETE as is. Is that intentional? I
> > don't immediately see where GSO skb->csum would be updated.
> 
> That is intentional, fraglist GSO packets aren't modified and csums
> don't need to be updated. The issues are with converting the checksum
> type: partial checksum information can be lost.
> 
> Thanks,
> Antoine



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v3 3/4] udp: do not transition UDP GRO fraglist partial checksums to unnecessary
  2024-03-25 18:39       ` Willem de Bruijn
@ 2024-03-26  9:44         ` Antoine Tenart
  2024-03-26 13:37           ` Willem de Bruijn
  0 siblings, 1 reply; 12+ messages in thread
From: Antoine Tenart @ 2024-03-26  9:44 UTC (permalink / raw)
  To: Willem de Bruijn, davem, edumazet, kuba, pabeni
  Cc: steffen.klassert, willemdebruijn.kernel, netdev

Quoting Willem de Bruijn (2024-03-25 19:39:46)
> Antoine Tenart wrote:
> > Quoting Willem de Bruijn (2024-03-22 18:29:40)
> > > 
> > > Should fraglist UDP GRO and non-fraglist (udp_gro_complete_segment)
> > > have the same checksumming behavior?
> > 
> > They can't as non-fraglist GRO packets can be aggregated, csum can't
> > just be converted there.
> 
> I suppose this could be done. But it is just simpler to convert to
> CHECKSUM_UNNECESSARY.

Oh, do you mean using the non-fraglist behavior in fraglist?
udp_gro_complete_segment converts all packets to CHECKSUM_PARTIAL (as
packets could have been aggregated) but that's not required in fraglist.

To say it another way: my understanding is packets in the non-fraglist
case have to be converted to CHECKSUM_PARTIAL, while the fraglist case
can keep the checksum info as-is (and have the conversion to unnecessary
as an optimization when applicable).

> You mean that on segmentation, the segments are restored and thus
> skb->csum of each segment is again correct, right?

In the fraglist case, yes.

> I suppose this could be converted to CHECKSUM_UNNECESSARY if just
> for equivalence between the two UDP_GRO methods and simplicity.
> 
> But also fine to leave as is.

I'm not sure I got your suggestion as I don't see how non-fraglist
packets could be converted to CHECKSUM_UNNECESSARY when being aggregated
(uh->len can change there).

This series is aiming at fixing known issues and unifying the behavior
would be net-next material IMO. I'll send a v4 for those fixes, but then
I'm happy to discuss the above suggestion and investigate; so let's
continue the discussion here in parallel.

> Can you at least summarize this in the commit message? Currently
> CHECKSUM_COMPLETE is not mentioned, but the behavior is not trivial.
> It may be helpful next time we again stumble on this code and do a
> git blame.

Sure, I'll try to improve the commit log.

Thanks!
Antoine

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v3 3/4] udp: do not transition UDP GRO fraglist partial checksums to unnecessary
  2024-03-26  9:44         ` Antoine Tenart
@ 2024-03-26 13:37           ` Willem de Bruijn
  0 siblings, 0 replies; 12+ messages in thread
From: Willem de Bruijn @ 2024-03-26 13:37 UTC (permalink / raw)
  To: Antoine Tenart, Willem de Bruijn, davem, edumazet, kuba, pabeni
  Cc: steffen.klassert, willemdebruijn.kernel, netdev

Antoine Tenart wrote:
> Quoting Willem de Bruijn (2024-03-25 19:39:46)
> > Antoine Tenart wrote:
> > > Quoting Willem de Bruijn (2024-03-22 18:29:40)
> > > > 
> > > > Should fraglist UDP GRO and non-fraglist (udp_gro_complete_segment)
> > > > have the same checksumming behavior?
> > > 
> > > They can't as non-fraglist GRO packets can be aggregated, csum can't
> > > just be converted there.
> > 
> > I suppose this could be done. But it is just simpler to convert to
> > CHECKSUM_UNNECESSARY.
> 
> Oh, do you mean using the non-fraglist behavior in fraglist?
> udp_gro_complete_segment converts all packets to CHECKSUM_PARTIAL (as
> packets could have been aggregated) but that's not required in fraglist.
> 
> To say it another way: my understanding is packets in the non-fraglist
> case have to be converted to CHECKSUM_PARTIAL, while the fraglist case
> can keep the checksum info as-is (and have the conversion to unnecessary
> as an optimization when applicable).

That makes sense. Thanks.
 
> > You mean that on segmentation, the segments are restored and thus
> > skb->csum of each segment is again correct, right?
> 
> In the fraglist case, yes.
> 
> > I suppose this could be converted to CHECKSUM_UNNECESSARY if just
> > for equivalence between the two UDP_GRO methods and simplicity.
> > 
> > But also fine to leave as is.
> 
> I'm not sure I got your suggestion as I don't see how non-fraglist
> packets could be converted to CHECKSUM_UNNECESSARY when being aggregated
> (uh->len can change there).
> 
> This series is aiming at fixing known issues and unifying the behavior
> would be net-next material IMO. I'll send a v4 for those fixes, but then
> I'm happy to discuss the above suggestion and investigate; so let's
> continue the discussion here in parallel.

The above explains why the solutions are distinct. No need to try to
unify more in a follow-up, then.

> > Can you at least summarize this in the commit message? Currently
> > CHECKSUM_COMPLETE is not mentioned, but the behavior is not trivial.
> > It may be helpful next time we again stumble on this code and do a
> > git blame.
> 
> Sure, I'll try to improve the commit log.
> 
> Thanks!
> Antoine



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-03-26 13:37 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-22 11:46 [PATCH net v3 0/4] gro: various fixes related to UDP tunnels Antoine Tenart
2024-03-22 11:46 ` [PATCH net v3 1/4] udp: do not accept non-tunnel GSO skbs landing in a tunnel Antoine Tenart
2024-03-22 11:46 ` [PATCH net v3 2/4] gro: fix ownership transfer Antoine Tenart
2024-03-22 11:46 ` [PATCH net v3 3/4] udp: do not transition UDP GRO fraglist partial checksums to unnecessary Antoine Tenart
2024-03-22 17:29   ` Willem de Bruijn
2024-03-25 10:37     ` Antoine Tenart
2024-03-25 18:39       ` Willem de Bruijn
2024-03-26  9:44         ` Antoine Tenart
2024-03-26 13:37           ` Willem de Bruijn
2024-03-22 11:46 ` [PATCH net v3 4/4] udp: prevent local UDP tunnel packets from being GROed Antoine Tenart
2024-03-22 22:55 ` [PATCH net v3 0/4] gro: various fixes related to UDP tunnels Jakub Kicinski
2024-03-25 10:16   ` Antoine Tenart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).