netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net v2 0/4] gro: various fixes related to UDP tunnels
@ 2024-03-19  9:31 Antoine Tenart
  2024-03-19  9:31 ` [PATCH net v2 1/4] udp: do not accept non-tunnel GSO skbs landing in a tunnel Antoine Tenart
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Antoine Tenart @ 2024-03-19  9:31 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet
  Cc: Antoine Tenart, steffen.klassert, willemdebruijn.kernel, netdev

Hello,

We found issues when a UDP tunnel endpoint is in a different netns than
where UDP GRO happens. This kind of setup is actually quite diverse,
from having one leg of the tunnel on a remove host, to having a tunnel
between netns (eg. being bridged in another one or on the host). In our
case that UDP tunnel was geneve.

UDP tunnel packets should not be GROed at the UDP level. The fundamental
issue here is such packet can't be detected in a foolproof way: we can't
know by looking at a packet alone and the current logic of looking up
UDP sockets is fragile (socket could be in another netns, packet could
be modified in between, etc). Because there is no way to make the GRO
code to correctly handle those packets in all cases, this series aims at
two things: making the net stack to correctly behave (as in, no crash
and no invalid packet) when such thing happens, and in some cases to
prevent this "early GRO" from happening.

First three patches fix issues when an "UDP tunneled" packet is being
GROed too early by rx-udp-gro-forwarding or rx-gro-list.

Last patch is preventing locally generated UDP tunnel packets from being
GROed. This turns out to be more complex than this patch alone as it
relies on skb->encapsulation which is currently untrusty in some cases
(see iptunnel_handle_offloads); but that should fix things in practice
and is acceptable for a fix. Future work is required to improve things
(prevent all locally generated UDP tunnel packets from being GROed),
such as fixing the misuse of skb->encapsulation in drivers; but that
would be net-next material.

Thanks!
Antoine

Since v1:
  - Fixed a build issue with IPv6 disabled in patch 1.
  - Reworked commit log in patch 2 (Willem de Bruijn feedback).
  - Added Reviewed-by tags on patches 1 & 4 (Willem de Bruijn feeback).

Antoine Tenart (4):
  udp: do not accept non-tunnel GSO skbs landing in a tunnel
  gro: fix ownership transfer
  udp: do not transition UDP fraglist to unnecessary checksum
  udp: prevent local UDP tunnel packets from being GROed

 include/linux/udp.h    | 28 ++++++++++++++++++++++++++++
 net/core/gro.c         |  3 ++-
 net/ipv4/udp_offload.c | 23 ++++++++++++-----------
 net/ipv6/udp_offload.c |  8 --------
 4 files changed, 42 insertions(+), 20 deletions(-)

-- 
2.44.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH net v2 1/4] udp: do not accept non-tunnel GSO skbs landing in a tunnel
  2024-03-19  9:31 [PATCH net v2 0/4] gro: various fixes related to UDP tunnels Antoine Tenart
@ 2024-03-19  9:31 ` Antoine Tenart
  2024-03-20  2:41   ` Jakub Kicinski
  2024-03-20 10:34   ` Nikolay Aleksandrov
  2024-03-19  9:31 ` [PATCH net v2 2/4] gro: fix ownership transfer Antoine Tenart
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 21+ messages in thread
From: Antoine Tenart @ 2024-03-19  9:31 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet
  Cc: Antoine Tenart, steffen.klassert, willemdebruijn.kernel, netdev,
	Willem de Bruijn

When rx-udp-gro-forwarding is enabled UDP packets might be GROed when
being forwarded. If such packets might land in a tunnel this can cause
various issues and udp_gro_receive makes sure this isn't the case by
looking for a matching socket. This is performed in
udp4/6_gro_lookup_skb but only in the current netns. This is an issue
with tunneled packets when the endpoint is in another netns. In such
cases the packets will be GROed at the UDP level, which leads to various
issues later on. The same thing can happen with rx-gro-list.

We saw this with geneve packets being GROed at the UDP level. In such
case gso_size is set; later the packet goes through the geneve rx path,
the geneve header is pulled, the offset are adjusted and frag_list skbs
are not adjusted with regard to geneve. When those skbs hit
skb_fragment, it will misbehave. Different outcomes are possible
depending on what the GROed skbs look like; from corrupted packets to
kernel crashes.

One example is a BUG_ON[1] triggered in skb_segment while processing the
frag_list. Because gso_size is wrong (geneve header was pulled)
skb_segment thinks there is "geneve header size" of data in frag_list,
although it's in fact the next packet. The BUG_ON itself has nothing to
do with the issue. This is only one of the potential issues.

Looking up for a matching socket in udp_gro_receive is fragile: the
lookup could be extended to all netns (not speaking about performances)
but nothing prevents those packets from being modified in between and we
could still not find a matching socket. It's OK to keep the current
logic there as it should cover most cases but we also need to make sure
we handle tunnel packets being GROed too early.

This is done by extending the checks in udp_unexpected_gso: GSO packets
lacking the SKB_GSO_UDP_TUNNEL/_CUSM bits and landing in a tunnel must
be segmented.

[1] kernel BUG at net/core/skbuff.c:4408!
    RIP: 0010:skb_segment+0xd2a/0xf70
    __udp_gso_segment+0xaa/0x560

Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
Fixes: 36707061d6ba ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
---

Since v1:
  - Fixed a build issue with IPv6 disabled.
  - Added Reviewed-by tag from v1 as the logic is the same.

 include/linux/udp.h    | 28 ++++++++++++++++++++++++++++
 net/ipv4/udp_offload.c |  6 ++++--
 2 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/include/linux/udp.h b/include/linux/udp.h
index 3748e82b627b..05231fff8703 100644
--- a/include/linux/udp.h
+++ b/include/linux/udp.h
@@ -150,6 +150,24 @@ static inline void udp_cmsg_recv(struct msghdr *msg, struct sock *sk,
 	}
 }
 
+DECLARE_STATIC_KEY_FALSE(udp_encap_needed_key);
+#if IS_ENABLED(CONFIG_IPV6)
+DECLARE_STATIC_KEY_FALSE(udpv6_encap_needed_key);
+#endif
+
+static inline bool udp_encap_needed(void)
+{
+	if (static_branch_unlikely(&udp_encap_needed_key))
+		return true;
+
+#if IS_ENABLED(CONFIG_IPV6)
+	if (static_branch_unlikely(&udpv6_encap_needed_key))
+		return true;
+#endif
+
+	return false;
+}
+
 static inline bool udp_unexpected_gso(struct sock *sk, struct sk_buff *skb)
 {
 	if (!skb_is_gso(skb))
@@ -163,6 +181,16 @@ static inline bool udp_unexpected_gso(struct sock *sk, struct sk_buff *skb)
 	    !udp_test_bit(ACCEPT_FRAGLIST, sk))
 		return true;
 
+	/* GSO packets lacking the SKB_GSO_UDP_TUNNEL/_CUSM bits might still
+	 * land in a tunnel as the socket check in udp_gro_receive cannot be
+	 * foolproof.
+	 */
+	if (udp_encap_needed() &&
+	    READ_ONCE(udp_sk(sk)->encap_rcv) &&
+	    !(skb_shinfo(skb)->gso_type &
+	      (SKB_GSO_UDP_TUNNEL | SKB_GSO_UDP_TUNNEL_CSUM)))
+		return true;
+
 	return false;
 }
 
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index b9880743765c..e9719afe91cf 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -551,8 +551,10 @@ struct sk_buff *udp_gro_receive(struct list_head *head, struct sk_buff *skb,
 	unsigned int off = skb_gro_offset(skb);
 	int flush = 1;
 
-	/* we can do L4 aggregation only if the packet can't land in a tunnel
-	 * otherwise we could corrupt the inner stream
+	/* We can do L4 aggregation only if the packet can't land in a tunnel
+	 * otherwise we could corrupt the inner stream. Detecting such packets
+	 * cannot be foolproof and the aggregation might still happen in some
+	 * cases. Such packets should be caught in udp_unexpected_gso later.
 	 */
 	NAPI_GRO_CB(skb)->is_flist = 0;
 	if (!sk || !udp_sk(sk)->gro_receive) {
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH net v2 2/4] gro: fix ownership transfer
  2024-03-19  9:31 [PATCH net v2 0/4] gro: various fixes related to UDP tunnels Antoine Tenart
  2024-03-19  9:31 ` [PATCH net v2 1/4] udp: do not accept non-tunnel GSO skbs landing in a tunnel Antoine Tenart
@ 2024-03-19  9:31 ` Antoine Tenart
  2024-03-19 13:13   ` Willem de Bruijn
  2024-03-19  9:31 ` [PATCH net v2 3/4] udp: do not transition UDP fraglist to unnecessary checksum Antoine Tenart
  2024-03-19  9:31 ` [PATCH net v2 4/4] udp: prevent local UDP tunnel packets from being GROed Antoine Tenart
  3 siblings, 1 reply; 21+ messages in thread
From: Antoine Tenart @ 2024-03-19  9:31 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet
  Cc: Antoine Tenart, steffen.klassert, willemdebruijn.kernel, netdev

If packets are GROed with fraglist they might be segmented later on and
continue their journey in the stack. In skb_segment_list those skbs can
be reused as-is. This is an issue as their destructor was removed in
skb_gro_receive_list but not the reference to their socket, and then
they can't be orphaned. Fix this by also removing the reference to the
socket.

For example this could be observed,

  kernel BUG at include/linux/skbuff.h:3131!  (skb_orphan)
  RIP: 0010:ip6_rcv_core+0x11bc/0x19a0
  Call Trace:
   ipv6_list_rcv+0x250/0x3f0
   __netif_receive_skb_list_core+0x49d/0x8f0
   netif_receive_skb_list_internal+0x634/0xd40
   napi_complete_done+0x1d2/0x7d0
   gro_cell_poll+0x118/0x1f0

A similar construction is found in skb_gro_receive, apply the same
change there.

Fixes: 5e10da5385d2 ("skbuff: allow 'slow_gro' for skb carring sock reference")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
---

Since v1:
  - Reworded beginning of the commit msg.

 net/core/gro.c         | 3 ++-
 net/ipv4/udp_offload.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/core/gro.c b/net/core/gro.c
index ee30d4f0c038..83f35d99a682 100644
--- a/net/core/gro.c
+++ b/net/core/gro.c
@@ -192,8 +192,9 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb)
 	}
 
 merge:
-	/* sk owenrship - if any - completely transferred to the aggregated packet */
+	/* sk ownership - if any - completely transferred to the aggregated packet */
 	skb->destructor = NULL;
+	skb->sk = NULL;
 	delta_truesize = skb->truesize;
 	if (offset > headlen) {
 		unsigned int eat = offset - headlen;
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index e9719afe91cf..3bb69464930b 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -449,8 +449,9 @@ static int skb_gro_receive_list(struct sk_buff *p, struct sk_buff *skb)
 	NAPI_GRO_CB(p)->count++;
 	p->data_len += skb->len;
 
-	/* sk owenrship - if any - completely transferred to the aggregated packet */
+	/* sk ownership - if any - completely transferred to the aggregated packet */
 	skb->destructor = NULL;
+	skb->sk = NULL;
 	p->truesize += skb->truesize;
 	p->len += skb->len;
 
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH net v2 3/4] udp: do not transition UDP fraglist to unnecessary checksum
  2024-03-19  9:31 [PATCH net v2 0/4] gro: various fixes related to UDP tunnels Antoine Tenart
  2024-03-19  9:31 ` [PATCH net v2 1/4] udp: do not accept non-tunnel GSO skbs landing in a tunnel Antoine Tenart
  2024-03-19  9:31 ` [PATCH net v2 2/4] gro: fix ownership transfer Antoine Tenart
@ 2024-03-19  9:31 ` Antoine Tenart
  2024-03-19 13:38   ` Willem de Bruijn
  2024-03-19  9:31 ` [PATCH net v2 4/4] udp: prevent local UDP tunnel packets from being GROed Antoine Tenart
  3 siblings, 1 reply; 21+ messages in thread
From: Antoine Tenart @ 2024-03-19  9:31 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet
  Cc: Antoine Tenart, steffen.klassert, willemdebruijn.kernel, netdev

udp4/6_gro_complete transition fraglist packets to CHECKSUM_UNNECESSARY
and sets their checksum level based on if the packet is recognized to be
a tunneled one. However there is no safe way to detect a packet is a
tunneled one and in case such packet is GROed at the UDP level, setting
a wrong checksum level will lead to later errors. For example if those
packets are forwarded to the Tx path they could produce the following
dump:

  gen01: hw csum failure
  skb len=3008 headroom=160 headlen=1376 tailroom=0
  mac=(106,14) net=(120,40) trans=160
  shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
  csum(0xffff232e ip_summed=2 complete_sw=0 valid=0 level=0)
  hash(0x77e3d716 sw=1 l4=1) proto=0x86dd pkttype=0 iif=12
  ...

Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
---
 net/ipv4/udp_offload.c | 8 --------
 net/ipv6/udp_offload.c | 8 --------
 2 files changed, 16 deletions(-)

diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 3bb69464930b..3263ebcaa3f4 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -722,14 +722,6 @@ INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff)
 		skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
 		skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
 
-		if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
-			if (skb->csum_level < SKB_MAX_CSUM_LEVEL)
-				skb->csum_level++;
-		} else {
-			skb->ip_summed = CHECKSUM_UNNECESSARY;
-			skb->csum_level = 0;
-		}
-
 		return 0;
 	}
 
diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index 312bcaeea96f..9289384cb7d0 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -174,14 +174,6 @@ INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
 		skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
 		skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
 
-		if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
-			if (skb->csum_level < SKB_MAX_CSUM_LEVEL)
-				skb->csum_level++;
-		} else {
-			skb->ip_summed = CHECKSUM_UNNECESSARY;
-			skb->csum_level = 0;
-		}
-
 		return 0;
 	}
 
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH net v2 4/4] udp: prevent local UDP tunnel packets from being GROed
  2024-03-19  9:31 [PATCH net v2 0/4] gro: various fixes related to UDP tunnels Antoine Tenart
                   ` (2 preceding siblings ...)
  2024-03-19  9:31 ` [PATCH net v2 3/4] udp: do not transition UDP fraglist to unnecessary checksum Antoine Tenart
@ 2024-03-19  9:31 ` Antoine Tenart
  3 siblings, 0 replies; 21+ messages in thread
From: Antoine Tenart @ 2024-03-19  9:31 UTC (permalink / raw)
  To: davem, kuba, pabeni, edumazet
  Cc: Antoine Tenart, steffen.klassert, willemdebruijn.kernel, netdev,
	Willem de Bruijn

GRO has a fundamental issue with UDP tunnel packets as it can't detect
those in a foolproof way and GRO could happen before they reach the
tunnel endpoint. Previous commits have fixed issues when UDP tunnel
packets come from a remote host, but if those packets are issued locally
they could run into checksum issues.

If the inner packet has a partial checksum the information will be lost
in the GRO logic, either in udp4/6_gro_complete or in
udp_gro_complete_segment and packets will have an invalid checksum when
leaving the host.

Prevent local UDP tunnel packets from ever being GROed at the outer UDP
level.

Due to skb->encapsulation being wrongly used in some drivers this is
actually only preventing UDP tunnel packets with a partial checksum to
be GROed (see iptunnel_handle_offloads) but those were also the packets
triggering issues so in practice this should be sufficient.

Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
Fixes: 36707061d6ba ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets")
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
---

Since v1:
  - Added Reviewed-by tag.

 net/ipv4/udp_offload.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 3263ebcaa3f4..4ea72bd4f6d7 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -559,6 +559,12 @@ struct sk_buff *udp_gro_receive(struct list_head *head, struct sk_buff *skb,
 	 */
 	NAPI_GRO_CB(skb)->is_flist = 0;
 	if (!sk || !udp_sk(sk)->gro_receive) {
+		/* If the packet was locally encapsulated in a UDP tunnel that
+		 * wasn't detected above, do not GRO.
+		 */
+		if (skb->encapsulation)
+			goto out;
+
 		if (skb->dev->features & NETIF_F_GRO_FRAGLIST)
 			NAPI_GRO_CB(skb)->is_flist = sk ? !udp_test_bit(GRO_ENABLED, sk) : 1;
 
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH net v2 2/4] gro: fix ownership transfer
  2024-03-19  9:31 ` [PATCH net v2 2/4] gro: fix ownership transfer Antoine Tenart
@ 2024-03-19 13:13   ` Willem de Bruijn
  0 siblings, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2024-03-19 13:13 UTC (permalink / raw)
  To: Antoine Tenart, davem, kuba, pabeni, edumazet
  Cc: Antoine Tenart, steffen.klassert, willemdebruijn.kernel, netdev

Antoine Tenart wrote:
> If packets are GROed with fraglist they might be segmented later on and
> continue their journey in the stack. In skb_segment_list those skbs can
> be reused as-is. This is an issue as their destructor was removed in
> skb_gro_receive_list but not the reference to their socket, and then
> they can't be orphaned. Fix this by also removing the reference to the
> socket.
> 
> For example this could be observed,
> 
>   kernel BUG at include/linux/skbuff.h:3131!  (skb_orphan)
>   RIP: 0010:ip6_rcv_core+0x11bc/0x19a0
>   Call Trace:
>    ipv6_list_rcv+0x250/0x3f0
>    __netif_receive_skb_list_core+0x49d/0x8f0
>    netif_receive_skb_list_internal+0x634/0xd40
>    napi_complete_done+0x1d2/0x7d0
>    gro_cell_poll+0x118/0x1f0
> 
> A similar construction is found in skb_gro_receive, apply the same
> change there.
> 
> Fixes: 5e10da5385d2 ("skbuff: allow 'slow_gro' for skb carring sock reference")
> Signed-off-by: Antoine Tenart <atenart@kernel.org>

Reviewed-by: Willem de Bruijn <willemb@google.com>

The BUG_ON in skb_orphan makes the invariant clear that the two fields
must be cleared together:

        if (skb->destructor) {
                skb->destructor(skb);
                skb->destructor = NULL;
                skb->sk         = NULL;
        } else {
                BUG_ON(skb->sk);
        }

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net v2 3/4] udp: do not transition UDP fraglist to unnecessary checksum
  2024-03-19  9:31 ` [PATCH net v2 3/4] udp: do not transition UDP fraglist to unnecessary checksum Antoine Tenart
@ 2024-03-19 13:38   ` Willem de Bruijn
  2024-03-19 16:01     ` Antoine Tenart
  0 siblings, 1 reply; 21+ messages in thread
From: Willem de Bruijn @ 2024-03-19 13:38 UTC (permalink / raw)
  To: Antoine Tenart, davem, kuba, pabeni, edumazet
  Cc: Antoine Tenart, steffen.klassert, willemdebruijn.kernel, netdev

Antoine Tenart wrote:
> udp4/6_gro_complete transition fraglist packets to CHECKSUM_UNNECESSARY
> and sets their checksum level based on if the packet is recognized to be
> a tunneled one. However there is no safe way to detect a packet is a
> tunneled one and in case such packet is GROed at the UDP level, setting
> a wrong checksum level will lead to later errors. For example if those
> packets are forwarded to the Tx path they could produce the following
> dump:
> 
>   gen01: hw csum failure
>   skb len=3008 headroom=160 headlen=1376 tailroom=0
>   mac=(106,14) net=(120,40) trans=160
>   shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
>   csum(0xffff232e ip_summed=2 complete_sw=0 valid=0 level=0)
>   hash(0x77e3d716 sw=1 l4=1) proto=0x86dd pkttype=0 iif=12
>   ...
> 
> Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
> Signed-off-by: Antoine Tenart <atenart@kernel.org>

The original patch converted to CHECKSUM_UNNECESSARY for a reason.
The skb->csum of the main gso_skb is not valid?

Should instead only the csum_level be adjusted, to always keep
csum_level == 0?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net v2 3/4] udp: do not transition UDP fraglist to unnecessary checksum
  2024-03-19 13:38   ` Willem de Bruijn
@ 2024-03-19 16:01     ` Antoine Tenart
  2024-03-20 13:00       ` Willem de Bruijn
  0 siblings, 1 reply; 21+ messages in thread
From: Antoine Tenart @ 2024-03-19 16:01 UTC (permalink / raw)
  To: Willem de Bruijn, davem, edumazet, kuba, pabeni
  Cc: steffen.klassert, willemdebruijn.kernel, netdev

Quoting Willem de Bruijn (2024-03-19 14:38:20)
> Antoine Tenart wrote:
> > udp4/6_gro_complete transition fraglist packets to CHECKSUM_UNNECESSARY
> > and sets their checksum level based on if the packet is recognized to be
> > a tunneled one. However there is no safe way to detect a packet is a
> > tunneled one and in case such packet is GROed at the UDP level, setting
> > a wrong checksum level will lead to later errors. For example if those
> > packets are forwarded to the Tx path they could produce the following
> > dump:
> > 
> >   gen01: hw csum failure
> >   skb len=3008 headroom=160 headlen=1376 tailroom=0
> >   mac=(106,14) net=(120,40) trans=160
> >   shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
> >   csum(0xffff232e ip_summed=2 complete_sw=0 valid=0 level=0)
> >   hash(0x77e3d716 sw=1 l4=1) proto=0x86dd pkttype=0 iif=12
> >   ...
> > 
> > Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
> > Signed-off-by: Antoine Tenart <atenart@kernel.org>
> 
> The original patch converted to CHECKSUM_UNNECESSARY for a reason.
> The skb->csum of the main gso_skb is not valid?
> 
> Should instead only the csum_level be adjusted, to always keep
> csum_level == 0?

The above trace is an ICMPv6 packet being tunneled and GROed at the UDP
level, thus we have:
  UDP(CHECKSUM_PARTIAL)/Geneve/ICMPv6(was CHECKSUM_NONE)
csum_level would need to be 1 here; but we can't know that.

There is another issue (no kernel trace): if a packet has partial csum
and is being GROed that information is lost and the packet ends up with
an invalid csum.

Packets with CHECKSUM_UNNECESSARY should end up with the same info. My
impression is this checksum conversion is at best setting the same info
and otherwise is overriding valuable csum information.

Or would packets with CSUM_NONE being GROed would benefit from the
CHECKSUM_UNNECESSARY conversion?

For reference, original commit says:
"""
After validating the csum,  we mark ip_summed as
CHECKSUM_UNNECESSARY for fraglist GRO packets to
make sure that the csum is not touched.
"""

But I'm failing to see where that would happen and how the none to
unnecessary conversion would help. WDYT?

Thanks,
Antoine

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net v2 1/4] udp: do not accept non-tunnel GSO skbs landing in a tunnel
  2024-03-19  9:31 ` [PATCH net v2 1/4] udp: do not accept non-tunnel GSO skbs landing in a tunnel Antoine Tenart
@ 2024-03-20  2:41   ` Jakub Kicinski
  2024-03-20 11:11     ` Antoine Tenart
  2024-03-20 10:34   ` Nikolay Aleksandrov
  1 sibling, 1 reply; 21+ messages in thread
From: Jakub Kicinski @ 2024-03-20  2:41 UTC (permalink / raw)
  To: Antoine Tenart
  Cc: davem, pabeni, edumazet, steffen.klassert, willemdebruijn.kernel,
	netdev, Willem de Bruijn

On Tue, 19 Mar 2024 10:31:36 +0100 Antoine Tenart wrote:
> +DECLARE_STATIC_KEY_FALSE(udp_encap_needed_key);

nit: our build bot says you need to export this as well for v6=m.
-- 
pw-bot: cr

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net v2 1/4] udp: do not accept non-tunnel GSO skbs landing in a tunnel
  2024-03-19  9:31 ` [PATCH net v2 1/4] udp: do not accept non-tunnel GSO skbs landing in a tunnel Antoine Tenart
  2024-03-20  2:41   ` Jakub Kicinski
@ 2024-03-20 10:34   ` Nikolay Aleksandrov
  2024-03-20 11:13     ` Antoine Tenart
  1 sibling, 1 reply; 21+ messages in thread
From: Nikolay Aleksandrov @ 2024-03-20 10:34 UTC (permalink / raw)
  To: Antoine Tenart, davem, kuba, pabeni, edumazet
  Cc: steffen.klassert, willemdebruijn.kernel, netdev, Willem de Bruijn

On 3/19/24 11:31, Antoine Tenart wrote:
[snip]
> @@ -163,6 +181,16 @@ static inline bool udp_unexpected_gso(struct sock *sk, struct sk_buff *skb)
>   	    !udp_test_bit(ACCEPT_FRAGLIST, sk))
>   		return true;
>   
> +	/* GSO packets lacking the SKB_GSO_UDP_TUNNEL/_CUSM bits might still

s/CUSM/CSUM/

> +	 * land in a tunnel as the socket check in udp_gro_receive cannot be
> +	 * foolproof.
> +	 */
> +	if (udp_encap_needed() &&
> +	    READ_ONCE(udp_sk(sk)->encap_rcv) &&
> +	    !(skb_shinfo(skb)->gso_type &
> +	      (SKB_GSO_UDP_TUNNEL | SKB_GSO_UDP_TUNNEL_CSUM)))
> +		return true;
> +
>   	return false;
>   }
>   
> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> index b9880743765c..e9719afe91cf 100644
> --- a/net/ipv4/udp_offload.c
> +++ b/net/ipv4/udp_offload.c
> @@ -551,8 +551,10 @@ struct sk_buff *udp_gro_receive(struct list_head *head, struct sk_buff *skb,
>   	unsigned int off = skb_gro_offset(skb);
>   	int flush = 1;
>   
> -	/* we can do L4 aggregation only if the packet can't land in a tunnel
> -	 * otherwise we could corrupt the inner stream
> +	/* We can do L4 aggregation only if the packet can't land in a tunnel
> +	 * otherwise we could corrupt the inner stream. Detecting such packets
> +	 * cannot be foolproof and the aggregation might still happen in some
> +	 * cases. Such packets should be caught in udp_unexpected_gso later.
>   	 */
>   	NAPI_GRO_CB(skb)->is_flist = 0;
>   	if (!sk || !udp_sk(sk)->gro_receive) {


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net v2 1/4] udp: do not accept non-tunnel GSO skbs landing in a tunnel
  2024-03-20  2:41   ` Jakub Kicinski
@ 2024-03-20 11:11     ` Antoine Tenart
  0 siblings, 0 replies; 21+ messages in thread
From: Antoine Tenart @ 2024-03-20 11:11 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, pabeni, edumazet, steffen.klassert, willemdebruijn.kernel,
	netdev, Willem de Bruijn

Quoting Jakub Kicinski (2024-03-20 03:41:24)
> On Tue, 19 Mar 2024 10:31:36 +0100 Antoine Tenart wrote:
> > +DECLARE_STATIC_KEY_FALSE(udp_encap_needed_key);
> 
> nit: our build bot says you need to export this as well for v6=m.

Thanks for the heads up, missed that. And udpv6_encap_needed_key needs
to be defined outside ipv6.ko and exported as well. The following should
fix the remaining build issues,

  diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
  index 661d0e0d273f..c02bf011d4a6 100644
  --- a/net/ipv4/udp.c
  +++ b/net/ipv4/udp.c
  @@ -582,6 +582,13 @@ static inline bool __udp_is_mcast_sock(struct net *net, const struct sock *sk,
   }

   DEFINE_STATIC_KEY_FALSE(udp_encap_needed_key);
  +EXPORT_SYMBOL(udp_encap_needed_key);
  +
  +#if IS_ENABLED(CONFIG_IPV6)
  +DEFINE_STATIC_KEY_FALSE(udpv6_encap_needed_key);
  +EXPORT_SYMBOL(udpv6_encap_needed_key);
  +#endif
  +
   void udp_encap_enable(void)
   {
          static_branch_inc(&udp_encap_needed_key);
  diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
  index 7c1e6469d091..8b1dd7f51249 100644
  --- a/net/ipv6/udp.c
  +++ b/net/ipv6/udp.c
  @@ -447,7 +447,7 @@ int udpv6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
          goto try_again;
   }

  -DEFINE_STATIC_KEY_FALSE(udpv6_encap_needed_key);
  +DECLARE_STATIC_KEY_FALSE(udpv6_encap_needed_key);
   void udpv6_encap_enable(void)
   {
          static_branch_inc(&udpv6_encap_needed_key);

Thanks!
Antoine

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net v2 1/4] udp: do not accept non-tunnel GSO skbs landing in a tunnel
  2024-03-20 10:34   ` Nikolay Aleksandrov
@ 2024-03-20 11:13     ` Antoine Tenart
  0 siblings, 0 replies; 21+ messages in thread
From: Antoine Tenart @ 2024-03-20 11:13 UTC (permalink / raw)
  To: Nikolay Aleksandrov, davem, edumazet, kuba, pabeni
  Cc: steffen.klassert, willemdebruijn.kernel, netdev, Willem de Bruijn

Quoting Nikolay Aleksandrov (2024-03-20 11:34:01)
> On 3/19/24 11:31, Antoine Tenart wrote:
> [snip]
> > @@ -163,6 +181,16 @@ static inline bool udp_unexpected_gso(struct sock *sk, struct sk_buff *skb)
> >           !udp_test_bit(ACCEPT_FRAGLIST, sk))
> >               return true;
> >   
> > +     /* GSO packets lacking the SKB_GSO_UDP_TUNNEL/_CUSM bits might still
> 
> s/CUSM/CSUM/

In the commit msg as well, thanks, will fix.

Antoine

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net v2 3/4] udp: do not transition UDP fraglist to unnecessary checksum
  2024-03-19 16:01     ` Antoine Tenart
@ 2024-03-20 13:00       ` Willem de Bruijn
  2024-03-20 15:08         ` Antoine Tenart
  0 siblings, 1 reply; 21+ messages in thread
From: Willem de Bruijn @ 2024-03-20 13:00 UTC (permalink / raw)
  To: Antoine Tenart, Willem de Bruijn, davem, edumazet, kuba, pabeni
  Cc: steffen.klassert, willemdebruijn.kernel, netdev

Antoine Tenart wrote:
> Quoting Willem de Bruijn (2024-03-19 14:38:20)
> > Antoine Tenart wrote:
> > > udp4/6_gro_complete transition fraglist packets to CHECKSUM_UNNECESSARY
> > > and sets their checksum level based on if the packet is recognized to be
> > > a tunneled one. However there is no safe way to detect a packet is a
> > > tunneled one and in case such packet is GROed at the UDP level, setting
> > > a wrong checksum level will lead to later errors. For example if those
> > > packets are forwarded to the Tx path they could produce the following
> > > dump:
> > > 
> > >   gen01: hw csum failure
> > >   skb len=3008 headroom=160 headlen=1376 tailroom=0
> > >   mac=(106,14) net=(120,40) trans=160
> > >   shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
> > >   csum(0xffff232e ip_summed=2 complete_sw=0 valid=0 level=0)
> > >   hash(0x77e3d716 sw=1 l4=1) proto=0x86dd pkttype=0 iif=12
> > >   ...
> > > 
> > > Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
> > > Signed-off-by: Antoine Tenart <atenart@kernel.org>
> > 
> > The original patch converted to CHECKSUM_UNNECESSARY for a reason.
> > The skb->csum of the main gso_skb is not valid?
> > 
> > Should instead only the csum_level be adjusted, to always keep
> > csum_level == 0?
> 
> The above trace is an ICMPv6 packet being tunneled and GROed at the UDP
> level, thus we have:
>   UDP(CHECKSUM_PARTIAL)/Geneve/ICMPv6(was CHECKSUM_NONE)
> csum_level would need to be 1 here; but we can't know that.

Is this a packet looped internally? Else it is not CHECKSUM_PARTIAL.
Looped packets can trivially be converted to CHECKSUM_UNNECESSARY.
It just has to be level 0 if only the outer checksum is known.

> There is another issue (no kernel trace): if a packet has partial csum
> and is being GROed that information is lost and the packet ends up with
> an invalid csum.

CHECKSUM_PARTIAL should be converted to CHECKSUM_UNNECESSARY for this
reason. CHECKSUM_PARTIAL implies the header is prepared with pseudo
header checksum. Similarly CHECKSUM_COMPLETE implies skb csum is valid.
CHECKSUM_UNNECESSARY has neither expectations.
 
> Packets with CHECKSUM_UNNECESSARY should end up with the same info. My
> impression is this checksum conversion is at best setting the same info
> and otherwise is overriding valuable csum information.
> 
> Or would packets with CSUM_NONE being GROed would benefit from the
> CHECKSUM_UNNECESSARY conversion?

Definitely. If the packet has CHECKSUM_NONE and GRO checks its
validity in software, converting it to CHECKSUM_UNNECESSARY avoids
potential additional checks at later stages in the packet path.

> 
> For reference, original commit says:
> """
> After validating the csum,  we mark ip_summed as
> CHECKSUM_UNNECESSARY for fraglist GRO packets to
> make sure that the csum is not touched.
> """
> 
> But I'm failing to see where that would happen and how the none to
> unnecessary conversion would help. WDYT?

I would appreciate the GRO and fraglist GRO authors to also review
this series and chime in.

> 
> Thanks,
> Antoine



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net v2 3/4] udp: do not transition UDP fraglist to unnecessary checksum
  2024-03-20 13:00       ` Willem de Bruijn
@ 2024-03-20 15:08         ` Antoine Tenart
  2024-03-20 20:43           ` Willem de Bruijn
  0 siblings, 1 reply; 21+ messages in thread
From: Antoine Tenart @ 2024-03-20 15:08 UTC (permalink / raw)
  To: Willem de Bruijn, davem, edumazet, kuba, pabeni
  Cc: steffen.klassert, willemdebruijn.kernel, netdev

Quoting Willem de Bruijn (2024-03-20 14:00:48)
> Antoine Tenart wrote:
> > Quoting Willem de Bruijn (2024-03-19 14:38:20)
> > > 
> > > The original patch converted to CHECKSUM_UNNECESSARY for a reason.
> > > The skb->csum of the main gso_skb is not valid?
> > > 
> > > Should instead only the csum_level be adjusted, to always keep
> > > csum_level == 0?
> > 
> > The above trace is an ICMPv6 packet being tunneled and GROed at the UDP
> > level, thus we have:
> >   UDP(CHECKSUM_PARTIAL)/Geneve/ICMPv6(was CHECKSUM_NONE)
> > csum_level would need to be 1 here; but we can't know that.
> 
> Is this a packet looped internally? Else it is not CHECKSUM_PARTIAL.

I'm not sure to follow, CHECKSUM_NONE packets going in a tunnel will be
encapsulated and the outer UDP header will be CHECKSUM_PARTIAL. The
packet can be looped internally or going to a remote host.

> > There is another issue (no kernel trace): if a packet has partial csum
> > and is being GROed that information is lost and the packet ends up with
> > an invalid csum.
> 
> CHECKSUM_PARTIAL should be converted to CHECKSUM_UNNECESSARY for this
> reason. CHECKSUM_PARTIAL implies the header is prepared with pseudo
> header checksum. Similarly CHECKSUM_COMPLETE implies skb csum is valid.
> CHECKSUM_UNNECESSARY has neither expectations.

But not if the packet is sent to a remote host. Otherwise an inner
partial csum is never fixed by the stack/NIC before going out.

> > Packets with CHECKSUM_UNNECESSARY should end up with the same info. My
> > impression is this checksum conversion is at best setting the same info
> > and otherwise is overriding valuable csum information.
> > 
> > Or would packets with CSUM_NONE being GROed would benefit from the
> > CHECKSUM_UNNECESSARY conversion?
> 
> Definitely. If the packet has CHECKSUM_NONE and GRO checks its
> validity in software, converting it to CHECKSUM_UNNECESSARY avoids
> potential additional checks at later stages in the packet path.

Makes sense. The current code really looks like
__skb_incr_checksum_unnecessary, w/o the CHECKSUM_NONE check to only
convert those packets.

Thanks!
Antoine

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net v2 3/4] udp: do not transition UDP fraglist to unnecessary checksum
  2024-03-20 15:08         ` Antoine Tenart
@ 2024-03-20 20:43           ` Willem de Bruijn
  2024-03-21  8:48             ` Antoine Tenart
  0 siblings, 1 reply; 21+ messages in thread
From: Willem de Bruijn @ 2024-03-20 20:43 UTC (permalink / raw)
  To: Antoine Tenart, Willem de Bruijn, davem, edumazet, kuba, pabeni
  Cc: steffen.klassert, willemdebruijn.kernel, netdev

Antoine Tenart wrote:
> Quoting Willem de Bruijn (2024-03-20 14:00:48)
> > Antoine Tenart wrote:
> > > Quoting Willem de Bruijn (2024-03-19 14:38:20)
> > > > 
> > > > The original patch converted to CHECKSUM_UNNECESSARY for a reason.
> > > > The skb->csum of the main gso_skb is not valid?
> > > > 
> > > > Should instead only the csum_level be adjusted, to always keep
> > > > csum_level == 0?
> > > 
> > > The above trace is an ICMPv6 packet being tunneled and GROed at the UDP
> > > level, thus we have:
> > >   UDP(CHECKSUM_PARTIAL)/Geneve/ICMPv6(was CHECKSUM_NONE)
> > > csum_level would need to be 1 here; but we can't know that.
> > 
> > Is this a packet looped internally? Else it is not CHECKSUM_PARTIAL.
> 
> I'm not sure to follow, CHECKSUM_NONE packets going in a tunnel will be
> encapsulated and the outer UDP header will be CHECKSUM_PARTIAL. The
> packet can be looped internally or going to a remote host.

That is on transmit. To come into contact with UDP_GRO while having
CHECKSUM_PARTIAL the packet will have to loop into the receive path,
in some way that triggers GRO. Perhaps through gro_cells, as other
GRO paths are hardware NIC drivers.
 
> > > There is another issue (no kernel trace): if a packet has partial csum
> > > and is being GROed that information is lost and the packet ends up with
> > > an invalid csum.
> > 
> > CHECKSUM_PARTIAL should be converted to CHECKSUM_UNNECESSARY for this
> > reason. CHECKSUM_PARTIAL implies the header is prepared with pseudo
> > header checksum. Similarly CHECKSUM_COMPLETE implies skb csum is valid.
> > CHECKSUM_UNNECESSARY has neither expectations.
> 
> But not if the packet is sent to a remote host. Otherwise an inner
> partial csum is never fixed by the stack/NIC before going out.

The stack will only offload a single checksum. With local checksum
offload, this can be the inner checksum and the outer can be cheaply
computed in software. udp_set_csum() handles this. It indeed sets lco
if the inner packet has CHECKSUM_PARTIAL. Otherwise it sets ip_summed
to CHECKSUM_PARTIAL, now pointing to the outer UDP header.

You're right. Regardless of whether it points to the inner or outer
checksum, a conversion of CHECKSUM_PARTIAL to CHECKSUM_UNNECESSARY
will break checksum offload in the forwarding case.

> > > Packets with CHECKSUM_UNNECESSARY should end up with the same info. My
> > > impression is this checksum conversion is at best setting the same info
> > > and otherwise is overriding valuable csum information.
> > > 
> > > Or would packets with CSUM_NONE being GROed would benefit from the
> > > CHECKSUM_UNNECESSARY conversion?
> > 
> > Definitely. If the packet has CHECKSUM_NONE and GRO checks its
> > validity in software, converting it to CHECKSUM_UNNECESSARY avoids
> > potential additional checks at later stages in the packet path.
> 
> Makes sense. The current code really looks like
> __skb_incr_checksum_unnecessary, w/o the CHECKSUM_NONE check to only
> convert those packets.
> 
> Thanks!
> Antoine



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net v2 3/4] udp: do not transition UDP fraglist to unnecessary checksum
  2024-03-20 20:43           ` Willem de Bruijn
@ 2024-03-21  8:48             ` Antoine Tenart
  2024-03-21 12:42               ` Paolo Abeni
  2024-03-21 14:58               ` Willem de Bruijn
  0 siblings, 2 replies; 21+ messages in thread
From: Antoine Tenart @ 2024-03-21  8:48 UTC (permalink / raw)
  To: Willem de Bruijn, davem, edumazet, kuba, pabeni
  Cc: steffen.klassert, willemdebruijn.kernel, netdev

Quoting Willem de Bruijn (2024-03-20 21:43:55)
> Antoine Tenart wrote:
> > Quoting Willem de Bruijn (2024-03-20 14:00:48)
> > > Antoine Tenart wrote:
> > > > Quoting Willem de Bruijn (2024-03-19 14:38:20)
> > > > > 
> > > > > The original patch converted to CHECKSUM_UNNECESSARY for a reason.
> > > > > The skb->csum of the main gso_skb is not valid?
> > > > > 
> > > > > Should instead only the csum_level be adjusted, to always keep
> > > > > csum_level == 0?
> > > > 
> > > > The above trace is an ICMPv6 packet being tunneled and GROed at the UDP
> > > > level, thus we have:
> > > >   UDP(CHECKSUM_PARTIAL)/Geneve/ICMPv6(was CHECKSUM_NONE)
> > > > csum_level would need to be 1 here; but we can't know that.
> > > 
> > > Is this a packet looped internally? Else it is not CHECKSUM_PARTIAL.
> > 
> > I'm not sure to follow, CHECKSUM_NONE packets going in a tunnel will be
> > encapsulated and the outer UDP header will be CHECKSUM_PARTIAL. The
> > packet can be looped internally or going to a remote host.
> 
> That is on transmit. To come into contact with UDP_GRO while having
> CHECKSUM_PARTIAL the packet will have to loop into the receive path,
> in some way that triggers GRO. Perhaps through gro_cells, as other
> GRO paths are hardware NIC drivers.

I get what you meant now, thanks. Yes, those Tx packets loop into the Rx
path. One easy way is through veth pairs, eg. packet get tunneled in a
netns, connected to another one via a veth pair.

> > > > There is another issue (no kernel trace): if a packet has partial csum
> > > > and is being GROed that information is lost and the packet ends up with
> > > > an invalid csum.
> > > 
> > > CHECKSUM_PARTIAL should be converted to CHECKSUM_UNNECESSARY for this
> > > reason. CHECKSUM_PARTIAL implies the header is prepared with pseudo
> > > header checksum. Similarly CHECKSUM_COMPLETE implies skb csum is valid.
> > > CHECKSUM_UNNECESSARY has neither expectations.
> > 
> > But not if the packet is sent to a remote host. Otherwise an inner
> > partial csum is never fixed by the stack/NIC before going out.
> 
> The stack will only offload a single checksum. With local checksum
> offload, this can be the inner checksum and the outer can be cheaply
> computed in software. udp_set_csum() handles this. It indeed sets lco
> if the inner packet has CHECKSUM_PARTIAL. Otherwise it sets ip_summed
> to CHECKSUM_PARTIAL, now pointing to the outer UDP header.
> 
> You're right. Regardless of whether it points to the inner or outer
> checksum, a conversion of CHECKSUM_PARTIAL to CHECKSUM_UNNECESSARY
> will break checksum offload in the forwarding case.
> 
> > > > Packets with CHECKSUM_UNNECESSARY should end up with the same info. My
> > > > impression is this checksum conversion is at best setting the same info
> > > > and otherwise is overriding valuable csum information.
> > > > 
> > > > Or would packets with CSUM_NONE being GROed would benefit from the
> > > > CHECKSUM_UNNECESSARY conversion?
> > > 
> > > Definitely. If the packet has CHECKSUM_NONE and GRO checks its
> > > validity in software, converting it to CHECKSUM_UNNECESSARY avoids
> > > potential additional checks at later stages in the packet path.
> > 
> > Makes sense. The current code really looks like
> > __skb_incr_checksum_unnecessary, w/o the CHECKSUM_NONE check to only
> > convert those packets.

If I sum up our discussion CHECKSUM_NONE conversion is wanted,
CHECKSUM_UNNECESSARY conversion is a no-op and CHECKSUM_PARTIAL
conversion breaks things. What about we just convert CHECKSUM_NONE to
CHECKSUM_UNNECESSARY?

diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index 50a8a65fad23..44779d4c538b 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -174,7 +174,7 @@ INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
                if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
                        if (skb->csum_level < SKB_MAX_CSUM_LEVEL)
                                skb->csum_level++;
-               } else {
+               } else if (skb->ip_summed == CHECKSUM_NONE) {
                        skb->ip_summed = CHECKSUM_UNNECESSARY;
                        skb->csum_level = 0;
                }

Or directly call __skb_incr_checksum_unnecessary.

Thanks,
Antoine

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH net v2 3/4] udp: do not transition UDP fraglist to unnecessary checksum
  2024-03-21  8:48             ` Antoine Tenart
@ 2024-03-21 12:42               ` Paolo Abeni
  2024-03-21 14:58               ` Willem de Bruijn
  1 sibling, 0 replies; 21+ messages in thread
From: Paolo Abeni @ 2024-03-21 12:42 UTC (permalink / raw)
  To: Antoine Tenart, Willem de Bruijn, davem, edumazet, kuba
  Cc: steffen.klassert, netdev

On Thu, 2024-03-21 at 09:48 +0100, Antoine Tenart wrote:
> Quoting Willem de Bruijn (2024-03-20 21:43:55)
> > Antoine Tenart wrote:
> > > Quoting Willem de Bruijn (2024-03-20 14:00:48)
> > > > Antoine Tenart wrote:
> > > > > Quoting Willem de Bruijn (2024-03-19 14:38:20)
> > > > > > 
> > > > > > The original patch converted to CHECKSUM_UNNECESSARY for a reason.
> > > > > > The skb->csum of the main gso_skb is not valid?
> > > > > > 
> > > > > > Should instead only the csum_level be adjusted, to always keep
> > > > > > csum_level == 0?
> > > > > 
> > > > > The above trace is an ICMPv6 packet being tunneled and GROed at the UDP
> > > > > level, thus we have:
> > > > >   UDP(CHECKSUM_PARTIAL)/Geneve/ICMPv6(was CHECKSUM_NONE)
> > > > > csum_level would need to be 1 here; but we can't know that.
> > > > 
> > > > Is this a packet looped internally? Else it is not CHECKSUM_PARTIAL.
> > > 
> > > I'm not sure to follow, CHECKSUM_NONE packets going in a tunnel will be
> > > encapsulated and the outer UDP header will be CHECKSUM_PARTIAL. The
> > > packet can be looped internally or going to a remote host.
> > 
> > That is on transmit. To come into contact with UDP_GRO while having
> > CHECKSUM_PARTIAL the packet will have to loop into the receive path,
> > in some way that triggers GRO. Perhaps through gro_cells, as other
> > GRO paths are hardware NIC drivers.
> 
> I get what you meant now, thanks. Yes, those Tx packets loop into the Rx
> path. One easy way is through veth pairs, eg. packet get tunneled in a
> netns, connected to another one via a veth pair.
> 
> > > > > There is another issue (no kernel trace): if a packet has partial csum
> > > > > and is being GROed that information is lost and the packet ends up with
> > > > > an invalid csum.
> > > > 
> > > > CHECKSUM_PARTIAL should be converted to CHECKSUM_UNNECESSARY for this
> > > > reason. CHECKSUM_PARTIAL implies the header is prepared with pseudo
> > > > header checksum. Similarly CHECKSUM_COMPLETE implies skb csum is valid.
> > > > CHECKSUM_UNNECESSARY has neither expectations.
> > > 
> > > But not if the packet is sent to a remote host. Otherwise an inner
> > > partial csum is never fixed by the stack/NIC before going out.
> > 
> > The stack will only offload a single checksum. With local checksum
> > offload, this can be the inner checksum and the outer can be cheaply
> > computed in software. udp_set_csum() handles this. It indeed sets lco
> > if the inner packet has CHECKSUM_PARTIAL. Otherwise it sets ip_summed
> > to CHECKSUM_PARTIAL, now pointing to the outer UDP header.
> > 
> > You're right. Regardless of whether it points to the inner or outer
> > checksum, a conversion of CHECKSUM_PARTIAL to CHECKSUM_UNNECESSARY
> > will break checksum offload in the forwarding case.
> > 
> > > > > Packets with CHECKSUM_UNNECESSARY should end up with the same info. My
> > > > > impression is this checksum conversion is at best setting the same info
> > > > > and otherwise is overriding valuable csum information.
> > > > > 
> > > > > Or would packets with CSUM_NONE being GROed would benefit from the
> > > > > CHECKSUM_UNNECESSARY conversion?
> > > > 
> > > > Definitely. If the packet has CHECKSUM_NONE and GRO checks its
> > > > validity in software, converting it to CHECKSUM_UNNECESSARY avoids
> > > > potential additional checks at later stages in the packet path.
> > > 
> > > Makes sense. The current code really looks like
> > > __skb_incr_checksum_unnecessary, w/o the CHECKSUM_NONE check to only
> > > convert those packets.
> 
> If I sum up our discussion CHECKSUM_NONE conversion is wanted,
> CHECKSUM_UNNECESSARY conversion is a no-op and CHECKSUM_PARTIAL
> conversion breaks things. What about we just convert CHECKSUM_NONE to
> CHECKSUM_UNNECESSARY?
> 
> diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
> index 50a8a65fad23..44779d4c538b 100644
> --- a/net/ipv6/udp_offload.c
> +++ b/net/ipv6/udp_offload.c
> @@ -174,7 +174,7 @@ INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
>                 if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
>                         if (skb->csum_level < SKB_MAX_CSUM_LEVEL)
>                                 skb->csum_level++;
> -               } else {
> +               } else if (skb->ip_summed == CHECKSUM_NONE) {
>                         skb->ip_summed = CHECKSUM_UNNECESSARY;
>                         skb->csum_level = 0;
>                 }
> 
> Or directly call __skb_incr_checksum_unnecessary

I think calling __skb_incr_checksum_unnecessary() would be the better
option.

@Willem: FTR, Antoine and me discussed this series internally for a
bit, and the above variant was also discussed. I was unable to find a
good reason for the CHECKSUM_NONE -> CHECKSUM_UNNECESSARY conversion
back then, so it seemed more clean to drop the whole chunk.

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net v2 3/4] udp: do not transition UDP fraglist to unnecessary checksum
  2024-03-21  8:48             ` Antoine Tenart
  2024-03-21 12:42               ` Paolo Abeni
@ 2024-03-21 14:58               ` Willem de Bruijn
  2024-03-21 17:22                 ` Antoine Tenart
  1 sibling, 1 reply; 21+ messages in thread
From: Willem de Bruijn @ 2024-03-21 14:58 UTC (permalink / raw)
  To: Antoine Tenart, Willem de Bruijn, davem, edumazet, kuba, pabeni
  Cc: steffen.klassert, willemdebruijn.kernel, netdev

Antoine Tenart wrote:
> Quoting Willem de Bruijn (2024-03-20 21:43:55)
> > Antoine Tenart wrote:
> > > Quoting Willem de Bruijn (2024-03-20 14:00:48)
> > > > Antoine Tenart wrote:
> > > > > Quoting Willem de Bruijn (2024-03-19 14:38:20)
> > > > > > 
> > > > > > The original patch converted to CHECKSUM_UNNECESSARY for a reason.
> > > > > > The skb->csum of the main gso_skb is not valid?
> > > > > > 
> > > > > > Should instead only the csum_level be adjusted, to always keep
> > > > > > csum_level == 0?
> > > > > 
> > > > > The above trace is an ICMPv6 packet being tunneled and GROed at the UDP
> > > > > level, thus we have:
> > > > >   UDP(CHECKSUM_PARTIAL)/Geneve/ICMPv6(was CHECKSUM_NONE)
> > > > > csum_level would need to be 1 here; but we can't know that.
> > > > 
> > > > Is this a packet looped internally? Else it is not CHECKSUM_PARTIAL.
> > > 
> > > I'm not sure to follow, CHECKSUM_NONE packets going in a tunnel will be
> > > encapsulated and the outer UDP header will be CHECKSUM_PARTIAL. The
> > > packet can be looped internally or going to a remote host.
> > 
> > That is on transmit. To come into contact with UDP_GRO while having
> > CHECKSUM_PARTIAL the packet will have to loop into the receive path,
> > in some way that triggers GRO. Perhaps through gro_cells, as other
> > GRO paths are hardware NIC drivers.
> 
> I get what you meant now, thanks. Yes, those Tx packets loop into the Rx
> path. One easy way is through veth pairs, eg. packet get tunneled in a
> netns, connected to another one via a veth pair.
> 
> > > > > There is another issue (no kernel trace): if a packet has partial csum
> > > > > and is being GROed that information is lost and the packet ends up with
> > > > > an invalid csum.
> > > > 
> > > > CHECKSUM_PARTIAL should be converted to CHECKSUM_UNNECESSARY for this
> > > > reason. CHECKSUM_PARTIAL implies the header is prepared with pseudo
> > > > header checksum. Similarly CHECKSUM_COMPLETE implies skb csum is valid.
> > > > CHECKSUM_UNNECESSARY has neither expectations.
> > > 
> > > But not if the packet is sent to a remote host. Otherwise an inner
> > > partial csum is never fixed by the stack/NIC before going out.
> > 
> > The stack will only offload a single checksum. With local checksum
> > offload, this can be the inner checksum and the outer can be cheaply
> > computed in software. udp_set_csum() handles this. It indeed sets lco
> > if the inner packet has CHECKSUM_PARTIAL. Otherwise it sets ip_summed
> > to CHECKSUM_PARTIAL, now pointing to the outer UDP header.
> > 
> > You're right. Regardless of whether it points to the inner or outer
> > checksum, a conversion of CHECKSUM_PARTIAL to CHECKSUM_UNNECESSARY
> > will break checksum offload in the forwarding case.
> > 
> > > > > Packets with CHECKSUM_UNNECESSARY should end up with the same info. My
> > > > > impression is this checksum conversion is at best setting the same info
> > > > > and otherwise is overriding valuable csum information.
> > > > > 
> > > > > Or would packets with CSUM_NONE being GROed would benefit from the
> > > > > CHECKSUM_UNNECESSARY conversion?
> > > > 
> > > > Definitely. If the packet has CHECKSUM_NONE and GRO checks its
> > > > validity in software, converting it to CHECKSUM_UNNECESSARY avoids
> > > > potential additional checks at later stages in the packet path.
> > > 
> > > Makes sense. The current code really looks like
> > > __skb_incr_checksum_unnecessary, w/o the CHECKSUM_NONE check to only
> > > convert those packets.
> 
> If I sum up our discussion CHECKSUM_NONE conversion is wanted,
> CHECKSUM_UNNECESSARY conversion is a no-op and CHECKSUM_PARTIAL
> conversion breaks things. What about we just convert CHECKSUM_NONE to
> CHECKSUM_UNNECESSARY?

CHECKSUM_NONE cannot be converted to CHECKSUM_UNNECESSARY in the
receive path. Unless it is known to have been locally generated,
this means that the packet has not been verified yet.

> diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
> index 50a8a65fad23..44779d4c538b 100644
> --- a/net/ipv6/udp_offload.c
> +++ b/net/ipv6/udp_offload.c
> @@ -174,7 +174,7 @@ INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
>                 if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
>                         if (skb->csum_level < SKB_MAX_CSUM_LEVEL)
>                                 skb->csum_level++;
> -               } else {
> +               } else if (skb->ip_summed == CHECKSUM_NONE) {
>                         skb->ip_summed = CHECKSUM_UNNECESSARY;
>                         skb->csum_level = 0;
>                 }
> 
> Or directly call __skb_incr_checksum_unnecessary.
> 
> Thanks,
> Antoine



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net v2 3/4] udp: do not transition UDP fraglist to unnecessary checksum
  2024-03-21 14:58               ` Willem de Bruijn
@ 2024-03-21 17:22                 ` Antoine Tenart
  2024-03-21 18:13                   ` Willem de Bruijn
  0 siblings, 1 reply; 21+ messages in thread
From: Antoine Tenart @ 2024-03-21 17:22 UTC (permalink / raw)
  To: Willem de Bruijn, davem, edumazet, kuba, pabeni
  Cc: steffen.klassert, willemdebruijn.kernel, netdev

Quoting Willem de Bruijn (2024-03-21 15:58:17)
> Antoine Tenart wrote:
> > 
> > If I sum up our discussion CHECKSUM_NONE conversion is wanted,
> > CHECKSUM_UNNECESSARY conversion is a no-op and CHECKSUM_PARTIAL
> > conversion breaks things. What about we just convert CHECKSUM_NONE to
> > CHECKSUM_UNNECESSARY?
> 
> CHECKSUM_NONE cannot be converted to CHECKSUM_UNNECESSARY in the
> receive path. Unless it is known to have been locally generated,
> this means that the packet has not been verified yet.

I'm not sure to follow, non-partial checksums are being verified by
skb_gro_checksum_validate_zero_check in udp4/6_gro_receive before ending
up in udp4/6_gro_complete. That's also probably what the original commit
msg refers to: "After validating the csum, we mark ip_summed as
CHECKSUM_UNNECESSARY for fraglist GRO packets".

With fraglist, the csum can then be converted to CHECKSUM_UNNECESSARY.
Except for CHECKSUM_PARTIAL, as we discussed.

Does that make sense? Anything we can do to help moving this forward?

Thanks!
Antoine

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net v2 3/4] udp: do not transition UDP fraglist to unnecessary checksum
  2024-03-21 17:22                 ` Antoine Tenart
@ 2024-03-21 18:13                   ` Willem de Bruijn
  2024-03-22 10:48                     ` Antoine Tenart
  0 siblings, 1 reply; 21+ messages in thread
From: Willem de Bruijn @ 2024-03-21 18:13 UTC (permalink / raw)
  To: Antoine Tenart, Willem de Bruijn, davem, edumazet, kuba, pabeni
  Cc: steffen.klassert, willemdebruijn.kernel, netdev

Antoine Tenart wrote:
> Quoting Willem de Bruijn (2024-03-21 15:58:17)
> > Antoine Tenart wrote:
> > > 
> > > If I sum up our discussion CHECKSUM_NONE conversion is wanted,
> > > CHECKSUM_UNNECESSARY conversion is a no-op and CHECKSUM_PARTIAL
> > > conversion breaks things. What about we just convert CHECKSUM_NONE to
> > > CHECKSUM_UNNECESSARY?
> > 
> > CHECKSUM_NONE cannot be converted to CHECKSUM_UNNECESSARY in the
> > receive path. Unless it is known to have been locally generated,
> > this means that the packet has not been verified yet.
> 
> I'm not sure to follow, non-partial checksums are being verified by
> skb_gro_checksum_validate_zero_check in udp4/6_gro_receive before ending
> up in udp4/6_gro_complete. That's also probably what the original commit
> msg refers to: "After validating the csum, we mark ip_summed as
> CHECKSUM_UNNECESSARY for fraglist GRO packets".
> 
> With fraglist, the csum can then be converted to CHECKSUM_UNNECESSARY.

Oh yes, of course.

> Except for CHECKSUM_PARTIAL, as we discussed.

Because that is treated as equivalent to CHECKSUM_UNNECESSARY in the
ingress path, and if forwarded to an egress path, csum_start and
csum_off are set correctly. Also for all segs after segmentation.
Okay, that sounds fine then.

There are two cases here: csum_start points to the outer header or it
points to the inner header. I suppose that does not matter for
correctness post segmentation.

> Does that make sense? Anything we can do to help moving this forward?
> 
> Thanks!
> Antoine



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net v2 3/4] udp: do not transition UDP fraglist to unnecessary checksum
  2024-03-21 18:13                   ` Willem de Bruijn
@ 2024-03-22 10:48                     ` Antoine Tenart
  0 siblings, 0 replies; 21+ messages in thread
From: Antoine Tenart @ 2024-03-22 10:48 UTC (permalink / raw)
  To: Willem de Bruijn, davem, edumazet, kuba, pabeni
  Cc: steffen.klassert, willemdebruijn.kernel, netdev

Quoting Willem de Bruijn (2024-03-21 19:13:35)
> Antoine Tenart wrote:
> > Quoting Willem de Bruijn (2024-03-21 15:58:17)
> > > Antoine Tenart wrote:
> > > > 
> > > > If I sum up our discussion CHECKSUM_NONE conversion is wanted,
> > > > CHECKSUM_UNNECESSARY conversion is a no-op and CHECKSUM_PARTIAL
> > > > conversion breaks things. What about we just convert CHECKSUM_NONE to
> > > > CHECKSUM_UNNECESSARY?
> > > 
> > > CHECKSUM_NONE cannot be converted to CHECKSUM_UNNECESSARY in the
> > > receive path. Unless it is known to have been locally generated,
> > > this means that the packet has not been verified yet.
> > 
> > I'm not sure to follow, non-partial checksums are being verified by
> > skb_gro_checksum_validate_zero_check in udp4/6_gro_receive before ending
> > up in udp4/6_gro_complete. That's also probably what the original commit
> > msg refers to: "After validating the csum, we mark ip_summed as
> > CHECKSUM_UNNECESSARY for fraglist GRO packets".
> > 
> > With fraglist, the csum can then be converted to CHECKSUM_UNNECESSARY.
> 
> Oh yes, of course.
> 
> > Except for CHECKSUM_PARTIAL, as we discussed.
> 
> Because that is treated as equivalent to CHECKSUM_UNNECESSARY in the
> ingress path, and if forwarded to an egress path, csum_start and
> csum_off are set correctly. Also for all segs after segmentation.
> Okay, that sounds fine then.
> 
> There are two cases here: csum_start points to the outer header or it
> points to the inner header. I suppose that does not matter for
> correctness post segmentation.

Right. I'll send a v3 to keep the none -> unnecessary conversion and
fix build issue in patch 1.

Thanks for the review!
Antoine

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2024-03-22 10:48 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-19  9:31 [PATCH net v2 0/4] gro: various fixes related to UDP tunnels Antoine Tenart
2024-03-19  9:31 ` [PATCH net v2 1/4] udp: do not accept non-tunnel GSO skbs landing in a tunnel Antoine Tenart
2024-03-20  2:41   ` Jakub Kicinski
2024-03-20 11:11     ` Antoine Tenart
2024-03-20 10:34   ` Nikolay Aleksandrov
2024-03-20 11:13     ` Antoine Tenart
2024-03-19  9:31 ` [PATCH net v2 2/4] gro: fix ownership transfer Antoine Tenart
2024-03-19 13:13   ` Willem de Bruijn
2024-03-19  9:31 ` [PATCH net v2 3/4] udp: do not transition UDP fraglist to unnecessary checksum Antoine Tenart
2024-03-19 13:38   ` Willem de Bruijn
2024-03-19 16:01     ` Antoine Tenart
2024-03-20 13:00       ` Willem de Bruijn
2024-03-20 15:08         ` Antoine Tenart
2024-03-20 20:43           ` Willem de Bruijn
2024-03-21  8:48             ` Antoine Tenart
2024-03-21 12:42               ` Paolo Abeni
2024-03-21 14:58               ` Willem de Bruijn
2024-03-21 17:22                 ` Antoine Tenart
2024-03-21 18:13                   ` Willem de Bruijn
2024-03-22 10:48                     ` Antoine Tenart
2024-03-19  9:31 ` [PATCH net v2 4/4] udp: prevent local UDP tunnel packets from being GROed Antoine Tenart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).