netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net v2 0/2] Silence bad offload warning when sending UDP GSO with IPv6 extension headers
@ 2024-08-01 13:52 Jakub Sitnicki
  2024-08-01 13:52 ` [PATCH net v2 1/2] gso: Skip bad offload detection when device supports requested GSO Jakub Sitnicki
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Jakub Sitnicki @ 2024-08-01 13:52 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Willem de Bruijn, kernel-team, syzbot+e15b7e15b8a751a91d9a

This series addresses a recent regression report from syzbot [1].
Please see patch 1 description for details.

[1] https://lore.kernel.org/all/000000000000e1609a061d5330ce@google.com/

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
Changes in v2:
- Contain the fix inside the GSO stack after discussing with Willem
- Rework tests after realizing the regression has nothing to do with tunnels
- Link to v1: https://lore.kernel.org/r/20240725-udp-gso-egress-from-tunnel-v1-0-5e5530ead524@cloudflare.com

---
Jakub Sitnicki (2):
      gso: Skip bad offload detection when device supports requested GSO
      selftests/net: Add coverage for UDP GSO with IPv6 extension headers

 net/core/gso.c                       |  6 ++++--
 tools/testing/selftests/net/udpgso.c | 25 ++++++++++++++++++++++++-
 2 files changed, 28 insertions(+), 3 deletions(-)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net v2 1/2] gso: Skip bad offload detection when device supports requested GSO
  2024-08-01 13:52 [PATCH net v2 0/2] Silence bad offload warning when sending UDP GSO with IPv6 extension headers Jakub Sitnicki
@ 2024-08-01 13:52 ` Jakub Sitnicki
  2024-08-01 19:13   ` Willem de Bruijn
  2024-08-01 13:52 ` [PATCH net v2 2/2] selftests/net: Add coverage for UDP GSO with IPv6 extension headers Jakub Sitnicki
  2024-08-02  1:36 ` [PATCH net v2 0/2] Silence bad offload warning when sending " Jakub Kicinski
  2 siblings, 1 reply; 12+ messages in thread
From: Jakub Sitnicki @ 2024-08-01 13:52 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Willem de Bruijn, kernel-team, syzbot+e15b7e15b8a751a91d9a

In commit 10154dbded6d ("udp: Allow GSO transmit from devices with no
checksum offload") we have intentionally allowed UDP GSO packets marked
CHECKSUM_NONE to pass to the GSO stack, so that they can be segmented and
checksummed by a software fallback when the egress device lacks these
features.

What was not taken into consideration is that a CHECKSUM_NONE skb can be
handed over to the GSO stack also when the egress device advertises the
tx-udp-segmentation / NETIF_F_GSO_UDP_L4 feature.

This can happen in two situations, which we detect in __ip_append_data()
and __ip6_append_data():

1) when there are IPv6 extension headers present, or
2) when the tunnel device does not advertise checksum offload.

Note that in the latter case we have a nonsensical device configuration.
Device support for UDP segmentation offload requires checksum offload in
hardware as well.

Syzbot has discovered the first case, producing a warning as below:

  ip6tnl0: caps=(0x00000006401d7869, 0x00000006401d7869)
  WARNING: CPU: 0 PID: 5112 at net/core/dev.c:3293 skb_warn_bad_offload+0x166/0x1a0 net/core/dev.c:3291
  Modules linked in:
  CPU: 0 PID: 5112 Comm: syz-executor391 Not tainted 6.10.0-rc7-syzkaller-01603-g80ab5445da62 #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
  RIP: 0010:skb_warn_bad_offload+0x166/0x1a0 net/core/dev.c:3291
  [...]
  Call Trace:
   <TASK>
   __skb_gso_segment+0x3be/0x4c0 net/core/gso.c:127
   skb_gso_segment include/net/gso.h:83 [inline]
   validate_xmit_skb+0x585/0x1120 net/core/dev.c:3661
   __dev_queue_xmit+0x17a4/0x3e90 net/core/dev.c:4415
   neigh_output include/net/neighbour.h:542 [inline]
   ip6_finish_output2+0xffa/0x1680 net/ipv6/ip6_output.c:137
   ip6_finish_output+0x41e/0x810 net/ipv6/ip6_output.c:222
   ip6_send_skb+0x112/0x230 net/ipv6/ip6_output.c:1958
   udp_v6_send_skb+0xbf5/0x1870 net/ipv6/udp.c:1292
   udpv6_sendmsg+0x23b3/0x3270 net/ipv6/udp.c:1588
   sock_sendmsg_nosec net/socket.c:730 [inline]
   __sock_sendmsg+0xef/0x270 net/socket.c:745
   ____sys_sendmsg+0x525/0x7d0 net/socket.c:2585
   ___sys_sendmsg net/socket.c:2639 [inline]
   __sys_sendmmsg+0x3b2/0x740 net/socket.c:2725
   __do_sys_sendmmsg net/socket.c:2754 [inline]
   __se_sys_sendmmsg net/socket.c:2751 [inline]
   __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2751
   do_syscall_x64 arch/x86/entry/common.c:52 [inline]
   do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
   entry_SYSCALL_64_after_hwframe+0x77/0x7f
   [...]
   </TASK>

We are hitting the bad offload warning because when an egress device is
capable of handling segmentation offload requested by
skb_shinfo(skb)->gso_type, the chain of gso_segment callbacks won't produce
any segment skbs and return NULL. See the skb_gso_ok() branch in
{__udp,tcp,sctp}_gso_segment helpers.

To fix it, skip bad offload detection when gso_segment has returned
nothing. We know that in such case the egress device supports the desired
GSO offload, which implies that it can fill in L4 checksums. Hence we don't
need to check the skb->ip_summed value, which reflects the egress device
checksum capabilities.

Fixes: 10154dbded6d ("udp: Allow GSO transmit from devices with no checksum offload")
Reported-by: syzbot+e15b7e15b8a751a91d9a@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/000000000000e1609a061d5330ce@google.com/
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 net/core/gso.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/core/gso.c b/net/core/gso.c
index bcd156372f4d..88d5ca7d4fe7 100644
--- a/net/core/gso.c
+++ b/net/core/gso.c
@@ -122,10 +122,12 @@ struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
 	skb_reset_mac_len(skb);
 
 	segs = skb_mac_gso_segment(skb, features);
+	if (IS_ERR_OR_NULL(segs))
+		goto out;
 
-	if (segs != skb && unlikely(skb_needs_check(skb, tx_path) && !IS_ERR(segs)))
+	if (segs != skb && unlikely(skb_needs_check(skb, tx_path)))
 		skb_warn_bad_offload(skb);
-
+out:
 	return segs;
 }
 EXPORT_SYMBOL(__skb_gso_segment);

-- 
2.40.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net v2 2/2] selftests/net: Add coverage for UDP GSO with IPv6 extension headers
  2024-08-01 13:52 [PATCH net v2 0/2] Silence bad offload warning when sending UDP GSO with IPv6 extension headers Jakub Sitnicki
  2024-08-01 13:52 ` [PATCH net v2 1/2] gso: Skip bad offload detection when device supports requested GSO Jakub Sitnicki
@ 2024-08-01 13:52 ` Jakub Sitnicki
  2024-08-01 19:16   ` Willem de Bruijn
  2024-08-08  2:33   ` Willem de Bruijn
  2024-08-02  1:36 ` [PATCH net v2 0/2] Silence bad offload warning when sending " Jakub Kicinski
  2 siblings, 2 replies; 12+ messages in thread
From: Jakub Sitnicki @ 2024-08-01 13:52 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Willem de Bruijn, kernel-team

After enabling UDP GSO for devices not offering checksum offload, we have
hit a regression where a bad offload warning can be triggered when sending
a datagram with IPv6 extension headers.

Extend the UDP GSO IPv6 tests to cover this scenario.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 tools/testing/selftests/net/udpgso.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/net/udpgso.c b/tools/testing/selftests/net/udpgso.c
index 3e74cfa1a2bf..3f2fca02fec5 100644
--- a/tools/testing/selftests/net/udpgso.c
+++ b/tools/testing/selftests/net/udpgso.c
@@ -67,6 +67,7 @@ struct testcase {
 	int gso_len;		/* mss after applying gso */
 	int r_num_mss;		/* recv(): number of calls of full mss */
 	int r_len_last;		/* recv(): size of last non-mss dgram, if any */
+	bool v6_ext_hdr;	/* send() dgrams with IPv6 extension headers */
 };
 
 const struct in6_addr addr6 = {
@@ -77,6 +78,8 @@ const struct in_addr addr4 = {
 	__constant_htonl(0x0a000001), /* 10.0.0.1 */
 };
 
+static const char ipv6_hopopts_pad1[8] = { 0 };
+
 struct testcase testcases_v4[] = {
 	{
 		/* no GSO: send a single byte */
@@ -255,6 +258,13 @@ struct testcase testcases_v6[] = {
 		.gso_len = 1,
 		.r_num_mss = 2,
 	},
+	{
+		/* send 2 1B segments with extension headers */
+		.tlen = 2,
+		.gso_len = 1,
+		.r_num_mss = 2,
+		.v6_ext_hdr = true,
+	},
 	{
 		/* send 2B + 2B + 1B segments */
 		.tlen = 5,
@@ -396,11 +406,18 @@ static void run_one(struct testcase *test, int fdt, int fdr,
 	int i, ret, val, mss;
 	bool sent;
 
-	fprintf(stderr, "ipv%d tx:%d gso:%d %s\n",
+	fprintf(stderr, "ipv%d tx:%d gso:%d %s%s\n",
 			addr->sa_family == AF_INET ? 4 : 6,
 			test->tlen, test->gso_len,
+			test->v6_ext_hdr ? "ext-hdr " : "",
 			test->tfail ? "(fail)" : "");
 
+	if (test->v6_ext_hdr) {
+		if (setsockopt(fdt, IPPROTO_IPV6, IPV6_HOPOPTS,
+			       ipv6_hopopts_pad1, sizeof(ipv6_hopopts_pad1)))
+			error(1, errno, "setsockopt ipv6 hopopts");
+	}
+
 	val = test->gso_len;
 	if (cfg_do_setsockopt) {
 		if (setsockopt(fdt, SOL_UDP, UDP_SEGMENT, &val, sizeof(val)))
@@ -412,6 +429,12 @@ static void run_one(struct testcase *test, int fdt, int fdr,
 		error(1, 0, "send succeeded while expecting failure");
 	if (!sent && !test->tfail)
 		error(1, 0, "send failed while expecting success");
+
+	if (test->v6_ext_hdr) {
+		if (setsockopt(fdt, IPPROTO_IPV6, IPV6_HOPOPTS, NULL, 0))
+			error(1, errno, "setsockopt ipv6 hopopts clear");
+	}
+
 	if (!sent)
 		return;
 

-- 
2.40.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net v2 1/2] gso: Skip bad offload detection when device supports requested GSO
  2024-08-01 13:52 ` [PATCH net v2 1/2] gso: Skip bad offload detection when device supports requested GSO Jakub Sitnicki
@ 2024-08-01 19:13   ` Willem de Bruijn
  2024-08-05 10:04     ` Jakub Sitnicki
  0 siblings, 1 reply; 12+ messages in thread
From: Willem de Bruijn @ 2024-08-01 19:13 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Willem de Bruijn, kernel-team,
	syzbot+e15b7e15b8a751a91d9a

On Thu, Aug 1, 2024 at 10:09 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> In commit 10154dbded6d ("udp: Allow GSO transmit from devices with no
> checksum offload") we have intentionally allowed UDP GSO packets marked
> CHECKSUM_NONE to pass to the GSO stack, so that they can be segmented and
> checksummed by a software fallback when the egress device lacks these
> features.
>
> What was not taken into consideration is that a CHECKSUM_NONE skb can be
> handed over to the GSO stack also when the egress device advertises the
> tx-udp-segmentation / NETIF_F_GSO_UDP_L4 feature.
>
> This can happen in two situations, which we detect in __ip_append_data()
> and __ip6_append_data():
>
> 1) when there are IPv6 extension headers present, or
> 2) when the tunnel device does not advertise checksum offload.
>
> Note that in the latter case we have a nonsensical device configuration.
> Device support for UDP segmentation offload requires checksum offload in
> hardware as well.
>
> Syzbot has discovered the first case, producing a warning as below:
>
>   ip6tnl0: caps=(0x00000006401d7869, 0x00000006401d7869)
>   WARNING: CPU: 0 PID: 5112 at net/core/dev.c:3293 skb_warn_bad_offload+0x166/0x1a0 net/core/dev.c:3291
>   Modules linked in:
>   CPU: 0 PID: 5112 Comm: syz-executor391 Not tainted 6.10.0-rc7-syzkaller-01603-g80ab5445da62 #0
>   Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
>   RIP: 0010:skb_warn_bad_offload+0x166/0x1a0 net/core/dev.c:3291
>   [...]
>   Call Trace:
>    <TASK>
>    __skb_gso_segment+0x3be/0x4c0 net/core/gso.c:127
>    skb_gso_segment include/net/gso.h:83 [inline]
>    validate_xmit_skb+0x585/0x1120 net/core/dev.c:3661
>    __dev_queue_xmit+0x17a4/0x3e90 net/core/dev.c:4415
>    neigh_output include/net/neighbour.h:542 [inline]
>    ip6_finish_output2+0xffa/0x1680 net/ipv6/ip6_output.c:137
>    ip6_finish_output+0x41e/0x810 net/ipv6/ip6_output.c:222
>    ip6_send_skb+0x112/0x230 net/ipv6/ip6_output.c:1958
>    udp_v6_send_skb+0xbf5/0x1870 net/ipv6/udp.c:1292
>    udpv6_sendmsg+0x23b3/0x3270 net/ipv6/udp.c:1588
>    sock_sendmsg_nosec net/socket.c:730 [inline]
>    __sock_sendmsg+0xef/0x270 net/socket.c:745
>    ____sys_sendmsg+0x525/0x7d0 net/socket.c:2585
>    ___sys_sendmsg net/socket.c:2639 [inline]
>    __sys_sendmmsg+0x3b2/0x740 net/socket.c:2725
>    __do_sys_sendmmsg net/socket.c:2754 [inline]
>    __se_sys_sendmmsg net/socket.c:2751 [inline]
>    __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2751
>    do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>    do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>    entry_SYSCALL_64_after_hwframe+0x77/0x7f
>    [...]
>    </TASK>
>
> We are hitting the bad offload warning because when an egress device is
> capable of handling segmentation offload requested by
> skb_shinfo(skb)->gso_type, the chain of gso_segment callbacks won't produce
> any segment skbs and return NULL. See the skb_gso_ok() branch in
> {__udp,tcp,sctp}_gso_segment helpers.
>
> To fix it, skip bad offload detection when gso_segment has returned
> nothing. We know that in such case the egress device supports the desired
> GSO offload, which implies that it can fill in L4 checksums. Hence we don't
> need to check the skb->ip_summed value, which reflects the egress device
> checksum capabilities.
>
> Fixes: 10154dbded6d ("udp: Allow GSO transmit from devices with no checksum offload")
> Reported-by: syzbot+e15b7e15b8a751a91d9a@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/000000000000e1609a061d5330ce@google.com/
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>

Reviewed-by: Willem de Bruijn <willemb@google.com>

It's a bit odd, in that the ip_summed == CHECKSUM_NONE ends up just
being ignored and devices are trusted to always be able to checksum
offload when they can segment offload -- even when the device does not
advertise checksum offload.

I think we should have a follow-on that makes advertising
NETIF_F_GSO_UDP_L4 dependent on having at least one of the
NETIF_F_*_CSUM bits set (handwaving over what happens when only
advertising NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v2 2/2] selftests/net: Add coverage for UDP GSO with IPv6 extension headers
  2024-08-01 13:52 ` [PATCH net v2 2/2] selftests/net: Add coverage for UDP GSO with IPv6 extension headers Jakub Sitnicki
@ 2024-08-01 19:16   ` Willem de Bruijn
  2024-08-08  2:33   ` Willem de Bruijn
  1 sibling, 0 replies; 12+ messages in thread
From: Willem de Bruijn @ 2024-08-01 19:16 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Willem de Bruijn, kernel-team

On Thu, Aug 1, 2024 at 9:54 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> After enabling UDP GSO for devices not offering checksum offload, we have
> hit a regression where a bad offload warning can be triggered when sending
> a datagram with IPv6 extension headers.
>
> Extend the UDP GSO IPv6 tests to cover this scenario.
>
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>

Reviewed-by: Willem de Bruijn <willemb@google.com>

Thanks for adding a regression test case!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v2 0/2] Silence bad offload warning when sending UDP GSO with IPv6 extension headers
  2024-08-01 13:52 [PATCH net v2 0/2] Silence bad offload warning when sending UDP GSO with IPv6 extension headers Jakub Sitnicki
  2024-08-01 13:52 ` [PATCH net v2 1/2] gso: Skip bad offload detection when device supports requested GSO Jakub Sitnicki
  2024-08-01 13:52 ` [PATCH net v2 2/2] selftests/net: Add coverage for UDP GSO with IPv6 extension headers Jakub Sitnicki
@ 2024-08-02  1:36 ` Jakub Kicinski
  2024-08-02 11:20   ` Jakub Sitnicki
  2 siblings, 1 reply; 12+ messages in thread
From: Jakub Kicinski @ 2024-08-02  1:36 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni,
	Willem de Bruijn, kernel-team, syzbot+e15b7e15b8a751a91d9a

On Thu, 01 Aug 2024 15:52:52 +0200 Jakub Sitnicki wrote:
> This series addresses a recent regression report from syzbot [1].
> Please see patch 1 description for details.

The test doesn't seem super happy in netdev CI:

https://netdev-3.bots.linux.dev/vmksft-net-dbg/results/709100/79-udpgso-sh/stdout
https://netdev-3.bots.linux.dev/vmksft-net/results/709101/78-udpgso-sh/stdout
https://netdev-3.bots.linux.dev/vmksft-net-dbg/results/708921/75-udpgso-sh/stdout
https://netdev-3.bots.linux.dev/vmksft-net/results/708921/78-udpgso-sh/stdout

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v2 0/2] Silence bad offload warning when sending UDP GSO with IPv6 extension headers
  2024-08-02  1:36 ` [PATCH net v2 0/2] Silence bad offload warning when sending " Jakub Kicinski
@ 2024-08-02 11:20   ` Jakub Sitnicki
  0 siblings, 0 replies; 12+ messages in thread
From: Jakub Sitnicki @ 2024-08-02 11:20 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni,
	Willem de Bruijn, kernel-team, syzbot+e15b7e15b8a751a91d9a

On Thu, Aug 01, 2024 at 06:36 PM -07, Jakub Kicinski wrote:
> On Thu, 01 Aug 2024 15:52:52 +0200 Jakub Sitnicki wrote:
>> This series addresses a recent regression report from syzbot [1].
>> Please see patch 1 description for details.
>
> The test doesn't seem super happy in netdev CI:
>
> https://netdev-3.bots.linux.dev/vmksft-net-dbg/results/709100/79-udpgso-sh/stdout
> https://netdev-3.bots.linux.dev/vmksft-net/results/709101/78-udpgso-sh/stdout
> https://netdev-3.bots.linux.dev/vmksft-net-dbg/results/708921/75-udpgso-sh/stdout
> https://netdev-3.bots.linux.dev/vmksft-net/results/708921/78-udpgso-sh/stdout

Embarassing. I must have not recompiled the tests after tweaking it.
Sorry for the oversight. I will deal with it.

Reproduces for me locally:

[pid   507] setsockopt(4, SOL_IPV6, IPV6_HOPOPTS, NULL, 0) = 0
[pid   507] recvfrom(3, 0x55fcf63813a0, 65535, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v2 1/2] gso: Skip bad offload detection when device supports requested GSO
  2024-08-01 19:13   ` Willem de Bruijn
@ 2024-08-05 10:04     ` Jakub Sitnicki
  2024-08-05 14:25       ` Willem de Bruijn
  0 siblings, 1 reply; 12+ messages in thread
From: Jakub Sitnicki @ 2024-08-05 10:04 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Willem de Bruijn, kernel-team,
	syzbot+e15b7e15b8a751a91d9a

On Thu, Aug 01, 2024 at 03:13 PM -04, Willem de Bruijn wrote:
> On Thu, Aug 1, 2024 at 10:09 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>>
>> In commit 10154dbded6d ("udp: Allow GSO transmit from devices with no
>> checksum offload") we have intentionally allowed UDP GSO packets marked
>> CHECKSUM_NONE to pass to the GSO stack, so that they can be segmented and
>> checksummed by a software fallback when the egress device lacks these
>> features.
>>
>> What was not taken into consideration is that a CHECKSUM_NONE skb can be
>> handed over to the GSO stack also when the egress device advertises the
>> tx-udp-segmentation / NETIF_F_GSO_UDP_L4 feature.
>>
>> This can happen in two situations, which we detect in __ip_append_data()
>> and __ip6_append_data():
>>
>> 1) when there are IPv6 extension headers present, or
>> 2) when the tunnel device does not advertise checksum offload.
>>
>> Note that in the latter case we have a nonsensical device configuration.
>> Device support for UDP segmentation offload requires checksum offload in
>> hardware as well.
>>
>> Syzbot has discovered the first case, producing a warning as below:
>>
>>   ip6tnl0: caps=(0x00000006401d7869, 0x00000006401d7869)
>>   WARNING: CPU: 0 PID: 5112 at net/core/dev.c:3293 skb_warn_bad_offload+0x166/0x1a0 net/core/dev.c:3291
>>   Modules linked in:
>>   CPU: 0 PID: 5112 Comm: syz-executor391 Not tainted 6.10.0-rc7-syzkaller-01603-g80ab5445da62 #0
>>   Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
>>   RIP: 0010:skb_warn_bad_offload+0x166/0x1a0 net/core/dev.c:3291
>>   [...]
>>   Call Trace:
>>    <TASK>
>>    __skb_gso_segment+0x3be/0x4c0 net/core/gso.c:127
>>    skb_gso_segment include/net/gso.h:83 [inline]
>>    validate_xmit_skb+0x585/0x1120 net/core/dev.c:3661
>>    __dev_queue_xmit+0x17a4/0x3e90 net/core/dev.c:4415
>>    neigh_output include/net/neighbour.h:542 [inline]
>>    ip6_finish_output2+0xffa/0x1680 net/ipv6/ip6_output.c:137
>>    ip6_finish_output+0x41e/0x810 net/ipv6/ip6_output.c:222
>>    ip6_send_skb+0x112/0x230 net/ipv6/ip6_output.c:1958
>>    udp_v6_send_skb+0xbf5/0x1870 net/ipv6/udp.c:1292
>>    udpv6_sendmsg+0x23b3/0x3270 net/ipv6/udp.c:1588
>>    sock_sendmsg_nosec net/socket.c:730 [inline]
>>    __sock_sendmsg+0xef/0x270 net/socket.c:745
>>    ____sys_sendmsg+0x525/0x7d0 net/socket.c:2585
>>    ___sys_sendmsg net/socket.c:2639 [inline]
>>    __sys_sendmmsg+0x3b2/0x740 net/socket.c:2725
>>    __do_sys_sendmmsg net/socket.c:2754 [inline]
>>    __se_sys_sendmmsg net/socket.c:2751 [inline]
>>    __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2751
>>    do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>    do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>>    entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>    [...]
>>    </TASK>
>>
>> We are hitting the bad offload warning because when an egress device is
>> capable of handling segmentation offload requested by
>> skb_shinfo(skb)->gso_type, the chain of gso_segment callbacks won't produce
>> any segment skbs and return NULL. See the skb_gso_ok() branch in
>> {__udp,tcp,sctp}_gso_segment helpers.
>>
>> To fix it, skip bad offload detection when gso_segment has returned
>> nothing. We know that in such case the egress device supports the desired
>> GSO offload, which implies that it can fill in L4 checksums. Hence we don't
>> need to check the skb->ip_summed value, which reflects the egress device
>> checksum capabilities.
>>
>> Fixes: 10154dbded6d ("udp: Allow GSO transmit from devices with no checksum offload")
>> Reported-by: syzbot+e15b7e15b8a751a91d9a@syzkaller.appspotmail.com
>> Closes: https://lore.kernel.org/all/000000000000e1609a061d5330ce@google.com/
>> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
>
> Reviewed-by: Willem de Bruijn <willemb@google.com>
>
> It's a bit odd, in that the ip_summed == CHECKSUM_NONE ends up just
> being ignored and devices are trusted to always be able to checksum
> offload when they can segment offload -- even when the device does not
> advertise checksum offload.
>
> I think we should have a follow-on that makes advertising
> NETIF_F_GSO_UDP_L4 dependent on having at least one of the
> NETIF_F_*_CSUM bits set (handwaving over what happens when only
> advertising NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM).

I agree. I've also gained some clarity as to how the fix should
look. Let's circle back to it, if we still think it's relevant once we
hash out the fix.

After spending some quality time debugging the addded regression test
[1], I've realized this fix is wrong.

You see, with commit 10154dbded6d ("udp: Allow GSO transmit from devices
with no checksum offload"), I've opened up the UDP_SEGMENT API to two
uses, which I think should not be allowed:

1. Hardware USO for IPv6 dgrams with extension headers

Previously that led to -EIO, because __ip6_append_data won't annotate
such packets as CHECKSUM_PARTIAL.

I'm guessing that we do this because some drivers that advertise csum
offload can't actually handle checksumming when extension headers are
present.

Extension headers are not part of IPv6 pseudo header, but who knows what
limitations NIC firmwares have.

Either way, changing it just like that sounds risky, so I think we need
to fall back to software USO with software checksum in this case.

Alternatively, we could catch it in the udp layer and error out with EIO
as ealier. But that shifts some burden onto the user space (detect and
segment before sendmsg()).

2. Hardware USO when hardware csum is unsupported / disabled

That sounds like a pathological device configuration case, but since it
is possible today with some drivers to disable csum offload for one IP
version and not the other, it seems safest to just handle that
gracefully.

I don't know why one might want to do that. Perhaps as a workaround for
some firmware bug while waiting for a fix?

In this scenario I think we also need to fall back to software USO and
checksum.

Code-wise that could look like below. WDYT?

[1] I feel silly for submitting a broken patch. I've been running a
stale test prog in a VM. It seems that virtme-ng with a read+write
overlay for rootfs played a trick on me. Changes to the host files are
not (always?) visible to the guest.

---8<---
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 83f8cd8aa2d1..819173807c81 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4806,7 +4806,13 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
 			goto perform_csum_check;
 
 		if (!sg) {
-			if (!csum) {
+			/* Fall back to software checksum for segments if:
+			 * 1) device can't checksum this network protocol, OR
+			 * 2) we consider the packet to be not checksummable in
+			 *    hardware, for example IPv6 extension headers are
+			 *    present.
+			 */
+			if (!csum || head_skb->ip_summed != CHECKSUM_PARTIAL) {
 				if (!nskb->remcsum_offload)
 					nskb->ip_summed = CHECKSUM_NONE;
 				SKB_GSO_CB(nskb)->csum =
@@ -4896,7 +4902,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
 		nskb->truesize += nskb->data_len;
 
 perform_csum_check:
-		if (!csum) {
+		if (!csum || head_skb->ip_summed != CHECKSUM_PARTIAL) {
 			if (skb_has_shared_frag(nskb) &&
 			    __skb_linearize(nskb))
 				goto err;
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index bc8a9da750fe..0a34b418b83c 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -282,7 +282,15 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
 		     skb_transport_header(gso_skb)))
 		return ERR_PTR(-EINVAL);
 
-	if (skb_gso_ok(gso_skb, features | NETIF_F_GSO_ROBUST)) {
+	/* Pass-through to device for segmentation only if:
+	 * 1) we consider the packet checksummable in hardware, that is no
+	 *    IPv6 extension headers present, AND
+	 * 2) device supports checksum offload for this network protocol
+	 *    (NETIF_F_{IP,IPV6}_CSUM or NETIF_F_HW_CSUM), AND
+	 * 3) device supports the requested GSO kind.
+	 */
+	if (gso_skb->ip_summed == CHECKSUM_PARTIAL &&
+	    skb_gso_ok(gso_skb, features | NETIF_F_GSO_ROBUST)) {
 		/* Packet is from an untrusted source, reset gso_segs. */
 		skb_shinfo(gso_skb)->gso_segs = DIV_ROUND_UP(gso_skb->len - sizeof(*uh),
 							     mss);

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net v2 1/2] gso: Skip bad offload detection when device supports requested GSO
  2024-08-05 10:04     ` Jakub Sitnicki
@ 2024-08-05 14:25       ` Willem de Bruijn
  2024-08-05 15:59         ` Jakub Sitnicki
  0 siblings, 1 reply; 12+ messages in thread
From: Willem de Bruijn @ 2024-08-05 14:25 UTC (permalink / raw)
  To: Jakub Sitnicki, Willem de Bruijn
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Willem de Bruijn, kernel-team,
	syzbot+e15b7e15b8a751a91d9a

Jakub Sitnicki wrote:
> On Thu, Aug 01, 2024 at 03:13 PM -04, Willem de Bruijn wrote:
> > On Thu, Aug 1, 2024 at 10:09 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> >>
> >> In commit 10154dbded6d ("udp: Allow GSO transmit from devices with no
> >> checksum offload") we have intentionally allowed UDP GSO packets marked
> >> CHECKSUM_NONE to pass to the GSO stack, so that they can be segmented and
> >> checksummed by a software fallback when the egress device lacks these
> >> features.
> >>
> >> What was not taken into consideration is that a CHECKSUM_NONE skb can be
> >> handed over to the GSO stack also when the egress device advertises the
> >> tx-udp-segmentation / NETIF_F_GSO_UDP_L4 feature.
> >>
> >> This can happen in two situations, which we detect in __ip_append_data()
> >> and __ip6_append_data():
> >>
> >> 1) when there are IPv6 extension headers present, or
> >> 2) when the tunnel device does not advertise checksum offload.
> >>
> >> Note that in the latter case we have a nonsensical device configuration.
> >> Device support for UDP segmentation offload requires checksum offload in
> >> hardware as well.
> >>
> >> Syzbot has discovered the first case, producing a warning as below:
> >>
> >>   ip6tnl0: caps=(0x00000006401d7869, 0x00000006401d7869)
> >>   WARNING: CPU: 0 PID: 5112 at net/core/dev.c:3293 skb_warn_bad_offload+0x166/0x1a0 net/core/dev.c:3291
> >>   Modules linked in:
> >>   CPU: 0 PID: 5112 Comm: syz-executor391 Not tainted 6.10.0-rc7-syzkaller-01603-g80ab5445da62 #0
> >>   Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
> >>   RIP: 0010:skb_warn_bad_offload+0x166/0x1a0 net/core/dev.c:3291
> >>   [...]
> >>   Call Trace:
> >>    <TASK>
> >>    __skb_gso_segment+0x3be/0x4c0 net/core/gso.c:127
> >>    skb_gso_segment include/net/gso.h:83 [inline]
> >>    validate_xmit_skb+0x585/0x1120 net/core/dev.c:3661
> >>    __dev_queue_xmit+0x17a4/0x3e90 net/core/dev.c:4415
> >>    neigh_output include/net/neighbour.h:542 [inline]
> >>    ip6_finish_output2+0xffa/0x1680 net/ipv6/ip6_output.c:137
> >>    ip6_finish_output+0x41e/0x810 net/ipv6/ip6_output.c:222
> >>    ip6_send_skb+0x112/0x230 net/ipv6/ip6_output.c:1958
> >>    udp_v6_send_skb+0xbf5/0x1870 net/ipv6/udp.c:1292
> >>    udpv6_sendmsg+0x23b3/0x3270 net/ipv6/udp.c:1588
> >>    sock_sendmsg_nosec net/socket.c:730 [inline]
> >>    __sock_sendmsg+0xef/0x270 net/socket.c:745
> >>    ____sys_sendmsg+0x525/0x7d0 net/socket.c:2585
> >>    ___sys_sendmsg net/socket.c:2639 [inline]
> >>    __sys_sendmmsg+0x3b2/0x740 net/socket.c:2725
> >>    __do_sys_sendmmsg net/socket.c:2754 [inline]
> >>    __se_sys_sendmmsg net/socket.c:2751 [inline]
> >>    __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2751
> >>    do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> >>    do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
> >>    entry_SYSCALL_64_after_hwframe+0x77/0x7f
> >>    [...]
> >>    </TASK>
> >>
> >> We are hitting the bad offload warning because when an egress device is
> >> capable of handling segmentation offload requested by
> >> skb_shinfo(skb)->gso_type, the chain of gso_segment callbacks won't produce
> >> any segment skbs and return NULL. See the skb_gso_ok() branch in
> >> {__udp,tcp,sctp}_gso_segment helpers.
> >>
> >> To fix it, skip bad offload detection when gso_segment has returned
> >> nothing. We know that in such case the egress device supports the desired
> >> GSO offload, which implies that it can fill in L4 checksums. Hence we don't
> >> need to check the skb->ip_summed value, which reflects the egress device
> >> checksum capabilities.
> >>
> >> Fixes: 10154dbded6d ("udp: Allow GSO transmit from devices with no checksum offload")
> >> Reported-by: syzbot+e15b7e15b8a751a91d9a@syzkaller.appspotmail.com
> >> Closes: https://lore.kernel.org/all/000000000000e1609a061d5330ce@google.com/
> >> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> >
> > Reviewed-by: Willem de Bruijn <willemb@google.com>
> >
> > It's a bit odd, in that the ip_summed == CHECKSUM_NONE ends up just
> > being ignored and devices are trusted to always be able to checksum
> > offload when they can segment offload -- even when the device does not
> > advertise checksum offload.
> >
> > I think we should have a follow-on that makes advertising
> > NETIF_F_GSO_UDP_L4 dependent on having at least one of the
> > NETIF_F_*_CSUM bits set (handwaving over what happens when only
> > advertising NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM).
> 
> I agree. I've also gained some clarity as to how the fix should
> look. Let's circle back to it, if we still think it's relevant once we
> hash out the fix.
> 
> After spending some quality time debugging the addded regression test
> [1], I've realized this fix is wrong.
> 
> You see, with commit 10154dbded6d ("udp: Allow GSO transmit from devices
> with no checksum offload"), I've opened up the UDP_SEGMENT API to two
> uses, which I think should not be allowed:
> 
> 1. Hardware USO for IPv6 dgrams with extension headers
> 
> Previously that led to -EIO, because __ip6_append_data won't annotate
> such packets as CHECKSUM_PARTIAL.
> 
> I'm guessing that we do this because some drivers that advertise csum
> offload can't actually handle checksumming when extension headers are
> present.
> 
> Extension headers are not part of IPv6 pseudo header, but who knows what
> limitations NIC firmwares have.
> 
> Either way, changing it just like that sounds risky, so I think we need
> to fall back to software USO with software checksum in this case.
> 
> Alternatively, we could catch it in the udp layer and error out with EIO
> as ealier. But that shifts some burden onto the user space (detect and
> segment before sendmsg()).
> 
> 2. Hardware USO when hardware csum is unsupported / disabled
> 
> That sounds like a pathological device configuration case, but since it
> is possible today with some drivers to disable csum offload for one IP
> version and not the other, it seems safest to just handle that
> gracefully.
> 
> I don't know why one might want to do that. Perhaps as a workaround for
> some firmware bug while waiting for a fix?

I doubt that this is actually used. But today it can be configured.

Which is why I suggested making NETIF_F_GSO_UDP_L4 dependent on csum
offload (in netdev_fix_features). I doubt that that will break any
real user.
 
> In this scenario I think we also need to fall back to software USO and
> checksum.
> 
> Code-wise that could look like below. WDYT?

Since this only affects USO, can we fix this is in __udp_gso_segment.
Basically, not taking the NETIF_F_GSO_ROBUST path.

skb_segment is so complicated already. Whatever we can do to avoid
adding to that.

> [1] I feel silly for submitting a broken patch. I've been running a
> stale test prog in a VM. It seems that virtme-ng with a read+write
> overlay for rootfs played a trick on me. Changes to the host files are
> not (always?) visible to the guest.
> 
> ---8<---
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 83f8cd8aa2d1..819173807c81 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -4806,7 +4806,13 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
>  			goto perform_csum_check;
>  
>  		if (!sg) {
> -			if (!csum) {
> +			/* Fall back to software checksum for segments if:
> +			 * 1) device can't checksum this network protocol, OR
> +			 * 2) we consider the packet to be not checksummable in
> +			 *    hardware, for example IPv6 extension headers are
> +			 *    present.
> +			 */
> +			if (!csum || head_skb->ip_summed != CHECKSUM_PARTIAL) {
>  				if (!nskb->remcsum_offload)
>  					nskb->ip_summed = CHECKSUM_NONE;
>  				SKB_GSO_CB(nskb)->csum =
> @@ -4896,7 +4902,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
>  		nskb->truesize += nskb->data_len;
>  
>  perform_csum_check:
> -		if (!csum) {
> +		if (!csum || head_skb->ip_summed != CHECKSUM_PARTIAL) {
>  			if (skb_has_shared_frag(nskb) &&
>  			    __skb_linearize(nskb))
>  				goto err;
> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> index bc8a9da750fe..0a34b418b83c 100644
> --- a/net/ipv4/udp_offload.c
> +++ b/net/ipv4/udp_offload.c
> @@ -282,7 +282,15 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
>  		     skb_transport_header(gso_skb)))
>  		return ERR_PTR(-EINVAL);
>  
> -	if (skb_gso_ok(gso_skb, features | NETIF_F_GSO_ROBUST)) {
> +	/* Pass-through to device for segmentation only if:
> +	 * 1) we consider the packet checksummable in hardware, that is no
> +	 *    IPv6 extension headers present, AND
> +	 * 2) device supports checksum offload for this network protocol
> +	 *    (NETIF_F_{IP,IPV6}_CSUM or NETIF_F_HW_CSUM), AND
> +	 * 3) device supports the requested GSO kind.
> +	 */
> +	if (gso_skb->ip_summed == CHECKSUM_PARTIAL &&
> +	    skb_gso_ok(gso_skb, features | NETIF_F_GSO_ROBUST)) {
>  		/* Packet is from an untrusted source, reset gso_segs. */
>  		skb_shinfo(gso_skb)->gso_segs = DIV_ROUND_UP(gso_skb->len - sizeof(*uh),
>  							     mss);



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v2 1/2] gso: Skip bad offload detection when device supports requested GSO
  2024-08-05 14:25       ` Willem de Bruijn
@ 2024-08-05 15:59         ` Jakub Sitnicki
  2024-08-05 16:11           ` Willem de Bruijn
  0 siblings, 1 reply; 12+ messages in thread
From: Jakub Sitnicki @ 2024-08-05 15:59 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Willem de Bruijn, kernel-team,
	syzbot+e15b7e15b8a751a91d9a

On Mon, Aug 05, 2024 at 10:25 AM -04, Willem de Bruijn wrote:
> Jakub Sitnicki wrote:
>> On Thu, Aug 01, 2024 at 03:13 PM -04, Willem de Bruijn wrote:

[...]

>> > It's a bit odd, in that the ip_summed == CHECKSUM_NONE ends up just
>> > being ignored and devices are trusted to always be able to checksum
>> > offload when they can segment offload -- even when the device does not
>> > advertise checksum offload.
>> >
>> > I think we should have a follow-on that makes advertising
>> > NETIF_F_GSO_UDP_L4 dependent on having at least one of the
>> > NETIF_F_*_CSUM bits set (handwaving over what happens when only
>> > advertising NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM).
>> 
>> I agree. I've also gained some clarity as to how the fix should
>> look. Let's circle back to it, if we still think it's relevant once we
>> hash out the fix.
>> 
>> After spending some quality time debugging the addded regression test
>> [1], I've realized this fix is wrong.
>> 
>> You see, with commit 10154dbded6d ("udp: Allow GSO transmit from devices
>> with no checksum offload"), I've opened up the UDP_SEGMENT API to two
>> uses, which I think should not be allowed:
>> 
>> 1. Hardware USO for IPv6 dgrams with extension headers
>> 
>> Previously that led to -EIO, because __ip6_append_data won't annotate
>> such packets as CHECKSUM_PARTIAL.
>> 
>> I'm guessing that we do this because some drivers that advertise csum
>> offload can't actually handle checksumming when extension headers are
>> present.
>> 
>> Extension headers are not part of IPv6 pseudo header, but who knows what
>> limitations NIC firmwares have.
>> 
>> Either way, changing it just like that sounds risky, so I think we need
>> to fall back to software USO with software checksum in this case.
>> 
>> Alternatively, we could catch it in the udp layer and error out with EIO
>> as ealier. But that shifts some burden onto the user space (detect and
>> segment before sendmsg()).
>> 
>> 2. Hardware USO when hardware csum is unsupported / disabled
>> 
>> That sounds like a pathological device configuration case, but since it
>> is possible today with some drivers to disable csum offload for one IP
>> version and not the other, it seems safest to just handle that
>> gracefully.
>> 
>> I don't know why one might want to do that. Perhaps as a workaround for
>> some firmware bug while waiting for a fix?
>
> I doubt that this is actually used. But today it can be configured.
>
> Which is why I suggested making NETIF_F_GSO_UDP_L4 dependent on csum
> offload (in netdev_fix_features). I doubt that that will break any
> real user.

Sounds like a plan. If we're talking about simply disabling GSO_UDP_L4
whenever either NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM is "off", then that
is straightforward. And the NETIF_F_HW_CSUM dependency is clear.

I could even piggy back it on this series, at the risk of additional
iterations.

>  
>> In this scenario I think we also need to fall back to software USO and
>> checksum.
>> 
>> Code-wise that could look like below. WDYT?
>
> Since this only affects USO, can we fix this is in __udp_gso_segment.
> Basically, not taking the NETIF_F_GSO_ROBUST path.
>
> skb_segment is so complicated already. Whatever we can do to avoid
> adding to that.

skb_segment is a complex beast. No disagreement there.

Keeping the changes down seems doable. We can drive skb_segment to
compute the checksum, when we know that's needed (because IPv6 extension
headers are present -> ip_summed is CHECKSUM_NONE) by masking off csum
flags. Thanks for the suggestion.

[...]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v2 1/2] gso: Skip bad offload detection when device supports requested GSO
  2024-08-05 15:59         ` Jakub Sitnicki
@ 2024-08-05 16:11           ` Willem de Bruijn
  0 siblings, 0 replies; 12+ messages in thread
From: Willem de Bruijn @ 2024-08-05 16:11 UTC (permalink / raw)
  To: Jakub Sitnicki, Willem de Bruijn
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Willem de Bruijn, kernel-team,
	syzbot+e15b7e15b8a751a91d9a

Jakub Sitnicki wrote:
> On Mon, Aug 05, 2024 at 10:25 AM -04, Willem de Bruijn wrote:
> > Jakub Sitnicki wrote:
> >> On Thu, Aug 01, 2024 at 03:13 PM -04, Willem de Bruijn wrote:
> 
> [...]
> 
> >> > It's a bit odd, in that the ip_summed == CHECKSUM_NONE ends up just
> >> > being ignored and devices are trusted to always be able to checksum
> >> > offload when they can segment offload -- even when the device does not
> >> > advertise checksum offload.
> >> >
> >> > I think we should have a follow-on that makes advertising
> >> > NETIF_F_GSO_UDP_L4 dependent on having at least one of the
> >> > NETIF_F_*_CSUM bits set (handwaving over what happens when only
> >> > advertising NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM).
> >> 
> >> I agree. I've also gained some clarity as to how the fix should
> >> look. Let's circle back to it, if we still think it's relevant once we
> >> hash out the fix.
> >> 
> >> After spending some quality time debugging the addded regression test
> >> [1], I've realized this fix is wrong.
> >> 
> >> You see, with commit 10154dbded6d ("udp: Allow GSO transmit from devices
> >> with no checksum offload"), I've opened up the UDP_SEGMENT API to two
> >> uses, which I think should not be allowed:
> >> 
> >> 1. Hardware USO for IPv6 dgrams with extension headers
> >> 
> >> Previously that led to -EIO, because __ip6_append_data won't annotate
> >> such packets as CHECKSUM_PARTIAL.
> >> 
> >> I'm guessing that we do this because some drivers that advertise csum
> >> offload can't actually handle checksumming when extension headers are
> >> present.
> >> 
> >> Extension headers are not part of IPv6 pseudo header, but who knows what
> >> limitations NIC firmwares have.
> >> 
> >> Either way, changing it just like that sounds risky, so I think we need
> >> to fall back to software USO with software checksum in this case.
> >> 
> >> Alternatively, we could catch it in the udp layer and error out with EIO
> >> as ealier. But that shifts some burden onto the user space (detect and
> >> segment before sendmsg()).
> >> 
> >> 2. Hardware USO when hardware csum is unsupported / disabled
> >> 
> >> That sounds like a pathological device configuration case, but since it
> >> is possible today with some drivers to disable csum offload for one IP
> >> version and not the other, it seems safest to just handle that
> >> gracefully.
> >> 
> >> I don't know why one might want to do that. Perhaps as a workaround for
> >> some firmware bug while waiting for a fix?
> >
> > I doubt that this is actually used. But today it can be configured.
> >
> > Which is why I suggested making NETIF_F_GSO_UDP_L4 dependent on csum
> > offload (in netdev_fix_features). I doubt that that will break any
> > real user.
> 
> Sounds like a plan. If we're talking about simply disabling GSO_UDP_L4
> whenever either NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM is "off", then that
> is straightforward. And the NETIF_F_HW_CSUM dependency is clear.
> 
> I could even piggy back it on this series, at the risk of additional
> iterations.

A two patch series SGTM.

> >  
> >> In this scenario I think we also need to fall back to software USO and
> >> checksum.
> >> 
> >> Code-wise that could look like below. WDYT?
> >
> > Since this only affects USO, can we fix this is in __udp_gso_segment.
> > Basically, not taking the NETIF_F_GSO_ROBUST path.
> >
> > skb_segment is so complicated already. Whatever we can do to avoid
> > adding to that.
> 
> skb_segment is a complex beast. No disagreement there.
> 
> Keeping the changes down seems doable. We can drive skb_segment to
> compute the checksum, when we know that's needed (because IPv6 extension
> headers are present -> ip_summed is CHECKSUM_NONE) by masking off csum
> flags. Thanks for the suggestion.

Perfect.

The effort to add a selftest already proved its worth btw :)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net v2 2/2] selftests/net: Add coverage for UDP GSO with IPv6 extension headers
  2024-08-01 13:52 ` [PATCH net v2 2/2] selftests/net: Add coverage for UDP GSO with IPv6 extension headers Jakub Sitnicki
  2024-08-01 19:16   ` Willem de Bruijn
@ 2024-08-08  2:33   ` Willem de Bruijn
  1 sibling, 0 replies; 12+ messages in thread
From: Willem de Bruijn @ 2024-08-08  2:33 UTC (permalink / raw)
  To: Jakub Sitnicki, netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Willem de Bruijn, kernel-team

Jakub Sitnicki wrote:
> After enabling UDP GSO for devices not offering checksum offload, we have
> hit a regression where a bad offload warning can be triggered when sending
> a datagram with IPv6 extension headers.
> 
> Extend the UDP GSO IPv6 tests to cover this scenario.
> 
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>

Reviewed-by: Willem de Bruijn <willemb@google.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-08-08  2:33 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-01 13:52 [PATCH net v2 0/2] Silence bad offload warning when sending UDP GSO with IPv6 extension headers Jakub Sitnicki
2024-08-01 13:52 ` [PATCH net v2 1/2] gso: Skip bad offload detection when device supports requested GSO Jakub Sitnicki
2024-08-01 19:13   ` Willem de Bruijn
2024-08-05 10:04     ` Jakub Sitnicki
2024-08-05 14:25       ` Willem de Bruijn
2024-08-05 15:59         ` Jakub Sitnicki
2024-08-05 16:11           ` Willem de Bruijn
2024-08-01 13:52 ` [PATCH net v2 2/2] selftests/net: Add coverage for UDP GSO with IPv6 extension headers Jakub Sitnicki
2024-08-01 19:16   ` Willem de Bruijn
2024-08-08  2:33   ` Willem de Bruijn
2024-08-02  1:36 ` [PATCH net v2 0/2] Silence bad offload warning when sending " Jakub Kicinski
2024-08-02 11:20   ` Jakub Sitnicki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).