netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* PROBLEM: invalid udp checksum with ip6gre-in-udp
@ 2024-09-05 15:22 Benoît Monin
  2024-09-12 14:04 ` Benoît Monin
  0 siblings, 1 reply; 3+ messages in thread
From: Benoît Monin @ 2024-09-05 15:22 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: netdev, linux-kernel

Hi all,

I am having issue with GRE-in-UDP (GRE with fou encapsulation) over 
IPv6: the outer UDP checksum is only valid if an inner checksum is 
present and valid. This problem is only present over IPv6, not IPv4, 
and reproducible with different kernel versions and cpu architecture.

Here is the test setup I used:

    +----------+                                    +------------+
    |          | fd00::2/64              fd00::1/64 |            |
    |  tester  | 10.0.0.33/24          10.0.0.11/24 |    DUT     |
    |          |------------------------------------|            |
    +----------+                                    +------------+

Two machines are connected with an ethernet cable, and each side is 
setup with an ipv4 and an ipv6 address.

On the device under test, two GRE-in-UDP tunnels are setup, one over 
ipv6 and one over ipv4, aimed at the tester. Each tunnel is configured 
with an ipv4 address:

    modprobe -a fou fou6

    ip link add gre6 type ip6gre local fd00::1 remote fd00::2 encap fou encap-sport 1234 encap-dport 4567
    ip link set up gre6
    ip address add 172.20.6.1/24 dev gre6

    ip link add gre4 type gre local 10.0.0.11 remote 10.0.0.33 encap fou encap-sport 3456 encap-dport 6789 encap-csum
    ip link set up gre4
    ip address add 172.20.4.1/24 dev gre4

On the tester, no setup is done, it is only running tcpdump to capture 
the traffic emitted by the DUT.

The following commands are used on the device under test to send 
packets via the tunnels:

    ping -c1 -W0.1 172.20.6.2
    ./send_udp 172.20.6.2 5555 "ip6gre-in-udp with inner udp csum"
    SO_NO_CHECK=1 ./send_udp 172.20.6.2 5555 "ip6gre-in-udp without inner udp csum"
    ping -c1 -W0.1 172.20.4.2
    ./send_udp 172.20.4.2 5555 "gre-in-udp with inner udp csum"
    SO_NO_CHECK=1 ./send_udp 172.20.4.2 5555 "gre-in-udp without inner udp csum"

Three packets are sent in each GRE tunnels :
* One ICMP echo request
* One UDP packet with a valid checksum
* One UDP packet with the checksum set to 0

Here is a link to the pcap containing the six packets generated by the 
previous commands:
https://onaip.mooo.com/pub/tmp/bug_ip6gre-in-udp.pcap

Some details about the captured packets:
IP6 fd00::1 > fd00::2: 1234 > 4567: [bad udp cksum 0xfa75 -> 0xe680!]
IP6 fd00::1 > fd00::2: 1234 > 4567: [udp sum ok]
IP6 fd00::1 > fd00::2: 1234 > 4567: [bad udp cksum 0xfa62 -> 0xb57c!]
IP 10.0.0.11.3456 > 10.0.0.33.6789: [udp sum ok]
IP 10.0.0.11.3456 > 10.0.0.33.6789: [udp sum ok]
IP 10.0.0.11.3456 > 10.0.0.33.6789: [udp sum ok]

For the tunnel over ipv6, only the UDP packet sent with a valid 
checksum get encapsulated with a valid checksum. For the ping and the 
UDP packet with a zero checksum, the outer UDP checksum matches the 
partial checksum of the pseudo-header.

For ipv4, the UDP checksum of the encapsulation are all valid.

The device under test used for the capture is an x86-64 machine with a 
realtek ethernet adapter (r8169 driver) running a 6.10.7 kernel.

The problem was also seen on an arm64 board (freescale ls1046) with 
dpaa ethernet driver running a 4.14 kernel. on this hardware, the 
packets that would have an invalid checksum are not emitted and the tx 
error counter of the ethernet interface increases.

In all cases, disabling hardware checksumming for ipv6 with ethtool can 
be used as a work-around:

    ethtool -K eth0 tx-checksum-ipv6 off

This and the partial checksum value seems to point to an error in the 
handling of hardware checksumming in the particular case of fou6 
encapsulation, but I have not been able to figure out what could be 
causing it.

Did I miss a configuration parameter for ip6gre-in-udp? Any advise on 
how to debug that would be appreciated.

-- 
Benoît



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: PROBLEM: invalid udp checksum with ip6gre-in-udp
  2024-09-05 15:22 PROBLEM: invalid udp checksum with ip6gre-in-udp Benoît Monin
@ 2024-09-12 14:04 ` Benoît Monin
  2024-09-20 12:10   ` Benoît Monin
  0 siblings, 1 reply; 3+ messages in thread
From: Benoît Monin @ 2024-09-12 14:04 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: netdev, linux-kernel

Hi again,

05/09/2024 Benoît Monin :
> Hi all,
> 
> I am having issue with GRE-in-UDP (GRE with fou encapsulation) over 
> IPv6: the outer UDP checksum is only valid if an inner checksum is 
> present and valid. This problem is only present over IPv6, not IPv4, 
> and reproducible with different kernel versions and cpu architecture.
> 
> Here is the test setup I used:
> 
>     +----------+                                    +------------+
>     |          | fd00::2/64              fd00::1/64 |            |
>     |  tester  | 10.0.0.33/24          10.0.0.11/24 |    DUT     |
>     |          |------------------------------------|            |
>     +----------+                                    +------------+
> 
> Two machines are connected with an ethernet cable, and each side is 
> setup with an ipv4 and an ipv6 address.
> 
> On the device under test, two GRE-in-UDP tunnels are setup, one over 
> ipv6 and one over ipv4, aimed at the tester. Each tunnel is configured 
> with an ipv4 address:
> 
>     modprobe -a fou fou6
> 
>     ip link add gre6 type ip6gre local fd00::1 remote fd00::2 encap fou encap-sport 1234 encap-dport 4567
>     ip link set up gre6
>     ip address add 172.20.6.1/24 dev gre6
> 
>     ip link add gre4 type gre local 10.0.0.11 remote 10.0.0.33 encap fou encap-sport 3456 encap-dport 6789 encap-csum
>     ip link set up gre4
>     ip address add 172.20.4.1/24 dev gre4
> 
> On the tester, no setup is done, it is only running tcpdump to capture 
> the traffic emitted by the DUT.
> 
> The following commands are used on the device under test to send 
> packets via the tunnels:
> 
>     ping -c1 -W0.1 172.20.6.2
>     ./send_udp 172.20.6.2 5555 "ip6gre-in-udp with inner udp csum"
>     SO_NO_CHECK=1 ./send_udp 172.20.6.2 5555 "ip6gre-in-udp without inner udp csum"
>     ping -c1 -W0.1 172.20.4.2
>     ./send_udp 172.20.4.2 5555 "gre-in-udp with inner udp csum"
>     SO_NO_CHECK=1 ./send_udp 172.20.4.2 5555 "gre-in-udp without inner udp csum"
> 
> Three packets are sent in each GRE tunnels :
> * One ICMP echo request
> * One UDP packet with a valid checksum
> * One UDP packet with the checksum set to 0
> 
> Here is a link to the pcap containing the six packets generated by the 
> previous commands:
> https://onaip.mooo.com/pub/tmp/bug_ip6gre-in-udp.pcap
> 
> Some details about the captured packets:
> IP6 fd00::1 > fd00::2: 1234 > 4567: [bad udp cksum 0xfa75 -> 0xe680!]
> IP6 fd00::1 > fd00::2: 1234 > 4567: [udp sum ok]
> IP6 fd00::1 > fd00::2: 1234 > 4567: [bad udp cksum 0xfa62 -> 0xb57c!]
> IP 10.0.0.11.3456 > 10.0.0.33.6789: [udp sum ok]
> IP 10.0.0.11.3456 > 10.0.0.33.6789: [udp sum ok]
> IP 10.0.0.11.3456 > 10.0.0.33.6789: [udp sum ok]
> 
> For the tunnel over ipv6, only the UDP packet sent with a valid 
> checksum get encapsulated with a valid checksum. For the ping and the 
> UDP packet with a zero checksum, the outer UDP checksum matches the 
> partial checksum of the pseudo-header.
> 
> For ipv4, the UDP checksum of the encapsulation are all valid.
> 
> The device under test used for the capture is an x86-64 machine with a 
> realtek ethernet adapter (r8169 driver) running a 6.10.7 kernel.
> 
> The problem was also seen on an arm64 board (freescale ls1046) with 
> dpaa ethernet driver running a 4.14 kernel. on this hardware, the 
> packets that would have an invalid checksum are not emitted and the tx 
> error counter of the ethernet interface increases.
> 
> In all cases, disabling hardware checksumming for ipv6 with ethtool can 
> be used as a work-around:
> 
>     ethtool -K eth0 tx-checksum-ipv6 off
> 
> This and the partial checksum value seems to point to an error in the 
> handling of hardware checksumming in the particular case of fou6 
> encapsulation, but I have not been able to figure out what could be 
> causing it.
> 
> Did I miss a configuration parameter for ip6gre-in-udp? Any advise on 
> how to debug that would be appreciated.
> 
I did some more digging with 6.11-rc7, and it is not a problem in the 
common code or udp encapsulation, I just got "lucky" with my tests...

First in fou6.c, fou6_build_udp constructs the upd header and calls 
udp6_set_csum. If the inner packet has a valid checksum, the outer udp 
get computed by reusing it, otherwise the partial checksum is set.

Next the outer ipv6 is built in ip6tunnel.c:ip6_tnl_xmit. With the 
default value of encaplimit (4), an destination options header is 
inserted to pass that value. This means that ipv6_hdr(skb)->nexthdr is 
set to 60 (NEXTHDR_DEST), not 17 (IPPROTO_UDP).

So when sending a packet without a valid checksum in a ip6gre-in-udp, 
we pass a skb with a partial udp checksum and an ipv6 header with an 
extension to ndo_start_xmit.

Finally, what are the odds of testing two different hardware and 
finding two similar bugs?

For the LS1046 platform, the Tx checksum is done in 
dpaa_enable_tx_csum. For ipv6, the layer 4 protocol is extracted from 
the skb with l4_proto = ipv6h->nexthdr. Since the value does not match 
IPPROTO_UDP nor IPPROTO_TCP, the function errors out and the packet is 
discarded.

For the PC with a realtek 8169 card, the code is similar in 
rtl8169_tso_csum_v2, with ip_protocol = ipv6_hdr(skb)->nexthdr. The 
error case then triggers a WARN_ON_ONCE(1) and the packet is still sent 
on the wire with a partial udp checksum.

There seems to be quite a few places in drivers/net/ethernet that use 
ipv6_hdr(skb)->nexthdr as the IP protocol, with only some of them 
calling ipv6_skip_exthdr afterward to take care of the extension 
headers. So my proposal is to add a small helper (ipv6_protocol?) and 
fix the drivers where needed.

I'll try to come up with a patch set, unless someone has a better idea.

-- 
Benoît



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: PROBLEM: invalid udp checksum with ip6gre-in-udp
  2024-09-12 14:04 ` Benoît Monin
@ 2024-09-20 12:10   ` Benoît Monin
  0 siblings, 0 replies; 3+ messages in thread
From: Benoît Monin @ 2024-09-20 12:10 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: netdev, linux-kernel

Hi,

12/09/2024 Benoît Monin :
> Hi again,
> 
> 05/09/2024 Benoît Monin :
> > Hi all,
> > 
> > I am having issue with GRE-in-UDP (GRE with fou encapsulation) over 
> > IPv6: the outer UDP checksum is only valid if an inner checksum is 
> > present and valid. This problem is only present over IPv6, not IPv4, 
> > and reproducible with different kernel versions and cpu architecture.
> > 
> > Here is the test setup I used:
> > 
> >     +----------+                                    +------------+
> >     |          | fd00::2/64              fd00::1/64 |            |
> >     |  tester  | 10.0.0.33/24          10.0.0.11/24 |    DUT     |
> >     |          |------------------------------------|            |
> >     +----------+                                    +------------+
> > 
> > Two machines are connected with an ethernet cable, and each side is 
> > setup with an ipv4 and an ipv6 address.
> > 
> > On the device under test, two GRE-in-UDP tunnels are setup, one over 
> > ipv6 and one over ipv4, aimed at the tester. Each tunnel is configured 
> > with an ipv4 address:
> > 
> >     modprobe -a fou fou6
> > 
> >     ip link add gre6 type ip6gre local fd00::1 remote fd00::2 encap fou encap-sport 1234 encap-dport 4567
> >     ip link set up gre6
> >     ip address add 172.20.6.1/24 dev gre6
> > 
> >     ip link add gre4 type gre local 10.0.0.11 remote 10.0.0.33 encap fou encap-sport 3456 encap-dport 6789 encap-csum
> >     ip link set up gre4
> >     ip address add 172.20.4.1/24 dev gre4
> > 
> > On the tester, no setup is done, it is only running tcpdump to capture 
> > the traffic emitted by the DUT.
> > 
> > The following commands are used on the device under test to send 
> > packets via the tunnels:
> > 
> >     ping -c1 -W0.1 172.20.6.2
> >     ./send_udp 172.20.6.2 5555 "ip6gre-in-udp with inner udp csum"
> >     SO_NO_CHECK=1 ./send_udp 172.20.6.2 5555 "ip6gre-in-udp without inner udp csum"
> >     ping -c1 -W0.1 172.20.4.2
> >     ./send_udp 172.20.4.2 5555 "gre-in-udp with inner udp csum"
> >     SO_NO_CHECK=1 ./send_udp 172.20.4.2 5555 "gre-in-udp without inner udp csum"
> > 
> > Three packets are sent in each GRE tunnels :
> > * One ICMP echo request
> > * One UDP packet with a valid checksum
> > * One UDP packet with the checksum set to 0
> > 
> > Here is a link to the pcap containing the six packets generated by the 
> > previous commands:
> > https://onaip.mooo.com/pub/tmp/bug_ip6gre-in-udp.pcap
> > 
> > Some details about the captured packets:
> > IP6 fd00::1 > fd00::2: 1234 > 4567: [bad udp cksum 0xfa75 -> 0xe680!]
> > IP6 fd00::1 > fd00::2: 1234 > 4567: [udp sum ok]
> > IP6 fd00::1 > fd00::2: 1234 > 4567: [bad udp cksum 0xfa62 -> 0xb57c!]
> > IP 10.0.0.11.3456 > 10.0.0.33.6789: [udp sum ok]
> > IP 10.0.0.11.3456 > 10.0.0.33.6789: [udp sum ok]
> > IP 10.0.0.11.3456 > 10.0.0.33.6789: [udp sum ok]
> > 
> > For the tunnel over ipv6, only the UDP packet sent with a valid 
> > checksum get encapsulated with a valid checksum. For the ping and the 
> > UDP packet with a zero checksum, the outer UDP checksum matches the 
> > partial checksum of the pseudo-header.
> > 
> > For ipv4, the UDP checksum of the encapsulation are all valid.
> > 
> > The device under test used for the capture is an x86-64 machine with a 
> > realtek ethernet adapter (r8169 driver) running a 6.10.7 kernel.
> > 
> > The problem was also seen on an arm64 board (freescale ls1046) with 
> > dpaa ethernet driver running a 4.14 kernel. on this hardware, the 
> > packets that would have an invalid checksum are not emitted and the tx 
> > error counter of the ethernet interface increases.
> > 
> > In all cases, disabling hardware checksumming for ipv6 with ethtool can 
> > be used as a work-around:
> > 
> >     ethtool -K eth0 tx-checksum-ipv6 off
> > 
> > This and the partial checksum value seems to point to an error in the 
> > handling of hardware checksumming in the particular case of fou6 
> > encapsulation, but I have not been able to figure out what could be 
> > causing it.
> > 
> > Did I miss a configuration parameter for ip6gre-in-udp? Any advise on 
> > how to debug that would be appreciated.
> > 
> I did some more digging with 6.11-rc7, and it is not a problem in the 
> common code or udp encapsulation, I just got "lucky" with my tests...
> 
> First in fou6.c, fou6_build_udp constructs the upd header and calls 
> udp6_set_csum. If the inner packet has a valid checksum, the outer udp 
> get computed by reusing it, otherwise the partial checksum is set.
> 
> Next the outer ipv6 is built in ip6tunnel.c:ip6_tnl_xmit. With the 
> default value of encaplimit (4), an destination options header is 
> inserted to pass that value. This means that ipv6_hdr(skb)->nexthdr is 
> set to 60 (NEXTHDR_DEST), not 17 (IPPROTO_UDP).
> 
> So when sending a packet without a valid checksum in a ip6gre-in-udp, 
> we pass a skb with a partial udp checksum and an ipv6 header with an 
> extension to ndo_start_xmit.
> 
> Finally, what are the odds of testing two different hardware and 
> finding two similar bugs?
> 
> For the LS1046 platform, the Tx checksum is done in 
> dpaa_enable_tx_csum. For ipv6, the layer 4 protocol is extracted from 
> the skb with l4_proto = ipv6h->nexthdr. Since the value does not match 
> IPPROTO_UDP nor IPPROTO_TCP, the function errors out and the packet is 
> discarded.
> 
> For the PC with a realtek 8169 card, the code is similar in 
> rtl8169_tso_csum_v2, with ip_protocol = ipv6_hdr(skb)->nexthdr. The 
> error case then triggers a WARN_ON_ONCE(1) and the packet is still sent 
> on the wire with a partial udp checksum.
> 
> There seems to be quite a few places in drivers/net/ethernet that use 
> ipv6_hdr(skb)->nexthdr as the IP protocol, with only some of them 
> calling ipv6_skip_exthdr afterward to take care of the extension 
> headers. So my proposal is to add a small helper (ipv6_protocol?) and 
> fix the drivers where needed.
> 
> I'll try to come up with a patch set, unless someone has a better idea.
> 
Looking more into this, I don't think the problem is in the ethernet 
drivers. Both dpaa and r8169 declare NETIF_F_IPV6_CSUM in their 
hw_features. The documentation for this flag says "IPv6 extension 
headers are not supported with this feature". Yet in the call-path 
described in my previous email, the driver gets handed an skb with both 
an IPv6 extension header and a UDP header with a partial checksum.

Below is the ftrace of a ping sent through an ip6gre-in-udp tunnel, 
when reaching the validate_xmit_skb call of the ethernet driver:

 3)               |                                                validate_xmit_skb_list() {
 3)               |                                                  validate_xmit_skb() {
 3)               |                                                    netif_skb_features() {
 3)               |                                                      rtl8169_features_check [r8169]() {
 3)   0.576 us    |                                                        rtl_quirk_packet_padto [r8169]();
 3)   1.812 us    |                                                      }
 3)   0.492 us    |                                                      skb_network_protocol();
 3)   4.152 us    |                                                    }
 3)   0.504 us    |                                                    skb_csum_hwoffload_help();
 3)   0.480 us    |                                                    validate_xmit_xfrm();
 3)   7.128 us    |                                                  }
 3)   8.076 us    |                                                }

The call to skb_csum_hwoffload_help does not trigger a call to 
skb_checksum_help. I guess the bug can be reproduced with any ethernet 
card that has tx-checksum-ipv6 set to on in its offload features.

It looks like skb_csum_hwoffload_help does not catch that particular 
case. Again, any help would be appreciated.

-- 
Benoît



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-09-20 12:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-05 15:22 PROBLEM: invalid udp checksum with ip6gre-in-udp Benoît Monin
2024-09-12 14:04 ` Benoît Monin
2024-09-20 12:10   ` Benoît Monin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).