* PROBLEM: invalid udp checksum with ip6gre-in-udp
@ 2024-09-05 15:22 Benoît Monin
2024-09-12 14:04 ` Benoît Monin
0 siblings, 1 reply; 3+ messages in thread
From: Benoît Monin @ 2024-09-05 15:22 UTC (permalink / raw)
To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: netdev, linux-kernel
Hi all,
I am having issue with GRE-in-UDP (GRE with fou encapsulation) over
IPv6: the outer UDP checksum is only valid if an inner checksum is
present and valid. This problem is only present over IPv6, not IPv4,
and reproducible with different kernel versions and cpu architecture.
Here is the test setup I used:
+----------+ +------------+
| | fd00::2/64 fd00::1/64 | |
| tester | 10.0.0.33/24 10.0.0.11/24 | DUT |
| |------------------------------------| |
+----------+ +------------+
Two machines are connected with an ethernet cable, and each side is
setup with an ipv4 and an ipv6 address.
On the device under test, two GRE-in-UDP tunnels are setup, one over
ipv6 and one over ipv4, aimed at the tester. Each tunnel is configured
with an ipv4 address:
modprobe -a fou fou6
ip link add gre6 type ip6gre local fd00::1 remote fd00::2 encap fou encap-sport 1234 encap-dport 4567
ip link set up gre6
ip address add 172.20.6.1/24 dev gre6
ip link add gre4 type gre local 10.0.0.11 remote 10.0.0.33 encap fou encap-sport 3456 encap-dport 6789 encap-csum
ip link set up gre4
ip address add 172.20.4.1/24 dev gre4
On the tester, no setup is done, it is only running tcpdump to capture
the traffic emitted by the DUT.
The following commands are used on the device under test to send
packets via the tunnels:
ping -c1 -W0.1 172.20.6.2
./send_udp 172.20.6.2 5555 "ip6gre-in-udp with inner udp csum"
SO_NO_CHECK=1 ./send_udp 172.20.6.2 5555 "ip6gre-in-udp without inner udp csum"
ping -c1 -W0.1 172.20.4.2
./send_udp 172.20.4.2 5555 "gre-in-udp with inner udp csum"
SO_NO_CHECK=1 ./send_udp 172.20.4.2 5555 "gre-in-udp without inner udp csum"
Three packets are sent in each GRE tunnels :
* One ICMP echo request
* One UDP packet with a valid checksum
* One UDP packet with the checksum set to 0
Here is a link to the pcap containing the six packets generated by the
previous commands:
https://onaip.mooo.com/pub/tmp/bug_ip6gre-in-udp.pcap
Some details about the captured packets:
IP6 fd00::1 > fd00::2: 1234 > 4567: [bad udp cksum 0xfa75 -> 0xe680!]
IP6 fd00::1 > fd00::2: 1234 > 4567: [udp sum ok]
IP6 fd00::1 > fd00::2: 1234 > 4567: [bad udp cksum 0xfa62 -> 0xb57c!]
IP 10.0.0.11.3456 > 10.0.0.33.6789: [udp sum ok]
IP 10.0.0.11.3456 > 10.0.0.33.6789: [udp sum ok]
IP 10.0.0.11.3456 > 10.0.0.33.6789: [udp sum ok]
For the tunnel over ipv6, only the UDP packet sent with a valid
checksum get encapsulated with a valid checksum. For the ping and the
UDP packet with a zero checksum, the outer UDP checksum matches the
partial checksum of the pseudo-header.
For ipv4, the UDP checksum of the encapsulation are all valid.
The device under test used for the capture is an x86-64 machine with a
realtek ethernet adapter (r8169 driver) running a 6.10.7 kernel.
The problem was also seen on an arm64 board (freescale ls1046) with
dpaa ethernet driver running a 4.14 kernel. on this hardware, the
packets that would have an invalid checksum are not emitted and the tx
error counter of the ethernet interface increases.
In all cases, disabling hardware checksumming for ipv6 with ethtool can
be used as a work-around:
ethtool -K eth0 tx-checksum-ipv6 off
This and the partial checksum value seems to point to an error in the
handling of hardware checksumming in the particular case of fou6
encapsulation, but I have not been able to figure out what could be
causing it.
Did I miss a configuration parameter for ip6gre-in-udp? Any advise on
how to debug that would be appreciated.
--
Benoît
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: PROBLEM: invalid udp checksum with ip6gre-in-udp
2024-09-05 15:22 PROBLEM: invalid udp checksum with ip6gre-in-udp Benoît Monin
@ 2024-09-12 14:04 ` Benoît Monin
2024-09-20 12:10 ` Benoît Monin
0 siblings, 1 reply; 3+ messages in thread
From: Benoît Monin @ 2024-09-12 14:04 UTC (permalink / raw)
To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: netdev, linux-kernel
Hi again,
05/09/2024 Benoît Monin :
> Hi all,
>
> I am having issue with GRE-in-UDP (GRE with fou encapsulation) over
> IPv6: the outer UDP checksum is only valid if an inner checksum is
> present and valid. This problem is only present over IPv6, not IPv4,
> and reproducible with different kernel versions and cpu architecture.
>
> Here is the test setup I used:
>
> +----------+ +------------+
> | | fd00::2/64 fd00::1/64 | |
> | tester | 10.0.0.33/24 10.0.0.11/24 | DUT |
> | |------------------------------------| |
> +----------+ +------------+
>
> Two machines are connected with an ethernet cable, and each side is
> setup with an ipv4 and an ipv6 address.
>
> On the device under test, two GRE-in-UDP tunnels are setup, one over
> ipv6 and one over ipv4, aimed at the tester. Each tunnel is configured
> with an ipv4 address:
>
> modprobe -a fou fou6
>
> ip link add gre6 type ip6gre local fd00::1 remote fd00::2 encap fou encap-sport 1234 encap-dport 4567
> ip link set up gre6
> ip address add 172.20.6.1/24 dev gre6
>
> ip link add gre4 type gre local 10.0.0.11 remote 10.0.0.33 encap fou encap-sport 3456 encap-dport 6789 encap-csum
> ip link set up gre4
> ip address add 172.20.4.1/24 dev gre4
>
> On the tester, no setup is done, it is only running tcpdump to capture
> the traffic emitted by the DUT.
>
> The following commands are used on the device under test to send
> packets via the tunnels:
>
> ping -c1 -W0.1 172.20.6.2
> ./send_udp 172.20.6.2 5555 "ip6gre-in-udp with inner udp csum"
> SO_NO_CHECK=1 ./send_udp 172.20.6.2 5555 "ip6gre-in-udp without inner udp csum"
> ping -c1 -W0.1 172.20.4.2
> ./send_udp 172.20.4.2 5555 "gre-in-udp with inner udp csum"
> SO_NO_CHECK=1 ./send_udp 172.20.4.2 5555 "gre-in-udp without inner udp csum"
>
> Three packets are sent in each GRE tunnels :
> * One ICMP echo request
> * One UDP packet with a valid checksum
> * One UDP packet with the checksum set to 0
>
> Here is a link to the pcap containing the six packets generated by the
> previous commands:
> https://onaip.mooo.com/pub/tmp/bug_ip6gre-in-udp.pcap
>
> Some details about the captured packets:
> IP6 fd00::1 > fd00::2: 1234 > 4567: [bad udp cksum 0xfa75 -> 0xe680!]
> IP6 fd00::1 > fd00::2: 1234 > 4567: [udp sum ok]
> IP6 fd00::1 > fd00::2: 1234 > 4567: [bad udp cksum 0xfa62 -> 0xb57c!]
> IP 10.0.0.11.3456 > 10.0.0.33.6789: [udp sum ok]
> IP 10.0.0.11.3456 > 10.0.0.33.6789: [udp sum ok]
> IP 10.0.0.11.3456 > 10.0.0.33.6789: [udp sum ok]
>
> For the tunnel over ipv6, only the UDP packet sent with a valid
> checksum get encapsulated with a valid checksum. For the ping and the
> UDP packet with a zero checksum, the outer UDP checksum matches the
> partial checksum of the pseudo-header.
>
> For ipv4, the UDP checksum of the encapsulation are all valid.
>
> The device under test used for the capture is an x86-64 machine with a
> realtek ethernet adapter (r8169 driver) running a 6.10.7 kernel.
>
> The problem was also seen on an arm64 board (freescale ls1046) with
> dpaa ethernet driver running a 4.14 kernel. on this hardware, the
> packets that would have an invalid checksum are not emitted and the tx
> error counter of the ethernet interface increases.
>
> In all cases, disabling hardware checksumming for ipv6 with ethtool can
> be used as a work-around:
>
> ethtool -K eth0 tx-checksum-ipv6 off
>
> This and the partial checksum value seems to point to an error in the
> handling of hardware checksumming in the particular case of fou6
> encapsulation, but I have not been able to figure out what could be
> causing it.
>
> Did I miss a configuration parameter for ip6gre-in-udp? Any advise on
> how to debug that would be appreciated.
>
I did some more digging with 6.11-rc7, and it is not a problem in the
common code or udp encapsulation, I just got "lucky" with my tests...
First in fou6.c, fou6_build_udp constructs the upd header and calls
udp6_set_csum. If the inner packet has a valid checksum, the outer udp
get computed by reusing it, otherwise the partial checksum is set.
Next the outer ipv6 is built in ip6tunnel.c:ip6_tnl_xmit. With the
default value of encaplimit (4), an destination options header is
inserted to pass that value. This means that ipv6_hdr(skb)->nexthdr is
set to 60 (NEXTHDR_DEST), not 17 (IPPROTO_UDP).
So when sending a packet without a valid checksum in a ip6gre-in-udp,
we pass a skb with a partial udp checksum and an ipv6 header with an
extension to ndo_start_xmit.
Finally, what are the odds of testing two different hardware and
finding two similar bugs?
For the LS1046 platform, the Tx checksum is done in
dpaa_enable_tx_csum. For ipv6, the layer 4 protocol is extracted from
the skb with l4_proto = ipv6h->nexthdr. Since the value does not match
IPPROTO_UDP nor IPPROTO_TCP, the function errors out and the packet is
discarded.
For the PC with a realtek 8169 card, the code is similar in
rtl8169_tso_csum_v2, with ip_protocol = ipv6_hdr(skb)->nexthdr. The
error case then triggers a WARN_ON_ONCE(1) and the packet is still sent
on the wire with a partial udp checksum.
There seems to be quite a few places in drivers/net/ethernet that use
ipv6_hdr(skb)->nexthdr as the IP protocol, with only some of them
calling ipv6_skip_exthdr afterward to take care of the extension
headers. So my proposal is to add a small helper (ipv6_protocol?) and
fix the drivers where needed.
I'll try to come up with a patch set, unless someone has a better idea.
--
Benoît
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: PROBLEM: invalid udp checksum with ip6gre-in-udp
2024-09-12 14:04 ` Benoît Monin
@ 2024-09-20 12:10 ` Benoît Monin
0 siblings, 0 replies; 3+ messages in thread
From: Benoît Monin @ 2024-09-20 12:10 UTC (permalink / raw)
To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: netdev, linux-kernel
Hi,
12/09/2024 Benoît Monin :
> Hi again,
>
> 05/09/2024 Benoît Monin :
> > Hi all,
> >
> > I am having issue with GRE-in-UDP (GRE with fou encapsulation) over
> > IPv6: the outer UDP checksum is only valid if an inner checksum is
> > present and valid. This problem is only present over IPv6, not IPv4,
> > and reproducible with different kernel versions and cpu architecture.
> >
> > Here is the test setup I used:
> >
> > +----------+ +------------+
> > | | fd00::2/64 fd00::1/64 | |
> > | tester | 10.0.0.33/24 10.0.0.11/24 | DUT |
> > | |------------------------------------| |
> > +----------+ +------------+
> >
> > Two machines are connected with an ethernet cable, and each side is
> > setup with an ipv4 and an ipv6 address.
> >
> > On the device under test, two GRE-in-UDP tunnels are setup, one over
> > ipv6 and one over ipv4, aimed at the tester. Each tunnel is configured
> > with an ipv4 address:
> >
> > modprobe -a fou fou6
> >
> > ip link add gre6 type ip6gre local fd00::1 remote fd00::2 encap fou encap-sport 1234 encap-dport 4567
> > ip link set up gre6
> > ip address add 172.20.6.1/24 dev gre6
> >
> > ip link add gre4 type gre local 10.0.0.11 remote 10.0.0.33 encap fou encap-sport 3456 encap-dport 6789 encap-csum
> > ip link set up gre4
> > ip address add 172.20.4.1/24 dev gre4
> >
> > On the tester, no setup is done, it is only running tcpdump to capture
> > the traffic emitted by the DUT.
> >
> > The following commands are used on the device under test to send
> > packets via the tunnels:
> >
> > ping -c1 -W0.1 172.20.6.2
> > ./send_udp 172.20.6.2 5555 "ip6gre-in-udp with inner udp csum"
> > SO_NO_CHECK=1 ./send_udp 172.20.6.2 5555 "ip6gre-in-udp without inner udp csum"
> > ping -c1 -W0.1 172.20.4.2
> > ./send_udp 172.20.4.2 5555 "gre-in-udp with inner udp csum"
> > SO_NO_CHECK=1 ./send_udp 172.20.4.2 5555 "gre-in-udp without inner udp csum"
> >
> > Three packets are sent in each GRE tunnels :
> > * One ICMP echo request
> > * One UDP packet with a valid checksum
> > * One UDP packet with the checksum set to 0
> >
> > Here is a link to the pcap containing the six packets generated by the
> > previous commands:
> > https://onaip.mooo.com/pub/tmp/bug_ip6gre-in-udp.pcap
> >
> > Some details about the captured packets:
> > IP6 fd00::1 > fd00::2: 1234 > 4567: [bad udp cksum 0xfa75 -> 0xe680!]
> > IP6 fd00::1 > fd00::2: 1234 > 4567: [udp sum ok]
> > IP6 fd00::1 > fd00::2: 1234 > 4567: [bad udp cksum 0xfa62 -> 0xb57c!]
> > IP 10.0.0.11.3456 > 10.0.0.33.6789: [udp sum ok]
> > IP 10.0.0.11.3456 > 10.0.0.33.6789: [udp sum ok]
> > IP 10.0.0.11.3456 > 10.0.0.33.6789: [udp sum ok]
> >
> > For the tunnel over ipv6, only the UDP packet sent with a valid
> > checksum get encapsulated with a valid checksum. For the ping and the
> > UDP packet with a zero checksum, the outer UDP checksum matches the
> > partial checksum of the pseudo-header.
> >
> > For ipv4, the UDP checksum of the encapsulation are all valid.
> >
> > The device under test used for the capture is an x86-64 machine with a
> > realtek ethernet adapter (r8169 driver) running a 6.10.7 kernel.
> >
> > The problem was also seen on an arm64 board (freescale ls1046) with
> > dpaa ethernet driver running a 4.14 kernel. on this hardware, the
> > packets that would have an invalid checksum are not emitted and the tx
> > error counter of the ethernet interface increases.
> >
> > In all cases, disabling hardware checksumming for ipv6 with ethtool can
> > be used as a work-around:
> >
> > ethtool -K eth0 tx-checksum-ipv6 off
> >
> > This and the partial checksum value seems to point to an error in the
> > handling of hardware checksumming in the particular case of fou6
> > encapsulation, but I have not been able to figure out what could be
> > causing it.
> >
> > Did I miss a configuration parameter for ip6gre-in-udp? Any advise on
> > how to debug that would be appreciated.
> >
> I did some more digging with 6.11-rc7, and it is not a problem in the
> common code or udp encapsulation, I just got "lucky" with my tests...
>
> First in fou6.c, fou6_build_udp constructs the upd header and calls
> udp6_set_csum. If the inner packet has a valid checksum, the outer udp
> get computed by reusing it, otherwise the partial checksum is set.
>
> Next the outer ipv6 is built in ip6tunnel.c:ip6_tnl_xmit. With the
> default value of encaplimit (4), an destination options header is
> inserted to pass that value. This means that ipv6_hdr(skb)->nexthdr is
> set to 60 (NEXTHDR_DEST), not 17 (IPPROTO_UDP).
>
> So when sending a packet without a valid checksum in a ip6gre-in-udp,
> we pass a skb with a partial udp checksum and an ipv6 header with an
> extension to ndo_start_xmit.
>
> Finally, what are the odds of testing two different hardware and
> finding two similar bugs?
>
> For the LS1046 platform, the Tx checksum is done in
> dpaa_enable_tx_csum. For ipv6, the layer 4 protocol is extracted from
> the skb with l4_proto = ipv6h->nexthdr. Since the value does not match
> IPPROTO_UDP nor IPPROTO_TCP, the function errors out and the packet is
> discarded.
>
> For the PC with a realtek 8169 card, the code is similar in
> rtl8169_tso_csum_v2, with ip_protocol = ipv6_hdr(skb)->nexthdr. The
> error case then triggers a WARN_ON_ONCE(1) and the packet is still sent
> on the wire with a partial udp checksum.
>
> There seems to be quite a few places in drivers/net/ethernet that use
> ipv6_hdr(skb)->nexthdr as the IP protocol, with only some of them
> calling ipv6_skip_exthdr afterward to take care of the extension
> headers. So my proposal is to add a small helper (ipv6_protocol?) and
> fix the drivers where needed.
>
> I'll try to come up with a patch set, unless someone has a better idea.
>
Looking more into this, I don't think the problem is in the ethernet
drivers. Both dpaa and r8169 declare NETIF_F_IPV6_CSUM in their
hw_features. The documentation for this flag says "IPv6 extension
headers are not supported with this feature". Yet in the call-path
described in my previous email, the driver gets handed an skb with both
an IPv6 extension header and a UDP header with a partial checksum.
Below is the ftrace of a ping sent through an ip6gre-in-udp tunnel,
when reaching the validate_xmit_skb call of the ethernet driver:
3) | validate_xmit_skb_list() {
3) | validate_xmit_skb() {
3) | netif_skb_features() {
3) | rtl8169_features_check [r8169]() {
3) 0.576 us | rtl_quirk_packet_padto [r8169]();
3) 1.812 us | }
3) 0.492 us | skb_network_protocol();
3) 4.152 us | }
3) 0.504 us | skb_csum_hwoffload_help();
3) 0.480 us | validate_xmit_xfrm();
3) 7.128 us | }
3) 8.076 us | }
The call to skb_csum_hwoffload_help does not trigger a call to
skb_checksum_help. I guess the bug can be reproduced with any ethernet
card that has tx-checksum-ipv6 set to on in its offload features.
It looks like skb_csum_hwoffload_help does not catch that particular
case. Again, any help would be appreciated.
--
Benoît
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-09-20 12:10 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-05 15:22 PROBLEM: invalid udp checksum with ip6gre-in-udp Benoît Monin
2024-09-12 14:04 ` Benoît Monin
2024-09-20 12:10 ` Benoît Monin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).