All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Abeni <pabeni@redhat.com>
To: Paul Chaignon <paul.chaignon@gmail.com>,
	netdev@vger.kernel.org, bpf@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>,
	David Ahern <dsahern@kernel.org>,
	Tom Herbert <tom@herbertland.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>
Subject: Re: [PATCH net v2 1/2] net: Fix checksum update for ILA adj-transport
Date: Thu, 29 May 2025 09:38:18 +0200	[thread overview]
Message-ID: <ea2e4ba3-4dd6-4bee-ad26-2ed541f4aeaf@redhat.com> (raw)
In-Reply-To: <3735f3bd86717bb22507a05f40b1432ec362138c.1748337614.git.paul.chaignon@gmail.com>

On 5/27/25 11:48 AM, Paul Chaignon wrote:
> During ILA address translations, the L4 checksums can be handled in
> different ways. One of them, adj-transport, consist in parsing the
> transport layer and updating any found checksum. This logic relies on
> inet_proto_csum_replace_by_diff and produces an incorrect skb->csum when
> in state CHECKSUM_COMPLETE.
> 
> This bug can be reproduced with a simple ILA to SIR mapping, assuming
> packets are received with CHECKSUM_COMPLETE:
> 
>   $ ip a show dev eth0
>   14: eth0@if15: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
>       link/ether 62:ae:35:9e:0f:8d brd ff:ff:ff:ff:ff:ff link-netnsid 0
>       inet6 3333:0:0:1::c078/64 scope global
>          valid_lft forever preferred_lft forever
>       inet6 fd00:10:244:1::c078/128 scope global nodad
>          valid_lft forever preferred_lft forever
>       inet6 fe80::60ae:35ff:fe9e:f8d/64 scope link proto kernel_ll
>          valid_lft forever preferred_lft forever
>   $ ip ila add loc_match fd00:10:244:1 loc 3333:0:0:1 \
>       csum-mode adj-transport ident-type luid dev eth0
> 
> Then I hit [fd00:10:244:1::c078]:8000 with a server listening only on
> [3333:0:0:1::c078]:8000. With the bug, the SYN packet is dropped with
> SKB_DROP_REASON_TCP_CSUM after inet_proto_csum_replace_by_diff changed
> skb->csum. The translation and drop are visible on pwru [1] traces:
> 
>   IFACE   TUPLE                                                        FUNC
>   eth0:9  [fd00:10:244:3::3d8]:51420->[fd00:10:244:1::c078]:8000(tcp)  ipv6_rcv
>   eth0:9  [fd00:10:244:3::3d8]:51420->[fd00:10:244:1::c078]:8000(tcp)  ip6_rcv_core
>   eth0:9  [fd00:10:244:3::3d8]:51420->[fd00:10:244:1::c078]:8000(tcp)  nf_hook_slow
>   eth0:9  [fd00:10:244:3::3d8]:51420->[fd00:10:244:1::c078]:8000(tcp)  inet_proto_csum_replace_by_diff
>   eth0:9  [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp)     tcp_v6_early_demux
>   eth0:9  [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp)     ip6_route_input
>   eth0:9  [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp)     ip6_input
>   eth0:9  [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp)     ip6_input_finish
>   eth0:9  [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp)     ip6_protocol_deliver_rcu
>   eth0:9  [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp)     raw6_local_deliver
>   eth0:9  [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp)     ipv6_raw_deliver
>   eth0:9  [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp)     tcp_v6_rcv
>   eth0:9  [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp)     __skb_checksum_complete
>   eth0:9  [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp)     kfree_skb_reason(SKB_DROP_REASON_TCP_CSUM)
>   eth0:9  [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp)     skb_release_head_state
>   eth0:9  [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp)     skb_release_data
>   eth0:9  [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp)     skb_free_head
>   eth0:9  [fd00:10:244:3::3d8]:51420->[3333:0:0:1::c078]:8000(tcp)     kfree_skbmem
> 
> This is happening because inet_proto_csum_replace_by_diff is updating
> skb->csum when it shouldn't. The L4 checksum is updated such that it
> "cancels" the IPv6 address change in terms of checksum computation, so
> the impact on skb->csum is null.
> 
> Note this would be different for an IPv4 packet since three fields
> would be updated: the IPv4 address, the IP checksum, and the L4
> checksum. Two would cancel each other and skb->csum would still need
> to be updated to take the L4 checksum change into account.
> 
> This patch fixes it by passing an ipv6 flag to
> inet_proto_csum_replace_by_diff, to skip the skb->csum update if we're
> in the IPv6 case. Note the behavior of the only other user of
> inet_proto_csum_replace_by_diff, the BPF subsystem, is left as is in
> this patch and fixed in the subsequent patch.
> 
> With the fix, using the reproduction from above, I can confirm
> skb->csum is not touched by inet_proto_csum_replace_by_diff and the TCP
> SYN proceeds to the application after the ILA translation.
> 
> 1 - https://github.com/cilium/pwru
> Fixes: 65d7ab8de582 ("net: Identifier Locator Addressing module")
> Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>

Patch 2 does not apply cleanly anymore, please rebase. While at it,
please also replace:

1 - https://github.com/cilium/pwru

with a more customary tag:

Link: https://github.com/cilium/pwru [1]

Thanks,

Paolo


  parent reply	other threads:[~2025-05-29  7:38 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-27  9:47 [PATCH net v2 0/2] net: Fix inet_proto_csum_replace_by_diff for IPv6 Paul Chaignon
2025-05-27  9:48 ` [PATCH net v2 1/2] net: Fix checksum update for ILA adj-transport Paul Chaignon
2025-05-27 15:13   ` Daniel Borkmann
2025-05-29  7:38   ` Paolo Abeni [this message]
2025-05-27  9:48 ` [PATCH net v2 2/2] bpf: Fix L4 csum update on IPv6 in CHECKSUM_COMPLETE Paul Chaignon
2025-05-27 15:14   ` Daniel Borkmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ea2e4ba3-4dd6-4bee-ad26-2ed541f4aeaf@redhat.com \
    --to=pabeni@redhat.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=paul.chaignon@gmail.com \
    --cc=tom@herbertland.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.