From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Miller <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>,
Network Development <netdev@vger.kernel.org>,
Eric Dumazet <edumazet@google.com>,
John Fastabend <john.fastabend@gmail.com>,
Jesse Brandeburg <jesse.brandeburg@intel.com>,
Tom Herbert <tom@herbertland.com>
Subject: Re: [PATCH net] net: Fix gro aggregation for udp encaps with zero csum
Date: Sat, 27 Feb 2021 11:00:55 -0500 [thread overview]
Message-ID: <CA+FuTSdn3zbynYOvuhLxZ02mmcDoRWQ5vUmBCbAgxeTa2X33YQ@mail.gmail.com> (raw)
In-Reply-To: <20210226212248.8300-1-daniel@iogearbox.net>
On Fri, Feb 26, 2021 at 4:23 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> We noticed a GRO issue for UDP-based encaps such as vxlan/geneve when the
> csum for the UDP header itself is 0. In that case, GRO aggregation does
> not take place on the phys dev, but instead is deferred to the vxlan/geneve
> driver (see trace below).
>
> The reason is essentially that GRO aggregation bails out in udp_gro_receive()
> for such case when drivers marked the skb with CHECKSUM_UNNECESSARY (ice, i40e,
> others) where for non-zero csums 2abb7cdc0dc8 ("udp: Add support for doing
> checksum unnecessary conversion") promotes those skbs to CHECKSUM_COMPLETE
> and napi context has csum_valid set. This is however not the case for zero
> UDP csum (here: csum_cnt is still 0 and csum_valid continues to be false).
>
> At the same time 57c67ff4bd92 ("udp: additional GRO support") added matches
> on !uh->check ^ !uh2->check as part to determine candidates for aggregation,
> so it certainly is expected to handle zero csums in udp_gro_receive(). The
> purpose of the check added via 662880f44203 ("net: Allow GRO to use and set
> levels of checksum unnecessary") seems to catch bad csum and stop aggregation
> right away.
>
> One way to fix aggregation in the zero case is to only perform the !csum_valid
> check in udp_gro_receive() if uh->check is infact non-zero.
>
> Before:
>
> [...]
> swapper 0 [008] 731.946506: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100400 len=1500 (1)
> swapper 0 [008] 731.946507: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100200 len=1500
> swapper 0 [008] 731.946507: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101100 len=1500
> swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101700 len=1500
> swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101b00 len=1500
> swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100600 len=1500
> swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100f00 len=1500
> swapper 0 [008] 731.946509: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100a00 len=1500
> swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100500 len=1500
> swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100700 len=1500
> swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101d00 len=1500 (2)
> swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101000 len=1500
> swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101c00 len=1500
> swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101400 len=1500
> swapper 0 [008] 731.946518: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100e00 len=1500
> swapper 0 [008] 731.946518: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101600 len=1500
> swapper 0 [008] 731.946521: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100800 len=774
> swapper 0 [008] 731.946530: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff966497100400 len=14032 (1)
> swapper 0 [008] 731.946530: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff966497101d00 len=9112 (2)
> [...]
>
> # netperf -H 10.55.10.4 -t TCP_STREAM -l 20
> MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.55.10.4 () port 0 AF_INET : demo
> Recv Send Send
> Socket Socket Message Elapsed
> Size Size Size Time Throughput
> bytes bytes bytes secs. 10^6bits/sec
>
> 87380 16384 16384 20.01 13129.24
>
> After:
>
> [...]
> swapper 0 [026] 521.862641: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff93ab0d479000 len=11286 (1)
> swapper 0 [026] 521.862643: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d479000 len=11236 (1)
> swapper 0 [026] 521.862650: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff93ab0d478500 len=2898 (2)
> swapper 0 [026] 521.862650: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff93ab0d479f00 len=8490 (3)
> swapper 0 [026] 521.862653: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d478500 len=2848 (2)
> swapper 0 [026] 521.862653: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d479f00 len=8440 (3)
> [...]
>
> # netperf -H 10.55.10.4 -t TCP_STREAM -l 20
> MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.55.10.4 () port 0 AF_INET : demo
> Recv Send Send
> Socket Socket Message Elapsed
> Size Size Size Time Throughput
> bytes bytes bytes secs. 10^6bits/sec
>
> 87380 16384 16384 20.01 24576.53
>
> Fixes: 57c67ff4bd92 ("udp: additional GRO support")
> Fixes: 662880f44203 ("net: Allow GRO to use and set levels of checksum unnecessary")
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> Cc: Tom Herbert <tom@herbertland.com>
Makes sense to me.
We cannot do checksum conversion with zero field, but that does not
have to limit coalescing.
CHECKSUM_COMPLETE with a checksum validated by
skb_gro_checksum_validate_zero_check implies csum_valid.
So the test
> (skb->ip_summed != CHECKSUM_PARTIAL &&
> NAPI_GRO_CB(skb)->csum_cnt == 0 &&
> !NAPI_GRO_CB(skb)->csum_valid) ||
Basically matches
- CHECKSUM_NONE
- CHECKSUM_UNNECESSARY which has already used up its valid state on a
prior header
- CHECKSUM_COMPLETE with bad checksum.
This change just refines to not drop for in the first two cases on a
zero checksum field.
Making this explicit in case anyone sees holes in the logic. Else,
Acked-by: Willem de Bruijn <willemb@google.com>
next prev parent reply other threads:[~2021-02-27 16:03 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-26 21:22 [PATCH net] net: Fix gro aggregation for udp encaps with zero csum Daniel Borkmann
2021-02-27 16:00 ` Willem de Bruijn [this message]
2021-02-27 17:16 ` John Fastabend
2021-02-28 20:02 ` Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CA+FuTSdn3zbynYOvuhLxZ02mmcDoRWQ5vUmBCbAgxeTa2X33YQ@mail.gmail.com \
--to=willemdebruijn.kernel@gmail.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=jesse.brandeburg@intel.com \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=tom@herbertland.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).