netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Miller <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	Eric Dumazet <edumazet@google.com>,
	John Fastabend <john.fastabend@gmail.com>,
	Jesse Brandeburg <jesse.brandeburg@intel.com>,
	Tom Herbert <tom@herbertland.com>
Subject: Re: [PATCH net] net: Fix gro aggregation for udp encaps with zero csum
Date: Sat, 27 Feb 2021 11:00:55 -0500	[thread overview]
Message-ID: <CA+FuTSdn3zbynYOvuhLxZ02mmcDoRWQ5vUmBCbAgxeTa2X33YQ@mail.gmail.com> (raw)
In-Reply-To: <20210226212248.8300-1-daniel@iogearbox.net>

On Fri, Feb 26, 2021 at 4:23 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> We noticed a GRO issue for UDP-based encaps such as vxlan/geneve when the
> csum for the UDP header itself is 0. In that case, GRO aggregation does
> not take place on the phys dev, but instead is deferred to the vxlan/geneve
> driver (see trace below).
>
> The reason is essentially that GRO aggregation bails out in udp_gro_receive()
> for such case when drivers marked the skb with CHECKSUM_UNNECESSARY (ice, i40e,
> others) where for non-zero csums 2abb7cdc0dc8 ("udp: Add support for doing
> checksum unnecessary conversion") promotes those skbs to CHECKSUM_COMPLETE
> and napi context has csum_valid set. This is however not the case for zero
> UDP csum (here: csum_cnt is still 0 and csum_valid continues to be false).
>
> At the same time 57c67ff4bd92 ("udp: additional GRO support") added matches
> on !uh->check ^ !uh2->check as part to determine candidates for aggregation,
> so it certainly is expected to handle zero csums in udp_gro_receive(). The
> purpose of the check added via 662880f44203 ("net: Allow GRO to use and set
> levels of checksum unnecessary") seems to catch bad csum and stop aggregation
> right away.
>
> One way to fix aggregation in the zero case is to only perform the !csum_valid
> check in udp_gro_receive() if uh->check is infact non-zero.
>
> Before:
>
>   [...]
>   swapper     0 [008]   731.946506: net:netif_receive_skb: dev=enp10s0f0  skbaddr=0xffff966497100400 len=1500   (1)
>   swapper     0 [008]   731.946507: net:netif_receive_skb: dev=enp10s0f0  skbaddr=0xffff966497100200 len=1500
>   swapper     0 [008]   731.946507: net:netif_receive_skb: dev=enp10s0f0  skbaddr=0xffff966497101100 len=1500
>   swapper     0 [008]   731.946508: net:netif_receive_skb: dev=enp10s0f0  skbaddr=0xffff966497101700 len=1500
>   swapper     0 [008]   731.946508: net:netif_receive_skb: dev=enp10s0f0  skbaddr=0xffff966497101b00 len=1500
>   swapper     0 [008]   731.946508: net:netif_receive_skb: dev=enp10s0f0  skbaddr=0xffff966497100600 len=1500
>   swapper     0 [008]   731.946508: net:netif_receive_skb: dev=enp10s0f0  skbaddr=0xffff966497100f00 len=1500
>   swapper     0 [008]   731.946509: net:netif_receive_skb: dev=enp10s0f0  skbaddr=0xffff966497100a00 len=1500
>   swapper     0 [008]   731.946516: net:netif_receive_skb: dev=enp10s0f0  skbaddr=0xffff966497100500 len=1500
>   swapper     0 [008]   731.946516: net:netif_receive_skb: dev=enp10s0f0  skbaddr=0xffff966497100700 len=1500
>   swapper     0 [008]   731.946516: net:netif_receive_skb: dev=enp10s0f0  skbaddr=0xffff966497101d00 len=1500   (2)
>   swapper     0 [008]   731.946517: net:netif_receive_skb: dev=enp10s0f0  skbaddr=0xffff966497101000 len=1500
>   swapper     0 [008]   731.946517: net:netif_receive_skb: dev=enp10s0f0  skbaddr=0xffff966497101c00 len=1500
>   swapper     0 [008]   731.946517: net:netif_receive_skb: dev=enp10s0f0  skbaddr=0xffff966497101400 len=1500
>   swapper     0 [008]   731.946518: net:netif_receive_skb: dev=enp10s0f0  skbaddr=0xffff966497100e00 len=1500
>   swapper     0 [008]   731.946518: net:netif_receive_skb: dev=enp10s0f0  skbaddr=0xffff966497101600 len=1500
>   swapper     0 [008]   731.946521: net:netif_receive_skb: dev=enp10s0f0  skbaddr=0xffff966497100800 len=774
>   swapper     0 [008]   731.946530: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff966497100400 len=14032 (1)
>   swapper     0 [008]   731.946530: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff966497101d00 len=9112  (2)
>   [...]
>
>   # netperf -H 10.55.10.4 -t TCP_STREAM -l 20
>   MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.55.10.4 () port 0 AF_INET : demo
>   Recv   Send    Send
>   Socket Socket  Message  Elapsed
>   Size   Size    Size     Time     Throughput
>   bytes  bytes   bytes    secs.    10^6bits/sec
>
>    87380  16384  16384    20.01    13129.24
>
> After:
>
>   [...]
>   swapper     0 [026]   521.862641: net:netif_receive_skb: dev=enp10s0f0  skbaddr=0xffff93ab0d479000 len=11286 (1)
>   swapper     0 [026]   521.862643: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d479000 len=11236 (1)
>   swapper     0 [026]   521.862650: net:netif_receive_skb: dev=enp10s0f0  skbaddr=0xffff93ab0d478500 len=2898  (2)
>   swapper     0 [026]   521.862650: net:netif_receive_skb: dev=enp10s0f0  skbaddr=0xffff93ab0d479f00 len=8490  (3)
>   swapper     0 [026]   521.862653: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d478500 len=2848  (2)
>   swapper     0 [026]   521.862653: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d479f00 len=8440  (3)
>   [...]
>
>   # netperf -H 10.55.10.4 -t TCP_STREAM -l 20
>   MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.55.10.4 () port 0 AF_INET : demo
>   Recv   Send    Send
>   Socket Socket  Message  Elapsed
>   Size   Size    Size     Time     Throughput
>   bytes  bytes   bytes    secs.    10^6bits/sec
>
>    87380  16384  16384    20.01    24576.53
>
> Fixes: 57c67ff4bd92 ("udp: additional GRO support")
> Fixes: 662880f44203 ("net: Allow GRO to use and set levels of checksum unnecessary")
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> Cc: Tom Herbert <tom@herbertland.com>

Makes sense to me.

We cannot do checksum conversion with zero field, but that does not
have to limit coalescing.

CHECKSUM_COMPLETE with a checksum validated by
skb_gro_checksum_validate_zero_check implies csum_valid.

So the test

>             (skb->ip_summed != CHECKSUM_PARTIAL &&
>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>              !NAPI_GRO_CB(skb)->csum_valid) ||

Basically matches

- CHECKSUM_NONE
- CHECKSUM_UNNECESSARY which has already used up its valid state on a
prior header
- CHECKSUM_COMPLETE with bad checksum.

This change just refines to not drop for in the first two cases on a
zero checksum field.

Making this explicit in case anyone sees holes in the logic. Else,

Acked-by: Willem de Bruijn <willemb@google.com>

  reply	other threads:[~2021-02-27 16:03 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-26 21:22 [PATCH net] net: Fix gro aggregation for udp encaps with zero csum Daniel Borkmann
2021-02-27 16:00 ` Willem de Bruijn [this message]
2021-02-27 17:16   ` John Fastabend
2021-02-28 20:02     ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+FuTSdn3zbynYOvuhLxZ02mmcDoRWQ5vUmBCbAgxeTa2X33YQ@mail.gmail.com \
    --to=willemdebruijn.kernel@gmail.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=tom@herbertland.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).