From: Or Gerlitz <gerlitz.or@gmail.com>
To: Alex Duyck <aduyck@mirantis.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>,
Alexander Duyck <alexander.duyck@gmail.com>,
Tom Herbert <tom@herbertland.com>,
"talal@mellanox.com" <talal@mellanox.com>,
Linux Netdev List <netdev@vger.kernel.org>,
Michael Chan <michael.chan@broadcom.com>,
David Miller <davem@davemloft.net>,
Gal Pressman <galp@mellanox.com>,
Eran Ben Elisha <eranbe@mellanox.com>
Subject: Re: [net-next PATCH v2 5/9] mlx4: Add support for UDP tunnel segmentation with outer checksum offload
Date: Fri, 6 May 2016 00:39:32 +0300 [thread overview]
Message-ID: <CAJ3xEMj8z617kmoc2bLMBXf7B7KSBLzqr5tLTuuXYSOA--MYMw@mail.gmail.com> (raw)
In-Reply-To: <CAMt9YRqCH8Ab5vefMoQj39gJ8b-gsg2w5nJanW1srqt13bUkSQ@mail.gmail.com>
On Wed, May 4, 2016 at 7:06 PM, Alex Duyck <aduyck@mirantis.com> wrote:
> On Wed, May 4, 2016 at 8:50 AM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
>> On 5/3/2016 6:29 PM, Alexander Duyck wrote:
>>>
>>> We split the one that would be a different size off via GSO. So we
>>> end up sending up 2 frames to the device if there is going to be one
>>> piece that doesn't quite match. We split that one piece off via GSO.
>>> That is one of the reasons why I referred to it as partial GSO as all
>>> we are using the software segmentation code for is to make sure we
>>> have the GSO block consists of segments that are all the same size.
>>
>>
>> I see, so if somehow it happens a lot that the TCP stack sends down
>> something which once segmented ends up with the last segment being of
>> different size from the other ones we would have to call the NIC xmit
>> function twice (BTW can we use xmit_more here?) -- which could be effecting
>> performance, I guess.
>>
>> GSO_UDP_TUNNEL_CSUM (commit 0f4f4ffa7 "net: Add GSO support for UDP tunnels
>> with checksum") came to mark "that a device is capable of computing the UDP
>> checksum in the encapsulating header of a UDP tunnel" -- and the way we use
>> it here is that we do advertize that bit towards the stack for devices whose
>> HW can **not** do that, and things work b/c of LCO (this is my
>> understanding).
>>
>> I miss something in the bigger picture here, what does this buy us? e.g vs
>> just letting this (say) vxlan tunnel use zero checksum on the outer UDP
>> packet, is that has something to do with RCO?
>
> I think the piece you are missing is GSO_PARTIAL. Basically
> GSO_PARTIAL indicates that we can perform GSO as long as all segments
> are the same size and also allows for ignoring one level of headers.
> So in the case of ixgbe for instance we can support tunnel offloads as
> long as we allow for the inner IPv4 ID to be a fixed value which is
> identified by enabling TSO_MANGLEID. In the case of i40e, mlx4, and
> mlx5 the key bit is that we just have to have the frames the same size
> for all segments and then we can support tunnels with outer checksum
> because the checksum has been computed once and can be applied to all
> of the segmented frames.
Yep, I think to basically follow on the PARTIAL thing, which once
advertised by i40e, mlx4 and mlx5 allow them support udp (and GRE in
i40e case) tunnels with outer checksum.
My question was what this buy us for the UDP case vs. using zero
checksum for the tunnel (outer packet), I tried to figure out if it
has something to do with the remote side, e.g for RCO or alike.
Basically, under PARTIAL, on the worst case we could have ending up
with 2x packet xmitted to the NIC - e.g if each TCP message which is
to be encapsulated by the stack and later segmented by the NIC HW is
broken to two b/c otherwise the last segmented packet will not be of
equal size as of the all the preceding ones.
Or being a bit more positive... is there an expected performance gain
when you use MANGLEID and/or PARTIAL to enable supporting UDP tunnel
segmentation checksum offload towards the stack? what is the reason
for that gain?
As for GRE tunnel segmentation checksum offload, I saw in your i40e
patch that it made your testbed to go from 12Gbs to 20Gbs, is this b/c
the stack can not actually let the HW do the segmentation w.o checksum
offload? if not, can you help understand the source of the gain?
> Hope that helps.
yes, your notes are very helpful, thanks for sparing the time..
Or.
next prev parent reply other threads:[~2016-05-05 21:39 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-29 22:43 [net-next PATCH v2 0/9] Fix Tunnel features and enable GSO partial for several drivers Alexander Duyck
2016-04-29 22:43 ` [net-next PATCH v2 1/9] net: Disable segmentation if checksumming is not supported Alexander Duyck
2016-05-01 20:30 ` Or Gerlitz
2016-05-02 2:16 ` Alexander Duyck
2016-04-29 22:43 ` [net-next PATCH v2 2/9] gso: Only allow GSO_PARTIAL if we can checksum the inner protocol Alexander Duyck
2016-04-29 22:43 ` [net-next PATCH v2 3/9] net: Fix netdev_fix_features so that TSO_MANGLEID is only available with TSO Alexander Duyck
2016-04-29 22:43 ` [net-next PATCH v2 4/9] vxlan: Add checksum check to the features check function Alexander Duyck
2016-04-29 22:43 ` [net-next PATCH v2 5/9] mlx4: Add support for UDP tunnel segmentation with outer checksum offload Alexander Duyck
2016-05-01 20:28 ` Saeed Mahameed
2016-05-01 20:35 ` Or Gerlitz
2016-05-02 2:25 ` Alexander Duyck
2016-05-02 7:19 ` Or Gerlitz
2016-05-02 15:41 ` Alexander Duyck
2016-05-03 12:41 ` Or Gerlitz
2016-05-03 15:29 ` Alexander Duyck
2016-05-04 15:50 ` Or Gerlitz
2016-05-04 16:06 ` Alex Duyck
2016-05-05 21:39 ` Or Gerlitz [this message]
2016-05-05 22:00 ` Alexander Duyck
2016-04-29 22:43 ` [net-next PATCH v2 6/9] mlx4: Add support for inner IPv6 checksum offloads and TSO Alexander Duyck
2016-05-01 20:21 ` Saeed Mahameed
2016-04-29 22:43 ` [net-next PATCH v2 7/9] mlx5e: Add support for UDP tunnel segmentation with outer checksum offload Alexander Duyck
2016-05-01 20:08 ` Saeed Mahameed
2016-04-29 22:43 ` [net-next PATCH v2 8/9] mlx5e: Fix IPv6 tunnel " Alexander Duyck
2016-05-01 20:09 ` Saeed Mahameed
2016-04-29 22:43 ` [net-next PATCH v2 9/9] bnxt: Add support for segmentation of tunnels with outer checksums Alexander Duyck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJ3xEMj8z617kmoc2bLMBXf7B7KSBLzqr5tLTuuXYSOA--MYMw@mail.gmail.com \
--to=gerlitz.or@gmail.com \
--cc=aduyck@mirantis.com \
--cc=alexander.duyck@gmail.com \
--cc=davem@davemloft.net \
--cc=eranbe@mellanox.com \
--cc=galp@mellanox.com \
--cc=michael.chan@broadcom.com \
--cc=netdev@vger.kernel.org \
--cc=ogerlitz@mellanox.com \
--cc=talal@mellanox.com \
--cc=tom@herbertland.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).