From mboxrd@z Thu Jan  1 00:00:00 1970
From: Balraj Singh <balrajsingh@ieee.org>
Subject: Re: Question about TCP checksum offload in Xen
Date: Thu, 5 Dec 2013 13:43:07 +0000
Message-ID: <CANeYhgGZJb2HBfprA4ORNr-2c3JWHe83HWMrh-Vj7tSkCs6CPw@mail.gmail.com>
References: <CANeYhgE44vTfP8mGQ5nvd8gyBbV_uLiTOgpdUmAYzeW4_KHpMw@mail.gmail.com>
	<20131205112952.GF14792@dark.recoil.org>
	<1386243546.20047.26.camel@kazak.uk.xensource.com>
	<52A073A1.4000102@oracle.com>
	<CAEeTej+o4ZmLKPdSR0G4wjMTcWHVyFu8zWFca7pZQWwW0gQtAg@mail.gmail.com>
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============7326973945865001368=="
Return-path: <xen-devel-bounces@lists.xen.org>
Received: from mail6.bemta14.messagelabs.com ([193.109.254.103])
	by lists.xen.org with esmtp (Exim 4.72)
	(envelope-from <balraj885@gmail.com>) id 1VoZCv-0000Eb-OK
	for xen-devel@lists.xenproject.org; Thu, 05 Dec 2013 13:43:10 +0000
Received: by mail-we0-f172.google.com with SMTP id w62so11009948wes.31
	for <xen-devel@lists.xenproject.org>;
	Thu, 05 Dec 2013 05:43:07 -0800 (PST)
In-Reply-To: <CAEeTej+o4ZmLKPdSR0G4wjMTcWHVyFu8zWFca7pZQWwW0gQtAg@mail.gmail.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Jon Crowcroft <jon.crowcroft@cl.cam.ac.uk>
Cc: John Haxby <john.haxby@oracle.com>, "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>, Ian Campbell <Ian.Campbell@citrix.com>, Mirage List <cl-mirage@lists.cam.ac.uk>, Anil Madhavapeddy <anil@recoil.org>
List-Id: xen-devel@lists.xenproject.org

--===============7326973945865001368==
Content-Type: multipart/alternative; boundary=001a11c35300a84df404ecc9b585

--001a11c35300a84df404ecc9b585
Content-Type: text/plain; charset=ISO-8859-1

That could work too to signal in the packet itself that the checksum is ok.
 But I don't think the received packets had 0 in the chksum field.  It was
just some random value, though I can check again.

Thanks,

Balraj


On Thu, Dec 5, 2013 at 1:15 PM, Jon Crowcroft <jon.crowcroft@cl.cam.ac.uk>wrote:

> i thought a value of 0 in the tcp chcksum field indicated "no checksum"
> and could be used in the cases you identify and ought not to trigger
> problems in correct code?
>
>
> On Thu, Dec 5, 2013 at 12:37 PM, John Haxby <john.haxby@oracle.com> wrote:
>
>> On 05/12/13 11:39, Ian Campbell wrote:
>> > On Thu, 2013-12-05 at 11:29 +0000, Anil Madhavapeddy wrote:
>> >> > On Tue, Dec 03, 2013 at 01:00:23PM +0000, Balraj Singh wrote:
>> >>> > > Hi,
>> >>> > >
>> >>> > > I'm working on verifying TCP checksums on incoming packets in
>> Mirage, but
>> >>> > > I've run into a bit of a problem.
>> >>> > >
>> >>> > > If TCP checksum offload is turned on on a virtual interface (this
>> is the
>> >>> > > default), and if the TCP connection is local to the machine, it
>> looks like
>> >>> > > Xen does not calculate the checksum at all.  This may be valid
>> because Xen
>> >>> > > may be providing a stronger guarantee, but it means that incoming
>> packets
>> >>> > > don't have a valid checksum in the header.  This then means that
>> in Mirage
>> >>> > > we can't just have checksum verification turned on all the time.
>>  This
>> >>> > > would have been the safe fall back option and detecting that
>> checksum
>> >>> > > offload is on, and then not duplicating the verification in
>> Mirage would
>> >>> > > have been an optimisation.  But it looks like this is not an
>> option.  Now I
>> >>> > > need to know for every incoming packet whether checksum
>> verification should
>> >>> > > be done or not.  It should ideally be for every packet since
>> chksum offload
>> >>> > > can be turned off and on on the VIF and existing tcp connections
>> should
>> >>> > > continue.  If not every packet, I need to get a notification or
>> efficiently
>> >>> > > detect right away that the setting is changed on the VIF.
>> >> >
>> >> > This is a question that seems to keep coming up even for Linux and
>> >> > Windows, as the combination of local<->local VMs vs local<->off-host
>> and
>> >> > the checksum offload is quite confusing.
>> >> >
>> >> > CCing xen-devel: is the appropriate behaviour for a guest VM that
>> wants to
>> >> > use checksum offloading in all situations documented anywhere?
>> > I don't understand the question/concern. If you have enabled checksum
>> > offload then of course you don't recalculate the checksum, that's the
>> > whole point of offloading it.
>>
>> I get this a lot.
>>
>> There are a few different cases:
>>
>>   * domain to domain traffic
>>   * domain to external traffic with egress from a NIC that does offload
>>   * domain to external through a non-offloading NIC
>>
>> With xen checksum offloading, domain to domain traffic appears to be
>> received with a bad checksum.  This is OK, there is no point in
>> calculating a checksum if the packets are only going through memory.  If
>> your memory is going to randomly corrupt packets you have more bigger
>> problems to worry about.   However, this does upset at least Solaris: if
>> you're using a Solaris guest for NAT then the NAT module on Solaris gets
>> all upset if the checksum is wrong and drops the packets.  (This is
>> Solaris's NAT module being overly picky, it may need to recalculate or
>> at least invalidate the existing checksum, but it doesn't need to check
>> it as well.)
>>
>> The second two cases are of interest from the domain perspective.  A
>> domain has no way of knowing how any given packet is going to leave the
>> host (or even if it is) so it can't know ahead of time whether to
>> calculate any checksums: the skb's are just marked with "checksum
>> needed" as usual and either the egress NIC will do the job or dom0 will
>> do it.
>>
>> There is absolutely nothing wrong in any of this (Solaris
>> notwithstanding).   The difficulty is getting people to realise that
>> checksums are only calculated when a packet hits the cat-5.  It doesn't
>> need documenting, it just needs a little thought.   I got tired of
>> hammering the point home :)
>>
>> jch
>>
>>
>

--001a11c35300a84df404ecc9b585
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">That could work too to signal in the packet itself that th=
e checksum is ok. =A0But I don&#39;t think the received packets had 0 in th=
e chksum field. =A0It was just some random value, though I can check again.=
<div>
<div><br></div></div><div>Thanks,</div><div><br></div><div>Balraj</div><div=
 class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Thu, Dec 5, 20=
13 at 1:15 PM, Jon Crowcroft <span dir=3D"ltr">&lt;<a href=3D"mailto:jon.cr=
owcroft@cl.cam.ac.uk" target=3D"_blank">jon.crowcroft@cl.cam.ac.uk</a>&gt;<=
/span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">i thought a value of 0 in t=
he tcp chcksum field indicated &quot;no checksum&quot; and could be used in=
 the cases you identify and ought not to trigger problems in correct code?<=
/div>
<div class=3D"gmail_extra">
<br><br><div class=3D"gmail_quote"><div class=3D"im">On Thu, Dec 5, 2013 at=
 12:37 PM, John Haxby <span dir=3D"ltr">&lt;<a href=3D"mailto:john.haxby@or=
acle.com" target=3D"_blank">john.haxby@oracle.com</a>&gt;</span> wrote:<br>=
</div>
<div><div class=3D"h5"><blockquote class=3D"gmail_quote" style=3D"margin:0 =
0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><div>On 05/12/13 11:39, Ian Campbell wrote:<br>
&gt; On Thu, 2013-12-05 at 11:29 +0000, Anil Madhavapeddy wrote:<br>
&gt;&gt; &gt; On Tue, Dec 03, 2013 at 01:00:23PM +0000, Balraj Singh wrote:=
<br>
&gt;&gt;&gt; &gt; &gt; Hi,<br>
&gt;&gt;&gt; &gt; &gt;<br>
&gt;&gt;&gt; &gt; &gt; I&#39;m working on verifying TCP checksums on incomi=
ng packets in Mirage, but<br>
&gt;&gt;&gt; &gt; &gt; I&#39;ve run into a bit of a problem.<br>
&gt;&gt;&gt; &gt; &gt;<br>
&gt;&gt;&gt; &gt; &gt; If TCP checksum offload is turned on on a virtual in=
terface (this is the<br>
&gt;&gt;&gt; &gt; &gt; default), and if the TCP connection is local to the =
machine, it looks like<br>
&gt;&gt;&gt; &gt; &gt; Xen does not calculate the checksum at all. =A0This =
may be valid because Xen<br>
&gt;&gt;&gt; &gt; &gt; may be providing a stronger guarantee, but it means =
that incoming packets<br>
&gt;&gt;&gt; &gt; &gt; don&#39;t have a valid checksum in the header. =A0Th=
is then means that in Mirage<br>
&gt;&gt;&gt; &gt; &gt; we can&#39;t just have checksum verification turned =
on all the time. =A0This<br>
&gt;&gt;&gt; &gt; &gt; would have been the safe fall back option and detect=
ing that checksum<br>
&gt;&gt;&gt; &gt; &gt; offload is on, and then not duplicating the verifica=
tion in Mirage would<br>
&gt;&gt;&gt; &gt; &gt; have been an optimisation. =A0But it looks like this=
 is not an option. =A0Now I<br>
&gt;&gt;&gt; &gt; &gt; need to know for every incoming packet whether check=
sum verification should<br>
&gt;&gt;&gt; &gt; &gt; be done or not. =A0It should ideally be for every pa=
cket since chksum offload<br>
&gt;&gt;&gt; &gt; &gt; can be turned off and on on the VIF and existing tcp=
 connections should<br>
&gt;&gt;&gt; &gt; &gt; continue. =A0If not every packet, I need to get a no=
tification or efficiently<br>
&gt;&gt;&gt; &gt; &gt; detect right away that the setting is changed on the=
 VIF.<br>
&gt;&gt; &gt;<br>
&gt;&gt; &gt; This is a question that seems to keep coming up even for Linu=
x and<br>
&gt;&gt; &gt; Windows, as the combination of local&lt;-&gt;local VMs vs loc=
al&lt;-&gt;off-host and<br>
&gt;&gt; &gt; the checksum offload is quite confusing.<br>
&gt;&gt; &gt;<br>
&gt;&gt; &gt; CCing xen-devel: is the appropriate behaviour for a guest VM =
that wants to<br>
&gt;&gt; &gt; use checksum offloading in all situations documented anywhere=
?<br>
&gt; I don&#39;t understand the question/concern. If you have enabled check=
sum<br>
&gt; offload then of course you don&#39;t recalculate the checksum, that=
9;s the<br>
&gt; whole point of offloading it.<br>
<br>
</div></div>I get this a lot.<br>
<br>
There are a few different cases:<br>
<br>
=A0 * domain to domain traffic<br>
=A0 * domain to external traffic with egress from a NIC that does offload<b=
r>
=A0 * domain to external through a non-offloading NIC<br>
<br>
With xen checksum offloading, domain to domain traffic appears to be<br>
received with a bad checksum. =A0This is OK, there is no point in<br>
calculating a checksum if the packets are only going through memory. =A0If<=
br>
your memory is going to randomly corrupt packets you have more bigger<br>
problems to worry about. =A0 However, this does upset at least Solaris: if<=
br>
you&#39;re using a Solaris guest for NAT then the NAT module on Solaris get=
s<br>
all upset if the checksum is wrong and drops the packets. =A0(This is<br>
Solaris&#39;s NAT module being overly picky, it may need to recalculate or<=
br>
at least invalidate the existing checksum, but it doesn&#39;t need to check=
<br>
it as well.)<br>
<br>
The second two cases are of interest from the domain perspective. =A0A<br>
domain has no way of knowing how any given packet is going to leave the<br>
host (or even if it is) so it can&#39;t know ahead of time whether to<br>
calculate any checksums: the skb&#39;s are just marked with &quot;checksum<=
br>
needed&quot; as usual and either the egress NIC will do the job or dom0 wil=
l<br>
do it.<br>
<br>
There is absolutely nothing wrong in any of this (Solaris<br>
notwithstanding). =A0 The difficulty is getting people to realise that<br>
checksums are only calculated when a packet hits the cat-5. =A0It doesn&#39=
;t<br>
need documenting, it just needs a little thought. =A0 I got tired of<br>
hammering the point home :)<br>
<br>
jch<br>
<br>
</blockquote></div></div></div><br></div>
</blockquote></div><br></div></div>

--001a11c35300a84df404ecc9b585--


--===============7326973945865001368==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

--===============7326973945865001368==--