Re: veth regression with "don’t modify ip_summed; doing so treats packets with bad checksums as good."

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Ben Greear <greearb@candelatech.com>
To: Vijay Pandurangan <vijayp@vijayp.ca>
Cc: Cong Wang <xiyou.wangcong@gmail.com>,
	netdev <netdev@vger.kernel.org>, Evan Jones <ej@evanjones.ca>,
	Cong Wang <cwang@twopensource.com>
Subject: Re: veth regression with "don’t modify ip_summed; doing so treats packets with bad checksums as good."
Date: Fri, 25 Mar 2016 16:46:44 -0700	[thread overview]
Message-ID: <56F5CDE4.7010306@candelatech.com> (raw)
In-Reply-To: <CAKUBDd8FGh3mhDmoyH70WvNGyofPVbjdROPKzenm+6DLfALyqg@mail.gmail.com>

On 03/25/2016 04:03 PM, Vijay Pandurangan wrote:
> On Fri, Mar 25, 2016 at 6:23 PM, Ben Greear <greearb@candelatech.com> wrote:
>> On 03/25/2016 02:59 PM, Vijay Pandurangan wrote:
>>>
>>> consider two scenarios, where process a sends raw ethernet frames
>>> containing UDP packets to b
>>>
>>> I) process a --> veth --> process b
>>>
>>> II) process a -> eth -> wire -> eth -> process b
>>>
>>> I believe (I) is the simplest setup we can create that will replicate this
>>> bug.
>>>
>>> If process a sends frames that contain UDP packets to process b, what
>>> is the behaviour we want if the UDP packet *has an incorrect
>>> checksum*?
>>>
>>> It seems to me that I and II should have identical behaviour, and I
>>> would think that (II) would not deliver the packets to the
>>> application.
>>>
>>> In (I) with Cong's patch would we be delivering corrupt UDP packets to
>>> process b despite an incorrect checksum in (I)?
>>>
>>> If so, I would argue that this patch isn't right.
>>
>>
>> Checksums are normally used to deal with flaky transport mechanisms,
>> and once a machine receives the frame, we do not keep re-calculating
>> checksums
>> as we move it through various drivers and subsystems.
>>
>> In particular, checksums are NOT a security mechanism and can be easily
>> faked.
>>
>> Since packets sent on one veth never actually hit any unreliable transport
>> before they are received on the peer veth, then there should be no need to
>> checksum packets whose origin is known to be on the local machine.
>
> That's a good argument.  I'm trying to figure out how to reconcile
> your thoughts with the argument that virtual ethernet devices are an
> abstraction that should behave identically to perfectly-functional
> physical ethernet devices when connected with a wire.
>
> In my view, the invariant must be identical functionality, and if I
> were writing a regression test for this system, that's what I would
> test. I think optimizations for eliding checksums should be
> implemented only if they don't alter this functionality.
>
> There must be a way to structure / write this code so that we can
> optimize veths without causing different behaviour ...

A real NIC can either do hardware checksums, or it cannot.  If it
cannot, then the host must do it on the CPU for both transmit and
receive.

Veth is not a real NIC, and it cannot do hardware checksum offloading.

So, we either lie and pretend it does, or we eat massive amounts
of CPU usage to calculate and check checksums when sending across
a veth pair.

>> Any frame sent from a socket can be considered to be a local packet in my
>> opinion.
>
> I'm not sure that's totally right. Your bridge is adding a delay to
> your packets; it could just as easily be simulating corruption by
> corrupting 5% of packets going through it. If this change allows
> corrupt packets to be delivered to an application when they could not
> be delivered if the packets were routed via physical eths, I think
> that is a bug.

I actually do support corrupting the frame, but what I normally do is corrupt the contents
of the packet, and then recalculate the IP checksum (and TCP if it applies)
and send it on its way.  The receiving NIC and stack will pass the frame up to
the application since the checksums match, and it would be up the application
to deal with it.  So, I can easily cause an application to receive corrupted
frames over physical eths.

I can also corrupt without updating the checksums in case you want to
test another systems NIC and/or stack.

But, if I am purposely corrupting a frame destined for veth, then the only reason
I would want the stack to check the checksums is if I were testing my own
stack's checksum logic, and that seems to be a pretty limited use.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

next prev parent reply	other threads:[~2016-03-25 23:46 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-24 22:01 veth regression with "don’t modify ip_summed; doing so treats packets with bad checksums as good." Ben Greear
     [not found] ` <CAKUBDd91rR7QTwCO6L6ZfRe4fuHw0L5+Zi7qm0uF018dwVGCLg@mail.gmail.com>
2016-03-24 22:57   ` Ben Greear
2016-03-24 23:56 ` Cong Wang
2016-03-25  0:06   ` Ben Greear
2016-03-25  1:11     ` Ben Greear
2016-03-25  1:13       ` Ben Greear
2016-03-25  1:44         ` Vijay Pandurangan
2016-03-25  4:34           ` Ben Greear
2016-03-25  4:41             ` Vijay Pandurangan
2016-03-25  4:45               ` Vijay Pandurangan
2016-03-25  5:07                 ` Ben Greear
2016-03-25  5:24                   ` Vijay Pandurangan
2016-03-25 14:35                     ` Ben Greear
2016-03-25 21:51                       ` Vijay Pandurangan
2016-03-25  5:06             ` Cong Wang
2016-03-25  5:13               ` Ben Greear
2016-03-25  5:33                 ` Cong Wang
2016-03-25 16:10                   ` Ben Greear
2016-03-25 16:32                     ` Cong Wang
2016-03-25 16:45                       ` David Miller
2016-03-25 16:44                     ` David Miller
2016-03-25 17:14                       ` Ben Greear
2016-03-25 19:00                         ` David Miller
2016-03-25 20:56                   ` Ben Greear
2016-03-25 21:59                     ` Vijay Pandurangan
2016-03-25 22:23                       ` Ben Greear
2016-03-25 23:03                         ` Vijay Pandurangan
2016-03-25 23:46                           ` Ben Greear [this message]
2016-04-07 15:11                             ` Vijay Pandurangan
2016-04-07 18:32                               ` Ben Greear
2016-03-25 22:23                       ` Cong Wang
2016-03-25 22:16                     ` Cong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56F5CDE4.7010306@candelatech.com \
    --to=greearb@candelatech.com \
    --cc=cwang@twopensource.com \
    --cc=ej@evanjones.ca \
    --cc=netdev@vger.kernel.org \
    --cc=vijayp@vijayp.ca \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).