netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: Ben Greear <greearb@candelatech.com>
Cc: Vijay Pandurangan <vijayp@vijayp.ca>,
	Tom Herbert <tom@herbertland.com>,
	Ben Hutchings <ben@decadent.org.uk>,
	Sabrina Dubroca <sd@queasysnail.net>,
	Hannes Frederic Sowa <hannes@stressinduktion.org>,
	LKML <linux-kernel@vger.kernel.org>,
	stable@vger.kernel.org, akpm@linux-foundation.org,
	"David S. Miller" <davem@davemloft.net>,
	Cong Wang <cwang@twopensource.com>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	Evan Jones <ej@evanjones.ca>,
	Nicolas Dichtel <nicolas.dichtel@6wind.com>,
	Phil Sutter <phil@nwl.cc>,
	Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>,
	Cong Wang <xiyou.wangcong@gmail.com>
Subject: Re: [PATCH 3.2 085/115] veth: don???t modify ip_summed; doing so treats packets with bad checksums as good.
Date: Sun, 1 May 2016 07:30:59 +0200	[thread overview]
Message-ID: <20160501053059.GA26097@1wt.eu> (raw)
In-Reply-To: <57253527.7010009@candelatech.com>

On Sat, Apr 30, 2016 at 03:43:51PM -0700, Ben Greear wrote:
> On 04/30/2016 03:01 PM, Vijay Pandurangan wrote:
> > Consider:
> > 
> > - App A  sends out corrupt packets 50% of the time and discards inbound data.
(...)
> How can you make a generic app C know how to do this?  The path could be,
> for instance:
> 
> eth0 <-> user-space-A <-> vethA <-> vethB <-> { kernel routing logic } <-> vethC <-> vethD <-> appC
> 
> There are no sockets on vethB, but it does need to have special behaviour to elide
> csums.  Even if appC is hacked to know how to twiddle some thing on it's veth port,
> mucking with vethD will have no effect on vethB.
> 
> With regard to your example above, why would A corrupt packets?  My guess:
> 
> 1)  It has bugs (so, fix the bugs, it could equally create incorrect data with proper checksums,
>     so just enabling checksumming adds no useful protection.)

I agree with Ben here, what he needs is the ability for userspace to be
trusted when *forwarding* a packet. Ideally you'd only want to receive
the csum status per packet on the packet socket and pass the same value
on the vethA interface, with this status being kept when the packet
reaches vethB.

If A purposely corrupts packet, it's A's problem. It's similar to designing
a NIC which intentionally corrupts packets and reports "checksum good".

The real issue is that in order to do things right, the userspace bridge
(here, "A") would really need to pass this status. In Ben's case as he
says, bad checksum packets are dropped before reaching A, so that
simplifies the process quite a bit and that might be what causes some
confusion, but ideally we'd rather have recvmsg() and sendmsg() with
these flags.

I faced the exact same issue 3 years ago when playing with netmap, it was
slow as hell because it would lose all checksum information when packets
were passing through userland, resulting in GRO/GSO etc being disabled,
and had to modify it to let userland preserve it. That's especially
important when you have to deal with possibly corrupted packets not yet
detected in the chain because the NIC did not validate their checksums.

Willy

  reply	other threads:[~2016-05-01  5:31 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <lsq.1461711744.351546278@decadent.org.uk>
2016-04-26 23:02 ` [PATCH 3.2 085/115] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good Ben Hutchings
2016-04-27 15:59   ` Ben Greear
2016-04-27 18:07     ` Ben Hutchings
2016-04-28  0:00       ` Hannes Frederic Sowa
2016-04-28  0:14         ` Ben Greear
2016-04-28 10:29           ` Sabrina Dubroca
2016-04-28 13:45             ` Ben Greear
2016-04-30 19:18               ` Ben Hutchings
2016-04-30 18:33             ` Ben Hutchings
2016-04-30 19:40               ` Ben Greear
2016-04-30 19:54                 ` Tom Herbert
2016-04-30 20:59                   ` Ben Greear
2016-04-30 21:13                     ` Vijay Pandurangan
2016-04-30 21:29                       ` Ben Greear
2016-04-30 21:36                         ` Vijay Pandurangan
2016-04-30 21:52                           ` Ben Greear
2016-04-30 22:01                             ` Vijay Pandurangan
2016-04-30 22:43                               ` Ben Greear
2016-05-01  5:30                                 ` Willy Tarreau [this message]
2016-05-13 16:57                                   ` [PATCH 3.2 085/115] veth: don???t " Ben Greear
2016-05-13 18:21                                     ` David Miller
2016-05-13 18:23                                       ` Ben Greear
2016-04-30 22:42                     ` [PATCH 3.2 085/115] veth: don’t " Tom Herbert
2016-04-30 20:15             ` Vijay Pandurangan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160501053059.GA26097@1wt.eu \
    --to=w@1wt.eu \
    --cc=akpm@linux-foundation.org \
    --cc=ben@decadent.org.uk \
    --cc=cwang@twopensource.com \
    --cc=davem@davemloft.net \
    --cc=ej@evanjones.ca \
    --cc=greearb@candelatech.com \
    --cc=hannes@stressinduktion.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=makita.toshiaki@lab.ntt.co.jp \
    --cc=netdev@vger.kernel.org \
    --cc=nicolas.dichtel@6wind.com \
    --cc=phil@nwl.cc \
    --cc=sd@queasysnail.net \
    --cc=stable@vger.kernel.org \
    --cc=tom@herbertland.com \
    --cc=vijayp@vijayp.ca \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).