From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753412AbcEMQ5X (ORCPT ); Fri, 13 May 2016 12:57:23 -0400 Received: from mail2.candelatech.com ([208.74.158.173]:58133 "EHLO mail2.candelatech.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753204AbcEMQ5V (ORCPT ); Fri, 13 May 2016 12:57:21 -0400 Subject: Re: [PATCH 3.2 085/115] veth: don???t modify ip_summed; doing so treats packets with bad checksums as good. References: <1462041181.17662.3.camel@decadent.org.uk> <57250A17.5090804@candelatech.com> <57251CB3.1040504@candelatech.com> <572523C4.4080307@candelatech.com> <57252918.7070302@candelatech.com> <57253527.7010009@candelatech.com> <20160501053059.GA26097@1wt.eu> Cc: Willy Tarreau , Vijay Pandurangan , Tom Herbert , Ben Hutchings , Sabrina Dubroca , Hannes Frederic Sowa , LKML , stable@vger.kernel.org, akpm@linux-foundation.org, Cong Wang , Linux Kernel Network Developers , Evan Jones , Nicolas Dichtel , Phil Sutter , Toshiaki Makita , Cong Wang To: "David S. Miller" From: Ben Greear Organization: Candela Technologies Message-ID: <5736076F.10003@candelatech.com> Date: Fri, 13 May 2016 09:57:19 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 MIME-Version: 1.0 In-Reply-To: <20160501053059.GA26097@1wt.eu> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Mr Miller: How do you feel about a new socket-option to allow a socket to request the old veth behaviour? Thanks, Ben On 04/30/2016 10:30 PM, Willy Tarreau wrote: > On Sat, Apr 30, 2016 at 03:43:51PM -0700, Ben Greear wrote: >> On 04/30/2016 03:01 PM, Vijay Pandurangan wrote: >>> Consider: >>> >>> - App A sends out corrupt packets 50% of the time and discards inbound data. > (...) >> How can you make a generic app C know how to do this? The path could be, >> for instance: >> >> eth0 <-> user-space-A <-> vethA <-> vethB <-> { kernel routing logic } <-> vethC <-> vethD <-> appC >> >> There are no sockets on vethB, but it does need to have special behaviour to elide >> csums. Even if appC is hacked to know how to twiddle some thing on it's veth port, >> mucking with vethD will have no effect on vethB. >> >> With regard to your example above, why would A corrupt packets? My guess: >> >> 1) It has bugs (so, fix the bugs, it could equally create incorrect data with proper checksums, >> so just enabling checksumming adds no useful protection.) > > I agree with Ben here, what he needs is the ability for userspace to be > trusted when *forwarding* a packet. Ideally you'd only want to receive > the csum status per packet on the packet socket and pass the same value > on the vethA interface, with this status being kept when the packet > reaches vethB. > > If A purposely corrupts packet, it's A's problem. It's similar to designing > a NIC which intentionally corrupts packets and reports "checksum good". > > The real issue is that in order to do things right, the userspace bridge > (here, "A") would really need to pass this status. In Ben's case as he > says, bad checksum packets are dropped before reaching A, so that > simplifies the process quite a bit and that might be what causes some > confusion, but ideally we'd rather have recvmsg() and sendmsg() with > these flags. > > I faced the exact same issue 3 years ago when playing with netmap, it was > slow as hell because it would lose all checksum information when packets > were passing through userland, resulting in GRO/GSO etc being disabled, > and had to modify it to let userland preserve it. That's especially > important when you have to deal with possibly corrupted packets not yet > detected in the chain because the NIC did not validate their checksums. > > Willy > -- Ben Greear Candela Technologies Inc http://www.candelatech.com