From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stephen Hemminger Subject: Re: LRO/GSO interaction when packets are forwarded Date: Fri, 7 Mar 2008 08:25:38 -0800 Message-ID: <20080307082538.1a674ae1@extreme> References: <1204898997.4220.41.camel@moonstone.uk.level5networks.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: netdev@vger.kernel.org To: Kieran Mansley Return-path: Received: from mail.vyatta.com ([216.93.170.194]:51028 "EHLO mail.vyatta.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752001AbYCGQZo (ORCPT ); Fri, 7 Mar 2008 11:25:44 -0500 In-Reply-To: <1204898997.4220.41.camel@moonstone.uk.level5networks.com> Sender: netdev-owner@vger.kernel.org List-ID: On Fri, 07 Mar 2008 14:09:57 +0000 Kieran Mansley wrote: > We've seen a couple of problems when using a bridge or IP forwarding > combined with LRO packets generated by a network device driver. As you > know, LRO packets can be either be page based (and passed up with > lro_receive_page()) or use the skb frag_list (and passed up with > lro_receive_skb()). In both cases it is likely that the device driver > will have set CHECKSUM_UNNECESSARY to indicate that the packet has been > checksummed by the device, and gso_size to mark it as an LRO packet and > indicate the actual received MSS. First off, no hardware should ever do LRO on non-local packets. If the hardware isn't smart enough to do this, I guess the bridge code to have an API to turn it off. IP should also turn it off if ip_forwarding is enabled on that device. > If this skb goes directly to the network stack everything is fine. The > problem comes when this packet instead goes into a bridge and is then > retransmitted on another device. The skb seems to pass through the > bridge relatively unmodified and because it has gso_size set the > transmit path will attempt to segment it. If page-based allocation has > been used, this is fine, but if the skb frag_list has been used the > transmit path BUGs in skb_gso_segment(): You can't do LRO with bridging, it is that simple, it is a protocol layering violation. > http://lxr.linux.no/linux+v2.6.24.3/net/core/dev.c#L1410 > > Secondly, the same function hopes that a GSO packet will have > CHECKSUM_PARTIAL set - if this packet had originated from a stack rather > than from an LRO device this would be the case - but instead it will > most likely have CHECKSUM_UNNECESSARY. > > Both of these problems are essentially being caused by gso_size and the > ip_summed field have slightly different meanings on the receive and > transmit paths, and the bridge/IP forwarding stuff not translating from > one to the other. To be fair to the bridge, it would not be obvious to > it that it will be passing the packet to a real device (that will invoke > the transmit path) or to a stack. > > This leads me to my questions: > > - any idea why other drivers aren't hitting this problem? One > possibility is that they're using lro_receive_page rather then > lro_receive_skb, but I'd still expect to see the CHECKSUM_PARTIAL > warning. I'm wondering if having LRO and forwarding between devices is > a relatively rare thing, and so it just hasn't been tested. > - any suggestion as to the best place to try and fix this up? My > preference is making the transmit path cope with a packet that has the > frag_list in use. Making it cope with CHECKSUM_UNNECESSARY should also > be possible but to be honest I'm finding skb_gso_segment's handling of > CHECKSUM_PARTIAL a bit hard to follow. The alternative would be I > suppose to get the bridge and IP forwarding code to fix the socket > buffer up before transmitting it, or for the driver to somehow know that > it this packet will be forwarded and so it shouldn't use LRO. In br_add_if, it should have a way to tell the device to turn LRO off. dev_change_flags?