From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?UTF-8?B?Tmljb2xhcyBkZSBQZXNsb8O8YW4=?= Subject: Re: [patch net-next-2.6 v2] net: vlan: make non-hw-accel rx path similar to hw-accel Date: Sun, 22 May 2011 04:59:49 +0200 Message-ID: <4DD87C25.4030701@gmail.com> References: <1302241713-3637-1-git-send-email-jpirko@redhat.com> <20110412.141645.112604563.davem@davemloft.net> <20110521072925.GA2588@jirka.orion> <4DD7BB61.9050200@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Changli Gao , Jiri Pirko , David Miller , netdev@vger.kernel.org, shemminger@linux-foundation.org, kaber@trash.net, fubar@us.ibm.com, eric.dumazet@gmail.com, andy@greyhouse.net, ebiederm@xmission.com To: Jesse Gross Return-path: Received: from mail-ww0-f44.google.com ([74.125.82.44]:57176 "EHLO mail-ww0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753430Ab1EVC7y (ORCPT ); Sat, 21 May 2011 22:59:54 -0400 Received: by wwa36 with SMTP id 36so5141279wwa.1 for ; Sat, 21 May 2011 19:59:53 -0700 (PDT) In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: Le 21/05/2011 19:54, Jesse Gross a =C3=A9crit : > On Sat, May 21, 2011 at 6:17 AM, Nicolas de Peslo=C3=BCan > wrote: >> Le 21/05/2011 12:43, Changli Gao a =C3=A9crit : >>> >>> On Sat, May 21, 2011 at 3:29 PM, Jiri Pirko w= rote: >>>> >>>> I do not see a reason why to not emulate that. To make paths as mu= ch >>>> similar as they can be, that is the point of this patch. >>>> >>>> I think it would be better to fix an issue you are pointing at >>>> rather that revert this. >>>> >>> >>> In my opinion, the hardware accelerated VLAN RX is just a special c= ase >>> of the non hardware accelerated VLAN RX with header reordering. For >>> promiscuous NICs and bridges, hw-accel-vlan-rx is just disabled. >> >> I strongly agree with that. >> >> The fact that a skb holds a VLAN tag is not a good enough reason to = always >> remove this tag before giving the skb to protocol handlers. >> >> If the user ask for VLAN tag removal, we should remove the tag, poss= ibly >> using hw-accel untagging if available else software untagging. And i= f the >> user doesn't ask for tag removal, we should not untag. >> >> In other words, if the user doesn't setup any vlan interface on top = of >> another interface, there is no reason to untag the skb : both hw-acc= el >> untagging and software untagging should be disabled. > > The problem is that for most hardware vlan stripping is actually the > common case, not the exception. When you try to disable it frequentl= y > there are hidden restrictions that cause problems. A few examples: > * Some NICs can't disable stripping at all. > * Some NICs can only do tag insertion if stripping is configured on r= eceive. > * Some NICs can only do hardware offloads (checksum, TSO) if tag > insertion is used on transmit. > > So if you are using vlans then acceleration is pretty much a fact of > life and the best possible way we can deal with it is to make the > accelerated and non-accelerated cases behave as similarly as possible= =2E > > Before we were trying to dynamically enable/disable vlan acceleration > based on whether a vlan group was configured and that worked fine for > vlan devices because acceleration was enabled for it. However, it > caused an endless series of problems for other devices (such as > bridging while trunking vlans) due to lost tags, driver bugs, and the > restrictions above. Some of these can be fixed with driver changes > but the fact is that dynamically changing behavior just leads to > problems for the less common cases that are supposedly being fixed. > It's much better to do the same thing all the time. Thanks for clarifying. So, because many limited/buggy hardware exist, we must mimic the behavi= or in software. 'Sounds good=20 to me. And because some setups may still require the skb not to be untagged, m= ay be we need the ability to=20 re-tag the skb in some situations... When a protocol handler or rx_hand= ler is explicitly registered=20 on a net_device which expect to receive tagged skb, we should deliver t= agged skb to it... Arguably,=20 this may sound incredible for the general case, but may be required for= not-so-special cases like=20 bridge or protocol analyzer. Of course, I don't say we should always re-tag: if no protocol handler = nor rx_handler were=20 registered on the parent interface, we don't need the extra work of re-= tagging. What I say is that it shouldn't be the job of protocol handlers or rx_h= andlers that expect the skb=20 to be tagged to fix the improper untagging. A generic feature should do= it when necessary. And all this being said, it doesn't mean that we should pollute __netif= _receive_skb with special=20 code for vlan handling. May be, as suggested by Eric W. Biederman in the V1 thread for this pat= ch, software untagging for=20 the first level of header should happen before __netif_receive_skb if w= e only try to mimic hardware=20 behavior. And possible later untagging (due to vlan nesting) should be done gener= ically inside=20 __netif_receive_skb, using rx_handler when appropriate. This would clea= nup the general case where no=20 vlan is involved at all. Nicolas.