From mboxrd@z Thu Jan 1 00:00:00 1970 From: Thomas Graf Subject: Re: [patch net-next v2 8/9] switchdev: introduce Netlink API Date: Mon, 22 Sep 2014 09:13:41 +0100 Message-ID: <20140922081341.GA20905@casper.infradead.org> References: <1411134590-4586-1-git-send-email-jiri@resnulli.us> <1411134590-4586-9-git-send-email-jiri@resnulli.us> <541C4AFC.8060500@mojatatu.com> <20140919154946.GH1980@nanopsycho.orion> <541C6E6D.9000109@mojatatu.com> <541CAA3C.5080105@intel.com> <20140920081426.GE1821@nanopsycho.orion> <20140920105354.GA29419@casper.infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Jiri Pirko , John Fastabend , Jamal Hadi Salim , "netdev@vger.kernel.org" , "David S. Miller" , Neil Horman , Andy Gospodarek , Daniel Borkmann , Or Gerlitz , Jesse Gross , Pravin Shelar , Andy Zhou , ben@decadent.org.uk, Stephen Hemminger , jeffrey.t.kirsher@intel.com, Vladislav Yasevich , Cong Wang , Eric Dumazet , Scott Feldman , Florian Fainelli , Roopa Prabhu , John Linville , "dev@openvswitch.org" Return-path: Received: from casper.infradead.org ([85.118.1.10]:52821 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753301AbaIVINu (ORCPT ); Mon, 22 Sep 2014 04:13:50 -0400 Content-Disposition: inline In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On 09/20/14 at 03:50pm, Alexei Starovoitov wrote: > I think HW should not be limited by SW abstractions whether > these abstractions are called flows, n-tuples, bridge or else. > Really looking forward to see "device reporting the headers as > header fields (len, offset) and the associated parse graph" > as the first step. > > Another topic that this discussion didn't cover yet is how this > all connects to tunnels and what is 'tunnel offloading'. > imo flow offloading by itself serves only academic interest. We haven't touched encryption yet either ;-) Certainly true for the host case. The Linux on TOR case is less dependant on this and L2/L3 offload w/o encap already has value. I'm with you though, all of this has little value on the host in the DC if stateful encap offload is not incorporated. I expect the HW to provide filters on the outer header plus metadata in the encap. Actually, this was a follow-up question I had for John as this is not easily describable with offset/len filters. How would we represent such capabilities? The TX side of this was one of the reasons why I initially thought it would be beneficial to implement a cache like offload as we could serve an initial encap in SW, do the FIB lookup and offload it transparently to avoid replicating the FIB in user space. What seems most feasisble to me right now is to separate the offload of the encap action from the IP -> dev mapping decision. The eSwitch would send the first encap for an unknown dest IP to the CPU due to a miss in the IP mapping table, the CPU would do the FIB lookup, update the table and send it back. What do you have in mind?