From mboxrd@z Thu Jan 1 00:00:00 1970 From: John Fastabend Subject: Re: [PATCH v1 1/6] net: Generalize udp based tunnel offload Date: Wed, 02 Dec 2015 15:35:53 -0800 Message-ID: <565F8059.3010101@gmail.com> References: <1448312579-159544-1-git-send-email-anjali.singhai@intel.com> <1448312579-159544-2-git-send-email-anjali.singhai@intel.com> <20151129.222138.1582847465760563254.davem@davemloft.net> <20151201154445.GF29497@tuxdriver.com> <1448984968.3382143.454794705.68D88B7D@webmail.messagingengine.com> <1449074114.3806253.455834737.16948E5F@webmail.messagingengine.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: "John W. Linville" , Jesse Gross , David Miller , Anjali Singhai Jain , Linux Kernel Network Developers , Kiran Patil To: Tom Herbert , Hannes Frederic Sowa Return-path: Received: from mail-pf0-f174.google.com ([209.85.192.174]:34022 "EHLO mail-pf0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757378AbbLBXmW (ORCPT ); Wed, 2 Dec 2015 18:42:22 -0500 Received: by pfbg73 with SMTP id g73so2597477pfb.1 for ; Wed, 02 Dec 2015 15:42:22 -0800 (PST) In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: [...] >> >> I wonder why we need protocol generic offloads? I know there are >> currently a lot of overlay encapsulation protocols. Are there many more >> coming? >> > Yes, and assume that there are more coming with an unbounded limit > (for instance I just noticed today that there is a netdev1.1 talk on > supporting GTP in the kernel). Besides, this problem space not just > limited to offload of encapsulation protocols, but how to generalize > offload of any transport, IPv[46], application protocols, protocol > implemented in user space, security protocols, etc. > >> Besides, this offload is about TSO and RSS and they do need to parse the >> packet to get the information where the inner header starts. It is not >> only about checksum offloading. >> > RSS does not require the device to parse the inner header. All the UDP > encapsulations protocols being defined set the source port to entropy > flow value and most devices already support RSS+UDP (just needs to be > enabled) so this works just fine with dumb NICs. In fact, this is one > of the main motivations of encapsulating UDP in the first place, to > leverage existing RSS and ECMP mechanisms. The more general solution > is to use IPv6 flow label (RFC6438). We need HW support to include the > flow label into the hash for ECMP and RSS, but once we have that much > of the motivation for using UDP goes away and we can get back to just > doing GRE/IP, IPIP, MPLS/IP, etc. (hence eliminate overhead and > complexity of UDP encap). > >> Please provide a sketch up for a protocol generic api that can tell >> hardware where a inner protocol header starts that supports vxlan, >> vxlan-gpe, geneve and ipv6 extension headers and knows which protocol is >> starting at that point. >> > BPF. Implementing protocol generic offloads are not just a HW concern > either, adding kernel GRO code for every possible protocol that comes > along doesn't scale well. This becomes especially obvious when we > consider how to provide offloads for applications protocols. If the > kernel provides a programmable framework for the offloads then > application protocols, such as QUIC, could use use that without > needing to hack the kernel to support the specific protocol (which no > one wants!). Application protocol parsing in KCM and some other use > cases of BPF have already foreshadowed this, and we are working on a > prototype for a BPF programmable engine in the kernel. Presumably, > this same model could eventually be applied as the HW API to > programmable offload. Just keying off the last statement there... I think BPF programs are going to be hard to translate into hardware for most devices. The problem is the BPF programs in general lack structure. A parse graph would be much more friendly for hardware or at minimum the BPF program would need to be a some sort of well-structured program so a driver could turn that into a parse graph. .John