From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ramu Ramamurthy Subject: Re: [PATCH] - vxlan: gro not effective for intel 82599 Date: Fri, 26 Jun 2015 14:44:29 -0700 Message-ID: <0fbeed20a897eeebdeececa0bb68fa72@imap.linux.ibm.com> References: <5981772fe36e64f8fec5997a4c7aa08f@imap.linux.ibm.com> <0b2eff60824ac7b7d3a672da9be9bf99@imap.linux.ibm.com> <3df94e04daebca29c94b6d32fb372177@imap.linux.ibm.com> <3036c2aaa52dc2817e674a77b5eac24d@imap.linux.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Cc: "David S. Miller" , Jiri Benc , James Morris , Linux Kernel Network Developers , pradeeps@linux.vnet.ibm.com, J Kidambi To: Tom Herbert Return-path: Received: from e35.co.us.ibm.com ([32.97.110.153]:40179 "EHLO e35.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751931AbbFZVoe (ORCPT ); Fri, 26 Jun 2015 17:44:34 -0400 Received: from /spool/local by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 26 Jun 2015 15:44:34 -0600 Received: from b03cxnp07029.gho.boulder.ibm.com (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id B7EE13E40030 for ; Fri, 26 Jun 2015 15:44:30 -0600 (MDT) Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t5QLiUJl49807430 for ; Fri, 26 Jun 2015 14:44:30 -0700 Received: from d03av01.boulder.ibm.com (localhost [127.0.0.1]) by d03av01.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t5QLiTOr007631 for ; Fri, 26 Jun 2015 15:44:30 -0600 In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On 2015-06-26 12:59, Tom Herbert wrote: > On Fri, Jun 26, 2015 at 12:31 PM, Ramu Ramamurthy > wrote: >> On 2015-06-26 11:04, Tom Herbert wrote: >>>> >>>> I am testing the simplest configuration which has 1 TCP flow >>>> generated by >>>> iperf from >>>> a VM connected to a linux bridge with a vxlan tunnel interface. The >>>> 10G >>>> nic >>>> (82599 ES) has >>>> multiple receive queues, but in this simple test, it is likely >>>> immaterial >>>> (because, the >>>> tuple on which it hashes would be fixed). The real difference in >>>> performance >>>> appears to >>>> be whether or not vxlan gro is performed by software. >>>> >>> >>> Please do "ethtool -k vxlan0" of whatever interface is for vxlan. >>> Ensure GRO is "on", if not enable it on the interface by "ethtool _k >>> vxlan0 gro on". Run iperf and to tcpdump on the vxlan interface to >>> verify GRO is being done. If we are seeing performance degradation >>> when GRO is being done at tunnel versus device that would be a >>> different problem than no GRO being done at all. >> >> >> Heres more details on the test. >> >> gro is "on" on the device and the tunnel. tcpdump on the vxlan >> interface >> show un-aggregated packets >> >> [root@ramu1 tracing]# tcpdump -i vxlan0 >> >> ptions [nop,nop,TS val 1972850548 ecr 193703], length 1398 >> 14:14:38.911955 IP 1.1.1.21.44134 > 1.1.1.11.commplex-link: Flags [.], >> seq >> 224921449:224922847, ack 1, win 221, options [nop,nop,TS val >> 1972850548 ecr > > Looks like GRO was never implemented for vxlan tunnels. The driver is > simply calling netif_rx instead of using the GRO cells infrastructure. > geneve is doing the same thing. For other tunnels which are used in > foo-over-udp (GRE, IPIP, SIT) ip_tunnel_rcv is called which in turn > calls gro_cells_receive. Can we remove or (relax) the checksum checks in udp_gro_receive() which are immediately preventing the vxlan_gro callbacks from being called from udp_gro_receive() ? vxlan driver is registering these offloads callbacks, and I can see them work when i relax the following checksum checks. if (NAPI_GRO_CB(skb)->udp_mark || (skb->ip_summed != CHECKSUM_PARTIAL && <<<< remove or relax these checks NAPI_GRO_CB(skb)->csum_cnt == 0 && <<<< which are directly !NAPI_GRO_CB(skb)->csum_valid)) <<<< dependent on nic capability goto out; Alternatively, can we move these checks to the respective drivers' gro_receive() function. The other changes you suggest (gro_cells) are beyond my understanding.