From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jesper Dangaard Brouer Subject: Re: [net-next PATCH] net: ipv4: fix listify ip_rcv_finish in case of forwarding Date: Fri, 13 Jul 2018 18:04:13 +0200 Message-ID: <20180713180413.4f8616ff@redhat.com> References: <153132125549.13161.16380200872856218805.stgit@firesoul> <7c5605ed2fe9505b982fde312d8416bd7fbbe6af.camel@mellanox.com> <20180711220649.266b071a@redhat.com> <3d08d6ae-a4cc-f9ad-f752-ba66ca13240b@solarflare.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Cc: Or Gerlitz , Saeed Mahameed , "netdev@vger.kernel.org" , brouer@redhat.com To: Edward Cree Return-path: Received: from mx3-rdu2.redhat.com ([66.187.233.73]:34974 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729681AbeGMQTc (ORCPT ); Fri, 13 Jul 2018 12:19:32 -0400 In-Reply-To: <3d08d6ae-a4cc-f9ad-f752-ba66ca13240b@solarflare.com> Sender: netdev-owner@vger.kernel.org List-ID: On Fri, 13 Jul 2018 15:19:40 +0100 Edward Cree wrote: > On 12/07/18 21:10, Or Gerlitz wrote: > > On Wed, Jul 11, 2018 at 11:06 PM, Jesper Dangaard Brouer > > wrote: > >> One reason I didn't "just" send a patch, is that Edward so-fare only > >> implemented netif_receive_skb_list() and not napi_gro_receive_list(). > > sfc does't support gro?! doesn't make sense.. Edward? > sfc has a flag EFX_RX_PKT_TCP set according to bits in the RX event, we >  call napi_{get,gro}_frags() (via efx_rx_packet_gro()) for TCP packets and >  netif_receive_skb() (or now the list handling) (via efx_rx_deliver()) for >  non-TCP packets.  So we avoid the GRO overhead for non-TCP workloads. > > > Same TCP performance > > > > with GRO and no rx-batching > > > > or > > > > without GRO and yes rx-batching > > > > is by far not intuitive result > > I'm also surprised by this.  If I can find the time I'll try to do similar > experiments on sfc. > Jesper, are the CPU utilisations similar in both cases? The CPU util is very different. With enabled-GRO netperf CPU is only 60.89% loaded in %sys With napi_gro_receive_list it is almost 100% loaded Same CPU-load with just disabling GRO. > You're sure your stream isn't TX-limited? It might be the case, as the netperf sender HW is not as new as the device under test. And the 60% load and idle cycles in case of GRO, does indicate this is the case. -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer