From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jesper Dangaard Brouer Subject: Re: Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage) Date: Mon, 25 Jan 2016 23:10:16 +0100 Message-ID: <20160125231016.4f0d2cd5@redhat.com> References: <1453330945.1223.329.camel@edumazet-glaptop2.roam.corp.google.com> <20160121122730.6330a84b@redhat.com> <20160121.105401.1793719917762270884.davem@davemloft.net> <20160124152814.2ea5e99b@redhat.com> <20160124163846-mutt-send-email-mst@redhat.com> <56A509C4.3030706@gmail.com> <20160125141516.795f3eb7@redhat.com> <56A66058.1090308@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: Tom Herbert , "Michael S. Tsirkin" , David Miller , Eric Dumazet , Or Gerlitz , Eric Dumazet , Linux Kernel Network Developers , Alexander Duyck , Alexei Starovoitov , Daniel Borkmann , Marek Majkowski , Hannes Frederic Sowa , Florian Westphal , Paolo Abeni , John Fastabend , Amir Vadai , Daniel Borkmann , Vladislav Yasevich , brouer@redhat.com To: John Fastabend Return-path: Received: from mx1.redhat.com ([209.132.183.28]:36507 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751275AbcAYWKZ (ORCPT ); Mon, 25 Jan 2016 17:10:25 -0500 In-Reply-To: <56A66058.1090308@gmail.com> Sender: netdev-owner@vger.kernel.org List-ID: On Mon, 25 Jan 2016 09:50:16 -0800 John Fastabend wrote: > On 16-01-25 09:09 AM, Tom Herbert wrote: > > On Mon, Jan 25, 2016 at 5:15 AM, Jesper Dangaard Brouer > > wrote: > >> [...] > >> > >> There are two ideas, getting mixed up here. (1) bundling from the > >> RX-ring, (2) allowing to pick up the "packet-page" directly. > >> > >> Bundling (1) is something that seems natural, and which help us > >> amortize the cost between layers (and utilizes icache better). Lets > >> keep that in another thread. > >> > >> This (2) direct forward of "packet-pages" is a fairly extreme idea, > >> BUT it have the potential of being an new integration point for > >> "selective" bypass-solutions and bringing RAW/af_packet (RX) up-to > >> speed with bypass-solutions. > [...] > > Jesper, at least for you (2) case what are we missing with the > bifurcated/queue splitting work? Are you really after systems > without SR-IOV support or are you trying to get this on the order > of queues instead of VFs. I'm not saying something is missing for bifurcated/queue splitting work. I'm not trying to work-around SR-IOV. This an extreme idea, which I got while looking at the lowest RX layer. Before working any further on this idea/path, I need/want to evaluate if it makes sense from a performance point of view. I need to evaluate if "pulling" out these "packet-pages" is fast enough to compete with DPDK/netmap. Else it makes no sense to work on this path. As a first step to evaluate this lowest RX layer, I'm simply hacking the drivers (ixgbe and mlx5) to drop/discard packets within-the-driver. For now, simply replacing napi_gro_receive() with dev_kfree_skb(), and measuring the "RX-drop" performance. Next step was to avoid the skb alloc+free calls, but doing so is more complicated that I first anticipated, as the SKB is tied in fairly heavily. Thus, right now I'm instead hooking in my bulk alloc+free API, as that will remove/mitigate most of the overhead of the kmem_cache/slab-allocators. -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat Author of http://www.iptv-analyzer.org LinkedIn: http://www.linkedin.com/in/brouer