From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jesper Dangaard Brouer Subject: Re: [net-next, RFC, 4/8] net: core: add recycle capabilities on skbs via page_pool API Date: Sat, 8 Dec 2018 21:11:53 +0100 Message-ID: <20181208211153.68f1424e@redhat.com> References: <154413874729.21735.10644578158550468689.stgit@firesoul> <20181208095758.GA32028@strlen.de> <72f33f12-9222-cbe7-6ff2-e4b4f86fb17c@gmail.com> <20181208145728.GA10660@apalos> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: Ilias Apalodimas , Florian Westphal , netdev@vger.kernel.org, "David S. Miller" , Toke =?UTF-8?B?SMO4aWxhbmQtSsO4cmdlbnNlbg==?= , ard.biesheuvel@linaro.org, Jason Wang , =?UTF-8?B?QmrDtnJuVMO2cGVs?= , w@1wt.eu, Saeed Mahameed , mykyta.iziumtsev@gmail.com, Daniel Borkmann , Alexei Starovoitov , Tariq Toukan , brouer@redhat.com To: Eric Dumazet Return-path: Received: from mx1.redhat.com ([209.132.183.28]:48536 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726181AbeLHUME (ORCPT ); Sat, 8 Dec 2018 15:12:04 -0500 In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On Sat, 8 Dec 2018 11:26:56 -0800 Eric Dumazet wrote: > On 12/08/2018 06:57 AM, Ilias Apalodimas wrote: > > Hi Eric, > >>>> This patch is changing struct sk_buff, and is thus per-definition > >>>> controversial. > >>>> > >>>> Place a new member 'mem_info' of type struct xdp_mem_info, just after > >>>> members (flags) head_frag and pfmemalloc, And not in between > >>>> headers_start/end to ensure skb_copy() and pskb_copy() work as-is. > >>>> Copying mem_info during skb_clone() is required. This makes sure that > >>>> pages are correctly freed or recycled during the altered > >>>> skb_free_head() invocation. > >>> > >>> I read this to mean that this 'info' isn't accessed/needed until skb > >>> is freed. Any reason its not added at the end? > >>> > >>> This would avoid moving other fields that are probably accessed > >>> more frequently during processing. > >>> > >> > >> But I do not get why the patch is needed. > >> > >> Adding extra cost for each skb destruction is costly. > >> > >> I though XDP was all about _not_ having skbs. > > > > You hit the only point i don't personally like in the code, xdp info in the > > skb. Considering the benefits i am convinced this is ok and here's why: > > > >> Please let's do not slow down the non XDP stack only to make XDP more appealing. > > > > We are not trying to do that, on the contrary. The patchset has nothing towards > > speeding XDP and slowing down anything else. The patchset speeds up the > > mvneta driver on the default network stack. The only change that was needed was > > to adapt the driver to using the page_pool API. The speed improvements we are > > seeing on specific workloads (i.e 256b < packet < 400b) are almost 3x. > > > > Lots of high speed drivers are doing similar recycling tricks themselves (and > > there's no common code, everyone is doing something similar though). All we are > > trying to do is provide a unified API to make that easier for the rest. Another > > advantage is that if the some drivers switch to the API, adding XDP > > functionality on them is pretty trivial. > > > > Moreover our tests are only performed on systems without or with SMMU disabled. > > On a platform i have access to, enabling and disabling the SMMU has some > > performance impact. By keeping the buffers mapped we believe this impact > > will be substantially less (i'll come back with results once i have them on > > this). > > > > I do understand your point, but the potential advantages on my head > > overwight that by a lot. > > > > I got other concerns on the patchset though. Like how much memory is it 'ok' to > > keep mapped keeping in mind we are using the streaming DMA API. Are we going to > > affect anyone else negatively by doing so ? > > > > I want to make sure you guys thought about splice() stuff, and > skb_try_coalesce(), and GRO, and skb cloning, and ... Thanks for the pointers. To Ilias, we need to double check skb_try_coalesce() code path, as it does look like we don't handle this correctly. > I do not see how an essential property of page fragments would be > attached to sk_buff, without adding extra code all over the places. > > Page recycling in drivers is fine, but should operate on pages that > the driver owns. > I think we have addressed this. We are only recycling pages that the driver owns. In case of fragments, then we release the DMA-mapping and don't recycle the page, instead the normal code path is taken (which is missing in case of skb_try_coalesce). -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer