From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jesper Dangaard Brouer Subject: Re: [net-next, PATCH 1/2, v3] net: socionext: different approach on DMA Date: Mon, 1 Oct 2018 13:03:13 +0200 Message-ID: <20181001130313.318065fd@redhat.com> References: <1538220482-16129-1-git-send-email-ilias.apalodimas@linaro.org> <1538220482-16129-2-git-send-email-ilias.apalodimas@linaro.org> <20181001112631.4a1fbb62@redhat.com> <20181001094450.GA24329@apalos> <20181001095657.GA24568@apalos> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: netdev@vger.kernel.org, jaswinder.singh@linaro.org, ard.biesheuvel@linaro.org, masami.hiramatsu@linaro.org, arnd@arndb.de, bjorn.topel@intel.com, magnus.karlsson@intel.com, daniel@iogearbox.net, ast@kernel.org, jesus.sanchez-palencia@intel.com, vinicius.gomes@intel.com, makita.toshiaki@lab.ntt.co.jp, Tariq Toukan , Tariq Toukan , brouer@redhat.com To: Ilias Apalodimas Return-path: Received: from mx1.redhat.com ([209.132.183.28]:56300 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728921AbeJARkh (ORCPT ); Mon, 1 Oct 2018 13:40:37 -0400 In-Reply-To: <20181001095657.GA24568@apalos> Sender: netdev-owner@vger.kernel.org List-ID: On Mon, 1 Oct 2018 12:56:58 +0300 Ilias Apalodimas wrote: > > > #2: You have allocations on the XDP fast-path. > > > > > > The REAL secret behind the XDP performance is to avoid allocations on > > > the fast-path. While I just told you to use the page-allocator and > > > order-0 pages, this will actually kill performance. Thus, to make this > > > fast, you need a driver local recycle scheme that avoids going through > > > the page allocator, which makes XDP_DROP and XDP_TX extremely fast. > > > For the XDP_REDIRECT action (which you seems to be interested in, as > > > this is needed for AF_XDP), there is a xdp_return_frame() API that can > > > make this fast. > > > > I had an initial implementation that did exactly that (that's why you the > > dma_sync_single_for_cpu() -> dma_unmap_single_attrs() is there). In the case > > of AF_XDP isn't that introducing a 'bottleneck' though? I mean you'll feed fresh > > buffers back to the hardware only when your packets have been processed from > > your userspace application > > Just a clarification here. This is the case if ZC is implemented. In my case > the buffers will be 'ok' to be passed back to the hardware once the use > userspace payload has been copied by xdp_do_redirect() Thanks for clarifying. But no, this is not introducing a 'bottleneck' for AF_XDP. For (1) the copy-mode-AF_XDP the frame (as you noticed) is "freed" or "returned" very quickly after it is copied. The code is a bit hard to follow, but in __xsk_rcv() it calls xdp_return_buff() after the memcpy. Thus, the frame can be kept DMA mapped and reused in RX-ring quickly. For (2) the zero-copy-AF_XDP, then you need to implement a new allocator of type MEM_TYPE_ZERO_COPY. The performance trick here is that all DMA-map/unmap and allocations go away, given everything is preallocated by userspace. Through the 4 rings (SPSC) are used for recycling the ZC-umem frames (read Documentation/networking/af_xdp.rst). -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer