From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jesper Dangaard Brouer Subject: Re: [RFC PATCH 00/24] Introducing AF_XDP support Date: Thu, 1 Feb 2018 17:42:40 +0100 Message-ID: <20180201174240.6368bc66@redhat.com> References: <20180131135356.19134-1-bjorn.topel@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Cc: magnus.karlsson@intel.com, alexander.h.duyck@intel.com, alexander.duyck@gmail.com, john.fastabend@gmail.com, ast@fb.com, willemdebruijn.kernel@gmail.com, daniel@iogearbox.net, netdev@vger.kernel.org, =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , michael.lundkvist@ericsson.com, jesse.brandeburg@intel.com, anjali.singhai@intel.com, jeffrey.b.shaw@intel.com, ferruh.yigit@intel.com, qi.z.zhang@intel.com, brouer@redhat.com, Saeed Mahameed To: =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= Return-path: Received: from mx1.redhat.com ([209.132.183.28]:49184 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751526AbeBAQm4 (ORCPT ); Thu, 1 Feb 2018 11:42:56 -0500 In-Reply-To: <20180131135356.19134-1-bjorn.topel@gmail.com> Sender: netdev-owner@vger.kernel.org List-ID: On Wed, 31 Jan 2018 14:53:32 +0100 Björn Töpel wrote: > * In this RFC, do not use an XDP_REDIRECT action other than > bpf_xdpsk_redirect for XDP_DRV_ZC. This is because a zero-copy > allocated buffer will then be sent to a cpu id / queue_pair through > ndo_xdp_xmit that does not know this has been ZC allocated. It will > then do a page_free on it and you will get a crash. How to extend > ndo_xdp_xmit with some free/completion function that could be called > instead of page_free? Hopefully, the same solution can be used here > as in the first problem item in this section. I'm prototype-coding extending ndo_xdp_xmit with a free/completion function call, that look at the xdp_rxq_info to determine what allocator type the RX-NIC used (info per RXq), and invoke the appropriate callback. I dusted off my old page_pool implementation (modifying it to run outside page-allocator). Implemented XDP_REDIRECT for mlx5, and extended xdp_rxq_info, and stored needed info in ixgbe for DMA TX completion. Disabled the mlx5 page cache, and instead use the page_pool. It worked surprisingly well... test is: pktgen on mlx5 100Gbit/s NIC, and XDP_REDIRECT with xdp_redirect_map sample, out 10G ixgbe NIC. Performance is surprisingly good... Testing DMA-TX completion on ixgbe, that calls "xdp_return_frame", which is mapped to page_pool_put_page(pool, page); Here DMA-TX-completion runs on CPU#3 and mlx5 RX runs on CPU#0. (Internally page_pool uses ptr_ring, which is what gives the good cross CPU performance). Show adapter(s) (ixgbe2 mlx5p2) statistics (ONLY that changed!) Ethtool(ixgbe2 ) stat: 810562253 ( 810,562,253) <= tx_bytes /sec Ethtool(ixgbe2 ) stat: 864600261 ( 864,600,261) <= tx_bytes_nic /sec Ethtool(ixgbe2 ) stat: 13509371 ( 13,509,371) <= tx_packets /sec Ethtool(ixgbe2 ) stat: 13509380 ( 13,509,380) <= tx_pkts_nic /sec Ethtool(mlx5p2 ) stat: 36827369 ( 36,827,369) <= rx_64_bytes_phy /sec Ethtool(mlx5p2 ) stat: 2356953271 ( 2,356,953,271) <= rx_bytes_phy /sec Ethtool(mlx5p2 ) stat: 23313782 ( 23,313,782) <= rx_discards_phy /sec Ethtool(mlx5p2 ) stat: 3019 ( 3,019) <= rx_out_of_buffer /sec Ethtool(mlx5p2 ) stat: 36827395 ( 36,827,395) <= rx_packets_phy /sec Ethtool(mlx5p2 ) stat: 2356924099 ( 2,356,924,099) <= rx_prio0_bytes /sec Ethtool(mlx5p2 ) stat: 13513560 ( 13,513,560) <= rx_prio0_packets /sec Ethtool(mlx5p2 ) stat: 810820253 ( 810,820,253) <= rx_vport_unicast_bytes /sec Ethtool(mlx5p2 ) stat: 13513672 ( 13,513,672) <= rx_vport_unicast_packets /sec If I only disabled the mlx5 page cache (no page_pool), then single flow performance was 6Mpps, and if I started two flows the collective performance drop to 4Mpps, because we hit the page allocator lock (further negative scaling occurs). If I keep the mlx5 cache, I see between 7-11Mpps... which varies depending on ixgbe TX-ring size and DMA-completion interrupt levels. For AF_XDP, we just register another free/completion callback function. -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer