From mboxrd@z Thu Jan 1 00:00:00 1970 From: Zoltan Kiss Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive function Date: Mon, 7 Sep 2015 15:15:25 +0100 Message-ID: <55ED9BFD.7040009@linaro.org> References: <1441135036-7491-1-git-send-email-zoltan.kiss@linaro.org> <55ED8252.1020900@linaro.org> <59AF69C657FD0841A61C55336867B5B0359227DF@IRSMSX103.ger.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit To: "Richardson, Bruce" , "dev@dpdk.org" Return-path: Received: from mail-wi0-f169.google.com (mail-wi0-f169.google.com [209.85.212.169]) by dpdk.org (Postfix) with ESMTP id 083E05963 for ; Mon, 7 Sep 2015 16:15:27 +0200 (CEST) Received: by wicge5 with SMTP id ge5so85642539wic.0 for ; Mon, 07 Sep 2015 07:15:26 -0700 (PDT) In-Reply-To: <59AF69C657FD0841A61C55336867B5B0359227DF@IRSMSX103.ger.corp.intel.com> List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 07/09/15 13:57, Richardson, Bruce wrote: > > >> -----Original Message----- >> From: Zoltan Kiss [mailto:zoltan.kiss@linaro.org] >> Sent: Monday, September 7, 2015 1:26 PM >> To: dev@dpdk.org >> Cc: Ananyev, Konstantin; Richardson, Bruce >> Subject: Re: [PATCH] ixgbe: prefetch packet headers in vector PMD receive >> function >> >> Hi, >> >> I just realized I've missed the "[PATCH]" tag from the subject. Did anyone >> had time to review this? >> > > Hi Zoltan, > > the big thing that concerns me with this is the addition of new instructions for > each packet in the fast path. Ideally, this prefetching would be better handled > in the application itself, as for some apps, e.g. those using pipelining, the > core doing the RX from the NIC may not touch the packet data at all, and the > prefetches will instead cause a performance slowdown. > > Is it possible to get the same performance increase - or something close to it - > by making changes in OVS? OVS already does a prefetch when it's processing the previous packet, but apparently it's not early enough. At least for my test scenario, where I'm forwarding UDP packets with the least possible overhead. I guess in tests where OVS does more complex processing it should be fine. I'll try to move the prefetch earlier in OVS codebase, but I'm not sure if it'll help. Also, I've checked the PMD receive functions, and generally it's quite mixed whether they prefetch the header or not. All the other 3 ixgbe receive functions do that for example, as well as the following drivers: bnx2x e1000 fm10k (scattered) i40e igb virtio While these drivers don't do that: cxgbe enic fm10k (non-scattered) mlx4 I think it would be better to add rte_packet_prefetch() everywhere, because then applications can turn that off with CONFIG_RTE_PMD_PACKET_PREFETCH. > > Regards, > /Bruce > >> Regards, >> >> Zoltan >> >> On 01/09/15 20:17, Zoltan Kiss wrote: >>> The lack of this prefetch causes a significant performance drop in >>> OVS-DPDK: 13.3 Mpps instead of 14 when forwarding 64 byte packets. >>> Even though OVS prefetches the next packet's header before it starts >>> processing the current one, it doesn't get there fast enough. This >>> aligns with the behaviour of other receive functions. >>> >>> Signed-off-by: Zoltan Kiss >>> --- >>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c >>> b/drivers/net/ixgbe/ixgbe_rxtx_vec.c >>> index cf25a53..51299fa 100644 >>> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c >>> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c >>> @@ -502,6 +502,15 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, >> struct rte_mbuf **rx_pkts, >>> _mm_storeu_si128((void *)&rx_pkts[pos]- >>> rx_descriptor_fields1, >>> pkt_mb1); >>> >>> + rte_packet_prefetch((char*)(rx_pkts[pos]->buf_addr) + >>> + RTE_PKTMBUF_HEADROOM); >>> + rte_packet_prefetch((char*)(rx_pkts[pos + 1]->buf_addr) >> + >>> + RTE_PKTMBUF_HEADROOM); >>> + rte_packet_prefetch((char*)(rx_pkts[pos + 2]->buf_addr) >> + >>> + RTE_PKTMBUF_HEADROOM); >>> + rte_packet_prefetch((char*)(rx_pkts[pos + 3]->buf_addr) >> + >>> + RTE_PKTMBUF_HEADROOM); >>> + >>> /* C.4 calc avaialbe number of desc */ >>> var = __builtin_popcountll(_mm_cvtsi128_si64(staterr)); >>> nb_pkts_recd += var; >>>