From mboxrd@z Thu Jan 1 00:00:00 1970 From: Thomas Monjalon Subject: Re: [PATCH v8 1/3] eal/x86: run-time dispatch over memcpy Date: Tue, 17 Oct 2017 23:24:12 +0200 Message-ID: <4482530.zMd2RtzCvC@xps> References: <1507206794-79941-1-git-send-email-xiaoyun.li@intel.com> <1507885309-165144-1-git-send-email-xiaoyun.li@intel.com> <1507885309-165144-2-git-send-email-xiaoyun.li@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7Bit Cc: dev@dpdk.org, wenzhuo.lu@intel.com, helin.zhang@intel.com, ophirmu@mellanox.com To: Xiaoyun Li , konstantin.ananyev@intel.com, bruce.richardson@intel.com Return-path: Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) by dpdk.org (Postfix) with ESMTP id 9034B1B5D9 for ; Tue, 17 Oct 2017 23:24:14 +0200 (CEST) In-Reply-To: <1507885309-165144-2-git-send-email-xiaoyun.li@intel.com> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi, 13/10/2017 11:01, Xiaoyun Li: > This patch dynamically selects functions of memcpy at run-time based > on CPU flags that current machine supports. This patch uses function > pointers which are bind to the relative functions at constrctor time. > In addition, AVX512 instructions set would be compiled only if users > config it enabled and the compiler supports it. > > Signed-off-by: Xiaoyun Li > --- Keeping only the major changes of the patch for later discussions: [...] > static inline void * > rte_memcpy(void *dst, const void *src, size_t n) > { > - if (!(((uintptr_t)dst | (uintptr_t)src) & ALIGNMENT_MASK)) > - return rte_memcpy_aligned(dst, src, n); > + if (n <= RTE_X86_MEMCPY_THRESH) > + return rte_memcpy_internal(dst, src, n); > else > - return rte_memcpy_generic(dst, src, n); > + return (*rte_memcpy_ptr)(dst, src, n); > } [...] > +static inline void * > +rte_memcpy_internal(void *dst, const void *src, size_t n) > +{ > + if (!(((uintptr_t)dst | (uintptr_t)src) & ALIGNMENT_MASK)) > + return rte_memcpy_aligned(dst, src, n); > + else > + return rte_memcpy_generic(dst, src, n); > +} The significant change of this patch is to call a function pointer for packet size > 128 (RTE_X86_MEMCPY_THRESH). Please could you provide some benchmark numbers? >>From a test done at Mellanox, there might be a performance degradation of about 15% in testpmd txonly with AVX2. Is there someone else seeing a performance degradation?