From mboxrd@z Thu Jan 1 00:00:00 1970 From: Thomas Monjalon Subject: Re: [PATCH] eal: fix rte_memcpy perf in hsw/bdw Date: Wed, 15 Jun 2016 16:21:08 +0200 Message-ID: <9020918.HyCU59pmu8@xps13> References: <1464139383-132732-1-git-send-email-zhihong.wang@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7Bit Cc: dev@dpdk.org To: Zhihong Wang Return-path: Received: from mail-lf0-f50.google.com (mail-lf0-f50.google.com [209.85.215.50]) by dpdk.org (Postfix) with ESMTP id 543EBC658 for ; Wed, 15 Jun 2016 16:21:11 +0200 (CEST) Received: by mail-lf0-f50.google.com with SMTP id l188so12017712lfe.2 for ; Wed, 15 Jun 2016 07:21:11 -0700 (PDT) In-Reply-To: <1464139383-132732-1-git-send-email-zhihong.wang@intel.com> List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" 2016-05-24 21:23, Zhihong Wang: > This patch fixes rte_memcpy performance in Haswell and Broadwell for > vhost when copy size larger than 256 bytes. > > It is observed that for large copies like 1024/1518 ones, rte_memcpy > suffers high ratio of store buffer full issue which causes pipeline > to stall in scenarios like vhost enqueue. This can be alleviated by > adjusting instruction layout. Note that this issue may not be visible > in micro test. > > How to reproduce? > > PHY-VM-PHY using vhost/virtio or vhost/virtio loop back, with large > packets like 1024/1518 bytes ones. Make sure packet generation rate > is not the bottleneck if PHY-VM-PHY is used. > > Signed-off-by: Zhihong Wang Test report: http://dpdk.org/ml/archives/dev/2016-May/039716.html Applied, thanks