From mboxrd@z Thu Jan  1 00:00:00 1970
From: Tiwei Bie <tiwei.bie@intel.com>
Subject: Re: [PATCH] vhost: adaptively batch small guest memory
	copies
Date: Fri, 8 Sep 2017 08:48:50 +0800
Message-ID: <20170908004849.GA18498@debian-ZGViaWFuCg>
References: <20170824021939.21306-1-tiwei.bie@intel.com>
 <a236798f-c4fe-8651-3471-b766c127f346@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Cc: dev@dpdk.org, yliu@fridaylinux.org, Zhihong Wang <zhihong.wang@intel.com>,
 Zhiyong Yang <zhiyong.yang@intel.com>
To: Maxime Coquelin <maxime.coquelin@redhat.com>
Return-path: <dev-bounces@dpdk.org>
Received: from mga14.intel.com (mga14.intel.com [192.55.52.115])
 by dpdk.org (Postfix) with ESMTP id 479ED377E
 for <dev@dpdk.org>; Fri,  8 Sep 2017 02:48:26 +0200 (CEST)
Content-Disposition: inline
In-Reply-To: <a236798f-c4fe-8651-3471-b766c127f346@redhat.com>
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

Hi Maxime,

On Thu, Sep 07, 2017 at 07:47:57PM +0200, Maxime Coquelin wrote:
> Hi Tiwei,
> 
> On 08/24/2017 04:19 AM, Tiwei Bie wrote:
> > This patch adaptively batches the small guest memory copies.
> > By batching the small copies, the efficiency of executing the
> > memory LOAD instructions can be improved greatly, because the
> > memory LOAD latency can be effectively hidden by the pipeline.
> > We saw great performance boosts for small packets PVP test.
> > 
> > This patch improves the performance for small packets, and has
> > distinguished the packets by size. So although the performance
> > for big packets doesn't change, it makes it relatively easy to
> > do some special optimizations for the big packets too.
> > 
> > Signed-off-by: Tiwei Bie<tiwei.bie@intel.com>
> > Signed-off-by: Zhihong Wang<zhihong.wang@intel.com>
> > Signed-off-by: Zhiyong Yang<zhiyong.yang@intel.com>
> > ---
> > This optimization depends on the CPU internal pipeline design.
> > So further tests (e.g. ARM) from the community is appreciated.
> > 
> >   lib/librte_vhost/vhost.c      |   2 +-
> >   lib/librte_vhost/vhost.h      |  13 +++
> >   lib/librte_vhost/vhost_user.c |  12 +++
> >   lib/librte_vhost/virtio_net.c | 240 ++++++++++++++++++++++++++++++++----------
> >   4 files changed, 209 insertions(+), 58 deletions(-)
> 
> I did some PVP benchmark with your patch.
> First I tried my standard PVP setup, with io forwarding on host and
> macswap on guest in bidirectional mode.
> 
> With this, I notice no improvement (18.8Mpps), but I think it explains
> because guest is the bottleneck here.
> So I change my setup to do csum forwarding on host side, so that host's
> PMD threads are more loaded.
> 
> In this case, I notice a great improvement, I get 18.8Mpps with your
> patch instead of 14.8Mpps without! Great work!
> 
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> 

Thank you very much for taking time to review and test this patch! :-)

Best regards,
Tiwei Bie