From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [PATCH] vhost: batch used descriptors chains
 write-back with packed ring
Date: Wed, 19 Dec 2018 11:10:44 -0500
Message-ID: <20181219110010-mutt-send-email-mst@kernel.org>
References: <CGME20181212082505epcas4p36ea3086cfd223aae70a925e946fced48@epcas4p3.samsung.com>
 <20181212082403.12002-1-maxime.coquelin@redhat.com>
 <e5d2fc9b-5d3e-95fc-3365-75a210770323@samsung.com>
 <e91c5b6f-de71-08fd-73a0-c1bde731b26e@redhat.com>
 <20181212134459-mutt-send-email-mst@kernel.org>
 <003a5fbb-386f-2cf6-e83d-abd7b9fc2938@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: Ilya Maximets <i.maximets@samsung.com>, dev@dpdk.org,
 tiwei.bie@intel.com, zhihong.wang@intel.com, jfreimann@redhat.com,
 Jason Wang <jasowang@redhat.com>
To: Maxime Coquelin <maxime.coquelin@redhat.com>
Return-path: <dev-bounces@dpdk.org>
Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28])
 by dpdk.org (Postfix) with ESMTP id BAD1B1B93B
 for <dev@dpdk.org>; Wed, 19 Dec 2018 17:10:51 +0100 (CET)
Content-Disposition: inline
In-Reply-To: <003a5fbb-386f-2cf6-e83d-abd7b9fc2938@redhat.com>
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

On Wed, Dec 19, 2018 at 10:16:24AM +0100, Maxime Coquelin wrote:
> 
> 
> On 12/12/18 7:53 PM, Michael S. Tsirkin wrote:
> > On Wed, Dec 12, 2018 at 05:34:31PM +0100, Maxime Coquelin wrote:
> > > Hi Ilya,
> > > 
> > > On 12/12/18 4:23 PM, Ilya Maximets wrote:
> > > > On 12.12.2018 11:24, Maxime Coquelin wrote:
> > > > > Instead of writing back descriptors chains in order, let's
> > > > > write the first chain flags last in order to improve batching.
> > > > > 
> > > > > With Kernel's pktgen benchmark, ~3% performance gain is measured.
> > > > > 
> > > > > Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> > > > > ---
> > > > >    lib/librte_vhost/virtio_net.c | 39 +++++++++++++++++++++--------------
> > > > >    1 file changed, 24 insertions(+), 15 deletions(-)
> > > > > 
> > > > 
> > > > Hi.
> > > > I made some rough testing on my ARMv8 system with this patch and v1 of it.
> > > > Here is the performance difference with current master:
> > > >       v1: +1.1 %
> > > >       v2: -3.6 %
> > > > 
> > > > So, write barriers are quiet heavy in practice.
> > > 
> > > Thanks for testing it on ARM. Indeed, SMP WMB is heavier on ARM.
> > 
> > Besides your ideas for improving packed rings, maybe we should switch to
> > load_acquite/store_release?
> > 
> > See
> > 	virtio: use smp_load_acquire/smp_store_release
> > 
> > which worked fine but as I only tested on x86 did not result in any gains.
> > 
> 
> Thanks for the pointer.
> We'll look into it for v19.05, as -rc1 for v19.02 is planned for end of
> week, so it will be too late to introduce such changes.
> 
> Regards,
> Maxime

That's not the only option BTW.  For loads, another option it to work
the value into an indirect dependency which does not need
a barrier.

For example:

#define OPTIMIZER_HIDE_VAR(var)                                         \
        __asm__ ("" : "=r" (var) : "0" (var))

unsigned empty = last_used == idx->used;
if (!empty) {
	OPTIMIZER_HIDE_VAR(empty);
	desc = used->ring[last_used + empty];
}


See linux for definitions of OPTIMIZER_HIDE_VAR.

One side effect of this is that this also blocks code speculation.
which can be a good or a bad thing for performance,
but can be a good thing for security.


-- 
MST