From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:39850)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mst@redhat.com>) id 1XbqCi-0001ZQ-Rd
	for qemu-devel@nongnu.org; Wed, 08 Oct 2014 08:18:59 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <mst@redhat.com>) id 1XbqCc-0006wQ-Dq
	for qemu-devel@nongnu.org; Wed, 08 Oct 2014 08:18:52 -0400
Received: from mx1.redhat.com ([209.132.183.28]:18233)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mst@redhat.com>) id 1XbqCc-0006w4-5k
	for qemu-devel@nongnu.org; Wed, 08 Oct 2014 08:18:46 -0400
Date: Wed, 8 Oct 2014 15:22:13 +0300
From: "Michael S. Tsirkin" <mst@redhat.com>
Message-ID: <20141008122213.GA4679@redhat.com>
References: <542A6B70.7090607@huawei.com> <20140930093356.GA3673@redhat.com>
	<5434EB0B.8010800@cloudius-systems.com>
	<20141008091547.GB3872@redhat.com>
	<54350919.8050401@cloudius-systems.com>
	<20141008101443.GA4291@redhat.com>
	<543513E5.8010507@cloudius-systems.com>
	<20141008105515.GA4429@redhat.com>
	<54351901.6050706@cloudius-systems.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <54351901.6050706@cloudius-systems.com>
Subject: Re: [Qemu-devel] [QA-virtio]:Why vring size is limited to 1024?
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Avi Kivity <avi@cloudius-systems.com>
Cc: Jason Wang <jasowang@redhat.com>, qemu-devel@nongnu.org, liuyongan@huawei.com, qinchuanyu@huawei.com, "Zhangjie (HZ)" <zhangjie14@huawei.com>, akong@redhat.com

On Wed, Oct 08, 2014 at 01:59:13PM +0300, Avi Kivity wrote:
> 
> On 10/08/2014 01:55 PM, Michael S. Tsirkin wrote:
> >>>>Even more useful is getting rid of the desc array and instead passing descs
> >>>>inline in avail and used.
> >>>You expect this to improve performance?
> >>>Quite possibly but this will have to be demonstrated.
> >>>
> >>The top vhost function in small packet workloads is vhost_get_vq_desc, and
> >>the top instruction within that (50%) is the one that reads the first 8
> >>bytes of desc.  It's a guaranteed cache line miss (and again on the guest
> >>side when it's time to reuse).
> >OK so basically what you are pointing out is that we get 5 accesses:
> >read of available head, read of available ring, read of descriptor,
> >write of used ring, write of used ring head.
> 
> Right.  And only read of descriptor is not amortized.
> 
> >If processing is in-order, we could build a much simpler design, with a
> >valid bit in the descriptor, cleared by host as descriptors are
> >consumed.
> >
> >Basically get rid of both used and available ring.
> 
> That only works if you don't allow reordering, which is never the case for
> block, and not the case for zero-copy net.  It also has writers on both side
> of the ring.
> 
> The right design is to keep avail and used, but instead of making them rings
> of pointers to descs, make them rings of descs.
> 
> The host reads descs from avail, processes them, then writes them back on
> used (possibly out-of-order).  The guest writes descs to avail and reads
> them back from used.
> 
> You'll probably have to add a 64-bit cookie to desc so you can complete
> without an additional lookup.

My old presentation from 2012 or so suggested something like this.
We don't need a 64 bit cookie I think - a small 16 bit one
should be enough.

> >
> >Sounds good in theory.
> >
> >>Inline descriptors will amortize the cache miss over 4 descriptors, and will
> >>allow the hardware to prefetch, since the descriptors are linear in memory.
> >If descriptors are used in order (as they are with current qemu)
> >then aren't they amortized already?
> >