From: Avi Kivity <avi@cloudius-systems.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>,
qemu-devel@nongnu.org, liuyongan@huawei.com,
qinchuanyu@huawei.com, "Zhangjie (HZ)" <zhangjie14@huawei.com>,
akong@redhat.com
Subject: Re: [Qemu-devel] [QA-virtio]:Why vring size is limited to 1024?
Date: Wed, 08 Oct 2014 13:59:13 +0300 [thread overview]
Message-ID: <54351901.6050706@cloudius-systems.com> (raw)
In-Reply-To: <20141008105515.GA4429@redhat.com>
On 10/08/2014 01:55 PM, Michael S. Tsirkin wrote:
>>>> Even more useful is getting rid of the desc array and instead passing descs
>>>> inline in avail and used.
>>> You expect this to improve performance?
>>> Quite possibly but this will have to be demonstrated.
>>>
>> The top vhost function in small packet workloads is vhost_get_vq_desc, and
>> the top instruction within that (50%) is the one that reads the first 8
>> bytes of desc. It's a guaranteed cache line miss (and again on the guest
>> side when it's time to reuse).
> OK so basically what you are pointing out is that we get 5 accesses:
> read of available head, read of available ring, read of descriptor,
> write of used ring, write of used ring head.
Right. And only read of descriptor is not amortized.
> If processing is in-order, we could build a much simpler design, with a
> valid bit in the descriptor, cleared by host as descriptors are
> consumed.
>
> Basically get rid of both used and available ring.
That only works if you don't allow reordering, which is never the case
for block, and not the case for zero-copy net. It also has writers on
both side of the ring.
The right design is to keep avail and used, but instead of making them
rings of pointers to descs, make them rings of descs.
The host reads descs from avail, processes them, then writes them back
on used (possibly out-of-order). The guest writes descs to avail and
reads them back from used.
You'll probably have to add a 64-bit cookie to desc so you can complete
without an additional lookup.
>
> Sounds good in theory.
>
>> Inline descriptors will amortize the cache miss over 4 descriptors, and will
>> allow the hardware to prefetch, since the descriptors are linear in memory.
> If descriptors are used in order (as they are with current qemu)
> then aren't they amortized already?
>
next prev parent reply other threads:[~2014-10-08 10:59 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-30 8:36 [Qemu-devel] [QA-virtio]:Why vring size is limited to 1024? Zhangjie (HZ)
2014-09-30 9:33 ` Michael S. Tsirkin
2014-10-08 7:17 ` Zhangjie (HZ)
2014-10-08 7:37 ` Michael S. Tsirkin
2014-10-08 8:07 ` Zhangjie (HZ)
2014-10-08 9:13 ` Michael S. Tsirkin
2014-10-08 7:43 ` Avi Kivity
2014-10-08 8:26 ` Zhangjie (HZ)
2014-10-08 9:15 ` Michael S. Tsirkin
2014-10-08 9:51 ` Avi Kivity
2014-10-08 10:14 ` Michael S. Tsirkin
2014-10-08 10:37 ` Avi Kivity
2014-10-08 10:55 ` Michael S. Tsirkin
2014-10-08 10:59 ` Avi Kivity [this message]
2014-10-08 12:22 ` Michael S. Tsirkin
2014-10-08 12:28 ` Avi Kivity
2014-10-08 12:36 ` Avi Kivity
2014-10-08 11:00 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54351901.6050706@cloudius-systems.com \
--to=avi@cloudius-systems.com \
--cc=akong@redhat.com \
--cc=jasowang@redhat.com \
--cc=liuyongan@huawei.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qinchuanyu@huawei.com \
--cc=zhangjie14@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).