From: Rusty Russell <rusty@rustcorp.com.au>
To: Avi Kivity <avi@redhat.com>, "Michael S. Tsirkin" <mst@redhat.com>
Cc: markmc@redhat.com, virtualization@lists.linux-foundation.org,
Sasha Levin <levinsasha928@gmail.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] virtio-ring: Use threshold for switching to indirect descriptors
Date: Mon, 05 Dec 2011 10:40:01 +1030 [thread overview]
Message-ID: <87bornri92.fsf@rustcorp.com.au> (raw)
In-Reply-To: <4EDB8EEB.4070309@redhat.com>
On Sun, 04 Dec 2011 17:16:59 +0200, Avi Kivity <avi@redhat.com> wrote:
> On 12/04/2011 05:11 PM, Michael S. Tsirkin wrote:
> > > There's also the used ring, but that's a
> > > mistake if you have out of order completion. We should have used copying.
> >
> > Seems unrelated... unless you want used to be written into
> > descriptor ring itself?
>
> The avail/used rings are in addition to the regular ring, no? If you
> copy descriptors, then it goes away.
There were two ideas which drove the current design:
1) The Van-Jacobson style "no two writers to same cacheline makes rings
fast" idea. Empirically, this doesn't show any winnage.
2) Allowing a generic inter-guest copy mechanism, so we could have
genuinely untrusted driver domains. Yet noone ever did this so it's
hardly a killer feature :(
So if we're going to revisit and drop those requirements, I'd say:
1) Shared device/driver rings like Xen. Xen uses device-specific ring
contents, I'd be tempted to stick to our pre-headers, and a 'u64
addr; u64 len_and_flags; u64 cookie;' generic style. Then use
the same ring for responses. That's a slight space-win, since
we're 24 bytes vs 26 bytes now.
2) Stick with physically-contiguous rings, but use them of size (2^n)-1.
Makes the indexing harder, but that -1 lets us stash the indices in
the first entry and makes the ring a nice 2^n size.
> > > 16kB worth of descriptors is 1024 entries. With 4kB buffers, that's 4MB
> > > worth of data, or 4 ms at 10GbE line speed. With 1500 byte buffers it's
> > > just 1.5 ms. In any case I think it's sufficient.
> >
> > Right. So I think that without indirect, we waste about 3 entries
> > per packet for virtio header and transport etc headers.
>
> That does suck. Are there issues in increasing the ring size? Or
> making it discontiguous?
Because the qemu implementation is broken. We can often put the virtio
header at the head of the packet. In practice, the qemu implementation
insists the header be a single descriptor.
(At least, it used to, perhaps it has now been fixed. We need a
VIRTIO_NET_F_I_NOW_CONFORM_TO_THE_DAMN_SPEC_SORRY_I_SUCK bit).
We currently use small rings: the guest can't negotiate so qemu has to
offer a lowest-common-denominator value. The new virtio-pci layout
fixes this, and lets the guest set the ring size.
> Can you take a peek at how Xen manages its rings? They have the same
> problems we do.
Yes, I made some mistakes, but I did steal from them in the first
place...
Cheers,
Rusty.
next prev parent reply other threads:[~2011-12-05 0:10 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-29 9:33 [PATCH] virtio-ring: Use threshold for switching to indirect descriptors Sasha Levin
2011-11-29 12:56 ` Michael S. Tsirkin
2011-11-29 13:34 ` Sasha Levin
2011-11-29 13:54 ` Michael S. Tsirkin
[not found] ` <20111129135406.GB30966@redhat.com>
2011-11-29 14:21 ` Sasha Levin
2011-11-29 14:54 ` Michael S. Tsirkin
2011-11-29 14:58 ` Avi Kivity
2011-11-30 16:11 ` Sasha Levin
2011-11-30 16:17 ` Sasha Levin
2011-12-01 2:42 ` Rusty Russell
2011-12-01 7:58 ` Michael S. Tsirkin
2011-12-01 8:09 ` Sasha Levin
2011-12-01 10:26 ` Michael S. Tsirkin
2011-12-02 0:46 ` Rusty Russell
2011-12-03 11:50 ` Sasha Levin
2011-12-04 11:06 ` Michael S. Tsirkin
2011-12-04 15:15 ` Michael S. Tsirkin
2011-12-04 11:52 ` Avi Kivity
2011-12-04 12:01 ` Michael S. Tsirkin
2011-12-04 12:06 ` Avi Kivity
2011-12-04 15:11 ` Michael S. Tsirkin
2011-12-04 15:16 ` Avi Kivity
2011-12-04 16:00 ` Michael S. Tsirkin
2011-12-04 16:33 ` Avi Kivity
2011-12-05 0:10 ` Rusty Russell [this message]
2011-12-05 9:52 ` Avi Kivity
2011-12-06 5:07 ` Rusty Russell
2011-12-06 9:58 ` Avi Kivity
2011-12-06 12:03 ` Rusty Russell
[not found] ` <87pqg1kiuu.fsf@rustcorp.com.au>
2011-12-07 13:37 ` Avi Kivity
2011-12-04 12:13 ` Sasha Levin
2011-12-04 16:22 ` Michael S. Tsirkin
2011-12-04 17:34 ` Sasha Levin
2011-12-04 17:37 ` Avi Kivity
2011-12-04 17:39 ` Sasha Levin
2011-12-04 18:23 ` Sasha Levin
2011-12-07 14:02 ` Sasha Levin
2011-12-07 15:48 ` Michael S. Tsirkin
2011-12-08 9:44 ` Rusty Russell
[not found] ` <87r50fgzyj.fsf@rustcorp.com.au>
2011-12-08 10:37 ` Sasha Levin
2011-12-09 5:33 ` Rusty Russell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87bornri92.fsf@rustcorp.com.au \
--to=rusty@rustcorp.com.au \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=levinsasha928@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=markmc@redhat.com \
--cc=mst@redhat.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).