* [PATCH] kvm tools: Support virtio indirect buffers
@ 2011-11-28 17:54 Sasha Levin
2011-11-28 18:49 ` Pekka Enberg
2011-11-29 6:31 ` Cyrill Gorcunov
0 siblings, 2 replies; 9+ messages in thread
From: Sasha Levin @ 2011-11-28 17:54 UTC (permalink / raw)
To: penberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Sasha Levin
Indirect buffers are ring descriptors which point to more (even more)
descriptors.
This can be used to increase the effective ring capacity, which helps the
guest to batch large requests - very useful for blk devices.
This patch also enables indirect buffers for virtio-net and virtio-blk.
The patch is based on the lguest's code which does the same.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
tools/kvm/virtio/blk.c | 3 ++-
tools/kvm/virtio/core.c | 47 ++++++++++++++++++++++++++++++++++++++---------
tools/kvm/virtio/net.c | 3 ++-
3 files changed, 42 insertions(+), 11 deletions(-)
diff --git a/tools/kvm/virtio/blk.c b/tools/kvm/virtio/blk.c
index 8c6f90b..d1a0197 100644
--- a/tools/kvm/virtio/blk.c
+++ b/tools/kvm/virtio/blk.c
@@ -148,7 +148,8 @@ static u32 get_host_features(struct kvm *kvm, void *dev)
{
return 1UL << VIRTIO_BLK_F_SEG_MAX
| 1UL << VIRTIO_BLK_F_FLUSH
- | 1UL << VIRTIO_RING_F_EVENT_IDX;
+ | 1UL << VIRTIO_RING_F_EVENT_IDX
+ | 1UL << VIRTIO_RING_F_INDIRECT_DESC;
}
static void set_guest_features(struct kvm *kvm, void *dev, u32 features)
diff --git a/tools/kvm/virtio/core.c b/tools/kvm/virtio/core.c
index a6f180e..fe9d588 100644
--- a/tools/kvm/virtio/core.c
+++ b/tools/kvm/virtio/core.c
@@ -33,27 +33,56 @@ struct vring_used_elem *virt_queue__set_used_elem(struct virt_queue *queue, u32
return used_elem;
}
+/*
+ * Each buffer in the virtqueues is actually a chain of descriptors. This
+ * function returns the next descriptor in the chain, or vq->vring.num if we're
+ * at the end.
+ */
+static unsigned next_desc(struct vring_desc *desc,
+ unsigned int i, unsigned int max)
+{
+ unsigned int next;
+
+ /* If this descriptor says it doesn't chain, we're done. */
+ if (!(desc[i].flags & VRING_DESC_F_NEXT))
+ return max;
+
+ /* Check they're not leading us off end of descriptors. */
+ next = desc[i].next;
+ /* Make sure compiler knows to grab that: we don't want it changing! */
+ wmb();
+
+ return next;
+}
+
u16 virt_queue__get_head_iov(struct virt_queue *vq, struct iovec iov[], u16 *out, u16 *in, u16 head, struct kvm *kvm)
{
struct vring_desc *desc;
u16 idx;
+ u16 max;
idx = head;
*out = *in = 0;
+ max = vq->vring.num;
+ desc = vq->vring.desc;
+
+ if (desc[idx].flags & VRING_DESC_F_INDIRECT) {
+
+ max = desc[idx].len / sizeof(struct vring_desc);
+ desc = guest_flat_to_host(kvm, desc[idx].addr);
+ idx = 0;
+ }
do {
- desc = virt_queue__get_desc(vq, idx);
- iov[*out + *in].iov_base = guest_flat_to_host(kvm, desc->addr);
- iov[*out + *in].iov_len = desc->len;
- if (desc->flags & VRING_DESC_F_WRITE)
+ /* Grab the first descriptor, and check it's OK. */
+ iov[*out + *in].iov_len = desc[idx].len;
+ iov[*out + *in].iov_base = guest_flat_to_host(kvm, desc[idx].addr);
+ /* If this is an input descriptor, increment that count. */
+ if (desc[idx].flags & VRING_DESC_F_WRITE)
(*in)++;
else
(*out)++;
- if (desc->flags & VRING_DESC_F_NEXT)
- idx = desc->next;
- else
- break;
- } while (1);
+ } while ((idx = next_desc(desc, idx, max)) != max);
return head;
}
diff --git a/tools/kvm/virtio/net.c b/tools/kvm/virtio/net.c
index ce34555..3fb5054 100644
--- a/tools/kvm/virtio/net.c
+++ b/tools/kvm/virtio/net.c
@@ -318,7 +318,8 @@ static u32 get_host_features(struct kvm *kvm, void *dev)
| 1UL << VIRTIO_NET_F_GUEST_UFO
| 1UL << VIRTIO_NET_F_GUEST_TSO4
| 1UL << VIRTIO_NET_F_GUEST_TSO6
- | 1UL << VIRTIO_RING_F_EVENT_IDX;
+ | 1UL << VIRTIO_RING_F_EVENT_IDX
+ | 1UL << VIRTIO_RING_F_INDIRECT_DESC;
}
static void set_guest_features(struct kvm *kvm, void *dev, u32 features)
--
1.7.8.rc3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] kvm tools: Support virtio indirect buffers
2011-11-28 17:54 [PATCH] kvm tools: Support virtio indirect buffers Sasha Levin
@ 2011-11-28 18:49 ` Pekka Enberg
2011-11-28 20:17 ` Sasha Levin
2011-11-29 6:31 ` Cyrill Gorcunov
1 sibling, 1 reply; 9+ messages in thread
From: Pekka Enberg @ 2011-11-28 18:49 UTC (permalink / raw)
To: Sasha Levin; +Cc: kvm, mingo, asias.hejun, gorcunov
On Mon, Nov 28, 2011 at 7:54 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> Indirect buffers are ring descriptors which point to more (even more)
> descriptors.
>
> This can be used to increase the effective ring capacity, which helps the
> guest to batch large requests - very useful for blk devices.
>
> This patch also enables indirect buffers for virtio-net and virtio-blk.
>
> The patch is based on the lguest's code which does the same.
>
> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
In what exact way is it useful? Improved throughput? Will this have
negative impact on virtio block or virtio net latency?
Pekka
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kvm tools: Support virtio indirect buffers
2011-11-28 18:49 ` Pekka Enberg
@ 2011-11-28 20:17 ` Sasha Levin
2011-11-28 21:05 ` Sasha Levin
0 siblings, 1 reply; 9+ messages in thread
From: Sasha Levin @ 2011-11-28 20:17 UTC (permalink / raw)
To: Pekka Enberg; +Cc: kvm, mingo, asias.hejun, gorcunov
On Mon, 2011-11-28 at 20:49 +0200, Pekka Enberg wrote:
> On Mon, Nov 28, 2011 at 7:54 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> > Indirect buffers are ring descriptors which point to more (even more)
> > descriptors.
> >
> > This can be used to increase the effective ring capacity, which helps the
> > guest to batch large requests - very useful for blk devices.
> >
> > This patch also enables indirect buffers for virtio-net and virtio-blk.
> >
> > The patch is based on the lguest's code which does the same.
> >
> > Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
>
> In what exact way is it useful? Improved throughput? Will this have
> negative impact on virtio block or virtio net latency?
The total size of requests is limited by the size of the virtio ring.
This makes it hard to squeeze large requests (like the ones you usually
see with blk devices) into the ring. This patch simply makes each entry
in the virtio ring point to another descriptor list, thus it allows to
squeeze much more requests into a singe virtio ring.
It shouldn't hurt latency.
I tried getting benchmarks with it, but the results I get from fio are
all over the place and I can't seem to get a steady result (bad sign for
my poor spindle disk).
--
Sasha.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kvm tools: Support virtio indirect buffers
2011-11-28 20:17 ` Sasha Levin
@ 2011-11-28 21:05 ` Sasha Levin
2011-11-30 5:20 ` Rusty Russell
0 siblings, 1 reply; 9+ messages in thread
From: Sasha Levin @ 2011-11-28 21:05 UTC (permalink / raw)
To: Pekka Enberg; +Cc: kvm, mingo, asias.hejun, gorcunov, Rusty Russell
On Mon, 2011-11-28 at 22:17 +0200, Sasha Levin wrote:
> On Mon, 2011-11-28 at 20:49 +0200, Pekka Enberg wrote:
> > On Mon, Nov 28, 2011 at 7:54 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> > > Indirect buffers are ring descriptors which point to more (even more)
> > > descriptors.
> > >
> > > This can be used to increase the effective ring capacity, which helps the
> > > guest to batch large requests - very useful for blk devices.
> > >
> > > This patch also enables indirect buffers for virtio-net and virtio-blk.
> > >
> > > The patch is based on the lguest's code which does the same.
> > >
> > > Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
> >
> > In what exact way is it useful? Improved throughput? Will this have
> > negative impact on virtio block or virtio net latency?
>
> The total size of requests is limited by the size of the virtio ring.
> This makes it hard to squeeze large requests (like the ones you usually
> see with blk devices) into the ring. This patch simply makes each entry
> in the virtio ring point to another descriptor list, thus it allows to
> squeeze much more requests into a singe virtio ring.
>
> It shouldn't hurt latency.
>
> I tried getting benchmarks with it, but the results I get from fio are
> all over the place and I can't seem to get a steady result (bad sign for
> my poor spindle disk).
btw, on an unrelated subject, I think that with this patch we've fully
covered the virtio spec, and as far as I know it's the first userspace
implementation which covers the entire spec :)
--
Sasha.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kvm tools: Support virtio indirect buffers
2011-11-28 17:54 [PATCH] kvm tools: Support virtio indirect buffers Sasha Levin
2011-11-28 18:49 ` Pekka Enberg
@ 2011-11-29 6:31 ` Cyrill Gorcunov
2011-11-29 7:45 ` Sasha Levin
1 sibling, 1 reply; 9+ messages in thread
From: Cyrill Gorcunov @ 2011-11-29 6:31 UTC (permalink / raw)
To: Sasha Levin; +Cc: penberg, kvm, mingo, asias.hejun
On Mon, Nov 28, 2011 at 07:54:27PM +0200, Sasha Levin wrote:
>
> +/*
> + * Each buffer in the virtqueues is actually a chain of descriptors. This
> + * function returns the next descriptor in the chain, or vq->vring.num if we're
> + * at the end.
> + */
> +static unsigned next_desc(struct vring_desc *desc,
> + unsigned int i, unsigned int max)
> +{
> + unsigned int next;
> +
> + /* If this descriptor says it doesn't chain, we're done. */
> + if (!(desc[i].flags & VRING_DESC_F_NEXT))
> + return max;
> +
> + /* Check they're not leading us off end of descriptors. */
> + next = desc[i].next;
> + /* Make sure compiler knows to grab that: we don't want it changing! */
> + wmb();
> +
> + return next;
> +}
> +
Hi Sasha, where the rmb() then? Or maybe you wanted plain barrier() here?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kvm tools: Support virtio indirect buffers
2011-11-29 6:31 ` Cyrill Gorcunov
@ 2011-11-29 7:45 ` Sasha Levin
2011-11-29 13:01 ` Pekka Enberg
0 siblings, 1 reply; 9+ messages in thread
From: Sasha Levin @ 2011-11-29 7:45 UTC (permalink / raw)
To: Cyrill Gorcunov; +Cc: penberg, kvm, mingo, asias.hejun
On Tue, 2011-11-29 at 10:31 +0400, Cyrill Gorcunov wrote:
> On Mon, Nov 28, 2011 at 07:54:27PM +0200, Sasha Levin wrote:
> >
> > +/*
> > + * Each buffer in the virtqueues is actually a chain of descriptors. This
> > + * function returns the next descriptor in the chain, or vq->vring.num if we're
> > + * at the end.
> > + */
> > +static unsigned next_desc(struct vring_desc *desc,
> > + unsigned int i, unsigned int max)
> > +{
> > + unsigned int next;
> > +
> > + /* If this descriptor says it doesn't chain, we're done. */
> > + if (!(desc[i].flags & VRING_DESC_F_NEXT))
> > + return max;
> > +
> > + /* Check they're not leading us off end of descriptors. */
> > + next = desc[i].next;
> > + /* Make sure compiler knows to grab that: we don't want it changing! */
> > + wmb();
> > +
> > + return next;
> > +}
> > +
>
> Hi Sasha, where the rmb() then? Or maybe you wanted plain barrier() here?
On the kernel side.
Theres a mb there which happens there during the kick.
--
Sasha.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kvm tools: Support virtio indirect buffers
2011-11-29 7:45 ` Sasha Levin
@ 2011-11-29 13:01 ` Pekka Enberg
2011-11-29 13:06 ` Cyrill Gorcunov
0 siblings, 1 reply; 9+ messages in thread
From: Pekka Enberg @ 2011-11-29 13:01 UTC (permalink / raw)
To: Sasha Levin; +Cc: Cyrill Gorcunov, kvm, mingo, asias.hejun
On Tue, 29 Nov 2011, Sasha Levin wrote:
>>> +/*
>>> + * Each buffer in the virtqueues is actually a chain of descriptors. This
>>> + * function returns the next descriptor in the chain, or vq->vring.num if we're
>>> + * at the end.
>>> + */
>>> +static unsigned next_desc(struct vring_desc *desc,
>>> + unsigned int i, unsigned int max)
>>> +{
>>> + unsigned int next;
>>> +
>>> + /* If this descriptor says it doesn't chain, we're done. */
>>> + if (!(desc[i].flags & VRING_DESC_F_NEXT))
>>> + return max;
>>> +
>>> + /* Check they're not leading us off end of descriptors. */
>>> + next = desc[i].next;
>>> + /* Make sure compiler knows to grab that: we don't want it changing! */
>>> + wmb();
>>> +
>>> + return next;
>>> +}
>>> +
>>
>> Hi Sasha, where the rmb() then? Or maybe you wanted plain barrier() here?
>
> On the kernel side.
> Theres a mb there which happens there during the kick.
I guess we need to improve the comment in next_desc()?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kvm tools: Support virtio indirect buffers
2011-11-29 13:01 ` Pekka Enberg
@ 2011-11-29 13:06 ` Cyrill Gorcunov
0 siblings, 0 replies; 9+ messages in thread
From: Cyrill Gorcunov @ 2011-11-29 13:06 UTC (permalink / raw)
To: Pekka Enberg; +Cc: Sasha Levin, kvm, mingo, asias.hejun
On Tue, Nov 29, 2011 at 03:01:59PM +0200, Pekka Enberg wrote:
> >>
> >>Hi Sasha, where the rmb() then? Or maybe you wanted plain barrier() here?
> >
> >On the kernel side.
> >Theres a mb there which happens there during the kick.
>
> I guess we need to improve the comment in next_desc()?
>
Kernel's code has pretty good aliases for virtio barriers I think
#ifdef CONFIG_SMP
/* Where possible, use SMP barriers which are more lightweight than mandatory
* barriers, because mandatory barriers control MMIO effects on accesses
* through relaxed memory I/O windows (which virtio does not use). */
#define virtio_mb() smp_mb()
#define virtio_rmb() smp_rmb()
#define virtio_wmb() smp_wmb()
#else
/* We must force memory ordering even if guest is UP since host could be
* running on another CPU, but SMP barriers are defined to barrier() in that
* configuration. So fall back to mandatory barriers instead. */
#define virtio_mb() mb()
#define virtio_rmb() rmb()
#define virtio_wmb() wmb()
#endif
Maybe we could use somethig similar?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kvm tools: Support virtio indirect buffers
2011-11-28 21:05 ` Sasha Levin
@ 2011-11-30 5:20 ` Rusty Russell
0 siblings, 0 replies; 9+ messages in thread
From: Rusty Russell @ 2011-11-30 5:20 UTC (permalink / raw)
To: Sasha Levin, Pekka Enberg; +Cc: kvm
On Mon, 28 Nov 2011 23:05:21 +0200, Sasha Levin <levinsasha928@gmail.com> wrote:
> On Mon, 2011-11-28 at 22:17 +0200, Sasha Levin wrote:
> btw, on an unrelated subject, I think that with this patch we've fully
> covered the virtio spec, and as far as I know it's the first userspace
> implementation which covers the entire spec :)
BTW, why did you bother implementing virtio-mmio? It seems like
gratuitous bloat. I hope you're not going the same way as qemu :(
Thanks,
Rusty.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2011-12-01 0:35 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-28 17:54 [PATCH] kvm tools: Support virtio indirect buffers Sasha Levin
2011-11-28 18:49 ` Pekka Enberg
2011-11-28 20:17 ` Sasha Levin
2011-11-28 21:05 ` Sasha Levin
2011-11-30 5:20 ` Rusty Russell
2011-11-29 6:31 ` Cyrill Gorcunov
2011-11-29 7:45 ` Sasha Levin
2011-11-29 13:01 ` Pekka Enberg
2011-11-29 13:06 ` Cyrill Gorcunov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).