* TX from KVM guest virtio_net to vhost issues
@ 2011-03-09 21:46 Shirley Ma
2011-03-11 6:19 ` Rusty Russell
0 siblings, 1 reply; 10+ messages in thread
From: Shirley Ma @ 2011-03-09 21:46 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Tom Lendacky, Rusty Russell, Krishna Kumar2, David Miller, kvm,
netdev, steved
Since we have lots of performance discussions about virtio_net and vhost
communication. I think it's better to have a common understandings of
the code first, then we can seek the right directions to improve it. We
also need to collect more statistics data on both virtio and vhost.
Let's look at TX first: from virtio_net(guest) to vhost(host), send vq
is shared between guest virtio_net and host vhost, it uses memory
barriers to sync the changes.
In the start:
Guest virtio_net TX send completion interrupt (for freeing used skbs) is
disable. Guest virtio_net TX send completion interrupt is enabled only
when send vq is overrun, guest needs to wait vhost to consume more
available skbs.
Host vhost notification is enabled in the beginning (for consuming
available skbs); It is disable whenever the send vq is not empty. Once
the send vq is empty, the notification is enabled by vhost.
In guest start_xmit(), it first frees used skbs, then send available
skbs to vhost, ideally guest never enables TX send completion interrupts
to free used skbs if vhost keeps posting used skbs in send vq.
In vhost handle_tx(), it wakes up by guest whenever the send vq has a
skb to send, once the send vq is not empty, vhost exits handle_tx()
without enabling notification. Ideally if guest keeps xmit skbs in send
vq, the notification is never enabled.
I don't see issues on this implementation.
However, in our TCP_STREAM small message size test, we found that
somehow guest couldn't see more used skbs to free, which caused
frequently TX send queue overrun.
In our TCP_RR small message size multiple streams test, we found that
vhost couldn't see more xmit skbs in send vq, thus it enabled
notification too often.
What's the possible cause here in xmit? How guest, vhost are being
scheduled? Whether it's possible, guest virtio_net cooperates with vhost
for ideal performance: both guest virtio_net and vhost can be in pace
with send vq without many notifications and exits?
Thanks
Shirley
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: TX from KVM guest virtio_net to vhost issues
2011-03-09 21:46 TX from KVM guest virtio_net to vhost issues Shirley Ma
@ 2011-03-11 6:19 ` Rusty Russell
2011-03-14 20:29 ` [PATCH 0/2] publish last used index (was Re: TX from KVM guest virtio_net to vhost issues) Michael S. Tsirkin
2011-03-17 0:20 ` TX from KVM guest virtio_net to vhost issues Shirley Ma
0 siblings, 2 replies; 10+ messages in thread
From: Rusty Russell @ 2011-03-11 6:19 UTC (permalink / raw)
To: Shirley Ma, Michael S. Tsirkin
Cc: Tom Lendacky, Krishna Kumar2, David Miller, kvm, netdev, steved
On Wed, 09 Mar 2011 13:46:36 -0800, Shirley Ma <mashirle@us.ibm.com> wrote:
> Since we have lots of performance discussions about virtio_net and vhost
> communication. I think it's better to have a common understandings of
> the code first, then we can seek the right directions to improve it. We
> also need to collect more statistics data on both virtio and vhost.
>
> Let's look at TX first: from virtio_net(guest) to vhost(host), send vq
> is shared between guest virtio_net and host vhost, it uses memory
> barriers to sync the changes.
>
> In the start:
>
> Guest virtio_net TX send completion interrupt (for freeing used skbs) is
> disable. Guest virtio_net TX send completion interrupt is enabled only
> when send vq is overrun, guest needs to wait vhost to consume more
> available skbs.
>
> Host vhost notification is enabled in the beginning (for consuming
> available skbs); It is disable whenever the send vq is not empty. Once
> the send vq is empty, the notification is enabled by vhost.
>
> In guest start_xmit(), it first frees used skbs, then send available
> skbs to vhost, ideally guest never enables TX send completion interrupts
> to free used skbs if vhost keeps posting used skbs in send vq.
>
> In vhost handle_tx(), it wakes up by guest whenever the send vq has a
> skb to send, once the send vq is not empty, vhost exits handle_tx()
> without enabling notification. Ideally if guest keeps xmit skbs in send
> vq, the notification is never enabled.
>
> I don't see issues on this implementation.
>
> However, in our TCP_STREAM small message size test, we found that
> somehow guest couldn't see more used skbs to free, which caused
> frequently TX send queue overrun.
So it seems like the guest is outrunning the host?
> In our TCP_RR small message size multiple streams test, we found that
> vhost couldn't see more xmit skbs in send vq, thus it enabled
> notification too often.
And here the host is outrunning the guest?
> What's the possible cause here in xmit? How guest, vhost are being
> scheduled? Whether it's possible, guest virtio_net cooperates with vhost
> for ideal performance: both guest virtio_net and vhost can be in pace
> with send vq without many notifications and exits?
We tend to blame the scheduler for all kinds of things, until we find
the real cause :) Nailing things to different CPUs might help us
determine if it really is scheduler...
But if one side outruns the other, it does a lot of unnecessary work
notifying/interrupting it over and over again before the host/guest gets
a chance to shut notifications/interrupts off. Hence the last_used
publishing patch (Xen does this right, I screwed it up).
Long weekend here, and I'm otherwise committed. But if noone has
cleaned up that patch by early next week, I'll try to do so and see if
we can make a real difference.
Cheers,
Rusty.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 0/2] publish last used index (was Re: TX from KVM guest virtio_net to vhost issues)
2011-03-11 6:19 ` Rusty Russell
@ 2011-03-14 20:29 ` Michael S. Tsirkin
2011-03-14 20:30 ` [PATCH 1/2] virtio: put last seen used index into ring itself Michael S. Tsirkin
2011-03-14 20:30 ` [PATCH 2/2] vhost-net: utilize PUBLISH_USED_IDX feature Michael S. Tsirkin
2011-03-17 0:20 ` TX from KVM guest virtio_net to vhost issues Shirley Ma
1 sibling, 2 replies; 10+ messages in thread
From: Michael S. Tsirkin @ 2011-03-14 20:29 UTC (permalink / raw)
To: Rusty Russell
Cc: Shirley Ma, Tom Lendacky, Krishna Kumar2, David Miller, kvm,
netdev, steved, jasowang
On Fri, Mar 11, 2011 at 04:49:26PM +1030, Rusty Russell wrote:
> But if one side outruns the other, it does a lot of unnecessary work
> notifying/interrupting it over and over again before the host/guest gets
> a chance to shut notifications/interrupts off. Hence the last_used
> publishing patch (Xen does this right, I screwed it up).
>
> Long weekend here, and I'm otherwise committed. But if noone has
> cleaned up that patch by early next week, I'll try to do so and see if
> we can make a real difference.
>
> Cheers,
> Rusty.
I actually had the last_used publishing patch working. Here it is
again: it's the part that publishes the used index from guest and the
part that uses it in host to avoid spurious interrupts
(this is Tom's case, not Shirley's case)
updated to the latest bits, and clarified the code a bit more.
Note: there's some tricky logic in the vhost part, I think I got it right
but please review.
Warning: compiled only, completely untested, I intend to add the bits
that use the last available index in guest to reduce exits, test it all
together.
Would appreciate review and/or testing.
Michael S. Tsirkin (2):
virtio: put last seen used index into ring itself
vhost-net: utilize PUBLISH_USED_IDX feature
drivers/vhost/vhost.c | 58 +++++++++++++++++++++++++++++++++++++----
drivers/vhost/vhost.h | 9 ++++++
drivers/virtio/virtio_ring.c | 25 +++++++++++------
include/linux/virtio_ring.h | 11 ++++++++
4 files changed, 88 insertions(+), 15 deletions(-)
--
1.7.3.2.91.g446ac
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/2] virtio: put last seen used index into ring itself
2011-03-14 20:29 ` [PATCH 0/2] publish last used index (was Re: TX from KVM guest virtio_net to vhost issues) Michael S. Tsirkin
@ 2011-03-14 20:30 ` Michael S. Tsirkin
2011-03-15 0:21 ` Shirley Ma
2011-03-14 20:30 ` [PATCH 2/2] vhost-net: utilize PUBLISH_USED_IDX feature Michael S. Tsirkin
1 sibling, 1 reply; 10+ messages in thread
From: Michael S. Tsirkin @ 2011-03-14 20:30 UTC (permalink / raw)
To: Rusty Russell
Cc: Shirley Ma, Tom Lendacky, Krishna Kumar2, David Miller, kvm,
netdev, steved, jasowang
Generally, the Host end of the virtio ring doesn't need to see where
Guest is up to in consuming the ring. However, to completely understand
what's going on from the outside, this information must be exposed.
For example, host can reduce the number of interrupts by detecting
that the guest is currently handling previous buffers.
Fortunately, we have room to expand: the ring is always a whole number
of pages and there's hundreds of bytes of padding after the avail ring
and the used ring, whatever the number of descriptors (which must be a
power of 2).
We add a feature bit so the guest can tell the host that it's writing
out the current value there, if it wants to use that.
This is based on a patch by Rusty Russell, with the main difference
being that we dedicate a feature bit to guest to tell the host it is
writing the used index. This way we don't need to force host to publish
the last available index until we have a use for it.
Memory barriers use has been rationalized (hopefully correctly).
To avoid the cost of an extra memory barrier on each update, we only
commit for the index to be up to date when interrupts
are enabled.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
drivers/virtio/virtio_ring.c | 25 ++++++++++++++++---------
include/linux/virtio_ring.h | 11 +++++++++++
2 files changed, 27 insertions(+), 9 deletions(-)
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index cc2f73e..a6fc537 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -89,9 +89,6 @@ struct vring_virtqueue
/* Number we've added since last sync. */
unsigned int num_added;
- /* Last used index we've seen. */
- u16 last_used_idx;
-
/* How to notify other side. FIXME: commonalize hcalls! */
void (*notify)(struct virtqueue *vq);
@@ -283,12 +280,13 @@ static void detach_buf(struct vring_virtqueue *vq, unsigned int head)
static inline bool more_used(const struct vring_virtqueue *vq)
{
- return vq->last_used_idx != vq->vring.used->idx;
+ return vring_last_used(vq) != vq->vring.used->idx;
}
void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
{
struct vring_virtqueue *vq = to_vvq(_vq);
+ struct vring_used_elem *u;
void *ret;
unsigned int i;
@@ -308,9 +306,9 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
/* Only get used array entries after they have been exposed by host. */
virtio_rmb();
- i = vq->vring.used->ring[vq->last_used_idx%vq->vring.num].id;
- *len = vq->vring.used->ring[vq->last_used_idx%vq->vring.num].len;
-
+ u = &vq->vring.used->ring[vring_last_used(vq) % vq->vring.num];
+ i = u->id;
+ *len = u->len;
if (unlikely(i >= vq->vring.num)) {
BAD_RING(vq, "id %u out of range\n", i);
return NULL;
@@ -323,7 +321,12 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
/* detach_buf clears data, so grab it now. */
ret = vq->data[i];
detach_buf(vq, i);
- vq->last_used_idx++;
+ (vring_last_used(vq))++;
+ /* If we expect an interrupt for the next entry, flush out
+ * last used index write. */
+ if (!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT))
+ virtio_mb();
+
END_USE(vq);
return ret;
}
@@ -346,6 +349,8 @@ bool virtqueue_enable_cb(struct virtqueue *_vq)
/* We optimistically turn back on interrupts, then check if there was
* more to do. */
vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
+ /* Besides flags write, this barrier also flushes out
+ * last available index write. */
virtio_mb();
if (unlikely(more_used(vq))) {
END_USE(vq);
@@ -429,7 +434,7 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
vq->vq.name = name;
vq->notify = notify;
vq->broken = false;
- vq->last_used_idx = 0;
+ vring_last_used(vq) = 0;
vq->num_added = 0;
list_add_tail(&vq->vq.list, &vdev->vqs);
#ifdef DEBUG
@@ -471,6 +476,8 @@ void vring_transport_features(struct virtio_device *vdev)
switch (i) {
case VIRTIO_RING_F_INDIRECT_DESC:
break;
+ case VIRTIO_RING_F_PUBLISH_USED:
+ break;
default:
/* We don't understand this bit. */
clear_bit(i, vdev->features);
diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index e4d144b..4587ea2 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -29,6 +29,9 @@
/* We support indirect buffer descriptors */
#define VIRTIO_RING_F_INDIRECT_DESC 28
+/* The Guest publishes last-seen used index at the end of the avail ring. */
+#define VIRTIO_RING_F_PUBLISH_USED 29
+
/* Virtio ring descriptors: 16 bytes. These can chain together via "next". */
struct vring_desc {
/* Address (guest-physical). */
@@ -83,6 +86,7 @@ struct vring {
* __u16 avail_flags;
* __u16 avail_idx;
* __u16 available[num];
+ * __u16 last_used_idx;
*
* // Padding to the next align boundary.
* char pad[];
@@ -93,6 +97,10 @@ struct vring {
* struct vring_used_elem used[num];
* };
*/
+
+/* We publish the last-seen used index at the end of the available ring.
+ * It is at the end for backwards compatibility. */
+#define vring_last_used(vr) ((vr)->avail->ring[num])
static inline void vring_init(struct vring *vr, unsigned int num, void *p,
unsigned long align)
{
@@ -101,6 +109,9 @@ static inline void vring_init(struct vring *vr, unsigned int num, void *p,
vr->avail = p + num*sizeof(struct vring_desc);
vr->used = (void *)(((unsigned long)&vr->avail->ring[num] + align-1)
& ~(align - 1));
+ /* Verify that last used index does not spill over the used ring. */
+ BUG_ON((void *)&vring_last_used(vr) +
+ sizeof vring_last_used(vr) > (void *)vr->used);
}
static inline unsigned vring_size(unsigned int num, unsigned long align)
--
1.7.3.2.91.g446ac
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/2] vhost-net: utilize PUBLISH_USED_IDX feature
2011-03-14 20:29 ` [PATCH 0/2] publish last used index (was Re: TX from KVM guest virtio_net to vhost issues) Michael S. Tsirkin
2011-03-14 20:30 ` [PATCH 1/2] virtio: put last seen used index into ring itself Michael S. Tsirkin
@ 2011-03-14 20:30 ` Michael S. Tsirkin
1 sibling, 0 replies; 10+ messages in thread
From: Michael S. Tsirkin @ 2011-03-14 20:30 UTC (permalink / raw)
To: Rusty Russell
Cc: Shirley Ma, Tom Lendacky, Krishna Kumar2, David Miller, kvm,
netdev, steved, jasowang
With PUBLISH_USED_IDX, guest tells us which used entries
it has consumed. This can be used to reduce the number
of interrupts: after we write a used entry, if the guest has not yet
consumed the previous entry, or if the guest has already consumed the
new entry, we do not need to interrupt.
This imporves bandwidth by 30% under some workflows.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
drivers/vhost/vhost.c | 58 +++++++++++++++++++++++++++++++++++++++++++-----
drivers/vhost/vhost.h | 9 +++++++
2 files changed, 61 insertions(+), 6 deletions(-)
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 2ab2912..8d5d56a 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -161,6 +161,8 @@ static void vhost_vq_reset(struct vhost_dev *dev,
vq->last_avail_idx = 0;
vq->avail_idx = 0;
vq->last_used_idx = 0;
+ vq->signal_used = 0;;
+ vq->signal_next = true;;
vq->used_flags = 0;
vq->log_used = false;
vq->log_addr = -1ull;
@@ -489,14 +491,15 @@ static int memory_access_ok(struct vhost_dev *d, struct vhost_memory *mem,
return 1;
}
-static int vq_access_ok(unsigned int num,
+static int vq_access_ok(struct vhost_dev *d, unsigned int num,
struct vring_desc __user *desc,
struct vring_avail __user *avail,
struct vring_used __user *used)
{
+ size_t s = vhost_has_feature(d, VIRTIO_RING_F_PUBLISH_USED) ? 2 : 0;
return access_ok(VERIFY_READ, desc, num * sizeof *desc) &&
access_ok(VERIFY_READ, avail,
- sizeof *avail + num * sizeof *avail->ring) &&
+ sizeof *avail + num * sizeof *avail->ring + s) &&
access_ok(VERIFY_WRITE, used,
sizeof *used + num * sizeof *used->ring);
}
@@ -531,7 +534,7 @@ static int vq_log_access_ok(struct vhost_virtqueue *vq, void __user *log_base)
/* Caller should have vq mutex and device mutex */
int vhost_vq_access_ok(struct vhost_virtqueue *vq)
{
- return vq_access_ok(vq->num, vq->desc, vq->avail, vq->used) &&
+ return vq_access_ok(vq->dev, vq->num, vq->desc, vq->avail, vq->used) &&
vq_log_access_ok(vq, vq->log_base);
}
@@ -577,6 +580,7 @@ static int init_used(struct vhost_virtqueue *vq,
if (r)
return r;
+ vq->signal_next = true;
return get_user(vq->last_used_idx, &used->idx);
}
@@ -674,7 +678,7 @@ static long vhost_set_vring(struct vhost_dev *d, int ioctl, void __user *argp)
* If it is not, we don't as size might not have been setup.
* We will verify when backend is configured. */
if (vq->private_data) {
- if (!vq_access_ok(vq->num,
+ if (!vq_access_ok(d, vq->num,
(void __user *)(unsigned long)a.desc_user_addr,
(void __user *)(unsigned long)a.avail_user_addr,
(void __user *)(unsigned long)a.used_user_addr)) {
@@ -699,6 +703,7 @@ static long vhost_set_vring(struct vhost_dev *d, int ioctl, void __user *argp)
vq->log_used = !!(a.flags & (0x1 << VHOST_VRING_F_LOG));
vq->desc = (void __user *)(unsigned long)a.desc_user_addr;
vq->avail = (void __user *)(unsigned long)a.avail_user_addr;
+ vq->last_used = (u16 __user *)&vq->avail->ring[vq->num];
vq->log_addr = a.log_guest_addr;
vq->used = (void __user *)(unsigned long)a.used_user_addr;
break;
@@ -1230,7 +1235,8 @@ void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n)
/* After we've used one of their buffers, we tell them about it. We'll then
* want to notify the guest, using eventfd. */
-int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len)
+int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head,
+ int len)
{
struct vring_used_elem __user *used;
@@ -1267,6 +1273,8 @@ int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len)
eventfd_signal(vq->log_ctx, 1);
}
vq->last_used_idx++;
+ if (unlikely(vq->last_used_idx == vq->signal_used))
+ vq->signal_next = true;
return 0;
}
@@ -1275,6 +1283,7 @@ static int __vhost_add_used_n(struct vhost_virtqueue *vq,
unsigned count)
{
struct vring_used_elem __user *used;
+ u16 last_used_idx;
int start;
start = vq->last_used_idx % vq->num;
@@ -1292,7 +1301,14 @@ static int __vhost_add_used_n(struct vhost_virtqueue *vq,
((void __user *)used - (void __user *)vq->used),
count * sizeof *used);
}
+ last_used_idx = vq->last_used_idx;
vq->last_used_idx += count;
+ /* make sure we signal the next entry in an unlikely event of a
+ * wrap-around without signalling once. */
+ /* Note: this should never happen with current vhost-net code. */
+ if (unlikely(last_used_idx < vq->signal_used &&
+ vq->last_used_idx >= vq->signal_used))
+ vq->signal_next = true;
return 0;
}
@@ -1335,7 +1351,8 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, struct vring_used_elem *heads,
void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq)
{
__u16 flags;
-
+ __u16 n;
+ bool s;
/* Flush out used index updates. This is paired
* with the barrier that the Guest executes when enabling
* interrupts. */
@@ -1346,12 +1363,41 @@ void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq)
return;
}
+ n = vq->signal_used;
+ s = vq->signal_next;
+ vq->signal_used = vq->last_used_idx;
+ vq->signal_next = false;
+
/* If they don't want an interrupt, don't signal, unless empty. */
if ((flags & VRING_AVAIL_F_NO_INTERRUPT) &&
(vq->avail_idx != vq->last_avail_idx ||
!vhost_has_feature(dev, VIRTIO_F_NOTIFY_ON_EMPTY)))
return;
+ if (vhost_has_feature(dev, VIRTIO_RING_F_PUBLISH_USED)) {
+ __u16 used;
+ if (get_user(used, vq->last_used)) {
+ vq_err(vq, "Failed to get last used idx");
+ return;
+ }
+
+ /* Do not notify if guest did not yet see the last update. */
+ /* We compare used from guest to signal_used, which is
+ * the old value of last_used_idx where we called signal previously.
+ Let's assume w.l.o.g that signalled_used is 0 (we will later
+ subtract this old value to get the generic case);
+ now if used > last_used_idx this means guest is processing
+ old entries and if used == last_used_idx it has processed
+ all the new ones.
+ Thus our logic *for when signal_used is 0* is:
+ if (used >= last_used_idx) return;
+ and in the generic case, subtract signalled_used,
+ modulo 0x10000.
+ */
+ if (s || ((u16)(used - n) >= (u16)(vq->last_used_idx - n)))
+ return;
+ }
+
/* Signal the Guest tell them we used something up. */
if (vq->call_ctx)
eventfd_signal(vq->call_ctx, 1);
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index b3363ae..b14b994 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -59,6 +59,7 @@ struct vhost_virtqueue {
unsigned int num;
struct vring_desc __user *desc;
struct vring_avail __user *avail;
+ u16 __user *last_used;
struct vring_used __user *used;
struct file *kick;
struct file *call;
@@ -84,6 +85,13 @@ struct vhost_virtqueue {
/* Used flags */
u16 used_flags;
+ /* last_used_idx value when we signalled. */
+ u16 signal_used;
+
+ /* Flag to ignore signal_used above
+ * and signal on the next entry. */
+ bool signal_next;
+
/* Log writes to used structure. */
bool log_used;
u64 log_addr;
@@ -164,6 +172,7 @@ int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
enum {
VHOST_FEATURES = (1 << VIRTIO_F_NOTIFY_ON_EMPTY) |
(1 << VIRTIO_RING_F_INDIRECT_DESC) |
+ (1 << VIRTIO_RING_F_PUBLISH_USED) |
(1 << VHOST_F_LOG_ALL) |
(1 << VHOST_NET_F_VIRTIO_NET_HDR) |
(1 << VIRTIO_NET_F_MRG_RXBUF),
--
1.7.3.2.91.g446ac
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] virtio: put last seen used index into ring itself
2011-03-14 20:30 ` [PATCH 1/2] virtio: put last seen used index into ring itself Michael S. Tsirkin
@ 2011-03-15 0:21 ` Shirley Ma
2011-03-17 12:20 ` Michael S. Tsirkin
0 siblings, 1 reply; 10+ messages in thread
From: Shirley Ma @ 2011-03-15 0:21 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Rusty Russell, Tom Lendacky, Krishna Kumar2, David Miller, kvm,
netdev, steved, jasowang
On Mon, 2011-03-14 at 22:30 +0200, Michael S. Tsirkin wrote:
> Generally, the Host end of the virtio ring doesn't need to see where
> Guest is up to in consuming the ring. However, to completely
> understand
> what's going on from the outside, this information must be exposed.
> For example, host can reduce the number of interrupts by detecting
> that the guest is currently handling previous buffers.
>
> Fortunately, we have room to expand: the ring is always a whole number
> of pages and there's hundreds of bytes of padding after the avail ring
> and the used ring, whatever the number of descriptors (which must be a
> power of 2).
>
> We add a feature bit so the guest can tell the host that it's writing
> out the current value there, if it wants to use that.
>
> This is based on a patch by Rusty Russell, with the main difference
> being that we dedicate a feature bit to guest to tell the host it is
> writing the used index. This way we don't need to force host to
> publish
> the last available index until we have a use for it.
>
> Memory barriers use has been rationalized (hopefully correctly).
> To avoid the cost of an extra memory barrier on each update, we only
> commit for the index to be up to date when interrupts
> are enabled.
>
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> drivers/virtio/virtio_ring.c | 25 ++++++++++++++++---------
> include/linux/virtio_ring.h | 11 +++++++++++
> 2 files changed, 27 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/virtio/virtio_ring.c
> b/drivers/virtio/virtio_ring.c
> index cc2f73e..a6fc537 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -89,9 +89,6 @@ struct vring_virtqueue
> /* Number we've added since last sync. */
> unsigned int num_added;
>
> - /* Last used index we've seen. */
> - u16 last_used_idx;
> -
> /* How to notify other side. FIXME: commonalize hcalls! */
> void (*notify)(struct virtqueue *vq);
>
> @@ -283,12 +280,13 @@ static void detach_buf(struct vring_virtqueue
> *vq, unsigned int head)
>
> static inline bool more_used(const struct vring_virtqueue *vq)
> {
> - return vq->last_used_idx != vq->vring.used->idx;
> + return vring_last_used(vq) != vq->vring.used->idx;
> }
here should be vring_last_used(&vq->vring)
> void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
> + struct vring_used_elem *u;
> void *ret;
> unsigned int i;
>
> @@ -308,9 +306,9 @@ void *virtqueue_get_buf(struct virtqueue *_vq,
> unsigned int *len)
> /* Only get used array entries after they have been exposed by
> host. */
> virtio_rmb();
>
> - i = vq->vring.used->ring[vq->last_used_idx%vq->vring.num].id;
> - *len = vq->vring.used->ring[vq->last_used_idx%
> vq->vring.num].len;
> -
> + u = &vq->vring.used->ring[vring_last_used(vq) %
> vq->vring.num];
same here vring_last_used(&vq->vring)
> + i = u->id;
> + *len = u->len;
> if (unlikely(i >= vq->vring.num)) {
> BAD_RING(vq, "id %u out of range\n", i);
> return NULL;
> @@ -323,7 +321,12 @@ void *virtqueue_get_buf(struct virtqueue *_vq,
> unsigned int *len)
> /* detach_buf clears data, so grab it now. */
> ret = vq->data[i];
> detach_buf(vq, i);
> - vq->last_used_idx++;
> + (vring_last_used(vq))++;
> + /* If we expect an interrupt for the next entry, flush out
> + * last used index write. */
> + if (!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT))
> + virtio_mb();
> +
> END_USE(vq);
> return ret;
> }
> @@ -346,6 +349,8 @@ bool virtqueue_enable_cb(struct virtqueue *_vq)
> /* We optimistically turn back on interrupts, then check if
> there was
> * more to do. */
> vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
> + /* Besides flags write, this barrier also flushes out
> + * last available index write. */
> virtio_mb();
> if (unlikely(more_used(vq))) {
> END_USE(vq);
> @@ -429,7 +434,7 @@ struct virtqueue *vring_new_virtqueue(unsigned int
> num,
> vq->vq.name = name;
> vq->notify = notify;
> vq->broken = false;
> - vq->last_used_idx = 0;
> + vring_last_used(vq) = 0;
Same here vring_last_used(&vq->vring)
> vq->num_added = 0;
> list_add_tail(&vq->vq.list, &vdev->vqs);
> #ifdef DEBUG
> @@ -471,6 +476,8 @@ void vring_transport_features(struct virtio_device
> *vdev)
> switch (i) {
> case VIRTIO_RING_F_INDIRECT_DESC:
> break;
> + case VIRTIO_RING_F_PUBLISH_USED:
> + break;
> default:
> /* We don't understand this bit. */
> clear_bit(i, vdev->features);
> diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
> index e4d144b..4587ea2 100644
> --- a/include/linux/virtio_ring.h
> +++ b/include/linux/virtio_ring.h
> @@ -29,6 +29,9 @@
> /* We support indirect buffer descriptors */
> #define VIRTIO_RING_F_INDIRECT_DESC 28
>
> +/* The Guest publishes last-seen used index at the end of the avail
> ring. */
> +#define VIRTIO_RING_F_PUBLISH_USED 29
> +
> /* Virtio ring descriptors: 16 bytes. These can chain together via
> "next". */
> struct vring_desc {
> /* Address (guest-physical). */
> @@ -83,6 +86,7 @@ struct vring {
> * __u16 avail_flags;
> * __u16 avail_idx;
> * __u16 available[num];
> + * __u16 last_used_idx;
> *
> * // Padding to the next align boundary.
> * char pad[];
> @@ -93,6 +97,10 @@ struct vring {
> * struct vring_used_elem used[num];
> * };
> */
> +
> +/* We publish the last-seen used index at the end of the available
> ring.
> + * It is at the end for backwards compatibility. */
> +#define vring_last_used(vr) ((vr)->avail->ring[num])
> static inline void vring_init(struct vring *vr, unsigned int num,
> void *p,
> unsigned long align)
> {
> @@ -101,6 +109,9 @@ static inline void vring_init(struct vring *vr,
> unsigned int num, void *p,
> vr->avail = p + num*sizeof(struct vring_desc);
> vr->used = (void *)(((unsigned long)&vr->avail->ring[num] +
> align-1)
> & ~(align - 1));
> + /* Verify that last used index does not spill over the used
> ring. */
> + BUG_ON((void *)&vring_last_used(vr) +
> + sizeof vring_last_used(vr) > (void *)vr->used);
> }
>
> static inline unsigned vring_size(unsigned int num, unsigned long
> align)
> --
> 1.7.3.2.91.g446ac
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: TX from KVM guest virtio_net to vhost issues
2011-03-11 6:19 ` Rusty Russell
2011-03-14 20:29 ` [PATCH 0/2] publish last used index (was Re: TX from KVM guest virtio_net to vhost issues) Michael S. Tsirkin
@ 2011-03-17 0:20 ` Shirley Ma
2011-03-17 6:00 ` Michael S. Tsirkin
1 sibling, 1 reply; 10+ messages in thread
From: Shirley Ma @ 2011-03-17 0:20 UTC (permalink / raw)
To: Rusty Russell
Cc: Michael S. Tsirkin, Tom Lendacky, Krishna Kumar2, David Miller,
kvm, netdev, steved
On Fri, 2011-03-11 at 16:49 +1030, Rusty Russell wrote:
> But if one side outruns the other, it does a lot of unnecessary work
> notifying/interrupting it over and over again before the host/guest
> gets
> a chance to shut notifications/interrupts off. Hence the last_used
> publishing patch (Xen does this right, I screwed it up).
>
> Long weekend here, and I'm otherwise committed. But if noone has
> cleaned up that patch by early next week, I'll try to do so and see if
> we can make a real difference.
This patch doesn't seem to help guest TX queue overrun. I couldn't find
any other reason why the TX queue overrun, even I increased guest send
queue ring size to 1K, it didn't help at all. I sent out a dropping
packet approach for guest to avoid this for you to review. Small message
size performance has been dramatically increased with a few packet
drops. Most of the workload doesn't invoke the TX queue overrun, so I
don't see any regressions.
With this patch, vhost side doesn't need to signal guest on TX path,
however for previous guest, we still need vhost to signal guest.
Please let me know if you have any other ideas to debug what causes
guest TX overrun.
Thanks
Shirley
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: TX from KVM guest virtio_net to vhost issues
2011-03-17 0:20 ` TX from KVM guest virtio_net to vhost issues Shirley Ma
@ 2011-03-17 6:00 ` Michael S. Tsirkin
2011-03-17 15:07 ` Shirley Ma
0 siblings, 1 reply; 10+ messages in thread
From: Michael S. Tsirkin @ 2011-03-17 6:00 UTC (permalink / raw)
To: Shirley Ma
Cc: Rusty Russell, Tom Lendacky, Krishna Kumar2, David Miller, kvm,
netdev, steved
On Wed, Mar 16, 2011 at 05:20:14PM -0700, Shirley Ma wrote:
> This patch doesn't seem to help guest TX queue overrun.
Just making sure: what exactly was tested?
You did combine this with a vhost-net and qemu patches, right?
--
MST
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] virtio: put last seen used index into ring itself
2011-03-15 0:21 ` Shirley Ma
@ 2011-03-17 12:20 ` Michael S. Tsirkin
0 siblings, 0 replies; 10+ messages in thread
From: Michael S. Tsirkin @ 2011-03-17 12:20 UTC (permalink / raw)
To: Shirley Ma
Cc: Rusty Russell, Tom Lendacky, Krishna Kumar2, David Miller, kvm,
netdev, steved, jasowang
The following is needed on top: still compiled only, post
here in case Tom is willing to test this meanwhile.
-->
virtio: fix vring_last_used
Reported-by: Shirley Ma <mashirle@us.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index a6fc537..a84b056 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -280,7 +280,7 @@ static void detach_buf(struct vring_virtqueue *vq, unsigned int head)
static inline bool more_used(const struct vring_virtqueue *vq)
{
- return vring_last_used(vq) != vq->vring.used->idx;
+ return vring_last_used(&vq->vring) != vq->vring.used->idx;
}
void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
@@ -306,7 +306,7 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
/* Only get used array entries after they have been exposed by host. */
virtio_rmb();
- u = &vq->vring.used->ring[vring_last_used(vq) % vq->vring.num];
+ u = &vq->vring.used->ring[vring_last_used(&vq->vring) % vq->vring.num];
i = u->id;
*len = u->len;
if (unlikely(i >= vq->vring.num)) {
@@ -321,7 +321,7 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
/* detach_buf clears data, so grab it now. */
ret = vq->data[i];
detach_buf(vq, i);
- (vring_last_used(vq))++;
+ (vring_last_used(&vq->vring))++;
/* If we expect an interrupt for the next entry, flush out
* last used index write. */
if (!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT))
@@ -434,7 +434,7 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
vq->vq.name = name;
vq->notify = notify;
vq->broken = false;
- vring_last_used(vq) = 0;
+ vring_last_used(&vq->vring) = 0;
vq->num_added = 0;
list_add_tail(&vq->vq.list, &vdev->vqs);
#ifdef DEBUG
diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index 4587ea2..462756d 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -100,7 +100,7 @@ struct vring {
/* We publish the last-seen used index at the end of the available ring.
* It is at the end for backwards compatibility. */
-#define vring_last_used(vr) ((vr)->avail->ring[num])
+#define vring_last_used(vr) ((vr)->avail->ring[(vr)->num])
static inline void vring_init(struct vring *vr, unsigned int num, void *p,
unsigned long align)
{
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: TX from KVM guest virtio_net to vhost issues
2011-03-17 6:00 ` Michael S. Tsirkin
@ 2011-03-17 15:07 ` Shirley Ma
0 siblings, 0 replies; 10+ messages in thread
From: Shirley Ma @ 2011-03-17 15:07 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Rusty Russell, Tom Lendacky, Krishna Kumar2, David Miller, kvm,
netdev, steved
On Thu, 2011-03-17 at 08:00 +0200, Michael S. Tsirkin wrote:
> On Wed, Mar 16, 2011 at 05:20:14PM -0700, Shirley Ma wrote:
> > This patch doesn't seem to help guest TX queue overrun.
>
> Just making sure: what exactly was tested?
> You did combine this with a vhost-net and qemu patches, right?
>
I have virtio_net patch, I missed qemu patch. I will retest it.
Thanks
Shirley
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-03-17 15:07 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-09 21:46 TX from KVM guest virtio_net to vhost issues Shirley Ma
2011-03-11 6:19 ` Rusty Russell
2011-03-14 20:29 ` [PATCH 0/2] publish last used index (was Re: TX from KVM guest virtio_net to vhost issues) Michael S. Tsirkin
2011-03-14 20:30 ` [PATCH 1/2] virtio: put last seen used index into ring itself Michael S. Tsirkin
2011-03-15 0:21 ` Shirley Ma
2011-03-17 12:20 ` Michael S. Tsirkin
2011-03-14 20:30 ` [PATCH 2/2] vhost-net: utilize PUBLISH_USED_IDX feature Michael S. Tsirkin
2011-03-17 0:20 ` TX from KVM guest virtio_net to vhost issues Shirley Ma
2011-03-17 6:00 ` Michael S. Tsirkin
2011-03-17 15:07 ` Shirley Ma
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).