* [PATCH 1 of 5] virtio: document functions better [not found] <patchbomb.1320306168@localhost6.localdomain6> @ 2011-11-03 7:42 ` Rusty Russell 2011-11-03 7:49 ` Christoph Hellwig 2011-11-03 7:42 ` [PATCH 2 of 5] virtio: rename virtqueue_add_buf_gfp to virtqueue_add_buf Rusty Russell ` (3 subsequent siblings) 4 siblings, 1 reply; 34+ messages in thread From: Rusty Russell @ 2011-11-03 7:42 UTC (permalink / raw) To: MichaelS.Tsirkist, Christoph Hellwig; +Cc: linux-kernel, kvm, virtualization The old documentation is left over from when we used a structure with strategy pointers. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> --- include/linux/virtio.h | 130 ++++++++++++++++++++++++++++++++----------------- 1 file changed, 87 insertions(+), 43 deletions(-) diff --git a/include/linux/virtio.h b/include/linux/virtio.h --- a/include/linux/virtio.h +++ b/include/linux/virtio.h @@ -26,52 +26,22 @@ struct virtqueue { }; /** - * operations for virtqueue - * virtqueue_add_buf: expose buffer to other end - * vq: the struct virtqueue we're talking about. - * sg: the description of the buffer(s). - * out_num: the number of sg readable by other side - * in_num: the number of sg which are writable (after readable ones) - * data: the token identifying the buffer. - * gfp: how to do memory allocations (if necessary). - * Returns remaining capacity of queue (sg segments) or a negative error. - * virtqueue_kick: update after add_buf - * vq: the struct virtqueue - * After one or more add_buf calls, invoke this to kick the other side. - * virtqueue_get_buf: get the next used buffer - * vq: the struct virtqueue we're talking about. - * len: the length written into the buffer - * Returns NULL or the "data" token handed to add_buf. - * virtqueue_disable_cb: disable callbacks - * vq: the struct virtqueue we're talking about. - * Note that this is not necessarily synchronous, hence unreliable and only - * useful as an optimization. - * virtqueue_enable_cb: restart callbacks after disable_cb. - * vq: the struct virtqueue we're talking about. - * This re-enables callbacks; it returns "false" if there are pending - * buffers in the queue, to detect a possible race between the driver - * checking for more work, and enabling callbacks. - * virtqueue_enable_cb_delayed: restart callbacks after disable_cb. - * vq: the struct virtqueue we're talking about. - * This re-enables callbacks but hints to the other side to delay - * interrupts until most of the available buffers have been processed; - * it returns "false" if there are many pending buffers in the queue, - * to detect a possible race between the driver checking for more work, - * and enabling callbacks. - * virtqueue_detach_unused_buf: detach first unused buffer - * vq: the struct virtqueue we're talking about. - * Returns NULL or the "data" token handed to add_buf - * virtqueue_get_vring_size: return the size of the virtqueue's vring - * vq: the struct virtqueue containing the vring of interest. - * Returns the size of the vring. + * virtqueue_add_buf_gfp - expose buffer to other end + * @vq: the struct virtqueue we're talking about. + * @sg: the description of the buffer(s). + * @out_num: the number of sg readable by other side + * @in_num: the number of sg which are writable (after readable ones) + * @data: the token identifying the buffer. + * @gfp: how to do memory allocations (if necessary). * - * Locking rules are straightforward: the driver is responsible for - * locking. No two operations may be invoked simultaneously, with the exception - * of virtqueue_disable_cb. + * Caller must ensure we don't call this with other virtqueue operations + * at the same time (except where noted). * - * All operations can be called in any context. + * Returns remaining capacity of queue or a negative error + * (ie. ENOSPC). Note that it only really makes sense to treat all + * positive return values as "available": indirect buffers mean that + * we can put an entire sg[] array inside a single queue entry. */ - int virtqueue_add_buf_gfp(struct virtqueue *vq, struct scatterlist sg[], unsigned int out_num, @@ -88,18 +58,92 @@ static inline int virtqueue_add_buf(stru return virtqueue_add_buf_gfp(vq, sg, out_num, in_num, data, GFP_ATOMIC); } +/** + * virtqueue_kick - update after add_buf + * @vq: the struct virtqueue + * + * After one or more virtqueue_add_buf_gfp calls, invoke this to kick + * the other side. + * + * Caller must ensure we don't call this with other virtqueue + * operations at the same time (except where noted). + */ void virtqueue_kick(struct virtqueue *vq); +/** + * virtqueue_get_buf - get the next used buffer + * @vq: the struct virtqueue we're talking about. + * @len: the length written into the buffer + * + * If the driver wrote data into the buffer, @len will be set to the + * amount written. This means you don't need to clear the buffer + * beforehand to ensure there's no data leakage in the case of short + * writes. + * + * Caller must ensure we don't call this with other virtqueue + * operations at the same time (except where noted). + * + * Returns NULL if there are no used buffers, or the "data" token + * handed to virtqueue_add_buf_gfp(). + */ void *virtqueue_get_buf(struct virtqueue *vq, unsigned int *len); +/** + * virtqueue_disable_cb - disable callbacks + * @vq: the struct virtqueue we're talking about. + * + * Note that this is not necessarily synchronous, hence unreliable and only + * useful as an optimization. + * + * Unlike other operations, this need not be serialized. + */ void virtqueue_disable_cb(struct virtqueue *vq); +/** + * virtqueue_enable_cb - restart callbacks after disable_cb. + * @vq: the struct virtqueue we're talking about. + * + * This re-enables callbacks; it returns "false" if there are pending + * buffers in the queue, to detect a possible race between the driver + * checking for more work, and enabling callbacks. + * + * Caller must ensure we don't call this with other virtqueue + * operations at the same time (except where noted). + */ bool virtqueue_enable_cb(struct virtqueue *vq); +/** + * virtqueue_enable_cb_delayed - restart callbacks after disable_cb. + * @vq: the struct virtqueue we're talking about. + * + * This re-enables callbacks but hints to the other side to delay + * interrupts until most of the available buffers have been processed; + * it returns "false" if there are many pending buffers in the queue, + * to detect a possible race between the driver checking for more work, + * and enabling callbacks. + * + * Caller must ensure we don't call this with other virtqueue + * operations at the same time (except where noted). + */ bool virtqueue_enable_cb_delayed(struct virtqueue *vq); +/** + * virtqueue_detach_unused_buf - detach first unused buffer + * @vq: the struct virtqueue we're talking about. + * + * Returns NULL or the "data" token handed to virtqueue_add_buf_gfp(). + * This is not valid on an active queue; it is useful only for device + * shutdown. + */ void *virtqueue_detach_unused_buf(struct virtqueue *vq); +/** + * virtqueue_get_vring_size - return the size of the virtqueue's vring + * @vq: the struct virtqueue containing the vring of interest. + * + * Returns the size of the vring. This is mainly used for boasting to + * userspace. Unlike other operations, this need not be serialized. + */ unsigned int virtqueue_get_vring_size(struct virtqueue *vq); /** ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 1 of 5] virtio: document functions better 2011-11-03 7:42 ` [PATCH 1 of 5] virtio: document functions better Rusty Russell @ 2011-11-03 7:49 ` Christoph Hellwig 0 siblings, 0 replies; 34+ messages in thread From: Christoph Hellwig @ 2011-11-03 7:49 UTC (permalink / raw) To: Rusty Russell Cc: MichaelS.Tsirkist, Christoph Hellwig, linux-kernel, kvm, virtualization On Thu, Nov 03, 2011 at 06:12:49PM +1030, Rusty Russell wrote: > The old documentation is left over from when we used a structure with > strategy pointers. Looks good, Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH 2 of 5] virtio: rename virtqueue_add_buf_gfp to virtqueue_add_buf [not found] <patchbomb.1320306168@localhost6.localdomain6> 2011-11-03 7:42 ` [PATCH 1 of 5] virtio: document functions better Rusty Russell @ 2011-11-03 7:42 ` Rusty Russell 2011-11-03 7:50 ` Christoph Hellwig 2011-11-03 7:42 ` [PATCH 3 of 5] virtio: support unlocked queue kick Rusty Russell ` (2 subsequent siblings) 4 siblings, 1 reply; 34+ messages in thread From: Rusty Russell @ 2011-11-03 7:42 UTC (permalink / raw) To: MichaelS.Tsirkist, Christoph Hellwig; +Cc: linux-kernel, kvm, virtualization Remove wrapper functions. This makes the allocation type explicit in all callers; I used GPF_KERNEL where it seemed obvious, left it at GFP_ATOMIC otherwise. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -171,7 +171,7 @@ static bool do_req(struct request_queue } } - if (virtqueue_add_buf(vblk->vq, vblk->sg, out, in, vbr) < 0) { + if (virtqueue_add_buf(vblk->vq, vblk->sg, out, in, vbr, GFP_ATOMIC)<0) { mempool_free(vbr, vblk->pool); return false; } diff --git a/drivers/char/hw_random/virtio-rng.c b/drivers/char/hw_random/virtio-rng.c --- a/drivers/char/hw_random/virtio-rng.c +++ b/drivers/char/hw_random/virtio-rng.c @@ -46,7 +46,7 @@ static void register_buffer(u8 *buf, siz sg_init_one(&sg, buf, size); /* There should always be room for one buffer. */ - if (virtqueue_add_buf(vq, &sg, 0, 1, buf) < 0) + if (virtqueue_add_buf(vq, &sg, 0, 1, buf, GFP_KERNEL) < 0) BUG(); virtqueue_kick(vq); diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -391,7 +391,7 @@ static int add_inbuf(struct virtqueue *v sg_init_one(sg, buf->buf, buf->size); - ret = virtqueue_add_buf(vq, sg, 0, 1, buf); + ret = virtqueue_add_buf(vq, sg, 0, 1, buf, GFP_ATOMIC); virtqueue_kick(vq); return ret; } @@ -456,7 +456,7 @@ static ssize_t __send_control_msg(struct vq = portdev->c_ovq; sg_init_one(sg, &cpkt, sizeof(cpkt)); - if (virtqueue_add_buf(vq, sg, 1, 0, &cpkt) >= 0) { + if (virtqueue_add_buf(vq, sg, 1, 0, &cpkt, GFP_ATOMIC) >= 0) { virtqueue_kick(vq); while (!virtqueue_get_buf(vq, &len)) cpu_relax(); @@ -505,7 +505,7 @@ static ssize_t send_buf(struct port *por reclaim_consumed_buffers(port); sg_init_one(sg, in_buf, in_count); - ret = virtqueue_add_buf(out_vq, sg, 1, 0, in_buf); + ret = virtqueue_add_buf(out_vq, sg, 1, 0, in_buf, GFP_ATOMIC); /* Tell Host to go! */ virtqueue_kick(out_vq); diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -368,7 +368,7 @@ static int add_recvbuf_small(struct virt skb_to_sgvec(skb, vi->rx_sg + 1, 0, skb->len); - err = virtqueue_add_buf_gfp(vi->rvq, vi->rx_sg, 0, 2, skb, gfp); + err = virtqueue_add_buf(vi->rvq, vi->rx_sg, 0, 2, skb, gfp); if (err < 0) dev_kfree_skb(skb); @@ -413,8 +413,8 @@ static int add_recvbuf_big(struct virtne /* chain first in list head */ first->private = (unsigned long)list; - err = virtqueue_add_buf_gfp(vi->rvq, vi->rx_sg, 0, MAX_SKB_FRAGS + 2, - first, gfp); + err = virtqueue_add_buf(vi->rvq, vi->rx_sg, 0, MAX_SKB_FRAGS + 2, + first, gfp); if (err < 0) give_pages(vi, first); @@ -432,7 +432,7 @@ static int add_recvbuf_mergeable(struct sg_init_one(vi->rx_sg, page_address(page), PAGE_SIZE); - err = virtqueue_add_buf_gfp(vi->rvq, vi->rx_sg, 0, 1, page, gfp); + err = virtqueue_add_buf(vi->rvq, vi->rx_sg, 0, 1, page, gfp); if (err < 0) give_pages(vi, page); @@ -601,7 +601,7 @@ static int xmit_skb(struct virtnet_info hdr->num_sg = skb_to_sgvec(skb, vi->tx_sg + 1, 0, skb->len) + 1; return virtqueue_add_buf(vi->svq, vi->tx_sg, hdr->num_sg, - 0, skb); + 0, skb, GFP_ATOMIC); } static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) @@ -754,7 +754,7 @@ static bool virtnet_send_command(struct sg_set_buf(&sg[i + 1], sg_virt(s), s->length); sg_set_buf(&sg[out + in - 1], &status, sizeof(status)); - BUG_ON(virtqueue_add_buf(vi->cvq, sg, out, in, vi) < 0); + BUG_ON(virtqueue_add_buf(vi->cvq, sg, out, in, vi, GFP_ATOMIC) < 0); virtqueue_kick(vi->cvq); diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -86,7 +86,7 @@ static void tell_host(struct virtio_ball init_completion(&vb->acked); /* We should always be able to add one buffer to an empty queue. */ - if (virtqueue_add_buf(vq, &sg, 1, 0, vb) < 0) + if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0) BUG(); virtqueue_kick(vq); @@ -219,7 +219,7 @@ static void stats_handle_request(struct vq = vb->stats_vq; sg_init_one(&sg, vb->stats, sizeof(vb->stats)); - if (virtqueue_add_buf(vq, &sg, 1, 0, vb) < 0) + if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0) BUG(); virtqueue_kick(vq); } @@ -312,7 +312,8 @@ static int virtballoon_probe(struct virt * use it to signal us later. */ sg_init_one(&sg, vb->stats, sizeof vb->stats); - if (virtqueue_add_buf(vb->stats_vq, &sg, 1, 0, vb) < 0) + if (virtqueue_add_buf(vb->stats_vq, &sg, 1, 0, vb, GFP_KERNEL) + < 0) BUG(); virtqueue_kick(vb->stats_vq); } diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -159,12 +159,12 @@ static int vring_add_indirect(struct vri return head; } -int virtqueue_add_buf_gfp(struct virtqueue *_vq, - struct scatterlist sg[], - unsigned int out, - unsigned int in, - void *data, - gfp_t gfp) +int virtqueue_add_buf(struct virtqueue *_vq, + struct scatterlist sg[], + unsigned int out, + unsigned int in, + void *data, + gfp_t gfp) { struct vring_virtqueue *vq = to_vvq(_vq); unsigned int i, avail, uninitialized_var(prev); @@ -235,7 +235,7 @@ add_head: return vq->num_free; } -EXPORT_SYMBOL_GPL(virtqueue_add_buf_gfp); +EXPORT_SYMBOL_GPL(virtqueue_add_buf); void virtqueue_kick(struct virtqueue *_vq) { diff --git a/include/linux/virtio.h b/include/linux/virtio.h --- a/include/linux/virtio.h +++ b/include/linux/virtio.h @@ -26,7 +26,7 @@ struct virtqueue { }; /** - * virtqueue_add_buf_gfp - expose buffer to other end + * virtqueue_add_buf - expose buffer to other end * @vq: the struct virtqueue we're talking about. * @sg: the description of the buffer(s). * @out_num: the number of sg readable by other side @@ -42,27 +42,18 @@ struct virtqueue { * positive return values as "available": indirect buffers mean that * we can put an entire sg[] array inside a single queue entry. */ -int virtqueue_add_buf_gfp(struct virtqueue *vq, - struct scatterlist sg[], - unsigned int out_num, - unsigned int in_num, - void *data, - gfp_t gfp); - -static inline int virtqueue_add_buf(struct virtqueue *vq, - struct scatterlist sg[], - unsigned int out_num, - unsigned int in_num, - void *data) -{ - return virtqueue_add_buf_gfp(vq, sg, out_num, in_num, data, GFP_ATOMIC); -} +int virtqueue_add_buf(struct virtqueue *vq, + struct scatterlist sg[], + unsigned int out_num, + unsigned int in_num, + void *data, + gfp_t gfp); /** * virtqueue_kick - update after add_buf * @vq: the struct virtqueue * - * After one or more virtqueue_add_buf_gfp calls, invoke this to kick + * After one or more virtqueue_add_buf calls, invoke this to kick * the other side. * * Caller must ensure we don't call this with other virtqueue @@ -84,7 +75,7 @@ void virtqueue_kick(struct virtqueue *vq * operations at the same time (except where noted). * * Returns NULL if there are no used buffers, or the "data" token - * handed to virtqueue_add_buf_gfp(). + * handed to virtqueue_add_buf(). */ void *virtqueue_get_buf(struct virtqueue *vq, unsigned int *len); @@ -131,7 +122,7 @@ bool virtqueue_enable_cb_delayed(struct * virtqueue_detach_unused_buf - detach first unused buffer * @vq: the struct virtqueue we're talking about. * - * Returns NULL or the "data" token handed to virtqueue_add_buf_gfp(). + * Returns NULL or the "data" token handed to virtqueue_add_buf(). * This is not valid on an active queue; it is useful only for device * shutdown. */ diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c --- a/net/9p/trans_virtio.c +++ b/net/9p/trans_virtio.c @@ -270,7 +270,8 @@ req_retry: in = pack_sg_list(chan->sg, out, VIRTQUEUE_NUM, req->rc->sdata, req->rc->capacity); - err = virtqueue_add_buf(chan->vq, chan->sg, out, in, req->tc); + err = virtqueue_add_buf(chan->vq, chan->sg, out, in, req->tc, + GFP_ATOMIC); if (err < 0) { if (err == -ENOSPC) { chan->ring_bufs_avail = 0; @@ -413,7 +414,8 @@ req_retry_pinned: in += pack_sg_list_p(chan->sg, out + in, VIRTQUEUE_NUM, in_pages, in_nr_pages, uidata, inlen); - err = virtqueue_add_buf(chan->vq, chan->sg, out, in, req->tc); + err = virtqueue_add_buf(chan->vq, chan->sg, out, in, req->tc, + GFP_ATOMIC); if (err < 0) { if (err == -ENOSPC) { chan->ring_bufs_avail = 0; diff --git a/tools/virtio/linux/virtio.h b/tools/virtio/linux/virtio.h --- a/tools/virtio/linux/virtio.h +++ b/tools/virtio/linux/virtio.h @@ -186,21 +186,12 @@ struct virtqueue { #endif /* Interfaces exported by virtio_ring. */ -int virtqueue_add_buf_gfp(struct virtqueue *vq, - struct scatterlist sg[], - unsigned int out_num, - unsigned int in_num, - void *data, - gfp_t gfp); - -static inline int virtqueue_add_buf(struct virtqueue *vq, - struct scatterlist sg[], - unsigned int out_num, - unsigned int in_num, - void *data) -{ - return virtqueue_add_buf_gfp(vq, sg, out_num, in_num, data, GFP_ATOMIC); -} +int virtqueue_add_buf(struct virtqueue *vq, + struct scatterlist sg[], + unsigned int out_num, + unsigned int in_num, + void *data, + gfp_t gfp); void virtqueue_kick(struct virtqueue *vq); diff --git a/tools/virtio/virtio_test.c b/tools/virtio/virtio_test.c --- a/tools/virtio/virtio_test.c +++ b/tools/virtio/virtio_test.c @@ -160,7 +160,8 @@ static void run_test(struct vdev_info *d if (started < bufs) { sg_init_one(&sl, dev->buf, dev->buf_size); r = virtqueue_add_buf(vq->vq, &sl, 1, 0, - dev->buf + started); + dev->buf + started, + GFP_ATOMIC); if (likely(r >= 0)) { ++started; virtqueue_kick(vq->vq); ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 2 of 5] virtio: rename virtqueue_add_buf_gfp to virtqueue_add_buf 2011-11-03 7:42 ` [PATCH 2 of 5] virtio: rename virtqueue_add_buf_gfp to virtqueue_add_buf Rusty Russell @ 2011-11-03 7:50 ` Christoph Hellwig 0 siblings, 0 replies; 34+ messages in thread From: Christoph Hellwig @ 2011-11-03 7:50 UTC (permalink / raw) To: Rusty Russell Cc: MichaelS.Tsirkist, Christoph Hellwig, linux-kernel, kvm, virtualization On Thu, Nov 03, 2011 at 06:12:50PM +1030, Rusty Russell wrote: > Remove wrapper functions. This makes the allocation type explicit in > all callers; I used GPF_KERNEL where it seemed obvious, left it at > GFP_ATOMIC otherwise. Looks good, Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH 3 of 5] virtio: support unlocked queue kick [not found] <patchbomb.1320306168@localhost6.localdomain6> 2011-11-03 7:42 ` [PATCH 1 of 5] virtio: document functions better Rusty Russell 2011-11-03 7:42 ` [PATCH 2 of 5] virtio: rename virtqueue_add_buf_gfp to virtqueue_add_buf Rusty Russell @ 2011-11-03 7:42 ` Rusty Russell 2011-11-03 7:52 ` Christoph Hellwig [not found] ` <20111103075211.GD6993@infradead.org> 2011-11-03 7:42 ` [PATCH 4 of 5] virtio: avoid modulus operation Rusty Russell 2011-11-03 7:42 ` [PATCH 5 of 5] virtio: expose added descriptors immediately Rusty Russell 4 siblings, 2 replies; 34+ messages in thread From: Rusty Russell @ 2011-11-03 7:42 UTC (permalink / raw) To: MichaelS.Tsirkist, Christoph Hellwig; +Cc: linux-kernel, kvm, virtualization Based on patch by Christoph for virtio_blk speedup: Split virtqueue_kick to be able to do the actual notification outside the lock protecting the virtqueue. This patch was originally done by Stefan Hajnoczi, but I can't find the original one anymore and had to recreated it from memory. Pointers to the original or corrections for the commit message are welcome. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -237,10 +237,12 @@ add_head: } EXPORT_SYMBOL_GPL(virtqueue_add_buf); -void virtqueue_kick(struct virtqueue *_vq) +bool virtqueue_kick_prepare(struct virtqueue *_vq) { struct vring_virtqueue *vq = to_vvq(_vq); u16 new, old; + bool needs_kick; + START_USE(vq); /* Descriptors and available array need to be set before we expose the * new available array entries. */ @@ -253,13 +255,30 @@ void virtqueue_kick(struct virtqueue *_v /* Need to update avail index before checking if we should notify */ virtio_mb(); - if (vq->event ? - vring_need_event(vring_avail_event(&vq->vring), new, old) : - !(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY)) - /* Prod other side to tell it about changes. */ - vq->notify(&vq->vq); + if (vq->event) { + needs_kick = vring_need_event(vring_avail_event(&vq->vring), + new, old); + } else { + needs_kick = (!(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY)); + } + END_USE(vq); + return needs_kick; +} +EXPORT_SYMBOL_GPL(virtqueue_kick_prepare); - END_USE(vq); +void virtqueue_notify(struct virtqueue *_vq) +{ + struct vring_virtqueue *vq = to_vvq(_vq); + + /* Prod other side to tell it about changes. */ + vq->notify(_vq); +} +EXPORT_SYMBOL_GPL(virtqueue_notify); + +void virtqueue_kick(struct virtqueue *vq) +{ + if (virtqueue_kick_prepare(vq)) + virtqueue_notify(vq); } EXPORT_SYMBOL_GPL(virtqueue_kick); diff --git a/include/linux/virtio.h b/include/linux/virtio.h --- a/include/linux/virtio.h +++ b/include/linux/virtio.h @@ -62,6 +62,27 @@ int virtqueue_add_buf(struct virtqueue * void virtqueue_kick(struct virtqueue *vq); /** + * virtqueue_kick_prepare - first half of split virtqueue_kick call. + * @vq: the struct virtqueue + * + * Instead of virtqueue_kick(), you can do: + * if (virtqueue_kick_prepare(vq)) + * virtqueue_notify(vq); + * + * This is sometimes useful because the virtqueue_kick_prepare() needs + * to be serialized, but the actual virtqueue_notify() call does not. + */ +bool virtqueue_kick_prepare(struct virtqueue *vq); + +/** + * virtqueue_notify - second half of split virtqueue_kick call. + * @vq: the struct virtqueue + * + * This does not need to be serialized. + */ +void virtqueue_notify(struct virtqueue *vq); + +/** * virtqueue_get_buf - get the next used buffer * @vq: the struct virtqueue we're talking about. * @len: the length written into the buffer ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 3 of 5] virtio: support unlocked queue kick 2011-11-03 7:42 ` [PATCH 3 of 5] virtio: support unlocked queue kick Rusty Russell @ 2011-11-03 7:52 ` Christoph Hellwig [not found] ` <20111103075211.GD6993@infradead.org> 1 sibling, 0 replies; 34+ messages in thread From: Christoph Hellwig @ 2011-11-03 7:52 UTC (permalink / raw) To: Rusty Russell Cc: MichaelS.Tsirkist, Christoph Hellwig, linux-kernel, kvm, virtualization On Thu, Nov 03, 2011 at 06:12:51PM +1030, Rusty Russell wrote: > Based on patch by Christoph for virtio_blk speedup: Please credit it to Stefan - he also sent a pointer to his original version in reply to the previous thread. Also shouldn't virtqueue_kick have kerneldoc comments? I also notices that you documented the functions bother here and in the first patch in the headers. At least historically the kerneldoc tools didn't parse comments at declarations, but only at the function defintions. Did you check these actually get picked up? ^ permalink raw reply [flat|nested] 34+ messages in thread
[parent not found: <20111103075211.GD6993@infradead.org>]
* Re: [PATCH 3 of 5] virtio: support unlocked queue kick [not found] ` <20111103075211.GD6993@infradead.org> @ 2011-11-04 10:09 ` Stefan Hajnoczi 2011-11-04 10:36 ` Rusty Russell 1 sibling, 0 replies; 34+ messages in thread From: Stefan Hajnoczi @ 2011-11-04 10:09 UTC (permalink / raw) To: Rusty Russell Cc: MichaelS.Tsirkist, Christoph Hellwig, linux-kernel, kvm, virtualization On Thu, Nov 3, 2011 at 7:52 AM, Christoph Hellwig <hch@infradead.org> wrote: > On Thu, Nov 03, 2011 at 06:12:51PM +1030, Rusty Russell wrote: >> Based on patch by Christoph for virtio_blk speedup: > > Please credit it to Stefan - he also sent a pointer to his original > version in reply to the previous thread. Thank you Christoph. Here the pointer to the original mailing list thread: "Here is the patch: https://github.com/stefanha/linux/commit/a6d06644e3a58e57a774e77d7dc34c4a5a2e7496 Or as an email if you want to track it down in your inbox: http://www.spinics.net/lists/linux-virtualization/msg14616.html" Stefan ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 3 of 5] virtio: support unlocked queue kick [not found] ` <20111103075211.GD6993@infradead.org> 2011-11-04 10:09 ` Stefan Hajnoczi @ 2011-11-04 10:36 ` Rusty Russell 1 sibling, 0 replies; 34+ messages in thread From: Rusty Russell @ 2011-11-04 10:36 UTC (permalink / raw) Cc: Christoph Hellwig, virtualization, linux-kernel, kvm, mst On Thu, 3 Nov 2011 03:52:11 -0400, Christoph Hellwig <hch@infradead.org> wrote: > On Thu, Nov 03, 2011 at 06:12:51PM +1030, Rusty Russell wrote: > > Based on patch by Christoph for virtio_blk speedup: > > Please credit it to Stefan - he also sent a pointer to his original > version in reply to the previous thread. > > Also shouldn't virtqueue_kick have kerneldoc comments? Yep, and I've credited it properly. > I also notices that you documented the functions bother here and in > the first patch in the headers. At least historically the kerneldoc > tools didn't parse comments at declarations, but only at the function > defintions. Did you check these actually get picked up? Gah, I'd forgotten that the kernel tendency is to put the interface documentation next to the implementation. Personally, I think extracting it is insane, but I've moved them. Thanks, Rusty. ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH 4 of 5] virtio: avoid modulus operation [not found] <patchbomb.1320306168@localhost6.localdomain6> ` (2 preceding siblings ...) 2011-11-03 7:42 ` [PATCH 3 of 5] virtio: support unlocked queue kick Rusty Russell @ 2011-11-03 7:42 ` Rusty Russell 2011-11-03 7:51 ` Pekka Enberg [not found] ` <CAOJsxLEt6_y7jw0bRsaita4gfb2k+BAQMeRLs9PcHntGVSFvaQ@mail.gmail.com> 2011-11-03 7:42 ` [PATCH 5 of 5] virtio: expose added descriptors immediately Rusty Russell 4 siblings, 2 replies; 34+ messages in thread From: Rusty Russell @ 2011-11-03 7:42 UTC (permalink / raw) To: MichaelS.Tsirkist, Christoph Hellwig; +Cc: linux-kernel, kvm, virtualization Since we know vq->vring.num is a power of 2, modulus is lazy (it's asserted in vring_new_virtqueue()). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> --- drivers/virtio/virtio_ring.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c --- a/arch/x86/lguest/boot.c +++ b/arch/x86/lguest/boot.c @@ -1420,7 +1420,7 @@ __init void lguest_init(void) add_preferred_console("hvc", 0, NULL); /* Register our very early console. */ - virtio_cons_early_init(early_put_chars); +// virtio_cons_early_init(early_put_chars); /* * Last of all, we set the power management poweroff hook to point to diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -226,8 +226,8 @@ add_head: vq->data[head] = data; /* Put entry in available array (but don't update avail->idx until they - * do sync). FIXME: avoid modulus here? */ - avail = (vq->vring.avail->idx + vq->num_added++) % vq->vring.num; + * do sync). */ + avail = ((vq->vring.avail->idx + vq->num_added++) & (vq->vring.num-1)); vq->vring.avail->ring[avail] = head; pr_debug("Added buffer head %i to %p\n", head, vq); @@ -317,6 +317,7 @@ void *virtqueue_get_buf(struct virtqueue struct vring_virtqueue *vq = to_vvq(_vq); void *ret; unsigned int i; + u16 last_used; START_USE(vq); @@ -334,8 +335,9 @@ void *virtqueue_get_buf(struct virtqueue /* Only get used array entries after they have been exposed by host. */ virtio_rmb(); - i = vq->vring.used->ring[vq->last_used_idx%vq->vring.num].id; - *len = vq->vring.used->ring[vq->last_used_idx%vq->vring.num].len; + last_used = (vq->last_used_idx & (vq->vring.num - 1)); + i = vq->vring.used->ring[last_used].id; + *len = vq->vring.used->ring[last_used].len; if (unlikely(i >= vq->vring.num)) { BAD_RING(vq, "id %u out of range\n", i); ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 4 of 5] virtio: avoid modulus operation 2011-11-03 7:42 ` [PATCH 4 of 5] virtio: avoid modulus operation Rusty Russell @ 2011-11-03 7:51 ` Pekka Enberg [not found] ` <CAOJsxLEt6_y7jw0bRsaita4gfb2k+BAQMeRLs9PcHntGVSFvaQ@mail.gmail.com> 1 sibling, 0 replies; 34+ messages in thread From: Pekka Enberg @ 2011-11-03 7:51 UTC (permalink / raw) To: Rusty Russell Cc: MichaelS.Tsirkist, Christoph Hellwig, linux-kernel, kvm, virtualization On Thu, Nov 3, 2011 at 9:42 AM, Rusty Russell <rusty@rustcorp.com.au> wrote: > Since we know vq->vring.num is a power of 2, modulus is lazy (it's asserted > in vring_new_virtqueue()). > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> > --- > drivers/virtio/virtio_ring.c | 10 ++++++---- > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c > --- a/arch/x86/lguest/boot.c > +++ b/arch/x86/lguest/boot.c > @@ -1420,7 +1420,7 @@ __init void lguest_init(void) > add_preferred_console("hvc", 0, NULL); > > /* Register our very early console. */ > - virtio_cons_early_init(early_put_chars); > +// virtio_cons_early_init(early_put_chars); What's this hunk here? ^ permalink raw reply [flat|nested] 34+ messages in thread
[parent not found: <CAOJsxLEt6_y7jw0bRsaita4gfb2k+BAQMeRLs9PcHntGVSFvaQ@mail.gmail.com>]
* Re: [PATCH 4 of 5] virtio: avoid modulus operation [not found] ` <CAOJsxLEt6_y7jw0bRsaita4gfb2k+BAQMeRLs9PcHntGVSFvaQ@mail.gmail.com> @ 2011-11-03 10:18 ` Rusty Russell 0 siblings, 0 replies; 34+ messages in thread From: Rusty Russell @ 2011-11-03 10:18 UTC (permalink / raw) To: Pekka Enberg; +Cc: Christoph Hellwig, virtualization, linux-kernel, kvm, mst On Thu, 3 Nov 2011 09:51:15 +0200, Pekka Enberg <penberg@kernel.org> wrote: > On Thu, Nov 3, 2011 at 9:42 AM, Rusty Russell <rusty@rustcorp.com.au> wrote: > > Since we know vq->vring.num is a power of 2, modulus is lazy (it's asserted > > in vring_new_virtqueue()). > > > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> > > --- > > drivers/virtio/virtio_ring.c | 10 ++++++---- > > 1 file changed, 6 insertions(+), 4 deletions(-) > > > > diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c > > --- a/arch/x86/lguest/boot.c > > +++ b/arch/x86/lguest/boot.c > > @@ -1420,7 +1420,7 @@ __init void lguest_init(void) > > add_preferred_console("hvc", 0, NULL); > > > > /* Register our very early console. */ > > - virtio_cons_early_init(early_put_chars); > > +// virtio_cons_early_init(early_put_chars); > > What's this hunk here? Ugly workaround for console until hvc_console breakage gets reverted. Left in by mistake :) Thanks, Rusty. _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH 5 of 5] virtio: expose added descriptors immediately [not found] <patchbomb.1320306168@localhost6.localdomain6> ` (3 preceding siblings ...) 2011-11-03 7:42 ` [PATCH 4 of 5] virtio: avoid modulus operation Rusty Russell @ 2011-11-03 7:42 ` Rusty Russell 2011-11-13 21:03 ` Michael S. Tsirkin 2012-07-01 9:20 ` RFD: virtio balloon API use (was Re: [PATCH 5 of 5] virtio: expose added descriptors immediately) Michael S. Tsirkin 4 siblings, 2 replies; 34+ messages in thread From: Rusty Russell @ 2011-11-03 7:42 UTC (permalink / raw) To: MichaelS.Tsirkist, Christoph Hellwig; +Cc: linux-kernel, kvm, virtualization A virtio driver does virtqueue_add_buf() multiple times before finally calling virtqueue_kick(); previously we only exposed the added buffers in the virtqueue_kick() call. This means we don't need a memory barrier in virtqueue_add_buf(), but it reduces concurrency as the device (ie. host) can't see the buffers until the kick. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> --- drivers/virtio/virtio_ring.c | 37 ++++++++++++++++++++++--------------- 1 file changed, 22 insertions(+), 15 deletions(-) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -227,9 +227,15 @@ add_head: /* Put entry in available array (but don't update avail->idx until they * do sync). */ - avail = ((vq->vring.avail->idx + vq->num_added++) & (vq->vring.num-1)); + avail = (vq->vring.avail->idx & (vq->vring.num-1)); vq->vring.avail->ring[avail] = head; + /* Descriptors and available array need to be set before we expose the + * new available array entries. */ + virtio_wmb(); + vq->vring.avail->idx++; + vq->num_added++; + pr_debug("Added buffer head %i to %p\n", head, vq); END_USE(vq); @@ -248,13 +254,10 @@ bool virtqueue_kick_prepare(struct virtq * new available array entries. */ virtio_wmb(); - old = vq->vring.avail->idx; - new = vq->vring.avail->idx = old + vq->num_added; + old = vq->vring.avail->idx - vq->num_added; + new = vq->vring.avail->idx; vq->num_added = 0; - /* Need to update avail index before checking if we should notify */ - virtio_mb(); - if (vq->event) { needs_kick = vring_need_event(vring_avail_event(&vq->vring), new, old); ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 5 of 5] virtio: expose added descriptors immediately 2011-11-03 7:42 ` [PATCH 5 of 5] virtio: expose added descriptors immediately Rusty Russell @ 2011-11-13 21:03 ` Michael S. Tsirkin 2011-11-14 0:43 ` Rusty Russell 2011-11-14 6:56 ` Michael S. Tsirkin 2012-07-01 9:20 ` RFD: virtio balloon API use (was Re: [PATCH 5 of 5] virtio: expose added descriptors immediately) Michael S. Tsirkin 1 sibling, 2 replies; 34+ messages in thread From: Michael S. Tsirkin @ 2011-11-13 21:03 UTC (permalink / raw) To: Rusty Russell; +Cc: Christoph Hellwig, virtualization, linux-kernel, kvm, mst On Thu, Nov 03, 2011 at 06:12:53PM +1030, Rusty Russell wrote: > A virtio driver does virtqueue_add_buf() multiple times before finally > calling virtqueue_kick(); previously we only exposed the added buffers > in the virtqueue_kick() call. This means we don't need a memory > barrier in virtqueue_add_buf(), but it reduces concurrency as the > device (ie. host) can't see the buffers until the kick. > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> In the past I played with a patch like this, but I didn't see a performance gain either way. Do you see any gain? I'm a bit concerned that with this patch, a buggy driver that adds more than 2^16 descriptors without a kick would seem to work sometimes. Let's add WARN_ON(vq->num_added > (1 << 16))? > --- > drivers/virtio/virtio_ring.c | 37 ++++++++++++++++++++++--------------- > 1 file changed, 22 insertions(+), 15 deletions(-) > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -227,9 +227,15 @@ add_head: > > /* Put entry in available array (but don't update avail->idx until they > * do sync). */ > - avail = ((vq->vring.avail->idx + vq->num_added++) & (vq->vring.num-1)); > + avail = (vq->vring.avail->idx & (vq->vring.num-1)); > vq->vring.avail->ring[avail] = head; > > + /* Descriptors and available array need to be set before we expose the > + * new available array entries. */ > + virtio_wmb(); > + vq->vring.avail->idx++; > + vq->num_added++; > + > pr_debug("Added buffer head %i to %p\n", head, vq); > END_USE(vq); > > @@ -248,13 +254,10 @@ bool virtqueue_kick_prepare(struct virtq > * new available array entries. */ > virtio_wmb(); > > - old = vq->vring.avail->idx; > - new = vq->vring.avail->idx = old + vq->num_added; > + old = vq->vring.avail->idx - vq->num_added; > + new = vq->vring.avail->idx; > vq->num_added = 0; > > - /* Need to update avail index before checking if we should notify */ > - virtio_mb(); > - > if (vq->event) { > needs_kick = vring_need_event(vring_avail_event(&vq->vring), > new, old); > > > _______________________________________________ > Virtualization mailing list > Virtualization@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 5 of 5] virtio: expose added descriptors immediately 2011-11-13 21:03 ` Michael S. Tsirkin @ 2011-11-14 0:43 ` Rusty Russell 2011-11-14 6:56 ` Michael S. Tsirkin 1 sibling, 0 replies; 34+ messages in thread From: Rusty Russell @ 2011-11-14 0:43 UTC (permalink / raw) Cc: Christoph Hellwig, virtualization, linux-kernel, kvm, mst On Sun, 13 Nov 2011 23:03:14 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Thu, Nov 03, 2011 at 06:12:53PM +1030, Rusty Russell wrote: > > A virtio driver does virtqueue_add_buf() multiple times before finally > > calling virtqueue_kick(); previously we only exposed the added buffers > > in the virtqueue_kick() call. This means we don't need a memory > > barrier in virtqueue_add_buf(), but it reduces concurrency as the > > device (ie. host) can't see the buffers until the kick. > > > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> > > In the past I played with a patch like this, but I didn't see a > performance gain either way. Do you see any gain? No, but I haven't run it on real hardware. lguest may see a win with this in theory, since the virtqueue processing is fully async, so I'll run some tests. > I'm a bit concerned that with this patch, a buggy driver that > adds more than 2^16 descriptors without a kick > would seem to work sometimes. Let's add WARN_ON(vq->num_added > (1 << 16))? Hmm, I guess it could wait for the add to fail before doing a kick, but noone does that at the moment, so I've added a slight variant: WARN_ON(vq->num_added > vq->vring.num); Thanks, Rusty. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 5 of 5] virtio: expose added descriptors immediately 2011-11-13 21:03 ` Michael S. Tsirkin 2011-11-14 0:43 ` Rusty Russell @ 2011-11-14 6:56 ` Michael S. Tsirkin 2011-11-16 0:21 ` Rusty Russell 1 sibling, 1 reply; 34+ messages in thread From: Michael S. Tsirkin @ 2011-11-14 6:56 UTC (permalink / raw) To: Rusty Russell; +Cc: Christoph Hellwig, linux-kernel, kvm, virtualization On Sun, Nov 13, 2011 at 11:03:13PM +0200, Michael S. Tsirkin wrote: > On Thu, Nov 03, 2011 at 06:12:53PM +1030, Rusty Russell wrote: > > A virtio driver does virtqueue_add_buf() multiple times before finally > > calling virtqueue_kick(); previously we only exposed the added buffers > > in the virtqueue_kick() call. This means we don't need a memory > > barrier in virtqueue_add_buf(), but it reduces concurrency as the > > device (ie. host) can't see the buffers until the kick. > > > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> > > In the past I played with a patch like this, but I didn't see a > performance gain either way. Do you see any gain? > > I'm a bit concerned that with this patch, a buggy driver that > adds more than 2^16 descriptors without a kick > would seem to work sometimes. Let's add WARN_ON(vq->num_added > (1 << 16))? Thinking about it more - it might be tricky for drivers to ensure this. add used to fail when vq is full, but now driver might do get between add and notify: lock add_buf * N prep unlock lock get_buf * N unlock lock add_buf prep unlock notify and since add was followed by get, this doesn't fail. So the right thing to do I think is to either ignore indexes and assume a kick is needed, something like: if vq->num_added >= (1 << 15)) needs_kick = true (note: maybe it's 1<<16, and maybe >, but 1<<15 is plenty anyway) Or alternatively, fail add when num_added is too large. > > --- > > drivers/virtio/virtio_ring.c | 37 ++++++++++++++++++++++--------------- > > 1 file changed, 22 insertions(+), 15 deletions(-) > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > --- a/drivers/virtio/virtio_ring.c > > +++ b/drivers/virtio/virtio_ring.c > > @@ -227,9 +227,15 @@ add_head: > > > > /* Put entry in available array (but don't update avail->idx until they > > * do sync). */ > > - avail = ((vq->vring.avail->idx + vq->num_added++) & (vq->vring.num-1)); > > + avail = (vq->vring.avail->idx & (vq->vring.num-1)); > > vq->vring.avail->ring[avail] = head; > > > > + /* Descriptors and available array need to be set before we expose the > > + * new available array entries. */ > > + virtio_wmb(); > > + vq->vring.avail->idx++; > > + vq->num_added++; > > + > > pr_debug("Added buffer head %i to %p\n", head, vq); > > END_USE(vq); > > > > @@ -248,13 +254,10 @@ bool virtqueue_kick_prepare(struct virtq > > * new available array entries. */ > > virtio_wmb(); > > > > - old = vq->vring.avail->idx; > > - new = vq->vring.avail->idx = old + vq->num_added; > > + old = vq->vring.avail->idx - vq->num_added; > > + new = vq->vring.avail->idx; > > vq->num_added = 0; > > > > - /* Need to update avail index before checking if we should notify */ > > - virtio_mb(); > > - > > if (vq->event) { > > needs_kick = vring_need_event(vring_avail_event(&vq->vring), > > new, old); > > > > > > _______________________________________________ > > Virtualization mailing list > > Virtualization@lists.linux-foundation.org > > https://lists.linuxfoundation.org/mailman/listinfo/virtualization ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 5 of 5] virtio: expose added descriptors immediately 2011-11-14 6:56 ` Michael S. Tsirkin @ 2011-11-16 0:21 ` Rusty Russell 2011-11-16 7:18 ` Michael S. Tsirkin 0 siblings, 1 reply; 34+ messages in thread From: Rusty Russell @ 2011-11-16 0:21 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: Christoph Hellwig, linux-kernel, kvm, virtualization On Mon, 14 Nov 2011 08:56:06 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Sun, Nov 13, 2011 at 11:03:13PM +0200, Michael S. Tsirkin wrote: > > On Thu, Nov 03, 2011 at 06:12:53PM +1030, Rusty Russell wrote: > > > A virtio driver does virtqueue_add_buf() multiple times before finally > > > calling virtqueue_kick(); previously we only exposed the added buffers > > > in the virtqueue_kick() call. This means we don't need a memory > > > barrier in virtqueue_add_buf(), but it reduces concurrency as the > > > device (ie. host) can't see the buffers until the kick. > > > > > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> > > > > In the past I played with a patch like this, but I didn't see a > > performance gain either way. Do you see any gain? > > > > I'm a bit concerned that with this patch, a buggy driver that > > adds more than 2^16 descriptors without a kick > > would seem to work sometimes. Let's add WARN_ON(vq->num_added > (1 << 16))? > > Thinking about it more - it might be tricky for drivers > to ensure this. add used to fail when vq is full, but now > driver might do get between add and notify: > lock > add_buf * N > prep > unlock > lock > get_buf * N > unlock > lock > add_buf > prep > unlock > notify > > and since add was followed by get, this doesn't fail. Right, the driver could, in theory, do: add_buf() if (!get_buf()) notify() But we don't allow that at the moment in our API: we insist on a notify occasionally. Noone does this at the moment, so a WARN_ON is correct. If you're just add_buf() without the get_buf() then add_buf() will fail already. Here's my current variant: diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -245,9 +245,19 @@ add_head: /* Put entry in available array (but don't update avail->idx until they * do sync). */ - avail = ((vq->vring.avail->idx + vq->num_added++) & (vq->vring.num-1)); + avail = (vq->vring.avail->idx & (vq->vring.num-1)); vq->vring.avail->ring[avail] = head; + /* Descriptors and available array need to be set before we expose the + * new available array entries. */ + virtio_wmb(); + vq->vring.avail->idx++; + vq->num_added++; + + /* If you haven't kicked in this long, you're probably doing something + * wrong. */ + WARN_ON(vq->num_added > vq->vring.num); + pr_debug("Added buffer head %i to %p\n", head, vq); END_USE(vq); It's hard to write a useful WARN_ON() for the "you should kick more regularly" case (we could take timestamps if DEBUG is defined, I guess), so let's leave this until someone actually trips it. Thanks, Rusty. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 5 of 5] virtio: expose added descriptors immediately 2011-11-16 0:21 ` Rusty Russell @ 2011-11-16 7:18 ` Michael S. Tsirkin 2011-11-21 1:48 ` Rusty Russell 0 siblings, 1 reply; 34+ messages in thread From: Michael S. Tsirkin @ 2011-11-16 7:18 UTC (permalink / raw) To: Rusty Russell; +Cc: Christoph Hellwig, linux-kernel, kvm, virtualization On Wed, Nov 16, 2011 at 10:51:26AM +1030, Rusty Russell wrote: > On Mon, 14 Nov 2011 08:56:06 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > On Sun, Nov 13, 2011 at 11:03:13PM +0200, Michael S. Tsirkin wrote: > > > On Thu, Nov 03, 2011 at 06:12:53PM +1030, Rusty Russell wrote: > > > > A virtio driver does virtqueue_add_buf() multiple times before finally > > > > calling virtqueue_kick(); previously we only exposed the added buffers > > > > in the virtqueue_kick() call. This means we don't need a memory > > > > barrier in virtqueue_add_buf(), but it reduces concurrency as the > > > > device (ie. host) can't see the buffers until the kick. > > > > > > > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> > > > > > > In the past I played with a patch like this, but I didn't see a > > > performance gain either way. Do you see any gain? > > > > > > I'm a bit concerned that with this patch, a buggy driver that > > > adds more than 2^16 descriptors without a kick > > > would seem to work sometimes. Let's add WARN_ON(vq->num_added > (1 << 16))? > > > > Thinking about it more - it might be tricky for drivers > > to ensure this. add used to fail when vq is full, but now > > driver might do get between add and notify: > > lock > > add_buf * N > > prep > > unlock > > lock > > get_buf * N > > unlock > > lock > > add_buf > > prep > > unlock > > notify > > > > and since add was followed by get, this doesn't fail. > > Right, the driver could, in theory, do: > add_buf() > if (!get_buf()) > notify() > > But we don't allow that at the moment in our API: we insist on a notify > occasionally. Noone does this at the moment, so a WARN_ON is correct. > > If you're just add_buf() without the get_buf() then add_buf() will fail > already. > > Here's my current variant: > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -245,9 +245,19 @@ add_head: > > /* Put entry in available array (but don't update avail->idx until they > * do sync). */ > - avail = ((vq->vring.avail->idx + vq->num_added++) & (vq->vring.num-1)); > + avail = (vq->vring.avail->idx & (vq->vring.num-1)); > vq->vring.avail->ring[avail] = head; > > + /* Descriptors and available array need to be set before we expose the > + * new available array entries. */ > + virtio_wmb(); > + vq->vring.avail->idx++; > + vq->num_added++; > + > + /* If you haven't kicked in this long, you're probably doing something > + * wrong. */ > + WARN_ON(vq->num_added > vq->vring.num); > + > pr_debug("Added buffer head %i to %p\n", head, vq); > END_USE(vq); > > It's hard to write a useful WARN_ON() for the "you should kick more > regularly" case (we could take timestamps if DEBUG is defined, I guess), > so let's leave this until someone actually trips it. > > Thanks, > Rusty. My unlocked kick patches will trip this warning: they make virtio-net do add + get without kick. I think block with unlocked kick can trip it too: add, lock is dropped and then an interrupt can get. We also don't need a kick each num - each 2^15 is enough. Why don't we do this at start of add_buf: if (vq->num_added >= 0x7fff) return -ENOSPC; -- MST ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 5 of 5] virtio: expose added descriptors immediately 2011-11-16 7:18 ` Michael S. Tsirkin @ 2011-11-21 1:48 ` Rusty Russell 2011-11-21 11:57 ` Michael S. Tsirkin 0 siblings, 1 reply; 34+ messages in thread From: Rusty Russell @ 2011-11-21 1:48 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: Christoph Hellwig, linux-kernel, kvm, virtualization On Wed, 16 Nov 2011 09:18:38 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote: > My unlocked kick patches will trip this warning: they make > virtio-net do add + get without kick. Heh, it's a good sign if they do, since that means you're running really well :) > I think block with unlocked kick can trip it too: > add, lock is dropped and then an interrupt can get. > > We also don't need a kick each num - each 2^15 is enough. > Why don't we do this at start of add_buf: > if (vq->num_added >= 0x7fff) > return -ENOSPC; The warning was there in case a driver is never doing a kick, and getting away with it (mostly) because the device is polling. Let's not penalize good drivers to catch bad ones. How about we do this properly, like so: From: Rusty Russell <rusty@rustcorp.com.au> Subject: virtio: add debugging if driver doesn't kick. Under the existing #ifdef DEBUG, check that they don't have more than 1/10 of a second between an add_buf() and a virtqueue_notify()/virtqueue_kick_prepare() call. We could get false positives on a really busy system, but good for development. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> --- drivers/virtio/virtio_ring.c | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -22,6 +22,7 @@ #include <linux/device.h> #include <linux/slab.h> #include <linux/module.h> +#include <linux/hrtimer.h> /* virtio guest is communicating with a virtual "device" that actually runs on * a host processor. Memory barriers are used to control SMP effects. */ @@ -102,6 +103,10 @@ struct vring_virtqueue #ifdef DEBUG /* They're supposed to lock for us. */ unsigned int in_use; + + /* Figure out if their kicks are too delayed. */ + bool last_add_time_valid; + ktime_t last_add_time; #endif /* Tokens for callbacks. */ @@ -192,6 +197,19 @@ int virtqueue_add_buf(struct virtqueue * BUG_ON(data == NULL); +#ifdef DEBUG + { + ktime_t now = ktime_get(); + + /* No kick or get, with .1 second between? Warn. */ + if (vq->last_add_time_valid) + WARN_ON(ktime_to_ms(ktime_sub(now, vq->last_add_time)) + > 100); + vq->last_add_time = now; + vq->last_add_time_valid = true; + } +#endif + /* If the host supports indirect descriptor tables, and we have multiple * buffers, then go indirect. FIXME: tune this threshold */ if (vq->indirect && (out + in) > 1 && vq->num_free) { @@ -291,6 +309,14 @@ bool virtqueue_kick_prepare(struct virtq new = vq->vring.avail->idx; vq->num_added = 0; +#ifdef DEBUG + if (vq->last_add_time_valid) { + WARN_ON(ktime_to_ms(ktime_sub(ktime_get(), + vq->last_add_time)) > 100); + } + vq->last_add_time_valid = false; +#endif + if (vq->event) { needs_kick = vring_need_event(vring_avail_event(&vq->vring), new, old); @@ -428,6 +454,10 @@ void *virtqueue_get_buf(struct virtqueue virtio_mb(); } +#ifdef DEBUG + vq->last_add_time_valid = false; +#endif + END_USE(vq); return ret; } @@ -611,6 +641,7 @@ struct virtqueue *vring_new_virtqueue(un list_add_tail(&vq->vq.list, &vdev->vqs); #ifdef DEBUG vq->in_use = false; + vq->last_add_time_valid = false; #endif vq->indirect = virtio_has_feature(vdev, VIRTIO_RING_F_INDIRECT_DESC); ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 5 of 5] virtio: expose added descriptors immediately 2011-11-21 1:48 ` Rusty Russell @ 2011-11-21 11:57 ` Michael S. Tsirkin 2011-11-22 0:33 ` Rusty Russell 0 siblings, 1 reply; 34+ messages in thread From: Michael S. Tsirkin @ 2011-11-21 11:57 UTC (permalink / raw) To: Rusty Russell; +Cc: Christoph Hellwig, linux-kernel, kvm, virtualization On Mon, Nov 21, 2011 at 12:18:45PM +1030, Rusty Russell wrote: > On Wed, 16 Nov 2011 09:18:38 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > My unlocked kick patches will trip this warning: they make > > virtio-net do add + get without kick. > > Heh, it's a good sign if they do, since that means you're running really > well :) They don't in fact, in my testing :(. But I think they can with luck. > > I think block with unlocked kick can trip it too: > > add, lock is dropped and then an interrupt can get. > > > > We also don't need a kick each num - each 2^15 is enough. > > Why don't we do this at start of add_buf: > > if (vq->num_added >= 0x7fff) > > return -ENOSPC; > > The warning was there in case a driver is never doing a kick, and > getting away with it (mostly) because the device is polling. Let's not > penalize good drivers to catch bad ones. > > How about we do this properly, like so: Absolutely. But I think we also need to handle num_added overflow of a 15 bit counter, no? Otherwise the vring_need_event logic might give us false negatives .... I'm guessing we can just assume we need a kick in that case. > From: Rusty Russell <rusty@rustcorp.com.au> > Subject: virtio: add debugging if driver doesn't kick. > > Under the existing #ifdef DEBUG, check that they don't have more than > 1/10 of a second between an add_buf() and a > virtqueue_notify()/virtqueue_kick_prepare() call. > > We could get false positives on a really busy system, but good for > development. > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> > --- > drivers/virtio/virtio_ring.c | 31 +++++++++++++++++++++++++++++++ > 1 file changed, 31 insertions(+) > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -22,6 +22,7 @@ > #include <linux/device.h> > #include <linux/slab.h> > #include <linux/module.h> > +#include <linux/hrtimer.h> > > /* virtio guest is communicating with a virtual "device" that actually runs on > * a host processor. Memory barriers are used to control SMP effects. */ > @@ -102,6 +103,10 @@ struct vring_virtqueue > #ifdef DEBUG > /* They're supposed to lock for us. */ > unsigned int in_use; > + > + /* Figure out if their kicks are too delayed. */ > + bool last_add_time_valid; > + ktime_t last_add_time; > #endif > > /* Tokens for callbacks. */ > @@ -192,6 +197,19 @@ int virtqueue_add_buf(struct virtqueue * > > BUG_ON(data == NULL); > > +#ifdef DEBUG > + { > + ktime_t now = ktime_get(); > + > + /* No kick or get, with .1 second between? Warn. */ > + if (vq->last_add_time_valid) > + WARN_ON(ktime_to_ms(ktime_sub(now, vq->last_add_time)) > + > 100); > + vq->last_add_time = now; > + vq->last_add_time_valid = true; > + } > +#endif > + > /* If the host supports indirect descriptor tables, and we have multiple > * buffers, then go indirect. FIXME: tune this threshold */ > if (vq->indirect && (out + in) > 1 && vq->num_free) { > @@ -291,6 +309,14 @@ bool virtqueue_kick_prepare(struct virtq > new = vq->vring.avail->idx; > vq->num_added = 0; > > +#ifdef DEBUG > + if (vq->last_add_time_valid) { > + WARN_ON(ktime_to_ms(ktime_sub(ktime_get(), > + vq->last_add_time)) > 100); > + } > + vq->last_add_time_valid = false; > +#endif > + > if (vq->event) { > needs_kick = vring_need_event(vring_avail_event(&vq->vring), > new, old); > @@ -428,6 +454,10 @@ void *virtqueue_get_buf(struct virtqueue > virtio_mb(); > } > > +#ifdef DEBUG > + vq->last_add_time_valid = false; > +#endif > + > END_USE(vq); > return ret; > } > @@ -611,6 +641,7 @@ struct virtqueue *vring_new_virtqueue(un > list_add_tail(&vq->vq.list, &vdev->vqs); > #ifdef DEBUG > vq->in_use = false; > + vq->last_add_time_valid = false; > #endif > > vq->indirect = virtio_has_feature(vdev, VIRTIO_RING_F_INDIRECT_DESC); ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 5 of 5] virtio: expose added descriptors immediately 2011-11-21 11:57 ` Michael S. Tsirkin @ 2011-11-22 0:33 ` Rusty Russell 2011-11-22 6:29 ` Michael S. Tsirkin 0 siblings, 1 reply; 34+ messages in thread From: Rusty Russell @ 2011-11-22 0:33 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: Christoph Hellwig, linux-kernel, kvm, virtualization On Mon, 21 Nov 2011 13:57:04 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Mon, Nov 21, 2011 at 12:18:45PM +1030, Rusty Russell wrote: > > On Wed, 16 Nov 2011 09:18:38 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > My unlocked kick patches will trip this warning: they make > > > virtio-net do add + get without kick. > > > > Heh, it's a good sign if they do, since that means you're running really > > well :) > > They don't in fact, in my testing :(. But I think they can with luck. > > > > I think block with unlocked kick can trip it too: > > > add, lock is dropped and then an interrupt can get. > > > > > > We also don't need a kick each num - each 2^15 is enough. > > > Why don't we do this at start of add_buf: > > > if (vq->num_added >= 0x7fff) > > > return -ENOSPC; > > > > The warning was there in case a driver is never doing a kick, and > > getting away with it (mostly) because the device is polling. Let's not > > penalize good drivers to catch bad ones. > > > > How about we do this properly, like so: > > Absolutely. But I think we also need to handle num_added > overflow of a 15 bit counter, no? Otherwise the > vring_need_event logic might give us false negatives .... > I'm guessing we can just assume we need a kick in that case. You're right. Thankyou. My immediate reaction of "make it an unsigned long" doesn't work. Here's the diff to what I posted before: diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -254,9 +254,10 @@ add_head: vq->vring.avail->idx++; vq->num_added++; - /* If you haven't kicked in this long, you're probably doing something - * wrong. */ - WARN_ON(vq->num_added > vq->vring.num); + /* This is very unlikely, but theoretically possible. Kick + * just in case. */ + if (unlikely(vq->num_added == 65535)) + virtqueue_kick(_vq); pr_debug("Added buffer head %i to %p\n", head, vq); END_USE(vq); ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 5 of 5] virtio: expose added descriptors immediately 2011-11-22 0:33 ` Rusty Russell @ 2011-11-22 6:29 ` Michael S. Tsirkin 2011-11-23 1:19 ` Rusty Russell 0 siblings, 1 reply; 34+ messages in thread From: Michael S. Tsirkin @ 2011-11-22 6:29 UTC (permalink / raw) To: Rusty Russell; +Cc: Christoph Hellwig, linux-kernel, kvm, virtualization On Tue, Nov 22, 2011 at 11:03:04AM +1030, Rusty Russell wrote: > On Mon, 21 Nov 2011 13:57:04 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > On Mon, Nov 21, 2011 at 12:18:45PM +1030, Rusty Russell wrote: > > > On Wed, 16 Nov 2011 09:18:38 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > My unlocked kick patches will trip this warning: they make > > > > virtio-net do add + get without kick. > > > > > > Heh, it's a good sign if they do, since that means you're running really > > > well :) > > > > They don't in fact, in my testing :(. But I think they can with luck. > > > > > > I think block with unlocked kick can trip it too: > > > > add, lock is dropped and then an interrupt can get. > > > > > > > > We also don't need a kick each num - each 2^15 is enough. > > > > Why don't we do this at start of add_buf: > > > > if (vq->num_added >= 0x7fff) > > > > return -ENOSPC; > > > > > > The warning was there in case a driver is never doing a kick, and > > > getting away with it (mostly) because the device is polling. Let's not > > > penalize good drivers to catch bad ones. > > > > > > How about we do this properly, like so: > > > > Absolutely. But I think we also need to handle num_added > > overflow of a 15 bit counter, no? Otherwise the > > vring_need_event logic might give us false negatives .... > > I'm guessing we can just assume we need a kick in that case. > > You're right. Thankyou. My immediate reaction of "make it an unsigned > long" doesn't work. > > Here's the diff to what I posted before: > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -254,9 +254,10 @@ add_head: > vq->vring.avail->idx++; > vq->num_added++; > > - /* If you haven't kicked in this long, you're probably doing something > - * wrong. */ > - WARN_ON(vq->num_added > vq->vring.num); > + /* This is very unlikely, but theoretically possible. Kick > + * just in case. */ > + if (unlikely(vq->num_added == 65535)) This is 0xffff but why use the decimal notation? > + virtqueue_kick(_vq); > > pr_debug("Added buffer head %i to %p\n", head, vq); > END_USE(vq); We also still need to reset vq->num_added, right? -- MST ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 5 of 5] virtio: expose added descriptors immediately 2011-11-22 6:29 ` Michael S. Tsirkin @ 2011-11-23 1:19 ` Rusty Russell 2011-11-23 8:30 ` Michael S. Tsirkin 0 siblings, 1 reply; 34+ messages in thread From: Rusty Russell @ 2011-11-23 1:19 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: Christoph Hellwig, linux-kernel, kvm, virtualization On Tue, 22 Nov 2011 08:29:08 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Tue, Nov 22, 2011 at 11:03:04AM +1030, Rusty Russell wrote: > > - /* If you haven't kicked in this long, you're probably doing something > > - * wrong. */ > > - WARN_ON(vq->num_added > vq->vring.num); > > + /* This is very unlikely, but theoretically possible. Kick > > + * just in case. */ > > + if (unlikely(vq->num_added == 65535)) > > This is 0xffff but why use the decimal notation? Interesting. Why use hex? Feels more like binary? But I've changed it to "(1 << 16) - 1" to be clear. > > + virtqueue_kick(_vq); > > > > pr_debug("Added buffer head %i to %p\n", head, vq); > > END_USE(vq); > > We also still need to reset vq->num_added, right? virtqueue_kick does that for us. Cheers, Rusty. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 5 of 5] virtio: expose added descriptors immediately 2011-11-23 1:19 ` Rusty Russell @ 2011-11-23 8:30 ` Michael S. Tsirkin 0 siblings, 0 replies; 34+ messages in thread From: Michael S. Tsirkin @ 2011-11-23 8:30 UTC (permalink / raw) To: Rusty Russell; +Cc: Christoph Hellwig, linux-kernel, kvm, virtualization On Wed, Nov 23, 2011 at 11:49:01AM +1030, Rusty Russell wrote: > On Tue, 22 Nov 2011 08:29:08 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > On Tue, Nov 22, 2011 at 11:03:04AM +1030, Rusty Russell wrote: > > > - /* If you haven't kicked in this long, you're probably doing something > > > - * wrong. */ > > > - WARN_ON(vq->num_added > vq->vring.num); > > > + /* This is very unlikely, but theoretically possible. Kick > > > + * just in case. */ > > > + if (unlikely(vq->num_added == 65535)) > > > > This is 0xffff but why use the decimal notation? > > Interesting. Why use hex? Feels more like binary? Just easier to see it's the largest 16 bit number. > But I've changed it to "(1 << 16) - 1" to be clear. That's even better. > > > + virtqueue_kick(_vq); > > > > > > pr_debug("Added buffer head %i to %p\n", head, vq); > > > END_USE(vq); > > > > We also still need to reset vq->num_added, right? > > virtqueue_kick does that for us. > > Cheers, > Rusty. Right. ^ permalink raw reply [flat|nested] 34+ messages in thread
* RFD: virtio balloon API use (was Re: [PATCH 5 of 5] virtio: expose added descriptors immediately) 2011-11-03 7:42 ` [PATCH 5 of 5] virtio: expose added descriptors immediately Rusty Russell 2011-11-13 21:03 ` Michael S. Tsirkin @ 2012-07-01 9:20 ` Michael S. Tsirkin 2012-07-02 1:05 ` Rusty Russell 1 sibling, 1 reply; 34+ messages in thread From: Michael S. Tsirkin @ 2012-07-01 9:20 UTC (permalink / raw) To: Rusty Russell; +Cc: virtualization, Rafael Aquini, kvm, linux-kernel On Thu, Nov 03, 2011 at 06:12:53PM +1030, Rusty Russell wrote: > A virtio driver does virtqueue_add_buf() multiple times before finally > calling virtqueue_kick(); previously we only exposed the added buffers > in the virtqueue_kick() call. This means we don't need a memory > barrier in virtqueue_add_buf(), but it reduces concurrency as the > device (ie. host) can't see the buffers until the kick. > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Looking at recent mm compaction patches made me look at locking in balloon closely. And I noticed the referenced patch (commit ee7cd8981e15bcb365fc762afe3fc47b8242f630 upstream) interacts strangely with virtio balloon; balloon currently does: static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq) { struct scatterlist sg; sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns); init_completion(&vb->acked); /* We should always be able to add one buffer to an empty queue. */ if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0) BUG(); virtqueue_kick(vq); /* When host has read buffer, this completes via balloon_ack */ wait_for_completion(&vb->acked); } While vq callback does: static void balloon_ack(struct virtqueue *vq) { struct virtio_balloon *vb; unsigned int len; vb = virtqueue_get_buf(vq, &len); if (vb) complete(&vb->acked); } So virtqueue_get_buf might now run concurrently with virtqueue_kick. I audited both and this seems safe in practice but I think we need to either declare this legal at the API level or add locking in driver. Further, is there a guarantee that we never get spurious callbacks? We currently check ring not empty but esp for non shared MSI this might not be needed. If a spurious callback triggers, virtqueue_get_buf can run concurrently with virtqueue_add_buf which is known to be racy. Again I think this is currently safe as no spurious callbacks in practice but should we guarantee no spurious callbacks at the API level or add locking in driver? -- MST ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: RFD: virtio balloon API use (was Re: [PATCH 5 of 5] virtio: expose added descriptors immediately) 2012-07-01 9:20 ` RFD: virtio balloon API use (was Re: [PATCH 5 of 5] virtio: expose added descriptors immediately) Michael S. Tsirkin @ 2012-07-02 1:05 ` Rusty Russell 2012-07-02 7:25 ` Michael S. Tsirkin 2012-07-02 7:33 ` [PATCH RFC] virtio-balloon: fix add/get API use Michael S. Tsirkin 0 siblings, 2 replies; 34+ messages in thread From: Rusty Russell @ 2012-07-02 1:05 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: virtualization, Rafael Aquini, kvm, linux-kernel On Sun, 1 Jul 2012 12:20:51 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Thu, Nov 03, 2011 at 06:12:53PM +1030, Rusty Russell wrote: > > A virtio driver does virtqueue_add_buf() multiple times before finally > > calling virtqueue_kick(); previously we only exposed the added buffers > > in the virtqueue_kick() call. This means we don't need a memory > > barrier in virtqueue_add_buf(), but it reduces concurrency as the > > device (ie. host) can't see the buffers until the kick. > > > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> > > Looking at recent mm compaction patches made me look at locking > in balloon closely. And I noticed the referenced patch (commit > ee7cd8981e15bcb365fc762afe3fc47b8242f630 upstream) interacts strangely > with virtio balloon; balloon currently does: > > static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq) > { > struct scatterlist sg; > > sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns); > > init_completion(&vb->acked); > > /* We should always be able to add one buffer to an empty queue. */ > if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0) > BUG(); > virtqueue_kick(vq); > > /* When host has read buffer, this completes via balloon_ack */ > wait_for_completion(&vb->acked); > } > > > While vq callback does: > > static void balloon_ack(struct virtqueue *vq) > { > struct virtio_balloon *vb; > unsigned int len; > > vb = virtqueue_get_buf(vq, &len); > if (vb) > complete(&vb->acked); > } > > > So virtqueue_get_buf might now run concurrently with virtqueue_kick. > I audited both and this seems safe in practice but I think Good spotting! Agreed. Because there's only add_buf, we get away with it: the add_buf must be almost finished by the time get_buf runs because the device has seen the buffer. > we need to either declare this legal at the API level > or add locking in driver. I wonder if we should just lock in the balloon driver, rather than document this corner case and set a bad example. Are there other drivers which take the same shortcut? > Further, is there a guarantee that we never get > spurious callbacks? We currently check ring not empty > but esp for non shared MSI this might not be needed. Yes, I think this saves us. A spurious interrupt won't trigger a spurious callback. > If a spurious callback triggers, virtqueue_get_buf can run > concurrently with virtqueue_add_buf which is known to be racy. > Again I think this is currently safe as no spurious callbacks in > practice but should we guarantee no spurious callbacks at the API level > or add locking in driver? I think we should guarantee it, but is there a hole in the current implementation? Thanks, Rusty. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: RFD: virtio balloon API use (was Re: [PATCH 5 of 5] virtio: expose added descriptors immediately) 2012-07-02 1:05 ` Rusty Russell @ 2012-07-02 7:25 ` Michael S. Tsirkin 2012-07-02 16:08 ` Rafael Aquini 2012-07-02 7:33 ` [PATCH RFC] virtio-balloon: fix add/get API use Michael S. Tsirkin 1 sibling, 1 reply; 34+ messages in thread From: Michael S. Tsirkin @ 2012-07-02 7:25 UTC (permalink / raw) To: Rusty Russell; +Cc: virtualization, Rafael Aquini, kvm, linux-kernel On Mon, Jul 02, 2012 at 10:35:47AM +0930, Rusty Russell wrote: > On Sun, 1 Jul 2012 12:20:51 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > On Thu, Nov 03, 2011 at 06:12:53PM +1030, Rusty Russell wrote: > > > A virtio driver does virtqueue_add_buf() multiple times before finally > > > calling virtqueue_kick(); previously we only exposed the added buffers > > > in the virtqueue_kick() call. This means we don't need a memory > > > barrier in virtqueue_add_buf(), but it reduces concurrency as the > > > device (ie. host) can't see the buffers until the kick. > > > > > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> > > > > Looking at recent mm compaction patches made me look at locking > > in balloon closely. And I noticed the referenced patch (commit > > ee7cd8981e15bcb365fc762afe3fc47b8242f630 upstream) interacts strangely > > with virtio balloon; balloon currently does: > > > > static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq) > > { > > struct scatterlist sg; > > > > sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns); > > > > init_completion(&vb->acked); > > > > /* We should always be able to add one buffer to an empty queue. */ > > if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0) > > BUG(); > > virtqueue_kick(vq); > > > > /* When host has read buffer, this completes via balloon_ack */ > > wait_for_completion(&vb->acked); > > } > > > > > > While vq callback does: > > > > static void balloon_ack(struct virtqueue *vq) > > { > > struct virtio_balloon *vb; > > unsigned int len; > > > > vb = virtqueue_get_buf(vq, &len); > > if (vb) > > complete(&vb->acked); > > } > > > > > > So virtqueue_get_buf might now run concurrently with virtqueue_kick. > > I audited both and this seems safe in practice but I think > > Good spotting! > > Agreed. Because there's only add_buf, we get away with it: the add_buf > must be almost finished by the time get_buf runs because the device has > seen the buffer. > > > we need to either declare this legal at the API level > > or add locking in driver. > > I wonder if we should just lock in the balloon driver, rather than > document this corner case and set a bad example. We'll need to replace &vb->acked with a waitqueue and do get_buf from the same thread. But I note that stats_request hash the same issue. Let's see if we can fix it. > Are there other > drivers which take the same shortcut? Not that I know. > > Further, is there a guarantee that we never get > > spurious callbacks? We currently check ring not empty > > but esp for non shared MSI this might not be needed. > > Yes, I think this saves us. A spurious interrupt won't trigger > a spurious callback. > > > If a spurious callback triggers, virtqueue_get_buf can run > > concurrently with virtqueue_add_buf which is known to be racy. > > Again I think this is currently safe as no spurious callbacks in > > practice but should we guarantee no spurious callbacks at the API level > > or add locking in driver? > > I think we should guarantee it, but is there a hole in the current > implementation? > > Thanks, > Rusty. Could be. The check for ring empty looks somewhat suspicious. It might be expensive to make it 100% robust - that check was intended as an optimization for shared interrupts. Whith per vq interrupts we IMO do not need the check. If we add locking in balloon I think there's no need to guarantee no spurious interrupts. -- MST ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: RFD: virtio balloon API use (was Re: [PATCH 5 of 5] virtio: expose added descriptors immediately) 2012-07-02 7:25 ` Michael S. Tsirkin @ 2012-07-02 16:08 ` Rafael Aquini 2012-07-03 0:47 ` Rusty Russell 2012-07-04 10:55 ` Michael S. Tsirkin 0 siblings, 2 replies; 34+ messages in thread From: Rafael Aquini @ 2012-07-02 16:08 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: linux-kernel, kvm, virtualization On Mon, Jul 02, 2012 at 10:25:58AM +0300, Michael S. Tsirkin wrote: > On Mon, Jul 02, 2012 at 10:35:47AM +0930, Rusty Russell wrote: > > On Sun, 1 Jul 2012 12:20:51 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > On Thu, Nov 03, 2011 at 06:12:53PM +1030, Rusty Russell wrote: > > > > A virtio driver does virtqueue_add_buf() multiple times before finally > > > > calling virtqueue_kick(); previously we only exposed the added buffers > > > > in the virtqueue_kick() call. This means we don't need a memory > > > > barrier in virtqueue_add_buf(), but it reduces concurrency as the > > > > device (ie. host) can't see the buffers until the kick. > > > > > > > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> > > > > > > Looking at recent mm compaction patches made me look at locking > > > in balloon closely. And I noticed the referenced patch (commit > > > ee7cd8981e15bcb365fc762afe3fc47b8242f630 upstream) interacts strangely > > > with virtio balloon; balloon currently does: > > > > > > static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq) > > > { > > > struct scatterlist sg; > > > > > > sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns); > > > > > > init_completion(&vb->acked); > > > > > > /* We should always be able to add one buffer to an empty queue. */ > > > if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0) > > > BUG(); > > > virtqueue_kick(vq); > > > > > > /* When host has read buffer, this completes via balloon_ack */ > > > wait_for_completion(&vb->acked); > > > } > > > > > > > > > While vq callback does: > > > > > > static void balloon_ack(struct virtqueue *vq) > > > { > > > struct virtio_balloon *vb; > > > unsigned int len; > > > > > > vb = virtqueue_get_buf(vq, &len); > > > if (vb) > > > complete(&vb->acked); > > > } > > > > > > > > > So virtqueue_get_buf might now run concurrently with virtqueue_kick. > > > I audited both and this seems safe in practice but I think > > > > Good spotting! > > > > Agreed. Because there's only add_buf, we get away with it: the add_buf > > must be almost finished by the time get_buf runs because the device has > > seen the buffer. > > > > > we need to either declare this legal at the API level > > > or add locking in driver. > > > > I wonder if we should just lock in the balloon driver, rather than > > document this corner case and set a bad example. > > We'll need to replace &vb->acked with a waitqueue > and do get_buf from the same thread. > But I note that stats_request hash the same issue. > Let's see if we can fix it. > > > Are there other > > drivers which take the same shortcut? > > Not that I know. > > > > Further, is there a guarantee that we never get > > > spurious callbacks? We currently check ring not empty > > > but esp for non shared MSI this might not be needed. > > > > Yes, I think this saves us. A spurious interrupt won't trigger > > a spurious callback. > > > > > If a spurious callback triggers, virtqueue_get_buf can run > > > concurrently with virtqueue_add_buf which is known to be racy. > > > Again I think this is currently safe as no spurious callbacks in > > > practice but should we guarantee no spurious callbacks at the API level > > > or add locking in driver? > > > > I think we should guarantee it, but is there a hole in the current > > implementation? > > > > Thanks, > > Rusty. > > Could be. The check for ring empty looks somewhat suspicious. > It might be expensive to make it 100% robust - that check was > intended as an optimization for shared interrupts. > Whith per vq interrupts we IMO do not need the check. > If we add locking in balloon I think there's no need > to guarantee no spurious interrupts. > As 'locking in balloon', may I assume the approach I took for the compaction case is OK and aligned to address these concerns of yours? If not, do not hesitate in giving me your thoughts, please. I'm respinning a V3 series to address a couple of extra nitpicks from the compaction standpoint, and I'd love to be able to address any extra concern you might have on the balloon side of that work. Thanks! Rafael. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: RFD: virtio balloon API use (was Re: [PATCH 5 of 5] virtio: expose added descriptors immediately) 2012-07-02 16:08 ` Rafael Aquini @ 2012-07-03 0:47 ` Rusty Russell 2012-07-03 16:26 ` Rafael Aquini 2012-07-04 10:55 ` Michael S. Tsirkin 2012-07-04 10:55 ` Michael S. Tsirkin 1 sibling, 2 replies; 34+ messages in thread From: Rusty Russell @ 2012-07-03 0:47 UTC (permalink / raw) To: Rafael Aquini, Michael S. Tsirkin; +Cc: linux-kernel, kvm, virtualization On Mon, 2 Jul 2012 13:08:19 -0300, Rafael Aquini <aquini@redhat.com> wrote: > As 'locking in balloon', may I assume the approach I took for the compaction case > is OK and aligned to address these concerns of yours? If not, do not hesitate in > giving me your thoughts, please. I'm respinning a V3 series to address a couple > of extra nitpicks from the compaction standpoint, and I'd love to be able to > address any extra concern you might have on the balloon side of that work. It's orthogonal, though looks like they clash textually :( I'll re-spin MST's patch on top of yours, and include both in my tree, otherwise linux-next will have to do the merge. But I'll await your push before pushing to Linus next merge window. Thanks, Rusty. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: RFD: virtio balloon API use (was Re: [PATCH 5 of 5] virtio: expose added descriptors immediately) 2012-07-03 0:47 ` Rusty Russell @ 2012-07-03 16:26 ` Rafael Aquini 2012-07-04 10:55 ` Michael S. Tsirkin 1 sibling, 0 replies; 34+ messages in thread From: Rafael Aquini @ 2012-07-03 16:26 UTC (permalink / raw) To: Rusty Russell; +Cc: virtualization, linux-kernel, kvm, Michael S. Tsirkin On Tue, Jul 03, 2012 at 10:17:46AM +0930, Rusty Russell wrote: > On Mon, 2 Jul 2012 13:08:19 -0300, Rafael Aquini <aquini@redhat.com> wrote: > > As 'locking in balloon', may I assume the approach I took for the compaction case > > is OK and aligned to address these concerns of yours? If not, do not hesitate in > > giving me your thoughts, please. I'm respinning a V3 series to address a couple > > of extra nitpicks from the compaction standpoint, and I'd love to be able to > > address any extra concern you might have on the balloon side of that work. > > It's orthogonal, though looks like they clash textually :( > > I'll re-spin MST's patch on top of yours, and include both in my tree, > otherwise linux-next will have to do the merge. But I'll await your > push before pushing to Linus next merge window. > Thanks, Rusty. I'll post V3 series quite soon. Cheers! Rafael > Thanks, > Rusty. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: RFD: virtio balloon API use (was Re: [PATCH 5 of 5] virtio: expose added descriptors immediately) 2012-07-03 0:47 ` Rusty Russell 2012-07-03 16:26 ` Rafael Aquini @ 2012-07-04 10:55 ` Michael S. Tsirkin 2012-07-08 23:39 ` Rusty Russell 1 sibling, 1 reply; 34+ messages in thread From: Michael S. Tsirkin @ 2012-07-04 10:55 UTC (permalink / raw) To: Rusty Russell; +Cc: virtualization, Rafael Aquini, kvm, linux-kernel On Tue, Jul 03, 2012 at 10:17:46AM +0930, Rusty Russell wrote: > On Mon, 2 Jul 2012 13:08:19 -0300, Rafael Aquini <aquini@redhat.com> wrote: > > As 'locking in balloon', may I assume the approach I took for the compaction case > > is OK and aligned to address these concerns of yours? If not, do not hesitate in > > giving me your thoughts, please. I'm respinning a V3 series to address a couple > > of extra nitpicks from the compaction standpoint, and I'd love to be able to > > address any extra concern you might have on the balloon side of that work. > > It's orthogonal, though looks like they clash textually :( > > I'll re-spin MST's patch on top of yours, and include both in my tree, > otherwise linux-next will have to do the merge. But I'll await your > push before pushing to Linus next merge window. > > Thanks, > Rusty. While theoretical mine is a bugfix so could be 3.5 material, no? -- MST ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: RFD: virtio balloon API use (was Re: [PATCH 5 of 5] virtio: expose added descriptors immediately) 2012-07-04 10:55 ` Michael S. Tsirkin @ 2012-07-08 23:39 ` Rusty Russell 0 siblings, 0 replies; 34+ messages in thread From: Rusty Russell @ 2012-07-08 23:39 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: virtualization, Rafael Aquini, kvm, linux-kernel On Wed, 4 Jul 2012 13:55:33 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Tue, Jul 03, 2012 at 10:17:46AM +0930, Rusty Russell wrote: > > On Mon, 2 Jul 2012 13:08:19 -0300, Rafael Aquini <aquini@redhat.com> wrote: > > > As 'locking in balloon', may I assume the approach I took for the compaction case > > > is OK and aligned to address these concerns of yours? If not, do not hesitate in > > > giving me your thoughts, please. I'm respinning a V3 series to address a couple > > > of extra nitpicks from the compaction standpoint, and I'd love to be able to > > > address any extra concern you might have on the balloon side of that work. > > > > It's orthogonal, though looks like they clash textually :( > > > > I'll re-spin MST's patch on top of yours, and include both in my tree, > > otherwise linux-next will have to do the merge. But I'll await your > > push before pushing to Linus next merge window. > > > > Thanks, > > Rusty. > > While theoretical mine is a bugfix so could be 3.5 material, no? It's a little thin, but it's a good idea. I'll try. Cheers, Rusty. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: RFD: virtio balloon API use (was Re: [PATCH 5 of 5] virtio: expose added descriptors immediately) 2012-07-02 16:08 ` Rafael Aquini 2012-07-03 0:47 ` Rusty Russell @ 2012-07-04 10:55 ` Michael S. Tsirkin 1 sibling, 0 replies; 34+ messages in thread From: Michael S. Tsirkin @ 2012-07-04 10:55 UTC (permalink / raw) To: Rafael Aquini; +Cc: linux-kernel, kvm, virtualization On Mon, Jul 02, 2012 at 01:08:19PM -0300, Rafael Aquini wrote: > On Mon, Jul 02, 2012 at 10:25:58AM +0300, Michael S. Tsirkin wrote: > > On Mon, Jul 02, 2012 at 10:35:47AM +0930, Rusty Russell wrote: > > > On Sun, 1 Jul 2012 12:20:51 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > On Thu, Nov 03, 2011 at 06:12:53PM +1030, Rusty Russell wrote: > > > > > A virtio driver does virtqueue_add_buf() multiple times before finally > > > > > calling virtqueue_kick(); previously we only exposed the added buffers > > > > > in the virtqueue_kick() call. This means we don't need a memory > > > > > barrier in virtqueue_add_buf(), but it reduces concurrency as the > > > > > device (ie. host) can't see the buffers until the kick. > > > > > > > > > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> > > > > > > > > Looking at recent mm compaction patches made me look at locking > > > > in balloon closely. And I noticed the referenced patch (commit > > > > ee7cd8981e15bcb365fc762afe3fc47b8242f630 upstream) interacts strangely > > > > with virtio balloon; balloon currently does: > > > > > > > > static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq) > > > > { > > > > struct scatterlist sg; > > > > > > > > sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns); > > > > > > > > init_completion(&vb->acked); > > > > > > > > /* We should always be able to add one buffer to an empty queue. */ > > > > if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0) > > > > BUG(); > > > > virtqueue_kick(vq); > > > > > > > > /* When host has read buffer, this completes via balloon_ack */ > > > > wait_for_completion(&vb->acked); > > > > } > > > > > > > > > > > > While vq callback does: > > > > > > > > static void balloon_ack(struct virtqueue *vq) > > > > { > > > > struct virtio_balloon *vb; > > > > unsigned int len; > > > > > > > > vb = virtqueue_get_buf(vq, &len); > > > > if (vb) > > > > complete(&vb->acked); > > > > } > > > > > > > > > > > > So virtqueue_get_buf might now run concurrently with virtqueue_kick. > > > > I audited both and this seems safe in practice but I think > > > > > > Good spotting! > > > > > > Agreed. Because there's only add_buf, we get away with it: the add_buf > > > must be almost finished by the time get_buf runs because the device has > > > seen the buffer. > > > > > > > we need to either declare this legal at the API level > > > > or add locking in driver. > > > > > > I wonder if we should just lock in the balloon driver, rather than > > > document this corner case and set a bad example. > > > > We'll need to replace &vb->acked with a waitqueue > > and do get_buf from the same thread. > > But I note that stats_request hash the same issue. > > Let's see if we can fix it. > > > > > Are there other > > > drivers which take the same shortcut? > > > > Not that I know. > > > > > > Further, is there a guarantee that we never get > > > > spurious callbacks? We currently check ring not empty > > > > but esp for non shared MSI this might not be needed. > > > > > > Yes, I think this saves us. A spurious interrupt won't trigger > > > a spurious callback. > > > > > > > If a spurious callback triggers, virtqueue_get_buf can run > > > > concurrently with virtqueue_add_buf which is known to be racy. > > > > Again I think this is currently safe as no spurious callbacks in > > > > practice but should we guarantee no spurious callbacks at the API level > > > > or add locking in driver? > > > > > > I think we should guarantee it, but is there a hole in the current > > > implementation? > > > > > > Thanks, > > > Rusty. > > > > Could be. The check for ring empty looks somewhat suspicious. > > It might be expensive to make it 100% robust - that check was > > intended as an optimization for shared interrupts. > > Whith per vq interrupts we IMO do not need the check. > > If we add locking in balloon I think there's no need > > to guarantee no spurious interrupts. > > > > As 'locking in balloon', may I assume the approach I took for the compaction case > is OK and aligned to address these concerns of yours? No, I mean the patch I posted. Not so much locking as moving get_buf to thread itself. > If not, do not hesitate in > giving me your thoughts, please. I'm respinning a V3 series to address a couple > of extra nitpicks from the compaction standpoint, and I'd love to be able to > address any extra concern you might have on the balloon side of that work. > > > Thanks! > Rafael. -- MST ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH RFC] virtio-balloon: fix add/get API use 2012-07-02 1:05 ` Rusty Russell 2012-07-02 7:25 ` Michael S. Tsirkin @ 2012-07-02 7:33 ` Michael S. Tsirkin 2012-07-04 3:27 ` Rusty Russell 1 sibling, 1 reply; 34+ messages in thread From: Michael S. Tsirkin @ 2012-07-02 7:33 UTC (permalink / raw) To: Rusty Russell; +Cc: virtualization, Rafael Aquini, kvm, linux-kernel In virtio balloon virtqueue_get_buf might now run concurrently with virtqueue_kick. I audited both and this seems safe in practice but this is not guaranteed by the API. Additionally, a spurious interrupt might in theory make virtqueue_get_buf run in parallel with virtqueue_add_buf, which is racy. While we might try to protect against spurious callbacks it's easier to fix the driver: balloon seems to be the only one (mis)using the API like this, so let's just fix balloon. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> --- Warning: completely untested. diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index bfbc15c..a26eb4f 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -47,7 +47,7 @@ struct virtio_balloon struct task_struct *thread; /* Waiting for host to ack the pages we released. */ - struct completion acked; + wait_queue_head_t acked; /* Number of balloon pages we've told the Host we're not using. */ unsigned int num_pages; @@ -89,29 +89,26 @@ static struct page *balloon_pfn_to_page(u32 pfn) static void balloon_ack(struct virtqueue *vq) { - struct virtio_balloon *vb; - unsigned int len; + struct virtio_balloon *vb = vq->vdev->priv; - vb = virtqueue_get_buf(vq, &len); - if (vb) - complete(&vb->acked); + wake_up(&vb->acked); } static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq) { struct scatterlist sg; + unsigned int len; + void *buf; sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns); - init_completion(&vb->acked); - /* We should always be able to add one buffer to an empty queue. */ if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0) BUG(); virtqueue_kick(vq); /* When host has read buffer, this completes via balloon_ack */ - wait_for_completion(&vb->acked); + wait_event(vb->acked, virtqueue_get_buf(vq, &len)); } static void set_page_pfns(u32 pfns[], struct page *page) @@ -231,12 +228,8 @@ static void update_balloon_stats(struct virtio_balloon *vb) */ static void stats_request(struct virtqueue *vq) { - struct virtio_balloon *vb; - unsigned int len; + struct virtio_balloon *vb = vq->vdev->priv; - vb = virtqueue_get_buf(vq, &len); - if (!vb) - return; vb->need_stats_update = 1; wake_up(&vb->config_change); } @@ -245,11 +238,14 @@ static void stats_handle_request(struct virtio_balloon *vb) { struct virtqueue *vq; struct scatterlist sg; + unsigned int len; vb->need_stats_update = 0; update_balloon_stats(vb); vq = vb->stats_vq; + if (!virtqueue_get_buf(vq, &len)) + return; sg_init_one(&sg, vb->stats, sizeof(vb->stats)); if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0) BUG(); @@ -358,6 +354,7 @@ static int virtballoon_probe(struct virtio_device *vdev) INIT_LIST_HEAD(&vb->pages); vb->num_pages = 0; init_waitqueue_head(&vb->config_change); + init_waitqueue_head(&vb->acked); vb->vdev = vdev; vb->need_stats_update = 0; ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH RFC] virtio-balloon: fix add/get API use 2012-07-02 7:33 ` [PATCH RFC] virtio-balloon: fix add/get API use Michael S. Tsirkin @ 2012-07-04 3:27 ` Rusty Russell 0 siblings, 0 replies; 34+ messages in thread From: Rusty Russell @ 2012-07-04 3:27 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: virtualization, Rafael Aquini, kvm, linux-kernel On Mon, 2 Jul 2012 10:33:08 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote: > In virtio balloon virtqueue_get_buf might now run concurrently with > virtqueue_kick. I audited both and this seems safe in practice but > this is not guaranteed by the API. > Additionally, a spurious interrupt might in theory make > virtqueue_get_buf run in parallel with virtqueue_add_buf, which is racy. > > While we might try to protect against spurious callbacks it's > easier to fix the driver: balloon seems to be the only one > (mis)using the API like this, so let's just fix balloon. > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com> I was thinking of a spinlock, but this is far more elegant. And I added an explicit reference to the 'virtio: expose added descriptors immediately.' commit in your commit msg. Kudos! Rusty. ^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2012-07-08 23:39 UTC | newest]
Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <patchbomb.1320306168@localhost6.localdomain6>
2011-11-03  7:42 ` [PATCH 1 of 5] virtio: document functions better Rusty Russell
2011-11-03  7:49   ` Christoph Hellwig
2011-11-03  7:42 ` [PATCH 2 of 5] virtio: rename virtqueue_add_buf_gfp to virtqueue_add_buf Rusty Russell
2011-11-03  7:50   ` Christoph Hellwig
2011-11-03  7:42 ` [PATCH 3 of 5] virtio: support unlocked queue kick Rusty Russell
2011-11-03  7:52   ` Christoph Hellwig
     [not found]   ` <20111103075211.GD6993@infradead.org>
2011-11-04 10:09     ` Stefan Hajnoczi
2011-11-04 10:36     ` Rusty Russell
2011-11-03  7:42 ` [PATCH 4 of 5] virtio: avoid modulus operation Rusty Russell
2011-11-03  7:51   ` Pekka Enberg
     [not found]   ` <CAOJsxLEt6_y7jw0bRsaita4gfb2k+BAQMeRLs9PcHntGVSFvaQ@mail.gmail.com>
2011-11-03 10:18     ` Rusty Russell
2011-11-03  7:42 ` [PATCH 5 of 5] virtio: expose added descriptors immediately Rusty Russell
2011-11-13 21:03   ` Michael S. Tsirkin
2011-11-14  0:43     ` Rusty Russell
2011-11-14  6:56     ` Michael S. Tsirkin
2011-11-16  0:21       ` Rusty Russell
2011-11-16  7:18         ` Michael S. Tsirkin
2011-11-21  1:48           ` Rusty Russell
2011-11-21 11:57             ` Michael S. Tsirkin
2011-11-22  0:33               ` Rusty Russell
2011-11-22  6:29                 ` Michael S. Tsirkin
2011-11-23  1:19                   ` Rusty Russell
2011-11-23  8:30                     ` Michael S. Tsirkin
2012-07-01  9:20   ` RFD: virtio balloon API use (was Re: [PATCH 5 of 5] virtio: expose added descriptors immediately) Michael S. Tsirkin
2012-07-02  1:05     ` Rusty Russell
2012-07-02  7:25       ` Michael S. Tsirkin
2012-07-02 16:08         ` Rafael Aquini
2012-07-03  0:47           ` Rusty Russell
2012-07-03 16:26             ` Rafael Aquini
2012-07-04 10:55             ` Michael S. Tsirkin
2012-07-08 23:39               ` Rusty Russell
2012-07-04 10:55           ` Michael S. Tsirkin
2012-07-02  7:33       ` [PATCH RFC] virtio-balloon: fix add/get API use Michael S. Tsirkin
2012-07-04  3:27         ` Rusty Russell
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).