[PATCH v1 0/2] block: virtio-blk: support multi vq per virtio-blk

linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v1 0/2] block: virtio-blk: support multi vq per virtio-blk
@ 2014-06-20 15:29 Ming Lei
  2014-06-20 15:29 ` [PATCH v1 1/2] include/uapi/linux/virtio_blk.h: introduce feature of VIRTIO_BLK_F_MQ Ming Lei
  2014-06-20 15:29 ` [PATCH v1 2/2] block: virtio-blk: support multi virt queues per virtio-blk device Ming Lei
  0 siblings, 2 replies; 6+ messages in thread
From: Ming Lei @ 2014-06-20 15:29 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: Michael S. Tsirkin, linux-api, virtualization, Stefan Hajnoczi,
	Paolo Bonzini

Hi,

These patches try to support multi virtual queues(multi-vq) in one
virtio-blk device, and maps each virtual queue(vq) to blk-mq's
hardware queue.

With this approach, both scalability and performance on virtio-blk
device can get improved.

For verifying the improvement, I implements virtio-blk multi-vq over
qemu's dataplane feature, and both handling host notification
from each vq and processing host I/O are still kept in the per-device
iothread context, the change is based on qemu v2.0.0 release, and
can be accessed from below tree:

        git://kernel.ubuntu.com/ming/qemu.git #v2.0.0-virtblk-mq.1

For enabling the multi-vq feature, 'num_queues=N' need to be added into
'-device virtio-blk-pci ...' of qemu command line, and suggest to pass
'vectors=N+1' to keep one MSI irq vector per each vq, and the feature
depends on x-data-plane.

Fio(libaio, randread, iodepth=64, bs=4K, jobs=N) is run inside VM to
verify the improvement.

I just create a small quadcore VM and run fio inside the VM, and
num_queues of the virtio-blk device is set as 2, but looks the
improvement is still obvious.

1), about scalability
- without mutli-vq feature
        -- jobs=2, thoughput: 145K iops
        -- jobs=4, thoughput: 100K iops
- with mutli-vq feature
        -- jobs=2, thoughput: 186K iops
        -- jobs=4, thoughput: 199K iops

2), about thoughput
- without mutli-vq feature
        -- top thoughput: 145K iops
- with mutli-vq feature
		-- top thoughput: 199K iops

So in my test, even for a quad-core VM, if the virtqueue number
is increased from 1 to 2, both scalability and performance can
get improved a lot.

V1:
	- remove RFC since no one objects
	- add '__u8 unused' for pending as suggested by Rusty
	- use virtio_cread_feature() directly, suggested by Rusty

Thanks,
--
Ming Lei

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v1 1/2] include/uapi/linux/virtio_blk.h: introduce feature of VIRTIO_BLK_F_MQ
  2014-06-20 15:29 [PATCH v1 0/2] block: virtio-blk: support multi vq per virtio-blk Ming Lei
@ 2014-06-20 15:29 ` Ming Lei
  2014-06-20 15:29 ` [PATCH v1 2/2] block: virtio-blk: support multi virt queues per virtio-blk device Ming Lei
  1 sibling, 0 replies; 6+ messages in thread
From: Ming Lei @ 2014-06-20 15:29 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: Michael S. Tsirkin, linux-api, Ming Lei, virtualization,
	Stefan Hajnoczi, Paolo Bonzini

Current virtio-blk spec only supports one virtual queue for transfering
data between VM and host, and inside VM all kinds of operations on
the virtual queue needs to hold one lock, so cause below problems:

	- bad scalability
	- bad throughput

This patch requests to introduce feature of VIRTIO_BLK_F_MQ
so that more than one virtual queues can be used to virtio-blk
device, then above problems can be solved or eased.

Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 include/uapi/linux/virtio_blk.h |    5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/uapi/linux/virtio_blk.h b/include/uapi/linux/virtio_blk.h
index 6d8e61c..9ad67b2 100644
--- a/include/uapi/linux/virtio_blk.h
+++ b/include/uapi/linux/virtio_blk.h
@@ -40,6 +40,7 @@
 #define VIRTIO_BLK_F_WCE	9	/* Writeback mode enabled after reset */
 #define VIRTIO_BLK_F_TOPOLOGY	10	/* Topology information is available */
 #define VIRTIO_BLK_F_CONFIG_WCE	11	/* Writeback mode available in config */
+#define VIRTIO_BLK_F_MQ		12	/* support more than one vq */
 
 #ifndef __KERNEL__
 /* Old (deprecated) name for VIRTIO_BLK_F_WCE. */
@@ -77,6 +78,10 @@ struct virtio_blk_config {
 
 	/* writeback mode (if VIRTIO_BLK_F_CONFIG_WCE) */
 	__u8 wce;
+	__u8 unused;
+
+	/* number of vqs, only available when VIRTIO_BLK_F_MQ is set */
+	__u16 num_queues;
 } __attribute__((packed));
 
 /*
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v1 2/2] block: virtio-blk: support multi virt queues per virtio-blk device
  2014-06-20 15:29 [PATCH v1 0/2] block: virtio-blk: support multi vq per virtio-blk Ming Lei
  2014-06-20 15:29 ` [PATCH v1 1/2] include/uapi/linux/virtio_blk.h: introduce feature of VIRTIO_BLK_F_MQ Ming Lei
@ 2014-06-20 15:29 ` Ming Lei
  2014-06-22 10:24   ` Michael S. Tsirkin
  1 sibling, 1 reply; 6+ messages in thread
From: Ming Lei @ 2014-06-20 15:29 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel
  Cc: Michael S. Tsirkin, linux-api, Ming Lei, virtualization,
	Stefan Hajnoczi, Paolo Bonzini

Firstly this patch supports more than one virtual queues for virtio-blk
device.

Secondly this patch maps the virtual queue to blk-mq's hardware queue.

With this approach, both scalability and performance can be improved.

Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 drivers/block/virtio_blk.c |   70 +++++++++++++++++++++++++++++++-------------
 1 file changed, 50 insertions(+), 20 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index f63d358..7c3d686 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -16,6 +16,8 @@
 
 #define PART_BITS 4
 
+#define MAX_NUM_VQ 16
+
 static int major;
 static DEFINE_IDA(vd_index_ida);
 
@@ -24,8 +26,8 @@ static struct workqueue_struct *virtblk_wq;
 struct virtio_blk
 {
 	struct virtio_device *vdev;
-	struct virtqueue *vq;
-	spinlock_t vq_lock;
+	struct virtqueue *vq[MAX_NUM_VQ];
+	spinlock_t vq_lock[MAX_NUM_VQ];
 
 	/* The disk structure for the kernel. */
 	struct gendisk *disk;
@@ -47,6 +49,9 @@ struct virtio_blk
 
 	/* Ida index - used to track minor number allocations. */
 	int index;
+
+	/* num of vqs */
+	int num_vqs;
 };
 
 struct virtblk_req
@@ -133,14 +138,15 @@ static void virtblk_done(struct virtqueue *vq)
 {
 	struct virtio_blk *vblk = vq->vdev->priv;
 	bool req_done = false;
+	int qid = vq->index;
 	struct virtblk_req *vbr;
 	unsigned long flags;
 	unsigned int len;
 
-	spin_lock_irqsave(&vblk->vq_lock, flags);
+	spin_lock_irqsave(&vblk->vq_lock[qid], flags);
 	do {
 		virtqueue_disable_cb(vq);
-		while ((vbr = virtqueue_get_buf(vblk->vq, &len)) != NULL) {
+		while ((vbr = virtqueue_get_buf(vblk->vq[qid], &len)) != NULL) {
 			blk_mq_complete_request(vbr->req);
 			req_done = true;
 		}
@@ -151,7 +157,7 @@ static void virtblk_done(struct virtqueue *vq)
 	/* In case queue is stopped waiting for more buffers. */
 	if (req_done)
 		blk_mq_start_stopped_hw_queues(vblk->disk->queue, true);
-	spin_unlock_irqrestore(&vblk->vq_lock, flags);
+	spin_unlock_irqrestore(&vblk->vq_lock[qid], flags);
 }
 
 static int virtio_queue_rq(struct blk_mq_hw_ctx *hctx, struct request *req)
@@ -160,6 +166,7 @@ static int virtio_queue_rq(struct blk_mq_hw_ctx *hctx, struct request *req)
 	struct virtblk_req *vbr = blk_mq_rq_to_pdu(req);
 	unsigned long flags;
 	unsigned int num;
+	int qid = hctx->queue_num;
 	const bool last = (req->cmd_flags & REQ_END) != 0;
 	int err;
 	bool notify = false;
@@ -202,12 +209,12 @@ static int virtio_queue_rq(struct blk_mq_hw_ctx *hctx, struct request *req)
 			vbr->out_hdr.type |= VIRTIO_BLK_T_IN;
 	}
 
-	spin_lock_irqsave(&vblk->vq_lock, flags);
-	err = __virtblk_add_req(vblk->vq, vbr, vbr->sg, num);
+	spin_lock_irqsave(&vblk->vq_lock[qid], flags);
+	err = __virtblk_add_req(vblk->vq[qid], vbr, vbr->sg, num);
 	if (err) {
-		virtqueue_kick(vblk->vq);
+		virtqueue_kick(vblk->vq[qid]);
 		blk_mq_stop_hw_queue(hctx);
-		spin_unlock_irqrestore(&vblk->vq_lock, flags);
+		spin_unlock_irqrestore(&vblk->vq_lock[qid], flags);
 		/* Out of mem doesn't actually happen, since we fall back
 		 * to direct descriptors */
 		if (err == -ENOMEM || err == -ENOSPC)
@@ -215,12 +222,12 @@ static int virtio_queue_rq(struct blk_mq_hw_ctx *hctx, struct request *req)
 		return BLK_MQ_RQ_QUEUE_ERROR;
 	}
 
-	if (last && virtqueue_kick_prepare(vblk->vq))
+	if (last && virtqueue_kick_prepare(vblk->vq[qid]))
 		notify = true;
-	spin_unlock_irqrestore(&vblk->vq_lock, flags);
+	spin_unlock_irqrestore(&vblk->vq_lock[qid], flags);
 
 	if (notify)
-		virtqueue_notify(vblk->vq);
+		virtqueue_notify(vblk->vq[qid]);
 	return BLK_MQ_RQ_QUEUE_OK;
 }
 
@@ -377,12 +384,35 @@ static void virtblk_config_changed(struct virtio_device *vdev)
 static int init_vq(struct virtio_blk *vblk)
 {
 	int err = 0;
+	int i;
+	vq_callback_t *callbacks[MAX_NUM_VQ];
+	const char *names[MAX_NUM_VQ];
+	unsigned short num_vqs;
+	struct virtio_device *vdev = vblk->vdev;
 
-	/* We expect one virtqueue, for output. */
-	vblk->vq = virtio_find_single_vq(vblk->vdev, virtblk_done, "requests");
-	if (IS_ERR(vblk->vq))
-		err = PTR_ERR(vblk->vq);
+	err = virtio_cread_feature(vdev, VIRTIO_BLK_F_MQ,
+				   struct virtio_blk_config, num_queues,
+				   &num_vqs);
+	if (err)
+		num_vqs = 1;
+	if (num_vqs > MAX_NUM_VQ)
+		num_vqs = MAX_NUM_VQ;
 
+	for (i = 0; i < num_vqs; i++) {
+		callbacks[i] = virtblk_done;
+		names[i] = "requests";
+	}
+
+	/* Discover virtqueues and write information to configuration.  */
+	err = vdev->config->find_vqs(vdev, num_vqs, vblk->vq,
+			callbacks, names);
+	if (err)
+		goto out;
+
+	for (i = 0; i < num_vqs; i++)
+		spin_lock_init(&vblk->vq_lock[i]);
+	vblk->num_vqs = num_vqs;
+out:
 	return err;
 }
 
@@ -551,7 +581,6 @@ static int virtblk_probe(struct virtio_device *vdev)
 	err = init_vq(vblk);
 	if (err)
 		goto out_free_vblk;
-	spin_lock_init(&vblk->vq_lock);
 
 	/* FIXME: How many partitions?  How long is a piece of string? */
 	vblk->disk = alloc_disk(1 << PART_BITS);
@@ -562,7 +591,7 @@ static int virtblk_probe(struct virtio_device *vdev)
 
 	/* Default queue sizing is to fill the ring. */
 	if (!virtblk_queue_depth) {
-		virtblk_queue_depth = vblk->vq->num_free;
+		virtblk_queue_depth = vblk->vq[0]->num_free;
 		/* ... but without indirect descs, we use 2 descs per req */
 		if (!virtio_has_feature(vdev, VIRTIO_RING_F_INDIRECT_DESC))
 			virtblk_queue_depth /= 2;
@@ -570,7 +599,6 @@ static int virtblk_probe(struct virtio_device *vdev)
 
 	memset(&vblk->tag_set, 0, sizeof(vblk->tag_set));
 	vblk->tag_set.ops = &virtio_mq_ops;
-	vblk->tag_set.nr_hw_queues = 1;
 	vblk->tag_set.queue_depth = virtblk_queue_depth;
 	vblk->tag_set.numa_node = NUMA_NO_NODE;
 	vblk->tag_set.flags = BLK_MQ_F_SHOULD_MERGE;
@@ -578,6 +606,7 @@ static int virtblk_probe(struct virtio_device *vdev)
 		sizeof(struct virtblk_req) +
 		sizeof(struct scatterlist) * sg_elems;
 	vblk->tag_set.driver_data = vblk;
+	vblk->tag_set.nr_hw_queues = vblk->num_vqs;
 
 	err = blk_mq_alloc_tag_set(&vblk->tag_set);
 	if (err)
@@ -777,7 +806,8 @@ static const struct virtio_device_id id_table[] = {
 static unsigned int features[] = {
 	VIRTIO_BLK_F_SEG_MAX, VIRTIO_BLK_F_SIZE_MAX, VIRTIO_BLK_F_GEOMETRY,
 	VIRTIO_BLK_F_RO, VIRTIO_BLK_F_BLK_SIZE, VIRTIO_BLK_F_SCSI,
-	VIRTIO_BLK_F_WCE, VIRTIO_BLK_F_TOPOLOGY, VIRTIO_BLK_F_CONFIG_WCE
+	VIRTIO_BLK_F_WCE, VIRTIO_BLK_F_TOPOLOGY, VIRTIO_BLK_F_CONFIG_WCE,
+	VIRTIO_BLK_F_MQ,
 };
 
 static struct virtio_driver virtio_blk = {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v1 2/2] block: virtio-blk: support multi virt queues per virtio-blk device
  2014-06-20 15:29 ` [PATCH v1 2/2] block: virtio-blk: support multi virt queues per virtio-blk device Ming Lei
@ 2014-06-22 10:24   ` Michael S. Tsirkin
  2014-06-23  3:42     ` Dave Chinner
  0 siblings, 1 reply; 6+ messages in thread
From: Michael S. Tsirkin @ 2014-06-22 10:24 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-api, linux-kernel, virtualization,
	Stefan Hajnoczi, Paolo Bonzini

On Fri, Jun 20, 2014 at 11:29:40PM +0800, Ming Lei wrote:
> Firstly this patch supports more than one virtual queues for virtio-blk
> device.
> 
> Secondly this patch maps the virtual queue to blk-mq's hardware queue.
> 
> With this approach, both scalability and performance can be improved.
> 
> Signed-off-by: Ming Lei <ming.lei@canonical.com>
> ---
>  drivers/block/virtio_blk.c |   70 +++++++++++++++++++++++++++++++-------------
>  1 file changed, 50 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index f63d358..7c3d686 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -16,6 +16,8 @@
>  
>  #define PART_BITS 4
>  
> +#define MAX_NUM_VQ 16
> +
>  static int major;
>  static DEFINE_IDA(vd_index_ida);
>  

Does it work much worse if we just use as many queues as
hardware supports, allocating as much memory as necessary?


> @@ -24,8 +26,8 @@ static struct workqueue_struct *virtblk_wq;
>  struct virtio_blk
>  {
>  	struct virtio_device *vdev;
> -	struct virtqueue *vq;
> -	spinlock_t vq_lock;
> +	struct virtqueue *vq[MAX_NUM_VQ];
> +	spinlock_t vq_lock[MAX_NUM_VQ];

array of struct {
    *vq;
    spinlock_t lock;
}
would use more memory but would get us better locality.
It might even make sense to add padding to avoid
cacheline sharing between two unrelated VQs.
Want to try?

>  
>  	/* The disk structure for the kernel. */
>  	struct gendisk *disk;
> @@ -47,6 +49,9 @@ struct virtio_blk
>  
>  	/* Ida index - used to track minor number allocations. */
>  	int index;
> +
> +	/* num of vqs */
> +	int num_vqs;
>  };
>  
>  struct virtblk_req
> @@ -133,14 +138,15 @@ static void virtblk_done(struct virtqueue *vq)
>  {
>  	struct virtio_blk *vblk = vq->vdev->priv;
>  	bool req_done = false;
> +	int qid = vq->index;
>  	struct virtblk_req *vbr;
>  	unsigned long flags;
>  	unsigned int len;
>  
> -	spin_lock_irqsave(&vblk->vq_lock, flags);
> +	spin_lock_irqsave(&vblk->vq_lock[qid], flags);
>  	do {
>  		virtqueue_disable_cb(vq);
> -		while ((vbr = virtqueue_get_buf(vblk->vq, &len)) != NULL) {
> +		while ((vbr = virtqueue_get_buf(vblk->vq[qid], &len)) != NULL) {
>  			blk_mq_complete_request(vbr->req);
>  			req_done = true;
>  		}
> @@ -151,7 +157,7 @@ static void virtblk_done(struct virtqueue *vq)
>  	/* In case queue is stopped waiting for more buffers. */
>  	if (req_done)
>  		blk_mq_start_stopped_hw_queues(vblk->disk->queue, true);
> -	spin_unlock_irqrestore(&vblk->vq_lock, flags);
> +	spin_unlock_irqrestore(&vblk->vq_lock[qid], flags);
>  }
>  
>  static int virtio_queue_rq(struct blk_mq_hw_ctx *hctx, struct request *req)
> @@ -160,6 +166,7 @@ static int virtio_queue_rq(struct blk_mq_hw_ctx *hctx, struct request *req)
>  	struct virtblk_req *vbr = blk_mq_rq_to_pdu(req);
>  	unsigned long flags;
>  	unsigned int num;
> +	int qid = hctx->queue_num;
>  	const bool last = (req->cmd_flags & REQ_END) != 0;
>  	int err;
>  	bool notify = false;
> @@ -202,12 +209,12 @@ static int virtio_queue_rq(struct blk_mq_hw_ctx *hctx, struct request *req)
>  			vbr->out_hdr.type |= VIRTIO_BLK_T_IN;
>  	}
>  
> -	spin_lock_irqsave(&vblk->vq_lock, flags);
> -	err = __virtblk_add_req(vblk->vq, vbr, vbr->sg, num);
> +	spin_lock_irqsave(&vblk->vq_lock[qid], flags);
> +	err = __virtblk_add_req(vblk->vq[qid], vbr, vbr->sg, num);
>  	if (err) {
> -		virtqueue_kick(vblk->vq);
> +		virtqueue_kick(vblk->vq[qid]);
>  		blk_mq_stop_hw_queue(hctx);
> -		spin_unlock_irqrestore(&vblk->vq_lock, flags);
> +		spin_unlock_irqrestore(&vblk->vq_lock[qid], flags);
>  		/* Out of mem doesn't actually happen, since we fall back
>  		 * to direct descriptors */
>  		if (err == -ENOMEM || err == -ENOSPC)
> @@ -215,12 +222,12 @@ static int virtio_queue_rq(struct blk_mq_hw_ctx *hctx, struct request *req)
>  		return BLK_MQ_RQ_QUEUE_ERROR;
>  	}
>  
> -	if (last && virtqueue_kick_prepare(vblk->vq))
> +	if (last && virtqueue_kick_prepare(vblk->vq[qid]))
>  		notify = true;
> -	spin_unlock_irqrestore(&vblk->vq_lock, flags);
> +	spin_unlock_irqrestore(&vblk->vq_lock[qid], flags);
>  
>  	if (notify)
> -		virtqueue_notify(vblk->vq);
> +		virtqueue_notify(vblk->vq[qid]);
>  	return BLK_MQ_RQ_QUEUE_OK;
>  }
>  
> @@ -377,12 +384,35 @@ static void virtblk_config_changed(struct virtio_device *vdev)
>  static int init_vq(struct virtio_blk *vblk)
>  {
>  	int err = 0;
> +	int i;
> +	vq_callback_t *callbacks[MAX_NUM_VQ];
> +	const char *names[MAX_NUM_VQ];
> +	unsigned short num_vqs;
> +	struct virtio_device *vdev = vblk->vdev;
>  
> -	/* We expect one virtqueue, for output. */
> -	vblk->vq = virtio_find_single_vq(vblk->vdev, virtblk_done, "requests");
> -	if (IS_ERR(vblk->vq))
> -		err = PTR_ERR(vblk->vq);
> +	err = virtio_cread_feature(vdev, VIRTIO_BLK_F_MQ,
> +				   struct virtio_blk_config, num_queues,
> +				   &num_vqs);
> +	if (err)
> +		num_vqs = 1;
> +	if (num_vqs > MAX_NUM_VQ)
> +		num_vqs = MAX_NUM_VQ;
>  
> +	for (i = 0; i < num_vqs; i++) {
> +		callbacks[i] = virtblk_done;
> +		names[i] = "requests";
> +	}
> +

This will name all VQs the same which makes debugging harder.
Better give each one a distinct name.

> +	/* Discover virtqueues and write information to configuration.  */
> +	err = vdev->config->find_vqs(vdev, num_vqs, vblk->vq,
> +			callbacks, names);
> +	if (err)
> +		goto out;
> +
> +	for (i = 0; i < num_vqs; i++)
> +		spin_lock_init(&vblk->vq_lock[i]);
> +	vblk->num_vqs = num_vqs;
> +out:
>  	return err;
>  }
>  
> @@ -551,7 +581,6 @@ static int virtblk_probe(struct virtio_device *vdev)
>  	err = init_vq(vblk);
>  	if (err)
>  		goto out_free_vblk;
> -	spin_lock_init(&vblk->vq_lock);
>  
>  	/* FIXME: How many partitions?  How long is a piece of string? */
>  	vblk->disk = alloc_disk(1 << PART_BITS);
> @@ -562,7 +591,7 @@ static int virtblk_probe(struct virtio_device *vdev)
>  
>  	/* Default queue sizing is to fill the ring. */
>  	if (!virtblk_queue_depth) {
> -		virtblk_queue_depth = vblk->vq->num_free;
> +		virtblk_queue_depth = vblk->vq[0]->num_free;
>  		/* ... but without indirect descs, we use 2 descs per req */
>  		if (!virtio_has_feature(vdev, VIRTIO_RING_F_INDIRECT_DESC))
>  			virtblk_queue_depth /= 2;
> @@ -570,7 +599,6 @@ static int virtblk_probe(struct virtio_device *vdev)
>  
>  	memset(&vblk->tag_set, 0, sizeof(vblk->tag_set));
>  	vblk->tag_set.ops = &virtio_mq_ops;
> -	vblk->tag_set.nr_hw_queues = 1;
>  	vblk->tag_set.queue_depth = virtblk_queue_depth;
>  	vblk->tag_set.numa_node = NUMA_NO_NODE;
>  	vblk->tag_set.flags = BLK_MQ_F_SHOULD_MERGE;
> @@ -578,6 +606,7 @@ static int virtblk_probe(struct virtio_device *vdev)
>  		sizeof(struct virtblk_req) +
>  		sizeof(struct scatterlist) * sg_elems;
>  	vblk->tag_set.driver_data = vblk;
> +	vblk->tag_set.nr_hw_queues = vblk->num_vqs;
>  
>  	err = blk_mq_alloc_tag_set(&vblk->tag_set);
>  	if (err)
> @@ -777,7 +806,8 @@ static const struct virtio_device_id id_table[] = {
>  static unsigned int features[] = {
>  	VIRTIO_BLK_F_SEG_MAX, VIRTIO_BLK_F_SIZE_MAX, VIRTIO_BLK_F_GEOMETRY,
>  	VIRTIO_BLK_F_RO, VIRTIO_BLK_F_BLK_SIZE, VIRTIO_BLK_F_SCSI,
> -	VIRTIO_BLK_F_WCE, VIRTIO_BLK_F_TOPOLOGY, VIRTIO_BLK_F_CONFIG_WCE
> +	VIRTIO_BLK_F_WCE, VIRTIO_BLK_F_TOPOLOGY, VIRTIO_BLK_F_CONFIG_WCE,
> +	VIRTIO_BLK_F_MQ,
>  };
>  
>  static struct virtio_driver virtio_blk = {
> -- 
> 1.7.9.5

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1 2/2] block: virtio-blk: support multi virt queues per virtio-blk device
  2014-06-22 10:24   ` Michael S. Tsirkin
@ 2014-06-23  3:42     ` Dave Chinner
  2014-06-23  6:47       ` Michael S. Tsirkin
  0 siblings, 1 reply; 6+ messages in thread
From: Dave Chinner @ 2014-06-23  3:42 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jens Axboe, linux-api, Ming Lei, linux-kernel, virtualization,
	Stefan Hajnoczi, Paolo Bonzini

On Sun, Jun 22, 2014 at 01:24:48PM +0300, Michael S. Tsirkin wrote:
> On Fri, Jun 20, 2014 at 11:29:40PM +0800, Ming Lei wrote:
> > @@ -24,8 +26,8 @@ static struct workqueue_struct *virtblk_wq;
> >  struct virtio_blk
> >  {
> >  	struct virtio_device *vdev;
> > -	struct virtqueue *vq;
> > -	spinlock_t vq_lock;
> > +	struct virtqueue *vq[MAX_NUM_VQ];
> > +	spinlock_t vq_lock[MAX_NUM_VQ];
> 
> array of struct {
>     *vq;
>     spinlock_t lock;
> }
> would use more memory but would get us better locality.
> It might even make sense to add padding to avoid
> cacheline sharing between two unrelated VQs.
> Want to try?

It's still false sharing because the queue objects share cachelines.
To operate without contention they have to be physically separated
from each other like so:

struct vq {
	struct virtqueue	*q;
	spinlock_t		lock;
} ____cacheline_aligned_in_smp;

struct some_other_struct {
	....
	struct vq	vq[MAX_NUM_VQ];
	....
};

This keeps locality to objects within a queue, but separates each
queue onto it's own cacheline....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1 2/2] block: virtio-blk: support multi virt queues per virtio-blk device
  2014-06-23  3:42     ` Dave Chinner
@ 2014-06-23  6:47       ` Michael S. Tsirkin
  0 siblings, 0 replies; 6+ messages in thread
From: Michael S. Tsirkin @ 2014-06-23  6:47 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Ming Lei, Jens Axboe, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Rusty Russell, linux-api-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Stefan Hajnoczi, Paolo Bonzini

On Mon, Jun 23, 2014 at 01:42:51PM +1000, Dave Chinner wrote:
> On Sun, Jun 22, 2014 at 01:24:48PM +0300, Michael S. Tsirkin wrote:
> > On Fri, Jun 20, 2014 at 11:29:40PM +0800, Ming Lei wrote:
> > > @@ -24,8 +26,8 @@ static struct workqueue_struct *virtblk_wq;
> > >  struct virtio_blk
> > >  {
> > >  	struct virtio_device *vdev;
> > > -	struct virtqueue *vq;
> > > -	spinlock_t vq_lock;
> > > +	struct virtqueue *vq[MAX_NUM_VQ];
> > > +	spinlock_t vq_lock[MAX_NUM_VQ];
> > 
> > array of struct {
> >     *vq;
> >     spinlock_t lock;
> > }
> > would use more memory but would get us better locality.
> > It might even make sense to add padding to avoid
> > cacheline sharing between two unrelated VQs.
> > Want to try?
> 
> It's still false sharing because the queue objects share cachelines.
> To operate without contention they have to be physically separated
> from each other like so:
> 
> struct vq {
> 	struct virtqueue	*q;
> 	spinlock_t		lock;
> } ____cacheline_aligned_in_smp;

Exacly, that's what I meant by padding above.

> struct some_other_struct {
> 	....
> 	struct vq	vq[MAX_NUM_VQ];
> 	....
> };
> 
> This keeps locality to objects within a queue, but separates each
> queue onto it's own cacheline....
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org

To reduce the amount of memory wasted, we could add
the lock in the VQ itself.
Wastes 8 bytes of memory for devices which don't need it, but
we can save it elsewhere (e.g. get rid of the list and
the priv pointer).

How's this?  Your patch would go on top.
Care benchmarking and telling us whether it makes sense?
If yes please let me know and I'll send an official patchset.

-->

virtio-blk: move spinlock to vq itself

Signed-off-by: Michael S. Tsirkin <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

--

diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index b46671e..0951b21 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -19,6 +19,7 @@
  * @priv: a pointer for the virtqueue implementation to use.
  * @index: the zero-based ordinal number for this queue.
  * @num_free: number of elements we expect to be able to fit.
+ * @lock: lock for optional use by devices. If used, devices must initialize it.
  *
  * A note on @num_free: with indirect buffers, each buffer needs one
  * element in the queue, otherwise a buffer will need one element per
@@ -31,6 +32,7 @@ struct virtqueue {
 	struct virtio_device *vdev;
 	unsigned int index;
 	unsigned int num_free;
+	spinlock_t lock;
 	void *priv;
 };
 
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index f63d358..a3cdc19 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -25,7 +25,6 @@ struct virtio_blk
 {
 	struct virtio_device *vdev;
 	struct virtqueue *vq;
-	spinlock_t vq_lock;
 
 	/* The disk structure for the kernel. */
 	struct gendisk *disk;
@@ -137,7 +136,7 @@ static void virtblk_done(struct virtqueue *vq)
 	unsigned long flags;
 	unsigned int len;
 
-	spin_lock_irqsave(&vblk->vq_lock, flags);
+	spin_lock_irqsave(&vblk->vq->lock, flags);
 	do {
 		virtqueue_disable_cb(vq);
 		while ((vbr = virtqueue_get_buf(vblk->vq, &len)) != NULL) {
@@ -151,7 +150,7 @@ static void virtblk_done(struct virtqueue *vq)
 	/* In case queue is stopped waiting for more buffers. */
 	if (req_done)
 		blk_mq_start_stopped_hw_queues(vblk->disk->queue, true);
-	spin_unlock_irqrestore(&vblk->vq_lock, flags);
+	spin_unlock_irqrestore(&vblk->vq->lock, flags);
 }
 
 static int virtio_queue_rq(struct blk_mq_hw_ctx *hctx, struct request *req)
@@ -202,12 +201,12 @@ static int virtio_queue_rq(struct blk_mq_hw_ctx *hctx, struct request *req)
 			vbr->out_hdr.type |= VIRTIO_BLK_T_IN;
 	}
 
-	spin_lock_irqsave(&vblk->vq_lock, flags);
+	spin_lock_irqsave(&vblk->vq->lock, flags);
 	err = __virtblk_add_req(vblk->vq, vbr, vbr->sg, num);
 	if (err) {
 		virtqueue_kick(vblk->vq);
 		blk_mq_stop_hw_queue(hctx);
-		spin_unlock_irqrestore(&vblk->vq_lock, flags);
+		spin_unlock_irqrestore(&vblk->vq->lock, flags);
 		/* Out of mem doesn't actually happen, since we fall back
 		 * to direct descriptors */
 		if (err == -ENOMEM || err == -ENOSPC)
@@ -217,7 +216,7 @@ static int virtio_queue_rq(struct blk_mq_hw_ctx *hctx, struct request *req)
 
 	if (last && virtqueue_kick_prepare(vblk->vq))
 		notify = true;
-	spin_unlock_irqrestore(&vblk->vq_lock, flags);
+	spin_unlock_irqrestore(&vblk->vq->lock, flags);
 
 	if (notify)
 		virtqueue_notify(vblk->vq);
@@ -551,7 +550,7 @@ static int virtblk_probe(struct virtio_device *vdev)
 	err = init_vq(vblk);
 	if (err)
 		goto out_free_vblk;
-	spin_lock_init(&vblk->vq_lock);
+	spin_lock_init(&vblk->vq->lock);
 
 	/* FIXME: How many partitions?  How long is a piece of string? */
 	vblk->disk = alloc_disk(1 << PART_BITS);

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-06-23  6:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-20 15:29 [PATCH v1 0/2] block: virtio-blk: support multi vq per virtio-blk Ming Lei
2014-06-20 15:29 ` [PATCH v1 1/2] include/uapi/linux/virtio_blk.h: introduce feature of VIRTIO_BLK_F_MQ Ming Lei
2014-06-20 15:29 ` [PATCH v1 2/2] block: virtio-blk: support multi virt queues per virtio-blk device Ming Lei
2014-06-22 10:24   ` Michael S. Tsirkin
2014-06-23  3:42     ` Dave Chinner
2014-06-23  6:47       ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).