qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v5 0/3] linux-aio: introduce submit I/O as a batch
@ 2014-07-04  3:06 Ming Lei
  2014-07-04  3:06 ` [Qemu-devel] [PATCH v5 1/3] block: block: introduce APIs for submitting IO " Ming Lei
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Ming Lei @ 2014-07-04  3:06 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel, Paolo Bonzini, Stefan Hajnoczi
  Cc: Kevin Wolf, Fam Zheng, Michael S. Tsirkin

Hi,

The commit 580b6b2aa2(dataplane: use the QEMU block layer for I/O)
introduces ~40% throughput regression on virtio-blk dataplane, and
one of causes is that submitting I/O as a batch is removed.

This patchset trys to introduce this mechanism on block, at least,
linux-aio can benefit from that.

With these patches, it is observed that thoughout on virtio-blk
dataplane can be improved a lot, see data in commit log of patch
3/3.

It should be possible to apply the batch mechanism to other devices
(such as virtio-scsi) too.

TODO:
	- support queuing I/O to multi files for scsi devies, which
	need some changes to linux-aio

V5:
	- rebase on v2.1.0-rc0 of qemu.git/master
	- block/linux-aio.c code style fix
    - don't flush io queue before flush, pointed by Paolo

V4:
	- support other non-raw formats with under-optimized performance
	- use reference counter for plug & unplug
	- flush io queue before sending flush command

V3:
	- only support submitting I/O as a batch for raw format, pointed by
    Kevin

V2:
	- define return value of bdrv_io_unplug as void, suggested by Paolo
	- avoid busy-wait for handling io_submit
V1:
	- move queuing io stuff into linux-aio.c as suggested by Paolo


Thanks,
--
Ming Lei

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH v5 1/3] block: block: introduce APIs for submitting IO as a batch
  2014-07-04  3:06 [Qemu-devel] [PATCH v5 0/3] linux-aio: introduce submit I/O as a batch Ming Lei
@ 2014-07-04  3:06 ` Ming Lei
  2014-07-04  3:06 ` [Qemu-devel] [PATCH v5 2/3] linux-aio: implement io plug, unplug and flush io queue Ming Lei
  2014-07-04  3:06 ` [Qemu-devel] [PATCH v5 3/3] dataplane: submit I/O as a batch Ming Lei
  2 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2014-07-04  3:06 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel, Paolo Bonzini, Stefan Hajnoczi
  Cc: Kevin Wolf, Ming Lei, Fam Zheng, Michael S. Tsirkin

This patch introduces three APIs so that following
patches can support queuing I/O requests and submitting them
as a batch for improving I/O performance.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 block.c                   |   31 +++++++++++++++++++++++++++++++
 include/block/block.h     |    4 ++++
 include/block/block_int.h |    5 +++++
 3 files changed, 40 insertions(+)

diff --git a/block.c b/block.c
index f80e2b2..8800a6b 100644
--- a/block.c
+++ b/block.c
@@ -1905,6 +1905,7 @@ void bdrv_drain_all(void)
             bool bs_busy;
 
             aio_context_acquire(aio_context);
+            bdrv_flush_io_queue(bs);
             bdrv_start_throttled_reqs(bs);
             bs_busy = bdrv_requests_pending(bs);
             bs_busy |= aio_poll(aio_context, bs_busy);
@@ -5782,3 +5783,33 @@ BlockDriverState *check_to_replace_node(const char *node_name, Error **errp)
 
     return to_replace_bs;
 }
+
+void bdrv_io_plug(BlockDriverState *bs)
+{
+    BlockDriver *drv = bs->drv;
+    if (drv && drv->bdrv_io_plug) {
+        drv->bdrv_io_plug(bs);
+    } else if (bs->file) {
+        bdrv_io_plug(bs->file);
+    }
+}
+
+void bdrv_io_unplug(BlockDriverState *bs)
+{
+    BlockDriver *drv = bs->drv;
+    if (drv && drv->bdrv_io_unplug) {
+        drv->bdrv_io_unplug(bs);
+    } else if (bs->file) {
+        bdrv_io_unplug(bs->file);
+    }
+}
+
+void bdrv_flush_io_queue(BlockDriverState *bs)
+{
+    BlockDriver *drv = bs->drv;
+    if (drv && drv->bdrv_flush_io_queue) {
+        drv->bdrv_flush_io_queue(bs);
+    } else if (bs->file) {
+        bdrv_flush_io_queue(bs->file);
+    }
+}
diff --git a/include/block/block.h b/include/block/block.h
index baecc26..32d3676 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -584,4 +584,8 @@ AioContext *bdrv_get_aio_context(BlockDriverState *bs);
  */
 void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context);
 
+void bdrv_io_plug(BlockDriverState *bs);
+void bdrv_io_unplug(BlockDriverState *bs);
+void bdrv_flush_io_queue(BlockDriverState *bs);
+
 #endif
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 8f8e65e..f6c3bef 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -261,6 +261,11 @@ struct BlockDriver {
     void (*bdrv_attach_aio_context)(BlockDriverState *bs,
                                     AioContext *new_context);
 
+    /* io queue for linux-aio */
+    void (*bdrv_io_plug)(BlockDriverState *bs);
+    void (*bdrv_io_unplug)(BlockDriverState *bs);
+    void (*bdrv_flush_io_queue)(BlockDriverState *bs);
+
     QLIST_ENTRY(BlockDriver) list;
 };
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH v5 2/3] linux-aio: implement io plug, unplug and flush io queue
  2014-07-04  3:06 [Qemu-devel] [PATCH v5 0/3] linux-aio: introduce submit I/O as a batch Ming Lei
  2014-07-04  3:06 ` [Qemu-devel] [PATCH v5 1/3] block: block: introduce APIs for submitting IO " Ming Lei
@ 2014-07-04  3:06 ` Ming Lei
  2014-07-04  8:42   ` Stefan Hajnoczi
  2014-07-04  3:06 ` [Qemu-devel] [PATCH v5 3/3] dataplane: submit I/O as a batch Ming Lei
  2 siblings, 1 reply; 5+ messages in thread
From: Ming Lei @ 2014-07-04  3:06 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel, Paolo Bonzini, Stefan Hajnoczi
  Cc: Kevin Wolf, Ming Lei, Fam Zheng, Michael S. Tsirkin

This patch implements .bdrv_io_plug, .bdrv_io_unplug and
.bdrv_flush_io_queue callbacks for linux-aio Block Drivers,
so that submitting I/O as a batch can be supported on linux-aio.

Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 block/linux-aio.c |   93 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 block/raw-aio.h   |    2 ++
 block/raw-posix.c |   45 ++++++++++++++++++++++++++
 3 files changed, 138 insertions(+), 2 deletions(-)

diff --git a/block/linux-aio.c b/block/linux-aio.c
index f0a2c08..9f1883e 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -25,6 +25,8 @@
  */
 #define MAX_EVENTS 128
 
+#define MAX_QUEUED_IO  128
+
 struct qemu_laiocb {
     BlockDriverAIOCB common;
     struct qemu_laio_state *ctx;
@@ -36,9 +38,19 @@ struct qemu_laiocb {
     QLIST_ENTRY(qemu_laiocb) node;
 };
 
+struct laio_queue {
+    struct iocb *iocbs[MAX_QUEUED_IO];
+    int plugged;
+    unsigned int size;
+    unsigned int idx;
+};
+
 struct qemu_laio_state {
     io_context_t ctx;
     EventNotifier e;
+
+    /* io queue for submit at batch */
+    struct laio_queue io_q;
 };
 
 static inline ssize_t io_event_ret(struct io_event *ev)
@@ -135,6 +147,77 @@ static const AIOCBInfo laio_aiocb_info = {
     .cancel             = laio_cancel,
 };
 
+static void ioq_init(struct laio_queue *io_q)
+{
+    io_q->size = MAX_QUEUED_IO;
+    io_q->idx = 0;
+    io_q->plugged = 0;
+}
+
+static int ioq_submit(struct qemu_laio_state *s)
+{
+    int ret, i = 0;
+    int len = s->io_q.idx;
+
+    do {
+        ret = io_submit(s->ctx, len, s->io_q.iocbs);
+    } while (i++ < 3 && ret == -EAGAIN);
+
+    /* empty io queue */
+    s->io_q.idx = 0;
+
+    if (ret >= 0) {
+        return 0;
+    }
+
+    for (i = 0; i < len; i++) {
+        struct qemu_laiocb *laiocb =
+            container_of(s->io_q.iocbs[i], struct qemu_laiocb, iocb);
+
+        laiocb->ret = ret;
+        qemu_laio_process_completion(s, laiocb);
+    }
+    return ret;
+}
+
+static void ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb)
+{
+    unsigned int idx = s->io_q.idx;
+
+    s->io_q.iocbs[idx++] = iocb;
+    s->io_q.idx = idx;
+
+    /* submit immediately if queue is full */
+    if (idx == s->io_q.size) {
+        ioq_submit(s);
+    }
+}
+
+void laio_io_plug(BlockDriverState *bs, void *aio_ctx)
+{
+    struct qemu_laio_state *s = aio_ctx;
+
+    s->io_q.plugged++;
+}
+
+int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug)
+{
+    struct qemu_laio_state *s = aio_ctx;
+    int ret = 0;
+
+    assert(s->io_q.plugged > 0 || !unplug);
+
+    if (unplug && --s->io_q.plugged > 0) {
+        return 0;
+    }
+
+    if (s->io_q.idx > 0) {
+        ret = ioq_submit(s);
+    }
+
+    return ret;
+}
+
 BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
         int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
         BlockDriverCompletionFunc *cb, void *opaque, int type)
@@ -168,8 +251,12 @@ BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
     }
     io_set_eventfd(&laiocb->iocb, event_notifier_get_fd(&s->e));
 
-    if (io_submit(s->ctx, 1, &iocbs) < 0)
-        goto out_free_aiocb;
+    if (!s->io_q.plugged) {
+        if (io_submit(s->ctx, 1, &iocbs) < 0)
+            goto out_free_aiocb;
+    } else {
+        ioq_enqueue(s, iocbs);
+    }
     return &laiocb->common;
 
 out_free_aiocb:
@@ -204,6 +291,8 @@ void *laio_init(void)
         goto out_close_efd;
     }
 
+    ioq_init(&s->io_q);
+
     return s;
 
 out_close_efd:
diff --git a/block/raw-aio.h b/block/raw-aio.h
index 8cf084e..e18c975 100644
--- a/block/raw-aio.h
+++ b/block/raw-aio.h
@@ -40,6 +40,8 @@ BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
         BlockDriverCompletionFunc *cb, void *opaque, int type);
 void laio_detach_aio_context(void *s, AioContext *old_context);
 void laio_attach_aio_context(void *s, AioContext *new_context);
+void laio_io_plug(BlockDriverState *bs, void *aio_ctx);
+int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug);
 #endif
 
 #ifdef _WIN32
diff --git a/block/raw-posix.c b/block/raw-posix.c
index 825a0c8..808b2d8 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1057,6 +1057,36 @@ static BlockDriverAIOCB *raw_aio_submit(BlockDriverState *bs,
                        cb, opaque, type);
 }
 
+static void raw_aio_plug(BlockDriverState *bs)
+{
+#ifdef CONFIG_LINUX_AIO
+    BDRVRawState *s = bs->opaque;
+    if (s->use_aio) {
+        laio_io_plug(bs, s->aio_ctx);
+    }
+#endif
+}
+
+static void raw_aio_unplug(BlockDriverState *bs)
+{
+#ifdef CONFIG_LINUX_AIO
+    BDRVRawState *s = bs->opaque;
+    if (s->use_aio) {
+        laio_io_unplug(bs, s->aio_ctx, true);
+    }
+#endif
+}
+
+static void raw_aio_flush_io_queue(BlockDriverState *bs)
+{
+#ifdef CONFIG_LINUX_AIO
+    BDRVRawState *s = bs->opaque;
+    if (s->use_aio) {
+        laio_io_unplug(bs, s->aio_ctx, false);
+    }
+#endif
+}
+
 static BlockDriverAIOCB *raw_aio_readv(BlockDriverState *bs,
         int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
         BlockDriverCompletionFunc *cb, void *opaque)
@@ -1528,6 +1558,9 @@ static BlockDriver bdrv_file = {
     .bdrv_aio_flush = raw_aio_flush,
     .bdrv_aio_discard = raw_aio_discard,
     .bdrv_refresh_limits = raw_refresh_limits,
+    .bdrv_io_plug = raw_aio_plug,
+    .bdrv_io_unplug = raw_aio_unplug,
+    .bdrv_flush_io_queue = raw_aio_flush_io_queue,
 
     .bdrv_truncate = raw_truncate,
     .bdrv_getlength = raw_getlength,
@@ -1927,6 +1960,9 @@ static BlockDriver bdrv_host_device = {
     .bdrv_aio_flush	= raw_aio_flush,
     .bdrv_aio_discard   = hdev_aio_discard,
     .bdrv_refresh_limits = raw_refresh_limits,
+    .bdrv_io_plug = raw_aio_plug,
+    .bdrv_io_unplug = raw_aio_unplug,
+    .bdrv_flush_io_queue = raw_aio_flush_io_queue,
 
     .bdrv_truncate      = raw_truncate,
     .bdrv_getlength	= raw_getlength,
@@ -2072,6 +2108,9 @@ static BlockDriver bdrv_host_floppy = {
     .bdrv_aio_writev    = raw_aio_writev,
     .bdrv_aio_flush	= raw_aio_flush,
     .bdrv_refresh_limits = raw_refresh_limits,
+    .bdrv_io_plug = raw_aio_plug,
+    .bdrv_io_unplug = raw_aio_unplug,
+    .bdrv_flush_io_queue = raw_aio_flush_io_queue,
 
     .bdrv_truncate      = raw_truncate,
     .bdrv_getlength      = raw_getlength,
@@ -2200,6 +2239,9 @@ static BlockDriver bdrv_host_cdrom = {
     .bdrv_aio_writev    = raw_aio_writev,
     .bdrv_aio_flush	= raw_aio_flush,
     .bdrv_refresh_limits = raw_refresh_limits,
+    .bdrv_io_plug = raw_aio_plug,
+    .bdrv_io_unplug = raw_aio_unplug,
+    .bdrv_flush_io_queue = raw_aio_flush_io_queue,
 
     .bdrv_truncate      = raw_truncate,
     .bdrv_getlength      = raw_getlength,
@@ -2334,6 +2376,9 @@ static BlockDriver bdrv_host_cdrom = {
     .bdrv_aio_writev    = raw_aio_writev,
     .bdrv_aio_flush	= raw_aio_flush,
     .bdrv_refresh_limits = raw_refresh_limits,
+    .bdrv_io_plug = raw_aio_plug,
+    .bdrv_io_unplug = raw_aio_unplug,
+    .bdrv_flush_io_queue = raw_aio_flush_io_queue,
 
     .bdrv_truncate      = raw_truncate,
     .bdrv_getlength      = raw_getlength,
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH v5 3/3] dataplane: submit I/O as a batch
  2014-07-04  3:06 [Qemu-devel] [PATCH v5 0/3] linux-aio: introduce submit I/O as a batch Ming Lei
  2014-07-04  3:06 ` [Qemu-devel] [PATCH v5 1/3] block: block: introduce APIs for submitting IO " Ming Lei
  2014-07-04  3:06 ` [Qemu-devel] [PATCH v5 2/3] linux-aio: implement io plug, unplug and flush io queue Ming Lei
@ 2014-07-04  3:06 ` Ming Lei
  2 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2014-07-04  3:06 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel, Paolo Bonzini, Stefan Hajnoczi
  Cc: Kevin Wolf, Ming Lei, Fam Zheng, Michael S. Tsirkin

Before commit 580b6b2aa2(dataplane: use the QEMU block
layer for I/O), dataplane for virtio-blk submits block
I/O as a batch.

This commit 580b6b2aa2 replaces the custom linux AIO
implementation(including submit I/O as a batch) with QEMU
block layer, but this commit causes ~40% throughput regression
on virtio-blk performance, and removing submitting I/O
as a batch is one of the causes.

This patch applies the newly introduced bdrv_io_plug() and
bdrv_io_unplug() interfaces to support submitting I/O
at batch for Qemu block layer, and in my test, the change
can improve throughput by ~30% with 'aio=native'.

Following my fio test script:

	[global]
	direct=1
	size=4G
	bsrange=4k-4k
	timeout=40
	numjobs=4
	ioengine=libaio
	iodepth=64
	filename=/dev/vdc
	group_reporting=1

	[f]
	rw=randread

Result on one of my small machine(host: x86_64, 2cores, 4thread, guest: 4cores):
	- qemu master: 65K IOPS
	- qemu master with these patches: 92K IOPS
	- 2.0.0 release(dataplane using custom linux aio): 104K IOPS

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 hw/block/dataplane/virtio-blk.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 6cd75f6..bed9f13 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -83,6 +83,7 @@ static void handle_notify(EventNotifier *e)
     };
 
     event_notifier_test_and_clear(&s->host_notifier);
+    bdrv_io_plug(s->blk->conf.bs);
     for (;;) {
         /* Disable guest->host notifies to avoid unnecessary vmexits */
         vring_disable_notification(s->vdev, &s->vring);
@@ -116,6 +117,7 @@ static void handle_notify(EventNotifier *e)
             break;
         }
     }
+    bdrv_io_unplug(s->blk->conf.bs);
 }
 
 /* Context: QEMU global mutex held */
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH v5 2/3] linux-aio: implement io plug, unplug and flush io queue
  2014-07-04  3:06 ` [Qemu-devel] [PATCH v5 2/3] linux-aio: implement io plug, unplug and flush io queue Ming Lei
@ 2014-07-04  8:42   ` Stefan Hajnoczi
  0 siblings, 0 replies; 5+ messages in thread
From: Stefan Hajnoczi @ 2014-07-04  8:42 UTC (permalink / raw)
  To: Ming Lei
  Cc: Kevin Wolf, Peter Maydell, Fam Zheng, Michael S. Tsirkin,
	qemu-devel, Stefan Hajnoczi, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 757 bytes --]

On Fri, Jul 04, 2014 at 11:06:41AM +0800, Ming Lei wrote:
> +static int ioq_submit(struct qemu_laio_state *s)
> +{
> +    int ret, i = 0;
> +    int len = s->io_q.idx;
> +
> +    do {
> +        ret = io_submit(s->ctx, len, s->io_q.iocbs);
> +    } while (i++ < 3 && ret == -EAGAIN);
> +
> +    /* empty io queue */
> +    s->io_q.idx = 0;
> +
> +    if (ret >= 0) {
> +        return 0;
> +    }
> +
> +    for (i = 0; i < len; i++) {
> +        struct qemu_laiocb *laiocb =
> +            container_of(s->io_q.iocbs[i], struct qemu_laiocb, iocb);
> +
> +        laiocb->ret = ret;
> +        qemu_laio_process_completion(s, laiocb);
> +    }
> +    return ret;
> +}

Please see my review of the previous revision.  You didn't address my
comments.

Stefan

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-07-04  8:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-04  3:06 [Qemu-devel] [PATCH v5 0/3] linux-aio: introduce submit I/O as a batch Ming Lei
2014-07-04  3:06 ` [Qemu-devel] [PATCH v5 1/3] block: block: introduce APIs for submitting IO " Ming Lei
2014-07-04  3:06 ` [Qemu-devel] [PATCH v5 2/3] linux-aio: implement io plug, unplug and flush io queue Ming Lei
2014-07-04  8:42   ` Stefan Hajnoczi
2014-07-04  3:06 ` [Qemu-devel] [PATCH v5 3/3] dataplane: submit I/O as a batch Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).