* [Qemu-devel] [PATCH v5 0/3] linux-aio: introduce submit I/O as a batch @ 2014-07-04 3:06 Ming Lei 2014-07-04 3:06 ` [Qemu-devel] [PATCH v5 1/3] block: block: introduce APIs for submitting IO " Ming Lei ` (2 more replies) 0 siblings, 3 replies; 5+ messages in thread From: Ming Lei @ 2014-07-04 3:06 UTC (permalink / raw) To: Peter Maydell, qemu-devel, Paolo Bonzini, Stefan Hajnoczi Cc: Kevin Wolf, Fam Zheng, Michael S. Tsirkin Hi, The commit 580b6b2aa2(dataplane: use the QEMU block layer for I/O) introduces ~40% throughput regression on virtio-blk dataplane, and one of causes is that submitting I/O as a batch is removed. This patchset trys to introduce this mechanism on block, at least, linux-aio can benefit from that. With these patches, it is observed that thoughout on virtio-blk dataplane can be improved a lot, see data in commit log of patch 3/3. It should be possible to apply the batch mechanism to other devices (such as virtio-scsi) too. TODO: - support queuing I/O to multi files for scsi devies, which need some changes to linux-aio V5: - rebase on v2.1.0-rc0 of qemu.git/master - block/linux-aio.c code style fix - don't flush io queue before flush, pointed by Paolo V4: - support other non-raw formats with under-optimized performance - use reference counter for plug & unplug - flush io queue before sending flush command V3: - only support submitting I/O as a batch for raw format, pointed by Kevin V2: - define return value of bdrv_io_unplug as void, suggested by Paolo - avoid busy-wait for handling io_submit V1: - move queuing io stuff into linux-aio.c as suggested by Paolo Thanks, -- Ming Lei ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Qemu-devel] [PATCH v5 1/3] block: block: introduce APIs for submitting IO as a batch 2014-07-04 3:06 [Qemu-devel] [PATCH v5 0/3] linux-aio: introduce submit I/O as a batch Ming Lei @ 2014-07-04 3:06 ` Ming Lei 2014-07-04 3:06 ` [Qemu-devel] [PATCH v5 2/3] linux-aio: implement io plug, unplug and flush io queue Ming Lei 2014-07-04 3:06 ` [Qemu-devel] [PATCH v5 3/3] dataplane: submit I/O as a batch Ming Lei 2 siblings, 0 replies; 5+ messages in thread From: Ming Lei @ 2014-07-04 3:06 UTC (permalink / raw) To: Peter Maydell, qemu-devel, Paolo Bonzini, Stefan Hajnoczi Cc: Kevin Wolf, Ming Lei, Fam Zheng, Michael S. Tsirkin This patch introduces three APIs so that following patches can support queuing I/O requests and submitting them as a batch for improving I/O performance. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Ming Lei <ming.lei@canonical.com> --- block.c | 31 +++++++++++++++++++++++++++++++ include/block/block.h | 4 ++++ include/block/block_int.h | 5 +++++ 3 files changed, 40 insertions(+) diff --git a/block.c b/block.c index f80e2b2..8800a6b 100644 --- a/block.c +++ b/block.c @@ -1905,6 +1905,7 @@ void bdrv_drain_all(void) bool bs_busy; aio_context_acquire(aio_context); + bdrv_flush_io_queue(bs); bdrv_start_throttled_reqs(bs); bs_busy = bdrv_requests_pending(bs); bs_busy |= aio_poll(aio_context, bs_busy); @@ -5782,3 +5783,33 @@ BlockDriverState *check_to_replace_node(const char *node_name, Error **errp) return to_replace_bs; } + +void bdrv_io_plug(BlockDriverState *bs) +{ + BlockDriver *drv = bs->drv; + if (drv && drv->bdrv_io_plug) { + drv->bdrv_io_plug(bs); + } else if (bs->file) { + bdrv_io_plug(bs->file); + } +} + +void bdrv_io_unplug(BlockDriverState *bs) +{ + BlockDriver *drv = bs->drv; + if (drv && drv->bdrv_io_unplug) { + drv->bdrv_io_unplug(bs); + } else if (bs->file) { + bdrv_io_unplug(bs->file); + } +} + +void bdrv_flush_io_queue(BlockDriverState *bs) +{ + BlockDriver *drv = bs->drv; + if (drv && drv->bdrv_flush_io_queue) { + drv->bdrv_flush_io_queue(bs); + } else if (bs->file) { + bdrv_flush_io_queue(bs->file); + } +} diff --git a/include/block/block.h b/include/block/block.h index baecc26..32d3676 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -584,4 +584,8 @@ AioContext *bdrv_get_aio_context(BlockDriverState *bs); */ void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context); +void bdrv_io_plug(BlockDriverState *bs); +void bdrv_io_unplug(BlockDriverState *bs); +void bdrv_flush_io_queue(BlockDriverState *bs); + #endif diff --git a/include/block/block_int.h b/include/block/block_int.h index 8f8e65e..f6c3bef 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -261,6 +261,11 @@ struct BlockDriver { void (*bdrv_attach_aio_context)(BlockDriverState *bs, AioContext *new_context); + /* io queue for linux-aio */ + void (*bdrv_io_plug)(BlockDriverState *bs); + void (*bdrv_io_unplug)(BlockDriverState *bs); + void (*bdrv_flush_io_queue)(BlockDriverState *bs); + QLIST_ENTRY(BlockDriver) list; }; -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Qemu-devel] [PATCH v5 2/3] linux-aio: implement io plug, unplug and flush io queue 2014-07-04 3:06 [Qemu-devel] [PATCH v5 0/3] linux-aio: introduce submit I/O as a batch Ming Lei 2014-07-04 3:06 ` [Qemu-devel] [PATCH v5 1/3] block: block: introduce APIs for submitting IO " Ming Lei @ 2014-07-04 3:06 ` Ming Lei 2014-07-04 8:42 ` Stefan Hajnoczi 2014-07-04 3:06 ` [Qemu-devel] [PATCH v5 3/3] dataplane: submit I/O as a batch Ming Lei 2 siblings, 1 reply; 5+ messages in thread From: Ming Lei @ 2014-07-04 3:06 UTC (permalink / raw) To: Peter Maydell, qemu-devel, Paolo Bonzini, Stefan Hajnoczi Cc: Kevin Wolf, Ming Lei, Fam Zheng, Michael S. Tsirkin This patch implements .bdrv_io_plug, .bdrv_io_unplug and .bdrv_flush_io_queue callbacks for linux-aio Block Drivers, so that submitting I/O as a batch can be supported on linux-aio. Signed-off-by: Ming Lei <ming.lei@canonical.com> --- block/linux-aio.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++-- block/raw-aio.h | 2 ++ block/raw-posix.c | 45 ++++++++++++++++++++++++++ 3 files changed, 138 insertions(+), 2 deletions(-) diff --git a/block/linux-aio.c b/block/linux-aio.c index f0a2c08..9f1883e 100644 --- a/block/linux-aio.c +++ b/block/linux-aio.c @@ -25,6 +25,8 @@ */ #define MAX_EVENTS 128 +#define MAX_QUEUED_IO 128 + struct qemu_laiocb { BlockDriverAIOCB common; struct qemu_laio_state *ctx; @@ -36,9 +38,19 @@ struct qemu_laiocb { QLIST_ENTRY(qemu_laiocb) node; }; +struct laio_queue { + struct iocb *iocbs[MAX_QUEUED_IO]; + int plugged; + unsigned int size; + unsigned int idx; +}; + struct qemu_laio_state { io_context_t ctx; EventNotifier e; + + /* io queue for submit at batch */ + struct laio_queue io_q; }; static inline ssize_t io_event_ret(struct io_event *ev) @@ -135,6 +147,77 @@ static const AIOCBInfo laio_aiocb_info = { .cancel = laio_cancel, }; +static void ioq_init(struct laio_queue *io_q) +{ + io_q->size = MAX_QUEUED_IO; + io_q->idx = 0; + io_q->plugged = 0; +} + +static int ioq_submit(struct qemu_laio_state *s) +{ + int ret, i = 0; + int len = s->io_q.idx; + + do { + ret = io_submit(s->ctx, len, s->io_q.iocbs); + } while (i++ < 3 && ret == -EAGAIN); + + /* empty io queue */ + s->io_q.idx = 0; + + if (ret >= 0) { + return 0; + } + + for (i = 0; i < len; i++) { + struct qemu_laiocb *laiocb = + container_of(s->io_q.iocbs[i], struct qemu_laiocb, iocb); + + laiocb->ret = ret; + qemu_laio_process_completion(s, laiocb); + } + return ret; +} + +static void ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb) +{ + unsigned int idx = s->io_q.idx; + + s->io_q.iocbs[idx++] = iocb; + s->io_q.idx = idx; + + /* submit immediately if queue is full */ + if (idx == s->io_q.size) { + ioq_submit(s); + } +} + +void laio_io_plug(BlockDriverState *bs, void *aio_ctx) +{ + struct qemu_laio_state *s = aio_ctx; + + s->io_q.plugged++; +} + +int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug) +{ + struct qemu_laio_state *s = aio_ctx; + int ret = 0; + + assert(s->io_q.plugged > 0 || !unplug); + + if (unplug && --s->io_q.plugged > 0) { + return 0; + } + + if (s->io_q.idx > 0) { + ret = ioq_submit(s); + } + + return ret; +} + BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd, int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, BlockDriverCompletionFunc *cb, void *opaque, int type) @@ -168,8 +251,12 @@ BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd, } io_set_eventfd(&laiocb->iocb, event_notifier_get_fd(&s->e)); - if (io_submit(s->ctx, 1, &iocbs) < 0) - goto out_free_aiocb; + if (!s->io_q.plugged) { + if (io_submit(s->ctx, 1, &iocbs) < 0) + goto out_free_aiocb; + } else { + ioq_enqueue(s, iocbs); + } return &laiocb->common; out_free_aiocb: @@ -204,6 +291,8 @@ void *laio_init(void) goto out_close_efd; } + ioq_init(&s->io_q); + return s; out_close_efd: diff --git a/block/raw-aio.h b/block/raw-aio.h index 8cf084e..e18c975 100644 --- a/block/raw-aio.h +++ b/block/raw-aio.h @@ -40,6 +40,8 @@ BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd, BlockDriverCompletionFunc *cb, void *opaque, int type); void laio_detach_aio_context(void *s, AioContext *old_context); void laio_attach_aio_context(void *s, AioContext *new_context); +void laio_io_plug(BlockDriverState *bs, void *aio_ctx); +int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug); #endif #ifdef _WIN32 diff --git a/block/raw-posix.c b/block/raw-posix.c index 825a0c8..808b2d8 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -1057,6 +1057,36 @@ static BlockDriverAIOCB *raw_aio_submit(BlockDriverState *bs, cb, opaque, type); } +static void raw_aio_plug(BlockDriverState *bs) +{ +#ifdef CONFIG_LINUX_AIO + BDRVRawState *s = bs->opaque; + if (s->use_aio) { + laio_io_plug(bs, s->aio_ctx); + } +#endif +} + +static void raw_aio_unplug(BlockDriverState *bs) +{ +#ifdef CONFIG_LINUX_AIO + BDRVRawState *s = bs->opaque; + if (s->use_aio) { + laio_io_unplug(bs, s->aio_ctx, true); + } +#endif +} + +static void raw_aio_flush_io_queue(BlockDriverState *bs) +{ +#ifdef CONFIG_LINUX_AIO + BDRVRawState *s = bs->opaque; + if (s->use_aio) { + laio_io_unplug(bs, s->aio_ctx, false); + } +#endif +} + static BlockDriverAIOCB *raw_aio_readv(BlockDriverState *bs, int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, BlockDriverCompletionFunc *cb, void *opaque) @@ -1528,6 +1558,9 @@ static BlockDriver bdrv_file = { .bdrv_aio_flush = raw_aio_flush, .bdrv_aio_discard = raw_aio_discard, .bdrv_refresh_limits = raw_refresh_limits, + .bdrv_io_plug = raw_aio_plug, + .bdrv_io_unplug = raw_aio_unplug, + .bdrv_flush_io_queue = raw_aio_flush_io_queue, .bdrv_truncate = raw_truncate, .bdrv_getlength = raw_getlength, @@ -1927,6 +1960,9 @@ static BlockDriver bdrv_host_device = { .bdrv_aio_flush = raw_aio_flush, .bdrv_aio_discard = hdev_aio_discard, .bdrv_refresh_limits = raw_refresh_limits, + .bdrv_io_plug = raw_aio_plug, + .bdrv_io_unplug = raw_aio_unplug, + .bdrv_flush_io_queue = raw_aio_flush_io_queue, .bdrv_truncate = raw_truncate, .bdrv_getlength = raw_getlength, @@ -2072,6 +2108,9 @@ static BlockDriver bdrv_host_floppy = { .bdrv_aio_writev = raw_aio_writev, .bdrv_aio_flush = raw_aio_flush, .bdrv_refresh_limits = raw_refresh_limits, + .bdrv_io_plug = raw_aio_plug, + .bdrv_io_unplug = raw_aio_unplug, + .bdrv_flush_io_queue = raw_aio_flush_io_queue, .bdrv_truncate = raw_truncate, .bdrv_getlength = raw_getlength, @@ -2200,6 +2239,9 @@ static BlockDriver bdrv_host_cdrom = { .bdrv_aio_writev = raw_aio_writev, .bdrv_aio_flush = raw_aio_flush, .bdrv_refresh_limits = raw_refresh_limits, + .bdrv_io_plug = raw_aio_plug, + .bdrv_io_unplug = raw_aio_unplug, + .bdrv_flush_io_queue = raw_aio_flush_io_queue, .bdrv_truncate = raw_truncate, .bdrv_getlength = raw_getlength, @@ -2334,6 +2376,9 @@ static BlockDriver bdrv_host_cdrom = { .bdrv_aio_writev = raw_aio_writev, .bdrv_aio_flush = raw_aio_flush, .bdrv_refresh_limits = raw_refresh_limits, + .bdrv_io_plug = raw_aio_plug, + .bdrv_io_unplug = raw_aio_unplug, + .bdrv_flush_io_queue = raw_aio_flush_io_queue, .bdrv_truncate = raw_truncate, .bdrv_getlength = raw_getlength, -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/3] linux-aio: implement io plug, unplug and flush io queue 2014-07-04 3:06 ` [Qemu-devel] [PATCH v5 2/3] linux-aio: implement io plug, unplug and flush io queue Ming Lei @ 2014-07-04 8:42 ` Stefan Hajnoczi 0 siblings, 0 replies; 5+ messages in thread From: Stefan Hajnoczi @ 2014-07-04 8:42 UTC (permalink / raw) To: Ming Lei Cc: Kevin Wolf, Peter Maydell, Fam Zheng, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi, Paolo Bonzini [-- Attachment #1: Type: text/plain, Size: 757 bytes --] On Fri, Jul 04, 2014 at 11:06:41AM +0800, Ming Lei wrote: > +static int ioq_submit(struct qemu_laio_state *s) > +{ > + int ret, i = 0; > + int len = s->io_q.idx; > + > + do { > + ret = io_submit(s->ctx, len, s->io_q.iocbs); > + } while (i++ < 3 && ret == -EAGAIN); > + > + /* empty io queue */ > + s->io_q.idx = 0; > + > + if (ret >= 0) { > + return 0; > + } > + > + for (i = 0; i < len; i++) { > + struct qemu_laiocb *laiocb = > + container_of(s->io_q.iocbs[i], struct qemu_laiocb, iocb); > + > + laiocb->ret = ret; > + qemu_laio_process_completion(s, laiocb); > + } > + return ret; > +} Please see my review of the previous revision. You didn't address my comments. Stefan [-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Qemu-devel] [PATCH v5 3/3] dataplane: submit I/O as a batch 2014-07-04 3:06 [Qemu-devel] [PATCH v5 0/3] linux-aio: introduce submit I/O as a batch Ming Lei 2014-07-04 3:06 ` [Qemu-devel] [PATCH v5 1/3] block: block: introduce APIs for submitting IO " Ming Lei 2014-07-04 3:06 ` [Qemu-devel] [PATCH v5 2/3] linux-aio: implement io plug, unplug and flush io queue Ming Lei @ 2014-07-04 3:06 ` Ming Lei 2 siblings, 0 replies; 5+ messages in thread From: Ming Lei @ 2014-07-04 3:06 UTC (permalink / raw) To: Peter Maydell, qemu-devel, Paolo Bonzini, Stefan Hajnoczi Cc: Kevin Wolf, Ming Lei, Fam Zheng, Michael S. Tsirkin Before commit 580b6b2aa2(dataplane: use the QEMU block layer for I/O), dataplane for virtio-blk submits block I/O as a batch. This commit 580b6b2aa2 replaces the custom linux AIO implementation(including submit I/O as a batch) with QEMU block layer, but this commit causes ~40% throughput regression on virtio-blk performance, and removing submitting I/O as a batch is one of the causes. This patch applies the newly introduced bdrv_io_plug() and bdrv_io_unplug() interfaces to support submitting I/O at batch for Qemu block layer, and in my test, the change can improve throughput by ~30% with 'aio=native'. Following my fio test script: [global] direct=1 size=4G bsrange=4k-4k timeout=40 numjobs=4 ioengine=libaio iodepth=64 filename=/dev/vdc group_reporting=1 [f] rw=randread Result on one of my small machine(host: x86_64, 2cores, 4thread, guest: 4cores): - qemu master: 65K IOPS - qemu master with these patches: 92K IOPS - 2.0.0 release(dataplane using custom linux aio): 104K IOPS Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Ming Lei <ming.lei@canonical.com> --- hw/block/dataplane/virtio-blk.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index 6cd75f6..bed9f13 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -83,6 +83,7 @@ static void handle_notify(EventNotifier *e) }; event_notifier_test_and_clear(&s->host_notifier); + bdrv_io_plug(s->blk->conf.bs); for (;;) { /* Disable guest->host notifies to avoid unnecessary vmexits */ vring_disable_notification(s->vdev, &s->vring); @@ -116,6 +117,7 @@ static void handle_notify(EventNotifier *e) break; } } + bdrv_io_unplug(s->blk->conf.bs); } /* Context: QEMU global mutex held */ -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-07-04 8:42 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-07-04 3:06 [Qemu-devel] [PATCH v5 0/3] linux-aio: introduce submit I/O as a batch Ming Lei 2014-07-04 3:06 ` [Qemu-devel] [PATCH v5 1/3] block: block: introduce APIs for submitting IO " Ming Lei 2014-07-04 3:06 ` [Qemu-devel] [PATCH v5 2/3] linux-aio: implement io plug, unplug and flush io queue Ming Lei 2014-07-04 8:42 ` Stefan Hajnoczi 2014-07-04 3:06 ` [Qemu-devel] [PATCH v5 3/3] dataplane: submit I/O as a batch Ming Lei
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).