* [Qemu-devel] [PATCH v4 0/3] linux-aio: introduce submit I/O as a batch @ 2014-07-02 12:18 Ming Lei 2014-07-02 12:18 ` [Qemu-devel] [PATCH v4 1/3] block: block: introduce APIs for submitting IO as Ming Lei ` (4 more replies) 0 siblings, 5 replies; 26+ messages in thread From: Ming Lei @ 2014-07-02 12:18 UTC (permalink / raw) To: Peter Maydell, qemu-devel, Paolo Bonzini, Stefan Hajnoczi Cc: Kevin Wolf, Fam Zheng, Michael S. Tsirkin Hi, The commit 580b6b2aa2(dataplane: use the QEMU block layer for I/O) introduces ~40% throughput regression on virtio-blk dataplane, and one of causes is that submitting I/O as a batch is removed. This patchset trys to introduce this mechanism on block, at least, linux-aio can benefit from that. With these patches, it is observed that thoughout on virtio-blk dataplane can be improved a lot, see data in commit log of patch 3/3. It should be possible to apply the batch mechanism to other devices (such as virtio-scsi) too. TODO: - support queuing I/O to multi files which need lots of change to linux-aio for scsi devies V4: - support other non-raw formats with under-optimized performance - use reference counter for plug & unplug - flush io queue before sending flush command V3: - only support submitting I/O as a batch for raw format, pointed by Kevin V2: - define return value of bdrv_io_unplug as void, suggested by Paolo - avoid busy-wait for handling io_submit V1: - move queuing io stuff into linux-aio.c as suggested by Paolo Thanks, -- Ming Lei ^ permalink raw reply [flat|nested] 26+ messages in thread
* [Qemu-devel] [PATCH v4 1/3] block: block: introduce APIs for submitting IO as 2014-07-02 12:18 [Qemu-devel] [PATCH v4 0/3] linux-aio: introduce submit I/O as a batch Ming Lei @ 2014-07-02 12:18 ` Ming Lei 2014-07-02 12:46 ` Eric Blake 2014-07-03 9:40 ` Kevin Wolf 2014-07-02 12:18 ` [Qemu-devel] [PATCH v4 2/3] linux-aio: implement io plug, unplug and flush io queue Ming Lei ` (3 subsequent siblings) 4 siblings, 2 replies; 26+ messages in thread From: Ming Lei @ 2014-07-02 12:18 UTC (permalink / raw) To: Peter Maydell, qemu-devel, Paolo Bonzini, Stefan Hajnoczi Cc: Kevin Wolf, Ming Lei, Fam Zheng, Michael S. Tsirkin a batch This patch introduces three APIs so that following patches can support queuing I/O requests and submitting them as a batch for improving I/O performance. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Ming Lei <ming.lei@canonical.com> --- block.c | 31 +++++++++++++++++++++++++++++++ include/block/block.h | 4 ++++ include/block/block_int.h | 5 +++++ 3 files changed, 40 insertions(+) diff --git a/block.c b/block.c index 217f523..4f64c38 100644 --- a/block.c +++ b/block.c @@ -1910,6 +1910,7 @@ void bdrv_drain_all(void) bool bs_busy; aio_context_acquire(aio_context); + bdrv_flush_io_queue(bs); bdrv_start_throttled_reqs(bs); bs_busy = bdrv_requests_pending(bs); bs_busy |= aio_poll(aio_context, bs_busy); @@ -5774,3 +5775,33 @@ bool bdrv_is_first_non_filter(BlockDriverState *candidate) return false; } + +void bdrv_io_plug(BlockDriverState *bs) +{ + BlockDriver *drv = bs->drv; + if (drv && drv->bdrv_io_plug) { + drv->bdrv_io_plug(bs); + } else if (bs->file) { + bdrv_io_plug(bs->file); + } +} + +void bdrv_io_unplug(BlockDriverState *bs) +{ + BlockDriver *drv = bs->drv; + if (drv && drv->bdrv_io_unplug) { + drv->bdrv_io_unplug(bs); + } else if (bs->file) { + bdrv_io_unplug(bs->file); + } +} + +void bdrv_flush_io_queue(BlockDriverState *bs) +{ + BlockDriver *drv = bs->drv; + if (drv && drv->bdrv_flush_io_queue) { + drv->bdrv_flush_io_queue(bs); + } else if (bs->file) { + bdrv_flush_io_queue(bs->file); + } +} diff --git a/include/block/block.h b/include/block/block.h index d0baf4f..7a8b990 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -578,4 +578,8 @@ AioContext *bdrv_get_aio_context(BlockDriverState *bs); */ void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context); +void bdrv_io_plug(BlockDriverState *bs); +void bdrv_io_unplug(BlockDriverState *bs); +void bdrv_flush_io_queue(BlockDriverState *bs); + #endif diff --git a/include/block/block_int.h b/include/block/block_int.h index 715c761..c5c4adc 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -257,6 +257,11 @@ struct BlockDriver { void (*bdrv_attach_aio_context)(BlockDriverState *bs, AioContext *new_context); + /* io queue for linux-aio */ + void (*bdrv_io_plug)(BlockDriverState *bs); + void (*bdrv_io_unplug)(BlockDriverState *bs); + void (*bdrv_flush_io_queue)(BlockDriverState *bs); + QLIST_ENTRY(BlockDriver) list; }; -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 1/3] block: block: introduce APIs for submitting IO as 2014-07-02 12:18 ` [Qemu-devel] [PATCH v4 1/3] block: block: introduce APIs for submitting IO as Ming Lei @ 2014-07-02 12:46 ` Eric Blake 2014-07-03 9:40 ` Kevin Wolf 1 sibling, 0 replies; 26+ messages in thread From: Eric Blake @ 2014-07-02 12:46 UTC (permalink / raw) To: Ming Lei, Peter Maydell, qemu-devel, Paolo Bonzini, Stefan Hajnoczi Cc: Kevin Wolf, Fam Zheng, Michael S. Tsirkin [-- Attachment #1: Type: text/plain, Size: 573 bytes --] On 07/02/2014 06:18 AM, Ming Lei wrote: > a batch subject line is screwed up; you want to drop the duplicate 'block:' and combine the 'a batch' with the rest of the line > > This patch introduces three APIs so that following > patches can support queuing I/O requests and submitting them > as a batch for improving I/O performance. > > Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> > Signed-off-by: Ming Lei <ming.lei@canonical.com> > --- -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 604 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 1/3] block: block: introduce APIs for submitting IO as 2014-07-02 12:18 ` [Qemu-devel] [PATCH v4 1/3] block: block: introduce APIs for submitting IO as Ming Lei 2014-07-02 12:46 ` Eric Blake @ 2014-07-03 9:40 ` Kevin Wolf 1 sibling, 0 replies; 26+ messages in thread From: Kevin Wolf @ 2014-07-03 9:40 UTC (permalink / raw) To: Ming Lei Cc: Peter Maydell, Fam Zheng, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi, Paolo Bonzini Am 02.07.2014 um 14:18 hat Ming Lei geschrieben: > a batch > > This patch introduces three APIs so that following > patches can support queuing I/O requests and submitting them > as a batch for improving I/O performance. > > Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> > Signed-off-by: Ming Lei <ming.lei@canonical.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> ^ permalink raw reply [flat|nested] 26+ messages in thread
* [Qemu-devel] [PATCH v4 2/3] linux-aio: implement io plug, unplug and flush io queue 2014-07-02 12:18 [Qemu-devel] [PATCH v4 0/3] linux-aio: introduce submit I/O as a batch Ming Lei 2014-07-02 12:18 ` [Qemu-devel] [PATCH v4 1/3] block: block: introduce APIs for submitting IO as Ming Lei @ 2014-07-02 12:18 ` Ming Lei 2014-07-03 9:40 ` Kevin Wolf 2014-07-03 12:24 ` Stefan Hajnoczi 2014-07-02 12:18 ` [Qemu-devel] [PATCH v4 3/3] dataplane: submit I/O as a batch Ming Lei ` (2 subsequent siblings) 4 siblings, 2 replies; 26+ messages in thread From: Ming Lei @ 2014-07-02 12:18 UTC (permalink / raw) To: Peter Maydell, qemu-devel, Paolo Bonzini, Stefan Hajnoczi Cc: Kevin Wolf, Ming Lei, Fam Zheng, Michael S. Tsirkin This patch implements .bdrv_io_plug, .bdrv_io_unplug and .bdrv_flush_io_queue callbacks for linux-aio Block Drivers, so that submitting I/O as a batch can be supported on linux-aio. Signed-off-by: Ming Lei <ming.lei@canonical.com> --- block/linux-aio.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++-- block/raw-aio.h | 2 ++ block/raw-posix.c | 45 +++++++++++++++++++++++++++ 3 files changed, 133 insertions(+), 2 deletions(-) diff --git a/block/linux-aio.c b/block/linux-aio.c index f0a2c08..1cb4845 100644 --- a/block/linux-aio.c +++ b/block/linux-aio.c @@ -25,6 +25,8 @@ */ #define MAX_EVENTS 128 +#define MAX_QUEUED_IO 128 + struct qemu_laiocb { BlockDriverAIOCB common; struct qemu_laio_state *ctx; @@ -36,9 +38,19 @@ struct qemu_laiocb { QLIST_ENTRY(qemu_laiocb) node; }; +struct laio_queue { + struct iocb *iocbs[MAX_QUEUED_IO]; + int plugged; + unsigned int size; + unsigned int idx; +}; + struct qemu_laio_state { io_context_t ctx; EventNotifier e; + + /* io queue for submit at batch */ + struct laio_queue io_q; }; static inline ssize_t io_event_ret(struct io_event *ev) @@ -135,6 +147,72 @@ static const AIOCBInfo laio_aiocb_info = { .cancel = laio_cancel, }; +static void ioq_init(struct laio_queue *io_q) +{ + io_q->size = MAX_QUEUED_IO; + io_q->idx = 0; + io_q->plugged = 0; +} + +static int ioq_submit(struct qemu_laio_state *s) +{ + int ret, i = 0; + int len = s->io_q.idx; + + do { + ret = io_submit(s->ctx, len, s->io_q.iocbs); + } while (i++ < 3 && ret == -EAGAIN); + + /* empty io queue */ + s->io_q.idx = 0; + + if (ret >= 0) + return 0; + + for (i = 0; i < len; i++) { + struct qemu_laiocb *laiocb = + container_of(s->io_q.iocbs[i], struct qemu_laiocb, iocb); + + laiocb->ret = ret; + qemu_laio_process_completion(s, laiocb); + } + return ret; +} + +static void ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb) +{ + unsigned int idx = s->io_q.idx; + + s->io_q.iocbs[idx++] = iocb; + s->io_q.idx = idx; + + /* submit immediately if queue is full */ + if (idx == s->io_q.size) + ioq_submit(s); +} + +void laio_io_plug(BlockDriverState *bs, void *aio_ctx) +{ + struct qemu_laio_state *s = aio_ctx; + + s->io_q.plugged++; +} + +int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug) +{ + struct qemu_laio_state *s = aio_ctx; + int ret = 0; + + if (unplug && --s->io_q.plugged > 0) + return 0; + + if (s->io_q.idx > 0) { + ret = ioq_submit(s); + } + + return ret; +} + BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd, int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, BlockDriverCompletionFunc *cb, void *opaque, int type) @@ -168,8 +246,12 @@ BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd, } io_set_eventfd(&laiocb->iocb, event_notifier_get_fd(&s->e)); - if (io_submit(s->ctx, 1, &iocbs) < 0) - goto out_free_aiocb; + if (!s->io_q.plugged) { + if (io_submit(s->ctx, 1, &iocbs) < 0) + goto out_free_aiocb; + } else { + ioq_enqueue(s, iocbs); + } return &laiocb->common; out_free_aiocb: @@ -204,6 +286,8 @@ void *laio_init(void) goto out_close_efd; } + ioq_init(&s->io_q); + return s; out_close_efd: diff --git a/block/raw-aio.h b/block/raw-aio.h index 8cf084e..e18c975 100644 --- a/block/raw-aio.h +++ b/block/raw-aio.h @@ -40,6 +40,8 @@ BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd, BlockDriverCompletionFunc *cb, void *opaque, int type); void laio_detach_aio_context(void *s, AioContext *old_context); void laio_attach_aio_context(void *s, AioContext *new_context); +void laio_io_plug(BlockDriverState *bs, void *aio_ctx); +int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug); #endif #ifdef _WIN32 diff --git a/block/raw-posix.c b/block/raw-posix.c index dacf4fb..410299d 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -1054,6 +1054,36 @@ static BlockDriverAIOCB *raw_aio_submit(BlockDriverState *bs, cb, opaque, type); } +static void raw_aio_plug(BlockDriverState *bs) +{ +#ifdef CONFIG_LINUX_AIO + BDRVRawState *s = bs->opaque; + if (s->use_aio) { + laio_io_plug(bs, s->aio_ctx); + } +#endif +} + +static void raw_aio_unplug(BlockDriverState *bs) +{ +#ifdef CONFIG_LINUX_AIO + BDRVRawState *s = bs->opaque; + if (s->use_aio) { + laio_io_unplug(bs, s->aio_ctx, true); + } +#endif +} + +static void raw_aio_flush_io_queue(BlockDriverState *bs) +{ +#ifdef CONFIG_LINUX_AIO + BDRVRawState *s = bs->opaque; + if (s->use_aio) { + laio_io_unplug(bs, s->aio_ctx, false); + } +#endif +} + static BlockDriverAIOCB *raw_aio_readv(BlockDriverState *bs, int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, BlockDriverCompletionFunc *cb, void *opaque) @@ -1503,6 +1533,9 @@ static BlockDriver bdrv_file = { .bdrv_aio_flush = raw_aio_flush, .bdrv_aio_discard = raw_aio_discard, .bdrv_refresh_limits = raw_refresh_limits, + .bdrv_io_plug = raw_aio_plug, + .bdrv_io_unplug = raw_aio_unplug, + .bdrv_flush_io_queue = raw_aio_flush_io_queue, .bdrv_truncate = raw_truncate, .bdrv_getlength = raw_getlength, @@ -1902,6 +1935,9 @@ static BlockDriver bdrv_host_device = { .bdrv_aio_flush = raw_aio_flush, .bdrv_aio_discard = hdev_aio_discard, .bdrv_refresh_limits = raw_refresh_limits, + .bdrv_io_plug = raw_aio_plug, + .bdrv_io_unplug = raw_aio_unplug, + .bdrv_flush_io_queue = raw_aio_flush_io_queue, .bdrv_truncate = raw_truncate, .bdrv_getlength = raw_getlength, @@ -2047,6 +2083,9 @@ static BlockDriver bdrv_host_floppy = { .bdrv_aio_writev = raw_aio_writev, .bdrv_aio_flush = raw_aio_flush, .bdrv_refresh_limits = raw_refresh_limits, + .bdrv_io_plug = raw_aio_plug, + .bdrv_io_unplug = raw_aio_unplug, + .bdrv_flush_io_queue = raw_aio_flush_io_queue, .bdrv_truncate = raw_truncate, .bdrv_getlength = raw_getlength, @@ -2175,6 +2214,9 @@ static BlockDriver bdrv_host_cdrom = { .bdrv_aio_writev = raw_aio_writev, .bdrv_aio_flush = raw_aio_flush, .bdrv_refresh_limits = raw_refresh_limits, + .bdrv_io_plug = raw_aio_plug, + .bdrv_io_unplug = raw_aio_unplug, + .bdrv_flush_io_queue = raw_aio_flush_io_queue, .bdrv_truncate = raw_truncate, .bdrv_getlength = raw_getlength, @@ -2309,6 +2351,9 @@ static BlockDriver bdrv_host_cdrom = { .bdrv_aio_writev = raw_aio_writev, .bdrv_aio_flush = raw_aio_flush, .bdrv_refresh_limits = raw_refresh_limits, + .bdrv_io_plug = raw_aio_plug, + .bdrv_io_unplug = raw_aio_unplug, + .bdrv_flush_io_queue = raw_aio_flush_io_queue, .bdrv_truncate = raw_truncate, .bdrv_getlength = raw_getlength, -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 2/3] linux-aio: implement io plug, unplug and flush io queue 2014-07-02 12:18 ` [Qemu-devel] [PATCH v4 2/3] linux-aio: implement io plug, unplug and flush io queue Ming Lei @ 2014-07-03 9:40 ` Kevin Wolf 2014-07-03 9:51 ` Ming Lei 2014-07-03 12:24 ` Stefan Hajnoczi 1 sibling, 1 reply; 26+ messages in thread From: Kevin Wolf @ 2014-07-03 9:40 UTC (permalink / raw) To: Ming Lei Cc: Peter Maydell, Fam Zheng, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi, Paolo Bonzini Am 02.07.2014 um 14:18 hat Ming Lei geschrieben: > This patch implements .bdrv_io_plug, .bdrv_io_unplug and > .bdrv_flush_io_queue callbacks for linux-aio Block Drivers, > so that submitting I/O as a batch can be supported on linux-aio. > > Signed-off-by: Ming Lei <ming.lei@canonical.com> Just a couple of minor comments, see inline. > block/linux-aio.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++-- > block/raw-aio.h | 2 ++ > block/raw-posix.c | 45 +++++++++++++++++++++++++++ > 3 files changed, 133 insertions(+), 2 deletions(-) > > diff --git a/block/linux-aio.c b/block/linux-aio.c > index f0a2c08..1cb4845 100644 > --- a/block/linux-aio.c > +++ b/block/linux-aio.c > @@ -25,6 +25,8 @@ > */ > #define MAX_EVENTS 128 > > +#define MAX_QUEUED_IO 128 > + > struct qemu_laiocb { > BlockDriverAIOCB common; > struct qemu_laio_state *ctx; > @@ -36,9 +38,19 @@ struct qemu_laiocb { > QLIST_ENTRY(qemu_laiocb) node; > }; > > +struct laio_queue { > + struct iocb *iocbs[MAX_QUEUED_IO]; > + int plugged; Why signed? > + unsigned int size; > + unsigned int idx; > +}; > + > struct qemu_laio_state { > io_context_t ctx; > EventNotifier e; > + > + /* io queue for submit at batch */ > + struct laio_queue io_q; > }; > > static inline ssize_t io_event_ret(struct io_event *ev) > @@ -135,6 +147,72 @@ static const AIOCBInfo laio_aiocb_info = { > .cancel = laio_cancel, > }; > > +static void ioq_init(struct laio_queue *io_q) > +{ > + io_q->size = MAX_QUEUED_IO; > + io_q->idx = 0; > + io_q->plugged = 0; > +} > + > +static int ioq_submit(struct qemu_laio_state *s) > +{ > + int ret, i = 0; > + int len = s->io_q.idx; > + > + do { > + ret = io_submit(s->ctx, len, s->io_q.iocbs); > + } while (i++ < 3 && ret == -EAGAIN); > + > + /* empty io queue */ > + s->io_q.idx = 0; > + > + if (ret >= 0) > + return 0; Indentation is off and braces are missing. > + > + for (i = 0; i < len; i++) { > + struct qemu_laiocb *laiocb = > + container_of(s->io_q.iocbs[i], struct qemu_laiocb, iocb); > + > + laiocb->ret = ret; > + qemu_laio_process_completion(s, laiocb); > + } > + return ret; > +} > + > +static void ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb) > +{ > + unsigned int idx = s->io_q.idx; > + > + s->io_q.iocbs[idx++] = iocb; > + s->io_q.idx = idx; > + > + /* submit immediately if queue is full */ > + if (idx == s->io_q.size) > + ioq_submit(s); Missing braces. > +} > + > +void laio_io_plug(BlockDriverState *bs, void *aio_ctx) > +{ > + struct qemu_laio_state *s = aio_ctx; > + > + s->io_q.plugged++; > +} > + > +int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug) > +{ > + struct qemu_laio_state *s = aio_ctx; > + int ret = 0; > + How about an assert(s->io_q.plugged > 0); here? > + if (unplug && --s->io_q.plugged > 0) > + return 0; Missing braces. > + > + if (s->io_q.idx > 0) { > + ret = ioq_submit(s); > + } > + > + return ret; > +} > + > BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd, > int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, > BlockDriverCompletionFunc *cb, void *opaque, int type) ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 2/3] linux-aio: implement io plug, unplug and flush io queue 2014-07-03 9:40 ` Kevin Wolf @ 2014-07-03 9:51 ` Ming Lei 2014-07-03 10:04 ` Paolo Bonzini 2014-07-03 10:22 ` Kevin Wolf 0 siblings, 2 replies; 26+ messages in thread From: Ming Lei @ 2014-07-03 9:51 UTC (permalink / raw) To: Kevin Wolf Cc: Peter Maydell, Fam Zheng, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi, Paolo Bonzini Hi Kevin, On Thu, Jul 3, 2014 at 5:40 PM, Kevin Wolf <kwolf@redhat.com> wrote: > Am 02.07.2014 um 14:18 hat Ming Lei geschrieben: >> This patch implements .bdrv_io_plug, .bdrv_io_unplug and >> .bdrv_flush_io_queue callbacks for linux-aio Block Drivers, >> so that submitting I/O as a batch can be supported on linux-aio. >> >> Signed-off-by: Ming Lei <ming.lei@canonical.com> > > Just a couple of minor comments, see inline. Thanks for your review. > >> block/linux-aio.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++-- >> block/raw-aio.h | 2 ++ >> block/raw-posix.c | 45 +++++++++++++++++++++++++++ >> 3 files changed, 133 insertions(+), 2 deletions(-) >> >> diff --git a/block/linux-aio.c b/block/linux-aio.c >> index f0a2c08..1cb4845 100644 >> --- a/block/linux-aio.c >> +++ b/block/linux-aio.c >> @@ -25,6 +25,8 @@ >> */ >> #define MAX_EVENTS 128 >> >> +#define MAX_QUEUED_IO 128 >> + >> struct qemu_laiocb { >> BlockDriverAIOCB common; >> struct qemu_laio_state *ctx; >> @@ -36,9 +38,19 @@ struct qemu_laiocb { >> QLIST_ENTRY(qemu_laiocb) node; >> }; >> >> +struct laio_queue { >> + struct iocb *iocbs[MAX_QUEUED_IO]; >> + int plugged; > > Why signed? > >> + unsigned int size; >> + unsigned int idx; >> +}; >> + >> struct qemu_laio_state { >> io_context_t ctx; >> EventNotifier e; >> + >> + /* io queue for submit at batch */ >> + struct laio_queue io_q; >> }; >> >> static inline ssize_t io_event_ret(struct io_event *ev) >> @@ -135,6 +147,72 @@ static const AIOCBInfo laio_aiocb_info = { >> .cancel = laio_cancel, >> }; >> >> +static void ioq_init(struct laio_queue *io_q) >> +{ >> + io_q->size = MAX_QUEUED_IO; >> + io_q->idx = 0; >> + io_q->plugged = 0; >> +} >> + >> +static int ioq_submit(struct qemu_laio_state *s) >> +{ >> + int ret, i = 0; >> + int len = s->io_q.idx; >> + >> + do { >> + ret = io_submit(s->ctx, len, s->io_q.iocbs); >> + } while (i++ < 3 && ret == -EAGAIN); >> + >> + /* empty io queue */ >> + s->io_q.idx = 0; >> + >> + if (ret >= 0) >> + return 0; > > Indentation is off and braces are missing. > >> + >> + for (i = 0; i < len; i++) { >> + struct qemu_laiocb *laiocb = >> + container_of(s->io_q.iocbs[i], struct qemu_laiocb, iocb); >> + >> + laiocb->ret = ret; >> + qemu_laio_process_completion(s, laiocb); >> + } >> + return ret; >> +} >> + >> +static void ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb) >> +{ >> + unsigned int idx = s->io_q.idx; >> + >> + s->io_q.iocbs[idx++] = iocb; >> + s->io_q.idx = idx; >> + >> + /* submit immediately if queue is full */ >> + if (idx == s->io_q.size) >> + ioq_submit(s); > > Missing braces. > >> +} >> + >> +void laio_io_plug(BlockDriverState *bs, void *aio_ctx) >> +{ >> + struct qemu_laio_state *s = aio_ctx; >> + >> + s->io_q.plugged++; >> +} >> + >> +int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug) >> +{ >> + struct qemu_laio_state *s = aio_ctx; >> + int ret = 0; >> + > > How about an assert(s->io_q.plugged > 0); here? how about just adding a warning because flush io queue uses the function too? Also that is why 'plugged' is defined as signed. > >> + if (unplug && --s->io_q.plugged > 0) >> + return 0; > > Missing braces. > >> + >> + if (s->io_q.idx > 0) { >> + ret = ioq_submit(s); >> + } >> + >> + return ret; >> +} >> + >> BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd, >> int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, >> BlockDriverCompletionFunc *cb, void *opaque, int type) > Thanks, ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 2/3] linux-aio: implement io plug, unplug and flush io queue 2014-07-03 9:51 ` Ming Lei @ 2014-07-03 10:04 ` Paolo Bonzini 2014-07-03 10:45 ` Ming Lei 2014-07-03 10:22 ` Kevin Wolf 1 sibling, 1 reply; 26+ messages in thread From: Paolo Bonzini @ 2014-07-03 10:04 UTC (permalink / raw) To: Ming Lei, Kevin Wolf Cc: Peter Maydell, Fam Zheng, qemu-devel, Stefan Hajnoczi, Michael S. Tsirkin Il 03/07/2014 11:51, Ming Lei ha scritto: >>> >> +int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug) >>> >> +{ >>> >> + struct qemu_laio_state *s = aio_ctx; >>> >> + int ret = 0; >>> >> + >> > >> > How about an assert(s->io_q.plugged > 0); here? > how about just adding a warning because flush io queue uses > the function too? This suggest that there is a smaller piece of code that both laio_io_unplug ad flush io queue should use. However, not that flush io queue is not necessary. The only place where you're using it (before a call to bdrv_aio_flush) does not need it. Paolo > Also that is why 'plugged' is defined as signed. > >> > >>> >> + if (unplug && --s->io_q.plugged > 0) >>> >> + return 0; >> > >> > Missing braces. >> > >>> >> + >>> >> + if (s->io_q.idx > 0) { >>> >> + ret = ioq_submit(s); >>> >> + } >>> >> + >>> >> + return ret; ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 2/3] linux-aio: implement io plug, unplug and flush io queue 2014-07-03 10:04 ` Paolo Bonzini @ 2014-07-03 10:45 ` Ming Lei 2014-07-03 10:50 ` Paolo Bonzini 0 siblings, 1 reply; 26+ messages in thread From: Ming Lei @ 2014-07-03 10:45 UTC (permalink / raw) To: Paolo Bonzini Cc: Kevin Wolf, Peter Maydell, Fam Zheng, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi On Thu, Jul 3, 2014 at 6:04 PM, Paolo Bonzini <pbonzini@redhat.com> wrote: > Il 03/07/2014 11:51, Ming Lei ha scritto: > >>>> >> +int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug) >>>> >> +{ >>>> >> + struct qemu_laio_state *s = aio_ctx; >>>> >> + int ret = 0; >>>> >> + >>> >>> > >>> > How about an assert(s->io_q.plugged > 0); here? >> >> how about just adding a warning because flush io queue uses >> the function too? > > > This suggest that there is a smaller piece of code that both laio_io_unplug > ad flush io queue should use. However, not that flush io queue is not > necessary. The only place where you're using it (before a call to > bdrv_aio_flush) does not need it. I think it may be needed: - following requests coming inside handle_notify(): req0, req1, req2-flush, req3 - both req0 and req1 queued - start to handle req2-flush - bdrv_co_flush() calls bdrv_co_flush(bs->file), which finally call raw_aio_flush() to send command, but the 1st two requests should have been submitted to fs before the flush action Thanks, ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 2/3] linux-aio: implement io plug, unplug and flush io queue 2014-07-03 10:45 ` Ming Lei @ 2014-07-03 10:50 ` Paolo Bonzini 2014-07-03 11:31 ` Ming Lei 0 siblings, 1 reply; 26+ messages in thread From: Paolo Bonzini @ 2014-07-03 10:50 UTC (permalink / raw) To: Ming Lei Cc: Kevin Wolf, Peter Maydell, Fam Zheng, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi Il 03/07/2014 12:45, Ming Lei ha scritto: > I think it may be needed: > > - following requests coming inside handle_notify(): > req0, req1, req2-flush, req3 > - both req0 and req1 queued > - start to handle req2-flush > - bdrv_co_flush() calls bdrv_co_flush(bs->file), which finally > call raw_aio_flush() to send command, but the 1st two requests > should have been submitted to fs before the flush action No, bdrv_aio_flush is only guaranteed to flush requests that _completed_ before it was sent. Since we haven't yet told the guest that req0 and req1 are complete, there is no need to submit them before req2-flush. Paolo ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 2/3] linux-aio: implement io plug, unplug and flush io queue 2014-07-03 10:50 ` Paolo Bonzini @ 2014-07-03 11:31 ` Ming Lei 0 siblings, 0 replies; 26+ messages in thread From: Ming Lei @ 2014-07-03 11:31 UTC (permalink / raw) To: Paolo Bonzini Cc: Kevin Wolf, Peter Maydell, Fam Zheng, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi On Thu, Jul 3, 2014 at 6:50 PM, Paolo Bonzini <pbonzini@redhat.com> wrote: > Il 03/07/2014 12:45, Ming Lei ha scritto: > >> I think it may be needed: >> >> - following requests coming inside handle_notify(): >> req0, req1, req2-flush, req3 >> - both req0 and req1 queued >> - start to handle req2-flush >> - bdrv_co_flush() calls bdrv_co_flush(bs->file), which finally >> call raw_aio_flush() to send command, but the 1st two requests >> should have been submitted to fs before the flush action > > > No, bdrv_aio_flush is only guaranteed to flush requests that _completed_ > before it was sent. > > Since we haven't yet told the guest that req0 and req1 are complete, there > is no need to submit them before req2-flush. You are right, and it is a bit confused to think between device and host, :-) Even with O_DIRECT, qemu can be thought as no cache. Thanks, ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 2/3] linux-aio: implement io plug, unplug and flush io queue 2014-07-03 9:51 ` Ming Lei 2014-07-03 10:04 ` Paolo Bonzini @ 2014-07-03 10:22 ` Kevin Wolf 2014-07-03 10:24 ` Ming Lei 1 sibling, 1 reply; 26+ messages in thread From: Kevin Wolf @ 2014-07-03 10:22 UTC (permalink / raw) To: Ming Lei Cc: Peter Maydell, Fam Zheng, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi, Paolo Bonzini Am 03.07.2014 um 11:51 hat Ming Lei geschrieben: > Hi Kevin, > > On Thu, Jul 3, 2014 at 5:40 PM, Kevin Wolf <kwolf@redhat.com> wrote: > > Am 02.07.2014 um 14:18 hat Ming Lei geschrieben: > >> This patch implements .bdrv_io_plug, .bdrv_io_unplug and > >> .bdrv_flush_io_queue callbacks for linux-aio Block Drivers, > >> so that submitting I/O as a batch can be supported on linux-aio. > >> > >> Signed-off-by: Ming Lei <ming.lei@canonical.com> > > > > Just a couple of minor comments, see inline. > > Thanks for your review. > > >> +void laio_io_plug(BlockDriverState *bs, void *aio_ctx) > >> +{ > >> + struct qemu_laio_state *s = aio_ctx; > >> + > >> + s->io_q.plugged++; > >> +} > >> + > >> +int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug) > >> +{ > >> + struct qemu_laio_state *s = aio_ctx; > >> + int ret = 0; > >> + > > > > How about an assert(s->io_q.plugged > 0); here? > > how about just adding a warning because flush io queue uses > the function too? Good point, this is what the assertion should look like then: assert(s->io_q.plugged > 0 || !unplug); > Also that is why 'plugged' is defined as signed. I don't understand. The flush function leaves s->io_q.plugged alone (otherwise it would be buggy), so how can it ever become negative? And if you say that a negative value is valid, what would it even mean? > > > >> + if (unplug && --s->io_q.plugged > 0) > >> + return 0; > > > > Missing braces. > > > >> + > >> + if (s->io_q.idx > 0) { > >> + ret = ioq_submit(s); > >> + } > >> + > >> + return ret; > >> +} Kevin ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 2/3] linux-aio: implement io plug, unplug and flush io queue 2014-07-03 10:22 ` Kevin Wolf @ 2014-07-03 10:24 ` Ming Lei 2014-07-03 10:30 ` Kevin Wolf 0 siblings, 1 reply; 26+ messages in thread From: Ming Lei @ 2014-07-03 10:24 UTC (permalink / raw) To: Kevin Wolf Cc: Peter Maydell, Fam Zheng, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi, Paolo Bonzini On Thu, Jul 3, 2014 at 6:22 PM, Kevin Wolf <kwolf@redhat.com> wrote: > Am 03.07.2014 um 11:51 hat Ming Lei geschrieben: >> Hi Kevin, >> >> On Thu, Jul 3, 2014 at 5:40 PM, Kevin Wolf <kwolf@redhat.com> wrote: >> > Am 02.07.2014 um 14:18 hat Ming Lei geschrieben: >> >> This patch implements .bdrv_io_plug, .bdrv_io_unplug and >> >> .bdrv_flush_io_queue callbacks for linux-aio Block Drivers, >> >> so that submitting I/O as a batch can be supported on linux-aio. >> >> >> >> Signed-off-by: Ming Lei <ming.lei@canonical.com> >> > >> > Just a couple of minor comments, see inline. >> >> Thanks for your review. >> >> >> +void laio_io_plug(BlockDriverState *bs, void *aio_ctx) >> >> +{ >> >> + struct qemu_laio_state *s = aio_ctx; >> >> + >> >> + s->io_q.plugged++; >> >> +} >> >> + >> >> +int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug) >> >> +{ >> >> + struct qemu_laio_state *s = aio_ctx; >> >> + int ret = 0; >> >> + >> > >> > How about an assert(s->io_q.plugged > 0); here? >> >> how about just adding a warning because flush io queue uses >> the function too? > > Good point, this is what the assertion should look like then: > > assert(s->io_q.plugged > 0 || !unplug); OK, will do it. >> Also that is why 'plugged' is defined as signed. > > I don't understand. The flush function leaves s->io_q.plugged alone > (otherwise it would be buggy), so how can it ever become negative? And > if you say that a negative value is valid, what would it even mean? I mean it is easy to detect bug with negative value, :-) Thanks, -- Ming Lei ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 2/3] linux-aio: implement io plug, unplug and flush io queue 2014-07-03 10:24 ` Ming Lei @ 2014-07-03 10:30 ` Kevin Wolf 0 siblings, 0 replies; 26+ messages in thread From: Kevin Wolf @ 2014-07-03 10:30 UTC (permalink / raw) To: Ming Lei Cc: Peter Maydell, Fam Zheng, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi, Paolo Bonzini Am 03.07.2014 um 12:24 hat Ming Lei geschrieben: > On Thu, Jul 3, 2014 at 6:22 PM, Kevin Wolf <kwolf@redhat.com> wrote: > > Am 03.07.2014 um 11:51 hat Ming Lei geschrieben: > >> Hi Kevin, > >> > >> On Thu, Jul 3, 2014 at 5:40 PM, Kevin Wolf <kwolf@redhat.com> wrote: > >> > Am 02.07.2014 um 14:18 hat Ming Lei geschrieben: > >> >> This patch implements .bdrv_io_plug, .bdrv_io_unplug and > >> >> .bdrv_flush_io_queue callbacks for linux-aio Block Drivers, > >> >> so that submitting I/O as a batch can be supported on linux-aio. > >> >> > >> >> Signed-off-by: Ming Lei <ming.lei@canonical.com> > >> > > >> > Just a couple of minor comments, see inline. > >> > >> Thanks for your review. > >> > >> >> +void laio_io_plug(BlockDriverState *bs, void *aio_ctx) > >> >> +{ > >> >> + struct qemu_laio_state *s = aio_ctx; > >> >> + > >> >> + s->io_q.plugged++; > >> >> +} > >> >> + > >> >> +int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug) > >> >> +{ > >> >> + struct qemu_laio_state *s = aio_ctx; > >> >> + int ret = 0; > >> >> + > >> > > >> > How about an assert(s->io_q.plugged > 0); here? > >> > >> how about just adding a warning because flush io queue uses > >> the function too? > > > > Good point, this is what the assertion should look like then: > > > > assert(s->io_q.plugged > 0 || !unplug); > > OK, will do it. > > >> Also that is why 'plugged' is defined as signed. > > > > I don't understand. The flush function leaves s->io_q.plugged alone > > (otherwise it would be buggy), so how can it ever become negative? And > > if you say that a negative value is valid, what would it even mean? > > I mean it is easy to detect bug with negative value, :-) Ah, I see. That makes some sense then. :-) Kevin ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 2/3] linux-aio: implement io plug, unplug and flush io queue 2014-07-02 12:18 ` [Qemu-devel] [PATCH v4 2/3] linux-aio: implement io plug, unplug and flush io queue Ming Lei 2014-07-03 9:40 ` Kevin Wolf @ 2014-07-03 12:24 ` Stefan Hajnoczi 2014-07-04 9:18 ` Ming Lei 1 sibling, 1 reply; 26+ messages in thread From: Stefan Hajnoczi @ 2014-07-03 12:24 UTC (permalink / raw) To: Ming Lei Cc: Kevin Wolf, Peter Maydell, Fam Zheng, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi, Paolo Bonzini [-- Attachment #1: Type: text/plain, Size: 2045 bytes --] On Wed, Jul 02, 2014 at 08:18:47PM +0800, Ming Lei wrote: > @@ -36,9 +38,19 @@ struct qemu_laiocb { > QLIST_ENTRY(qemu_laiocb) node; > }; > > +struct laio_queue { This file doesn't follow QEMU coding style for structs but normally this should be: typedef struct { ... } LaioQueue; Up to you if you want to fix it. See ./HACKING and ./CODING_STYLE. Please always run patches throught scripts/checkpatch.pl before submitting them. > +static int ioq_submit(struct qemu_laio_state *s) > +{ > + int ret, i = 0; > + int len = s->io_q.idx; > + > + do { > + ret = io_submit(s->ctx, len, s->io_q.iocbs); > + } while (i++ < 3 && ret == -EAGAIN); > + > + /* empty io queue */ > + s->io_q.idx = 0; > + > + if (ret >= 0) > + return 0; This leaks requests when ret < len. I think the loop below should be used in that case to fail unsubmitted requests with -EIO. Also, QEMU always uses {} even when there is only one statement in the if body and 4-space indentation. > + > + for (i = 0; i < len; i++) { > + struct qemu_laiocb *laiocb = > + container_of(s->io_q.iocbs[i], struct qemu_laiocb, iocb); > + > + laiocb->ret = ret; > + qemu_laio_process_completion(s, laiocb); > + } > + return ret; > +} > + > +static void ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb) > +{ > + unsigned int idx = s->io_q.idx; > + > + s->io_q.iocbs[idx++] = iocb; > + s->io_q.idx = idx; > + > + /* submit immediately if queue is full */ > + if (idx == s->io_q.size) > + ioq_submit(s); Missing {} > +} > + > +void laio_io_plug(BlockDriverState *bs, void *aio_ctx) > +{ > + struct qemu_laio_state *s = aio_ctx; > + > + s->io_q.plugged++; > +} > + > +int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug) > +{ > + struct qemu_laio_state *s = aio_ctx; > + int ret = 0; > + > + if (unplug && --s->io_q.plugged > 0) > + return 0; Missing {} [-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 2/3] linux-aio: implement io plug, unplug and flush io queue 2014-07-03 12:24 ` Stefan Hajnoczi @ 2014-07-04 9:18 ` Ming Lei 2014-07-04 9:21 ` Ming Lei 0 siblings, 1 reply; 26+ messages in thread From: Ming Lei @ 2014-07-04 9:18 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Kevin Wolf, Peter Maydell, Fam Zheng, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi, Paolo Bonzini Hi Stefan, Sorry for missing your comments. On Thu, Jul 3, 2014 at 8:24 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote: > On Wed, Jul 02, 2014 at 08:18:47PM +0800, Ming Lei wrote: >> @@ -36,9 +38,19 @@ struct qemu_laiocb { >> QLIST_ENTRY(qemu_laiocb) node; >> }; >> >> +struct laio_queue { > > This file doesn't follow QEMU coding style for structs but normally this > should be: > typedef struct { > ... > } LaioQueue; > > Up to you if you want to fix it. See ./HACKING and ./CODING_STYLE. OK. > Please always run patches throught scripts/checkpatch.pl before > submitting them. Looks no failure and warning. > >> +static int ioq_submit(struct qemu_laio_state *s) >> +{ >> + int ret, i = 0; >> + int len = s->io_q.idx; >> + >> + do { >> + ret = io_submit(s->ctx, len, s->io_q.iocbs); >> + } while (i++ < 3 && ret == -EAGAIN); >> + >> + /* empty io queue */ >> + s->io_q.idx = 0; >> + >> + if (ret >= 0) >> + return 0; > > This leaks requests when ret < len. I think the loop below should be > used in that case to fail unsubmitted requests with -EIO. I thought about the problem before, but looks it may not return 'ret' which is less than len, follows the man io_submit(): The function returns immediately after having enqueued all the requests. On suc‐ cess, io_submit returns the number of iocbs submitted successfully. Otherwise, -error is return, where error is one of the Exxx values defined in the Errors sec‐ tion. The above description looks a bit confusing, and I will check linux fs/aio.c to see if there is the case. > > Also, QEMU always uses {} even when there is only one statement in the > if body and 4-space indentation. > >> + >> + for (i = 0; i < len; i++) { >> + struct qemu_laiocb *laiocb = >> + container_of(s->io_q.iocbs[i], struct qemu_laiocb, iocb); >> + >> + laiocb->ret = ret; >> + qemu_laio_process_completion(s, laiocb); >> + } >> + return ret; >> +} >> + >> +static void ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb) >> +{ >> + unsigned int idx = s->io_q.idx; >> + >> + s->io_q.iocbs[idx++] = iocb; >> + s->io_q.idx = idx; >> + >> + /* submit immediately if queue is full */ >> + if (idx == s->io_q.size) >> + ioq_submit(s); > > Missing {} > >> +} >> + >> +void laio_io_plug(BlockDriverState *bs, void *aio_ctx) >> +{ >> + struct qemu_laio_state *s = aio_ctx; >> + >> + s->io_q.plugged++; >> +} >> + >> +int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug) >> +{ >> + struct qemu_laio_state *s = aio_ctx; >> + int ret = 0; >> + >> + if (unplug && --s->io_q.plugged > 0) >> + return 0; > > Missing {} All these coding style problems have been fixed. Thanks, ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 2/3] linux-aio: implement io plug, unplug and flush io queue 2014-07-04 9:18 ` Ming Lei @ 2014-07-04 9:21 ` Ming Lei 0 siblings, 0 replies; 26+ messages in thread From: Ming Lei @ 2014-07-04 9:21 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Kevin Wolf, Peter Maydell, Fam Zheng, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi, Paolo Bonzini On Fri, Jul 4, 2014 at 5:18 PM, Ming Lei <ming.lei@canonical.com> wrote: >> >> This leaks requests when ret < len. I think the loop below should be >> used in that case to fail unsubmitted requests with -EIO. > > I thought about the problem before, but looks it may not return 'ret' > which is less than > len, follows the man io_submit(): > > The function returns immediately after having enqueued all > the requests. On suc‐ > cess, io_submit returns the number of iocbs submitted > successfully. Otherwise, > -error is return, where error is one of the Exxx values > defined in the Errors sec‐ > tion. > > The above description looks a bit confusing, and I will check linux > fs/aio.c to see > if there is the case. Ah, fs/aio.c do have the case, will fix in v6. Thanks, ^ permalink raw reply [flat|nested] 26+ messages in thread
* [Qemu-devel] [PATCH v4 3/3] dataplane: submit I/O as a batch 2014-07-02 12:18 [Qemu-devel] [PATCH v4 0/3] linux-aio: introduce submit I/O as a batch Ming Lei 2014-07-02 12:18 ` [Qemu-devel] [PATCH v4 1/3] block: block: introduce APIs for submitting IO as Ming Lei 2014-07-02 12:18 ` [Qemu-devel] [PATCH v4 2/3] linux-aio: implement io plug, unplug and flush io queue Ming Lei @ 2014-07-02 12:18 ` Ming Lei 2014-07-03 9:44 ` Kevin Wolf ` (2 more replies) 2014-07-03 9:25 ` [Qemu-devel] [PATCH v4 0/3] linux-aio: introduce " Kevin Wolf 2014-07-03 12:36 ` Stefan Hajnoczi 4 siblings, 3 replies; 26+ messages in thread From: Ming Lei @ 2014-07-02 12:18 UTC (permalink / raw) To: Peter Maydell, qemu-devel, Paolo Bonzini, Stefan Hajnoczi Cc: Kevin Wolf, Ming Lei, Fam Zheng, Michael S. Tsirkin Before commit 580b6b2aa2(dataplane: use the QEMU block layer for I/O), dataplane for virtio-blk submits block I/O as a batch. This commit 580b6b2aa2 replaces the custom linux AIO implementation(including submit I/O as a batch) with QEMU block layer, but this commit causes ~40% throughput regression on virtio-blk performance, and removing submitting I/O as a batch is one of the causes. This patch applies the newly introduced bdrv_io_plug() and bdrv_io_unplug() interfaces to support submitting I/O at batch for Qemu block layer, and in my test, the change can improve throughput by ~30% with 'aio=native'. Following my fio test script: [global] direct=1 size=4G bsrange=4k-4k timeout=40 numjobs=4 ioengine=libaio iodepth=64 filename=/dev/vdc group_reporting=1 [f] rw=randread Result on one of my small machine(host: x86_64, 2cores, 4thread, guest: 4cores): - qemu master: 59K IOPS - qemu master with these patches: 81K IOPS - 2.0.0 release(dataplane using custom linux aio): 104K IOPS Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Ming Lei <ming.lei@canonical.com> --- hw/block/dataplane/virtio-blk.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index c10b7b7..82bb276 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -201,6 +201,9 @@ static void do_flush_cmd(VirtIOBlockDataPlane *s, VirtQueueElement *elem, req->elem = elem; req->inhdr = inhdr; + /* flush IOs queued first */ + bdrv_flush_io_queue(s->blk->conf.bs); + bdrv_aio_flush(s->blk->conf.bs, complete_flush, req); } @@ -289,6 +292,7 @@ static void handle_notify(EventNotifier *e) int ret; event_notifier_test_and_clear(&s->host_notifier); + bdrv_io_plug(s->blk->conf.bs); for (;;) { /* Disable guest->host notifies to avoid unnecessary vmexits */ vring_disable_notification(s->vdev, &s->vring); @@ -322,6 +326,7 @@ static void handle_notify(EventNotifier *e) break; } } + bdrv_io_unplug(s->blk->conf.bs); } /* Context: QEMU global mutex held */ -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 3/3] dataplane: submit I/O as a batch 2014-07-02 12:18 ` [Qemu-devel] [PATCH v4 3/3] dataplane: submit I/O as a batch Ming Lei @ 2014-07-03 9:44 ` Kevin Wolf 2014-07-03 9:49 ` Paolo Bonzini 2014-07-03 12:35 ` Stefan Hajnoczi 2014-07-03 12:37 ` Stefan Hajnoczi 2 siblings, 1 reply; 26+ messages in thread From: Kevin Wolf @ 2014-07-03 9:44 UTC (permalink / raw) To: Ming Lei Cc: Peter Maydell, Fam Zheng, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi, Paolo Bonzini Am 02.07.2014 um 14:18 hat Ming Lei geschrieben: > Before commit 580b6b2aa2(dataplane: use the QEMU block > layer for I/O), dataplane for virtio-blk submits block > I/O as a batch. > > This commit 580b6b2aa2 replaces the custom linux AIO > implementation(including submit I/O as a batch) with QEMU > block layer, but this commit causes ~40% throughput regression > on virtio-blk performance, and removing submitting I/O > as a batch is one of the causes. > > This patch applies the newly introduced bdrv_io_plug() and > bdrv_io_unplug() interfaces to support submitting I/O > at batch for Qemu block layer, and in my test, the change > can improve throughput by ~30% with 'aio=native'. > > Following my fio test script: > > [global] > direct=1 > size=4G > bsrange=4k-4k > timeout=40 > numjobs=4 > ioengine=libaio > iodepth=64 > filename=/dev/vdc > group_reporting=1 > > [f] > rw=randread > > Result on one of my small machine(host: x86_64, 2cores, 4thread, guest: 4cores): > - qemu master: 59K IOPS > - qemu master with these patches: 81K IOPS > - 2.0.0 release(dataplane using custom linux aio): 104K IOPS > > Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> > Signed-off-by: Ming Lei <ming.lei@canonical.com> Reviewing this one doesn't make sense because commit b002254d ('virtio-blk: Unify {non-,}dataplane's request handlings') removes the patched code. You need to patch common virtio-blk code now (which should improve non-dataplane virtio-blk as well). Kevin ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 3/3] dataplane: submit I/O as a batch 2014-07-03 9:44 ` Kevin Wolf @ 2014-07-03 9:49 ` Paolo Bonzini 2014-07-03 10:18 ` Kevin Wolf 0 siblings, 1 reply; 26+ messages in thread From: Paolo Bonzini @ 2014-07-03 9:49 UTC (permalink / raw) To: Kevin Wolf, Ming Lei Cc: Peter Maydell, Fam Zheng, qemu-devel, Stefan Hajnoczi, Michael S. Tsirkin Il 03/07/2014 11:44, Kevin Wolf ha scritto: > Am 02.07.2014 um 14:18 hat Ming Lei geschrieben: >> Before commit 580b6b2aa2(dataplane: use the QEMU block >> layer for I/O), dataplane for virtio-blk submits block >> I/O as a batch. >> >> This commit 580b6b2aa2 replaces the custom linux AIO >> implementation(including submit I/O as a batch) with QEMU >> block layer, but this commit causes ~40% throughput regression >> on virtio-blk performance, and removing submitting I/O >> as a batch is one of the causes. >> >> This patch applies the newly introduced bdrv_io_plug() and >> bdrv_io_unplug() interfaces to support submitting I/O >> at batch for Qemu block layer, and in my test, the change >> can improve throughput by ~30% with 'aio=native'. >> >> Following my fio test script: >> >> [global] >> direct=1 >> size=4G >> bsrange=4k-4k >> timeout=40 >> numjobs=4 >> ioengine=libaio >> iodepth=64 >> filename=/dev/vdc >> group_reporting=1 >> >> [f] >> rw=randread >> >> Result on one of my small machine(host: x86_64, 2cores, 4thread, guest: 4cores): >> - qemu master: 59K IOPS >> - qemu master with these patches: 81K IOPS >> - 2.0.0 release(dataplane using custom linux aio): 104K IOPS >> >> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> >> Signed-off-by: Ming Lei <ming.lei@canonical.com> > > Reviewing this one doesn't make sense because commit b002254d > ('virtio-blk: Unify {non-,}dataplane's request handlings') removes the > patched code. You need to patch common virtio-blk code now (which should > improve non-dataplane virtio-blk as well). Actually no, the second and third hunk still apply. The patch you mentioned unifies request processing, but the loops that fetches requests from the virtqueue are still separate. This is because virtio-blk-dataplane still uses the vring API instead of the generic virtio API. address_space_map/unmap is not thread-safe yet, vring avoids it. Regarding the first hunk, it is not needed at all. Paolo ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 3/3] dataplane: submit I/O as a batch 2014-07-03 9:49 ` Paolo Bonzini @ 2014-07-03 10:18 ` Kevin Wolf 0 siblings, 0 replies; 26+ messages in thread From: Kevin Wolf @ 2014-07-03 10:18 UTC (permalink / raw) To: Paolo Bonzini Cc: Peter Maydell, Fam Zheng, Michael S. Tsirkin, Ming Lei, qemu-devel, Stefan Hajnoczi Am 03.07.2014 um 11:49 hat Paolo Bonzini geschrieben: > Il 03/07/2014 11:44, Kevin Wolf ha scritto: > >Am 02.07.2014 um 14:18 hat Ming Lei geschrieben: > >>Before commit 580b6b2aa2(dataplane: use the QEMU block > >>layer for I/O), dataplane for virtio-blk submits block > >>I/O as a batch. > >> > >>This commit 580b6b2aa2 replaces the custom linux AIO > >>implementation(including submit I/O as a batch) with QEMU > >>block layer, but this commit causes ~40% throughput regression > >>on virtio-blk performance, and removing submitting I/O > >>as a batch is one of the causes. > >> > >>This patch applies the newly introduced bdrv_io_plug() and > >>bdrv_io_unplug() interfaces to support submitting I/O > >>at batch for Qemu block layer, and in my test, the change > >>can improve throughput by ~30% with 'aio=native'. > >> > >>Following my fio test script: > >> > >> [global] > >> direct=1 > >> size=4G > >> bsrange=4k-4k > >> timeout=40 > >> numjobs=4 > >> ioengine=libaio > >> iodepth=64 > >> filename=/dev/vdc > >> group_reporting=1 > >> > >> [f] > >> rw=randread > >> > >>Result on one of my small machine(host: x86_64, 2cores, 4thread, guest: 4cores): > >> - qemu master: 59K IOPS > >> - qemu master with these patches: 81K IOPS > >> - 2.0.0 release(dataplane using custom linux aio): 104K IOPS > >> > >>Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> > >>Signed-off-by: Ming Lei <ming.lei@canonical.com> > > > >Reviewing this one doesn't make sense because commit b002254d > >('virtio-blk: Unify {non-,}dataplane's request handlings') removes the > >patched code. You need to patch common virtio-blk code now (which should > >improve non-dataplane virtio-blk as well). > > Actually no, the second and third hunk still apply. The patch you > mentioned unifies request processing, but the loops that fetches > requests from the virtqueue are still separate. This is because > virtio-blk-dataplane still uses the vring API instead of the generic > virtio API. address_space_map/unmap is not thread-safe yet, vring > avoids it. > > Regarding the first hunk, it is not needed at all. Right, I only saw the merge conflict on the first hunk and gave up. I didn't even look at the unapplied patch to see that it's wrong and should be dropped indeed. Kevin ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 3/3] dataplane: submit I/O as a batch 2014-07-02 12:18 ` [Qemu-devel] [PATCH v4 3/3] dataplane: submit I/O as a batch Ming Lei 2014-07-03 9:44 ` Kevin Wolf @ 2014-07-03 12:35 ` Stefan Hajnoczi 2014-07-03 12:37 ` Stefan Hajnoczi 2 siblings, 0 replies; 26+ messages in thread From: Stefan Hajnoczi @ 2014-07-03 12:35 UTC (permalink / raw) To: Ming Lei Cc: Kevin Wolf, Peter Maydell, Fam Zheng, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi, Paolo Bonzini [-- Attachment #1: Type: text/plain, Size: 922 bytes --] On Wed, Jul 02, 2014 at 08:18:48PM +0800, Ming Lei wrote: > diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c > index c10b7b7..82bb276 100644 > --- a/hw/block/dataplane/virtio-blk.c > +++ b/hw/block/dataplane/virtio-blk.c > @@ -201,6 +201,9 @@ static void do_flush_cmd(VirtIOBlockDataPlane *s, VirtQueueElement *elem, > req->elem = elem; > req->inhdr = inhdr; > > + /* flush IOs queued first */ > + bdrv_flush_io_queue(s->blk->conf.bs); > + > bdrv_aio_flush(s->blk->conf.bs, complete_flush, req); > } > I looked back at previous email threads but I don't understand why this is necessary. bdrv_aio_flush() commits the disk write cache, that means _already completed_ writes will be on stable storage. However, it does not make any guarantees about in-flight writes. So this seems like a pointless call that can be dropped safely. Stefan [-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 3/3] dataplane: submit I/O as a batch 2014-07-02 12:18 ` [Qemu-devel] [PATCH v4 3/3] dataplane: submit I/O as a batch Ming Lei 2014-07-03 9:44 ` Kevin Wolf 2014-07-03 12:35 ` Stefan Hajnoczi @ 2014-07-03 12:37 ` Stefan Hajnoczi 2014-07-03 12:47 ` Ming Lei 2 siblings, 1 reply; 26+ messages in thread From: Stefan Hajnoczi @ 2014-07-03 12:37 UTC (permalink / raw) To: Ming Lei Cc: Kevin Wolf, Peter Maydell, Fam Zheng, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi, Paolo Bonzini [-- Attachment #1: Type: text/plain, Size: 1158 bytes --] On Wed, Jul 02, 2014 at 08:18:48PM +0800, Ming Lei wrote: > diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c > index c10b7b7..82bb276 100644 > --- a/hw/block/dataplane/virtio-blk.c > +++ b/hw/block/dataplane/virtio-blk.c > @@ -201,6 +201,9 @@ static void do_flush_cmd(VirtIOBlockDataPlane *s, VirtQueueElement *elem, > req->elem = elem; > req->inhdr = inhdr; > > + /* flush IOs queued first */ > + bdrv_flush_io_queue(s->blk->conf.bs); > + > bdrv_aio_flush(s->blk->conf.bs, complete_flush, req); > } > > @@ -289,6 +292,7 @@ static void handle_notify(EventNotifier *e) > int ret; > > event_notifier_test_and_clear(&s->host_notifier); > + bdrv_io_plug(s->blk->conf.bs); > for (;;) { > /* Disable guest->host notifies to avoid unnecessary vmexits */ > vring_disable_notification(s->vdev, &s->vring); > @@ -322,6 +326,7 @@ static void handle_notify(EventNotifier *e) > break; > } > } > + bdrv_io_unplug(s->blk->conf.bs); > } Might as well do the same for non-dataplane in hw/block/virtio-blk.c? Stefan [-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 3/3] dataplane: submit I/O as a batch 2014-07-03 12:37 ` Stefan Hajnoczi @ 2014-07-03 12:47 ` Ming Lei 0 siblings, 0 replies; 26+ messages in thread From: Ming Lei @ 2014-07-03 12:47 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Kevin Wolf, Peter Maydell, Fam Zheng, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi, Paolo Bonzini On Thu, Jul 3, 2014 at 8:37 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote: > On Wed, Jul 02, 2014 at 08:18:48PM +0800, Ming Lei wrote: >> diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c >> index c10b7b7..82bb276 100644 >> --- a/hw/block/dataplane/virtio-blk.c >> +++ b/hw/block/dataplane/virtio-blk.c >> @@ -201,6 +201,9 @@ static void do_flush_cmd(VirtIOBlockDataPlane *s, VirtQueueElement *elem, >> req->elem = elem; >> req->inhdr = inhdr; >> >> + /* flush IOs queued first */ >> + bdrv_flush_io_queue(s->blk->conf.bs); >> + >> bdrv_aio_flush(s->blk->conf.bs, complete_flush, req); >> } >> >> @@ -289,6 +292,7 @@ static void handle_notify(EventNotifier *e) >> int ret; >> >> event_notifier_test_and_clear(&s->host_notifier); >> + bdrv_io_plug(s->blk->conf.bs); >> for (;;) { >> /* Disable guest->host notifies to avoid unnecessary vmexits */ >> vring_disable_notification(s->vdev, &s->vring); >> @@ -322,6 +326,7 @@ static void handle_notify(EventNotifier *e) >> break; >> } >> } >> + bdrv_io_unplug(s->blk->conf.bs); >> } > > Might as well do the same for non-dataplane in hw/block/virtio-blk.c? It may need further optimization for non-dataplane, and I saw the message of 'main-loop: WARNING: I/O thread spun for 1000 iterations' when I enabled for non-dataplane. The queue depth should be decreased for non-dataplane. Thanks, ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/3] linux-aio: introduce submit I/O as a batch 2014-07-02 12:18 [Qemu-devel] [PATCH v4 0/3] linux-aio: introduce submit I/O as a batch Ming Lei ` (2 preceding siblings ...) 2014-07-02 12:18 ` [Qemu-devel] [PATCH v4 3/3] dataplane: submit I/O as a batch Ming Lei @ 2014-07-03 9:25 ` Kevin Wolf 2014-07-03 12:36 ` Stefan Hajnoczi 4 siblings, 0 replies; 26+ messages in thread From: Kevin Wolf @ 2014-07-03 9:25 UTC (permalink / raw) To: Ming Lei Cc: Peter Maydell, Fam Zheng, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi, Paolo Bonzini Am 02.07.2014 um 14:18 hat Ming Lei geschrieben: > Hi, > > The commit 580b6b2aa2(dataplane: use the QEMU block layer for I/O) > introduces ~40% throughput regression on virtio-blk dataplane, and > one of causes is that submitting I/O as a batch is removed. > > This patchset trys to introduce this mechanism on block, at least, > linux-aio can benefit from that. > > With these patches, it is observed that thoughout on virtio-blk > dataplane can be improved a lot, see data in commit log of patch > 3/3. > > It should be possible to apply the batch mechanism to other devices > (such as virtio-scsi) too. This series doesn't seem to apply cleanly on current master. Can you rebase, please? Meanwhile, I'll try reviewing the patches as good as it's possibly when they don't apply. Kevin ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Qemu-devel] [PATCH v4 0/3] linux-aio: introduce submit I/O as a batch 2014-07-02 12:18 [Qemu-devel] [PATCH v4 0/3] linux-aio: introduce submit I/O as a batch Ming Lei ` (3 preceding siblings ...) 2014-07-03 9:25 ` [Qemu-devel] [PATCH v4 0/3] linux-aio: introduce " Kevin Wolf @ 2014-07-03 12:36 ` Stefan Hajnoczi 4 siblings, 0 replies; 26+ messages in thread From: Stefan Hajnoczi @ 2014-07-03 12:36 UTC (permalink / raw) To: Ming Lei Cc: Kevin Wolf, Peter Maydell, Fam Zheng, Michael S. Tsirkin, qemu-devel, Stefan Hajnoczi, Paolo Bonzini [-- Attachment #1: Type: text/plain, Size: 1477 bytes --] On Wed, Jul 02, 2014 at 08:18:45PM +0800, Ming Lei wrote: > Hi, > > The commit 580b6b2aa2(dataplane: use the QEMU block layer for I/O) > introduces ~40% throughput regression on virtio-blk dataplane, and > one of causes is that submitting I/O as a batch is removed. > > This patchset trys to introduce this mechanism on block, at least, > linux-aio can benefit from that. > > With these patches, it is observed that thoughout on virtio-blk > dataplane can be improved a lot, see data in commit log of patch > 3/3. > > It should be possible to apply the batch mechanism to other devices > (such as virtio-scsi) too. > > TODO: > - support queuing I/O to multi files which need lots of change > to linux-aio for scsi devies > > V4: > - support other non-raw formats with under-optimized performance > - use reference counter for plug & unplug > - flush io queue before sending flush command > > V3: > - only support submitting I/O as a batch for raw format, pointed by > Kevin > > V2: > - define return value of bdrv_io_unplug as void, suggested by Paolo > - avoid busy-wait for handling io_submit > V1: > - move queuing io stuff into linux-aio.c as suggested by Paolo > > > Thanks, > -- > Ming Lei Nice performance fix. Together with the other fixes we've been discussing I think we can fight a dataplane performance regression from QEMU 2.0 to 2.1. Looking forward to the next revision. Stefan [-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2014-07-04 9:21 UTC | newest] Thread overview: 26+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-07-02 12:18 [Qemu-devel] [PATCH v4 0/3] linux-aio: introduce submit I/O as a batch Ming Lei 2014-07-02 12:18 ` [Qemu-devel] [PATCH v4 1/3] block: block: introduce APIs for submitting IO as Ming Lei 2014-07-02 12:46 ` Eric Blake 2014-07-03 9:40 ` Kevin Wolf 2014-07-02 12:18 ` [Qemu-devel] [PATCH v4 2/3] linux-aio: implement io plug, unplug and flush io queue Ming Lei 2014-07-03 9:40 ` Kevin Wolf 2014-07-03 9:51 ` Ming Lei 2014-07-03 10:04 ` Paolo Bonzini 2014-07-03 10:45 ` Ming Lei 2014-07-03 10:50 ` Paolo Bonzini 2014-07-03 11:31 ` Ming Lei 2014-07-03 10:22 ` Kevin Wolf 2014-07-03 10:24 ` Ming Lei 2014-07-03 10:30 ` Kevin Wolf 2014-07-03 12:24 ` Stefan Hajnoczi 2014-07-04 9:18 ` Ming Lei 2014-07-04 9:21 ` Ming Lei 2014-07-02 12:18 ` [Qemu-devel] [PATCH v4 3/3] dataplane: submit I/O as a batch Ming Lei 2014-07-03 9:44 ` Kevin Wolf 2014-07-03 9:49 ` Paolo Bonzini 2014-07-03 10:18 ` Kevin Wolf 2014-07-03 12:35 ` Stefan Hajnoczi 2014-07-03 12:37 ` Stefan Hajnoczi 2014-07-03 12:47 ` Ming Lei 2014-07-03 9:25 ` [Qemu-devel] [PATCH v4 0/3] linux-aio: introduce " Kevin Wolf 2014-07-03 12:36 ` Stefan Hajnoczi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).