* [Qemu-devel] [PATCH v7 0/3] linux-aio: fix batch submission
@ 2014-12-01 9:04 Ming Lei
2014-12-01 9:04 ` [Qemu-devel] [PATCH v7 1/3] linux-aio: fix submit aio as a batch Ming Lei
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Ming Lei @ 2014-12-01 9:04 UTC (permalink / raw)
To: qemu-devel, Paolo Bonzini, Stefan Hajnoczi, Kevin Wolf
The 1st patch fixes batch submission.
The 2nd one fixes -EAGAIN for non-batch case.
The 3rd one is a cleanup.
This patchset is splitted from previous patchset(dataplane: optimization
and multi virtqueue support), as suggested by Stefan.
V7:
- add protection for aborting in laio_attach_aio_context(), as suggested
by Stefan, 1/3
- patch style, return real aborting failure to caller, as suggested
by Kevin, 1/3
- track pending I/O and only handle -EAGAIN if there is pending I/O,
pointed by Kevin, 2/3
V6:
- don't pass ioq_submit() return value to ioq_enqueue(), as suggested
by Stefan
- fix one build failure introduced in V5, reported by Stefan
V5:
- in case of submission failure, return -EIO for new coming requests
until aborting is handled
- in patch2, follow Paolo's suggestion about ioq_enqueue() changes
V4:
- abort reuqests in BH to abvoid potential "Co-routine re-entered recursively"
- remove 'enqueue' parameter to ioq_submit() to simpify change
- beautify code as suggested by Paolo
V3:
- rebase on QEMU master
V2:
- code style fix and commit log fix as suggested by Benoît Canet
V1:
- rebase on latest QEMU master
block/linux-aio.c | 139 ++++++++++++++++++++++++++++++++++++++++++++---------
1 file changed, 116 insertions(+), 23 deletions(-)
Thanks,
Ming Lei
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Qemu-devel] [PATCH v7 1/3] linux-aio: fix submit aio as a batch
2014-12-01 9:04 [Qemu-devel] [PATCH v7 0/3] linux-aio: fix batch submission Ming Lei
@ 2014-12-01 9:04 ` Ming Lei
2014-12-01 9:04 ` [Qemu-devel] [PATCH v7 2/3] linux-aio: handling -EAGAIN for !s->io_q.plugged case Ming Lei
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2014-12-01 9:04 UTC (permalink / raw)
To: qemu-devel, Paolo Bonzini, Stefan Hajnoczi, Kevin Wolf; +Cc: Ming Lei
In the submit path, we can't complete request directly,
otherwise "Co-routine re-entered recursively" may be caused,
so this patch fixes the issue with below ideas:
- for -EAGAIN or partial submission, retry the submision
in following completion cb which is run in BH context
- for part of submission, update the io queue too
- for case of io queue full, submit queued requests
immediatelly and return failure to caller
- for other failure, abort all queued requests in BH
context, and requests won't be allow to submit until
aborting is handled
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
block/linux-aio.c | 116 ++++++++++++++++++++++++++++++++++++++++++++---------
1 file changed, 97 insertions(+), 19 deletions(-)
diff --git a/block/linux-aio.c b/block/linux-aio.c
index d92513b..53c5616 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -38,11 +38,21 @@ struct qemu_laiocb {
QLIST_ENTRY(qemu_laiocb) node;
};
+/*
+ * TODO: support to batch I/O from multiple bs in one same
+ * AIO context, one important use case is multi-lun scsi,
+ * so in future the IO queue should be per AIO context.
+ */
typedef struct {
struct iocb *iocbs[MAX_QUEUED_IO];
int plugged;
unsigned int size;
unsigned int idx;
+
+ /* abort queued requests in BH context */
+ QEMUBH *abort_bh;
+ bool aborting;
+ int abort_ret;
} LaioQueue;
struct qemu_laio_state {
@@ -59,6 +69,8 @@ struct qemu_laio_state {
int event_max;
};
+static int ioq_submit(struct qemu_laio_state *s);
+
static inline ssize_t io_event_ret(struct io_event *ev)
{
return (ssize_t)(((uint64_t)ev->res2 << 32) | ev->res);
@@ -135,6 +147,11 @@ static void qemu_laio_completion_bh(void *opaque)
qemu_laio_process_completion(s, laiocb);
}
+
+ /* Handle -EAGAIN or partial submission */
+ if (s->io_q.idx) {
+ ioq_submit(s);
+ }
}
static void qemu_laio_completion_cb(EventNotifier *e)
@@ -175,47 +192,100 @@ static void ioq_init(LaioQueue *io_q)
io_q->size = MAX_QUEUED_IO;
io_q->idx = 0;
io_q->plugged = 0;
+ io_q->aborting = false;
}
+/* Always return >= 0 and it means how many requests are submitted */
static int ioq_submit(struct qemu_laio_state *s)
{
- int ret, i = 0;
+ int ret;
int len = s->io_q.idx;
- do {
- ret = io_submit(s->ctx, len, s->io_q.iocbs);
- } while (i++ < 3 && ret == -EAGAIN);
-
- /* empty io queue */
- s->io_q.idx = 0;
+ if (!len) {
+ return 0;
+ }
+ ret = io_submit(s->ctx, len, s->io_q.iocbs);
if (ret < 0) {
- i = 0;
- } else {
- i = ret;
+ /* retry in following completion cb */
+ if (ret == -EAGAIN) {
+ return 0;
+ }
+
+ /*
+ * Abort in BH context for avoiding Co-routine re-entered,
+ * and update io queue at that time
+ */
+ s->io_q.aborting = true;
+ s->io_q.abort_ret = ret;
+ qemu_bh_schedule(s->io_q.abort_bh);
+ ret = 0;
}
- for (; i < len; i++) {
- struct qemu_laiocb *laiocb =
- container_of(s->io_q.iocbs[i], struct qemu_laiocb, iocb);
+ /*
+ * update io queue, and retry will be started automatically
+ * in following completion cb for the remainder
+ */
+ if (ret > 0) {
+ if (ret < len) {
+ memmove(&s->io_q.iocbs[0], &s->io_q.iocbs[ret],
+ (len - ret) * sizeof(struct iocb *));
+ }
+ s->io_q.idx -= ret;
+ }
+
+ return ret;
+}
- laiocb->ret = (ret < 0) ? ret : -EIO;
+static void ioq_abort_bh(void *opaque)
+{
+ struct qemu_laio_state *s = opaque;
+ int i;
+
+ for (i = 0; i < s->io_q.idx; i++) {
+ struct qemu_laiocb *laiocb = container_of(s->io_q.iocbs[i],
+ struct qemu_laiocb,
+ iocb);
+ laiocb->ret = s->io_q.abort_ret;
qemu_laio_process_completion(s, laiocb);
}
- return ret;
+
+ s->io_q.idx = 0;
+ s->io_q.aborting = false;
}
-static void ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb)
+static int ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb)
{
unsigned int idx = s->io_q.idx;
+ /* Request can't be allowed to submit until aborting is handled */
+ if (unlikely(s->io_q.aborting)) {
+ return -EIO;
+ }
+
+ if (unlikely(idx == s->io_q.size)) {
+ ioq_submit(s);
+
+ if (unlikely(s->io_q.aborting)) {
+ return -EIO;
+ }
+ idx = s->io_q.idx;
+ }
+
+ /* It has to return now if queue is still full */
+ if (unlikely(idx == s->io_q.size)) {
+ return -EAGAIN;
+ }
+
s->io_q.iocbs[idx++] = iocb;
s->io_q.idx = idx;
- /* submit immediately if queue is full */
- if (idx == s->io_q.size) {
+ /* submit immediately if queue depth is above 2/3 */
+ if (idx > s->io_q.size * 2 / 3) {
ioq_submit(s);
}
+
+ return 0;
}
void laio_io_plug(BlockDriverState *bs, void *aio_ctx)
@@ -281,7 +351,9 @@ BlockAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
goto out_free_aiocb;
}
} else {
- ioq_enqueue(s, iocbs);
+ if (ioq_enqueue(s, iocbs) < 0) {
+ goto out_free_aiocb;
+ }
}
return &laiocb->common;
@@ -296,14 +368,20 @@ void laio_detach_aio_context(void *s_, AioContext *old_context)
aio_set_event_notifier(old_context, &s->e, NULL);
qemu_bh_delete(s->completion_bh);
+ qemu_bh_delete(s->io_q.abort_bh);
}
void laio_attach_aio_context(void *s_, AioContext *new_context)
{
struct qemu_laio_state *s = s_;
+ s->io_q.abort_bh = aio_bh_new(new_context, ioq_abort_bh, s);
s->completion_bh = aio_bh_new(new_context, qemu_laio_completion_bh, s);
aio_set_event_notifier(new_context, &s->e, qemu_laio_completion_cb);
+
+ if (s->io_q.aborting) {
+ qemu_bh_schedule(s->io_q.abort_bh);
+ }
}
void *laio_init(void)
--
1.7.9.5
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Qemu-devel] [PATCH v7 2/3] linux-aio: handling -EAGAIN for !s->io_q.plugged case
2014-12-01 9:04 [Qemu-devel] [PATCH v7 0/3] linux-aio: fix batch submission Ming Lei
2014-12-01 9:04 ` [Qemu-devel] [PATCH v7 1/3] linux-aio: fix submit aio as a batch Ming Lei
@ 2014-12-01 9:04 ` Ming Lei
2014-12-01 9:04 ` [Qemu-devel] [PATCH v7 3/3] linux-aio: remove 'node' from 'struct qemu_laiocb' Ming Lei
2014-12-10 13:11 ` [Qemu-devel] [PATCH v7 0/3] linux-aio: fix batch submission Paolo Bonzini
3 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2014-12-01 9:04 UTC (permalink / raw)
To: qemu-devel, Paolo Bonzini, Stefan Hajnoczi, Kevin Wolf; +Cc: Ming Lei
Previously -EAGAIN is simply ignored for !s->io_q.plugged case,
and sometimes it is easy to cause -EIO to VM, such as NVME device.
This patch handles -EAGAIN by io queue for !s->io_q.plugged case,
and it will be retried in following aio completion cb.
Most of times, -EAGAIN only happens if there is pending I/O, but
from linux kernel AIO implementation io_submit() might return it
when kmem_cache_alloc(GFP_KERNEL) returns NULL too. So 'pending'
in 'struct qemu_laio_state' is introduced for tracking active IO,
and -EAGAIN is handled when there is pending I/O.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
block/linux-aio.c | 32 ++++++++++++++++++++++++--------
1 file changed, 24 insertions(+), 8 deletions(-)
diff --git a/block/linux-aio.c b/block/linux-aio.c
index 53c5616..9403b17 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -56,6 +56,7 @@ typedef struct {
} LaioQueue;
struct qemu_laio_state {
+ unsigned long pending;
io_context_t ctx;
EventNotifier e;
@@ -98,6 +99,7 @@ static void qemu_laio_process_completion(struct qemu_laio_state *s,
}
}
}
+ s->pending--;
laiocb->common.cb(laiocb->common.opaque, ret);
qemu_aio_unref(laiocb);
@@ -179,6 +181,7 @@ static void laio_cancel(BlockAIOCB *blockacb)
return;
}
+ laiocb->ctx->pending--;
laiocb->common.cb(laiocb->common.opaque, laiocb->ret);
}
@@ -280,8 +283,13 @@ static int ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb)
s->io_q.iocbs[idx++] = iocb;
s->io_q.idx = idx;
- /* submit immediately if queue depth is above 2/3 */
- if (idx > s->io_q.size * 2 / 3) {
+ /*
+ * This is reached in two cases: queue not plugged but io_submit
+ * returned -EAGAIN, or queue plugged. In the latter case, start
+ * submitting some I/O if the queue is getting too full. In the
+ * former case, instead, wait until an I/O operation is completed.
+ */
+ if (s->io_q.plugged && unlikely(idx > s->io_q.size * 2 / 3)) {
ioq_submit(s);
}
@@ -346,15 +354,23 @@ BlockAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
}
io_set_eventfd(&laiocb->iocb, event_notifier_get_fd(&s->e));
- if (!s->io_q.plugged) {
- if (io_submit(s->ctx, 1, &iocbs) < 0) {
- goto out_free_aiocb;
- }
- } else {
- if (ioq_enqueue(s, iocbs) < 0) {
+ /* Switch to queue mode until -EAGAIN is handled */
+ if (!s->io_q.plugged && !s->io_q.idx) {
+ int ret = io_submit(s->ctx, 1, &iocbs);
+ if (ret >= 0) {
+ return &laiocb->common;
+ } else if (ret != -EAGAIN || (ret == -EAGAIN && !s->pending)) {
goto out_free_aiocb;
}
+ /*
+ * In case of -EAGAIN, only queue the req if there is pending
+ * I/O and it is resubmitted in completion of pending I/O
+ */
+ }
+ if (ioq_enqueue(s, iocbs) < 0) {
+ goto out_free_aiocb;
}
+ s->pending++;
return &laiocb->common;
out_free_aiocb:
--
1.7.9.5
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Qemu-devel] [PATCH v7 3/3] linux-aio: remove 'node' from 'struct qemu_laiocb'
2014-12-01 9:04 [Qemu-devel] [PATCH v7 0/3] linux-aio: fix batch submission Ming Lei
2014-12-01 9:04 ` [Qemu-devel] [PATCH v7 1/3] linux-aio: fix submit aio as a batch Ming Lei
2014-12-01 9:04 ` [Qemu-devel] [PATCH v7 2/3] linux-aio: handling -EAGAIN for !s->io_q.plugged case Ming Lei
@ 2014-12-01 9:04 ` Ming Lei
2014-12-10 13:11 ` [Qemu-devel] [PATCH v7 0/3] linux-aio: fix batch submission Paolo Bonzini
3 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2014-12-01 9:04 UTC (permalink / raw)
To: qemu-devel, Paolo Bonzini, Stefan Hajnoczi, Kevin Wolf; +Cc: Ming Lei
No one uses the 'node' field any more, so remove it
from 'struct qemu_laiocb', and this can save 16byte
for the struct on 64bit arch.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
block/linux-aio.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/block/linux-aio.c b/block/linux-aio.c
index 9403b17..cad3848 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -35,7 +35,6 @@ struct qemu_laiocb {
size_t nbytes;
QEMUIOVector *qiov;
bool is_read;
- QLIST_ENTRY(qemu_laiocb) node;
};
/*
--
1.7.9.5
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [PATCH v7 0/3] linux-aio: fix batch submission
2014-12-01 9:04 [Qemu-devel] [PATCH v7 0/3] linux-aio: fix batch submission Ming Lei
` (2 preceding siblings ...)
2014-12-01 9:04 ` [Qemu-devel] [PATCH v7 3/3] linux-aio: remove 'node' from 'struct qemu_laiocb' Ming Lei
@ 2014-12-10 13:11 ` Paolo Bonzini
3 siblings, 0 replies; 5+ messages in thread
From: Paolo Bonzini @ 2014-12-10 13:11 UTC (permalink / raw)
To: Ming Lei, qemu-devel, Stefan Hajnoczi, Kevin Wolf
On 01/12/2014 10:04, Ming Lei wrote:
> The 1st patch fixes batch submission.
>
> The 2nd one fixes -EAGAIN for non-batch case.
>
> The 3rd one is a cleanup.
>
> This patchset is splitted from previous patchset(dataplane: optimization
> and multi virtqueue support), as suggested by Stefan.
I think this series needs a rewrite. Instead of removing 'node' from
'struct qemu_laiocb', each LaioQueue should actually have a list (or
SIMPLEQ) of pending 'struct qemu_laiocb'.
Every time you do an io_submit, you copy from the laiocb queue to the
iocbs array. If it returns N, the first N elements are removed from the
list. There is no need to do a memmove, the iocbs will be populated
again on the next io_submit. If io_submit fails with anything but
EAGAIN, just abort(). It should never happen.
Every time you get a request, and the list was empty, you do an io_submit.
Every time you get a bdrv_unplug and the nesting level was 1, you do an
io_submit.
Every time you get a completion, if the pending queue is not empty, you
do an io_submit.
It should be _that_ simple.
Paolo
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-12-10 13:12 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-01 9:04 [Qemu-devel] [PATCH v7 0/3] linux-aio: fix batch submission Ming Lei
2014-12-01 9:04 ` [Qemu-devel] [PATCH v7 1/3] linux-aio: fix submit aio as a batch Ming Lei
2014-12-01 9:04 ` [Qemu-devel] [PATCH v7 2/3] linux-aio: handling -EAGAIN for !s->io_q.plugged case Ming Lei
2014-12-01 9:04 ` [Qemu-devel] [PATCH v7 3/3] linux-aio: remove 'node' from 'struct qemu_laiocb' Ming Lei
2014-12-10 13:11 ` [Qemu-devel] [PATCH v7 0/3] linux-aio: fix batch submission Paolo Bonzini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).