* [Qemu-devel] [PATCH v7 0/3] linux-aio: fix batch submission
@ 2014-12-01 9:04 Ming Lei
2014-12-01 9:04 ` [Qemu-devel] [PATCH v7 1/3] linux-aio: fix submit aio as a batch Ming Lei
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Ming Lei @ 2014-12-01 9:04 UTC (permalink / raw)
To: qemu-devel, Paolo Bonzini, Stefan Hajnoczi, Kevin Wolf
The 1st patch fixes batch submission.
The 2nd one fixes -EAGAIN for non-batch case.
The 3rd one is a cleanup.
This patchset is splitted from previous patchset(dataplane: optimization
and multi virtqueue support), as suggested by Stefan.
V7:
- add protection for aborting in laio_attach_aio_context(), as suggested
by Stefan, 1/3
- patch style, return real aborting failure to caller, as suggested
by Kevin, 1/3
- track pending I/O and only handle -EAGAIN if there is pending I/O,
pointed by Kevin, 2/3
V6:
- don't pass ioq_submit() return value to ioq_enqueue(), as suggested
by Stefan
- fix one build failure introduced in V5, reported by Stefan
V5:
- in case of submission failure, return -EIO for new coming requests
until aborting is handled
- in patch2, follow Paolo's suggestion about ioq_enqueue() changes
V4:
- abort reuqests in BH to abvoid potential "Co-routine re-entered recursively"
- remove 'enqueue' parameter to ioq_submit() to simpify change
- beautify code as suggested by Paolo
V3:
- rebase on QEMU master
V2:
- code style fix and commit log fix as suggested by Benoît Canet
V1:
- rebase on latest QEMU master
block/linux-aio.c | 139 ++++++++++++++++++++++++++++++++++++++++++++---------
1 file changed, 116 insertions(+), 23 deletions(-)
Thanks,
Ming Lei
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Qemu-devel] [PATCH v7 1/3] linux-aio: fix submit aio as a batch
2014-12-01 9:04 [Qemu-devel] [PATCH v7 0/3] linux-aio: fix batch submission Ming Lei
@ 2014-12-01 9:04 ` Ming Lei
2014-12-01 9:04 ` [Qemu-devel] [PATCH v7 2/3] linux-aio: handling -EAGAIN for !s->io_q.plugged case Ming Lei
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2014-12-01 9:04 UTC (permalink / raw)
To: qemu-devel, Paolo Bonzini, Stefan Hajnoczi, Kevin Wolf; +Cc: Ming Lei
In the submit path, we can't complete request directly,
otherwise "Co-routine re-entered recursively" may be caused,
so this patch fixes the issue with below ideas:
- for -EAGAIN or partial submission, retry the submision
in following completion cb which is run in BH context
- for part of submission, update the io queue too
- for case of io queue full, submit queued requests
immediatelly and return failure to caller
- for other failure, abort all queued requests in BH
context, and requests won't be allow to submit until
aborting is handled
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
block/linux-aio.c | 116 ++++++++++++++++++++++++++++++++++++++++++++---------
1 file changed, 97 insertions(+), 19 deletions(-)
diff --git a/block/linux-aio.c b/block/linux-aio.c
index d92513b..53c5616 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -38,11 +38,21 @@ struct qemu_laiocb {
QLIST_ENTRY(qemu_laiocb) node;
};
+/*
+ * TODO: support to batch I/O from multiple bs in one same
+ * AIO context, one important use case is multi-lun scsi,
+ * so in future the IO queue should be per AIO context.
+ */
typedef struct {
struct iocb *iocbs[MAX_QUEUED_IO];
int plugged;
unsigned int size;
unsigned int idx;
+
+ /* abort queued requests in BH context */
+ QEMUBH *abort_bh;
+ bool aborting;
+ int abort_ret;
} LaioQueue;
struct qemu_laio_state {
@@ -59,6 +69,8 @@ struct qemu_laio_state {
int event_max;
};
+static int ioq_submit(struct qemu_laio_state *s);
+
static inline ssize_t io_event_ret(struct io_event *ev)
{
return (ssize_t)(((uint64_t)ev->res2 << 32) | ev->res);
@@ -135,6 +147,11 @@ static void qemu_laio_completion_bh(void *opaque)
qemu_laio_process_completion(s, laiocb);
}
+
+ /* Handle -EAGAIN or partial submission */
+ if (s->io_q.idx) {
+ ioq_submit(s);
+ }
}
static void qemu_laio_completion_cb(EventNotifier *e)
@@ -175,47 +192,100 @@ static void ioq_init(LaioQueue *io_q)
io_q->size = MAX_QUEUED_IO;
io_q->idx = 0;
io_q->plugged = 0;
+ io_q->aborting = false;
}
+/* Always return >= 0 and it means how many requests are submitted */
static int ioq_submit(struct qemu_laio_state *s)
{
- int ret, i = 0;
+ int ret;
int len = s->io_q.idx;
- do {
- ret = io_submit(s->ctx, len, s->io_q.iocbs);
- } while (i++ < 3 && ret == -EAGAIN);
-
- /* empty io queue */
- s->io_q.idx = 0;
+ if (!len) {
+ return 0;
+ }
+ ret = io_submit(s->ctx, len, s->io_q.iocbs);
if (ret < 0) {
- i = 0;
- } else {
- i = ret;
+ /* retry in following completion cb */
+ if (ret == -EAGAIN) {
+ return 0;
+ }
+
+ /*
+ * Abort in BH context for avoiding Co-routine re-entered,
+ * and update io queue at that time
+ */
+ s->io_q.aborting = true;
+ s->io_q.abort_ret = ret;
+ qemu_bh_schedule(s->io_q.abort_bh);
+ ret = 0;
}
- for (; i < len; i++) {
- struct qemu_laiocb *laiocb =
- container_of(s->io_q.iocbs[i], struct qemu_laiocb, iocb);
+ /*
+ * update io queue, and retry will be started automatically
+ * in following completion cb for the remainder
+ */
+ if (ret > 0) {
+ if (ret < len) {
+ memmove(&s->io_q.iocbs[0], &s->io_q.iocbs[ret],
+ (len - ret) * sizeof(struct iocb *));
+ }
+ s->io_q.idx -= ret;
+ }
+
+ return ret;
+}
- laiocb->ret = (ret < 0) ? ret : -EIO;
+static void ioq_abort_bh(void *opaque)
+{
+ struct qemu_laio_state *s = opaque;
+ int i;
+
+ for (i = 0; i < s->io_q.idx; i++) {
+ struct qemu_laiocb *laiocb = container_of(s->io_q.iocbs[i],
+ struct qemu_laiocb,
+ iocb);
+ laiocb->ret = s->io_q.abort_ret;
qemu_laio_process_completion(s, laiocb);
}
- return ret;
+
+ s->io_q.idx = 0;
+ s->io_q.aborting = false;
}
-static void ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb)
+static int ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb)
{
unsigned int idx = s->io_q.idx;
+ /* Request can't be allowed to submit until aborting is handled */
+ if (unlikely(s->io_q.aborting)) {
+ return -EIO;
+ }
+
+ if (unlikely(idx == s->io_q.size)) {
+ ioq_submit(s);
+
+ if (unlikely(s->io_q.aborting)) {
+ return -EIO;
+ }
+ idx = s->io_q.idx;
+ }
+
+ /* It has to return now if queue is still full */
+ if (unlikely(idx == s->io_q.size)) {
+ return -EAGAIN;
+ }
+
s->io_q.iocbs[idx++] = iocb;
s->io_q.idx = idx;
- /* submit immediately if queue is full */
- if (idx == s->io_q.size) {
+ /* submit immediately if queue depth is above 2/3 */
+ if (idx > s->io_q.size * 2 / 3) {
ioq_submit(s);
}
+
+ return 0;
}
void laio_io_plug(BlockDriverState *bs, void *aio_ctx)
@@ -281,7 +351,9 @@ BlockAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
goto out_free_aiocb;
}
} else {
- ioq_enqueue(s, iocbs);
+ if (ioq_enqueue(s, iocbs) < 0) {
+ goto out_free_aiocb;
+ }
}
return &laiocb->common;
@@ -296,14 +368,20 @@ void laio_detach_aio_context(void *s_, AioContext *old_context)
aio_set_event_notifier(old_context, &s->e, NULL);
qemu_bh_delete(s->completion_bh);
+ qemu_bh_delete(s->io_q.abort_bh);
}
void laio_attach_aio_context(void *s_, AioContext *new_context)
{
struct qemu_laio_state *s = s_;
+ s->io_q.abort_bh = aio_bh_new(new_context, ioq_abort_bh, s);
s->completion_bh = aio_bh_new(new_context, qemu_laio_completion_bh, s);
aio_set_event_notifier(new_context, &s->e, qemu_laio_completion_cb);
+
+ if (s->io_q.aborting) {
+ qemu_bh_schedule(s->io_q.abort_bh);
+ }
}
void *laio_init(void)
--
1.7.9.5
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Qemu-devel] [PATCH v7 2/3] linux-aio: handling -EAGAIN for !s->io_q.plugged case
2014-12-01 9:04 [Qemu-devel] [PATCH v7 0/3] linux-aio: fix batch submission Ming Lei
2014-12-01 9:04 ` [Qemu-devel] [PATCH v7 1/3] linux-aio: fix submit aio as a batch Ming Lei
@ 2014-12-01 9:04 ` Ming Lei
2014-12-01 9:04 ` [Qemu-devel] [PATCH v7 3/3] linux-aio: remove 'node' from 'struct qemu_laiocb' Ming Lei
2014-12-10 13:11 ` [Qemu-devel] [PATCH v7 0/3] linux-aio: fix batch submission Paolo Bonzini
3 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2014-12-01 9:04 UTC (permalink / raw)
To: qemu-devel, Paolo Bonzini, Stefan Hajnoczi, Kevin Wolf; +Cc: Ming Lei
Previously -EAGAIN is simply ignored for !s->io_q.plugged case,
and sometimes it is easy to cause -EIO to VM, such as NVME device.
This patch handles -EAGAIN by io queue for !s->io_q.plugged case,
and it will be retried in following aio completion cb.
Most of times, -EAGAIN only happens if there is pending I/O, but
from linux kernel AIO implementation io_submit() might return it
when kmem_cache_alloc(GFP_KERNEL) returns NULL too. So 'pending'
in 'struct qemu_laio_state' is introduced for tracking active IO,
and -EAGAIN is handled when there is pending I/O.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
block/linux-aio.c | 32 ++++++++++++++++++++++++--------
1 file changed, 24 insertions(+), 8 deletions(-)
diff --git a/block/linux-aio.c b/block/linux-aio.c
index 53c5616..9403b17 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -56,6 +56,7 @@ typedef struct {
} LaioQueue;
struct qemu_laio_state {
+ unsigned long pending;
io_context_t ctx;
EventNotifier e;
@@ -98,6 +99,7 @@ static void qemu_laio_process_completion(struct qemu_laio_state *s,
}
}
}
+ s->pending--;
laiocb->common.cb(laiocb->common.opaque, ret);
qemu_aio_unref(laiocb);
@@ -179,6 +181,7 @@ static void laio_cancel(BlockAIOCB *blockacb)
return;
}
+ laiocb->ctx->pending--;
laiocb->common.cb(laiocb->common.opaque, laiocb->ret);
}
@@ -280,8 +283,13 @@ static int ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb)
s->io_q.iocbs[idx++] = iocb;
s->io_q.idx = idx;
- /* submit immediately if queue depth is above 2/3 */
- if (idx > s->io_q.size * 2 / 3) {
+ /*
+ * This is reached in two cases: queue not plugged but io_submit
+ * returned -EAGAIN, or queue plugged. In the latter case, start
+ * submitting some I/O if the queue is getting too full. In the
+ * former case, instead, wait until an I/O operation is completed.
+ */
+ if (s->io_q.plugged && unlikely(idx > s->io_q.size * 2 / 3)) {
ioq_submit(s);
}
@@ -346,15 +354,23 @@ BlockAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
}
io_set_eventfd(&laiocb->iocb, event_notifier_get_fd(&s->e));
- if (!s->io_q.plugged) {
- if (io_submit(s->ctx, 1, &iocbs) < 0) {
- goto out_free_aiocb;
- }
- } else {
- if (ioq_enqueue(s, iocbs) < 0) {
+ /* Switch to queue mode until -EAGAIN is handled */
+ if (!s->io_q.plugged && !s->io_q.idx) {
+ int ret = io_submit(s->ctx, 1, &iocbs);
+ if (ret >= 0) {
+ return &laiocb->common;
+ } else if (ret != -EAGAIN || (ret == -EAGAIN && !s->pending)) {
goto out_free_aiocb;
}
+ /*
+ * In case of -EAGAIN, only queue the req if there is pending
+ * I/O and it is resubmitted in completion of pending I/O
+ */
+ }
+ if (ioq_enqueue(s, iocbs) < 0) {
+ goto out_free_aiocb;
}
+ s->pending++;
return &laiocb->common;
out_free_aiocb:
--
1.7.9.5
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Qemu-devel] [PATCH v7 3/3] linux-aio: remove 'node' from 'struct qemu_laiocb'
2014-12-01 9:04 [Qemu-devel] [PATCH v7 0/3] linux-aio: fix batch submission Ming Lei
2014-12-01 9:04 ` [Qemu-devel] [PATCH v7 1/3] linux-aio: fix submit aio as a batch Ming Lei
2014-12-01 9:04 ` [Qemu-devel] [PATCH v7 2/3] linux-aio: handling -EAGAIN for !s->io_q.plugged case Ming Lei
@ 2014-12-01 9:04 ` Ming Lei
2014-12-10 13:11 ` [Qemu-devel] [PATCH v7 0/3] linux-aio: fix batch submission Paolo Bonzini
3 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2014-12-01 9:04 UTC (permalink / raw)
To: qemu-devel, Paolo Bonzini, Stefan Hajnoczi, Kevin Wolf; +Cc: Ming Lei
No one uses the 'node' field any more, so remove it
from 'struct qemu_laiocb', and this can save 16byte
for the struct on 64bit arch.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
block/linux-aio.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/block/linux-aio.c b/block/linux-aio.c
index 9403b17..cad3848 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -35,7 +35,6 @@ struct qemu_laiocb {
size_t nbytes;
QEMUIOVector *qiov;
bool is_read;
- QLIST_ENTRY(qemu_laiocb) node;
};
/*
--
1.7.9.5
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [PATCH v7 0/3] linux-aio: fix batch submission
2014-12-01 9:04 [Qemu-devel] [PATCH v7 0/3] linux-aio: fix batch submission Ming Lei
` (2 preceding siblings ...)
2014-12-01 9:04 ` [Qemu-devel] [PATCH v7 3/3] linux-aio: remove 'node' from 'struct qemu_laiocb' Ming Lei
@ 2014-12-10 13:11 ` Paolo Bonzini
3 siblings, 0 replies; 5+ messages in thread
From: Paolo Bonzini @ 2014-12-10 13:11 UTC (permalink / raw)
To: Ming Lei, qemu-devel, Stefan Hajnoczi, Kevin Wolf
On 01/12/2014 10:04, Ming Lei wrote:
> The 1st patch fixes batch submission.
>
> The 2nd one fixes -EAGAIN for non-batch case.
>
> The 3rd one is a cleanup.
>
> This patchset is splitted from previous patchset(dataplane: optimization
> and multi virtqueue support), as suggested by Stefan.
I think this series needs a rewrite. Instead of removing 'node' from
'struct qemu_laiocb', each LaioQueue should actually have a list (or
SIMPLEQ) of pending 'struct qemu_laiocb'.
Every time you do an io_submit, you copy from the laiocb queue to the
iocbs array. If it returns N, the first N elements are removed from the
list. There is no need to do a memmove, the iocbs will be populated
again on the next io_submit. If io_submit fails with anything but
EAGAIN, just abort(). It should never happen.
Every time you get a request, and the list was empty, you do an io_submit.
Every time you get a bdrv_unplug and the nesting level was 1, you do an
io_submit.
Every time you get a completion, if the pending queue is not empty, you
do an io_submit.
It should be _that_ simple.
Paolo
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-12-10 13:12 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-01 9:04 [Qemu-devel] [PATCH v7 0/3] linux-aio: fix batch submission Ming Lei
2014-12-01 9:04 ` [Qemu-devel] [PATCH v7 1/3] linux-aio: fix submit aio as a batch Ming Lei
2014-12-01 9:04 ` [Qemu-devel] [PATCH v7 2/3] linux-aio: handling -EAGAIN for !s->io_q.plugged case Ming Lei
2014-12-01 9:04 ` [Qemu-devel] [PATCH v7 3/3] linux-aio: remove 'node' from 'struct qemu_laiocb' Ming Lei
2014-12-10 13:11 ` [Qemu-devel] [PATCH v7 0/3] linux-aio: fix batch submission Paolo Bonzini
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.