* [Qemu-devel] [PATCH 0/7] AIO nested loop and bdrv_drain_all changes @ 2012-03-12 18:22 Paolo Bonzini 2012-03-12 18:22 ` [Qemu-devel] [PATCH 1/7] qemu-io: use main_loop_wait Paolo Bonzini ` (6 more replies) 0 siblings, 7 replies; 11+ messages in thread From: Paolo Bonzini @ 2012-03-12 18:22 UTC (permalink / raw) To: qemu-devel This series includes several changes that, in different ways, touch the aio.c main loop. Patches 1 and 2 let the tools use timers. Patches 3 to 6 simplify the aio loop by removing the separate process_queue callback, and by adding a return value to qemu_aio_wait. (Mostly extracted from my threading experiments, and retested). Patch 7 is zwu's patch to drain requests correctly in the presence of I/O throttling. This version does not add bdrv_drain; see comments in the patch and on the mailing list. Paolo Bonzini (6): qemu-io: use main_loop_wait qemu-tool: map vm_clock to rt_clock posix-aio: merge posix_aio_process_queue and posix_aio_read aio: remove process_queue callback and qemu_aio_process_queue aio: return "AIO in progress" state from qemu_aio_wait aio: simplify qemu_aio_wait Zhi Yong Wu (1): block: add the support to drain throttled requests aio.c | 176 +++++++++++++++++++++------------------------------- block.c | 21 ++++++- block/curl.c | 10 +-- block/iscsi.c | 4 +- block/nbd.c | 8 +- block/rbd.c | 5 +- block/sheepdog.c | 11 ++-- cmd.c | 10 ++-- linux-aio.c | 2 +- posix-aio-compat.c | 45 +++++--------- qemu-aio.h | 19 +----- qemu-io.c | 7 +- qemu-tool.c | 3 +- 13 files changed, 139 insertions(+), 182 deletions(-) -- 1.7.7.6 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Qemu-devel] [PATCH 1/7] qemu-io: use main_loop_wait 2012-03-12 18:22 [Qemu-devel] [PATCH 0/7] AIO nested loop and bdrv_drain_all changes Paolo Bonzini @ 2012-03-12 18:22 ` Paolo Bonzini 2012-03-12 18:22 ` [Qemu-devel] [PATCH 2/7] qemu-tool: map vm_clock to rt_clock Paolo Bonzini ` (5 subsequent siblings) 6 siblings, 0 replies; 11+ messages in thread From: Paolo Bonzini @ 2012-03-12 18:22 UTC (permalink / raw) To: qemu-devel This will let timers run during aio_read and aio_write commands, though not during synchronous commands. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- cmd.c | 10 +++++----- qemu-io.c | 7 ++++--- 2 files changed, 9 insertions(+), 8 deletions(-) diff --git a/cmd.c b/cmd.c index 0806e18..7ffbb71 100644 --- a/cmd.c +++ b/cmd.c @@ -25,6 +25,7 @@ #include "cmd.h" #include "qemu-aio.h" +#include "main-loop.h" #define _(x) x /* not gettext support yet */ @@ -146,7 +147,7 @@ static void prep_fetchline(void *opaque) { int *fetchable = opaque; - qemu_aio_set_fd_handler(STDIN_FILENO, NULL, NULL, NULL, NULL, NULL); + qemu_set_fd_handler(STDIN_FILENO, NULL, NULL, NULL); *fetchable= 1; } @@ -193,12 +194,11 @@ void command_loop(void) if (!prompted) { printf("%s", get_prompt()); fflush(stdout); - qemu_aio_set_fd_handler(STDIN_FILENO, prep_fetchline, NULL, NULL, - NULL, &fetchable); + qemu_set_fd_handler(STDIN_FILENO, prep_fetchline, NULL, &fetchable); prompted = 1; } - qemu_aio_wait(); + main_loop_wait(false); if (!fetchable) { continue; @@ -221,7 +221,7 @@ void command_loop(void) prompted = 0; fetchable = 0; } - qemu_aio_set_fd_handler(STDIN_FILENO, NULL, NULL, NULL, NULL, NULL); + qemu_set_fd_handler(STDIN_FILENO, NULL, NULL, NULL); } /* from libxcmd/input.c */ diff --git a/qemu-io.c b/qemu-io.c index 3189530..e9b7369 100644 --- a/qemu-io.c +++ b/qemu-io.c @@ -15,6 +15,7 @@ #include <libgen.h> #include "qemu-common.h" +#include "main-loop.h" #include "block_int.h" #include "cmd.h" @@ -294,7 +295,7 @@ static int do_aio_readv(QEMUIOVector *qiov, int64_t offset, int *total) bdrv_aio_readv(bs, offset >> 9, qiov, qiov->size >> 9, aio_rw_done, &async_ret); while (async_ret == NOT_DONE) { - qemu_aio_wait(); + main_loop_wait(false); } *total = qiov->size; @@ -308,7 +309,7 @@ static int do_aio_writev(QEMUIOVector *qiov, int64_t offset, int *total) bdrv_aio_writev(bs, offset >> 9, qiov, qiov->size >> 9, aio_rw_done, &async_ret); while (async_ret == NOT_DONE) { - qemu_aio_wait(); + main_loop_wait(false); } *total = qiov->size; @@ -351,7 +352,7 @@ static int do_aio_multiwrite(BlockRequest* reqs, int num_reqs, int *total) } while (async_ret.num_done < num_reqs) { - qemu_aio_wait(); + main_loop_wait(false); } return async_ret.error < 0 ? async_ret.error : 1; -- 1.7.7.6 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [Qemu-devel] [PATCH 2/7] qemu-tool: map vm_clock to rt_clock 2012-03-12 18:22 [Qemu-devel] [PATCH 0/7] AIO nested loop and bdrv_drain_all changes Paolo Bonzini 2012-03-12 18:22 ` [Qemu-devel] [PATCH 1/7] qemu-io: use main_loop_wait Paolo Bonzini @ 2012-03-12 18:22 ` Paolo Bonzini 2012-03-12 18:22 ` [Qemu-devel] [PATCH 3/7] posix-aio: merge posix_aio_process_queue and posix_aio_read Paolo Bonzini ` (4 subsequent siblings) 6 siblings, 0 replies; 11+ messages in thread From: Paolo Bonzini @ 2012-03-12 18:22 UTC (permalink / raw) To: qemu-devel QED uses vm_clock timers so that images are not touched during and after migration. This however does not apply to qemu-io and qemu-img. Treat vm_clock as a synonym for rt_clock there, and enable it. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- qemu-tool.c | 3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/qemu-tool.c b/qemu-tool.c index edb84f5..6579b00 100644 --- a/qemu-tool.c +++ b/qemu-tool.c @@ -61,7 +61,7 @@ void monitor_protocol_event(MonitorEvent event, QObject *data) int64_t cpu_get_clock(void) { - return 0; + return qemu_get_clock_ns(rt_clock); } int64_t cpu_get_icount(void) @@ -87,7 +87,6 @@ int qemu_init_main_loop(void) { init_clocks(); init_timer_alarm(); - qemu_clock_enable(vm_clock, false); return main_loop_init(); } -- 1.7.7.6 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [Qemu-devel] [PATCH 3/7] posix-aio: merge posix_aio_process_queue and posix_aio_read 2012-03-12 18:22 [Qemu-devel] [PATCH 0/7] AIO nested loop and bdrv_drain_all changes Paolo Bonzini 2012-03-12 18:22 ` [Qemu-devel] [PATCH 1/7] qemu-io: use main_loop_wait Paolo Bonzini 2012-03-12 18:22 ` [Qemu-devel] [PATCH 2/7] qemu-tool: map vm_clock to rt_clock Paolo Bonzini @ 2012-03-12 18:22 ` Paolo Bonzini 2012-03-12 18:22 ` [Qemu-devel] [PATCH 4/7] aio: remove process_queue callback and qemu_aio_process_queue Paolo Bonzini ` (3 subsequent siblings) 6 siblings, 0 replies; 11+ messages in thread From: Paolo Bonzini @ 2012-03-12 18:22 UTC (permalink / raw) To: qemu-devel posix_aio_read already calls qemu_aio_process_queue, and dually qemu_aio_process_queue is always followed by a select loop that calls posix_aio_read. No races are possible, so there is no need for a separate process_queue callback. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- posix-aio-compat.c | 46 ++++++++++++++++++---------------------------- 1 files changed, 18 insertions(+), 28 deletions(-) diff --git a/posix-aio-compat.c b/posix-aio-compat.c index d311d13..1066c60 100644 --- a/posix-aio-compat.c +++ b/posix-aio-compat.c @@ -468,26 +468,39 @@ static int qemu_paio_error(struct qemu_paiocb *aiocb) return ret; } -static int posix_aio_process_queue(void *opaque) +static void posix_aio_read(void *opaque) { PosixAioState *s = opaque; struct qemu_paiocb *acb, **pacb; int ret; - int result = 0; + ssize_t len; + + /* read all bytes from signal pipe */ + for (;;) { + char bytes[16]; + + len = read(s->rfd, bytes, sizeof(bytes)); + if (len == -1 && errno == EINTR) { + continue; /* try again */ + } + if (len == sizeof(bytes)) { + continue; /* more to read */ + } + break; + } for(;;) { pacb = &s->first_aio; for(;;) { acb = *pacb; if (!acb) - return result; + return; ret = qemu_paio_error(acb); if (ret == ECANCELED) { /* remove the request */ *pacb = acb->next; qemu_aio_release(acb); - result = 1; } else if (ret != EINPROGRESS) { /* end of aio */ if (ret == 0) { @@ -507,35 +518,12 @@ static int posix_aio_process_queue(void *opaque) /* call the callback */ acb->common.cb(acb->common.opaque, ret); qemu_aio_release(acb); - result = 1; break; } else { pacb = &acb->next; } } } - - return result; -} - -static void posix_aio_read(void *opaque) -{ - PosixAioState *s = opaque; - ssize_t len; - - /* read all bytes from signal pipe */ - for (;;) { - char bytes[16]; - - len = read(s->rfd, bytes, sizeof(bytes)); - if (len == -1 && errno == EINTR) - continue; /* try again */ - if (len == sizeof(bytes)) - continue; /* more to read */ - break; - } - - posix_aio_process_queue(s); } static int posix_aio_flush(void *opaque) @@ -676,7 +664,7 @@ int paio_init(void) fcntl(s->wfd, F_SETFL, O_NONBLOCK); qemu_aio_set_fd_handler(s->rfd, posix_aio_read, NULL, posix_aio_flush, - posix_aio_process_queue, s); + NULL, s); ret = pthread_attr_init(&attr); if (ret) -- 1.7.7.6 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [Qemu-devel] [PATCH 4/7] aio: remove process_queue callback and qemu_aio_process_queue 2012-03-12 18:22 [Qemu-devel] [PATCH 0/7] AIO nested loop and bdrv_drain_all changes Paolo Bonzini ` (2 preceding siblings ...) 2012-03-12 18:22 ` [Qemu-devel] [PATCH 3/7] posix-aio: merge posix_aio_process_queue and posix_aio_read Paolo Bonzini @ 2012-03-12 18:22 ` Paolo Bonzini 2012-03-12 18:22 ` [Qemu-devel] [PATCH 5/7] aio: return "AIO in progress" state from qemu_aio_wait Paolo Bonzini ` (2 subsequent siblings) 6 siblings, 0 replies; 11+ messages in thread From: Paolo Bonzini @ 2012-03-12 18:22 UTC (permalink / raw) To: qemu-devel Both unused after the previous patch. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- aio.c | 29 ++--------------------------- block/curl.c | 10 ++++------ block/iscsi.c | 4 ++-- block/nbd.c | 8 ++++---- block/rbd.c | 5 ++--- block/sheepdog.c | 11 +++++------ linux-aio.c | 2 +- posix-aio-compat.c | 3 +-- qemu-aio.h | 13 ------------- 9 files changed, 21 insertions(+), 64 deletions(-) diff --git a/aio.c b/aio.c index eb3bf42..f19b3c6 100644 --- a/aio.c +++ b/aio.c @@ -35,7 +35,6 @@ struct AioHandler IOHandler *io_read; IOHandler *io_write; AioFlushHandler *io_flush; - AioProcessQueue *io_process_queue; int deleted; void *opaque; QLIST_ENTRY(AioHandler) node; @@ -58,7 +57,6 @@ int qemu_aio_set_fd_handler(int fd, IOHandler *io_read, IOHandler *io_write, AioFlushHandler *io_flush, - AioProcessQueue *io_process_queue, void *opaque) { AioHandler *node; @@ -91,7 +89,6 @@ int qemu_aio_set_fd_handler(int fd, node->io_read = io_read; node->io_write = io_write; node->io_flush = io_flush; - node->io_process_queue = io_process_queue; node->opaque = opaque; } @@ -122,39 +119,17 @@ void qemu_aio_flush(void) } while (qemu_bh_poll() || ret > 0); } -int qemu_aio_process_queue(void) -{ - AioHandler *node; - int ret = 0; - - walking_handlers = 1; - - QLIST_FOREACH(node, &aio_handlers, node) { - if (node->io_process_queue) { - if (node->io_process_queue(node->opaque)) { - ret = 1; - } - } - } - - walking_handlers = 0; - - return ret; -} - void qemu_aio_wait(void) { int ret; - if (qemu_bh_poll()) - return; - /* * If there are callbacks left that have been queued, we need to call then. * Return afterwards to avoid waiting needlessly in select(). */ - if (qemu_aio_process_queue()) + if (qemu_bh_poll()) { return; + } do { AioHandler *node; diff --git a/block/curl.c b/block/curl.c index e9102e3..c5ff89f 100644 --- a/block/curl.c +++ b/block/curl.c @@ -89,19 +89,17 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action, DPRINTF("CURL (AIO): Sock action %d on fd %d\n", action, fd); switch (action) { case CURL_POLL_IN: - qemu_aio_set_fd_handler(fd, curl_multi_do, NULL, curl_aio_flush, - NULL, s); + qemu_aio_set_fd_handler(fd, curl_multi_do, NULL, curl_aio_flush, s); break; case CURL_POLL_OUT: - qemu_aio_set_fd_handler(fd, NULL, curl_multi_do, curl_aio_flush, - NULL, s); + qemu_aio_set_fd_handler(fd, NULL, curl_multi_do, curl_aio_flush, s); break; case CURL_POLL_INOUT: qemu_aio_set_fd_handler(fd, curl_multi_do, curl_multi_do, - curl_aio_flush, NULL, s); + curl_aio_flush, s); break; case CURL_POLL_REMOVE: - qemu_aio_set_fd_handler(fd, NULL, NULL, NULL, NULL, NULL); + qemu_aio_set_fd_handler(fd, NULL, NULL, NULL, NULL); break; } diff --git a/block/iscsi.c b/block/iscsi.c index bd3ca11..5222726 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -108,7 +108,7 @@ iscsi_set_events(IscsiLun *iscsilun) qemu_aio_set_fd_handler(iscsi_get_fd(iscsi), iscsi_process_read, (iscsi_which_events(iscsi) & POLLOUT) ? iscsi_process_write : NULL, - iscsi_process_flush, NULL, iscsilun); + iscsi_process_flush, iscsilun); } static void @@ -682,7 +682,7 @@ static void iscsi_close(BlockDriverState *bs) IscsiLun *iscsilun = bs->opaque; struct iscsi_context *iscsi = iscsilun->iscsi; - qemu_aio_set_fd_handler(iscsi_get_fd(iscsi), NULL, NULL, NULL, NULL, NULL); + qemu_aio_set_fd_handler(iscsi_get_fd(iscsi), NULL, NULL, NULL, NULL); iscsi_destroy_context(iscsi); memset(iscsilun, 0, sizeof(IscsiLun)); } diff --git a/block/nbd.c b/block/nbd.c index 161b299..524c9cf 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -191,7 +191,7 @@ static int nbd_co_send_request(BDRVNBDState *s, struct nbd_request *request, qemu_co_mutex_lock(&s->send_mutex); s->send_coroutine = qemu_coroutine_self(); qemu_aio_set_fd_handler(s->sock, nbd_reply_ready, nbd_restart_write, - nbd_have_request, NULL, s); + nbd_have_request, s); rc = nbd_send_request(s->sock, request); if (rc != -1 && iov) { ret = qemu_co_sendv(s->sock, iov, request->len, offset); @@ -201,7 +201,7 @@ static int nbd_co_send_request(BDRVNBDState *s, struct nbd_request *request, } } qemu_aio_set_fd_handler(s->sock, nbd_reply_ready, NULL, - nbd_have_request, NULL, s); + nbd_have_request, s); s->send_coroutine = NULL; qemu_co_mutex_unlock(&s->send_mutex); return rc; @@ -274,7 +274,7 @@ static int nbd_establish_connection(BlockDriverState *bs) * kick the reply mechanism. */ socket_set_nonblock(sock); qemu_aio_set_fd_handler(s->sock, nbd_reply_ready, NULL, - nbd_have_request, NULL, s); + nbd_have_request, s); s->sock = sock; s->size = size; @@ -294,7 +294,7 @@ static void nbd_teardown_connection(BlockDriverState *bs) request.len = 0; nbd_send_request(s->sock, &request); - qemu_aio_set_fd_handler(s->sock, NULL, NULL, NULL, NULL, NULL); + qemu_aio_set_fd_handler(s->sock, NULL, NULL, NULL, NULL); closesocket(s->sock); } diff --git a/block/rbd.c b/block/rbd.c index 46a8579..6cd8448 100644 --- a/block/rbd.c +++ b/block/rbd.c @@ -504,7 +504,7 @@ static int qemu_rbd_open(BlockDriverState *bs, const char *filename, int flags) fcntl(s->fds[0], F_SETFL, O_NONBLOCK); fcntl(s->fds[1], F_SETFL, O_NONBLOCK); qemu_aio_set_fd_handler(s->fds[RBD_FD_READ], qemu_rbd_aio_event_reader, - NULL, qemu_rbd_aio_flush_cb, NULL, s); + NULL, qemu_rbd_aio_flush_cb, s); return 0; @@ -525,8 +525,7 @@ static void qemu_rbd_close(BlockDriverState *bs) close(s->fds[0]); close(s->fds[1]); - qemu_aio_set_fd_handler(s->fds[RBD_FD_READ], NULL , NULL, NULL, NULL, - NULL); + qemu_aio_set_fd_handler(s->fds[RBD_FD_READ], NULL, NULL, NULL, NULL); rbd_close(s->image); rados_ioctx_destroy(s->io_ctx); diff --git a/block/sheepdog.c b/block/sheepdog.c index 00276f6f..60091c0 100644 --- a/block/sheepdog.c +++ b/block/sheepdog.c @@ -742,8 +742,7 @@ static int get_sheep_fd(BDRVSheepdogState *s) return -1; } - qemu_aio_set_fd_handler(fd, co_read_response, NULL, aio_flush_request, - NULL, s); + qemu_aio_set_fd_handler(fd, co_read_response, NULL, aio_flush_request, s); return fd; } @@ -912,7 +911,7 @@ static int coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req, qemu_co_mutex_lock(&s->lock); s->co_send = qemu_coroutine_self(); qemu_aio_set_fd_handler(s->fd, co_read_response, co_write_request, - aio_flush_request, NULL, s); + aio_flush_request, s); socket_set_cork(s->fd, 1); /* send a header */ @@ -934,7 +933,7 @@ static int coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req, socket_set_cork(s->fd, 0); qemu_aio_set_fd_handler(s->fd, co_read_response, NULL, - aio_flush_request, NULL, s); + aio_flush_request, s); qemu_co_mutex_unlock(&s->lock); return 0; @@ -1056,7 +1055,7 @@ static int sd_open(BlockDriverState *bs, const char *filename, int flags) g_free(buf); return 0; out: - qemu_aio_set_fd_handler(s->fd, NULL, NULL, NULL, NULL, NULL); + qemu_aio_set_fd_handler(s->fd, NULL, NULL, NULL, NULL); if (s->fd >= 0) { closesocket(s->fd); } @@ -1270,7 +1269,7 @@ static void sd_close(BlockDriverState *bs) error_report("%s, %s", sd_strerror(rsp->result), s->name); } - qemu_aio_set_fd_handler(s->fd, NULL, NULL, NULL, NULL, NULL); + qemu_aio_set_fd_handler(s->fd, NULL, NULL, NULL, NULL); closesocket(s->fd); g_free(s->addr); } diff --git a/linux-aio.c b/linux-aio.c index 15261ec..fa0fbf3 100644 --- a/linux-aio.c +++ b/linux-aio.c @@ -214,7 +214,7 @@ void *laio_init(void) goto out_close_efd; qemu_aio_set_fd_handler(s->efd, qemu_laio_completion_cb, NULL, - qemu_laio_flush_cb, NULL, s); + qemu_laio_flush_cb, s); return s; diff --git a/posix-aio-compat.c b/posix-aio-compat.c index 1066c60..68361f5 100644 --- a/posix-aio-compat.c +++ b/posix-aio-compat.c @@ -663,8 +663,7 @@ int paio_init(void) fcntl(s->rfd, F_SETFL, O_NONBLOCK); fcntl(s->wfd, F_SETFL, O_NONBLOCK); - qemu_aio_set_fd_handler(s->rfd, posix_aio_read, NULL, posix_aio_flush, - NULL, s); + qemu_aio_set_fd_handler(s->rfd, posix_aio_read, NULL, posix_aio_flush, s); ret = pthread_attr_init(&attr); if (ret) diff --git a/qemu-aio.h b/qemu-aio.h index 230c2f7..0fc8409 100644 --- a/qemu-aio.h +++ b/qemu-aio.h @@ -41,11 +41,6 @@ void qemu_aio_release(void *p); /* Returns 1 if there are still outstanding AIO requests; 0 otherwise */ typedef int (AioFlushHandler)(void *opaque); -/* Runs all currently allowed AIO callbacks of completed requests in the - * respective AIO backend. Returns 0 if no requests was handled, non-zero - * if at least one queued request was handled. */ -typedef int (AioProcessQueue)(void *opaque); - /* Flush any pending AIO operation. This function will block until all * outstanding AIO operations have been completed or cancelled. */ void qemu_aio_flush(void); @@ -56,13 +51,6 @@ void qemu_aio_flush(void); * result of executing I/O completion or bh callbacks. */ void qemu_aio_wait(void); -/* - * Runs all currently allowed AIO callbacks of completed requests. Returns 0 - * if no requests were handled, non-zero if at least one request was - * processed. - */ -int qemu_aio_process_queue(void); - /* Register a file descriptor and associated callbacks. Behaves very similarly * to qemu_set_fd_handler2. Unlike qemu_set_fd_handler2, these callbacks will * be invoked when using either qemu_aio_wait() or qemu_aio_flush(). @@ -74,7 +62,6 @@ int qemu_aio_set_fd_handler(int fd, IOHandler *io_read, IOHandler *io_write, AioFlushHandler *io_flush, - AioProcessQueue *io_process_queue, void *opaque); #endif -- 1.7.7.6 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [Qemu-devel] [PATCH 5/7] aio: return "AIO in progress" state from qemu_aio_wait 2012-03-12 18:22 [Qemu-devel] [PATCH 0/7] AIO nested loop and bdrv_drain_all changes Paolo Bonzini ` (3 preceding siblings ...) 2012-03-12 18:22 ` [Qemu-devel] [PATCH 4/7] aio: remove process_queue callback and qemu_aio_process_queue Paolo Bonzini @ 2012-03-12 18:22 ` Paolo Bonzini 2012-03-12 18:22 ` [Qemu-devel] [PATCH 6/7] aio: simplify qemu_aio_wait Paolo Bonzini 2012-03-12 18:22 ` [Qemu-devel] [PATCH 7/7] block: add the support to drain throttled requests Paolo Bonzini 6 siblings, 0 replies; 11+ messages in thread From: Paolo Bonzini @ 2012-03-12 18:22 UTC (permalink / raw) To: qemu-devel The definition of when qemu_aio_flush should loop is much simpler than it looks. It just has to call qemu_aio_wait until it makes no progress and all flush callbacks return false. qemu_aio_wait is the logical place to tell the caller about this, and the return code will also help implementing bdrv_drain. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- aio.c | 48 ++++++++++++++++++++++-------------------------- qemu-aio.h | 6 ++++-- 2 files changed, 26 insertions(+), 28 deletions(-) diff --git a/aio.c b/aio.c index f19b3c6..71264fc 100644 --- a/aio.c +++ b/aio.c @@ -99,41 +99,30 @@ int qemu_aio_set_fd_handler(int fd, void qemu_aio_flush(void) { - AioHandler *node; - int ret; - - do { - ret = 0; - - /* - * If there are pending emulated aio start them now so flush - * will be able to return 1. - */ - qemu_aio_wait(); - - QLIST_FOREACH(node, &aio_handlers, node) { - if (node->io_flush) { - ret |= node->io_flush(node->opaque); - } - } - } while (qemu_bh_poll() || ret > 0); + /* Always poll at least once; bottom halves may start new AIO so + * flush will be able to return 1. However, they also might not :) + * so only block starting from the second call. + */ + while (qemu_aio_wait()); } -void qemu_aio_wait(void) +bool qemu_aio_wait(void) { int ret; /* * If there are callbacks left that have been queued, we need to call then. - * Return afterwards to avoid waiting needlessly in select(). + * Do not call select in this case, because it is possible that the caller + * does not need a complete flush (as is the case for qemu_aio_wait loops). */ if (qemu_bh_poll()) { - return; + return true; } do { AioHandler *node; fd_set rdfds, wrfds; + bool busy; int max_fd = -1; walking_handlers = 1; @@ -142,14 +131,18 @@ void qemu_aio_wait(void) FD_ZERO(&wrfds); /* fill fd sets */ + busy = false; QLIST_FOREACH(node, &aio_handlers, node) { /* If there aren't pending AIO operations, don't invoke callbacks. * Otherwise, if there are no AIO requests, qemu_aio_wait() would * wait indefinitely. */ - if (node->io_flush && node->io_flush(node->opaque) == 0) - continue; - + if (node->io_flush) { + if (node->io_flush(node->opaque) == 0) { + continue; + } + busy = true; + } if (!node->deleted && node->io_read) { FD_SET(node->fd, &rdfds); max_fd = MAX(max_fd, node->fd + 1); @@ -163,8 +156,9 @@ void qemu_aio_wait(void) walking_handlers = 0; /* No AIO operations? Get us out of here */ - if (max_fd == -1) - break; + if (!busy) { + return false; + } /* wait until next event */ ret = select(max_fd, &rdfds, &wrfds, NULL, NULL); @@ -204,4 +198,6 @@ void qemu_aio_wait(void) walking_handlers = 0; } } while (ret == 0); + + return true; } diff --git a/qemu-aio.h b/qemu-aio.h index 0fc8409..bfdd35f 100644 --- a/qemu-aio.h +++ b/qemu-aio.h @@ -48,8 +48,10 @@ void qemu_aio_flush(void); /* Wait for a single AIO completion to occur. This function will wait * until a single AIO event has completed and it will ensure something * has moved before returning. This can issue new pending aio as - * result of executing I/O completion or bh callbacks. */ -void qemu_aio_wait(void); + * result of executing I/O completion or bh callbacks. + * + * Return whether there is still any pending AIO operation. */ +bool qemu_aio_wait(void); /* Register a file descriptor and associated callbacks. Behaves very similarly * to qemu_set_fd_handler2. Unlike qemu_set_fd_handler2, these callbacks will -- 1.7.7.6 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [Qemu-devel] [PATCH 6/7] aio: simplify qemu_aio_wait 2012-03-12 18:22 [Qemu-devel] [PATCH 0/7] AIO nested loop and bdrv_drain_all changes Paolo Bonzini ` (4 preceding siblings ...) 2012-03-12 18:22 ` [Qemu-devel] [PATCH 5/7] aio: return "AIO in progress" state from qemu_aio_wait Paolo Bonzini @ 2012-03-12 18:22 ` Paolo Bonzini 2012-03-12 18:22 ` [Qemu-devel] [PATCH 7/7] block: add the support to drain throttled requests Paolo Bonzini 6 siblings, 0 replies; 11+ messages in thread From: Paolo Bonzini @ 2012-03-12 18:22 UTC (permalink / raw) To: qemu-devel The do...while loop can never loop, because select will just not return 0 when invoked with infinite timeout. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- aio.c | 133 +++++++++++++++++++++++++++++++--------------------------------- 1 files changed, 64 insertions(+), 69 deletions(-) diff --git a/aio.c b/aio.c index 71264fc..2d82dfe 100644 --- a/aio.c +++ b/aio.c @@ -108,7 +108,11 @@ void qemu_aio_flush(void) bool qemu_aio_wait(void) { + AioHandler *node; + fd_set rdfds, wrfds; + int max_fd = -1; int ret; + bool busy; /* * If there are callbacks left that have been queued, we need to call then. @@ -119,85 +123,76 @@ bool qemu_aio_wait(void) return true; } - do { - AioHandler *node; - fd_set rdfds, wrfds; - bool busy; - int max_fd = -1; + walking_handlers = 1; - walking_handlers = 1; + FD_ZERO(&rdfds); + FD_ZERO(&wrfds); - FD_ZERO(&rdfds); - FD_ZERO(&wrfds); - - /* fill fd sets */ - busy = false; - QLIST_FOREACH(node, &aio_handlers, node) { - /* If there aren't pending AIO operations, don't invoke callbacks. - * Otherwise, if there are no AIO requests, qemu_aio_wait() would - * wait indefinitely. - */ - if (node->io_flush) { - if (node->io_flush(node->opaque) == 0) { - continue; - } - busy = true; - } - if (!node->deleted && node->io_read) { - FD_SET(node->fd, &rdfds); - max_fd = MAX(max_fd, node->fd + 1); - } - if (!node->deleted && node->io_write) { - FD_SET(node->fd, &wrfds); - max_fd = MAX(max_fd, node->fd + 1); + /* fill fd sets */ + busy = false; + QLIST_FOREACH(node, &aio_handlers, node) { + /* If there aren't pending AIO operations, don't invoke callbacks. + * Otherwise, if there are no AIO requests, qemu_aio_wait() would + * wait indefinitely. + */ + if (node->io_flush) { + if (node->io_flush(node->opaque) == 0) { + continue; } + busy = true; + } + if (!node->deleted && node->io_read) { + FD_SET(node->fd, &rdfds); + max_fd = MAX(max_fd, node->fd + 1); } + if (!node->deleted && node->io_write) { + FD_SET(node->fd, &wrfds); + max_fd = MAX(max_fd, node->fd + 1); + } + } - walking_handlers = 0; + walking_handlers = 0; - /* No AIO operations? Get us out of here */ - if (!busy) { - return false; - } + /* No AIO operations? Get us out of here */ + if (!busy) { + return false; + } - /* wait until next event */ - ret = select(max_fd, &rdfds, &wrfds, NULL, NULL); - if (ret == -1 && errno == EINTR) - continue; - - /* if we have any readable fds, dispatch event */ - if (ret > 0) { - walking_handlers = 1; - - /* we have to walk very carefully in case - * qemu_aio_set_fd_handler is called while we're walking */ - node = QLIST_FIRST(&aio_handlers); - while (node) { - AioHandler *tmp; - - if (!node->deleted && - FD_ISSET(node->fd, &rdfds) && - node->io_read) { - node->io_read(node->opaque); - } - if (!node->deleted && - FD_ISSET(node->fd, &wrfds) && - node->io_write) { - node->io_write(node->opaque); - } - - tmp = node; - node = QLIST_NEXT(node, node); - - if (tmp->deleted) { - QLIST_REMOVE(tmp, node); - g_free(tmp); - } + /* wait until next event */ + ret = select(max_fd, &rdfds, &wrfds, NULL, NULL); + + /* if we have any readable fds, dispatch event */ + if (ret > 0) { + walking_handlers = 1; + + /* we have to walk very carefully in case + * qemu_aio_set_fd_handler is called while we're walking */ + node = QLIST_FIRST(&aio_handlers); + while (node) { + AioHandler *tmp; + + if (!node->deleted && + FD_ISSET(node->fd, &rdfds) && + node->io_read) { + node->io_read(node->opaque); + } + if (!node->deleted && + FD_ISSET(node->fd, &wrfds) && + node->io_write) { + node->io_write(node->opaque); } - walking_handlers = 0; + tmp = node; + node = QLIST_NEXT(node, node); + + if (tmp->deleted) { + QLIST_REMOVE(tmp, node); + g_free(tmp); + } } - } while (ret == 0); + + walking_handlers = 0; + } return true; } -- 1.7.7.6 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [Qemu-devel] [PATCH 7/7] block: add the support to drain throttled requests 2012-03-12 18:22 [Qemu-devel] [PATCH 0/7] AIO nested loop and bdrv_drain_all changes Paolo Bonzini ` (5 preceding siblings ...) 2012-03-12 18:22 ` [Qemu-devel] [PATCH 6/7] aio: simplify qemu_aio_wait Paolo Bonzini @ 2012-03-12 18:22 ` Paolo Bonzini 2012-03-13 1:38 ` Zhi Yong Wu ` (2 more replies) 6 siblings, 3 replies; 11+ messages in thread From: Paolo Bonzini @ 2012-03-12 18:22 UTC (permalink / raw) To: qemu-devel; +Cc: Zhi Yong Wu From: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> [ Iterate until all block devices have processed all requests, add comments. - Paolo ] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- block.c | 21 ++++++++++++++++++++- 1 files changed, 20 insertions(+), 1 deletions(-) diff --git a/block.c b/block.c index dac62a1..0d769a6 100644 --- a/block.c +++ b/block.c @@ -858,12 +858,31 @@ void bdrv_close_all(void) * * This function does not flush data to disk, use bdrv_flush_all() for that * after calling this function. + * + * Note that completion of an asynchronous I/O operation can trigger any + * number of other I/O operations on other devices---for example a coroutine + * can be arbitrarily complex and a constant flow of I/O to multiple devices + * can come until the coroutine is complete. Because of this, it is not + * possible to drain a single device's I/O queue. */ void bdrv_drain_all(void) { BlockDriverState *bs; + bool busy; - qemu_aio_flush(); + do { + busy = qemu_aio_wait(); + + /* FIXME: We do not have timer support here, so this is effectively + * a busy wait. + */ + QTAILQ_FOREACH(bs, &bdrv_states, list) { + if (!qemu_co_queue_empty(&bs->throttled_reqs)) { + qemu_co_queue_restart_all(&bs->throttled_reqs); + busy = true; + } + } + } while (busy); /* If requests are still pending there is a bug somewhere */ QTAILQ_FOREACH(bs, &bdrv_states, list) { -- 1.7.7.6 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH 7/7] block: add the support to drain throttled requests 2012-03-12 18:22 ` [Qemu-devel] [PATCH 7/7] block: add the support to drain throttled requests Paolo Bonzini @ 2012-03-13 1:38 ` Zhi Yong Wu 2012-03-13 1:46 ` Zhi Yong Wu 2012-03-13 1:56 ` Zhi Yong Wu 2 siblings, 0 replies; 11+ messages in thread From: Zhi Yong Wu @ 2012-03-13 1:38 UTC (permalink / raw) To: Paolo Bonzini; +Cc: Zhi Yong Wu, qemu-devel On Tue, Mar 13, 2012 at 2:22 AM, Paolo Bonzini <pbonzini@redhat.com> wrote: > From: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> > > Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> > [ Iterate until all block devices have processed all requests, > add comments. - Paolo ] > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > block.c | 21 ++++++++++++++++++++- > 1 files changed, 20 insertions(+), 1 deletions(-) > > diff --git a/block.c b/block.c > index dac62a1..0d769a6 100644 > --- a/block.c > +++ b/block.c > @@ -858,12 +858,31 @@ void bdrv_close_all(void) > * > * This function does not flush data to disk, use bdrv_flush_all() for that > * after calling this function. > + * > + * Note that completion of an asynchronous I/O operation can trigger any > + * number of other I/O operations on other devices---for example a coroutine > + * can be arbitrarily complex and a constant flow of I/O to multiple devices > + * can come until the coroutine is complete. Because of this, it is not > + * possible to drain a single device's I/O queue. > */ > void bdrv_drain_all(void) > { > BlockDriverState *bs; > + bool busy; > > - qemu_aio_flush(); > + do { > + busy = qemu_aio_wait(); busy is bool type, while qemu_aio_wait return void. > + > + /* FIXME: We do not have timer support here, so this is effectively > + * a busy wait. > + */ > + QTAILQ_FOREACH(bs, &bdrv_states, list) { > + if (!qemu_co_queue_empty(&bs->throttled_reqs)) { > + qemu_co_queue_restart_all(&bs->throttled_reqs); > + busy = true; > + } > + } > + } while (busy); > > /* If requests are still pending there is a bug somewhere */ > QTAILQ_FOREACH(bs, &bdrv_states, list) { > -- > 1.7.7.6 > > -- Regards, Zhi Yong Wu ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH 7/7] block: add the support to drain throttled requests 2012-03-12 18:22 ` [Qemu-devel] [PATCH 7/7] block: add the support to drain throttled requests Paolo Bonzini 2012-03-13 1:38 ` Zhi Yong Wu @ 2012-03-13 1:46 ` Zhi Yong Wu 2012-03-13 1:56 ` Zhi Yong Wu 2 siblings, 0 replies; 11+ messages in thread From: Zhi Yong Wu @ 2012-03-13 1:46 UTC (permalink / raw) To: Paolo Bonzini; +Cc: Zhi Yong Wu, qemu-devel On Tue, Mar 13, 2012 at 2:22 AM, Paolo Bonzini <pbonzini@redhat.com> wrote: > From: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> > > Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> > [ Iterate until all block devices have processed all requests, > add comments. - Paolo ] > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > block.c | 21 ++++++++++++++++++++- > 1 files changed, 20 insertions(+), 1 deletions(-) > > diff --git a/block.c b/block.c > index dac62a1..0d769a6 100644 > --- a/block.c > +++ b/block.c > @@ -858,12 +858,31 @@ void bdrv_close_all(void) > * > * This function does not flush data to disk, use bdrv_flush_all() for that > * after calling this function. > + * > + * Note that completion of an asynchronous I/O operation can trigger any > + * number of other I/O operations on other devices---for example a coroutine > + * can be arbitrarily complex and a constant flow of I/O to multiple devices > + * can come until the coroutine is complete. Because of this, it is not > + * possible to drain a single device's I/O queue. > */ > void bdrv_drain_all(void) > { > BlockDriverState *bs; > + bool busy; > > - qemu_aio_flush(); > + do { > + busy = qemu_aio_wait(); You call qemu_aio_flush only one time, so can't make sure that some new requests which could be enqueued to tracked_requests are flushed. > + > + /* FIXME: We do not have timer support here, so this is effectively > + * a busy wait. > + */ > + QTAILQ_FOREACH(bs, &bdrv_states, list) { > + if (!qemu_co_queue_empty(&bs->throttled_reqs)) { > + qemu_co_queue_restart_all(&bs->throttled_reqs); > + busy = true; > + } > + } > + } while (busy); > > /* If requests are still pending there is a bug somewhere */ > QTAILQ_FOREACH(bs, &bdrv_states, list) { > -- > 1.7.7.6 > > -- Regards, Zhi Yong Wu ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH 7/7] block: add the support to drain throttled requests 2012-03-12 18:22 ` [Qemu-devel] [PATCH 7/7] block: add the support to drain throttled requests Paolo Bonzini 2012-03-13 1:38 ` Zhi Yong Wu 2012-03-13 1:46 ` Zhi Yong Wu @ 2012-03-13 1:56 ` Zhi Yong Wu 2 siblings, 0 replies; 11+ messages in thread From: Zhi Yong Wu @ 2012-03-13 1:56 UTC (permalink / raw) To: Paolo Bonzini; +Cc: Zhi Yong Wu, qemu-devel On Tue, Mar 13, 2012 at 2:22 AM, Paolo Bonzini <pbonzini@redhat.com> wrote: > From: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> > > Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> > [ Iterate until all block devices have processed all requests, > add comments. - Paolo ] > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > block.c | 21 ++++++++++++++++++++- > 1 files changed, 20 insertions(+), 1 deletions(-) > > diff --git a/block.c b/block.c > index dac62a1..0d769a6 100644 > --- a/block.c > +++ b/block.c > @@ -858,12 +858,31 @@ void bdrv_close_all(void) > * > * This function does not flush data to disk, use bdrv_flush_all() for that > * after calling this function. > + * > + * Note that completion of an asynchronous I/O operation can trigger any > + * number of other I/O operations on other devices---for example a coroutine > + * can be arbitrarily complex and a constant flow of I/O to multiple devices > + * can come until the coroutine is complete. Because of this, it is not > + * possible to drain a single device's I/O queue. > */ > void bdrv_drain_all(void) > { > BlockDriverState *bs; > + bool busy; > > - qemu_aio_flush(); > + do { > + busy = qemu_aio_wait(); > + > + /* FIXME: We do not have timer support here, so this is effectively > + * a busy wait. Maybe we can temporarily disable I/O throttling, after flush is completed, enable it again. But this perhaps affects other guest. > + */ > + QTAILQ_FOREACH(bs, &bdrv_states, list) { > + if (!qemu_co_queue_empty(&bs->throttled_reqs)) { > + qemu_co_queue_restart_all(&bs->throttled_reqs); > + busy = true; > + } > + } > + } while (busy); > > /* If requests are still pending there is a bug somewhere */ > QTAILQ_FOREACH(bs, &bdrv_states, list) { > -- > 1.7.7.6 > > -- Regards, Zhi Yong Wu ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2012-03-13 1:56 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-03-12 18:22 [Qemu-devel] [PATCH 0/7] AIO nested loop and bdrv_drain_all changes Paolo Bonzini 2012-03-12 18:22 ` [Qemu-devel] [PATCH 1/7] qemu-io: use main_loop_wait Paolo Bonzini 2012-03-12 18:22 ` [Qemu-devel] [PATCH 2/7] qemu-tool: map vm_clock to rt_clock Paolo Bonzini 2012-03-12 18:22 ` [Qemu-devel] [PATCH 3/7] posix-aio: merge posix_aio_process_queue and posix_aio_read Paolo Bonzini 2012-03-12 18:22 ` [Qemu-devel] [PATCH 4/7] aio: remove process_queue callback and qemu_aio_process_queue Paolo Bonzini 2012-03-12 18:22 ` [Qemu-devel] [PATCH 5/7] aio: return "AIO in progress" state from qemu_aio_wait Paolo Bonzini 2012-03-12 18:22 ` [Qemu-devel] [PATCH 6/7] aio: simplify qemu_aio_wait Paolo Bonzini 2012-03-12 18:22 ` [Qemu-devel] [PATCH 7/7] block: add the support to drain throttled requests Paolo Bonzini 2012-03-13 1:38 ` Zhi Yong Wu 2012-03-13 1:46 ` Zhi Yong Wu 2012-03-13 1:56 ` Zhi Yong Wu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).