* [Qemu-devel] [PATCH v6 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end @ 2015-10-22 6:32 Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 01/12] aio: Add "is_external" flag for event handlers Fam Zheng ` (11 more replies) 0 siblings, 12 replies; 18+ messages in thread From: Fam Zheng @ 2015-10-22 6:32 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, pbonzini, jcody, stefanha, qemu-block v6: Add Kevin's rev-by in patches 1-3, 6-8, 10, 12. Add Jeff's rev-by in patches 1, 2, 6-8, 10. 04: Fix spelling and wording in comments. [Jeff] Add assert at decrement. [Jeff] 05: Fix bad rebase. [Jeff] 09: Let blk_is_available come first. [Jeff, Kevin] 11: Rewrite bdrv_qed_drain. [Jeff] v5: Rebase onto Kevin's block tree. v4: Rebase on to master so fix the "bdrv_move_feature_fields" issue. v3: Call bdrv_drain unconditionally in bdrv_drained_begin. Document the internal I/O implications between bdrv_drain_begin and end. The nested aio_poll()'s in block layer has a bug that new r/w requests from ioeventfds and nbd exports are processed, which might break the caller's semantics (qmp_transaction) or even pointers (bdrv_reopen). Fam Zheng (12): aio: Add "is_external" flag for event handlers nbd: Mark fd handlers client type as "external" dataplane: Mark host notifiers' client type as "external" aio: introduce aio_{disable,enable}_external block: Introduce "drained begin/end" API block: Add "drained begin/end" for transactional external snapshot block: Add "drained begin/end" for transactional backup block: Add "drained begin/end" for transactional blockdev-backup block: Add "drained begin/end" for internal snapshot block: Introduce BlockDriver.bdrv_drain callback qed: Implement .bdrv_drain tests: Add test case for aio_disable_external aio-posix.c | 9 ++++- aio-win32.c | 8 +++- async.c | 3 +- block/curl.c | 14 ++++--- block/io.c | 23 +++++++++++- block/iscsi.c | 9 ++--- block/linux-aio.c | 5 ++- block/nbd-client.c | 10 +++-- block/nfs.c | 17 ++++----- block/qed.c | 15 ++++++++ block/sheepdog.c | 38 ++++++++++++------- block/ssh.c | 5 ++- block/win32-aio.c | 5 ++- blockdev.c | 38 ++++++++++++++++--- hw/block/dataplane/virtio-blk.c | 5 ++- hw/scsi/virtio-scsi-dataplane.c | 22 +++++++---- include/block/aio.h | 40 ++++++++++++++++++++ include/block/block.h | 24 ++++++++++++ include/block/block_int.h | 8 ++++ iohandler.c | 3 +- nbd.c | 4 +- tests/test-aio.c | 82 ++++++++++++++++++++++++++++------------- 22 files changed, 294 insertions(+), 93 deletions(-) -- 2.4.3 ^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v6 01/12] aio: Add "is_external" flag for event handlers 2015-10-22 6:32 [Qemu-devel] [PATCH v6 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng @ 2015-10-22 6:32 ` Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 02/12] nbd: Mark fd handlers client type as "external" Fam Zheng ` (10 subsequent siblings) 11 siblings, 0 replies; 18+ messages in thread From: Fam Zheng @ 2015-10-22 6:32 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, pbonzini, jcody, stefanha, qemu-block All callers pass in false, and the real external ones will switch to true in coming patches. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> --- aio-posix.c | 6 ++++- aio-win32.c | 5 ++++ async.c | 3 ++- block/curl.c | 14 +++++----- block/iscsi.c | 9 +++---- block/linux-aio.c | 5 ++-- block/nbd-client.c | 10 ++++--- block/nfs.c | 17 +++++------- block/sheepdog.c | 38 ++++++++++++++++++--------- block/ssh.c | 5 ++-- block/win32-aio.c | 5 ++-- hw/block/dataplane/virtio-blk.c | 6 +++-- hw/scsi/virtio-scsi-dataplane.c | 24 +++++++++++------ include/block/aio.h | 2 ++ iohandler.c | 3 ++- nbd.c | 4 ++- tests/test-aio.c | 58 +++++++++++++++++++++++------------------ 17 files changed, 130 insertions(+), 84 deletions(-) diff --git a/aio-posix.c b/aio-posix.c index d477033..f0f9122 100644 --- a/aio-posix.c +++ b/aio-posix.c @@ -25,6 +25,7 @@ struct AioHandler IOHandler *io_write; int deleted; void *opaque; + bool is_external; QLIST_ENTRY(AioHandler) node; }; @@ -43,6 +44,7 @@ static AioHandler *find_aio_handler(AioContext *ctx, int fd) void aio_set_fd_handler(AioContext *ctx, int fd, + bool is_external, IOHandler *io_read, IOHandler *io_write, void *opaque) @@ -82,6 +84,7 @@ void aio_set_fd_handler(AioContext *ctx, node->io_read = io_read; node->io_write = io_write; node->opaque = opaque; + node->is_external = is_external; node->pfd.events = (io_read ? G_IO_IN | G_IO_HUP | G_IO_ERR : 0); node->pfd.events |= (io_write ? G_IO_OUT | G_IO_ERR : 0); @@ -92,10 +95,11 @@ void aio_set_fd_handler(AioContext *ctx, void aio_set_event_notifier(AioContext *ctx, EventNotifier *notifier, + bool is_external, EventNotifierHandler *io_read) { aio_set_fd_handler(ctx, event_notifier_get_fd(notifier), - (IOHandler *)io_read, NULL, notifier); + is_external, (IOHandler *)io_read, NULL, notifier); } bool aio_prepare(AioContext *ctx) diff --git a/aio-win32.c b/aio-win32.c index 50a6867..3110d85 100644 --- a/aio-win32.c +++ b/aio-win32.c @@ -28,11 +28,13 @@ struct AioHandler { GPollFD pfd; int deleted; void *opaque; + bool is_external; QLIST_ENTRY(AioHandler) node; }; void aio_set_fd_handler(AioContext *ctx, int fd, + bool is_external, IOHandler *io_read, IOHandler *io_write, void *opaque) @@ -86,6 +88,7 @@ void aio_set_fd_handler(AioContext *ctx, node->opaque = opaque; node->io_read = io_read; node->io_write = io_write; + node->is_external = is_external; event = event_notifier_get_handle(&ctx->notifier); WSAEventSelect(node->pfd.fd, event, @@ -98,6 +101,7 @@ void aio_set_fd_handler(AioContext *ctx, void aio_set_event_notifier(AioContext *ctx, EventNotifier *e, + bool is_external, EventNotifierHandler *io_notify) { AioHandler *node; @@ -133,6 +137,7 @@ void aio_set_event_notifier(AioContext *ctx, node->e = e; node->pfd.fd = (uintptr_t)event_notifier_get_handle(e); node->pfd.events = G_IO_IN; + node->is_external = is_external; QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node); g_source_add_poll(&ctx->source, &node->pfd); diff --git a/async.c b/async.c index efce14b..bdc64a3 100644 --- a/async.c +++ b/async.c @@ -247,7 +247,7 @@ aio_ctx_finalize(GSource *source) } qemu_mutex_unlock(&ctx->bh_lock); - aio_set_event_notifier(ctx, &ctx->notifier, NULL); + aio_set_event_notifier(ctx, &ctx->notifier, false, NULL); event_notifier_cleanup(&ctx->notifier); rfifolock_destroy(&ctx->lock); qemu_mutex_destroy(&ctx->bh_lock); @@ -329,6 +329,7 @@ AioContext *aio_context_new(Error **errp) } g_source_set_can_recurse(&ctx->source, true); aio_set_event_notifier(ctx, &ctx->notifier, + false, (EventNotifierHandler *) event_notifier_dummy_cb); ctx->thread_pool = NULL; diff --git a/block/curl.c b/block/curl.c index 032cc8a..8994182 100644 --- a/block/curl.c +++ b/block/curl.c @@ -154,18 +154,20 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action, DPRINTF("CURL (AIO): Sock action %d on fd %d\n", action, fd); switch (action) { case CURL_POLL_IN: - aio_set_fd_handler(s->aio_context, fd, curl_multi_read, - NULL, state); + aio_set_fd_handler(s->aio_context, fd, false, + curl_multi_read, NULL, state); break; case CURL_POLL_OUT: - aio_set_fd_handler(s->aio_context, fd, NULL, curl_multi_do, state); + aio_set_fd_handler(s->aio_context, fd, false, + NULL, curl_multi_do, state); break; case CURL_POLL_INOUT: - aio_set_fd_handler(s->aio_context, fd, curl_multi_read, - curl_multi_do, state); + aio_set_fd_handler(s->aio_context, fd, false, + curl_multi_read, curl_multi_do, state); break; case CURL_POLL_REMOVE: - aio_set_fd_handler(s->aio_context, fd, NULL, NULL, NULL); + aio_set_fd_handler(s->aio_context, fd, false, + NULL, NULL, NULL); break; } diff --git a/block/iscsi.c b/block/iscsi.c index 93f1ee4..9a628b7 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -291,8 +291,8 @@ iscsi_set_events(IscsiLun *iscsilun) int ev = iscsi_which_events(iscsi); if (ev != iscsilun->events) { - aio_set_fd_handler(iscsilun->aio_context, - iscsi_get_fd(iscsi), + aio_set_fd_handler(iscsilun->aio_context, iscsi_get_fd(iscsi), + false, (ev & POLLIN) ? iscsi_process_read : NULL, (ev & POLLOUT) ? iscsi_process_write : NULL, iscsilun); @@ -1280,9 +1280,8 @@ static void iscsi_detach_aio_context(BlockDriverState *bs) { IscsiLun *iscsilun = bs->opaque; - aio_set_fd_handler(iscsilun->aio_context, - iscsi_get_fd(iscsilun->iscsi), - NULL, NULL, NULL); + aio_set_fd_handler(iscsilun->aio_context, iscsi_get_fd(iscsilun->iscsi), + false, NULL, NULL, NULL); iscsilun->events = 0; if (iscsilun->nop_timer) { diff --git a/block/linux-aio.c b/block/linux-aio.c index c991443..88b0520 100644 --- a/block/linux-aio.c +++ b/block/linux-aio.c @@ -287,7 +287,7 @@ void laio_detach_aio_context(void *s_, AioContext *old_context) { struct qemu_laio_state *s = s_; - aio_set_event_notifier(old_context, &s->e, NULL); + aio_set_event_notifier(old_context, &s->e, false, NULL); qemu_bh_delete(s->completion_bh); } @@ -296,7 +296,8 @@ void laio_attach_aio_context(void *s_, AioContext *new_context) struct qemu_laio_state *s = s_; s->completion_bh = aio_bh_new(new_context, qemu_laio_completion_bh, s); - aio_set_event_notifier(new_context, &s->e, qemu_laio_completion_cb); + aio_set_event_notifier(new_context, &s->e, false, + qemu_laio_completion_cb); } void *laio_init(void) diff --git a/block/nbd-client.c b/block/nbd-client.c index e1bb919..b7fd17a 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -124,7 +124,7 @@ static int nbd_co_send_request(BlockDriverState *bs, s->send_coroutine = qemu_coroutine_self(); aio_context = bdrv_get_aio_context(bs); - aio_set_fd_handler(aio_context, s->sock, + aio_set_fd_handler(aio_context, s->sock, false, nbd_reply_ready, nbd_restart_write, bs); if (qiov) { if (!s->is_unix) { @@ -144,7 +144,8 @@ static int nbd_co_send_request(BlockDriverState *bs, } else { rc = nbd_send_request(s->sock, request); } - aio_set_fd_handler(aio_context, s->sock, nbd_reply_ready, NULL, bs); + aio_set_fd_handler(aio_context, s->sock, false, + nbd_reply_ready, NULL, bs); s->send_coroutine = NULL; qemu_co_mutex_unlock(&s->send_mutex); return rc; @@ -348,14 +349,15 @@ int nbd_client_co_discard(BlockDriverState *bs, int64_t sector_num, void nbd_client_detach_aio_context(BlockDriverState *bs) { aio_set_fd_handler(bdrv_get_aio_context(bs), - nbd_get_client_session(bs)->sock, NULL, NULL, NULL); + nbd_get_client_session(bs)->sock, + false, NULL, NULL, NULL); } void nbd_client_attach_aio_context(BlockDriverState *bs, AioContext *new_context) { aio_set_fd_handler(new_context, nbd_get_client_session(bs)->sock, - nbd_reply_ready, NULL, bs); + false, nbd_reply_ready, NULL, bs); } void nbd_client_close(BlockDriverState *bs) diff --git a/block/nfs.c b/block/nfs.c index 887a98e..fd79f89 100644 --- a/block/nfs.c +++ b/block/nfs.c @@ -63,11 +63,10 @@ static void nfs_set_events(NFSClient *client) { int ev = nfs_which_events(client->context); if (ev != client->events) { - aio_set_fd_handler(client->aio_context, - nfs_get_fd(client->context), + aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context), + false, (ev & POLLIN) ? nfs_process_read : NULL, - (ev & POLLOUT) ? nfs_process_write : NULL, - client); + (ev & POLLOUT) ? nfs_process_write : NULL, client); } client->events = ev; @@ -242,9 +241,8 @@ static void nfs_detach_aio_context(BlockDriverState *bs) { NFSClient *client = bs->opaque; - aio_set_fd_handler(client->aio_context, - nfs_get_fd(client->context), - NULL, NULL, NULL); + aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context), + false, NULL, NULL, NULL); client->events = 0; } @@ -263,9 +261,8 @@ static void nfs_client_close(NFSClient *client) if (client->fh) { nfs_close(client->context, client->fh); } - aio_set_fd_handler(client->aio_context, - nfs_get_fd(client->context), - NULL, NULL, NULL); + aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context), + false, NULL, NULL, NULL); nfs_destroy_context(client->context); } memset(client, 0, sizeof(NFSClient)); diff --git a/block/sheepdog.c b/block/sheepdog.c index e7e58b7..d80e4ed 100644 --- a/block/sheepdog.c +++ b/block/sheepdog.c @@ -651,14 +651,16 @@ static coroutine_fn void do_co_req(void *opaque) unsigned int *rlen = srco->rlen; co = qemu_coroutine_self(); - aio_set_fd_handler(srco->aio_context, sockfd, NULL, restart_co_req, co); + aio_set_fd_handler(srco->aio_context, sockfd, false, + NULL, restart_co_req, co); ret = send_co_req(sockfd, hdr, data, wlen); if (ret < 0) { goto out; } - aio_set_fd_handler(srco->aio_context, sockfd, restart_co_req, NULL, co); + aio_set_fd_handler(srco->aio_context, sockfd, false, + restart_co_req, NULL, co); ret = qemu_co_recv(sockfd, hdr, sizeof(*hdr)); if (ret != sizeof(*hdr)) { @@ -683,7 +685,8 @@ static coroutine_fn void do_co_req(void *opaque) out: /* there is at most one request for this sockfd, so it is safe to * set each handler to NULL. */ - aio_set_fd_handler(srco->aio_context, sockfd, NULL, NULL, NULL); + aio_set_fd_handler(srco->aio_context, sockfd, false, + NULL, NULL, NULL); srco->ret = ret; srco->finished = true; @@ -735,7 +738,8 @@ static coroutine_fn void reconnect_to_sdog(void *opaque) BDRVSheepdogState *s = opaque; AIOReq *aio_req, *next; - aio_set_fd_handler(s->aio_context, s->fd, NULL, NULL, NULL); + aio_set_fd_handler(s->aio_context, s->fd, false, NULL, + NULL, NULL); close(s->fd); s->fd = -1; @@ -938,7 +942,8 @@ static int get_sheep_fd(BDRVSheepdogState *s, Error **errp) return fd; } - aio_set_fd_handler(s->aio_context, fd, co_read_response, NULL, s); + aio_set_fd_handler(s->aio_context, fd, false, + co_read_response, NULL, s); return fd; } @@ -1199,7 +1204,7 @@ static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req, qemu_co_mutex_lock(&s->lock); s->co_send = qemu_coroutine_self(); - aio_set_fd_handler(s->aio_context, s->fd, + aio_set_fd_handler(s->aio_context, s->fd, false, co_read_response, co_write_request, s); socket_set_cork(s->fd, 1); @@ -1218,7 +1223,8 @@ static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req, } out: socket_set_cork(s->fd, 0); - aio_set_fd_handler(s->aio_context, s->fd, co_read_response, NULL, s); + aio_set_fd_handler(s->aio_context, s->fd, false, + co_read_response, NULL, s); s->co_send = NULL; qemu_co_mutex_unlock(&s->lock); } @@ -1368,7 +1374,8 @@ static void sd_detach_aio_context(BlockDriverState *bs) { BDRVSheepdogState *s = bs->opaque; - aio_set_fd_handler(s->aio_context, s->fd, NULL, NULL, NULL); + aio_set_fd_handler(s->aio_context, s->fd, false, NULL, + NULL, NULL); } static void sd_attach_aio_context(BlockDriverState *bs, @@ -1377,7 +1384,8 @@ static void sd_attach_aio_context(BlockDriverState *bs, BDRVSheepdogState *s = bs->opaque; s->aio_context = new_context; - aio_set_fd_handler(new_context, s->fd, co_read_response, NULL, s); + aio_set_fd_handler(new_context, s->fd, false, + co_read_response, NULL, s); } /* TODO Convert to fine grained options */ @@ -1490,7 +1498,8 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags, g_free(buf); return 0; out: - aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd, NULL, NULL, NULL); + aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd, + false, NULL, NULL, NULL); if (s->fd >= 0) { closesocket(s->fd); } @@ -1528,7 +1537,8 @@ static void sd_reopen_commit(BDRVReopenState *state) BDRVSheepdogState *s = state->bs->opaque; if (s->fd) { - aio_set_fd_handler(s->aio_context, s->fd, NULL, NULL, NULL); + aio_set_fd_handler(s->aio_context, s->fd, false, + NULL, NULL, NULL); closesocket(s->fd); } @@ -1551,7 +1561,8 @@ static void sd_reopen_abort(BDRVReopenState *state) } if (re_s->fd) { - aio_set_fd_handler(s->aio_context, re_s->fd, NULL, NULL, NULL); + aio_set_fd_handler(s->aio_context, re_s->fd, false, + NULL, NULL, NULL); closesocket(re_s->fd); } @@ -1935,7 +1946,8 @@ static void sd_close(BlockDriverState *bs) error_report("%s, %s", sd_strerror(rsp->result), s->name); } - aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd, NULL, NULL, NULL); + aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd, + false, NULL, NULL, NULL); closesocket(s->fd); g_free(s->host_spec); } diff --git a/block/ssh.c b/block/ssh.c index d35b51f..af025c0 100644 --- a/block/ssh.c +++ b/block/ssh.c @@ -800,14 +800,15 @@ static coroutine_fn void set_fd_handler(BDRVSSHState *s, BlockDriverState *bs) rd_handler, wr_handler); aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock, - rd_handler, wr_handler, co); + false, rd_handler, wr_handler, co); } static coroutine_fn void clear_fd_handler(BDRVSSHState *s, BlockDriverState *bs) { DPRINTF("s->sock=%d", s->sock); - aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock, NULL, NULL, NULL); + aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock, + false, NULL, NULL, NULL); } /* A non-blocking call returned EAGAIN, so yield, ensuring the diff --git a/block/win32-aio.c b/block/win32-aio.c index 64e8682..bbf2f01 100644 --- a/block/win32-aio.c +++ b/block/win32-aio.c @@ -174,7 +174,7 @@ int win32_aio_attach(QEMUWin32AIOState *aio, HANDLE hfile) void win32_aio_detach_aio_context(QEMUWin32AIOState *aio, AioContext *old_context) { - aio_set_event_notifier(old_context, &aio->e, NULL); + aio_set_event_notifier(old_context, &aio->e, false, NULL); aio->is_aio_context_attached = false; } @@ -182,7 +182,8 @@ void win32_aio_attach_aio_context(QEMUWin32AIOState *aio, AioContext *new_context) { aio->is_aio_context_attached = true; - aio_set_event_notifier(new_context, &aio->e, win32_aio_completion_cb); + aio_set_event_notifier(new_context, &aio->e, false, + win32_aio_completion_cb); } QEMUWin32AIOState *win32_aio_init(void) diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index 6106e46..f8716bc 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -283,7 +283,8 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) /* Get this show started by hooking up our callbacks */ aio_context_acquire(s->ctx); - aio_set_event_notifier(s->ctx, &s->host_notifier, handle_notify); + aio_set_event_notifier(s->ctx, &s->host_notifier, false, + handle_notify); aio_context_release(s->ctx); return; @@ -319,7 +320,8 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s) aio_context_acquire(s->ctx); /* Stop notifications for new requests from guest */ - aio_set_event_notifier(s->ctx, &s->host_notifier, NULL); + aio_set_event_notifier(s->ctx, &s->host_notifier, false, + NULL); /* Drain and switch bs back to the QEMU main loop */ blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context()); diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c index 1248fd9..21f51df 100644 --- a/hw/scsi/virtio-scsi-dataplane.c +++ b/hw/scsi/virtio-scsi-dataplane.c @@ -60,7 +60,8 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s, r = g_new(VirtIOSCSIVring, 1); r->host_notifier = *virtio_queue_get_host_notifier(vq); r->guest_notifier = *virtio_queue_get_guest_notifier(vq); - aio_set_event_notifier(s->ctx, &r->host_notifier, handler); + aio_set_event_notifier(s->ctx, &r->host_notifier, false, + handler); r->parent = s; @@ -71,7 +72,8 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s, return r; fail_vring: - aio_set_event_notifier(s->ctx, &r->host_notifier, NULL); + aio_set_event_notifier(s->ctx, &r->host_notifier, false, + NULL); k->set_host_notifier(qbus->parent, n, false); g_free(r); return NULL; @@ -162,14 +164,17 @@ static void virtio_scsi_clear_aio(VirtIOSCSI *s) int i; if (s->ctrl_vring) { - aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier, NULL); + aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier, + false, NULL); } if (s->event_vring) { - aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier, NULL); + aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier, + false, NULL); } if (s->cmd_vrings) { for (i = 0; i < vs->conf.num_queues && s->cmd_vrings[i]; i++) { - aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier, NULL); + aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier, + false, NULL); } } } @@ -290,10 +295,13 @@ void virtio_scsi_dataplane_stop(VirtIOSCSI *s) aio_context_acquire(s->ctx); - aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier, NULL); - aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier, NULL); + aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier, + false, NULL); + aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier, + false, NULL); for (i = 0; i < vs->conf.num_queues; i++) { - aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier, NULL); + aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier, + false, NULL); } blk_drain_all(); /* ensure there are no in-flight requests */ diff --git a/include/block/aio.h b/include/block/aio.h index 400b1b0..12f1141 100644 --- a/include/block/aio.h +++ b/include/block/aio.h @@ -299,6 +299,7 @@ bool aio_poll(AioContext *ctx, bool blocking); */ void aio_set_fd_handler(AioContext *ctx, int fd, + bool is_external, IOHandler *io_read, IOHandler *io_write, void *opaque); @@ -312,6 +313,7 @@ void aio_set_fd_handler(AioContext *ctx, */ void aio_set_event_notifier(AioContext *ctx, EventNotifier *notifier, + bool is_external, EventNotifierHandler *io_read); /* Return a GSource that lets the main loop poll the file descriptors attached diff --git a/iohandler.c b/iohandler.c index 55f8501..eb68083 100644 --- a/iohandler.c +++ b/iohandler.c @@ -55,7 +55,8 @@ void qemu_set_fd_handler(int fd, void *opaque) { iohandler_init(); - aio_set_fd_handler(iohandler_ctx, fd, fd_read, fd_write, opaque); + aio_set_fd_handler(iohandler_ctx, fd, false, + fd_read, fd_write, opaque); } /* reaping of zombies. right now we're not passing the status to diff --git a/nbd.c b/nbd.c index 74859cb..fbc66be 100644 --- a/nbd.c +++ b/nbd.c @@ -1446,6 +1446,7 @@ static void nbd_set_handlers(NBDClient *client) { if (client->exp && client->exp->ctx) { aio_set_fd_handler(client->exp->ctx, client->sock, + false, client->can_read ? nbd_read : NULL, client->send_coroutine ? nbd_restart_write : NULL, client); @@ -1455,7 +1456,8 @@ static void nbd_set_handlers(NBDClient *client) static void nbd_unset_handlers(NBDClient *client) { if (client->exp && client->exp->ctx) { - aio_set_fd_handler(client->exp->ctx, client->sock, NULL, NULL, NULL); + aio_set_fd_handler(client->exp->ctx, client->sock, + false, NULL, NULL, NULL); } } diff --git a/tests/test-aio.c b/tests/test-aio.c index 217e337..03cd45d 100644 --- a/tests/test-aio.c +++ b/tests/test-aio.c @@ -118,6 +118,12 @@ static void *test_acquire_thread(void *opaque) return NULL; } +static void set_event_notifier(AioContext *ctx, EventNotifier *notifier, + EventNotifierHandler *handler) +{ + aio_set_event_notifier(ctx, notifier, false, handler); +} + static void dummy_notifier_read(EventNotifier *unused) { g_assert(false); /* should never be invoked */ @@ -131,7 +137,7 @@ static void test_acquire(void) /* Dummy event notifier ensures aio_poll() will block */ event_notifier_init(¬ifier, false); - aio_set_event_notifier(ctx, ¬ifier, dummy_notifier_read); + set_event_notifier(ctx, ¬ifier, dummy_notifier_read); g_assert(!aio_poll(ctx, false)); /* consume aio_notify() */ qemu_mutex_init(&data.start_lock); @@ -149,7 +155,7 @@ static void test_acquire(void) aio_context_release(ctx); qemu_thread_join(&thread); - aio_set_event_notifier(ctx, ¬ifier, NULL); + set_event_notifier(ctx, ¬ifier, NULL); event_notifier_cleanup(¬ifier); g_assert(data.thread_acquired); @@ -308,11 +314,11 @@ static void test_set_event_notifier(void) { EventNotifierTestData data = { .n = 0, .active = 0 }; event_notifier_init(&data.e, false); - aio_set_event_notifier(ctx, &data.e, event_ready_cb); + set_event_notifier(ctx, &data.e, event_ready_cb); g_assert(!aio_poll(ctx, false)); g_assert_cmpint(data.n, ==, 0); - aio_set_event_notifier(ctx, &data.e, NULL); + set_event_notifier(ctx, &data.e, NULL); g_assert(!aio_poll(ctx, false)); g_assert_cmpint(data.n, ==, 0); event_notifier_cleanup(&data.e); @@ -322,7 +328,7 @@ static void test_wait_event_notifier(void) { EventNotifierTestData data = { .n = 0, .active = 1 }; event_notifier_init(&data.e, false); - aio_set_event_notifier(ctx, &data.e, event_ready_cb); + set_event_notifier(ctx, &data.e, event_ready_cb); while (aio_poll(ctx, false)); g_assert_cmpint(data.n, ==, 0); g_assert_cmpint(data.active, ==, 1); @@ -336,7 +342,7 @@ static void test_wait_event_notifier(void) g_assert_cmpint(data.n, ==, 1); g_assert_cmpint(data.active, ==, 0); - aio_set_event_notifier(ctx, &data.e, NULL); + set_event_notifier(ctx, &data.e, NULL); g_assert(!aio_poll(ctx, false)); g_assert_cmpint(data.n, ==, 1); @@ -347,7 +353,7 @@ static void test_flush_event_notifier(void) { EventNotifierTestData data = { .n = 0, .active = 10, .auto_set = true }; event_notifier_init(&data.e, false); - aio_set_event_notifier(ctx, &data.e, event_ready_cb); + set_event_notifier(ctx, &data.e, event_ready_cb); while (aio_poll(ctx, false)); g_assert_cmpint(data.n, ==, 0); g_assert_cmpint(data.active, ==, 10); @@ -363,7 +369,7 @@ static void test_flush_event_notifier(void) g_assert_cmpint(data.active, ==, 0); g_assert(!aio_poll(ctx, false)); - aio_set_event_notifier(ctx, &data.e, NULL); + set_event_notifier(ctx, &data.e, NULL); g_assert(!aio_poll(ctx, false)); event_notifier_cleanup(&data.e); } @@ -374,7 +380,7 @@ static void test_wait_event_notifier_noflush(void) EventNotifierTestData dummy = { .n = 0, .active = 1 }; event_notifier_init(&data.e, false); - aio_set_event_notifier(ctx, &data.e, event_ready_cb); + set_event_notifier(ctx, &data.e, event_ready_cb); g_assert(!aio_poll(ctx, false)); g_assert_cmpint(data.n, ==, 0); @@ -387,7 +393,7 @@ static void test_wait_event_notifier_noflush(void) /* An active event notifier forces aio_poll to look at EventNotifiers. */ event_notifier_init(&dummy.e, false); - aio_set_event_notifier(ctx, &dummy.e, event_ready_cb); + set_event_notifier(ctx, &dummy.e, event_ready_cb); event_notifier_set(&data.e); g_assert(aio_poll(ctx, false)); @@ -407,10 +413,10 @@ static void test_wait_event_notifier_noflush(void) g_assert_cmpint(dummy.n, ==, 1); g_assert_cmpint(dummy.active, ==, 0); - aio_set_event_notifier(ctx, &dummy.e, NULL); + set_event_notifier(ctx, &dummy.e, NULL); event_notifier_cleanup(&dummy.e); - aio_set_event_notifier(ctx, &data.e, NULL); + set_event_notifier(ctx, &data.e, NULL); g_assert(!aio_poll(ctx, false)); g_assert_cmpint(data.n, ==, 2); @@ -428,7 +434,7 @@ static void test_timer_schedule(void) * an fd to wait on. Fixing this breaks other tests. So create a dummy one. */ event_notifier_init(&e, false); - aio_set_event_notifier(ctx, &e, dummy_io_handler_read); + set_event_notifier(ctx, &e, dummy_io_handler_read); aio_poll(ctx, false); aio_timer_init(ctx, &data.timer, data.clock_type, @@ -467,7 +473,7 @@ static void test_timer_schedule(void) g_assert(!aio_poll(ctx, false)); g_assert_cmpint(data.n, ==, 2); - aio_set_event_notifier(ctx, &e, NULL); + set_event_notifier(ctx, &e, NULL); event_notifier_cleanup(&e); timer_del(&data.timer); @@ -638,11 +644,11 @@ static void test_source_set_event_notifier(void) { EventNotifierTestData data = { .n = 0, .active = 0 }; event_notifier_init(&data.e, false); - aio_set_event_notifier(ctx, &data.e, event_ready_cb); + set_event_notifier(ctx, &data.e, event_ready_cb); while (g_main_context_iteration(NULL, false)); g_assert_cmpint(data.n, ==, 0); - aio_set_event_notifier(ctx, &data.e, NULL); + set_event_notifier(ctx, &data.e, NULL); while (g_main_context_iteration(NULL, false)); g_assert_cmpint(data.n, ==, 0); event_notifier_cleanup(&data.e); @@ -652,7 +658,7 @@ static void test_source_wait_event_notifier(void) { EventNotifierTestData data = { .n = 0, .active = 1 }; event_notifier_init(&data.e, false); - aio_set_event_notifier(ctx, &data.e, event_ready_cb); + set_event_notifier(ctx, &data.e, event_ready_cb); while (g_main_context_iteration(NULL, false)); g_assert_cmpint(data.n, ==, 0); g_assert_cmpint(data.active, ==, 1); @@ -666,7 +672,7 @@ static void test_source_wait_event_notifier(void) g_assert_cmpint(data.n, ==, 1); g_assert_cmpint(data.active, ==, 0); - aio_set_event_notifier(ctx, &data.e, NULL); + set_event_notifier(ctx, &data.e, NULL); while (g_main_context_iteration(NULL, false)); g_assert_cmpint(data.n, ==, 1); @@ -677,7 +683,7 @@ static void test_source_flush_event_notifier(void) { EventNotifierTestData data = { .n = 0, .active = 10, .auto_set = true }; event_notifier_init(&data.e, false); - aio_set_event_notifier(ctx, &data.e, event_ready_cb); + set_event_notifier(ctx, &data.e, event_ready_cb); while (g_main_context_iteration(NULL, false)); g_assert_cmpint(data.n, ==, 0); g_assert_cmpint(data.active, ==, 10); @@ -693,7 +699,7 @@ static void test_source_flush_event_notifier(void) g_assert_cmpint(data.active, ==, 0); g_assert(!g_main_context_iteration(NULL, false)); - aio_set_event_notifier(ctx, &data.e, NULL); + set_event_notifier(ctx, &data.e, NULL); while (g_main_context_iteration(NULL, false)); event_notifier_cleanup(&data.e); } @@ -704,7 +710,7 @@ static void test_source_wait_event_notifier_noflush(void) EventNotifierTestData dummy = { .n = 0, .active = 1 }; event_notifier_init(&data.e, false); - aio_set_event_notifier(ctx, &data.e, event_ready_cb); + set_event_notifier(ctx, &data.e, event_ready_cb); while (g_main_context_iteration(NULL, false)); g_assert_cmpint(data.n, ==, 0); @@ -717,7 +723,7 @@ static void test_source_wait_event_notifier_noflush(void) /* An active event notifier forces aio_poll to look at EventNotifiers. */ event_notifier_init(&dummy.e, false); - aio_set_event_notifier(ctx, &dummy.e, event_ready_cb); + set_event_notifier(ctx, &dummy.e, event_ready_cb); event_notifier_set(&data.e); g_assert(g_main_context_iteration(NULL, false)); @@ -737,10 +743,10 @@ static void test_source_wait_event_notifier_noflush(void) g_assert_cmpint(dummy.n, ==, 1); g_assert_cmpint(dummy.active, ==, 0); - aio_set_event_notifier(ctx, &dummy.e, NULL); + set_event_notifier(ctx, &dummy.e, NULL); event_notifier_cleanup(&dummy.e); - aio_set_event_notifier(ctx, &data.e, NULL); + set_event_notifier(ctx, &data.e, NULL); while (g_main_context_iteration(NULL, false)); g_assert_cmpint(data.n, ==, 2); @@ -759,7 +765,7 @@ static void test_source_timer_schedule(void) * an fd to wait on. Fixing this breaks other tests. So create a dummy one. */ event_notifier_init(&e, false); - aio_set_event_notifier(ctx, &e, dummy_io_handler_read); + set_event_notifier(ctx, &e, dummy_io_handler_read); do {} while (g_main_context_iteration(NULL, false)); aio_timer_init(ctx, &data.timer, data.clock_type, @@ -784,7 +790,7 @@ static void test_source_timer_schedule(void) g_assert_cmpint(data.n, ==, 2); g_assert(qemu_clock_get_ns(data.clock_type) > expiry); - aio_set_event_notifier(ctx, &e, NULL); + set_event_notifier(ctx, &e, NULL); event_notifier_cleanup(&e); timer_del(&data.timer); -- 2.4.3 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v6 02/12] nbd: Mark fd handlers client type as "external" 2015-10-22 6:32 [Qemu-devel] [PATCH v6 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 01/12] aio: Add "is_external" flag for event handlers Fam Zheng @ 2015-10-22 6:32 ` Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 03/12] dataplane: Mark host notifiers' " Fam Zheng ` (9 subsequent siblings) 11 siblings, 0 replies; 18+ messages in thread From: Fam Zheng @ 2015-10-22 6:32 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, pbonzini, jcody, stefanha, qemu-block So we could distinguish it from internal used fds, thus avoid handling unwanted events in nested aio polls. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> --- nbd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/nbd.c b/nbd.c index fbc66be..dab1ebb 100644 --- a/nbd.c +++ b/nbd.c @@ -1446,7 +1446,7 @@ static void nbd_set_handlers(NBDClient *client) { if (client->exp && client->exp->ctx) { aio_set_fd_handler(client->exp->ctx, client->sock, - false, + true, client->can_read ? nbd_read : NULL, client->send_coroutine ? nbd_restart_write : NULL, client); @@ -1457,7 +1457,7 @@ static void nbd_unset_handlers(NBDClient *client) { if (client->exp && client->exp->ctx) { aio_set_fd_handler(client->exp->ctx, client->sock, - false, NULL, NULL, NULL); + true, NULL, NULL, NULL); } } -- 2.4.3 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v6 03/12] dataplane: Mark host notifiers' client type as "external" 2015-10-22 6:32 [Qemu-devel] [PATCH v6 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 01/12] aio: Add "is_external" flag for event handlers Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 02/12] nbd: Mark fd handlers client type as "external" Fam Zheng @ 2015-10-22 6:32 ` Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 04/12] aio: introduce aio_{disable, enable}_external Fam Zheng ` (8 subsequent siblings) 11 siblings, 0 replies; 18+ messages in thread From: Fam Zheng @ 2015-10-22 6:32 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, pbonzini, jcody, stefanha, qemu-block They will be excluded by type in the nested event loops in block layer, so that unwanted events won't be processed there. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> --- hw/block/dataplane/virtio-blk.c | 5 ++--- hw/scsi/virtio-scsi-dataplane.c | 18 ++++++++---------- 2 files changed, 10 insertions(+), 13 deletions(-) diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index f8716bc..c42ddeb 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -283,7 +283,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) /* Get this show started by hooking up our callbacks */ aio_context_acquire(s->ctx); - aio_set_event_notifier(s->ctx, &s->host_notifier, false, + aio_set_event_notifier(s->ctx, &s->host_notifier, true, handle_notify); aio_context_release(s->ctx); return; @@ -320,8 +320,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s) aio_context_acquire(s->ctx); /* Stop notifications for new requests from guest */ - aio_set_event_notifier(s->ctx, &s->host_notifier, false, - NULL); + aio_set_event_notifier(s->ctx, &s->host_notifier, true, NULL); /* Drain and switch bs back to the QEMU main loop */ blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context()); diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c index 21f51df..0d8d71e 100644 --- a/hw/scsi/virtio-scsi-dataplane.c +++ b/hw/scsi/virtio-scsi-dataplane.c @@ -60,8 +60,7 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s, r = g_new(VirtIOSCSIVring, 1); r->host_notifier = *virtio_queue_get_host_notifier(vq); r->guest_notifier = *virtio_queue_get_guest_notifier(vq); - aio_set_event_notifier(s->ctx, &r->host_notifier, false, - handler); + aio_set_event_notifier(s->ctx, &r->host_notifier, true, handler); r->parent = s; @@ -72,8 +71,7 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s, return r; fail_vring: - aio_set_event_notifier(s->ctx, &r->host_notifier, false, - NULL); + aio_set_event_notifier(s->ctx, &r->host_notifier, true, NULL); k->set_host_notifier(qbus->parent, n, false); g_free(r); return NULL; @@ -165,16 +163,16 @@ static void virtio_scsi_clear_aio(VirtIOSCSI *s) if (s->ctrl_vring) { aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier, - false, NULL); + true, NULL); } if (s->event_vring) { aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier, - false, NULL); + true, NULL); } if (s->cmd_vrings) { for (i = 0; i < vs->conf.num_queues && s->cmd_vrings[i]; i++) { aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier, - false, NULL); + true, NULL); } } } @@ -296,12 +294,12 @@ void virtio_scsi_dataplane_stop(VirtIOSCSI *s) aio_context_acquire(s->ctx); aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier, - false, NULL); + true, NULL); aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier, - false, NULL); + true, NULL); for (i = 0; i < vs->conf.num_queues; i++) { aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier, - false, NULL); + true, NULL); } blk_drain_all(); /* ensure there are no in-flight requests */ -- 2.4.3 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v6 04/12] aio: introduce aio_{disable, enable}_external 2015-10-22 6:32 [Qemu-devel] [PATCH v6 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng ` (2 preceding siblings ...) 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 03/12] dataplane: Mark host notifiers' " Fam Zheng @ 2015-10-22 6:32 ` Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 05/12] block: Introduce "drained begin/end" API Fam Zheng ` (7 subsequent siblings) 11 siblings, 0 replies; 18+ messages in thread From: Fam Zheng @ 2015-10-22 6:32 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, pbonzini, jcody, stefanha, qemu-block Signed-off-by: Fam Zheng <famz@redhat.com> --- aio-posix.c | 3 ++- aio-win32.c | 3 ++- include/block/aio.h | 38 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 42 insertions(+), 2 deletions(-) diff --git a/aio-posix.c b/aio-posix.c index f0f9122..0467f23 100644 --- a/aio-posix.c +++ b/aio-posix.c @@ -261,7 +261,8 @@ bool aio_poll(AioContext *ctx, bool blocking) /* fill pollfds */ QLIST_FOREACH(node, &ctx->aio_handlers, node) { - if (!node->deleted && node->pfd.events) { + if (!node->deleted && node->pfd.events + && aio_node_check(ctx, node->is_external)) { add_pollfd(node); } } diff --git a/aio-win32.c b/aio-win32.c index 3110d85..43c4c79 100644 --- a/aio-win32.c +++ b/aio-win32.c @@ -309,7 +309,8 @@ bool aio_poll(AioContext *ctx, bool blocking) /* fill fd sets */ count = 0; QLIST_FOREACH(node, &ctx->aio_handlers, node) { - if (!node->deleted && node->io_notify) { + if (!node->deleted && node->io_notify + && aio_node_check(ctx, node->is_external)) { events[count++] = event_notifier_get_handle(node->e); } } diff --git a/include/block/aio.h b/include/block/aio.h index 12f1141..bcc7d43 100644 --- a/include/block/aio.h +++ b/include/block/aio.h @@ -122,6 +122,8 @@ struct AioContext { /* TimerLists for calling timers - one per clock type */ QEMUTimerListGroup tlg; + + int external_disable_cnt; }; /** @@ -375,4 +377,40 @@ static inline void aio_timer_init(AioContext *ctx, */ int64_t aio_compute_timeout(AioContext *ctx); +/** + * aio_disable_external: + * @ctx: the aio context + * + * Disable the further processing of external clients. + */ +static inline void aio_disable_external(AioContext *ctx) +{ + atomic_inc(&ctx->external_disable_cnt); +} + +/** + * aio_enable_external: + * @ctx: the aio context + * + * Enable the processing of external clients. + */ +static inline void aio_enable_external(AioContext *ctx) +{ + assert(ctx->external_disable_cnt > 0); + atomic_dec(&ctx->external_disable_cnt); +} + +/** + * aio_node_check: + * @ctx: the aio context + * @is_external: Whether or not the checked node is an external event source. + * + * Check if the node's is_external flag is okay to be polled by the ctx at this + * moment. True means green light. + */ +static inline bool aio_node_check(AioContext *ctx, bool is_external) +{ + return !is_external || !atomic_read(&ctx->external_disable_cnt); +} + #endif -- 2.4.3 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v6 05/12] block: Introduce "drained begin/end" API 2015-10-22 6:32 [Qemu-devel] [PATCH v6 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng ` (3 preceding siblings ...) 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 04/12] aio: introduce aio_{disable, enable}_external Fam Zheng @ 2015-10-22 6:32 ` Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 06/12] block: Add "drained begin/end" for transactional external snapshot Fam Zheng ` (6 subsequent siblings) 11 siblings, 0 replies; 18+ messages in thread From: Fam Zheng @ 2015-10-22 6:32 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, pbonzini, jcody, stefanha, qemu-block The semantics is that after bdrv_drained_begin(bs), bs will not get new external requests until the matching bdrv_drained_end(bs). Signed-off-by: Fam Zheng <famz@redhat.com> --- block/io.c | 17 +++++++++++++++++ include/block/block.h | 19 +++++++++++++++++++ include/block/block_int.h | 2 ++ 3 files changed, 38 insertions(+) diff --git a/block/io.c b/block/io.c index 2fd7a1d..5ac6256 100644 --- a/block/io.c +++ b/block/io.c @@ -2624,3 +2624,20 @@ void bdrv_flush_io_queue(BlockDriverState *bs) } bdrv_start_throttled_reqs(bs); } + +void bdrv_drained_begin(BlockDriverState *bs) +{ + if (!bs->quiesce_counter++) { + aio_disable_external(bdrv_get_aio_context(bs)); + } + bdrv_drain(bs); +} + +void bdrv_drained_end(BlockDriverState *bs) +{ + assert(bs->quiesce_counter > 0); + if (--bs->quiesce_counter > 0) { + return; + } + aio_enable_external(bdrv_get_aio_context(bs)); +} diff --git a/include/block/block.h b/include/block/block.h index 28d903c..5d722a7 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -610,4 +610,23 @@ void bdrv_io_plug(BlockDriverState *bs); void bdrv_io_unplug(BlockDriverState *bs); void bdrv_flush_io_queue(BlockDriverState *bs); +/** + * bdrv_drained_begin: + * + * Begin a quiesced section for exclusive access to the BDS, by disabling + * external request sources including NBD server and device model. Note that + * this doesn't block timers or coroutines from submitting more requests, which + * means block_job_pause is still necessary. + * + * This function can be recursive. + */ +void bdrv_drained_begin(BlockDriverState *bs); + +/** + * bdrv_drained_end: + * + * End a quiescent section started by bdrv_drained_begin(). + */ +void bdrv_drained_end(BlockDriverState *bs); + #endif diff --git a/include/block/block_int.h b/include/block/block_int.h index e472a03..e317b14 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -448,6 +448,8 @@ struct BlockDriverState { /* threshold limit for writes, in bytes. "High water mark". */ uint64_t write_threshold_offset; NotifierWithReturn write_threshold_notifier; + + int quiesce_counter; }; struct BlockBackendRootState { -- 2.4.3 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v6 06/12] block: Add "drained begin/end" for transactional external snapshot 2015-10-22 6:32 [Qemu-devel] [PATCH v6 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng ` (4 preceding siblings ...) 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 05/12] block: Introduce "drained begin/end" API Fam Zheng @ 2015-10-22 6:32 ` Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 07/12] block: Add "drained begin/end" for transactional backup Fam Zheng ` (5 subsequent siblings) 11 siblings, 0 replies; 18+ messages in thread From: Fam Zheng @ 2015-10-22 6:32 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, pbonzini, jcody, stefanha, qemu-block This ensures the atomicity of the transaction by avoiding processing of external requests such as those from ioeventfd. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> --- blockdev.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/blockdev.c b/blockdev.c index 35e5719..e4a5eb4 100644 --- a/blockdev.c +++ b/blockdev.c @@ -1569,6 +1569,7 @@ static void external_snapshot_prepare(BlkTransactionState *common, /* Acquire AioContext now so any threads operating on old_bs stop */ state->aio_context = bdrv_get_aio_context(state->old_bs); aio_context_acquire(state->aio_context); + bdrv_drained_begin(state->old_bs); if (!bdrv_is_inserted(state->old_bs)) { error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, device); @@ -1638,8 +1639,6 @@ static void external_snapshot_commit(BlkTransactionState *common) * don't want to abort all of them if one of them fails the reopen */ bdrv_reopen(state->old_bs, state->old_bs->open_flags & ~BDRV_O_RDWR, NULL); - - aio_context_release(state->aio_context); } static void external_snapshot_abort(BlkTransactionState *common) @@ -1649,7 +1648,14 @@ static void external_snapshot_abort(BlkTransactionState *common) if (state->new_bs) { bdrv_unref(state->new_bs); } +} + +static void external_snapshot_clean(BlkTransactionState *common) +{ + ExternalSnapshotState *state = + DO_UPCAST(ExternalSnapshotState, common, common); if (state->aio_context) { + bdrv_drained_end(state->old_bs); aio_context_release(state->aio_context); } } @@ -1809,6 +1815,7 @@ static const BdrvActionOps actions[] = { .prepare = external_snapshot_prepare, .commit = external_snapshot_commit, .abort = external_snapshot_abort, + .clean = external_snapshot_clean, }, [TRANSACTION_ACTION_KIND_DRIVE_BACKUP] = { .instance_size = sizeof(DriveBackupState), -- 2.4.3 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v6 07/12] block: Add "drained begin/end" for transactional backup 2015-10-22 6:32 [Qemu-devel] [PATCH v6 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng ` (5 preceding siblings ...) 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 06/12] block: Add "drained begin/end" for transactional external snapshot Fam Zheng @ 2015-10-22 6:32 ` Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 08/12] block: Add "drained begin/end" for transactional blockdev-backup Fam Zheng ` (4 subsequent siblings) 11 siblings, 0 replies; 18+ messages in thread From: Fam Zheng @ 2015-10-22 6:32 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, pbonzini, jcody, stefanha, qemu-block This ensures the atomicity of the transaction by avoiding processing of external requests such as those from ioeventfd. Move the assignment to state->bs up right after bdrv_drained_begin, so that we can use it in the clean callback. The abort callback will still check bs->job and state->job, so it's OK. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> --- blockdev.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/blockdev.c b/blockdev.c index e4a5eb4..0a7848b 100644 --- a/blockdev.c +++ b/blockdev.c @@ -1684,9 +1684,16 @@ static void drive_backup_prepare(BlkTransactionState *common, Error **errp) return; } + if (!blk_is_available(blk)) { + error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, backup->device); + return; + } + /* AioContext is released in .clean() */ state->aio_context = blk_get_aio_context(blk); aio_context_acquire(state->aio_context); + bdrv_drained_begin(blk_bs(blk)); + state->bs = blk_bs(blk); qmp_drive_backup(backup->device, backup->target, backup->has_format, backup->format, @@ -1702,7 +1709,6 @@ static void drive_backup_prepare(BlkTransactionState *common, Error **errp) return; } - state->bs = blk_bs(blk); state->job = state->bs->job; } @@ -1722,6 +1728,7 @@ static void drive_backup_clean(BlkTransactionState *common) DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common); if (state->aio_context) { + bdrv_drained_end(state->bs); aio_context_release(state->aio_context); } } -- 2.4.3 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v6 08/12] block: Add "drained begin/end" for transactional blockdev-backup 2015-10-22 6:32 [Qemu-devel] [PATCH v6 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng ` (6 preceding siblings ...) 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 07/12] block: Add "drained begin/end" for transactional backup Fam Zheng @ 2015-10-22 6:32 ` Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 09/12] block: Add "drained begin/end" for internal snapshot Fam Zheng ` (3 subsequent siblings) 11 siblings, 0 replies; 18+ messages in thread From: Fam Zheng @ 2015-10-22 6:32 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, pbonzini, jcody, stefanha, qemu-block Similar to the previous patch, make sure that external events are not dispatched during transaction operations. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> --- blockdev.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/blockdev.c b/blockdev.c index 0a7848b..52f44b2 100644 --- a/blockdev.c +++ b/blockdev.c @@ -1756,6 +1756,11 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp) return; } + if (!blk_is_available(blk)) { + error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, backup->device); + return; + } + target = blk_by_name(backup->target); if (!target) { error_setg(errp, "Device '%s' not found", backup->target); @@ -1770,6 +1775,8 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp) return; } aio_context_acquire(state->aio_context); + state->bs = blk_bs(blk); + bdrv_drained_begin(state->bs); qmp_blockdev_backup(backup->device, backup->target, backup->sync, @@ -1782,7 +1789,6 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp) return; } - state->bs = blk_bs(blk); state->job = state->bs->job; } @@ -1802,6 +1808,7 @@ static void blockdev_backup_clean(BlkTransactionState *common) BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common); if (state->aio_context) { + bdrv_drained_end(state->bs); aio_context_release(state->aio_context); } } -- 2.4.3 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v6 09/12] block: Add "drained begin/end" for internal snapshot 2015-10-22 6:32 [Qemu-devel] [PATCH v6 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng ` (7 preceding siblings ...) 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 08/12] block: Add "drained begin/end" for transactional blockdev-backup Fam Zheng @ 2015-10-22 6:32 ` Fam Zheng 2015-10-22 10:18 ` Kevin Wolf 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 10/12] block: Introduce BlockDriver.bdrv_drain callback Fam Zheng ` (2 subsequent siblings) 11 siblings, 1 reply; 18+ messages in thread From: Fam Zheng @ 2015-10-22 6:32 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, pbonzini, jcody, stefanha, qemu-block This ensures the atomicity of the transaction by avoiding processing of external requests such as those from ioeventfd. state->bs is assigned right after bdrv_drained_begin. Because it was used as the flag for deletion or not in abort, now we need a separate flag - InternalSnapshotState.created. Signed-off-by: Fam Zheng <famz@redhat.com> --- blockdev.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/blockdev.c b/blockdev.c index 52f44b2..adc0e69 100644 --- a/blockdev.c +++ b/blockdev.c @@ -1370,6 +1370,7 @@ typedef struct InternalSnapshotState { BlockDriverState *bs; AioContext *aio_context; QEMUSnapshotInfo sn; + bool created; } InternalSnapshotState; static void internal_snapshot_prepare(BlkTransactionState *common, @@ -1414,6 +1415,9 @@ static void internal_snapshot_prepare(BlkTransactionState *common, } bs = blk_bs(blk); + state->bs = bs; + bdrv_drained_begin(bs); + if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_INTERNAL_SNAPSHOT, errp)) { return; } @@ -1465,7 +1469,7 @@ static void internal_snapshot_prepare(BlkTransactionState *common, } /* 4. succeed, mark a snapshot is created */ - state->bs = bs; + state->created = true; } static void internal_snapshot_abort(BlkTransactionState *common) @@ -1476,7 +1480,7 @@ static void internal_snapshot_abort(BlkTransactionState *common) QEMUSnapshotInfo *sn = &state->sn; Error *local_error = NULL; - if (!bs) { + if (!state->created) { return; } @@ -1497,6 +1501,7 @@ static void internal_snapshot_clean(BlkTransactionState *common) common, common); if (state->aio_context) { + bdrv_drained_end(state->bs); aio_context_release(state->aio_context); } } -- 2.4.3 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH v6 09/12] block: Add "drained begin/end" for internal snapshot 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 09/12] block: Add "drained begin/end" for internal snapshot Fam Zheng @ 2015-10-22 10:18 ` Kevin Wolf 2015-10-22 11:05 ` Jeff Cody 0 siblings, 1 reply; 18+ messages in thread From: Kevin Wolf @ 2015-10-22 10:18 UTC (permalink / raw) To: Fam Zheng; +Cc: pbonzini, jcody, stefanha, qemu-devel, qemu-block Am 22.10.2015 um 08:32 hat Fam Zheng geschrieben: > This ensures the atomicity of the transaction by avoiding processing of > external requests such as those from ioeventfd. > > state->bs is assigned right after bdrv_drained_begin. Because it was > used as the flag for deletion or not in abort, now we need a separate > flag - InternalSnapshotState.created. > > Signed-off-by: Fam Zheng <famz@redhat.com> > --- > blockdev.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/blockdev.c b/blockdev.c > index 52f44b2..adc0e69 100644 > --- a/blockdev.c > +++ b/blockdev.c > @@ -1370,6 +1370,7 @@ typedef struct InternalSnapshotState { > BlockDriverState *bs; > AioContext *aio_context; > QEMUSnapshotInfo sn; > + bool created; > } InternalSnapshotState; > > static void internal_snapshot_prepare(BlkTransactionState *common, > @@ -1414,6 +1415,9 @@ static void internal_snapshot_prepare(BlkTransactionState *common, > } > bs = blk_bs(blk); > > + state->bs = bs; > + bdrv_drained_begin(bs); > + > if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_INTERNAL_SNAPSHOT, errp)) { > return; > } More context: /* AioContext is released in .clean() */ state->aio_context = blk_get_aio_context(blk); aio_context_acquire(state->aio_context); if (!blk_is_available(blk)) { error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, device); return; } bs = blk_bs(blk); state->bs = bs; bdrv_drained_begin(bs); If we error out because of !blk_is_available(blk), we will still call bdrv_drained_end() in .clean even though bdrv_drained_begin() hasn't been called yet. Kevin ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH v6 09/12] block: Add "drained begin/end" for internal snapshot 2015-10-22 10:18 ` Kevin Wolf @ 2015-10-22 11:05 ` Jeff Cody 0 siblings, 0 replies; 18+ messages in thread From: Jeff Cody @ 2015-10-22 11:05 UTC (permalink / raw) To: Kevin Wolf; +Cc: stefanha, pbonzini, Fam Zheng, qemu-devel, qemu-block On Thu, Oct 22, 2015 at 12:18:11PM +0200, Kevin Wolf wrote: > Am 22.10.2015 um 08:32 hat Fam Zheng geschrieben: > > This ensures the atomicity of the transaction by avoiding processing of > > external requests such as those from ioeventfd. > > > > state->bs is assigned right after bdrv_drained_begin. Because it was > > used as the flag for deletion or not in abort, now we need a separate > > flag - InternalSnapshotState.created. > > > > Signed-off-by: Fam Zheng <famz@redhat.com> > > --- > > blockdev.c | 9 +++++++-- > > 1 file changed, 7 insertions(+), 2 deletions(-) > > > > diff --git a/blockdev.c b/blockdev.c > > index 52f44b2..adc0e69 100644 > > --- a/blockdev.c > > +++ b/blockdev.c > > @@ -1370,6 +1370,7 @@ typedef struct InternalSnapshotState { > > BlockDriverState *bs; > > AioContext *aio_context; > > QEMUSnapshotInfo sn; > > + bool created; > > } InternalSnapshotState; > > > > static void internal_snapshot_prepare(BlkTransactionState *common, > > @@ -1414,6 +1415,9 @@ static void internal_snapshot_prepare(BlkTransactionState *common, > > } > > bs = blk_bs(blk); > > > > + state->bs = bs; > > + bdrv_drained_begin(bs); > > + > > if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_INTERNAL_SNAPSHOT, errp)) { > > return; > > } > > More context: > > /* AioContext is released in .clean() */ > state->aio_context = blk_get_aio_context(blk); > aio_context_acquire(state->aio_context); > > if (!blk_is_available(blk)) { > error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, device); > return; > } > bs = blk_bs(blk); > > state->bs = bs; > bdrv_drained_begin(bs); > > If we error out because of !blk_is_available(blk), we will still call > bdrv_drained_end() in .clean even though bdrv_drained_begin() hasn't > been called yet. > Right, if the bdrv_drained_begin() is moved down as it is in this patch, then the state->aio_context assignment needs moved down as well. ^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v6 10/12] block: Introduce BlockDriver.bdrv_drain callback 2015-10-22 6:32 [Qemu-devel] [PATCH v6 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng ` (8 preceding siblings ...) 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 09/12] block: Add "drained begin/end" for internal snapshot Fam Zheng @ 2015-10-22 6:32 ` Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 11/12] qed: Implement .bdrv_drain Fam Zheng 2015-10-22 6:33 ` [Qemu-devel] [PATCH v6 12/12] tests: Add test case for aio_disable_external Fam Zheng 11 siblings, 0 replies; 18+ messages in thread From: Fam Zheng @ 2015-10-22 6:32 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, pbonzini, jcody, stefanha, qemu-block Drivers can have internal request sources that generate IO, like the need_check_timer in QED. Since we want quiesced periods that contain nested event loops in block layer, we need to have a way to disable such event sources. Block drivers must implement the "bdrv_drain" callback if it has any internal sources that can generate I/O activity, like a timer or a worker thread (even in a library) that can schedule QEMUBH in an asynchronous callback. Update the comments of bdrv_drain and bdrv_drained_begin accordingly. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> --- block/io.c | 6 +++++- include/block/block.h | 9 +++++++-- include/block/block_int.h | 6 ++++++ 3 files changed, 18 insertions(+), 3 deletions(-) diff --git a/block/io.c b/block/io.c index 5ac6256..999df63 100644 --- a/block/io.c +++ b/block/io.c @@ -235,7 +235,8 @@ bool bdrv_requests_pending(BlockDriverState *bs) } /* - * Wait for pending requests to complete on a single BlockDriverState subtree + * Wait for pending requests to complete on a single BlockDriverState subtree, + * and suspend block driver's internal I/O until next request arrives. * * Note that unlike bdrv_drain_all(), the caller must hold the BlockDriverState * AioContext. @@ -248,6 +249,9 @@ void bdrv_drain(BlockDriverState *bs) { bool busy = true; + if (bs->drv && bs->drv->bdrv_drain) { + bs->drv->bdrv_drain(bs); + } while (busy) { /* Keep iterating */ bdrv_flush_io_queue(bs); diff --git a/include/block/block.h b/include/block/block.h index 5d722a7..cf459d6 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -615,8 +615,13 @@ void bdrv_flush_io_queue(BlockDriverState *bs); * * Begin a quiesced section for exclusive access to the BDS, by disabling * external request sources including NBD server and device model. Note that - * this doesn't block timers or coroutines from submitting more requests, which - * means block_job_pause is still necessary. + * this doesn't prevent timers or coroutines from submitting more requests, + * which means block_job_pause is still necessary. + * + * If new I/O requests are submitted after bdrv_drained_begin is called before + * bdrv_drained_end, more internal I/O might be going on after the request has + * been completed. If you don't want this, you have to issue another bdrv_drain + * or use a nested bdrv_drained_begin/end section. * * This function can be recursive. */ diff --git a/include/block/block_int.h b/include/block/block_int.h index e317b14..73eba05 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -288,6 +288,12 @@ struct BlockDriver { */ int (*bdrv_probe_geometry)(BlockDriverState *bs, HDGeometry *geo); + /** + * Drain and stop any internal sources of requests in the driver, and + * remain so until next I/O callback (e.g. bdrv_co_writev) is called. + */ + void (*bdrv_drain)(BlockDriverState *bs); + QLIST_ENTRY(BlockDriver) list; }; -- 2.4.3 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v6 11/12] qed: Implement .bdrv_drain 2015-10-22 6:32 [Qemu-devel] [PATCH v6 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng ` (9 preceding siblings ...) 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 10/12] block: Introduce BlockDriver.bdrv_drain callback Fam Zheng @ 2015-10-22 6:32 ` Fam Zheng 2015-10-22 10:53 ` Kevin Wolf 2015-10-22 6:33 ` [Qemu-devel] [PATCH v6 12/12] tests: Add test case for aio_disable_external Fam Zheng 11 siblings, 1 reply; 18+ messages in thread From: Fam Zheng @ 2015-10-22 6:32 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, pbonzini, jcody, stefanha, qemu-block The "need_check_timer" is used to clear the "NEED_CHECK" flag in the image header after a grace period once metadata update has finished. In compliance to the bdrv_drain semantics we should make sure it remains deleted once .bdrv_drain is called. We cannot reuse qed_need_check_timer_cb because it calls bdrv_aio_flush with a completion callback that starts more I/O, bdrv_drain cannot cope with this. Also the assertions in qed_need_check_timer_cb and qed_unplug_allocating_write_reqs are not guaranteed in the case of bdrv_qed_drain. Signed-off-by: Fam Zheng <famz@redhat.com> --- block/qed.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/block/qed.c b/block/qed.c index 5ea05d4..d63a7de 100644 --- a/block/qed.c +++ b/block/qed.c @@ -375,6 +375,20 @@ static void bdrv_qed_attach_aio_context(BlockDriverState *bs, } } +static void bdrv_qed_drain(BlockDriverState *bs) +{ + BDRVQEDState *s = bs->opaque; + int ret; + + /* Cancel timer and start doing I/O that were meant to happen when it + * fires, that way we get bdrv_drain() taking care of the requests + * correctly. */ + qed_cancel_need_check_timer(s); + qed_plug_allocating_write_reqs(s); + ret = bdrv_flush(s->bs); + qed_clear_need_check(s, ret); +} + static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags, Error **errp) { @@ -1676,6 +1690,7 @@ static BlockDriver bdrv_qed = { .bdrv_check = bdrv_qed_check, .bdrv_detach_aio_context = bdrv_qed_detach_aio_context, .bdrv_attach_aio_context = bdrv_qed_attach_aio_context, + .bdrv_drain = bdrv_qed_drain, }; static void bdrv_qed_init(void) -- 2.4.3 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH v6 11/12] qed: Implement .bdrv_drain 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 11/12] qed: Implement .bdrv_drain Fam Zheng @ 2015-10-22 10:53 ` Kevin Wolf 2015-10-22 20:59 ` Paolo Bonzini 0 siblings, 1 reply; 18+ messages in thread From: Kevin Wolf @ 2015-10-22 10:53 UTC (permalink / raw) To: Fam Zheng; +Cc: pbonzini, jcody, stefanha, qemu-devel, qemu-block Am 22.10.2015 um 08:32 hat Fam Zheng geschrieben: > The "need_check_timer" is used to clear the "NEED_CHECK" flag in the > image header after a grace period once metadata update has finished. In > compliance to the bdrv_drain semantics we should make sure it remains > deleted once .bdrv_drain is called. > > We cannot reuse qed_need_check_timer_cb because it calls bdrv_aio_flush > with a completion callback that starts more I/O, bdrv_drain cannot cope > with this. For the record: I discussed this with Fam on IRC and we came to the conclusion that we should instead fix bdrv_drain() to correctly drain in-flight flushes. In order to achieve this, Fam will send a series that adds a BdrvTrackedRequest to all remaining asynchronous operations. So far we have identified flush, discard and aio_ioctl. At the same time he will send a rebased version of this series that goes back to asynchronous flushing in bdrv_qed_drain(). > Also the assertions in qed_need_check_timer_cb and > qed_unplug_allocating_write_reqs are not guaranteed in the case of > bdrv_qed_drain. > > Signed-off-by: Fam Zheng <famz@redhat.com> Kevin ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH v6 11/12] qed: Implement .bdrv_drain 2015-10-22 10:53 ` Kevin Wolf @ 2015-10-22 20:59 ` Paolo Bonzini 2015-10-23 3:07 ` Fam Zheng 0 siblings, 1 reply; 18+ messages in thread From: Paolo Bonzini @ 2015-10-22 20:59 UTC (permalink / raw) To: Kevin Wolf, Fam Zheng; +Cc: jcody, qemu-block, qemu-devel, stefanha On 22/10/2015 12:53, Kevin Wolf wrote: > Am 22.10.2015 um 08:32 hat Fam Zheng geschrieben: >> The "need_check_timer" is used to clear the "NEED_CHECK" flag in the >> image header after a grace period once metadata update has finished. In >> compliance to the bdrv_drain semantics we should make sure it remains >> deleted once .bdrv_drain is called. >> >> We cannot reuse qed_need_check_timer_cb because it calls bdrv_aio_flush >> with a completion callback that starts more I/O, bdrv_drain cannot cope >> with this. > > For the record: I discussed this with Fam on IRC and we came to the > conclusion that we should instead fix bdrv_drain() to correctly drain > in-flight flushes. > > In order to achieve this, Fam will send a series that adds a > BdrvTrackedRequest to all remaining asynchronous operations. So far we > have identified flush, discard and aio_ioctl. At the same time he will > send a rebased version of this series that goes back to asynchronous > flushing in bdrv_qed_drain(). This is a completely separate bug. Can you at least merge all patches except this, so that everything except QED can do snapshots with dataplane? Of course there will be a v7 at least for patch 9. Paolo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH v6 11/12] qed: Implement .bdrv_drain 2015-10-22 20:59 ` Paolo Bonzini @ 2015-10-23 3:07 ` Fam Zheng 0 siblings, 0 replies; 18+ messages in thread From: Fam Zheng @ 2015-10-23 3:07 UTC (permalink / raw) To: Paolo Bonzini; +Cc: Kevin Wolf, jcody, qemu-block, qemu-devel, stefanha On Thu, 10/22 22:59, Paolo Bonzini wrote: > > > On 22/10/2015 12:53, Kevin Wolf wrote: > > Am 22.10.2015 um 08:32 hat Fam Zheng geschrieben: > >> The "need_check_timer" is used to clear the "NEED_CHECK" flag in the > >> image header after a grace period once metadata update has finished. In > >> compliance to the bdrv_drain semantics we should make sure it remains > >> deleted once .bdrv_drain is called. > >> > >> We cannot reuse qed_need_check_timer_cb because it calls bdrv_aio_flush > >> with a completion callback that starts more I/O, bdrv_drain cannot cope > >> with this. > > > > For the record: I discussed this with Fam on IRC and we came to the > > conclusion that we should instead fix bdrv_drain() to correctly drain > > in-flight flushes. > > > > In order to achieve this, Fam will send a series that adds a > > BdrvTrackedRequest to all remaining asynchronous operations. So far we > > have identified flush, discard and aio_ioctl. At the same time he will > > send a rebased version of this series that goes back to asynchronous > > flushing in bdrv_qed_drain(). > > This is a completely separate bug. Can you at least merge all patches > except this, so that everything except QED can do snapshots with > dataplane? Of course there will be a v7 at least for patch 9. > So I'll drop the two patches for QED in v7, and let them follow the bdrv_drain fixing series. Actually, I think the current being of the QED timer doesn't change guest visible data - if the callback would dequeue more allocating requests (qed_unplug_allocating_write_reqs), the loop in bdrv_drain will wait for them. So it's not a big problem for the sake of transaction after all. Fam ^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v6 12/12] tests: Add test case for aio_disable_external 2015-10-22 6:32 [Qemu-devel] [PATCH v6 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng ` (10 preceding siblings ...) 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 11/12] qed: Implement .bdrv_drain Fam Zheng @ 2015-10-22 6:33 ` Fam Zheng 11 siblings, 0 replies; 18+ messages in thread From: Fam Zheng @ 2015-10-22 6:33 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, pbonzini, jcody, stefanha, qemu-block Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> --- tests/test-aio.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/tests/test-aio.c b/tests/test-aio.c index 03cd45d..1623803 100644 --- a/tests/test-aio.c +++ b/tests/test-aio.c @@ -374,6 +374,29 @@ static void test_flush_event_notifier(void) event_notifier_cleanup(&data.e); } +static void test_aio_external_client(void) +{ + int i, j; + + for (i = 1; i < 3; i++) { + EventNotifierTestData data = { .n = 0, .active = 10, .auto_set = true }; + event_notifier_init(&data.e, false); + aio_set_event_notifier(ctx, &data.e, true, event_ready_cb); + event_notifier_set(&data.e); + for (j = 0; j < i; j++) { + aio_disable_external(ctx); + } + for (j = 0; j < i; j++) { + assert(!aio_poll(ctx, false)); + assert(event_notifier_test_and_clear(&data.e)); + event_notifier_set(&data.e); + aio_enable_external(ctx); + } + assert(aio_poll(ctx, false)); + event_notifier_cleanup(&data.e); + } +} + static void test_wait_event_notifier_noflush(void) { EventNotifierTestData data = { .n = 0 }; @@ -832,6 +855,7 @@ int main(int argc, char **argv) g_test_add_func("/aio/event/wait", test_wait_event_notifier); g_test_add_func("/aio/event/wait/no-flush-cb", test_wait_event_notifier_noflush); g_test_add_func("/aio/event/flush", test_flush_event_notifier); + g_test_add_func("/aio/external-client", test_aio_external_client); g_test_add_func("/aio/timer/schedule", test_timer_schedule); g_test_add_func("/aio-gsource/flush", test_source_flush); -- 2.4.3 ^ permalink raw reply related [flat|nested] 18+ messages in thread
end of thread, other threads:[~2015-10-23 3:07 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-10-22 6:32 [Qemu-devel] [PATCH v6 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 01/12] aio: Add "is_external" flag for event handlers Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 02/12] nbd: Mark fd handlers client type as "external" Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 03/12] dataplane: Mark host notifiers' " Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 04/12] aio: introduce aio_{disable, enable}_external Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 05/12] block: Introduce "drained begin/end" API Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 06/12] block: Add "drained begin/end" for transactional external snapshot Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 07/12] block: Add "drained begin/end" for transactional backup Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 08/12] block: Add "drained begin/end" for transactional blockdev-backup Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 09/12] block: Add "drained begin/end" for internal snapshot Fam Zheng 2015-10-22 10:18 ` Kevin Wolf 2015-10-22 11:05 ` Jeff Cody 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 10/12] block: Introduce BlockDriver.bdrv_drain callback Fam Zheng 2015-10-22 6:32 ` [Qemu-devel] [PATCH v6 11/12] qed: Implement .bdrv_drain Fam Zheng 2015-10-22 10:53 ` Kevin Wolf 2015-10-22 20:59 ` Paolo Bonzini 2015-10-23 3:07 ` Fam Zheng 2015-10-22 6:33 ` [Qemu-devel] [PATCH v6 12/12] tests: Add test case for aio_disable_external Fam Zheng
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).