From: Stefan Hajnoczi <stefanha@gmail.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-devel@nongnu.org, kwolf@redhat.com, famz@redhat.com,
stefanha@redhat.com, qemu-block@nongnu.org
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH 15/18] block: only call aio_poll on the current thread's AioContext
Date: Sun, 16 Oct 2016 17:40:10 +0100 [thread overview]
Message-ID: <20161016164010.GF2128@stefanha-x1.localdomain> (raw)
In-Reply-To: <1476380062-18001-16-git-send-email-pbonzini@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 5183 bytes --]
On Thu, Oct 13, 2016 at 07:34:19PM +0200, Paolo Bonzini wrote:
> aio_poll is not thread safe; for example bdrv_drain can hang if
> the last in-flight I/O operation is completed in the I/O thread after
> the main thread has checked bs->in_flight.
>
> The bug remains latent as long as all of it is called within
> aio_context_acquire/aio_context_release, but this will change soon.
>
> To fix this, if bdrv_drain is called from outside the I/O thread,
> signal the main AioContext through a dummy bottom half. The event
> loop then only runs in the I/O thread.
>
> Reviewed-by: Fam Zheng <famz@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> async.c | 1 +
> block.c | 2 ++
> block/io.c | 7 +++++++
> include/block/block.h | 24 +++++++++++++++++++++---
> include/block/block_int.h | 1 +
> 5 files changed, 32 insertions(+), 3 deletions(-)
>
> diff --git a/async.c b/async.c
> index f30d011..fb37b03 100644
> --- a/async.c
> +++ b/async.c
> @@ -61,6 +61,7 @@ void aio_bh_schedule_oneshot(AioContext *ctx, QEMUBHFunc *cb, void *opaque)
> smp_wmb();
> ctx->first_bh = bh;
> qemu_mutex_unlock(&ctx->bh_lock);
> + aio_notify(ctx);
> }
>
> QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque)
> diff --git a/block.c b/block.c
> index fbe485c..a17baab 100644
> --- a/block.c
> +++ b/block.c
> @@ -2090,7 +2090,9 @@ int bdrv_reopen_multiple(AioContext *ctx, BlockReopenQueue *bs_queue, Error **er
>
> assert(bs_queue != NULL);
>
> + aio_context_release(ctx);
> bdrv_drain_all();
> + aio_context_acquire(ctx);
>
> QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) {
> if (bdrv_reopen_prepare(&bs_entry->state, bs_queue, &local_err)) {
> diff --git a/block/io.c b/block/io.c
> index 7d3dcfc..cd28909 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -474,8 +474,15 @@ void bdrv_inc_in_flight(BlockDriverState *bs)
> atomic_inc(&bs->in_flight);
> }
>
> +static void dummy_bh_cb(void *opaque)
> +{
> +}
> +
> void bdrv_wakeup(BlockDriverState *bs)
> {
> + if (bs->wakeup) {
> + aio_bh_schedule_oneshot(qemu_get_aio_context(), dummy_bh_cb, NULL);
> + }
> }
Why use a dummy BH instead of aio_notify()?
>
> void bdrv_dec_in_flight(BlockDriverState *bs)
> diff --git a/include/block/block.h b/include/block/block.h
> index ba4318b..72d5d8e 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -343,9 +343,27 @@ void bdrv_drain_all(void);
> #define bdrv_poll_while(bs, cond) ({ \
> bool waited_ = false; \
> BlockDriverState *bs_ = (bs); \
> - while ((cond)) { \
> - aio_poll(bdrv_get_aio_context(bs_), true); \
> - waited_ = true; \
> + AioContext *ctx_ = bdrv_get_aio_context(bs_); \
> + if (aio_context_in_iothread(ctx_)) { \
> + while ((cond)) { \
> + aio_poll(ctx_, true); \
> + waited_ = true; \
> + } \
> + } else { \
> + assert(qemu_get_current_aio_context() == \
> + qemu_get_aio_context()); \
The assumption is that IOThread #1 will never call bdrv_poll_while() on
IOThread #2's AioContext. I believe that is true today. Is this what
you had in mind?
Please add a comment since it's not completely obvious from the assert
expression.
> + /* Ask bdrv_dec_in_flight to wake up the main \
> + * QEMU AioContext. \
> + */ \
> + assert(!bs_->wakeup); \
> + bs_->wakeup = true; \
> + while ((cond)) { \
> + aio_context_release(ctx_); \
> + aio_poll(qemu_get_aio_context(), true); \
> + aio_context_acquire(ctx_); \
> + waited_ = true; \
> + } \
> + bs_->wakeup = false; \
> } \
> waited_; })
>
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 11f877b..0516f62 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -470,6 +470,7 @@ struct BlockDriverState {
> NotifierWithReturnList before_write_notifiers;
>
> /* number of in-flight requests; overall and serialising */
> + bool wakeup;
> unsigned int in_flight;
> unsigned int serialising_in_flight;
>
> --
> 2.7.4
>
>
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]
next prev parent reply other threads:[~2016-10-16 16:40 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-13 17:34 [Qemu-devel] [PATCH 00/18] dataplane: remove RFifoLock (including almost all previously sent patches) Paolo Bonzini
2016-10-13 17:34 ` [Qemu-devel] [PATCH 01/18] replication: interrupt failover if the main device is closed Paolo Bonzini
2016-10-13 17:34 ` [Qemu-devel] [PATCH 02/18] blockjob: introduce .drain callback for jobs Paolo Bonzini
2016-10-16 10:02 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-10-17 7:53 ` [Qemu-devel] " Paolo Bonzini
2016-10-13 17:34 ` [Qemu-devel] [PATCH 03/18] mirror: use bdrv_drained_begin/bdrv_drained_end Paolo Bonzini
2016-10-14 9:43 ` Fam Zheng
2016-10-14 10:00 ` Paolo Bonzini
2016-10-13 17:34 ` [Qemu-devel] [PATCH 04/18] block: add BDS field to count in-flight requests Paolo Bonzini
2016-10-13 17:34 ` [Qemu-devel] [PATCH 05/18] block: change drain to look only at one child at a time Paolo Bonzini
2016-10-14 10:12 ` Fam Zheng
2016-10-13 17:34 ` [Qemu-devel] [PATCH 06/18] qed: Implement .bdrv_drain Paolo Bonzini
2016-10-14 10:33 ` Fam Zheng
2016-10-14 10:40 ` Paolo Bonzini
2016-10-13 17:34 ` [Qemu-devel] [PATCH 07/18] block: introduce bdrv_poll_while and bdrv_wakeup Paolo Bonzini
2016-10-14 10:42 ` Fam Zheng
2016-10-14 10:43 ` Paolo Bonzini
2016-10-16 10:25 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-10-17 7:54 ` [Qemu-devel] " Paolo Bonzini
2016-10-13 17:34 ` [Qemu-devel] [PATCH 08/18] nfs: move nfs_set_events out of the while loops Paolo Bonzini
2016-10-16 10:37 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-10-13 17:34 ` [Qemu-devel] [PATCH 09/18] nfs: use bdrv_poll_while and bdrv_wakeup Paolo Bonzini
2016-10-16 16:17 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-10-13 17:34 ` [Qemu-devel] [PATCH 10/18] sheepdog: " Paolo Bonzini
2016-10-16 16:21 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-10-13 17:34 ` [Qemu-devel] [PATCH 11/18] aio: introduce qemu_get_current_aio_context Paolo Bonzini
2016-10-16 16:28 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-10-13 17:34 ` [Qemu-devel] [PATCH 12/18] iothread: detach all block devices before stopping them Paolo Bonzini
2016-10-14 14:50 ` Fam Zheng
2016-10-14 14:59 ` Paolo Bonzini
2016-10-13 17:34 ` [Qemu-devel] [PATCH 13/18] replication: pass BlockDriverState to reopen_backing_file Paolo Bonzini
2016-10-16 16:31 ` Stefan Hajnoczi
2016-10-13 17:34 ` [Qemu-devel] [PATCH 14/18] block: prepare bdrv_reopen_multiple to release AioContext Paolo Bonzini
2016-10-16 16:32 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-10-13 17:34 ` [Qemu-devel] [PATCH 15/18] block: only call aio_poll on the current thread's AioContext Paolo Bonzini
2016-10-14 14:55 ` Fam Zheng
2016-10-16 16:40 ` Stefan Hajnoczi [this message]
2016-10-17 8:04 ` Paolo Bonzini
2016-10-18 10:10 ` Stefan Hajnoczi
2016-10-13 17:34 ` [Qemu-devel] [PATCH 16/18] iothread: release AioContext around aio_poll Paolo Bonzini
2016-10-13 17:34 ` [Qemu-devel] [PATCH 17/18] qemu-thread: introduce QemuRecMutex Paolo Bonzini
2016-10-13 17:34 ` [Qemu-devel] [PATCH 18/18] aio: convert from RFifoLock to QemuRecMutex Paolo Bonzini
2016-10-16 16:43 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-10-17 8:58 ` [Qemu-devel] [PATCH 00/18] dataplane: remove RFifoLock (including almost all previously sent patches) Christian Borntraeger
2016-10-17 9:17 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161016164010.GF2128@stefanha-x1.localdomain \
--to=stefanha@gmail.com \
--cc=famz@redhat.com \
--cc=kwolf@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).