All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Zhiyong Ye <yezhiyong@bytedance.com>
Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, mreitz@redhat.com
Subject: Re: [PATCH 1/1] block: Do not poll in bdrv_set_aio_context_ignore() when acquiring new_context
Date: Mon, 19 Jul 2021 12:24:53 +0200	[thread overview]
Message-ID: <YPVS9ZnMFXEJ6c9H@redhat.com> (raw)
In-Reply-To: <20210712053831.51722-2-yezhiyong@bytedance.com>

Am 12.07.2021 um 07:38 hat Zhiyong Ye geschrieben:
> When bdrv_set_aio_context_ignore() is called in the main loop to change
> the AioContext onto the IO thread, the bdrv_drain_invoke_entry() never
> gets to run and the IO thread hangs at co_schedule_bh_cb().
> 
> This is because the AioContext is occupied by the main thread after
> being attached to the IO thread, and the main thread poll in
> bdrv_drained_end() waiting for the IO request to be drained, but the IO
> thread cannot acquire the AioContext, which leads to deadlock.

This shouldn't be the case:

 * The caller must own the AioContext lock for the old AioContext of bs, but it
 * must not own the AioContext lock for new_context (unless new_context is the
 * same as the current context of bs).

Then bdrv_set_aio_context_ignore() switches the AioContext of bs, and
then calls bdrv_drained_end() while holding only the lock for the new
context. AIO_WAIT_WHILE() will temporarily drop that lock, so aio_poll()
should run without holding any AioContext locks.

If I'm not missing anything, the scenario you're seeing means that the
caller already held a lock for the new AioContext, so that it's locked
twice while AIO_WAIT_WHILE() drops the lock only once. This would be a
bug in the caller because the documentation I quoted explicitly forbids
holding the AioContext lock for the new context.

> Just like below:
> 
> <------>
> [Switching to thread 1 (Thread 0x7fd810bbef40 (LWP 533312))]
> (gdb) bt
> ...
> 3  0x00005601f6ea93aa in fdmon_poll_wait at ../util/fdmon-poll.c:80
> 4  0x00005601f6e81a1c in aio_poll at ../util/aio-posix.c:607
> 5  0x00005601f6dcde87 in bdrv_drained_end at ../block/io.c:496
> 6  0x00005601f6d798cd in bdrv_set_aio_context_ignore at ../block.c:6502
> 7  0x00005601f6d7996c in bdrv_set_aio_context_ignore at ../block.c:6472
> 8  0x00005601f6d79cb8 in bdrv_child_try_set_aio_context at ../block.c:6587
> 9  0x00005601f6da86f2 in blk_do_set_aio_context at ../block/block-backend.c:2026
> 10 0x00005601f6daa96d in blk_set_aio_context at ../block/block-backend.c:2047
> 11 0x00005601f6c71883 in virtio_scsi_hotplug at ../hw/scsi/virtio-scsi.c:831

Which version of QEMU are you using? In current git master, line 831 is
something entirely different.

Are you using something before commit c7040ff6? Because this is a commit
that fixed a virtio-scsi bug where it would hold the wrong lock before
calling blk_set_aio_context().

> 
> [Switching to thread 4 (Thread 0x7fd8092e7700 (LWP 533315))]
> (gdb) bt
> ...
> 4  0x00005601f6eab6a8 in qemu_mutex_lock_impl at ../util/qemu-thread-posix.c:79
> 5  0x00005601f6e7ce88 in co_schedule_bh_cb at ../util/async.c:489
> 6  0x00005601f6e7c404 in aio_bh_poll at ../util/async.c:164
> 7  0x00005601f6e81a46 in aio_poll at ../util/aio-posix.c:659
> 8  0x00005601f6d5ccf3 in iothread_run at ../iothread.c:73
> 9  0x00005601f6eab512 in qemu_thread_start at ../util/qemu-thread-posix.c:521
> 10 0x00007fd80d7b84a4 in start_thread at pthread_create.c:456
> 11 0x00007fd80d4fad0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
> (gdb) f 4
> 4  0x00005601f6eab6a8 in qemu_mutex_lock_impl at ../util/qemu-thread-posix.c:79
> (gdb) p *mutex
> $2 = {lock = {__data = {__lock = 2, __count = 1, __owner = 533312, __nusers = 1,
>       __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
>       __size = "\002\000\000\000\001\000\000\000@#\b\000\001\000\000\000\001",
>       '\000' <repeats 22 times>, __align = 4294967298}, initialized = true}
> <------>
> 
> Therefore, we should never poll anywhere in
> bdrv_set_aio_context_ignore() when acquiring the new context. In fact,
> commit e037c09c has also already elaborated on why we can't poll at
> bdrv_do_drained_end().
> 
> Signed-off-by: Zhiyong Ye <yezhiyong@bytedance.com>
> ---
>  block.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/block.c b/block.c
> index be083f389e..ebbea72d64 100644
> --- a/block.c
> +++ b/block.c
> @@ -6846,6 +6846,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
>      GSList *parents_to_process = NULL;
>      GSList *entry;
>      BdrvChild *child, *parent;
> +    int drained_end_counter = 0;
>  
>      g_assert(qemu_get_current_aio_context() == qemu_get_aio_context());
>  
> @@ -6907,7 +6908,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
>          aio_context_release(old_context);
>      }
>  
> -    bdrv_drained_end(bs);
> +    bdrv_drained_end_no_poll(bs, &drained_end_counter);
>  
>      if (qemu_get_aio_context() != old_context) {
>          aio_context_acquire(old_context);
> @@ -6915,6 +6916,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
>      if (qemu_get_aio_context() != new_context) {
>          aio_context_release(new_context);
>      }
> +    BDRV_POLL_WHILE(bs, qatomic_read(&drained_end_counter) > 0);

This would be wrong because bs is already in the new context, but you
wouldn't hold the lock for it. AIO_WAIT_WHILE() would try to drop the
lock for a context that isn't even locked, resulting in a crash.

>  }
>  
>  static bool bdrv_parent_can_set_aio_context(BdrvChild *c, AioContext *ctx,

Kevin



  reply	other threads:[~2021-07-19 10:27 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-12  5:38 [PATCH 0/1] block: Do not poll in bdrv_set_aio_context_ignore() when acquiring new_context Zhiyong Ye
2021-07-12  5:38 ` [PATCH 1/1] " Zhiyong Ye
2021-07-19 10:24   ` Kevin Wolf [this message]
2021-07-20 13:07     ` Zhiyong Ye

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YPVS9ZnMFXEJ6c9H@redhat.com \
    --to=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=yezhiyong@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.