From: Zhiyong Ye <yezhiyong@bytedance.com>
To: qemu-block@nongnu.org, qemu-devel@nongnu.org
Cc: kwolf@redhat.com, Zhiyong Ye <yezhiyong@bytedance.com>,
mreitz@redhat.com
Subject: [PATCH 1/1] block: Do not poll in bdrv_set_aio_context_ignore() when acquiring new_context
Date: Mon, 12 Jul 2021 13:38:31 +0800 [thread overview]
Message-ID: <20210712053831.51722-2-yezhiyong@bytedance.com> (raw)
In-Reply-To: <20210712053831.51722-1-yezhiyong@bytedance.com>
When bdrv_set_aio_context_ignore() is called in the main loop to change
the AioContext onto the IO thread, the bdrv_drain_invoke_entry() never
gets to run and the IO thread hangs at co_schedule_bh_cb().
This is because the AioContext is occupied by the main thread after
being attached to the IO thread, and the main thread poll in
bdrv_drained_end() waiting for the IO request to be drained, but the IO
thread cannot acquire the AioContext, which leads to deadlock.
Just like below:
<------>
[Switching to thread 1 (Thread 0x7fd810bbef40 (LWP 533312))]
(gdb) bt
...
3 0x00005601f6ea93aa in fdmon_poll_wait at ../util/fdmon-poll.c:80
4 0x00005601f6e81a1c in aio_poll at ../util/aio-posix.c:607
5 0x00005601f6dcde87 in bdrv_drained_end at ../block/io.c:496
6 0x00005601f6d798cd in bdrv_set_aio_context_ignore at ../block.c:6502
7 0x00005601f6d7996c in bdrv_set_aio_context_ignore at ../block.c:6472
8 0x00005601f6d79cb8 in bdrv_child_try_set_aio_context at ../block.c:6587
9 0x00005601f6da86f2 in blk_do_set_aio_context at ../block/block-backend.c:2026
10 0x00005601f6daa96d in blk_set_aio_context at ../block/block-backend.c:2047
11 0x00005601f6c71883 in virtio_scsi_hotplug at ../hw/scsi/virtio-scsi.c:831
...
[Switching to thread 4 (Thread 0x7fd8092e7700 (LWP 533315))]
(gdb) bt
...
4 0x00005601f6eab6a8 in qemu_mutex_lock_impl at ../util/qemu-thread-posix.c:79
5 0x00005601f6e7ce88 in co_schedule_bh_cb at ../util/async.c:489
6 0x00005601f6e7c404 in aio_bh_poll at ../util/async.c:164
7 0x00005601f6e81a46 in aio_poll at ../util/aio-posix.c:659
8 0x00005601f6d5ccf3 in iothread_run at ../iothread.c:73
9 0x00005601f6eab512 in qemu_thread_start at ../util/qemu-thread-posix.c:521
10 0x00007fd80d7b84a4 in start_thread at pthread_create.c:456
11 0x00007fd80d4fad0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
(gdb) f 4
4 0x00005601f6eab6a8 in qemu_mutex_lock_impl at ../util/qemu-thread-posix.c:79
(gdb) p *mutex
$2 = {lock = {__data = {__lock = 2, __count = 1, __owner = 533312, __nusers = 1,
__kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = "\002\000\000\000\001\000\000\000@#\b\000\001\000\000\000\001",
'\000' <repeats 22 times>, __align = 4294967298}, initialized = true}
<------>
Therefore, we should never poll anywhere in
bdrv_set_aio_context_ignore() when acquiring the new context. In fact,
commit e037c09c has also already elaborated on why we can't poll at
bdrv_do_drained_end().
Signed-off-by: Zhiyong Ye <yezhiyong@bytedance.com>
---
block.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/block.c b/block.c
index be083f389e..ebbea72d64 100644
--- a/block.c
+++ b/block.c
@@ -6846,6 +6846,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
GSList *parents_to_process = NULL;
GSList *entry;
BdrvChild *child, *parent;
+ int drained_end_counter = 0;
g_assert(qemu_get_current_aio_context() == qemu_get_aio_context());
@@ -6907,7 +6908,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
aio_context_release(old_context);
}
- bdrv_drained_end(bs);
+ bdrv_drained_end_no_poll(bs, &drained_end_counter);
if (qemu_get_aio_context() != old_context) {
aio_context_acquire(old_context);
@@ -6915,6 +6916,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
if (qemu_get_aio_context() != new_context) {
aio_context_release(new_context);
}
+ BDRV_POLL_WHILE(bs, qatomic_read(&drained_end_counter) > 0);
}
static bool bdrv_parent_can_set_aio_context(BdrvChild *c, AioContext *ctx,
--
2.11.0
next prev parent reply other threads:[~2021-07-12 5:41 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-12 5:38 [PATCH 0/1] block: Do not poll in bdrv_set_aio_context_ignore() when acquiring new_context Zhiyong Ye
2021-07-12 5:38 ` Zhiyong Ye [this message]
2021-07-19 10:24 ` [PATCH 1/1] " Kevin Wolf
2021-07-20 13:07 ` Zhiyong Ye
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210712053831.51722-2-yezhiyong@bytedance.com \
--to=yezhiyong@bytedance.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).