* [PATCH 0/1] block: Do not poll in bdrv_set_aio_context_ignore() when acquiring new_context @ 2021-07-12 5:38 Zhiyong Ye 2021-07-12 5:38 ` [PATCH 1/1] " Zhiyong Ye 0 siblings, 1 reply; 4+ messages in thread From: Zhiyong Ye @ 2021-07-12 5:38 UTC (permalink / raw) To: qemu-block, qemu-devel; +Cc: kwolf, Zhiyong Ye, mreitz When bdrv_set_aio_context_ignore() is called in the main loop to change the AioContext onto the IO thread, the bdrv_drain_invoke_entry() never gets to run and the IO thread hangs at co_schedule_bh_cb(). This is because the AioContext is occupied by the main thread after being attached to the IO thread, and the main thread poll in bdrv_drained_end() waiting for the IO request to be drained, but the IO thread cannot acquire the AioContext, which leads to deadlock. Zhiyong Ye (1): block: Do not poll in bdrv_set_aio_context_ignore() when acquiring new_context block.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) -- 2.11.0 ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 1/1] block: Do not poll in bdrv_set_aio_context_ignore() when acquiring new_context 2021-07-12 5:38 [PATCH 0/1] block: Do not poll in bdrv_set_aio_context_ignore() when acquiring new_context Zhiyong Ye @ 2021-07-12 5:38 ` Zhiyong Ye 2021-07-19 10:24 ` Kevin Wolf 0 siblings, 1 reply; 4+ messages in thread From: Zhiyong Ye @ 2021-07-12 5:38 UTC (permalink / raw) To: qemu-block, qemu-devel; +Cc: kwolf, Zhiyong Ye, mreitz When bdrv_set_aio_context_ignore() is called in the main loop to change the AioContext onto the IO thread, the bdrv_drain_invoke_entry() never gets to run and the IO thread hangs at co_schedule_bh_cb(). This is because the AioContext is occupied by the main thread after being attached to the IO thread, and the main thread poll in bdrv_drained_end() waiting for the IO request to be drained, but the IO thread cannot acquire the AioContext, which leads to deadlock. Just like below: <------> [Switching to thread 1 (Thread 0x7fd810bbef40 (LWP 533312))] (gdb) bt ... 3 0x00005601f6ea93aa in fdmon_poll_wait at ../util/fdmon-poll.c:80 4 0x00005601f6e81a1c in aio_poll at ../util/aio-posix.c:607 5 0x00005601f6dcde87 in bdrv_drained_end at ../block/io.c:496 6 0x00005601f6d798cd in bdrv_set_aio_context_ignore at ../block.c:6502 7 0x00005601f6d7996c in bdrv_set_aio_context_ignore at ../block.c:6472 8 0x00005601f6d79cb8 in bdrv_child_try_set_aio_context at ../block.c:6587 9 0x00005601f6da86f2 in blk_do_set_aio_context at ../block/block-backend.c:2026 10 0x00005601f6daa96d in blk_set_aio_context at ../block/block-backend.c:2047 11 0x00005601f6c71883 in virtio_scsi_hotplug at ../hw/scsi/virtio-scsi.c:831 ... [Switching to thread 4 (Thread 0x7fd8092e7700 (LWP 533315))] (gdb) bt ... 4 0x00005601f6eab6a8 in qemu_mutex_lock_impl at ../util/qemu-thread-posix.c:79 5 0x00005601f6e7ce88 in co_schedule_bh_cb at ../util/async.c:489 6 0x00005601f6e7c404 in aio_bh_poll at ../util/async.c:164 7 0x00005601f6e81a46 in aio_poll at ../util/aio-posix.c:659 8 0x00005601f6d5ccf3 in iothread_run at ../iothread.c:73 9 0x00005601f6eab512 in qemu_thread_start at ../util/qemu-thread-posix.c:521 10 0x00007fd80d7b84a4 in start_thread at pthread_create.c:456 11 0x00007fd80d4fad0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 (gdb) f 4 4 0x00005601f6eab6a8 in qemu_mutex_lock_impl at ../util/qemu-thread-posix.c:79 (gdb) p *mutex $2 = {lock = {__data = {__lock = 2, __count = 1, __owner = 533312, __nusers = 1, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\002\000\000\000\001\000\000\000@#\b\000\001\000\000\000\001", '\000' <repeats 22 times>, __align = 4294967298}, initialized = true} <------> Therefore, we should never poll anywhere in bdrv_set_aio_context_ignore() when acquiring the new context. In fact, commit e037c09c has also already elaborated on why we can't poll at bdrv_do_drained_end(). Signed-off-by: Zhiyong Ye <yezhiyong@bytedance.com> --- block.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/block.c b/block.c index be083f389e..ebbea72d64 100644 --- a/block.c +++ b/block.c @@ -6846,6 +6846,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs, GSList *parents_to_process = NULL; GSList *entry; BdrvChild *child, *parent; + int drained_end_counter = 0; g_assert(qemu_get_current_aio_context() == qemu_get_aio_context()); @@ -6907,7 +6908,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs, aio_context_release(old_context); } - bdrv_drained_end(bs); + bdrv_drained_end_no_poll(bs, &drained_end_counter); if (qemu_get_aio_context() != old_context) { aio_context_acquire(old_context); @@ -6915,6 +6916,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs, if (qemu_get_aio_context() != new_context) { aio_context_release(new_context); } + BDRV_POLL_WHILE(bs, qatomic_read(&drained_end_counter) > 0); } static bool bdrv_parent_can_set_aio_context(BdrvChild *c, AioContext *ctx, -- 2.11.0 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH 1/1] block: Do not poll in bdrv_set_aio_context_ignore() when acquiring new_context 2021-07-12 5:38 ` [PATCH 1/1] " Zhiyong Ye @ 2021-07-19 10:24 ` Kevin Wolf 2021-07-20 13:07 ` Zhiyong Ye 0 siblings, 1 reply; 4+ messages in thread From: Kevin Wolf @ 2021-07-19 10:24 UTC (permalink / raw) To: Zhiyong Ye; +Cc: qemu-devel, qemu-block, mreitz Am 12.07.2021 um 07:38 hat Zhiyong Ye geschrieben: > When bdrv_set_aio_context_ignore() is called in the main loop to change > the AioContext onto the IO thread, the bdrv_drain_invoke_entry() never > gets to run and the IO thread hangs at co_schedule_bh_cb(). > > This is because the AioContext is occupied by the main thread after > being attached to the IO thread, and the main thread poll in > bdrv_drained_end() waiting for the IO request to be drained, but the IO > thread cannot acquire the AioContext, which leads to deadlock. This shouldn't be the case: * The caller must own the AioContext lock for the old AioContext of bs, but it * must not own the AioContext lock for new_context (unless new_context is the * same as the current context of bs). Then bdrv_set_aio_context_ignore() switches the AioContext of bs, and then calls bdrv_drained_end() while holding only the lock for the new context. AIO_WAIT_WHILE() will temporarily drop that lock, so aio_poll() should run without holding any AioContext locks. If I'm not missing anything, the scenario you're seeing means that the caller already held a lock for the new AioContext, so that it's locked twice while AIO_WAIT_WHILE() drops the lock only once. This would be a bug in the caller because the documentation I quoted explicitly forbids holding the AioContext lock for the new context. > Just like below: > > <------> > [Switching to thread 1 (Thread 0x7fd810bbef40 (LWP 533312))] > (gdb) bt > ... > 3 0x00005601f6ea93aa in fdmon_poll_wait at ../util/fdmon-poll.c:80 > 4 0x00005601f6e81a1c in aio_poll at ../util/aio-posix.c:607 > 5 0x00005601f6dcde87 in bdrv_drained_end at ../block/io.c:496 > 6 0x00005601f6d798cd in bdrv_set_aio_context_ignore at ../block.c:6502 > 7 0x00005601f6d7996c in bdrv_set_aio_context_ignore at ../block.c:6472 > 8 0x00005601f6d79cb8 in bdrv_child_try_set_aio_context at ../block.c:6587 > 9 0x00005601f6da86f2 in blk_do_set_aio_context at ../block/block-backend.c:2026 > 10 0x00005601f6daa96d in blk_set_aio_context at ../block/block-backend.c:2047 > 11 0x00005601f6c71883 in virtio_scsi_hotplug at ../hw/scsi/virtio-scsi.c:831 Which version of QEMU are you using? In current git master, line 831 is something entirely different. Are you using something before commit c7040ff6? Because this is a commit that fixed a virtio-scsi bug where it would hold the wrong lock before calling blk_set_aio_context(). > > [Switching to thread 4 (Thread 0x7fd8092e7700 (LWP 533315))] > (gdb) bt > ... > 4 0x00005601f6eab6a8 in qemu_mutex_lock_impl at ../util/qemu-thread-posix.c:79 > 5 0x00005601f6e7ce88 in co_schedule_bh_cb at ../util/async.c:489 > 6 0x00005601f6e7c404 in aio_bh_poll at ../util/async.c:164 > 7 0x00005601f6e81a46 in aio_poll at ../util/aio-posix.c:659 > 8 0x00005601f6d5ccf3 in iothread_run at ../iothread.c:73 > 9 0x00005601f6eab512 in qemu_thread_start at ../util/qemu-thread-posix.c:521 > 10 0x00007fd80d7b84a4 in start_thread at pthread_create.c:456 > 11 0x00007fd80d4fad0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 > (gdb) f 4 > 4 0x00005601f6eab6a8 in qemu_mutex_lock_impl at ../util/qemu-thread-posix.c:79 > (gdb) p *mutex > $2 = {lock = {__data = {__lock = 2, __count = 1, __owner = 533312, __nusers = 1, > __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, > __size = "\002\000\000\000\001\000\000\000@#\b\000\001\000\000\000\001", > '\000' <repeats 22 times>, __align = 4294967298}, initialized = true} > <------> > > Therefore, we should never poll anywhere in > bdrv_set_aio_context_ignore() when acquiring the new context. In fact, > commit e037c09c has also already elaborated on why we can't poll at > bdrv_do_drained_end(). > > Signed-off-by: Zhiyong Ye <yezhiyong@bytedance.com> > --- > block.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/block.c b/block.c > index be083f389e..ebbea72d64 100644 > --- a/block.c > +++ b/block.c > @@ -6846,6 +6846,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs, > GSList *parents_to_process = NULL; > GSList *entry; > BdrvChild *child, *parent; > + int drained_end_counter = 0; > > g_assert(qemu_get_current_aio_context() == qemu_get_aio_context()); > > @@ -6907,7 +6908,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs, > aio_context_release(old_context); > } > > - bdrv_drained_end(bs); > + bdrv_drained_end_no_poll(bs, &drained_end_counter); > > if (qemu_get_aio_context() != old_context) { > aio_context_acquire(old_context); > @@ -6915,6 +6916,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs, > if (qemu_get_aio_context() != new_context) { > aio_context_release(new_context); > } > + BDRV_POLL_WHILE(bs, qatomic_read(&drained_end_counter) > 0); This would be wrong because bs is already in the new context, but you wouldn't hold the lock for it. AIO_WAIT_WHILE() would try to drop the lock for a context that isn't even locked, resulting in a crash. > } > > static bool bdrv_parent_can_set_aio_context(BdrvChild *c, AioContext *ctx, Kevin ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Re: [PATCH 1/1] block: Do not poll in bdrv_set_aio_context_ignore() when acquiring new_context 2021-07-19 10:24 ` Kevin Wolf @ 2021-07-20 13:07 ` Zhiyong Ye 0 siblings, 0 replies; 4+ messages in thread From: Zhiyong Ye @ 2021-07-20 13:07 UTC (permalink / raw) To: Kevin Wolf; +Cc: qemu-devel, qemu-block, mreitz Hi Kevin, Thanks for your reply and detailed answer. It is true that AIO_WAIT_WHILE() will temporarily unlock the new context as you said, which is a point I overlooked. I'm using qemu version 5.2, and it works fine after I cherry-pick commit c7040ff6 into it. Thanks again! Zhiyong On 7/19/21 6:24 PM, Kevin Wolf wrote: > Am 12.07.2021 um 07:38 hat Zhiyong Ye geschrieben: >> When bdrv_set_aio_context_ignore() is called in the main loop to change >> the AioContext onto the IO thread, the bdrv_drain_invoke_entry() never >> gets to run and the IO thread hangs at co_schedule_bh_cb(). >> >> This is because the AioContext is occupied by the main thread after >> being attached to the IO thread, and the main thread poll in >> bdrv_drained_end() waiting for the IO request to be drained, but the IO >> thread cannot acquire the AioContext, which leads to deadlock. > > This shouldn't be the case: > > * The caller must own the AioContext lock for the old AioContext of bs, but it > * must not own the AioContext lock for new_context (unless new_context is the > * same as the current context of bs). > > Then bdrv_set_aio_context_ignore() switches the AioContext of bs, and > then calls bdrv_drained_end() while holding only the lock for the new > context. AIO_WAIT_WHILE() will temporarily drop that lock, so aio_poll() > should run without holding any AioContext locks. > > If I'm not missing anything, the scenario you're seeing means that the > caller already held a lock for the new AioContext, so that it's locked > twice while AIO_WAIT_WHILE() drops the lock only once. This would be a > bug in the caller because the documentation I quoted explicitly forbids > holding the AioContext lock for the new context. > >> Just like below: >> >> <------> >> [Switching to thread 1 (Thread 0x7fd810bbef40 (LWP 533312))] >> (gdb) bt >> ... >> 3 0x00005601f6ea93aa in fdmon_poll_wait at ../util/fdmon-poll.c:80 >> 4 0x00005601f6e81a1c in aio_poll at ../util/aio-posix.c:607 >> 5 0x00005601f6dcde87 in bdrv_drained_end at ../block/io.c:496 >> 6 0x00005601f6d798cd in bdrv_set_aio_context_ignore at ../block.c:6502 >> 7 0x00005601f6d7996c in bdrv_set_aio_context_ignore at ../block.c:6472 >> 8 0x00005601f6d79cb8 in bdrv_child_try_set_aio_context at ../block.c:6587 >> 9 0x00005601f6da86f2 in blk_do_set_aio_context at ../block/block-backend.c:2026 >> 10 0x00005601f6daa96d in blk_set_aio_context at ../block/block-backend.c:2047 >> 11 0x00005601f6c71883 in virtio_scsi_hotplug at ../hw/scsi/virtio-scsi.c:831 > > Which version of QEMU are you using? In current git master, line 831 is > something entirely different. > > Are you using something before commit c7040ff6? Because this is a commit > that fixed a virtio-scsi bug where it would hold the wrong lock before > calling blk_set_aio_context(). > >> >> [Switching to thread 4 (Thread 0x7fd8092e7700 (LWP 533315))] >> (gdb) bt >> ... >> 4 0x00005601f6eab6a8 in qemu_mutex_lock_impl at ../util/qemu-thread-posix.c:79 >> 5 0x00005601f6e7ce88 in co_schedule_bh_cb at ../util/async.c:489 >> 6 0x00005601f6e7c404 in aio_bh_poll at ../util/async.c:164 >> 7 0x00005601f6e81a46 in aio_poll at ../util/aio-posix.c:659 >> 8 0x00005601f6d5ccf3 in iothread_run at ../iothread.c:73 >> 9 0x00005601f6eab512 in qemu_thread_start at ../util/qemu-thread-posix.c:521 >> 10 0x00007fd80d7b84a4 in start_thread at pthread_create.c:456 >> 11 0x00007fd80d4fad0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 >> (gdb) f 4 >> 4 0x00005601f6eab6a8 in qemu_mutex_lock_impl at ../util/qemu-thread-posix.c:79 >> (gdb) p *mutex >> $2 = {lock = {__data = {__lock = 2, __count = 1, __owner = 533312, __nusers = 1, >> __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, >> __size = "\002\000\000\000\001\000\000\000@#\b\000\001\000\000\000\001", >> '\000' <repeats 22 times>, __align = 4294967298}, initialized = true} >> <------> >> >> Therefore, we should never poll anywhere in >> bdrv_set_aio_context_ignore() when acquiring the new context. In fact, >> commit e037c09c has also already elaborated on why we can't poll at >> bdrv_do_drained_end(). >> >> Signed-off-by: Zhiyong Ye <yezhiyong@bytedance.com> >> --- >> block.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/block.c b/block.c >> index be083f389e..ebbea72d64 100644 >> --- a/block.c >> +++ b/block.c >> @@ -6846,6 +6846,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs, >> GSList *parents_to_process = NULL; >> GSList *entry; >> BdrvChild *child, *parent; >> + int drained_end_counter = 0; >> >> g_assert(qemu_get_current_aio_context() == qemu_get_aio_context()); >> >> @@ -6907,7 +6908,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs, >> aio_context_release(old_context); >> } >> >> - bdrv_drained_end(bs); >> + bdrv_drained_end_no_poll(bs, &drained_end_counter); >> >> if (qemu_get_aio_context() != old_context) { >> aio_context_acquire(old_context); >> @@ -6915,6 +6916,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs, >> if (qemu_get_aio_context() != new_context) { >> aio_context_release(new_context); >> } >> + BDRV_POLL_WHILE(bs, qatomic_read(&drained_end_counter) > 0); > > This would be wrong because bs is already in the new context, but you > wouldn't hold the lock for it. AIO_WAIT_WHILE() would try to drop the > lock for a context that isn't even locked, resulting in a crash. > >> } >> >> static bool bdrv_parent_can_set_aio_context(BdrvChild *c, AioContext *ctx, > > Kevin > ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-07-20 13:14 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-07-12 5:38 [PATCH 0/1] block: Do not poll in bdrv_set_aio_context_ignore() when acquiring new_context Zhiyong Ye 2021-07-12 5:38 ` [PATCH 1/1] " Zhiyong Ye 2021-07-19 10:24 ` Kevin Wolf 2021-07-20 13:07 ` Zhiyong Ye
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).