* [Qemu-devel] [PATCH] block: avoid recursive AioContext acquire in bdrv_inactivate_all()
@ 2017-12-06 17:54 Stefan Hajnoczi
2017-12-06 18:40 ` [Qemu-devel] [Qemu-block] " Kevin Wolf
0 siblings, 1 reply; 3+ messages in thread
From: Stefan Hajnoczi @ 2017-12-06 17:54 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-stable, Paolo Bonzini, Dr. David Alan Gilbert, qemu-block
From: Paolo Bonzini <pbonzini@redhat.com>
BDRV_POLL_WHILE() does not support recursive AioContext locking. It
only releases the AioContext lock once regardless of how many times the
caller has acquired it. This results in a hang since the IOThread does
not make progress while the AioContext is still locked.
The following steps trigger the hang:
$ qemu-system-x86_64 -M accel=kvm -m 1G -cpu host \
-object iothread,id=iothread0 \
-device virtio-scsi-pci,iothread=iothread0 \
-drive if=none,id=drive0,file=test.img,format=raw \
-device scsi-hd,drive=drive0 \
-drive if=none,id=drive1,file=test.img,format=raw \
-device scsi-hd,drive=drive1
$ qemu-system-x86_64 ...same options... \
-incoming tcp::1234
(qemu) migrate tcp:127.0.0.1:1234
...hang...
Tested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
block.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/block.c b/block.c
index 9a1a0d1e73..1c37ce4554 100644
--- a/block.c
+++ b/block.c
@@ -4320,9 +4320,15 @@ int bdrv_inactivate_all(void)
BdrvNextIterator it;
int ret = 0;
int pass;
+ GSList *aio_ctxs = NULL, *ctx;
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
- aio_context_acquire(bdrv_get_aio_context(bs));
+ AioContext *aio_context = bdrv_get_aio_context(bs);
+
+ if (!g_slist_find(aio_ctxs, aio_context)) {
+ aio_ctxs = g_slist_prepend(aio_ctxs, aio_context);
+ aio_context_acquire(aio_context);
+ }
}
/* We do two passes of inactivation. The first pass calls to drivers'
@@ -4340,9 +4346,11 @@ int bdrv_inactivate_all(void)
}
out:
- for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
- aio_context_release(bdrv_get_aio_context(bs));
+ for (ctx = aio_ctxs; ctx != NULL; ctx = ctx->next) {
+ AioContext *aio_context = ctx->data;
+ aio_context_release(aio_context);
}
+ g_slist_free(aio_ctxs);
return ret;
}
--
2.14.3
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [Qemu-devel] [Qemu-block] [PATCH] block: avoid recursive AioContext acquire in bdrv_inactivate_all()
2017-12-06 17:54 [Qemu-devel] [PATCH] block: avoid recursive AioContext acquire in bdrv_inactivate_all() Stefan Hajnoczi
@ 2017-12-06 18:40 ` Kevin Wolf
2017-12-07 14:46 ` Stefan Hajnoczi
0 siblings, 1 reply; 3+ messages in thread
From: Kevin Wolf @ 2017-12-06 18:40 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: qemu-devel, Paolo Bonzini, qemu-stable, qemu-block,
Dr. David Alan Gilbert
Am 06.12.2017 um 18:54 hat Stefan Hajnoczi geschrieben:
> From: Paolo Bonzini <pbonzini@redhat.com>
>
> BDRV_POLL_WHILE() does not support recursive AioContext locking. It
> only releases the AioContext lock once regardless of how many times the
> caller has acquired it. This results in a hang since the IOThread does
> not make progress while the AioContext is still locked.
>
> The following steps trigger the hang:
>
> $ qemu-system-x86_64 -M accel=kvm -m 1G -cpu host \
> -object iothread,id=iothread0 \
> -device virtio-scsi-pci,iothread=iothread0 \
> -drive if=none,id=drive0,file=test.img,format=raw \
> -device scsi-hd,drive=drive0 \
> -drive if=none,id=drive1,file=test.img,format=raw \
> -device scsi-hd,drive=drive1
> $ qemu-system-x86_64 ...same options... \
> -incoming tcp::1234
> (qemu) migrate tcp:127.0.0.1:1234
> ...hang...
Please turn this into a test case.
We should probably also update docs/devel/multiple-iothreads.txt.
Currently it says:
aio_context_acquire()/aio_context_release() calls may be nested.
This means you can call them if you're not sure whether #2 applies.
While technically that's still correct as far as the lock is concerned,
the limitations of BDRV_POLL_WHILE() mean that in practice this is not a
viable option any more at least in the context of the block layer.
Kevin
> Tested-by: Stefan Hajnoczi <stefanha@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> block.c | 14 +++++++++++---
> 1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/block.c b/block.c
> index 9a1a0d1e73..1c37ce4554 100644
> --- a/block.c
> +++ b/block.c
> @@ -4320,9 +4320,15 @@ int bdrv_inactivate_all(void)
> BdrvNextIterator it;
> int ret = 0;
> int pass;
> + GSList *aio_ctxs = NULL, *ctx;
>
> for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
> - aio_context_acquire(bdrv_get_aio_context(bs));
> + AioContext *aio_context = bdrv_get_aio_context(bs);
> +
> + if (!g_slist_find(aio_ctxs, aio_context)) {
> + aio_ctxs = g_slist_prepend(aio_ctxs, aio_context);
> + aio_context_acquire(aio_context);
> + }
> }
>
> /* We do two passes of inactivation. The first pass calls to drivers'
> @@ -4340,9 +4346,11 @@ int bdrv_inactivate_all(void)
> }
>
> out:
> - for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
> - aio_context_release(bdrv_get_aio_context(bs));
> + for (ctx = aio_ctxs; ctx != NULL; ctx = ctx->next) {
> + AioContext *aio_context = ctx->data;
> + aio_context_release(aio_context);
> }
> + g_slist_free(aio_ctxs);
>
> return ret;
> }
> --
> 2.14.3
>
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Qemu-devel] [Qemu-block] [PATCH] block: avoid recursive AioContext acquire in bdrv_inactivate_all()
2017-12-06 18:40 ` [Qemu-devel] [Qemu-block] " Kevin Wolf
@ 2017-12-07 14:46 ` Stefan Hajnoczi
0 siblings, 0 replies; 3+ messages in thread
From: Stefan Hajnoczi @ 2017-12-07 14:46 UTC (permalink / raw)
To: Kevin Wolf
Cc: qemu-devel, Paolo Bonzini, qemu-stable, qemu-block,
Dr. David Alan Gilbert
[-- Attachment #1: Type: text/plain, Size: 1707 bytes --]
On Wed, Dec 06, 2017 at 07:40:28PM +0100, Kevin Wolf wrote:
> Am 06.12.2017 um 18:54 hat Stefan Hajnoczi geschrieben:
> > From: Paolo Bonzini <pbonzini@redhat.com>
> >
> > BDRV_POLL_WHILE() does not support recursive AioContext locking. It
> > only releases the AioContext lock once regardless of how many times the
> > caller has acquired it. This results in a hang since the IOThread does
> > not make progress while the AioContext is still locked.
> >
> > The following steps trigger the hang:
> >
> > $ qemu-system-x86_64 -M accel=kvm -m 1G -cpu host \
> > -object iothread,id=iothread0 \
> > -device virtio-scsi-pci,iothread=iothread0 \
> > -drive if=none,id=drive0,file=test.img,format=raw \
> > -device scsi-hd,drive=drive0 \
> > -drive if=none,id=drive1,file=test.img,format=raw \
> > -device scsi-hd,drive=drive1
> > $ qemu-system-x86_64 ...same options... \
> > -incoming tcp::1234
> > (qemu) migrate tcp:127.0.0.1:1234
> > ...hang...
>
> Please turn this into a test case.
>
> We should probably also update docs/devel/multiple-iothreads.txt.
> Currently it says:
>
> aio_context_acquire()/aio_context_release() calls may be nested.
> This means you can call them if you're not sure whether #2 applies.
>
> While technically that's still correct as far as the lock is concerned,
> the limitations of BDRV_POLL_WHILE() mean that in practice this is not a
> viable option any more at least in the context of the block layer.
Good point, will fix both things in v2.
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2017-12-07 14:47 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-06 17:54 [Qemu-devel] [PATCH] block: avoid recursive AioContext acquire in bdrv_inactivate_all() Stefan Hajnoczi
2017-12-06 18:40 ` [Qemu-devel] [Qemu-block] " Kevin Wolf
2017-12-07 14:46 ` Stefan Hajnoczi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).