[Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race
@ 2015-07-27 16:33 Stefan Hajnoczi
  2015-07-27 16:33 ` [Qemu-devel] [PATCH for-2.4 1/2] AioContext: avoid leaking BHs on cleanup Stefan Hajnoczi
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Stefan Hajnoczi @ 2015-07-27 16:33 UTC (permalink / raw)
  To: qemu-devel; +Cc: cornelia.huck, borntraeger, Stefan Hajnoczi, Paolo Bonzini

See Patch 2 for details on the deadlock after two aio_context_acquire() calls
race.  This caused dataplane to hang on startup.

Patch 1 is a memory leak fix for AioContext that's needed by Patch 2.

Stefan Hajnoczi (2):
  AioContext: avoid leaking BHs on cleanup
  AioContext: force event loop iteration using BH

 async.c             | 29 +++++++++++++++++++++++++++--
 include/block/aio.h |  3 +++
 2 files changed, 30 insertions(+), 2 deletions(-)

-- 
2.4.3

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH for-2.4 1/2] AioContext: avoid leaking BHs on cleanup
  2015-07-27 16:33 [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race Stefan Hajnoczi
@ 2015-07-27 16:33 ` Stefan Hajnoczi
  2015-07-27 16:33 ` [Qemu-devel] [PATCH for-2.4 2/2] AioContext: force event loop iteration using BH Stefan Hajnoczi
  2015-07-28  7:07 ` [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race Cornelia Huck
  2 siblings, 0 replies; 13+ messages in thread
From: Stefan Hajnoczi @ 2015-07-27 16:33 UTC (permalink / raw)
  To: qemu-devel; +Cc: cornelia.huck, borntraeger, Stefan Hajnoczi, Paolo Bonzini

BHs are freed during aio_bh_poll().  This leads to memory leaks if there
is no aio_bh_poll() between qemu_bh_delete() and aio_ctx_finalize().

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 async.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/async.c b/async.c
index 9a98a74..9fab4c6 100644
--- a/async.c
+++ b/async.c
@@ -230,6 +230,18 @@ aio_ctx_finalize(GSource     *source)
 {
     AioContext *ctx = (AioContext *) source;
 
+    qemu_mutex_lock(&ctx->bh_lock);
+    while (ctx->first_bh) {
+        QEMUBH *next = ctx->first_bh->next;
+
+        /* qemu_bh_delete() must have been called on BHs in this AioContext */
+        assert(ctx->first_bh->deleted);
+
+        g_free(ctx->first_bh);
+        ctx->first_bh = next;
+    }
+    qemu_mutex_unlock(&ctx->bh_lock);
+
     thread_pool_free(ctx->thread_pool);
     aio_set_event_notifier(ctx, &ctx->notifier, NULL);
     event_notifier_cleanup(&ctx->notifier);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH for-2.4 2/2] AioContext: force event loop iteration using BH
  2015-07-27 16:33 [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race Stefan Hajnoczi
  2015-07-27 16:33 ` [Qemu-devel] [PATCH for-2.4 1/2] AioContext: avoid leaking BHs on cleanup Stefan Hajnoczi
@ 2015-07-27 16:33 ` Stefan Hajnoczi
  2015-07-27 17:49   ` Paolo Bonzini
  2015-07-28  7:07 ` [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race Cornelia Huck
  2 siblings, 1 reply; 13+ messages in thread
From: Stefan Hajnoczi @ 2015-07-27 16:33 UTC (permalink / raw)
  To: qemu-devel; +Cc: cornelia.huck, borntraeger, Stefan Hajnoczi, Paolo Bonzini

The notify_me optimization introduced in commit eabc97797310
("AioContext: fix broken ctx->dispatching optimization") skips
event_notifier_set() calls when the event loop thread is not blocked in
ppoll(2).

This optimization causes a deadlock if two aio_context_acquire() calls
race.  notify_me = 0 during the race so the winning thread can enter
ppoll(2) unaware that the other thread is waiting its turn to acquire
the AioContext.

This patch forces ppoll(2) to return by scheduling a BH instead of
calling aio_notify().

The following deadlock with virtio-blk dataplane is fixed:

  qemu ... -object iothread,id=iothread0 \
           -drive if=none,id=drive0,file=test.img,... \
           -device virtio-blk-pci,iothread=iothread0,drive=drive0

This command-line results in a hang early on without this patch.

Thanks to Paolo Bonzini <pbonzini@redhat.com> for investigating this bug
with me.

Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 async.c             | 17 +++++++++++++++--
 include/block/aio.h |  3 +++
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/async.c b/async.c
index 9fab4c6..9ca7095 100644
--- a/async.c
+++ b/async.c
@@ -79,8 +79,10 @@ int aio_bh_poll(AioContext *ctx)
          * aio_notify again if necessary.
          */
         if (!bh->deleted && atomic_xchg(&bh->scheduled, 0)) {
-            if (!bh->idle)
+            /* Idle BHs and the notify BH don't count as progress */
+            if (!bh->idle && bh != ctx->notify_dummy_bh) {
                 ret = 1;
+            }
             bh->idle = 0;
             bh->cb(bh->opaque);
         }
@@ -230,6 +232,8 @@ aio_ctx_finalize(GSource     *source)
 {
     AioContext *ctx = (AioContext *) source;
 
+    qemu_bh_delete(ctx->notify_dummy_bh);
+
     qemu_mutex_lock(&ctx->bh_lock);
     while (ctx->first_bh) {
         QEMUBH *next = ctx->first_bh->next;
@@ -297,8 +301,15 @@ static void aio_timerlist_notify(void *opaque)
 
 static void aio_rfifolock_cb(void *opaque)
 {
+    AioContext *ctx = opaque;
+
     /* Kick owner thread in case they are blocked in aio_poll() */
-    aio_notify(opaque);
+    qemu_bh_schedule(ctx->notify_dummy_bh);
+}
+
+static void notify_dummy_bh(void *opaque)
+{
+    /* Do nothing, we were invoked just to force the event loop to iterate */
 }
 
 static void event_notifier_dummy_cb(EventNotifier *e)
@@ -325,6 +336,8 @@ AioContext *aio_context_new(Error **errp)
     rfifolock_init(&ctx->lock, aio_rfifolock_cb, ctx);
     timerlistgroup_init(&ctx->tlg, aio_timerlist_notify, ctx);
 
+    ctx->notify_dummy_bh = aio_bh_new(ctx, notify_dummy_bh, NULL);
+
     return ctx;
 }
 
diff --git a/include/block/aio.h b/include/block/aio.h
index 9dd32e0..400b1b0 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -114,6 +114,9 @@ struct AioContext {
     bool notified;
     EventNotifier notifier;
 
+    /* Scheduling this BH forces the event loop it iterate */
+    QEMUBH *notify_dummy_bh;
+
     /* Thread pool for performing work and receiving completion callbacks */
     struct ThreadPool *thread_pool;
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.4 2/2] AioContext: force event loop iteration using BH
  2015-07-27 16:33 ` [Qemu-devel] [PATCH for-2.4 2/2] AioContext: force event loop iteration using BH Stefan Hajnoczi
@ 2015-07-27 17:49   ` Paolo Bonzini
  0 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2015-07-27 17:49 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel; +Cc: cornelia.huck, borntraeger



On 27/07/2015 18:33, Stefan Hajnoczi wrote:
> The notify_me optimization introduced in commit eabc97797310
> ("AioContext: fix broken ctx->dispatching optimization") skips
> event_notifier_set() calls when the event loop thread is not blocked in
> ppoll(2).
> 
> This optimization causes a deadlock if two aio_context_acquire() calls
> race.  notify_me = 0 during the race so the winning thread can enter
> ppoll(2) unaware that the other thread is waiting its turn to acquire
> the AioContext.
> 
> This patch forces ppoll(2) to return by scheduling a BH instead of
> calling aio_notify().
> 
> The following deadlock with virtio-blk dataplane is fixed:
> 
>   qemu ... -object iothread,id=iothread0 \
>            -drive if=none,id=drive0,file=test.img,... \
>            -device virtio-blk-pci,iothread=iothread0,drive=drive0
> 
> This command-line results in a hang early on without this patch.
> 
> Thanks to Paolo Bonzini <pbonzini@redhat.com> for investigating this bug
> with me.

You're too good! :)  Series

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  async.c             | 17 +++++++++++++++--
>  include/block/aio.h |  3 +++
>  2 files changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/async.c b/async.c
> index 9fab4c6..9ca7095 100644
> --- a/async.c
> +++ b/async.c
> @@ -79,8 +79,10 @@ int aio_bh_poll(AioContext *ctx)
>           * aio_notify again if necessary.
>           */
>          if (!bh->deleted && atomic_xchg(&bh->scheduled, 0)) {
> -            if (!bh->idle)
> +            /* Idle BHs and the notify BH don't count as progress */
> +            if (!bh->idle && bh != ctx->notify_dummy_bh) {
>                  ret = 1;
> +            }
>              bh->idle = 0;
>              bh->cb(bh->opaque);
>          }
> @@ -230,6 +232,8 @@ aio_ctx_finalize(GSource     *source)
>  {
>      AioContext *ctx = (AioContext *) source;
>  
> +    qemu_bh_delete(ctx->notify_dummy_bh);
> +
>      qemu_mutex_lock(&ctx->bh_lock);
>      while (ctx->first_bh) {
>          QEMUBH *next = ctx->first_bh->next;
> @@ -297,8 +301,15 @@ static void aio_timerlist_notify(void *opaque)
>  
>  static void aio_rfifolock_cb(void *opaque)
>  {
> +    AioContext *ctx = opaque;
> +
>      /* Kick owner thread in case they are blocked in aio_poll() */
> -    aio_notify(opaque);
> +    qemu_bh_schedule(ctx->notify_dummy_bh);
> +}
> +
> +static void notify_dummy_bh(void *opaque)
> +{
> +    /* Do nothing, we were invoked just to force the event loop to iterate */
>  }
>  
>  static void event_notifier_dummy_cb(EventNotifier *e)
> @@ -325,6 +336,8 @@ AioContext *aio_context_new(Error **errp)
>      rfifolock_init(&ctx->lock, aio_rfifolock_cb, ctx);
>      timerlistgroup_init(&ctx->tlg, aio_timerlist_notify, ctx);
>  
> +    ctx->notify_dummy_bh = aio_bh_new(ctx, notify_dummy_bh, NULL);
> +
>      return ctx;
>  }
>  
> diff --git a/include/block/aio.h b/include/block/aio.h
> index 9dd32e0..400b1b0 100644
> --- a/include/block/aio.h
> +++ b/include/block/aio.h
> @@ -114,6 +114,9 @@ struct AioContext {
>      bool notified;
>      EventNotifier notifier;
>  
> +    /* Scheduling this BH forces the event loop it iterate */
> +    QEMUBH *notify_dummy_bh;
> +
>      /* Thread pool for performing work and receiving completion callbacks */
>      struct ThreadPool *thread_pool;
>  
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race
  2015-07-27 16:33 [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race Stefan Hajnoczi
  2015-07-27 16:33 ` [Qemu-devel] [PATCH for-2.4 1/2] AioContext: avoid leaking BHs on cleanup Stefan Hajnoczi
  2015-07-27 16:33 ` [Qemu-devel] [PATCH for-2.4 2/2] AioContext: force event loop iteration using BH Stefan Hajnoczi
@ 2015-07-28  7:07 ` Cornelia Huck
  2015-07-28  8:02   ` Cornelia Huck
  2 siblings, 1 reply; 13+ messages in thread
From: Cornelia Huck @ 2015-07-28  7:07 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: borntraeger, qemu-devel, Paolo Bonzini

On Mon, 27 Jul 2015 17:33:37 +0100
Stefan Hajnoczi <stefanha@redhat.com> wrote:

> See Patch 2 for details on the deadlock after two aio_context_acquire() calls
> race.  This caused dataplane to hang on startup.
> 
> Patch 1 is a memory leak fix for AioContext that's needed by Patch 2.
> 
> Stefan Hajnoczi (2):
>   AioContext: avoid leaking BHs on cleanup
>   AioContext: force event loop iteration using BH
> 
>  async.c             | 29 +++++++++++++++++++++++++++--
>  include/block/aio.h |  3 +++
>  2 files changed, 30 insertions(+), 2 deletions(-)
> 

Just gave this a try: The stripped-down guest that hangs during startup
on master is working fine with these patches applied, and my full setup
works as well.

So,

Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race
  2015-07-28  7:07 ` [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race Cornelia Huck
@ 2015-07-28  8:02   ` Cornelia Huck
  2015-07-28  8:34     ` Stefan Hajnoczi
  0 siblings, 1 reply; 13+ messages in thread
From: Cornelia Huck @ 2015-07-28  8:02 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: borntraeger, qemu-devel, Paolo Bonzini

On Tue, 28 Jul 2015 09:07:00 +0200
Cornelia Huck <cornelia.huck@de.ibm.com> wrote:

> On Mon, 27 Jul 2015 17:33:37 +0100
> Stefan Hajnoczi <stefanha@redhat.com> wrote:
> 
> > See Patch 2 for details on the deadlock after two aio_context_acquire() calls
> > race.  This caused dataplane to hang on startup.
> > 
> > Patch 1 is a memory leak fix for AioContext that's needed by Patch 2.
> > 
> > Stefan Hajnoczi (2):
> >   AioContext: avoid leaking BHs on cleanup
> >   AioContext: force event loop iteration using BH
> > 
> >  async.c             | 29 +++++++++++++++++++++++++++--
> >  include/block/aio.h |  3 +++
> >  2 files changed, 30 insertions(+), 2 deletions(-)
> > 
> 
> Just gave this a try: The stripped-down guest that hangs during startup
> on master is working fine with these patches applied, and my full setup
> works as well.
> 
> So,
> 
> Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>

Uh-oh, spoke too soon. It starts, but when I try a virsh managedsave, I
get

qemu-system-s390x: /data/git/yyy/qemu/async.c:242: aio_ctx_finalize: Assertion `ctx->first_bh->deleted' failed.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race
  2015-07-28  8:02   ` Cornelia Huck
@ 2015-07-28  8:34     ` Stefan Hajnoczi
  2015-07-28 10:26       ` Cornelia Huck
  0 siblings, 1 reply; 13+ messages in thread
From: Stefan Hajnoczi @ 2015-07-28  8:34 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: borntraeger, qemu-devel, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 1828 bytes --]

On Tue, Jul 28, 2015 at 10:02:26AM +0200, Cornelia Huck wrote:
> On Tue, 28 Jul 2015 09:07:00 +0200
> Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
> 
> > On Mon, 27 Jul 2015 17:33:37 +0100
> > Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > 
> > > See Patch 2 for details on the deadlock after two aio_context_acquire() calls
> > > race.  This caused dataplane to hang on startup.
> > > 
> > > Patch 1 is a memory leak fix for AioContext that's needed by Patch 2.
> > > 
> > > Stefan Hajnoczi (2):
> > >   AioContext: avoid leaking BHs on cleanup
> > >   AioContext: force event loop iteration using BH
> > > 
> > >  async.c             | 29 +++++++++++++++++++++++++++--
> > >  include/block/aio.h |  3 +++
> > >  2 files changed, 30 insertions(+), 2 deletions(-)
> > > 
> > 
> > Just gave this a try: The stripped-down guest that hangs during startup
> > on master is working fine with these patches applied, and my full setup
> > works as well.
> > 
> > So,
> > 
> > Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
> 
> Uh-oh, spoke too soon. It starts, but when I try a virsh managedsave, I
> get
> 
> qemu-system-s390x: /data/git/yyy/qemu/async.c:242: aio_ctx_finalize: Assertion `ctx->first_bh->deleted' failed.

Please pretty-print ctx->first_bh in gdb.  In particular, which function
is ctx->first_bh->cb pointing to?

I tried reproducing with qemu-system-x86_64 and a RHEL 7 guest but
couldn't trigger the assertion failure.

This assertion means that there is an *existing* QEMUBH memory leak.  It
is not introduced by this patch series.  If we run out of time for QEMU
2.4, it would be acceptable to replace the assertion with:

  /* TODO track down leaked BHs and turn this into an assertion */
  if (ctx->first_bh->deleted) {
      g_free(ctx->first_bh);
  }

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race
  2015-07-28  8:34     ` Stefan Hajnoczi
@ 2015-07-28 10:26       ` Cornelia Huck
  2015-07-28 10:31         ` Stefan Hajnoczi
  0 siblings, 1 reply; 13+ messages in thread
From: Cornelia Huck @ 2015-07-28 10:26 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: borntraeger, qemu-devel, Paolo Bonzini

On Tue, 28 Jul 2015 09:34:46 +0100
Stefan Hajnoczi <stefanha@redhat.com> wrote:

> On Tue, Jul 28, 2015 at 10:02:26AM +0200, Cornelia Huck wrote:
> > On Tue, 28 Jul 2015 09:07:00 +0200
> > Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
> > 
> > > On Mon, 27 Jul 2015 17:33:37 +0100
> > > Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > > 
> > > > See Patch 2 for details on the deadlock after two aio_context_acquire() calls
> > > > race.  This caused dataplane to hang on startup.
> > > > 
> > > > Patch 1 is a memory leak fix for AioContext that's needed by Patch 2.
> > > > 
> > > > Stefan Hajnoczi (2):
> > > >   AioContext: avoid leaking BHs on cleanup
> > > >   AioContext: force event loop iteration using BH
> > > > 
> > > >  async.c             | 29 +++++++++++++++++++++++++++--
> > > >  include/block/aio.h |  3 +++
> > > >  2 files changed, 30 insertions(+), 2 deletions(-)
> > > > 
> > > 
> > > Just gave this a try: The stripped-down guest that hangs during startup
> > > on master is working fine with these patches applied, and my full setup
> > > works as well.
> > > 
> > > So,
> > > 
> > > Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
> > 
> > Uh-oh, spoke too soon. It starts, but when I try a virsh managedsave, I
> > get
> > 
> > qemu-system-s390x: /data/git/yyy/qemu/async.c:242: aio_ctx_finalize: Assertion `ctx->first_bh->deleted' failed.
> 
> Please pretty-print ctx->first_bh in gdb.  In particular, which function
> is ctx->first_bh->cb pointing to?

(gdb) p/x *(QEMUBH *)ctx->first_bh
$2 = {ctx = 0x9aab3730, cb = 0x801b7c5c, opaque = 0x3ff9800dee0, next = 
    0x3ff9800dfb0, scheduled = 0x0, idle = 0x0, deleted = 0x0}

cb is pointing at spawn_thread_bh_fn.

> 
> I tried reproducing with qemu-system-x86_64 and a RHEL 7 guest but
> couldn't trigger the assertion failure.

I use the old x-data-plane attribute; if I turn it off, I don't hit the
assertion.

> 
> This assertion means that there is an *existing* QEMUBH memory leak.  It
> is not introduced by this patch series.  If we run out of time for QEMU
> 2.4, it would be acceptable to replace the assertion with:
> 
>   /* TODO track down leaked BHs and turn this into an assertion */
>   if (ctx->first_bh->deleted) {
>       g_free(ctx->first_bh);
>   }

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race
  2015-07-28 10:26       ` Cornelia Huck
@ 2015-07-28 10:31         ` Stefan Hajnoczi
  2015-07-28 10:34           ` Stefan Hajnoczi
  0 siblings, 1 reply; 13+ messages in thread
From: Stefan Hajnoczi @ 2015-07-28 10:31 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Christian Borntraeger, qemu-devel, Stefan Hajnoczi, Paolo Bonzini

On Tue, Jul 28, 2015 at 11:26 AM, Cornelia Huck
<cornelia.huck@de.ibm.com> wrote:
> On Tue, 28 Jul 2015 09:34:46 +0100
> Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
>> On Tue, Jul 28, 2015 at 10:02:26AM +0200, Cornelia Huck wrote:
>> > On Tue, 28 Jul 2015 09:07:00 +0200
>> > Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
>> >
>> > > On Mon, 27 Jul 2015 17:33:37 +0100
>> > > Stefan Hajnoczi <stefanha@redhat.com> wrote:
>> > >
>> > > > See Patch 2 for details on the deadlock after two aio_context_acquire() calls
>> > > > race.  This caused dataplane to hang on startup.
>> > > >
>> > > > Patch 1 is a memory leak fix for AioContext that's needed by Patch 2.
>> > > >
>> > > > Stefan Hajnoczi (2):
>> > > >   AioContext: avoid leaking BHs on cleanup
>> > > >   AioContext: force event loop iteration using BH
>> > > >
>> > > >  async.c             | 29 +++++++++++++++++++++++++++--
>> > > >  include/block/aio.h |  3 +++
>> > > >  2 files changed, 30 insertions(+), 2 deletions(-)
>> > > >
>> > >
>> > > Just gave this a try: The stripped-down guest that hangs during startup
>> > > on master is working fine with these patches applied, and my full setup
>> > > works as well.
>> > >
>> > > So,
>> > >
>> > > Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
>> >
>> > Uh-oh, spoke too soon. It starts, but when I try a virsh managedsave, I
>> > get
>> >
>> > qemu-system-s390x: /data/git/yyy/qemu/async.c:242: aio_ctx_finalize: Assertion `ctx->first_bh->deleted' failed.
>>
>> Please pretty-print ctx->first_bh in gdb.  In particular, which function
>> is ctx->first_bh->cb pointing to?
>
> (gdb) p/x *(QEMUBH *)ctx->first_bh
> $2 = {ctx = 0x9aab3730, cb = 0x801b7c5c, opaque = 0x3ff9800dee0, next =
>     0x3ff9800dfb0, scheduled = 0x0, idle = 0x0, deleted = 0x0}
>
> cb is pointing at spawn_thread_bh_fn.
>
>>
>> I tried reproducing with qemu-system-x86_64 and a RHEL 7 guest but
>> couldn't trigger the assertion failure.
>
> I use the old x-data-plane attribute; if I turn it off, I don't hit the
> assertion.

Thanks.  I understand how to reproduce it now: use -drive aio=threads
and do I/O during managedsave.

I suspect there are more cases of this.  We need to clean it up during QEMU 2.5.

For now let's continue leaking these BHs as we've always done.

Stefan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race
  2015-07-28 10:31         ` Stefan Hajnoczi
@ 2015-07-28 10:34           ` Stefan Hajnoczi
  2015-07-28 10:58             ` Cornelia Huck
  0 siblings, 1 reply; 13+ messages in thread
From: Stefan Hajnoczi @ 2015-07-28 10:34 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Christian Borntraeger, qemu-devel, Stefan Hajnoczi, Paolo Bonzini

On Tue, Jul 28, 2015 at 11:31 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Tue, Jul 28, 2015 at 11:26 AM, Cornelia Huck
> <cornelia.huck@de.ibm.com> wrote:
>> On Tue, 28 Jul 2015 09:34:46 +0100
>> Stefan Hajnoczi <stefanha@redhat.com> wrote:
>>
>>> On Tue, Jul 28, 2015 at 10:02:26AM +0200, Cornelia Huck wrote:
>>> > On Tue, 28 Jul 2015 09:07:00 +0200
>>> > Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
>>> >
>>> > > On Mon, 27 Jul 2015 17:33:37 +0100
>>> > > Stefan Hajnoczi <stefanha@redhat.com> wrote:
>>> > >
>>> > > > See Patch 2 for details on the deadlock after two aio_context_acquire() calls
>>> > > > race.  This caused dataplane to hang on startup.
>>> > > >
>>> > > > Patch 1 is a memory leak fix for AioContext that's needed by Patch 2.
>>> > > >
>>> > > > Stefan Hajnoczi (2):
>>> > > >   AioContext: avoid leaking BHs on cleanup
>>> > > >   AioContext: force event loop iteration using BH
>>> > > >
>>> > > >  async.c             | 29 +++++++++++++++++++++++++++--
>>> > > >  include/block/aio.h |  3 +++
>>> > > >  2 files changed, 30 insertions(+), 2 deletions(-)
>>> > > >
>>> > >
>>> > > Just gave this a try: The stripped-down guest that hangs during startup
>>> > > on master is working fine with these patches applied, and my full setup
>>> > > works as well.
>>> > >
>>> > > So,
>>> > >
>>> > > Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
>>> >
>>> > Uh-oh, spoke too soon. It starts, but when I try a virsh managedsave, I
>>> > get
>>> >
>>> > qemu-system-s390x: /data/git/yyy/qemu/async.c:242: aio_ctx_finalize: Assertion `ctx->first_bh->deleted' failed.
>>>
>>> Please pretty-print ctx->first_bh in gdb.  In particular, which function
>>> is ctx->first_bh->cb pointing to?
>>
>> (gdb) p/x *(QEMUBH *)ctx->first_bh
>> $2 = {ctx = 0x9aab3730, cb = 0x801b7c5c, opaque = 0x3ff9800dee0, next =
>>     0x3ff9800dfb0, scheduled = 0x0, idle = 0x0, deleted = 0x0}
>>
>> cb is pointing at spawn_thread_bh_fn.
>>
>>>
>>> I tried reproducing with qemu-system-x86_64 and a RHEL 7 guest but
>>> couldn't trigger the assertion failure.
>>
>> I use the old x-data-plane attribute; if I turn it off, I don't hit the
>> assertion.
>
> Thanks.  I understand how to reproduce it now: use -drive aio=threads
> and do I/O during managedsave.
>
> I suspect there are more cases of this.  We need to clean it up during QEMU 2.5.
>
> For now let's continue leaking these BHs as we've always done.

Actually, this case can be fixed in the patch by moving
thread_pool_free() before the BH cleanup loop.

But I still fear other parts of QEMU may be leaking BHs and we should
use a full release cycle to weed them out before introducing the
assertion.

Stefan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race
  2015-07-28 10:34           ` Stefan Hajnoczi
@ 2015-07-28 10:58             ` Cornelia Huck
  2015-07-28 12:18               ` Paolo Bonzini
  0 siblings, 1 reply; 13+ messages in thread
From: Cornelia Huck @ 2015-07-28 10:58 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Christian Borntraeger, qemu-devel, Stefan Hajnoczi, Paolo Bonzini

On Tue, 28 Jul 2015 11:34:18 +0100
Stefan Hajnoczi <stefanha@gmail.com> wrote:

> On Tue, Jul 28, 2015 at 11:31 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> > On Tue, Jul 28, 2015 at 11:26 AM, Cornelia Huck
> > <cornelia.huck@de.ibm.com> wrote:
> >> On Tue, 28 Jul 2015 09:34:46 +0100
> >> Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >>
> >>> On Tue, Jul 28, 2015 at 10:02:26AM +0200, Cornelia Huck wrote:
> >>> > On Tue, 28 Jul 2015 09:07:00 +0200
> >>> > Cornelia Huck <cornelia.huck@de.ibm.com> wrote:
> >>> >
> >>> > > On Mon, 27 Jul 2015 17:33:37 +0100
> >>> > > Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >>> > >
> >>> > > > See Patch 2 for details on the deadlock after two aio_context_acquire() calls
> >>> > > > race.  This caused dataplane to hang on startup.
> >>> > > >
> >>> > > > Patch 1 is a memory leak fix for AioContext that's needed by Patch 2.
> >>> > > >
> >>> > > > Stefan Hajnoczi (2):
> >>> > > >   AioContext: avoid leaking BHs on cleanup
> >>> > > >   AioContext: force event loop iteration using BH
> >>> > > >
> >>> > > >  async.c             | 29 +++++++++++++++++++++++++++--
> >>> > > >  include/block/aio.h |  3 +++
> >>> > > >  2 files changed, 30 insertions(+), 2 deletions(-)
> >>> > > >
> >>> > >
> >>> > > Just gave this a try: The stripped-down guest that hangs during startup
> >>> > > on master is working fine with these patches applied, and my full setup
> >>> > > works as well.
> >>> > >
> >>> > > So,
> >>> > >
> >>> > > Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
> >>> >
> >>> > Uh-oh, spoke too soon. It starts, but when I try a virsh managedsave, I
> >>> > get
> >>> >
> >>> > qemu-system-s390x: /data/git/yyy/qemu/async.c:242: aio_ctx_finalize: Assertion `ctx->first_bh->deleted' failed.
> >>>
> >>> Please pretty-print ctx->first_bh in gdb.  In particular, which function
> >>> is ctx->first_bh->cb pointing to?
> >>
> >> (gdb) p/x *(QEMUBH *)ctx->first_bh
> >> $2 = {ctx = 0x9aab3730, cb = 0x801b7c5c, opaque = 0x3ff9800dee0, next =
> >>     0x3ff9800dfb0, scheduled = 0x0, idle = 0x0, deleted = 0x0}
> >>
> >> cb is pointing at spawn_thread_bh_fn.
> >>
> >>>
> >>> I tried reproducing with qemu-system-x86_64 and a RHEL 7 guest but
> >>> couldn't trigger the assertion failure.
> >>
> >> I use the old x-data-plane attribute; if I turn it off, I don't hit the
> >> assertion.
> >
> > Thanks.  I understand how to reproduce it now: use -drive aio=threads
> > and do I/O during managedsave.
> >
> > I suspect there are more cases of this.  We need to clean it up during QEMU 2.5.
> >
> > For now let's continue leaking these BHs as we've always done.
> 
> Actually, this case can be fixed in the patch by moving
> thread_pool_free() before the BH cleanup loop.

Tried that, may have done it wrong, because the assertion still hits.

> 
> But I still fear other parts of QEMU may be leaking BHs and we should
> use a full release cycle to weed them out before introducing the
> assertion.

Yes, that's probably the best thing to do that late in the cycle.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race
  2015-07-28 10:58             ` Cornelia Huck
@ 2015-07-28 12:18               ` Paolo Bonzini
  2015-07-28 13:58                 ` Stefan Hajnoczi
  0 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2015-07-28 12:18 UTC (permalink / raw)
  To: Cornelia Huck, Stefan Hajnoczi
  Cc: Christian Borntraeger, qemu-devel, Stefan Hajnoczi



On 28/07/2015 12:58, Cornelia Huck wrote:
> > > Thanks.  I understand how to reproduce it now: use -drive aio=threads
> > > and do I/O during managedsave.
> > >
> > > I suspect there are more cases of this.  We need to clean it up during QEMU 2.5.
> > >
> > > For now let's continue leaking these BHs as we've always done.
> > 
> > Actually, this case can be fixed in the patch by moving
> > thread_pool_free() before the BH cleanup loop.
>
> Tried that, may have done it wrong, because the assertion still hits.

If you're doing savevm with a dataplane disk as the destination, that 
cannot work; savevm doesn't attempt to acquire the AioContext so it is 
not thread safe.

An even simpler reproducer for this bug, however, is to hot-unplug a 
disk created with x-data-plane.  It also shows another bug, fixed by 
this patch:

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 3db139b..6106e46 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -223,8 +223,8 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s)
     virtio_blk_data_plane_stop(s);
     blk_op_unblock_all(s->conf->conf.blk, s->blocker);
     error_free(s->blocker);
-    object_unref(OBJECT(s->iothread));
     qemu_bh_delete(s->bh);
+    object_unref(OBJECT(s->iothread));
     g_free(s);
 }
 
which I'll formally send shortly.

I would prefer to fix them all in 2.4 and risk regressions, because the
bugs are use-after-frees, i.e. pretty bad.

Paolo

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race
  2015-07-28 12:18               ` Paolo Bonzini
@ 2015-07-28 13:58                 ` Stefan Hajnoczi
  0 siblings, 0 replies; 13+ messages in thread
From: Stefan Hajnoczi @ 2015-07-28 13:58 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Cornelia Huck, Christian Borntraeger, qemu-devel, Stefan Hajnoczi

On Tue, Jul 28, 2015 at 1:18 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> I would prefer to fix them all in 2.4 and risk regressions, because the
> bugs are use-after-frees, i.e. pretty bad.

There may be existing use-after-free bugs but keep in mind there are
other common cases:
1. Never touch the QEMUBH again.  Simple leak.
2. Call qemu_bh_delete().  Leak but still not use-after-free, since
the QEMUBH is still allocated.

The only scenario where a real use-after-free occurs is when
qemu_bh_schedule() is called after the AioContext was freed.  We don't
need an assertion to detect that case, just assign bh->ctx = NULL to
cause a segfault if the AioContext is ever accessed again.

Stefan

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-07-28 13:58 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-27 16:33 [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race Stefan Hajnoczi
2015-07-27 16:33 ` [Qemu-devel] [PATCH for-2.4 1/2] AioContext: avoid leaking BHs on cleanup Stefan Hajnoczi
2015-07-27 16:33 ` [Qemu-devel] [PATCH for-2.4 2/2] AioContext: force event loop iteration using BH Stefan Hajnoczi
2015-07-27 17:49   ` Paolo Bonzini
2015-07-28  7:07 ` [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race Cornelia Huck
2015-07-28  8:02   ` Cornelia Huck
2015-07-28  8:34     ` Stefan Hajnoczi
2015-07-28 10:26       ` Cornelia Huck
2015-07-28 10:31         ` Stefan Hajnoczi
2015-07-28 10:34           ` Stefan Hajnoczi
2015-07-28 10:58             ` Cornelia Huck
2015-07-28 12:18               ` Paolo Bonzini
2015-07-28 13:58                 ` Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).