From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:46093)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1ZK3q3-0001ej-8n
	for qemu-devel@nongnu.org; Tue, 28 Jul 2015 08:18:32 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1ZK3pz-0000rZ-Qs
	for qemu-devel@nongnu.org; Tue, 28 Jul 2015 08:18:31 -0400
Received: from mail-wi0-x22e.google.com ([2a00:1450:400c:c05::22e]:36855)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1ZK3pz-0000rV-JR
	for qemu-devel@nongnu.org; Tue, 28 Jul 2015 08:18:27 -0400
Received: by wicgb10 with SMTP id gb10so153947739wic.1
	for <qemu-devel@nongnu.org>; Tue, 28 Jul 2015 05:18:27 -0700 (PDT)
Sender: Paolo Bonzini <paolo.bonzini@gmail.com>
References: <1438014819-18125-1-git-send-email-stefanha@redhat.com>
	<20150728090700.73f001d3.cornelia.huck@de.ibm.com>
	<20150728100226.49bafc67.cornelia.huck@de.ibm.com>
	<20150728083446.GC32719@stefanha-thinkpad.redhat.com>
	<20150728122626.7d9eeabf.cornelia.huck@de.ibm.com>
	<CAJSP0QV7gJ_sYyVYS5VfbiE_afCdtE6ynHhXpOdh8FHL_XawBQ@mail.gmail.com>
	<CAJSP0QVjxooCYnOrNHVgjZ8MN7bD1p5-86eXuTvUuYfZS8XuQQ@mail.gmail.com>
	<20150728125857.174d2887.cornelia.huck@de.ibm.com>
From: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <55B77310.6050808@redhat.com>
Date: Tue, 28 Jul 2015 14:18:24 +0200
MIME-Version: 1.0
In-Reply-To: <20150728125857.174d2887.cornelia.huck@de.ibm.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after
 aio_context_acquire() race
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Cornelia Huck <cornelia.huck@de.ibm.com>, Stefan Hajnoczi <stefanha@gmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>, qemu-devel <qemu-devel@nongnu.org>, Stefan Hajnoczi <stefanha@redhat.com>


On 28/07/2015 12:58, Cornelia Huck wrote:
> > > Thanks.  I understand how to reproduce it now: use -drive aio=threads
> > > and do I/O during managedsave.
> > >
> > > I suspect there are more cases of this.  We need to clean it up during QEMU 2.5.
> > >
> > > For now let's continue leaking these BHs as we've always done.
> > 
> > Actually, this case can be fixed in the patch by moving
> > thread_pool_free() before the BH cleanup loop.
>
> Tried that, may have done it wrong, because the assertion still hits.

If you're doing savevm with a dataplane disk as the destination, that 
cannot work; savevm doesn't attempt to acquire the AioContext so it is 
not thread safe.

An even simpler reproducer for this bug, however, is to hot-unplug a 
disk created with x-data-plane.  It also shows another bug, fixed by 
this patch:

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 3db139b..6106e46 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -223,8 +223,8 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s)
     virtio_blk_data_plane_stop(s);
     blk_op_unblock_all(s->conf->conf.blk, s->blocker);
     error_free(s->blocker);
-    object_unref(OBJECT(s->iothread));
     qemu_bh_delete(s->bh);
+    object_unref(OBJECT(s->iothread));
     g_free(s);
 }
 
which I'll formally send shortly.

I would prefer to fix them all in 2.4 and risk regressions, because the
bugs are use-after-frees, i.e. pretty bad.

Paolo