From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46093) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZK3q3-0001ej-8n for qemu-devel@nongnu.org; Tue, 28 Jul 2015 08:18:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZK3pz-0000rZ-Qs for qemu-devel@nongnu.org; Tue, 28 Jul 2015 08:18:31 -0400 Received: from mail-wi0-x22e.google.com ([2a00:1450:400c:c05::22e]:36855) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZK3pz-0000rV-JR for qemu-devel@nongnu.org; Tue, 28 Jul 2015 08:18:27 -0400 Received: by wicgb10 with SMTP id gb10so153947739wic.1 for ; Tue, 28 Jul 2015 05:18:27 -0700 (PDT) Sender: Paolo Bonzini References: <1438014819-18125-1-git-send-email-stefanha@redhat.com> <20150728090700.73f001d3.cornelia.huck@de.ibm.com> <20150728100226.49bafc67.cornelia.huck@de.ibm.com> <20150728083446.GC32719@stefanha-thinkpad.redhat.com> <20150728122626.7d9eeabf.cornelia.huck@de.ibm.com> <20150728125857.174d2887.cornelia.huck@de.ibm.com> From: Paolo Bonzini Message-ID: <55B77310.6050808@redhat.com> Date: Tue, 28 Jul 2015 14:18:24 +0200 MIME-Version: 1.0 In-Reply-To: <20150728125857.174d2887.cornelia.huck@de.ibm.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH for-2.4 0/2] AioContext: fix deadlock after aio_context_acquire() race List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Cornelia Huck , Stefan Hajnoczi Cc: Christian Borntraeger , qemu-devel , Stefan Hajnoczi On 28/07/2015 12:58, Cornelia Huck wrote: > > > Thanks. I understand how to reproduce it now: use -drive aio=threads > > > and do I/O during managedsave. > > > > > > I suspect there are more cases of this. We need to clean it up during QEMU 2.5. > > > > > > For now let's continue leaking these BHs as we've always done. > > > > Actually, this case can be fixed in the patch by moving > > thread_pool_free() before the BH cleanup loop. > > Tried that, may have done it wrong, because the assertion still hits. If you're doing savevm with a dataplane disk as the destination, that cannot work; savevm doesn't attempt to acquire the AioContext so it is not thread safe. An even simpler reproducer for this bug, however, is to hot-unplug a disk created with x-data-plane. It also shows another bug, fixed by this patch: diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index 3db139b..6106e46 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -223,8 +223,8 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s) virtio_blk_data_plane_stop(s); blk_op_unblock_all(s->conf->conf.blk, s->blocker); error_free(s->blocker); - object_unref(OBJECT(s->iothread)); qemu_bh_delete(s->bh); + object_unref(OBJECT(s->iothread)); g_free(s); } which I'll formally send shortly. I would prefer to fix them all in 2.4 and risk regressions, because the bugs are use-after-frees, i.e. pretty bad. Paolo