From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55291) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XUYSd-00065L-09 for qemu-devel@nongnu.org; Thu, 18 Sep 2014 05:57:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XUYSX-00011e-U0 for qemu-devel@nongnu.org; Thu, 18 Sep 2014 05:57:10 -0400 Received: from mx1.redhat.com ([209.132.183.28]:33568) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XUYSX-00011D-Lu for qemu-devel@nongnu.org; Thu, 18 Sep 2014 05:57:05 -0400 Message-ID: <541AAC64.4020006@redhat.com> Date: Thu, 18 Sep 2014 11:56:52 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <5416C46D.7040105@ozlabs.ru> <541826CA.7050607@ozlabs.ru> <541828BF.8090301@redhat.com> <20140917090615.GB10699@stefanha-thinkpad.redhat.com> <54195395.9010201@redhat.com> <5419902B.1030309@ozlabs.ru> <541A50F6.4060703@ozlabs.ru> In-Reply-To: <541A50F6.4060703@ozlabs.ru> Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] migration: qemu-coroutine-lock.c:141: qemu_co_mutex_unlock: Assertion `mutex->locked == 1' failed List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexey Kardashevskiy , Stefan Hajnoczi Cc: Kevin Wolf , "qemu-devel@nongnu.org" , Max Reitz , Stefan Hajnoczi , "Dr. David Alan Gilbert" Il 18/09/2014 05:26, Alexey Kardashevskiy ha scritto: > On 09/18/2014 01:07 AM, Stefan Hajnoczi wrote: >> On Wed, Sep 17, 2014 at 2:44 PM, Alexey Kardashevskiy wrote: >>> On 09/17/2014 07:25 PM, Paolo Bonzini wrote: >>> btw any better idea of a hack to try? Testers are pushing me - they want to >>> upgrade the broken setup and I am blocking them :) Thanks! >> >> Paolo's qemu_co_mutex_lock(&s->lock) idea in qcow2_invalidate_cache() >> is good. Have you tried that patch? > > > Yes, did not help. > >> >> I haven't checked the qcow2 code whether that works properly across >> bdrv_close() (is the lock freed?) but in principle that's how you >> protect against concurrent I/O. > > I thought we have to avoid qemu_coroutine_yield() in this particular case. > I fail to see how the locks may help if we still do yeild. But the whole > thing is already way behind of my understanding :) For example - how many > BlockDriverState things are layered here? NBD -> QCOW2 -> RAW? No, this is an NBD server. So we have three users of the same QCOW2 image: migration, NBD server and virtio disk (not active while the bug happens, and thus not depicted): NBD server -> QCOW2 <- migration | v File The problem is that the NBD server accesses the QCOW2 image while migration does qcow2_invalidate_cache. Paolo