From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60433) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XUtqY-0006mT-5O for qemu-devel@nongnu.org; Fri, 19 Sep 2014 04:47:24 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XUtqS-0002Xf-14 for qemu-devel@nongnu.org; Fri, 19 Sep 2014 04:47:18 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43984) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XUtqR-0002XH-NZ for qemu-devel@nongnu.org; Fri, 19 Sep 2014 04:47:11 -0400 Date: Fri, 19 Sep 2014 10:47:03 +0200 From: Kevin Wolf Message-ID: <20140919084703.GA7667@noname.redhat.com> References: <5416C46D.7040105@ozlabs.ru> <541826CA.7050607@ozlabs.ru> <541828BF.8090301@redhat.com> <20140916123431.GB4886@noname.str.redhat.com> <54182EAE.4000802@redhat.com> <20140916125223.GC4886@noname.str.redhat.com> <5418343A.1050600@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5418343A.1050600@redhat.com> Subject: Re: [Qemu-devel] migration: qemu-coroutine-lock.c:141: qemu_co_mutex_unlock: Assertion `mutex->locked == 1' failed List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: Alexey Kardashevskiy , "qemu-devel@nongnu.org" , Max Reitz , Stefan Hajnoczi , "Dr. David Alan Gilbert" Am 16.09.2014 um 14:59 hat Paolo Bonzini geschrieben: > Il 16/09/2014 14:52, Kevin Wolf ha scritto: > > Yes, that's true. We can't fix this problem in qcow2, though, because > > it's a more general one. I think we must make sure that > > bdrv_invalidate_cache() doesn't yield. > > > > Either by forbidding to run bdrv_invalidate_cache() in a coroutine and > > moving the problem to the caller (where and why is it even called from a > > coroutine?), or possibly by creating a new coroutine for the driver > > callback and running that in a nested event loop that only handles > > bdrv_invalidate_cache() callbacks, so that the NBD server doesn't get a > > chance to process new requests in this thread. > > Incoming migration runs in a coroutine (the coroutine entry point is > process_incoming_migration_co). But everything after qemu_fclose() can > probably be moved into a separate bottom half, so that it gets out of > coroutine context. Alexey, you should probably rather try this (and add a bdrv_drain_all() in bdrv_invalidate_cache) than messing around with qcow2 locks. This isn't a problem that can be completely fixed in qcow2. Kevin