From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:45861) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TGVgo-0007yv-Tl for qemu-devel@nongnu.org; Tue, 25 Sep 2012 10:00:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TGVgW-0000PM-TQ for qemu-devel@nongnu.org; Tue, 25 Sep 2012 10:00:42 -0400 Received: from mx1.redhat.com ([209.132.183.28]:29937) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TGVgW-0000Og-Kk for qemu-devel@nongnu.org; Tue, 25 Sep 2012 10:00:24 -0400 Message-ID: <5061B8E5.7070000@redhat.com> Date: Tue, 25 Sep 2012 16:00:05 +0200 From: Kevin Wolf MIME-Version: 1.0 References: <5055A643.8060505@dlhnet.de> <5056E221.8020106@redhat.com> <5057842F.6090506@dlhnet.de> <50584CC6.2030207@dlhnet.de> <50584D84.2080802@redhat.com> <50595CFE.7050208@dlhnet.de> In-Reply-To: <50595CFE.7050208@dlhnet.de> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] Block Migration Assertion in qemu-kvm 1.2.0 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Lieven Cc: Paolo Bonzini , "qemu-devel@nongnu.org" , "kvm@vger.kernel.org" Am 19.09.2012 07:49, schrieb Peter Lieven: > On 09/18/12 12:31, Kevin Wolf wrote: >> Am 18.09.2012 12:28, schrieb Peter Lieven: >>> On 09/17/12 22:12, Peter Lieven wrote: >>>> On 09/17/12 10:41, Kevin Wolf wrote: >>>>> Am 16.09.2012 12:13, schrieb Peter Lieven: >>>>>> Hi, >>>>>> >>>>>> when trying to block migrate a VM from one node to another, the source >>>>>> VM crashed with the following assertion: >>>>>> block.c:3829: bdrv_set_in_use: Assertion `bs->in_use != in_use' failed. >>>>>> >>>>>> Is this sth already addresses/known? >>>>> Not that I'm aware of, at least. >>>>> >>>>> Block migration doesn't seem to check whether the device is already in >>>>> use, maybe this is the problem. Not sure why it would be in use, though, >>>>> and in my quick test it didn't crash. >>>>> >>>>> So we need some more information: What's you command line, did you do >>>>> anything specific in the monitor with block devices, what does the >>>>> stacktrace look like, etc.? >>>> kevin, it seems that i can very easily force a crash if I cancel a >>>> running block migration. >>> if I understand correctly what happens there are aio callbacks coming in >>> after >>> blk_mig_cleanup() has been called. >>> >>> what is the proper way to detect this in blk_mig_read_cb()? >> You could try this, it doesn't detect the situation in >> blk_mig_read_cb(), but ensures that all callbacks happen before we do >> the actual cleanup (completely untested): > after testing it for half an hour i can say, it seems to fix the problem. > no segfaults and also no other assertions. > > while searching I have seen that the queses blk_list and bmds_list are > initialized at > qemu startup. wouldn't it be better to initialize them at init_blk_migration > or at least check that they are really empty? i have also seen that > prev_time_offset > is not initialized. Probably. If you sent this as a proper patch with a SoB, I wouldn't reject it, but considering that block migration is deprecated anyway, I won't bother myself as long as there's no real bug. Kevin