From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Lieven Subject: Re: [Qemu-devel] Block Migration Assertion in qemu-kvm 1.2.0 Date: Wed, 19 Sep 2012 07:49:50 +0200 Message-ID: <50595CFE.7050208@dlhnet.de> References: <5055A643.8060505@dlhnet.de> <5056E221.8020106@redhat.com> <5057842F.6090506@dlhnet.de> <50584CC6.2030207@dlhnet.de> <50584D84.2080802@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "qemu-devel@nongnu.org" , "kvm@vger.kernel.org" , Paolo Bonzini To: Kevin Wolf Return-path: Received: from ssl.dlhnet.de ([91.198.192.8]:33670 "EHLO ssl.dlh.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753487Ab2ISFtw (ORCPT ); Wed, 19 Sep 2012 01:49:52 -0400 In-Reply-To: <50584D84.2080802@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On 09/18/12 12:31, Kevin Wolf wrote: > Am 18.09.2012 12:28, schrieb Peter Lieven: >> On 09/17/12 22:12, Peter Lieven wrote: >>> On 09/17/12 10:41, Kevin Wolf wrote: >>>> Am 16.09.2012 12:13, schrieb Peter Lieven: >>>>> Hi, >>>>> >>>>> when trying to block migrate a VM from one node to another, the source >>>>> VM crashed with the following assertion: >>>>> block.c:3829: bdrv_set_in_use: Assertion `bs->in_use != in_use' failed. >>>>> >>>>> Is this sth already addresses/known? >>>> Not that I'm aware of, at least. >>>> >>>> Block migration doesn't seem to check whether the device is already in >>>> use, maybe this is the problem. Not sure why it would be in use, though, >>>> and in my quick test it didn't crash. >>>> >>>> So we need some more information: What's you command line, did you do >>>> anything specific in the monitor with block devices, what does the >>>> stacktrace look like, etc.? >>> kevin, it seems that i can very easily force a crash if I cancel a >>> running block migration. >> if I understand correctly what happens there are aio callbacks coming in >> after >> blk_mig_cleanup() has been called. >> >> what is the proper way to detect this in blk_mig_read_cb()? > You could try this, it doesn't detect the situation in > blk_mig_read_cb(), but ensures that all callbacks happen before we do > the actual cleanup (completely untested): after testing it for half an hour i can say, it seems to fix the problem. no segfaults and also no other assertions. while searching I have seen that the queses blk_list and bmds_list are initialized at qemu startup. wouldn't it be better to initialize them at init_blk_migration or at least check that they are really empty? i have also seen that prev_time_offset is not initialized. thank you, peter sth like this: --- qemu-kvm-1.2.0/block-migration.c.orig 2012-09-17 21:14:44.458429855 +0200 +++ qemu-kvm-1.2.0/block-migration.c 2012-09-17 21:15:40.599736962 +0200 @@ -311,8 +311,12 @@ static void init_blk_migration(QEMUFile block_mig_state.prev_progress = -1; block_mig_state.bulk_completed = 0; block_mig_state.total_time = 0; + block_mig_state.prev_time_offset = 0; block_mig_state.reads = 0; + QSIMPLEQ_INIT(&block_mig_state.bmds_list); + QSIMPLEQ_INIT(&block_mig_state.blk_list); + bdrv_iterate(init_blk_migration_it, NULL); } @@ -760,9 +764,6 @@ SaveVMHandlers savevm_block_handlers = { void blk_mig_init(void) { - QSIMPLEQ_INIT(&block_mig_state.bmds_list); - QSIMPLEQ_INIT(&block_mig_state.blk_list); - register_savevm_live(NULL, "block", 0, 1, &savevm_block_handlers, &block_mig_state); } > diff --git a/block-migration.c b/block-migration.c > index 7def8ab..ed93301 100644 > --- a/block-migration.c > +++ b/block-migration.c > @@ -519,6 +519,8 @@ static void blk_mig_cleanup(void) > BlkMigDevState *bmds; > BlkMigBlock *blk; > > + bdrv_drain_all(); > + > set_dirty_tracking(0); > > while ((bmds = QSIMPLEQ_FIRST(&block_mig_state.bmds_list)) != NULL) {