From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:36223)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1XUHYZ-0006cn-WD
	for qemu-devel@nongnu.org; Wed, 17 Sep 2014 11:54:18 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1XUHYQ-0006V9-2x
	for qemu-devel@nongnu.org; Wed, 17 Sep 2014 11:54:11 -0400
Received: from mx1.redhat.com ([209.132.183.28]:16485)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1XUHYP-0006Tm-Rp
	for qemu-devel@nongnu.org; Wed, 17 Sep 2014 11:54:02 -0400
Message-ID: <5419AE8D.3070007@redhat.com>
Date: Wed, 17 Sep 2014 17:53:49 +0200
From: Paolo Bonzini <pbonzini@redhat.com>
MIME-Version: 1.0
References: <5416C46D.7040105@ozlabs.ru>	<541826CA.7050607@ozlabs.ru>	<541828BF.8090301@redhat.com>	<20140917090615.GB10699@stefanha-thinkpad.redhat.com>	<54195395.9010201@redhat.com>
	<CAJSP0QXq_OY7+UOPO4c1_uu4RsOoWzPVncZ6JDcGF1OYcUNijg@mail.gmail.com>
In-Reply-To: <CAJSP0QXq_OY7+UOPO4c1_uu4RsOoWzPVncZ6JDcGF1OYcUNijg@mail.gmail.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] migration: qemu-coroutine-lock.c:141:
 qemu_co_mutex_unlock: Assertion `mutex->locked == 1' failed
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Alexey Kardashevskiy <aik@ozlabs.ru>, Michal Privoznik <mprivozn@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, Max Reitz <mreitz@redhat.com>

Il 17/09/2014 17:04, Stefan Hajnoczi ha scritto:
> On Wed, Sep 17, 2014 at 10:25 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Il 17/09/2014 11:06, Stefan Hajnoczi ha scritto:
>>> I think the fundamental problem here is that the mirror block job
>>> on the source host does not synchronize with live migration.
>>>
>>> Remember the mirror block job iterates on the dirty bitmap
>>> whenever it feels like.
>>>
>>> There is no guarantee that the mirror block job has quiesced before
>>> migration handover takes place, right?
>>
>> Libvirt does that.  Migration is started only once storage mirroring
>> is out of the bulk phase, and the handover looks like:
>>
>> 1) migration completes
>>
>> 2) because the source VM is stopped, the disk has quiesced on the source
> 
> But the mirror block job might still be writing out dirty blocks.

Right, but it quiesces after (3).

>> 3) libvirt sends block-job-complete
> 
> No, it sends block-job-cancel after the source QEMU's migration has
> completed.  See the qemuMigrationCancelDriveMirror() call in
> src/qemu/qemu_migration.c:qemuMigrationRun().

No problem, block-job-cancel and block-job-complete are the same except
for pivoting to the destination.

>> 4) libvirt receives BLOCK_JOB_COMPLETED.  The disk has now quiesced on
>> the destination as well.
> 
> I don't see where this happens in the libvirt source code.  Libvirt
> doesn't care about block job events for drive-mirror during migration.
> 
> And that's why there could still be I/O going on (since
> block-job-cancel is asynchronous).

Oops, this would be a bug!  block-job-complete and block-job-cancel are
asynchronous.  CCing Michal Privoznik who wrote the libvirt code.

Paolo

>> 5) the VM is started on the destination
>>
>> 6) the NBD server is stopped on the destination and the source VM is quit.
>>
>> It is actually a feature that storage migration is completed
>> asynchronously with respect to RAM migration.  The problem is that
>> qcow2_invalidate_cache happens between (3) and (5), and it doesn't
>> like the concurrent I/O received by the NBD server.
> 
> I agree that qcow2_invalidate_cache() (and any other invalidate cache
> implementations) need to allow concurrent I/O requests.
> 
> Either I'm misreading the libvirt code or libvirt is not actually
> ensuring that the block job on the source has cancelled/completed
> before the guest is resumed on the destination.  So I think there is
> still a bug, maybe Eric can verify this?
> 
> Stefan
>