qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	Alexey Kardashevskiy <aik@ozlabs.ru>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Max Reitz <mreitz@redhat.com>,
	"libvir-list@redhat.com" <libvir-list@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [Qemu-devel] migration: qemu-coroutine-lock.c:141: qemu_co_mutex_unlock: Assertion `mutex->locked == 1' failed
Date: Wed, 17 Sep 2014 09:17:20 -0600	[thread overview]
Message-ID: <5419A600.1070209@redhat.com> (raw)
In-Reply-To: <CAJSP0QXq_OY7+UOPO4c1_uu4RsOoWzPVncZ6JDcGF1OYcUNijg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2803 bytes --]

[adding libvirt list]

On 09/17/2014 09:04 AM, Stefan Hajnoczi wrote:
> On Wed, Sep 17, 2014 at 10:25 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Il 17/09/2014 11:06, Stefan Hajnoczi ha scritto:
>>> I think the fundamental problem here is that the mirror block job
>>> on the source host does not synchronize with live migration.
>>>
>>> Remember the mirror block job iterates on the dirty bitmap
>>> whenever it feels like.
>>>
>>> There is no guarantee that the mirror block job has quiesced before
>>> migration handover takes place, right?
>>
>> Libvirt does that.  Migration is started only once storage mirroring
>> is out of the bulk phase, and the handover looks like:
>>
>> 1) migration completes
>>
>> 2) because the source VM is stopped, the disk has quiesced on the source
> 
> But the mirror block job might still be writing out dirty blocks.
> 
>> 3) libvirt sends block-job-complete
> 
> No, it sends block-job-cancel after the source QEMU's migration has
> completed.  See the qemuMigrationCancelDriveMirror() call in
> src/qemu/qemu_migration.c:qemuMigrationRun().
> 
>> 4) libvirt receives BLOCK_JOB_COMPLETED.  The disk has now quiesced on
>> the destination as well.
> 
> I don't see where this happens in the libvirt source code.  Libvirt
> doesn't care about block job events for drive-mirror during migration.
> 
> And that's why there could still be I/O going on (since
> block-job-cancel is asynchronous).
> 
>> 5) the VM is started on the destination
>>
>> 6) the NBD server is stopped on the destination and the source VM is quit.
>>
>> It is actually a feature that storage migration is completed
>> asynchronously with respect to RAM migration.  The problem is that
>> qcow2_invalidate_cache happens between (3) and (5), and it doesn't
>> like the concurrent I/O received by the NBD server.
> 
> I agree that qcow2_invalidate_cache() (and any other invalidate cache
> implementations) need to allow concurrent I/O requests.
> 
> Either I'm misreading the libvirt code or libvirt is not actually
> ensuring that the block job on the source has cancelled/completed
> before the guest is resumed on the destination.  So I think there is
> still a bug, maybe Eric can verify this?

You may indeed be correct that libvirt is not waiting long enough for
the block job to be gone on the source before resuming on the
destination.  I didn't write that particular code, so I'm cc'ing the
libvirt list, but I can try and take a look into it, since it's related
to code I've recently touched in getting libvirt to support active layer
block commit.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]

  reply	other threads:[~2014-09-17 15:17 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-15 10:50 [Qemu-devel] migration: qemu-coroutine-lock.c:141: qemu_co_mutex_unlock: Assertion `mutex->locked == 1' failed Alexey Kardashevskiy
2014-09-16 12:02 ` Alexey Kardashevskiy
2014-09-16 12:10   ` Paolo Bonzini
2014-09-16 12:34     ` Kevin Wolf
2014-09-16 12:35       ` Paolo Bonzini
2014-09-16 12:52         ` Kevin Wolf
2014-09-16 12:59           ` Paolo Bonzini
2014-09-19  8:47             ` Kevin Wolf
2014-09-23  8:47               ` [Qemu-devel] [RFC PATCH] qcow2: Fix race in cache invalidation Alexey Kardashevskiy
2014-09-24  7:30                 ` Alexey Kardashevskiy
2014-09-24  9:48                 ` Kevin Wolf
2014-09-25  8:41                   ` Alexey Kardashevskiy
2014-09-25  8:57                     ` Kevin Wolf
2014-09-25  9:55                       ` Alexey Kardashevskiy
2014-09-25 10:20                         ` Kevin Wolf
2014-09-25 12:29                           ` Alexey Kardashevskiy
2014-09-25 12:39                             ` Kevin Wolf
2014-09-25 14:05                               ` Alexey Kardashevskiy
2014-09-28 11:14                                 ` Alexey Kardashevskiy
2014-09-17  6:46       ` [Qemu-devel] migration: qemu-coroutine-lock.c:141: qemu_co_mutex_unlock: Assertion `mutex->locked == 1' failed Alexey Kardashevskiy
2014-09-16 14:52     ` Alexey Kardashevskiy
2014-09-17  9:06     ` Stefan Hajnoczi
2014-09-17  9:25       ` Paolo Bonzini
2014-09-17 13:44         ` Alexey Kardashevskiy
2014-09-17 15:07           ` Stefan Hajnoczi
2014-09-18  3:26             ` Alexey Kardashevskiy
2014-09-18  9:56               ` Paolo Bonzini
2014-09-19  8:23                 ` Alexey Kardashevskiy
2014-09-17 15:04         ` Stefan Hajnoczi
2014-09-17 15:17           ` Eric Blake [this message]
2014-09-17 15:53           ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5419A600.1070209@redhat.com \
    --to=eblake@redhat.com \
    --cc=aik@ozlabs.ru \
    --cc=dgilbert@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=libvir-list@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).