From: "Daniel P. Berrange" <berrange@redhat.com>
To: "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com>
Cc: qemu-devel@nongnu.org, kwolf@redhat.com, jdenemar@redhat.com,
wangjie88@huawei.com, quintela@redhat.com, peterx@redhat.com,
mreitz@redhat.com, eblake@redhat.com, fuweiwei2@huawei.com
Subject: Re: [Qemu-devel] [PATCH 0/7] migration: pause-before-device
Date: Thu, 12 Oct 2017 11:02:44 +0100 [thread overview]
Message-ID: <20171012100244.GG16125@redhat.com> (raw)
In-Reply-To: <20171011191317.24157-1-dgilbert@redhat.com>
On Wed, Oct 11, 2017 at 08:13:10PM +0100, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> Hi,
> This set attempts to make a race condition between migration and
> drive-mirror (and other block users) soluble by allowing the migration
> to be paused after the source qemu releases the block devices but
> before the serialisation of the device state.
>
> The symptom of this failure, as reported by Wangjie, is a:
> _co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed
>
> and the source qemu dieing; so the problem is pretty nasty.
> This has only been seen on 2.9 onwards, but the theory is that
> prior to 2.9 it might have been happening anyway and we were
> perhaps getting unreported corruptions (lost writes); so this
> really needs fixing.
>
> This flow came from discussions between Kevin and me, and we can't
> see a way of fixing it without exposing a new state to the management
> layer.
>
> The flow is now:
>
> (qemu) migrate_set_capability pause-before-device on
How about 'switchover-cleanup'
> (qemu) migrate -d ...
> (qemu) info migrate
> ...
> Migration status: pause-before-device
and 'switchover'
> ...
> << issue commands to clean up any block jobs>>
>
> (qemu) migrate_continue pause-before-device
> (qemu) info migrate
> ...
> Migration status: completed
>
> This set has been _very_ lightly tested just at the normal migration
> code, without the addition of the drive mirror; so this is a first
> cut. I'd appreciate some feedback from libvirt whether the inteface
> is OK and ideally a hack to test it in a full libvirt setup to see
> if we hit any other issues.
>
> The precopy flow is:
> active->pause-before-device->completed
>
> The postcopy flow is:
> active->pause-before-device->postcopy-active->completed
>
> Although the behaviour with postcopy only gets interesting when
> we add something like Max's active-sync.
>
> Please argue about the command and state naming.
Argued above :-)
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
prev parent reply other threads:[~2017-10-12 10:02 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-11 19:13 [Qemu-devel] [PATCH 0/7] migration: pause-before-device Dr. David Alan Gilbert (git)
2017-10-11 19:13 ` [Qemu-devel] [PATCH 1/7] migration: Add 'pause-before-device' capability Dr. David Alan Gilbert (git)
2017-10-11 19:13 ` [Qemu-devel] [PATCH 2/7] migration: Add 'pause-before-device' and 'device' statuses Dr. David Alan Gilbert (git)
2017-10-11 19:13 ` [Qemu-devel] [PATCH 3/7] migration: Wait for semaphore before completing migration Dr. David Alan Gilbert (git)
2017-10-18 3:35 ` Peter Xu
2017-10-18 8:59 ` Dr. David Alan Gilbert
2017-10-11 19:13 ` [Qemu-devel] [PATCH 4/7] migration: migrate-continue Dr. David Alan Gilbert (git)
2017-10-11 19:13 ` [Qemu-devel] [PATCH 5/7] migrate: HMP migate_continue Dr. David Alan Gilbert (git)
2017-10-11 19:13 ` [Qemu-devel] [PATCH 6/7] migration: allow cancel to unpause Dr. David Alan Gilbert (git)
2017-10-11 19:13 ` [Qemu-devel] [PATCH 7/7] migration: pause-before-device for postcopy Dr. David Alan Gilbert (git)
2017-10-11 20:03 ` [Qemu-devel] [PATCH 0/7] migration: pause-before-device no-reply
2017-10-12 8:21 ` Daniel P. Berrange
2017-10-12 9:18 ` Kevin Wolf
2017-10-12 9:27 ` Daniel P. Berrange
2017-10-12 9:52 ` Kevin Wolf
2017-10-12 9:55 ` Daniel P. Berrange
2017-10-12 10:02 ` Daniel P. Berrange [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171012100244.GG16125@redhat.com \
--to=berrange@redhat.com \
--cc=dgilbert@redhat.com \
--cc=eblake@redhat.com \
--cc=fuweiwei2@huawei.com \
--cc=jdenemar@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=wangjie88@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).