qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Laine Stump <laine@redhat.com>
Cc: Laurent Vivier <lvivier@redhat.com>,
	Peter Krempa <pkrempa@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	Libvirt <libvir-list@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	qemu-devel@nongnu.org, Jiri Denemark <jdenemar@redhat.com>,
	Eric Blake <eblake@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [PATCH] failover: allow to pause the VM during the migration
Date: Fri, 1 Oct 2021 10:01:43 +0100	[thread overview]
Message-ID: <YVbOd7dYNst3BBfA@redhat.com> (raw)
In-Reply-To: <f1898bf0-dadb-4e2e-a45a-9087d2c63678@redhat.com>

On Thu, Sep 30, 2021 at 04:17:44PM -0400, Laine Stump wrote:
> On 9/30/21 1:09 PM, Laurent Vivier wrote:
> > If we want to save a snapshot of a VM to a file, we used to follow the
> > following steps:
> > 
> > 1- stop the VM:
> >     (qemu) stop
> > 
> > 2- migrate the VM to a file:
> >     (qemu) migrate "exec:cat > snapshot"
> > 
> > 3- resume the VM:
> >     (qemu) cont
> > 
> > After that we can restore the snapshot with:
> >    qemu-system-x86_64 ... -incoming "exec:cat snapshot"
> >    (qemu) cont
> 
> This is the basics of what libvirt does for a snapshot, and steps 1+2 are
> what it does for a "managedsave" (where it saves the snapshot to disk and
> then terminates the qemu process, for later re-animation).
> 
> In those cases, it seems like this new parameter could work for us - instead
> of explicitly pausing the guest prior to migrating it to disk, we would set
> this new parameter to on, then directly migrate-to-disk (relying on qemu to
> do the pause). Care will need to be taken to assure that error recovery
> behaves the same though.

What libvirt does is actually quite different from this in a signficant
way.  In the HMP example here 'migrate' is a blocking command that does
not return until migration is finished.

Libvirt uses QMP and 'migrate' there is a asynchronous command that merely
launches the migration and returns control to the client.

IOW, what libvirt does is

    stop
    migrate
    while status != failed || completed
       query-migrate
       
       ...also receive any QMP migration events...

       ...possibly modify migration parameters...

    cont

With this pattern I'm not seeing any need for a new migration parameter
for libvirt. The migration status lets us distinguish when QEMU is in
the "waiting for unplug" phase vs the "active" phase. So AFAICT, libvirt
can do:

    migrate
    while status != failed || completed
       query-migrate
       
       ...also receive any QMP migration events..

       if status changed wait-for-unplug to active
         stop

       ...possibly modify migration parameters...

    cont


There is a small window here when the guest CPUs are running
but migration is active.  In most cases for libvirt that is
harmless.  If there are cases where libvirt needs a strong
guarantee to synchonize the 'stop' with some other option,
then the new proposed "pause-vm" parameter as the same problem
as libvirt can't sychronize against that either.


> There are a couple of cases when libvirt apparently *doesn't* pause the
> guest during the migrate-to-disk, both having to do with saving a coredump
> of the guest. Since I really have no idea of how common/important that is
> (or even if my assessment of the code is correct), I'm Cc'ing this patch to
> libvir-list to make sure it catches the attention of someone who knows the
> answers and implications.

IIUC, the problem with unplug only happens when libvirt pauses
the guest. So surely if there are some scenarios where we're not
pausing the guest, there's no problem to solve for those.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



  parent reply	other threads:[~2021-10-01  9:05 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-30 17:09 [PATCH] failover: allow to pause the VM during the migration Laurent Vivier
2021-09-30 20:17 ` Laine Stump
2021-10-01  6:48   ` Laurent Vivier
2021-10-01  7:37   ` Peter Krempa
2021-10-01  9:01   ` Daniel P. Berrangé [this message]
2021-10-14 13:20 ` Dr. David Alan Gilbert
2021-10-29 13:49   ` Juan Quintela
2021-10-29 13:56 ` Juan Quintela

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YVbOd7dYNst3BBfA@redhat.com \
    --to=berrange@redhat.com \
    --cc=armbru@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eblake@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=jdenemar@redhat.com \
    --cc=laine@redhat.com \
    --cc=libvir-list@redhat.com \
    --cc=lvivier@redhat.com \
    --cc=mst@redhat.com \
    --cc=pkrempa@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).