qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: dlaor@redhat.com
Cc: qemu-devel <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] Live migration protocol, device features, ABIs and other beasts
Date: Sun, 22 Nov 2009 09:49:26 -0600	[thread overview]
Message-ID: <4B095D86.700@codemonkey.ws> (raw)
In-Reply-To: <4B0952C9.9010803@redhat.com>

Dor Laor wrote:
> In the last couple of days we discovered some issues regarding stable 
> ABI and the robustness of the live migration protocol. Let's just jump 
> right into it, ordered by complexity:
>
> 1. Control *every* feature exposed to the guest by qemu cmdline:
>
>    While thinking on cross version migration, and reviewing some
>    patches, I noticed that there are many times that we use feature bits
>    in order to expose functionality for the guest driver - example:
>    VIRTIO_BLK_F_BARRIER, but we do not control it from qemu cmdline.
>
>    The result is that guest running on a newer qemu cannot live migrate
>    into older qemu without the barrier feature.
>
>    Like this barrier example, there are probably many cases that we
>    do keep device/driver abi but forget new/old release abi.
>
>    The solution here is simpler - Every guest visible change should
>    translate into cmdline option. This is part of the machine type and
>    in addition should be configurable.
>    It's an issue we all should keep in the back of our heads and popup
>    when a new capability/change are introduced.

s/cmdline/qdev/g and I agree with you.  There's nothing protocol 
specific about this though.

> 2. Live migration inherent problem.
>
>    Currently, even with VMState, the protocol is not flexible enough.
>    We run into problem when we needed to fix pvclock migration issue.
>    The fix included 2 additional fields in save/load state and thus
>    needed a new version number.
>    The trouble is that the load function does not accept sections with
>    versions greater than the one it supports.

This is a feature, not a bug.  You cannot migrate from an newer qemu to 
an older one.  There's simply no way to support this in a sane way.

>    We cannot even create a new 'hack section' for new code since the
>    sections are ordered and expected to be exact match on the
>    destination.
>
>    The result is that new->old migration cannot work. This is not cross
>    releases even! It means that even a small bug in current release
>    prevents live migration between various instances of the code.
>    It forces us to decide whether to fix pvclock migration issue vs
>    allow new->old migration. Another ugly hack is to add cmdline that
>    will control this behavior. Still it's a pain to mgmt stack and
>    users.

This is a pretty normal policy (backwards compat but not forwards compat).

>    The solution here is more complex. One can claim that we should allow
>    newer sections to be accepted by current code (and send the section
>    size) and send optional sections. This would be a nasty work around.
>
>    IMHO we should 'specify' the migration protocol and introduce
>    capabilities, feature bits, etc. This way we'll have a robust,
>    extensible protocol that will withstand any potential issue. Both
>    Michael Tisrkin and I suggest it at the time vmstate was introduced.
>    Vmstate is good for the code but it's not a protocol.

I don't see how this fixes anything.  If you used feature bits, how do 
you migrate from a version that has a feature bit that an older version 
doesn't know about?  Do you just ignore it?

Migration needs to be conservative.  There should be only two possible 
outcomes: 1) a successful live migration or 2) graceful failure with the 
source VM still running correctly.  Silently ignoring things that could 
affect the guests behavior means that it's possible that after failure, 
the guest will fail in an unexpected way.

> Which protocol should we use? You're smarter than me, please suggest
> one.
> wrt the above guest abi issue, we should write a qemu spec with clear 
> definitions for devices, drivers, versions, etc.

I don't think there's a problem with what we have now.  The only thing I 
think we should add is a vendor sub-versioning mechanism.  
Unfortunately, we have downstreams that make lots of changes.  Today, 
since we have a single version space, there is inevitable versioning 
clash because of the shared namespace.  If we had a sub-versioning 
mechanism, it provides a way for downstreams to backport features and 
change the device models in such a way that the versioning doesn't clash 
with upstream.

It also provides a way to determine if two downstreams are compatible 
with each other which is a pretty neat concept.

This could be done as a small, incremental change to the current protocol.

Regards,

Anthony Liguori

  reply	other threads:[~2009-11-22 15:49 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-22 15:03 [Qemu-devel] Live migration protocol, device features, ABIs and other beasts Dor Laor
2009-11-22 15:49 ` Anthony Liguori [this message]
2009-11-22 20:22   ` [Qemu-devel] " Paolo Bonzini
2009-11-23  2:17     ` Anthony Liguori
2009-11-23  8:18       ` Paolo Bonzini
2009-11-23 13:04         ` Anthony Liguori
2009-11-23  8:26       ` Gleb Natapov
2009-11-23  9:29         ` Paolo Bonzini
2009-11-23  9:31           ` Gleb Natapov
     [not found]             ` <m3einp4e7c.fsf@neno.neno>
2009-11-23 12:37               ` Gleb Natapov
     [not found]         ` <m3iqd14edf.fsf@neno.neno>
2009-11-23 12:36           ` Gleb Natapov
     [not found]             ` <m3r5rpwcww.fsf@neno.neno>
2009-11-23 14:32               ` Gleb Natapov
2009-11-23 14:51                 ` Anthony Liguori
2009-11-23 14:53                   ` Gleb Natapov
2009-11-23 15:05                     ` Anthony Liguori
2009-11-23 15:22                       ` Gleb Natapov
2009-11-23 15:30                         ` Paolo Bonzini
2009-11-23 15:32                         ` Anthony Liguori
2009-11-23 15:49                           ` Gleb Natapov
2009-11-23 16:09                             ` Anthony Liguori
2009-11-23 16:15                               ` Gleb Natapov
2009-11-23 16:19                                 ` Anthony Liguori
     [not found]                   ` <m33a45s009.fsf@neno.neno>
2009-11-23 16:05                     ` Gleb Natapov
2009-11-23 16:10                       ` Anthony Liguori
2009-11-24 13:28             ` Michael S. Tsirkin
2009-11-23 13:01           ` Anthony Liguori
     [not found]             ` <m3vdh1wd0n.fsf@neno.neno>
2009-11-23 14:49               ` Anthony Liguori
2009-11-23 15:21                 ` Eduardo Habkost
2009-11-23 16:16                   ` Anthony Liguori
2009-11-23 17:08                     ` Eduardo Habkost
2009-11-23 18:28                       ` Anthony Liguori
2009-11-23 19:24                         ` Eduardo Habkost
2009-11-23 19:49                           ` Anthony Liguori
2009-11-23 21:21                             ` Eduardo Habkost
2009-11-24 11:00                         ` Dor Laor
     [not found]                 ` <m3y6lxqkpv.fsf@neno.neno>
2009-11-23 16:44                   ` Anthony Liguori
     [not found]                     ` <m3zl6db11z.fsf@neno.neno>
2009-11-23 18:44                       ` Anthony Liguori
2009-11-23 20:24                     ` Eduardo Habkost
2009-11-24 13:39                 ` Michael S. Tsirkin
2009-11-23 13:51       ` Eduardo Habkost
2009-11-23 14:21         ` Paolo Bonzini
2009-11-23 15:00           ` Anthony Liguori
2009-11-23 15:37             ` Eduardo Habkost
2009-11-23 15:02           ` Eduardo Habkost
2009-11-23 15:12             ` Anthony Liguori
2009-11-24 14:26               ` [Qemu-devel] " Michael S. Tsirkin
2009-11-23 14:53         ` [Qemu-devel] " Anthony Liguori
2009-11-24 14:28           ` [Qemu-devel] " Michael S. Tsirkin
2009-11-24 14:33             ` [Qemu-devel] " Anthony Liguori
2009-11-24 16:05               ` Michael S. Tsirkin
     [not found]                 ` <m3skc2r66t.fsf@neno.neno>
2009-11-25 16:28                   ` Michael S. Tsirkin
2009-11-24 13:17       ` [Qemu-devel] " Michael S. Tsirkin
2009-11-24 13:35         ` Paul Brook
2009-11-24 13:49           ` [Qemu-devel] " Michael S. Tsirkin
2009-11-24 13:59             ` [Qemu-devel] " Paul Brook
2009-11-24 14:21               ` Michael S. Tsirkin
2009-11-24 17:06                 ` Blue Swirl
2009-11-24 17:08                   ` Michael S. Tsirkin
2009-11-24 17:43                     ` Paolo Bonzini
2009-11-24 18:51                       ` Anthony Liguori
2009-11-24 18:56                         ` Blue Swirl
2009-11-24 19:24                           ` Anthony Liguori
2009-11-24 18:57                         ` Paolo Bonzini
2009-11-24 19:29                           ` Anthony Liguori
2009-11-24 20:01                             ` Michael S. Tsirkin
     [not found]             ` <m3my2ct2qe.fsf@neno.neno>
2009-11-24 17:41               ` Paolo Bonzini
2009-11-24 13:21   ` Michael S. Tsirkin
2009-11-24 13:45     ` Anthony Liguori
2009-11-24 13:55       ` Michael S. Tsirkin
2009-11-23 12:15 ` Juan Quintela
2009-11-23 13:09   ` Anthony Liguori
2009-11-23 14:13     ` Juan Quintela
2009-11-24 14:05       ` Michael S. Tsirkin
2009-11-24 14:20         ` Juan Quintela
2009-11-24 14:35           ` Michael S. Tsirkin
2009-11-25 13:42             ` Gerd Hoffmann
2009-11-25 13:42               ` Michael S. Tsirkin
2009-11-25 14:10                 ` Gerd Hoffmann
2009-11-25 14:09                   ` Michael S. Tsirkin
2009-11-25 14:52                     ` Gerd Hoffmann
2009-11-26 18:03                     ` Andrea Arcangeli
2009-11-25 13:36         ` Gerd Hoffmann
2009-11-25 13:40           ` Michael S. Tsirkin
2009-11-25 13:59             ` Gerd Hoffmann
2009-11-25 14:03               ` Michael S. Tsirkin
2009-11-25 14:53                 ` Juan Quintela
2009-11-25 15:01                   ` Michael S. Tsirkin
2009-11-24 10:39   ` Dor Laor
2009-11-24 14:01     ` Michael S. Tsirkin
2009-11-24 14:21       ` Juan Quintela
2009-11-24 14:38         ` Michael S. Tsirkin
2009-11-24 16:05         ` Michael S. Tsirkin
2009-11-25  9:30           ` Juan Quintela
2009-11-25  9:32             ` Michael S. Tsirkin
2009-11-25 13:36               ` Juan Quintela
2009-11-24 13:59   ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B095D86.700@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=dlaor@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).