From: Anthony Liguori <anthony@codemonkey.ws>
To: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [RFC] More robust migration
Date: Fri, 20 Feb 2009 09:15:58 -0600 [thread overview]
Message-ID: <499EC92E.9000401@codemonkey.ws> (raw)
In-Reply-To: <499EBFD8.50307@amd.com>
Hi Andre,
Andre Przywara wrote:
> Hi,
>
> after fiddling around with migration (and the data dumped into the
> stream) I found the current concept possesses some shortcomings.
Yikes :-) FWIW, I focused a lot on robustness in the implementation so
hopefully a lot of what you mention below were conscious decisions with
very specific reasoning.
> I am interested in your opinions whether it is worth to implement a
> new improved format.
FWIW, the format is sufficiently versioned that it isn't necessary to
completely change it (not that I think it needs changing).
> Issues I would like to address:
> 1. Transfer configuration data. Currently there is no VM configuration
> data transferred with the stream.
Yes, the difficulty here is that we need to transfer the machine
configuration but not the host configuration. Management tools should
decide how to configure the host on the target side but we should be
passing the machine configuration.
If you've been following the config file threads, I've mentioned this as
a use case for the current design a number of times. We would pass a
flattened device tree as another savevm section with a well known name
(like "machine"). Given the semantics of the current migration
protocol, this would ensure that the machine generated on the remote
node was exactly the same as the source node.
> One has to start QEMU/KVM with the _exact_ same parameters on the
> other side to allow migration. If there would be a pseudo-device
> (transferred first) holding these parameters (and other runtime
> dependent stuff like kvm_enabled()) this would ease migration a lot.
FWIW, there's nothing preventing migrating from TCG -> KVM.
I think one can debate about whether host config should be migrated
too. I'd argue that in the core migration protocol, host config should
not be present. I think you can have an easier to use migration
protocol (like the old ssh protocol) that also transferred host config.
But in the general case, you want management tools to be able to
manipulate host config upon migration.
> 2. Introduce a length field to the header of each device.
IMHO, this would reduce robustness. It's also difficult because of the
way savevm registration works. You don't know how large a section is
until it's written and migration streams are not seekable.
> This would allow to skip unknown (or unwanted) devices.
No good can come from this. If you have an unknown section, you must
throw and error and stop the migration. What if this is for a device
that the guest is interacting with? The device just disappears after
migration? All savevm state is state that affects the functionality of
a guest. Throwing away this state will change the functionality of the
VM and migration should not affect guest functionality.
> I know this imposes a bit of a challenge, because the length is not
> always known in advance, but one could overcome this (by using the
> buffer to patch in the length later for instance).
What are the use cases where you think this would be beneficial? I
really see the change in semantics from the old way (throwing away
unknown sections) to the new way (requiring strict versioning and
validating all sections) as being a huge step toward robustness.
>
> 3. Make the device versioning really bulletproof. Currently some
> devices dump different data depending on runtime (or better
> time-of-creation) state (for instance hw/i8254.c: if (s->irq_timer)...).
If you look carefully, s->irq_timer will always be set. The checks are
unnecessary.
> Another example is the (x86?) CPU state, which differs with KVM
> en/disabled.
Not in upstream QEMU...
> Some devices even dump host system dependent structures (like struct
> vecio in virtio-blk.c).
That is awful and needs to be fixed. It should have never been
committed like that.
>
> Also one could create some kind of (limited) upward compatibility, so
> older QEMU versions ignore additional, but optional fields in a device
> state (similar to the ext2 compatibility scheme). Maybe this could be
> done by an external converter program.
To me, ignoring is always a bad thing. It's almost always going to be
unsafe. Doesn't this decrease robustness by being less conservative?
> 4. Allow optional devices. Some devices are always started (like
> HPET), although they don't need to be used by the OS. If one migrates
> such a guest from say KVM-83 to KVM-81, it will fail, because KVM-81
> does not support HPET. One could migrate the device only if it has
> been used.
There's no way you can migrate from KVM-83 to KVM-81 if you've enabled
the HPET. It cannot be made to work.
There is a -no-hpet option though. If you are a management tool that
needs to support migration from multiple versions, you should use
-no-hpet. Also, if you need to migrate from KVM-81 to KVM-83, you
should use -no-hpet with KVM-83 to avoid changing the guest visible state.
In the long run, the machine configuration file will address this in a
more thorough manner. FWIW, -no-hpet was added specifically to deal
with migration.
> In general I would like to know whether QEMU migration is intended to
> be used in such a flexible manner or whether the requirement of the
> exact same software version on both side is not a limitation in
> everyday use.
My primary goal for migration is robustness. I do not think it's a good
idea to support any circumstances that could introduce changes in guest
visible state during a live migration.
Live migration is a critical feature for many production environments.
To be useful IMHO, it has to be bullet-proof.
Regards,
Anthony Liguori
> Awaiting your comments!
>
> Regards,
> Andre.
>
next prev parent reply other threads:[~2009-02-20 15:16 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-20 14:36 [Qemu-devel] [RFC] More robust migration Andre Przywara
2009-02-20 15:15 ` Anthony Liguori [this message]
2009-02-20 16:09 ` Paul Brook
2009-02-20 16:38 ` Jamie Lokier
2009-02-20 16:47 ` Paul Brook
2009-02-23 3:51 ` Jamie Lokier
2009-02-23 11:55 ` Paul Brook
2009-02-23 22:07 ` Jamie Lokier
2009-02-23 23:21 ` Paul Brook
2009-02-24 1:15 ` Anthony Liguori
2009-02-24 10:18 ` Avi Kivity
2009-02-20 16:37 ` Jamie Lokier
2009-02-20 18:27 ` Anthony Liguori
2009-02-20 17:06 ` [Qemu-devel] " Charles Duffy
2009-02-23 3:54 ` Jamie Lokier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=499EC92E.9000401@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).