qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Liran Alon <liran.alon@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Eduardo Habkost <ehabkost@redhat.com>,
	kvm@vger.kernel.org, mtosatti@redhat.com, qemu-devel@nongnu.org,
	rth@twiddle.net, jmattson@google.com
Subject: Re: [Qemu-devel] [QEMU PATCH v2 0/2]: KVM: i386: Add support for save and restore nested state
Date: Thu, 1 Nov 2018 19:07:31 +0000	[thread overview]
Message-ID: <20181101190730.GF2726@work-vm> (raw)
In-Reply-To: <13842591-5801-493B-A82D-FE44610FE5A1@oracle.com>

* Liran Alon (liran.alon@oracle.com) wrote:
> 
> 
> > On 1 Nov 2018, at 17:56, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> > 
> > * Liran Alon (liran.alon@oracle.com) wrote:
> >> 
> >> 
> >>> On 1 Nov 2018, at 15:10, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> >>> 
> >>> * Liran Alon (liran.alon@oracle.com) wrote:
> >>>> 
> >>>> 
> >>>>> On 31 Oct 2018, at 20:59, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> >>>>> 
> >>>>> * Liran Alon (liran.alon@oracle.com) wrote:
> >>>>>> 
> >>>>>> 
> >>>>>>> On 31 Oct 2018, at 20:19, Paolo Bonzini <pbonzini@redhat.com> wrote:
> >>>>>>> 
> >>>>>>> On 31/10/2018 19:17, Eduardo Habkost wrote:
> >>>>>>>> On Wed, Oct 31, 2018 at 03:03:34AM +0200, Liran Alon wrote:
> >>>>>>>>> Ping.
> >>>>>>>>> Patch was submitted almost two months ago and I haven’t seen any respond for the v2 of this series.
> >>>>>>>> Sorry for the long delay.  This was on my queue of patches to be
> >>>>>>>> reviewed, but I'm failing to keep up to the rate of incoming
> >>>>>>>> patches.  I will try to review the series next week.
> >>>>>>> 
> >>>>>>> I have already reviewed it; unfortunately I have missed the soft freeze
> >>>>>>> for posting the version I had also been working on when Liran posted
> >>>>>>> these patches.
> >>>>>>> 
> >>>>>>> Paolo
> >>>>>> 
> >>>>>> Paolo, note that this is v2 of this patch series. It’s not the one you have already reviewed.
> >>>>>> It now correctly handles the case you mentioned in review of v1 of migrating with various nested_state buffer sizes.
> >>>>>> The following scenarios were tested:
> >>>>>> (a) src and dest have same nested state size.
> >>>>>> 	==> Migration succeeds.
> >>>>>> (b) src don't have nested state while dest do.
> >>>>>> 	==> Migration succeed and src don't send it's nested state.
> >>>>>> (c) src have nested state while dest don't.
> >>>>>> 	==> Migration fails as it cannot restore nested state.
> >>>>>> (d) dest have bigger max nested state size than src
> >>>>>> 	==> Migration succeeds.
> >>>>>> (e) dest have smaller max nested state size than src but enough to store it's saved nested state
> >>>>>> 	==> Migration succeeds
> >>>>>> (f) dest have smaller max nested state size than src but not enough to store it's saved nested state
> >>>>>> 	==> Migration fails
> >>>>> 
> >>>>> Is it possible to tell these limits before the start of the migration,
> >>>>> or do we only find out that a nested migration won't work by trying it?
> >>>>> 
> >>>>> Dave
> >>>> 
> >>>> It is possible for the destination host to query what is it’s max nested state size.
> >>>> (This is what is returned from "kvm_check_extension(s, KVM_CAP_NESTED_STATE);” See kvm_init() code)
> >>> 
> >>> Is this max size a function of:
> >>> a) The host CPU
> >>> b) The host kernel
> >>> c) Some configuration
> >>> 
> >>> or all of those?
> >>> 
> >>> What about the maximum size that will be sent?
> >> 
> >> The max size is a function of (b). It depends on your KVM capabilities.
> >> This size that will be sent is also the max size at source.
> > 
> > So if I have matching host kernels it should always work?
> 
> Yes.

OK, that's a good start.

> > What happens if I upgrade the source kernel to increase it's maximum
> > nested size, can I force it to keep things small for some VMs?
> 
> Currently, the IOCTL which saves the nested state have only a single version which could potentially fill entire size (depending on current guest state).
> Saving the nested state obviously always attempts to save all relevant information it has because otherwise, you deliberately don't transfer to destination part of the
> state it needs to continue running the migrated guest correctly at destination.
> 
> It is true that I do expect future versions of this IOCTL to be able to “set nested state” of older versions (smaller one which lack some information)
> but I do expect migration to fail gracefully (and continue running guest at source) if destination is not capable of restoring all state sent from source
> (When destination max nested state size is smaller than source nested state size).

That older version thing would be good; then we would tie that to the
versioned machine types and/or CPU models some how; that way every
migration of a 3.2 qemu with a given CPU would work irrespective of host
version.

> > 
> >>> 
> >>>> However, I didn’t find any suitable mechanism in QEMU Live-Migration code to negotiate
> >>>> this destination host information with source prior to migration. Which kinda makes sense as
> >>>> this code is also used to my understanding in standard suspend/resume flow. In that scenario,
> >>>> there is no other side to negotiate with.
> >>> 
> >>> At the moment, if the user has appropriately configured their QEMU and
> >>> their are no IO errors, the migration should not fail; and the
> >>> management layer can responsible for configuring the QEMU and checking
> >>> compatibilities - 
> >> 
> >> I agree with this. The post_load() checks I have done are extra measures to prevent a failing migration.
> >> Management layer can perform extra work before the migration to make sure that source can actually migrate to destination.
> >> Taking the max size of the nested state into account.
> >> 
> >>> 
> >>>> So currently what happens in my patch is that source prepares the nested state buffer and sends it to destination as part of VMState,
> >>>> and destination attempts to load this nested state in it’s nested_state_post_load() function.
> >>>> If destination kernel cannot handle loading the nested state it has received from source, it fails the migration by returning
> >>>> failure from nested_state_post_load().
> >>> 
> >>> That's minimally OK, but ideally we'd have a way of knowing before we
> >>> start the migration if it was going to work, that way the management
> >>> layer can refuse it before it spends ages transferring data and then
> >>> trying to switch over - recovery from a failed migration can be a bit
> >>> tricky/risky.
> >>> 
> >>> I can't see it being much use in a cloud environment if the migration
> >>> will sometimes fail without the cloud admin being able to do something
> >>> about it.
> >> 
> >> With current QEMU Live-Migration code, I believe this is in the responsibility of the Control-Plane.
> >> It’s not different from the fact that Control-Plane is also responsible for launching the destination QEMU with same vHW configuration
> >> as the source QEMU. If there was some kind of negotiation mechanism in QEMU that verifies these vHW configuration matches *before*
> >> starting the migration (and not part of Control-Plane), I would have also added to that negotiation to consider the max size of nested state.
> >> But with current mechanisms, this is solely in the responsibility of the Control-Plane.
> > 
> > Yep, that's fine - all we have to make sure of is that:
> >  a) The control plane can somehow find out the maximums (via qemu or
> > something else on the host) so it can make that decision.
> 
> This is easy for QEMU or something else on the host to verify. Just a matter of sending IOCTL to KVM.

OK that's probably fine, we should check with libvirt people what they would like.

> >  b) Thing about how upgrades will work when one host is newer.
> 
> I made sure that it is possible for destination to accept nested state of size smaller than it’s max nested state size.
> Therefore, it should allow upgrades from one host to a newer host.

OK; the tricky thing is when you upgrade one host in a small cluster as
you start doing an upgrade, and then once it's got it's first VM you
can't migrate away from it until others are updated; that gets messy.

Dave

> (If of course IOCTL which “set nested state” is written correctly in a compatible way).
> 
> -Liran
> 
> > 
> > Dave
> > 
> >>> 
> >>>> Do you have a better suggestion?
> >>> 
> >>> I'll need to understand a bit more about the nested state.
> >> 
> >> Feel free to ask more questions about it and I will try to answer.
> >> 
> >> Thanks,
> >> -Liran
> >> 
> >>> 
> >>> Dave
> >>> 
> >>>> -Liran
> >>>> 
> >>>>> 
> >>>>>> -Liran
> >>>>>> 
> >>>>>> 
> >>>>> --
> >>>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> >>>> 
> >>> --
> >>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> >> 
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2018-11-01 19:07 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-16 12:46 [Qemu-devel] [QEMU PATCH v2 0/2]: KVM: i386: Add support for save and restore nested state Liran Alon
2018-09-16 12:46 ` [Qemu-devel] [QEMU PATCH v2 1/2] i386: Compile CPUX86State xsave_buf only when support KVM or HVF Liran Alon
2018-09-16 12:46 ` [Qemu-devel] [QEMU PATCH v2 2/2] KVM: i386: Add support for save and restore nested state Liran Alon
2018-10-08 17:21 ` [Qemu-devel] [QEMU PATCH v2 0/2]: " Liran Alon
2018-10-15 18:10   ` Liran Alon
2018-10-31  1:03     ` Liran Alon
2018-10-31 18:17       ` Eduardo Habkost
2018-10-31 18:19         ` Paolo Bonzini
2018-10-31 18:50           ` Liran Alon
2018-10-31 18:59             ` Dr. David Alan Gilbert
2018-10-31 23:17               ` Liran Alon
2018-11-01 13:10                 ` Dr. David Alan Gilbert
2018-11-01 15:23                   ` Liran Alon
2018-11-01 15:56                     ` Dr. David Alan Gilbert
2018-11-01 16:45                       ` Jim Mattson
2018-11-02  3:46                         ` Liran Alon
2018-11-02  9:40                           ` Paolo Bonzini
2018-11-02 12:35                             ` Dr. David Alan Gilbert
2018-11-02 12:40                               ` Daniel P. Berrangé
2018-11-04 22:12                               ` Paolo Bonzini
2018-11-02 12:59                             ` Liran Alon
2018-11-02 16:44                               ` Jim Mattson
2018-11-02 16:58                                 ` Daniel P. Berrangé
2018-11-02 17:01                                   ` Jim Mattson
2018-11-02 16:54                             ` Daniel P. Berrangé
2018-11-02 16:58                               ` Dr. David Alan Gilbert
2018-11-04 22:19                               ` Paolo Bonzini
2018-11-12 16:18                                 ` Daniel P. Berrangé
2018-11-12 16:50                                   ` Dr. David Alan Gilbert
2018-11-12 16:53                                     ` Paolo Bonzini
2018-11-12 16:54                                     ` Daniel P. Berrangé
2018-11-13  0:00                                       ` Liran Alon
2018-11-13  0:07                                         ` Jim Mattson
2018-11-13  0:09                                           ` Liran Alon
2018-11-12 23:58                                     ` Liran Alon
2018-11-02 16:39                           ` Jim Mattson
2018-11-03  2:02                             ` Liran Alon
2018-11-08  0:13                               ` Liran Alon
2018-11-08  0:45                                 ` Jim Mattson
2018-11-08  9:50                                   ` Paolo Bonzini
2018-11-08  9:57                                     ` Liran Alon
2018-11-08 17:02                                       ` Paolo Bonzini
2018-11-08 18:41                                         ` Liran Alon
2018-11-08 20:34                                           ` Paolo Bonzini
2018-11-12 14:51                                           ` Dr. David Alan Gilbert
2018-11-01 19:03                       ` Liran Alon
2018-11-01 19:07                         ` Dr. David Alan Gilbert [this message]
2018-11-01 19:41                           ` Jim Mattson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181101190730.GF2726@work-vm \
    --to=dgilbert@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=liran.alon@oracle.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).