qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Liran Alon <liran.alon@oracle.com>,
	jmattson@google.com, ehabkost@redhat.com, kvm@vger.kernel.org,
	mtosatti@redhat.com, rth@twiddle.net, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [QEMU PATCH v2 0/2]: KVM: i386: Add support for save and restore nested state
Date: Fri, 2 Nov 2018 12:35:05 +0000	[thread overview]
Message-ID: <20181102123504.GA2387@work-vm> (raw)
In-Reply-To: <12c26c34-8dd1-a442-7826-86b93ff978f8@redhat.com>

* Paolo Bonzini (pbonzini@redhat.com) wrote:
> On 02/11/2018 04:46, Liran Alon wrote:
> >> On Thu, Nov1, 2018 at 09:45 AM, Jim Mattson <jmattson@google.com> wrote:
> > 
> >>> On Thu, Nov 1, 2018 at 8:56 AM, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> > 
> >>> So if I have matching host kernels it should always work?
> >>> What happens if I upgrade the source kernel to increase it's maximum
> >>> nested size, can I force it to keep things small for some VMs?
> > 
> >> Any change to the format of the nested state should be gated by a
> >> KVM_CAP set by userspace. (Unlike, say, how the
> >> KVM_VCPUEVENT_VALID_SMM flag was added to the saved VCPU events state
> >> in commit f077825a8758d.) KVM has traditionally been quite bad about
> >> maintaining backwards compatibility, but I hope the community is more
> >> cognizant of the issues now.
> > 
> >> As a cloud provider, one would only enable the new capability from
> >> userspace once all hosts in the pool have a kernel that supports it.
> >> During the transition, the capability would not be enabled on the
> >> hosts with a new kernel, and these hosts would continue to provide
> >> nested state that could be consumed by hosts running the older kernel.
> > 
> > Hmm this makes sense.
> > 
> > This means though that the patch I have submitted here isn't good enough.
> > My patch currently assumes that when it attempts to get nested state from KVM,
> > QEMU should always set nested_state->size to max size supported by KVM as received
> > from kvm_check_extension(s, KVM_CAP_NESTED_STATE);
> > (See kvm_get_nested_state() introduced on my patch).
> > This indeed won't allow migration from host with new KVM to host with old KVM if
> > nested_state size was enlarged between these KVM versions.
> > Which is obviously an issue.
> 
> Actually I think this is okay, because unlike the "new" capability was
> enabled, KVM would always reduce nested_state->size to a value that is
> compatible with current kernels.
> 
> > But on second thought, I'm not sure that this is the right approach as-well.
> > We don't really want the used version of nested_state to be determined on kvm_init().
> > * On source QEMU, we actually want to determine it when preparing for migration based
> > on to the support given by our destination host. If it's an old host, we would like to
> > save an old version nested_state and if it's a new host, we will like to save our newest
> > supported nested_state.
> 
> No, that's wrong because it would lead to losing state.  If the source
> QEMU supports more state than the destination QEMU, and the current VM
> state needs to transmit it for migration to be _correct_, then migration
> to that destination QEMU must fail.
> 
> In particular, enabling the new KVM capability needs to be gated by a
> new machine type and/or -cpu flag, if migration compatibility is needed.
>  (In particular, this is one reason why I haven't considered this series
> for 3.1.  Right now, migration of nested hypervisors is completely
> busted but if we make it "almost" work, pre-3.1 machine types would not
> ever be able to add support for KVM_CAP_EXCEPTION_PAYLOAD.  Therefore,
> it's better for users if we wait for one release more, and add support
> for KVM_CAP_NESTED_STATE and KVM_CAP_EXCEPTION_PAYLOAD at the same time).
> 
> Personally, I would like to say that, starting from QEMU 3.2, enabling
> nested VMX requires a 4.20 kernel.  It's a bit bold, but I think it's a
> good way to keep some sanity.  Any opinions on that?

That seems a bit mean; there's a lot of people already using nested.

Dave

> Paolo
> 
> > Therefore, I don't think that we want this versioning to be based on KVM_CAP at all.
> > It seems that we would want the process to behave as follows:
> > 1) Mgmt-layer at dest queries dest host max supported nested_state size.
> >    (Which should be returned from kvm_check_extension(KVM_CAP_NESTED_STATE))
> > 2) Mgmt-layer at source initiate migration to dest with requesting QEMU to send nested_state 
> >    matching dest max supported nested_state size.
> >    When saving nested state using KVM_GET_NESTED_STATE IOCTL, QEMU will specify in nested_state->size
> >    the *requested* size to be saved and KVM should be able to save only the information which matches
> >    the version that worked with that size.
> > 3) After some sanity checks on received migration stream, dest host use KVM_SET_NESTED_STATE IOCTL.
> >    This IOCTL should deduce which information it should deploy based on given nested_state->size.
> > 
> > This also makes me wonder if it's not just nicer to use nested_state->flags to specify which
> > information is actually present on nested_state instead of managing versioning with nested_state->size.
> > 
> > What are your opinions on this?
> > 
> > -Liran
> > 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2018-11-02 12:35 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-16 12:46 [Qemu-devel] [QEMU PATCH v2 0/2]: KVM: i386: Add support for save and restore nested state Liran Alon
2018-09-16 12:46 ` [Qemu-devel] [QEMU PATCH v2 1/2] i386: Compile CPUX86State xsave_buf only when support KVM or HVF Liran Alon
2018-09-16 12:46 ` [Qemu-devel] [QEMU PATCH v2 2/2] KVM: i386: Add support for save and restore nested state Liran Alon
2018-10-08 17:21 ` [Qemu-devel] [QEMU PATCH v2 0/2]: " Liran Alon
2018-10-15 18:10   ` Liran Alon
2018-10-31  1:03     ` Liran Alon
2018-10-31 18:17       ` Eduardo Habkost
2018-10-31 18:19         ` Paolo Bonzini
2018-10-31 18:50           ` Liran Alon
2018-10-31 18:59             ` Dr. David Alan Gilbert
2018-10-31 23:17               ` Liran Alon
2018-11-01 13:10                 ` Dr. David Alan Gilbert
2018-11-01 15:23                   ` Liran Alon
2018-11-01 15:56                     ` Dr. David Alan Gilbert
2018-11-01 16:45                       ` Jim Mattson
2018-11-02  3:46                         ` Liran Alon
2018-11-02  9:40                           ` Paolo Bonzini
2018-11-02 12:35                             ` Dr. David Alan Gilbert [this message]
2018-11-02 12:40                               ` Daniel P. Berrangé
2018-11-04 22:12                               ` Paolo Bonzini
2018-11-02 12:59                             ` Liran Alon
2018-11-02 16:44                               ` Jim Mattson
2018-11-02 16:58                                 ` Daniel P. Berrangé
2018-11-02 17:01                                   ` Jim Mattson
2018-11-02 16:54                             ` Daniel P. Berrangé
2018-11-02 16:58                               ` Dr. David Alan Gilbert
2018-11-04 22:19                               ` Paolo Bonzini
2018-11-12 16:18                                 ` Daniel P. Berrangé
2018-11-12 16:50                                   ` Dr. David Alan Gilbert
2018-11-12 16:53                                     ` Paolo Bonzini
2018-11-12 16:54                                     ` Daniel P. Berrangé
2018-11-13  0:00                                       ` Liran Alon
2018-11-13  0:07                                         ` Jim Mattson
2018-11-13  0:09                                           ` Liran Alon
2018-11-12 23:58                                     ` Liran Alon
2018-11-02 16:39                           ` Jim Mattson
2018-11-03  2:02                             ` Liran Alon
2018-11-08  0:13                               ` Liran Alon
2018-11-08  0:45                                 ` Jim Mattson
2018-11-08  9:50                                   ` Paolo Bonzini
2018-11-08  9:57                                     ` Liran Alon
2018-11-08 17:02                                       ` Paolo Bonzini
2018-11-08 18:41                                         ` Liran Alon
2018-11-08 20:34                                           ` Paolo Bonzini
2018-11-12 14:51                                           ` Dr. David Alan Gilbert
2018-11-01 19:03                       ` Liran Alon
2018-11-01 19:07                         ` Dr. David Alan Gilbert
2018-11-01 19:41                           ` Jim Mattson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181102123504.GA2387@work-vm \
    --to=dgilbert@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=liran.alon@oracle.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).