From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH 0/30] nVMX: Nested VMX, v9 Date: Mon, 23 May 2011 16:52:47 +0300 Message-ID: <4DDA66AF.7020505@redhat.com> References: <4DC7CD81.2070305@redhat.com> <20110511082027.GG19019@redhat.com> <20110512154228.GA7943@fermat.math.technion.ac.il> <20110512155727.GA20193@redhat.com> <20110512163115.GA13138@fermat.math.technion.ac.il> <20110512165157.GC20193@redhat.com> <20110522193239.GA13130@fermat.math.technion.ac.il> <4DDA2E72.8070907@redhat.com> <20110523130226.GC23407@8bytes.org> <4DDA5C30.10107@redhat.com> <20110523134052.GD23407@8bytes.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "Nadav Har'El" , Gleb Natapov , kvm@vger.kernel.org, abelg@il.ibm.com To: Joerg Roedel Return-path: Received: from mx1.redhat.com ([209.132.183.28]:20522 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753004Ab1EWNw6 (ORCPT ); Mon, 23 May 2011 09:52:58 -0400 In-Reply-To: <20110523134052.GD23407@8bytes.org> Sender: kvm-owner@vger.kernel.org List-ID: On 05/23/2011 04:40 PM, Joerg Roedel wrote: > On Mon, May 23, 2011 at 04:08:00PM +0300, Avi Kivity wrote: > > On 05/23/2011 04:02 PM, Joerg Roedel wrote: > > >> About live-migration with nesting, we had discussed the idea of just > >> doing an VMEXIT(INTR) if the vcpu runs nested and we want to migrate. > >> The problem was that the hypervisor may not expect an INTR intercept. > >> > >> How about doing an implicit VMEXIT in this case and an implicit VMRUN > >> after the vcpu is migrated? > > > > What if there's something in EXIT_INT_INFO? > > On real SVM hardware EXIT_INT_INFO should only contain something for > exception and npt intercepts. These are all handled in the kernel and do > not cause an exit to user-space so that no valid EXIT_INT_INFO should be > around when we actually go back to user-space (so that migration can > happen). > > The exception might be the #PF/NPT intercept when the guest is doing > very obscure things like putting an exception/interrupt handler on mmio > memory, but that isn't really supported by KVM anyway so I doubt we > should care. > > Unless I miss something here we should be safe by just not looking at > EXIT_INT_INFO while migrating. Agree. > >> The nested hypervisor will not see the > >> vmexit and the vcpu will be in a state where it is safe to migrate. This > >> should work for nested-vmx too if the guest-state is written back to > >> guest memory on VMEXIT. Is this the case? > > > > It is the case with the current implementation, and we can/should make > > it so in future implementations, just before exit to userspace. Or at > > least provide an ABI to sync memory. > > > > But I don't see why we shouldn't just migrate all the hidden state (in > > guest mode flag, svm host paging mode, svm host interrupt state, vmcb > > address/vmptr, etc.). It's more state, but no thinking is involved, so > > it's clearly superior. > > An issue is that there is different state to migrate for Intel and AMD > hosts. If we keep all that information in guest memory the kvm kernel > module can handle those details and all KVM needs to migrate is the > in-guest-mode flag and the gpa of the vmcb/vmcs which is currently > executed. This state should be enough for Intel and AMD nesting. I think for Intel there is no hidden state apart from in-guest-mode (there is the VMPTR, but it is an actual register accessible via instructions). For svm we can keep the hidden state in the host state-save area (including the vmcb pointer). The only risk is that svm will gain hardware support for nesting, and will choose a different format than ours. An alternative is a fake MSR for storing this data, or just another get/set ioctl pair. We'll have a flags field that says which fields are filled in. > The next benefit is that it works seemlessly even if the state that > needs to be transfered is extended (e.g. by emulating a new > virtualization hardware feature). This support can be implemented in the > kernel module and no changes to qemu are required. I agree it's a benefit. But I don't like making the fake vmexit part of live migration, if it turns out the wrong choice it's hard to undo it. -- error compiling committee.c: too many arguments to function