From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH 5/24] Introduce vmcs12: a VMCS structure for L1 Date: Wed, 23 Jun 2010 11:07:18 +0300 Message-ID: <4C21C0B6.50404@redhat.com> References: <1276431753-nyh@il.ibm.com> <201006131225.o5DCP79H012922@rice.haifa.ibm.com> <4C15E95D.9000300@redhat.com> <20100622145441.GA23496@fermat.math.technion.ac.il> <20100622165322.GA29629@fermat.math.technion.ac.il> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: kvm@vger.kernel.org To: "Nadav Har'El" Return-path: Received: from mx1.redhat.com ([209.132.183.28]:55331 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751190Ab0FWIHX (ORCPT ); Wed, 23 Jun 2010 04:07:23 -0400 In-Reply-To: <20100622165322.GA29629@fermat.math.technion.ac.il> Sender: kvm-owner@vger.kernel.org List-ID: On 06/22/2010 07:53 PM, Nadav Har'El wrote: > On Tue, Jun 22, 2010, Nadav Har'El wrote about "Re: [PATCH 5/24] Introduce vmcs12: a VMCS structure for L1": > >>> Note that this structure becomes an ABI, it cannot change except in a >>> backward compatible way due to the need for live migration. So I'd like >>> a documentation patch that adds a description of the content to >>> Documentation/kvm/. It can be as simple as listing the structure >>> definition. >>> > I decided that if I add a file in Documentation/kvm, it would be very useful > for it to describe the nested vmx feature in general, in addition to the > structure that you asked documented. So here is the new patch I propose: > > > Great, that's always helpful. > ---- > Subject: [PATCH 25/25] Documentation > > This patch includes a brief introduction to the nested vmx feature in the > Documentation/kvm directory. The document also includes a copy of the > vmcs12 structure, as requested by Avi Kivity. > > > + > +We describe in much greater detail the theory behind the nested VMX feature, > +its implementation and its performance characteristics, in IBM Research report > +H-0282, "The Turtles Project: Design and Implementation of Nested > +Virtualization", available at: > + > + http://bit.ly/a0o9te > Please put the true url in here. > + > +Known limitations > +----------------- > + > +The current code support running Linux under a nested KVM using shadow > +page table (with bypass_guest_pf disabled). Might as well remove this, since nvmx will not be merged with such a gaping hole. In theory I ought to reject anything that doesn't comply with the spec. In practice I'll accept deviations from the spec, so long as - those features aren't used by common guests - when the features are attempted to be used, kvm will issue a warning I don't think PFEC matching ought to present any implementation difficulty. > +ABIs > +---- > + > +Nested VMX aims to present a standard and (eventually) fully-functional VMX > +implementation for the a guest hypervisor to use. As such, the official > +specification of the ABI that it provides is Intel's VMX specification, > +namely volume 3B of their "Intel 64 and IA-32 Architectures Software > +Developer's Manual". Not all of VMX's features are currently fully supported, > +but the goal is to eventually support them all, starting with the VMX features > +which are used in practice by popular hypervisors (KVM and others). > + > +As a VMX implementation, nested VMX presents a VMCS structure to L1. > +As mandated by the spec, other than the two fields revision_id and abort, > +this structure is *opaque* to its user, who is not supposed to know or care > +about its internal structure. Rather, the structure is accessed through the > +VMREAD and VMWRITE instructions. > +Still, for debugging purposes, KVM developers might be interested to know the > +internals of this structure; This is struct vmcs12 from arch/x86/kvm/vmx.c. > +For convenience, we repeat its content here. If the internals of this structure > +changes, this can break live migration across KVM versions. VMCS12_REVISION > +(from vmx.c) should be changed if struct vmcs12 or its inner struct shadow_vmcs > +is ever changed. > This is indeed great for debugging, we can add qemu commands to inspect a vmcs (and add the vmptr to 'info registers'). > + > +struct __packed vmcs12 { > + /* According to the Intel spec, a VMCS region must start with the > + * following two fields. Then follow implementation-specific data. > + */ > + u32 revision_id; > + u32 abort; > + > + struct shadow_vmcs shadow_vmcs; > + > + bool launch_state; /* set to 0 by VMCLEAR, to 1 by VMLAUNCH */ > + > + int cpu; > + int launched; > Why is cpu needed? In what way is launched != launch_state? Please use explicitly sized types. > +struct __packed shadow_vmcs { > > > + u32 host_ia32_sysenter_cs; > u32 pad; > + unsigned long cr0_guest_host_mask; > + unsigned long cr4_guest_host_mask; > + unsigned long cr0_read_shadow; > + unsigned long cr4_read_shadow; > + unsigned long cr3_target_value0; > + unsigned long cr3_target_value1; > + unsigned long cr3_target_value2; > + unsigned long cr3_target_value3; > + unsigned long exit_qualification; > + unsigned long guest_linear_address; > + unsigned long guest_cr0; > + unsigned long guest_cr3; > + unsigned long guest_cr4; > + unsigned long guest_es_base; > + unsigned long guest_cs_base; > + unsigned long guest_ss_base; > + unsigned long guest_ds_base; > + unsigned long guest_fs_base; > + unsigned long guest_gs_base; > + unsigned long guest_ldtr_base; > + unsigned long guest_tr_base; > + unsigned long guest_gdtr_base; > + unsigned long guest_idtr_base; > + unsigned long guest_dr7; > + unsigned long guest_rsp; > + unsigned long guest_rip; > + unsigned long guest_rflags; > + unsigned long guest_pending_dbg_exceptions; > + unsigned long guest_sysenter_esp; > + unsigned long guest_sysenter_eip; > + unsigned long host_cr0; > + unsigned long host_cr3; > + unsigned long host_cr4; > + unsigned long host_fs_base; > + unsigned long host_gs_base; > + unsigned long host_tr_base; > + unsigned long host_gdtr_base; > + unsigned long host_idtr_base; > + unsigned long host_ia32_sysenter_esp; > + unsigned long host_ia32_sysenter_eip; > + unsigned long host_rsp; > + unsigned long host_rip; > Use u64 instead of unsigned long, otherwise the size changes during live migration from a 32-bit host to a 64-bit host. Reserve tons of space here. > +}; > + > + > -- error compiling committee.c: too many arguments to function