From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mukesh Rathor Subject: Re: [V11 PATCH 17/21] PVH xen: vmcs related changes Date: Fri, 23 Aug 2013 17:26:03 -0700 Message-ID: <20130823172603.2fa09825@mantra.us.oracle.com> References: <1377220750-19514-1-git-send-email-mukesh.rathor@oracle.com> <1377220750-19514-18-git-send-email-mukesh.rathor@oracle.com> <52173C7302000078000EDF05@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1VD1gB-0007S3-NR for xen-devel@lists.xenproject.org; Sat, 24 Aug 2013 00:26:12 +0000 In-Reply-To: <52173C7302000078000EDF05@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: xen-devel List-Id: xen-devel@lists.xenproject.org On Fri, 23 Aug 2013 09:41:55 +0100 "Jan Beulich" wrote: > >>> On 23.08.13 at 03:19, Mukesh Rathor > >>> wrote: > > This patch contains vmcs changes related for PVH, mainly creating a > > VMCS for PVH guest. > > > > Changes in V11: > > - Remove pvh_construct_vmcs and make it part of construct_vmcs > > > > - v->arch.hvm_vmx.secondary_exec_control &= > > ~SECONDARY_EXEC_ENABLE_VPID; > > + /* Intel SDM: resvd bits are 0 */ > > + v->arch.hvm_vmx.secondary_exec_control = bits; > > + } > > + else > > + { > > + v->arch.hvm_vmx.secondary_exec_control = > > vmx_secondary_exec_control; + > > + /* Disable VPID for now: we decide when to enable it on > > VMENTER. */ > > + v->arch.hvm_vmx.secondary_exec_control &= > > ~SECONDARY_EXEC_ENABLE_VPID; > > + } > > So this difference in VPID handling needs some explanation, as I > don't immediately see how you adjust the code referred to by the > non-PVH comment above. /* Phase I PVH: we run with minimal secondary exec features */ u32 bits = SECONDARY_EXEC_ENABLE_EPT | SECONDARY_EXEC_ENABLE_VPID; Somehow I had concluded it was harmless to have it that way, but I can't remember now. Since, we have common function now, it could just do the HVM way too, ie, disabled here since we've already checked cpu_has_vmx_vpid in pvh_check_requirements. > > if ( paging_mode_hap(d) ) > > { > > v->arch.hvm_vmx.exec_control &= ~(CPU_BASED_INVLPG_EXITING > > | CPU_BASED_CR3_LOAD_EXITING | > > CPU_BASED_CR3_STORE_EXITING); > > + if ( is_pvh_vcpu(v) ) > > + { > > + u32 bits = CPU_BASED_ACTIVATE_SECONDARY_CONTROLS | > > + CPU_BASED_ACTIVATE_MSR_BITMAP; > > + v->arch.hvm_vmx.exec_control |= bits; > > + > > + bits = CPU_BASED_USE_TSC_OFFSETING | > > CPU_BASED_TPR_SHADOW | > > + CPU_BASED_VIRTUAL_NMI_PENDING; > > + v->arch.hvm_vmx.exec_control &= ~bits; > > + } > > Did you notice that the original adjustments were truly HAP-related? > Putting your PVH code here just because it takes HAP as a prereq is > not the way to go. Wherever the flags you want set get set for the > HVM case, you should add your adjustments (if any are necessary > at all). Ok. They, the adjustments, are necessary in Phase I. > This is misplaced too - we're past the point of determining the set of > flags, and already in the process of committing them. The code > again should go alongside where the corresponding HVM code sits. > > > - if ( cpu_has_vmx_ple ) > > + if ( cpu_has_vmx_ple && !is_pvh_vcpu(v) ) > > { > > __vmwrite(PLE_GAP, ple_gap); > > __vmwrite(PLE_WINDOW, ple_window); > > Why would this be conditional upon !PVH? We don't have intercept for PLE right now for PVH in phase I. Easy to add tho. > > + vmx_disable_intercept_for_msr(v, MSR_IA32_SYSENTER_CS, > > msr_type); > > + vmx_disable_intercept_for_msr(v, MSR_IA32_SYSENTER_ESP, > > msr_type); > > + vmx_disable_intercept_for_msr(v, MSR_IA32_SYSENTER_EIP, > > msr_type); + > > if ( cpu_has_vmx_pat && paging_mode_hap(d) ) > > - vmx_disable_intercept_for_msr(v, MSR_IA32_CR_PAT, > > MSR_TYPE_R | MSR_TYPE_W); > > + vmx_disable_intercept_for_msr(v, MSR_IA32_CR_PAT, > > msr_type); > > These changes all look pointless; if you really think they're > worthwhile cleanup, they don't belong here. They violate coding style (lines longer than 80). but anyways.. > > @@ -1032,16 +1150,36 @@ static int construct_vmcs(struct vcpu *v) > > > > v->arch.hvm_vmx.exception_bitmap = HVM_TRAP_MASK > > | (paging_mode_hap(d) ? 0 : (1U << TRAP_page_fault)) > > + | (is_pvh_vcpu(v) ? (1U << TRAP_int3) | (1U << > > TRAP_debug) : 0) | (1U << TRAP_no_device); > > What's so special about PVH that it requires these extra intercepts? HVM does the same in vmx_update_debug_state() called from vmx_update_guest_cr() which we don't call for cr0. We need the two for gdbsx and kdb, and they are harmless too as we inject back into the guest if we don't own the exception, jfyi. > So you set hw_cr[0] while I saw you saying hw_cr[4] won't be > touched by PVH in a reply to the v10 series. Such > inconsistencies are just calling for subsequent bugs. Either PVH > set all valid hw_cr[] fields, or it clearly states somewhere that > none of them are used (and then the assignment above ought to > go away). With commonising PVH with HVM, hw_cr gets used. Otherwise the intention was not to use it. I don't see a huge problem at this point with this. > > @@ -1297,6 +1440,9 @@ void vmx_do_resume(struct vcpu *v) > > hvm_asid_flush_vcpu(v); > > } > > > > + if ( is_pvh_vcpu(v) ) > > + reset_stack_and_jump(vmx_asm_do_vmentry); > > + > > debug_state = v->domain->debugger_attached > > || > > v->domain->arch.hvm_domain.params[HVM_PARAM_MEMORY_EVENT_INT3] || > > v->domain->arch.hvm_domain.params[HVM_PARAM_MEMORY_EVENT_SINGLE_STEP]; > > Missing a "fixme" annotation? PVH supports gdbsx, I don't think we support PV for this external debugger, and hence PVH support is not intended also. thanks Mukesh