From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boris Ostrovsky Subject: Re: [PATCH v14 for-xen-4.5 17/21] x86/VPMU: Handle PMU interrupts for PV guests Date: Mon, 27 Oct 2014 15:43:19 -0400 Message-ID: <544EA057.1030007@oracle.com> References: <1413580689-2750-1-git-send-email-boris.ostrovsky@oracle.com> <1413580689-2750-18-git-send-email-boris.ostrovsky@oracle.com> <544E86D90200007800042917@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <544E86D90200007800042917@mail.emea.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: kevin.tian@intel.com, keir@xen.org, suravee.suthikulpanit@amd.com, andrew.cooper3@citrix.com, tim@xen.org, dietmar.hahn@ts.fujitsu.com, xen-devel@lists.xen.org, Aravind.Gopalakrishnan@amd.com, jun.nakajima@intel.com, dgdegra@tycho.nsa.gov List-Id: xen-devel@lists.xenproject.org On 10/27/2014 12:54 PM, Jan Beulich wrote: >>>> On 17.10.14 at 23:18, wrote: >> --- a/xen/arch/x86/hvm/vpmu.c >> +++ b/xen/arch/x86/hvm/vpmu.c >> @@ -81,46 +81,206 @@ static void __init parse_vpmu_param(char *s) >> >> void vpmu_lvtpc_update(uint32_t val) >> { >> - struct vpmu_struct *vpmu = vcpu_vpmu(current); >> + struct vcpu *curr = current; >> + struct vpmu_struct *vpmu = vcpu_vpmu(curr); >> >> vpmu->hw_lapic_lvtpc = PMU_APIC_VECTOR | (val & APIC_LVT_MASKED); >> - apic_write(APIC_LVTPC, vpmu->hw_lapic_lvtpc); >> + >> + /* Postpone APIC updates for PV(H) guests if PMU interrupt is pending */ >> + if ( is_hvm_vcpu(curr) || !vpmu->xenpmu_data || >> + !(vpmu->xenpmu_data->pmu.pmu_flags & PMU_CACHED) ) > Isn't this the pointer that pvpmu_finish() deallocates (and needs to > clear? If so, there's a race between it being cleared and used. If you > need it in places like this, perhaps you'd be better off never clearing > it and leaving the MFN allocated? This will be one of the places that check for VPMU_CONTEXT_ALLOCATED. > >> void vpmu_do_interrupt(struct cpu_user_regs *regs) >> { >> - struct vcpu *v = current; >> - struct vpmu_struct *vpmu = vcpu_vpmu(v); >> + struct vcpu *sampled = current, *sampling; >> + struct vpmu_struct *vpmu; >> + >> + /* dom0 will handle interrupt for special domains (e.g. idle domain) */ >> + if ( sampled->domain->domain_id >= DOMID_FIRST_RESERVED ) >> + { >> + sampling = choose_hwdom_vcpu(); >> + if ( !sampling ) >> + return; >> + } >> + else >> + sampling = sampled; >> + >> + vpmu = vcpu_vpmu(sampling); >> + if ( !is_hvm_vcpu(sampling) ) >> + { >> + /* PV(H) guest */ >> + const struct cpu_user_regs *cur_regs; >> + uint64_t *flags = &vpmu->xenpmu_data->pmu.pmu_flags; >> + uint32_t domid = DOMID_SELF; >> + >> + if ( !vpmu->xenpmu_data ) >> + return; >> + >> + if ( *flags & PMU_CACHED ) >> + return; >> + >> + if ( is_pvh_vcpu(sampling) && >> + !vpmu->arch_vpmu_ops->do_interrupt(regs) ) >> + return; >> + >> + /* PV guest will be reading PMU MSRs from xenpmu_data */ >> + vpmu_set(vpmu, VPMU_CONTEXT_SAVE | VPMU_CONTEXT_LOADED); >> + vpmu->arch_vpmu_ops->arch_vpmu_save(sampling); >> + vpmu_reset(vpmu, VPMU_CONTEXT_SAVE | VPMU_CONTEXT_LOADED); >> + >> + *flags = 0; >> + >> + /* Store appropriate registers in xenpmu_data */ >> + /* FIXME: 32-bit PVH should go here as well */ >> + if ( is_pv_32bit_vcpu(sampling) ) >> + { >> + /* >> + * 32-bit dom0 cannot process Xen's addresses (which are 64 bit) >> + * and therefore we treat it the same way as a non-privileged >> + * PV 32-bit domain. >> + */ >> + struct compat_pmu_regs *cmp; >> + >> + cur_regs = guest_cpu_user_regs(); >> + >> + cmp = (void *)&vpmu->xenpmu_data->pmu.r.regs; >> + cmp->ip = cur_regs->rip; >> + cmp->sp = cur_regs->rsp; >> + cmp->flags = cur_regs->eflags; >> + cmp->ss = cur_regs->ss; >> + cmp->cs = cur_regs->cs; >> + if ( (cmp->cs & 3) != 1 ) >> + *flags |= PMU_SAMPLE_USER; >> + } >> + else >> + { >> + struct xen_pmu_regs *r = &vpmu->xenpmu_data->pmu.r.regs; >> + >> + if ( (vpmu_mode & XENPMU_MODE_SELF) ) >> + cur_regs = guest_cpu_user_regs(); >> + else if ( (regs->rip >= XEN_VIRT_START) && >> + (regs->rip < XEN_VIRT_END) && >> + is_hardware_domain(sampling->domain)) > I'm pretty sure that already on the previous round I said that using > only RIP for determining whether the sample occurred in hypervisor > context is not enough. Hmm, I did change this to !guest_mode(). But must have reverted it when doing rebasing. > >> + { >> + cur_regs = regs; >> + domid = DOMID_XEN; >> + } >> + else >> + cur_regs = guest_cpu_user_regs(); >> + >> + r->ip = cur_regs->rip; >> + r->sp = cur_regs->rsp; >> + r->flags = cur_regs->eflags; >> + >> + if ( !has_hvm_container_vcpu(sampled) ) >> + { >> + r->ss = cur_regs->ss; >> + r->cs = cur_regs->cs; >> + if ( !(sampled->arch.flags & TF_kernel_mode) ) >> + *flags |= PMU_SAMPLE_USER; >> + } >> + else >> + { >> + struct segment_register seg; >> + >> + hvm_get_segment_register(sampled, x86_seg_cs, &seg); >> + r->cs = seg.sel; >> + if ( (r->cs & 3) != 0 ) >> + *flags |= PMU_SAMPLE_USER; > So is the VM86 mode case here intentionally being ignored? We pass EFLAGS so the guest can check the VM bit. Is this not sufficient? > And is > there a particular reason you look at the selector's RPL instead of > DPL, and CS instead of SS? Should be DPL indeed. But why is SS better than CS? -boris