From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boris Ostrovsky Subject: Re: [PATCH v24 12/15] x86/VPMU: Handle PMU interrupts for PV(H) guests Date: Mon, 15 Jun 2015 13:17:04 -0400 Message-ID: <557F0890.8080008@oracle.com> References: <1433948671-29504-1-git-send-email-boris.ostrovsky@oracle.com> <1433948671-29504-13-git-send-email-boris.ostrovsky@oracle.com> <557F104C0200007800085040@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <557F104C0200007800085040@mail.emea.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: kevin.tian@intel.com, suravee.suthikulpanit@amd.com, andrew.cooper3@citrix.com, tim@xen.org, dietmar.hahn@ts.fujitsu.com, xen-devel@lists.xen.org, Aravind.Gopalakrishnan@amd.com, jun.nakajima@intel.com, dgdegra@tycho.nsa.gov List-Id: xen-devel@lists.xenproject.org On 06/15/2015 11:50 AM, Jan Beulich wrote: >>>> On 10.06.15 at 17:04, wrote: >> @@ -211,27 +214,65 @@ static inline void context_load(struct vcpu *v) >> } >> } >> >> -static void amd_vpmu_load(struct vcpu *v) >> +static int amd_vpmu_load(struct vcpu *v, bool_t from_guest) >> { >> struct vpmu_struct *vpmu = vcpu_vpmu(v); >> - struct xen_pmu_amd_ctxt *ctxt = vpmu->context; >> - uint64_t *ctrl_regs = vpmu_reg_pointer(ctxt, ctrls); >> + struct xen_pmu_amd_ctxt *ctxt; >> + uint64_t *ctrl_regs; >> + unsigned int i; >> >> vpmu_reset(vpmu, VPMU_FROZEN); >> >> - if ( vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) >> + if ( !from_guest && vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) >> { >> - unsigned int i; >> + ctxt = vpmu->context; >> + ctrl_regs = vpmu_reg_pointer(ctxt, ctrls); >> >> for ( i = 0; i < num_counters; i++ ) >> wrmsrl(ctrls[i], ctrl_regs[i]); >> >> - return; >> + return 0; >> + } >> + >> + if ( from_guest ) > Generally I dislike such redundancy ( > if ( cond1 && cond2 ) > return; > if ( !cond1 ) > ... > which can be written without checking cond1 twice) - do you really > think it is beneficial for overall readability to have it that way? I thought it was more readable as the first clause means that we are in a quick VPMU load from context switch (and so we do a return from it) while the second is a part of a full VPMU load. > >> + { >> + unsigned int num_enabled = 0; >> + struct xen_pmu_amd_ctxt *guest_ctxt = &vpmu->xenpmu_data->pmu.c.amd; >> + >> + ASSERT(!is_hvm_vcpu(v)); >> + >> + ctxt = vpmu->context; >> + ctrl_regs = vpmu_reg_pointer(ctxt, ctrls); >> + >> + memcpy(&ctxt->regs[0], &guest_ctxt->regs[0], regs_sz); > So that's the live structure, not any staging area iiuc. What state > is the guest going to be in when validation fails (and you can't > restore the original state)? And what guarantees that nothing > elsewhere in the hypervisor uses the data _before_ the > validation below succeeds? We don't load this data into hardware until we have validated it. On failed validation guest will receive hypercall error --- it's up to the guest to decide what to do. The hypervisor will not use this data as it will still be flagged as PMU_CACHED, i.e. invalid/stale. (That's why I say in the comment below that re-initializing it is really not necessary) > >> + for ( i = 0; i < num_counters; i++ ) >> + { >> + if ( (ctrl_regs[i] & CTRL_RSVD_MASK) != ctrl_rsvd[i] ) >> + { >> + /* >> + * Not necessary to re-init context since we should never load >> + * it until guest provides valid values. But just to be safe. >> + */ >> + amd_vpmu_init_regs(ctxt); >> + return -EINVAL; >> + } >> + >> + if ( is_pmu_enabled(ctrl_regs[i]) ) >> + num_enabled++; >> + } >> + >> + if ( num_enabled ) > Looks like a boolean flag would do - the exact count doesn't seem to > be of interest here or in later patches? So the reason why I use a counter here is because keeping track of VPMU_RUNNING state is currently broken on AMD, I noticed it when I was updating this patch. amd_vpmu_do_wrmsr() will reset VPMU_RUNNING if the MSR write is disabling current counter, even though there may still be other counters running. This may be related to HVM brokenness of AMD counters that I mentioned a while ago where the guest, when running with multiple counters, sometimes gets unexpected NMIs. (This goes back all the way to to 4.1.) I don't want to fix this in this series but I will likely need to count number of active counters when I do (just like I do for Intel). I can use a boolean now though since I am not dealing with this problem here. -boris