From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: [PATCH v2 11/13] x86/PMU: Handle PMU interrupts for PV guests Date: Wed, 25 Sep 2013 15:40:13 +0100 Message-ID: <5242F5CD.3000804@citrix.com> References: <1379670132-1748-1-git-send-email-boris.ostrovsky@oracle.com> <1379670132-1748-12-git-send-email-boris.ostrovsky@oracle.com> <5243106702000078000F6524@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1VOqGI-0004iO-17 for xen-devel@lists.xenproject.org; Wed, 25 Sep 2013 14:40:18 +0000 In-Reply-To: <5243106702000078000F6524@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: jun.nakajima@intel.com, George.Dunlap@eu.citrix.com, jacob.shin@amd.com, eddie.dong@intel.com, dietmar.hahn@ts.fujitsu.com, suravee.suthikulpanit@amd.com, xen-devel , Boris Ostrovsky List-Id: xen-devel@lists.xenproject.org On 25/09/13 15:33, Jan Beulich wrote: >>>> On 20.09.13 at 11:42, Boris Ostrovsky wrote: >> Add support for handling PMU interrupts for PV guests, make these interrupts >> NMI instead of PMU_APIC_VECTOR vector. Depending on vpmu_mode forward the >> interrupts to appropriate guest (mode is VPMU_ON) or to dom0 (VPMU_DOM0). > Is using NMIs here a necessity? I guess not, in which case I'd really > like this to be a (perhaps even non-default) option controllable via > command line option. > >> - * This interrupt handles performance counters interrupt >> - */ >> - >> -void pmu_apic_interrupt(struct cpu_user_regs *regs) >> -{ >> - ack_APIC_irq(); >> - vpmu_do_interrupt(regs); >> -} > So this was the only caller of vpmu_do_interrupt(); no new one gets > added in this patch afaics, and I don't recall having seen addition of > another caller in earlier patches. What's the deal? > >> @@ -99,17 +106,97 @@ int vpmu_do_rdmsr(unsigned int msr, uint64_t *msr_content) >> int vpmu_do_interrupt(struct cpu_user_regs *regs) >> { >> struct vcpu *v = current; >> - struct vpmu_struct *vpmu = vcpu_vpmu(v); >> + struct vpmu_struct *vpmu; >> >> - if ( vpmu->arch_vpmu_ops ) >> + >> + /* dom0 will handle this interrupt */ >> + if ( (vpmu_mode & XENPMU_MODE_PRIV) || >> + (v->domain->domain_id >= DOMID_FIRST_RESERVED) ) >> + { >> + if ( smp_processor_id() >= dom0->max_vcpus ) >> + return 0; >> + v = dom0->vcpu[smp_processor_id()]; >> + } >> + >> + vpmu = vcpu_vpmu(v); >> + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) ) >> + return 0; >> + >> + if ( !is_hvm_domain(v->domain) || (vpmu_mode & XENPMU_MODE_PRIV) ) >> + { >> + /* PV guest or dom0 is doing system profiling */ >> + void *p; >> + struct cpu_user_regs *gregs; >> + >> + p = &v->arch.vpmu.xenpmu_data->pmu.regs; >> + >> + /* PV guest will be reading PMU MSRs from xenpmu_data */ >> + vpmu_save_force(v); >> + >> + /* Store appropriate registers in xenpmu_data >> + * >> + * Note: '!current->is_running' is possible when 'set_current(next)' >> + * for the (HVM) guest has been called but 'reset_stack_and_jump()' >> + * has not (i.e. the guest is not actually running yet). >> + */ >> + if ( !is_hvm_domain(current->domain) || >> + ((vpmu_mode & XENPMU_MODE_PRIV) && !current->is_running) ) >> + { >> + /* >> + * 32-bit dom0 cannot process Xen's addresses (which are 64 bit) >> + * and therefore we treat it the same way as a non-priviledged >> + * PV 32-bit domain. >> + */ >> + if ( is_pv_32bit_domain(current->domain) ) >> + { >> + struct compat_cpu_user_regs cmp; >> + >> + gregs = guest_cpu_user_regs(); >> + XLAT_cpu_user_regs(&cmp, gregs); >> + memcpy(p, &cmp, sizeof(struct compat_cpu_user_regs)); >> + } >> + else if ( (current->domain != dom0) && !is_idle_vcpu(current) && >> + !(vpmu_mode & XENPMU_MODE_PRIV) ) >> + { >> + /* PV guest */ >> + gregs = guest_cpu_user_regs(); >> + memcpy(p, gregs, sizeof(struct cpu_user_regs)); >> + } >> + else >> + memcpy(p, regs, sizeof(struct cpu_user_regs)); >> + } >> + else >> + { >> + /* HVM guest */ >> + struct segment_register cs; >> + >> + gregs = guest_cpu_user_regs(); >> + hvm_get_segment_register(current, x86_seg_cs, &cs); >> + gregs->cs = cs.attr.fields.dpl; >> + >> + memcpy(p, gregs, sizeof(struct cpu_user_regs)); >> + } >> + >> + v->arch.vpmu.xenpmu_data->domain_id = current->domain->domain_id; >> + v->arch.vpmu.xenpmu_data->vcpu_id = current->vcpu_id; >> + v->arch.vpmu.xenpmu_data->pcpu_id = smp_processor_id(); >> + >> + raise_softirq(PMU_SOFTIRQ); >> + vpmu_set(vpmu, VPMU_WAIT_FOR_FLUSH); >> + >> + return 1; >> + } >> + else if ( vpmu->arch_vpmu_ops ) >> { >> - struct vlapic *vlapic = vcpu_vlapic(v); >> + /* HVM guest */ >> + struct vlapic *vlapic; >> u32 vlapic_lvtpc; >> unsigned char int_vec; >> >> if ( !vpmu->arch_vpmu_ops->do_interrupt(regs) ) >> return 0; >> >> + vlapic = vcpu_vlapic(v); >> if ( !is_vlapic_lvtpc_enabled(vlapic) ) >> return 1; >> > Assuming the plan is to run this in NMI context - this is _a lot_ of > stuff you want to do. Did you carefully audit all paths for being > NMI-safe? > > Jan vpmu_save() is not safe from an NMI context, as its non-NMI context uses local_irq_disable() to achieve consistency. ~Andrew