From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boris Ostrovsky Subject: Re: [PATCH v12 for-xen-4.5 16/20] x86/VPMU: Handle PMU interrupts for PV guests Date: Tue, 30 Sep 2014 11:07:08 -0400 Message-ID: <542AC71C.6020804@oracle.com> References: <1411673336-32736-1-git-send-email-boris.ostrovsky@oracle.com> <1411673336-32736-17-git-send-email-boris.ostrovsky@oracle.com> <542A81B6020000780003ADD4@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <542A81B6020000780003ADD4@mail.emea.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: kevin.tian@intel.com, keir@xen.org, suravee.suthikulpanit@amd.com, andrew.cooper3@citrix.com, tim@xen.org, dietmar.hahn@ts.fujitsu.com, xen-devel@lists.xen.org, Aravind.Gopalakrishnan@amd.com, jun.nakajima@intel.com, dgdegra@tycho.nsa.gov List-Id: xen-devel@lists.xenproject.org On 09/30/2014 04:11 AM, Jan Beulich wrote: >>>> On 25.09.14 at 21:28, wrote: >> --- a/xen/arch/x86/hvm/vpmu.c >> +++ b/xen/arch/x86/hvm/vpmu.c >> @@ -80,44 +80,191 @@ static void __init parse_vpmu_param(char *s) >> >> void vpmu_lvtpc_update(uint32_t val) >> { >> - struct vpmu_struct *vpmu = vcpu_vpmu(current); >> + struct vcpu *curr = current; >> + struct vpmu_struct *vpmu = vcpu_vpmu(curr); >> >> vpmu->hw_lapic_lvtpc = PMU_APIC_VECTOR | (val & APIC_LVT_MASKED); >> - apic_write(APIC_LVTPC, vpmu->hw_lapic_lvtpc); >> + >> + /* Postpone APIC updates for PV(H) guests if PMU interrupt is pending >> */ >> + if ( is_hvm_domain(curr->domain) || > Please don't open-code is_hvm_vcpu() (more instances elsewhere). Why is this open-coded? > >> + !(vpmu->xenpmu_data && (vpmu->xenpmu_data->pmu_flags & PMU_CACHED)) ) > I think readability would benefit if you resolved the !(&&) to !||! > (making it the proper inverse of what you do in vpmu_do_wrmsr() > and vpmu_do_rdmsr()). > >> +static struct vcpu *choose_hwdom_vcpu(void) >> +{ >> + struct vcpu *v; >> + unsigned idx = smp_processor_id() % hardware_domain->max_vcpus; >> + >> + if ( hardware_domain->vcpu == NULL ) >> + return NULL; >> + >> + v = hardware_domain->vcpu[idx]; >> + >> + /* >> + * If index is not populated search downwards the vcpu array until >> + * a valid vcpu can be found >> + */ >> + while ( !v && idx-- ) >> + v = hardware_domain->vcpu[idx]; > Each time I get here I wonder what case this is good for. I thought we can have a case when first hardware_domain->vcpu[idx] is NULL so we walk the array down until we find the first non-NULL vcpu. Hot unplug, for example (we may be calling this from NMI context). > >> int vpmu_do_interrupt(struct cpu_user_regs *regs) >> { >> - struct vcpu *v = current; >> - struct vpmu_struct *vpmu = vcpu_vpmu(v); >> + struct vcpu *sampled = current, *sampling; >> + struct vpmu_struct *vpmu; >> + >> + /* dom0 will handle interrupt for special domains (e.g. idle domain) */ >> + if ( sampled->domain->domain_id >= DOMID_FIRST_RESERVED ) >> + { >> + sampling = choose_hwdom_vcpu(); >> + if ( !sampling ) >> + return 0; >> + } >> + else >> + sampling = sampled; >> + >> + vpmu = vcpu_vpmu(sampling); >> + if ( !is_hvm_domain(sampling->domain) ) >> + { >> + /* PV(H) guest */ >> + const struct cpu_user_regs *cur_regs; >> + >> + if ( !vpmu->xenpmu_data ) >> + return 0; >> + >> + if ( vpmu->xenpmu_data->pmu_flags & PMU_CACHED ) >> + return 1; >> + >> + if ( is_pvh_domain(sampled->domain) && > Here and below - is this really the right condition? I.e. is the > opposite case (doing nothing here, but the one further down > having an else) really meant to cover both HVM and PV? The outer > !is_hvm_() doesn't count here as that acts on sampling, not > sampled. This is test for an error in do_interrupt() --- if it reported a failure then there is no reason to proceed further. > >> + !vpmu->arch_vpmu_ops->do_interrupt(regs) ) >> + return 0; >> + >> >> + >> + /* Non-privileged domains are always in XENPMU_MODE_SELF mode */ >> + if ( (vpmu_mode & XENPMU_MODE_SELF) || >> + (!is_hardware_domain(sampled->domain) && >> + !is_idle_vcpu(sampled)) ) >> + cur_regs = guest_cpu_user_regs(); >> + else >> + cur_regs = regs; >> + >> + r->rip = cur_regs->rip; >> + r->rsp = cur_regs->rsp; >> + >> + if ( !is_pvh_domain(sampled->domain) ) > (This is the other instance.) This actually should be !has_hvm_container_domain(sampled->domain). > >> + { >> + r->cs = cur_regs->cs; >> + if ( sampled->arch.flags & TF_kernel_mode ) >> + r->cs &= ~3; > And once again I wonder how the consumer of this data is to tell > apart guest kernel and hypervisor addresses. Based on the RIP --- perf, for example, searches through various symbol tables. I suppose I can set xenpmu_data->domain_id below to either DOMID_SELF for guest and DOMID_XEN for the hypervisor. > >> + } >> + else >> + { >> + struct segment_register seg_cs; >> + >> + hvm_get_segment_register(sampled, x86_seg_cs, &seg_cs); >> + r->cs = seg_cs.sel; >> + } >> + } >> + >> + vpmu->xenpmu_data->domain_id = DOMID_SELF; >> + vpmu->xenpmu_data->vcpu_id = sampled->vcpu_id; >> + vpmu->xenpmu_data->pcpu_id = smp_processor_id(); >> + >> + vpmu->xenpmu_data->pmu_flags |= PMU_CACHED; >> + vpmu->hw_lapic_lvtpc |= APIC_LVT_MASKED; >> + apic_write(APIC_LVTPC, vpmu->hw_lapic_lvtpc); >> + >> + send_guest_vcpu_virq(sampling, VIRQ_XENPMU); >> + >> + return 1; >> + } >> -boris