From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boris Ostrovsky Subject: Re: patch for restricted vPMU modes Date: Mon, 23 Nov 2015 09:35:05 -0500 Message-ID: <56532419.3010807@oracle.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Brendan Gregg , xen-devel@lists.xen.org, Dietmar Hahn List-Id: xen-devel@lists.xenproject.org (+ Dietmar) On 11/20/2015 07:21 PM, Brendan Gregg wrote: > G'Day, > > The vpmu feature of Xen is incredibly useful for performance analysis, > however, it's currently all counters or nothing. In secure > environments, there can be hesitation to enable access to all PMCs > (there are hundreds of them). I've included a prototype patch that > introduces two new restricted vpmu modes: > > vpmu=ipc: As the most restricted minimum set. This enables cycles, > reference cycles, and instructions only. This is enough to calculate > instructions per cycle (IPC). > > vpm=arch: This enables the 7 pre-defined architectural events as > listed in cpuid, and in Table 18-1 of the Intel software developer's > manual, vol 3B. > > There can be a third mode added later on, with a larger set (including > micro-ops PMCs). These new features would need a corresponding change in Linux for PV guests (or for dom0 to change feature set globally). But before that do_xenpmu_op()'s XENPMU_feature_set clause will have to be updated to deal with new modes. > > I've included the short patch below for Xen 4.6.0, which provides > these modes (it also fixes a minor copy-and-paste error with > core2_get_fixed_pmc_count(), which I believe was accessing the wrong > register). I am not a veteran Xen programmer, so please feel free to > edit or rewrite this patch. In case this email messes it up, it's also > on: > https://github.com/brendangregg/Misc/blob/master/xen/xen-4.6.0-vpmu-filter.diff > > I've shown testing of four modes (off, on, ipc, arch) here: > https://gist.github.com/brendangregg/b7318c0f49bf906dc8df > > For example, here is Linux perf running in a PVHVM guest with the new > vpmu=ipc mode: > > root@vm0hvm:~# perf stat -d ./noploop > > Performance counter stats for './noploop': > > 1511.326375 task-clock (msec) # 0.999 CPUs utilized > 24 context-switches # 0.016 K/sec > 0 cpu-migrations # 0.000 K/sec > 113 page-faults # 0.075 K/sec > 5,028,638,883 cycles # 3.327 GHz > 0 stalled-cycles-frontend # 0.00% frontend > cycles idle > 0 stalled-cycles-backend # 0.00% backend > cycles idle > 20,043,427,933 instructions # 3.99 insns per cycle > 0 branches # 0.000 K/sec > 0 branch-misses # 0.00% of all branches > 0 L1-dcache-loads # 0.000 K/sec > 0 L1-dcache-load-misses # 0.00% of all > L1-dcache hits > 0 LLC-loads # 0.000 K/sec > LLC-load-misses:HG > > Note that IPC is shown ("insns per cycle"), but other counters are not. > > > ---patch--- > diff -ur xen-4.6.0-clean/docs/misc/xen-command-line.markdown > xen-4.6.0-brendan/docs/misc/xen-command-line.markdown > --- xen-4.6.0-clean/docs/misc/xen-command-line.markdown2015-10-05 > 07:33:39.000000000 -0700 > +++ xen-4.6.0-brendan/docs/misc/xen-command-line.markdown2015-11-20 > 15:29:05.663781176 -0800 > @@ -1444,7 +1444,7 @@ > flushes on VM entry and exit, increasing performance. > ### vpmu > -> `= ( bts )` > +> `= ( | bts | ipc | arch )` > > Default: `off` > @@ -1460,6 +1460,15 @@ > If 'vpmu=bts' is specified the virtualisation of the Branch Trace > Store (BTS) > feature is switched on on Intel processors supporting this feature. > +vpmu=ipc enables performance monitoring, but restricts the counters > to the > +most minimum set possible: instructions, cycles, and reference > cycles. These > +can be used to calculate instructions per cycle (IPC). > + > +vpmu=arch enables performance monitoring, but restricts the counters > to the > +pre-defined architectural events only. These are exposed by cpuid, > and listed > +in Table 18-1 from the Intel 64 and IA-32 Architectures Software > Developer's > +Manual, Volume 3B, System Programming Guide, Part 2. > + > Note that if **watchdog** option is also specified vpmu will be > turned off. > *Warning:* > diff -ur xen-4.6.0-clean/xen/arch/x86/cpu/vpmu.c > xen-4.6.0-brendan/xen/arch/x86/cpu/vpmu.c > --- xen-4.6.0-clean/xen/arch/x86/cpu/vpmu.c2015-10-05 > 07:33:39.000000000 -0700 > +++ xen-4.6.0-brendan/xen/arch/x86/cpu/vpmu.c2015-11-20 > 15:29:50.847781176 -0800 > @@ -43,9 +43,11 @@ > CHECK_pmu_params; > /* > - * "vpmu" : vpmu generally enabled > - * "vpmu=off" : vpmu generally disabled > - * "vpmu=bts" : vpmu enabled and Intel BTS feature switched on. > + * "vpmu" : vpmu generally enabled (all counters) > + * "vpmu=off" : vpmu generally disabled > + * "vpmu=bts" : vpmu enabled and Intel BTS feature switched on. > + * "vpmu=ipc" : vpmu enabled for IPC counters only (most restrictive) > + * "vpmu=arch" : vpmu enabled for predef arch counters only (restrictive) > */ > static unsigned int __read_mostly opt_vpmu_enabled; > unsigned int __read_mostly vpmu_mode = XENPMU_MODE_OFF; > @@ -67,6 +69,10 @@ > default: > if ( !strcmp(s, "bts") ) > vpmu_features |= XENPMU_FEATURE_INTEL_BTS; > + else if ( !strcmp(s, "ipc") ) > + vpmu_features |= XENPMU_FEATURE_IPC_ONLY; > + else if ( !strcmp(s, "arch") ) > + vpmu_features |= XENPMU_FEATURE_ARCH_ONLY; > else if ( *s ) > { > printk("VPMU: unknown flag: %s - vpmu disabled!\n", s); > diff -ur xen-4.6.0-clean/xen/arch/x86/cpu/vpmu_intel.c > xen-4.6.0-brendan/xen/arch/x86/cpu/vpmu_intel.c > --- xen-4.6.0-clean/xen/arch/x86/cpu/vpmu_intel.c2015-10-05 > 07:33:39.000000000 -0700 > +++ xen-4.6.0-brendan/xen/arch/x86/cpu/vpmu_intel.c2015-11-20 > 15:29:42.571781176 -0800 > @@ -166,10 +166,10 @@ > */ > static int core2_get_fixed_pmc_count(void) > { > - u32 eax; > + u32 edx; > - eax = cpuid_eax(0xa); > - return MASK_EXTR(eax, PMU_FIXED_NR_MASK); > + edx = cpuid_edx(0xa); > + return MASK_EXTR(edx, PMU_FIXED_NR_MASK); > } This would need to be made into a separate patch since it fixes a bug. > /* edx bits 5-12: Bit width of fixed-function performance counters */ > @@ -652,12 +652,52 @@ > tmp = msr - MSR_P6_EVNTSEL(0); > if ( tmp >= 0 && tmp < arch_pmc_cnt ) > { > + int umaskevent, blocked = 0; Should be uint64_t and bool_t. > struct xen_pmu_cntr_pair *xen_pmu_cntr_pair = > vpmu_reg_pointer(core2_vpmu_cxt, arch_counters); > if ( msr_content & ARCH_CTRL_MASK ) > return -EINVAL; > + /* PMC filters */ > + umaskevent = msr_content & MSR_IA32_CMT_EVTSEL_UE_MASK; I don't see this mask defined anywhere. (I assume it's 0xffffffff). Also, if either of those two flags is set we probably want to block MSR_IA32_DS_AREA and MSR_IA32_PEBS_ENABLE accesses as well. > + if ( vpmu_features & XENPMU_FEATURE_IPC_ONLY || > + vpmu_features & XENPMU_FEATURE_ARCH_ONLY ) > + { > + blocked = 1; > + switch ( umaskevent ) > + { > + /* > + * See Table 18-1 from the Intel 64 and IA-32 > Architectures Software > + * Developer's Manual, Volume 3B, System Programming > Guide, Part 2. > + */ > + case 0x003c:/* unhalted core cycles */ > + case 0x013c:/* unhalted ref cycles */ > + case 0x00c0:/* instruction retired */ > + blocked = 0; > + default: > + break; > + } > + } > + > + if ( vpmu_features & XENPMU_FEATURE_ARCH_ONLY ) > + { > + /* additional counters beyond IPC only; blocked > already set */ > + switch ( umaskevent ) > + { > + case 0x4f2e:/* LLC reference */ > + case 0x412e:/* LLC misses */ > + case 0x00c4:/* branch instruction retired */ > + case 0x00c5:/* branch */ > + blocked = 0; > + default: > + break; > + } > + } > + > + if ( blocked ) > + return -EINVAL; > + > if ( has_hvm_container_vcpu(v) ) > vmx_read_guest_msr(MSR_CORE_PERF_GLOBAL_CTRL, > &core2_vpmu_cxt->global_ctrl); > diff -ur xen-4.6.0-clean/xen/include/public/pmu.h > xen-4.6.0-brendan/xen/include/public/pmu.h > --- xen-4.6.0-clean/xen/include/public/pmu.h2015-10-05 > 07:33:39.000000000 -0700 > +++ xen-4.6.0-brendan/xen/include/public/pmu.h2015-11-20 > 15:30:08.887781176 -0800 > @@ -84,9 +84,17 @@ > /* > * PMU features: > - * - XENPMU_FEATURE_INTEL_BTS: Intel BTS support (ignored on AMD) > + * - XENPMU_FEATURE_INTEL_BTS: Intel BTS support (ignored on AMD) > + * - XENPMU_FEATURE_IPC_ONLY: Restrict PMC to the most minimum set > possible. > + * Instructions, cycles, and ref cycles. > Can be > + * used to calculate > instructions-per-cycle (IPC). > + * - XENPMU_FEATURE_ARCH_ONLY: Restrict PMCs to the Intel pre-defined > + * architecteral events exposed by cpuid and > + * listed in Table 18-1 of the > developer's manual. Needs "(ignored on AMD)" -boris > */ > -#define XENPMU_FEATURE_INTEL_BTS 1 > +#define XENPMU_FEATURE_INTEL_BTS (1<<0) > +#define XENPMU_FEATURE_IPC_ONLY (1<<1) > +#define XENPMU_FEATURE_ARCH_ONLY (1<<2) > /* > * Shared PMU data between hypervisor and PV(H) domains. > ---patch--- > > > Brendan > > -- > Brendan Gregg, Senior Performance Architect, Netflix