* patch for restricted vPMU modes
@ 2015-11-21 5:32 Brendan Gregg
2015-11-23 11:08 ` Jan Beulich
0 siblings, 1 reply; 6+ messages in thread
From: Brendan Gregg @ 2015-11-21 5:32 UTC (permalink / raw)
To: xen-devel
[-- Attachment #1.1: Type: text/plain, Size: 8753 bytes --]
G'Day,
The vpmu feature of Xen is incredibly useful for performance analysis,
however, it's currently all counters or nothing. In secure environments,
there can be hesitation to enable access to all PMCs (there are hundreds of
them). I've included a prototype patch that introduces two new restricted
vpmu modes:
vpmu=ipc: As the most restricted minimum set. This enables cycles,
reference cycles, and instructions only. This is enough to calculate
instructions per cycle (IPC).
vpm=arch: This enables the 7 pre-defined architectural events as listed in
cpuid, and in Table 18-1 of the Intel software developer's manual, vol 3B.
There can be a third mode added later on, with a larger set (including
micro-ops PMCs).
I've included the short patch below for Xen 4.6.0, which provides these
modes (it also fixes a minor copy-and-paste error with
core2_get_fixed_pmc_count(), which I believe was accessing the wrong
register). I am not a veteran Xen programmer, so please feel free to edit
or rewrite this patch. In case this email messes it up, it's also on:
https://github.com/brendangregg/Misc/blob/master/xen/xen-4.6.0-vpmu-filter.diff
I've shown testing of four modes (off, on, ipc, arch) here:
https://gist.github.com/brendangregg/b7318c0f49bf906dc8df
For example, here is Linux perf running in a PVHVM guest with the new
vpmu=ipc mode:
root@vm0hvm:~# perf stat -d ./noploop
Performance counter stats for './noploop':
1511.326375 task-clock (msec) # 0.999 CPUs utilized
24 context-switches # 0.016 K/sec
0 cpu-migrations # 0.000 K/sec
113 page-faults # 0.075 K/sec
5,028,638,883 cycles # 3.327 GHz
0 stalled-cycles-frontend # 0.00% frontend cycles
idle
0 stalled-cycles-backend # 0.00% backend cycles
idle
20,043,427,933 instructions # 3.99 insns per cycle
0 branches # 0.000 K/sec
0 branch-misses # 0.00% of all branches
0 L1-dcache-loads # 0.000 K/sec
0 L1-dcache-load-misses # 0.00% of all L1-dcache
hits
0 LLC-loads # 0.000 K/sec
<not supported> LLC-load-misses:HG
Note that IPC is shown ("insns per cycle"), but other counters are not.
---patch---
diff -ur xen-4.6.0-clean/docs/misc/xen-command-line.markdown
xen-4.6.0-brendan/docs/misc/xen-command-line.markdown
--- xen-4.6.0-clean/docs/misc/xen-command-line.markdown 2015-10-05
07:33:39.000000000 -0700
+++ xen-4.6.0-brendan/docs/misc/xen-command-line.markdown 2015-11-20
15:29:05.663781176 -0800
@@ -1444,7 +1444,7 @@
flushes on VM entry and exit, increasing performance.
### vpmu
-> `= ( bts )`
+> `= ( <boolean> | bts | ipc | arch )`
> Default: `off`
@@ -1460,6 +1460,15 @@
If 'vpmu=bts' is specified the virtualisation of the Branch Trace Store
(BTS)
feature is switched on on Intel processors supporting this feature.
+vpmu=ipc enables performance monitoring, but restricts the counters to the
+most minimum set possible: instructions, cycles, and reference cycles.
These
+can be used to calculate instructions per cycle (IPC).
+
+vpmu=arch enables performance monitoring, but restricts the counters to the
+pre-defined architectural events only. These are exposed by cpuid, and
listed
+in Table 18-1 from the Intel 64 and IA-32 Architectures Software
Developer's
+Manual, Volume 3B, System Programming Guide, Part 2.
+
Note that if **watchdog** option is also specified vpmu will be turned off.
*Warning:*
diff -ur xen-4.6.0-clean/xen/arch/x86/cpu/vpmu.c
xen-4.6.0-brendan/xen/arch/x86/cpu/vpmu.c
--- xen-4.6.0-clean/xen/arch/x86/cpu/vpmu.c 2015-10-05 07:33:39.000000000
-0700
+++ xen-4.6.0-brendan/xen/arch/x86/cpu/vpmu.c 2015-11-20 15:29:50.847781176
-0800
@@ -43,9 +43,11 @@
CHECK_pmu_params;
/*
- * "vpmu" : vpmu generally enabled
- * "vpmu=off" : vpmu generally disabled
- * "vpmu=bts" : vpmu enabled and Intel BTS feature switched on.
+ * "vpmu" : vpmu generally enabled (all counters)
+ * "vpmu=off" : vpmu generally disabled
+ * "vpmu=bts" : vpmu enabled and Intel BTS feature switched on.
+ * "vpmu=ipc" : vpmu enabled for IPC counters only (most restrictive)
+ * "vpmu=arch" : vpmu enabled for predef arch counters only (restrictive)
*/
static unsigned int __read_mostly opt_vpmu_enabled;
unsigned int __read_mostly vpmu_mode = XENPMU_MODE_OFF;
@@ -67,6 +69,10 @@
default:
if ( !strcmp(s, "bts") )
vpmu_features |= XENPMU_FEATURE_INTEL_BTS;
+ else if ( !strcmp(s, "ipc") )
+ vpmu_features |= XENPMU_FEATURE_IPC_ONLY;
+ else if ( !strcmp(s, "arch") )
+ vpmu_features |= XENPMU_FEATURE_ARCH_ONLY;
else if ( *s )
{
printk("VPMU: unknown flag: %s - vpmu disabled!\n", s);
diff -ur xen-4.6.0-clean/xen/arch/x86/cpu/vpmu_intel.c
xen-4.6.0-brendan/xen/arch/x86/cpu/vpmu_intel.c
--- xen-4.6.0-clean/xen/arch/x86/cpu/vpmu_intel.c 2015-10-05
07:33:39.000000000 -0700
+++ xen-4.6.0-brendan/xen/arch/x86/cpu/vpmu_intel.c 2015-11-20
15:29:42.571781176 -0800
@@ -166,10 +166,10 @@
*/
static int core2_get_fixed_pmc_count(void)
{
- u32 eax;
+ u32 edx;
- eax = cpuid_eax(0xa);
- return MASK_EXTR(eax, PMU_FIXED_NR_MASK);
+ edx = cpuid_edx(0xa);
+ return MASK_EXTR(edx, PMU_FIXED_NR_MASK);
}
/* edx bits 5-12: Bit width of fixed-function performance counters */
@@ -652,12 +652,52 @@
tmp = msr - MSR_P6_EVNTSEL(0);
if ( tmp >= 0 && tmp < arch_pmc_cnt )
{
+ int umaskevent, blocked = 0;
struct xen_pmu_cntr_pair *xen_pmu_cntr_pair =
vpmu_reg_pointer(core2_vpmu_cxt, arch_counters);
if ( msr_content & ARCH_CTRL_MASK )
return -EINVAL;
+ /* PMC filters */
+ umaskevent = msr_content & MSR_IA32_CMT_EVTSEL_UE_MASK;
+ if ( vpmu_features & XENPMU_FEATURE_IPC_ONLY ||
+ vpmu_features & XENPMU_FEATURE_ARCH_ONLY )
+ {
+ blocked = 1;
+ switch ( umaskevent )
+ {
+ /*
+ * See Table 18-1 from the Intel 64 and IA-32
Architectures Software
+ * Developer's Manual, Volume 3B, System Programming
Guide, Part 2.
+ */
+ case 0x003c: /* unhalted core cycles */
+ case 0x013c: /* unhalted ref cycles */
+ case 0x00c0: /* instruction retired */
+ blocked = 0;
+ default:
+ break;
+ }
+ }
+
+ if ( vpmu_features & XENPMU_FEATURE_ARCH_ONLY )
+ {
+ /* additional counters beyond IPC only; blocked already
set */
+ switch ( umaskevent )
+ {
+ case 0x4f2e: /* LLC reference */
+ case 0x412e: /* LLC misses */
+ case 0x00c4: /* branch instruction retired */
+ case 0x00c5: /* branch */
+ blocked = 0;
+ default:
+ break;
+ }
+ }
+
+ if ( blocked )
+ return -EINVAL;
+
if ( has_hvm_container_vcpu(v) )
vmx_read_guest_msr(MSR_CORE_PERF_GLOBAL_CTRL,
&core2_vpmu_cxt->global_ctrl);
diff -ur xen-4.6.0-clean/xen/include/public/pmu.h
xen-4.6.0-brendan/xen/include/public/pmu.h
--- xen-4.6.0-clean/xen/include/public/pmu.h 2015-10-05 07:33:39.000000000
-0700
+++ xen-4.6.0-brendan/xen/include/public/pmu.h 2015-11-20
15:30:08.887781176 -0800
@@ -84,9 +84,17 @@
/*
* PMU features:
- * - XENPMU_FEATURE_INTEL_BTS: Intel BTS support (ignored on AMD)
+ * - XENPMU_FEATURE_INTEL_BTS: Intel BTS support (ignored on AMD)
+ * - XENPMU_FEATURE_IPC_ONLY: Restrict PMC to the most minimum set
possible.
+ * Instructions, cycles, and ref cycles. Can
be
+ * used to calculate instructions-per-cycle
(IPC).
+ * - XENPMU_FEATURE_ARCH_ONLY: Restrict PMCs to the Intel pre-defined
+ * architecteral events exposed by cpuid and
+ * listed in Table 18-1 of the developer's
manual.
*/
-#define XENPMU_FEATURE_INTEL_BTS 1
+#define XENPMU_FEATURE_INTEL_BTS (1<<0)
+#define XENPMU_FEATURE_IPC_ONLY (1<<1)
+#define XENPMU_FEATURE_ARCH_ONLY (1<<2)
/*
* Shared PMU data between hypervisor and PV(H) domains.
---patch---
Brendan
--
Brendan Gregg, Senior Performance Architect, Netflix
[-- Attachment #1.2: Type: text/html, Size: 12981 bytes --]
[-- Attachment #2: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: patch for restricted vPMU modes
2015-11-21 5:32 Brendan Gregg
@ 2015-11-23 11:08 ` Jan Beulich
2015-11-23 11:12 ` Andrew Cooper
0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2015-11-23 11:08 UTC (permalink / raw)
To: Brendan Gregg; +Cc: xen-devel
>>> On 21.11.15 at 06:32, <bgregg@netflix.com> wrote:
> I've included the short patch below for Xen 4.6.0, which provides these
> modes (it also fixes a minor copy-and-paste error with
> core2_get_fixed_pmc_count(), which I believe was accessing the wrong
> register). I am not a veteran Xen programmer, so please feel free to edit
> or rewrite this patch. In case this email messes it up, it's also on:
> https://github.com/brendangregg/Misc/blob/master/xen/xen-4.6.0-vpmu-filter.d
> iff
Thanks for the contribution, but I'm sorry - this is not how things work.
Unless someone else want to pick this up (and perhaps even then) the
patch lacks proper attributes (like a Signed-off-by tag), should be
against -unstable instead of any released version, and I don't think
anyone's going to go grab it from a web page to apply (i.e. if you
can't get your mail client to handle it properly when inlined, attach it
in addition to inlining).
See http://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches.
> --- xen-4.6.0-clean/xen/arch/x86/cpu/vpmu_intel.c 2015-10-05
> 07:33:39.000000000 -0700
> +++ xen-4.6.0-brendan/xen/arch/x86/cpu/vpmu_intel.c 2015-11-20
> 15:29:42.571781176 -0800
> @@ -166,10 +166,10 @@
> */
> static int core2_get_fixed_pmc_count(void)
> {
> - u32 eax;
> + u32 edx;
>
> - eax = cpuid_eax(0xa);
> - return MASK_EXTR(eax, PMU_FIXED_NR_MASK);
> + edx = cpuid_edx(0xa);
> + return MASK_EXTR(edx, PMU_FIXED_NR_MASK);
> }
Without going into much detail on the actual patch, this caught my
eye: Either you're fixing a pretty blatant bug here, or this change
just can't be right. In the former case, such a fix should be
submitted as a separate patch.
Jan
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: patch for restricted vPMU modes
2015-11-23 11:08 ` Jan Beulich
@ 2015-11-23 11:12 ` Andrew Cooper
0 siblings, 0 replies; 6+ messages in thread
From: Andrew Cooper @ 2015-11-23 11:12 UTC (permalink / raw)
To: Jan Beulich, Brendan Gregg; +Cc: xen-devel
On 23/11/15 11:08, Jan Beulich wrote:
>>>> On 21.11.15 at 06:32, <bgregg@netflix.com> wrote:
>> I've included the short patch below for Xen 4.6.0, which provides these
>> modes (it also fixes a minor copy-and-paste error with
>> core2_get_fixed_pmc_count(), which I believe was accessing the wrong
>> register). I am not a veteran Xen programmer, so please feel free to edit
>> or rewrite this patch. In case this email messes it up, it's also on:
>> https://github.com/brendangregg/Misc/blob/master/xen/xen-4.6.0-vpmu-filter.d
>> iff
> Thanks for the contribution, but I'm sorry - this is not how things work.
> Unless someone else want to pick this up (and perhaps even then) the
> patch lacks proper attributes (like a Signed-off-by tag), should be
> against -unstable instead of any released version, and I don't think
> anyone's going to go grab it from a web page to apply (i.e. if you
> can't get your mail client to handle it properly when inlined, attach it
> in addition to inlining).
>
> See http://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches.
>
>> --- xen-4.6.0-clean/xen/arch/x86/cpu/vpmu_intel.c 2015-10-05
>> 07:33:39.000000000 -0700
>> +++ xen-4.6.0-brendan/xen/arch/x86/cpu/vpmu_intel.c 2015-11-20
>> 15:29:42.571781176 -0800
>> @@ -166,10 +166,10 @@
>> */
>> static int core2_get_fixed_pmc_count(void)
>> {
>> - u32 eax;
>> + u32 edx;
>>
>> - eax = cpuid_eax(0xa);
>> - return MASK_EXTR(eax, PMU_FIXED_NR_MASK);
>> + edx = cpuid_edx(0xa);
>> + return MASK_EXTR(edx, PMU_FIXED_NR_MASK);
>> }
> Without going into much detail on the actual patch, this caught my
> eye: Either you're fixing a pretty blatant bug here, or this change
> just can't be right. In the former case, such a fix should be
> submitted as a separate patch.
Blatent bug. The number of fixed function perf counters is bits 4:0 of edx.
~Andrew
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: patch for restricted vPMU modes
[not found] <CAJN39ohu5fvs9rKowg5iaZPJeni+Xn2i+cPkbUgX++0GksgtDQ@mail.gmail.com>
@ 2015-11-23 14:35 ` Boris Ostrovsky
2015-11-23 22:01 ` Brendan Gregg
0 siblings, 1 reply; 6+ messages in thread
From: Boris Ostrovsky @ 2015-11-23 14:35 UTC (permalink / raw)
To: Brendan Gregg, xen-devel, Dietmar Hahn
(+ Dietmar)
On 11/20/2015 07:21 PM, Brendan Gregg wrote:
> G'Day,
>
> The vpmu feature of Xen is incredibly useful for performance analysis,
> however, it's currently all counters or nothing. In secure
> environments, there can be hesitation to enable access to all PMCs
> (there are hundreds of them). I've included a prototype patch that
> introduces two new restricted vpmu modes:
>
> vpmu=ipc: As the most restricted minimum set. This enables cycles,
> reference cycles, and instructions only. This is enough to calculate
> instructions per cycle (IPC).
>
> vpm=arch: This enables the 7 pre-defined architectural events as
> listed in cpuid, and in Table 18-1 of the Intel software developer's
> manual, vol 3B.
>
> There can be a third mode added later on, with a larger set (including
> micro-ops PMCs).
These new features would need a corresponding change in Linux for PV
guests (or for dom0 to change feature set globally). But before that
do_xenpmu_op()'s XENPMU_feature_set clause will have to be updated to
deal with new modes.
>
> I've included the short patch below for Xen 4.6.0, which provides
> these modes (it also fixes a minor copy-and-paste error with
> core2_get_fixed_pmc_count(), which I believe was accessing the wrong
> register). I am not a veteran Xen programmer, so please feel free to
> edit or rewrite this patch. In case this email messes it up, it's also
> on:
> https://github.com/brendangregg/Misc/blob/master/xen/xen-4.6.0-vpmu-filter.diff
>
> I've shown testing of four modes (off, on, ipc, arch) here:
> https://gist.github.com/brendangregg/b7318c0f49bf906dc8df
>
> For example, here is Linux perf running in a PVHVM guest with the new
> vpmu=ipc mode:
>
> root@vm0hvm:~# perf stat -d ./noploop
>
> Performance counter stats for './noploop':
>
> 1511.326375 task-clock (msec) # 0.999 CPUs utilized
> 24 context-switches # 0.016 K/sec
> 0 cpu-migrations # 0.000 K/sec
> 113 page-faults # 0.075 K/sec
> 5,028,638,883 cycles # 3.327 GHz
> 0 stalled-cycles-frontend # 0.00% frontend
> cycles idle
> 0 stalled-cycles-backend # 0.00% backend
> cycles idle
> 20,043,427,933 instructions # 3.99 insns per cycle
> 0 branches # 0.000 K/sec
> 0 branch-misses # 0.00% of all branches
> 0 L1-dcache-loads # 0.000 K/sec
> 0 L1-dcache-load-misses # 0.00% of all
> L1-dcache hits
> 0 LLC-loads # 0.000 K/sec
> <not supported> LLC-load-misses:HG
>
> Note that IPC is shown ("insns per cycle"), but other counters are not.
>
>
> ---patch---
> diff -ur xen-4.6.0-clean/docs/misc/xen-command-line.markdown
> xen-4.6.0-brendan/docs/misc/xen-command-line.markdown
> --- xen-4.6.0-clean/docs/misc/xen-command-line.markdown2015-10-05
> 07:33:39.000000000 -0700
> +++ xen-4.6.0-brendan/docs/misc/xen-command-line.markdown2015-11-20
> 15:29:05.663781176 -0800
> @@ -1444,7 +1444,7 @@
> flushes on VM entry and exit, increasing performance.
> ### vpmu
> -> `= ( bts )`
> +> `= ( <boolean> | bts | ipc | arch )`
> > Default: `off`
> @@ -1460,6 +1460,15 @@
> If 'vpmu=bts' is specified the virtualisation of the Branch Trace
> Store (BTS)
> feature is switched on on Intel processors supporting this feature.
> +vpmu=ipc enables performance monitoring, but restricts the counters
> to the
> +most minimum set possible: instructions, cycles, and reference
> cycles. These
> +can be used to calculate instructions per cycle (IPC).
> +
> +vpmu=arch enables performance monitoring, but restricts the counters
> to the
> +pre-defined architectural events only. These are exposed by cpuid,
> and listed
> +in Table 18-1 from the Intel 64 and IA-32 Architectures Software
> Developer's
> +Manual, Volume 3B, System Programming Guide, Part 2.
> +
> Note that if **watchdog** option is also specified vpmu will be
> turned off.
> *Warning:*
> diff -ur xen-4.6.0-clean/xen/arch/x86/cpu/vpmu.c
> xen-4.6.0-brendan/xen/arch/x86/cpu/vpmu.c
> --- xen-4.6.0-clean/xen/arch/x86/cpu/vpmu.c2015-10-05
> 07:33:39.000000000 -0700
> +++ xen-4.6.0-brendan/xen/arch/x86/cpu/vpmu.c2015-11-20
> 15:29:50.847781176 -0800
> @@ -43,9 +43,11 @@
> CHECK_pmu_params;
> /*
> - * "vpmu" : vpmu generally enabled
> - * "vpmu=off" : vpmu generally disabled
> - * "vpmu=bts" : vpmu enabled and Intel BTS feature switched on.
> + * "vpmu" : vpmu generally enabled (all counters)
> + * "vpmu=off" : vpmu generally disabled
> + * "vpmu=bts" : vpmu enabled and Intel BTS feature switched on.
> + * "vpmu=ipc" : vpmu enabled for IPC counters only (most restrictive)
> + * "vpmu=arch" : vpmu enabled for predef arch counters only (restrictive)
> */
> static unsigned int __read_mostly opt_vpmu_enabled;
> unsigned int __read_mostly vpmu_mode = XENPMU_MODE_OFF;
> @@ -67,6 +69,10 @@
> default:
> if ( !strcmp(s, "bts") )
> vpmu_features |= XENPMU_FEATURE_INTEL_BTS;
> + else if ( !strcmp(s, "ipc") )
> + vpmu_features |= XENPMU_FEATURE_IPC_ONLY;
> + else if ( !strcmp(s, "arch") )
> + vpmu_features |= XENPMU_FEATURE_ARCH_ONLY;
> else if ( *s )
> {
> printk("VPMU: unknown flag: %s - vpmu disabled!\n", s);
> diff -ur xen-4.6.0-clean/xen/arch/x86/cpu/vpmu_intel.c
> xen-4.6.0-brendan/xen/arch/x86/cpu/vpmu_intel.c
> --- xen-4.6.0-clean/xen/arch/x86/cpu/vpmu_intel.c2015-10-05
> 07:33:39.000000000 -0700
> +++ xen-4.6.0-brendan/xen/arch/x86/cpu/vpmu_intel.c2015-11-20
> 15:29:42.571781176 -0800
> @@ -166,10 +166,10 @@
> */
> static int core2_get_fixed_pmc_count(void)
> {
> - u32 eax;
> + u32 edx;
> - eax = cpuid_eax(0xa);
> - return MASK_EXTR(eax, PMU_FIXED_NR_MASK);
> + edx = cpuid_edx(0xa);
> + return MASK_EXTR(edx, PMU_FIXED_NR_MASK);
> }
This would need to be made into a separate patch since it fixes a bug.
> /* edx bits 5-12: Bit width of fixed-function performance counters */
> @@ -652,12 +652,52 @@
> tmp = msr - MSR_P6_EVNTSEL(0);
> if ( tmp >= 0 && tmp < arch_pmc_cnt )
> {
> + int umaskevent, blocked = 0;
Should be uint64_t and bool_t.
> struct xen_pmu_cntr_pair *xen_pmu_cntr_pair =
> vpmu_reg_pointer(core2_vpmu_cxt, arch_counters);
> if ( msr_content & ARCH_CTRL_MASK )
> return -EINVAL;
> + /* PMC filters */
> + umaskevent = msr_content & MSR_IA32_CMT_EVTSEL_UE_MASK;
I don't see this mask defined anywhere. (I assume it's 0xffffffff).
Also, if either of those two flags is set we probably want to block
MSR_IA32_DS_AREA and MSR_IA32_PEBS_ENABLE accesses as well.
> + if ( vpmu_features & XENPMU_FEATURE_IPC_ONLY ||
> + vpmu_features & XENPMU_FEATURE_ARCH_ONLY )
> + {
> + blocked = 1;
> + switch ( umaskevent )
> + {
> + /*
> + * See Table 18-1 from the Intel 64 and IA-32
> Architectures Software
> + * Developer's Manual, Volume 3B, System Programming
> Guide, Part 2.
> + */
> + case 0x003c:/* unhalted core cycles */
> + case 0x013c:/* unhalted ref cycles */
> + case 0x00c0:/* instruction retired */
> + blocked = 0;
> + default:
> + break;
> + }
> + }
> +
> + if ( vpmu_features & XENPMU_FEATURE_ARCH_ONLY )
> + {
> + /* additional counters beyond IPC only; blocked
> already set */
> + switch ( umaskevent )
> + {
> + case 0x4f2e:/* LLC reference */
> + case 0x412e:/* LLC misses */
> + case 0x00c4:/* branch instruction retired */
> + case 0x00c5:/* branch */
> + blocked = 0;
> + default:
> + break;
> + }
> + }
> +
> + if ( blocked )
> + return -EINVAL;
> +
> if ( has_hvm_container_vcpu(v) )
> vmx_read_guest_msr(MSR_CORE_PERF_GLOBAL_CTRL,
> &core2_vpmu_cxt->global_ctrl);
> diff -ur xen-4.6.0-clean/xen/include/public/pmu.h
> xen-4.6.0-brendan/xen/include/public/pmu.h
> --- xen-4.6.0-clean/xen/include/public/pmu.h2015-10-05
> 07:33:39.000000000 -0700
> +++ xen-4.6.0-brendan/xen/include/public/pmu.h2015-11-20
> 15:30:08.887781176 -0800
> @@ -84,9 +84,17 @@
> /*
> * PMU features:
> - * - XENPMU_FEATURE_INTEL_BTS: Intel BTS support (ignored on AMD)
> + * - XENPMU_FEATURE_INTEL_BTS: Intel BTS support (ignored on AMD)
> + * - XENPMU_FEATURE_IPC_ONLY: Restrict PMC to the most minimum set
> possible.
> + * Instructions, cycles, and ref cycles.
> Can be
> + * used to calculate
> instructions-per-cycle (IPC).
> + * - XENPMU_FEATURE_ARCH_ONLY: Restrict PMCs to the Intel pre-defined
> + * architecteral events exposed by cpuid and
> + * listed in Table 18-1 of the
> developer's manual.
Needs "(ignored on AMD)"
-boris
> */
> -#define XENPMU_FEATURE_INTEL_BTS 1
> +#define XENPMU_FEATURE_INTEL_BTS (1<<0)
> +#define XENPMU_FEATURE_IPC_ONLY (1<<1)
> +#define XENPMU_FEATURE_ARCH_ONLY (1<<2)
> /*
> * Shared PMU data between hypervisor and PV(H) domains.
> ---patch---
>
>
> Brendan
>
> --
> Brendan Gregg, Senior Performance Architect, Netflix
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: patch for restricted vPMU modes
2015-11-23 14:35 ` patch for restricted vPMU modes Boris Ostrovsky
@ 2015-11-23 22:01 ` Brendan Gregg
2015-11-23 22:54 ` Boris Ostrovsky
0 siblings, 1 reply; 6+ messages in thread
From: Brendan Gregg @ 2015-11-23 22:01 UTC (permalink / raw)
To: Boris Ostrovsky; +Cc: Dietmar Hahn, xen-devel
[-- Attachment #1.1: Type: text/plain, Size: 10584 bytes --]
On Mon, Nov 23, 2015 at 6:35 AM, Boris Ostrovsky <boris.ostrovsky@oracle.com
> wrote:
> (+ Dietmar)
>
> On 11/20/2015 07:21 PM, Brendan Gregg wrote:
>
>> G'Day,
>>
>> The vpmu feature of Xen is incredibly useful for performance analysis,
>> however, it's currently all counters or nothing. In secure environments,
>> there can be hesitation to enable access to all PMCs (there are hundreds of
>> them). I've included a prototype patch that introduces two new restricted
>> vpmu modes:
>>
>> vpmu=ipc: As the most restricted minimum set. This enables cycles,
>> reference cycles, and instructions only. This is enough to calculate
>> instructions per cycle (IPC).
>>
>> vpm=arch: This enables the 7 pre-defined architectural events as listed
>> in cpuid, and in Table 18-1 of the Intel software developer's manual, vol
>> 3B.
>>
>> There can be a third mode added later on, with a larger set (including
>> micro-ops PMCs).
>>
>
> These new features would need a corresponding change in Linux for PV
> guests (or for dom0 to change feature set globally). But before that
> do_xenpmu_op()'s XENPMU_feature_set clause will have to be updated to deal
> with new modes.
>
Ok, thanks. For now this is HVM, and I can EINVAL in do_xenpmu_op() for the
new modes, like with BTS. (Do you know if the later Linux changes would be
more than a feature version of pmu_mode_store/pmu_mode_show?)
>
>
>> I've included the short patch below for Xen 4.6.0, which provides these
>> modes (it also fixes a minor copy-and-paste error with
>> core2_get_fixed_pmc_count(), which I believe was accessing the wrong
>> register). I am not a veteran Xen programmer, so please feel free to edit
>> or rewrite this patch. In case this email messes it up, it's also on:
>> https://github.com/brendangregg/Misc/blob/master/xen/xen-4.6.0-vpmu-filter.diff
>>
>> I've shown testing of four modes (off, on, ipc, arch) here:
>> https://gist.github.com/brendangregg/b7318c0f49bf906dc8df
>>
>> For example, here is Linux perf running in a PVHVM guest with the new
>> vpmu=ipc mode:
>>
>> root@vm0hvm:~# perf stat -d ./noploop
>>
>> Performance counter stats for './noploop':
>>
>> 1511.326375 task-clock (msec) # 0.999 CPUs utilized
>> 24 context-switches # 0.016 K/sec
>> 0 cpu-migrations # 0.000 K/sec
>> 113 page-faults # 0.075 K/sec
>> 5,028,638,883 cycles # 3.327 GHz
>> 0 stalled-cycles-frontend # 0.00% frontend cycles
>> idle
>> 0 stalled-cycles-backend # 0.00% backend cycles
>> idle
>> 20,043,427,933 instructions # 3.99 insns per cycle
>> 0 branches # 0.000 K/sec
>> 0 branch-misses # 0.00% of all branches
>> 0 L1-dcache-loads # 0.000 K/sec
>> 0 L1-dcache-load-misses # 0.00% of all L1-dcache
>> hits
>> 0 LLC-loads # 0.000 K/sec
>> <not supported> LLC-load-misses:HG
>>
>> Note that IPC is shown ("insns per cycle"), but other counters are not.
>>
>>
>> ---patch---
>> diff -ur xen-4.6.0-clean/docs/misc/xen-command-line.markdown
>> xen-4.6.0-brendan/docs/misc/xen-command-line.markdown
>> --- xen-4.6.0-clean/docs/misc/xen-command-line.markdown2015-10-05
>> 07:33:39.000000000 -0700
>> +++ xen-4.6.0-brendan/docs/misc/xen-command-line.markdown2015-11-20
>> 15:29:05.663781176 -0800
>> @@ -1444,7 +1444,7 @@
>> flushes on VM entry and exit, increasing performance.
>> ### vpmu
>> -> `= ( bts )`
>> +> `= ( <boolean> | bts | ipc | arch )`
>> > Default: `off`
>> @@ -1460,6 +1460,15 @@
>> If 'vpmu=bts' is specified the virtualisation of the Branch Trace Store
>> (BTS)
>> feature is switched on on Intel processors supporting this feature.
>> +vpmu=ipc enables performance monitoring, but restricts the counters to
>> the
>> +most minimum set possible: instructions, cycles, and reference cycles.
>> These
>> +can be used to calculate instructions per cycle (IPC).
>> +
>> +vpmu=arch enables performance monitoring, but restricts the counters to
>> the
>> +pre-defined architectural events only. These are exposed by cpuid, and
>> listed
>> +in Table 18-1 from the Intel 64 and IA-32 Architectures Software
>> Developer's
>> +Manual, Volume 3B, System Programming Guide, Part 2.
>> +
>> Note that if **watchdog** option is also specified vpmu will be turned
>> off.
>> *Warning:*
>> diff -ur xen-4.6.0-clean/xen/arch/x86/cpu/vpmu.c
>> xen-4.6.0-brendan/xen/arch/x86/cpu/vpmu.c
>> --- xen-4.6.0-clean/xen/arch/x86/cpu/vpmu.c2015-10-05 07:33:39.000000000
>> -0700
>> +++ xen-4.6.0-brendan/xen/arch/x86/cpu/vpmu.c2015-11-20
>> 15:29:50.847781176 -0800
>>
>> @@ -43,9 +43,11 @@
>> CHECK_pmu_params;
>> /*
>> - * "vpmu" : vpmu generally enabled
>> - * "vpmu=off" : vpmu generally disabled
>> - * "vpmu=bts" : vpmu enabled and Intel BTS feature switched on.
>> + * "vpmu" : vpmu generally enabled (all counters)
>> + * "vpmu=off" : vpmu generally disabled
>> + * "vpmu=bts" : vpmu enabled and Intel BTS feature switched on.
>> + * "vpmu=ipc" : vpmu enabled for IPC counters only (most restrictive)
>> + * "vpmu=arch" : vpmu enabled for predef arch counters only (restrictive)
>> */
>> static unsigned int __read_mostly opt_vpmu_enabled;
>> unsigned int __read_mostly vpmu_mode = XENPMU_MODE_OFF;
>> @@ -67,6 +69,10 @@
>> default:
>> if ( !strcmp(s, "bts") )
>> vpmu_features |= XENPMU_FEATURE_INTEL_BTS;
>> + else if ( !strcmp(s, "ipc") )
>> + vpmu_features |= XENPMU_FEATURE_IPC_ONLY;
>> + else if ( !strcmp(s, "arch") )
>> + vpmu_features |= XENPMU_FEATURE_ARCH_ONLY;
>> else if ( *s )
>> {
>> printk("VPMU: unknown flag: %s - vpmu disabled!\n", s);
>> diff -ur xen-4.6.0-clean/xen/arch/x86/cpu/vpmu_intel.c
>> xen-4.6.0-brendan/xen/arch/x86/cpu/vpmu_intel.c
>> --- xen-4.6.0-clean/xen/arch/x86/cpu/vpmu_intel.c2015-10-05
>> 07:33:39.000000000 -0700
>> +++ xen-4.6.0-brendan/xen/arch/x86/cpu/vpmu_intel.c2015-11-20
>> 15:29:42.571781176 -0800
>> @@ -166,10 +166,10 @@
>> */
>> static int core2_get_fixed_pmc_count(void)
>> {
>> - u32 eax;
>> + u32 edx;
>> - eax = cpuid_eax(0xa);
>> - return MASK_EXTR(eax, PMU_FIXED_NR_MASK);
>> + edx = cpuid_edx(0xa);
>> + return MASK_EXTR(edx, PMU_FIXED_NR_MASK);
>> }
>>
>
> This would need to be made into a separate patch since it fixes a bug.
Ok, thanks.
>
>
> /* edx bits 5-12: Bit width of fixed-function performance counters */
>> @@ -652,12 +652,52 @@
>> tmp = msr - MSR_P6_EVNTSEL(0);
>> if ( tmp >= 0 && tmp < arch_pmc_cnt )
>> {
>> + int umaskevent, blocked = 0;
>>
>
> Should be uint64_t and bool_t.
Ok, thanks.
>
>
> struct xen_pmu_cntr_pair *xen_pmu_cntr_pair =
>> vpmu_reg_pointer(core2_vpmu_cxt, arch_counters);
>> if ( msr_content & ARCH_CTRL_MASK )
>> return -EINVAL;
>> + /* PMC filters */
>> + umaskevent = msr_content & MSR_IA32_CMT_EVTSEL_UE_MASK;
>>
>
>
> I don't see this mask defined anywhere. (I assume it's 0xffffffff).
>
Ah, sorry, will be there in v2 (I'm switching to git send-email, thanks
Jan). It's 0x0000ffff.
>
> Also, if either of those two flags is set we probably want to block
> MSR_IA32_DS_AREA and MSR_IA32_PEBS_ENABLE accesses as well.
>
Ok, yes, good idea.
>
>
> + if ( vpmu_features & XENPMU_FEATURE_IPC_ONLY ||
>> + vpmu_features & XENPMU_FEATURE_ARCH_ONLY )
>> + {
>> + blocked = 1;
>> + switch ( umaskevent )
>> + {
>> + /*
>> + * See Table 18-1 from the Intel 64 and IA-32
>> Architectures Software
>> + * Developer's Manual, Volume 3B, System Programming
>> Guide, Part 2.
>> + */
>> + case 0x003c:/* unhalted core cycles */
>> + case 0x013c:/* unhalted ref cycles */
>> + case 0x00c0:/* instruction retired */
>> + blocked = 0;
>> + default:
>> + break;
>> + }
>> + }
>> +
>> + if ( vpmu_features & XENPMU_FEATURE_ARCH_ONLY )
>> + {
>> + /* additional counters beyond IPC only; blocked already
>> set */
>> + switch ( umaskevent )
>> + {
>> + case 0x4f2e:/* LLC reference */
>> + case 0x412e:/* LLC misses */
>> + case 0x00c4:/* branch instruction retired */
>> + case 0x00c5:/* branch */
>> + blocked = 0;
>> + default:
>> + break;
>> + }
>> + }
>> +
>> + if ( blocked )
>> + return -EINVAL;
>> +
>> if ( has_hvm_container_vcpu(v) )
>> vmx_read_guest_msr(MSR_CORE_PERF_GLOBAL_CTRL,
>> &core2_vpmu_cxt->global_ctrl);
>> diff -ur xen-4.6.0-clean/xen/include/public/pmu.h
>> xen-4.6.0-brendan/xen/include/public/pmu.h
>> --- xen-4.6.0-clean/xen/include/public/pmu.h2015-10-05 07:33:39.000000000
>> -0700
>> +++ xen-4.6.0-brendan/xen/include/public/pmu.h2015-11-20
>> 15:30:08.887781176 -0800
>> @@ -84,9 +84,17 @@
>> /*
>> * PMU features:
>> - * - XENPMU_FEATURE_INTEL_BTS: Intel BTS support (ignored on AMD)
>> + * - XENPMU_FEATURE_INTEL_BTS: Intel BTS support (ignored on AMD)
>> + * - XENPMU_FEATURE_IPC_ONLY: Restrict PMC to the most minimum set
>> possible.
>> + * Instructions, cycles, and ref cycles.
>> Can be
>> + * used to calculate instructions-per-cycle
>> (IPC).
>> + * - XENPMU_FEATURE_ARCH_ONLY: Restrict PMCs to the Intel pre-defined
>> + * architecteral events exposed by cpuid and
>> + * listed in Table 18-1 of the developer's
>> manual.
>>
>
> Needs "(ignored on AMD)"
Ok, thanks.
Brendan
>
>
> -boris
>
>
>
> */
>> -#define XENPMU_FEATURE_INTEL_BTS 1
>> +#define XENPMU_FEATURE_INTEL_BTS (1<<0)
>> +#define XENPMU_FEATURE_IPC_ONLY (1<<1)
>> +#define XENPMU_FEATURE_ARCH_ONLY (1<<2)
>> /*
>> * Shared PMU data between hypervisor and PV(H) domains.
>> ---patch---
>>
>>
>> Brendan
>>
>> --
>> Brendan Gregg, Senior Performance Architect, Netflix
>>
>
>
--
Brendan Gregg, Senior Performance Architect, Netflix
[-- Attachment #1.2: Type: text/html, Size: 15149 bytes --]
[-- Attachment #2: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: patch for restricted vPMU modes
2015-11-23 22:01 ` Brendan Gregg
@ 2015-11-23 22:54 ` Boris Ostrovsky
0 siblings, 0 replies; 6+ messages in thread
From: Boris Ostrovsky @ 2015-11-23 22:54 UTC (permalink / raw)
To: Brendan Gregg; +Cc: Dietmar Hahn, xen-devel
On 11/23/2015 05:01 PM, Brendan Gregg wrote:
>
>
> On Mon, Nov 23, 2015 at 6:35 AM, Boris Ostrovsky
> <boris.ostrovsky@oracle.com <mailto:boris.ostrovsky@oracle.com>> wrote:
>
>
> On 11/20/2015 07:21 PM, Brendan Gregg wrote:
>
> These new features would need a corresponding change in Linux for
> PV guests (or for dom0 to change feature set globally). But before
> that do_xenpmu_op()'s XENPMU_feature_set clause will have to be
> updated to deal with new modes.
>
>
> Ok, thanks. For now this is HVM, and I can EINVAL in do_xenpmu_op()
> for the new modes, like with BTS. (Do you know if the later Linux
> changes would be more than a feature version of
> pmu_mode_store/pmu_mode_show?)
You are right --- Linux doesn't need anything, it just passes the
features directly to the hypervisor. pmu_mode_store/show convert strings
to numbers but we don't use strings for features.
So the only thing that's needed is to allow hypervisor set/clear those
two bits in vpmu_features in do_xenpmu_op().
>
> Also, if either of those two flags is set we probably want to
> block MSR_IA32_DS_AREA and MSR_IA32_PEBS_ENABLE accesses as well.
>
>
> Ok, yes, good idea.
But: DS area is needed by BTS so if that feature is set we want to allow
guests to write it (and as for PEBS (which also uses DS area) --- that's
really a nop so whether or not you block it won't make any difference.
But we should probably still do it).
I don't know whether anyone else uses debug store. The way SDM's section
18.11.4 describes it there doesn't seem to be any other users.
-boris
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-11-23 22:54 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CAJN39ohu5fvs9rKowg5iaZPJeni+Xn2i+cPkbUgX++0GksgtDQ@mail.gmail.com>
2015-11-23 14:35 ` patch for restricted vPMU modes Boris Ostrovsky
2015-11-23 22:01 ` Brendan Gregg
2015-11-23 22:54 ` Boris Ostrovsky
2015-11-21 5:32 Brendan Gregg
2015-11-23 11:08 ` Jan Beulich
2015-11-23 11:12 ` Andrew Cooper
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.