* [PATCH 1/1] perf, Add support for Xeon-Phi PMU
@ 2012-09-20 17:03 Vince Weaver
2012-09-24 17:48 ` Meadows, Lawrence F
2012-09-25 11:32 ` Peter Zijlstra
0 siblings, 2 replies; 12+ messages in thread
From: Vince Weaver @ 2012-09-20 17:03 UTC (permalink / raw)
To: linux-kernel
Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar,
Arnaldo Carvalho de Melo, eranian, Meadows, Lawrence F
Hello
Included below is a patch that adds perf support for the Xeon-Phi PMU,
as documented in the "Intel Xeon Phi Coprocessor (codename: Knights
Corner) Performance Monitoring Units" manual.
Even though it is a co-processor, a Phi runs a full Linux environment
and can support performance counters.
This is just barebones support, it does not add support for
interesting new features such as the SPFLT intruction that
allows starting/stopping events without entering the kernel.
The PMU internally is just like that of an original Pentium, but
a P6-like MSR interface is provided. The interface is different enough
from a real P6 that it's not easy (or practical) to re-use the code in
perf_event_p6.c
One additional complication: some of the cache events map to
event "0". This causes problems because the generic events code
assumes "0" means not-available. I'm not sure the best way to address
that problem.
Vince
vincent.weaver@maine.edu
diff -Nur linux-3.6-rc6/arch/x86/include/asm/msr-index.h linux-3.6-rc6-mic/arch/x86/include/asm/msr-index.h
--- linux-3.6-rc6/arch/x86/include/asm/msr-index.h 2012-09-16 17:58:51.000000000 -0400
+++ linux-3.6-rc6-mic/arch/x86/include/asm/msr-index.h 2012-09-20 11:55:14.854332191 -0400
@@ -121,6 +121,11 @@
#define MSR_P6_EVNTSEL0 0x00000186
#define MSR_P6_EVNTSEL1 0x00000187
+#define MSR_PHI_PERFCTR0 0x00000020
+#define MSR_PHI_PERFCTR1 0x00000021
+#define MSR_PHI_EVNTSEL0 0x00000028
+#define MSR_PHI_EVNTSEL1 0x00000029
+
/* AMD64 MSRs. Not complete. See the architecture manual for a more
complete list. */
diff -Nur linux-3.6-rc6/arch/x86/kernel/cpu/Makefile linux-3.6-rc6-mic/arch/x86/kernel/cpu/Makefile
--- linux-3.6-rc6/arch/x86/kernel/cpu/Makefile 2012-09-16 17:58:51.000000000 -0400
+++ linux-3.6-rc6-mic/arch/x86/kernel/cpu/Makefile 2012-09-20 12:15:21.042331190 -0400
@@ -32,7 +32,7 @@
ifdef CONFIG_PERF_EVENTS
obj-$(CONFIG_CPU_SUP_AMD) += perf_event_amd.o
-obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_p6.o perf_event_p4.o
+obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_p6.o perf_event_phi.o perf_event_p4.o
obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_lbr.o perf_event_intel_ds.o perf_event_intel.o
obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_uncore.o
endif
diff -Nur linux-3.6-rc6/arch/x86/kernel/cpu/perfctr-watchdog.c linux-3.6-rc6-mic/arch/x86/kernel/cpu/perfctr-watchdog.c
--- linux-3.6-rc6/arch/x86/kernel/cpu/perfctr-watchdog.c 2012-09-16 17:58:51.000000000 -0400
+++ linux-3.6-rc6-mic/arch/x86/kernel/cpu/perfctr-watchdog.c 2012-09-20 11:56:51.430332108 -0400
@@ -56,6 +56,8 @@
switch (boot_cpu_data.x86) {
case 6:
return msr - MSR_P6_PERFCTR0;
+ case 11:
+ return msr - MSR_PHI_PERFCTR0;
case 15:
return msr - MSR_P4_BPU_PERFCTR0;
}
@@ -82,6 +84,8 @@
switch (boot_cpu_data.x86) {
case 6:
return msr - MSR_P6_EVNTSEL0;
+ case 11:
+ return msr - MSR_PHI_EVNTSEL0;
case 15:
return msr - MSR_P4_BSU_ESCR0;
}
diff -Nur linux-3.6-rc6/arch/x86/kernel/cpu/perf_event.h linux-3.6-rc6-mic/arch/x86/kernel/cpu/perf_event.h
--- linux-3.6-rc6/arch/x86/kernel/cpu/perf_event.h 2012-09-16 17:58:51.000000000 -0400
+++ linux-3.6-rc6-mic/arch/x86/kernel/cpu/perf_event.h 2012-09-20 12:49:25.782329486 -0400
@@ -624,6 +624,8 @@
int p6_pmu_init(void);
+int phi_pmu_init(void);
+
#else /* CONFIG_CPU_SUP_INTEL */
static inline void reserve_ds_buffers(void)
diff -Nur linux-3.6-rc6/arch/x86/kernel/cpu/perf_event_intel.c linux-3.6-rc6-mic/arch/x86/kernel/cpu/perf_event_intel.c
--- linux-3.6-rc6/arch/x86/kernel/cpu/perf_event_intel.c 2012-09-16 17:58:51.000000000 -0400
+++ linux-3.6-rc6-mic/arch/x86/kernel/cpu/perf_event_intel.c 2012-09-20 12:14:37.082331232 -0400
@@ -1906,6 +1906,8 @@
switch (boot_cpu_data.x86) {
case 0x6:
return p6_pmu_init();
+ case 0xb:
+ return phi_pmu_init();
case 0xf:
return p4_pmu_init();
}
diff -Nur linux-3.6-rc6/arch/x86/kernel/cpu/perf_event_phi.c linux-3.6-rc6-mic/arch/x86/kernel/cpu/perf_event_phi.c
--- linux-3.6-rc6/arch/x86/kernel/cpu/perf_event_phi.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-3.6-rc6-mic/arch/x86/kernel/cpu/perf_event_phi.c 2012-09-20 12:48:28.454329539 -0400
@@ -0,0 +1,236 @@
+#include <linux/perf_event.h>
+#include <linux/types.h>
+
+#include "perf_event.h"
+
+static const u64 phi_perfmon_event_map[] =
+{
+ [PERF_COUNT_HW_CPU_CYCLES] = 0x002a,
+ [PERF_COUNT_HW_INSTRUCTIONS] = 0x0016,
+ [PERF_COUNT_HW_CACHE_REFERENCES] = 0x0028,
+ [PERF_COUNT_HW_CACHE_MISSES] = 0x0029,
+ [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x0012,
+ [PERF_COUNT_HW_BRANCH_MISSES] = 0x002b,
+};
+
+static __initconst u64 phi_hw_cache_event_ids
+ [PERF_COUNT_HW_CACHE_MAX]
+ [PERF_COUNT_HW_CACHE_OP_MAX]
+ [PERF_COUNT_HW_CACHE_RESULT_MAX] =
+{
+ [ C(L1D) ] = {
+ [ C(OP_READ) ] = {
+ [ C(RESULT_ACCESS) ] = 0x10000, /* DATA_READ+cheat bit */
+ [ C(RESULT_MISS) ] = 0x0003, /* DATA_READ_MISS */
+ },
+ [ C(OP_WRITE) ] = {
+ [ C(RESULT_ACCESS) ] = 0x0001, /* DATA_WRITE */
+ [ C(RESULT_MISS) ] = 0x0004, /* DATA_WRITE_MISS */
+ },
+ [ C(OP_PREFETCH) ] = {
+ [ C(RESULT_ACCESS) ] = 0x0011, /* L1_DATA_PF1 */
+ [ C(RESULT_MISS) ] = 0x001c, /* L1_DATA_PF1_MISS */
+ },
+ },
+ [ C(L1I ) ] = {
+ [ C(OP_READ) ] = {
+ [ C(RESULT_ACCESS) ] = 0x000c, /* CODE_READ */
+ [ C(RESULT_MISS) ] = 0x000e, /* CODE_CACHE_MISS */
+ },
+ [ C(OP_WRITE) ] = {
+ [ C(RESULT_ACCESS) ] = -1,
+ [ C(RESULT_MISS) ] = -1,
+ },
+ [ C(OP_PREFETCH) ] = {
+ [ C(RESULT_ACCESS) ] = 0x0,
+ [ C(RESULT_MISS) ] = 0x0,
+ },
+ },
+ [ C(LL ) ] = {
+ [ C(OP_READ) ] = {
+ [ C(RESULT_ACCESS) ] = 0,
+ [ C(RESULT_MISS) ] = 0x10cb, /* L2_READ_MISS */
+ },
+ [ C(OP_WRITE) ] = {
+ [ C(RESULT_ACCESS) ] = 0x10cc, /* L2_WRITE_HIT ?? */
+ [ C(RESULT_MISS) ] = 0,
+ },
+ [ C(OP_PREFETCH) ] = {
+ [ C(RESULT_ACCESS) ] = 0x10fc, /* L2_DATA_PF2 */
+ [ C(RESULT_MISS) ] = 0x10fe, /* L2_DATA_PF2_MISS */
+ },
+ },
+ [ C(DTLB) ] = {
+ [ C(OP_READ) ] = {
+ [ C(RESULT_ACCESS) ] = 0x10000, /* DATA_READ+cheat bit */
+ [ C(RESULT_MISS) ] = 0x0002, /* DATA_PAGE_WALK */
+ },
+ [ C(OP_WRITE) ] = {
+ [ C(RESULT_ACCESS) ] = 0x0001, /* DATA_WRITE */
+ [ C(RESULT_MISS) ] = 0x0002, /* DATA_PAGE_WALK */
+ },
+ [ C(OP_PREFETCH) ] = {
+ [ C(RESULT_ACCESS) ] = 0x0,
+ [ C(RESULT_MISS) ] = 0x0,
+ },
+ },
+ [ C(ITLB) ] = {
+ [ C(OP_READ) ] = {
+ [ C(RESULT_ACCESS) ] = 0x000c, /* CODE_READ */
+ [ C(RESULT_MISS) ] = 0x000d, /* CODE_PAGE_WALK */
+ },
+ [ C(OP_WRITE) ] = {
+ [ C(RESULT_ACCESS) ] = -1,
+ [ C(RESULT_MISS) ] = -1,
+ },
+ [ C(OP_PREFETCH) ] = {
+ [ C(RESULT_ACCESS) ] = -1,
+ [ C(RESULT_MISS) ] = -1,
+ },
+ },
+ [ C(BPU ) ] = {
+ [ C(OP_READ) ] = {
+ [ C(RESULT_ACCESS) ] = 0x0012, /* BRANCHES */
+ [ C(RESULT_MISS) ] = 0x002b, /* BRANCHES_MISPREDICTED */
+ },
+ [ C(OP_WRITE) ] = {
+ [ C(RESULT_ACCESS) ] = -1,
+ [ C(RESULT_MISS) ] = -1,
+ },
+ [ C(OP_PREFETCH) ] = {
+ [ C(RESULT_ACCESS) ] = -1,
+ [ C(RESULT_MISS) ] = -1,
+ },
+ },
+};
+
+
+static u64 phi_pmu_event_map(int hw_event)
+{
+ return phi_perfmon_event_map[hw_event];
+}
+
+static struct event_constraint phi_event_constraints[] =
+{
+ INTEL_EVENT_CONSTRAINT(0xc3, 0x1), /* HWP_L2HIT */
+ INTEL_EVENT_CONSTRAINT(0xc4, 0x1), /* HWP_L2MISS */
+ INTEL_EVENT_CONSTRAINT(0xc8, 0x1), /* L2_READ_HIT_E */
+ INTEL_EVENT_CONSTRAINT(0xc9, 0x1), /* L2_READ_HIT_M */
+ INTEL_EVENT_CONSTRAINT(0xca, 0x1), /* L2_READ_HIT_S */
+ INTEL_EVENT_CONSTRAINT(0xcb, 0x1), /* L2_READ_MISS */
+ INTEL_EVENT_CONSTRAINT(0xcc, 0x1), /* L2_WRITE_HIT */
+ INTEL_EVENT_CONSTRAINT(0xce, 0x1), /* L2_STRONGLY_ORDERED_STREAMING_VSTORES_MISS */
+ INTEL_EVENT_CONSTRAINT(0xcf, 0x1), /* L2_WEAKLY_ORDERED_STREAMING_VSTORE_MISS */
+ INTEL_EVENT_CONSTRAINT(0xd7, 0x1), /* L2_VICTIM_REQ_WITH_DATA */
+ INTEL_EVENT_CONSTRAINT(0xe3, 0x1), /* SNP_HITM_BUNIT */
+ INTEL_EVENT_CONSTRAINT(0xe6, 0x1), /* SNP_HIT_L2 */
+ INTEL_EVENT_CONSTRAINT(0xe7, 0x1), /* SNP_HITM_L2 */
+ INTEL_EVENT_CONSTRAINT(0xf1, 0x1), /* L2_DATA_READ_MISS_CACHE_FILL */
+ INTEL_EVENT_CONSTRAINT(0xf2, 0x1), /* L2_DATA_WRITE_MISS_CACHE_FILL */
+ INTEL_EVENT_CONSTRAINT(0xf6, 0x1), /* L2_DATA_READ_MISS_MEM_FILL */
+ INTEL_EVENT_CONSTRAINT(0xf7, 0x1), /* L2_DATA_WRITE_MISS_MEM_FILL */
+ INTEL_EVENT_CONSTRAINT(0xfc, 0x1), /* L2_DATA_PF2 */
+ INTEL_EVENT_CONSTRAINT(0xfd, 0x1), /* L2_DATA_PF2_DROP */
+ INTEL_EVENT_CONSTRAINT(0xfe, 0x1), /* L2_DATA_PF2_MISS */
+ INTEL_EVENT_CONSTRAINT(0xff, 0x1), /* L2_DATA_HIT_INFLIGHT_PF2 */
+ EVENT_CONSTRAINT_END
+};
+
+#define MSR_MIC_IA32_PERF_GLOBAL_STATUS 0x0000002d
+#define MSR_MIC_IA32_PERF_GLOBAL_OVF_CONTROL 0x0000002e
+#define MSR_MIC_IA32_PERF_GLOBAL_CTRL 0x0000002f
+
+
+static void phi_pmu_disable_all(void)
+{
+ u64 val;
+
+ rdmsrl(MSR_MIC_IA32_PERF_GLOBAL_CTRL, val);
+ val &= ~0x3;
+ wrmsrl(MSR_MIC_IA32_PERF_GLOBAL_CTRL, val);
+}
+
+static void phi_pmu_enable_all(int added)
+{
+ unsigned long val;
+
+ rdmsrl(MSR_MIC_IA32_PERF_GLOBAL_CTRL, val);
+ val |= 0x3;
+ wrmsrl(MSR_MIC_IA32_PERF_GLOBAL_CTRL, val);
+}
+
+static inline void
+phi_pmu_disable_event(struct perf_event *event)
+{
+ struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+ struct hw_perf_event *hwc = &event->hw;
+ u64 val = 0;
+
+ if (cpuc->enabled)
+ val |= ARCH_PERFMON_EVENTSEL_ENABLE;
+
+ (void)wrmsrl_safe(hwc->config_base + hwc->idx, val);
+}
+
+static void phi_pmu_enable_event(struct perf_event *event)
+{
+ struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+ struct hw_perf_event *hwc = &event->hw;
+ u64 val;
+
+ val = hwc->config;
+ if (cpuc->enabled)
+ val |= ARCH_PERFMON_EVENTSEL_ENABLE;
+
+ (void)wrmsrl_safe(hwc->config_base + hwc->idx, val);
+}
+
+PMU_FORMAT_ATTR(event, "config:0-7" );
+PMU_FORMAT_ATTR(umask, "config:8-15" );
+PMU_FORMAT_ATTR(edge, "config:18" );
+PMU_FORMAT_ATTR(inv, "config:23" );
+PMU_FORMAT_ATTR(cmask, "config:24-31" );
+
+static struct attribute *intel_phi_formats_attr[] = {
+ &format_attr_event.attr,
+ &format_attr_umask.attr,
+ &format_attr_edge.attr,
+ &format_attr_inv.attr,
+ &format_attr_cmask.attr,
+ NULL,
+};
+
+static __initconst struct x86_pmu phi_pmu = {
+ .name = "phi",
+ .handle_irq = x86_pmu_handle_irq,
+ .disable_all = phi_pmu_disable_all,
+ .enable_all = phi_pmu_enable_all,
+ .enable = phi_pmu_enable_event,
+ .disable = phi_pmu_disable_event,
+ .hw_config = x86_pmu_hw_config,
+ .schedule_events = x86_schedule_events,
+ .eventsel = MSR_PHI_EVNTSEL0,
+ .perfctr = MSR_PHI_PERFCTR0,
+ .event_map = phi_pmu_event_map,
+ .max_events = ARRAY_SIZE(phi_perfmon_event_map),
+ .apic = 1,
+ .max_period = (1ULL << 31) - 1,
+ .version = 0,
+ .num_counters = 2,
+ /* in theory 40 bits, early silicon is buggy though */
+ .cntval_bits = 32,
+ .cntval_mask = (1ULL << 32) - 1,
+ .get_event_constraints = x86_get_event_constraints,
+ .event_constraints = phi_event_constraints,
+ .format_attrs = intel_phi_formats_attr,
+};
+
+__init int phi_pmu_init(void)
+{
+ x86_pmu = phi_pmu;
+
+ memcpy(hw_cache_event_ids, phi_hw_cache_event_ids,
+ sizeof(hw_cache_event_ids));
+
+ return 0;
+}
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [PATCH 1/1] perf, Add support for Xeon-Phi PMU
2012-09-20 17:03 [PATCH 1/1] perf, Add support for Xeon-Phi PMU Vince Weaver
@ 2012-09-24 17:48 ` Meadows, Lawrence F
2012-09-25 11:32 ` Peter Zijlstra
1 sibling, 0 replies; 12+ messages in thread
From: Meadows, Lawrence F @ 2012-09-24 17:48 UTC (permalink / raw)
To: Vince Weaver, linux-kernel@vger.kernel.org
Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar,
Arnaldo Carvalho de Melo, eranian@gmail.com
There was a hack in my patch to workaround the event code 0 problem-- did you not get that? Anyway I'll compare this one to mine. I just got back from vacation so bear with me.
-----Original Message-----
From: Vince Weaver [mailto:vincent.weaver@maine.edu]
Sent: Thursday, September 20, 2012 10:03 AM
To: linux-kernel@vger.kernel.org
Cc: Peter Zijlstra; Paul Mackerras; Ingo Molnar; Arnaldo Carvalho de Melo; eranian@gmail.com; Meadows, Lawrence F
Subject: [PATCH 1/1] perf, Add support for Xeon-Phi PMU
Hello
Included below is a patch that adds perf support for the Xeon-Phi PMU, as documented in the "Intel Xeon Phi Coprocessor (codename: Knights
Corner) Performance Monitoring Units" manual.
Even though it is a co-processor, a Phi runs a full Linux environment and can support performance counters.
This is just barebones support, it does not add support for interesting new features such as the SPFLT intruction that allows starting/stopping events without entering the kernel.
The PMU internally is just like that of an original Pentium, but a P6-like MSR interface is provided. The interface is different enough from a real P6 that it's not easy (or practical) to re-use the code in perf_event_p6.c
One additional complication: some of the cache events map to event "0". This causes problems because the generic events code assumes "0" means not-available. I'm not sure the best way to address that problem.
Vince
vincent.weaver@maine.edu
diff -Nur linux-3.6-rc6/arch/x86/include/asm/msr-index.h linux-3.6-rc6-mic/arch/x86/include/asm/msr-index.h
--- linux-3.6-rc6/arch/x86/include/asm/msr-index.h 2012-09-16 17:58:51.000000000 -0400
+++ linux-3.6-rc6-mic/arch/x86/include/asm/msr-index.h 2012-09-20 11:55:14.854332191 -0400
@@ -121,6 +121,11 @@
#define MSR_P6_EVNTSEL0 0x00000186
#define MSR_P6_EVNTSEL1 0x00000187
+#define MSR_PHI_PERFCTR0 0x00000020
+#define MSR_PHI_PERFCTR1 0x00000021
+#define MSR_PHI_EVNTSEL0 0x00000028
+#define MSR_PHI_EVNTSEL1 0x00000029
+
/* AMD64 MSRs. Not complete. See the architecture manual for a more
complete list. */
diff -Nur linux-3.6-rc6/arch/x86/kernel/cpu/Makefile linux-3.6-rc6-mic/arch/x86/kernel/cpu/Makefile
--- linux-3.6-rc6/arch/x86/kernel/cpu/Makefile 2012-09-16 17:58:51.000000000 -0400
+++ linux-3.6-rc6-mic/arch/x86/kernel/cpu/Makefile 2012-09-20 12:15:21.042331190 -0400
@@ -32,7 +32,7 @@
ifdef CONFIG_PERF_EVENTS
obj-$(CONFIG_CPU_SUP_AMD) += perf_event_amd.o
-obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_p6.o perf_event_p4.o
+obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_p6.o perf_event_phi.o perf_event_p4.o
obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_lbr.o perf_event_intel_ds.o perf_event_intel.o
obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_uncore.o
endif
diff -Nur linux-3.6-rc6/arch/x86/kernel/cpu/perfctr-watchdog.c linux-3.6-rc6-mic/arch/x86/kernel/cpu/perfctr-watchdog.c
--- linux-3.6-rc6/arch/x86/kernel/cpu/perfctr-watchdog.c 2012-09-16 17:58:51.000000000 -0400
+++ linux-3.6-rc6-mic/arch/x86/kernel/cpu/perfctr-watchdog.c 2012-09-20 11:56:51.430332108 -0400
@@ -56,6 +56,8 @@
switch (boot_cpu_data.x86) {
case 6:
return msr - MSR_P6_PERFCTR0;
+ case 11:
+ return msr - MSR_PHI_PERFCTR0;
case 15:
return msr - MSR_P4_BPU_PERFCTR0;
}
@@ -82,6 +84,8 @@
switch (boot_cpu_data.x86) {
case 6:
return msr - MSR_P6_EVNTSEL0;
+ case 11:
+ return msr - MSR_PHI_EVNTSEL0;
case 15:
return msr - MSR_P4_BSU_ESCR0;
}
diff -Nur linux-3.6-rc6/arch/x86/kernel/cpu/perf_event.h linux-3.6-rc6-mic/arch/x86/kernel/cpu/perf_event.h
--- linux-3.6-rc6/arch/x86/kernel/cpu/perf_event.h 2012-09-16 17:58:51.000000000 -0400
+++ linux-3.6-rc6-mic/arch/x86/kernel/cpu/perf_event.h 2012-09-20 12:49:25.782329486 -0400
@@ -624,6 +624,8 @@
int p6_pmu_init(void);
+int phi_pmu_init(void);
+
#else /* CONFIG_CPU_SUP_INTEL */
static inline void reserve_ds_buffers(void) diff -Nur linux-3.6-rc6/arch/x86/kernel/cpu/perf_event_intel.c linux-3.6-rc6-mic/arch/x86/kernel/cpu/perf_event_intel.c
--- linux-3.6-rc6/arch/x86/kernel/cpu/perf_event_intel.c 2012-09-16 17:58:51.000000000 -0400
+++ linux-3.6-rc6-mic/arch/x86/kernel/cpu/perf_event_intel.c 2012-09-20 12:14:37.082331232 -0400
@@ -1906,6 +1906,8 @@
switch (boot_cpu_data.x86) {
case 0x6:
return p6_pmu_init();
+ case 0xb:
+ return phi_pmu_init();
case 0xf:
return p4_pmu_init();
}
diff -Nur linux-3.6-rc6/arch/x86/kernel/cpu/perf_event_phi.c linux-3.6-rc6-mic/arch/x86/kernel/cpu/perf_event_phi.c
--- linux-3.6-rc6/arch/x86/kernel/cpu/perf_event_phi.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-3.6-rc6-mic/arch/x86/kernel/cpu/perf_event_phi.c 2012-09-20 12:48:28.454329539 -0400
@@ -0,0 +1,236 @@
+#include <linux/perf_event.h>
+#include <linux/types.h>
+
+#include "perf_event.h"
+
+static const u64 phi_perfmon_event_map[] = {
+ [PERF_COUNT_HW_CPU_CYCLES] = 0x002a,
+ [PERF_COUNT_HW_INSTRUCTIONS] = 0x0016,
+ [PERF_COUNT_HW_CACHE_REFERENCES] = 0x0028,
+ [PERF_COUNT_HW_CACHE_MISSES] = 0x0029,
+ [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x0012,
+ [PERF_COUNT_HW_BRANCH_MISSES] = 0x002b,
+};
+
+static __initconst u64 phi_hw_cache_event_ids
+ [PERF_COUNT_HW_CACHE_MAX]
+ [PERF_COUNT_HW_CACHE_OP_MAX]
+ [PERF_COUNT_HW_CACHE_RESULT_MAX] =
+{
+ [ C(L1D) ] = {
+ [ C(OP_READ) ] = {
+ [ C(RESULT_ACCESS) ] = 0x10000, /* DATA_READ+cheat bit */
+ [ C(RESULT_MISS) ] = 0x0003, /* DATA_READ_MISS */
+ },
+ [ C(OP_WRITE) ] = {
+ [ C(RESULT_ACCESS) ] = 0x0001, /* DATA_WRITE */
+ [ C(RESULT_MISS) ] = 0x0004, /* DATA_WRITE_MISS */
+ },
+ [ C(OP_PREFETCH) ] = {
+ [ C(RESULT_ACCESS) ] = 0x0011, /* L1_DATA_PF1 */
+ [ C(RESULT_MISS) ] = 0x001c, /* L1_DATA_PF1_MISS */
+ },
+ },
+ [ C(L1I ) ] = {
+ [ C(OP_READ) ] = {
+ [ C(RESULT_ACCESS) ] = 0x000c, /* CODE_READ */
+ [ C(RESULT_MISS) ] = 0x000e, /* CODE_CACHE_MISS */
+ },
+ [ C(OP_WRITE) ] = {
+ [ C(RESULT_ACCESS) ] = -1,
+ [ C(RESULT_MISS) ] = -1,
+ },
+ [ C(OP_PREFETCH) ] = {
+ [ C(RESULT_ACCESS) ] = 0x0,
+ [ C(RESULT_MISS) ] = 0x0,
+ },
+ },
+ [ C(LL ) ] = {
+ [ C(OP_READ) ] = {
+ [ C(RESULT_ACCESS) ] = 0,
+ [ C(RESULT_MISS) ] = 0x10cb, /* L2_READ_MISS */
+ },
+ [ C(OP_WRITE) ] = {
+ [ C(RESULT_ACCESS) ] = 0x10cc, /* L2_WRITE_HIT ?? */
+ [ C(RESULT_MISS) ] = 0,
+ },
+ [ C(OP_PREFETCH) ] = {
+ [ C(RESULT_ACCESS) ] = 0x10fc, /* L2_DATA_PF2 */
+ [ C(RESULT_MISS) ] = 0x10fe, /* L2_DATA_PF2_MISS */
+ },
+ },
+ [ C(DTLB) ] = {
+ [ C(OP_READ) ] = {
+ [ C(RESULT_ACCESS) ] = 0x10000, /* DATA_READ+cheat bit */
+ [ C(RESULT_MISS) ] = 0x0002, /* DATA_PAGE_WALK */
+ },
+ [ C(OP_WRITE) ] = {
+ [ C(RESULT_ACCESS) ] = 0x0001, /* DATA_WRITE */
+ [ C(RESULT_MISS) ] = 0x0002, /* DATA_PAGE_WALK */
+ },
+ [ C(OP_PREFETCH) ] = {
+ [ C(RESULT_ACCESS) ] = 0x0,
+ [ C(RESULT_MISS) ] = 0x0,
+ },
+ },
+ [ C(ITLB) ] = {
+ [ C(OP_READ) ] = {
+ [ C(RESULT_ACCESS) ] = 0x000c, /* CODE_READ */
+ [ C(RESULT_MISS) ] = 0x000d, /* CODE_PAGE_WALK */
+ },
+ [ C(OP_WRITE) ] = {
+ [ C(RESULT_ACCESS) ] = -1,
+ [ C(RESULT_MISS) ] = -1,
+ },
+ [ C(OP_PREFETCH) ] = {
+ [ C(RESULT_ACCESS) ] = -1,
+ [ C(RESULT_MISS) ] = -1,
+ },
+ },
+ [ C(BPU ) ] = {
+ [ C(OP_READ) ] = {
+ [ C(RESULT_ACCESS) ] = 0x0012, /* BRANCHES */
+ [ C(RESULT_MISS) ] = 0x002b, /* BRANCHES_MISPREDICTED */
+ },
+ [ C(OP_WRITE) ] = {
+ [ C(RESULT_ACCESS) ] = -1,
+ [ C(RESULT_MISS) ] = -1,
+ },
+ [ C(OP_PREFETCH) ] = {
+ [ C(RESULT_ACCESS) ] = -1,
+ [ C(RESULT_MISS) ] = -1,
+ },
+ },
+};
+
+
+static u64 phi_pmu_event_map(int hw_event) {
+ return phi_perfmon_event_map[hw_event]; }
+
+static struct event_constraint phi_event_constraints[] = {
+ INTEL_EVENT_CONSTRAINT(0xc3, 0x1), /* HWP_L2HIT */
+ INTEL_EVENT_CONSTRAINT(0xc4, 0x1), /* HWP_L2MISS */
+ INTEL_EVENT_CONSTRAINT(0xc8, 0x1), /* L2_READ_HIT_E */
+ INTEL_EVENT_CONSTRAINT(0xc9, 0x1), /* L2_READ_HIT_M */
+ INTEL_EVENT_CONSTRAINT(0xca, 0x1), /* L2_READ_HIT_S */
+ INTEL_EVENT_CONSTRAINT(0xcb, 0x1), /* L2_READ_MISS */
+ INTEL_EVENT_CONSTRAINT(0xcc, 0x1), /* L2_WRITE_HIT */
+ INTEL_EVENT_CONSTRAINT(0xce, 0x1), /* L2_STRONGLY_ORDERED_STREAMING_VSTORES_MISS */
+ INTEL_EVENT_CONSTRAINT(0xcf, 0x1), /* L2_WEAKLY_ORDERED_STREAMING_VSTORE_MISS */
+ INTEL_EVENT_CONSTRAINT(0xd7, 0x1), /* L2_VICTIM_REQ_WITH_DATA */
+ INTEL_EVENT_CONSTRAINT(0xe3, 0x1), /* SNP_HITM_BUNIT */
+ INTEL_EVENT_CONSTRAINT(0xe6, 0x1), /* SNP_HIT_L2 */
+ INTEL_EVENT_CONSTRAINT(0xe7, 0x1), /* SNP_HITM_L2 */
+ INTEL_EVENT_CONSTRAINT(0xf1, 0x1), /* L2_DATA_READ_MISS_CACHE_FILL */
+ INTEL_EVENT_CONSTRAINT(0xf2, 0x1), /* L2_DATA_WRITE_MISS_CACHE_FILL */
+ INTEL_EVENT_CONSTRAINT(0xf6, 0x1), /* L2_DATA_READ_MISS_MEM_FILL */
+ INTEL_EVENT_CONSTRAINT(0xf7, 0x1), /* L2_DATA_WRITE_MISS_MEM_FILL */
+ INTEL_EVENT_CONSTRAINT(0xfc, 0x1), /* L2_DATA_PF2 */
+ INTEL_EVENT_CONSTRAINT(0xfd, 0x1), /* L2_DATA_PF2_DROP */
+ INTEL_EVENT_CONSTRAINT(0xfe, 0x1), /* L2_DATA_PF2_MISS */
+ INTEL_EVENT_CONSTRAINT(0xff, 0x1), /* L2_DATA_HIT_INFLIGHT_PF2 */
+ EVENT_CONSTRAINT_END
+};
+
+#define MSR_MIC_IA32_PERF_GLOBAL_STATUS 0x0000002d
+#define MSR_MIC_IA32_PERF_GLOBAL_OVF_CONTROL 0x0000002e
+#define MSR_MIC_IA32_PERF_GLOBAL_CTRL 0x0000002f
+
+
+static void phi_pmu_disable_all(void)
+{
+ u64 val;
+
+ rdmsrl(MSR_MIC_IA32_PERF_GLOBAL_CTRL, val);
+ val &= ~0x3;
+ wrmsrl(MSR_MIC_IA32_PERF_GLOBAL_CTRL, val); }
+
+static void phi_pmu_enable_all(int added) {
+ unsigned long val;
+
+ rdmsrl(MSR_MIC_IA32_PERF_GLOBAL_CTRL, val);
+ val |= 0x3;
+ wrmsrl(MSR_MIC_IA32_PERF_GLOBAL_CTRL, val); }
+
+static inline void
+phi_pmu_disable_event(struct perf_event *event) {
+ struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+ struct hw_perf_event *hwc = &event->hw;
+ u64 val = 0;
+
+ if (cpuc->enabled)
+ val |= ARCH_PERFMON_EVENTSEL_ENABLE;
+
+ (void)wrmsrl_safe(hwc->config_base + hwc->idx, val); }
+
+static void phi_pmu_enable_event(struct perf_event *event) {
+ struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+ struct hw_perf_event *hwc = &event->hw;
+ u64 val;
+
+ val = hwc->config;
+ if (cpuc->enabled)
+ val |= ARCH_PERFMON_EVENTSEL_ENABLE;
+
+ (void)wrmsrl_safe(hwc->config_base + hwc->idx, val); }
+
+PMU_FORMAT_ATTR(event, "config:0-7" );
+PMU_FORMAT_ATTR(umask, "config:8-15" );
+PMU_FORMAT_ATTR(edge, "config:18" );
+PMU_FORMAT_ATTR(inv, "config:23" );
+PMU_FORMAT_ATTR(cmask, "config:24-31" );
+
+static struct attribute *intel_phi_formats_attr[] = {
+ &format_attr_event.attr,
+ &format_attr_umask.attr,
+ &format_attr_edge.attr,
+ &format_attr_inv.attr,
+ &format_attr_cmask.attr,
+ NULL,
+};
+
+static __initconst struct x86_pmu phi_pmu = {
+ .name = "phi",
+ .handle_irq = x86_pmu_handle_irq,
+ .disable_all = phi_pmu_disable_all,
+ .enable_all = phi_pmu_enable_all,
+ .enable = phi_pmu_enable_event,
+ .disable = phi_pmu_disable_event,
+ .hw_config = x86_pmu_hw_config,
+ .schedule_events = x86_schedule_events,
+ .eventsel = MSR_PHI_EVNTSEL0,
+ .perfctr = MSR_PHI_PERFCTR0,
+ .event_map = phi_pmu_event_map,
+ .max_events = ARRAY_SIZE(phi_perfmon_event_map),
+ .apic = 1,
+ .max_period = (1ULL << 31) - 1,
+ .version = 0,
+ .num_counters = 2,
+ /* in theory 40 bits, early silicon is buggy though */
+ .cntval_bits = 32,
+ .cntval_mask = (1ULL << 32) - 1,
+ .get_event_constraints = x86_get_event_constraints,
+ .event_constraints = phi_event_constraints,
+ .format_attrs = intel_phi_formats_attr,
+};
+
+__init int phi_pmu_init(void)
+{
+ x86_pmu = phi_pmu;
+
+ memcpy(hw_cache_event_ids, phi_hw_cache_event_ids,
+ sizeof(hw_cache_event_ids));
+
+ return 0;
+}
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/1] perf, Add support for Xeon-Phi PMU
2012-09-20 17:03 [PATCH 1/1] perf, Add support for Xeon-Phi PMU Vince Weaver
2012-09-24 17:48 ` Meadows, Lawrence F
@ 2012-09-25 11:32 ` Peter Zijlstra
2012-09-25 11:42 ` Cyrill Gorcunov
1 sibling, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2012-09-25 11:32 UTC (permalink / raw)
To: Vince Weaver
Cc: linux-kernel, Paul Mackerras, Ingo Molnar,
Arnaldo Carvalho de Melo, eranian, Meadows, Lawrence F,
Cyrill Gorcunov
On Thu, 2012-09-20 at 13:03 -0400, Vince Weaver wrote:
> One additional complication: some of the cache events map to
> event "0". This causes problems because the generic events code
> assumes "0" means not-available. I'm not sure the best way to address
> that problem.
For all except P4 we could remap the 0 value to -2, that has all high
bits set (like the -1) which aren't used by hardware.
P4 is stuffing two registers in the 64bit config space and actually has
them all in use I think.. Cyrill?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/1] perf, Add support for Xeon-Phi PMU
2012-09-25 11:32 ` Peter Zijlstra
@ 2012-09-25 11:42 ` Cyrill Gorcunov
2012-09-25 11:51 ` Cyrill Gorcunov
2012-09-25 12:01 ` Peter Zijlstra
0 siblings, 2 replies; 12+ messages in thread
From: Cyrill Gorcunov @ 2012-09-25 11:42 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Vince Weaver, linux-kernel, Paul Mackerras, Ingo Molnar,
Arnaldo Carvalho de Melo, eranian, Meadows, Lawrence F
On Tue, Sep 25, 2012 at 01:32:38PM +0200, Peter Zijlstra wrote:
> On Thu, 2012-09-20 at 13:03 -0400, Vince Weaver wrote:
> > One additional complication: some of the cache events map to
> > event "0". This causes problems because the generic events code
> > assumes "0" means not-available. I'm not sure the best way to address
> > that problem.
>
> For all except P4 we could remap the 0 value to -2, that has all high
> bits set (like the -1) which aren't used by hardware.
>
> P4 is stuffing two registers in the 64bit config space and actually has
> them all in use I think.. Cyrill?
Yeah, we use almost all 64 bits in config. I tried to describe the bitmaps
in perf_event_p4.h (see Notes on internal configuration of ESCR+CCCR tuples).
Guys, letme re-read this whole mail thread first since I have no clue
what this remapping about ;)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/1] perf, Add support for Xeon-Phi PMU
2012-09-25 11:42 ` Cyrill Gorcunov
@ 2012-09-25 11:51 ` Cyrill Gorcunov
2012-09-25 12:01 ` Peter Zijlstra
1 sibling, 0 replies; 12+ messages in thread
From: Cyrill Gorcunov @ 2012-09-25 11:51 UTC (permalink / raw)
To: Peter Zijlstra, Vince Weaver, linux-kernel, Paul Mackerras,
Ingo Molnar, Arnaldo Carvalho de Melo, eranian,
Meadows, Lawrence F
On Tue, Sep 25, 2012 at 03:42:25PM +0400, Cyrill Gorcunov wrote:
> On Tue, Sep 25, 2012 at 01:32:38PM +0200, Peter Zijlstra wrote:
> > On Thu, 2012-09-20 at 13:03 -0400, Vince Weaver wrote:
> > > One additional complication: some of the cache events map to
> > > event "0". This causes problems because the generic events code
> > > assumes "0" means not-available. I'm not sure the best way to address
> > > that problem.
> >
> > For all except P4 we could remap the 0 value to -2, that has all high
> > bits set (like the -1) which aren't used by hardware.
> >
> > P4 is stuffing two registers in the 64bit config space and actually has
> > them all in use I think.. Cyrill?
>
> Yeah, we use almost all 64 bits in config. I tried to describe the bitmaps
> in perf_event_p4.h (see Notes on internal configuration of ESCR+CCCR tuples).
>
> Guys, letme re-read this whole mail thread first since I have no clue
> what this remapping about ;)
If we need some special mark in config I can try to free hight bit in
@config and move it somehwere in low 32 bits (there are bits 28-29 which
i can use for that). Ie I can provide the sign bit, would it be enough?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/1] perf, Add support for Xeon-Phi PMU
2012-09-25 11:42 ` Cyrill Gorcunov
2012-09-25 11:51 ` Cyrill Gorcunov
@ 2012-09-25 12:01 ` Peter Zijlstra
2012-09-25 12:05 ` stephane eranian
2012-09-25 13:27 ` Cyrill Gorcunov
1 sibling, 2 replies; 12+ messages in thread
From: Peter Zijlstra @ 2012-09-25 12:01 UTC (permalink / raw)
To: Cyrill Gorcunov
Cc: Vince Weaver, linux-kernel, Paul Mackerras, Ingo Molnar,
Arnaldo Carvalho de Melo, eranian, Meadows, Lawrence F
On Tue, 2012-09-25 at 15:42 +0400, Cyrill Gorcunov wrote:
> Guys, letme re-read this whole mail thread first since I have no clue
> what this remapping about ;)
x86_setup_perfctr() / set_ext_hw_attr() have special purposed 0 and -1
config values to mean -ENOENT and -EINVAL resp.
This means neither config value can be a 'real' event. Now it turns out
Xeon-Phi has an actual event 0, which is masked by these special case
thingies.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/1] perf, Add support for Xeon-Phi PMU
2012-09-25 12:01 ` Peter Zijlstra
@ 2012-09-25 12:05 ` stephane eranian
2012-09-25 12:22 ` Cyrill Gorcunov
2012-09-25 13:27 ` Cyrill Gorcunov
1 sibling, 1 reply; 12+ messages in thread
From: stephane eranian @ 2012-09-25 12:05 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Cyrill Gorcunov, Vince Weaver, linux-kernel, Paul Mackerras,
Ingo Molnar, Arnaldo Carvalho de Melo, Meadows, Lawrence F
On Tue, Sep 25, 2012 at 2:01 PM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> On Tue, 2012-09-25 at 15:42 +0400, Cyrill Gorcunov wrote:
>
>> Guys, letme re-read this whole mail thread first since I have no clue
>> what this remapping about ;)
>
> x86_setup_perfctr() / set_ext_hw_attr() have special purposed 0 and -1
> config values to mean -ENOENT and -EINVAL resp.
>
> This means neither config value can be a 'real' event. Now it turns out
> Xeon-Phi has an actual event 0, which is masked by these special case
> thingies.
Then how about using -1 or -2 for ENOENT and EINVAL?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/1] perf, Add support for Xeon-Phi PMU
2012-09-25 12:05 ` stephane eranian
@ 2012-09-25 12:22 ` Cyrill Gorcunov
2012-09-25 12:23 ` Cyrill Gorcunov
0 siblings, 1 reply; 12+ messages in thread
From: Cyrill Gorcunov @ 2012-09-25 12:22 UTC (permalink / raw)
To: stephane eranian
Cc: Peter Zijlstra, Vince Weaver, linux-kernel, Paul Mackerras,
Ingo Molnar, Arnaldo Carvalho de Melo, Meadows, Lawrence F
On Tue, Sep 25, 2012 at 02:05:58PM +0200, stephane eranian wrote:
> On Tue, Sep 25, 2012 at 2:01 PM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> > On Tue, 2012-09-25 at 15:42 +0400, Cyrill Gorcunov wrote:
> >
> >> Guys, letme re-read this whole mail thread first since I have no clue
> >> what this remapping about ;)
> >
> > x86_setup_perfctr() / set_ext_hw_attr() have special purposed 0 and -1
> > config values to mean -ENOENT and -EINVAL resp.
> >
> > This means neither config value can be a 'real' event. Now it turns out
> > Xeon-Phi has an actual event 0, which is masked by these special case
> > thingies.
>
> Then how about using -1 or -2 for ENOENT and EINVAL?
-2 can't be a valid p4 config as far as i can tell.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/1] perf, Add support for Xeon-Phi PMU
2012-09-25 12:22 ` Cyrill Gorcunov
@ 2012-09-25 12:23 ` Cyrill Gorcunov
0 siblings, 0 replies; 12+ messages in thread
From: Cyrill Gorcunov @ 2012-09-25 12:23 UTC (permalink / raw)
To: stephane eranian
Cc: Peter Zijlstra, Vince Weaver, linux-kernel, Paul Mackerras,
Ingo Molnar, Arnaldo Carvalho de Melo, Meadows, Lawrence F
On Tue, Sep 25, 2012 at 04:22:29PM +0400, Cyrill Gorcunov wrote:
> On Tue, Sep 25, 2012 at 02:05:58PM +0200, stephane eranian wrote:
> > On Tue, Sep 25, 2012 at 2:01 PM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> > > On Tue, 2012-09-25 at 15:42 +0400, Cyrill Gorcunov wrote:
> > >
> > >> Guys, letme re-read this whole mail thread first since I have no clue
> > >> what this remapping about ;)
> > >
> > > x86_setup_perfctr() / set_ext_hw_attr() have special purposed 0 and -1
> > > config values to mean -ENOENT and -EINVAL resp.
> > >
> > > This means neither config value can be a 'real' event. Now it turns out
> > > Xeon-Phi has an actual event 0, which is masked by these special case
> > > thingies.
> >
> > Then how about using -1 or -2 for ENOENT and EINVAL?
>
> -2 can't be a valid p4 config as far as i can tell.
I mean such value can be easily recognized by p4 code and treated
specially if needed.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/1] perf, Add support for Xeon-Phi PMU
2012-09-25 12:01 ` Peter Zijlstra
2012-09-25 12:05 ` stephane eranian
@ 2012-09-25 13:27 ` Cyrill Gorcunov
2012-09-25 14:45 ` Vince Weaver
1 sibling, 1 reply; 12+ messages in thread
From: Cyrill Gorcunov @ 2012-09-25 13:27 UTC (permalink / raw)
To: Peter Zijlstra, Vince Weaver
Cc: linux-kernel, Paul Mackerras, Ingo Molnar,
Arnaldo Carvalho de Melo, eranian, Meadows, Lawrence F
On Tue, Sep 25, 2012 at 02:01:26PM +0200, Peter Zijlstra wrote:
> On Tue, 2012-09-25 at 15:42 +0400, Cyrill Gorcunov wrote:
>
> > Guys, letme re-read this whole mail thread first since I have no clue
> > what this remapping about ;)
>
> x86_setup_perfctr() / set_ext_hw_attr() have special purposed 0 and -1
> config values to mean -ENOENT and -EINVAL resp.
>
> This means neither config value can be a 'real' event. Now it turns out
> Xeon-Phi has an actual event 0, which is masked by these special case
> thingies.
So guys, if understand all things correctly it's supposed to use some
-1/-2 as initial @config value for unsupported events, right? Vince,
may not it be easier to use bit 19 as a flag of valid event and clear
it when you write to msr, thus we will not have to change "zero is reserved"
semantics (otoh i'm not sure if it won't become a problem somewhere in
future with some new cpu :)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/1] perf, Add support for Xeon-Phi PMU
2012-09-25 13:27 ` Cyrill Gorcunov
@ 2012-09-25 14:45 ` Vince Weaver
2012-09-25 14:53 ` Cyrill Gorcunov
0 siblings, 1 reply; 12+ messages in thread
From: Vince Weaver @ 2012-09-25 14:45 UTC (permalink / raw)
To: Cyrill Gorcunov
Cc: Peter Zijlstra, Vince Weaver, linux-kernel, Paul Mackerras,
Ingo Molnar, Arnaldo Carvalho de Melo, eranian,
Meadows, Lawrence F
On Tue, 25 Sep 2012, Cyrill Gorcunov wrote:
> So guys, if understand all things correctly it's supposed to use some
> -1/-2 as initial @config value for unsupported events, right? Vince,
> may not it be easier to use bit 19 as a flag of valid event and clear
> it when you write to msr, thus we will not have to change "zero is reserved"
> semantics (otoh i'm not sure if it won't become a problem somewhere in
> future with some new cpu :)
Well, we wouldn't want to use a reserved bit.
In theory we could re-use bit 22 (enable) or bit 20 (APIC enable)
because those values should in theory be set elsewhere and could probably
be masked out at an appropriate place.
Is -2 really a valid cache event on Pentium 4?
Though I admit patching all of the various PMU drivers to use -1/-2 rather
than 0/-1 will be a pain, especially as many of them just default to 0
with no initialization currently.
Vince Weaver
vincent.weaver@maine.edu
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/1] perf, Add support for Xeon-Phi PMU
2012-09-25 14:45 ` Vince Weaver
@ 2012-09-25 14:53 ` Cyrill Gorcunov
0 siblings, 0 replies; 12+ messages in thread
From: Cyrill Gorcunov @ 2012-09-25 14:53 UTC (permalink / raw)
To: Vince Weaver
Cc: Peter Zijlstra, linux-kernel, Paul Mackerras, Ingo Molnar,
Arnaldo Carvalho de Melo, eranian, Meadows, Lawrence F
On Tue, Sep 25, 2012 at 10:45:02AM -0400, Vince Weaver wrote:
>
> On Tue, 25 Sep 2012, Cyrill Gorcunov wrote:
>
> > So guys, if understand all things correctly it's supposed to use some
> > -1/-2 as initial @config value for unsupported events, right? Vince,
> > may not it be easier to use bit 19 as a flag of valid event and clear
> > it when you write to msr, thus we will not have to change "zero is reserved"
> > semantics (otoh i'm not sure if it won't become a problem somewhere in
> > future with some new cpu :)
>
> Well, we wouldn't want to use a reserved bit.
> In theory we could re-use bit 22 (enable) or bit 20 (APIC enable)
> because those values should in theory be set elsewhere and could probably
> be masked out at an appropriate place.
>
> Is -2 really a valid cache event on Pentium 4?
Nope, there can't be config with -2 as valid value. So we can use -2
if needed as far as I can tell (the -1 can't be valid as well).
> Though I admit patching all of the various PMU drivers to use -1/-2 rather
> than 0/-1 will be a pain, especially as many of them just default to 0
> with no initialization currently.
Yup, but if it'll be needed I can tune up p4 code (thought i'll need
some help in testing since i've no p4 cpu anymore).
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2012-09-25 14:53 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-20 17:03 [PATCH 1/1] perf, Add support for Xeon-Phi PMU Vince Weaver
2012-09-24 17:48 ` Meadows, Lawrence F
2012-09-25 11:32 ` Peter Zijlstra
2012-09-25 11:42 ` Cyrill Gorcunov
2012-09-25 11:51 ` Cyrill Gorcunov
2012-09-25 12:01 ` Peter Zijlstra
2012-09-25 12:05 ` stephane eranian
2012-09-25 12:22 ` Cyrill Gorcunov
2012-09-25 12:23 ` Cyrill Gorcunov
2012-09-25 13:27 ` Cyrill Gorcunov
2012-09-25 14:45 ` Vince Weaver
2012-09-25 14:53 ` Cyrill Gorcunov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox