* [PATCH 0/2] perf_events: correct event assignments on Intel processors @ 2009-10-06 14:42 Stephane Eranian 2009-10-06 14:42 ` [PATCH 1/2] perf_events: check for filters on fixed counter events Stephane Eranian 0 siblings, 1 reply; 21+ messages in thread From: Stephane Eranian @ 2009-10-06 14:42 UTC (permalink / raw) To: linux-kernel Cc: mingo, paulus, a.p.zijlstra, perfmon2-devel, Stephane Eranian This series of patches corrects the assignment of events to counters for the Intel X86 processors. Not all events can be measured on any counter. Not all filters can be used on fixed counters. The current implementation silently returns bogus counts in case the event is programmed into the wrong counter. Signed-off-by: Stephane Eranian <eranian@gmail.com> --- arch/x86/include/asm/perf_event.h | 11 ++++ arch/x86/kernel/cpu/perf_event.c | 112 +++++++++++++++++++++++++++++++++++- 2 files changed, 119 insertions(+), 4 deletions(-) ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 1/2] perf_events: check for filters on fixed counter events 2009-10-06 14:42 [PATCH 0/2] perf_events: correct event assignments on Intel processors Stephane Eranian @ 2009-10-06 14:42 ` Stephane Eranian 2009-10-06 14:42 ` [PATCH 2/2] perf_events: add event constraints support for Intel processors Stephane Eranian 2009-10-09 14:22 ` [tip:perf/core] perf_events: Check for filters on fixed counter events tip-bot for Stephane Eranian 0 siblings, 2 replies; 21+ messages in thread From: Stephane Eranian @ 2009-10-06 14:42 UTC (permalink / raw) To: linux-kernel Cc: mingo, paulus, a.p.zijlstra, perfmon2-devel, Stephane Eranian Intel fixed counters do not support all the filters possible with a generic counter. Thus, if a fixed counter event is passed but with certain filters set, then the fixed_mode_idx() function must fail and the event must be measured in a generic counter instead. Reject filters are: inv, edge, cnt-mask Signed-off-by: Stephane Eranian <eranian@gmail.com> --- arch/x86/include/asm/perf_event.h | 11 +++++++++++ arch/x86/kernel/cpu/perf_event.c | 6 ++++++ 2 files changed, 17 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h index ad7ce3f..719273b 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -28,6 +28,17 @@ */ #define ARCH_PERFMON_EVENT_MASK 0xffff +/* + * filter mask to validate fixed counter events. + * the following filters disqualify for fixed counters: + * - inv + * - edge + * - cnt-mask + * The other filters are supported by fixed counters. + * The any-thread option is supported starting with v3. + */ +#define ARCH_PERFMON_EVENT_FILTER_MASK 0xff840000 + #define ARCH_PERFMON_UNHALTED_CORE_CYCLES_SEL 0x3c #define ARCH_PERFMON_UNHALTED_CORE_CYCLES_UMASK (0x00 << 8) #define ARCH_PERFMON_UNHALTED_CORE_CYCLES_INDEX 0 diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c index b5801c3..1d16bd6 100644 --- a/arch/x86/kernel/cpu/perf_event.c +++ b/arch/x86/kernel/cpu/perf_event.c @@ -1349,6 +1349,12 @@ fixed_mode_idx(struct perf_event *event, struct hw_perf_event *hwc) if (!x86_pmu.num_events_fixed) return -1; + /* + * fixed counters do not take all possible filters + */ + if (hwc->config & ARCH_PERFMON_EVENT_FILTER_MASK) + return -1; + if (unlikely(hw_event == x86_pmu.event_map(PERF_COUNT_HW_INSTRUCTIONS))) return X86_PMC_IDX_FIXED_INSTRUCTIONS; if (unlikely(hw_event == x86_pmu.event_map(PERF_COUNT_HW_CPU_CYCLES))) -- 1.5.4.3 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 2/2] perf_events: add event constraints support for Intel processors 2009-10-06 14:42 ` [PATCH 1/2] perf_events: check for filters on fixed counter events Stephane Eranian @ 2009-10-06 14:42 ` Stephane Eranian 2009-10-06 16:29 ` Peter Zijlstra ` (2 more replies) 2009-10-09 14:22 ` [tip:perf/core] perf_events: Check for filters on fixed counter events tip-bot for Stephane Eranian 1 sibling, 3 replies; 21+ messages in thread From: Stephane Eranian @ 2009-10-06 14:42 UTC (permalink / raw) To: linux-kernel Cc: mingo, paulus, a.p.zijlstra, perfmon2-devel, Stephane Eranian On some Intel processors, not all events can be measured in all counters. Some events can only be measured in one particular counter, for instance. Assigning an event to the wrong counter does not crash the machine but this yields bogus counts, i.e., silent error. This patch changes the event to counter assignment logic to take into account event constraints for Intel P6, Core and Nehalem processors. There is no contraints on Intel Atom. There are constraints on Intel Yonah (Core Duo) but they are not provided in this patch given that this processor is not yet supported by perf_events. As a result of the constraints, it is possible for some event groups to never actually be loaded onto the PMU if they contain two events which can only be measured on a single counter. That situation can be detected with the scaling information extracted with read(). Signed-off-by: Stephane Eranian <eranian@gmail.com> --- arch/x86/kernel/cpu/perf_event.c | 106 ++++++++++++++++++++++++++++++++++++-- 1 files changed, 102 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c index 1d16bd6..06ca1b2 100644 --- a/arch/x86/kernel/cpu/perf_event.c +++ b/arch/x86/kernel/cpu/perf_event.c @@ -77,6 +77,18 @@ struct cpu_hw_events { struct debug_store *ds; }; +struct evt_cstr { + unsigned long idxmsk[BITS_TO_LONGS(X86_PMC_IDX_MAX)]; + int code; +}; + +#define EVT_CSTR0(c, m) { .code = (c), .idxmsk[0] = (m) } +#define EVT_CSTR_END { .code = 0, .idxmsk[0] = 0 } + +#define for_each_evt_cstr(e, c) \ + for((e) = (c); (e)->idxmsk[0]; (e)++) + + /* * struct x86_pmu - generic x86 pmu */ @@ -102,6 +114,7 @@ struct x86_pmu { u64 intel_ctrl; void (*enable_bts)(u64 config); void (*disable_bts)(void); + int (*get_event_idx)(struct hw_perf_event *hwc); }; static struct x86_pmu x86_pmu __read_mostly; @@ -110,6 +123,8 @@ static DEFINE_PER_CPU(struct cpu_hw_events, cpu_hw_events) = { .enabled = 1, }; +static const struct evt_cstr *evt_cstr; + /* * Not sure about some of these */ @@ -155,6 +170,15 @@ static u64 p6_pmu_raw_event(u64 hw_event) return hw_event & P6_EVNTSEL_MASK; } +static const struct evt_cstr intel_p6_evt[]={ + EVT_CSTR0(0xc1, 0x1), /* FLOPS */ + EVT_CSTR0(0x10, 0x1), /* FP_COMP_OPS_EXE */ + EVT_CSTR0(0x11, 0x1), /* FP_ASSIST */ + EVT_CSTR0(0x12, 0x2), /* MUL */ + EVT_CSTR0(0x13, 0x2), /* DIV */ + EVT_CSTR0(0x14, 0x1), /* CYCLES_DIV_BUSY */ + EVT_CSTR_END +}; /* * Intel PerfMon v3. Used on Core2 and later. @@ -170,6 +194,33 @@ static const u64 intel_perfmon_event_map[] = [PERF_COUNT_HW_BUS_CYCLES] = 0x013c, }; +static const struct evt_cstr intel_core_evt[]={ + EVT_CSTR0(0x10, 0x1), /* FP_COMP_OPS_EXE */ + EVT_CSTR0(0x11, 0x2), /* FP_ASSIST */ + EVT_CSTR0(0x12, 0x2), /* MUL */ + EVT_CSTR0(0x13, 0x2), /* DIV */ + EVT_CSTR0(0x14, 0x1), /* CYCLES_DIV_BUSY */ + EVT_CSTR0(0x18, 0x1), /* IDLE_DURING_DIV */ + EVT_CSTR0(0x19, 0x2), /* DELAYED_BYPASS */ + EVT_CSTR0(0xa1, 0x1), /* RS_UOPS_DISPATCH_CYCLES */ + EVT_CSTR0(0xcb, 0x1), /* MEM_LOAD_RETIRED */ + EVT_CSTR_END +}; + +static const struct evt_cstr intel_nhm_evt[]={ + EVT_CSTR0(0x40, 0x3), /* L1D_CACHE_LD */ + EVT_CSTR0(0x41, 0x3), /* L1D_CACHE_ST */ + EVT_CSTR0(0x42, 0x3), /* L1D_CACHE_LOCK */ + EVT_CSTR0(0x43, 0x3), /* L1D_ALL_REF */ + EVT_CSTR0(0x4e, 0x3), /* L1D_PREFETCH */ + EVT_CSTR0(0x4c, 0x3), /* LOAD_HIT_PRE */ + EVT_CSTR0(0x51, 0x3), /* L1D */ + EVT_CSTR0(0x52, 0x3), /* L1D_CACHE_PREFETCH_LOCK_FB_HIT */ + EVT_CSTR0(0x53, 0x3), /* L1D_CACHE_LOCK_FB_HIT */ + EVT_CSTR0(0xc5, 0x3), /* CACHE_LOCK_CYCLES */ + EVT_CSTR_END +}; + static u64 intel_pmu_event_map(int hw_event) { return intel_perfmon_event_map[hw_event]; @@ -932,6 +983,8 @@ static int __hw_perf_event_init(struct perf_event *event) */ hwc->config = ARCH_PERFMON_EVENTSEL_INT; + hwc->idx = -1; + /* * Count user and OS events unless requested not to. */ @@ -1366,6 +1419,45 @@ fixed_mode_idx(struct perf_event *event, struct hw_perf_event *hwc) } /* + * generic counter allocator: get next free counter + */ +static int gen_get_event_idx(struct hw_perf_event *hwc) +{ + struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events); + int idx; + + idx = find_first_zero_bit(cpuc->used_mask, x86_pmu.num_events); + return idx == x86_pmu.num_events ? -1 : idx; +} + +/* + * intel-specific counter allocator: check event constraints + */ +static int intel_get_event_idx(struct hw_perf_event *hwc) +{ + struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events); + const struct evt_cstr *evt; + int i, code; + + if (!evt_cstr) + goto skip; + + code = hwc->config & 0xff; + + for_each_evt_cstr(evt, evt_cstr) { + if (code == evt->code) { + for_each_bit(i, evt->idxmsk, X86_PMC_IDX_MAX) { + if (!test_and_set_bit(i, cpuc->used_mask)) + return i; + } + return -1; + } + } +skip: + return gen_get_event_idx(hwc); +} + +/* * Find a PMC slot for the freshly enabled / scheduled in event: */ static int x86_pmu_enable(struct perf_event *event) @@ -1402,11 +1494,10 @@ static int x86_pmu_enable(struct perf_event *event) } else { idx = hwc->idx; /* Try to get the previous generic event again */ - if (test_and_set_bit(idx, cpuc->used_mask)) { + if (idx == -1 || test_and_set_bit(idx, cpuc->used_mask)) { try_generic: - idx = find_first_zero_bit(cpuc->used_mask, - x86_pmu.num_events); - if (idx == x86_pmu.num_events) + idx = x86_pmu.get_event_idx(hwc); + if (idx == -1) return -EAGAIN; set_bit(idx, cpuc->used_mask); @@ -1883,6 +1974,7 @@ static struct x86_pmu p6_pmu = { */ .event_bits = 32, .event_mask = (1ULL << 32) - 1, + .get_event_idx = intel_get_event_idx, }; static struct x86_pmu intel_pmu = { @@ -1906,6 +1998,7 @@ static struct x86_pmu intel_pmu = { .max_period = (1ULL << 31) - 1, .enable_bts = intel_pmu_enable_bts, .disable_bts = intel_pmu_disable_bts, + .get_event_idx = intel_get_event_idx, }; static struct x86_pmu amd_pmu = { @@ -1926,6 +2019,7 @@ static struct x86_pmu amd_pmu = { .apic = 1, /* use highest bit to detect overflow */ .max_period = (1ULL << 47) - 1, + .get_event_idx = gen_get_event_idx, }; static int p6_pmu_init(void) @@ -1938,10 +2032,12 @@ static int p6_pmu_init(void) case 7: case 8: case 11: /* Pentium III */ + evt_cstr = intel_p6_evt; break; case 9: case 13: /* Pentium M */ + evt_cstr = intel_p6_evt; break; default: pr_cont("unsupported p6 CPU model %d ", @@ -2013,12 +2109,14 @@ static int intel_pmu_init(void) sizeof(hw_cache_event_ids)); pr_cont("Core2 events, "); + evt_cstr = intel_core_evt; break; default: case 26: memcpy(hw_cache_event_ids, nehalem_hw_cache_event_ids, sizeof(hw_cache_event_ids)); + evt_cstr = intel_nhm_evt; pr_cont("Nehalem/Corei7 events, "); break; case 28: -- 1.5.4.3 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] perf_events: add event constraints support for Intel processors 2009-10-06 14:42 ` [PATCH 2/2] perf_events: add event constraints support for Intel processors Stephane Eranian @ 2009-10-06 16:29 ` Peter Zijlstra 2009-10-06 17:26 ` stephane eranian 2009-10-09 13:55 ` [PATCH 2/2] perf_events: add event constraints support for Intel processors Ingo Molnar 2009-10-09 14:22 ` [tip:perf/core] perf_events: Add " tip-bot for Stephane Eranian 2 siblings, 1 reply; 21+ messages in thread From: Peter Zijlstra @ 2009-10-06 16:29 UTC (permalink / raw) To: Stephane Eranian Cc: linux-kernel, mingo, paulus, perfmon2-devel, Stephane Eranian On Tue, 2009-10-06 at 16:42 +0200, Stephane Eranian wrote: > This patch changes the event to counter assignment logic to take > into account event constraints for Intel P6, Core and Nehalem > processors. There is no contraints on Intel Atom. There are > constraints on Intel Yonah (Core Duo) but they are not provided > in this patch given that this processor is not yet supported by > perf_events. I don't think there's much missing for that, right? I don't actually have that hardware, so I can't test it. > As a result of the constraints, it is possible for some event groups > to never actually be loaded onto the PMU if they contain two events > which can only be measured on a single counter. That situation can be > detected with the scaling information extracted with read(). Right, that's a pre existing bug in the x86 code (we can create groups larger than the PMU) and should be fixed. Patch looks nice though. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] perf_events: add event constraints support for Intel processors 2009-10-06 16:29 ` Peter Zijlstra @ 2009-10-06 17:26 ` stephane eranian 2009-10-06 18:57 ` [perfmon2] " Vince Weaver 2009-10-07 10:31 ` Peter Zijlstra 0 siblings, 2 replies; 21+ messages in thread From: stephane eranian @ 2009-10-06 17:26 UTC (permalink / raw) To: Peter Zijlstra; +Cc: linux-kernel, mingo, paulus, perfmon2-devel On Tue, Oct 6, 2009 at 6:29 PM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > On Tue, 2009-10-06 at 16:42 +0200, Stephane Eranian wrote: > >> This patch changes the event to counter assignment logic to take >> into account event constraints for Intel P6, Core and Nehalem >> processors. There is no contraints on Intel Atom. There are >> constraints on Intel Yonah (Core Duo) but they are not provided >> in this patch given that this processor is not yet supported by >> perf_events. > > I don't think there's much missing for that, right? > Yonah implements architectural perfmon v1. It has two generic counters but no fixed counters. You already handled that part. But what it does not have is all the GLOBAL_* MSR. You could consider it as P6, but the difference is that the two counters can be started and stopped independently. Given the organization of the code, Yonah present a hybrid configuration. That is where the problem is. So either: - you go the architected PMU path, but you protect all accesses to GLOBAL_* - you go the P6 path, and you make the counters independent. > I don't actually have that hardware, so I can't test it. > I don't have that either but I can find people who have that. >> As a result of the constraints, it is possible for some event groups >> to never actually be loaded onto the PMU if they contain two events >> which can only be measured on a single counter. That situation can be >> detected with the scaling information extracted with read(). > > Right, that's a pre existing bug in the x86 code (we can create groups > larger than the PMU) and should be fixed. > Nope, this happens event if you specify only two events on a two-counter PMU. For instance, if I want to breakdown the number of multiplication between user and kernel to compute a ratio. I would measure MUL twice: perf stat -e MUL:u,MUL:k Assuming that with perf you could express that you want those events grouped. This would always yield zero. I am not using all the counters but my two events compete for the same counter, here counter 0. The problem is that for the tool it is hard to print some meaningful messages to help the user. You can detect something is wrong with the group because time_enabled will be zero. But there are multiple reasons why this can happen. But what is more problematic is if I build a group with an event without a constraint and one with a constraint. For instance, I want to measure BACLEARS and MUL in the same group. If I make BACLEARS the group leader then the group will never be loaded. That's because the assignment code looks at each event individually. Thus it will assign BACLEARS to the first available counter, i.e., counter 0. Then it will try to assign MUL, which can only run on counter 0, and it will always fail. The assignment code needs to look at groups not individual events. Then, it will be able to find a working assignment: MUL -> counter0, BACLEARS -> counter 1. By design of this API, the user should never be concerned about ordering the events in a group a certain way to get a successful assignment to counters. This should all be handled by the kernel. > Patch looks nice though. > Thanks. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [perfmon2] [PATCH 2/2] perf_events: add event constraints support for Intel processors 2009-10-06 17:26 ` stephane eranian @ 2009-10-06 18:57 ` Vince Weaver 2009-10-07 10:31 ` Peter Zijlstra 1 sibling, 0 replies; 21+ messages in thread From: Vince Weaver @ 2009-10-06 18:57 UTC (permalink / raw) To: eranian; +Cc: Peter Zijlstra, perfmon2-devel, mingo, paulus, linux-kernel On Tue, 6 Oct 2009, stephane eranian wrote: > On Tue, Oct 6, 2009 at 6:29 PM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > > On Tue, 2009-10-06 at 16:42 +0200, Stephane Eranian wrote: > > > I don't actually have that hardware, so I can't test it. > > > I don't have that either but I can find people who have that. I have Yonah hardware and would be glad to run tests. For what it's worth, the perfmon2 Yonah support (at least on the laptop I have) stopped working after 2.6.23. It looked like a problem with interrupts not being delivered properly. I never had time to track down if it was a perfmon2 issue or a generic kernel issue. It was still broken as of 2.6.29 Vince ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] perf_events: add event constraints support for Intel processors 2009-10-06 17:26 ` stephane eranian 2009-10-06 18:57 ` [perfmon2] " Vince Weaver @ 2009-10-07 10:31 ` Peter Zijlstra 2009-10-07 11:15 ` Paul Mackerras 2009-10-09 14:22 ` [tip:perf/core] perf, x86: Add simple group validation tip-bot for Peter Zijlstra 1 sibling, 2 replies; 21+ messages in thread From: Peter Zijlstra @ 2009-10-07 10:31 UTC (permalink / raw) To: eranian; +Cc: linux-kernel, mingo, paulus, perfmon2-devel On Tue, 2009-10-06 at 19:26 +0200, stephane eranian wrote: > >> As a result of the constraints, it is possible for some event groups > >> to never actually be loaded onto the PMU if they contain two events > >> which can only be measured on a single counter. That situation can be > >> detected with the scaling information extracted with read(). > > > > Right, that's a pre existing bug in the x86 code (we can create groups > > larger than the PMU) and should be fixed. > > > Nope, this happens event if you specify only two events on a two-counter > PMU. For instance, if I want to breakdown the number of multiplication > between user and kernel to compute a ratio. I would measure MUL twice: > > perf stat -e MUL:u,MUL:k > > Assuming that with perf you could express that you want those events grouped. > This would always yield zero. I am not using all the counters but my two events > compete for the same counter, here counter 0. Right, so what I meant was, we could specify unschedulable groups by adding more counters than present, these constraints make that worse. > The problem is that for the tool it is hard to print some meaningful > messages to help > the user. You can detect something is wrong with the group because time_enabled > will be zero. But there are multiple reasons why this can happen. Something like the below patch (on top of your two patches, compile tested only) would give better feedback by returning -ENOSPC when a group doesn't fit (still relies on the counter order). > But what is more problematic is if I build a group with an event > without a constraint > and one with a constraint. For instance, I want to measure BACLEARS and MUL in > the same group. If I make BACLEARS the group leader then the group will never > be loaded. That's because the assignment code looks at each event individually. > Thus it will assign BACLEARS to the first available counter, i.e., > counter 0. Then it will > try to assign MUL, which can only run on counter 0, and it will always > fail. The assignment > code needs to look at groups not individual events. Then, it will be > able to find a working > assignment: MUL -> counter0, BACLEARS -> counter 1. > > By design of this API, the user should never be concerned about > ordering the events > in a group a certain way to get a successful assignment to counters. > This should all > be handled by the kernel. Agreed, the POWER implementation actually does this quite nicely, maybe we should borrow some of its code for scheduling groups. --- Index: linux-2.6/arch/x86/kernel/cpu/perf_event.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/cpu/perf_event.c +++ linux-2.6/arch/x86/kernel/cpu/perf_event.c @@ -114,7 +114,8 @@ struct x86_pmu { u64 intel_ctrl; void (*enable_bts)(u64 config); void (*disable_bts)(void); - int (*get_event_idx)(struct hw_perf_event *hwc); + int (*get_event_idx)(struct cpu_hw_events *cpuc, + struct hw_perf_event *hwc); }; static struct x86_pmu x86_pmu __read_mostly; @@ -520,7 +521,7 @@ static u64 intel_pmu_raw_event(u64 hw_ev #define CORE_EVNTSEL_UNIT_MASK 0x0000FF00ULL #define CORE_EVNTSEL_EDGE_MASK 0x00040000ULL #define CORE_EVNTSEL_INV_MASK 0x00800000ULL -#define CORE_EVNTSEL_REG_MASK 0xFF000000ULL +#define CORE_EVNTSEL_REG_MASK 0xFF000000ULL #define CORE_EVNTSEL_MASK \ (CORE_EVNTSEL_EVENT_MASK | \ @@ -1387,8 +1388,7 @@ static void amd_pmu_enable_event(struct x86_pmu_enable_event(hwc, idx); } -static int -fixed_mode_idx(struct perf_event *event, struct hw_perf_event *hwc) +static int fixed_mode_idx(struct hw_perf_event *hwc) { unsigned int hw_event; @@ -1421,9 +1421,9 @@ fixed_mode_idx(struct perf_event *event, /* * generic counter allocator: get next free counter */ -static int gen_get_event_idx(struct hw_perf_event *hwc) +static int +gen_get_event_idx(struct cpu_hw_events *cpuc, struct hw_perf_event *hwc) { - struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events); int idx; idx = find_first_zero_bit(cpuc->used_mask, x86_pmu.num_events); @@ -1433,16 +1433,16 @@ static int gen_get_event_idx(struct hw_p /* * intel-specific counter allocator: check event constraints */ -static int intel_get_event_idx(struct hw_perf_event *hwc) +static int +intel_get_event_idx(struct cpu_hw_events *cpuc, struct hw_perf_event *hwc) { - struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events); const struct evt_cstr *evt; int i, code; if (!evt_cstr) goto skip; - code = hwc->config & 0xff; + code = hwc->config & CORE_EVNTSEL_EVENT_MASK; for_each_evt_cstr(evt, evt_cstr) { if (code == evt->code) { @@ -1454,26 +1454,22 @@ static int intel_get_event_idx(struct hw } } skip: - return gen_get_event_idx(hwc); + return gen_get_event_idx(cpuc, hwc); } -/* - * Find a PMC slot for the freshly enabled / scheduled in event: - */ -static int x86_pmu_enable(struct perf_event *event) +static int +x86_schedule_event(struct cpu_hw_events *cpuc, struct hw_perf_event *hwc) { - struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events); - struct hw_perf_event *hwc = &event->hw; int idx; - idx = fixed_mode_idx(event, hwc); + idx = fixed_mode_idx(hwc); if (idx == X86_PMC_IDX_FIXED_BTS) { /* BTS is already occupied. */ if (test_and_set_bit(idx, cpuc->used_mask)) return -EAGAIN; hwc->config_base = 0; - hwc->event_base = 0; + hwc->event_base = 0; hwc->idx = idx; } else if (idx >= 0) { /* @@ -1496,17 +1492,33 @@ static int x86_pmu_enable(struct perf_ev /* Try to get the previous generic event again */ if (idx == -1 || test_and_set_bit(idx, cpuc->used_mask)) { try_generic: - idx = x86_pmu.get_event_idx(hwc); + idx = x86_pmu.get_event_idx(cpuc, hwc); if (idx == -1) return -EAGAIN; set_bit(idx, cpuc->used_mask); hwc->idx = idx; } - hwc->config_base = x86_pmu.eventsel; - hwc->event_base = x86_pmu.perfctr; + hwc->config_base = x86_pmu.eventsel; + hwc->event_base = x86_pmu.perfctr; } + return idx; +} + +/* + * Find a PMC slot for the freshly enabled / scheduled in event: + */ +static int x86_pmu_enable(struct perf_event *event) +{ + struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events); + struct hw_perf_event *hwc = &event->hw; + int idx; + + idx = x86_schedule_event(cpuc, hwc); + if (idx < 0) + return idx; + perf_events_lapic_init(); x86_pmu.disable(hwc, idx); @@ -2209,11 +2221,47 @@ static const struct pmu pmu = { .unthrottle = x86_pmu_unthrottle, }; +static int +validate_event(struct cpu_hw_events *cpuc, struct perf_event *event) +{ + struct hw_perf_event fake_event = event->hw; + + if (event->pmu != &pmu) + return 0; + + return x86_schedule_event(cpuc, &fake_event); +} + +static int validate_group(struct perf_event *event) +{ + struct perf_event *sibling, *leader = event->group_leader; + struct cpu_hw_events fake_pmu; + + memset(&fake_pmu, 0, sizeof(fake_pmu)); + + if (!validate_event(&fake_pmu, leader)) + return -ENOSPC; + + list_for_each_entry(sibling, &leader->sibling_list, group_entry) { + if (!validate_event(&fake_pmu, sibling)) + return -ENOSPC; + } + + if (!validate_event(&fake_pmu, event)) + return -ENOSPC; + + return 0; +} + const struct pmu *hw_perf_event_init(struct perf_event *event) { int err; err = __hw_perf_event_init(event); + if (!err) { + if (event->group_leader != event) + err = validate_group(event); + } if (err) { if (event->destroy) event->destroy(event); ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] perf_events: add event constraints support for Intel processors 2009-10-07 10:31 ` Peter Zijlstra @ 2009-10-07 11:15 ` Paul Mackerras 2009-10-07 12:31 ` stephane eranian 2009-10-09 14:22 ` [tip:perf/core] perf, x86: Add simple group validation tip-bot for Peter Zijlstra 1 sibling, 1 reply; 21+ messages in thread From: Paul Mackerras @ 2009-10-07 11:15 UTC (permalink / raw) To: Peter Zijlstra; +Cc: eranian, linux-kernel, mingo, perfmon2-devel Peter Zijlstra writes: > > By design of this API, the user should never be concerned about > > ordering the events > > in a group a certain way to get a successful assignment to counters. > > This should all > > be handled by the kernel. > > Agreed, the POWER implementation actually does this quite nicely, maybe > we should borrow some of its code for scheduling groups. Yeah, I'm quite pleased with how that code turned out, and I'd be happy to help adapt it for other architectures. The one design handles all the POWER PMUs from POWER4 with multiple layers of event multiplexers feeding an event bus (and some events available through more than one multiplexer) through to the much simpler and more straightforward POWER7. Paul. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] perf_events: add event constraints support for Intel processors 2009-10-07 11:15 ` Paul Mackerras @ 2009-10-07 12:31 ` stephane eranian 2009-10-07 20:46 ` David Miller 2009-10-08 23:18 ` Paul Mackerras 0 siblings, 2 replies; 21+ messages in thread From: stephane eranian @ 2009-10-07 12:31 UTC (permalink / raw) To: Paul Mackerras; +Cc: Peter Zijlstra, linux-kernel, mingo, perfmon2-devel Paul, On Wed, Oct 7, 2009 at 1:15 PM, Paul Mackerras <paulus@samba.org> wrote: > Peter Zijlstra writes: > >> > By design of this API, the user should never be concerned about >> > ordering the events >> > in a group a certain way to get a successful assignment to counters. >> > This should all >> > be handled by the kernel. >> >> Agreed, the POWER implementation actually does this quite nicely, maybe >> we should borrow some of its code for scheduling groups. > > Yeah, I'm quite pleased with how that code turned out, and I'd be > happy to help adapt it for other architectures. The one design > handles all the POWER PMUs from POWER4 with multiple layers of event > multiplexers feeding an event bus (and some events available through > more than one multiplexer) through to the much simpler and more > straightforward POWER7. > I am not an expert on PPC PMU register constraints but I took a quick look at the code and in particular hw_perf_enable() where the action seems to be. Given that in kernel/perf_events.c, the PMU specific layer is invoked on a per event basis in event_sched_in(), you need to have a way to look at the registers you have already assigned. I think this is what PPC does. it stops the PMU and re-runs the assignment code. But for that it needs to maintains a per-cpu structure which has the current event -> counter assignment. What PPC does is probably the only way to do this given the interface between generic and machine-specific code. The one advantage I see is that it works inside an event group but also across event groups because that code does not look at group boundary, it only looks at the events and the number of available registers. The downside is that you duplicate state. Did I get this right, Paul? ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] perf_events: add event constraints support for Intel processors 2009-10-07 12:31 ` stephane eranian @ 2009-10-07 20:46 ` David Miller 2009-10-07 21:30 ` stephane eranian 2009-10-08 20:08 ` Ingo Molnar 2009-10-08 23:18 ` Paul Mackerras 1 sibling, 2 replies; 21+ messages in thread From: David Miller @ 2009-10-07 20:46 UTC (permalink / raw) To: eranian, eranian Cc: paulus, a.p.zijlstra, linux-kernel, mingo, perfmon2-devel From: stephane eranian <eranian@googlemail.com> Date: Wed, 7 Oct 2009 14:31:58 +0200 > What PPC does is probably the only way to do this given the interface between > generic and machine-specific code. The one advantage I see is that it works > inside an event group but also across event groups because that code does not > look at group boundary, it only looks at the events and the number of available > registers. The downside is that you duplicate state. > > Did I get this right, Paul? That's basically how his code works, yes. I intend on duplicating it to some extent on sparc64 since I'm operating in a similar problem space. So if at least some of this engine went to a generic place, there'd be at least a 3rd user :-) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] perf_events: add event constraints support for Intel processors 2009-10-07 20:46 ` David Miller @ 2009-10-07 21:30 ` stephane eranian 2009-10-08 20:08 ` Ingo Molnar 1 sibling, 0 replies; 21+ messages in thread From: stephane eranian @ 2009-10-07 21:30 UTC (permalink / raw) To: David Miller; +Cc: paulus, a.p.zijlstra, linux-kernel, mingo, perfmon2-devel On Wed, Oct 7, 2009 at 10:46 PM, David Miller <davem@davemloft.net> wrote: > From: stephane eranian <eranian@googlemail.com> > Date: Wed, 7 Oct 2009 14:31:58 +0200 > >> What PPC does is probably the only way to do this given the interface between >> generic and machine-specific code. The one advantage I see is that it works >> inside an event group but also across event groups because that code does not >> look at group boundary, it only looks at the events and the number of available >> registers. The downside is that you duplicate state. >> >> Did I get this right, Paul? > > That's basically how his code works, yes. I intend on duplicating > it to some extent on sparc64 since I'm operating in a similar > problem space. > > So if at least some of this engine went to a generic place, there'd > be at least a 3rd user :-) Based on my experience, the assignment problem is best handled in the architecture or model specific code. On some PMU models, it is way more complicated than just two events competing for the same counter. That's the case on Itanium, for instance. And that is also the case with AMD64 Northbridge events. Something I forgot to mention earlier is that when you re-run the assignment code for the new event, no already assigned event can be kicked out in favor of the new event. An existing event can be moved to another counter at worst. Otherwise, you will evict an event which, the generic layer thinks, is currently running. > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] perf_events: add event constraints support for Intel processors 2009-10-07 20:46 ` David Miller 2009-10-07 21:30 ` stephane eranian @ 2009-10-08 20:08 ` Ingo Molnar 2009-10-08 20:28 ` stephane eranian 1 sibling, 1 reply; 21+ messages in thread From: Ingo Molnar @ 2009-10-08 20:08 UTC (permalink / raw) To: David Miller Cc: eranian, eranian, paulus, a.p.zijlstra, linux-kernel, perfmon2-devel * David Miller <davem@davemloft.net> wrote: > From: stephane eranian <eranian@googlemail.com> > Date: Wed, 7 Oct 2009 14:31:58 +0200 > > > What PPC does is probably the only way to do this given the interface between > > generic and machine-specific code. The one advantage I see is that it works > > inside an event group but also across event groups because that code does not > > look at group boundary, it only looks at the events and the number of available > > registers. The downside is that you duplicate state. > > > > Did I get this right, Paul? > > That's basically how his code works, yes. I intend on duplicating it > to some extent on sparc64 since I'm operating in a similar problem > space. > > So if at least some of this engine went to a generic place, there'd be > at least a 3rd user :-) Yeah, i'd definitely suggest to generalize this. We've missed updating PowerPC lowlevel details a couple of times in perf core updates, just because it's in a non-obvious place. Even if it's used by just a single arch, generic code is much more visible. PowerPC really has this somewhat somewhat weird track record of privatizing generic facilities and smugly keeping it to themselves as a competitive advantage ;-) Reminds me of the old semaphore code which was the best on PowerPC, for years. Lets not go there again :) Ingo ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] perf_events: add event constraints support for Intel processors 2009-10-08 20:08 ` Ingo Molnar @ 2009-10-08 20:28 ` stephane eranian 2009-10-12 9:05 ` Ingo Molnar 0 siblings, 1 reply; 21+ messages in thread From: stephane eranian @ 2009-10-08 20:28 UTC (permalink / raw) To: Ingo Molnar Cc: David Miller, paulus, a.p.zijlstra, linux-kernel, perfmon2-devel On Thu, Oct 8, 2009 at 10:08 PM, Ingo Molnar <mingo@elte.hu> wrote: > > * David Miller <davem@davemloft.net> wrote: > >> From: stephane eranian <eranian@googlemail.com> >> Date: Wed, 7 Oct 2009 14:31:58 +0200 >> >> > What PPC does is probably the only way to do this given the interface between >> > generic and machine-specific code. The one advantage I see is that it works >> > inside an event group but also across event groups because that code does not >> > look at group boundary, it only looks at the events and the number of available >> > registers. The downside is that you duplicate state. >> > >> > Did I get this right, Paul? >> >> That's basically how his code works, yes. I intend on duplicating it >> to some extent on sparc64 since I'm operating in a similar problem >> space. >> >> So if at least some of this engine went to a generic place, there'd be >> at least a 3rd user :-) > > Yeah, i'd definitely suggest to generalize this. We've missed updating > PowerPC lowlevel details a couple of times in perf core updates, just > because it's in a non-obvious place. Even if it's used by just a single > arch, generic code is much more visible. > It is not clear how you can do this without creating a monster. As I said the constraints can be far more difficult than just event X => allowed_counter_bitmask. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] perf_events: add event constraints support for Intel processors 2009-10-08 20:28 ` stephane eranian @ 2009-10-12 9:05 ` Ingo Molnar 2009-10-13 7:17 ` stephane eranian 0 siblings, 1 reply; 21+ messages in thread From: Ingo Molnar @ 2009-10-12 9:05 UTC (permalink / raw) To: eranian; +Cc: David Miller, paulus, a.p.zijlstra, linux-kernel, perfmon2-devel * stephane eranian <eranian@googlemail.com> wrote: > On Thu, Oct 8, 2009 at 10:08 PM, Ingo Molnar <mingo@elte.hu> wrote: > > > > * David Miller <davem@davemloft.net> wrote: > > > >> From: stephane eranian <eranian@googlemail.com> > >> Date: Wed, 7 Oct 2009 14:31:58 +0200 > >> > >> > What PPC does is probably the only way to do this given the interface between > >> > generic and machine-specific code. The one advantage I see is that it works > >> > inside an event group but also across event groups because that code does not > >> > look at group boundary, it only looks at the events and the number of available > >> > registers. The downside is that you duplicate state. > >> > > >> > Did I get this right, Paul? > >> > >> That's basically how his code works, yes. I intend on duplicating it > >> to some extent on sparc64 since I'm operating in a similar problem > >> space. > >> > >> So if at least some of this engine went to a generic place, there'd be > >> at least a 3rd user :-) > > > > Yeah, i'd definitely suggest to generalize this. We've missed updating > > PowerPC lowlevel details a couple of times in perf core updates, just > > because it's in a non-obvious place. Even if it's used by just a single > > arch, generic code is much more visible. > > > > It is not clear how you can do this without creating a monster. As I > said the constraints can be far more difficult than just event X => > allowed_counter_bitmask. The event constraints we are interested in come from the physics of CPUs, not from inherent properties of particular architectures. If you check the various constraints you'll see there's many repeating patterns and many of those will repeat between architectures. Arbitrary, random constraints (that stem from design stupidity/laziness) can be kept at arch level, as a quirk in essence. Spreading them all out into architecture code is the far worse solution, it creates a fragile distributed monster with repeating patterns - instead we want a manageable central monster ;-) [We are also quite good at controlling and shrinking monsters in the core kernel.] So we could start all this by factoring out the sane looking bits of PowerPC and x86 constraints into generic helpers, and go step by step starting from that point. Would you be interested in trying that, to finish what you started with 'struct event_constraint' in arch/x86/kernel/cpu/perf_event.c? Thanks, Ingo ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] perf_events: add event constraints support for Intel processors 2009-10-12 9:05 ` Ingo Molnar @ 2009-10-13 7:17 ` stephane eranian 2009-10-13 7:29 ` Ingo Molnar 0 siblings, 1 reply; 21+ messages in thread From: stephane eranian @ 2009-10-13 7:17 UTC (permalink / raw) To: Ingo Molnar Cc: David Miller, paulus, a.p.zijlstra, linux-kernel, perfmon2-devel On Mon, Oct 12, 2009 at 11:05 AM, Ingo Molnar <mingo@elte.hu> wrote: > > The event constraints we are interested in come from the physics of > CPUs, not from inherent properties of particular architectures. > I don't understand this statement. > If you check the various constraints you'll see there's many repeating > patterns and many of those will repeat between architectures. > Some are similar and I have mentioned them but also many are specific. > Arbitrary, random constraints (that stem from design stupidity/laziness) > can be kept at arch level, as a quirk in essence. > We have had a discussion about event constraints early on. I think you and I have a different appreciation on why they exist. I don't think it would be fruitful to restart this. Constraints exist, they will most likely always be there. You can choose to ignore them, drop the constrained features, or add the code to deal with them when the feature is worth it. > Spreading them all out into architecture code is the far worse solution, > it creates a fragile distributed monster with repeating patterns - > instead we want a manageable central monster ;-) [We are also quite good > at controlling and shrinking monsters in the core kernel.] > I don't understand this either. Why would architecture specific code be more fragile ? > So we could start all this by factoring out the sane looking bits of > PowerPC and x86 constraints into generic helpers, and go step by step > starting from that point. > > Would you be interested in trying that, to finish what you started with > 'struct event_constraint' in arch/x86/kernel/cpu/perf_event.c? I have already started this effort because what is there now, though correct, is still not satisfactory. But I have decided to first try to implement it in the X86 specific code to gauge what is actually needed. Then, we may be able to promote some code to the generic layer. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] perf_events: add event constraints support for Intel processors 2009-10-13 7:17 ` stephane eranian @ 2009-10-13 7:29 ` Ingo Molnar 0 siblings, 0 replies; 21+ messages in thread From: Ingo Molnar @ 2009-10-13 7:29 UTC (permalink / raw) To: eranian; +Cc: David Miller, paulus, a.p.zijlstra, linux-kernel, perfmon2-devel * stephane eranian <eranian@googlemail.com> wrote: > > Spreading them all out into architecture code is the far worse > > solution, it creates a fragile distributed monster with repeating > > patterns - instead we want a manageable central monster ;-) [We are > > also quite good at controlling and shrinking monsters in the core > > kernel.] > > I don't understand this either. > Why would architecture specific code be more fragile ? Because similar code spread out and partly duplicated in 22 architectures is an order of magnitude less maintainable than a core library. Ingo ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] perf_events: add event constraints support for Intel processors 2009-10-07 12:31 ` stephane eranian 2009-10-07 20:46 ` David Miller @ 2009-10-08 23:18 ` Paul Mackerras 1 sibling, 0 replies; 21+ messages in thread From: Paul Mackerras @ 2009-10-08 23:18 UTC (permalink / raw) To: eranian; +Cc: Peter Zijlstra, linux-kernel, mingo, perfmon2-devel stephane eranian writes: > I am not an expert on PPC PMU register constraints but I took a quick look > at the code and in particular hw_perf_enable() where the action seems to be. > > Given that in kernel/perf_events.c, the PMU specific layer is invoked on a per > event basis in event_sched_in(), you need to have a way to look at the registers > you have already assigned. I think this is what PPC does. it stops the PMU and > re-runs the assignment code. But for that it needs to maintains a > per-cpu structure > which has the current event -> counter assignment. The idea is that when switching contexts, the core code does hw_perf_disable, then calls hw_perf_group_sched_in for each group that it wants to have on the PMU, then calls hw_perf_enable. So what the powerpc code does is to defer the actual assignment of perf_events to hardware counters until the hw_perf_enable call. As each group is added, I do the constraint checking to ensure that the group can go on, but I don't do the assignment of perf_events to hardware counters or the computation of PMU control register values. I have a way of encoding all the constraints into a pair of 64-bit values for each event such that I can tell very quickly (using only some quick integer arithmetic) whether it's possible to add a given event to the set that are on the PMU without violating any constraints. There is a bit of extra complexity that comes in because there are sometimes alternative event codes for the same event. So as each event is added to the set to go on the PMU, if the initial constraint check indicates that it can't go on, I then go and do a search over the space of alternative codes (for all of the events currently in the set plus the one I want to add) to see if there's a way to get everything on using alternative codes for some events. That sounds expensive but it turns out not to be because only a few events have alternative codes, and there are generally only a couple of alternative codes for those events. The event codes that I use encode settings for the various multiplexers plus an indication of what set of counters the event can be counted on. If an event can be counted on all or some subset of counters with the same settings for all the relevant multiplexers, then I use a single code for it. If an event can be counted for example on hardware counter 1 with selector code 0xf0, or hardware counter 2 with selector code 0x12, then I use two alternative event codes for that event. So this all means that I can map an event code into two 64-bit values -- a value/mask pair. That mapping is processor-specific, but the code that checks whether a set of events is feasible is generic. The idea is that the 64-bit value/mask pair is divided into bitfields, each of which describes one constraint. The big comment at the end of arch/powerpc/include/asm/perf_event.h describes the three different types of constraints that can be represented and how that works as a bitfield. It turns out that this is very powerful and very fast, since the constraint checking is just a few adds, ands and ors, done on the whole 64-bit value/mask pairs (there is no need to iterate over individual bitfields). I hope this makes it a bit clearer. Let me know if I need to expand further. Paul. ^ permalink raw reply [flat|nested] 21+ messages in thread
* [tip:perf/core] perf, x86: Add simple group validation 2009-10-07 10:31 ` Peter Zijlstra 2009-10-07 11:15 ` Paul Mackerras @ 2009-10-09 14:22 ` tip-bot for Peter Zijlstra 1 sibling, 0 replies; 21+ messages in thread From: tip-bot for Peter Zijlstra @ 2009-10-09 14:22 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, a.p.zijlstra, tglx, eranian, mingo Commit-ID: fe9081cc9bdabb0be953a39ad977cea14e35bce5 Gitweb: http://git.kernel.org/tip/fe9081cc9bdabb0be953a39ad977cea14e35bce5 Author: Peter Zijlstra <a.p.zijlstra@chello.nl> AuthorDate: Thu, 8 Oct 2009 11:56:07 +0200 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Fri, 9 Oct 2009 15:56:14 +0200 perf, x86: Add simple group validation Refuse to add events when the group wouldn't fit onto the PMU anymore. Naive implementation. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@gmail.com> LKML-Reference: <1254911461.26976.239.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- arch/x86/kernel/cpu/perf_event.c | 90 +++++++++++++++++++++++++++++--------- 1 files changed, 69 insertions(+), 21 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c index 9c75854..9961d84 100644 --- a/arch/x86/kernel/cpu/perf_event.c +++ b/arch/x86/kernel/cpu/perf_event.c @@ -114,7 +114,8 @@ struct x86_pmu { u64 intel_ctrl; void (*enable_bts)(u64 config); void (*disable_bts)(void); - int (*get_event_idx)(struct hw_perf_event *hwc); + int (*get_event_idx)(struct cpu_hw_events *cpuc, + struct hw_perf_event *hwc); }; static struct x86_pmu x86_pmu __read_mostly; @@ -523,7 +524,7 @@ static u64 intel_pmu_raw_event(u64 hw_event) #define CORE_EVNTSEL_UNIT_MASK 0x0000FF00ULL #define CORE_EVNTSEL_EDGE_MASK 0x00040000ULL #define CORE_EVNTSEL_INV_MASK 0x00800000ULL -#define CORE_EVNTSEL_REG_MASK 0xFF000000ULL +#define CORE_EVNTSEL_REG_MASK 0xFF000000ULL #define CORE_EVNTSEL_MASK \ (CORE_EVNTSEL_EVENT_MASK | \ @@ -1390,8 +1391,7 @@ static void amd_pmu_enable_event(struct hw_perf_event *hwc, int idx) x86_pmu_enable_event(hwc, idx); } -static int -fixed_mode_idx(struct perf_event *event, struct hw_perf_event *hwc) +static int fixed_mode_idx(struct hw_perf_event *hwc) { unsigned int hw_event; @@ -1424,9 +1424,9 @@ fixed_mode_idx(struct perf_event *event, struct hw_perf_event *hwc) /* * generic counter allocator: get next free counter */ -static int gen_get_event_idx(struct hw_perf_event *hwc) +static int +gen_get_event_idx(struct cpu_hw_events *cpuc, struct hw_perf_event *hwc) { - struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events); int idx; idx = find_first_zero_bit(cpuc->used_mask, x86_pmu.num_events); @@ -1436,16 +1436,16 @@ static int gen_get_event_idx(struct hw_perf_event *hwc) /* * intel-specific counter allocator: check event constraints */ -static int intel_get_event_idx(struct hw_perf_event *hwc) +static int +intel_get_event_idx(struct cpu_hw_events *cpuc, struct hw_perf_event *hwc) { - struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events); const struct event_constraint *event_constraint; int i, code; if (!event_constraint) goto skip; - code = hwc->config & 0xff; + code = hwc->config & CORE_EVNTSEL_EVENT_MASK; for_each_event_constraint(event_constraint, event_constraint) { if (code == event_constraint->code) { @@ -1457,26 +1457,22 @@ static int intel_get_event_idx(struct hw_perf_event *hwc) } } skip: - return gen_get_event_idx(hwc); + return gen_get_event_idx(cpuc, hwc); } -/* - * Find a PMC slot for the freshly enabled / scheduled in event: - */ -static int x86_pmu_enable(struct perf_event *event) +static int +x86_schedule_event(struct cpu_hw_events *cpuc, struct hw_perf_event *hwc) { - struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events); - struct hw_perf_event *hwc = &event->hw; int idx; - idx = fixed_mode_idx(event, hwc); + idx = fixed_mode_idx(hwc); if (idx == X86_PMC_IDX_FIXED_BTS) { /* BTS is already occupied. */ if (test_and_set_bit(idx, cpuc->used_mask)) return -EAGAIN; hwc->config_base = 0; - hwc->event_base = 0; + hwc->event_base = 0; hwc->idx = idx; } else if (idx >= 0) { /* @@ -1499,17 +1495,33 @@ static int x86_pmu_enable(struct perf_event *event) /* Try to get the previous generic event again */ if (idx == -1 || test_and_set_bit(idx, cpuc->used_mask)) { try_generic: - idx = x86_pmu.get_event_idx(hwc); + idx = x86_pmu.get_event_idx(cpuc, hwc); if (idx == -1) return -EAGAIN; set_bit(idx, cpuc->used_mask); hwc->idx = idx; } - hwc->config_base = x86_pmu.eventsel; - hwc->event_base = x86_pmu.perfctr; + hwc->config_base = x86_pmu.eventsel; + hwc->event_base = x86_pmu.perfctr; } + return idx; +} + +/* + * Find a PMC slot for the freshly enabled / scheduled in event: + */ +static int x86_pmu_enable(struct perf_event *event) +{ + struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events); + struct hw_perf_event *hwc = &event->hw; + int idx; + + idx = x86_schedule_event(cpuc, hwc); + if (idx < 0) + return idx; + perf_events_lapic_init(); x86_pmu.disable(hwc, idx); @@ -2212,11 +2224,47 @@ static const struct pmu pmu = { .unthrottle = x86_pmu_unthrottle, }; +static int +validate_event(struct cpu_hw_events *cpuc, struct perf_event *event) +{ + struct hw_perf_event fake_event = event->hw; + + if (event->pmu != &pmu) + return 0; + + return x86_schedule_event(cpuc, &fake_event); +} + +static int validate_group(struct perf_event *event) +{ + struct perf_event *sibling, *leader = event->group_leader; + struct cpu_hw_events fake_pmu; + + memset(&fake_pmu, 0, sizeof(fake_pmu)); + + if (!validate_event(&fake_pmu, leader)) + return -ENOSPC; + + list_for_each_entry(sibling, &leader->sibling_list, group_entry) { + if (!validate_event(&fake_pmu, sibling)) + return -ENOSPC; + } + + if (!validate_event(&fake_pmu, event)) + return -ENOSPC; + + return 0; +} + const struct pmu *hw_perf_event_init(struct perf_event *event) { int err; err = __hw_perf_event_init(event); + if (!err) { + if (event->group_leader != event) + err = validate_group(event); + } if (err) { if (event->destroy) event->destroy(event); ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] perf_events: add event constraints support for Intel processors 2009-10-06 14:42 ` [PATCH 2/2] perf_events: add event constraints support for Intel processors Stephane Eranian 2009-10-06 16:29 ` Peter Zijlstra @ 2009-10-09 13:55 ` Ingo Molnar 2009-10-09 14:22 ` [tip:perf/core] perf_events: Add " tip-bot for Stephane Eranian 2 siblings, 0 replies; 21+ messages in thread From: Ingo Molnar @ 2009-10-09 13:55 UTC (permalink / raw) To: Stephane Eranian Cc: linux-kernel, paulus, a.p.zijlstra, perfmon2-devel, Stephane Eranian * Stephane Eranian <eranian@googlemail.com> wrote: > +struct evt_cstr { > + unsigned long idxmsk[BITS_TO_LONGS(X86_PMC_IDX_MAX)]; > + int code; > +}; > + > +#define EVT_CSTR0(c, m) { .code = (c), .idxmsk[0] = (m) } > +#define EVT_CSTR_END { .code = 0, .idxmsk[0] = 0 } > + > +#define for_each_evt_cstr(e, c) \ > + for((e) = (c); (e)->idxmsk[0]; (e)++) Nice patch - but the naming here absolutely sucked, so i changed evt_cstr, idxmsk, CSTR, etc. to something more palatable. Field names and abstractions in Linux code really need to be meaningful, and the code has to be readable and understandable. Wdntusabbrntslkthtinlnx :) Ingo ^ permalink raw reply [flat|nested] 21+ messages in thread
* [tip:perf/core] perf_events: Add event constraints support for Intel processors 2009-10-06 14:42 ` [PATCH 2/2] perf_events: add event constraints support for Intel processors Stephane Eranian 2009-10-06 16:29 ` Peter Zijlstra 2009-10-09 13:55 ` [PATCH 2/2] perf_events: add event constraints support for Intel processors Ingo Molnar @ 2009-10-09 14:22 ` tip-bot for Stephane Eranian 2 siblings, 0 replies; 21+ messages in thread From: tip-bot for Stephane Eranian @ 2009-10-09 14:22 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, eranian, a.p.zijlstra, tglx, eranian, mingo Commit-ID: b690081d4d3f6a23541493f1682835c3cd5c54a1 Gitweb: http://git.kernel.org/tip/b690081d4d3f6a23541493f1682835c3cd5c54a1 Author: Stephane Eranian <eranian@googlemail.com> AuthorDate: Tue, 6 Oct 2009 16:42:09 +0200 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Fri, 9 Oct 2009 15:56:12 +0200 perf_events: Add event constraints support for Intel processors On some Intel processors, not all events can be measured in all counters. Some events can only be measured in one particular counter, for instance. Assigning an event to the wrong counter does not crash the machine but this yields bogus counts, i.e., silent error. This patch changes the event to counter assignment logic to take into account event constraints for Intel P6, Core and Nehalem processors. There is no contraints on Intel Atom. There are constraints on Intel Yonah (Core Duo) but they are not provided in this patch given that this processor is not yet supported by perf_events. As a result of the constraints, it is possible for some event groups to never actually be loaded onto the PMU if they contain two events which can only be measured on a single counter. That situation can be detected with the scaling information extracted with read(). Signed-off-by: Stephane Eranian <eranian@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1254840129-6198-3-git-send-email-eranian@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- arch/x86/kernel/cpu/perf_event.c | 109 ++++++++++++++++++++++++++++++++++++-- 1 files changed, 105 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c index 1d16bd6..9c75854 100644 --- a/arch/x86/kernel/cpu/perf_event.c +++ b/arch/x86/kernel/cpu/perf_event.c @@ -77,6 +77,18 @@ struct cpu_hw_events { struct debug_store *ds; }; +struct event_constraint { + unsigned long idxmsk[BITS_TO_LONGS(X86_PMC_IDX_MAX)]; + int code; +}; + +#define EVENT_CONSTRAINT(c, m) { .code = (c), .idxmsk[0] = (m) } +#define EVENT_CONSTRAINT_END { .code = 0, .idxmsk[0] = 0 } + +#define for_each_event_constraint(e, c) \ + for ((e) = (c); (e)->idxmsk[0]; (e)++) + + /* * struct x86_pmu - generic x86 pmu */ @@ -102,6 +114,7 @@ struct x86_pmu { u64 intel_ctrl; void (*enable_bts)(u64 config); void (*disable_bts)(void); + int (*get_event_idx)(struct hw_perf_event *hwc); }; static struct x86_pmu x86_pmu __read_mostly; @@ -110,6 +123,8 @@ static DEFINE_PER_CPU(struct cpu_hw_events, cpu_hw_events) = { .enabled = 1, }; +static const struct event_constraint *event_constraint; + /* * Not sure about some of these */ @@ -155,6 +170,16 @@ static u64 p6_pmu_raw_event(u64 hw_event) return hw_event & P6_EVNTSEL_MASK; } +static const struct event_constraint intel_p6_event_constraints[] = +{ + EVENT_CONSTRAINT(0xc1, 0x1), /* FLOPS */ + EVENT_CONSTRAINT(0x10, 0x1), /* FP_COMP_OPS_EXE */ + EVENT_CONSTRAINT(0x11, 0x1), /* FP_ASSIST */ + EVENT_CONSTRAINT(0x12, 0x2), /* MUL */ + EVENT_CONSTRAINT(0x13, 0x2), /* DIV */ + EVENT_CONSTRAINT(0x14, 0x1), /* CYCLES_DIV_BUSY */ + EVENT_CONSTRAINT_END +}; /* * Intel PerfMon v3. Used on Core2 and later. @@ -170,6 +195,35 @@ static const u64 intel_perfmon_event_map[] = [PERF_COUNT_HW_BUS_CYCLES] = 0x013c, }; +static const struct event_constraint intel_core_event_constraints[] = +{ + EVENT_CONSTRAINT(0x10, 0x1), /* FP_COMP_OPS_EXE */ + EVENT_CONSTRAINT(0x11, 0x2), /* FP_ASSIST */ + EVENT_CONSTRAINT(0x12, 0x2), /* MUL */ + EVENT_CONSTRAINT(0x13, 0x2), /* DIV */ + EVENT_CONSTRAINT(0x14, 0x1), /* CYCLES_DIV_BUSY */ + EVENT_CONSTRAINT(0x18, 0x1), /* IDLE_DURING_DIV */ + EVENT_CONSTRAINT(0x19, 0x2), /* DELAYED_BYPASS */ + EVENT_CONSTRAINT(0xa1, 0x1), /* RS_UOPS_DISPATCH_CYCLES */ + EVENT_CONSTRAINT(0xcb, 0x1), /* MEM_LOAD_RETIRED */ + EVENT_CONSTRAINT_END +}; + +static const struct event_constraint intel_nehalem_event_constraints[] = +{ + EVENT_CONSTRAINT(0x40, 0x3), /* L1D_CACHE_LD */ + EVENT_CONSTRAINT(0x41, 0x3), /* L1D_CACHE_ST */ + EVENT_CONSTRAINT(0x42, 0x3), /* L1D_CACHE_LOCK */ + EVENT_CONSTRAINT(0x43, 0x3), /* L1D_ALL_REF */ + EVENT_CONSTRAINT(0x4e, 0x3), /* L1D_PREFETCH */ + EVENT_CONSTRAINT(0x4c, 0x3), /* LOAD_HIT_PRE */ + EVENT_CONSTRAINT(0x51, 0x3), /* L1D */ + EVENT_CONSTRAINT(0x52, 0x3), /* L1D_CACHE_PREFETCH_LOCK_FB_HIT */ + EVENT_CONSTRAINT(0x53, 0x3), /* L1D_CACHE_LOCK_FB_HIT */ + EVENT_CONSTRAINT(0xc5, 0x3), /* CACHE_LOCK_CYCLES */ + EVENT_CONSTRAINT_END +}; + static u64 intel_pmu_event_map(int hw_event) { return intel_perfmon_event_map[hw_event]; @@ -932,6 +986,8 @@ static int __hw_perf_event_init(struct perf_event *event) */ hwc->config = ARCH_PERFMON_EVENTSEL_INT; + hwc->idx = -1; + /* * Count user and OS events unless requested not to. */ @@ -1366,6 +1422,45 @@ fixed_mode_idx(struct perf_event *event, struct hw_perf_event *hwc) } /* + * generic counter allocator: get next free counter + */ +static int gen_get_event_idx(struct hw_perf_event *hwc) +{ + struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events); + int idx; + + idx = find_first_zero_bit(cpuc->used_mask, x86_pmu.num_events); + return idx == x86_pmu.num_events ? -1 : idx; +} + +/* + * intel-specific counter allocator: check event constraints + */ +static int intel_get_event_idx(struct hw_perf_event *hwc) +{ + struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events); + const struct event_constraint *event_constraint; + int i, code; + + if (!event_constraint) + goto skip; + + code = hwc->config & 0xff; + + for_each_event_constraint(event_constraint, event_constraint) { + if (code == event_constraint->code) { + for_each_bit(i, event_constraint->idxmsk, X86_PMC_IDX_MAX) { + if (!test_and_set_bit(i, cpuc->used_mask)) + return i; + } + return -1; + } + } +skip: + return gen_get_event_idx(hwc); +} + +/* * Find a PMC slot for the freshly enabled / scheduled in event: */ static int x86_pmu_enable(struct perf_event *event) @@ -1402,11 +1497,10 @@ static int x86_pmu_enable(struct perf_event *event) } else { idx = hwc->idx; /* Try to get the previous generic event again */ - if (test_and_set_bit(idx, cpuc->used_mask)) { + if (idx == -1 || test_and_set_bit(idx, cpuc->used_mask)) { try_generic: - idx = find_first_zero_bit(cpuc->used_mask, - x86_pmu.num_events); - if (idx == x86_pmu.num_events) + idx = x86_pmu.get_event_idx(hwc); + if (idx == -1) return -EAGAIN; set_bit(idx, cpuc->used_mask); @@ -1883,6 +1977,7 @@ static struct x86_pmu p6_pmu = { */ .event_bits = 32, .event_mask = (1ULL << 32) - 1, + .get_event_idx = intel_get_event_idx, }; static struct x86_pmu intel_pmu = { @@ -1906,6 +2001,7 @@ static struct x86_pmu intel_pmu = { .max_period = (1ULL << 31) - 1, .enable_bts = intel_pmu_enable_bts, .disable_bts = intel_pmu_disable_bts, + .get_event_idx = intel_get_event_idx, }; static struct x86_pmu amd_pmu = { @@ -1926,6 +2022,7 @@ static struct x86_pmu amd_pmu = { .apic = 1, /* use highest bit to detect overflow */ .max_period = (1ULL << 47) - 1, + .get_event_idx = gen_get_event_idx, }; static int p6_pmu_init(void) @@ -1938,10 +2035,12 @@ static int p6_pmu_init(void) case 7: case 8: case 11: /* Pentium III */ + event_constraint = intel_p6_event_constraints; break; case 9: case 13: /* Pentium M */ + event_constraint = intel_p6_event_constraints; break; default: pr_cont("unsupported p6 CPU model %d ", @@ -2013,12 +2112,14 @@ static int intel_pmu_init(void) sizeof(hw_cache_event_ids)); pr_cont("Core2 events, "); + event_constraint = intel_core_event_constraints; break; default: case 26: memcpy(hw_cache_event_ids, nehalem_hw_cache_event_ids, sizeof(hw_cache_event_ids)); + event_constraint = intel_nehalem_event_constraints; pr_cont("Nehalem/Corei7 events, "); break; case 28: ^ permalink raw reply related [flat|nested] 21+ messages in thread
* [tip:perf/core] perf_events: Check for filters on fixed counter events 2009-10-06 14:42 ` [PATCH 1/2] perf_events: check for filters on fixed counter events Stephane Eranian 2009-10-06 14:42 ` [PATCH 2/2] perf_events: add event constraints support for Intel processors Stephane Eranian @ 2009-10-09 14:22 ` tip-bot for Stephane Eranian 1 sibling, 0 replies; 21+ messages in thread From: tip-bot for Stephane Eranian @ 2009-10-09 14:22 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, eranian, a.p.zijlstra, tglx, eranian, mingo Commit-ID: 04a705df47d1ea27ca2b066f24b1951c51792d0d Gitweb: http://git.kernel.org/tip/04a705df47d1ea27ca2b066f24b1951c51792d0d Author: Stephane Eranian <eranian@googlemail.com> AuthorDate: Tue, 6 Oct 2009 16:42:08 +0200 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Fri, 9 Oct 2009 15:56:10 +0200 perf_events: Check for filters on fixed counter events Intel fixed counters do not support all the filters possible with a generic counter. Thus, if a fixed counter event is passed but with certain filters set, then the fixed_mode_idx() function must fail and the event must be measured in a generic counter instead. Reject filters are: inv, edge, cnt-mask. Signed-off-by: Stephane Eranian <eranian@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1254840129-6198-2-git-send-email-eranian@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- arch/x86/include/asm/perf_event.h | 13 ++++++++++++- arch/x86/kernel/cpu/perf_event.c | 6 ++++++ 2 files changed, 18 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h index ad7ce3f..8d9f854 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -28,9 +28,20 @@ */ #define ARCH_PERFMON_EVENT_MASK 0xffff +/* + * filter mask to validate fixed counter events. + * the following filters disqualify for fixed counters: + * - inv + * - edge + * - cnt-mask + * The other filters are supported by fixed counters. + * The any-thread option is supported starting with v3. + */ +#define ARCH_PERFMON_EVENT_FILTER_MASK 0xff840000 + #define ARCH_PERFMON_UNHALTED_CORE_CYCLES_SEL 0x3c #define ARCH_PERFMON_UNHALTED_CORE_CYCLES_UMASK (0x00 << 8) -#define ARCH_PERFMON_UNHALTED_CORE_CYCLES_INDEX 0 +#define ARCH_PERFMON_UNHALTED_CORE_CYCLES_INDEX 0 #define ARCH_PERFMON_UNHALTED_CORE_CYCLES_PRESENT \ (1 << (ARCH_PERFMON_UNHALTED_CORE_CYCLES_INDEX)) diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c index b5801c3..1d16bd6 100644 --- a/arch/x86/kernel/cpu/perf_event.c +++ b/arch/x86/kernel/cpu/perf_event.c @@ -1349,6 +1349,12 @@ fixed_mode_idx(struct perf_event *event, struct hw_perf_event *hwc) if (!x86_pmu.num_events_fixed) return -1; + /* + * fixed counters do not take all possible filters + */ + if (hwc->config & ARCH_PERFMON_EVENT_FILTER_MASK) + return -1; + if (unlikely(hw_event == x86_pmu.event_map(PERF_COUNT_HW_INSTRUCTIONS))) return X86_PMC_IDX_FIXED_INSTRUCTIONS; if (unlikely(hw_event == x86_pmu.event_map(PERF_COUNT_HW_CPU_CYCLES))) ^ permalink raw reply related [flat|nested] 21+ messages in thread
end of thread, other threads:[~2009-10-13 7:29 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-10-06 14:42 [PATCH 0/2] perf_events: correct event assignments on Intel processors Stephane Eranian 2009-10-06 14:42 ` [PATCH 1/2] perf_events: check for filters on fixed counter events Stephane Eranian 2009-10-06 14:42 ` [PATCH 2/2] perf_events: add event constraints support for Intel processors Stephane Eranian 2009-10-06 16:29 ` Peter Zijlstra 2009-10-06 17:26 ` stephane eranian 2009-10-06 18:57 ` [perfmon2] " Vince Weaver 2009-10-07 10:31 ` Peter Zijlstra 2009-10-07 11:15 ` Paul Mackerras 2009-10-07 12:31 ` stephane eranian 2009-10-07 20:46 ` David Miller 2009-10-07 21:30 ` stephane eranian 2009-10-08 20:08 ` Ingo Molnar 2009-10-08 20:28 ` stephane eranian 2009-10-12 9:05 ` Ingo Molnar 2009-10-13 7:17 ` stephane eranian 2009-10-13 7:29 ` Ingo Molnar 2009-10-08 23:18 ` Paul Mackerras 2009-10-09 14:22 ` [tip:perf/core] perf, x86: Add simple group validation tip-bot for Peter Zijlstra 2009-10-09 13:55 ` [PATCH 2/2] perf_events: add event constraints support for Intel processors Ingo Molnar 2009-10-09 14:22 ` [tip:perf/core] perf_events: Add " tip-bot for Stephane Eranian 2009-10-09 14:22 ` [tip:perf/core] perf_events: Check for filters on fixed counter events tip-bot for Stephane Eranian
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox