* [PATCH v2 0/2] perf/x86: simplify PEBS constraints @ 2014-06-27 13:48 Stephane Eranian 2014-06-27 13:48 ` [PATCH v2 1/2] " Stephane Eranian 2014-06-27 13:48 ` [PATCH v2 2/2] perf/x86: load latency event supports all PEBS counters Stephane Eranian 0 siblings, 2 replies; 6+ messages in thread From: Stephane Eranian @ 2014-06-27 13:48 UTC (permalink / raw) To: linux-kernel; +Cc: peterz, mingo, ak, jmario, dzickus, jolsa, acme This short series of patches greatly simplifies event constraints for PEBS events on Intel NHM/WSM,SNB/IVB/,HSW. The first patch removes all non counter specific or special constraint for PEBS event. It is based on the fact that enabling precise (PEBS) on a event that does not support PEBS is harmless. No samples is captured. Thus no need to specify PEBS events in pebs constraint tables. The first patch is based on PeterZ initial patch proposal posted on LKML (https://lkml.org/lkml/2014/6/19/343). The second patch simplifies the constraint for the Load Latency event. It was hardcoded to counter 3 when in fact, it can run perfectly well on any PEBS counter. The constraint was there to help simplify scheduling of the event which requires an extra MSR shared with all other counters on the PMU. But in Linux, we have an infrastructure to handle shared regs like this. The advantage is that we can now combine load latency and precise store sampling in a single run without multiplexing. Stephane Eranian (2): perf/x86: simplify PEBS constraints perf/x86: load latency event supports all PEBS counters arch/x86/kernel/cpu/perf_event.h | 4 ++ arch/x86/kernel/cpu/perf_event_intel.c | 12 +++- arch/x86/kernel/cpu/perf_event_intel_ds.c | 109 +++-------------------------- 3 files changed, 25 insertions(+), 100 deletions(-) -- 1.7.9.5 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2 1/2] perf/x86: simplify PEBS constraints 2014-06-27 13:48 [PATCH v2 0/2] perf/x86: simplify PEBS constraints Stephane Eranian @ 2014-06-27 13:48 ` Stephane Eranian 2014-06-27 15:37 ` Andi Kleen 2014-06-27 13:48 ` [PATCH v2 2/2] perf/x86: load latency event supports all PEBS counters Stephane Eranian 1 sibling, 1 reply; 6+ messages in thread From: Stephane Eranian @ 2014-06-27 13:48 UTC (permalink / raw) To: linux-kernel; +Cc: peterz, mingo, ak, jmario, dzickus, jolsa, acme This patches simplifies the management of PEBS constraints for Intel processors. It assumes using a event in PEBS (precise sampling) mode when the event does not actually support it, is harmless and no sample is captured. Based on that assumption, it is possible to simplify all the PEBS constraint tables, by only listing the events which have specific PEBS counter constraints, e.g., run only on one of the 4 PEBS counters. Events which have extra registers requirements such as load latency events must stay in the constraint tables. But every other event can be eliminated. The patch adds INTEL_PEBS_CONSTRAINT() to each table as a 'catch all' constraint which matches any event code. Based on patch proposal by PeterZ on LKML. See https://lkml.org/lkml/2014/6/19/343. Signed-off-by: Stephane Eranian <eranian@google.com> --- arch/x86/kernel/cpu/perf_event.h | 4 ++ arch/x86/kernel/cpu/perf_event_intel.c | 12 +++- arch/x86/kernel/cpu/perf_event_intel_ds.c | 103 +++-------------------------- 3 files changed, 22 insertions(+), 97 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h index 3b2f9bd..cf48a02 100644 --- a/arch/x86/kernel/cpu/perf_event.h +++ b/arch/x86/kernel/cpu/perf_event.h @@ -264,6 +264,10 @@ struct cpu_hw_events { __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \ HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_ST_HSW) +/* catch all event code+umask */ +#define INTEL_PEBS_CONSTRAINT(n) \ + EVENT_CONSTRAINT(0x0000, n, 0x0000) + /* * We define the end marker as having a weight of -1 * to enable blacklisting of events using a counter bitmask diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c index adb02aa..53b96c1 100644 --- a/arch/x86/kernel/cpu/perf_event_intel.c +++ b/arch/x86/kernel/cpu/perf_event_intel.c @@ -1735,8 +1735,16 @@ static int intel_pmu_hw_config(struct perf_event *event) if (ret) return ret; - if (event->attr.precise_ip && x86_pmu.pebs_aliases) - x86_pmu.pebs_aliases(event); + if (event->attr.precise_ip) { + /* + * not a valid event code and used as a PEBS code catch all + */ + if ((event->hw.config & INTEL_ARCH_EVENT_MASK) == 0x0000) + return -EINVAL; + + if (x86_pmu.pebs_aliases) + x86_pmu.pebs_aliases(event); + } if (intel_pmu_needs_lbr_smpl(event)) { ret = intel_pmu_setup_lbr_filter(event); diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c index 980970c..fdadec0 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c @@ -551,141 +551,54 @@ int intel_pmu_drain_bts_buffer(void) * PEBS */ struct event_constraint intel_core2_pebs_event_constraints[] = { - INTEL_UEVENT_CONSTRAINT(0x00c0, 0x1), /* INST_RETIRED.ANY */ - INTEL_UEVENT_CONSTRAINT(0xfec1, 0x1), /* X87_OPS_RETIRED.ANY */ - INTEL_UEVENT_CONSTRAINT(0x00c5, 0x1), /* BR_INST_RETIRED.MISPRED */ - INTEL_UEVENT_CONSTRAINT(0x1fc7, 0x1), /* SIMD_INST_RETURED.ANY */ - INTEL_EVENT_CONSTRAINT(0xcb, 0x1), /* MEM_LOAD_RETIRED.* */ + INTEL_PEBS_CONSTRAINT(0x1), /* catch all PEBS */ EVENT_CONSTRAINT_END }; struct event_constraint intel_atom_pebs_event_constraints[] = { - INTEL_UEVENT_CONSTRAINT(0x00c0, 0x1), /* INST_RETIRED.ANY */ - INTEL_UEVENT_CONSTRAINT(0x00c5, 0x1), /* MISPREDICTED_BRANCH_RETIRED */ - INTEL_EVENT_CONSTRAINT(0xcb, 0x1), /* MEM_LOAD_RETIRED.* */ + INTEL_PEBS_CONSTRAINT(0x1), /* catch all PEBS */ EVENT_CONSTRAINT_END }; struct event_constraint intel_slm_pebs_event_constraints[] = { - INTEL_UEVENT_CONSTRAINT(0x0103, 0x1), /* REHABQ.LD_BLOCK_ST_FORWARD_PS */ - INTEL_UEVENT_CONSTRAINT(0x0803, 0x1), /* REHABQ.LD_SPLITS_PS */ - INTEL_UEVENT_CONSTRAINT(0x0204, 0x1), /* MEM_UOPS_RETIRED.L2_HIT_LOADS_PS */ - INTEL_UEVENT_CONSTRAINT(0x0404, 0x1), /* MEM_UOPS_RETIRED.L2_MISS_LOADS_PS */ - INTEL_UEVENT_CONSTRAINT(0x0804, 0x1), /* MEM_UOPS_RETIRED.DTLB_MISS_LOADS_PS */ - INTEL_UEVENT_CONSTRAINT(0x2004, 0x1), /* MEM_UOPS_RETIRED.HITM_PS */ - INTEL_UEVENT_CONSTRAINT(0x00c0, 0x1), /* INST_RETIRED.ANY_PS */ - INTEL_UEVENT_CONSTRAINT(0x00c4, 0x1), /* BR_INST_RETIRED.ALL_BRANCHES_PS */ - INTEL_UEVENT_CONSTRAINT(0x7ec4, 0x1), /* BR_INST_RETIRED.JCC_PS */ - INTEL_UEVENT_CONSTRAINT(0xbfc4, 0x1), /* BR_INST_RETIRED.FAR_BRANCH_PS */ - INTEL_UEVENT_CONSTRAINT(0xebc4, 0x1), /* BR_INST_RETIRED.NON_RETURN_IND_PS */ - INTEL_UEVENT_CONSTRAINT(0xf7c4, 0x1), /* BR_INST_RETIRED.RETURN_PS */ - INTEL_UEVENT_CONSTRAINT(0xf9c4, 0x1), /* BR_INST_RETIRED.CALL_PS */ - INTEL_UEVENT_CONSTRAINT(0xfbc4, 0x1), /* BR_INST_RETIRED.IND_CALL_PS */ - INTEL_UEVENT_CONSTRAINT(0xfdc4, 0x1), /* BR_INST_RETIRED.REL_CALL_PS */ - INTEL_UEVENT_CONSTRAINT(0xfec4, 0x1), /* BR_INST_RETIRED.TAKEN_JCC_PS */ - INTEL_UEVENT_CONSTRAINT(0x00c5, 0x1), /* BR_INST_MISP_RETIRED.ALL_BRANCHES_PS */ - INTEL_UEVENT_CONSTRAINT(0x7ec5, 0x1), /* BR_INST_MISP_RETIRED.JCC_PS */ - INTEL_UEVENT_CONSTRAINT(0xebc5, 0x1), /* BR_INST_MISP_RETIRED.NON_RETURN_IND_PS */ - INTEL_UEVENT_CONSTRAINT(0xf7c5, 0x1), /* BR_INST_MISP_RETIRED.RETURN_PS */ - INTEL_UEVENT_CONSTRAINT(0xfbc5, 0x1), /* BR_INST_MISP_RETIRED.IND_CALL_PS */ - INTEL_UEVENT_CONSTRAINT(0xfec5, 0x1), /* BR_INST_MISP_RETIRED.TAKEN_JCC_PS */ + INTEL_PEBS_CONSTRAINT(0x1), /* catch all PEBS */ EVENT_CONSTRAINT_END }; struct event_constraint intel_nehalem_pebs_event_constraints[] = { INTEL_PLD_CONSTRAINT(0x100b, 0xf), /* MEM_INST_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0x0f, 0xf), /* MEM_UNCORE_RETIRED.* */ - INTEL_UEVENT_CONSTRAINT(0x010c, 0xf), /* MEM_STORE_RETIRED.DTLB_MISS */ - INTEL_EVENT_CONSTRAINT(0xc0, 0xf), /* INST_RETIRED.ANY */ - INTEL_EVENT_CONSTRAINT(0xc2, 0xf), /* UOPS_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */ - INTEL_UEVENT_CONSTRAINT(0x02c5, 0xf), /* BR_MISP_RETIRED.NEAR_CALL */ - INTEL_EVENT_CONSTRAINT(0xc7, 0xf), /* SSEX_UOPS_RETIRED.* */ - INTEL_UEVENT_CONSTRAINT(0x20c8, 0xf), /* ITLB_MISS_RETIRED */ - INTEL_EVENT_CONSTRAINT(0xcb, 0xf), /* MEM_LOAD_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xf7, 0xf), /* FP_ASSIST.* */ + INTEL_PEBS_CONSTRAINT(0xf), /* catch all PEBS */ EVENT_CONSTRAINT_END }; struct event_constraint intel_westmere_pebs_event_constraints[] = { INTEL_PLD_CONSTRAINT(0x100b, 0xf), /* MEM_INST_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0x0f, 0xf), /* MEM_UNCORE_RETIRED.* */ - INTEL_UEVENT_CONSTRAINT(0x010c, 0xf), /* MEM_STORE_RETIRED.DTLB_MISS */ - INTEL_EVENT_CONSTRAINT(0xc0, 0xf), /* INSTR_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xc2, 0xf), /* UOPS_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xc5, 0xf), /* BR_MISP_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xc7, 0xf), /* SSEX_UOPS_RETIRED.* */ - INTEL_UEVENT_CONSTRAINT(0x20c8, 0xf), /* ITLB_MISS_RETIRED */ - INTEL_EVENT_CONSTRAINT(0xcb, 0xf), /* MEM_LOAD_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xf7, 0xf), /* FP_ASSIST.* */ + INTEL_PEBS_CONSTRAINT(0xf), /* catch all PEBS */ EVENT_CONSTRAINT_END }; struct event_constraint intel_snb_pebs_event_constraints[] = { INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */ - INTEL_UEVENT_CONSTRAINT(0x01c2, 0xf), /* UOPS_RETIRED.ALL */ - INTEL_UEVENT_CONSTRAINT(0x02c2, 0xf), /* UOPS_RETIRED.RETIRE_SLOTS */ - INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xc5, 0xf), /* BR_MISP_RETIRED.* */ INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */ INTEL_PST_CONSTRAINT(0x02cd, 0x8), /* MEM_TRANS_RETIRED.PRECISE_STORES */ - INTEL_EVENT_CONSTRAINT(0xd0, 0xf), /* MEM_UOP_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xd1, 0xf), /* MEM_LOAD_UOPS_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xd2, 0xf), /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xd3, 0xf), /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */ - INTEL_UEVENT_CONSTRAINT(0x02d4, 0xf), /* MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS */ + INTEL_PEBS_CONSTRAINT(0xf), /* catch all PEBS */ EVENT_CONSTRAINT_END }; struct event_constraint intel_ivb_pebs_event_constraints[] = { INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */ - INTEL_UEVENT_CONSTRAINT(0x01c2, 0xf), /* UOPS_RETIRED.ALL */ - INTEL_UEVENT_CONSTRAINT(0x02c2, 0xf), /* UOPS_RETIRED.RETIRE_SLOTS */ - INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xc5, 0xf), /* BR_MISP_RETIRED.* */ INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */ INTEL_PST_CONSTRAINT(0x02cd, 0x8), /* MEM_TRANS_RETIRED.PRECISE_STORES */ - INTEL_EVENT_CONSTRAINT(0xd0, 0xf), /* MEM_UOP_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xd1, 0xf), /* MEM_LOAD_UOPS_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xd2, 0xf), /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xd3, 0xf), /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */ + INTEL_PEBS_CONSTRAINT(0xf), /* catch all PEBS */ EVENT_CONSTRAINT_END }; struct event_constraint intel_hsw_pebs_event_constraints[] = { INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */ INTEL_PST_HSW_CONSTRAINT(0x01c2, 0xf), /* UOPS_RETIRED.ALL */ - INTEL_UEVENT_CONSTRAINT(0x02c2, 0xf), /* UOPS_RETIRED.RETIRE_SLOTS */ - INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */ - INTEL_UEVENT_CONSTRAINT(0x01c5, 0xf), /* BR_MISP_RETIRED.CONDITIONAL */ - INTEL_UEVENT_CONSTRAINT(0x04c5, 0xf), /* BR_MISP_RETIRED.ALL_BRANCHES */ - INTEL_UEVENT_CONSTRAINT(0x20c5, 0xf), /* BR_MISP_RETIRED.NEAR_TAKEN */ INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.* */ - /* MEM_UOPS_RETIRED.STLB_MISS_LOADS */ - INTEL_UEVENT_CONSTRAINT(0x11d0, 0xf), - /* MEM_UOPS_RETIRED.STLB_MISS_STORES */ - INTEL_UEVENT_CONSTRAINT(0x12d0, 0xf), - INTEL_UEVENT_CONSTRAINT(0x21d0, 0xf), /* MEM_UOPS_RETIRED.LOCK_LOADS */ - INTEL_UEVENT_CONSTRAINT(0x41d0, 0xf), /* MEM_UOPS_RETIRED.SPLIT_LOADS */ - /* MEM_UOPS_RETIRED.SPLIT_STORES */ - INTEL_UEVENT_CONSTRAINT(0x42d0, 0xf), - INTEL_UEVENT_CONSTRAINT(0x81d0, 0xf), /* MEM_UOPS_RETIRED.ALL_LOADS */ INTEL_PST_HSW_CONSTRAINT(0x82d0, 0xf), /* MEM_UOPS_RETIRED.ALL_STORES */ - INTEL_UEVENT_CONSTRAINT(0x01d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L1_HIT */ - INTEL_UEVENT_CONSTRAINT(0x02d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L2_HIT */ - INTEL_UEVENT_CONSTRAINT(0x04d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L3_HIT */ - /* MEM_LOAD_UOPS_RETIRED.HIT_LFB */ - INTEL_UEVENT_CONSTRAINT(0x40d1, 0xf), - /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_MISS */ - INTEL_UEVENT_CONSTRAINT(0x01d2, 0xf), - /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HIT */ - INTEL_UEVENT_CONSTRAINT(0x02d2, 0xf), - /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.LOCAL_DRAM */ - INTEL_UEVENT_CONSTRAINT(0x01d3, 0xf), - INTEL_UEVENT_CONSTRAINT(0x04c8, 0xf), /* HLE_RETIRED.Abort */ - INTEL_UEVENT_CONSTRAINT(0x04c9, 0xf), /* RTM_RETIRED.Abort */ - + INTEL_PEBS_CONSTRAINT(0xf), /* catch all PEBS */ EVENT_CONSTRAINT_END }; -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/2] perf/x86: simplify PEBS constraints 2014-06-27 13:48 ` [PATCH v2 1/2] " Stephane Eranian @ 2014-06-27 15:37 ` Andi Kleen 2014-07-01 17:04 ` Stephane Eranian 0 siblings, 1 reply; 6+ messages in thread From: Andi Kleen @ 2014-06-27 15:37 UTC (permalink / raw) To: Stephane Eranian Cc: linux-kernel, peterz, mingo, jmario, dzickus, jolsa, acme On Fri, Jun 27, 2014 at 03:48:02PM +0200, Stephane Eranian wrote: > This patches simplifies the management of PEBS constraints > for Intel processors. I implemented this slightly differently, also fixing some more issues on the way and checking the flags properly. Not yet fully tested. --- >From a2db81ed6036b6bde04f82ff710aed03f4d5bf39 Mon Sep 17 00:00:00 2001 From: Andi Kleen <ak@linux.intel.com> Date: Thu, 26 Jun 2014 17:21:33 -0700 Subject: [PATCH] perf, x86: Revamp PEBS event selection As already discussed earlier in email. The basic idea is that it does not make sense to list all PEBS events individually. The list is very long, sometimes outdated and the hardware doesn't need it. If an event does not support PEBS it will just not count, there is no security issue. This vastly simplifies the PEBS event selection. Bugs fixed: - We do not allow setting forbidden flags with PEBS anymore (SDM 18.9.4), except for the special cycle event. This is done using a new constraint macro that also matches on the event flags. - We now allow DataLA on all Haswell events, not just a small subset. In general all PEBS events that tag memory accesses support DataLA on Haswell. Otherwise the reported address is just zero. This allows address profiling on vastly more events. - We did not allow all PEBS events on Haswell. This includes the changes proposed by Stephane earlier and obsoletes his patchkit. I only did Sandy Bridge and Silvermont and later so far, mostly because these are the parts I could directly confirm the hardware behavior with hardware architects. Cc: eranian@google.com Signed-off-by: Andi Kleen <ak@linux.intel.com> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h index 8249df4..8dfc9fd 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -51,6 +51,14 @@ ARCH_PERFMON_EVENTSEL_EDGE | \ ARCH_PERFMON_EVENTSEL_INV | \ ARCH_PERFMON_EVENTSEL_CMASK) +#define X86_ALL_EVENT_FLAGS \ + (ARCH_PERFMON_EVENTSEL_EDGE | \ + ARCH_PERFMON_EVENTSEL_INV | \ + ARCH_PERFMON_EVENTSEL_CMASK | \ + ARCH_PERFMON_EVENTSEL_ANY | \ + ARCH_PERFMON_EVENTSEL_PIN_CONTROL | \ + HSW_IN_TX | \ + HSW_IN_TX_CHECKPOINTED) #define AMD64_RAW_EVENT_MASK \ (X86_RAW_EVENT_MASK | \ AMD64_EVENTSEL_EVENT) diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h index 3b2f9bd..37b2841 100644 --- a/arch/x86/kernel/cpu/perf_event.h +++ b/arch/x86/kernel/cpu/perf_event.h @@ -252,17 +252,20 @@ struct cpu_hw_events { EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK) #define INTEL_PLD_CONSTRAINT(c, n) \ - __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \ + __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS, \ HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_LDLAT) #define INTEL_PST_CONSTRAINT(c, n) \ - __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \ + __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS, \ HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_ST) -/* DataLA version of store sampling without extra enable bit. */ -#define INTEL_PST_HSW_CONSTRAINT(c, n) \ - __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \ - HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_ST_HSW) +/* Event constraint, but match on all event flags too. */ +#define INTEL_FLAGS_EVENT_CONSTRAINT(c, n) \ + EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS) + +#define INTEL_FLAGS_EVENT_CONSTRAINT_DATALA(c, n) \ + __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS, \ + HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_ST) /* * We define the end marker as having a weight of -1 diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c index 980970c..053e7f2 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c @@ -567,28 +567,10 @@ struct event_constraint intel_atom_pebs_event_constraints[] = { }; struct event_constraint intel_slm_pebs_event_constraints[] = { - INTEL_UEVENT_CONSTRAINT(0x0103, 0x1), /* REHABQ.LD_BLOCK_ST_FORWARD_PS */ - INTEL_UEVENT_CONSTRAINT(0x0803, 0x1), /* REHABQ.LD_SPLITS_PS */ - INTEL_UEVENT_CONSTRAINT(0x0204, 0x1), /* MEM_UOPS_RETIRED.L2_HIT_LOADS_PS */ - INTEL_UEVENT_CONSTRAINT(0x0404, 0x1), /* MEM_UOPS_RETIRED.L2_MISS_LOADS_PS */ - INTEL_UEVENT_CONSTRAINT(0x0804, 0x1), /* MEM_UOPS_RETIRED.DTLB_MISS_LOADS_PS */ - INTEL_UEVENT_CONSTRAINT(0x2004, 0x1), /* MEM_UOPS_RETIRED.HITM_PS */ - INTEL_UEVENT_CONSTRAINT(0x00c0, 0x1), /* INST_RETIRED.ANY_PS */ - INTEL_UEVENT_CONSTRAINT(0x00c4, 0x1), /* BR_INST_RETIRED.ALL_BRANCHES_PS */ - INTEL_UEVENT_CONSTRAINT(0x7ec4, 0x1), /* BR_INST_RETIRED.JCC_PS */ - INTEL_UEVENT_CONSTRAINT(0xbfc4, 0x1), /* BR_INST_RETIRED.FAR_BRANCH_PS */ - INTEL_UEVENT_CONSTRAINT(0xebc4, 0x1), /* BR_INST_RETIRED.NON_RETURN_IND_PS */ - INTEL_UEVENT_CONSTRAINT(0xf7c4, 0x1), /* BR_INST_RETIRED.RETURN_PS */ - INTEL_UEVENT_CONSTRAINT(0xf9c4, 0x1), /* BR_INST_RETIRED.CALL_PS */ - INTEL_UEVENT_CONSTRAINT(0xfbc4, 0x1), /* BR_INST_RETIRED.IND_CALL_PS */ - INTEL_UEVENT_CONSTRAINT(0xfdc4, 0x1), /* BR_INST_RETIRED.REL_CALL_PS */ - INTEL_UEVENT_CONSTRAINT(0xfec4, 0x1), /* BR_INST_RETIRED.TAKEN_JCC_PS */ - INTEL_UEVENT_CONSTRAINT(0x00c5, 0x1), /* BR_INST_MISP_RETIRED.ALL_BRANCHES_PS */ - INTEL_UEVENT_CONSTRAINT(0x7ec5, 0x1), /* BR_INST_MISP_RETIRED.JCC_PS */ - INTEL_UEVENT_CONSTRAINT(0xebc5, 0x1), /* BR_INST_MISP_RETIRED.NON_RETURN_IND_PS */ - INTEL_UEVENT_CONSTRAINT(0xf7c5, 0x1), /* BR_INST_MISP_RETIRED.RETURN_PS */ - INTEL_UEVENT_CONSTRAINT(0xfbc5, 0x1), /* BR_INST_MISP_RETIRED.IND_CALL_PS */ - INTEL_UEVENT_CONSTRAINT(0xfec5, 0x1), /* BR_INST_MISP_RETIRED.TAKEN_JCC_PS */ + /* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */ + INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf), + /* Allow all events as PEBS with no flags */ + INTEL_FLAGS_EVENT_CONSTRAINT(0xffff, 0x1), EVENT_CONSTRAINT_END }; @@ -624,68 +606,34 @@ struct event_constraint intel_westmere_pebs_event_constraints[] = { struct event_constraint intel_snb_pebs_event_constraints[] = { INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */ - INTEL_UEVENT_CONSTRAINT(0x01c2, 0xf), /* UOPS_RETIRED.ALL */ - INTEL_UEVENT_CONSTRAINT(0x02c2, 0xf), /* UOPS_RETIRED.RETIRE_SLOTS */ - INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xc5, 0xf), /* BR_MISP_RETIRED.* */ - INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */ - INTEL_PST_CONSTRAINT(0x02cd, 0x8), /* MEM_TRANS_RETIRED.PRECISE_STORES */ - INTEL_EVENT_CONSTRAINT(0xd0, 0xf), /* MEM_UOP_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xd1, 0xf), /* MEM_LOAD_UOPS_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xd2, 0xf), /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xd3, 0xf), /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */ - INTEL_UEVENT_CONSTRAINT(0x02d4, 0xf), /* MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS */ + INTEL_PLD_CONSTRAINT(0x01cd, 0xf), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */ + INTEL_PST_CONSTRAINT(0x02cd, 0xf), /* MEM_TRANS_RETIRED.PRECISE_STORES */ + /* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */ + INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf), + /* Allow all events as PEBS with no flags */ + INTEL_FLAGS_EVENT_CONSTRAINT(0xffff, 0xf), EVENT_CONSTRAINT_END }; struct event_constraint intel_ivb_pebs_event_constraints[] = { INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */ - INTEL_UEVENT_CONSTRAINT(0x01c2, 0xf), /* UOPS_RETIRED.ALL */ - INTEL_UEVENT_CONSTRAINT(0x02c2, 0xf), /* UOPS_RETIRED.RETIRE_SLOTS */ - INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xc5, 0xf), /* BR_MISP_RETIRED.* */ - INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */ - INTEL_PST_CONSTRAINT(0x02cd, 0x8), /* MEM_TRANS_RETIRED.PRECISE_STORES */ - INTEL_EVENT_CONSTRAINT(0xd0, 0xf), /* MEM_UOP_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xd1, 0xf), /* MEM_LOAD_UOPS_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xd2, 0xf), /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */ - INTEL_EVENT_CONSTRAINT(0xd3, 0xf), /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */ + INTEL_PLD_CONSTRAINT(0x01cd, 0xf), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */ + INTEL_PST_CONSTRAINT(0x02cd, 0xf), /* MEM_TRANS_RETIRED.PRECISE_STORES */ + /* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */ + INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf), + /* Allow all events as PEBS with no flags */ + INTEL_FLAGS_EVENT_CONSTRAINT(0xffff, 0xf), EVENT_CONSTRAINT_END }; struct event_constraint intel_hsw_pebs_event_constraints[] = { INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */ - INTEL_PST_HSW_CONSTRAINT(0x01c2, 0xf), /* UOPS_RETIRED.ALL */ - INTEL_UEVENT_CONSTRAINT(0x02c2, 0xf), /* UOPS_RETIRED.RETIRE_SLOTS */ - INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */ - INTEL_UEVENT_CONSTRAINT(0x01c5, 0xf), /* BR_MISP_RETIRED.CONDITIONAL */ - INTEL_UEVENT_CONSTRAINT(0x04c5, 0xf), /* BR_MISP_RETIRED.ALL_BRANCHES */ - INTEL_UEVENT_CONSTRAINT(0x20c5, 0xf), /* BR_MISP_RETIRED.NEAR_TAKEN */ - INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.* */ - /* MEM_UOPS_RETIRED.STLB_MISS_LOADS */ - INTEL_UEVENT_CONSTRAINT(0x11d0, 0xf), - /* MEM_UOPS_RETIRED.STLB_MISS_STORES */ - INTEL_UEVENT_CONSTRAINT(0x12d0, 0xf), - INTEL_UEVENT_CONSTRAINT(0x21d0, 0xf), /* MEM_UOPS_RETIRED.LOCK_LOADS */ - INTEL_UEVENT_CONSTRAINT(0x41d0, 0xf), /* MEM_UOPS_RETIRED.SPLIT_LOADS */ - /* MEM_UOPS_RETIRED.SPLIT_STORES */ - INTEL_UEVENT_CONSTRAINT(0x42d0, 0xf), - INTEL_UEVENT_CONSTRAINT(0x81d0, 0xf), /* MEM_UOPS_RETIRED.ALL_LOADS */ - INTEL_PST_HSW_CONSTRAINT(0x82d0, 0xf), /* MEM_UOPS_RETIRED.ALL_STORES */ - INTEL_UEVENT_CONSTRAINT(0x01d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L1_HIT */ - INTEL_UEVENT_CONSTRAINT(0x02d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L2_HIT */ - INTEL_UEVENT_CONSTRAINT(0x04d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L3_HIT */ - /* MEM_LOAD_UOPS_RETIRED.HIT_LFB */ - INTEL_UEVENT_CONSTRAINT(0x40d1, 0xf), - /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_MISS */ - INTEL_UEVENT_CONSTRAINT(0x01d2, 0xf), - /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HIT */ - INTEL_UEVENT_CONSTRAINT(0x02d2, 0xf), - /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.LOCAL_DRAM */ - INTEL_UEVENT_CONSTRAINT(0x01d3, 0xf), - INTEL_UEVENT_CONSTRAINT(0x04c8, 0xf), /* HLE_RETIRED.Abort */ - INTEL_UEVENT_CONSTRAINT(0x04c9, 0xf), /* RTM_RETIRED.Abort */ - + INTEL_PLD_CONSTRAINT(0x01cd, 0xf), /* MEM_TRANS_RETIRED.* */ + /* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */ + INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf), + /* Allow all events as PEBS with no flags */ + /* We allow DATALA for all PEBS events, will be 0 if not supported */ + INTEL_FLAGS_EVENT_CONSTRAINT_DATALA(0xffff, 0xf), EVENT_CONSTRAINT_END }; ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/2] perf/x86: simplify PEBS constraints 2014-06-27 15:37 ` Andi Kleen @ 2014-07-01 17:04 ` Stephane Eranian 2014-07-01 18:20 ` Andi Kleen 0 siblings, 1 reply; 6+ messages in thread From: Stephane Eranian @ 2014-07-01 17:04 UTC (permalink / raw) To: Andi Kleen Cc: LKML, Peter Zijlstra, mingo@elte.hu, Joe Mario, Don Zickus, Jiri Olsa, Arnaldo Carvalho de Melo On Fri, Jun 27, 2014 at 5:37 PM, Andi Kleen <ak@linux.intel.com> wrote: > > On Fri, Jun 27, 2014 at 03:48:02PM +0200, Stephane Eranian wrote: > > This patches simplifies the management of PEBS constraints > > for Intel processors. > > I implemented this slightly differently, also fixing some > more issues on the way and checking the flags properly. > Not yet fully tested. > > --- > > From a2db81ed6036b6bde04f82ff710aed03f4d5bf39 Mon Sep 17 00:00:00 2001 > From: Andi Kleen <ak@linux.intel.com> > Date: Thu, 26 Jun 2014 17:21:33 -0700 > Subject: [PATCH] perf, x86: Revamp PEBS event selection > > As already discussed earlier in email. > > The basic idea is that it does not make sense to list all PEBS > events individually. The list is very long, sometimes outdated > and the hardware doesn't need it. If an event does not support > PEBS it will just not count, there is no security issue. > > This vastly simplifies the PEBS event selection. > > Bugs fixed: > - We do not allow setting forbidden flags with PEBS anymore > (SDM 18.9.4), except for the special cycle event. > This is done using a new constraint macro that also > matches on the event flags. I like the fact that you handle the event flags. Indeed, PEBS does not work with filters. > > - We now allow DataLA on all Haswell events, not just > a small subset. In general all PEBS events that tag memory > accesses support DataLA on Haswell. Otherwise the reported > address is just zero. This allows address profiling > on vastly more events. > - We did not allow all PEBS events on Haswell. > > This includes the changes proposed by Stephane earlier and obsoletes > his patchkit. > It is missing NHM, WSM, Atom, mine did cover that. > > I only did Sandy Bridge and Silvermont and later so far, mostly because these > are the parts I could directly confirm the hardware behavior with hardware > architects. > The patch does not work as expected, unfortunately. See below. > > Cc: eranian@google.com > Signed-off-by: Andi Kleen <ak@linux.intel.com> > > diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h > index 8249df4..8dfc9fd 100644 > --- a/arch/x86/include/asm/perf_event.h > +++ b/arch/x86/include/asm/perf_event.h > @@ -51,6 +51,14 @@ > ARCH_PERFMON_EVENTSEL_EDGE | \ > ARCH_PERFMON_EVENTSEL_INV | \ > ARCH_PERFMON_EVENTSEL_CMASK) > +#define X86_ALL_EVENT_FLAGS \ > + (ARCH_PERFMON_EVENTSEL_EDGE | \ > + ARCH_PERFMON_EVENTSEL_INV | \ > + ARCH_PERFMON_EVENTSEL_CMASK | \ > + ARCH_PERFMON_EVENTSEL_ANY | \ > + ARCH_PERFMON_EVENTSEL_PIN_CONTROL | \ > + HSW_IN_TX | \ > + HSW_IN_TX_CHECKPOINTED) > #define AMD64_RAW_EVENT_MASK \ > (X86_RAW_EVENT_MASK | \ > AMD64_EVENTSEL_EVENT) > diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h > index 3b2f9bd..37b2841 100644 > --- a/arch/x86/kernel/cpu/perf_event.h > +++ b/arch/x86/kernel/cpu/perf_event.h > @@ -252,17 +252,20 @@ struct cpu_hw_events { > EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK) > > #define INTEL_PLD_CONSTRAINT(c, n) \ > - __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \ > + __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS, \ > HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_LDLAT) > > #define INTEL_PST_CONSTRAINT(c, n) \ > - __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \ > + __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS, \ > HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_ST) > > -/* DataLA version of store sampling without extra enable bit. */ > -#define INTEL_PST_HSW_CONSTRAINT(c, n) \ > - __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \ > - HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_ST_HSW) > +/* Event constraint, but match on all event flags too. */ > +#define INTEL_FLAGS_EVENT_CONSTRAINT(c, n) \ > + EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS) > + > +#define INTEL_FLAGS_EVENT_CONSTRAINT_DATALA(c, n) \ > + __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS, \ > + HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_ST) > > /* > * We define the end marker as having a weight of -1 > diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c > index 980970c..053e7f2 100644 > --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c > +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c > @@ -567,28 +567,10 @@ struct event_constraint intel_atom_pebs_event_constraints[] = { > }; > > struct event_constraint intel_slm_pebs_event_constraints[] = { > - INTEL_UEVENT_CONSTRAINT(0x0103, 0x1), /* REHABQ.LD_BLOCK_ST_FORWARD_PS */ > - INTEL_UEVENT_CONSTRAINT(0x0803, 0x1), /* REHABQ.LD_SPLITS_PS */ > - INTEL_UEVENT_CONSTRAINT(0x0204, 0x1), /* MEM_UOPS_RETIRED.L2_HIT_LOADS_PS */ > - INTEL_UEVENT_CONSTRAINT(0x0404, 0x1), /* MEM_UOPS_RETIRED.L2_MISS_LOADS_PS */ > - INTEL_UEVENT_CONSTRAINT(0x0804, 0x1), /* MEM_UOPS_RETIRED.DTLB_MISS_LOADS_PS */ > - INTEL_UEVENT_CONSTRAINT(0x2004, 0x1), /* MEM_UOPS_RETIRED.HITM_PS */ > - INTEL_UEVENT_CONSTRAINT(0x00c0, 0x1), /* INST_RETIRED.ANY_PS */ > - INTEL_UEVENT_CONSTRAINT(0x00c4, 0x1), /* BR_INST_RETIRED.ALL_BRANCHES_PS */ > - INTEL_UEVENT_CONSTRAINT(0x7ec4, 0x1), /* BR_INST_RETIRED.JCC_PS */ > - INTEL_UEVENT_CONSTRAINT(0xbfc4, 0x1), /* BR_INST_RETIRED.FAR_BRANCH_PS */ > - INTEL_UEVENT_CONSTRAINT(0xebc4, 0x1), /* BR_INST_RETIRED.NON_RETURN_IND_PS */ > - INTEL_UEVENT_CONSTRAINT(0xf7c4, 0x1), /* BR_INST_RETIRED.RETURN_PS */ > - INTEL_UEVENT_CONSTRAINT(0xf9c4, 0x1), /* BR_INST_RETIRED.CALL_PS */ > - INTEL_UEVENT_CONSTRAINT(0xfbc4, 0x1), /* BR_INST_RETIRED.IND_CALL_PS */ > - INTEL_UEVENT_CONSTRAINT(0xfdc4, 0x1), /* BR_INST_RETIRED.REL_CALL_PS */ > - INTEL_UEVENT_CONSTRAINT(0xfec4, 0x1), /* BR_INST_RETIRED.TAKEN_JCC_PS */ > - INTEL_UEVENT_CONSTRAINT(0x00c5, 0x1), /* BR_INST_MISP_RETIRED.ALL_BRANCHES_PS */ > - INTEL_UEVENT_CONSTRAINT(0x7ec5, 0x1), /* BR_INST_MISP_RETIRED.JCC_PS */ > - INTEL_UEVENT_CONSTRAINT(0xebc5, 0x1), /* BR_INST_MISP_RETIRED.NON_RETURN_IND_PS */ > - INTEL_UEVENT_CONSTRAINT(0xf7c5, 0x1), /* BR_INST_MISP_RETIRED.RETURN_PS */ > - INTEL_UEVENT_CONSTRAINT(0xfbc5, 0x1), /* BR_INST_MISP_RETIRED.IND_CALL_PS */ > - INTEL_UEVENT_CONSTRAINT(0xfec5, 0x1), /* BR_INST_MISP_RETIRED.TAKEN_JCC_PS */ > + /* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */ > + INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf), > + /* Allow all events as PEBS with no flags */ > + INTEL_FLAGS_EVENT_CONSTRAINT(0xffff, 0x1), This macro and EVENT_CONSTRAINT() as you have them will never any event (even those with the filters cleared). The similar macro was also wrong in Peter's patch. I fixed that in mine. But now, you also want to reject any event with filters. That cannot work as is. The reason is that the matching statement is: if ((attr->config & c->mask) == c->code) You have a c->code = 0xffff. That does not match any event. You need to the the mask to ~INTEL_ARCH_EVENT_MASK & X86_ALL_EVENT_FLAGS. And c->code = 0x0, then I think it should work. So I believe the correct code should be: INTEL_FLAGS_EVENT_ALL_CONSTRAINT(0x0, 0x1), With that new macro: +/* Event constraint, but match on all event flags too. */ +#define INTEL_FLAGS_EVENT_ALL_CONSTRAINT(c, n) \ + EVENT_CONSTRAINT(c, n, (~INTEL_ARCH_EVENT_MASK & X86_ALL_EVENT_FLAGS)) I checked that and it works with random event codes now! > > @@ -624,68 +606,34 @@ struct event_constraint intel_westmere_pebs_event_constraints[] = { > > struct event_constraint intel_snb_pebs_event_constraints[] = { > INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */ > - INTEL_UEVENT_CONSTRAINT(0x01c2, 0xf), /* UOPS_RETIRED.ALL */ > - INTEL_UEVENT_CONSTRAINT(0x02c2, 0xf), /* UOPS_RETIRED.RETIRE_SLOTS */ > - INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */ > - INTEL_EVENT_CONSTRAINT(0xc5, 0xf), /* BR_MISP_RETIRED.* */ > - INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */ > - INTEL_PST_CONSTRAINT(0x02cd, 0x8), /* MEM_TRANS_RETIRED.PRECISE_STORES */ > - INTEL_EVENT_CONSTRAINT(0xd0, 0xf), /* MEM_UOP_RETIRED.* */ > - INTEL_EVENT_CONSTRAINT(0xd1, 0xf), /* MEM_LOAD_UOPS_RETIRED.* */ > - INTEL_EVENT_CONSTRAINT(0xd2, 0xf), /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */ > - INTEL_EVENT_CONSTRAINT(0xd3, 0xf), /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */ > - INTEL_UEVENT_CONSTRAINT(0x02d4, 0xf), /* MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS */ > + INTEL_PLD_CONSTRAINT(0x01cd, 0xf), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */ > + INTEL_PST_CONSTRAINT(0x02cd, 0xf), /* MEM_TRANS_RETIRED.PRECISE_STORES */ > + /* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */ > + INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf), > + /* Allow all events as PEBS with no flags */ > + INTEL_FLAGS_EVENT_CONSTRAINT(0xffff, 0xf), > EVENT_CONSTRAINT_END > }; > > struct event_constraint intel_ivb_pebs_event_constraints[] = { > INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */ > - INTEL_UEVENT_CONSTRAINT(0x01c2, 0xf), /* UOPS_RETIRED.ALL */ > - INTEL_UEVENT_CONSTRAINT(0x02c2, 0xf), /* UOPS_RETIRED.RETIRE_SLOTS */ > - INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */ > - INTEL_EVENT_CONSTRAINT(0xc5, 0xf), /* BR_MISP_RETIRED.* */ > - INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */ > - INTEL_PST_CONSTRAINT(0x02cd, 0x8), /* MEM_TRANS_RETIRED.PRECISE_STORES */ > - INTEL_EVENT_CONSTRAINT(0xd0, 0xf), /* MEM_UOP_RETIRED.* */ > - INTEL_EVENT_CONSTRAINT(0xd1, 0xf), /* MEM_LOAD_UOPS_RETIRED.* */ > - INTEL_EVENT_CONSTRAINT(0xd2, 0xf), /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */ > - INTEL_EVENT_CONSTRAINT(0xd3, 0xf), /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */ > + INTEL_PLD_CONSTRAINT(0x01cd, 0xf), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */ > + INTEL_PST_CONSTRAINT(0x02cd, 0xf), /* MEM_TRANS_RETIRED.PRECISE_STORES */ > + /* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */ > + INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf), > + /* Allow all events as PEBS with no flags */ > + INTEL_FLAGS_EVENT_CONSTRAINT(0xffff, 0xf), > EVENT_CONSTRAINT_END > }; > > struct event_constraint intel_hsw_pebs_event_constraints[] = { > INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */ > - INTEL_PST_HSW_CONSTRAINT(0x01c2, 0xf), /* UOPS_RETIRED.ALL */ > - INTEL_UEVENT_CONSTRAINT(0x02c2, 0xf), /* UOPS_RETIRED.RETIRE_SLOTS */ > - INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */ > - INTEL_UEVENT_CONSTRAINT(0x01c5, 0xf), /* BR_MISP_RETIRED.CONDITIONAL */ > - INTEL_UEVENT_CONSTRAINT(0x04c5, 0xf), /* BR_MISP_RETIRED.ALL_BRANCHES */ > - INTEL_UEVENT_CONSTRAINT(0x20c5, 0xf), /* BR_MISP_RETIRED.NEAR_TAKEN */ > - INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.* */ > - /* MEM_UOPS_RETIRED.STLB_MISS_LOADS */ > - INTEL_UEVENT_CONSTRAINT(0x11d0, 0xf), > - /* MEM_UOPS_RETIRED.STLB_MISS_STORES */ > - INTEL_UEVENT_CONSTRAINT(0x12d0, 0xf), > - INTEL_UEVENT_CONSTRAINT(0x21d0, 0xf), /* MEM_UOPS_RETIRED.LOCK_LOADS */ > - INTEL_UEVENT_CONSTRAINT(0x41d0, 0xf), /* MEM_UOPS_RETIRED.SPLIT_LOADS */ > - /* MEM_UOPS_RETIRED.SPLIT_STORES */ > - INTEL_UEVENT_CONSTRAINT(0x42d0, 0xf), > - INTEL_UEVENT_CONSTRAINT(0x81d0, 0xf), /* MEM_UOPS_RETIRED.ALL_LOADS */ > - INTEL_PST_HSW_CONSTRAINT(0x82d0, 0xf), /* MEM_UOPS_RETIRED.ALL_STORES */ > - INTEL_UEVENT_CONSTRAINT(0x01d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L1_HIT */ > - INTEL_UEVENT_CONSTRAINT(0x02d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L2_HIT */ > - INTEL_UEVENT_CONSTRAINT(0x04d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L3_HIT */ > - /* MEM_LOAD_UOPS_RETIRED.HIT_LFB */ > - INTEL_UEVENT_CONSTRAINT(0x40d1, 0xf), > - /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_MISS */ > - INTEL_UEVENT_CONSTRAINT(0x01d2, 0xf), > - /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HIT */ > - INTEL_UEVENT_CONSTRAINT(0x02d2, 0xf), > - /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.LOCAL_DRAM */ > - INTEL_UEVENT_CONSTRAINT(0x01d3, 0xf), > - INTEL_UEVENT_CONSTRAINT(0x04c8, 0xf), /* HLE_RETIRED.Abort */ > - INTEL_UEVENT_CONSTRAINT(0x04c9, 0xf), /* RTM_RETIRED.Abort */ > - > + INTEL_PLD_CONSTRAINT(0x01cd, 0xf), /* MEM_TRANS_RETIRED.* */ > + /* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */ > + INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf), > + /* Allow all events as PEBS with no flags */ > + /* We allow DATALA for all PEBS events, will be 0 if not supported */ > + INTEL_FLAGS_EVENT_CONSTRAINT_DATALA(0xffff, 0xf), > EVENT_CONSTRAINT_END > }; > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/2] perf/x86: simplify PEBS constraints 2014-07-01 17:04 ` Stephane Eranian @ 2014-07-01 18:20 ` Andi Kleen 0 siblings, 0 replies; 6+ messages in thread From: Andi Kleen @ 2014-07-01 18:20 UTC (permalink / raw) To: Stephane Eranian Cc: LKML, Peter Zijlstra, mingo@elte.hu, Joe Mario, Don Zickus, Jiri Olsa, Arnaldo Carvalho de Melo Hi Stephane, Please review the followon patch I sent later. This version had some issues as you noticed. > > his patchkit. > > > It is missing NHM, WSM, Atom, mine did cover that. True. I guess we could just use yours for that. > > I only did Sandy Bridge and Silvermont and later so far, mostly because these > > are the parts I could directly confirm the hardware behavior with hardware > > architects. > > > The patch does not work as expected, unfortunately. See below. See the other patch. -Andi ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2 2/2] perf/x86: load latency event supports all PEBS counters 2014-06-27 13:48 [PATCH v2 0/2] perf/x86: simplify PEBS constraints Stephane Eranian 2014-06-27 13:48 ` [PATCH v2 1/2] " Stephane Eranian @ 2014-06-27 13:48 ` Stephane Eranian 1 sibling, 0 replies; 6+ messages in thread From: Stephane Eranian @ 2014-06-27 13:48 UTC (permalink / raw) To: linux-kernel; +Cc: peterz, mingo, ak, jmario, dzickus, jolsa, acme This patch relaxes the event constraint for the Load Latency PEBS event on SNB/IVB/HSW. It was limited to counter 3 only for artificial reasons, i.e., to make scheduling it easier. This event uses an extra MSR which has to be shared among all events of a PMU. In Linux, there is a sophisticated infratstructure to handle shared regs, and we don't need to constrain the event to one counter. The advantage of this patch is that it makes it possible to combine load latency and precise store events without multiplexing. Signed-off-by: Stephane Eranian <eranian@google.com> --- arch/x86/kernel/cpu/perf_event_intel_ds.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c index fdadec0..9dbfc9b 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c @@ -579,7 +579,7 @@ struct event_constraint intel_westmere_pebs_event_constraints[] = { struct event_constraint intel_snb_pebs_event_constraints[] = { INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */ - INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */ + INTEL_PLD_CONSTRAINT(0x01cd, 0xf), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */ INTEL_PST_CONSTRAINT(0x02cd, 0x8), /* MEM_TRANS_RETIRED.PRECISE_STORES */ INTEL_PEBS_CONSTRAINT(0xf), /* catch all PEBS */ EVENT_CONSTRAINT_END @@ -587,7 +587,7 @@ struct event_constraint intel_snb_pebs_event_constraints[] = { struct event_constraint intel_ivb_pebs_event_constraints[] = { INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */ - INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */ + INTEL_PLD_CONSTRAINT(0x01cd, 0xf), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */ INTEL_PST_CONSTRAINT(0x02cd, 0x8), /* MEM_TRANS_RETIRED.PRECISE_STORES */ INTEL_PEBS_CONSTRAINT(0xf), /* catch all PEBS */ EVENT_CONSTRAINT_END @@ -596,7 +596,7 @@ struct event_constraint intel_ivb_pebs_event_constraints[] = { struct event_constraint intel_hsw_pebs_event_constraints[] = { INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */ INTEL_PST_HSW_CONSTRAINT(0x01c2, 0xf), /* UOPS_RETIRED.ALL */ - INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.* */ + INTEL_PLD_CONSTRAINT(0x01cd, 0xf), /* MEM_TRANS_RETIRED.* */ INTEL_PST_HSW_CONSTRAINT(0x82d0, 0xf), /* MEM_UOPS_RETIRED.ALL_STORES */ INTEL_PEBS_CONSTRAINT(0xf), /* catch all PEBS */ EVENT_CONSTRAINT_END -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-07-01 18:20 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-06-27 13:48 [PATCH v2 0/2] perf/x86: simplify PEBS constraints Stephane Eranian 2014-06-27 13:48 ` [PATCH v2 1/2] " Stephane Eranian 2014-06-27 15:37 ` Andi Kleen 2014-07-01 17:04 ` Stephane Eranian 2014-07-01 18:20 ` Andi Kleen 2014-06-27 13:48 ` [PATCH v2 2/2] perf/x86: load latency event supports all PEBS counters Stephane Eranian
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox