* Fix Skylake PEBS data source for perf
@ 2017-06-22 0:03 Andi Kleen
2017-06-22 0:03 ` [PATCH 1/4] perf/x86: Move Nehalem PEBS code to flag Andi Kleen
` (4 more replies)
0 siblings, 5 replies; 7+ messages in thread
From: Andi Kleen @ 2017-06-22 0:03 UTC (permalink / raw)
To: peterz; +Cc: acme, linux-kernel
Fix data source reporting for Skylake and Skylake Server.
The encodings have changed to express support for L4 and persistent
memory.
The first patch is a (independent) cleanup.
The second is for the kernel and the third for perf/tools.
The kernel part and perf tools will compile independently.
v1:
Initial post
v2:
Merged some patches.
Change encoding to use special bit for each combination instead
of modifiers.
v3:
Switch to new generic lvlnum indication
v4:
Repost. No changes.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/4] perf/x86: Move Nehalem PEBS code to flag
2017-06-22 0:03 Fix Skylake PEBS data source for perf Andi Kleen
@ 2017-06-22 0:03 ` Andi Kleen
2017-06-22 0:03 ` [PATCH 2/4] perf/x86: Fix data source decoding for Skylake Andi Kleen
` (3 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Andi Kleen @ 2017-06-22 0:03 UTC (permalink / raw)
To: peterz; +Cc: acme, linux-kernel, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Minor cleanup: use an explicit x86_pmu flag to handle the
missing Lock / TLB information on Nehalem, instead of always
checking the model number for each PEBS sample.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
arch/x86/events/intel/core.c | 1 +
arch/x86/events/intel/ds.c | 5 +----
arch/x86/events/perf_event.h | 3 ++-
3 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index da9047eec7ba..dec9b4bf0752 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3775,6 +3775,7 @@ __init int intel_pmu_init(void)
intel_pmu_pebs_data_source_nhm();
x86_add_quirk(intel_nehalem_quirk);
+ x86_pmu.pebs_no_tlb = 1;
pr_cont("Nehalem events, ");
break;
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index c6d23ffe422d..7732999f5e2a 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -149,8 +149,6 @@ static u64 load_latency_data(u64 status)
{
union intel_x86_pebs_dse dse;
u64 val;
- int model = boot_cpu_data.x86_model;
- int fam = boot_cpu_data.x86;
dse.val = status;
@@ -162,8 +160,7 @@ static u64 load_latency_data(u64 status)
/*
* Nehalem models do not support TLB, Lock infos
*/
- if (fam == 0x6 && (model == 26 || model == 30
- || model == 31 || model == 46)) {
+ if (x86_pmu.pebs_no_tlb) {
val |= P(TLB, NA) | P(LOCK, NA);
return val;
}
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 53728eea1bed..a6d9d6570957 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -591,7 +591,8 @@ struct x86_pmu {
pebs :1,
pebs_active :1,
pebs_broken :1,
- pebs_prec_dist :1;
+ pebs_prec_dist :1,
+ pebs_no_tlb :1;
int pebs_record_size;
int pebs_buffer_size;
void (*drain_pebs)(struct pt_regs *regs);
--
2.9.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/4] perf/x86: Fix data source decoding for Skylake
2017-06-22 0:03 Fix Skylake PEBS data source for perf Andi Kleen
2017-06-22 0:03 ` [PATCH 1/4] perf/x86: Move Nehalem PEBS code to flag Andi Kleen
@ 2017-06-22 0:03 ` Andi Kleen
2017-07-03 10:38 ` Thomas Gleixner
2017-06-22 0:03 ` [PATCH 3/4] perf, tools: Add support for printing new mem_info encodings Andi Kleen
` (2 subsequent siblings)
4 siblings, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2017-06-22 0:03 UTC (permalink / raw)
To: peterz; +Cc: acme, linux-kernel, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Skylake changed the encoding of the PEBS data source field.
Some combinations are not available anymore, but some new cases
e.g. for L4 cache hit are added.
Fix up the conversion table for Skylake, similar as had been done
for Nehalem.
On Skylake server the encoding for L4 actually means persistent
memory. Handle this case too.
To properly describe it in the abstracted perf format I had to add
some new fields. Since a hit can have only one level add a new
field that is an enumeration, not a bit field to describe
the level. It can describe any level. Some numbers are also
used to describe PMEM and LFB.
Also add a new generic remote flag that can be combined with
the generic level to signify a remote cache.
And there is an extension field for the snoop indication to handle
the Forward state.
I didn't add a generic flag for hops because it's not needed
for Skylake.
I changed the existing encodings for older CPUs to also fill in the
new level and remote fields.
v2: Merge with persistent memory patch.
Add explicit bit for each case instead of using generic modifier.
v3: Rework with new lvlnum and remote fields.
Change older CPUs to report the new fields too.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
arch/x86/events/intel/core.c | 2 ++
arch/x86/events/intel/ds.c | 51 ++++++++++++++++++++++++++---------------
arch/x86/events/perf_event.h | 2 ++
include/uapi/linux/perf_event.h | 30 ++++++++++++++++++++++--
4 files changed, 64 insertions(+), 21 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index dec9b4bf0752..08e53f36d697 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4052,6 +4052,8 @@ __init int intel_pmu_init(void)
skl_format_attr);
WARN_ON(!x86_pmu.format_attrs);
x86_pmu.cpu_events = hsw_events_attrs;
+ intel_pmu_pebs_data_source_skl(
+ boot_cpu_data.x86_model == INTEL_FAM6_SKYLAKE_X);
pr_cont("Skylake events, ");
break;
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 7732999f5e2a..2ced5d42edab 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -49,34 +49,47 @@ union intel_x86_pebs_dse {
*/
#define P(a, b) PERF_MEM_S(a, b)
#define OP_LH (P(OP, LOAD) | P(LVL, HIT))
+#define LEVEL(x) P(LVLNUM, x)
+#define REM P(REMOTE, REMOTE)
#define SNOOP_NONE_MISS (P(SNOOP, NONE) | P(SNOOP, MISS))
/* Version for Sandy Bridge and later */
static u64 pebs_data_source[] = {
- P(OP, LOAD) | P(LVL, MISS) | P(LVL, L3) | P(SNOOP, NA),/* 0x00:ukn L3 */
- OP_LH | P(LVL, L1) | P(SNOOP, NONE), /* 0x01: L1 local */
- OP_LH | P(LVL, LFB) | P(SNOOP, NONE), /* 0x02: LFB hit */
- OP_LH | P(LVL, L2) | P(SNOOP, NONE), /* 0x03: L2 hit */
- OP_LH | P(LVL, L3) | P(SNOOP, NONE), /* 0x04: L3 hit */
- OP_LH | P(LVL, L3) | P(SNOOP, MISS), /* 0x05: L3 hit, snoop miss */
- OP_LH | P(LVL, L3) | P(SNOOP, HIT), /* 0x06: L3 hit, snoop hit */
- OP_LH | P(LVL, L3) | P(SNOOP, HITM), /* 0x07: L3 hit, snoop hitm */
- OP_LH | P(LVL, REM_CCE1) | P(SNOOP, HIT), /* 0x08: L3 miss snoop hit */
- OP_LH | P(LVL, REM_CCE1) | P(SNOOP, HITM), /* 0x09: L3 miss snoop hitm*/
- OP_LH | P(LVL, LOC_RAM) | P(SNOOP, HIT), /* 0x0a: L3 miss, shared */
- OP_LH | P(LVL, REM_RAM1) | P(SNOOP, HIT), /* 0x0b: L3 miss, shared */
- OP_LH | P(LVL, LOC_RAM) | SNOOP_NONE_MISS,/* 0x0c: L3 miss, excl */
- OP_LH | P(LVL, REM_RAM1) | SNOOP_NONE_MISS,/* 0x0d: L3 miss, excl */
- OP_LH | P(LVL, IO) | P(SNOOP, NONE), /* 0x0e: I/O */
- OP_LH | P(LVL, UNC) | P(SNOOP, NONE), /* 0x0f: uncached */
+ P(OP, LOAD) | P(LVL, MISS) | LEVEL(L3) | P(SNOOP, NA),/* 0x00:ukn L3 */
+ OP_LH | P(LVL, L1) | LEVEL(L1) | P(SNOOP, NONE), /* 0x01: L1 local */
+ OP_LH | P(LVL, LFB) | LEVEL(LFB) | P(SNOOP, NONE), /* 0x02: LFB hit */
+ OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, NONE), /* 0x03: L2 hit */
+ OP_LH | P(LVL, L3) | LEVEL(L3) | P(SNOOP, NONE), /* 0x04: L3 hit */
+ OP_LH | P(LVL, L3) | LEVEL(L3) | P(SNOOP, MISS), /* 0x05: L3 hit, snoop miss */
+ OP_LH | P(LVL, L3) | LEVEL(L3) | P(SNOOP, HIT), /* 0x06: L3 hit, snoop hit */
+ OP_LH | P(LVL, L3) | LEVEL(L3) | P(SNOOP, HITM), /* 0x07: L3 hit, snoop hitm */
+ OP_LH | P(LVL, REM_CCE1) | REM | LEVEL(L3) | P(SNOOP, HIT), /* 0x08: L3 miss snoop hit */
+ OP_LH | P(LVL, REM_CCE1) | REM | LEVEL(L3) | P(SNOOP, HITM), /* 0x09: L3 miss snoop hitm*/
+ OP_LH | P(LVL, LOC_RAM) | LEVEL(RAM) | P(SNOOP, HIT), /* 0x0a: L3 miss, shared */
+ OP_LH | P(LVL, REM_RAM1) | REM | LEVEL(L3) | P(SNOOP, HIT), /* 0x0b: L3 miss, shared */
+ OP_LH | P(LVL, LOC_RAM) | LEVEL(RAM) | SNOOP_NONE_MISS, /* 0x0c: L3 miss, excl */
+ OP_LH | P(LVL, REM_RAM1) | LEVEL(RAM) | REM | SNOOP_NONE_MISS, /* 0x0d: L3 miss, excl */
+ OP_LH | P(LVL, IO) | LEVEL(NA) | P(SNOOP, NONE), /* 0x0e: I/O */
+ OP_LH | P(LVL, UNC) | LEVEL(NA) | P(SNOOP, NONE), /* 0x0f: uncached */
};
/* Patch up minor differences in the bits */
void __init intel_pmu_pebs_data_source_nhm(void)
{
- pebs_data_source[0x05] = OP_LH | P(LVL, L3) | P(SNOOP, HIT);
- pebs_data_source[0x06] = OP_LH | P(LVL, L3) | P(SNOOP, HITM);
- pebs_data_source[0x07] = OP_LH | P(LVL, L3) | P(SNOOP, HITM);
+ pebs_data_source[0x05] = OP_LH | P(LVL, L3) | LEVEL(L3) | P(SNOOP, HIT);
+ pebs_data_source[0x06] = OP_LH | P(LVL, L3) | LEVEL(L3) | P(SNOOP, HITM);
+ pebs_data_source[0x07] = OP_LH | P(LVL, L3) | LEVEL(L3) | P(SNOOP, HITM);
+}
+
+void __init intel_pmu_pebs_data_source_skl(bool pmem)
+{
+ u64 pmem_or_l4 = pmem ? LEVEL(PMEM) : LEVEL(L4);
+
+ pebs_data_source[0x08] = OP_LH | pmem_or_l4 | P(SNOOP, HIT);
+ pebs_data_source[0x09] = OP_LH | pmem_or_l4 | REM | P(SNOOP, HIT);
+ pebs_data_source[0x0b] = OP_LH | LEVEL(RAM) | REM | P(SNOOP, NONE);
+ pebs_data_source[0x0c] = OP_LH | LEVEL(ANY_CACHE) | REM | P(SNOOPX, FWD);
+ pebs_data_source[0x0d] = OP_LH | LEVEL(ANY_CACHE) | REM | P(SNOOP, HITM);
}
static u64 precise_store_data(u64 status)
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index a6d9d6570957..d7571f248652 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -946,6 +946,8 @@ void intel_pmu_lbr_init_knl(void);
void intel_pmu_pebs_data_source_nhm(void);
+void intel_pmu_pebs_data_source_skl(bool pmem);
+
int intel_pmu_setup_lbr_filter(struct perf_event *event);
void intel_pt_interrupt(void);
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index b1c0b187acfe..7cfeb54e0b5a 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -931,14 +931,20 @@ union perf_mem_data_src {
mem_snoop:5, /* snoop mode */
mem_lock:2, /* lock instr */
mem_dtlb:7, /* tlb access */
- mem_rsvd:31;
+ mem_lvl_num:4, /* memory hierarchy level number */
+ mem_remote:1, /* remote */
+ mem_snoopx:2, /* snoop mode, ext */
+ mem_rsvd:24;
};
};
#elif defined(__BIG_ENDIAN_BITFIELD)
union perf_mem_data_src {
__u64 val;
struct {
- __u64 mem_rsvd:31,
+ __u64 mem_rsvd:24,
+ mem_snoopx:2, /* snoop mode, ext */
+ mem_remote:1, /* remote */
+ mem_lvl_num:4, /* memory hierarchy level number */
mem_dtlb:7, /* tlb access */
mem_lock:2, /* lock instr */
mem_snoop:5, /* snoop mode */
@@ -975,6 +981,22 @@ union perf_mem_data_src {
#define PERF_MEM_LVL_UNC 0x2000 /* Uncached memory */
#define PERF_MEM_LVL_SHIFT 5
+#define PERF_MEM_REMOTE_REMOTE 0x01 /* Remote */
+#define PERF_MEM_REMOTE_SHIFT 37
+
+#define PERF_MEM_LVLNUM_L1 0x01 /* L1 */
+#define PERF_MEM_LVLNUM_L2 0x02 /* L2 */
+#define PERF_MEM_LVLNUM_L3 0x03 /* L3 */
+#define PERF_MEM_LVLNUM_L4 0x04 /* L4 */
+/* 5-0xa available */
+#define PERF_MEM_LVLNUM_ANY_CACHE 0x0b /* Any cache */
+#define PERF_MEM_LVLNUM_LFB 0x0c /* LFB */
+#define PERF_MEM_LVLNUM_RAM 0x0d /* RAM */
+#define PERF_MEM_LVLNUM_PMEM 0x0e /* PMEM */
+#define PERF_MEM_LVLNUM_NA 0x0f /* N/A */
+
+#define PERF_MEM_LVLNUM_SHIFT 33
+
/* snoop mode */
#define PERF_MEM_SNOOP_NA 0x01 /* not available */
#define PERF_MEM_SNOOP_NONE 0x02 /* no snoop */
@@ -983,6 +1005,10 @@ union perf_mem_data_src {
#define PERF_MEM_SNOOP_HITM 0x10 /* snoop hit modified */
#define PERF_MEM_SNOOP_SHIFT 19
+#define PERF_MEM_SNOOPX_FWD 0x01 /* forward */
+/* 1 free */
+#define PERF_MEM_SNOOPX_SHIFT 37
+
/* locked instruction */
#define PERF_MEM_LOCK_NA 0x01 /* not available */
#define PERF_MEM_LOCK_LOCKED 0x02 /* locked transaction */
--
2.9.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/4] perf, tools: Add support for printing new mem_info encodings
2017-06-22 0:03 Fix Skylake PEBS data source for perf Andi Kleen
2017-06-22 0:03 ` [PATCH 1/4] perf/x86: Move Nehalem PEBS code to flag Andi Kleen
2017-06-22 0:03 ` [PATCH 2/4] perf/x86: Fix data source decoding for Skylake Andi Kleen
@ 2017-06-22 0:03 ` Andi Kleen
2017-06-22 0:03 ` [PATCH 4/4] perf, tools: Add test cases for new data source encoding Andi Kleen
2017-06-22 12:38 ` Fix Skylake PEBS data source for perf Thomas Gleixner
4 siblings, 0 replies; 7+ messages in thread
From: Andi Kleen @ 2017-06-22 0:03 UTC (permalink / raw)
To: peterz; +Cc: acme, linux-kernel, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Add decoding for the new lvlx and snoopx field meminfo field
added earlier to the kernel so that "perf mem report" and
other tools can print it properly.
v2: Merge with persistent memory patch.
Switch to new bit encoding for each combination.
v3: Switch to generic lvlnum field.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
tools/include/uapi/linux/perf_event.h | 30 ++++++++++++++++++++++--
tools/perf/util/mem-events.c | 43 ++++++++++++++++++++++++++++++++---
2 files changed, 68 insertions(+), 5 deletions(-)
diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index b1c0b187acfe..7cfeb54e0b5a 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -931,14 +931,20 @@ union perf_mem_data_src {
mem_snoop:5, /* snoop mode */
mem_lock:2, /* lock instr */
mem_dtlb:7, /* tlb access */
- mem_rsvd:31;
+ mem_lvl_num:4, /* memory hierarchy level number */
+ mem_remote:1, /* remote */
+ mem_snoopx:2, /* snoop mode, ext */
+ mem_rsvd:24;
};
};
#elif defined(__BIG_ENDIAN_BITFIELD)
union perf_mem_data_src {
__u64 val;
struct {
- __u64 mem_rsvd:31,
+ __u64 mem_rsvd:24,
+ mem_snoopx:2, /* snoop mode, ext */
+ mem_remote:1, /* remote */
+ mem_lvl_num:4, /* memory hierarchy level number */
mem_dtlb:7, /* tlb access */
mem_lock:2, /* lock instr */
mem_snoop:5, /* snoop mode */
@@ -975,6 +981,22 @@ union perf_mem_data_src {
#define PERF_MEM_LVL_UNC 0x2000 /* Uncached memory */
#define PERF_MEM_LVL_SHIFT 5
+#define PERF_MEM_REMOTE_REMOTE 0x01 /* Remote */
+#define PERF_MEM_REMOTE_SHIFT 37
+
+#define PERF_MEM_LVLNUM_L1 0x01 /* L1 */
+#define PERF_MEM_LVLNUM_L2 0x02 /* L2 */
+#define PERF_MEM_LVLNUM_L3 0x03 /* L3 */
+#define PERF_MEM_LVLNUM_L4 0x04 /* L4 */
+/* 5-0xa available */
+#define PERF_MEM_LVLNUM_ANY_CACHE 0x0b /* Any cache */
+#define PERF_MEM_LVLNUM_LFB 0x0c /* LFB */
+#define PERF_MEM_LVLNUM_RAM 0x0d /* RAM */
+#define PERF_MEM_LVLNUM_PMEM 0x0e /* PMEM */
+#define PERF_MEM_LVLNUM_NA 0x0f /* N/A */
+
+#define PERF_MEM_LVLNUM_SHIFT 33
+
/* snoop mode */
#define PERF_MEM_SNOOP_NA 0x01 /* not available */
#define PERF_MEM_SNOOP_NONE 0x02 /* no snoop */
@@ -983,6 +1005,10 @@ union perf_mem_data_src {
#define PERF_MEM_SNOOP_HITM 0x10 /* snoop hit modified */
#define PERF_MEM_SNOOP_SHIFT 19
+#define PERF_MEM_SNOOPX_FWD 0x01 /* forward */
+/* 1 free */
+#define PERF_MEM_SNOOPX_SHIFT 37
+
/* locked instruction */
#define PERF_MEM_LOCK_NA 0x01 /* not available */
#define PERF_MEM_LOCK_LOCKED 0x02 /* locked transaction */
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 06f5a3a4295c..ced4f3fff035 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -166,11 +166,20 @@ static const char * const mem_lvl[] = {
"Uncached",
};
+static const char * const mem_lvlnum[] = {
+ [PERF_MEM_LVLNUM_ANY_CACHE] = "Any cache",
+ [PERF_MEM_LVLNUM_LFB] = "LFB",
+ [PERF_MEM_LVLNUM_RAM] = "RAM",
+ [PERF_MEM_LVLNUM_PMEM] = "PMEM",
+ [PERF_MEM_LVLNUM_NA] = "N/A",
+};
+
int perf_mem__lvl_scnprintf(char *out, size_t sz, struct mem_info *mem_info)
{
size_t i, l = 0;
u64 m = PERF_MEM_LVL_NA;
u64 hit, miss;
+ int printed;
if (mem_info)
m = mem_info->data_src.mem_lvl;
@@ -184,17 +193,37 @@ int perf_mem__lvl_scnprintf(char *out, size_t sz, struct mem_info *mem_info)
/* already taken care of */
m &= ~(PERF_MEM_LVL_HIT|PERF_MEM_LVL_MISS);
+
+ if (mem_info && mem_info->data_src.mem_remote) {
+ strcat(out, "Remote ");
+ l += 7;
+ }
+
+ printed = 0;
for (i = 0; m && i < ARRAY_SIZE(mem_lvl); i++, m >>= 1) {
if (!(m & 0x1))
continue;
- if (l) {
+ if (printed++) {
strcat(out, " or ");
l += 4;
}
l += scnprintf(out + l, sz - l, mem_lvl[i]);
}
- if (*out == '\0')
- l += scnprintf(out, sz - l, "N/A");
+
+ if (mem_info && mem_info->data_src.mem_lvl_num) {
+ int lvl = mem_info->data_src.mem_lvl_num;
+ if (printed++) {
+ strcat(out, " or ");
+ l += 4;
+ }
+ if (mem_lvlnum[lvl])
+ l += scnprintf(out + l, sz - l, mem_lvlnum[lvl]);
+ else
+ l += scnprintf(out + l, sz - l, "L%d", lvl);
+ }
+
+ if (l == 0)
+ l += scnprintf(out + l, sz - l, "N/A");
if (hit)
l += scnprintf(out + l, sz - l, " hit");
if (miss)
@@ -231,6 +260,14 @@ int perf_mem__snp_scnprintf(char *out, size_t sz, struct mem_info *mem_info)
}
l += scnprintf(out + l, sz - l, snoop_access[i]);
}
+ if (mem_info &&
+ (mem_info->data_src.mem_snoopx & PERF_MEM_SNOOPX_FWD)) {
+ if (l) {
+ strcat(out, " or ");
+ l += 4;
+ }
+ l += scnprintf(out + l, sz - l, "Fwd");
+ }
if (*out == '\0')
l += scnprintf(out, sz - l, "N/A");
--
2.9.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 4/4] perf, tools: Add test cases for new data source encoding
2017-06-22 0:03 Fix Skylake PEBS data source for perf Andi Kleen
` (2 preceding siblings ...)
2017-06-22 0:03 ` [PATCH 3/4] perf, tools: Add support for printing new mem_info encodings Andi Kleen
@ 2017-06-22 0:03 ` Andi Kleen
2017-06-22 12:38 ` Fix Skylake PEBS data source for perf Thomas Gleixner
4 siblings, 0 replies; 7+ messages in thread
From: Andi Kleen @ 2017-06-22 0:03 UTC (permalink / raw)
To: peterz; +Cc: acme, linux-kernel, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Add some simple tests to perf test to test data source printing.
v2: Make the tests actually checked for the correct name of Forward
v3: Adjust to new encoding
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
tools/perf/tests/Build | 1 +
tools/perf/tests/builtin-test.c | 4 ++++
tools/perf/tests/mem.c | 53 +++++++++++++++++++++++++++++++++++++++++
tools/perf/tests/tests.h | 1 +
4 files changed, 59 insertions(+)
create mode 100644 tools/perf/tests/mem.c
diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index 84222bdb8689..87bf3edb037c 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -34,6 +34,7 @@ perf-y += thread-map.o
perf-y += llvm.o llvm-src-base.o llvm-src-kbuild.o llvm-src-prologue.o llvm-src-relocation.o
perf-y += bpf.o
perf-y += topology.o
+perf-y += mem.o
perf-y += cpumap.o
perf-y += stat.o
perf-y += event_update.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 3ccfd58a8c3c..2ebd18c487da 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -43,6 +43,10 @@ static struct test generic_tests[] = {
.func = test__basic_mmap,
},
{
+ .desc = "Test data source output",
+ .func = test__mem,
+ },
+ {
.desc = "Parse event definition strings",
.func = test__parse_events,
},
diff --git a/tools/perf/tests/mem.c b/tools/perf/tests/mem.c
new file mode 100644
index 000000000000..891b8284e0b3
--- /dev/null
+++ b/tools/perf/tests/mem.c
@@ -0,0 +1,53 @@
+#include "util/mem-events.h"
+#include "util/symbol.h"
+#include "linux/perf_event.h"
+#include "util/debug.h"
+#include "tests.h"
+#include <string.h>
+
+static int check(union perf_mem_data_src data_src,
+ const char *string)
+{
+ char out[100];
+ char failure[100];
+ struct mem_info mi = { .data_src = data_src };
+
+ int n;
+
+ n = perf_mem__snp_scnprintf(out, sizeof out, &mi);
+ n += perf_mem__lvl_scnprintf(out + n, sizeof out - n, &mi);
+ snprintf(failure, sizeof failure, "unexpected %s", out);
+ TEST_ASSERT_VAL(failure, !strcmp(string, out));
+ return 0;
+}
+
+int test__mem(int subtest __maybe_unused)
+{
+ int ret = 0;
+
+ ret |= check(((union perf_mem_data_src) {
+ .mem_lvl = PERF_MEM_LVL_HIT,
+ .mem_lvl_num = 4 }), "N/AL4 hit");
+
+ ret |= check(((union perf_mem_data_src) {
+ .mem_lvl = PERF_MEM_LVL_HIT,
+ .mem_lvl_num = 4,
+ .mem_remote = 1 }), "N/ARemote L4 hit");
+
+ ret |= check(((union perf_mem_data_src) {
+ .mem_lvl = PERF_MEM_LVL_MISS,
+ .mem_lvl_num = PERF_MEM_LVLNUM_PMEM }), "N/APMEM miss");
+
+ ret |= check(((union perf_mem_data_src) {
+ .mem_lvl = PERF_MEM_LVL_MISS,
+ .mem_lvl_num = PERF_MEM_LVLNUM_PMEM,
+ .mem_remote =1 }), "N/ARemote PMEM miss");
+
+ ret |= check(((union perf_mem_data_src) {
+ .mem_snoopx = PERF_MEM_SNOOPX_FWD,
+ .mem_lvl = PERF_MEM_LVL_MISS,
+ .mem_lvl_num = PERF_MEM_LVLNUM_RAM,
+ .mem_remote = 1 }), "FwdRemote RAM miss");
+
+ return ret;
+}
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 577363809c9b..c106a146a188 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -57,6 +57,7 @@ int test__python_use(int subtest);
int test__bp_signal(int subtest);
int test__bp_signal_overflow(int subtest);
int test__task_exit(int subtest);
+int test__mem(int subtest);
int test__sw_clock_freq(int subtest);
int test__code_reading(int subtest);
int test__sample_parsing(int subtest);
--
2.9.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: Fix Skylake PEBS data source for perf
2017-06-22 0:03 Fix Skylake PEBS data source for perf Andi Kleen
` (3 preceding siblings ...)
2017-06-22 0:03 ` [PATCH 4/4] perf, tools: Add test cases for new data source encoding Andi Kleen
@ 2017-06-22 12:38 ` Thomas Gleixner
4 siblings, 0 replies; 7+ messages in thread
From: Thomas Gleixner @ 2017-06-22 12:38 UTC (permalink / raw)
To: Andi Kleen; +Cc: peterz, acme, linux-kernel
Andi,
On Wed, 21 Jun 2017, Andi Kleen wrote:
I asked for proper cover letters with a proper PATCH prefix for the first
submission and a PATCH vN prefix for subsequent submissions politely more
than once.
I'm tired of your obnoxious refusal to cooperate with other people.
>From now on patches which ignore the requested and well documented patch
format requirements are going to be NAKed automatically.
Thanks,
tglx
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/4] perf/x86: Fix data source decoding for Skylake
2017-06-22 0:03 ` [PATCH 2/4] perf/x86: Fix data source decoding for Skylake Andi Kleen
@ 2017-07-03 10:38 ` Thomas Gleixner
0 siblings, 0 replies; 7+ messages in thread
From: Thomas Gleixner @ 2017-07-03 10:38 UTC (permalink / raw)
To: Andi Kleen; +Cc: peterz, acme, linux-kernel, Andi Kleen
On Wed, 21 Jun 2017, Andi Kleen wrote:
> I didn't add a generic flag for hops because it's not needed
> for Skylake.
When is that going to come?
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index b1c0b187acfe..7cfeb54e0b5a 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -931,14 +931,20 @@ union perf_mem_data_src {
> mem_snoop:5, /* snoop mode */
> mem_lock:2, /* lock instr */
> mem_dtlb:7, /* tlb access */
> - mem_rsvd:31;
> + mem_lvl_num:4, /* memory hierarchy level number */
> + mem_remote:1, /* remote */
> + mem_snoopx:2, /* snoop mode, ext */
Is that extending snoopx another time?
The question really is, whether the 2 bits are going to be sufficient for a
while or are we going to add snoopx2 in a few month from now.
Thanks,
tglx
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-07-03 10:39 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-22 0:03 Fix Skylake PEBS data source for perf Andi Kleen
2017-06-22 0:03 ` [PATCH 1/4] perf/x86: Move Nehalem PEBS code to flag Andi Kleen
2017-06-22 0:03 ` [PATCH 2/4] perf/x86: Fix data source decoding for Skylake Andi Kleen
2017-07-03 10:38 ` Thomas Gleixner
2017-06-22 0:03 ` [PATCH 3/4] perf, tools: Add support for printing new mem_info encodings Andi Kleen
2017-06-22 0:03 ` [PATCH 4/4] perf, tools: Add test cases for new data source encoding Andi Kleen
2017-06-22 12:38 ` Fix Skylake PEBS data source for perf Thomas Gleixner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox