* [PATCH/RFC v1 0/2] Human readable performance event description in sysfs
@ 2010-01-20 9:11 Tomasz Fujak
2010-01-20 9:11 ` [PATCH v1 1/2] perfevent: Add performance event structure definition and 'extevents' sysfs entry Tomasz Fujak
` (3 more replies)
0 siblings, 4 replies; 23+ messages in thread
From: Tomasz Fujak @ 2010-01-20 9:11 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
While I managed to build and run the early version (back from December), I was unable to find the newest sources (infra + ARMv6, ARMv7 support).
Where do I find them?
The following patches provide a sysfs entry with hardware event human readable description in the form of "0x%llx\t%lld-%lld\t%s\t%s" % (event_value, minval, maxval, name, description) and means to populate the file.
The version posted contains ARMv6, ARMv7 (Cortex-A[89]) support in this matter.
The intended use is twofold: for users to read the list directly and for tools (like perf).
This series includes:
[PATCH v1 1/2] perfevent: Add performance event structure definition and 'extevents' sysfs entry
[PATCH v1 2/2] [ARM] perfevent: Event description list for ARMv6, Cortex-A8 and Cortex-A9 exported
Thanks,
--
Tomasz Fujak
^ permalink raw reply [flat|nested] 23+ messages in thread* [PATCH v1 1/2] perfevent: Add performance event structure definition and 'extevents' sysfs entry 2010-01-20 9:11 [PATCH/RFC v1 0/2] Human readable performance event description in sysfs Tomasz Fujak @ 2010-01-20 9:11 ` Tomasz Fujak 2010-01-20 9:11 ` [PATCH v1 2/2] [ARM] perfevent: Event description list for ARMv6, Cortex-A8 and Cortex-A9 exported Tomasz Fujak ` (2 subsequent siblings) 3 siblings, 0 replies; 23+ messages in thread From: Tomasz Fujak @ 2010-01-20 9:11 UTC (permalink / raw) To: linux-arm-kernel This patch adds a structure that contains single hardware performance event definition (including name and description fields), and sysfs entry suited to export machine-dependent list of events. Signed-off-by: Tomasz Fujak <t.fujak@samsung.com> Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com> --- include/linux/perf_event.h | 19 +++++++++++++++++++ kernel/perf_event.c | 32 ++++++++++++++++++++++++++++++++ 2 files changed, 51 insertions(+), 0 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 9e70126..4dc4d73 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -447,6 +447,12 @@ enum perf_callchain_context { #define PERF_MAX_STACK_DEPTH 255 +#define PERF_EVENT_RAW_BIT (1ULL << 63) +#define PERF_EVENT_RAW_TO_CONFIG(_val) ((_val) | PERF_EVENT_RAW_BIT) +#define PERF_EVENT_CONFIG_TO_RAW(_val) ((_val) & ~PERF_EVENT_RAW_BIT) +#define PERF_EVENT_IS_RAW(_val) ((_val) & PERF_EVENT_RAW_BIT) + + struct perf_callchain_entry { __u64 nr; __u64 ip[PERF_MAX_STACK_DEPTH]; @@ -538,6 +544,19 @@ struct perf_mmap_data { void *data_pages[0]; }; +struct perf_event_description { + struct list_head list; + + /* type : 1, subsystem [0..7], id [56..63]*/ + __u64 config; + __u64 min_value; /* min. wakeup period */ + __u64 max_value; /* max. wakeup period */ + __u32 flags; /* ??? */ + __u32 reserved[3]; + char *name; + char *description; +}; + struct perf_pending_entry { struct perf_pending_entry *next; void (*func)(struct perf_pending_entry *); diff --git a/kernel/perf_event.c b/kernel/perf_event.c index 7f29643..4223870 100644 --- a/kernel/perf_event.c +++ b/kernel/perf_event.c @@ -97,6 +97,13 @@ void __weak hw_perf_enable(void) { barrier(); } void __weak hw_perf_event_setup(int cpu) { barrier(); } void __weak hw_perf_event_setup_online(int cpu) { barrier(); } +static LIST_HEAD(perf_event_empty); + +const struct list_head __weak *hw_perf_event_get_list(void) +{ + return &perf_event_empty; +} + int __weak hw_perf_group_sched_in(struct perf_event *group_leader, struct perf_cpu_context *cpuctx, @@ -5097,6 +5104,23 @@ perf_set_overcommit(struct sysdev_class *class, const char *buf, size_t count) return count; } +static ssize_t perf_show_extevents(struct sysdev_class *class, char *buf) +{ + char *str = buf; + const struct list_head *head = hw_perf_event_get_list(); + const struct perf_event_description *entry; + + list_for_each_entry(entry, head, list) + if (PERF_EVENT_IS_RAW(entry->config)) + str += sprintf(str, "0x%llx\t%s\t%lld-%lld\t%s\n", + PERF_EVENT_CONFIG_TO_RAW(entry->config), + entry->name, entry->min_value, + entry->max_value, entry->description); + + return str - buf; +} + + static SYSDEV_CLASS_ATTR( reserve_percpu, 0644, @@ -5111,9 +5135,17 @@ static SYSDEV_CLASS_ATTR( perf_set_overcommit ); +static SYSDEV_CLASS_ATTR( + extevents, + 0444, + perf_show_extevents, + NULL + ); + static struct attribute *perfclass_attrs[] = { &attr_reserve_percpu.attr, &attr_overcommit.attr, + &attr_extevents.attr, NULL }; -- 1.5.4.3 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v1 2/2] [ARM] perfevent: Event description list for ARMv6, Cortex-A8 and Cortex-A9 exported 2010-01-20 9:11 [PATCH/RFC v1 0/2] Human readable performance event description in sysfs Tomasz Fujak 2010-01-20 9:11 ` [PATCH v1 1/2] perfevent: Add performance event structure definition and 'extevents' sysfs entry Tomasz Fujak @ 2010-01-20 9:11 ` Tomasz Fujak 2010-01-20 9:16 ` [PATCH/RFC v1 0/2] Human readable performance event description in sysfs Peter Zijlstra 2010-01-20 9:57 ` Russell King - ARM Linux 3 siblings, 0 replies; 23+ messages in thread From: Tomasz Fujak @ 2010-01-20 9:11 UTC (permalink / raw) To: linux-arm-kernel Signed-off-by: Tomasz Fujak <t.fujak@samsung.com> Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com> --- arch/arm/kernel/perf_event.c | 341 +++++++++++++++++++++++++++++++++++++++++- 1 files changed, 337 insertions(+), 4 deletions(-) diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c index 8d24be3..64573a2 100644 --- a/arch/arm/kernel/perf_event.c +++ b/arch/arm/kernel/perf_event.c @@ -26,6 +26,17 @@ static const struct pmu_irqs *pmu_irqs; +#define PERF_EVENT_DESC_ENTRY(_val, _min, _max, _name, _desc) { \ + .config = PERF_EVENT_RAW_TO_CONFIG(_val),\ + .min_value = (_min),\ + .max_value = (_max),\ + .name = (_name),\ + .description = (_desc)\ +} + +#define minv 0 +#define maxv 0 + /* * Hardware lock to serialize accesses to PMU registers. Needed for the * read/modify/write sequences. @@ -84,6 +95,7 @@ struct arm_pmu { /* Set at runtime when we know what CPU type we are. */ static struct arm_pmu *armpmu; +static LIST_HEAD(perf_events_arm); #define HW_OP_UNSUPPORTED 0xFFFF @@ -96,6 +108,17 @@ static unsigned armpmu_perf_cache_map[PERF_COUNT_HW_CACHE_MAX] [PERF_COUNT_HW_CACHE_OP_MAX] [PERF_COUNT_HW_CACHE_RESULT_MAX]; +static void +perf_event_add_events(struct list_head *head, + struct perf_event_description *array, + unsigned int count) +{ + unsigned int idx = 0; + + while (idx < count) + __list_add(&array[idx++].list, head->prev, head); +} + static const int armpmu_map_cache_event(u64 config) { @@ -673,6 +696,56 @@ static const unsigned armv6_perf_cache_map[PERF_COUNT_HW_CACHE_MAX] }, }; +static struct perf_event_description armv6_event_description[] = { + /* armv6 events */ + PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_ICACHE_MISS, minv, maxv, + "ICACHE_MISS", "Instruction cache miss"), + PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_IBUF_STALL, minv, maxv, + "IBUF_STALL", "Instruction fetch stall cycle" + " (either uTLB or I-cache miss)"), + PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_DDEP_STALL, minv, maxv, + "DDEP_STALL", "Data dependency stall cycle"), + PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_ITLB_MISS, minv, maxv, + "ITLB_MISS", "Instruction uTLB miss"), + PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_DTLB_MISS, minv, maxv, + "DTLB_MISS", "Data uTLB miss"), + PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_BR_EXEC, minv, maxv, + "BR_EXEC", "Branch instruction executed " + "(even if the PC hasn't been affected)"), + PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_BR_MISPREDICT, minv, maxv, + "BR_MISPREDICT", "Branch mispredicted"), + PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_INSTR_EXEC, minv, maxv, + "INSTR_EXEC", "Instruction executed (may be incremented" + " by 2 on some occasion)"), + PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_DCACHE_HIT, minv, maxv, + "DCACHE_HIT", "Data cache hit for cacheable locations " + "(cache ops don't count)"), + PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_DCACHE_ACCESS, minv, maxv, + "DCACHE_ACCESS", "Data cache access, all locations (?)"), + PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_DCACHE_MISS, minv, maxv, + "DCACHE_MISS", "Data cache miss (cache ops don't count)"), + PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_DCACHE_WBACK, minv, maxv, + "DCACHE_WBACK", "Data cache writeback (once for " + "half a cache line)"), + PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_SW_PC_CHANGE, minv, maxv, + "SW_PC_CHANGE", "Software PC change (does not count if the " + "mode is changed, i.e. at SVC)"), + PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_MAIN_TLB_MISS, minv, maxv, + "MAIN_TLB_MISS", "Main TLB (not uTLB) miss"), + PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_EXPL_D_ACCESS, minv, maxv, + "EXPL_D_ACCESS", "Explicit external data access, DCache " + "linefill, Uncached, write-through"), + PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_LSU_FULL_STALL, minv, maxv, + "LSU_FULL_STALL", "Stall cycle due to full Load/Store" + " Unit queue"), + PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_WBUF_DRAINED, minv, maxv, + "WBUF_DRAINED", "Write buffer drained because of DSB or " + "Strongly Ordered memory operation"), + PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_CPU_CYCLES, minv, maxv, + "CPU_CYCLES", "CPU cycles"), + PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_NOP, minv, maxv, "NOP", "???") +}; + static inline unsigned long armv6_pmcr_read(void) { @@ -1223,6 +1296,248 @@ static const unsigned armv7_a8_perf_cache_map[PERF_COUNT_HW_CACHE_MAX] }, }; +static struct perf_event_description armv7_event_description[] = { + /* armv7 generic events */ + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PMNC_SW_INCR, minv, maxv, + "PMNC_SW_INCR", "Software increment (write to a " + "dedicated register)"), + + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_IFETCH_MISS, minv, maxv, + "IFETCH_MISS", "Instruction fetch miss that causes " + "refill. Speculative misses count unless they don't " + "make to the execution, maintenance operations don't"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_ITLB_MISS, minv, maxv, + "ITLB_MISS", "Instruction TLB miss that causes a refill." + " Both speculative and explicit accesses count"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DCACHE_REFILL, minv, maxv, + "DCACHE_REFILL", "Data cache refill. Same rules as ITLB_MISS"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DCACHE_ACCESS, minv, maxv, + "DCACHE_ACCESS", "Data cache access. Same rules as ITLB_MISS"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DTLB_REFILL, minv, maxv, + "DTLB_REFILL", "Data TLB refill. Same rules as ITLB_MISS"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DREAD, minv, maxv, "DREAD", + "Data read executed (including SWP)"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DWRITE, minv, maxv, "DWRITE", + "Data write executed (including SWP)"), + + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_EXC_TAKEN, minv, maxv, + "EXC_TAKEN", "Exception taken"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_EXC_EXECUTED, minv, maxv, + "EXC_EXECUTED", "Exception return executed"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_CID_WRITE, minv, maxv, + "CID_WRITE", "Context ID register written"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PC_WRITE, minv, maxv, "PC_WRITE", + "Software change of the PC (R15)"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PC_IMM_BRANCH, minv, maxv, + "PC_IMM_BRANCH", "Immediate branch (B[L], BLX, CB[N]Z, HB[L]," + " HBLP), including conditional that fail"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_UNALIGNED_ACCESS, minv, maxv, + "UNALIGNED_ACCESS", "Data access unaligned to the transfer" + " size"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PC_BRANCH_MIS_PRED, minv, maxv, + "BRANCH_MISS_PRED", "Branch misprediction or not predicted"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_CLOCK_CYCLES, minv, maxv, + "CLOCK_CYCLES", "Cycle count"), + + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PC_BRANCH_MIS_USED, minv, maxv, + "BRANCH_MIS_USED", "Branch or other program flow change that " + "could have been predicted"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_CPU_CYCLES, minv, maxv, + "CPU_CYCLES", "measures cpu cycles, the only allowed event" + " for the first counter") +}; + +static struct perf_event_description cortexa8_event_description[] = { + /* Cortex A8 specific events */ + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_INSTR_EXECUTED, minv, maxv, + "INSTR_EXECUTED", "Instruction executed (including conditional" + " that don't pass)"), + + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PC_PROC_RETURN, minv, maxv, + "PC_PROC_RETURN", "Procedure return (BX LR; MOV PC, LR; POP " + "{.., PC} and such)"), + + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_WRITE_BUFFER_FULL, minv, maxv, + "WRITE_BUFFER_FULL", "Write buffer full cycle"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L2_STORE_MERGED, minv, maxv, + "L2_STORE_MERGED", "Store that is merged in the L2 memory"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L2_STORE_BUFF, minv, maxv, + "L2_STORE_BUFF", "A bufferable store from load/store to L2" + " cache, evictions and cast out data don't count (?)"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L2_ACCESS, minv, maxv, "L2_ACCESS", + "L2 cache access"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L2_CACH_MISS, minv, maxv, + "L2_CACH_MISS", "L2 cache miss"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_AXI_READ_CYCLES, minv, maxv, + "AXI_READ_CYCLES", "AXI read data transfers"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_AXI_WRITE_CYCLES, minv, maxv, + "AXI_WRITE_CYCLES", "AXI write data transfers"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_MEMORY_REPLAY, minv, maxv, + "MEMORY_REPLAY", "Replay event in the memory subsystem (?)"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_UNALIGNED_ACCESS_REPLAY, minv, maxv, + "UNALIGNED_ACCESS_REPLAY", "An unaligned memory access that" + " results in a replay (?)"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L1_DATA_MISS, minv, maxv, + "L1_DATA_MISS", "L1 data cache miss"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L1_INST_MISS, minv, maxv, + "L1_INST_MISS", "L1 instruction cache miss"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L1_DATA_COLORING, minv, maxv, + "L1_DATA_COLORING", "L1 access that triggers eviction or cast" + " out (page coloring alias)"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L1_NEON_DATA, minv, maxv, + "L1_NEON_DATA", "A NEON access that hits the L1 DCache"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L1_NEON_CACH_DATA, minv, maxv, + "L1_NEON_CACH_DATA", "A cacheable NEON access that hits the" + " L1 DCache"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L2_NEON, minv, maxv, "L2_NEON", + "A NEON access memory access that results in L2 being" + " accessed"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L2_NEON_HIT, minv, maxv, + "L2_NEON_HIT", "A NEON hit in the L2"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L1_INST, minv, maxv, "L1_INST", + "A L1 instruction access (CP15 cache ops don't count)"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PC_RETURN_MIS_PRED, minv, maxv, + "PC_RETURN_MIS_PRED", "A return stack misprediction because" + " of incorrect stack address"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PC_BRANCH_FAILED, minv, maxv, + "PC_BRANCH_FAILED", "Branch misprediction (both ways)"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PC_BRANCH_TAKEN, minv, maxv, + "PC_BRANCH_TAKEN", "Predictable branch predicted taken"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PC_BRANCH_EXECUTED, minv, maxv, + "PC_BRANCH_EXECUTED", "Predictable branch executed taken"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_OP_EXECUTED, minv, maxv, + "OP_EXECUTED", "uOP executed (an instruction or a " + "multi-instruction step)"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_CYCLES_INST_STALL, minv, maxv, + "CYCLES_INST_STALL", "Instruction issue unit idle cycle"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_CYCLES_INST, minv, maxv, + "CYCLES_INST", "Instruction issued (multicycle instruction " + "counts for one)"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_CYCLES_NEON_DATA_STALL, minv, maxv, + "CYCLES_NEON_DATA_STALL", "Cycles the CPU waits on MRC " + "from NEON"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_CYCLES_NEON_INST_STALL, minv, maxv, + "CYCLES_NEON_INST_STALL", "Stall cycles caused by full NEON" + " queue (either ins. queue or load queue)"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_NEON_CYCLES, minv, maxv, + "NEON_CYCLES", "Cycles that both processors (ARM & NEON)" + " are not idle"), + + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PMU0_EVENTS, minv, maxv, + "PMU0_EVENTS", "Event on external input source (PMUEXTIN[0])"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PMU1_EVENTS, minv, maxv, + "PMU1_EVENTS", "Event on external input source (PMUEXTIN[1])"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PMU_EVENTS, minv, maxv, + "PMU_EVENTS", "Event on either of the external input sources" + " (PMUEXTIN[0,1])") +}; + +static struct perf_event_description cortexa9_event_description[] = { + /* ARMv7 Cortex-A9 specific event types */ + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_JAVA_HW_BYTECODE_EXEC, minv, maxv, + "JAVA_HW_BYTECODE_EXEC", "Java bytecode executed in HW"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_JAVA_SW_BYTECODE_EXEC, minv, maxv, + "JAVA_SW_BYTECODE_EXEC", "Java bytecode executed in SW"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_JAZELLE_BRANCH_EXEC, minv, maxv, + "JAZELLE_BRANCH_EXEC", "Jazelle backward branch"), + + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_COHERENT_LINE_MISS, minv, maxv, + "COHERENT_LINE_MISS", "???"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_COHERENT_LINE_HIT, minv, maxv, + "COHERENT_LINE_HIT", "???"), + + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_ICACHE_DEP_STALL_CYCLES, minv, + maxv, "ICACHE_DEP_STALL_CYCLES", "Instruction cache " + "dependent stall"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DCACHE_DEP_STALL_CYCLES, minv, + maxv, "DCACHE_DEP_STALL_CYCLES", "Data cache dependent stall"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_TLB_MISS_DEP_STALL_CYCLES, minv, + maxv, "TLB_MISS_DEP_STALL_CYCLES", "Main TLB miss stall"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_STREX_EXECUTED_PASSED, minv, maxv, + "STREX_EXECUTED_PASSED", "STREX passed"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_STREX_EXECUTED_FAILED, minv, maxv, + "STREX_EXECUTED_FAILED", "STREX failed"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DATA_EVICTION, minv, maxv, + "DATA_EVICTION", "Cache data eviction (?)"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_ISSUE_STAGE_NO_INST, minv, maxv, + "ISSUE_STAGE_NO_INST", "No instruction issued cycle"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_ISSUE_STAGE_EMPTY, minv, maxv, + "ISSUE_STAGE_EMPTY", "Empty issue unit cycles"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_INST_OUT_OF_RENAME_STAGE, minv, + maxv, "INST_OUT_OF_RENAME_STAGE", "???"), + + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PREDICTABLE_FUNCT_RETURNS, minv, + maxv, "PREDICTABLE_FUNCT_RETURNS", "Predictable return " + "occured (?)"), + + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_MAIN_UNIT_EXECUTED_INST, minv, + maxv, "MAIN_UNIT_EXECUTED_INST", "Pipe 0 instruction " + "executed (?)"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_SECOND_UNIT_EXECUTED_INST, minv, + maxv, "SECOND_UNIT_EXECUTED_INST", "Pipe 1 instruction " + "executed (?)"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_LD_ST_UNIT_EXECUTED_INST, minv, + maxv, "LD_ST_UNIT_EXECUTED_INST", "Load/Store Unit instruction" + " executed (?)"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_FP_EXECUTED_INST, minv, maxv, + "FP_EXECUTED_INST", "VFP instruction executed (?)"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_NEON_EXECUTED_INST, minv, maxv, + "NEON_EXECUTED_INST", "NEON instruction executed (?)"), + + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PLD_FULL_DEP_STALL_CYCLES, + minv, maxv, "PLD_FULL_DEP_STALL_CYCLES", "PLD stall cycle"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DATA_WR_DEP_STALL_CYCLES, minv, + maxv, "DATA_WR_DEP_STALL_CYCLES", "Write stall cycle"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_ITLB_MISS_DEP_STALL_CYCLES, minv, + maxv, "ITLB_MISS_DEP_STALL_CYCLES", "Instruction stall due to" + " main TLB miss (?)"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DTLB_MISS_DEP_STALL_CYCLES, minv, + maxv, "DTLB_MISS_DEP_STALL_CYCLES", "Data stall due to main TLB" + " miss (?)"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_MICRO_ITLB_MISS_DEP_STALL_CYCLES, + minv, maxv, "MICRO_ITLB_MISS_DEP_STALL_CYCLES", "Instruction " + "stall due to uTLB miss (?)"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_MICRO_DTLB_MISS_DEP_STALL_CYCLES, + minv, maxv, "MICRO_DTLB_MISS_DEP_STALL_CYCLES", "Data stall " + "due to micro uTLB miss (?)"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DMB_DEP_STALL_CYCLES, minv, maxv, + "DMB_DEP_STALL_CYCLES", "DMB stall (?)"), + + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_INTGR_CLK_ENABLED_CYCLES, minv, + maxv, "INTGR_CLK_ENABLED_CYCLES", "Integer core clock " + "disabled (?)"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DATA_ENGINE_CLK_EN_CYCLES, minv, + maxv, "DATA_ENGINE_CLK_EN_CYCLES", "Data engine clock disabled" + " (?)"), + + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_ISB_INST, minv, maxv, "ISB_INST", + "ISB executed"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DSB_INST, minv, maxv, "DSB_INST", + "DSB executed"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DMB_INST, minv, maxv, "DMB_INST", + "DMB executed"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_EXT_INTERRUPTS, minv, maxv, + "EXT_INTERRUPTS", "External interrupt"), + + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PLE_CACHE_LINE_RQST_COMPLETED, + minv, maxv, "PLE_CACHE_LINE_RQST_COMPLETED", "PLE (Preload " + "engine) cache line request completed"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PLE_CACHE_LINE_RQST_SKIPPED, minv, + maxv, "PLE_CACHE_LINE_RQST_SKIPPED", "PLE cache line " + "request skipped"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PLE_FIFO_FLUSH, minv, maxv, + "PLE_FIFO_FLUSH", "PLE FIFO flush"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PLE_RQST_COMPLETED, minv, maxv, + "PLE_RQST_COMPLETED", "PLE request completed"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PLE_FIFO_OVERFLOW, minv, maxv, + "PLE_FIFO_OVERFLOW", "PLE FIFO overflow"), + PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PLE_RQST_PROG, minv, maxv, + "PLE_RQST_PROG", "PLE request programmed") +}; + + +/* ********************************************************** */ + /* * Cortex-A9 HW events mapping */ @@ -1798,6 +2113,11 @@ static struct arm_pmu armv7pmu = { .max_period = (1LLU << 32) - 1, }; +const struct list_head *hw_perf_event_get_list(void) +{ + return &perf_events_arm; +} + static int __init init_hw_perf_events(void) { @@ -1820,11 +2140,16 @@ init_hw_perf_events(void) memcpy(armpmu_perf_cache_map, armv6_perf_cache_map, sizeof(armv6_perf_cache_map)); perf_max_events = armv6pmu.num_events; + + perf_event_add_events(&perf_events_arm, armv6_event_description, + ARRAY_SIZE(armv6_event_description)); } /* * ARMv7 detection */ else if (cpu_architecture() == CPU_ARCH_ARMv7) { + perf_event_add_events(&perf_events_arm, armv7_event_description, + ARRAY_SIZE(armv7_event_description)); /* * Cortex-A8 detection */ @@ -1834,6 +2159,10 @@ init_hw_perf_events(void) sizeof(armv7_a8_perf_cache_map)); armv7pmu.event_map = armv7_a8_pmu_event_map; armpmu = &armv7pmu; + + perf_event_add_events(&perf_events_arm, + cortexa8_event_description, + ARRAY_SIZE(cortexa8_event_description)); } else /* * Cortex-A9 detection @@ -1846,8 +2175,12 @@ init_hw_perf_events(void) sizeof(armv7_a9_perf_cache_map)); armv7pmu.event_map = armv7_a9_pmu_event_map; armpmu = &armv7pmu; - } else - perf_max_events = -1; + + perf_event_add_events(&perf_events_arm, + cortexa9_event_description, + ARRAY_SIZE(cortexa9_event_description)); + } else + perf_max_events = -1; if (armpmu) { u32 nb_cnt; @@ -1867,11 +2200,11 @@ init_hw_perf_events(void) perf_max_events = -1; } - if (armpmu) + if (armpmu) pr_info("enabled with %s PMU driver, %d counters available\n", armpmu->name, armpmu->num_events); - return 0; + return 0; } arch_initcall(init_hw_perf_events); -- 1.5.4.3 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH/RFC v1 0/2] Human readable performance event description in sysfs 2010-01-20 9:11 [PATCH/RFC v1 0/2] Human readable performance event description in sysfs Tomasz Fujak 2010-01-20 9:11 ` [PATCH v1 1/2] perfevent: Add performance event structure definition and 'extevents' sysfs entry Tomasz Fujak 2010-01-20 9:11 ` [PATCH v1 2/2] [ARM] perfevent: Event description list for ARMv6, Cortex-A8 and Cortex-A9 exported Tomasz Fujak @ 2010-01-20 9:16 ` Peter Zijlstra 2010-01-20 9:46 ` Tomasz Fujak 2010-01-20 9:57 ` Michał Nazarewicz 2010-01-20 9:57 ` Russell King - ARM Linux 3 siblings, 2 replies; 23+ messages in thread From: Peter Zijlstra @ 2010-01-20 9:16 UTC (permalink / raw) To: linux-arm-kernel On Wed, 2010-01-20 at 10:11 +0100, Tomasz Fujak wrote: > Hi, > > While I managed to build and run the early version (back from > December), I was unable to find the newest sources (infra + ARMv6, > ARMv7 support). > Where do I find them? > > The following patches provide a sysfs entry with hardware event human > readable description in the form of "0x%llx\t%lld-%lld\t%s\t%s" % > (event_value, minval, maxval, name, description) and means to populate > the file. > The version posted contains ARMv6, ARMv7 (Cortex-A[89]) support in > this matter. > > The intended use is twofold: for users to read the list directly and > for tools (like perf). > > This series includes: > [PATCH v1 1/2] perfevent: Add performance event structure definition > and 'extevents' sysfs entry > [PATCH v1 2/2] [ARM] perfevent: Event description list for ARMv6, > Cortex-A8 and Cortex-A9 exported Why do this in kernel space? Listing available events seems like something we can do from userspace just fine. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC v1 0/2] Human readable performance event description in sysfs 2010-01-20 9:16 ` [PATCH/RFC v1 0/2] Human readable performance event description in sysfs Peter Zijlstra @ 2010-01-20 9:46 ` Tomasz Fujak 2010-01-20 9:57 ` Michał Nazarewicz 1 sibling, 0 replies; 23+ messages in thread From: Tomasz Fujak @ 2010-01-20 9:46 UTC (permalink / raw) To: linux-arm-kernel > -----Original Message----- > From: linux-arm-kernel-bounces at lists.infradead.org [mailto:linux-arm- > kernel-bounces at lists.infradead.org] On Behalf Of Peter Zijlstra > Sent: Wednesday, January 20, 2010 10:17 AM > To: Tomasz Fujak > Cc: jpihet at mvista.com; p.osciak at samsung.com; jamie.iles at picochip.com; > will.deacon at arm.com; linux-kernel at vger.kernel.org; > kyungmin.park at samsung.com; mingo at elte.hu; linux-arm- > kernel at lists.infradead.org; m.szyprowski at samsung.com > Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event > description in sysfs > > On Wed, 2010-01-20 at 10:11 +0100, Tomasz Fujak wrote: > > Hi, > > > > While I managed to build and run the early version (back from > > December), I was unable to find the newest sources (infra + ARMv6, > > ARMv7 support). > > Where do I find them? > > > > The following patches provide a sysfs entry with hardware event human > > readable description in the form of "0x%llx\t%lld-%lld\t%s\t%s" % > > (event_value, minval, maxval, name, description) and means to > populate > > the file. > > The version posted contains ARMv6, ARMv7 (Cortex-A[89]) support in > > this matter. > > > > The intended use is twofold: for users to read the list directly and > > for tools (like perf). > > > > This series includes: > > [PATCH v1 1/2] perfevent: Add performance event structure definition > > and 'extevents' sysfs entry > > [PATCH v1 2/2] [ARM] perfevent: Event description list for ARMv6, > > Cortex-A8 and Cortex-A9 exported > > Why do this in kernel space? Listing available events seems like > something we can do from userspace just fine. Sure we could, it's the other option. But it does not appeal to me. In case of userspace tools (like the pref for which the above is meant) they'd need to come with their own version of the list, which must match the host platform. Right now the perf just forwards raw event number to the kernel and that's it. Potentially it could bind a set of events supported to a platform (how to detect which platform we execute on?). But how do we handle different revisions and minor changes within a single platform? That's why I think the kernel should expose supported events. At least with an identifier suitable to unambiguously detect which HW defined event it is. In the proposed approach I also provided a name a and description. Right now if one wants to set a counter with some non-generic value, a datasheet comes handy. And Joe the average user does not necessarily know the detailed machine he/she has, let alone the datasheet. With this approach the user is armed with the event definition, which helps them go around outdated/unsupported tools. > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC v1 0/2] Human readable performance event description in sysfs 2010-01-20 9:16 ` [PATCH/RFC v1 0/2] Human readable performance event description in sysfs Peter Zijlstra 2010-01-20 9:46 ` Tomasz Fujak @ 2010-01-20 9:57 ` Michał Nazarewicz 2010-01-20 13:31 ` Jamie Iles 1 sibling, 1 reply; 23+ messages in thread From: Michał Nazarewicz @ 2010-01-20 9:57 UTC (permalink / raw) To: linux-arm-kernel >> The following patches provide a sysfs entry with hardware event human >> readable description in the form of "0x%llx\t%lld-%lld\t%s\t%s" % >> (event_value, minval, maxval, name, description) and means to populate >> the file. >> >> The intended use is twofold: for users to read the list directly and >> for tools (like perf). On Wed, 20 Jan 2010 10:16:39 +0100, Peter Zijlstra <peterz@infradead.org> wrote: > Why do this in kernel space? Listing available events seems like > something we can do from userspace just fine. IMO kernel knows better what hardware it's running on and user space should not care and if this list were to be kept in user space it would have to detect the processor it's running on and act accordingly. Also, keeping the list in user space could lead to different software maintaining separate lists which would get out of sync. I think it's easier to update a single list in kernel then wait till all the software packages update theirs. This also means that different tools would use different names and descriptions for the events which would only increase confusion. Moreover, since kernel already does the hard work of detecting CPU it may provide a list as well. But I'm just a humble coder, what do I know... ;) -- Best regards, _ _ .o. | Liege of Serenely Enlightened Majesty of o' \,=./ `o ..o | Computer Science, Micha? "mina86" Nazarewicz (o o) ooo +---[mina86 at mina86.com]---[mina86 at jabber.org]---ooO--(_)--Ooo-- ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC v1 0/2] Human readable performance event description in sysfs 2010-01-20 9:57 ` Michał Nazarewicz @ 2010-01-20 13:31 ` Jamie Iles 2010-01-20 13:39 ` Peter Zijlstra 0 siblings, 1 reply; 23+ messages in thread From: Jamie Iles @ 2010-01-20 13:31 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jan 20, 2010 at 10:57:08AM +0100, Micha? Nazarewicz wrote: >>> The following patches provide a sysfs entry with hardware event human >>> readable description in the form of "0x%llx\t%lld-%lld\t%s\t%s" % >>> (event_value, minval, maxval, name, description) and means to populate >>> the file. >>> >>> The intended use is twofold: for users to read the list directly and >>> for tools (like perf). > > On Wed, 20 Jan 2010 10:16:39 +0100, Peter Zijlstra <peterz@infradead.org> wrote: >> Why do this in kernel space? Listing available events seems like >> something we can do from userspace just fine. > > IMO kernel knows better what hardware it's running on and user space > should not care and if this list were to be kept in user space it > would have to detect the processor it's running on and act accordingly. > > Also, keeping the list in user space could lead to different software > maintaining separate lists which would get out of sync. I think it's > easier to update a single list in kernel then wait till all the > software packages update theirs. > > This also means that different tools would use different names and > descriptions for the events which would only increase confusion. Personally I think this is a good idea. At the moment 'perf list' gives lots of events that the system isn't capable of counting. Admittedly it's fairly easy to see if they are supported but it would be nice if the list reflected the countable events. perf already does this for the tracing events so it would be nice if it did the same for the hardware events. I guess the same hierarchy would be nice too. The main problem I can envisage is that different CPUs could use slightly different names for the same event. Jamie ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC v1 0/2] Human readable performance event description in sysfs 2010-01-20 13:31 ` Jamie Iles @ 2010-01-20 13:39 ` Peter Zijlstra 2010-01-20 13:55 ` Russell King - ARM Linux 0 siblings, 1 reply; 23+ messages in thread From: Peter Zijlstra @ 2010-01-20 13:39 UTC (permalink / raw) To: linux-arm-kernel On Wed, 2010-01-20 at 13:31 +0000, Jamie Iles wrote: > Personally I think this is a good idea. At the moment 'perf list' gives lots > of events that the system isn't capable of counting. Admittedly it's fairly > easy to see if they are supported but it would be nice if the list reflected > the countable events. perf already does this for the tracing events so it > would be nice if it did the same for the hardware events. I guess the same > hierarchy would be nice too. This seems to be missing the patch that extends perf list to report the support and counting status for the events on the current machine :-) Furthermore, /proc/cpuinfo should be enough information to come up with an arch specific set of events to be translated into raw. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC v1 0/2] Human readable performance event description in sysfs 2010-01-20 13:39 ` Peter Zijlstra @ 2010-01-20 13:55 ` Russell King - ARM Linux 2010-01-20 14:01 ` Peter Zijlstra 0 siblings, 1 reply; 23+ messages in thread From: Russell King - ARM Linux @ 2010-01-20 13:55 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jan 20, 2010 at 02:39:39PM +0100, Peter Zijlstra wrote: > Furthermore, /proc/cpuinfo should be enough information to come up with > an arch specific set of events to be translated into raw. Unfortunately, it isn't. CPU identification has become a fairly murky business on ARM that the information exported from /proc/cpuinfo can no longer precisely identify the CPU itself. For example, we just treat Cortex A8 and A9 as "ARMv7" because from the kernel's point of view, they're the same. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC v1 0/2] Human readable performance event description in sysfs 2010-01-20 13:55 ` Russell King - ARM Linux @ 2010-01-20 14:01 ` Peter Zijlstra 2010-01-20 14:09 ` Michał Nazarewicz 2010-01-20 14:41 ` Russell King - ARM Linux 0 siblings, 2 replies; 23+ messages in thread From: Peter Zijlstra @ 2010-01-20 14:01 UTC (permalink / raw) To: linux-arm-kernel On Wed, 2010-01-20 at 13:55 +0000, Russell King - ARM Linux wrote: > > Unfortunately, it isn't. CPU identification has become a fairly murky > business on ARM that the information exported from /proc/cpuinfo can > no longer precisely identify the CPU itself. > > For example, we just treat Cortex A8 and A9 as "ARMv7" because from the > kernel's point of view, they're the same. Would it make sense to extend arm's cpuinfo to include enough information so that userspace can indeed do this? It seems to me userspace might care about the exact platform they're running on. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC v1 0/2] Human readable performance event description in sysfs 2010-01-20 14:01 ` Peter Zijlstra @ 2010-01-20 14:09 ` Michał Nazarewicz 2010-01-20 14:16 ` Peter Zijlstra 2010-01-20 14:41 ` Russell King - ARM Linux 1 sibling, 1 reply; 23+ messages in thread From: Michał Nazarewicz @ 2010-01-20 14:09 UTC (permalink / raw) To: linux-arm-kernel On Wed, 20 Jan 2010 15:01:20 +0100, Peter Zijlstra <peterz@infradead.org> wrote: > It seems to me userspace might care about the exact platform they're > running on. In my humble opinion, user space should never care about platform it's running on. Interfaces provided by kernel should suffice to implement abstraction layer between user space and hardware. If we abandon that we're back in DOS times. But hey, again, that's just my opinion. -- Best regards, _ _ .o. | Liege of Serenely Enlightened Majesty of o' \,=./ `o ..o | Computer Science, Micha? "mina86" Nazarewicz (o o) ooo +---[mina86 at mina86.com]---[mina86 at jabber.org]---ooO--(_)--Ooo-- ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC v1 0/2] Human readable performance event description in sysfs 2010-01-20 14:09 ` Michał Nazarewicz @ 2010-01-20 14:16 ` Peter Zijlstra 2010-01-20 14:26 ` Peter Zijlstra 2010-01-20 14:54 ` Michał Nazarewicz 0 siblings, 2 replies; 23+ messages in thread From: Peter Zijlstra @ 2010-01-20 14:16 UTC (permalink / raw) To: linux-arm-kernel On Wed, 2010-01-20 at 15:09 +0100, Micha? Nazarewicz wrote: > On Wed, 20 Jan 2010 15:01:20 +0100, Peter Zijlstra <peterz@infradead.org> wrote: > > It seems to me userspace might care about the exact platform they're > > running on. > > In my humble opinion, user space should never care about platform it's > running on. Interfaces provided by kernel should suffice to implement > abstraction layer between user space and hardware. If we abandon that > we're back in DOS times. But hey, again, that's just my opinion. Well, you're completely right. But the often sad reality is that perfect abstraction is either impossible or prohibitively expensive. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC v1 0/2] Human readable performance event description in sysfs 2010-01-20 14:16 ` Peter Zijlstra @ 2010-01-20 14:26 ` Peter Zijlstra 2010-01-20 14:45 ` Russell King - ARM Linux 2010-01-20 14:54 ` Michał Nazarewicz 1 sibling, 1 reply; 23+ messages in thread From: Peter Zijlstra @ 2010-01-20 14:26 UTC (permalink / raw) To: linux-arm-kernel On Wed, 2010-01-20 at 15:16 +0100, Peter Zijlstra wrote: > On Wed, 2010-01-20 at 15:09 +0100, Micha? Nazarewicz wrote: > > On Wed, 20 Jan 2010 15:01:20 +0100, Peter Zijlstra <peterz@infradead.org> wrote: > > > It seems to me userspace might care about the exact platform they're > > > running on. > > > > In my humble opinion, user space should never care about platform it's > > running on. Interfaces provided by kernel should suffice to implement > > abstraction layer between user space and hardware. If we abandon that > > we're back in DOS times. But hey, again, that's just my opinion. > > Well, you're completely right. But the often sad reality is that perfect > abstraction is either impossible or prohibitively expensive. And then there is the simple matter of knowing what kind of box it is without having to resort to a screwdriver or worse. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC v1 0/2] Human readable performance event description in sysfs 2010-01-20 14:26 ` Peter Zijlstra @ 2010-01-20 14:45 ` Russell King - ARM Linux 0 siblings, 0 replies; 23+ messages in thread From: Russell King - ARM Linux @ 2010-01-20 14:45 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jan 20, 2010 at 03:26:49PM +0100, Peter Zijlstra wrote: > On Wed, 2010-01-20 at 15:16 +0100, Peter Zijlstra wrote: > > On Wed, 2010-01-20 at 15:09 +0100, Micha? Nazarewicz wrote: > > > On Wed, 20 Jan 2010 15:01:20 +0100, Peter Zijlstra <peterz@infradead.org> wrote: > > > > It seems to me userspace might care about the exact platform they're > > > > running on. > > > > > > In my humble opinion, user space should never care about platform it's > > > running on. Interfaces provided by kernel should suffice to implement > > > abstraction layer between user space and hardware. If we abandon that > > > we're back in DOS times. But hey, again, that's just my opinion. > > > > Well, you're completely right. But the often sad reality is that perfect > > abstraction is either impossible or prohibitively expensive. > > And then there is the simple matter of knowing what kind of box it is > without having to resort to a screwdriver or worse. If you're expecting the CPU to tell you that, give up now. The CPU will tell you about the CPU core, not the SoC. All SoCs that have an ARM926 core in report that they are an ARM926 CPU; that doesn't tell you that the surrounding hardware is an Atmel SoC, Samsung SoC, etc. Even some buggy CPUs which aren't an ARM926 report themselves as an ARM926 (Feroceon) while being incompatible with the ARM926 on several levels. (Apparantly, the argument being that they wanted ARM926 software to run on Feroceon, or something like that.) That's why we have the value passed in from the boot loader; there's no other way to tell what SoC you're running on. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC v1 0/2] Human readable performance event description in sysfs 2010-01-20 14:16 ` Peter Zijlstra 2010-01-20 14:26 ` Peter Zijlstra @ 2010-01-20 14:54 ` Michał Nazarewicz 1 sibling, 0 replies; 23+ messages in thread From: Michał Nazarewicz @ 2010-01-20 14:54 UTC (permalink / raw) To: linux-arm-kernel >> On Wed, 20 Jan 2010 15:01:20 +0100, Peter Zijlstra <peterz@infradead.org> wrote: >>> It seems to me userspace might care about the exact platform they're >>> running on. > On Wed, 2010-01-20 at 15:09 +0100, Micha? Nazarewicz wrote: >> In my humble opinion, user space should never care about platform it's >> running on. Interfaces provided by kernel should suffice to implement >> abstraction layer between user space and hardware. If we abandon that >> we're back in DOS times. But hey, again, that's just my opinion. On Wed, 20 Jan 2010 15:16:19 +0100, Peter Zijlstra <peterz@infradead.org> wrote: > Well, you're completely right. But the often sad reality is that perfect > abstraction is either impossible or prohibitively expensive. Yes, I agree and am aware of that, but I think it's not the case with performance events. It is possible for kernel to provide such a list and at the same time it's not that expensive (it's a matter of hardcoding a list in the source and possibly alter it a bit according to hardware detection which is done anyway). Of course, it's not all gold -- maintaining such a list increases complexity of the kernel and adds burden of keeping the lists in sync with reality. Still, however, in my opinion, the advantages of the list maintained in kernel are greater then disadvantages and so I'd opt in for that solution. (Of course, I'm not some kind of ARM Linux guru so I may be simply wrong.) -- Best regards, _ _ .o. | Liege of Serenely Enlightened Majesty of o' \,=./ `o ..o | Computer Science, Micha? "mina86" Nazarewicz (o o) ooo +---[mina86 at mina86.com]---[mina86 at jabber.org]---ooO--(_)--Ooo-- ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC v1 0/2] Human readable performance event description in sysfs 2010-01-20 14:01 ` Peter Zijlstra 2010-01-20 14:09 ` Michał Nazarewicz @ 2010-01-20 14:41 ` Russell King - ARM Linux 2010-01-20 15:03 ` Jamie Iles 2010-01-20 16:26 ` Jamie Lokier 1 sibling, 2 replies; 23+ messages in thread From: Russell King - ARM Linux @ 2010-01-20 14:41 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jan 20, 2010 at 03:01:20PM +0100, Peter Zijlstra wrote: > On Wed, 2010-01-20 at 13:55 +0000, Russell King - ARM Linux wrote: > > > > Unfortunately, it isn't. CPU identification has become a fairly murky > > business on ARM that the information exported from /proc/cpuinfo can > > no longer precisely identify the CPU itself. > > > > For example, we just treat Cortex A8 and A9 as "ARMv7" because from the > > kernel's point of view, they're the same. > > Would it make sense to extend arm's cpuinfo to include enough > information so that userspace can indeed do this? The idea that "I'm running on a Cortex A9" is no longer provided by the new CPU ID scheme. Instead, what's now provided is a set of registers which describe various individual features of the CPU: - ThumbEE ISA level, Jazelle ISA level, Thumb ISA level, ARM ISA level. - Programmer model (not much here that userspace would be interested in) - Debug model (memory mapped/co-processor, v6 debug architecture, v7 debug architecture.) - Four 32-bit registers describing the memory model. Note that pre-ARMv6k does not provide this information. Plus, the interpretation of these registers change between ARMv6k and ARMv7 - and I wouldn't be surprised if the interpretation changes in the future - just like the 'cache type' register completely changed format on ARMv7. > It seems to me userspace might care about the exact platform they're > running on. It may wanted to care at one time, but as time goes on, knowing what the high-level chip is will be come irrelevent, and is actually the wrong question. The real questions that userspace needs to ask are the specific ones, such as "what ARM ISA level is supported? what Thumb ISA level is supported? what debug model is implemented?" Given that history has shown that identification schemes on ARM change in extremely annoying ways, I don't think decoding these registers to some kind of textual representation for /proc/cpuinfo is the right approach. It might instead make more sense to just export the entire set of CPU ID registers to userspace, and let userspace grapple with the complexities of decoding the information it wants from them. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC v1 0/2] Human readable performance event description in sysfs 2010-01-20 14:41 ` Russell King - ARM Linux @ 2010-01-20 15:03 ` Jamie Iles 2010-01-20 15:42 ` Russell King - ARM Linux 2010-01-20 16:26 ` Jamie Lokier 1 sibling, 1 reply; 23+ messages in thread From: Jamie Iles @ 2010-01-20 15:03 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jan 20, 2010 at 02:41:40PM +0000, Russell King - ARM Linux wrote: > Given that history has shown that identification schemes on ARM change > in extremely annoying ways, I don't think decoding these registers to > some kind of textual representation for /proc/cpuinfo is the right > approach. It might instead make more sense to just export the entire > set of CPU ID registers to userspace, and let userspace grapple with > the complexities of decoding the information it wants from them. Yes, this would probably be the best generic solution, but in the specific case of ARM perfevents, the kernel code already has to decode some of the CPU ID registers to work out what set of events to use. Why make userspace do all of this decoding again? The x86 code sets up the x86_pmu depending on CPU type so this is doing a similar thing (although it is easier for x86). Having perf do all of this decoding for all of the supported CPU types when the kernel has already done it once and maintaining 2 sets of event lists seems a bit fiddly compared to simply exporting the supported events from the kernel... Jamie ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC v1 0/2] Human readable performance event description in sysfs 2010-01-20 15:03 ` Jamie Iles @ 2010-01-20 15:42 ` Russell King - ARM Linux 2010-01-20 16:18 ` Jamie Iles 0 siblings, 1 reply; 23+ messages in thread From: Russell King - ARM Linux @ 2010-01-20 15:42 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jan 20, 2010 at 03:03:03PM +0000, Jamie Iles wrote: > On Wed, Jan 20, 2010 at 02:41:40PM +0000, Russell King - ARM Linux wrote: > > Given that history has shown that identification schemes on ARM change > > in extremely annoying ways, I don't think decoding these registers to > > some kind of textual representation for /proc/cpuinfo is the right > > approach. It might instead make more sense to just export the entire > > set of CPU ID registers to userspace, and let userspace grapple with > > the complexities of decoding the information it wants from them. > Yes, this would probably be the best generic solution, but in the specific > case of ARM perfevents, the kernel code already has to decode some of the CPU > ID registers to work out what set of events to use. Why make userspace do all > of this decoding again? The x86 code sets up the x86_pmu depending on CPU type > so this is doing a similar thing (although it is easier for x86). If you're referring to reading the main CPU ID register and relying on the part number telling you what CPU you're running on, that's unreliable if you're only checking the part number - you at least need to check the implementer. If you want to do ID checking via the main ID register, there are some clashes even if you take the implementer field into account. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC v1 0/2] Human readable performance event description in sysfs 2010-01-20 15:42 ` Russell King - ARM Linux @ 2010-01-20 16:18 ` Jamie Iles 0 siblings, 0 replies; 23+ messages in thread From: Jamie Iles @ 2010-01-20 16:18 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jan 20, 2010 at 03:42:50PM +0000, Russell King - ARM Linux wrote: > If you're referring to reading the main CPU ID register and relying > on the part number telling you what CPU you're running on, that's > unreliable if you're only checking the part number - you at least > need to check the implementer. > > If you want to do ID checking via the main ID register, there are > some clashes even if you take the implementer field into account. Ok, so for the kernel based code I should check the implementer and part number then. For now we can make sure that the implementor is ARM and add others if they have compatible PMUs and hope that there aren't any clashes with nasty side effects. Jamie ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC v1 0/2] Human readable performance event description in sysfs 2010-01-20 14:41 ` Russell King - ARM Linux 2010-01-20 15:03 ` Jamie Iles @ 2010-01-20 16:26 ` Jamie Lokier 2010-01-20 16:35 ` Russell King - ARM Linux 1 sibling, 1 reply; 23+ messages in thread From: Jamie Lokier @ 2010-01-20 16:26 UTC (permalink / raw) To: linux-arm-kernel Russell King - ARM Linux wrote: > On Wed, Jan 20, 2010 at 03:01:20PM +0100, Peter Zijlstra wrote: > > On Wed, 2010-01-20 at 13:55 +0000, Russell King - ARM Linux wrote: > > > > > > Unfortunately, it isn't. CPU identification has become a fairly murky > > > business on ARM that the information exported from /proc/cpuinfo can > > > no longer precisely identify the CPU itself. > > > > > > For example, we just treat Cortex A8 and A9 as "ARMv7" because from the > > > kernel's point of view, they're the same. > > > > Would it make sense to extend arm's cpuinfo to include enough > > information so that userspace can indeed do this? > > The idea that "I'm running on a Cortex A9" is no longer provided by the > new CPU ID scheme. Instead, what's now provided is a set of registers > which describe various individual features of the CPU: > > - ThumbEE ISA level, Jazelle ISA level, Thumb ISA level, ARM ISA level. > - Programmer model (not much here that userspace would be interested in) > - Debug model (memory mapped/co-processor, v6 debug architecture, v7 debug > architecture.) > - Four 32-bit registers describing the memory model. > > Note that pre-ARMv6k does not provide this information. Plus, the > interpretation of these registers change between ARMv6k and ARMv7 - > and I wouldn't be surprised if the interpretation changes in the > future - just like the 'cache type' register completely changed format > on ARMv7. > > > It seems to me userspace might care about the exact platform they're > > running on. > > It may wanted to care at one time, but as time goes on, knowing what > the high-level chip is will be come irrelevent, and is actually the > wrong question. > > The real questions that userspace needs to ask are the specific ones, > such as "what ARM ISA level is supported? what Thumb ISA level is > supported? what debug model is implemented?" > > Given that history has shown that identification schemes on ARM change > in extremely annoying ways, I don't think decoding these registers to > some kind of textual representation for /proc/cpuinfo is the right > approach. It might instead make more sense to just export the entire > set of CPU ID registers to userspace, and let userspace grapple with > the complexities of decoding the information it wants from them. In practice, the list of capabilities works well on x86 in /proc/cpuinfo: flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc arch_perfmon bts pni monitor vmx est tm2 xtpr pdcm They are based on the feature bits from the CPU's cpuid instruction, but the kernel does things like apply errata quirks to remove bits that don't work on a particular implementation and show the lowest common denominator when there are multiple CPUs. Userspace tends to look for features it cares about (e.g. sse means sse instructions are available), and doesn't need to know anything about murky details of different CPUs. Many of the features aren't relevant to userspace; the rest tend to indicate the presence of particular instructions. On ARM, it would be great to have a simple set of features in /proc/cpuinfo indicating which instruction sets are available (and reliable). -- Jamie ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC v1 0/2] Human readable performance event description in sysfs 2010-01-20 16:26 ` Jamie Lokier @ 2010-01-20 16:35 ` Russell King - ARM Linux 0 siblings, 0 replies; 23+ messages in thread From: Russell King - ARM Linux @ 2010-01-20 16:35 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jan 20, 2010 at 04:26:47PM +0000, Jamie Lokier wrote: > In practice, the list of capabilities works well on x86 in /proc/cpuinfo: > > flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc arch_perfmon bts pni monitor vmx est tm2 xtpr pdcm > > They are based on the feature bits from the CPU's cpuid instruction, > but the kernel does things like apply errata quirks to remove bits > that don't work on a particular implementation and show the lowest common > denominator when there are multiple CPUs. You're assuming that there's a fixed set of feature bits on ARM. There aren't. What you have is a main ID register up until ARMv6, which has about four different encodings. On some CPUs, this is the only ID register offered, and within that subset, some different CPUs (eg, implemented by different manufacturers, or indeed the same manufacturer) have the same ID register value, despite being rather different. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC v1 0/2] Human readable performance event description in sysfs 2010-01-20 9:11 [PATCH/RFC v1 0/2] Human readable performance event description in sysfs Tomasz Fujak ` (2 preceding siblings ...) 2010-01-20 9:16 ` [PATCH/RFC v1 0/2] Human readable performance event description in sysfs Peter Zijlstra @ 2010-01-20 9:57 ` Russell King - ARM Linux 2010-01-20 10:21 ` Tomasz Fujak 3 siblings, 1 reply; 23+ messages in thread From: Russell King - ARM Linux @ 2010-01-20 9:57 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jan 20, 2010 at 10:11:44AM +0100, Tomasz Fujak wrote: > The following patches provide a sysfs entry with hardware event human > readable description in the form of "0x%llx\t%lld-%lld\t%s\t%s" % > (event_value, minval, maxval, name, description) I think your patch is in violation of this from Documentation/filesystems/sysfs.txt: Attributes ~~~~~~~~~ ... Attributes should be ASCII text files, preferably with only one value per file. It is noted that it may not be efficient to contain only one value per file, so it is socially acceptable to express an array of values of the same type. Mixing types, expressing multiple lines of data, and doing fancy formatting of data is heavily frowned upon. Doing these things may get you publically humiliated and your code rewritten without notice. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC v1 0/2] Human readable performance event description in sysfs 2010-01-20 9:57 ` Russell King - ARM Linux @ 2010-01-20 10:21 ` Tomasz Fujak 0 siblings, 0 replies; 23+ messages in thread From: Tomasz Fujak @ 2010-01-20 10:21 UTC (permalink / raw) To: linux-arm-kernel > -----Original Message----- > From: linux-arm-kernel-bounces at lists.infradead.org [mailto:linux-arm- > kernel-bounces at lists.infradead.org] On Behalf Of Russell King - ARM > Linux > Sent: Wednesday, January 20, 2010 10:58 AM > To: Tomasz Fujak > Cc: jpihet at mvista.com; peterz at infradead.org; p.osciak at samsung.com; > jamie.iles at picochip.com; will.deacon at arm.com; linux- > kernel at vger.kernel.org; kyungmin.park at samsung.com; mingo at elte.hu; > linux-arm-kernel at lists.infradead.org; m.szyprowski at samsung.com > Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event > description in sysfs > > On Wed, Jan 20, 2010 at 10:11:44AM +0100, Tomasz Fujak wrote: > > The following patches provide a sysfs entry with hardware event human > > readable description in the form of "0x%llx\t%lld-%lld\t%s\t%s" % > > (event_value, minval, maxval, name, description) > > I think your patch is in violation of this from > Documentation/filesystems/sysfs.txt: > > Attributes > ~~~~~~~~~ > ... > Attributes should be ASCII text files, preferably with only one value > per file. It is noted that it may not be efficient to contain only one > value per file, so it is socially acceptable to express an array of > values of the same type. > > Mixing types, expressing multiple lines of data, and doing fancy > formatting of data is heavily frowned upon. Doing these things may get > you publically humiliated and your code rewritten without notice. 1. There are numerous exceptions: $ find /sys -exec grep -HC ^ {} \; 2>/dev/null | grep ":[3-9]$" | grep -c yielded 43 on my machine. Some of them list multiple lines with fancy formatting each (i.e.: /sys/class/Bluetooth/l2cap or devices/pci*/resource) 2. There are sysfs entries regarding the performance counters already: 'overcommit' and 'reserve_percpu' They are simple, I admit, but I find it useful to have all relevant thing in one place. If the above does not convince you, I could move the file to the debugfs. > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2010-01-20 16:35 UTC | newest] Thread overview: 23+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-01-20 9:11 [PATCH/RFC v1 0/2] Human readable performance event description in sysfs Tomasz Fujak 2010-01-20 9:11 ` [PATCH v1 1/2] perfevent: Add performance event structure definition and 'extevents' sysfs entry Tomasz Fujak 2010-01-20 9:11 ` [PATCH v1 2/2] [ARM] perfevent: Event description list for ARMv6, Cortex-A8 and Cortex-A9 exported Tomasz Fujak 2010-01-20 9:16 ` [PATCH/RFC v1 0/2] Human readable performance event description in sysfs Peter Zijlstra 2010-01-20 9:46 ` Tomasz Fujak 2010-01-20 9:57 ` Michał Nazarewicz 2010-01-20 13:31 ` Jamie Iles 2010-01-20 13:39 ` Peter Zijlstra 2010-01-20 13:55 ` Russell King - ARM Linux 2010-01-20 14:01 ` Peter Zijlstra 2010-01-20 14:09 ` Michał Nazarewicz 2010-01-20 14:16 ` Peter Zijlstra 2010-01-20 14:26 ` Peter Zijlstra 2010-01-20 14:45 ` Russell King - ARM Linux 2010-01-20 14:54 ` Michał Nazarewicz 2010-01-20 14:41 ` Russell King - ARM Linux 2010-01-20 15:03 ` Jamie Iles 2010-01-20 15:42 ` Russell King - ARM Linux 2010-01-20 16:18 ` Jamie Iles 2010-01-20 16:26 ` Jamie Lokier 2010-01-20 16:35 ` Russell King - ARM Linux 2010-01-20 9:57 ` Russell King - ARM Linux 2010-01-20 10:21 ` Tomasz Fujak
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox