* [PATCH v2 0/5] Cavium ThunderX PMU support @ 2016-01-28 14:54 Jan Glauber 2016-01-28 14:54 ` [PATCH v2 1/5] arm64/perf: Rename Cortex A57 events Jan Glauber ` (4 more replies) 0 siblings, 5 replies; 11+ messages in thread From: Jan Glauber @ 2016-01-28 14:54 UTC (permalink / raw) To: linux-arm-kernel Hi Mark & Will, I'm resending the arm64 PMU patches. The only difference to the first version is that I dropped the x on thunder in order to be consistent with the existing device tree name. Thanks, Jan Jan Glauber (5): arm64/perf: Rename Cortex A57 events arm64/perf: Add Cavium ThunderX PMU support arm64: dts: Add Cavium ThunderX specific PMU arm64/perf: Enable PMCR long cycle counter bit arm64/perf: Extend event mask for ARMv8.1 Documentation/devicetree/bindings/arm/pmu.txt | 1 + arch/arm64/boot/dts/cavium/thunder-88xx.dtsi | 5 + arch/arm64/kernel/perf_event.c | 145 ++++++++++++++++++++------ drivers/perf/arm_pmu.c | 5 +- include/linux/perf/arm_pmu.h | 4 +- 5 files changed, 125 insertions(+), 35 deletions(-) -- 1.9.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 1/5] arm64/perf: Rename Cortex A57 events 2016-01-28 14:54 [PATCH v2 0/5] Cavium ThunderX PMU support Jan Glauber @ 2016-01-28 14:54 ` Jan Glauber 2016-01-28 14:55 ` [PATCH v2 2/5] arm64/perf: Add Cavium ThunderX PMU support Jan Glauber ` (3 subsequent siblings) 4 siblings, 0 replies; 11+ messages in thread From: Jan Glauber @ 2016-01-28 14:54 UTC (permalink / raw) To: linux-arm-kernel The implemented Cortex A57 events are not A57 specific. They are recommended by ARM and can be found on other ARMv8 SOCs like Cavium ThunderX too. Therefore move these events to the common PMUv3 table. Signed-off-by: Jan Glauber <jglauber@cavium.com> --- arch/arm64/kernel/perf_event.c | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index f7ab14c..32fe656 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -87,17 +87,17 @@ #define ARMV8_PMUV3_PERFCTR_L2D_TLB 0x2F #define ARMV8_PMUV3_PERFCTR_L21_TLB 0x30 +/* Recommended events. */ +#define ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS_LD 0x40 +#define ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS_ST 0x41 +#define ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL_LD 0x42 +#define ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL_ST 0x43 +#define ARMV8_PMUV3_PERFCTR_DTLB_REFILL_LD 0x4C +#define ARMV8_PMUV3_PERFCTR_DTLB_REFILL_ST 0x4D + /* ARMv8 Cortex-A53 specific event types. */ #define ARMV8_A53_PERFCTR_PREFETCH_LINEFILL 0xC2 -/* ARMv8 Cortex-A57 and Cortex-A72 specific event types. */ -#define ARMV8_A57_PERFCTR_L1_DCACHE_ACCESS_LD 0x40 -#define ARMV8_A57_PERFCTR_L1_DCACHE_ACCESS_ST 0x41 -#define ARMV8_A57_PERFCTR_L1_DCACHE_REFILL_LD 0x42 -#define ARMV8_A57_PERFCTR_L1_DCACHE_REFILL_ST 0x43 -#define ARMV8_A57_PERFCTR_DTLB_REFILL_LD 0x4c -#define ARMV8_A57_PERFCTR_DTLB_REFILL_ST 0x4d - /* PMUv3 HW events mapping. */ static const unsigned armv8_pmuv3_perf_map[PERF_COUNT_HW_MAX] = { PERF_MAP_ALL_UNSUPPORTED, @@ -174,16 +174,16 @@ static const unsigned armv8_a57_perf_cache_map[PERF_COUNT_HW_CACHE_MAX] [PERF_COUNT_HW_CACHE_RESULT_MAX] = { PERF_CACHE_MAP_ALL_UNSUPPORTED, - [C(L1D)][C(OP_READ)][C(RESULT_ACCESS)] = ARMV8_A57_PERFCTR_L1_DCACHE_ACCESS_LD, - [C(L1D)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_A57_PERFCTR_L1_DCACHE_REFILL_LD, - [C(L1D)][C(OP_WRITE)][C(RESULT_ACCESS)] = ARMV8_A57_PERFCTR_L1_DCACHE_ACCESS_ST, - [C(L1D)][C(OP_WRITE)][C(RESULT_MISS)] = ARMV8_A57_PERFCTR_L1_DCACHE_REFILL_ST, + [C(L1D)][C(OP_READ)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS_LD, + [C(L1D)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL_LD, + [C(L1D)][C(OP_WRITE)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS_ST, + [C(L1D)][C(OP_WRITE)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL_ST, [C(L1I)][C(OP_READ)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_L1_ICACHE_ACCESS, [C(L1I)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_L1_ICACHE_REFILL, - [C(DTLB)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_A57_PERFCTR_DTLB_REFILL_LD, - [C(DTLB)][C(OP_WRITE)][C(RESULT_MISS)] = ARMV8_A57_PERFCTR_DTLB_REFILL_ST, + [C(DTLB)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_DTLB_REFILL_LD, + [C(DTLB)][C(OP_WRITE)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_DTLB_REFILL_ST, [C(ITLB)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_ITLB_REFILL, -- 1.9.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 2/5] arm64/perf: Add Cavium ThunderX PMU support 2016-01-28 14:54 [PATCH v2 0/5] Cavium ThunderX PMU support Jan Glauber 2016-01-28 14:54 ` [PATCH v2 1/5] arm64/perf: Rename Cortex A57 events Jan Glauber @ 2016-01-28 14:55 ` Jan Glauber 2016-01-28 14:55 ` [PATCH v2 3/5] arm64: dts: Add Cavium ThunderX specific PMU Jan Glauber ` (2 subsequent siblings) 4 siblings, 0 replies; 11+ messages in thread From: Jan Glauber @ 2016-01-28 14:55 UTC (permalink / raw) To: linux-arm-kernel Support PMU events on Caviums ThunderX SOC. ThunderX supports some additional counters compared to the default ARMv8 PMUv3: - branch instructions counter - stall frontend & backend counters - L1 dcache load & store counters - L1 icache counters - iTLB & dTLB counters - L1 dcache & icache prefetch counters Signed-off-by: Jan Glauber <jglauber@cavium.com> --- arch/arm64/kernel/perf_event.c | 69 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 68 insertions(+), 1 deletion(-) diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index 32fe656..c038e4e 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -94,10 +94,19 @@ #define ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL_ST 0x43 #define ARMV8_PMUV3_PERFCTR_DTLB_REFILL_LD 0x4C #define ARMV8_PMUV3_PERFCTR_DTLB_REFILL_ST 0x4D +#define ARMV8_PMUV3_PERFCTR_DTLB_ACCESS_LD 0x4E +#define ARMV8_PMUV3_PERFCTR_DTLB_ACCESS_ST 0x4F /* ARMv8 Cortex-A53 specific event types. */ #define ARMV8_A53_PERFCTR_PREFETCH_LINEFILL 0xC2 +/* ARMv8 Cavium Thunder specific event types. */ +#define ARMV8_THUNDER_PERFCTR_L1_DCACHE_MISS_ST 0xE9 +#define ARMV8_THUNDER_PERFCTR_L1_DCACHE_PREF_ACCESS 0xEA +#define ARMV8_THUNDER_PERFCTR_L1_DCACHE_PREF_MISS 0xEB +#define ARMV8_THUNDER_PERFCTR_L1_ICACHE_PREF_ACCESS 0xEC +#define ARMV8_THUNDER_PERFCTR_L1_ICACHE_PREF_MISS 0xED + /* PMUv3 HW events mapping. */ static const unsigned armv8_pmuv3_perf_map[PERF_COUNT_HW_MAX] = { PERF_MAP_ALL_UNSUPPORTED, @@ -131,6 +140,18 @@ static const unsigned armv8_a57_perf_map[PERF_COUNT_HW_MAX] = { [PERF_COUNT_HW_BUS_CYCLES] = ARMV8_PMUV3_PERFCTR_BUS_CYCLES, }; +static const unsigned armv8_thunder_perf_map[PERF_COUNT_HW_MAX] = { + PERF_MAP_ALL_UNSUPPORTED, + [PERF_COUNT_HW_CPU_CYCLES] = ARMV8_PMUV3_PERFCTR_CLOCK_CYCLES, + [PERF_COUNT_HW_INSTRUCTIONS] = ARMV8_PMUV3_PERFCTR_INSTR_EXECUTED, + [PERF_COUNT_HW_CACHE_REFERENCES] = ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS, + [PERF_COUNT_HW_CACHE_MISSES] = ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL, + [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = ARMV8_PMUV3_PERFCTR_PC_WRITE, + [PERF_COUNT_HW_BRANCH_MISSES] = ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED, + [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = ARMV8_PMUV3_PERFCTR_STALL_FRONTEND, + [PERF_COUNT_HW_STALLED_CYCLES_BACKEND] = ARMV8_PMUV3_PERFCTR_STALL_BACKEND, +}; + static const unsigned armv8_pmuv3_perf_cache_map[PERF_COUNT_HW_CACHE_MAX] [PERF_COUNT_HW_CACHE_OP_MAX] [PERF_COUNT_HW_CACHE_RESULT_MAX] = { @@ -193,6 +214,36 @@ static const unsigned armv8_a57_perf_cache_map[PERF_COUNT_HW_CACHE_MAX] [C(BPU)][C(OP_WRITE)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED, }; +static const unsigned armv8_thunder_perf_cache_map[PERF_COUNT_HW_CACHE_MAX] + [PERF_COUNT_HW_CACHE_OP_MAX] + [PERF_COUNT_HW_CACHE_RESULT_MAX] = { + PERF_CACHE_MAP_ALL_UNSUPPORTED, + + [C(L1D)][C(OP_READ)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS_LD, + [C(L1D)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL_LD, + [C(L1D)][C(OP_WRITE)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS_ST, + [C(L1D)][C(OP_WRITE)][C(RESULT_MISS)] = ARMV8_THUNDER_PERFCTR_L1_DCACHE_MISS_ST, + [C(L1D)][C(OP_PREFETCH)][C(RESULT_ACCESS)] = ARMV8_THUNDER_PERFCTR_L1_DCACHE_PREF_ACCESS, + [C(L1D)][C(OP_PREFETCH)][C(RESULT_MISS)] = ARMV8_THUNDER_PERFCTR_L1_DCACHE_PREF_MISS, + + [C(L1I)][C(OP_READ)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_L1_ICACHE_ACCESS, + [C(L1I)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_L1_ICACHE_REFILL, + [C(L1I)][C(OP_PREFETCH)][C(RESULT_ACCESS)] = ARMV8_THUNDER_PERFCTR_L1_ICACHE_PREF_ACCESS, + [C(L1I)][C(OP_PREFETCH)][C(RESULT_MISS)] = ARMV8_THUNDER_PERFCTR_L1_ICACHE_PREF_MISS, + + [C(DTLB)][C(OP_READ)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_DTLB_ACCESS_LD, + [C(DTLB)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_DTLB_REFILL_LD, + [C(DTLB)][C(OP_WRITE)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_DTLB_ACCESS_ST, + [C(DTLB)][C(OP_WRITE)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_DTLB_REFILL_ST, + + [C(ITLB)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_ITLB_REFILL, + + [C(BPU)][C(OP_READ)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_PC_BRANCH_PRED, + [C(BPU)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED, + [C(BPU)][C(OP_WRITE)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_PC_BRANCH_PRED, + [C(BPU)][C(OP_WRITE)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED, +}; + #define ARMV8_EVENT_ATTR_RESOLVE(m) #m #define ARMV8_EVENT_ATTR(name, config) \ PMU_EVENT_ATTR_STRING(name, armv8_event_attr_##name, \ @@ -324,7 +375,6 @@ static const struct attribute_group *armv8_pmuv3_attr_groups[] = { NULL, }; - /* * Perf Events' indices */ @@ -743,6 +793,13 @@ static int armv8_a57_map_event(struct perf_event *event) ARMV8_EVTYPE_EVENT); } +static int armv8_thunder_map_event(struct perf_event *event) +{ + return armpmu_map_event(event, &armv8_thunder_perf_map, + &armv8_thunder_perf_cache_map, + ARMV8_EVTYPE_EVENT); +} + static void armv8pmu_read_num_pmnc_events(void *info) { int *nb_cnt = info; @@ -811,11 +868,21 @@ static int armv8_a72_pmu_init(struct arm_pmu *cpu_pmu) return armv8pmu_probe_num_events(cpu_pmu); } +static int armv8_thunder_pmu_init(struct arm_pmu *cpu_pmu) +{ + armv8_pmu_init(cpu_pmu); + cpu_pmu->name = "armv8_cavium_thunder"; + cpu_pmu->map_event = armv8_thunder_map_event; + cpu_pmu->pmu.attr_groups = armv8_pmuv3_attr_groups; + return armv8pmu_probe_num_events(cpu_pmu); +} + static const struct of_device_id armv8_pmu_of_device_ids[] = { {.compatible = "arm,armv8-pmuv3", .data = armv8_pmuv3_init}, {.compatible = "arm,cortex-a53-pmu", .data = armv8_a53_pmu_init}, {.compatible = "arm,cortex-a57-pmu", .data = armv8_a57_pmu_init}, {.compatible = "arm,cortex-a72-pmu", .data = armv8_a72_pmu_init}, + {.compatible = "cavium,thunder-pmu", .data = armv8_thunder_pmu_init}, {}, }; -- 1.9.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 3/5] arm64: dts: Add Cavium ThunderX specific PMU 2016-01-28 14:54 [PATCH v2 0/5] Cavium ThunderX PMU support Jan Glauber 2016-01-28 14:54 ` [PATCH v2 1/5] arm64/perf: Rename Cortex A57 events Jan Glauber 2016-01-28 14:55 ` [PATCH v2 2/5] arm64/perf: Add Cavium ThunderX PMU support Jan Glauber @ 2016-01-28 14:55 ` Jan Glauber 2016-01-28 14:55 ` [PATCH v2 4/5] arm64/perf: Enable PMCR long cycle counter bit Jan Glauber 2016-01-28 14:55 ` [PATCH v2 5/5] arm64/perf: Extend event mask for ARMv8.1 Jan Glauber 4 siblings, 0 replies; 11+ messages in thread From: Jan Glauber @ 2016-01-28 14:55 UTC (permalink / raw) To: linux-arm-kernel Add a compatible string for the Cavium ThunderX PMU. Signed-off-by: Jan Glauber <jglauber@cavium.com> --- Documentation/devicetree/bindings/arm/pmu.txt | 1 + arch/arm64/boot/dts/cavium/thunder-88xx.dtsi | 5 +++++ 2 files changed, 6 insertions(+) diff --git a/Documentation/devicetree/bindings/arm/pmu.txt b/Documentation/devicetree/bindings/arm/pmu.txt index 5651883..d3999a1 100644 --- a/Documentation/devicetree/bindings/arm/pmu.txt +++ b/Documentation/devicetree/bindings/arm/pmu.txt @@ -25,6 +25,7 @@ Required properties: "qcom,scorpion-pmu" "qcom,scorpion-mp-pmu" "qcom,krait-pmu" + "cavium,thunder-pmu" - interrupts : 1 combined interrupt or 1 per core. If the interrupt is a per-cpu interrupt (PPI) then 1 interrupt should be specified. diff --git a/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi b/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi index 9cb7cf9..2eb9b22 100644 --- a/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi +++ b/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi @@ -360,6 +360,11 @@ <1 10 0xff01>; }; + pmu { + compatible = "cavium,thunder-pmu", "arm,armv8-pmuv3"; + interrupts = <1 7 4>; + }; + soc { compatible = "simple-bus"; #address-cells = <2>; -- 1.9.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 4/5] arm64/perf: Enable PMCR long cycle counter bit 2016-01-28 14:54 [PATCH v2 0/5] Cavium ThunderX PMU support Jan Glauber ` (2 preceding siblings ...) 2016-01-28 14:55 ` [PATCH v2 3/5] arm64: dts: Add Cavium ThunderX specific PMU Jan Glauber @ 2016-01-28 14:55 ` Jan Glauber 2016-01-28 14:55 ` [PATCH v2 5/5] arm64/perf: Extend event mask for ARMv8.1 Jan Glauber 4 siblings, 0 replies; 11+ messages in thread From: Jan Glauber @ 2016-01-28 14:55 UTC (permalink / raw) To: linux-arm-kernel With the long cycle counter bit (LC) disabled the cycle counter is not working on ThunderX SOC (ThunderX only implements Aarch64). Also, according to documentation LC == 0 is deprecated. To keep the code simple the patch does not introduce 64 bit wide counter functions. Instead writing the cycle counter always sets the upper 32 bits so overflow interrupts are generated as before. Original patch from Andrew Pinksi <Andrew.Pinksi@caviumnetworks.com> Signed-off-by: Jan Glauber <jglauber@cavium.com> --- arch/arm64/kernel/perf_event.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index c038e4e..5e4275e 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -405,6 +405,7 @@ static const struct attribute_group *armv8_pmuv3_attr_groups[] = { #define ARMV8_PMCR_D (1 << 3) /* CCNT counts every 64th cpu cycle */ #define ARMV8_PMCR_X (1 << 4) /* Export to ETM */ #define ARMV8_PMCR_DP (1 << 5) /* Disable CCNT if non-invasive debug*/ +#define ARMV8_PMCR_LC (1 << 6) /* Overflow on 64 bit cycle counter */ #define ARMV8_PMCR_N_SHIFT 11 /* Number of counters supported */ #define ARMV8_PMCR_N_MASK 0x1f #define ARMV8_PMCR_MASK 0x3f /* Mask for writable bits */ @@ -494,9 +495,16 @@ static inline void armv8pmu_write_counter(struct perf_event *event, u32 value) if (!armv8pmu_counter_valid(cpu_pmu, idx)) pr_err("CPU%u writing wrong counter %d\n", smp_processor_id(), idx); - else if (idx == ARMV8_IDX_CYCLE_COUNTER) - asm volatile("msr pmccntr_el0, %0" :: "r" (value)); - else if (armv8pmu_select_counter(idx) == idx) + else if (idx == ARMV8_IDX_CYCLE_COUNTER) { + /* + * Set the upper 32bits as this is a 64bit counter but we only + * count using the lower 32bits and we want an interrupt when + * it overflows. + */ + u64 value64 = 0xffffffff00000000ULL | value; + + asm volatile("msr pmccntr_el0, %0" :: "r" (value64)); + } else if (armv8pmu_select_counter(idx) == idx) asm volatile("msr pmxevcntr_el0, %0" :: "r" (value)); } @@ -768,8 +776,11 @@ static void armv8pmu_reset(void *info) armv8pmu_disable_intens(idx); } - /* Initialize & Reset PMNC: C and P bits. */ - armv8pmu_pmcr_write(ARMV8_PMCR_P | ARMV8_PMCR_C); + /* + * Initialize & Reset PMNC. Request overflow on 64 bit but + * cheat in armv8pmu_write_counter(). + */ + armv8pmu_pmcr_write(ARMV8_PMCR_P | ARMV8_PMCR_C | ARMV8_PMCR_LC); } static int armv8_pmuv3_map_event(struct perf_event *event) -- 1.9.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 5/5] arm64/perf: Extend event mask for ARMv8.1 2016-01-28 14:54 [PATCH v2 0/5] Cavium ThunderX PMU support Jan Glauber ` (3 preceding siblings ...) 2016-01-28 14:55 ` [PATCH v2 4/5] arm64/perf: Enable PMCR long cycle counter bit Jan Glauber @ 2016-01-28 14:55 ` Jan Glauber 2016-01-28 16:14 ` kbuild test robot ` (2 more replies) 4 siblings, 3 replies; 11+ messages in thread From: Jan Glauber @ 2016-01-28 14:55 UTC (permalink / raw) To: linux-arm-kernel ARMv8.1 increases the PMU event number space. Detect the presence of this PMUv3 type and extend the event mask. The event mask is moved to struct arm_pmu so different event masks can exist, depending on the PMU type. Signed-off-by: Jan Glauber <jglauber@cavium.com> --- arch/arm64/kernel/perf_event.c | 33 +++++++++++++++++++-------------- drivers/perf/arm_pmu.c | 5 +++-- include/linux/perf/arm_pmu.h | 4 ++-- 3 files changed, 24 insertions(+), 18 deletions(-) diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index 5e4275e..78b24cb 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -419,7 +419,7 @@ static const struct attribute_group *armv8_pmuv3_attr_groups[] = { /* * PMXEVTYPER: Event selection reg */ -#define ARMV8_EVTYPE_MASK 0xc80003ff /* Mask for writable bits */ +#define ARMV8_EVTYPE_FLT_MASK 0xc8000000 /* Writable filter bits */ #define ARMV8_EVTYPE_EVENT 0x3ff /* Mask for EVENT bits */ /* @@ -510,10 +510,8 @@ static inline void armv8pmu_write_counter(struct perf_event *event, u32 value) static inline void armv8pmu_write_evtype(int idx, u32 val) { - if (armv8pmu_select_counter(idx) == idx) { - val &= ARMV8_EVTYPE_MASK; + if (armv8pmu_select_counter(idx) == idx) asm volatile("msr pmxevtyper_el0, %0" :: "r" (val)); - } } static inline int armv8pmu_enable_counter(int idx) @@ -570,6 +568,7 @@ static void armv8pmu_enable_event(struct perf_event *event) struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); int idx = hwc->idx; + u32 val; /* * Enable counter and interrupt, and set the counter to count @@ -585,7 +584,8 @@ static void armv8pmu_enable_event(struct perf_event *event) /* * Set event (if destined for PMNx counters). */ - armv8pmu_write_evtype(idx, hwc->config_base); + val = hwc->config_base & (ARMV8_EVTYPE_FLT_MASK | cpu_pmu->event_mask); + armv8pmu_write_evtype(idx, val); /* * Enable interrupt for this counter @@ -716,7 +716,7 @@ static int armv8pmu_get_event_idx(struct pmu_hw_events *cpuc, int idx; struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); struct hw_perf_event *hwc = &event->hw; - unsigned long evtype = hwc->config_base & ARMV8_EVTYPE_EVENT; + unsigned long evtype = hwc->config_base & cpu_pmu->event_mask; /* Always place a cycle counter into the cycle counter. */ if (evtype == ARMV8_PMUV3_PERFCTR_CLOCK_CYCLES) { @@ -786,29 +786,25 @@ static void armv8pmu_reset(void *info) static int armv8_pmuv3_map_event(struct perf_event *event) { return armpmu_map_event(event, &armv8_pmuv3_perf_map, - &armv8_pmuv3_perf_cache_map, - ARMV8_EVTYPE_EVENT); + &armv8_pmuv3_perf_cache_map); } static int armv8_a53_map_event(struct perf_event *event) { return armpmu_map_event(event, &armv8_a53_perf_map, - &armv8_a53_perf_cache_map, - ARMV8_EVTYPE_EVENT); + &armv8_a53_perf_cache_map); } static int armv8_a57_map_event(struct perf_event *event) { return armpmu_map_event(event, &armv8_a57_perf_map, - &armv8_a57_perf_cache_map, - ARMV8_EVTYPE_EVENT); + &armv8_a57_perf_cache_map); } static int armv8_thunder_map_event(struct perf_event *event) { return armpmu_map_event(event, &armv8_thunder_perf_map, - &armv8_thunder_perf_cache_map, - ARMV8_EVTYPE_EVENT); + &armv8_thunder_perf_cache_map); } static void armv8pmu_read_num_pmnc_events(void *info) @@ -831,6 +827,8 @@ static int armv8pmu_probe_num_events(struct arm_pmu *arm_pmu) static void armv8_pmu_init(struct arm_pmu *cpu_pmu) { + u64 id; + cpu_pmu->handle_irq = armv8pmu_handle_irq, cpu_pmu->enable = armv8pmu_enable_event, cpu_pmu->disable = armv8pmu_disable_event, @@ -842,6 +840,13 @@ static void armv8_pmu_init(struct arm_pmu *cpu_pmu) cpu_pmu->reset = armv8pmu_reset, cpu_pmu->max_period = (1LLU << 32) - 1, cpu_pmu->set_event_filter = armv8pmu_set_event_filter; + + /* detect ARMv8.1 PMUv3 with extended event mask */ + id = read_cpuid(ID_AA64DFR0_EL1); + if (((id >> 8) & 0xf) == 4) + cpu_pmu->event_mask = 0xffff; /* ARMv8.1 extended events */ + else + cpu_pmu->event_mask = ARMV8_EVTYPE_EVENT; } static int armv8_pmuv3_init(struct arm_pmu *cpu_pmu) diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c index 166637f..79e681f 100644 --- a/drivers/perf/arm_pmu.c +++ b/drivers/perf/arm_pmu.c @@ -79,9 +79,10 @@ armpmu_map_event(struct perf_event *event, const unsigned (*cache_map) [PERF_COUNT_HW_CACHE_MAX] [PERF_COUNT_HW_CACHE_OP_MAX] - [PERF_COUNT_HW_CACHE_RESULT_MAX], - u32 raw_event_mask) + [PERF_COUNT_HW_CACHE_RESULT_MAX]) { + struct arm_pmu *armpmu = to_arm_pmu(event->pmu); + u32 raw_event_mask = armpmu->event_mask; u64 config = event->attr.config; int type = event->attr.type; diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h index 83b5e34..9a4c3a9 100644 --- a/include/linux/perf/arm_pmu.h +++ b/include/linux/perf/arm_pmu.h @@ -101,6 +101,7 @@ struct arm_pmu { void (*free_irq)(struct arm_pmu *); int (*map_event)(struct perf_event *event); int num_events; + int event_mask; atomic_t active_events; struct mutex reserve_mutex; u64 max_period; @@ -119,8 +120,7 @@ int armpmu_map_event(struct perf_event *event, const unsigned (*event_map)[PERF_COUNT_HW_MAX], const unsigned (*cache_map)[PERF_COUNT_HW_CACHE_MAX] [PERF_COUNT_HW_CACHE_OP_MAX] - [PERF_COUNT_HW_CACHE_RESULT_MAX], - u32 raw_event_mask); + [PERF_COUNT_HW_CACHE_RESULT_MAX]); struct pmu_probe_info { unsigned int cpuid; -- 1.9.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 5/5] arm64/perf: Extend event mask for ARMv8.1 2016-01-28 14:55 ` [PATCH v2 5/5] arm64/perf: Extend event mask for ARMv8.1 Jan Glauber @ 2016-01-28 16:14 ` kbuild test robot 2016-01-28 17:33 ` kbuild test robot 2016-01-29 8:29 ` Jan Glauber 2 siblings, 0 replies; 11+ messages in thread From: kbuild test robot @ 2016-01-28 16:14 UTC (permalink / raw) To: linux-arm-kernel Hi Jan, [auto build test ERROR on robh/for-next] [also build test ERROR on v4.5-rc1 next-20160128] [cannot apply to tip/perf/core] [if your patch is applied to the wrong git tree, please drop us a note to help improving the system] url: https://github.com/0day-ci/linux/commits/Jan-Glauber/Cavium-ThunderX-PMU-support/20160128-225855 base: https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux for-next config: arm-imx_v6_v7_defconfig (attached as .config) reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=arm All errors (new ones prefixed by >>): arch/arm/kernel/perf_event_v6.c: In function 'armv6_map_event': >> arch/arm/kernel/perf_event_v6.c:483:9: error: too many arguments to function 'armpmu_map_event' return armpmu_map_event(event, &armv6_perf_map, ^ In file included from arch/arm/kernel/perf_event_v6.c:39:0: include/linux/perf/arm_pmu.h:119:5: note: declared here int armpmu_map_event(struct perf_event *event, ^ arch/arm/kernel/perf_event_v6.c: In function 'armv6mpcore_map_event': arch/arm/kernel/perf_event_v6.c:533:9: error: too many arguments to function 'armpmu_map_event' return armpmu_map_event(event, &armv6mpcore_perf_map, ^ In file included from arch/arm/kernel/perf_event_v6.c:39:0: include/linux/perf/arm_pmu.h:119:5: note: declared here int armpmu_map_event(struct perf_event *event, ^ -- arch/arm/kernel/perf_event_v7.c: In function 'armv7_a8_map_event': >> arch/arm/kernel/perf_event_v7.c:1111:9: error: too many arguments to function 'armpmu_map_event' return armpmu_map_event(event, &armv7_a8_perf_map, ^ In file included from arch/arm/kernel/perf_event_v7.c:28:0: include/linux/perf/arm_pmu.h:119:5: note: declared here int armpmu_map_event(struct perf_event *event, ^ arch/arm/kernel/perf_event_v7.c: In function 'armv7_a9_map_event': arch/arm/kernel/perf_event_v7.c:1117:9: error: too many arguments to function 'armpmu_map_event' return armpmu_map_event(event, &armv7_a9_perf_map, ^ In file included from arch/arm/kernel/perf_event_v7.c:28:0: include/linux/perf/arm_pmu.h:119:5: note: declared here int armpmu_map_event(struct perf_event *event, ^ arch/arm/kernel/perf_event_v7.c: In function 'armv7_a5_map_event': arch/arm/kernel/perf_event_v7.c:1123:9: error: too many arguments to function 'armpmu_map_event' return armpmu_map_event(event, &armv7_a5_perf_map, ^ In file included from arch/arm/kernel/perf_event_v7.c:28:0: include/linux/perf/arm_pmu.h:119:5: note: declared here int armpmu_map_event(struct perf_event *event, ^ arch/arm/kernel/perf_event_v7.c: In function 'armv7_a15_map_event': arch/arm/kernel/perf_event_v7.c:1129:9: error: too many arguments to function 'armpmu_map_event' return armpmu_map_event(event, &armv7_a15_perf_map, ^ In file included from arch/arm/kernel/perf_event_v7.c:28:0: include/linux/perf/arm_pmu.h:119:5: note: declared here int armpmu_map_event(struct perf_event *event, ^ arch/arm/kernel/perf_event_v7.c: In function 'armv7_a7_map_event': arch/arm/kernel/perf_event_v7.c:1135:9: error: too many arguments to function 'armpmu_map_event' return armpmu_map_event(event, &armv7_a7_perf_map, ^ In file included from arch/arm/kernel/perf_event_v7.c:28:0: include/linux/perf/arm_pmu.h:119:5: note: declared here int armpmu_map_event(struct perf_event *event, ^ arch/arm/kernel/perf_event_v7.c: In function 'armv7_a12_map_event': arch/arm/kernel/perf_event_v7.c:1141:9: error: too many arguments to function 'armpmu_map_event' return armpmu_map_event(event, &armv7_a12_perf_map, ^ In file included from arch/arm/kernel/perf_event_v7.c:28:0: include/linux/perf/arm_pmu.h:119:5: note: declared here int armpmu_map_event(struct perf_event *event, ^ arch/arm/kernel/perf_event_v7.c: In function 'krait_map_event': arch/arm/kernel/perf_event_v7.c:1147:9: error: too many arguments to function 'armpmu_map_event' return armpmu_map_event(event, &krait_perf_map, ^ In file included from arch/arm/kernel/perf_event_v7.c:28:0: include/linux/perf/arm_pmu.h:119:5: note: declared here int armpmu_map_event(struct perf_event *event, ^ arch/arm/kernel/perf_event_v7.c: In function 'krait_map_event_no_branch': arch/arm/kernel/perf_event_v7.c:1153:9: error: too many arguments to function 'armpmu_map_event' return armpmu_map_event(event, &krait_perf_map_no_branch, ^ In file included from arch/arm/kernel/perf_event_v7.c:28:0: include/linux/perf/arm_pmu.h:119:5: note: declared here int armpmu_map_event(struct perf_event *event, ^ arch/arm/kernel/perf_event_v7.c: In function 'scorpion_map_event': arch/arm/kernel/perf_event_v7.c:1159:9: error: too many arguments to function 'armpmu_map_event' return armpmu_map_event(event, &scorpion_perf_map, ^ In file included from arch/arm/kernel/perf_event_v7.c:28:0: include/linux/perf/arm_pmu.h:119:5: note: declared here int armpmu_map_event(struct perf_event *event, ^ vim +/armpmu_map_event +483 arch/arm/kernel/perf_event_v6.c 43eab878 Will Deacon 2010-11-13 477 armv6_pmcr_write(val); 0f78d2d5 Mark Rutland 2011-04-28 478 raw_spin_unlock_irqrestore(&events->pmu_lock, flags); 43eab878 Will Deacon 2010-11-13 479 } 43eab878 Will Deacon 2010-11-13 480 e1f431b5 Mark Rutland 2011-04-28 481 static int armv6_map_event(struct perf_event *event) e1f431b5 Mark Rutland 2011-04-28 482 { 6dbc0029 Will Deacon 2012-07-29 @483 return armpmu_map_event(event, &armv6_perf_map, e1f431b5 Mark Rutland 2011-04-28 484 &armv6_perf_cache_map, 0xFF); e1f431b5 Mark Rutland 2011-04-28 485 } e1f431b5 Mark Rutland 2011-04-28 486 :::::: The code at line 483 was first introduced by commit :::::: 6dbc00297095122ea89e016ce6affad0b7c0ddac ARM: perf: prepare for moving CPU PMU code into separate file :::::: TO: Will Deacon <will.deacon@arm.com> :::::: CC: Will Deacon <will.deacon@arm.com> --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation -------------- next part -------------- A non-text attachment was scrubbed... Name: .config.gz Type: application/octet-stream Size: 29108 bytes Desc: not available URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160129/982f6557/attachment-0001.obj> ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 5/5] arm64/perf: Extend event mask for ARMv8.1 2016-01-28 14:55 ` [PATCH v2 5/5] arm64/perf: Extend event mask for ARMv8.1 Jan Glauber 2016-01-28 16:14 ` kbuild test robot @ 2016-01-28 17:33 ` kbuild test robot 2016-01-29 8:27 ` Jan Glauber 2016-01-29 8:29 ` Jan Glauber 2 siblings, 1 reply; 11+ messages in thread From: kbuild test robot @ 2016-01-28 17:33 UTC (permalink / raw) To: linux-arm-kernel Hi Jan, [auto build test ERROR on robh/for-next] [also build test ERROR on v4.5-rc1 next-20160128] [cannot apply to tip/perf/core] [if your patch is applied to the wrong git tree, please drop us a note to help improving the system] url: https://github.com/0day-ci/linux/commits/Jan-Glauber/Cavium-ThunderX-PMU-support/20160128-225855 base: https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux for-next config: arm-corgi_defconfig (attached as .config) reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=arm All errors (new ones prefixed by >>): arch/arm/kernel/perf_event_xscale.c: In function 'xscale_map_event': >> arch/arm/kernel/perf_event_xscale.c:360:9: error: too many arguments to function 'armpmu_map_event' return armpmu_map_event(event, &xscale_perf_map, ^ In file included from arch/arm/kernel/perf_event_xscale.c:21:0: include/linux/perf/arm_pmu.h:119:5: note: declared here int armpmu_map_event(struct perf_event *event, ^ vim +/armpmu_map_event +360 arch/arm/kernel/perf_event_xscale.c 43eab878 Will Deacon 2010-11-13 354 break; 43eab878 Will Deacon 2010-11-13 355 } 43eab878 Will Deacon 2010-11-13 356 } 43eab878 Will Deacon 2010-11-13 357 e1f431b5 Mark Rutland 2011-04-28 358 static int xscale_map_event(struct perf_event *event) e1f431b5 Mark Rutland 2011-04-28 359 { 6dbc0029 Will Deacon 2012-07-29 @360 return armpmu_map_event(event, &xscale_perf_map, e1f431b5 Mark Rutland 2011-04-28 361 &xscale_perf_cache_map, 0xFF); e1f431b5 Mark Rutland 2011-04-28 362 } e1f431b5 Mark Rutland 2011-04-28 363 :::::: The code at line 360 was first introduced by commit :::::: 6dbc00297095122ea89e016ce6affad0b7c0ddac ARM: perf: prepare for moving CPU PMU code into separate file :::::: TO: Will Deacon <will.deacon@arm.com> :::::: CC: Will Deacon <will.deacon@arm.com> --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation -------------- next part -------------- A non-text attachment was scrubbed... Name: .config.gz Type: application/octet-stream Size: 18550 bytes Desc: not available URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160129/db8caebe/attachment-0001.obj> ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 5/5] arm64/perf: Extend event mask for ARMv8.1 2016-01-28 17:33 ` kbuild test robot @ 2016-01-29 8:27 ` Jan Glauber 0 siblings, 0 replies; 11+ messages in thread From: Jan Glauber @ 2016-01-29 8:27 UTC (permalink / raw) To: linux-arm-kernel On Fri, Jan 29, 2016 at 01:33:35AM +0800, kbuild test robot wrote: > Hi Jan, > > [auto build test ERROR on robh/for-next] > [also build test ERROR on v4.5-rc1 next-20160128] > [cannot apply to tip/perf/core] > [if your patch is applied to the wrong git tree, please drop us a note to help improving the system] > > url: https://github.com/0day-ci/linux/commits/Jan-Glauber/Cavium-ThunderX-PMU-support/20160128-225855 > base: https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux for-next > config: arm-corgi_defconfig (attached as .config) > reproduce: > wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross > chmod +x ~/bin/make.cross > # save the attached .config to linux build tree > make.cross ARCH=arm > > All errors (new ones prefixed by >>): > > arch/arm/kernel/perf_event_xscale.c: In function 'xscale_map_event': > >> arch/arm/kernel/perf_event_xscale.c:360:9: error: too many arguments to function 'armpmu_map_event' > return armpmu_map_event(event, &xscale_perf_map, > ^ I forgot the arm32 parts of this patch. I'll resend this patch only, the other patches don't touch arm32. --Jan ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 5/5] arm64/perf: Extend event mask for ARMv8.1 2016-01-28 14:55 ` [PATCH v2 5/5] arm64/perf: Extend event mask for ARMv8.1 Jan Glauber 2016-01-28 16:14 ` kbuild test robot 2016-01-28 17:33 ` kbuild test robot @ 2016-01-29 8:29 ` Jan Glauber 2016-01-29 17:01 ` David Daney 2 siblings, 1 reply; 11+ messages in thread From: Jan Glauber @ 2016-01-29 8:29 UTC (permalink / raw) To: linux-arm-kernel ARMv8.1 increases the PMU event number space. Detect the presence of this PMUv3 type and extend the event mask. The event mask is moved to struct arm_pmu so different event masks can exist, depending on the PMU type. Signed-off-by: Jan Glauber <jglauber@cavium.com> --- arch/arm/kernel/perf_event_v6.c | 6 ++++-- arch/arm/kernel/perf_event_v7.c | 29 +++++++++++++++++++---------- arch/arm/kernel/perf_event_xscale.c | 4 +++- arch/arm64/kernel/perf_event.c | 33 +++++++++++++++++++-------------- drivers/perf/arm_pmu.c | 5 +++-- include/linux/perf/arm_pmu.h | 4 ++-- 6 files changed, 50 insertions(+), 31 deletions(-) diff --git a/arch/arm/kernel/perf_event_v6.c b/arch/arm/kernel/perf_event_v6.c index 09413e7..d6769f5 100644 --- a/arch/arm/kernel/perf_event_v6.c +++ b/arch/arm/kernel/perf_event_v6.c @@ -481,7 +481,7 @@ static void armv6mpcore_pmu_disable_event(struct perf_event *event) static int armv6_map_event(struct perf_event *event) { return armpmu_map_event(event, &armv6_perf_map, - &armv6_perf_cache_map, 0xFF); + &armv6_perf_cache_map); } static void armv6pmu_init(struct arm_pmu *cpu_pmu) @@ -494,6 +494,7 @@ static void armv6pmu_init(struct arm_pmu *cpu_pmu) cpu_pmu->get_event_idx = armv6pmu_get_event_idx; cpu_pmu->start = armv6pmu_start; cpu_pmu->stop = armv6pmu_stop; + cpu_pmu->event_mask = 0xFF; cpu_pmu->map_event = armv6_map_event; cpu_pmu->num_events = 3; cpu_pmu->max_period = (1LLU << 32) - 1; @@ -531,7 +532,7 @@ static int armv6_1176_pmu_init(struct arm_pmu *cpu_pmu) static int armv6mpcore_map_event(struct perf_event *event) { return armpmu_map_event(event, &armv6mpcore_perf_map, - &armv6mpcore_perf_cache_map, 0xFF); + &armv6mpcore_perf_cache_map); } static int armv6mpcore_pmu_init(struct arm_pmu *cpu_pmu) @@ -545,6 +546,7 @@ static int armv6mpcore_pmu_init(struct arm_pmu *cpu_pmu) cpu_pmu->get_event_idx = armv6pmu_get_event_idx; cpu_pmu->start = armv6pmu_start; cpu_pmu->stop = armv6pmu_stop; + cpu_pmu->event_mask = 0xFF; cpu_pmu->map_event = armv6mpcore_map_event; cpu_pmu->num_events = 3; cpu_pmu->max_period = (1LLU << 32) - 1; diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c index 4152158..8aab098 100644 --- a/arch/arm/kernel/perf_event_v7.c +++ b/arch/arm/kernel/perf_event_v7.c @@ -1042,7 +1042,7 @@ static int armv7pmu_get_event_idx(struct pmu_hw_events *cpuc, int idx; struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); struct hw_perf_event *hwc = &event->hw; - unsigned long evtype = hwc->config_base & ARMV7_EVTYPE_EVENT; + unsigned long evtype = hwc->config_base & cpu_pmu->event_mask; /* Always place a cycle counter into the cycle counter. */ if (evtype == ARMV7_PERFCTR_CPU_CYCLES) { @@ -1109,55 +1109,55 @@ static void armv7pmu_reset(void *info) static int armv7_a8_map_event(struct perf_event *event) { return armpmu_map_event(event, &armv7_a8_perf_map, - &armv7_a8_perf_cache_map, 0xFF); + &armv7_a8_perf_cache_map); } static int armv7_a9_map_event(struct perf_event *event) { return armpmu_map_event(event, &armv7_a9_perf_map, - &armv7_a9_perf_cache_map, 0xFF); + &armv7_a9_perf_cache_map); } static int armv7_a5_map_event(struct perf_event *event) { return armpmu_map_event(event, &armv7_a5_perf_map, - &armv7_a5_perf_cache_map, 0xFF); + &armv7_a5_perf_cache_map); } static int armv7_a15_map_event(struct perf_event *event) { return armpmu_map_event(event, &armv7_a15_perf_map, - &armv7_a15_perf_cache_map, 0xFF); + &armv7_a15_perf_cache_map); } static int armv7_a7_map_event(struct perf_event *event) { return armpmu_map_event(event, &armv7_a7_perf_map, - &armv7_a7_perf_cache_map, 0xFF); + &armv7_a7_perf_cache_map); } static int armv7_a12_map_event(struct perf_event *event) { return armpmu_map_event(event, &armv7_a12_perf_map, - &armv7_a12_perf_cache_map, 0xFF); + &armv7_a12_perf_cache_map); } static int krait_map_event(struct perf_event *event) { return armpmu_map_event(event, &krait_perf_map, - &krait_perf_cache_map, 0xFFFFF); + &krait_perf_cache_map); } static int krait_map_event_no_branch(struct perf_event *event) { return armpmu_map_event(event, &krait_perf_map_no_branch, - &krait_perf_cache_map, 0xFFFFF); + &krait_perf_cache_map); } static int scorpion_map_event(struct perf_event *event) { return armpmu_map_event(event, &scorpion_perf_map, - &scorpion_perf_cache_map, 0xFFFFF); + &scorpion_perf_cache_map); } static void armv7pmu_init(struct arm_pmu *cpu_pmu) @@ -1196,6 +1196,7 @@ static int armv7_a8_pmu_init(struct arm_pmu *cpu_pmu) { armv7pmu_init(cpu_pmu); cpu_pmu->name = "armv7_cortex_a8"; + cpu_pmu->event_mask = ARMV7_EVTYPE_EVENT; cpu_pmu->map_event = armv7_a8_map_event; cpu_pmu->pmu.attr_groups = armv7_pmuv1_attr_groups; return armv7_probe_num_events(cpu_pmu); @@ -1205,6 +1206,7 @@ static int armv7_a9_pmu_init(struct arm_pmu *cpu_pmu) { armv7pmu_init(cpu_pmu); cpu_pmu->name = "armv7_cortex_a9"; + cpu_pmu->event_mask = ARMV7_EVTYPE_EVENT; cpu_pmu->map_event = armv7_a9_map_event; cpu_pmu->pmu.attr_groups = armv7_pmuv1_attr_groups; return armv7_probe_num_events(cpu_pmu); @@ -1214,6 +1216,7 @@ static int armv7_a5_pmu_init(struct arm_pmu *cpu_pmu) { armv7pmu_init(cpu_pmu); cpu_pmu->name = "armv7_cortex_a5"; + cpu_pmu->event_mask = ARMV7_EVTYPE_EVENT; cpu_pmu->map_event = armv7_a5_map_event; cpu_pmu->pmu.attr_groups = armv7_pmuv1_attr_groups; return armv7_probe_num_events(cpu_pmu); @@ -1223,6 +1226,7 @@ static int armv7_a15_pmu_init(struct arm_pmu *cpu_pmu) { armv7pmu_init(cpu_pmu); cpu_pmu->name = "armv7_cortex_a15"; + cpu_pmu->event_mask = ARMV7_EVTYPE_EVENT; cpu_pmu->map_event = armv7_a15_map_event; cpu_pmu->set_event_filter = armv7pmu_set_event_filter; cpu_pmu->pmu.attr_groups = armv7_pmuv2_attr_groups; @@ -1233,6 +1237,7 @@ static int armv7_a7_pmu_init(struct arm_pmu *cpu_pmu) { armv7pmu_init(cpu_pmu); cpu_pmu->name = "armv7_cortex_a7"; + cpu_pmu->event_mask = ARMV7_EVTYPE_EVENT; cpu_pmu->map_event = armv7_a7_map_event; cpu_pmu->set_event_filter = armv7pmu_set_event_filter; cpu_pmu->pmu.attr_groups = armv7_pmuv2_attr_groups; @@ -1243,6 +1248,7 @@ static int armv7_a12_pmu_init(struct arm_pmu *cpu_pmu) { armv7pmu_init(cpu_pmu); cpu_pmu->name = "armv7_cortex_a12"; + cpu_pmu->event_mask = ARMV7_EVTYPE_EVENT; cpu_pmu->map_event = armv7_a12_map_event; cpu_pmu->set_event_filter = armv7pmu_set_event_filter; cpu_pmu->pmu.attr_groups = armv7_pmuv2_attr_groups; @@ -1628,6 +1634,7 @@ static int krait_pmu_init(struct arm_pmu *cpu_pmu) { armv7pmu_init(cpu_pmu); cpu_pmu->name = "armv7_krait"; + cpu_pmu->event_mask = 0xFFFFF; /* Some early versions of Krait don't support PC write events */ if (of_property_read_bool(cpu_pmu->plat_device->dev.of_node, "qcom,no-pc-write")) @@ -1957,6 +1964,7 @@ static int scorpion_pmu_init(struct arm_pmu *cpu_pmu) { armv7pmu_init(cpu_pmu); cpu_pmu->name = "armv7_scorpion"; + cpu_pmu->event_mask = 0xFFFFF; cpu_pmu->map_event = scorpion_map_event; cpu_pmu->reset = scorpion_pmu_reset; cpu_pmu->enable = scorpion_pmu_enable_event; @@ -1970,6 +1978,7 @@ static int scorpion_mp_pmu_init(struct arm_pmu *cpu_pmu) { armv7pmu_init(cpu_pmu); cpu_pmu->name = "armv7_scorpion_mp"; + cpu_pmu->event_mask = 0xFFFFF; cpu_pmu->map_event = scorpion_map_event; cpu_pmu->reset = scorpion_pmu_reset; cpu_pmu->enable = scorpion_pmu_enable_event; diff --git a/arch/arm/kernel/perf_event_xscale.c b/arch/arm/kernel/perf_event_xscale.c index aa0499e..8708691 100644 --- a/arch/arm/kernel/perf_event_xscale.c +++ b/arch/arm/kernel/perf_event_xscale.c @@ -358,7 +358,7 @@ static inline void xscale1pmu_write_counter(struct perf_event *event, u32 val) static int xscale_map_event(struct perf_event *event) { return armpmu_map_event(event, &xscale_perf_map, - &xscale_perf_cache_map, 0xFF); + &xscale_perf_cache_map); } static int xscale1pmu_init(struct arm_pmu *cpu_pmu) @@ -372,6 +372,7 @@ static int xscale1pmu_init(struct arm_pmu *cpu_pmu) cpu_pmu->get_event_idx = xscale1pmu_get_event_idx; cpu_pmu->start = xscale1pmu_start; cpu_pmu->stop = xscale1pmu_stop; + cpu_pmu->event_mask = 0xFF; cpu_pmu->map_event = xscale_map_event; cpu_pmu->num_events = 3; cpu_pmu->max_period = (1LLU << 32) - 1; @@ -742,6 +743,7 @@ static int xscale2pmu_init(struct arm_pmu *cpu_pmu) cpu_pmu->get_event_idx = xscale2pmu_get_event_idx; cpu_pmu->start = xscale2pmu_start; cpu_pmu->stop = xscale2pmu_stop; + cpu_pmu->event_mask = 0xFF; cpu_pmu->map_event = xscale_map_event; cpu_pmu->num_events = 5; cpu_pmu->max_period = (1LLU << 32) - 1; diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index 5e4275e..78b24cb 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -419,7 +419,7 @@ static const struct attribute_group *armv8_pmuv3_attr_groups[] = { /* * PMXEVTYPER: Event selection reg */ -#define ARMV8_EVTYPE_MASK 0xc80003ff /* Mask for writable bits */ +#define ARMV8_EVTYPE_FLT_MASK 0xc8000000 /* Writable filter bits */ #define ARMV8_EVTYPE_EVENT 0x3ff /* Mask for EVENT bits */ /* @@ -510,10 +510,8 @@ static inline void armv8pmu_write_counter(struct perf_event *event, u32 value) static inline void armv8pmu_write_evtype(int idx, u32 val) { - if (armv8pmu_select_counter(idx) == idx) { - val &= ARMV8_EVTYPE_MASK; + if (armv8pmu_select_counter(idx) == idx) asm volatile("msr pmxevtyper_el0, %0" :: "r" (val)); - } } static inline int armv8pmu_enable_counter(int idx) @@ -570,6 +568,7 @@ static void armv8pmu_enable_event(struct perf_event *event) struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events); int idx = hwc->idx; + u32 val; /* * Enable counter and interrupt, and set the counter to count @@ -585,7 +584,8 @@ static void armv8pmu_enable_event(struct perf_event *event) /* * Set event (if destined for PMNx counters). */ - armv8pmu_write_evtype(idx, hwc->config_base); + val = hwc->config_base & (ARMV8_EVTYPE_FLT_MASK | cpu_pmu->event_mask); + armv8pmu_write_evtype(idx, val); /* * Enable interrupt for this counter @@ -716,7 +716,7 @@ static int armv8pmu_get_event_idx(struct pmu_hw_events *cpuc, int idx; struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu); struct hw_perf_event *hwc = &event->hw; - unsigned long evtype = hwc->config_base & ARMV8_EVTYPE_EVENT; + unsigned long evtype = hwc->config_base & cpu_pmu->event_mask; /* Always place a cycle counter into the cycle counter. */ if (evtype == ARMV8_PMUV3_PERFCTR_CLOCK_CYCLES) { @@ -786,29 +786,25 @@ static void armv8pmu_reset(void *info) static int armv8_pmuv3_map_event(struct perf_event *event) { return armpmu_map_event(event, &armv8_pmuv3_perf_map, - &armv8_pmuv3_perf_cache_map, - ARMV8_EVTYPE_EVENT); + &armv8_pmuv3_perf_cache_map); } static int armv8_a53_map_event(struct perf_event *event) { return armpmu_map_event(event, &armv8_a53_perf_map, - &armv8_a53_perf_cache_map, - ARMV8_EVTYPE_EVENT); + &armv8_a53_perf_cache_map); } static int armv8_a57_map_event(struct perf_event *event) { return armpmu_map_event(event, &armv8_a57_perf_map, - &armv8_a57_perf_cache_map, - ARMV8_EVTYPE_EVENT); + &armv8_a57_perf_cache_map); } static int armv8_thunder_map_event(struct perf_event *event) { return armpmu_map_event(event, &armv8_thunder_perf_map, - &armv8_thunder_perf_cache_map, - ARMV8_EVTYPE_EVENT); + &armv8_thunder_perf_cache_map); } static void armv8pmu_read_num_pmnc_events(void *info) @@ -831,6 +827,8 @@ static int armv8pmu_probe_num_events(struct arm_pmu *arm_pmu) static void armv8_pmu_init(struct arm_pmu *cpu_pmu) { + u64 id; + cpu_pmu->handle_irq = armv8pmu_handle_irq, cpu_pmu->enable = armv8pmu_enable_event, cpu_pmu->disable = armv8pmu_disable_event, @@ -842,6 +840,13 @@ static void armv8_pmu_init(struct arm_pmu *cpu_pmu) cpu_pmu->reset = armv8pmu_reset, cpu_pmu->max_period = (1LLU << 32) - 1, cpu_pmu->set_event_filter = armv8pmu_set_event_filter; + + /* detect ARMv8.1 PMUv3 with extended event mask */ + id = read_cpuid(ID_AA64DFR0_EL1); + if (((id >> 8) & 0xf) == 4) + cpu_pmu->event_mask = 0xffff; /* ARMv8.1 extended events */ + else + cpu_pmu->event_mask = ARMV8_EVTYPE_EVENT; } static int armv8_pmuv3_init(struct arm_pmu *cpu_pmu) diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c index 166637f..79e681f 100644 --- a/drivers/perf/arm_pmu.c +++ b/drivers/perf/arm_pmu.c @@ -79,9 +79,10 @@ armpmu_map_event(struct perf_event *event, const unsigned (*cache_map) [PERF_COUNT_HW_CACHE_MAX] [PERF_COUNT_HW_CACHE_OP_MAX] - [PERF_COUNT_HW_CACHE_RESULT_MAX], - u32 raw_event_mask) + [PERF_COUNT_HW_CACHE_RESULT_MAX]) { + struct arm_pmu *armpmu = to_arm_pmu(event->pmu); + u32 raw_event_mask = armpmu->event_mask; u64 config = event->attr.config; int type = event->attr.type; diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h index 83b5e34..9a4c3a9 100644 --- a/include/linux/perf/arm_pmu.h +++ b/include/linux/perf/arm_pmu.h @@ -101,6 +101,7 @@ struct arm_pmu { void (*free_irq)(struct arm_pmu *); int (*map_event)(struct perf_event *event); int num_events; + int event_mask; atomic_t active_events; struct mutex reserve_mutex; u64 max_period; @@ -119,8 +120,7 @@ int armpmu_map_event(struct perf_event *event, const unsigned (*event_map)[PERF_COUNT_HW_MAX], const unsigned (*cache_map)[PERF_COUNT_HW_CACHE_MAX] [PERF_COUNT_HW_CACHE_OP_MAX] - [PERF_COUNT_HW_CACHE_RESULT_MAX], - u32 raw_event_mask); + [PERF_COUNT_HW_CACHE_RESULT_MAX]); struct pmu_probe_info { unsigned int cpuid; -- 1.9.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 5/5] arm64/perf: Extend event mask for ARMv8.1 2016-01-29 8:29 ` Jan Glauber @ 2016-01-29 17:01 ` David Daney 0 siblings, 0 replies; 11+ messages in thread From: David Daney @ 2016-01-29 17:01 UTC (permalink / raw) To: linux-arm-kernel Jan, There was already a "[PATCH v2 5/5]" that differs from this one. Perhaps you should resend the entire patch set and mark it v3. Thanks, David Daney On 01/29/2016 12:29 AM, Jan Glauber wrote: > ARMv8.1 increases the PMU event number space. Detect the > presence of this PMUv3 type and extend the event mask. > > The event mask is moved to struct arm_pmu so different event masks > can exist, depending on the PMU type. > > Signed-off-by: Jan Glauber <jglauber@cavium.com> > --- > arch/arm/kernel/perf_event_v6.c | 6 ++++-- > arch/arm/kernel/perf_event_v7.c | 29 +++++++++++++++++++---------- > arch/arm/kernel/perf_event_xscale.c | 4 +++- > arch/arm64/kernel/perf_event.c | 33 +++++++++++++++++++-------------- > drivers/perf/arm_pmu.c | 5 +++-- > include/linux/perf/arm_pmu.h | 4 ++-- > 6 files changed, 50 insertions(+), 31 deletions(-) > > diff --git a/arch/arm/kernel/perf_event_v6.c b/arch/arm/kernel/perf_event_v6.c > index 09413e7..d6769f5 100644 ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-01-29 17:01 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-01-28 14:54 [PATCH v2 0/5] Cavium ThunderX PMU support Jan Glauber 2016-01-28 14:54 ` [PATCH v2 1/5] arm64/perf: Rename Cortex A57 events Jan Glauber 2016-01-28 14:55 ` [PATCH v2 2/5] arm64/perf: Add Cavium ThunderX PMU support Jan Glauber 2016-01-28 14:55 ` [PATCH v2 3/5] arm64: dts: Add Cavium ThunderX specific PMU Jan Glauber 2016-01-28 14:55 ` [PATCH v2 4/5] arm64/perf: Enable PMCR long cycle counter bit Jan Glauber 2016-01-28 14:55 ` [PATCH v2 5/5] arm64/perf: Extend event mask for ARMv8.1 Jan Glauber 2016-01-28 16:14 ` kbuild test robot 2016-01-28 17:33 ` kbuild test robot 2016-01-29 8:27 ` Jan Glauber 2016-01-29 8:29 ` Jan Glauber 2016-01-29 17:01 ` David Daney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).