linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/5] Cavium ThunderX PMU support
@ 2016-01-14 12:55 Jan Glauber
  2016-01-14 12:55 ` [RFC PATCH 1/5] arm64/perf: Rename Cortex A57 events Jan Glauber
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Jan Glauber @ 2016-01-14 12:55 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland; +Cc: linux-kernel, linux-arm-kernel, Jan Glauber

Hi Will & Mark,

please have a look at these patches for perf PMU support. The Cavium
PMU stuff should be pretty generic but the long cycle counter bit
will change all ARMv8 PMUs and needs careful review.

Thanks,
Jan

Jan Glauber (5):
  arm64/perf: Rename Cortex A57 events
  arm64/perf: Add Cavium ThunderX PMU support
  arm64: dts: Add Cavium ThunderX specific PMU
  arm64/perf: Enable PMCR long cycle counter bit
  arm64/perf: Extend event mask for ARMv8.1

 Documentation/devicetree/bindings/arm/pmu.txt |   1 +
 arch/arm64/boot/dts/cavium/thunder-88xx.dtsi  |   5 +
 arch/arm64/kernel/perf_event.c                | 145 ++++++++++++++++++++------
 drivers/perf/arm_pmu.c                        |   5 +-
 include/linux/perf/arm_pmu.h                  |   4 +-
 5 files changed, 125 insertions(+), 35 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH 1/5] arm64/perf: Rename Cortex A57 events
  2016-01-14 12:55 [RFC PATCH 0/5] Cavium ThunderX PMU support Jan Glauber
@ 2016-01-14 12:55 ` Jan Glauber
  2016-01-14 12:55 ` [RFC PATCH 2/5] arm64/perf: Add Cavium ThunderX PMU support Jan Glauber
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Jan Glauber @ 2016-01-14 12:55 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland; +Cc: linux-kernel, linux-arm-kernel, Jan Glauber

The implemented Cortex A57 events are not A57 specific.
They are recommended by ARM and can be found on other
ARMv8 SOCs like Cavium ThunderX too. Therefore move
these events to the common PMUv3 table.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 arch/arm64/kernel/perf_event.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index f7ab14c..32fe656 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -87,17 +87,17 @@
 #define ARMV8_PMUV3_PERFCTR_L2D_TLB				0x2F
 #define ARMV8_PMUV3_PERFCTR_L21_TLB				0x30
 
+/* Recommended events. */
+#define ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS_LD			0x40
+#define ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS_ST			0x41
+#define ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL_LD			0x42
+#define ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL_ST			0x43
+#define ARMV8_PMUV3_PERFCTR_DTLB_REFILL_LD			0x4C
+#define ARMV8_PMUV3_PERFCTR_DTLB_REFILL_ST			0x4D
+
 /* ARMv8 Cortex-A53 specific event types. */
 #define ARMV8_A53_PERFCTR_PREFETCH_LINEFILL			0xC2
 
-/* ARMv8 Cortex-A57 and Cortex-A72 specific event types. */
-#define ARMV8_A57_PERFCTR_L1_DCACHE_ACCESS_LD			0x40
-#define ARMV8_A57_PERFCTR_L1_DCACHE_ACCESS_ST			0x41
-#define ARMV8_A57_PERFCTR_L1_DCACHE_REFILL_LD			0x42
-#define ARMV8_A57_PERFCTR_L1_DCACHE_REFILL_ST			0x43
-#define ARMV8_A57_PERFCTR_DTLB_REFILL_LD			0x4c
-#define ARMV8_A57_PERFCTR_DTLB_REFILL_ST			0x4d
-
 /* PMUv3 HW events mapping. */
 static const unsigned armv8_pmuv3_perf_map[PERF_COUNT_HW_MAX] = {
 	PERF_MAP_ALL_UNSUPPORTED,
@@ -174,16 +174,16 @@ static const unsigned armv8_a57_perf_cache_map[PERF_COUNT_HW_CACHE_MAX]
 					      [PERF_COUNT_HW_CACHE_RESULT_MAX] = {
 	PERF_CACHE_MAP_ALL_UNSUPPORTED,
 
-	[C(L1D)][C(OP_READ)][C(RESULT_ACCESS)]	= ARMV8_A57_PERFCTR_L1_DCACHE_ACCESS_LD,
-	[C(L1D)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_A57_PERFCTR_L1_DCACHE_REFILL_LD,
-	[C(L1D)][C(OP_WRITE)][C(RESULT_ACCESS)]	= ARMV8_A57_PERFCTR_L1_DCACHE_ACCESS_ST,
-	[C(L1D)][C(OP_WRITE)][C(RESULT_MISS)]	= ARMV8_A57_PERFCTR_L1_DCACHE_REFILL_ST,
+	[C(L1D)][C(OP_READ)][C(RESULT_ACCESS)]	= ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS_LD,
+	[C(L1D)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL_LD,
+	[C(L1D)][C(OP_WRITE)][C(RESULT_ACCESS)]	= ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS_ST,
+	[C(L1D)][C(OP_WRITE)][C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL_ST,
 
 	[C(L1I)][C(OP_READ)][C(RESULT_ACCESS)]	= ARMV8_PMUV3_PERFCTR_L1_ICACHE_ACCESS,
 	[C(L1I)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_L1_ICACHE_REFILL,
 
-	[C(DTLB)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_A57_PERFCTR_DTLB_REFILL_LD,
-	[C(DTLB)][C(OP_WRITE)][C(RESULT_MISS)]	= ARMV8_A57_PERFCTR_DTLB_REFILL_ST,
+	[C(DTLB)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_DTLB_REFILL_LD,
+	[C(DTLB)][C(OP_WRITE)][C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_DTLB_REFILL_ST,
 
 	[C(ITLB)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_ITLB_REFILL,
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH 2/5] arm64/perf: Add Cavium ThunderX PMU support
  2016-01-14 12:55 [RFC PATCH 0/5] Cavium ThunderX PMU support Jan Glauber
  2016-01-14 12:55 ` [RFC PATCH 1/5] arm64/perf: Rename Cortex A57 events Jan Glauber
@ 2016-01-14 12:55 ` Jan Glauber
  2016-01-14 12:55 ` [RFC PATCH 3/5] arm64: dts: Add Cavium ThunderX specific PMU Jan Glauber
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Jan Glauber @ 2016-01-14 12:55 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland; +Cc: linux-kernel, linux-arm-kernel, Jan Glauber

Support PMU events on Caviums ThunderX SOC. ThunderX supports
some additional counters compared to the default ARMv8 PMUv3:

- branch instructions counter
- stall frontend & backend counters
- L1 dcache load & store counters
- L1 icache counters
- iTLB & dTLB counters
- L1 dcache & icache prefetch counters

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 arch/arm64/kernel/perf_event.c | 69 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 68 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 32fe656..cf2cc39 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -94,10 +94,19 @@
 #define ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL_ST			0x43
 #define ARMV8_PMUV3_PERFCTR_DTLB_REFILL_LD			0x4C
 #define ARMV8_PMUV3_PERFCTR_DTLB_REFILL_ST			0x4D
+#define ARMV8_PMUV3_PERFCTR_DTLB_ACCESS_LD			0x4E
+#define ARMV8_PMUV3_PERFCTR_DTLB_ACCESS_ST			0x4F
 
 /* ARMv8 Cortex-A53 specific event types. */
 #define ARMV8_A53_PERFCTR_PREFETCH_LINEFILL			0xC2
 
+/* ARMv8 Cavium ThunderX specific event types. */
+#define ARMV8_THUNDERX_PERFCTR_L1_DCACHE_MISS_ST		0xE9
+#define ARMV8_THUNDERX_PERFCTR_L1_DCACHE_PREF_ACCESS		0xEA
+#define ARMV8_THUNDERX_PERFCTR_L1_DCACHE_PREF_MISS		0xEB
+#define ARMV8_THUNDERX_PERFCTR_L1_ICACHE_PREF_ACCESS		0xEC
+#define ARMV8_THUNDERX_PERFCTR_L1_ICACHE_PREF_MISS		0xED
+
 /* PMUv3 HW events mapping. */
 static const unsigned armv8_pmuv3_perf_map[PERF_COUNT_HW_MAX] = {
 	PERF_MAP_ALL_UNSUPPORTED,
@@ -131,6 +140,18 @@ static const unsigned armv8_a57_perf_map[PERF_COUNT_HW_MAX] = {
 	[PERF_COUNT_HW_BUS_CYCLES]		= ARMV8_PMUV3_PERFCTR_BUS_CYCLES,
 };
 
+static const unsigned armv8_thunderx_perf_map[PERF_COUNT_HW_MAX] = {
+	PERF_MAP_ALL_UNSUPPORTED,
+	[PERF_COUNT_HW_CPU_CYCLES]		= ARMV8_PMUV3_PERFCTR_CLOCK_CYCLES,
+	[PERF_COUNT_HW_INSTRUCTIONS]		= ARMV8_PMUV3_PERFCTR_INSTR_EXECUTED,
+	[PERF_COUNT_HW_CACHE_REFERENCES]	= ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS,
+	[PERF_COUNT_HW_CACHE_MISSES]		= ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL,
+	[PERF_COUNT_HW_BRANCH_INSTRUCTIONS]	= ARMV8_PMUV3_PERFCTR_PC_WRITE,
+	[PERF_COUNT_HW_BRANCH_MISSES]		= ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED,
+	[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = ARMV8_PMUV3_PERFCTR_STALL_FRONTEND,
+	[PERF_COUNT_HW_STALLED_CYCLES_BACKEND]	= ARMV8_PMUV3_PERFCTR_STALL_BACKEND,
+};
+
 static const unsigned armv8_pmuv3_perf_cache_map[PERF_COUNT_HW_CACHE_MAX]
 						[PERF_COUNT_HW_CACHE_OP_MAX]
 						[PERF_COUNT_HW_CACHE_RESULT_MAX] = {
@@ -193,6 +214,36 @@ static const unsigned armv8_a57_perf_cache_map[PERF_COUNT_HW_CACHE_MAX]
 	[C(BPU)][C(OP_WRITE)][C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED,
 };
 
+static const unsigned armv8_thunderx_perf_cache_map[PERF_COUNT_HW_CACHE_MAX]
+						   [PERF_COUNT_HW_CACHE_OP_MAX]
+						   [PERF_COUNT_HW_CACHE_RESULT_MAX] = {
+	PERF_CACHE_MAP_ALL_UNSUPPORTED,
+
+	[C(L1D)][C(OP_READ)][C(RESULT_ACCESS)]	= ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS_LD,
+	[C(L1D)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL_LD,
+	[C(L1D)][C(OP_WRITE)][C(RESULT_ACCESS)]	= ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS_ST,
+	[C(L1D)][C(OP_WRITE)][C(RESULT_MISS)]	= ARMV8_THUNDERX_PERFCTR_L1_DCACHE_MISS_ST,
+	[C(L1D)][C(OP_PREFETCH)][C(RESULT_ACCESS)] = ARMV8_THUNDERX_PERFCTR_L1_DCACHE_PREF_ACCESS,
+	[C(L1D)][C(OP_PREFETCH)][C(RESULT_MISS)] = ARMV8_THUNDERX_PERFCTR_L1_DCACHE_PREF_MISS,
+
+	[C(L1I)][C(OP_READ)][C(RESULT_ACCESS)]	= ARMV8_PMUV3_PERFCTR_L1_ICACHE_ACCESS,
+	[C(L1I)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_L1_ICACHE_REFILL,
+	[C(L1I)][C(OP_PREFETCH)][C(RESULT_ACCESS)] = ARMV8_THUNDERX_PERFCTR_L1_ICACHE_PREF_ACCESS,
+	[C(L1I)][C(OP_PREFETCH)][C(RESULT_MISS)] = ARMV8_THUNDERX_PERFCTR_L1_ICACHE_PREF_MISS,
+
+	[C(DTLB)][C(OP_READ)][C(RESULT_ACCESS)]	= ARMV8_PMUV3_PERFCTR_DTLB_ACCESS_LD,
+	[C(DTLB)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_DTLB_REFILL_LD,
+	[C(DTLB)][C(OP_WRITE)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_DTLB_ACCESS_ST,
+	[C(DTLB)][C(OP_WRITE)][C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_DTLB_REFILL_ST,
+
+	[C(ITLB)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_ITLB_REFILL,
+
+	[C(BPU)][C(OP_READ)][C(RESULT_ACCESS)]	= ARMV8_PMUV3_PERFCTR_PC_BRANCH_PRED,
+	[C(BPU)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED,
+	[C(BPU)][C(OP_WRITE)][C(RESULT_ACCESS)]	= ARMV8_PMUV3_PERFCTR_PC_BRANCH_PRED,
+	[C(BPU)][C(OP_WRITE)][C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED,
+};
+
 #define ARMV8_EVENT_ATTR_RESOLVE(m) #m
 #define ARMV8_EVENT_ATTR(name, config) \
 	PMU_EVENT_ATTR_STRING(name, armv8_event_attr_##name, \
@@ -324,7 +375,6 @@ static const struct attribute_group *armv8_pmuv3_attr_groups[] = {
 	NULL,
 };
 
-
 /*
  * Perf Events' indices
  */
@@ -743,6 +793,13 @@ static int armv8_a57_map_event(struct perf_event *event)
 				ARMV8_EVTYPE_EVENT);
 }
 
+static int armv8_thunderx_map_event(struct perf_event *event)
+{
+	return armpmu_map_event(event, &armv8_thunderx_perf_map,
+				&armv8_thunderx_perf_cache_map,
+				ARMV8_EVTYPE_EVENT);
+}
+
 static void armv8pmu_read_num_pmnc_events(void *info)
 {
 	int *nb_cnt = info;
@@ -811,11 +868,21 @@ static int armv8_a72_pmu_init(struct arm_pmu *cpu_pmu)
 	return armv8pmu_probe_num_events(cpu_pmu);
 }
 
+static int armv8_thunderx_pmu_init(struct arm_pmu *cpu_pmu)
+{
+	armv8_pmu_init(cpu_pmu);
+	cpu_pmu->name			= "armv8_cavium_thunderx";
+	cpu_pmu->map_event		= armv8_thunderx_map_event;
+	cpu_pmu->pmu.attr_groups	= armv8_pmuv3_attr_groups;
+	return armv8pmu_probe_num_events(cpu_pmu);
+}
+
 static const struct of_device_id armv8_pmu_of_device_ids[] = {
 	{.compatible = "arm,armv8-pmuv3",	.data = armv8_pmuv3_init},
 	{.compatible = "arm,cortex-a53-pmu",	.data = armv8_a53_pmu_init},
 	{.compatible = "arm,cortex-a57-pmu",	.data = armv8_a57_pmu_init},
 	{.compatible = "arm,cortex-a72-pmu",	.data = armv8_a72_pmu_init},
+	{.compatible = "cavium,thunderx-pmu",	.data = armv8_thunderx_pmu_init},
 	{},
 };
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH 3/5] arm64: dts: Add Cavium ThunderX specific PMU
  2016-01-14 12:55 [RFC PATCH 0/5] Cavium ThunderX PMU support Jan Glauber
  2016-01-14 12:55 ` [RFC PATCH 1/5] arm64/perf: Rename Cortex A57 events Jan Glauber
  2016-01-14 12:55 ` [RFC PATCH 2/5] arm64/perf: Add Cavium ThunderX PMU support Jan Glauber
@ 2016-01-14 12:55 ` Jan Glauber
  2016-01-14 14:47   ` Mark Rutland
  2016-01-14 12:55 ` [RFC PATCH 4/5] arm64/perf: Enable PMCR long cycle counter bit Jan Glauber
  2016-01-14 12:55 ` [RFC PATCH 5/5] arm64/perf: Extend event mask for ARMv8.1 Jan Glauber
  4 siblings, 1 reply; 8+ messages in thread
From: Jan Glauber @ 2016-01-14 12:55 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland; +Cc: linux-kernel, linux-arm-kernel, Jan Glauber

Add a compatible string for the Cavium ThunderX PMU.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 Documentation/devicetree/bindings/arm/pmu.txt | 1 +
 arch/arm64/boot/dts/cavium/thunder-88xx.dtsi  | 5 +++++
 2 files changed, 6 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/pmu.txt b/Documentation/devicetree/bindings/arm/pmu.txt
index 5651883..9bd0a33 100644
--- a/Documentation/devicetree/bindings/arm/pmu.txt
+++ b/Documentation/devicetree/bindings/arm/pmu.txt
@@ -25,6 +25,7 @@ Required properties:
 	"qcom,scorpion-pmu"
 	"qcom,scorpion-mp-pmu"
 	"qcom,krait-pmu"
+	"cavium,thunderx-pmu"
 - interrupts : 1 combined interrupt or 1 per core. If the interrupt is a per-cpu
                interrupt (PPI) then 1 interrupt should be specified.
 
diff --git a/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi b/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
index 9cb7cf9..84ac556 100644
--- a/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
+++ b/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
@@ -360,6 +360,11 @@
 		             <1 10 0xff01>;
 	};
 
+	pmu {
+		compatible = "cavium,thunderx-pmu", "arm,armv8-pmuv3";
+		interrupts = <1 7 4>;
+	};
+
 	soc {
 		compatible = "simple-bus";
 		#address-cells = <2>;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH 4/5] arm64/perf: Enable PMCR long cycle counter bit
  2016-01-14 12:55 [RFC PATCH 0/5] Cavium ThunderX PMU support Jan Glauber
                   ` (2 preceding siblings ...)
  2016-01-14 12:55 ` [RFC PATCH 3/5] arm64: dts: Add Cavium ThunderX specific PMU Jan Glauber
@ 2016-01-14 12:55 ` Jan Glauber
  2016-01-14 12:55 ` [RFC PATCH 5/5] arm64/perf: Extend event mask for ARMv8.1 Jan Glauber
  4 siblings, 0 replies; 8+ messages in thread
From: Jan Glauber @ 2016-01-14 12:55 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland; +Cc: linux-kernel, linux-arm-kernel, Jan Glauber

With the long cycle counter bit (LC) disabled the cycle counter is not
working on ThunderX SOC (because ThunderX only implements Aarch64).
Also, according to documentation LC == 0 is deprecated.

To keep the code simple the patch does not introduce 64 bit wide counter
functions. Instead writing the cycle counter always sets the upper
32 bits so overflow interrupts are generated as before.

Original patch from Andrew Pinksi <Andrew.Pinksi@caviumnetworks.com>

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 arch/arm64/kernel/perf_event.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index cf2cc39..d8d5d59 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -405,6 +405,7 @@ static const struct attribute_group *armv8_pmuv3_attr_groups[] = {
 #define ARMV8_PMCR_D		(1 << 3) /* CCNT counts every 64th cpu cycle */
 #define ARMV8_PMCR_X		(1 << 4) /* Export to ETM */
 #define ARMV8_PMCR_DP		(1 << 5) /* Disable CCNT if non-invasive debug*/
+#define ARMV8_PMCR_LC		(1 << 6) /* Overflow on 64 bit cycle counter */
 #define	ARMV8_PMCR_N_SHIFT	11	 /* Number of counters supported */
 #define	ARMV8_PMCR_N_MASK	0x1f
 #define	ARMV8_PMCR_MASK		0x3f	 /* Mask for writable bits */
@@ -494,9 +495,16 @@ static inline void armv8pmu_write_counter(struct perf_event *event, u32 value)
 	if (!armv8pmu_counter_valid(cpu_pmu, idx))
 		pr_err("CPU%u writing wrong counter %d\n",
 			smp_processor_id(), idx);
-	else if (idx == ARMV8_IDX_CYCLE_COUNTER)
-		asm volatile("msr pmccntr_el0, %0" :: "r" (value));
-	else if (armv8pmu_select_counter(idx) == idx)
+	else if (idx == ARMV8_IDX_CYCLE_COUNTER) {
+		/*
+		 * Set the upper 32bits as this is a 64bit counter but we only
+		 * count using the lower 32bits and we want an interrupt when
+		 * it overflows.
+		 */
+		u64 value64 = 0xffffffff00000000ULL | value;
+
+		asm volatile("msr pmccntr_el0, %0" :: "r" (value64));
+	} else if (armv8pmu_select_counter(idx) == idx)
 		asm volatile("msr pmxevcntr_el0, %0" :: "r" (value));
 }
 
@@ -768,8 +776,11 @@ static void armv8pmu_reset(void *info)
 		armv8pmu_disable_intens(idx);
 	}
 
-	/* Initialize & Reset PMNC: C and P bits. */
-	armv8pmu_pmcr_write(ARMV8_PMCR_P | ARMV8_PMCR_C);
+	/*
+	 * Initialize & Reset PMNC. Request overflow on 64 bit but
+	 * cheat in armv8pmu_write_counter().
+	 */
+	armv8pmu_pmcr_write(ARMV8_PMCR_P | ARMV8_PMCR_C | ARMV8_PMCR_LC);
 }
 
 static int armv8_pmuv3_map_event(struct perf_event *event)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH 5/5] arm64/perf: Extend event mask for ARMv8.1
  2016-01-14 12:55 [RFC PATCH 0/5] Cavium ThunderX PMU support Jan Glauber
                   ` (3 preceding siblings ...)
  2016-01-14 12:55 ` [RFC PATCH 4/5] arm64/perf: Enable PMCR long cycle counter bit Jan Glauber
@ 2016-01-14 12:55 ` Jan Glauber
  4 siblings, 0 replies; 8+ messages in thread
From: Jan Glauber @ 2016-01-14 12:55 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland; +Cc: linux-kernel, linux-arm-kernel, Jan Glauber

ARMv8.1 increases the PMU event number space. Detect the
presence of this PMUv3 type and extend the event mask.

The event mask is moved to struct arm_pmu so different event masks
can exist, depending on the PMU type.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 arch/arm64/kernel/perf_event.c | 33 +++++++++++++++++++--------------
 drivers/perf/arm_pmu.c         |  5 +++--
 include/linux/perf/arm_pmu.h   |  4 ++--
 3 files changed, 24 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index d8d5d59..d448a75 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -419,7 +419,7 @@ static const struct attribute_group *armv8_pmuv3_attr_groups[] = {
 /*
  * PMXEVTYPER: Event selection reg
  */
-#define	ARMV8_EVTYPE_MASK	0xc80003ff	/* Mask for writable bits */
+#define	ARMV8_EVTYPE_FLT_MASK	0xc8000000	/* Writable filter bits */
 #define	ARMV8_EVTYPE_EVENT	0x3ff		/* Mask for EVENT bits */
 
 /*
@@ -510,10 +510,8 @@ static inline void armv8pmu_write_counter(struct perf_event *event, u32 value)
 
 static inline void armv8pmu_write_evtype(int idx, u32 val)
 {
-	if (armv8pmu_select_counter(idx) == idx) {
-		val &= ARMV8_EVTYPE_MASK;
+	if (armv8pmu_select_counter(idx) == idx)
 		asm volatile("msr pmxevtyper_el0, %0" :: "r" (val));
-	}
 }
 
 static inline int armv8pmu_enable_counter(int idx)
@@ -570,6 +568,7 @@ static void armv8pmu_enable_event(struct perf_event *event)
 	struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
 	struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
 	int idx = hwc->idx;
+	u32 val;
 
 	/*
 	 * Enable counter and interrupt, and set the counter to count
@@ -585,7 +584,8 @@ static void armv8pmu_enable_event(struct perf_event *event)
 	/*
 	 * Set event (if destined for PMNx counters).
 	 */
-	armv8pmu_write_evtype(idx, hwc->config_base);
+	val = hwc->config_base & (ARMV8_EVTYPE_FLT_MASK | cpu_pmu->event_mask);
+	armv8pmu_write_evtype(idx, val);
 
 	/*
 	 * Enable interrupt for this counter
@@ -716,7 +716,7 @@ static int armv8pmu_get_event_idx(struct pmu_hw_events *cpuc,
 	int idx;
 	struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
 	struct hw_perf_event *hwc = &event->hw;
-	unsigned long evtype = hwc->config_base & ARMV8_EVTYPE_EVENT;
+	unsigned long evtype = hwc->config_base & cpu_pmu->event_mask;
 
 	/* Always place a cycle counter into the cycle counter. */
 	if (evtype == ARMV8_PMUV3_PERFCTR_CLOCK_CYCLES) {
@@ -786,29 +786,25 @@ static void armv8pmu_reset(void *info)
 static int armv8_pmuv3_map_event(struct perf_event *event)
 {
 	return armpmu_map_event(event, &armv8_pmuv3_perf_map,
-				&armv8_pmuv3_perf_cache_map,
-				ARMV8_EVTYPE_EVENT);
+				&armv8_pmuv3_perf_cache_map);
 }
 
 static int armv8_a53_map_event(struct perf_event *event)
 {
 	return armpmu_map_event(event, &armv8_a53_perf_map,
-				&armv8_a53_perf_cache_map,
-				ARMV8_EVTYPE_EVENT);
+				&armv8_a53_perf_cache_map);
 }
 
 static int armv8_a57_map_event(struct perf_event *event)
 {
 	return armpmu_map_event(event, &armv8_a57_perf_map,
-				&armv8_a57_perf_cache_map,
-				ARMV8_EVTYPE_EVENT);
+				&armv8_a57_perf_cache_map);
 }
 
 static int armv8_thunderx_map_event(struct perf_event *event)
 {
 	return armpmu_map_event(event, &armv8_thunderx_perf_map,
-				&armv8_thunderx_perf_cache_map,
-				ARMV8_EVTYPE_EVENT);
+				&armv8_thunderx_perf_cache_map);
 }
 
 static void armv8pmu_read_num_pmnc_events(void *info)
@@ -831,6 +827,8 @@ static int armv8pmu_probe_num_events(struct arm_pmu *arm_pmu)
 
 static void armv8_pmu_init(struct arm_pmu *cpu_pmu)
 {
+	u64 id;
+
 	cpu_pmu->handle_irq		= armv8pmu_handle_irq,
 	cpu_pmu->enable			= armv8pmu_enable_event,
 	cpu_pmu->disable		= armv8pmu_disable_event,
@@ -842,6 +840,13 @@ static void armv8_pmu_init(struct arm_pmu *cpu_pmu)
 	cpu_pmu->reset			= armv8pmu_reset,
 	cpu_pmu->max_period		= (1LLU << 32) - 1,
 	cpu_pmu->set_event_filter	= armv8pmu_set_event_filter;
+
+	/* detect ARMv8.1 PMUv3 with extended event mask */
+	id = read_cpuid(ID_AA64DFR0_EL1);
+	if (((id >> 8) & 0xf) == 4)
+		cpu_pmu->event_mask = 0xffff;	/* ARMv8.1 extended events */
+	else
+		cpu_pmu->event_mask = ARMV8_EVTYPE_EVENT;
 }
 
 static int armv8_pmuv3_init(struct arm_pmu *cpu_pmu)
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 166637f..79e681f 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -79,9 +79,10 @@ armpmu_map_event(struct perf_event *event,
 		 const unsigned (*cache_map)
 				[PERF_COUNT_HW_CACHE_MAX]
 				[PERF_COUNT_HW_CACHE_OP_MAX]
-				[PERF_COUNT_HW_CACHE_RESULT_MAX],
-		 u32 raw_event_mask)
+				[PERF_COUNT_HW_CACHE_RESULT_MAX])
 {
+	struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
+	u32 raw_event_mask = armpmu->event_mask;
 	u64 config = event->attr.config;
 	int type = event->attr.type;
 
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 83b5e34..9a4c3a9 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -101,6 +101,7 @@ struct arm_pmu {
 	void		(*free_irq)(struct arm_pmu *);
 	int		(*map_event)(struct perf_event *event);
 	int		num_events;
+	int		event_mask;
 	atomic_t	active_events;
 	struct mutex	reserve_mutex;
 	u64		max_period;
@@ -119,8 +120,7 @@ int armpmu_map_event(struct perf_event *event,
 		     const unsigned (*event_map)[PERF_COUNT_HW_MAX],
 		     const unsigned (*cache_map)[PERF_COUNT_HW_CACHE_MAX]
 						[PERF_COUNT_HW_CACHE_OP_MAX]
-						[PERF_COUNT_HW_CACHE_RESULT_MAX],
-		     u32 raw_event_mask);
+						[PERF_COUNT_HW_CACHE_RESULT_MAX]);
 
 struct pmu_probe_info {
 	unsigned int cpuid;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 3/5] arm64: dts: Add Cavium ThunderX specific PMU
  2016-01-14 12:55 ` [RFC PATCH 3/5] arm64: dts: Add Cavium ThunderX specific PMU Jan Glauber
@ 2016-01-14 14:47   ` Mark Rutland
  2016-01-14 16:06     ` Jan Glauber
  0 siblings, 1 reply; 8+ messages in thread
From: Mark Rutland @ 2016-01-14 14:47 UTC (permalink / raw)
  To: Jan Glauber; +Cc: Will Deacon, linux-kernel, linux-arm-kernel, Jan Glauber

Hi,

As it's the middle of the merge window, it will be a while before this
sees a full review. In future, it would be better to wait until -rc1
before posting new patches (appropriately rebased).

I did spot one thing however.

On Thu, Jan 14, 2016 at 01:55:43PM +0100, Jan Glauber wrote:
> Add a compatible string for the Cavium ThunderX PMU.
> 
> Signed-off-by: Jan Glauber <jglauber@cavium.com>
> ---
>  Documentation/devicetree/bindings/arm/pmu.txt | 1 +
>  arch/arm64/boot/dts/cavium/thunder-88xx.dtsi  | 5 +++++
>  2 files changed, 6 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/arm/pmu.txt b/Documentation/devicetree/bindings/arm/pmu.txt
> index 5651883..9bd0a33 100644
> --- a/Documentation/devicetree/bindings/arm/pmu.txt
> +++ b/Documentation/devicetree/bindings/arm/pmu.txt
> @@ -25,6 +25,7 @@ Required properties:
>  	"qcom,scorpion-pmu"
>  	"qcom,scorpion-mp-pmu"
>  	"qcom,krait-pmu"
> +	"cavium,thunderx-pmu"
>  - interrupts : 1 combined interrupt or 1 per core. If the interrupt is a per-cpu
>                 interrupt (PPI) then 1 interrupt should be specified.
>  
> diff --git a/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi b/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
> index 9cb7cf9..84ac556 100644
> --- a/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
> +++ b/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
> @@ -360,6 +360,11 @@
>  		             <1 10 0xff01>;
>  	};
>  
> +	pmu {
> +		compatible = "cavium,thunderx-pmu", "arm,armv8-pmuv3";

In current dts, "cavium,thunder" is used as the CPU compatible string.

Please decide whether you want to call the CPU "Thunder", or
"Thunder-X", and ensure that all compatible strings are consistent.

Please also ansure that any related names exposes to userspace (i.e.
e.g. the PMU name) follow this consistent naming.

Thanks,
Mark.

> +		interrupts = <1 7 4>;
> +	};
> +
>  	soc {
>  		compatible = "simple-bus";
>  		#address-cells = <2>;
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 3/5] arm64: dts: Add Cavium ThunderX specific PMU
  2016-01-14 14:47   ` Mark Rutland
@ 2016-01-14 16:06     ` Jan Glauber
  0 siblings, 0 replies; 8+ messages in thread
From: Jan Glauber @ 2016-01-14 16:06 UTC (permalink / raw)
  To: Mark Rutland; +Cc: Jan Glauber, Will Deacon, linux-kernel, linux-arm-kernel

On Thu, Jan 14, 2016 at 02:47:12PM +0000, Mark Rutland wrote:
> Hi,
> 
> As it's the middle of the merge window, it will be a while before this
> sees a full review. In future, it would be better to wait until -rc1
> before posting new patches (appropriately rebased).

OK, fair enough. I'll repost after -rc1 if the patches don't fit
anymore.

> I did spot one thing however.
> 
> On Thu, Jan 14, 2016 at 01:55:43PM +0100, Jan Glauber wrote:
> > Add a compatible string for the Cavium ThunderX PMU.
> > 
> > Signed-off-by: Jan Glauber <jglauber@cavium.com>
> > ---
> >  Documentation/devicetree/bindings/arm/pmu.txt | 1 +
> >  arch/arm64/boot/dts/cavium/thunder-88xx.dtsi  | 5 +++++
> >  2 files changed, 6 insertions(+)
> > 
> > diff --git a/Documentation/devicetree/bindings/arm/pmu.txt b/Documentation/devicetree/bindings/arm/pmu.txt
> > index 5651883..9bd0a33 100644
> > --- a/Documentation/devicetree/bindings/arm/pmu.txt
> > +++ b/Documentation/devicetree/bindings/arm/pmu.txt
> > @@ -25,6 +25,7 @@ Required properties:
> >  	"qcom,scorpion-pmu"
> >  	"qcom,scorpion-mp-pmu"
> >  	"qcom,krait-pmu"
> > +	"cavium,thunderx-pmu"
> >  - interrupts : 1 combined interrupt or 1 per core. If the interrupt is a per-cpu
> >                 interrupt (PPI) then 1 interrupt should be specified.
> >  
> > diff --git a/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi b/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
> > index 9cb7cf9..84ac556 100644
> > --- a/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
> > +++ b/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
> > @@ -360,6 +360,11 @@
> >  		             <1 10 0xff01>;
> >  	};
> >  
> > +	pmu {
> > +		compatible = "cavium,thunderx-pmu", "arm,armv8-pmuv3";
> 
> In current dts, "cavium,thunder" is used as the CPU compatible string.
> 
> Please decide whether you want to call the CPU "Thunder", or
> "Thunder-X", and ensure that all compatible strings are consistent.

I think we should keep the already existing name then.

thanks,
Jan

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-01-14 16:21 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-14 12:55 [RFC PATCH 0/5] Cavium ThunderX PMU support Jan Glauber
2016-01-14 12:55 ` [RFC PATCH 1/5] arm64/perf: Rename Cortex A57 events Jan Glauber
2016-01-14 12:55 ` [RFC PATCH 2/5] arm64/perf: Add Cavium ThunderX PMU support Jan Glauber
2016-01-14 12:55 ` [RFC PATCH 3/5] arm64: dts: Add Cavium ThunderX specific PMU Jan Glauber
2016-01-14 14:47   ` Mark Rutland
2016-01-14 16:06     ` Jan Glauber
2016-01-14 12:55 ` [RFC PATCH 4/5] arm64/perf: Enable PMCR long cycle counter bit Jan Glauber
2016-01-14 12:55 ` [RFC PATCH 5/5] arm64/perf: Extend event mask for ARMv8.1 Jan Glauber

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).