linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/5] Cavium ThunderX PMU support
@ 2016-02-18 16:50 Jan Glauber
  2016-02-18 16:50 ` [PATCH v4 1/5] arm64/perf: Rename Cortex A57 events Jan Glauber
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Jan Glauber @ 2016-02-18 16:50 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

this should address all comments. With the simplified event mask
arm isn't touched anymore and also the cpuid check vanished.

Changes to v3:
- renamed A57 events to IMPDEF
- changed comment about 64 bit cycle counter overflow
- unconditionally increase event mask

Changes to v2:
- fixed arm compile errors

Changes to v1:
- renamed thunderx dt pmu binding to thunder

Jan

--------------------------------------------------------

Jan Glauber (5):
  arm64/perf: Rename Cortex A57 events
  arm64/perf: Add Cavium ThunderX PMU support
  arm64: dts: Add Cavium ThunderX specific PMU
  arm64/perf: Enable PMCR long cycle counter bit
  arm64/perf: Extend event mask for ARMv8.1

 Documentation/devicetree/bindings/arm/pmu.txt |   1 +
 arch/arm64/boot/dts/cavium/thunder-88xx.dtsi  |   5 ++
 arch/arm64/kernel/perf_event.c                | 120 +++++++++++++++++++++-----
 3 files changed, 105 insertions(+), 21 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v4 1/5] arm64/perf: Rename Cortex A57 events
  2016-02-18 16:50 [PATCH v4 0/5] Cavium ThunderX PMU support Jan Glauber
@ 2016-02-18 16:50 ` Jan Glauber
  2016-02-18 16:50 ` [PATCH v4 2/5] arm64/perf: Add Cavium ThunderX PMU support Jan Glauber
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 17+ messages in thread
From: Jan Glauber @ 2016-02-18 16:50 UTC (permalink / raw)
  To: linux-arm-kernel

The implemented Cortex A57 events are strictly-speaking not
A57 specific. They are ARM recommended implementation defined events
and can be found on other ARMv8 SOCs like Cavium ThunderX too.

Therefore rename these events to allow using them in other
implementations too.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 arch/arm64/kernel/perf_event.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index f7ab14c..2adbcb5 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -90,13 +90,13 @@
 /* ARMv8 Cortex-A53 specific event types. */
 #define ARMV8_A53_PERFCTR_PREFETCH_LINEFILL			0xC2
 
-/* ARMv8 Cortex-A57 and Cortex-A72 specific event types. */
-#define ARMV8_A57_PERFCTR_L1_DCACHE_ACCESS_LD			0x40
-#define ARMV8_A57_PERFCTR_L1_DCACHE_ACCESS_ST			0x41
-#define ARMV8_A57_PERFCTR_L1_DCACHE_REFILL_LD			0x42
-#define ARMV8_A57_PERFCTR_L1_DCACHE_REFILL_ST			0x43
-#define ARMV8_A57_PERFCTR_DTLB_REFILL_LD			0x4c
-#define ARMV8_A57_PERFCTR_DTLB_REFILL_ST			0x4d
+/* ARMv8 implementation defined event types. */
+#define ARMV8_IMPDEF_PERFCTR_L1_DCACHE_ACCESS_LD		0x40
+#define ARMV8_IMPDEF_PERFCTR_L1_DCACHE_ACCESS_ST		0x41
+#define ARMV8_IMPDEF_PERFCTR_L1_DCACHE_REFILL_LD		0x42
+#define ARMV8_IMPDEF_PERFCTR_L1_DCACHE_REFILL_ST		0x43
+#define ARMV8_IMPDEF_PERFCTR_DTLB_REFILL_LD			0x4c
+#define ARMV8_IMPDEF_PERFCTR_DTLB_REFILL_ST			0x4d
 
 /* PMUv3 HW events mapping. */
 static const unsigned armv8_pmuv3_perf_map[PERF_COUNT_HW_MAX] = {
@@ -174,16 +174,16 @@ static const unsigned armv8_a57_perf_cache_map[PERF_COUNT_HW_CACHE_MAX]
 					      [PERF_COUNT_HW_CACHE_RESULT_MAX] = {
 	PERF_CACHE_MAP_ALL_UNSUPPORTED,
 
-	[C(L1D)][C(OP_READ)][C(RESULT_ACCESS)]	= ARMV8_A57_PERFCTR_L1_DCACHE_ACCESS_LD,
-	[C(L1D)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_A57_PERFCTR_L1_DCACHE_REFILL_LD,
-	[C(L1D)][C(OP_WRITE)][C(RESULT_ACCESS)]	= ARMV8_A57_PERFCTR_L1_DCACHE_ACCESS_ST,
-	[C(L1D)][C(OP_WRITE)][C(RESULT_MISS)]	= ARMV8_A57_PERFCTR_L1_DCACHE_REFILL_ST,
+	[C(L1D)][C(OP_READ)][C(RESULT_ACCESS)]	= ARMV8_IMPDEF_PERFCTR_L1_DCACHE_ACCESS_LD,
+	[C(L1D)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_IMPDEF_PERFCTR_L1_DCACHE_REFILL_LD,
+	[C(L1D)][C(OP_WRITE)][C(RESULT_ACCESS)]	= ARMV8_IMPDEF_PERFCTR_L1_DCACHE_ACCESS_ST,
+	[C(L1D)][C(OP_WRITE)][C(RESULT_MISS)]	= ARMV8_IMPDEF_PERFCTR_L1_DCACHE_REFILL_ST,
 
 	[C(L1I)][C(OP_READ)][C(RESULT_ACCESS)]	= ARMV8_PMUV3_PERFCTR_L1_ICACHE_ACCESS,
 	[C(L1I)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_L1_ICACHE_REFILL,
 
-	[C(DTLB)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_A57_PERFCTR_DTLB_REFILL_LD,
-	[C(DTLB)][C(OP_WRITE)][C(RESULT_MISS)]	= ARMV8_A57_PERFCTR_DTLB_REFILL_ST,
+	[C(DTLB)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_IMPDEF_PERFCTR_DTLB_REFILL_LD,
+	[C(DTLB)][C(OP_WRITE)][C(RESULT_MISS)]	= ARMV8_IMPDEF_PERFCTR_DTLB_REFILL_ST,
 
 	[C(ITLB)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_ITLB_REFILL,
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v4 2/5] arm64/perf: Add Cavium ThunderX PMU support
  2016-02-18 16:50 [PATCH v4 0/5] Cavium ThunderX PMU support Jan Glauber
  2016-02-18 16:50 ` [PATCH v4 1/5] arm64/perf: Rename Cortex A57 events Jan Glauber
@ 2016-02-18 16:50 ` Jan Glauber
  2016-02-18 16:50 ` [PATCH v4 3/5] arm64: dts: Add Cavium ThunderX specific PMU Jan Glauber
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 17+ messages in thread
From: Jan Glauber @ 2016-02-18 16:50 UTC (permalink / raw)
  To: linux-arm-kernel

Support PMU events on Caviums ThunderX SOC. ThunderX supports
some additional counters compared to the default ARMv8 PMUv3:

- branch instructions counter
- stall frontend & backend counters
- L1 dcache load & store counters
- L1 icache counters
- iTLB & dTLB counters
- L1 dcache & icache prefetch counters

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 arch/arm64/kernel/perf_event.c | 69 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 68 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 2adbcb5..0ed05f6 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -97,6 +97,15 @@
 #define ARMV8_IMPDEF_PERFCTR_L1_DCACHE_REFILL_ST		0x43
 #define ARMV8_IMPDEF_PERFCTR_DTLB_REFILL_LD			0x4c
 #define ARMV8_IMPDEF_PERFCTR_DTLB_REFILL_ST			0x4d
+#define ARMV8_IMPDEF_PERFCTR_DTLB_ACCESS_LD			0x4e
+#define ARMV8_IMPDEF_PERFCTR_DTLB_ACCESS_ST			0x4f
+
+/* ARMv8 Cavium ThunderX specific event types. */
+#define ARMV8_THUNDER_PERFCTR_L1_DCACHE_MISS_ST			0xE9
+#define ARMV8_THUNDER_PERFCTR_L1_DCACHE_PREF_ACCESS		0xea
+#define ARMV8_THUNDER_PERFCTR_L1_DCACHE_PREF_MISS		0xeb
+#define ARMV8_THUNDER_PERFCTR_L1_ICACHE_PREF_ACCESS		0xec
+#define ARMV8_THUNDER_PERFCTR_L1_ICACHE_PREF_MISS		0xed
 
 /* PMUv3 HW events mapping. */
 static const unsigned armv8_pmuv3_perf_map[PERF_COUNT_HW_MAX] = {
@@ -131,6 +140,18 @@ static const unsigned armv8_a57_perf_map[PERF_COUNT_HW_MAX] = {
 	[PERF_COUNT_HW_BUS_CYCLES]		= ARMV8_PMUV3_PERFCTR_BUS_CYCLES,
 };
 
+static const unsigned armv8_thunder_perf_map[PERF_COUNT_HW_MAX] = {
+	PERF_MAP_ALL_UNSUPPORTED,
+	[PERF_COUNT_HW_CPU_CYCLES]		= ARMV8_PMUV3_PERFCTR_CLOCK_CYCLES,
+	[PERF_COUNT_HW_INSTRUCTIONS]		= ARMV8_PMUV3_PERFCTR_INSTR_EXECUTED,
+	[PERF_COUNT_HW_CACHE_REFERENCES]	= ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS,
+	[PERF_COUNT_HW_CACHE_MISSES]		= ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL,
+	[PERF_COUNT_HW_BRANCH_INSTRUCTIONS]	= ARMV8_PMUV3_PERFCTR_PC_WRITE,
+	[PERF_COUNT_HW_BRANCH_MISSES]		= ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED,
+	[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = ARMV8_PMUV3_PERFCTR_STALL_FRONTEND,
+	[PERF_COUNT_HW_STALLED_CYCLES_BACKEND]	= ARMV8_PMUV3_PERFCTR_STALL_BACKEND,
+};
+
 static const unsigned armv8_pmuv3_perf_cache_map[PERF_COUNT_HW_CACHE_MAX]
 						[PERF_COUNT_HW_CACHE_OP_MAX]
 						[PERF_COUNT_HW_CACHE_RESULT_MAX] = {
@@ -193,6 +214,36 @@ static const unsigned armv8_a57_perf_cache_map[PERF_COUNT_HW_CACHE_MAX]
 	[C(BPU)][C(OP_WRITE)][C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED,
 };
 
+static const unsigned armv8_thunder_perf_cache_map[PERF_COUNT_HW_CACHE_MAX]
+						   [PERF_COUNT_HW_CACHE_OP_MAX]
+						   [PERF_COUNT_HW_CACHE_RESULT_MAX] = {
+	PERF_CACHE_MAP_ALL_UNSUPPORTED,
+
+	[C(L1D)][C(OP_READ)][C(RESULT_ACCESS)]	= ARMV8_IMPDEF_PERFCTR_L1_DCACHE_ACCESS_LD,
+	[C(L1D)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_IMPDEF_PERFCTR_L1_DCACHE_REFILL_LD,
+	[C(L1D)][C(OP_WRITE)][C(RESULT_ACCESS)]	= ARMV8_IMPDEF_PERFCTR_L1_DCACHE_ACCESS_ST,
+	[C(L1D)][C(OP_WRITE)][C(RESULT_MISS)]	= ARMV8_THUNDER_PERFCTR_L1_DCACHE_MISS_ST,
+	[C(L1D)][C(OP_PREFETCH)][C(RESULT_ACCESS)] = ARMV8_THUNDER_PERFCTR_L1_DCACHE_PREF_ACCESS,
+	[C(L1D)][C(OP_PREFETCH)][C(RESULT_MISS)] = ARMV8_THUNDER_PERFCTR_L1_DCACHE_PREF_MISS,
+
+	[C(L1I)][C(OP_READ)][C(RESULT_ACCESS)]	= ARMV8_PMUV3_PERFCTR_L1_ICACHE_ACCESS,
+	[C(L1I)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_L1_ICACHE_REFILL,
+	[C(L1I)][C(OP_PREFETCH)][C(RESULT_ACCESS)] = ARMV8_THUNDER_PERFCTR_L1_ICACHE_PREF_ACCESS,
+	[C(L1I)][C(OP_PREFETCH)][C(RESULT_MISS)] = ARMV8_THUNDER_PERFCTR_L1_ICACHE_PREF_MISS,
+
+	[C(DTLB)][C(OP_READ)][C(RESULT_ACCESS)]	= ARMV8_IMPDEF_PERFCTR_DTLB_ACCESS_LD,
+	[C(DTLB)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_IMPDEF_PERFCTR_DTLB_REFILL_LD,
+	[C(DTLB)][C(OP_WRITE)][C(RESULT_ACCESS)] = ARMV8_IMPDEF_PERFCTR_DTLB_ACCESS_ST,
+	[C(DTLB)][C(OP_WRITE)][C(RESULT_MISS)]	= ARMV8_IMPDEF_PERFCTR_DTLB_REFILL_ST,
+
+	[C(ITLB)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_ITLB_REFILL,
+
+	[C(BPU)][C(OP_READ)][C(RESULT_ACCESS)]	= ARMV8_PMUV3_PERFCTR_PC_BRANCH_PRED,
+	[C(BPU)][C(OP_READ)][C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED,
+	[C(BPU)][C(OP_WRITE)][C(RESULT_ACCESS)]	= ARMV8_PMUV3_PERFCTR_PC_BRANCH_PRED,
+	[C(BPU)][C(OP_WRITE)][C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED,
+};
+
 #define ARMV8_EVENT_ATTR_RESOLVE(m) #m
 #define ARMV8_EVENT_ATTR(name, config) \
 	PMU_EVENT_ATTR_STRING(name, armv8_event_attr_##name, \
@@ -324,7 +375,6 @@ static const struct attribute_group *armv8_pmuv3_attr_groups[] = {
 	NULL,
 };
 
-
 /*
  * Perf Events' indices
  */
@@ -743,6 +793,13 @@ static int armv8_a57_map_event(struct perf_event *event)
 				ARMV8_EVTYPE_EVENT);
 }
 
+static int armv8_thunder_map_event(struct perf_event *event)
+{
+	return armpmu_map_event(event, &armv8_thunder_perf_map,
+				&armv8_thunder_perf_cache_map,
+				ARMV8_EVTYPE_EVENT);
+}
+
 static void armv8pmu_read_num_pmnc_events(void *info)
 {
 	int *nb_cnt = info;
@@ -811,11 +868,21 @@ static int armv8_a72_pmu_init(struct arm_pmu *cpu_pmu)
 	return armv8pmu_probe_num_events(cpu_pmu);
 }
 
+static int armv8_thunder_pmu_init(struct arm_pmu *cpu_pmu)
+{
+	armv8_pmu_init(cpu_pmu);
+	cpu_pmu->name			= "armv8_cavium_thunder";
+	cpu_pmu->map_event		= armv8_thunder_map_event;
+	cpu_pmu->pmu.attr_groups	= armv8_pmuv3_attr_groups;
+	return armv8pmu_probe_num_events(cpu_pmu);
+}
+
 static const struct of_device_id armv8_pmu_of_device_ids[] = {
 	{.compatible = "arm,armv8-pmuv3",	.data = armv8_pmuv3_init},
 	{.compatible = "arm,cortex-a53-pmu",	.data = armv8_a53_pmu_init},
 	{.compatible = "arm,cortex-a57-pmu",	.data = armv8_a57_pmu_init},
 	{.compatible = "arm,cortex-a72-pmu",	.data = armv8_a72_pmu_init},
+	{.compatible = "cavium,thunder-pmu",	.data = armv8_thunder_pmu_init},
 	{},
 };
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v4 3/5] arm64: dts: Add Cavium ThunderX specific PMU
  2016-02-18 16:50 [PATCH v4 0/5] Cavium ThunderX PMU support Jan Glauber
  2016-02-18 16:50 ` [PATCH v4 1/5] arm64/perf: Rename Cortex A57 events Jan Glauber
  2016-02-18 16:50 ` [PATCH v4 2/5] arm64/perf: Add Cavium ThunderX PMU support Jan Glauber
@ 2016-02-18 16:50 ` Jan Glauber
  2016-02-18 17:32   ` Will Deacon
  2016-02-18 16:50 ` [PATCH v4 4/5] arm64/perf: Enable PMCR long cycle counter bit Jan Glauber
  2016-02-18 16:50 ` [PATCH v4 5/5] arm64/perf: Extend event mask for ARMv8.1 Jan Glauber
  4 siblings, 1 reply; 17+ messages in thread
From: Jan Glauber @ 2016-02-18 16:50 UTC (permalink / raw)
  To: linux-arm-kernel

Add a compatible string for the Cavium ThunderX PMU.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 Documentation/devicetree/bindings/arm/pmu.txt | 1 +
 arch/arm64/boot/dts/cavium/thunder-88xx.dtsi  | 5 +++++
 2 files changed, 6 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/pmu.txt b/Documentation/devicetree/bindings/arm/pmu.txt
index 5651883..d3999a1 100644
--- a/Documentation/devicetree/bindings/arm/pmu.txt
+++ b/Documentation/devicetree/bindings/arm/pmu.txt
@@ -25,6 +25,7 @@ Required properties:
 	"qcom,scorpion-pmu"
 	"qcom,scorpion-mp-pmu"
 	"qcom,krait-pmu"
+	"cavium,thunder-pmu"
 - interrupts : 1 combined interrupt or 1 per core. If the interrupt is a per-cpu
                interrupt (PPI) then 1 interrupt should be specified.
 
diff --git a/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi b/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
index 9cb7cf9..2eb9b22 100644
--- a/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
+++ b/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
@@ -360,6 +360,11 @@
 		             <1 10 0xff01>;
 	};
 
+	pmu {
+		compatible = "cavium,thunder-pmu", "arm,armv8-pmuv3";
+		interrupts = <1 7 4>;
+	};
+
 	soc {
 		compatible = "simple-bus";
 		#address-cells = <2>;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v4 4/5] arm64/perf: Enable PMCR long cycle counter bit
  2016-02-18 16:50 [PATCH v4 0/5] Cavium ThunderX PMU support Jan Glauber
                   ` (2 preceding siblings ...)
  2016-02-18 16:50 ` [PATCH v4 3/5] arm64: dts: Add Cavium ThunderX specific PMU Jan Glauber
@ 2016-02-18 16:50 ` Jan Glauber
  2016-02-18 17:34   ` Will Deacon
  2016-02-29 15:39   ` Will Deacon
  2016-02-18 16:50 ` [PATCH v4 5/5] arm64/perf: Extend event mask for ARMv8.1 Jan Glauber
  4 siblings, 2 replies; 17+ messages in thread
From: Jan Glauber @ 2016-02-18 16:50 UTC (permalink / raw)
  To: linux-arm-kernel

With the long cycle counter bit (LC) disabled the cycle counter is not
working on ThunderX SOC (ThunderX only implements Aarch64).
Also, according to documentation LC == 0 is deprecated.

To keep the code simple the patch does not introduce 64 bit wide counter
functions. Instead writing the cycle counter always sets the upper
32 bits so overflow interrupts are generated as before.

Original patch from Andrew Pinksi <Andrew.Pinksi@caviumnetworks.com>

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 arch/arm64/kernel/perf_event.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 0ed05f6..c68fa98 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -405,6 +405,7 @@ static const struct attribute_group *armv8_pmuv3_attr_groups[] = {
 #define ARMV8_PMCR_D		(1 << 3) /* CCNT counts every 64th cpu cycle */
 #define ARMV8_PMCR_X		(1 << 4) /* Export to ETM */
 #define ARMV8_PMCR_DP		(1 << 5) /* Disable CCNT if non-invasive debug*/
+#define ARMV8_PMCR_LC		(1 << 6) /* Overflow on 64 bit cycle counter */
 #define	ARMV8_PMCR_N_SHIFT	11	 /* Number of counters supported */
 #define	ARMV8_PMCR_N_MASK	0x1f
 #define	ARMV8_PMCR_MASK		0x3f	 /* Mask for writable bits */
@@ -494,9 +495,16 @@ static inline void armv8pmu_write_counter(struct perf_event *event, u32 value)
 	if (!armv8pmu_counter_valid(cpu_pmu, idx))
 		pr_err("CPU%u writing wrong counter %d\n",
 			smp_processor_id(), idx);
-	else if (idx == ARMV8_IDX_CYCLE_COUNTER)
-		asm volatile("msr pmccntr_el0, %0" :: "r" (value));
-	else if (armv8pmu_select_counter(idx) == idx)
+	else if (idx == ARMV8_IDX_CYCLE_COUNTER) {
+		/*
+		 * Set the upper 32bits as this is a 64bit counter but we only
+		 * count using the lower 32bits and we want an interrupt when
+		 * it overflows.
+		 */
+		u64 value64 = 0xffffffff00000000ULL | value;
+
+		asm volatile("msr pmccntr_el0, %0" :: "r" (value64));
+	} else if (armv8pmu_select_counter(idx) == idx)
 		asm volatile("msr pmxevcntr_el0, %0" :: "r" (value));
 }
 
@@ -768,8 +776,11 @@ static void armv8pmu_reset(void *info)
 		armv8pmu_disable_intens(idx);
 	}
 
-	/* Initialize & Reset PMNC: C and P bits. */
-	armv8pmu_pmcr_write(ARMV8_PMCR_P | ARMV8_PMCR_C);
+	/*
+	 * Initialize & Reset PMNC. Request overflow interrupt for
+	 * 64 bit cycle counter but cheat in armv8pmu_write_counter().
+	 */
+	armv8pmu_pmcr_write(ARMV8_PMCR_P | ARMV8_PMCR_C | ARMV8_PMCR_LC);
 }
 
 static int armv8_pmuv3_map_event(struct perf_event *event)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v4 5/5] arm64/perf: Extend event mask for ARMv8.1
  2016-02-18 16:50 [PATCH v4 0/5] Cavium ThunderX PMU support Jan Glauber
                   ` (3 preceding siblings ...)
  2016-02-18 16:50 ` [PATCH v4 4/5] arm64/perf: Enable PMCR long cycle counter bit Jan Glauber
@ 2016-02-18 16:50 ` Jan Glauber
  4 siblings, 0 replies; 17+ messages in thread
From: Jan Glauber @ 2016-02-18 16:50 UTC (permalink / raw)
  To: linux-arm-kernel

ARMv8.1 increases the PMU event number space to 16 bit so increase
the EVTYPE mask.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 arch/arm64/kernel/perf_event.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index c68fa98..8746781 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -419,8 +419,8 @@ static const struct attribute_group *armv8_pmuv3_attr_groups[] = {
 /*
  * PMXEVTYPER: Event selection reg
  */
-#define	ARMV8_EVTYPE_MASK	0xc80003ff	/* Mask for writable bits */
-#define	ARMV8_EVTYPE_EVENT	0x3ff		/* Mask for EVENT bits */
+#define	ARMV8_EVTYPE_MASK	0xc800ffff	/* Mask for writable bits */
+#define	ARMV8_EVTYPE_EVENT	0xffff		/* Mask for EVENT bits */
 
 /*
  * Event filters for PMUv3
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v4 3/5] arm64: dts: Add Cavium ThunderX specific PMU
  2016-02-18 16:50 ` [PATCH v4 3/5] arm64: dts: Add Cavium ThunderX specific PMU Jan Glauber
@ 2016-02-18 17:32   ` Will Deacon
  2016-02-18 18:37     ` David Daney
  2016-02-22 12:40     ` Jan Glauber
  0 siblings, 2 replies; 17+ messages in thread
From: Will Deacon @ 2016-02-18 17:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 18, 2016 at 05:50:12PM +0100, Jan Glauber wrote:
> Add a compatible string for the Cavium ThunderX PMU.

Stupid question, but is "thunder" the name of the CPU or the SoC or ...?

Whatever we use to describe the PMU, should probably also identify the
CPU uniquely.

Will

> Signed-off-by: Jan Glauber <jglauber@cavium.com>
> ---
>  Documentation/devicetree/bindings/arm/pmu.txt | 1 +
>  arch/arm64/boot/dts/cavium/thunder-88xx.dtsi  | 5 +++++
>  2 files changed, 6 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/arm/pmu.txt b/Documentation/devicetree/bindings/arm/pmu.txt
> index 5651883..d3999a1 100644
> --- a/Documentation/devicetree/bindings/arm/pmu.txt
> +++ b/Documentation/devicetree/bindings/arm/pmu.txt
> @@ -25,6 +25,7 @@ Required properties:
>  	"qcom,scorpion-pmu"
>  	"qcom,scorpion-mp-pmu"
>  	"qcom,krait-pmu"
> +	"cavium,thunder-pmu"
>  - interrupts : 1 combined interrupt or 1 per core. If the interrupt is a per-cpu
>                 interrupt (PPI) then 1 interrupt should be specified.
>  
> diff --git a/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi b/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
> index 9cb7cf9..2eb9b22 100644
> --- a/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
> +++ b/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
> @@ -360,6 +360,11 @@
>  		             <1 10 0xff01>;
>  	};
>  
> +	pmu {
> +		compatible = "cavium,thunder-pmu", "arm,armv8-pmuv3";
> +		interrupts = <1 7 4>;
> +	};
> +
>  	soc {
>  		compatible = "simple-bus";
>  		#address-cells = <2>;
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v4 4/5] arm64/perf: Enable PMCR long cycle counter bit
  2016-02-18 16:50 ` [PATCH v4 4/5] arm64/perf: Enable PMCR long cycle counter bit Jan Glauber
@ 2016-02-18 17:34   ` Will Deacon
  2016-02-18 18:28     ` Jan Glauber
                       ` (2 more replies)
  2016-02-29 15:39   ` Will Deacon
  1 sibling, 3 replies; 17+ messages in thread
From: Will Deacon @ 2016-02-18 17:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 18, 2016 at 05:50:13PM +0100, Jan Glauber wrote:
> With the long cycle counter bit (LC) disabled the cycle counter is not
> working on ThunderX SOC (ThunderX only implements Aarch64).
> Also, according to documentation LC == 0 is deprecated.
> 
> To keep the code simple the patch does not introduce 64 bit wide counter
> functions. Instead writing the cycle counter always sets the upper
> 32 bits so overflow interrupts are generated as before.
> 
> Original patch from Andrew Pinksi <Andrew.Pinksi@caviumnetworks.com>

What does this mean? Do we need Andrew's S-o-B, or is this a fresh patch?

Will

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v4 4/5] arm64/perf: Enable PMCR long cycle counter bit
  2016-02-18 17:34   ` Will Deacon
@ 2016-02-18 18:28     ` Jan Glauber
  2016-02-18 18:57     ` David Daney
  2016-02-22 12:45     ` Jan Glauber
  2 siblings, 0 replies; 17+ messages in thread
From: Jan Glauber @ 2016-02-18 18:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 18, 2016 at 05:34:28PM +0000, Will Deacon wrote:
> On Thu, Feb 18, 2016 at 05:50:13PM +0100, Jan Glauber wrote:
> > With the long cycle counter bit (LC) disabled the cycle counter is not
> > working on ThunderX SOC (ThunderX only implements Aarch64).
> > Also, according to documentation LC == 0 is deprecated.
> > 
> > To keep the code simple the patch does not introduce 64 bit wide counter
> > functions. Instead writing the cycle counter always sets the upper
> > 32 bits so overflow interrupts are generated as before.
> > 
> > Original patch from Andrew Pinksi <Andrew.Pinksi@caviumnetworks.com>
> 
> What does this mean? Do we need Andrew's S-o-B, or is this a fresh patch?
> 
> Will

I've modified Andrew's patch. I assumed his formal S-o-B is not
required. Please correct me if I'm wrong.

Jan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v4 3/5] arm64: dts: Add Cavium ThunderX specific PMU
  2016-02-18 17:32   ` Will Deacon
@ 2016-02-18 18:37     ` David Daney
  2016-02-22 12:40     ` Jan Glauber
  1 sibling, 0 replies; 17+ messages in thread
From: David Daney @ 2016-02-18 18:37 UTC (permalink / raw)
  To: linux-arm-kernel

On 02/18/2016 09:32 AM, Will Deacon wrote:
> On Thu, Feb 18, 2016 at 05:50:12PM +0100, Jan Glauber wrote:
>> Add a compatible string for the Cavium ThunderX PMU.
>
> Stupid question, but is "thunder" the name of the CPU or the SoC or ...?

At a high level Cavium ThunderX (tm) is a family of SoCs.  Since the SoC 
contains many different functional blocks ...

>
> Whatever we use to describe the PMU, should probably also identify the
> CPU uniquely.

... In the context of this patch, "cavium,thunder-pmu" refers to the PMU 
of Cavium's implementation of the ARMv8 Processing Element (PE) 
specification (i.e. the CPU), as found on the CN88XX family of SoCs.

If we think of this in terms of MIDR_EL1, That would be:

    Implementer: 0x43
    PartNum: 0xA1


>
> Will
>
>> Signed-off-by: Jan Glauber <jglauber@cavium.com>
>> ---
>>   Documentation/devicetree/bindings/arm/pmu.txt | 1 +
>>   arch/arm64/boot/dts/cavium/thunder-88xx.dtsi  | 5 +++++
>>   2 files changed, 6 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/arm/pmu.txt b/Documentation/devicetree/bindings/arm/pmu.txt
>> index 5651883..d3999a1 100644
>> --- a/Documentation/devicetree/bindings/arm/pmu.txt
>> +++ b/Documentation/devicetree/bindings/arm/pmu.txt
>> @@ -25,6 +25,7 @@ Required properties:
>>   	"qcom,scorpion-pmu"
>>   	"qcom,scorpion-mp-pmu"
>>   	"qcom,krait-pmu"
>> +	"cavium,thunder-pmu"
>>   - interrupts : 1 combined interrupt or 1 per core. If the interrupt is a per-cpu
>>                  interrupt (PPI) then 1 interrupt should be specified.
>>
>> diff --git a/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi b/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
>> index 9cb7cf9..2eb9b22 100644
>> --- a/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
>> +++ b/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
>> @@ -360,6 +360,11 @@
>>   		             <1 10 0xff01>;
>>   	};
>>
>> +	pmu {
>> +		compatible = "cavium,thunder-pmu", "arm,armv8-pmuv3";
>> +		interrupts = <1 7 4>;
>> +	};
>> +
>>   	soc {
>>   		compatible = "simple-bus";
>>   		#address-cells = <2>;
>> --
>> 1.9.1
>>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v4 4/5] arm64/perf: Enable PMCR long cycle counter bit
  2016-02-18 17:34   ` Will Deacon
  2016-02-18 18:28     ` Jan Glauber
@ 2016-02-18 18:57     ` David Daney
  2016-02-22 12:45     ` Jan Glauber
  2 siblings, 0 replies; 17+ messages in thread
From: David Daney @ 2016-02-18 18:57 UTC (permalink / raw)
  To: linux-arm-kernel

On 02/18/2016 09:34 AM, Will Deacon wrote:
> On Thu, Feb 18, 2016 at 05:50:13PM +0100, Jan Glauber wrote:
>> With the long cycle counter bit (LC) disabled the cycle counter is not
>> working on ThunderX SOC (ThunderX only implements Aarch64).
>> Also, according to documentation LC == 0 is deprecated.
>>
>> To keep the code simple the patch does not introduce 64 bit wide counter
>> functions. Instead writing the cycle counter always sets the upper
>> 32 bits so overflow interrupts are generated as before.
>>
>> Original patch from Andrew Pinksi <Andrew.Pinksi@caviumnetworks.com>
>
> What does this mean? Do we need Andrew's S-o-B, or is this a fresh patch?

I don't believe we need Andrew's S-o-B as the assertion of the 
Developer's Certificate of Origin 1.1 clauses (a), (b) and (d) is being 
made.  Specifically, clause (c) does not apply.

However this may be a gray area, so we could put on Andrew's S-o-B if 
that would make everybody happier.

David Daney


>
> Will
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v4 3/5] arm64: dts: Add Cavium ThunderX specific PMU
  2016-02-18 17:32   ` Will Deacon
  2016-02-18 18:37     ` David Daney
@ 2016-02-22 12:40     ` Jan Glauber
  1 sibling, 0 replies; 17+ messages in thread
From: Jan Glauber @ 2016-02-22 12:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 18, 2016 at 05:32:48PM +0000, Will Deacon wrote:
> On Thu, Feb 18, 2016 at 05:50:12PM +0100, Jan Glauber wrote:
> > Add a compatible string for the Cavium ThunderX PMU.
> 
> Stupid question, but is "thunder" the name of the CPU or the SoC or ...?
> 
> Whatever we use to describe the PMU, should probably also identify the
> CPU uniquely.

The CPU is currently:

compatible = "cavium,thunder", "arm,armv8";

We clearly need better names in case of a subsequent CPU, but for now
I think we should stick to the existing name.

Jan

> Will
> 
> > Signed-off-by: Jan Glauber <jglauber@cavium.com>
> > ---
> >  Documentation/devicetree/bindings/arm/pmu.txt | 1 +
> >  arch/arm64/boot/dts/cavium/thunder-88xx.dtsi  | 5 +++++
> >  2 files changed, 6 insertions(+)
> > 
> > diff --git a/Documentation/devicetree/bindings/arm/pmu.txt b/Documentation/devicetree/bindings/arm/pmu.txt
> > index 5651883..d3999a1 100644
> > --- a/Documentation/devicetree/bindings/arm/pmu.txt
> > +++ b/Documentation/devicetree/bindings/arm/pmu.txt
> > @@ -25,6 +25,7 @@ Required properties:
> >  	"qcom,scorpion-pmu"
> >  	"qcom,scorpion-mp-pmu"
> >  	"qcom,krait-pmu"
> > +	"cavium,thunder-pmu"
> >  - interrupts : 1 combined interrupt or 1 per core. If the interrupt is a per-cpu
> >                 interrupt (PPI) then 1 interrupt should be specified.
> >  
> > diff --git a/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi b/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
> > index 9cb7cf9..2eb9b22 100644
> > --- a/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
> > +++ b/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
> > @@ -360,6 +360,11 @@
> >  		             <1 10 0xff01>;
> >  	};
> >  
> > +	pmu {
> > +		compatible = "cavium,thunder-pmu", "arm,armv8-pmuv3";
> > +		interrupts = <1 7 4>;
> > +	};
> > +
> >  	soc {
> >  		compatible = "simple-bus";
> >  		#address-cells = <2>;
> > -- 
> > 1.9.1
> > 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v4 4/5] arm64/perf: Enable PMCR long cycle counter bit
  2016-02-18 17:34   ` Will Deacon
  2016-02-18 18:28     ` Jan Glauber
  2016-02-18 18:57     ` David Daney
@ 2016-02-22 12:45     ` Jan Glauber
  2016-02-22 13:41       ` Will Deacon
  2 siblings, 1 reply; 17+ messages in thread
From: Jan Glauber @ 2016-02-22 12:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 18, 2016 at 05:34:28PM +0000, Will Deacon wrote:
> On Thu, Feb 18, 2016 at 05:50:13PM +0100, Jan Glauber wrote:
> > With the long cycle counter bit (LC) disabled the cycle counter is not
> > working on ThunderX SOC (ThunderX only implements Aarch64).
> > Also, according to documentation LC == 0 is deprecated.
> > 
> > To keep the code simple the patch does not introduce 64 bit wide counter
> > functions. Instead writing the cycle counter always sets the upper
> > 32 bits so overflow interrupts are generated as before.
> > 
> > Original patch from Andrew Pinksi <Andrew.Pinksi@caviumnetworks.com>
> 
> What does this mean? Do we need Andrew's S-o-B, or is this a fresh patch?

Hi Will,

Please let me know if I should repost or not, FWIW I got Andrew's S-o-B on the
patch.

Thanks, Jan

> Will

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v4 4/5] arm64/perf: Enable PMCR long cycle counter bit
  2016-02-22 12:45     ` Jan Glauber
@ 2016-02-22 13:41       ` Will Deacon
  0 siblings, 0 replies; 17+ messages in thread
From: Will Deacon @ 2016-02-22 13:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Feb 22, 2016 at 01:45:14PM +0100, Jan Glauber wrote:
> On Thu, Feb 18, 2016 at 05:34:28PM +0000, Will Deacon wrote:
> > On Thu, Feb 18, 2016 at 05:50:13PM +0100, Jan Glauber wrote:
> > > With the long cycle counter bit (LC) disabled the cycle counter is not
> > > working on ThunderX SOC (ThunderX only implements Aarch64).
> > > Also, according to documentation LC == 0 is deprecated.
> > > 
> > > To keep the code simple the patch does not introduce 64 bit wide counter
> > > functions. Instead writing the cycle counter always sets the upper
> > > 32 bits so overflow interrupts are generated as before.
> > > 
> > > Original patch from Andrew Pinksi <Andrew.Pinksi@caviumnetworks.com>
> > 
> > What does this mean? Do we need Andrew's S-o-B, or is this a fresh patch?
> 
> Please let me know if I should repost or not, FWIW I got Andrew's S-o-B on the
> patch.

I think it's fine. This should all be in -next as of last Friday anyhow.

Will

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v4 4/5] arm64/perf: Enable PMCR long cycle counter bit
  2016-02-18 16:50 ` [PATCH v4 4/5] arm64/perf: Enable PMCR long cycle counter bit Jan Glauber
  2016-02-18 17:34   ` Will Deacon
@ 2016-02-29 15:39   ` Will Deacon
  2016-03-01  7:21     ` Jan Glauber
  2016-03-01 15:10     ` Jan Glauber
  1 sibling, 2 replies; 17+ messages in thread
From: Will Deacon @ 2016-02-29 15:39 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jan,

I've queued this lot on my perf/updates branch, but I just noticed an
oddity whilst dealing with some potential conflicts with the kvm tree.

On Thu, Feb 18, 2016 at 05:50:13PM +0100, Jan Glauber wrote:
> With the long cycle counter bit (LC) disabled the cycle counter is not
> working on ThunderX SOC (ThunderX only implements Aarch64).
> Also, according to documentation LC == 0 is deprecated.
> 
> To keep the code simple the patch does not introduce 64 bit wide counter
> functions. Instead writing the cycle counter always sets the upper
> 32 bits so overflow interrupts are generated as before.
> 
> Original patch from Andrew Pinksi <Andrew.Pinksi@caviumnetworks.com>
> 
> Signed-off-by: Jan Glauber <jglauber@cavium.com>
> ---
>  arch/arm64/kernel/perf_event.c | 21 ++++++++++++++++-----
>  1 file changed, 16 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index 0ed05f6..c68fa98 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -405,6 +405,7 @@ static const struct attribute_group *armv8_pmuv3_attr_groups[] = {
>  #define ARMV8_PMCR_D		(1 << 3) /* CCNT counts every 64th cpu cycle */
>  #define ARMV8_PMCR_X		(1 << 4) /* Export to ETM */
>  #define ARMV8_PMCR_DP		(1 << 5) /* Disable CCNT if non-invasive debug*/
> +#define ARMV8_PMCR_LC		(1 << 6) /* Overflow on 64 bit cycle counter */
>  #define	ARMV8_PMCR_N_SHIFT	11	 /* Number of counters supported */
>  #define	ARMV8_PMCR_N_MASK	0x1f
>  #define	ARMV8_PMCR_MASK		0x3f	 /* Mask for writable bits */

You haven't extended this mask to cover the LC bit, so it will be ignored
by armv8pmu_pmcr_write afaict.

How did you test this? I can easily update the mask, but it would be
good to know that it doesn't end up cause a breakage.

Will

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v4 4/5] arm64/perf: Enable PMCR long cycle counter bit
  2016-02-29 15:39   ` Will Deacon
@ 2016-03-01  7:21     ` Jan Glauber
  2016-03-01 15:10     ` Jan Glauber
  1 sibling, 0 replies; 17+ messages in thread
From: Jan Glauber @ 2016-03-01  7:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Feb 29, 2016 at 03:39:35PM +0000, Will Deacon wrote:
> Hi Jan,
> 
> I've queued this lot on my perf/updates branch, but I just noticed an
> oddity whilst dealing with some potential conflicts with the kvm tree.
> 
> On Thu, Feb 18, 2016 at 05:50:13PM +0100, Jan Glauber wrote:
> > With the long cycle counter bit (LC) disabled the cycle counter is not
> > working on ThunderX SOC (ThunderX only implements Aarch64).
> > Also, according to documentation LC == 0 is deprecated.
> > 
> > To keep the code simple the patch does not introduce 64 bit wide counter
> > functions. Instead writing the cycle counter always sets the upper
> > 32 bits so overflow interrupts are generated as before.
> > 
> > Original patch from Andrew Pinksi <Andrew.Pinksi@caviumnetworks.com>
> > 
> > Signed-off-by: Jan Glauber <jglauber@cavium.com>
> > ---
> >  arch/arm64/kernel/perf_event.c | 21 ++++++++++++++++-----
> >  1 file changed, 16 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > index 0ed05f6..c68fa98 100644
> > --- a/arch/arm64/kernel/perf_event.c
> > +++ b/arch/arm64/kernel/perf_event.c
> > @@ -405,6 +405,7 @@ static const struct attribute_group *armv8_pmuv3_attr_groups[] = {
> >  #define ARMV8_PMCR_D		(1 << 3) /* CCNT counts every 64th cpu cycle */
> >  #define ARMV8_PMCR_X		(1 << 4) /* Export to ETM */
> >  #define ARMV8_PMCR_DP		(1 << 5) /* Disable CCNT if non-invasive debug*/
> > +#define ARMV8_PMCR_LC		(1 << 6) /* Overflow on 64 bit cycle counter */
> >  #define	ARMV8_PMCR_N_SHIFT	11	 /* Number of counters supported */
> >  #define	ARMV8_PMCR_N_MASK	0x1f
> >  #define	ARMV8_PMCR_MASK		0x3f	 /* Mask for writable bits */
> 
> You haven't extended this mask to cover the LC bit, so it will be ignored
> by armv8pmu_pmcr_write afaict.

This is weird. I've double checked and I missed this mask. Annoying.
Nevertheless it works for me without the LC bit set.

> How did you test this? I can easily update the mask, but it would be
> good to know that it doesn't end up cause a breakage.
 
For testing I used:
- perf top and perf record & report
- looked at interrupt numbers in /proc/interrupts

Without the patch _no_ samples at all are recorded and the interrupt does
not occur. With the patch I get samples and see a reasonable number of
interrupts.

Extending the mask so the LC bit is covered would make sense, I'm going
to test this now.

Jan

> Will

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v4 4/5] arm64/perf: Enable PMCR long cycle counter bit
  2016-02-29 15:39   ` Will Deacon
  2016-03-01  7:21     ` Jan Glauber
@ 2016-03-01 15:10     ` Jan Glauber
  1 sibling, 0 replies; 17+ messages in thread
From: Jan Glauber @ 2016-03-01 15:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Feb 29, 2016 at 03:39:35PM +0000, Will Deacon wrote:
> Hi Jan,
> 
> I've queued this lot on my perf/updates branch, but I just noticed an
> oddity whilst dealing with some potential conflicts with the kvm tree.
> 
> On Thu, Feb 18, 2016 at 05:50:13PM +0100, Jan Glauber wrote:
> > With the long cycle counter bit (LC) disabled the cycle counter is not
> > working on ThunderX SOC (ThunderX only implements Aarch64).
> > Also, according to documentation LC == 0 is deprecated.
> > 
> > To keep the code simple the patch does not introduce 64 bit wide counter
> > functions. Instead writing the cycle counter always sets the upper
> > 32 bits so overflow interrupts are generated as before.
> > 
> > Original patch from Andrew Pinksi <Andrew.Pinksi@caviumnetworks.com>
> > 
> > Signed-off-by: Jan Glauber <jglauber@cavium.com>
> > ---
> >  arch/arm64/kernel/perf_event.c | 21 ++++++++++++++++-----
> >  1 file changed, 16 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > index 0ed05f6..c68fa98 100644
> > --- a/arch/arm64/kernel/perf_event.c
> > +++ b/arch/arm64/kernel/perf_event.c
> > @@ -405,6 +405,7 @@ static const struct attribute_group *armv8_pmuv3_attr_groups[] = {
> >  #define ARMV8_PMCR_D		(1 << 3) /* CCNT counts every 64th cpu cycle */
> >  #define ARMV8_PMCR_X		(1 << 4) /* Export to ETM */
> >  #define ARMV8_PMCR_DP		(1 << 5) /* Disable CCNT if non-invasive debug*/
> > +#define ARMV8_PMCR_LC		(1 << 6) /* Overflow on 64 bit cycle counter */
> >  #define	ARMV8_PMCR_N_SHIFT	11	 /* Number of counters supported */
> >  #define	ARMV8_PMCR_N_MASK	0x1f
> >  #define	ARMV8_PMCR_MASK		0x3f	 /* Mask for writable bits */
> 
> You haven't extended this mask to cover the LC bit, so it will be ignored
> by armv8pmu_pmcr_write afaict.
> 
> How did you test this? I can easily update the mask, but it would be
> good to know that it doesn't end up cause a breakage.

Please update the mask, I've tested with ARMV8_PMCR_MASK set to 0x7f
and it works fine.

It seems like it would work even without the LC bit set because we
set the upper bits again after an interrupt, but reading the documentation
we should really use ARMV8_PMCR_LC.

thanks,
Jan

> Will

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-03-01 15:10 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-18 16:50 [PATCH v4 0/5] Cavium ThunderX PMU support Jan Glauber
2016-02-18 16:50 ` [PATCH v4 1/5] arm64/perf: Rename Cortex A57 events Jan Glauber
2016-02-18 16:50 ` [PATCH v4 2/5] arm64/perf: Add Cavium ThunderX PMU support Jan Glauber
2016-02-18 16:50 ` [PATCH v4 3/5] arm64: dts: Add Cavium ThunderX specific PMU Jan Glauber
2016-02-18 17:32   ` Will Deacon
2016-02-18 18:37     ` David Daney
2016-02-22 12:40     ` Jan Glauber
2016-02-18 16:50 ` [PATCH v4 4/5] arm64/perf: Enable PMCR long cycle counter bit Jan Glauber
2016-02-18 17:34   ` Will Deacon
2016-02-18 18:28     ` Jan Glauber
2016-02-18 18:57     ` David Daney
2016-02-22 12:45     ` Jan Glauber
2016-02-22 13:41       ` Will Deacon
2016-02-29 15:39   ` Will Deacon
2016-03-01  7:21     ` Jan Glauber
2016-03-01 15:10     ` Jan Glauber
2016-02-18 16:50 ` [PATCH v4 5/5] arm64/perf: Extend event mask for ARMv8.1 Jan Glauber

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).