All of lore.kernel.org
 help / color / mirror / Atom feed
* [kvm-unit-tests PATCH v2 0/6] arm: pmu: Fix random failures of pmu-chain-promotion
@ 2023-05-31 20:14 Eric Auger
  2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 1/6] arm: pmu: pmu-chain-promotion: Improve debug messages Eric Auger
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Eric Auger @ 2023-05-31 20:14 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, kvm, kvmarm, andrew.jones, maz, will,
	oliver.upton, ricarkol, reijiw, alexandru.elisei
  Cc: mark.rutland

On some HW (ThunderXv2), some random failures of
pmu-chain-promotion test can be observed.

pmu-chain-promotion is composed of several subtests
which run 2 mem_access loops. The initial value of
the counter is set so that no overflow is expected on
the first loop run and overflow is expected on the second.
However it is observed that sometimes we get an overflow
on the first run. It looks related to some variability of
the mem_acess count. This variability is observed on all
HW I have access to, with different span though. On
ThunderX2 HW it looks the margin that is currently taken
is too small and we regularly hit failure.

although the first goal of this series is to increase
the count/margin used in those tests, it also attempts
to improve the pmu-chain-promotion logs, add some barriers
in the mem-access loop, clarify the chain counter
enable/disable sequence.

A new 'pmu-mem-access-reliability' is also introduced to
detect issues with MEM_ACCESS event variability and make
the debug easier.

Obviously one can wonder if this variability is something normal
and does not hide any other bug. I hope this series will raise
additional discussions about this.

https://github.com/eauger/kut/tree/pmu-chain-promotion-fixes-v2

History:
v1 -> v2:
- Take into account Alexandru's & Mark's comments. Added some
  R-b's and T-b's.

Eric Auger (6):
  arm: pmu: pmu-chain-promotion: Improve debug messages
  arm: pmu: pmu-chain-promotion: Introduce defines for count and margin
    values
  arm: pmu: Add extra DSB barriers in the mem_access loop
  arm: pmu: Fix chain counter enable/disable sequences
  arm: pmu: Add pmu-mem-access-reliability test
  arm: pmu-chain-promotion: Increase the count and margin values

 arm/pmu.c         | 196 +++++++++++++++++++++++++++++++++-------------
 arm/unittests.cfg |   6 ++
 2 files changed, 148 insertions(+), 54 deletions(-)

-- 
2.38.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [kvm-unit-tests PATCH v2 1/6] arm: pmu: pmu-chain-promotion: Improve debug messages
  2023-05-31 20:14 [kvm-unit-tests PATCH v2 0/6] arm: pmu: Fix random failures of pmu-chain-promotion Eric Auger
@ 2023-05-31 20:14 ` Eric Auger
  2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 2/6] arm: pmu: pmu-chain-promotion: Introduce defines for count and margin values Eric Auger
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Eric Auger @ 2023-05-31 20:14 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, kvm, kvmarm, andrew.jones, maz, will,
	oliver.upton, ricarkol, reijiw, alexandru.elisei
  Cc: mark.rutland

The pmu-chain-promotion test is composed of several subtests.
In case of failures, the current logs are really dificult to
analyze since they look very similar and sometimes duplicated
for each subtest. Add prefixes for each subtest and introduce
a macro that prints the registers we are mostly interested in,
namerly the 2 first counters and the overflow counter.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>

---

v1 -> v2:
- Added Alexandru's R-b
- s/2d/2nd
---
 arm/pmu.c | 63 ++++++++++++++++++++++++++++---------------------------
 1 file changed, 32 insertions(+), 31 deletions(-)

diff --git a/arm/pmu.c b/arm/pmu.c
index f6e95012..cc2e733e 100644
--- a/arm/pmu.c
+++ b/arm/pmu.c
@@ -715,6 +715,11 @@ static void test_chained_sw_incr(bool unused)
 	report_info("overflow=0x%lx, #0=0x%lx #1=0x%lx", read_sysreg(pmovsclr_el0),
 		    read_regn_el0(pmevcntr, 0), read_regn_el0(pmevcntr, 1));
 }
+#define PRINT_REGS(__s) \
+	report_info("%s #1=0x%lx #0=0x%lx overflow=0x%lx", __s, \
+		    read_regn_el0(pmevcntr, 1), \
+		    read_regn_el0(pmevcntr, 0), \
+		    read_sysreg(pmovsclr_el0))
 
 static void test_chain_promotion(bool unused)
 {
@@ -725,6 +730,7 @@ static void test_chain_promotion(bool unused)
 		return;
 
 	/* Only enable CHAIN counter */
+	report_prefix_push("subtest1");
 	pmu_reset();
 	write_regn_el0(pmevtyper, 0, MEM_ACCESS | PMEVTYPER_EXCLUDE_EL0);
 	write_regn_el0(pmevtyper, 1, CHAIN | PMEVTYPER_EXCLUDE_EL0);
@@ -732,83 +738,81 @@ static void test_chain_promotion(bool unused)
 	isb();
 
 	mem_access_loop(addr, 20, pmu.pmcr_ro | PMU_PMCR_E);
+	PRINT_REGS("post");
 	report(!read_regn_el0(pmevcntr, 0),
 		"chain counter not counting if even counter is disabled");
+	report_prefix_pop();
 
 	/* Only enable even counter */
+	report_prefix_push("subtest2");
 	pmu_reset();
 	write_regn_el0(pmevcntr, 0, PRE_OVERFLOW_32);
 	write_sysreg_s(0x1, PMCNTENSET_EL0);
 	isb();
 
 	mem_access_loop(addr, 20, pmu.pmcr_ro | PMU_PMCR_E);
+	PRINT_REGS("post");
 	report(!read_regn_el0(pmevcntr, 1) && (read_sysreg(pmovsclr_el0) == 0x1),
 		"odd counter did not increment on overflow if disabled");
-	report_info("MEM_ACCESS counter #0 has value 0x%lx",
-		    read_regn_el0(pmevcntr, 0));
-	report_info("CHAIN counter #1 has value 0x%lx",
-		    read_regn_el0(pmevcntr, 1));
-	report_info("overflow counter 0x%lx", read_sysreg(pmovsclr_el0));
+	report_prefix_pop();
 
 	/* start at 0xFFFFFFDC, +20 with CHAIN enabled, +20 with CHAIN disabled */
+	report_prefix_push("subtest3");
 	pmu_reset();
 	write_sysreg_s(0x3, PMCNTENSET_EL0);
 	write_regn_el0(pmevcntr, 0, PRE_OVERFLOW2_32);
 	isb();
+	PRINT_REGS("init");
 
 	mem_access_loop(addr, 20, pmu.pmcr_ro | PMU_PMCR_E);
-	report_info("MEM_ACCESS counter #0 has value 0x%lx",
-		    read_regn_el0(pmevcntr, 0));
+	PRINT_REGS("After 1st loop");
 
 	/* disable the CHAIN event */
 	write_sysreg_s(0x2, PMCNTENCLR_EL0);
 	mem_access_loop(addr, 20, pmu.pmcr_ro | PMU_PMCR_E);
-	report_info("MEM_ACCESS counter #0 has value 0x%lx",
-		    read_regn_el0(pmevcntr, 0));
+	PRINT_REGS("After 2nd loop");
 	report(read_sysreg(pmovsclr_el0) == 0x1,
 		"should have triggered an overflow on #0");
 	report(!read_regn_el0(pmevcntr, 1),
 		"CHAIN counter #1 shouldn't have incremented");
+	report_prefix_pop();
 
 	/* start at 0xFFFFFFDC, +20 with CHAIN disabled, +20 with CHAIN enabled */
 
+	report_prefix_push("subtest4");
 	pmu_reset();
 	write_sysreg_s(0x1, PMCNTENSET_EL0);
 	write_regn_el0(pmevcntr, 0, PRE_OVERFLOW2_32);
 	isb();
-	report_info("counter #0 = 0x%lx, counter #1 = 0x%lx overflow=0x%lx",
-		    read_regn_el0(pmevcntr, 0), read_regn_el0(pmevcntr, 1),
-		    read_sysreg(pmovsclr_el0));
+	PRINT_REGS("init");
 
 	mem_access_loop(addr, 20, pmu.pmcr_ro | PMU_PMCR_E);
-	report_info("MEM_ACCESS counter #0 has value 0x%lx",
-		    read_regn_el0(pmevcntr, 0));
+	PRINT_REGS("After 1st loop");
 
 	/* enable the CHAIN event */
 	write_sysreg_s(0x3, PMCNTENSET_EL0);
 	isb();
 	mem_access_loop(addr, 20, pmu.pmcr_ro | PMU_PMCR_E);
-	report_info("MEM_ACCESS counter #0 has value 0x%lx",
-		    read_regn_el0(pmevcntr, 0));
+
+	PRINT_REGS("After 2nd loop");
 
 	report((read_regn_el0(pmevcntr, 1) == 1) &&
 		(read_sysreg(pmovsclr_el0) == 0x1),
 		"CHAIN counter enabled: CHAIN counter was incremented and overflow");
-
-	report_info("CHAIN counter #1 = 0x%lx, overflow=0x%lx",
-		read_regn_el0(pmevcntr, 1), read_sysreg(pmovsclr_el0));
+	report_prefix_pop();
 
 	/* start as MEM_ACCESS/CPU_CYCLES and move to CHAIN/MEM_ACCESS */
+	report_prefix_push("subtest5");
 	pmu_reset();
 	write_regn_el0(pmevtyper, 0, MEM_ACCESS | PMEVTYPER_EXCLUDE_EL0);
 	write_regn_el0(pmevtyper, 1, CPU_CYCLES | PMEVTYPER_EXCLUDE_EL0);
 	write_sysreg_s(0x3, PMCNTENSET_EL0);
 	write_regn_el0(pmevcntr, 0, PRE_OVERFLOW2_32);
 	isb();
+	PRINT_REGS("init");
 
 	mem_access_loop(addr, 20, pmu.pmcr_ro | PMU_PMCR_E);
-	report_info("MEM_ACCESS counter #0 has value 0x%lx",
-		    read_regn_el0(pmevcntr, 0));
+	PRINT_REGS("After 1st loop");
 
 	/* 0 becomes CHAINED */
 	write_sysreg_s(0x0, PMCNTENSET_EL0);
@@ -817,37 +821,34 @@ static void test_chain_promotion(bool unused)
 	write_regn_el0(pmevcntr, 1, 0x0);
 
 	mem_access_loop(addr, 20, pmu.pmcr_ro | PMU_PMCR_E);
-	report_info("MEM_ACCESS counter #0 has value 0x%lx",
-		    read_regn_el0(pmevcntr, 0));
+	PRINT_REGS("After 2nd loop");
 
 	report((read_regn_el0(pmevcntr, 1) == 1) &&
 		(read_sysreg(pmovsclr_el0) == 0x1),
 		"32b->64b: CHAIN counter incremented and overflow");
-
-	report_info("CHAIN counter #1 = 0x%lx, overflow=0x%lx",
-		read_regn_el0(pmevcntr, 1), read_sysreg(pmovsclr_el0));
+	report_prefix_pop();
 
 	/* start as CHAIN/MEM_ACCESS and move to MEM_ACCESS/CPU_CYCLES */
+	report_prefix_push("subtest6");
 	pmu_reset();
 	write_regn_el0(pmevtyper, 0, MEM_ACCESS | PMEVTYPER_EXCLUDE_EL0);
 	write_regn_el0(pmevtyper, 1, CHAIN | PMEVTYPER_EXCLUDE_EL0);
 	write_regn_el0(pmevcntr, 0, PRE_OVERFLOW2_32);
 	write_sysreg_s(0x3, PMCNTENSET_EL0);
+	PRINT_REGS("init");
 
 	mem_access_loop(addr, 20, pmu.pmcr_ro | PMU_PMCR_E);
-	report_info("counter #0=0x%lx, counter #1=0x%lx",
-			read_regn_el0(pmevcntr, 0), read_regn_el0(pmevcntr, 1));
+	PRINT_REGS("After 1st loop");
 
 	write_sysreg_s(0x0, PMCNTENSET_EL0);
 	write_regn_el0(pmevtyper, 1, CPU_CYCLES | PMEVTYPER_EXCLUDE_EL0);
 	write_sysreg_s(0x3, PMCNTENSET_EL0);
 
 	mem_access_loop(addr, 20, pmu.pmcr_ro | PMU_PMCR_E);
+	PRINT_REGS("After 2nd loop");
 	report(read_sysreg(pmovsclr_el0) == 1,
 		"overflow is expected on counter 0");
-	report_info("counter #0=0x%lx, counter #1=0x%lx overflow=0x%lx",
-			read_regn_el0(pmevcntr, 0), read_regn_el0(pmevcntr, 1),
-			read_sysreg(pmovsclr_el0));
+	report_prefix_pop();
 }
 
 static bool expect_interrupts(uint32_t bitmap)
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [kvm-unit-tests PATCH v2 2/6] arm: pmu: pmu-chain-promotion: Introduce defines for count and margin values
  2023-05-31 20:14 [kvm-unit-tests PATCH v2 0/6] arm: pmu: Fix random failures of pmu-chain-promotion Eric Auger
  2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 1/6] arm: pmu: pmu-chain-promotion: Improve debug messages Eric Auger
@ 2023-05-31 20:14 ` Eric Auger
  2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 3/6] arm: pmu: Add extra DSB barriers in the mem_access loop Eric Auger
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Eric Auger @ 2023-05-31 20:14 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, kvm, kvmarm, andrew.jones, maz, will,
	oliver.upton, ricarkol, reijiw, alexandru.elisei
  Cc: mark.rutland

The pmu-chain-promotion test is composed of separate subtests.

Some of them apply some settings on a first MEM_ACCESS loop
iterations, change the settings and run another MEM_ACCESS loop.

The PRE_OVERFLOW2 MEM_ACCESS counter init value is defined so that
the first loop does not overflow and the second loop overflows.

At the moment the MEM_ACCESS count number is hardcoded to 20 and
PRE_OVERFLOW2 is set to UINT32_MAX - 20 - 15 where 15 acts as a
margin.

Introduce defines for the count number and the margin so that it
becomes easier to change them.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>

---

v1 -> v2:
- 2d -> 2nd and use @COUNT in comment
- Added Alexandru's R-b
---
 arm/pmu.c | 35 +++++++++++++++++++++--------------
 1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/arm/pmu.c b/arm/pmu.c
index cc2e733e..51c0fe80 100644
--- a/arm/pmu.c
+++ b/arm/pmu.c
@@ -55,11 +55,18 @@
 #define EXT_COMMON_EVENTS_LOW	0x4000
 #define EXT_COMMON_EVENTS_HIGH	0x403F
 
-#define ALL_SET_32			0x00000000FFFFFFFFULL
+#define ALL_SET_32		0x00000000FFFFFFFFULL
 #define ALL_CLEAR		0x0000000000000000ULL
 #define PRE_OVERFLOW_32		0x00000000FFFFFFF0ULL
-#define PRE_OVERFLOW2_32	0x00000000FFFFFFDCULL
 #define PRE_OVERFLOW_64		0xFFFFFFFFFFFFFFF0ULL
+#define COUNT 20
+#define MARGIN 15
+/*
+ * PRE_OVERFLOW2 is set so that 1st @COUNT iterations do not
+ * produce 32b overflow and 2nd @COUNT iterations do. To accommodate
+ * for some observed variability we take into account a given @MARGIN
+ */
+#define PRE_OVERFLOW2_32		(ALL_SET_32 - COUNT - MARGIN)
 
 #define PRE_OVERFLOW(__overflow_at_64bits)				\
 	(__overflow_at_64bits ? PRE_OVERFLOW_64 : PRE_OVERFLOW_32)
@@ -737,7 +744,7 @@ static void test_chain_promotion(bool unused)
 	write_sysreg_s(0x2, PMCNTENSET_EL0);
 	isb();
 
-	mem_access_loop(addr, 20, pmu.pmcr_ro | PMU_PMCR_E);
+	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
 	PRINT_REGS("post");
 	report(!read_regn_el0(pmevcntr, 0),
 		"chain counter not counting if even counter is disabled");
@@ -750,13 +757,13 @@ static void test_chain_promotion(bool unused)
 	write_sysreg_s(0x1, PMCNTENSET_EL0);
 	isb();
 
-	mem_access_loop(addr, 20, pmu.pmcr_ro | PMU_PMCR_E);
+	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
 	PRINT_REGS("post");
 	report(!read_regn_el0(pmevcntr, 1) && (read_sysreg(pmovsclr_el0) == 0x1),
 		"odd counter did not increment on overflow if disabled");
 	report_prefix_pop();
 
-	/* start at 0xFFFFFFDC, +20 with CHAIN enabled, +20 with CHAIN disabled */
+	/* 1st COUNT with CHAIN enabled, next COUNT with CHAIN disabled */
 	report_prefix_push("subtest3");
 	pmu_reset();
 	write_sysreg_s(0x3, PMCNTENSET_EL0);
@@ -764,12 +771,12 @@ static void test_chain_promotion(bool unused)
 	isb();
 	PRINT_REGS("init");
 
-	mem_access_loop(addr, 20, pmu.pmcr_ro | PMU_PMCR_E);
+	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
 	PRINT_REGS("After 1st loop");
 
 	/* disable the CHAIN event */
 	write_sysreg_s(0x2, PMCNTENCLR_EL0);
-	mem_access_loop(addr, 20, pmu.pmcr_ro | PMU_PMCR_E);
+	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
 	PRINT_REGS("After 2nd loop");
 	report(read_sysreg(pmovsclr_el0) == 0x1,
 		"should have triggered an overflow on #0");
@@ -777,7 +784,7 @@ static void test_chain_promotion(bool unused)
 		"CHAIN counter #1 shouldn't have incremented");
 	report_prefix_pop();
 
-	/* start at 0xFFFFFFDC, +20 with CHAIN disabled, +20 with CHAIN enabled */
+	/* 1st COUNT with CHAIN disabled, next COUNT with CHAIN enabled */
 
 	report_prefix_push("subtest4");
 	pmu_reset();
@@ -786,13 +793,13 @@ static void test_chain_promotion(bool unused)
 	isb();
 	PRINT_REGS("init");
 
-	mem_access_loop(addr, 20, pmu.pmcr_ro | PMU_PMCR_E);
+	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
 	PRINT_REGS("After 1st loop");
 
 	/* enable the CHAIN event */
 	write_sysreg_s(0x3, PMCNTENSET_EL0);
 	isb();
-	mem_access_loop(addr, 20, pmu.pmcr_ro | PMU_PMCR_E);
+	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
 
 	PRINT_REGS("After 2nd loop");
 
@@ -811,7 +818,7 @@ static void test_chain_promotion(bool unused)
 	isb();
 	PRINT_REGS("init");
 
-	mem_access_loop(addr, 20, pmu.pmcr_ro | PMU_PMCR_E);
+	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
 	PRINT_REGS("After 1st loop");
 
 	/* 0 becomes CHAINED */
@@ -820,7 +827,7 @@ static void test_chain_promotion(bool unused)
 	write_sysreg_s(0x3, PMCNTENSET_EL0);
 	write_regn_el0(pmevcntr, 1, 0x0);
 
-	mem_access_loop(addr, 20, pmu.pmcr_ro | PMU_PMCR_E);
+	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
 	PRINT_REGS("After 2nd loop");
 
 	report((read_regn_el0(pmevcntr, 1) == 1) &&
@@ -837,14 +844,14 @@ static void test_chain_promotion(bool unused)
 	write_sysreg_s(0x3, PMCNTENSET_EL0);
 	PRINT_REGS("init");
 
-	mem_access_loop(addr, 20, pmu.pmcr_ro | PMU_PMCR_E);
+	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
 	PRINT_REGS("After 1st loop");
 
 	write_sysreg_s(0x0, PMCNTENSET_EL0);
 	write_regn_el0(pmevtyper, 1, CPU_CYCLES | PMEVTYPER_EXCLUDE_EL0);
 	write_sysreg_s(0x3, PMCNTENSET_EL0);
 
-	mem_access_loop(addr, 20, pmu.pmcr_ro | PMU_PMCR_E);
+	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
 	PRINT_REGS("After 2nd loop");
 	report(read_sysreg(pmovsclr_el0) == 1,
 		"overflow is expected on counter 0");
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [kvm-unit-tests PATCH v2 3/6] arm: pmu: Add extra DSB barriers in the mem_access loop
  2023-05-31 20:14 [kvm-unit-tests PATCH v2 0/6] arm: pmu: Fix random failures of pmu-chain-promotion Eric Auger
  2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 1/6] arm: pmu: pmu-chain-promotion: Improve debug messages Eric Auger
  2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 2/6] arm: pmu: pmu-chain-promotion: Introduce defines for count and margin values Eric Auger
@ 2023-05-31 20:14 ` Eric Auger
  2023-06-16  9:42   ` Alexandru Elisei
  2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 4/6] arm: pmu: Fix chain counter enable/disable sequences Eric Auger
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Eric Auger @ 2023-05-31 20:14 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, kvm, kvmarm, andrew.jones, maz, will,
	oliver.upton, ricarkol, reijiw, alexandru.elisei
  Cc: mark.rutland

The mem access loop currently features ISB barriers only. However
the mem_access loop counts the number of accesses to memory. ISB
do not garantee the PE cannot reorder memory access. Let's
add a DSB ISH before the write to PMCR_EL0 that enables the PMU
to make sure any previous memory access aren't counted in the
loop, another one after the PMU gets enabled (to make sure loop
memory accesses cannot be reordered before the PMU gets enabled)
and a last one after the last iteration, before disabling the PMU.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Alexandru Elisei <alexandru.elisei@arm.com>

---

v1 -> v2:
- added yet another DSB after PMU enabled as suggested by Alexandru

This was discussed in https://lore.kernel.org/all/YzxmHpV2rpfaUdWi@monolith.localdoman/
---
 arm/pmu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arm/pmu.c b/arm/pmu.c
index 51c0fe80..74dd4c10 100644
--- a/arm/pmu.c
+++ b/arm/pmu.c
@@ -301,13 +301,16 @@ static void mem_access_loop(void *addr, long loop, uint32_t pmcr)
 {
 	uint64_t pmcr64 = pmcr;
 asm volatile(
+	"       dsb     ish\n"
 	"       msr     pmcr_el0, %[pmcr]\n"
 	"       isb\n"
+	"       dsb     ish\n"
 	"       mov     x10, %[loop]\n"
 	"1:     sub     x10, x10, #1\n"
 	"       ldr	x9, [%[addr]]\n"
 	"       cmp     x10, #0x0\n"
 	"       b.gt    1b\n"
+	"       dsb     ish\n"
 	"       msr     pmcr_el0, xzr\n"
 	"       isb\n"
 	:
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [kvm-unit-tests PATCH v2 4/6] arm: pmu: Fix chain counter enable/disable sequences
  2023-05-31 20:14 [kvm-unit-tests PATCH v2 0/6] arm: pmu: Fix random failures of pmu-chain-promotion Eric Auger
                   ` (2 preceding siblings ...)
  2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 3/6] arm: pmu: Add extra DSB barriers in the mem_access loop Eric Auger
@ 2023-05-31 20:14 ` Eric Auger
  2023-06-16 10:50   ` Alexandru Elisei
  2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 5/6] arm: pmu: Add pmu-mem-access-reliability test Eric Auger
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Eric Auger @ 2023-05-31 20:14 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, kvm, kvmarm, andrew.jones, maz, will,
	oliver.upton, ricarkol, reijiw, alexandru.elisei
  Cc: mark.rutland

In some ARM ARM ddi0487 revisions it is said that
disabling/enabling a pair of counters that are paired
by a CHAIN event should follow a given sequence:

Enable the high counter first, isb, enable low counter
Disable the low counter first, isb, disable the high counter

This was the case in Fc. However this is not written anymore
in subsequent revions such as Ia.

Anyway, just in case, and because it also makes the code a
little bit simpler, introduce 2 helpers to enable/disable chain
counters that execute those sequences and replace the existing
PMCNTENCLR/ENSET calls (at least this cannot do any harm).

Also fix 2 write_sysreg_s(0x0, PMCNTENSET_EL0) in subtest 5 & 6
and replace them by PMCNTENCLR writes since writing 0 in
PMCNTENSET_EL0 has no effect.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v1 -> v2:
- fix the enable_chain_counter()/disable_chain_counter()
  sequence, ie. swap n + 1 / n as reported by Alexandru.
- fix an other comment using the 'low' terminology
---
 arm/pmu.c | 37 ++++++++++++++++++++++++++++---------
 1 file changed, 28 insertions(+), 9 deletions(-)

diff --git a/arm/pmu.c b/arm/pmu.c
index 74dd4c10..74c9f6f9 100644
--- a/arm/pmu.c
+++ b/arm/pmu.c
@@ -731,6 +731,22 @@ static void test_chained_sw_incr(bool unused)
 		    read_regn_el0(pmevcntr, 0), \
 		    read_sysreg(pmovsclr_el0))
 
+static void enable_chain_counter(int even)
+{
+	write_sysreg_s(BIT(even + 1), PMCNTENSET_EL0); /* Enable the high counter first */
+	isb();
+	write_sysreg_s(BIT(even), PMCNTENSET_EL0); /* Enable the low counter */
+	isb();
+}
+
+static void disable_chain_counter(int even)
+{
+	write_sysreg_s(BIT(even), PMCNTENCLR_EL0); /* Disable the low counter first*/
+	isb();
+	write_sysreg_s(BIT(even + 1), PMCNTENCLR_EL0); /* Disable the high counter */
+	isb();
+}
+
 static void test_chain_promotion(bool unused)
 {
 	uint32_t events[] = {MEM_ACCESS, CHAIN};
@@ -769,16 +785,17 @@ static void test_chain_promotion(bool unused)
 	/* 1st COUNT with CHAIN enabled, next COUNT with CHAIN disabled */
 	report_prefix_push("subtest3");
 	pmu_reset();
-	write_sysreg_s(0x3, PMCNTENSET_EL0);
 	write_regn_el0(pmevcntr, 0, PRE_OVERFLOW2_32);
-	isb();
+	enable_chain_counter(0);
 	PRINT_REGS("init");
 
 	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
 	PRINT_REGS("After 1st loop");
 
 	/* disable the CHAIN event */
-	write_sysreg_s(0x2, PMCNTENCLR_EL0);
+	disable_chain_counter(0);
+	write_sysreg_s(0x1, PMCNTENSET_EL0); /* Enable the low counter */
+	isb();
 	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
 	PRINT_REGS("After 2nd loop");
 	report(read_sysreg(pmovsclr_el0) == 0x1,
@@ -799,9 +816,11 @@ static void test_chain_promotion(bool unused)
 	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
 	PRINT_REGS("After 1st loop");
 
-	/* enable the CHAIN event */
-	write_sysreg_s(0x3, PMCNTENSET_EL0);
+	/* Disable the low counter first and enable the chain counter */
+	write_sysreg_s(0x1, PMCNTENCLR_EL0);
 	isb();
+	enable_chain_counter(0);
+
 	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
 
 	PRINT_REGS("After 2nd loop");
@@ -825,10 +844,10 @@ static void test_chain_promotion(bool unused)
 	PRINT_REGS("After 1st loop");
 
 	/* 0 becomes CHAINED */
-	write_sysreg_s(0x0, PMCNTENSET_EL0);
+	write_sysreg_s(0x3, PMCNTENCLR_EL0);
 	write_regn_el0(pmevtyper, 1, CHAIN | PMEVTYPER_EXCLUDE_EL0);
-	write_sysreg_s(0x3, PMCNTENSET_EL0);
 	write_regn_el0(pmevcntr, 1, 0x0);
+	enable_chain_counter(0);
 
 	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
 	PRINT_REGS("After 2nd loop");
@@ -844,13 +863,13 @@ static void test_chain_promotion(bool unused)
 	write_regn_el0(pmevtyper, 0, MEM_ACCESS | PMEVTYPER_EXCLUDE_EL0);
 	write_regn_el0(pmevtyper, 1, CHAIN | PMEVTYPER_EXCLUDE_EL0);
 	write_regn_el0(pmevcntr, 0, PRE_OVERFLOW2_32);
-	write_sysreg_s(0x3, PMCNTENSET_EL0);
+	enable_chain_counter(0);
 	PRINT_REGS("init");
 
 	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
 	PRINT_REGS("After 1st loop");
 
-	write_sysreg_s(0x0, PMCNTENSET_EL0);
+	disable_chain_counter(0);
 	write_regn_el0(pmevtyper, 1, CPU_CYCLES | PMEVTYPER_EXCLUDE_EL0);
 	write_sysreg_s(0x3, PMCNTENSET_EL0);
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [kvm-unit-tests PATCH v2 5/6] arm: pmu: Add pmu-mem-access-reliability test
  2023-05-31 20:14 [kvm-unit-tests PATCH v2 0/6] arm: pmu: Fix random failures of pmu-chain-promotion Eric Auger
                   ` (3 preceding siblings ...)
  2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 4/6] arm: pmu: Fix chain counter enable/disable sequences Eric Auger
@ 2023-05-31 20:14 ` Eric Auger
  2023-06-16 11:52   ` Alexandru Elisei
  2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 6/6] arm: pmu-chain-promotion: Increase the count and margin values Eric Auger
  2023-06-07 19:07 ` [kvm-unit-tests PATCH v2 0/6] arm: pmu: Fix random failures of pmu-chain-promotion Andrew Jones
  6 siblings, 1 reply; 15+ messages in thread
From: Eric Auger @ 2023-05-31 20:14 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, kvm, kvmarm, andrew.jones, maz, will,
	oliver.upton, ricarkol, reijiw, alexandru.elisei
  Cc: mark.rutland

Add a new basic test that runs MEM_ACCESS loop over
100 iterations and make sure the number of measured
MEM_ACCESS never overflows the margin. Some other
pmu tests rely on this pattern and if the MEM_ACCESS
measurement is not reliable, it is better to report
it beforehand and not confuse the user any further.

Without the subsequent patch, this typically fails on
ThunderXv2 with the following logs:

INFO: pmu: pmu-mem-access-reliability: 32-bit overflows:
overflow=1 min=21 max=41 COUNT=20 MARGIN=15
FAIL: pmu: pmu-mem-access-reliability: 32-bit overflows:
mem_access count is reliable

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v1 -> v2:
- use mem-access instead of memaccess as suggested by Mark
- simplify the logic and add comments in the test loop
---
 arm/pmu.c         | 58 +++++++++++++++++++++++++++++++++++++++++++++++
 arm/unittests.cfg |  6 +++++
 2 files changed, 64 insertions(+)

diff --git a/arm/pmu.c b/arm/pmu.c
index 74c9f6f9..925f277c 100644
--- a/arm/pmu.c
+++ b/arm/pmu.c
@@ -56,6 +56,7 @@
 #define EXT_COMMON_EVENTS_HIGH	0x403F
 
 #define ALL_SET_32		0x00000000FFFFFFFFULL
+#define ALL_SET_64		0xFFFFFFFFFFFFFFFFULL
 #define ALL_CLEAR		0x0000000000000000ULL
 #define PRE_OVERFLOW_32		0x00000000FFFFFFF0ULL
 #define PRE_OVERFLOW_64		0xFFFFFFFFFFFFFFF0ULL
@@ -67,6 +68,10 @@
  * for some observed variability we take into account a given @MARGIN
  */
 #define PRE_OVERFLOW2_32		(ALL_SET_32 - COUNT - MARGIN)
+#define PRE_OVERFLOW2_64		(ALL_SET_64 - COUNT - MARGIN)
+
+#define PRE_OVERFLOW2(__overflow_at_64bits)				\
+	(__overflow_at_64bits ? PRE_OVERFLOW2_64 : PRE_OVERFLOW2_32)
 
 #define PRE_OVERFLOW(__overflow_at_64bits)				\
 	(__overflow_at_64bits ? PRE_OVERFLOW_64 : PRE_OVERFLOW_32)
@@ -747,6 +752,56 @@ static void disable_chain_counter(int even)
 	isb();
 }
 
+/*
+ * This test checks that a mem access loop featuring COUNT accesses
+ * does not overflow with an init value of PRE_OVERFLOW2. It also
+ * records the min/max access count to see how much the counting
+ * is (un)reliable
+ */
+static void test_mem_access_reliability(bool overflow_at_64bits)
+{
+	uint32_t events[] = {MEM_ACCESS};
+	void *addr = malloc(PAGE_SIZE);
+	uint64_t count, delta, max = 0, min = pmevcntr_mask();
+	uint64_t pre_overflow2 = PRE_OVERFLOW2(overflow_at_64bits);
+	uint64_t pmcr_lp = overflow_at_64bits ? PMU_PMCR_LP : 0;
+	bool overflow = false;
+
+	if (!satisfy_prerequisites(events, ARRAY_SIZE(events)) ||
+	    !check_overflow_prerequisites(overflow_at_64bits))
+		return;
+
+	pmu_reset();
+	write_regn_el0(pmevtyper, 0, MEM_ACCESS | PMEVTYPER_EXCLUDE_EL0);
+	for (int i = 0; i < 100; i++) {
+		pmu_reset();
+		write_regn_el0(pmevcntr, 0, pre_overflow2);
+		write_sysreg_s(0x1, PMCNTENSET_EL0);
+		isb();
+		mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E | pmcr_lp);
+		count = read_regn_el0(pmevcntr, 0);
+		if (count >= pre_overflow2) {
+			/* not counter overflow, as expected */
+			delta = count - pre_overflow2;
+		} else {
+			/*
+			 * unexpected counter overflow meaning the actual
+			 * mem access count, delta, is count + COUNT + MARGIN
+			 */
+			delta = count + COUNT + MARGIN;
+			overflow = true;
+			report_info("iter=%d count=%ld min=%ld max=%ld overflow!!!",
+				    i, delta, min, max);
+		}
+		/* record extreme value */
+		max = MAX(delta, max);
+		min = MIN(delta, min);
+	}
+	report_info("overflow=%d min=%ld max=%ld COUNT=%d MARGIN=%d",
+		    overflow, min, max, COUNT, MARGIN);
+	report(!overflow, "mem_access count is reliable");
+}
+
 static void test_chain_promotion(bool unused)
 {
 	uint32_t events[] = {MEM_ACCESS, CHAIN};
@@ -1204,6 +1259,9 @@ int main(int argc, char *argv[])
 	} else if (strcmp(argv[1], "pmu-basic-event-count") == 0) {
 		run_event_test(argv[1], test_basic_event_count, false);
 		run_event_test(argv[1], test_basic_event_count, true);
+	} else if (strcmp(argv[1], "pmu-mem-access-reliability") == 0) {
+		run_event_test(argv[1], test_mem_access_reliability, false);
+		run_event_test(argv[1], test_mem_access_reliability, true);
 	} else if (strcmp(argv[1], "pmu-mem-access") == 0) {
 		run_event_test(argv[1], test_mem_access, false);
 		run_event_test(argv[1], test_mem_access, true);
diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index 5e67b558..fe601cbb 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -90,6 +90,12 @@ groups = pmu
 arch = arm64
 extra_params = -append 'pmu-mem-access'
 
+[pmu-mem-access-reliability]
+file = pmu.flat
+groups = pmu
+arch = arm64
+extra_params = -append 'pmu-mem-access-reliability'
+
 [pmu-sw-incr]
 file = pmu.flat
 groups = pmu
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [kvm-unit-tests PATCH v2 6/6] arm: pmu-chain-promotion: Increase the count and margin values
  2023-05-31 20:14 [kvm-unit-tests PATCH v2 0/6] arm: pmu: Fix random failures of pmu-chain-promotion Eric Auger
                   ` (4 preceding siblings ...)
  2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 5/6] arm: pmu: Add pmu-mem-access-reliability test Eric Auger
@ 2023-05-31 20:14 ` Eric Auger
  2023-06-07 19:07 ` [kvm-unit-tests PATCH v2 0/6] arm: pmu: Fix random failures of pmu-chain-promotion Andrew Jones
  6 siblings, 0 replies; 15+ messages in thread
From: Eric Auger @ 2023-05-31 20:14 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, kvm, kvmarm, andrew.jones, maz, will,
	oliver.upton, ricarkol, reijiw, alexandru.elisei
  Cc: mark.rutland

Let's increase the mem_access loop count by defining COUNT=250
(instead of 20) and define a more reasonable margin (100 instead
of 15 previously) so that it gives better chance to accommodate
for HW implementation variance. Those values were chosen arbitrarily
higher. Those values fix the random failures on ThunderX2 machines.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
---
 arm/pmu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arm/pmu.c b/arm/pmu.c
index 925f277c..62d41ea9 100644
--- a/arm/pmu.c
+++ b/arm/pmu.c
@@ -60,8 +60,8 @@
 #define ALL_CLEAR		0x0000000000000000ULL
 #define PRE_OVERFLOW_32		0x00000000FFFFFFF0ULL
 #define PRE_OVERFLOW_64		0xFFFFFFFFFFFFFFF0ULL
-#define COUNT 20
-#define MARGIN 15
+#define COUNT 250
+#define MARGIN 100
 /*
  * PRE_OVERFLOW2 is set so that 1st @COUNT iterations do not
  * produce 32b overflow and 2nd @COUNT iterations do. To accommodate
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v2 0/6] arm: pmu: Fix random failures of pmu-chain-promotion
  2023-05-31 20:14 [kvm-unit-tests PATCH v2 0/6] arm: pmu: Fix random failures of pmu-chain-promotion Eric Auger
                   ` (5 preceding siblings ...)
  2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 6/6] arm: pmu-chain-promotion: Increase the count and margin values Eric Auger
@ 2023-06-07 19:07 ` Andrew Jones
  2023-06-08 16:38   ` Alexandru Elisei
  6 siblings, 1 reply; 15+ messages in thread
From: Andrew Jones @ 2023-06-07 19:07 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, kvm, kvmarm, maz, will, oliver.upton, ricarkol,
	reijiw, alexandru.elisei, mark.rutland

On Wed, May 31, 2023 at 10:14:32PM +0200, Eric Auger wrote:
> On some HW (ThunderXv2), some random failures of
> pmu-chain-promotion test can be observed.
> 
> pmu-chain-promotion is composed of several subtests
> which run 2 mem_access loops. The initial value of
> the counter is set so that no overflow is expected on
> the first loop run and overflow is expected on the second.
> However it is observed that sometimes we get an overflow
> on the first run. It looks related to some variability of
> the mem_acess count. This variability is observed on all
> HW I have access to, with different span though. On
> ThunderX2 HW it looks the margin that is currently taken
> is too small and we regularly hit failure.
> 
> although the first goal of this series is to increase
> the count/margin used in those tests, it also attempts
> to improve the pmu-chain-promotion logs, add some barriers
> in the mem-access loop, clarify the chain counter
> enable/disable sequence.
> 
> A new 'pmu-mem-access-reliability' is also introduced to
> detect issues with MEM_ACCESS event variability and make
> the debug easier.
> 
> Obviously one can wonder if this variability is something normal
> and does not hide any other bug. I hope this series will raise
> additional discussions about this.
> 
> https://github.com/eauger/kut/tree/pmu-chain-promotion-fixes-v2
> 
> History:
> v1 -> v2:
> - Take into account Alexandru's & Mark's comments. Added some
>   R-b's and T-b's.
> 
> Eric Auger (6):
>   arm: pmu: pmu-chain-promotion: Improve debug messages
>   arm: pmu: pmu-chain-promotion: Introduce defines for count and margin
>     values
>   arm: pmu: Add extra DSB barriers in the mem_access loop
>   arm: pmu: Fix chain counter enable/disable sequences
>   arm: pmu: Add pmu-mem-access-reliability test
>   arm: pmu-chain-promotion: Increase the count and margin values
> 
>  arm/pmu.c         | 196 +++++++++++++++++++++++++++++++++-------------
>  arm/unittests.cfg |   6 ++
>  2 files changed, 148 insertions(+), 54 deletions(-)
> 
> -- 
> 2.38.1
>

Hi Eric,

I'm eager to merge this, but I'll give Alexandru some time to revisit it
since he had comments on the last revision.

Thanks,
drew

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v2 0/6] arm: pmu: Fix random failures of pmu-chain-promotion
  2023-06-07 19:07 ` [kvm-unit-tests PATCH v2 0/6] arm: pmu: Fix random failures of pmu-chain-promotion Andrew Jones
@ 2023-06-08 16:38   ` Alexandru Elisei
  0 siblings, 0 replies; 15+ messages in thread
From: Alexandru Elisei @ 2023-06-08 16:38 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Eric Auger, eric.auger.pro, kvm, kvmarm, maz, will, oliver.upton,
	ricarkol, reijiw, mark.rutland

Hi,

On Wed, Jun 07, 2023 at 09:07:09PM +0200, Andrew Jones wrote:
> On Wed, May 31, 2023 at 10:14:32PM +0200, Eric Auger wrote:
> > On some HW (ThunderXv2), some random failures of
> > pmu-chain-promotion test can be observed.
> > 
> > pmu-chain-promotion is composed of several subtests
> > which run 2 mem_access loops. The initial value of
> > the counter is set so that no overflow is expected on
> > the first loop run and overflow is expected on the second.
> > However it is observed that sometimes we get an overflow
> > on the first run. It looks related to some variability of
> > the mem_acess count. This variability is observed on all
> > HW I have access to, with different span though. On
> > ThunderX2 HW it looks the margin that is currently taken
> > is too small and we regularly hit failure.
> > 
> > although the first goal of this series is to increase
> > the count/margin used in those tests, it also attempts
> > to improve the pmu-chain-promotion logs, add some barriers
> > in the mem-access loop, clarify the chain counter
> > enable/disable sequence.
> > 
> > A new 'pmu-mem-access-reliability' is also introduced to
> > detect issues with MEM_ACCESS event variability and make
> > the debug easier.
> > 
> > Obviously one can wonder if this variability is something normal
> > and does not hide any other bug. I hope this series will raise
> > additional discussions about this.
> > 
> > https://github.com/eauger/kut/tree/pmu-chain-promotion-fixes-v2
> > 
> > History:
> > v1 -> v2:
> > - Take into account Alexandru's & Mark's comments. Added some
> >   R-b's and T-b's.
> > 
> > Eric Auger (6):
> >   arm: pmu: pmu-chain-promotion: Improve debug messages
> >   arm: pmu: pmu-chain-promotion: Introduce defines for count and margin
> >     values
> >   arm: pmu: Add extra DSB barriers in the mem_access loop
> >   arm: pmu: Fix chain counter enable/disable sequences
> >   arm: pmu: Add pmu-mem-access-reliability test
> >   arm: pmu-chain-promotion: Increase the count and margin values
> > 
> >  arm/pmu.c         | 196 +++++++++++++++++++++++++++++++++-------------
> >  arm/unittests.cfg |   6 ++
> >  2 files changed, 148 insertions(+), 54 deletions(-)
> > 
> > -- 
> > 2.38.1
> >
> 
> Hi Eric,
> 
> I'm eager to merge this, but I'll give Alexandru some time to revisit it
> since he had comments on the last revision.

I've just come back from holiday, I'll have a look next week.

Thanks,
Alex

> 
> Thanks,
> drew

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v2 3/6] arm: pmu: Add extra DSB barriers in the mem_access loop
  2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 3/6] arm: pmu: Add extra DSB barriers in the mem_access loop Eric Auger
@ 2023-06-16  9:42   ` Alexandru Elisei
  0 siblings, 0 replies; 15+ messages in thread
From: Alexandru Elisei @ 2023-06-16  9:42 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, kvm, kvmarm, andrew.jones, maz, will,
	oliver.upton, ricarkol, reijiw, mark.rutland

Hi,

On Wed, May 31, 2023 at 10:14:35PM +0200, Eric Auger wrote:
> The mem access loop currently features ISB barriers only. However
> the mem_access loop counts the number of accesses to memory. ISB
> do not garantee the PE cannot reorder memory access. Let's
> add a DSB ISH before the write to PMCR_EL0 that enables the PMU
> to make sure any previous memory access aren't counted in the
> loop, another one after the PMU gets enabled (to make sure loop
> memory accesses cannot be reordered before the PMU gets enabled)
> and a last one after the last iteration, before disabling the PMU.

Looks good to me:

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>

Thanks,
Alex

> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> Suggested-by: Alexandru Elisei <alexandru.elisei@arm.com>
> 
> ---
> 
> v1 -> v2:
> - added yet another DSB after PMU enabled as suggested by Alexandru
> 
> This was discussed in https://lore.kernel.org/all/YzxmHpV2rpfaUdWi@monolith.localdoman/
> ---
>  arm/pmu.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arm/pmu.c b/arm/pmu.c
> index 51c0fe80..74dd4c10 100644
> --- a/arm/pmu.c
> +++ b/arm/pmu.c
> @@ -301,13 +301,16 @@ static void mem_access_loop(void *addr, long loop, uint32_t pmcr)
>  {
>  	uint64_t pmcr64 = pmcr;
>  asm volatile(
> +	"       dsb     ish\n"
>  	"       msr     pmcr_el0, %[pmcr]\n"
>  	"       isb\n"
> +	"       dsb     ish\n"
>  	"       mov     x10, %[loop]\n"
>  	"1:     sub     x10, x10, #1\n"
>  	"       ldr	x9, [%[addr]]\n"
>  	"       cmp     x10, #0x0\n"
>  	"       b.gt    1b\n"
> +	"       dsb     ish\n"
>  	"       msr     pmcr_el0, xzr\n"
>  	"       isb\n"
>  	:
> -- 
> 2.38.1
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v2 4/6] arm: pmu: Fix chain counter enable/disable sequences
  2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 4/6] arm: pmu: Fix chain counter enable/disable sequences Eric Auger
@ 2023-06-16 10:50   ` Alexandru Elisei
  2023-06-19 19:57     ` Eric Auger
  0 siblings, 1 reply; 15+ messages in thread
From: Alexandru Elisei @ 2023-06-16 10:50 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, kvm, kvmarm, andrew.jones, maz, will,
	oliver.upton, ricarkol, reijiw, mark.rutland

Hi,

On Wed, May 31, 2023 at 10:14:36PM +0200, Eric Auger wrote:
> In some ARM ARM ddi0487 revisions it is said that
> disabling/enabling a pair of counters that are paired
> by a CHAIN event should follow a given sequence:
> 
> Enable the high counter first, isb, enable low counter
> Disable the low counter first, isb, disable the high counter
> 
> This was the case in Fc. However this is not written anymore
> in subsequent revions such as Ia.
> 
> Anyway, just in case, and because it also makes the code a
> little bit simpler, introduce 2 helpers to enable/disable chain
> counters that execute those sequences and replace the existing
> PMCNTENCLR/ENSET calls (at least this cannot do any harm).
> 
> Also fix 2 write_sysreg_s(0x0, PMCNTENSET_EL0) in subtest 5 & 6
> and replace them by PMCNTENCLR writes since writing 0 in
> PMCNTENSET_EL0 has no effect.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> 
> v1 -> v2:
> - fix the enable_chain_counter()/disable_chain_counter()
>   sequence, ie. swap n + 1 / n as reported by Alexandru.
> - fix an other comment using the 'low' terminology
> ---
>  arm/pmu.c | 37 ++++++++++++++++++++++++++++---------
>  1 file changed, 28 insertions(+), 9 deletions(-)
> 
> diff --git a/arm/pmu.c b/arm/pmu.c
> index 74dd4c10..74c9f6f9 100644
> --- a/arm/pmu.c
> +++ b/arm/pmu.c
> @@ -731,6 +731,22 @@ static void test_chained_sw_incr(bool unused)
>  		    read_regn_el0(pmevcntr, 0), \
>  		    read_sysreg(pmovsclr_el0))
>  
> +static void enable_chain_counter(int even)
> +{
> +	write_sysreg_s(BIT(even + 1), PMCNTENSET_EL0); /* Enable the high counter first */
> +	isb();
> +	write_sysreg_s(BIT(even), PMCNTENSET_EL0); /* Enable the low counter */
> +	isb();
> +}
> +
> +static void disable_chain_counter(int even)
> +{
> +	write_sysreg_s(BIT(even), PMCNTENCLR_EL0); /* Disable the low counter first*/
> +	isb();
> +	write_sysreg_s(BIT(even + 1), PMCNTENCLR_EL0); /* Disable the high counter */
> +	isb();
> +}
> +
>  static void test_chain_promotion(bool unused)

Here's what test_chain_promotion() does for the first subtest:

static void test_chain_promotion(bool unused)
{
        uint32_t events[] = {MEM_ACCESS, CHAIN};
        void *addr = malloc(PAGE_SIZE);

        if (!satisfy_prerequisites(events, ARRAY_SIZE(events)))
                return;

        /* Only enable CHAIN counter */
        report_prefix_push("subtest1");
        pmu_reset();
        write_regn_el0(pmevtyper, 0, MEM_ACCESS | PMEVTYPER_EXCLUDE_EL0);
        write_regn_el0(pmevtyper, 1, CHAIN | PMEVTYPER_EXCLUDE_EL0);
        write_sysreg_s(0x2, PMCNTENSET_EL0);
        isb();

        mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);

And here's what test_chained_counters() does:

static void test_chained_counters(bool unused)
{
        uint32_t events[] = {CPU_CYCLES, CHAIN};
        uint64_t all_set = pmevcntr_mask();

        if (!satisfy_prerequisites(events, ARRAY_SIZE(events)))
                return;

        pmu_reset();

        write_regn_el0(pmevtyper, 0, CPU_CYCLES | PMEVTYPER_EXCLUDE_EL0);
        write_regn_el0(pmevtyper, 1, CHAIN | PMEVTYPER_EXCLUDE_EL0);
        /* enable counters #0 and #1 */
        write_sysreg_s(0x3, PMCNTENSET_EL0);
        write_regn_el0(pmevcntr, 0, PRE_OVERFLOW_32);

        precise_instrs_loop(22, pmu.pmcr_ro | PMU_PMCR_E);

Why the extra ISB in test_chain_promotion()? Or, if you want to look at it
the other way around, is the ISB missing from test_chained_counters()?

>  {
>  	uint32_t events[] = {MEM_ACCESS, CHAIN};
> @@ -769,16 +785,17 @@ static void test_chain_promotion(bool unused)
>  	/* 1st COUNT with CHAIN enabled, next COUNT with CHAIN disabled */
>  	report_prefix_push("subtest3");
>  	pmu_reset();
> -	write_sysreg_s(0x3, PMCNTENSET_EL0);
>  	write_regn_el0(pmevcntr, 0, PRE_OVERFLOW2_32);
> -	isb();
> +	enable_chain_counter(0);
>  	PRINT_REGS("init");

Here's how subtest3 ends up looking:

        report_prefix_push("subtest3");
        pmu_reset();
        write_regn_el0(pmevcntr, 0, PRE_OVERFLOW2_32);
        enable_chain_counter(0);
        PRINT_REGS("init");

        mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);

And here's something similar from test_chained_counters():

        pmu_reset();
        write_sysreg_s(0x3, PMCNTENSET_EL0);

        write_regn_el0(pmevcntr, 0, PRE_OVERFLOW_32);
        write_regn_el0(pmevcntr, 1, 0x1);
        precise_instrs_loop(22, pmu.pmcr_ro | PMU_PMCR_E);


Why does test_chain_promotion() use enable_chain_counter() and
test_chained_counters() doesn't?

Could probably find more examples of this in test_chain_promotion().

As an aside, it's extremely difficult to figure out how the counters are
programmed for a subtest. In the example above, you need to go back 2
subtests, to the start of test_chain_promotion(), to figure that out. And
that only gets worse the subtest number increases. test_chain_promotion()
would really benefit from being split into separate functions, each with
each own clear initial state. But that's for another patch, not for this
series.

Thanks,
Alex

>  
>  	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
>  	PRINT_REGS("After 1st loop");
>  
>  	/* disable the CHAIN event */
> -	write_sysreg_s(0x2, PMCNTENCLR_EL0);
> +	disable_chain_counter(0);
> +	write_sysreg_s(0x1, PMCNTENSET_EL0); /* Enable the low counter */
> +	isb();
>  	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
>  	PRINT_REGS("After 2nd loop");
>  	report(read_sysreg(pmovsclr_el0) == 0x1,
> @@ -799,9 +816,11 @@ static void test_chain_promotion(bool unused)
>  	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
>  	PRINT_REGS("After 1st loop");
>  
> -	/* enable the CHAIN event */
> -	write_sysreg_s(0x3, PMCNTENSET_EL0);
> +	/* Disable the low counter first and enable the chain counter */
> +	write_sysreg_s(0x1, PMCNTENCLR_EL0);
>  	isb();
> +	enable_chain_counter(0);
> +
>  	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
>  
>  	PRINT_REGS("After 2nd loop");
> @@ -825,10 +844,10 @@ static void test_chain_promotion(bool unused)
>  	PRINT_REGS("After 1st loop");
>  
>  	/* 0 becomes CHAINED */
> -	write_sysreg_s(0x0, PMCNTENSET_EL0);
> +	write_sysreg_s(0x3, PMCNTENCLR_EL0);
>  	write_regn_el0(pmevtyper, 1, CHAIN | PMEVTYPER_EXCLUDE_EL0);
> -	write_sysreg_s(0x3, PMCNTENSET_EL0);
>  	write_regn_el0(pmevcntr, 1, 0x0);
> +	enable_chain_counter(0);
>  
>  	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
>  	PRINT_REGS("After 2nd loop");
> @@ -844,13 +863,13 @@ static void test_chain_promotion(bool unused)
>  	write_regn_el0(pmevtyper, 0, MEM_ACCESS | PMEVTYPER_EXCLUDE_EL0);
>  	write_regn_el0(pmevtyper, 1, CHAIN | PMEVTYPER_EXCLUDE_EL0);
>  	write_regn_el0(pmevcntr, 0, PRE_OVERFLOW2_32);
> -	write_sysreg_s(0x3, PMCNTENSET_EL0);
> +	enable_chain_counter(0);
>  	PRINT_REGS("init");
>  
>  	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
>  	PRINT_REGS("After 1st loop");
>  
> -	write_sysreg_s(0x0, PMCNTENSET_EL0);
> +	disable_chain_counter(0);
>  	write_regn_el0(pmevtyper, 1, CPU_CYCLES | PMEVTYPER_EXCLUDE_EL0);
>  	write_sysreg_s(0x3, PMCNTENSET_EL0);
>  
> -- 
> 2.38.1
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v2 5/6] arm: pmu: Add pmu-mem-access-reliability test
  2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 5/6] arm: pmu: Add pmu-mem-access-reliability test Eric Auger
@ 2023-06-16 11:52   ` Alexandru Elisei
  2023-06-19 20:00     ` Eric Auger
  0 siblings, 1 reply; 15+ messages in thread
From: Alexandru Elisei @ 2023-06-16 11:52 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, kvm, kvmarm, andrew.jones, maz, will,
	oliver.upton, ricarkol, reijiw, mark.rutland

Hi,

The test looks much more readable now, some comments below.

On Wed, May 31, 2023 at 10:14:37PM +0200, Eric Auger wrote:
> Add a new basic test that runs MEM_ACCESS loop over
> 100 iterations and make sure the number of measured
> MEM_ACCESS never overflows the margin. Some other
> pmu tests rely on this pattern and if the MEM_ACCESS
> measurement is not reliable, it is better to report
> it beforehand and not confuse the user any further.
> 
> Without the subsequent patch, this typically fails on
> ThunderXv2 with the following logs:
> 
> INFO: pmu: pmu-mem-access-reliability: 32-bit overflows:
> overflow=1 min=21 max=41 COUNT=20 MARGIN=15
> FAIL: pmu: pmu-mem-access-reliability: 32-bit overflows:
> mem_access count is reliable
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> 
> v1 -> v2:
> - use mem-access instead of memaccess as suggested by Mark
> - simplify the logic and add comments in the test loop
> ---
>  arm/pmu.c         | 58 +++++++++++++++++++++++++++++++++++++++++++++++
>  arm/unittests.cfg |  6 +++++
>  2 files changed, 64 insertions(+)
> 
> diff --git a/arm/pmu.c b/arm/pmu.c
> index 74c9f6f9..925f277c 100644
> --- a/arm/pmu.c
> +++ b/arm/pmu.c
> @@ -56,6 +56,7 @@
>  #define EXT_COMMON_EVENTS_HIGH	0x403F
>  
>  #define ALL_SET_32		0x00000000FFFFFFFFULL
> +#define ALL_SET_64		0xFFFFFFFFFFFFFFFFULL
>  #define ALL_CLEAR		0x0000000000000000ULL
>  #define PRE_OVERFLOW_32		0x00000000FFFFFFF0ULL
>  #define PRE_OVERFLOW_64		0xFFFFFFFFFFFFFFF0ULL
> @@ -67,6 +68,10 @@
>   * for some observed variability we take into account a given @MARGIN
>   */
>  #define PRE_OVERFLOW2_32		(ALL_SET_32 - COUNT - MARGIN)
> +#define PRE_OVERFLOW2_64		(ALL_SET_64 - COUNT - MARGIN)
> +
> +#define PRE_OVERFLOW2(__overflow_at_64bits)				\
> +	(__overflow_at_64bits ? PRE_OVERFLOW2_64 : PRE_OVERFLOW2_32)
>  
>  #define PRE_OVERFLOW(__overflow_at_64bits)				\
>  	(__overflow_at_64bits ? PRE_OVERFLOW_64 : PRE_OVERFLOW_32)
> @@ -747,6 +752,56 @@ static void disable_chain_counter(int even)
>  	isb();
>  }
>  
> +/*
> + * This test checks that a mem access loop featuring COUNT accesses
> + * does not overflow with an init value of PRE_OVERFLOW2. It also
> + * records the min/max access count to see how much the counting
> + * is (un)reliable
> + */
> +static void test_mem_access_reliability(bool overflow_at_64bits)
> +{
> +	uint32_t events[] = {MEM_ACCESS};
> +	void *addr = malloc(PAGE_SIZE);
> +	uint64_t count, delta, max = 0, min = pmevcntr_mask();
> +	uint64_t pre_overflow2 = PRE_OVERFLOW2(overflow_at_64bits);
> +	uint64_t pmcr_lp = overflow_at_64bits ? PMU_PMCR_LP : 0;
> +	bool overflow = false;
> +
> +	if (!satisfy_prerequisites(events, ARRAY_SIZE(events)) ||
> +	    !check_overflow_prerequisites(overflow_at_64bits))
> +		return;
> +
> +	pmu_reset();
> +	write_regn_el0(pmevtyper, 0, MEM_ACCESS | PMEVTYPER_EXCLUDE_EL0);
> +	for (int i = 0; i < 100; i++) {
> +		pmu_reset();
> +		write_regn_el0(pmevcntr, 0, pre_overflow2);
> +		write_sysreg_s(0x1, PMCNTENSET_EL0);
> +		isb();
> +		mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E | pmcr_lp);
> +		count = read_regn_el0(pmevcntr, 0);
> +		if (count >= pre_overflow2) {
> +			/* not counter overflow, as expected */
> +			delta = count - pre_overflow2;

Personally, I find the names confusing. Since the test tries to see how
much the counting is unreliable, I would have have expected delta to be
difference between the expected number of events incremented (i.e., COUNT)
and the actual number of events recorded in the counter. I would rename
count to cntr_val and delta to num_events, but that might be just personal
bias and I leave it up to you if think this might be useful.

> +		} else {
> +			/*
> +			 * unexpected counter overflow meaning the actual
> +			 * mem access count, delta, is count + COUNT + MARGIN
> +			 */
> +			delta = count + COUNT + MARGIN;

This assumes that PRE_OVERFLOW2_{32,64} = ALL_SET{32,64} - COUNT - MARGIN,
which might change in the future.

I think a better way to do that is:

delta = count + all_set - pre_overflow2, where

all_set = overflow_at_64bits ? ALL_SET64 : ALL_SET32.

That is a lot easier to parse for someone who doesn't know exactly how
PRE_OVERFLOW2_* is defined, and more robust.

> +			overflow = true;
> +			report_info("iter=%d count=%ld min=%ld max=%ld overflow!!!",
> +				    i, delta, min, max);

I find this message extremely confusing: it does not print count (the value
read from counter 0) like the text displayed suggests, it prints delta,
which represents the number of events counted by the counter.

Besides those minor issues, the patch looks correct. Also ran the test on a
rockpro64 under KVM + qemu.

Thanks,
Alex

> +		}
> +		/* record extreme value */
> +		max = MAX(delta, max);
> +		min = MIN(delta, min);
> +	}
> +	report_info("overflow=%d min=%ld max=%ld COUNT=%d MARGIN=%d",
> +		    overflow, min, max, COUNT, MARGIN);
> +	report(!overflow, "mem_access count is reliable");
> +}
> +
>  static void test_chain_promotion(bool unused)
>  {
>  	uint32_t events[] = {MEM_ACCESS, CHAIN};
> @@ -1204,6 +1259,9 @@ int main(int argc, char *argv[])
>  	} else if (strcmp(argv[1], "pmu-basic-event-count") == 0) {
>  		run_event_test(argv[1], test_basic_event_count, false);
>  		run_event_test(argv[1], test_basic_event_count, true);
> +	} else if (strcmp(argv[1], "pmu-mem-access-reliability") == 0) {
> +		run_event_test(argv[1], test_mem_access_reliability, false);
> +		run_event_test(argv[1], test_mem_access_reliability, true);
>  	} else if (strcmp(argv[1], "pmu-mem-access") == 0) {
>  		run_event_test(argv[1], test_mem_access, false);
>  		run_event_test(argv[1], test_mem_access, true);
> diff --git a/arm/unittests.cfg b/arm/unittests.cfg
> index 5e67b558..fe601cbb 100644
> --- a/arm/unittests.cfg
> +++ b/arm/unittests.cfg
> @@ -90,6 +90,12 @@ groups = pmu
>  arch = arm64
>  extra_params = -append 'pmu-mem-access'
>  
> +[pmu-mem-access-reliability]
> +file = pmu.flat
> +groups = pmu
> +arch = arm64
> +extra_params = -append 'pmu-mem-access-reliability'
> +
>  [pmu-sw-incr]
>  file = pmu.flat
>  groups = pmu
> -- 
> 2.38.1
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v2 4/6] arm: pmu: Fix chain counter enable/disable sequences
  2023-06-16 10:50   ` Alexandru Elisei
@ 2023-06-19 19:57     ` Eric Auger
  0 siblings, 0 replies; 15+ messages in thread
From: Eric Auger @ 2023-06-19 19:57 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: eric.auger.pro, kvm, kvmarm, andrew.jones, maz, will,
	oliver.upton, ricarkol, reijiw, mark.rutland

Hi Alexandru,
On 6/16/23 12:50, Alexandru Elisei wrote:
> Hi,
>
> On Wed, May 31, 2023 at 10:14:36PM +0200, Eric Auger wrote:
>> In some ARM ARM ddi0487 revisions it is said that
>> disabling/enabling a pair of counters that are paired
>> by a CHAIN event should follow a given sequence:
>>
>> Enable the high counter first, isb, enable low counter
>> Disable the low counter first, isb, disable the high counter
>>
>> This was the case in Fc. However this is not written anymore
>> in subsequent revions such as Ia.
>>
>> Anyway, just in case, and because it also makes the code a
>> little bit simpler, introduce 2 helpers to enable/disable chain
>> counters that execute those sequences and replace the existing
>> PMCNTENCLR/ENSET calls (at least this cannot do any harm).
>>
>> Also fix 2 write_sysreg_s(0x0, PMCNTENSET_EL0) in subtest 5 & 6
>> and replace them by PMCNTENCLR writes since writing 0 in
>> PMCNTENSET_EL0 has no effect.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>>
>> v1 -> v2:
>> - fix the enable_chain_counter()/disable_chain_counter()
>>   sequence, ie. swap n + 1 / n as reported by Alexandru.
>> - fix an other comment using the 'low' terminology
>> ---
>>  arm/pmu.c | 37 ++++++++++++++++++++++++++++---------
>>  1 file changed, 28 insertions(+), 9 deletions(-)
>>
>> diff --git a/arm/pmu.c b/arm/pmu.c
>> index 74dd4c10..74c9f6f9 100644
>> --- a/arm/pmu.c
>> +++ b/arm/pmu.c
>> @@ -731,6 +731,22 @@ static void test_chained_sw_incr(bool unused)
>>  		    read_regn_el0(pmevcntr, 0), \
>>  		    read_sysreg(pmovsclr_el0))
>>  
>> +static void enable_chain_counter(int even)
>> +{
>> +	write_sysreg_s(BIT(even + 1), PMCNTENSET_EL0); /* Enable the high counter first */
>> +	isb();
>> +	write_sysreg_s(BIT(even), PMCNTENSET_EL0); /* Enable the low counter */
>> +	isb();
>> +}
>> +
>> +static void disable_chain_counter(int even)
>> +{
>> +	write_sysreg_s(BIT(even), PMCNTENCLR_EL0); /* Disable the low counter first*/
>> +	isb();
>> +	write_sysreg_s(BIT(even + 1), PMCNTENCLR_EL0); /* Disable the high counter */
>> +	isb();
>> +}
>> +
>>  static void test_chain_promotion(bool unused)
> Here's what test_chain_promotion() does for the first subtest:
>
> static void test_chain_promotion(bool unused)
> {
>         uint32_t events[] = {MEM_ACCESS, CHAIN};
>         void *addr = malloc(PAGE_SIZE);
>
>         if (!satisfy_prerequisites(events, ARRAY_SIZE(events)))
>                 return;
>
>         /* Only enable CHAIN counter */
>         report_prefix_push("subtest1");
>         pmu_reset();
>         write_regn_el0(pmevtyper, 0, MEM_ACCESS | PMEVTYPER_EXCLUDE_EL0);
>         write_regn_el0(pmevtyper, 1, CHAIN | PMEVTYPER_EXCLUDE_EL0);
>         write_sysreg_s(0x2, PMCNTENSET_EL0);
>         isb();
>
>         mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
>
> And here's what test_chained_counters() does:
>
> static void test_chained_counters(bool unused)
> {
>         uint32_t events[] = {CPU_CYCLES, CHAIN};
>         uint64_t all_set = pmevcntr_mask();
>
>         if (!satisfy_prerequisites(events, ARRAY_SIZE(events)))
>                 return;
>
>         pmu_reset();
>
>         write_regn_el0(pmevtyper, 0, CPU_CYCLES | PMEVTYPER_EXCLUDE_EL0);
>         write_regn_el0(pmevtyper, 1, CHAIN | PMEVTYPER_EXCLUDE_EL0);
>         /* enable counters #0 and #1 */
>         write_sysreg_s(0x3, PMCNTENSET_EL0);
>         write_regn_el0(pmevcntr, 0, PRE_OVERFLOW_32);
>
>         precise_instrs_loop(22, pmu.pmcr_ro | PMU_PMCR_E);
>
> Why the extra ISB in test_chain_promotion()? Or, if you want to look at it
> the other way around, is the ISB missing from test_chained_counters()?

agreed. as mentionned below, enable_chain_counter() can also be
used here and this issues an isb().
>
>>  {
>>  	uint32_t events[] = {MEM_ACCESS, CHAIN};
>> @@ -769,16 +785,17 @@ static void test_chain_promotion(bool unused)
>>  	/* 1st COUNT with CHAIN enabled, next COUNT with CHAIN disabled */
>>  	report_prefix_push("subtest3");
>>  	pmu_reset();
>> -	write_sysreg_s(0x3, PMCNTENSET_EL0);
>>  	write_regn_el0(pmevcntr, 0, PRE_OVERFLOW2_32);
>> -	isb();
>> +	enable_chain_counter(0);
>>  	PRINT_REGS("init");
> Here's how subtest3 ends up looking:
>
>         report_prefix_push("subtest3");
>         pmu_reset();
>         write_regn_el0(pmevcntr, 0, PRE_OVERFLOW2_32);
>         enable_chain_counter(0);
>         PRINT_REGS("init");
>
>         mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
>
> And here's something similar from test_chained_counters():
>
>         pmu_reset();
>         write_sysreg_s(0x3, PMCNTENSET_EL0);
>
>         write_regn_el0(pmevcntr, 0, PRE_OVERFLOW_32);
>         write_regn_el0(pmevcntr, 1, 0x1);
>         precise_instrs_loop(22, pmu.pmcr_ro | PMU_PMCR_E);
>
>
> Why does test_chain_promotion() use enable_chain_counter() and
> test_chained_counters() doesn't?
>
> Could probably find more examples of this in test_chain_promotion().

agreed. I used enable_chain_counter() in test_chain_promotion()
and test_chained_sw_incr() whenever possible. In other case, both
counters are not set in 'chained' mode.
>
> As an aside, it's extremely difficult to figure out how the counters are
> programmed for a subtest. In the example above, you need to go back 2
> subtests, to the start of test_chain_promotion(), to figure that out. And
> that only gets worse the subtest number increases. test_chain_promotion()
> would really benefit from being split into separate functions, each with
> each own clear initial state. But that's for another patch, not for this
> series.

Yes I do agree with you. This subset splitting was not the best idea
ever. That can be reworked separately indeed.

Thank you for your time!

Eric
>
> Thanks,
> Alex
>
>>  
>>  	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
>>  	PRINT_REGS("After 1st loop");
>>  
>>  	/* disable the CHAIN event */
>> -	write_sysreg_s(0x2, PMCNTENCLR_EL0);
>> +	disable_chain_counter(0);
>> +	write_sysreg_s(0x1, PMCNTENSET_EL0); /* Enable the low counter */
>> +	isb();
>>  	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
>>  	PRINT_REGS("After 2nd loop");
>>  	report(read_sysreg(pmovsclr_el0) == 0x1,
>> @@ -799,9 +816,11 @@ static void test_chain_promotion(bool unused)
>>  	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
>>  	PRINT_REGS("After 1st loop");
>>  
>> -	/* enable the CHAIN event */
>> -	write_sysreg_s(0x3, PMCNTENSET_EL0);
>> +	/* Disable the low counter first and enable the chain counter */
>> +	write_sysreg_s(0x1, PMCNTENCLR_EL0);
>>  	isb();
>> +	enable_chain_counter(0);
>> +
>>  	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
>>  
>>  	PRINT_REGS("After 2nd loop");
>> @@ -825,10 +844,10 @@ static void test_chain_promotion(bool unused)
>>  	PRINT_REGS("After 1st loop");
>>  
>>  	/* 0 becomes CHAINED */
>> -	write_sysreg_s(0x0, PMCNTENSET_EL0);
>> +	write_sysreg_s(0x3, PMCNTENCLR_EL0);
>>  	write_regn_el0(pmevtyper, 1, CHAIN | PMEVTYPER_EXCLUDE_EL0);
>> -	write_sysreg_s(0x3, PMCNTENSET_EL0);
>>  	write_regn_el0(pmevcntr, 1, 0x0);
>> +	enable_chain_counter(0);
>>  
>>  	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
>>  	PRINT_REGS("After 2nd loop");
>> @@ -844,13 +863,13 @@ static void test_chain_promotion(bool unused)
>>  	write_regn_el0(pmevtyper, 0, MEM_ACCESS | PMEVTYPER_EXCLUDE_EL0);
>>  	write_regn_el0(pmevtyper, 1, CHAIN | PMEVTYPER_EXCLUDE_EL0);
>>  	write_regn_el0(pmevcntr, 0, PRE_OVERFLOW2_32);
>> -	write_sysreg_s(0x3, PMCNTENSET_EL0);
>> +	enable_chain_counter(0);
>>  	PRINT_REGS("init");
>>  
>>  	mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E);
>>  	PRINT_REGS("After 1st loop");
>>  
>> -	write_sysreg_s(0x0, PMCNTENSET_EL0);
>> +	disable_chain_counter(0);
>>  	write_regn_el0(pmevtyper, 1, CPU_CYCLES | PMEVTYPER_EXCLUDE_EL0);
>>  	write_sysreg_s(0x3, PMCNTENSET_EL0);
>>  
>> -- 
>> 2.38.1
>>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v2 5/6] arm: pmu: Add pmu-mem-access-reliability test
  2023-06-16 11:52   ` Alexandru Elisei
@ 2023-06-19 20:00     ` Eric Auger
  2023-06-30 14:43       ` Alexandru Elisei
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Auger @ 2023-06-19 20:00 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: eric.auger.pro, kvm, kvmarm, andrew.jones, maz, will,
	oliver.upton, ricarkol, reijiw, mark.rutland

Hi,

On 6/16/23 13:52, Alexandru Elisei wrote:
> Hi,
>
> The test looks much more readable now, some comments below.
>
> On Wed, May 31, 2023 at 10:14:37PM +0200, Eric Auger wrote:
>> Add a new basic test that runs MEM_ACCESS loop over
>> 100 iterations and make sure the number of measured
>> MEM_ACCESS never overflows the margin. Some other
>> pmu tests rely on this pattern and if the MEM_ACCESS
>> measurement is not reliable, it is better to report
>> it beforehand and not confuse the user any further.
>>
>> Without the subsequent patch, this typically fails on
>> ThunderXv2 with the following logs:
>>
>> INFO: pmu: pmu-mem-access-reliability: 32-bit overflows:
>> overflow=1 min=21 max=41 COUNT=20 MARGIN=15
>> FAIL: pmu: pmu-mem-access-reliability: 32-bit overflows:
>> mem_access count is reliable
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>>
>> v1 -> v2:
>> - use mem-access instead of memaccess as suggested by Mark
>> - simplify the logic and add comments in the test loop
>> ---
>>  arm/pmu.c         | 58 +++++++++++++++++++++++++++++++++++++++++++++++
>>  arm/unittests.cfg |  6 +++++
>>  2 files changed, 64 insertions(+)
>>
>> diff --git a/arm/pmu.c b/arm/pmu.c
>> index 74c9f6f9..925f277c 100644
>> --- a/arm/pmu.c
>> +++ b/arm/pmu.c
>> @@ -56,6 +56,7 @@
>>  #define EXT_COMMON_EVENTS_HIGH	0x403F
>>  
>>  #define ALL_SET_32		0x00000000FFFFFFFFULL
>> +#define ALL_SET_64		0xFFFFFFFFFFFFFFFFULL
>>  #define ALL_CLEAR		0x0000000000000000ULL
>>  #define PRE_OVERFLOW_32		0x00000000FFFFFFF0ULL
>>  #define PRE_OVERFLOW_64		0xFFFFFFFFFFFFFFF0ULL
>> @@ -67,6 +68,10 @@
>>   * for some observed variability we take into account a given @MARGIN
>>   */
>>  #define PRE_OVERFLOW2_32		(ALL_SET_32 - COUNT - MARGIN)
>> +#define PRE_OVERFLOW2_64		(ALL_SET_64 - COUNT - MARGIN)
>> +
>> +#define PRE_OVERFLOW2(__overflow_at_64bits)				\
>> +	(__overflow_at_64bits ? PRE_OVERFLOW2_64 : PRE_OVERFLOW2_32)
>>  
>>  #define PRE_OVERFLOW(__overflow_at_64bits)				\
>>  	(__overflow_at_64bits ? PRE_OVERFLOW_64 : PRE_OVERFLOW_32)
>> @@ -747,6 +752,56 @@ static void disable_chain_counter(int even)
>>  	isb();
>>  }
>>  
>> +/*
>> + * This test checks that a mem access loop featuring COUNT accesses
>> + * does not overflow with an init value of PRE_OVERFLOW2. It also
>> + * records the min/max access count to see how much the counting
>> + * is (un)reliable
>> + */
>> +static void test_mem_access_reliability(bool overflow_at_64bits)
>> +{
>> +	uint32_t events[] = {MEM_ACCESS};
>> +	void *addr = malloc(PAGE_SIZE);
>> +	uint64_t count, delta, max = 0, min = pmevcntr_mask();
>> +	uint64_t pre_overflow2 = PRE_OVERFLOW2(overflow_at_64bits);
>> +	uint64_t pmcr_lp = overflow_at_64bits ? PMU_PMCR_LP : 0;
>> +	bool overflow = false;
>> +
>> +	if (!satisfy_prerequisites(events, ARRAY_SIZE(events)) ||
>> +	    !check_overflow_prerequisites(overflow_at_64bits))
>> +		return;
>> +
>> +	pmu_reset();
>> +	write_regn_el0(pmevtyper, 0, MEM_ACCESS | PMEVTYPER_EXCLUDE_EL0);
>> +	for (int i = 0; i < 100; i++) {
>> +		pmu_reset();
>> +		write_regn_el0(pmevcntr, 0, pre_overflow2);
>> +		write_sysreg_s(0x1, PMCNTENSET_EL0);
>> +		isb();
>> +		mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E | pmcr_lp);
>> +		count = read_regn_el0(pmevcntr, 0);
>> +		if (count >= pre_overflow2) {
>> +			/* not counter overflow, as expected */
>> +			delta = count - pre_overflow2;
> Personally, I find the names confusing. Since the test tries to see how
> much the counting is unreliable, I would have have expected delta to be
> difference between the expected number of events incremented (i.e., COUNT)
> and the actual number of events recorded in the counter. I would rename
> count to cntr_val and delta to num_events, but that might be just personal
> bias and I leave it up to you if think this might be useful.
I followed your suggestion
>
>> +		} else {
>> +			/*
>> +			 * unexpected counter overflow meaning the actual
>> +			 * mem access count, delta, is count + COUNT + MARGIN
>> +			 */
>> +			delta = count + COUNT + MARGIN;
> This assumes that PRE_OVERFLOW2_{32,64} = ALL_SET{32,64} - COUNT - MARGIN,
> which might change in the future.
>
> I think a better way to do that is:
>
> delta = count + all_set - pre_overflow2, where
>
> all_set = overflow_at_64bits ? ALL_SET64 : ALL_SET32.
>
> That is a lot easier to parse for someone who doesn't know exactly how
> PRE_OVERFLOW2_* is defined, and more robust.

OK I also adopted this suggestion.
>
>> +			overflow = true;
>> +			report_info("iter=%d count=%ld min=%ld max=%ld overflow!!!",
>> +				    i, delta, min, max);
> I find this message extremely confusing: it does not print count (the value
> read from counter 0) like the text displayed suggests, it prints delta,
> which represents the number of events counted by the counter.
I now use num_events instead.
>
> Besides those minor issues, the patch looks correct. Also ran the test on a
> rockpro64 under KVM + qemu.
cool. Many thanks for the testing!

Eric
>
> Thanks,
> Alex
>
>> +		}
>> +		/* record extreme value */
>> +		max = MAX(delta, max);
>> +		min = MIN(delta, min);
>> +	}
>> +	report_info("overflow=%d min=%ld max=%ld COUNT=%d MARGIN=%d",
>> +		    overflow, min, max, COUNT, MARGIN);
>> +	report(!overflow, "mem_access count is reliable");
>> +}
>> +
>>  static void test_chain_promotion(bool unused)
>>  {
>>  	uint32_t events[] = {MEM_ACCESS, CHAIN};
>> @@ -1204,6 +1259,9 @@ int main(int argc, char *argv[])
>>  	} else if (strcmp(argv[1], "pmu-basic-event-count") == 0) {
>>  		run_event_test(argv[1], test_basic_event_count, false);
>>  		run_event_test(argv[1], test_basic_event_count, true);
>> +	} else if (strcmp(argv[1], "pmu-mem-access-reliability") == 0) {
>> +		run_event_test(argv[1], test_mem_access_reliability, false);
>> +		run_event_test(argv[1], test_mem_access_reliability, true);
>>  	} else if (strcmp(argv[1], "pmu-mem-access") == 0) {
>>  		run_event_test(argv[1], test_mem_access, false);
>>  		run_event_test(argv[1], test_mem_access, true);
>> diff --git a/arm/unittests.cfg b/arm/unittests.cfg
>> index 5e67b558..fe601cbb 100644
>> --- a/arm/unittests.cfg
>> +++ b/arm/unittests.cfg
>> @@ -90,6 +90,12 @@ groups = pmu
>>  arch = arm64
>>  extra_params = -append 'pmu-mem-access'
>>  
>> +[pmu-mem-access-reliability]
>> +file = pmu.flat
>> +groups = pmu
>> +arch = arm64
>> +extra_params = -append 'pmu-mem-access-reliability'
>> +
>>  [pmu-sw-incr]
>>  file = pmu.flat
>>  groups = pmu
>> -- 
>> 2.38.1
>>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [kvm-unit-tests PATCH v2 5/6] arm: pmu: Add pmu-mem-access-reliability test
  2023-06-19 20:00     ` Eric Auger
@ 2023-06-30 14:43       ` Alexandru Elisei
  0 siblings, 0 replies; 15+ messages in thread
From: Alexandru Elisei @ 2023-06-30 14:43 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, kvm, kvmarm, andrew.jones, maz, will,
	oliver.upton, ricarkol, reijiw, mark.rutland

Hi Eric,

On Mon, Jun 19, 2023 at 10:00:11PM +0200, Eric Auger wrote:
> Hi,
> 
> On 6/16/23 13:52, Alexandru Elisei wrote:
> > Hi,
> >
> > The test looks much more readable now, some comments below.
> >
> > On Wed, May 31, 2023 at 10:14:37PM +0200, Eric Auger wrote:
> >> [..]
> >> +static void test_mem_access_reliability(bool overflow_at_64bits)
> >> +{
> >> +	uint32_t events[] = {MEM_ACCESS};
> >> +	void *addr = malloc(PAGE_SIZE);
> >> +	uint64_t count, delta, max = 0, min = pmevcntr_mask();
> >> +	uint64_t pre_overflow2 = PRE_OVERFLOW2(overflow_at_64bits);
> >> +	uint64_t pmcr_lp = overflow_at_64bits ? PMU_PMCR_LP : 0;
> >> +	bool overflow = false;
> >> +
> >> +	if (!satisfy_prerequisites(events, ARRAY_SIZE(events)) ||
> >> +	    !check_overflow_prerequisites(overflow_at_64bits))
> >> +		return;
> >> +
> >> +	pmu_reset();
> >> +	write_regn_el0(pmevtyper, 0, MEM_ACCESS | PMEVTYPER_EXCLUDE_EL0);
> >> +	for (int i = 0; i < 100; i++) {
> >> +		pmu_reset();
> >> +		write_regn_el0(pmevcntr, 0, pre_overflow2);
> >> +		write_sysreg_s(0x1, PMCNTENSET_EL0);
> >> +		isb();
> >> +		mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E | pmcr_lp);
> >> +		count = read_regn_el0(pmevcntr, 0);
> >> +		if (count >= pre_overflow2) {
> >> +			/* not counter overflow, as expected */
> >> +			delta = count - pre_overflow2;
> > Personally, I find the names confusing. Since the test tries to see how
> > much the counting is unreliable, I would have have expected delta to be
> > difference between the expected number of events incremented (i.e., COUNT)
> > and the actual number of events recorded in the counter. I would rename
> > count to cntr_val and delta to num_events, but that might be just personal
> > bias and I leave it up to you if think this might be useful.
> I followed your suggestion

Sorry for that, I guess I didn't think things through :(

Thanks,
Alex

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-06-30 14:43 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-31 20:14 [kvm-unit-tests PATCH v2 0/6] arm: pmu: Fix random failures of pmu-chain-promotion Eric Auger
2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 1/6] arm: pmu: pmu-chain-promotion: Improve debug messages Eric Auger
2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 2/6] arm: pmu: pmu-chain-promotion: Introduce defines for count and margin values Eric Auger
2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 3/6] arm: pmu: Add extra DSB barriers in the mem_access loop Eric Auger
2023-06-16  9:42   ` Alexandru Elisei
2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 4/6] arm: pmu: Fix chain counter enable/disable sequences Eric Auger
2023-06-16 10:50   ` Alexandru Elisei
2023-06-19 19:57     ` Eric Auger
2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 5/6] arm: pmu: Add pmu-mem-access-reliability test Eric Auger
2023-06-16 11:52   ` Alexandru Elisei
2023-06-19 20:00     ` Eric Auger
2023-06-30 14:43       ` Alexandru Elisei
2023-05-31 20:14 ` [kvm-unit-tests PATCH v2 6/6] arm: pmu-chain-promotion: Increase the count and margin values Eric Auger
2023-06-07 19:07 ` [kvm-unit-tests PATCH v2 0/6] arm: pmu: Fix random failures of pmu-chain-promotion Andrew Jones
2023-06-08 16:38   ` Alexandru Elisei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.