* [kvm-unit-tests PATCH v9 0/3] ARM PMU tests @ 2016-11-19 4:15 Wei Huang 2016-11-19 4:15 ` [kvm-unit-tests PATCH v9 1/3] arm: Add PMU test Wei Huang ` (2 more replies) 0 siblings, 3 replies; 6+ messages in thread From: Wei Huang @ 2016-11-19 4:15 UTC (permalink / raw) To: linux-arm-kernel Changes from v8: * Probe PMU version based on ID_DFR0 * pmccntr_read() now returns 64bit and can handle both 32bit and 64bit PMCCNTR based on PMU version. * Add pmccntr_write() support * Use a common printf format PRId64 to support 64bit variable smoothly in test functions * Add barriers to several PMU write functions * Verfied on different execution modes Note: 1) Current KVM code has bugs in handling PMCCFILTR write. A fix (see below) is required for this unit testing code to work correctly under KVM mode. https://lists.cs.columbia.edu/pipermail/kvmarm/2016-November/022134.html. Thanks, -Wei Wei Huang (3): arm: Add PMU test arm: pmu: Check cycle count increases arm: pmu: Add CPI checking arm/Makefile.common | 3 +- arm/pmu.c | 339 ++++++++++++++++++++++++++++++++++++++++++++++++++++ arm/unittests.cfg | 19 +++ 3 files changed, 360 insertions(+), 1 deletion(-) create mode 100644 arm/pmu.c -- 1.8.3.1 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [kvm-unit-tests PATCH v9 1/3] arm: Add PMU test 2016-11-19 4:15 [kvm-unit-tests PATCH v9 0/3] ARM PMU tests Wei Huang @ 2016-11-19 4:15 ` Wei Huang 2016-11-19 4:15 ` [kvm-unit-tests PATCH v9 2/3] arm: pmu: Check cycle count increases Wei Huang 2016-11-19 4:15 ` [kvm-unit-tests PATCH v9 3/3] arm: pmu: Add CPI checking Wei Huang 2 siblings, 0 replies; 6+ messages in thread From: Wei Huang @ 2016-11-19 4:15 UTC (permalink / raw) To: linux-arm-kernel From: Christopher Covington <cov@codeaurora.org> Beginning with a simple sanity check of the control register, add a unit test for the ARM Performance Monitors Unit (PMU). Signed-off-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: Wei Huang <wei@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> --- arm/Makefile.common | 3 ++- arm/pmu.c | 74 +++++++++++++++++++++++++++++++++++++++++++++++++++++ arm/unittests.cfg | 5 ++++ 3 files changed, 81 insertions(+), 1 deletion(-) create mode 100644 arm/pmu.c diff --git a/arm/Makefile.common b/arm/Makefile.common index ccb554d..f98f422 100644 --- a/arm/Makefile.common +++ b/arm/Makefile.common @@ -11,7 +11,8 @@ endif tests-common = \ $(TEST_DIR)/selftest.flat \ - $(TEST_DIR)/spinlock-test.flat + $(TEST_DIR)/spinlock-test.flat \ + $(TEST_DIR)/pmu.flat all: test_cases diff --git a/arm/pmu.c b/arm/pmu.c new file mode 100644 index 0000000..9d9c53b --- /dev/null +++ b/arm/pmu.c @@ -0,0 +1,74 @@ +/* + * Test the ARM Performance Monitors Unit (PMU). + * + * Copyright (c) 2015-2016, The Linux Foundation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU Lesser General Public License version 2.1 and + * only version 2.1 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License + * for more details. + */ +#include "libcflat.h" +#include "asm/barrier.h" + +#define PMU_PMCR_N_SHIFT 11 +#define PMU_PMCR_N_MASK 0x1f +#define PMU_PMCR_ID_SHIFT 16 +#define PMU_PMCR_ID_MASK 0xff +#define PMU_PMCR_IMP_SHIFT 24 +#define PMU_PMCR_IMP_MASK 0xff + +#if defined(__arm__) +static inline uint32_t pmcr_read(void) +{ + uint32_t ret; + + asm volatile("mrc p15, 0, %0, c9, c12, 0" : "=r" (ret)); + return ret; +} +#elif defined(__aarch64__) +static inline uint32_t pmcr_read(void) +{ + uint32_t ret; + + asm volatile("mrs %0, pmcr_el0" : "=r" (ret)); + return ret; +} +#endif + +/* + * As a simple sanity check on the PMCR_EL0, ensure the implementer field isn't + * null. Also print out a couple other interesting fields for diagnostic + * purposes. For example, as of fall 2016, QEMU TCG mode doesn't implement + * event counters and therefore reports zero event counters, but hopefully + * support for at least the instructions event will be added in the future and + * the reported number of event counters will become nonzero. + */ +static bool check_pmcr(void) +{ + uint32_t pmcr; + + pmcr = pmcr_read(); + + printf("PMU implementer: %c\n", + (pmcr >> PMU_PMCR_IMP_SHIFT) & PMU_PMCR_IMP_MASK); + printf("Identification code: 0x%x\n", + (pmcr >> PMU_PMCR_ID_SHIFT) & PMU_PMCR_ID_MASK); + printf("Event counters: %d\n", + (pmcr >> PMU_PMCR_N_SHIFT) & PMU_PMCR_N_MASK); + + return ((pmcr >> PMU_PMCR_IMP_SHIFT) & PMU_PMCR_IMP_MASK) != 0; +} + +int main(void) +{ + report_prefix_push("pmu"); + + report("Control register", check_pmcr()); + + return report_summary(); +} diff --git a/arm/unittests.cfg b/arm/unittests.cfg index 3f6fa45..7645180 100644 --- a/arm/unittests.cfg +++ b/arm/unittests.cfg @@ -54,3 +54,8 @@ file = selftest.flat smp = $MAX_SMP extra_params = -append 'smp' groups = selftest + +# Test PMU support +[pmu] +file = pmu.flat +groups = pmu -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [kvm-unit-tests PATCH v9 2/3] arm: pmu: Check cycle count increases 2016-11-19 4:15 [kvm-unit-tests PATCH v9 0/3] ARM PMU tests Wei Huang 2016-11-19 4:15 ` [kvm-unit-tests PATCH v9 1/3] arm: Add PMU test Wei Huang @ 2016-11-19 4:15 ` Wei Huang 2016-11-21 9:14 ` Andrew Jones 2016-11-19 4:15 ` [kvm-unit-tests PATCH v9 3/3] arm: pmu: Add CPI checking Wei Huang 2 siblings, 1 reply; 6+ messages in thread From: Wei Huang @ 2016-11-19 4:15 UTC (permalink / raw) To: linux-arm-kernel From: Christopher Covington <cov@codeaurora.org> Ensure that reads of the PMCCNTR_EL0 are monotonically increasing, even for the smallest delta of two subsequent reads. Signed-off-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: Wei Huang <wei@redhat.com> --- arm/pmu.c | 156 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 156 insertions(+) diff --git a/arm/pmu.c b/arm/pmu.c index 9d9c53b..fa87de4 100644 --- a/arm/pmu.c +++ b/arm/pmu.c @@ -15,6 +15,9 @@ #include "libcflat.h" #include "asm/barrier.h" +#define PMU_PMCR_E (1 << 0) +#define PMU_PMCR_C (1 << 2) +#define PMU_PMCR_LC (1 << 6) #define PMU_PMCR_N_SHIFT 11 #define PMU_PMCR_N_MASK 0x1f #define PMU_PMCR_ID_SHIFT 16 @@ -22,6 +25,14 @@ #define PMU_PMCR_IMP_SHIFT 24 #define PMU_PMCR_IMP_MASK 0xff +#define ID_DFR0_PERFMON_SHIFT 24 +#define ID_DFR0_PERFMON_MASK 0xf + +#define PMU_CYCLE_IDX 31 + +#define NR_SAMPLES 10 + +static unsigned int pmu_version; #if defined(__arm__) static inline uint32_t pmcr_read(void) { @@ -30,6 +41,69 @@ static inline uint32_t pmcr_read(void) asm volatile("mrc p15, 0, %0, c9, c12, 0" : "=r" (ret)); return ret; } + +static inline void pmcr_write(uint32_t value) +{ + asm volatile("mcr p15, 0, %0, c9, c12, 0" : : "r" (value)); + isb(); +} + +static inline void pmselr_write(uint32_t value) +{ + asm volatile("mcr p15, 0, %0, c9, c12, 5" : : "r" (value)); + isb(); +} + +static inline void pmxevtyper_write(uint32_t value) +{ + asm volatile("mcr p15, 0, %0, c9, c13, 1" : : "r" (value)); +} + +static inline uint64_t pmccntr_read(void) +{ + uint32_t lo, hi = 0; + + if (pmu_version == 0x3) + asm volatile("mrrc p15, 0, %0, %1, c9" : "=r" (lo), "=r" (hi)); + else + asm volatile("mrc p15, 0, %0, c9, c13, 0" : "=r" (lo)); + + return ((uint64_t)hi << 32) | lo; +} + +static inline void pmccntr_write(uint64_t value) +{ + uint32_t lo, hi; + + lo = value & 0xffffffff; + hi = (value >> 32) & 0xffffffff; + + if (pmu_version == 0x3) + asm volatile("mcrr p15, 0, %0, %1, c9" : : "r" (lo), "r" (hi)); + else + asm volatile("mcr p15, 0, %0, c9, c13, 0" : : "r" (lo)); +} + +static inline void pmcntenset_write(uint32_t value) +{ + asm volatile("mcr p15, 0, %0, c9, c12, 1" : : "r" (value)); +} + +/* PMCCFILTR is an obsolete name for PMXEVTYPER31 in ARMv7 */ +static inline void pmccfiltr_write(uint32_t value) +{ + pmselr_write(PMU_CYCLE_IDX); + pmxevtyper_write(value); + isb(); +} + +static inline uint32_t id_dfr0_read(void) +{ + uint32_t val; + + asm volatile("mrc p15, 0, %0, c0, c1, 2" : "=r" (val)); + return val; +} #elif defined(__aarch64__) static inline uint32_t pmcr_read(void) { @@ -38,6 +112,44 @@ static inline uint32_t pmcr_read(void) asm volatile("mrs %0, pmcr_el0" : "=r" (ret)); return ret; } + +static inline void pmcr_write(uint32_t value) +{ + asm volatile("msr pmcr_el0, %0" : : "r" (value)); + isb(); +} + +static inline uint64_t pmccntr_read(void) +{ + uint64_t cycles; + + asm volatile("mrs %0, pmccntr_el0" : "=r" (cycles)); + return cycles; +} + +static inline void pmccntr_write(uint64_t value) +{ + asm volatile("msr pmccntr_el0, %0" : : "r" (value)); +} + +static inline void pmcntenset_write(uint32_t value) +{ + asm volatile("msr pmcntenset_el0, %0" : : "r" (value)); +} + +static inline void pmccfiltr_write(uint32_t value) +{ + asm volatile("msr pmccfiltr_el0, %0" : : "r" (value)); + isb(); +} + +static inline uint32_t id_dfr0_read(void) +{ + uint32_t id; + + asm volatile("mrs %0, id_dfr0_el1" : "=r" (id)); + return id; +} #endif /* @@ -64,11 +176,55 @@ static bool check_pmcr(void) return ((pmcr >> PMU_PMCR_IMP_SHIFT) & PMU_PMCR_IMP_MASK) != 0; } +/* + * Ensure that the cycle counter progresses between back-to-back reads. + */ +static bool check_cycles_increase(void) +{ + bool success = true; + + pmccntr_write(0); + pmcr_write(pmcr_read() | PMU_PMCR_LC | PMU_PMCR_C | PMU_PMCR_E); + + for (int i = 0; i < NR_SAMPLES; i++) { + uint64_t a, b; + + a = pmccntr_read(); + b = pmccntr_read(); + + if (a >= b) { + printf("Read %"PRId64" then %"PRId64".\n", a, b); + success = false; + break; + } + } + + pmcr_write(pmcr_read() & ~PMU_PMCR_E); + + return success; +} + +void pmu_init(void) +{ + uint32_t dfr0; + + /* probe pmu version */ + dfr0 = id_dfr0_read(); + pmu_version = (dfr0 >> ID_DFR0_PERFMON_SHIFT) & ID_DFR0_PERFMON_MASK; + printf("PMU version: %d\n", pmu_version); + + /* init for PMU event access, right now only care about cycle count */ + pmcntenset_write(1 << PMU_CYCLE_IDX); + pmccfiltr_write(0); /* count cycles in EL0, EL1, but not EL2 */ +} + int main(void) { report_prefix_push("pmu"); + pmu_init(); report("Control register", check_pmcr()); + report("Monotonically increasing cycle count", check_cycles_increase()); return report_summary(); } -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [kvm-unit-tests PATCH v9 2/3] arm: pmu: Check cycle count increases 2016-11-19 4:15 ` [kvm-unit-tests PATCH v9 2/3] arm: pmu: Check cycle count increases Wei Huang @ 2016-11-21 9:14 ` Andrew Jones 0 siblings, 0 replies; 6+ messages in thread From: Andrew Jones @ 2016-11-21 9:14 UTC (permalink / raw) To: linux-arm-kernel On Fri, Nov 18, 2016 at 10:15:41PM -0600, Wei Huang wrote: > From: Christopher Covington <cov@codeaurora.org> > > Ensure that reads of the PMCCNTR_EL0 are monotonically increasing, > even for the smallest delta of two subsequent reads. > > Signed-off-by: Christopher Covington <cov@codeaurora.org> > Signed-off-by: Wei Huang <wei@redhat.com> > --- > arm/pmu.c | 156 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 156 insertions(+) > > diff --git a/arm/pmu.c b/arm/pmu.c > index 9d9c53b..fa87de4 100644 > --- a/arm/pmu.c > +++ b/arm/pmu.c > @@ -15,6 +15,9 @@ > #include "libcflat.h" > #include "asm/barrier.h" > > +#define PMU_PMCR_E (1 << 0) > +#define PMU_PMCR_C (1 << 2) > +#define PMU_PMCR_LC (1 << 6) > #define PMU_PMCR_N_SHIFT 11 > #define PMU_PMCR_N_MASK 0x1f > #define PMU_PMCR_ID_SHIFT 16 > @@ -22,6 +25,14 @@ > #define PMU_PMCR_IMP_SHIFT 24 > #define PMU_PMCR_IMP_MASK 0xff > > +#define ID_DFR0_PERFMON_SHIFT 24 > +#define ID_DFR0_PERFMON_MASK 0xf > + > +#define PMU_CYCLE_IDX 31 > + > +#define NR_SAMPLES 10 > + > +static unsigned int pmu_version; > #if defined(__arm__) > static inline uint32_t pmcr_read(void) > { > @@ -30,6 +41,69 @@ static inline uint32_t pmcr_read(void) > asm volatile("mrc p15, 0, %0, c9, c12, 0" : "=r" (ret)); > return ret; > } > + > +static inline void pmcr_write(uint32_t value) > +{ > + asm volatile("mcr p15, 0, %0, c9, c12, 0" : : "r" (value)); > + isb(); > +} > + > +static inline void pmselr_write(uint32_t value) > +{ > + asm volatile("mcr p15, 0, %0, c9, c12, 5" : : "r" (value)); > + isb(); > +} > + > +static inline void pmxevtyper_write(uint32_t value) > +{ > + asm volatile("mcr p15, 0, %0, c9, c13, 1" : : "r" (value)); > +} > + > +static inline uint64_t pmccntr_read(void) > +{ > + uint32_t lo, hi = 0; > + > + if (pmu_version == 0x3) > + asm volatile("mrrc p15, 0, %0, %1, c9" : "=r" (lo), "=r" (hi)); > + else > + asm volatile("mrc p15, 0, %0, c9, c13, 0" : "=r" (lo)); This is fine for this single test, but I was actually thinking we'd implement both the 32-bit and 64-bit accessors separately, allowing us to add an additional test that tries both on AArch32 (PMUv3). I.e. report_prefix_push("pmccntr"); report("32-bit read", pmccntr_read() != 0); /* test mrc access */ if (pmu_version == 0x3) { u64 cntr64 = pmccntr_read64(); /* test mrrc access */ report("64-bit read", (cntr64 >> 32) == 0 && (u32)cntr64 != 0); } else { report_skip("64-bit read, PMUv3 required"); } We can refactor when we add that test later though. > + > + return ((uint64_t)hi << 32) | lo; > +} > + > +static inline void pmccntr_write(uint64_t value) > +{ > + uint32_t lo, hi; > + > + lo = value & 0xffffffff; > + hi = (value >> 32) & 0xffffffff; > + > + if (pmu_version == 0x3) > + asm volatile("mcrr p15, 0, %0, %1, c9" : : "r" (lo), "r" (hi)); > + else > + asm volatile("mcr p15, 0, %0, c9, c13, 0" : : "r" (lo)); > +} > + > +static inline void pmcntenset_write(uint32_t value) > +{ > + asm volatile("mcr p15, 0, %0, c9, c12, 1" : : "r" (value)); > +} > + > +/* PMCCFILTR is an obsolete name for PMXEVTYPER31 in ARMv7 */ > +static inline void pmccfiltr_write(uint32_t value) > +{ > + pmselr_write(PMU_CYCLE_IDX); > + pmxevtyper_write(value); > + isb(); > +} > + > +static inline uint32_t id_dfr0_read(void) > +{ > + uint32_t val; > + > + asm volatile("mrc p15, 0, %0, c0, c1, 2" : "=r" (val)); > + return val; > +} > #elif defined(__aarch64__) > static inline uint32_t pmcr_read(void) > { > @@ -38,6 +112,44 @@ static inline uint32_t pmcr_read(void) > asm volatile("mrs %0, pmcr_el0" : "=r" (ret)); > return ret; > } > + > +static inline void pmcr_write(uint32_t value) > +{ > + asm volatile("msr pmcr_el0, %0" : : "r" (value)); > + isb(); > +} > + > +static inline uint64_t pmccntr_read(void) > +{ > + uint64_t cycles; > + > + asm volatile("mrs %0, pmccntr_el0" : "=r" (cycles)); > + return cycles; > +} > + > +static inline void pmccntr_write(uint64_t value) > +{ > + asm volatile("msr pmccntr_el0, %0" : : "r" (value)); > +} > + > +static inline void pmcntenset_write(uint32_t value) > +{ > + asm volatile("msr pmcntenset_el0, %0" : : "r" (value)); > +} > + > +static inline void pmccfiltr_write(uint32_t value) > +{ > + asm volatile("msr pmccfiltr_el0, %0" : : "r" (value)); > + isb(); > +} > + > +static inline uint32_t id_dfr0_read(void) > +{ > + uint32_t id; > + > + asm volatile("mrs %0, id_dfr0_el1" : "=r" (id)); > + return id; > +} > #endif > > /* > @@ -64,11 +176,55 @@ static bool check_pmcr(void) > return ((pmcr >> PMU_PMCR_IMP_SHIFT) & PMU_PMCR_IMP_MASK) != 0; > } > > +/* > + * Ensure that the cycle counter progresses between back-to-back reads. > + */ > +static bool check_cycles_increase(void) > +{ > + bool success = true; > + > + pmccntr_write(0); > + pmcr_write(pmcr_read() | PMU_PMCR_LC | PMU_PMCR_C | PMU_PMCR_E); > + > + for (int i = 0; i < NR_SAMPLES; i++) { > + uint64_t a, b; > + > + a = pmccntr_read(); > + b = pmccntr_read(); > + > + if (a >= b) { > + printf("Read %"PRId64" then %"PRId64".\n", a, b); > + success = false; > + break; > + } > + } > + > + pmcr_write(pmcr_read() & ~PMU_PMCR_E); > + > + return success; > +} > + > +void pmu_init(void) > +{ > + uint32_t dfr0; > + > + /* probe pmu version */ > + dfr0 = id_dfr0_read(); > + pmu_version = (dfr0 >> ID_DFR0_PERFMON_SHIFT) & ID_DFR0_PERFMON_MASK; > + printf("PMU version: %d\n", pmu_version); > + > + /* init for PMU event access, right now only care about cycle count */ > + pmcntenset_write(1 << PMU_CYCLE_IDX); > + pmccfiltr_write(0); /* count cycles in EL0, EL1, but not EL2 */ These last two register writes are specific to the cycle count test, so I don't think they belong in a common pmu_init function. I'd keep them at the top of check_cycles_increase. > +} > + > int main(void) > { > report_prefix_push("pmu"); > > + pmu_init(); > report("Control register", check_pmcr()); > + report("Monotonically increasing cycle count", check_cycles_increase()); > > return report_summary(); > } > -- > 1.8.3.1 > Thanks, drew ^ permalink raw reply [flat|nested] 6+ messages in thread
* [kvm-unit-tests PATCH v9 3/3] arm: pmu: Add CPI checking 2016-11-19 4:15 [kvm-unit-tests PATCH v9 0/3] ARM PMU tests Wei Huang 2016-11-19 4:15 ` [kvm-unit-tests PATCH v9 1/3] arm: Add PMU test Wei Huang 2016-11-19 4:15 ` [kvm-unit-tests PATCH v9 2/3] arm: pmu: Check cycle count increases Wei Huang @ 2016-11-19 4:15 ` Wei Huang 2016-11-21 9:41 ` Andrew Jones 2 siblings, 1 reply; 6+ messages in thread From: Wei Huang @ 2016-11-19 4:15 UTC (permalink / raw) To: linux-arm-kernel From: Christopher Covington <cov@codeaurora.org> Calculate the numbers of cycles per instruction (CPI) implied by ARM PMU cycle counter values. The code includes a strict checking facility intended for the -icount option in TCG mode in the configuration file. Signed-off-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: Wei Huang <wei@redhat.com> --- arm/pmu.c | 111 +++++++++++++++++++++++++++++++++++++++++++++++++++++- arm/unittests.cfg | 14 +++++++ 2 files changed, 124 insertions(+), 1 deletion(-) diff --git a/arm/pmu.c b/arm/pmu.c index fa87de4..b36c4fb 100644 --- a/arm/pmu.c +++ b/arm/pmu.c @@ -104,6 +104,25 @@ static inline uint32_t id_dfr0_read(void) asm volatile("mrc p15, 0, %0, c0, c1, 2" : "=r" (val)); return val; } + +/* + * Extra instructions inserted by the compiler would be difficult to compensate + * for, so hand assemble everything between, and including, the PMCR accesses + * to start and stop counting. + */ +static inline void loop(int i, uint32_t pmcr) +{ + asm volatile( + " mcr p15, 0, %[pmcr], c9, c12, 0\n" + " isb\n" + "1: subs %[i], %[i], #1\n" + " bgt 1b\n" + " mcr p15, 0, %[z], c9, c12, 0\n" + " isb\n" + : [i] "+r" (i) + : [pmcr] "r" (pmcr), [z] "r" (0) + : "cc"); +} #elif defined(__aarch64__) static inline uint32_t pmcr_read(void) { @@ -150,6 +169,25 @@ static inline uint32_t id_dfr0_read(void) asm volatile("mrs %0, id_dfr0_el1" : "=r" (id)); return id; } + +/* + * Extra instructions inserted by the compiler would be difficult to compensate + * for, so hand assemble everything between, and including, the PMCR accesses + * to start and stop counting. + */ +static inline void loop(int i, uint32_t pmcr) +{ + asm volatile( + " msr pmcr_el0, %[pmcr]\n" + " isb\n" + "1: subs %[i], %[i], #1\n" + " b.gt 1b\n" + " msr pmcr_el0, xzr\n" + " isb\n" + : [i] "+r" (i) + : [pmcr] "r" (pmcr) + : "cc"); +} #endif /* @@ -204,6 +242,71 @@ static bool check_cycles_increase(void) return success; } +/* + * Execute a known number of guest instructions. Only odd instruction counts + * greater than or equal to 3 are supported by the in-line assembly code. The + * control register (PMCR_EL0) is initialized with the provided value (allowing + * for example for the cycle counter or event counters to be reset). At the end + * of the exact instruction loop, zero is written to PMCR_EL0 to disable + * counting, allowing the cycle counter or event counters to be read at the + * leisure of the calling code. + */ +static void measure_instrs(int num, uint32_t pmcr) +{ + int i = (num - 1) / 2; + + assert(num >= 3 && ((num - 1) % 2 == 0)); + loop(i, pmcr); +} + +/* + * Measure cycle counts for various known instruction counts. Ensure that the + * cycle counter progresses (similar to check_cycles_increase() but with more + * instructions and using reset and stop controls). If supplied a positive, + * nonzero CPI parameter, also strictly check that every measurement matches + * it. Strict CPI checking is used to test -icount mode. + */ +static bool check_cpi(int cpi) +{ + uint32_t pmcr = pmcr_read() | PMU_PMCR_LC | PMU_PMCR_C | PMU_PMCR_E; + + if (cpi > 0) + printf("Checking for CPI=%d.\n", cpi); + printf("instrs : cycles0 cycles1 ...\n"); + + for (unsigned int i = 3; i < 300; i += 32) { + uint64_t avg, sum = 0; + + printf("%d :", i); + for (int j = 0; j < NR_SAMPLES; j++) { + uint64_t cycles; + + pmccntr_write(0); + measure_instrs(i, pmcr); + cycles = pmccntr_read(); + printf(" %"PRId64"", cycles); + + /* + * The cycles taken by the loop above should fit in + * 32 bits easily. We check the upper 32 bits of the + * cycle counter to make sure there is no supprise. + */ + if (!cycles || (cpi > 0 && cycles != i * cpi) || + (cycles & 0xffffffff00000000)) { + printf("\n"); + return false; + } + + sum += cycles; + } + avg = sum / NR_SAMPLES; + printf(" sum=%"PRId64" avg=%"PRId64" avg_ipc=%"PRId64" " + "avg_cpi=%"PRId64"\n", sum, avg, i / avg, avg / i); + } + + return true; +} + void pmu_init(void) { uint32_t dfr0; @@ -218,13 +321,19 @@ void pmu_init(void) pmccfiltr_write(0); /* count cycles in EL0, EL1, but not EL2 */ } -int main(void) +int main(int argc, char *argv[]) { + int cpi = 0; + + if (argc >= 1) + cpi = atol(argv[0]); + report_prefix_push("pmu"); pmu_init(); report("Control register", check_pmcr()); report("Monotonically increasing cycle count", check_cycles_increase()); + report("Cycle/instruction ratio", check_cpi(cpi)); return report_summary(); } diff --git a/arm/unittests.cfg b/arm/unittests.cfg index 7645180..2050dc8 100644 --- a/arm/unittests.cfg +++ b/arm/unittests.cfg @@ -59,3 +59,17 @@ groups = selftest [pmu] file = pmu.flat groups = pmu + +# Test PMU support (TCG) with -icount IPC=1 +[pmu-tcg-icount-1] +file = pmu.flat +extra_params = -icount 0 -append '1' +groups = pmu +accel = tcg + +# Test PMU support (TCG) with -icount IPC=256 +[pmu-tcg-icount-256] +file = pmu.flat +extra_params = -icount 8 -append '256' +groups = pmu +accel = tcg -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [kvm-unit-tests PATCH v9 3/3] arm: pmu: Add CPI checking 2016-11-19 4:15 ` [kvm-unit-tests PATCH v9 3/3] arm: pmu: Add CPI checking Wei Huang @ 2016-11-21 9:41 ` Andrew Jones 0 siblings, 0 replies; 6+ messages in thread From: Andrew Jones @ 2016-11-21 9:41 UTC (permalink / raw) To: linux-arm-kernel On Fri, Nov 18, 2016 at 10:15:42PM -0600, Wei Huang wrote: > From: Christopher Covington <cov@codeaurora.org> > > Calculate the numbers of cycles per instruction (CPI) implied by ARM > PMU cycle counter values. The code includes a strict checking facility > intended for the -icount option in TCG mode in the configuration file. > > Signed-off-by: Christopher Covington <cov@codeaurora.org> > Signed-off-by: Wei Huang <wei@redhat.com> > --- > arm/pmu.c | 111 +++++++++++++++++++++++++++++++++++++++++++++++++++++- > arm/unittests.cfg | 14 +++++++ > 2 files changed, 124 insertions(+), 1 deletion(-) > > diff --git a/arm/pmu.c b/arm/pmu.c > index fa87de4..b36c4fb 100644 > --- a/arm/pmu.c > +++ b/arm/pmu.c > @@ -104,6 +104,25 @@ static inline uint32_t id_dfr0_read(void) > asm volatile("mrc p15, 0, %0, c0, c1, 2" : "=r" (val)); > return val; > } > + > +/* > + * Extra instructions inserted by the compiler would be difficult to compensate > + * for, so hand assemble everything between, and including, the PMCR accesses > + * to start and stop counting. > + */ > +static inline void loop(int i, uint32_t pmcr) Thought you were going to rename this function. > +{ > + asm volatile( > + " mcr p15, 0, %[pmcr], c9, c12, 0\n" > + " isb\n" > + "1: subs %[i], %[i], #1\n" > + " bgt 1b\n" > + " mcr p15, 0, %[z], c9, c12, 0\n" > + " isb\n" > + : [i] "+r" (i) > + : [pmcr] "r" (pmcr), [z] "r" (0) > + : "cc"); > +} > #elif defined(__aarch64__) > static inline uint32_t pmcr_read(void) > { > @@ -150,6 +169,25 @@ static inline uint32_t id_dfr0_read(void) > asm volatile("mrs %0, id_dfr0_el1" : "=r" (id)); > return id; > } > + > +/* > + * Extra instructions inserted by the compiler would be difficult to compensate > + * for, so hand assemble everything between, and including, the PMCR accesses > + * to start and stop counting. > + */ > +static inline void loop(int i, uint32_t pmcr) > +{ > + asm volatile( > + " msr pmcr_el0, %[pmcr]\n" > + " isb\n" > + "1: subs %[i], %[i], #1\n" > + " b.gt 1b\n" > + " msr pmcr_el0, xzr\n" > + " isb\n" > + : [i] "+r" (i) > + : [pmcr] "r" (pmcr) > + : "cc"); > +} > #endif > > /* > @@ -204,6 +242,71 @@ static bool check_cycles_increase(void) > return success; > } > > +/* > + * Execute a known number of guest instructions. Only odd instruction counts > + * greater than or equal to 3 are supported by the in-line assembly code. The > + * control register (PMCR_EL0) is initialized with the provided value (allowing > + * for example for the cycle counter or event counters to be reset). At the end > + * of the exact instruction loop, zero is written to PMCR_EL0 to disable > + * counting, allowing the cycle counter or event counters to be read at the > + * leisure of the calling code. > + */ > +static void measure_instrs(int num, uint32_t pmcr) > +{ > + int i = (num - 1) / 2; > + > + assert(num >= 3 && ((num - 1) % 2 == 0)); > + loop(i, pmcr); > +} > + > +/* > + * Measure cycle counts for various known instruction counts. Ensure that the > + * cycle counter progresses (similar to check_cycles_increase() but with more > + * instructions and using reset and stop controls). If supplied a positive, > + * nonzero CPI parameter, also strictly check that every measurement matches > + * it. Strict CPI checking is used to test -icount mode. > + */ > +static bool check_cpi(int cpi) > +{ > + uint32_t pmcr = pmcr_read() | PMU_PMCR_LC | PMU_PMCR_C | PMU_PMCR_E; > + > + if (cpi > 0) > + printf("Checking for CPI=%d.\n", cpi); > + printf("instrs : cycles0 cycles1 ...\n"); > + > + for (unsigned int i = 3; i < 300; i += 32) { > + uint64_t avg, sum = 0; > + > + printf("%d :", i); > + for (int j = 0; j < NR_SAMPLES; j++) { > + uint64_t cycles; > + > + pmccntr_write(0); > + measure_instrs(i, pmcr); > + cycles = pmccntr_read(); > + printf(" %"PRId64"", cycles); > + > + /* > + * The cycles taken by the loop above should fit in > + * 32 bits easily. We check the upper 32 bits of the > + * cycle counter to make sure there is no supprise. > + */ > + if (!cycles || (cpi > 0 && cycles != i * cpi) || > + (cycles & 0xffffffff00000000)) { (cycles >> 32) != 0 would look better. > + printf("\n"); We have 3 cases where we return false here. How about doing the tests separately and adding descriptive print statements for each? if (!cycles) { printf("\ncycles not incrementing!\n"); return false; } else if (cpi > 0 && cycles != i * cpi) { ... } else if ((cycles >> 32) != 0) { ... } > + return false; > + } > + > + sum += cycles; > + } > + avg = sum / NR_SAMPLES; > + printf(" sum=%"PRId64" avg=%"PRId64" avg_ipc=%"PRId64" " > + "avg_cpi=%"PRId64"\n", sum, avg, i / avg, avg / i); > + } > + > + return true; > +} > + > void pmu_init(void) > { > uint32_t dfr0; > @@ -218,13 +321,19 @@ void pmu_init(void) > pmccfiltr_write(0); /* count cycles in EL0, EL1, but not EL2 */ > } > > -int main(void) > +int main(int argc, char *argv[]) > { > + int cpi = 0; > + > + if (argc >= 1) ^ this '=' shouldn't be here > + cpi = atol(argv[0]); ^ sigh, this is still zero... Looks like you forgot all my comments from the last review round... > + > report_prefix_push("pmu"); > > pmu_init(); > report("Control register", check_pmcr()); > report("Monotonically increasing cycle count", check_cycles_increase()); > + report("Cycle/instruction ratio", check_cpi(cpi)); > > return report_summary(); > } > diff --git a/arm/unittests.cfg b/arm/unittests.cfg > index 7645180..2050dc8 100644 > --- a/arm/unittests.cfg > +++ b/arm/unittests.cfg > @@ -59,3 +59,17 @@ groups = selftest > [pmu] > file = pmu.flat > groups = pmu > + > +# Test PMU support (TCG) with -icount IPC=1 > +[pmu-tcg-icount-1] > +file = pmu.flat > +extra_params = -icount 0 -append '1' > +groups = pmu > +accel = tcg > + > +# Test PMU support (TCG) with -icount IPC=256 > +[pmu-tcg-icount-256] > +file = pmu.flat > +extra_params = -icount 8 -append '256' > +groups = pmu > +accel = tcg > -- > 1.8.3.1 > drew ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-11-21 9:41 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-11-19 4:15 [kvm-unit-tests PATCH v9 0/3] ARM PMU tests Wei Huang 2016-11-19 4:15 ` [kvm-unit-tests PATCH v9 1/3] arm: Add PMU test Wei Huang 2016-11-19 4:15 ` [kvm-unit-tests PATCH v9 2/3] arm: pmu: Check cycle count increases Wei Huang 2016-11-21 9:14 ` Andrew Jones 2016-11-19 4:15 ` [kvm-unit-tests PATCH v9 3/3] arm: pmu: Add CPI checking Wei Huang 2016-11-21 9:41 ` Andrew Jones
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).