* [RFC PATCH v2 0/4] PMU partitioning driver support
@ 2025-02-08 2:01 Colton Lewis
2025-02-08 2:01 ` [RFC PATCH v2 1/4] perf: arm_pmuv3: Generalize counter bitmasks Colton Lewis
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Colton Lewis @ 2025-02-08 2:01 UTC (permalink / raw)
To: kvm
Cc: Russell King, Catalin Marinas, Will Deacon, Marc Zyngier,
Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
Mark Rutland, Paolo Bonzini, Shuah Khan, linux-arm-kernel,
linux-kernel, kvmarm, linux-perf-users, linux-kselftest,
Colton Lewis
This series introduces support in the ARM PMUv3 driver for
partitioning PMU counters into two separate ranges by taking advantage
of the MDCR_EL2.HPMN register field.
The advantage of a partitioned PMU would be to allow KVM guests direct
access to a subset of PMU functionality, greatly reducing the overhead
of performance monitoring in guests.
While this feature could be accepted on its own merits, practically
there is a lot more to be done before it will be fully useful, so I'm
sending as an RFC for now.
v2:
* Rebase to v6.14-rc1, and take into account Oliver's debug cleanup
changing some KVM handling of MDCR_EL2.
* Reorder patches to put related things closer together for easier
reading.
* Ensure driver compiles for 32-bit ARM by defining 32-bit access to
MDCR_EL2 correctly, which is called HDCR there. These will not be
called due to the below requirement for VHE mode but a functionally
correct definition seemed preferable to making it a no op.
* Rename parameter to reserved_host_counters to better reflect
underlying hardware semantics and ensure host counters are easier to
preserve.
* Document parameter with MODULE_PARM_DESC
* Restrict partitioning to VHE mode only. Using HPMN restricts the
host counters to EL2 access only. Trying to make this work for the
driver running at EL1 would require hypercalls for every access of
those registers and a lot of additional handling at the KVM level.
This would defeat the whole purpose of having the feature at
all. Define has_vhe() on 32-bit arm to always return false.
* While Rob asked about making the parameter configurable at run time
rather than boot time through sysfs, that is potentially very tricky
because it is only safe to reprogram HPMN when we are certain no
one, host or guest, is using any counters that might change
ownership. If this is determined to be worthwhile, it should come
after getting the core infrastructure right.
* While Marc asked about reusing the sysreg masks instead of
redefining them for MDCR_EL2, that doesn't work because sysreg masks
are only defined for arm64. I'm not sure how to avoid the
duplication.
* Wrap code for determining if a counter index belongs to the host or
guest partition in functions
* Typo fixes and additional testing
v1:
https://lore.kernel.org/kvm/20250127222031.3078945-1-coltonlewis@google.com/
Colton Lewis (4):
perf: arm_pmuv3: Generalize counter bitmasks
perf: arm_pmuv3: Introduce module param to partition the PMU
perf: arm_pmuv3: Keep out of guest counter partition
KVM: arm64: Make guests see only counters they can access
arch/arm/include/asm/arm_pmuv3.h | 13 ++
arch/arm64/include/asm/arm_pmuv3.h | 10 ++
arch/arm64/kvm/debug.c | 3 +-
arch/arm64/kvm/pmu-emul.c | 8 +-
drivers/perf/arm_pmuv3.c | 117 ++++++++++++++++--
include/linux/perf/arm_pmu.h | 2 +
include/linux/perf/arm_pmuv3.h | 34 ++++-
.../selftests/kvm/arm64/vpmu_counter_access.c | 2 +-
8 files changed, 168 insertions(+), 21 deletions(-)
base-commit: 2014c95afecee3e76ca4a56956a936e23283f05b
--
2.48.1.502.g6dc24dfdaf-goog
^ permalink raw reply [flat|nested] 9+ messages in thread
* [RFC PATCH v2 1/4] perf: arm_pmuv3: Generalize counter bitmasks
2025-02-08 2:01 [RFC PATCH v2 0/4] PMU partitioning driver support Colton Lewis
@ 2025-02-08 2:01 ` Colton Lewis
2025-02-08 2:01 ` [RFC PATCH v2 2/4] perf: arm_pmuv3: Introduce module param to partition the PMU Colton Lewis
` (2 subsequent siblings)
3 siblings, 0 replies; 9+ messages in thread
From: Colton Lewis @ 2025-02-08 2:01 UTC (permalink / raw)
To: kvm
Cc: Russell King, Catalin Marinas, Will Deacon, Marc Zyngier,
Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
Mark Rutland, Paolo Bonzini, Shuah Khan, linux-arm-kernel,
linux-kernel, kvmarm, linux-perf-users, linux-kselftest,
Colton Lewis
These bitmasks are valid for enable and interrupt registers as well as
overflow registers. Generalize the names.
Signed-off-by: Colton Lewis <coltonlewis@google.com>
---
include/linux/perf/arm_pmuv3.h | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/include/linux/perf/arm_pmuv3.h b/include/linux/perf/arm_pmuv3.h
index d698efba28a2..c2448477c37f 100644
--- a/include/linux/perf/arm_pmuv3.h
+++ b/include/linux/perf/arm_pmuv3.h
@@ -223,16 +223,23 @@
ARMV8_PMU_PMCR_X | ARMV8_PMU_PMCR_DP | \
ARMV8_PMU_PMCR_LC | ARMV8_PMU_PMCR_LP)
+/*
+ * Counter bitmask layouts for overflow, enable, and interrupts
+ */
+#define ARMV8_PMU_CNT_MASK_P GENMASK(30, 0)
+#define ARMV8_PMU_CNT_MASK_C BIT(31)
+#define ARMV8_PMU_CNT_MASK_F BIT_ULL(32) /* arm64 only */
+#define ARMV8_PMU_CNT_MASK_ALL (ARMV8_PMU_CNT_MASK_P | \
+ ARMV8_PMU_CNT_MASK_C | \
+ ARMV8_PMU_CNT_MASK_F)
/*
* PMOVSR: counters overflow flag status reg
*/
-#define ARMV8_PMU_OVSR_P GENMASK(30, 0)
-#define ARMV8_PMU_OVSR_C BIT(31)
-#define ARMV8_PMU_OVSR_F BIT_ULL(32) /* arm64 only */
+#define ARMV8_PMU_OVSR_P ARMV8_PMU_CNT_MASK_P
+#define ARMV8_PMU_OVSR_C ARMV8_PMU_CNT_MASK_C
+#define ARMV8_PMU_OVSR_F ARMV8_PMU_CNT_MASK_F
/* Mask for writable bits is both P and C fields */
-#define ARMV8_PMU_OVERFLOWED_MASK (ARMV8_PMU_OVSR_P | ARMV8_PMU_OVSR_C | \
- ARMV8_PMU_OVSR_F)
-
+#define ARMV8_PMU_OVERFLOWED_MASK ARMV8_PMU_CNT_MASK_ALL
/*
* PMXEVTYPER: Event selection reg
*/
--
2.48.1.502.g6dc24dfdaf-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [RFC PATCH v2 2/4] perf: arm_pmuv3: Introduce module param to partition the PMU
2025-02-08 2:01 [RFC PATCH v2 0/4] PMU partitioning driver support Colton Lewis
2025-02-08 2:01 ` [RFC PATCH v2 1/4] perf: arm_pmuv3: Generalize counter bitmasks Colton Lewis
@ 2025-02-08 2:01 ` Colton Lewis
2025-02-10 7:25 ` Oliver Upton
2025-02-08 2:01 ` [RFC PATCH v2 3/4] perf: arm_pmuv3: Keep out of guest counter partition Colton Lewis
2025-02-08 2:01 ` [RFC PATCH v2 4/4] KVM: arm64: Make guests see only counters they can access Colton Lewis
3 siblings, 1 reply; 9+ messages in thread
From: Colton Lewis @ 2025-02-08 2:01 UTC (permalink / raw)
To: kvm
Cc: Russell King, Catalin Marinas, Will Deacon, Marc Zyngier,
Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
Mark Rutland, Paolo Bonzini, Shuah Khan, linux-arm-kernel,
linux-kernel, kvmarm, linux-perf-users, linux-kselftest,
Colton Lewis
For PMUv3, the register MDCR_EL2.HPMN partitiones the PMU counters
into two ranges where counters 0..HPMN-1 are accessible by EL1 and, if
allowed, EL0 while counters HPMN..N are only accessible by EL2.
Introduce a module parameter in the PMUv3 driver to set this
register. The name reserved_host_counters reflects the intent to
reserve some counters for the host so the guest may eventually be
allowed direct access to a subset of PMU functionality for increased
performance.
Track HPMN and whether the pmu is partitioned in struct arm_pmu
because KVM will need to know that to handle guests correctly.
While FEAT_HPMN0 does allow HPMN to be set to 0, this patch
specifically disallows that case because it's not useful given the
intention to allow guests access to their own counters.
Due to the difficulty this feature would create for the driver running
at EL1 on the host, partitioning is only allowed in VHE mode. Working
on nVHE mode would require a hypercall for every register access
because the counters reserved for the host by HPMN are now only
accessible to EL2.
The parameter is only configurable at boot time. Making the parameter
configurable on a running system is dangerous due to the difficulty of
knowing for sure no counters are in use anywhere so it is safe to
reporgram HPMN.
Signed-off-by: Colton Lewis <coltonlewis@google.com>
---
arch/arm/include/asm/arm_pmuv3.h | 13 +++++++++
arch/arm64/include/asm/arm_pmuv3.h | 10 +++++++
drivers/perf/arm_pmuv3.c | 46 ++++++++++++++++++++++++++++--
include/linux/perf/arm_pmu.h | 2 ++
include/linux/perf/arm_pmuv3.h | 7 +++++
5 files changed, 76 insertions(+), 2 deletions(-)
diff --git a/arch/arm/include/asm/arm_pmuv3.h b/arch/arm/include/asm/arm_pmuv3.h
index 2ec0e5e83fc9..c5f496450e16 100644
--- a/arch/arm/include/asm/arm_pmuv3.h
+++ b/arch/arm/include/asm/arm_pmuv3.h
@@ -11,6 +11,8 @@
#define PMCCNTR __ACCESS_CP15_64(0, c9)
+#define HDCR __ACCESS_CP15(c1, 4, c1, 1)
+
#define PMCR __ACCESS_CP15(c9, 0, c12, 0)
#define PMCNTENSET __ACCESS_CP15(c9, 0, c12, 1)
#define PMCNTENCLR __ACCESS_CP15(c9, 0, c12, 2)
@@ -214,6 +216,7 @@ static inline void write_pmuserenr(u32 val)
static inline void write_pmuacr(u64 val) {}
+static inline bool has_vhe(void) { return false; }
static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) {}
static inline void kvm_clr_pmu_events(u32 clr) {}
static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
@@ -277,4 +280,14 @@ static inline u64 read_pmceid1(void)
return val;
}
+static inline u32 read_mdcr(void)
+{
+ return read_sysreg(HDCR);
+}
+
+static inline void write_mdcr(u32 val)
+{
+ write_sysreg(val, HDCR);
+}
+
#endif
diff --git a/arch/arm64/include/asm/arm_pmuv3.h b/arch/arm64/include/asm/arm_pmuv3.h
index 8a777dec8d88..fc37e7e81e07 100644
--- a/arch/arm64/include/asm/arm_pmuv3.h
+++ b/arch/arm64/include/asm/arm_pmuv3.h
@@ -188,4 +188,14 @@ static inline bool is_pmuv3p9(int pmuver)
return pmuver >= ID_AA64DFR0_EL1_PMUVer_V3P9;
}
+static inline u64 read_mdcr(void)
+{
+ return read_sysreg(mdcr_el2);
+}
+
+static inline void write_mdcr(u64 val)
+{
+ write_sysreg(val, mdcr_el2);
+}
+
#endif
diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
index 0e360feb3432..39109260b161 100644
--- a/drivers/perf/arm_pmuv3.c
+++ b/drivers/perf/arm_pmuv3.c
@@ -325,6 +325,7 @@ GEN_PMU_FORMAT_ATTR(threshold_compare);
GEN_PMU_FORMAT_ATTR(threshold);
static int sysctl_perf_user_access __read_mostly;
+static u8 reserved_host_counters __read_mostly;
static bool armv8pmu_event_is_64bit(struct perf_event *event)
{
@@ -500,6 +501,29 @@ static void armv8pmu_pmcr_write(u64 val)
write_pmcr(val);
}
+static u64 armv8pmu_mdcr_read(void)
+{
+ return read_mdcr();
+}
+
+static void armv8pmu_mdcr_write(u64 val)
+{
+ write_mdcr(val);
+ isb();
+}
+
+static void armv8pmu_partition(u8 hpmn)
+{
+ u64 mdcr = armv8pmu_mdcr_read();
+
+ mdcr &= ~ARMV8_PMU_MDCR_HPMN;
+ mdcr |= FIELD_PREP(ARMV8_PMU_MDCR_HPMN, hpmn);
+ /* Prevent guest counters counting at EL2 */
+ mdcr |= ARMV8_PMU_MDCR_HPMD;
+
+ armv8pmu_mdcr_write(mdcr);
+}
+
static int armv8pmu_has_overflowed(u64 pmovsr)
{
return !!(pmovsr & ARMV8_PMU_OVERFLOWED_MASK);
@@ -1069,6 +1093,9 @@ static void armv8pmu_reset(void *info)
bitmap_to_arr64(&mask, cpu_pmu->cntr_mask, ARMPMU_MAX_HWEVENTS);
+ if (cpu_pmu->partitioned)
+ armv8pmu_partition(cpu_pmu->hpmn);
+
/* The counter and interrupt enable registers are unknown at reset. */
armv8pmu_disable_counter(mask);
armv8pmu_disable_intens(mask);
@@ -1205,6 +1232,7 @@ static void __armv8pmu_probe_pmu(void *info)
{
struct armv8pmu_probe_info *probe = info;
struct arm_pmu *cpu_pmu = probe->pmu;
+ u8 pmcr_n;
u64 pmceid_raw[2];
u32 pmceid[2];
int pmuver;
@@ -1215,10 +1243,20 @@ static void __armv8pmu_probe_pmu(void *info)
cpu_pmu->pmuver = pmuver;
probe->present = true;
+ pmcr_n = FIELD_GET(ARMV8_PMU_PMCR_N, armv8pmu_pmcr_read());
/* Read the nb of CNTx counters supported from PMNC */
- bitmap_set(cpu_pmu->cntr_mask,
- 0, FIELD_GET(ARMV8_PMU_PMCR_N, armv8pmu_pmcr_read()));
+ bitmap_set(cpu_pmu->cntr_mask, 0, pmcr_n);
+
+ if (has_vhe() &&
+ reserved_host_counters > 0 &&
+ reserved_host_counters < pmcr_n) {
+ cpu_pmu->hpmn = pmcr_n - reserved_host_counters;
+ cpu_pmu->partitioned = true;
+ } else {
+ cpu_pmu->hpmn = pmcr_n;
+ cpu_pmu->partitioned = false;
+ }
/* Add the CPU cycles counter */
set_bit(ARMV8_PMU_CYCLE_IDX, cpu_pmu->cntr_mask);
@@ -1516,3 +1554,7 @@ void arch_perf_update_userpage(struct perf_event *event,
userpg->cap_user_time_zero = 1;
userpg->cap_user_time_short = 1;
}
+
+module_param(reserved_host_counters, byte, 0);
+MODULE_PARM_DESC(reserved_host_counters,
+ "Partition the PMU into host and guest counters");
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 4b5b83677e3f..ad97aabed25a 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -101,6 +101,8 @@ struct arm_pmu {
void (*reset)(void *);
int (*map_event)(struct perf_event *event);
DECLARE_BITMAP(cntr_mask, ARMPMU_MAX_HWEVENTS);
+ u8 hpmn; /* MDCR_EL2.HPMN: counter partition pivot */
+ bool partitioned;
bool secure_access; /* 32-bit ARM only */
#define ARMV8_PMUV3_MAX_COMMON_EVENTS 0x40
DECLARE_BITMAP(pmceid_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS);
diff --git a/include/linux/perf/arm_pmuv3.h b/include/linux/perf/arm_pmuv3.h
index c2448477c37f..115ee39f693a 100644
--- a/include/linux/perf/arm_pmuv3.h
+++ b/include/linux/perf/arm_pmuv3.h
@@ -223,6 +223,13 @@
ARMV8_PMU_PMCR_X | ARMV8_PMU_PMCR_DP | \
ARMV8_PMU_PMCR_LC | ARMV8_PMU_PMCR_LP)
+/*
+ * Per-CPU MDCR: config reg
+ */
+#define ARMV8_PMU_MDCR_HPMN GENMASK(4, 0)
+#define ARMV8_PMU_MDCR_HPME BIT(7)
+#define ARMV8_PMU_MDCR_HPMD BIT(17)
+
/*
* Counter bitmask layouts for overflow, enable, and interrupts
*/
--
2.48.1.502.g6dc24dfdaf-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [RFC PATCH v2 3/4] perf: arm_pmuv3: Keep out of guest counter partition
2025-02-08 2:01 [RFC PATCH v2 0/4] PMU partitioning driver support Colton Lewis
2025-02-08 2:01 ` [RFC PATCH v2 1/4] perf: arm_pmuv3: Generalize counter bitmasks Colton Lewis
2025-02-08 2:01 ` [RFC PATCH v2 2/4] perf: arm_pmuv3: Introduce module param to partition the PMU Colton Lewis
@ 2025-02-08 2:01 ` Colton Lewis
2025-02-08 2:01 ` [RFC PATCH v2 4/4] KVM: arm64: Make guests see only counters they can access Colton Lewis
3 siblings, 0 replies; 9+ messages in thread
From: Colton Lewis @ 2025-02-08 2:01 UTC (permalink / raw)
To: kvm
Cc: Russell King, Catalin Marinas, Will Deacon, Marc Zyngier,
Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
Mark Rutland, Paolo Bonzini, Shuah Khan, linux-arm-kernel,
linux-kernel, kvmarm, linux-perf-users, linux-kselftest,
Colton Lewis
If the PMU is partitioned, keep the driver out of the guest counter
partition and only use the host counter partition. Partitioning is
defined by the MDCR_EL2.HPMN register field and saved in
cpu_pmu->hpmn. The range 0..HPMN-1 is accessible by EL1 and EL0 while
HPMN..PMCR.N is reserved for EL2.
Define some macros that take HPMN as an argument and construct
mutually exclusive bitmaps for testing which partition a particular
counter is in. Note that despite their different position in the
bitmap, the cycle and instruction counters are always in the guest
partition.
Signed-off-by: Colton Lewis <coltonlewis@google.com>
---
drivers/perf/arm_pmuv3.c | 73 +++++++++++++++++++++++++++++-----
include/linux/perf/arm_pmuv3.h | 8 ++++
2 files changed, 71 insertions(+), 10 deletions(-)
diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
index 39109260b161..01468ec9fead 100644
--- a/drivers/perf/arm_pmuv3.c
+++ b/drivers/perf/arm_pmuv3.c
@@ -754,15 +754,19 @@ static void armv8pmu_disable_event_irq(struct perf_event *event)
armv8pmu_disable_intens(BIT(event->hw.idx));
}
-static u64 armv8pmu_getreset_flags(void)
+static u64 armv8pmu_getreset_flags(struct arm_pmu *cpu_pmu)
{
u64 value;
/* Read */
value = read_pmovsclr();
+ if (cpu_pmu->partitioned)
+ value &= ARMV8_PMU_HOST_CNT_PART(cpu_pmu->hpmn);
+ else
+ value &= ARMV8_PMU_OVERFLOWED_MASK;
+
/* Write to clear flags */
- value &= ARMV8_PMU_OVERFLOWED_MASK;
write_pmovsclr(value);
return value;
@@ -789,6 +793,18 @@ static void armv8pmu_disable_user_access(void)
update_pmuserenr(0);
}
+static bool armv8pmu_is_guest_part(struct arm_pmu *cpu_pmu, u8 idx)
+{
+ return cpu_pmu->partitioned &&
+ (BIT(idx) & ARMV8_PMU_GUEST_CNT_PART(cpu_pmu->hpmn));
+}
+
+static bool armv8pmu_is_host_part(struct arm_pmu *cpu_pmu, u8 idx)
+{
+ return !cpu_pmu->partitioned ||
+ (BIT(idx) & ARMV8_PMU_HOST_CNT_PART(cpu_pmu->hpmn));
+}
+
static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
{
int i;
@@ -797,6 +813,8 @@ static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
if (is_pmuv3p9(cpu_pmu->pmuver)) {
u64 mask = 0;
for_each_set_bit(i, cpuc->used_mask, ARMPMU_MAX_HWEVENTS) {
+ if (armv8pmu_is_guest_part(cpu_pmu, i))
+ continue;
if (armv8pmu_event_has_user_read(cpuc->events[i]))
mask |= BIT(i);
}
@@ -805,6 +823,8 @@ static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
/* Clear any unused counters to avoid leaking their contents */
for_each_andnot_bit(i, cpu_pmu->cntr_mask, cpuc->used_mask,
ARMPMU_MAX_HWEVENTS) {
+ if (armv8pmu_is_guest_part(cpu_pmu, i))
+ continue;
if (i == ARMV8_PMU_CYCLE_IDX)
write_pmccntr(0);
else if (i == ARMV8_PMU_INSTR_IDX)
@@ -850,7 +870,10 @@ static void armv8pmu_start(struct arm_pmu *cpu_pmu)
armv8pmu_disable_user_access();
/* Enable all counters */
- armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
+ if (cpu_pmu->partitioned)
+ armv8pmu_mdcr_write(armv8pmu_mdcr_read() | ARMV8_PMU_MDCR_HPME);
+ else
+ armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
kvm_vcpu_pmu_resync_el0();
}
@@ -858,7 +881,10 @@ static void armv8pmu_start(struct arm_pmu *cpu_pmu)
static void armv8pmu_stop(struct arm_pmu *cpu_pmu)
{
/* Disable all counters */
- armv8pmu_pmcr_write(armv8pmu_pmcr_read() & ~ARMV8_PMU_PMCR_E);
+ if (cpu_pmu->partitioned)
+ armv8pmu_mdcr_write(armv8pmu_mdcr_read() & ~ARMV8_PMU_MDCR_HPME);
+ else
+ armv8pmu_pmcr_write(armv8pmu_pmcr_read() & ~ARMV8_PMU_PMCR_E);
}
static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
@@ -872,7 +898,7 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
/*
* Get and reset the IRQ flags
*/
- pmovsr = armv8pmu_getreset_flags();
+ pmovsr = armv8pmu_getreset_flags(cpu_pmu);
/*
* Did an overflow occur?
@@ -930,6 +956,8 @@ static int armv8pmu_get_single_idx(struct pmu_hw_events *cpuc,
int idx;
for_each_set_bit(idx, cpu_pmu->cntr_mask, ARMV8_PMU_MAX_GENERAL_COUNTERS) {
+ if (armv8pmu_is_guest_part(cpu_pmu, idx))
+ continue;
if (!test_and_set_bit(idx, cpuc->used_mask))
return idx;
}
@@ -946,6 +974,8 @@ static int armv8pmu_get_chain_idx(struct pmu_hw_events *cpuc,
* the lower idx must be even.
*/
for_each_set_bit(idx, cpu_pmu->cntr_mask, ARMV8_PMU_MAX_GENERAL_COUNTERS) {
+ if (armv8pmu_is_guest_part(cpu_pmu, idx))
+ continue;
if (!(idx & 0x1))
continue;
if (!test_and_set_bit(idx, cpuc->used_mask)) {
@@ -968,6 +998,7 @@ static int armv8pmu_get_event_idx(struct pmu_hw_events *cpuc,
/* Always prefer to place a cycle counter into the cycle counter. */
if ((evtype == ARMV8_PMUV3_PERFCTR_CPU_CYCLES) &&
+ !cpu_pmu->partitioned &&
!armv8pmu_event_get_threshold(&event->attr)) {
if (!test_and_set_bit(ARMV8_PMU_CYCLE_IDX, cpuc->used_mask))
return ARMV8_PMU_CYCLE_IDX;
@@ -983,6 +1014,7 @@ static int armv8pmu_get_event_idx(struct pmu_hw_events *cpuc,
* may not know how to handle it.
*/
if ((evtype == ARMV8_PMUV3_PERFCTR_INST_RETIRED) &&
+ !cpu_pmu->partitioned &&
!armv8pmu_event_get_threshold(&event->attr) &&
test_bit(ARMV8_PMU_INSTR_IDX, cpu_pmu->cntr_mask) &&
!armv8pmu_event_want_user_access(event)) {
@@ -994,7 +1026,7 @@ static int armv8pmu_get_event_idx(struct pmu_hw_events *cpuc,
* Otherwise use events counters
*/
if (armv8pmu_event_is_chained(event))
- return armv8pmu_get_chain_idx(cpuc, cpu_pmu);
+ return armv8pmu_get_chain_idx(cpuc, cpu_pmu);
else
return armv8pmu_get_single_idx(cpuc, cpu_pmu);
}
@@ -1086,6 +1118,16 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
return 0;
}
+static void armv8pmu_reset_host_counters(struct arm_pmu *cpu_pmu)
+{
+ int idx;
+
+ for_each_set_bit(idx, cpu_pmu->cntr_mask, ARMV8_PMU_MAX_GENERAL_COUNTERS) {
+ if (armv8pmu_is_host_part(cpu_pmu, idx))
+ armv8pmu_write_evcntr(idx, 0);
+ }
+}
+
static void armv8pmu_reset(void *info)
{
struct arm_pmu *cpu_pmu = (struct arm_pmu *)info;
@@ -1093,8 +1135,10 @@ static void armv8pmu_reset(void *info)
bitmap_to_arr64(&mask, cpu_pmu->cntr_mask, ARMPMU_MAX_HWEVENTS);
- if (cpu_pmu->partitioned)
+ if (cpu_pmu->partitioned) {
armv8pmu_partition(cpu_pmu->hpmn);
+ mask &= ARMV8_PMU_HOST_CNT_PART(cpu_pmu->hpmn);
+ }
/* The counter and interrupt enable registers are unknown at reset. */
armv8pmu_disable_counter(mask);
@@ -1103,11 +1147,20 @@ static void armv8pmu_reset(void *info)
/* Clear the counters we flip at guest entry/exit */
kvm_clr_pmu_events(mask);
+
+ pmcr = ARMV8_PMU_PMCR_LC;
+
/*
- * Initialize & Reset PMNC. Request overflow interrupt for
- * 64 bit cycle counter but cheat in armv8pmu_write_counter().
+ * Initialize & Reset PMNC. Request overflow interrupt for 64
+ * bit cycle counter but cheat in armv8pmu_write_counter().
+ *
+ * When partitioned, there is no single bit to reset only the
+ * host counters. so reset them individually.
*/
- pmcr = ARMV8_PMU_PMCR_P | ARMV8_PMU_PMCR_C | ARMV8_PMU_PMCR_LC;
+ if (cpu_pmu->partitioned)
+ armv8pmu_reset_host_counters(cpu_pmu);
+ else
+ pmcr = ARMV8_PMU_PMCR_P | ARMV8_PMU_PMCR_C;
/* Enable long event counter support where available */
if (armv8pmu_has_long_event(cpu_pmu))
diff --git a/include/linux/perf/arm_pmuv3.h b/include/linux/perf/arm_pmuv3.h
index 115ee39f693a..5f8b143794ce 100644
--- a/include/linux/perf/arm_pmuv3.h
+++ b/include/linux/perf/arm_pmuv3.h
@@ -247,6 +247,14 @@
#define ARMV8_PMU_OVSR_F ARMV8_PMU_CNT_MASK_F
/* Mask for writable bits is both P and C fields */
#define ARMV8_PMU_OVERFLOWED_MASK ARMV8_PMU_CNT_MASK_ALL
+
+/* Masks for guest and host counter partitions */
+#define ARMV8_PMU_HPMN_CNT_MASK(N) GENMASK((N) - 1, 0)
+#define ARMV8_PMU_GUEST_CNT_PART(N) (ARMV8_PMU_HPMN_CNT_MASK(N) | \
+ ARMV8_PMU_CNT_MASK_C | \
+ ARMV8_PMU_CNT_MASK_F)
+#define ARMV8_PMU_HOST_CNT_PART(N) (ARMV8_PMU_CNT_MASK_ALL & \
+ ~ARMV8_PMU_GUEST_CNT_PART(N))
/*
* PMXEVTYPER: Event selection reg
*/
--
2.48.1.502.g6dc24dfdaf-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [RFC PATCH v2 4/4] KVM: arm64: Make guests see only counters they can access
2025-02-08 2:01 [RFC PATCH v2 0/4] PMU partitioning driver support Colton Lewis
` (2 preceding siblings ...)
2025-02-08 2:01 ` [RFC PATCH v2 3/4] perf: arm_pmuv3: Keep out of guest counter partition Colton Lewis
@ 2025-02-08 2:01 ` Colton Lewis
2025-02-10 7:37 ` Oliver Upton
3 siblings, 1 reply; 9+ messages in thread
From: Colton Lewis @ 2025-02-08 2:01 UTC (permalink / raw)
To: kvm
Cc: Russell King, Catalin Marinas, Will Deacon, Marc Zyngier,
Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
Mark Rutland, Paolo Bonzini, Shuah Khan, linux-arm-kernel,
linux-kernel, kvmarm, linux-perf-users, linux-kselftest,
Colton Lewis
The ARM architecture specifies that when MDCR_EL2.HPMN is set, EL1 and
EL0, which includes KVM guests, should read that value for PMCR.N.
Signed-off-by: Colton Lewis <coltonlewis@google.com>
---
arch/arm64/kvm/debug.c | 3 +--
arch/arm64/kvm/pmu-emul.c | 8 +++++++-
tools/testing/selftests/kvm/arm64/vpmu_counter_access.c | 2 +-
3 files changed, 9 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index 0e4c805e7e89..7c04db00bf6c 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -36,8 +36,7 @@ static void kvm_arm_setup_mdcr_el2(struct kvm_vcpu *vcpu)
* This also clears MDCR_EL2_E2PB_MASK and MDCR_EL2_E2TB_MASK
* to disable guest access to the profiling and trace buffers
*/
- vcpu->arch.mdcr_el2 = FIELD_PREP(MDCR_EL2_HPMN,
- *host_data_ptr(nr_event_counters));
+ vcpu->arch.mdcr_el2 = FIELD_PREP(MDCR_EL2_HPMN, read_mdcr());
vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |
MDCR_EL2_TPMS |
MDCR_EL2_TTRF |
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 6c5950b9ceac..052ce8c721fe 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -993,12 +993,18 @@ static bool pmu_irq_is_valid(struct kvm *kvm, int irq)
u8 kvm_arm_pmu_get_max_counters(struct kvm *kvm)
{
struct arm_pmu *arm_pmu = kvm->arch.arm_pmu;
+ u8 limit;
+
+ if (arm_pmu->partitioned)
+ limit = arm_pmu->hpmn - 1;
+ else
+ limit = ARMV8_PMU_MAX_GENERAL_COUNTERS;
/*
* The arm_pmu->cntr_mask considers the fixed counter(s) as well.
* Ignore those and return only the general-purpose counters.
*/
- return bitmap_weight(arm_pmu->cntr_mask, ARMV8_PMU_MAX_GENERAL_COUNTERS);
+ return bitmap_weight(arm_pmu->cntr_mask, limit);
}
static void kvm_arm_set_pmu(struct kvm *kvm, struct arm_pmu *arm_pmu)
diff --git a/tools/testing/selftests/kvm/arm64/vpmu_counter_access.c b/tools/testing/selftests/kvm/arm64/vpmu_counter_access.c
index f16b3b27e32e..b5bc18b7528d 100644
--- a/tools/testing/selftests/kvm/arm64/vpmu_counter_access.c
+++ b/tools/testing/selftests/kvm/arm64/vpmu_counter_access.c
@@ -609,7 +609,7 @@ static void run_pmregs_validity_test(uint64_t pmcr_n)
*/
static void run_error_test(uint64_t pmcr_n)
{
- pr_debug("Error test with pmcr_n %lu (larger than the host)\n", pmcr_n);
+ pr_debug("Error test with pmcr_n %lu (larger than the host allows)\n", pmcr_n);
test_create_vpmu_vm_with_pmcr_n(pmcr_n, true);
destroy_vpmu_vm();
--
2.48.1.502.g6dc24dfdaf-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [RFC PATCH v2 2/4] perf: arm_pmuv3: Introduce module param to partition the PMU
2025-02-08 2:01 ` [RFC PATCH v2 2/4] perf: arm_pmuv3: Introduce module param to partition the PMU Colton Lewis
@ 2025-02-10 7:25 ` Oliver Upton
2025-02-10 23:07 ` Colton Lewis
0 siblings, 1 reply; 9+ messages in thread
From: Oliver Upton @ 2025-02-10 7:25 UTC (permalink / raw)
To: Colton Lewis
Cc: kvm, Russell King, Catalin Marinas, Will Deacon, Marc Zyngier,
Joey Gouly, Suzuki K Poulose, Zenghui Yu, Mark Rutland,
Paolo Bonzini, Shuah Khan, linux-arm-kernel, linux-kernel, kvmarm,
linux-perf-users, linux-kselftest
Hi Colton,
On Sat, Feb 08, 2025 at 02:01:09AM +0000, Colton Lewis wrote:
> For PMUv3, the register MDCR_EL2.HPMN partitiones the PMU counters
> into two ranges where counters 0..HPMN-1 are accessible by EL1 and, if
> allowed, EL0 while counters HPMN..N are only accessible by EL2.
>
> Introduce a module parameter in the PMUv3 driver to set this
> register. The name reserved_host_counters reflects the intent to
> reserve some counters for the host so the guest may eventually be
> allowed direct access to a subset of PMU functionality for increased
> performance.
>
> Track HPMN and whether the pmu is partitioned in struct arm_pmu
> because KVM will need to know that to handle guests correctly.
>
> While FEAT_HPMN0 does allow HPMN to be set to 0, this patch
> specifically disallows that case because it's not useful given the
> intention to allow guests access to their own counters.
Quite the contrary.
FEAT_HPMN0 is useful if userspace wants to provide a vPMU that has a
fixed cycle counter w/o event counters. Certain OSes refuse to boot
without it...
> static inline u32 read_mdcr(void)
> {
> return read_sysreg(HDCR);
> }
>
> static inline void write_mdcr(u32 val)
> {
> write_sysreg(val, HDCR);
> }
>
Hmm... While this fixes the 32bit compilation issues, it opens a new can
of worms.
VHE is a 64bit only feature, so you're *guaranteed* that these accessors
will undef (running at EL1).
> +static void armv8pmu_partition(u8 hpmn)
> +{
> + u64 mdcr = armv8pmu_mdcr_read();
> +
> + mdcr &= ~ARMV8_PMU_MDCR_HPMN;
> + mdcr |= FIELD_PREP(ARMV8_PMU_MDCR_HPMN, hpmn);
> + /* Prevent guest counters counting at EL2 */
> + mdcr |= ARMV8_PMU_MDCR_HPMD;
> +
> + armv8pmu_mdcr_write(mdcr);
> +}
> +
After giving this a read, I don't think the host PMU driver should care
about MDCR_EL2 at all. The only time that 'guest' events are loaded into
the PMU is between vcpu_load() / vcpu_put(), at which point *KVM* has
reconfigured MDCR_EL2.
I'm not sure if there's much involvement with the host PMU driver beyond
it pinky-swearing to only use the specified counters. KVM is what
actually will program HPMN.
That'd alleviate the PMU driver from having VHE awareness or caring
about MDCR_EL2.
--
Thanks,
Oliver
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH v2 4/4] KVM: arm64: Make guests see only counters they can access
2025-02-08 2:01 ` [RFC PATCH v2 4/4] KVM: arm64: Make guests see only counters they can access Colton Lewis
@ 2025-02-10 7:37 ` Oliver Upton
2025-02-10 23:08 ` Colton Lewis
0 siblings, 1 reply; 9+ messages in thread
From: Oliver Upton @ 2025-02-10 7:37 UTC (permalink / raw)
To: Colton Lewis
Cc: kvm, Russell King, Catalin Marinas, Will Deacon, Marc Zyngier,
Joey Gouly, Suzuki K Poulose, Zenghui Yu, Mark Rutland,
Paolo Bonzini, Shuah Khan, linux-arm-kernel, linux-kernel, kvmarm,
linux-perf-users, linux-kselftest
On Sat, Feb 08, 2025 at 02:01:11AM +0000, Colton Lewis wrote:
> The ARM architecture specifies that when MDCR_EL2.HPMN is set, EL1 and
> EL0, which includes KVM guests, should read that value for PMCR.N.
>
> Signed-off-by: Colton Lewis <coltonlewis@google.com>
> ---
> arch/arm64/kvm/debug.c | 3 +--
> arch/arm64/kvm/pmu-emul.c | 8 +++++++-
> tools/testing/selftests/kvm/arm64/vpmu_counter_access.c | 2 +-
> 3 files changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
> index 0e4c805e7e89..7c04db00bf6c 100644
> --- a/arch/arm64/kvm/debug.c
> +++ b/arch/arm64/kvm/debug.c
> @@ -36,8 +36,7 @@ static void kvm_arm_setup_mdcr_el2(struct kvm_vcpu *vcpu)
> * This also clears MDCR_EL2_E2PB_MASK and MDCR_EL2_E2TB_MASK
> * to disable guest access to the profiling and trace buffers
> */
> - vcpu->arch.mdcr_el2 = FIELD_PREP(MDCR_EL2_HPMN,
> - *host_data_ptr(nr_event_counters));
> + vcpu->arch.mdcr_el2 = FIELD_PREP(MDCR_EL2_HPMN, read_mdcr());
Please avoid unnecessary accesses to MDCR_EL2 at all costs. This is a
guaranteed trap to EL2 with nested virt.
> vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |
> MDCR_EL2_TPMS |
> MDCR_EL2_TTRF |
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index 6c5950b9ceac..052ce8c721fe 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -993,12 +993,18 @@ static bool pmu_irq_is_valid(struct kvm *kvm, int irq)
> u8 kvm_arm_pmu_get_max_counters(struct kvm *kvm)
> {
> struct arm_pmu *arm_pmu = kvm->arch.arm_pmu;
> + u8 limit;
> +
> + if (arm_pmu->partitioned)
> + limit = arm_pmu->hpmn - 1;
> + else
> + limit = ARMV8_PMU_MAX_GENERAL_COUNTERS;
>
> /*
> * The arm_pmu->cntr_mask considers the fixed counter(s) as well.
> * Ignore those and return only the general-purpose counters.
> */
> - return bitmap_weight(arm_pmu->cntr_mask, ARMV8_PMU_MAX_GENERAL_COUNTERS);
> + return bitmap_weight(arm_pmu->cntr_mask, limit);
> }
This isn't necessary and is likely to regress the existing behavior.
When the architecture says the lower ELs should see PMCR_EL0.N have the
same value as MDCR_EL2.HPMN, what it really means is direct reads from
hardware will return the value.
So my expectation would be that a VM using the partitioned PMU
implementation would never reach any of the *emulated* PMU handlers, as
we would've disabled the associated traps.
The partitioned PMU will not replace the emulated vPMU implementation in
KVM, so it'd be good to refactor what we have today to make room for
your work. I think that'd be along the lines of:
- Shared code for event filter enforcement and handling of the vPMU
overflow IRQ.
- Emulated PMU implementation that provides trap handlers for all PMUv3
registers and backs into host perf
- Partitioned PMU implementation that doesn't rely on traps and instead
saves / restores a portion of the PMU that contains the guest
context.
These should be done in separate files, i.e. I do not want to see a
bunch of inline handling to cope with emulated v. partitioned PMUs.
> static void kvm_arm_set_pmu(struct kvm *kvm, struct arm_pmu *arm_pmu)
> diff --git a/tools/testing/selftests/kvm/arm64/vpmu_counter_access.c b/tools/testing/selftests/kvm/arm64/vpmu_counter_access.c
> index f16b3b27e32e..b5bc18b7528d 100644
> --- a/tools/testing/selftests/kvm/arm64/vpmu_counter_access.c
> +++ b/tools/testing/selftests/kvm/arm64/vpmu_counter_access.c
> @@ -609,7 +609,7 @@ static void run_pmregs_validity_test(uint64_t pmcr_n)
> */
> static void run_error_test(uint64_t pmcr_n)
> {
> - pr_debug("Error test with pmcr_n %lu (larger than the host)\n", pmcr_n);
> + pr_debug("Error test with pmcr_n %lu (larger than the host allows)\n", pmcr_n);
NBD for an RFC, but in the future please do selftests changes in a
separate patch.
--
Thanks,
Oliver
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH v2 2/4] perf: arm_pmuv3: Introduce module param to partition the PMU
2025-02-10 7:25 ` Oliver Upton
@ 2025-02-10 23:07 ` Colton Lewis
0 siblings, 0 replies; 9+ messages in thread
From: Colton Lewis @ 2025-02-10 23:07 UTC (permalink / raw)
To: Oliver Upton
Cc: kvm, linux, catalin.marinas, will, maz, joey.gouly,
suzuki.poulose, yuzenghui, mark.rutland, pbonzini, shuah,
linux-arm-kernel, linux-kernel, kvmarm, linux-perf-users,
linux-kselftest
Hi Oliver, thanks for the review.
Oliver Upton <oliver.upton@linux.dev> writes:
> Hi Colton,
> On Sat, Feb 08, 2025 at 02:01:09AM +0000, Colton Lewis wrote:
>> For PMUv3, the register MDCR_EL2.HPMN partitiones the PMU counters
>> into two ranges where counters 0..HPMN-1 are accessible by EL1 and, if
>> allowed, EL0 while counters HPMN..N are only accessible by EL2.
>> Introduce a module parameter in the PMUv3 driver to set this
>> register. The name reserved_host_counters reflects the intent to
>> reserve some counters for the host so the guest may eventually be
>> allowed direct access to a subset of PMU functionality for increased
>> performance.
>> Track HPMN and whether the pmu is partitioned in struct arm_pmu
>> because KVM will need to know that to handle guests correctly.
>> While FEAT_HPMN0 does allow HPMN to be set to 0, this patch
>> specifically disallows that case because it's not useful given the
>> intention to allow guests access to their own counters.
> Quite the contrary.
> FEAT_HPMN0 is useful if userspace wants to provide a vPMU that has a
> fixed cycle counter w/o event counters. Certain OSes refuse to boot
> without it...
Cool. I'll make sure writing 0 is allowed if we have FEAT_HPMN0.
>> static inline u32 read_mdcr(void)
>> {
>> return read_sysreg(HDCR);
>> }
>> static inline void write_mdcr(u32 val)
>> {
>> write_sysreg(val, HDCR);
>> }
> Hmm... While this fixes the 32bit compilation issues, it opens a new can
> of worms.
> VHE is a 64bit only feature, so you're *guaranteed* that these accessors
> will undef (running at EL1).
True. I mentioned in the cover letter they aren't intended to actually
run in the PMU driver because I guard for VHE there.
>> +static void armv8pmu_partition(u8 hpmn)
>> +{
>> + u64 mdcr = armv8pmu_mdcr_read();
>> +
>> + mdcr &= ~ARMV8_PMU_MDCR_HPMN;
>> + mdcr |= FIELD_PREP(ARMV8_PMU_MDCR_HPMN, hpmn);
>> + /* Prevent guest counters counting at EL2 */
>> + mdcr |= ARMV8_PMU_MDCR_HPMD;
>> +
>> + armv8pmu_mdcr_write(mdcr);
>> +}
>> +
> After giving this a read, I don't think the host PMU driver should care
> about MDCR_EL2 at all. The only time that 'guest' events are loaded into
> the PMU is between vcpu_load() / vcpu_put(), at which point *KVM* has
> reconfigured MDCR_EL2.
> I'm not sure if there's much involvement with the host PMU driver beyond
> it pinky-swearing to only use the specified counters. KVM is what
> actually will program HPMN.
> That'd alleviate the PMU driver from having VHE awareness or caring
> about MDCR_EL2.
That does seem like a cleaner solution. I can lift that handling into
KVM code, but I will still need the variables I introduced in struct
arm_pmu to be populated so the driver knows what counters to use.
Without looking, I'm concerned there might be an initilization ordering
issue with that where KVM will get the value for HPMN first because it's
initialized first but then can't store them somewhere accessible to the
driver because the driver hasn't been initialized yet. Then the driver
is initialized and used without knowing HPMN and is now using counters
we wanted the guest to have.
There is probably an easy way around this, but I'll need to do some
digging.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH v2 4/4] KVM: arm64: Make guests see only counters they can access
2025-02-10 7:37 ` Oliver Upton
@ 2025-02-10 23:08 ` Colton Lewis
0 siblings, 0 replies; 9+ messages in thread
From: Colton Lewis @ 2025-02-10 23:08 UTC (permalink / raw)
To: Oliver Upton
Cc: kvm, linux, catalin.marinas, will, maz, joey.gouly,
suzuki.poulose, yuzenghui, mark.rutland, pbonzini, shuah,
linux-arm-kernel, linux-kernel, kvmarm, linux-perf-users,
linux-kselftest
Hi Oliver, thanks for the review.
Oliver Upton <oliver.upton@linux.dev> writes:
> On Sat, Feb 08, 2025 at 02:01:11AM +0000, Colton Lewis wrote:
>> The ARM architecture specifies that when MDCR_EL2.HPMN is set, EL1 and
>> EL0, which includes KVM guests, should read that value for PMCR.N.
>> Signed-off-by: Colton Lewis <coltonlewis@google.com>
>> ---
>> arch/arm64/kvm/debug.c | 3 +--
>> arch/arm64/kvm/pmu-emul.c | 8 +++++++-
>> tools/testing/selftests/kvm/arm64/vpmu_counter_access.c | 2 +-
>> 3 files changed, 9 insertions(+), 4 deletions(-)
>> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
>> index 0e4c805e7e89..7c04db00bf6c 100644
>> --- a/arch/arm64/kvm/debug.c
>> +++ b/arch/arm64/kvm/debug.c
>> @@ -36,8 +36,7 @@ static void kvm_arm_setup_mdcr_el2(struct kvm_vcpu
>> *vcpu)
>> * This also clears MDCR_EL2_E2PB_MASK and MDCR_EL2_E2TB_MASK
>> * to disable guest access to the profiling and trace buffers
>> */
>> - vcpu->arch.mdcr_el2 = FIELD_PREP(MDCR_EL2_HPMN,
>> - *host_data_ptr(nr_event_counters));
>> + vcpu->arch.mdcr_el2 = FIELD_PREP(MDCR_EL2_HPMN, read_mdcr());
> Please avoid unnecessary accesses to MDCR_EL2 at all costs. This is a
> guaranteed trap to EL2 with nested virt.
I thought it was necessary when I wrote this, but I see it's not after
thinking about it for a while.
The intended value is accessible from vcpu->kvm->arch.arm_pmu.hpmn as
written, but could be somewhere else after addressing your suggestion to
lift all MDCR handling into KVM.
>> vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |
>> MDCR_EL2_TPMS |
>> MDCR_EL2_TTRF |
>> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
>> index 6c5950b9ceac..052ce8c721fe 100644
>> --- a/arch/arm64/kvm/pmu-emul.c
>> +++ b/arch/arm64/kvm/pmu-emul.c
>> @@ -993,12 +993,18 @@ static bool pmu_irq_is_valid(struct kvm *kvm, int
>> irq)
>> u8 kvm_arm_pmu_get_max_counters(struct kvm *kvm)
>> {
>> struct arm_pmu *arm_pmu = kvm->arch.arm_pmu;
>> + u8 limit;
>> +
>> + if (arm_pmu->partitioned)
>> + limit = arm_pmu->hpmn - 1;
>> + else
>> + limit = ARMV8_PMU_MAX_GENERAL_COUNTERS;
>> /*
>> * The arm_pmu->cntr_mask considers the fixed counter(s) as well.
>> * Ignore those and return only the general-purpose counters.
>> */
>> - return bitmap_weight(arm_pmu->cntr_mask,
>> ARMV8_PMU_MAX_GENERAL_COUNTERS);
>> + return bitmap_weight(arm_pmu->cntr_mask, limit);
>> }
> This isn't necessary and is likely to regress the existing behavior.
> When the architecture says the lower ELs should see PMCR_EL0.N have the
> same value as MDCR_EL2.HPMN, what it really means is direct reads from
> hardware will return the value.
> So my expectation would be that a VM using the partitioned PMU
> implementation would never reach any of the *emulated* PMU handlers, as
> we would've disabled the associated traps.
Understood. The change was here to see be able to read PMCR_EL0.N from
inside a VM since I haven't disabled the associated traps yet.
It shouldn't be in this patch.
> The partitioned PMU will not replace the emulated vPMU implementation in
> KVM, so it'd be good to refactor what we have today to make room for
> your work. I think that'd be along the lines of:
> - Shared code for event filter enforcement and handling of the vPMU
> overflow IRQ.
> - Emulated PMU implementation that provides trap handlers for all PMUv3
> registers and backs into host perf
> - Partitioned PMU implementation that doesn't rely on traps and instead
> saves / restores a portion of the PMU that contains the guest
> context.
> These should be done in separate files, i.e. I do not want to see a
> bunch of inline handling to cope with emulated v. partitioned PMUs.
Agreed.
>> static void kvm_arm_set_pmu(struct kvm *kvm, struct arm_pmu *arm_pmu)
>> diff --git a/tools/testing/selftests/kvm/arm64/vpmu_counter_access.c
>> b/tools/testing/selftests/kvm/arm64/vpmu_counter_access.c
>> index f16b3b27e32e..b5bc18b7528d 100644
>> --- a/tools/testing/selftests/kvm/arm64/vpmu_counter_access.c
>> +++ b/tools/testing/selftests/kvm/arm64/vpmu_counter_access.c
>> @@ -609,7 +609,7 @@ static void run_pmregs_validity_test(uint64_t pmcr_n)
>> */
>> static void run_error_test(uint64_t pmcr_n)
>> {
>> - pr_debug("Error test with pmcr_n %lu (larger than the host)\n",
>> pmcr_n);
>> + pr_debug("Error test with pmcr_n %lu (larger than the host allows)\n",
>> pmcr_n);
> NBD for an RFC, but in the future please do selftests changes in a
> separate patch.
Ok, I'll do that in the future.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-02-10 23:08 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-08 2:01 [RFC PATCH v2 0/4] PMU partitioning driver support Colton Lewis
2025-02-08 2:01 ` [RFC PATCH v2 1/4] perf: arm_pmuv3: Generalize counter bitmasks Colton Lewis
2025-02-08 2:01 ` [RFC PATCH v2 2/4] perf: arm_pmuv3: Introduce module param to partition the PMU Colton Lewis
2025-02-10 7:25 ` Oliver Upton
2025-02-10 23:07 ` Colton Lewis
2025-02-08 2:01 ` [RFC PATCH v2 3/4] perf: arm_pmuv3: Keep out of guest counter partition Colton Lewis
2025-02-08 2:01 ` [RFC PATCH v2 4/4] KVM: arm64: Make guests see only counters they can access Colton Lewis
2025-02-10 7:37 ` Oliver Upton
2025-02-10 23:08 ` Colton Lewis
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).