linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/14] KVM: arm64: Support FEAT_PMUv3 on Apple hardware
@ 2024-12-03 19:32 Oliver Upton
  2024-12-03 19:32 ` [RFC PATCH 01/14] drivers/perf: apple_m1: Refactor event select/filter configuration Oliver Upton
                   ` (13 more replies)
  0 siblings, 14 replies; 20+ messages in thread
From: Oliver Upton @ 2024-12-03 19:32 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Mingwei Zhang, Colton Lewis, Raghavendra Rao Ananta,
	Catalin Marinas, Will Deacon, Mark Rutland, linux-arm-kernel,
	linux-kernel, Oliver Upton

One of the interesting features of some Apple M* parts is an IMPDEF trap
that routes EL1/EL0 accesses of the PMUv3 registers to EL2. This allows
a hypervisor to emulate an architectural PMUv3 on top of the IMPDEF PMU
hardware present in the CPU.

And if you squint, this _might_ look like a CPU erratum :-)

This series takes advantage of these IMPDEF traps to provide PMUv3 to
KVM guests. As a starting point, only expose the fixed CPU cycle counter
and no event counters. Conveniently, this is enough to get Windows
running as a KVM guest on Apple hardware.

I've tried to keep the deviation to a minimum by refactoring some of the
flows used for PMUv3, e.g. computing PMCEID from the arm_pmu bitmap
instead of reading hardware directly.

Sending this as an RFC as there are some obvious open Qs:

 - Does enabling PMUv3 meet the bar for allowing IMPDEF crap to slide
   into KVM?

   I certainly think the answer is 'yes', especially considering this
   enables Windows guests.

 - Do we want to support programmable event counters?

   I'm sitting on some extra patches to do this, which maps a few
   architectural event IDs into the PERF_TYPE_HARDWARE space, allowing
   the perf driver to select the right hardware event ID.

   Deciding on a sensible number of counters in this case is a bit of a
   mess, as the M* PMCs aren't entirely fungible (some events only work
   on specific counters).

Applies to 6.13-rc1. I've only enabled this for M2 as it is the hardware
I have, but it is extremely likely this feature works on other M* parts.

It is also very possible I've broken something for true PMUv3 hardware,
as I've only tested on the M2.

Oliver Upton (14):
  drivers/perf: apple_m1: Refactor event select/filter configuration
  drivers/perf: apple_m1: Support host/guest event filtering
  drivers/perf: apple_m1: Map generic branch events
  KVM: arm64: Compute PMCEID from arm_pmu's event bitmaps
  KVM: arm64: Always allow fixed cycle counter
  KVM: arm64: Use PERF_COUNT_HW_CPU_CYCLES for fixed cycle counter
  KVM: arm64: Use a cpucap to determine if system supports FEAT_PMUv3
  KVM: arm64: Drop kvm_arm_pmu_available static key
  KVM: arm64: Use guard() to cleanup usage of arm_pmus_lock
  KVM: arm64: Move PMUVer filtering into KVM code
  KVM: arm64: Compute synthetic sysreg ESR for Apple PMUv3 traps
  KVM: arm64: Advertise PMUv3 if IMPDEF traps are present
  KVM: arm64: Advertise 0 event counters for IMPDEF PMU
  arm64: Enable IMP DEF PMUv3 traps on Apple M2

 arch/arm64/include/asm/apple_m1_pmu.h   |   1 +
 arch/arm64/include/asm/cpufeature.h     |  28 +----
 arch/arm64/kernel/cpu_errata.c          |  38 ++++++
 arch/arm64/kernel/cpufeature.c          |  19 +++
 arch/arm64/kernel/image-vars.h          |   5 -
 arch/arm64/kvm/arm.c                    |   4 +-
 arch/arm64/kvm/hyp/include/hyp/switch.h |   4 +-
 arch/arm64/kvm/hyp/vhe/switch.c         |  22 ++++
 arch/arm64/kvm/pmu-emul.c               | 159 +++++++++++++++---------
 arch/arm64/kvm/pmu.c                    |  10 +-
 arch/arm64/tools/cpucaps                |   2 +
 drivers/perf/apple_m1_cpu_pmu.c         |  68 ++++++----
 include/kvm/arm_pmu.h                   |  15 +--
 13 files changed, 248 insertions(+), 127 deletions(-)


base-commit: 40384c840ea1944d7c5a392e8975ed088ecf0b37
-- 
2.39.5


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC PATCH 01/14] drivers/perf: apple_m1: Refactor event select/filter configuration
  2024-12-03 19:32 [RFC PATCH 00/14] KVM: arm64: Support FEAT_PMUv3 on Apple hardware Oliver Upton
@ 2024-12-03 19:32 ` Oliver Upton
  2024-12-03 19:32 ` [RFC PATCH 02/14] drivers/perf: apple_m1: Support host/guest event filtering Oliver Upton
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Oliver Upton @ 2024-12-03 19:32 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Mingwei Zhang, Colton Lewis, Raghavendra Rao Ananta,
	Catalin Marinas, Will Deacon, Mark Rutland, linux-arm-kernel,
	linux-kernel, Oliver Upton

Supporting guest mode events will necessitate programming two event
filters. Prepare by splitting up the programming of the event selector +
event filter into separate headers.

Opportunistically replace RMW patterns with sysreg_clear_set_s().

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
---
 arch/arm64/include/asm/apple_m1_pmu.h |  1 +
 drivers/perf/apple_m1_cpu_pmu.c       | 52 ++++++++++++++++-----------
 2 files changed, 33 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/include/asm/apple_m1_pmu.h b/arch/arm64/include/asm/apple_m1_pmu.h
index 99483b19b99f..02e05d05851f 100644
--- a/arch/arm64/include/asm/apple_m1_pmu.h
+++ b/arch/arm64/include/asm/apple_m1_pmu.h
@@ -37,6 +37,7 @@
 #define PMCR0_PMI_ENABLE_8_9	GENMASK(45, 44)
 
 #define SYS_IMP_APL_PMCR1_EL1	sys_reg(3, 1, 15, 1, 0)
+#define SYS_IMP_APL_PMCR1_EL12	sys_reg(3, 1, 15, 7, 2)
 #define PMCR1_COUNT_A64_EL0_0_7	GENMASK(15, 8)
 #define PMCR1_COUNT_A64_EL1_0_7	GENMASK(23, 16)
 #define PMCR1_COUNT_A64_EL0_8_9	GENMASK(41, 40)
diff --git a/drivers/perf/apple_m1_cpu_pmu.c b/drivers/perf/apple_m1_cpu_pmu.c
index 1d4d01e1275e..ecc71f6808dd 100644
--- a/drivers/perf/apple_m1_cpu_pmu.c
+++ b/drivers/perf/apple_m1_cpu_pmu.c
@@ -325,11 +325,10 @@ static void m1_pmu_disable_counter_interrupt(unsigned int index)
 	__m1_pmu_enable_counter_interrupt(index, false);
 }
 
-static void m1_pmu_configure_counter(unsigned int index, u8 event,
-				     bool user, bool kernel)
+static void __m1_pmu_configure_event_filter(unsigned int index, bool user,
+					    bool kernel)
 {
-	u64 val, user_bit, kernel_bit;
-	int shift;
+	u64 clear, set, user_bit, kernel_bit;
 
 	switch (index) {
 	case 0 ... 7:
@@ -344,19 +343,24 @@ static void m1_pmu_configure_counter(unsigned int index, u8 event,
 		BUG();
 	}
 
-	val = read_sysreg_s(SYS_IMP_APL_PMCR1_EL1);
-
+	clear = set = 0;
 	if (user)
-		val |= user_bit;
+		set |= user_bit;
 	else
-		val &= ~user_bit;
+		clear |= user_bit;
 
 	if (kernel)
-		val |= kernel_bit;
+		set |= kernel_bit;
 	else
-		val &= ~kernel_bit;
+		clear |= kernel_bit;
 
-	write_sysreg_s(val, SYS_IMP_APL_PMCR1_EL1);
+	sysreg_clear_set_s(SYS_IMP_APL_PMCR1_EL1, clear, set);
+}
+
+static void __m1_pmu_configure_eventsel(unsigned int index, u8 event)
+{
+	u64 clear = 0, set = 0;
+	int shift;
 
 	/*
 	 * Counters 0 and 1 have fixed events. For anything else,
@@ -369,21 +373,29 @@ static void m1_pmu_configure_counter(unsigned int index, u8 event,
 		break;
 	case 2 ... 5:
 		shift = (index - 2) * 8;
-		val = read_sysreg_s(SYS_IMP_APL_PMESR0_EL1);
-		val &= ~((u64)0xff << shift);
-		val |= (u64)event << shift;
-		write_sysreg_s(val, SYS_IMP_APL_PMESR0_EL1);
+		clear |= (u64)0xff << shift;
+		set |= (u64)event << shift;
+		sysreg_clear_set_s(SYS_IMP_APL_PMESR0_EL1, clear, set);
 		break;
 	case 6 ... 9:
 		shift = (index - 6) * 8;
-		val = read_sysreg_s(SYS_IMP_APL_PMESR1_EL1);
-		val &= ~((u64)0xff << shift);
-		val |= (u64)event << shift;
-		write_sysreg_s(val, SYS_IMP_APL_PMESR1_EL1);
+		clear |= (u64)0xff << shift;
+		set |= (u64)event << shift;
+		sysreg_clear_set_s(SYS_IMP_APL_PMESR1_EL1, clear, set);
 		break;
 	}
 }
 
+static void m1_pmu_configure_counter(unsigned int index, unsigned long config_base)
+{
+	bool kernel = config_base & M1_PMU_CFG_COUNT_KERNEL;
+	bool user = config_base & M1_PMU_CFG_COUNT_USER;
+	u8 evt = config_base & M1_PMU_CFG_EVENT;
+
+	__m1_pmu_configure_event_filter(index, user, kernel);
+	__m1_pmu_configure_eventsel(index, evt);
+}
+
 /* arm_pmu backend */
 static void m1_pmu_enable_event(struct perf_event *event)
 {
@@ -398,7 +410,7 @@ static void m1_pmu_enable_event(struct perf_event *event)
 	m1_pmu_disable_counter(event->hw.idx);
 	isb();
 
-	m1_pmu_configure_counter(event->hw.idx, evt, user, kernel);
+	m1_pmu_configure_counter(event->hw.idx, event->hw.config_base);
 	m1_pmu_enable_counter(event->hw.idx);
 	m1_pmu_enable_counter_interrupt(event->hw.idx);
 	isb();
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 02/14] drivers/perf: apple_m1: Support host/guest event filtering
  2024-12-03 19:32 [RFC PATCH 00/14] KVM: arm64: Support FEAT_PMUv3 on Apple hardware Oliver Upton
  2024-12-03 19:32 ` [RFC PATCH 01/14] drivers/perf: apple_m1: Refactor event select/filter configuration Oliver Upton
@ 2024-12-03 19:32 ` Oliver Upton
  2024-12-03 19:32 ` [RFC PATCH 03/14] drivers/perf: apple_m1: Map generic branch events Oliver Upton
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Oliver Upton @ 2024-12-03 19:32 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Mingwei Zhang, Colton Lewis, Raghavendra Rao Ananta,
	Catalin Marinas, Will Deacon, Mark Rutland, linux-arm-kernel,
	linux-kernel, Oliver Upton

The PMU appears to have a separate register for filtering 'guest'
exception levels (i.e. EL1 and !ELIsInHost(EL0)) which has the same
layout as PMCR1_EL1. Conveniently, there exists a VHE register alias
(PMCR1_EL12) that can be used to configure it.

Support guest events by programming the EL12 register with the intended
guest kernel/userspace filters. Limit support for guest events to VHE
(i.e. kernel running at EL2), as it avoids involving KVM to context
switch PMU registers. VHE is the only supported mode on M* parts anyway,
so this isn't an actual feature limitation.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
---
 drivers/perf/apple_m1_cpu_pmu.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/perf/apple_m1_cpu_pmu.c b/drivers/perf/apple_m1_cpu_pmu.c
index ecc71f6808dd..2a34523500f8 100644
--- a/drivers/perf/apple_m1_cpu_pmu.c
+++ b/drivers/perf/apple_m1_cpu_pmu.c
@@ -120,6 +120,8 @@ enum m1_pmu_events {
 	 */
 	M1_PMU_CFG_COUNT_USER					= BIT(8),
 	M1_PMU_CFG_COUNT_KERNEL					= BIT(9),
+	M1_PMU_CFG_COUNT_HOST					= BIT(10),
+	M1_PMU_CFG_COUNT_GUEST					= BIT(11),
 };
 
 /*
@@ -326,7 +328,7 @@ static void m1_pmu_disable_counter_interrupt(unsigned int index)
 }
 
 static void __m1_pmu_configure_event_filter(unsigned int index, bool user,
-					    bool kernel)
+					    bool kernel, bool host)
 {
 	u64 clear, set, user_bit, kernel_bit;
 
@@ -354,7 +356,10 @@ static void __m1_pmu_configure_event_filter(unsigned int index, bool user,
 	else
 		clear |= kernel_bit;
 
-	sysreg_clear_set_s(SYS_IMP_APL_PMCR1_EL1, clear, set);
+	if (host)
+		sysreg_clear_set_s(SYS_IMP_APL_PMCR1_EL1, clear, set);
+	else if (is_kernel_in_hyp_mode())
+		sysreg_clear_set_s(SYS_IMP_APL_PMCR1_EL12, clear, set);
 }
 
 static void __m1_pmu_configure_eventsel(unsigned int index, u8 event)
@@ -389,10 +394,13 @@ static void __m1_pmu_configure_eventsel(unsigned int index, u8 event)
 static void m1_pmu_configure_counter(unsigned int index, unsigned long config_base)
 {
 	bool kernel = config_base & M1_PMU_CFG_COUNT_KERNEL;
+	bool guest = config_base & M1_PMU_CFG_COUNT_GUEST;
+	bool host = config_base & M1_PMU_CFG_COUNT_HOST;
 	bool user = config_base & M1_PMU_CFG_COUNT_USER;
 	u8 evt = config_base & M1_PMU_CFG_EVENT;
 
-	__m1_pmu_configure_event_filter(index, user, kernel);
+	__m1_pmu_configure_event_filter(index, user && host, kernel && host, true);
+	__m1_pmu_configure_event_filter(index, user && guest, kernel && guest, false);
 	__m1_pmu_configure_eventsel(index, evt);
 }
 
@@ -568,7 +576,7 @@ static int m1_pmu_set_event_filter(struct hw_perf_event *event,
 {
 	unsigned long config_base = 0;
 
-	if (!attr->exclude_guest) {
+	if (!attr->exclude_guest && !is_kernel_in_hyp_mode()) {
 		pr_debug("ARM performance counters do not support mode exclusion\n");
 		return -EOPNOTSUPP;
 	}
@@ -576,6 +584,10 @@ static int m1_pmu_set_event_filter(struct hw_perf_event *event,
 		config_base |= M1_PMU_CFG_COUNT_KERNEL;
 	if (!attr->exclude_user)
 		config_base |= M1_PMU_CFG_COUNT_USER;
+	if (!attr->exclude_host)
+		config_base |= M1_PMU_CFG_COUNT_HOST;
+	if (!attr->exclude_guest)
+		config_base |= M1_PMU_CFG_COUNT_GUEST;
 
 	event->config_base = config_base;
 
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 03/14] drivers/perf: apple_m1: Map generic branch events
  2024-12-03 19:32 [RFC PATCH 00/14] KVM: arm64: Support FEAT_PMUv3 on Apple hardware Oliver Upton
  2024-12-03 19:32 ` [RFC PATCH 01/14] drivers/perf: apple_m1: Refactor event select/filter configuration Oliver Upton
  2024-12-03 19:32 ` [RFC PATCH 02/14] drivers/perf: apple_m1: Support host/guest event filtering Oliver Upton
@ 2024-12-03 19:32 ` Oliver Upton
  2024-12-03 19:32 ` [RFC PATCH 04/14] KVM: arm64: Compute PMCEID from arm_pmu's event bitmaps Oliver Upton
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Oliver Upton @ 2024-12-03 19:32 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Mingwei Zhang, Colton Lewis, Raghavendra Rao Ananta,
	Catalin Marinas, Will Deacon, Mark Rutland, linux-arm-kernel,
	linux-kernel, Oliver Upton

Map the generic perf events for branch prediction stats to the
corresponding hardware events.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
---
 drivers/perf/apple_m1_cpu_pmu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/perf/apple_m1_cpu_pmu.c b/drivers/perf/apple_m1_cpu_pmu.c
index 2a34523500f8..d6d4ff6da862 100644
--- a/drivers/perf/apple_m1_cpu_pmu.c
+++ b/drivers/perf/apple_m1_cpu_pmu.c
@@ -170,6 +170,8 @@ static const unsigned m1_pmu_perf_map[PERF_COUNT_HW_MAX] = {
 	PERF_MAP_ALL_UNSUPPORTED,
 	[PERF_COUNT_HW_CPU_CYCLES]		= M1_PMU_PERFCTR_CORE_ACTIVE_CYCLE,
 	[PERF_COUNT_HW_INSTRUCTIONS]		= M1_PMU_PERFCTR_INST_ALL,
+	[PERF_COUNT_HW_BRANCH_INSTRUCTIONS]	= M1_PMU_PERFCTR_INST_BRANCH,
+	[PERF_COUNT_HW_BRANCH_MISSES]		= M1_PMU_PERFCTR_BRANCH_MISPRED_NONSPEC,
 };
 
 /* sysfs definitions */
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 04/14] KVM: arm64: Compute PMCEID from arm_pmu's event bitmaps
  2024-12-03 19:32 [RFC PATCH 00/14] KVM: arm64: Support FEAT_PMUv3 on Apple hardware Oliver Upton
                   ` (2 preceding siblings ...)
  2024-12-03 19:32 ` [RFC PATCH 03/14] drivers/perf: apple_m1: Map generic branch events Oliver Upton
@ 2024-12-03 19:32 ` Oliver Upton
  2024-12-03 19:32 ` [RFC PATCH 05/14] KVM: arm64: Always allow fixed cycle counter Oliver Upton
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Oliver Upton @ 2024-12-03 19:32 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Mingwei Zhang, Colton Lewis, Raghavendra Rao Ananta,
	Catalin Marinas, Will Deacon, Mark Rutland, linux-arm-kernel,
	linux-kernel, Oliver Upton

The PMUv3 driver populates a couple of bitmaps with the values of
PMCEID{0,1}, from which the guest's PMCEID{0,1} can be derived. This
is particularly convenient when virtualizing PMUv3 on IMP DEF hardware,
as reading the nonexistent PMCEID registers leads to a rather unpleasant
UNDEF.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
---
 arch/arm64/kvm/pmu-emul.c | 47 ++++++++++++++++++++++++++++++---------
 1 file changed, 36 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 456102bc0b55..809d65b912e8 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -861,8 +861,42 @@ static struct arm_pmu *kvm_pmu_probe_armpmu(void)
 	return pmu;
 }
 
+static u64 __compute_pmceid(struct arm_pmu *pmu, bool pmceid1)
+{
+	u32 hi[2], lo[2];
+
+	bitmap_to_arr32(lo, pmu->pmceid_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS);
+	bitmap_to_arr32(hi, pmu->pmceid_ext_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS);
+
+	return ((u64)hi[pmceid1] << 32) | lo[pmceid1];
+}
+
+static u64 compute_pmceid0(struct arm_pmu *pmu)
+{
+	u64 val = __compute_pmceid(pmu, 0);
+
+	/* always support CHAIN */
+	val |= BIT(ARMV8_PMUV3_PERFCTR_CHAIN);
+	return val;
+}
+
+static u64 compute_pmceid1(struct arm_pmu *pmu)
+{
+	u64 val = __compute_pmceid(pmu, 1);
+
+	/*
+	 * Don't advertise STALL_SLOT*, as PMMIR_EL0 is handled
+	 * as RAZ
+	 */
+	val &= ~(BIT_ULL(ARMV8_PMUV3_PERFCTR_STALL_SLOT - 32) |
+		 BIT_ULL(ARMV8_PMUV3_PERFCTR_STALL_SLOT_FRONTEND - 32) |
+		 BIT_ULL(ARMV8_PMUV3_PERFCTR_STALL_SLOT_BACKEND - 32));
+	return val;
+}
+
 u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
 {
+	struct arm_pmu *cpu_pmu = vcpu->kvm->arch.arm_pmu;
 	unsigned long *bmap = vcpu->kvm->arch.pmu_filter;
 	u64 val, mask = 0;
 	int base, i, nr_events;
@@ -871,19 +905,10 @@ u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
 		return 0;
 
 	if (!pmceid1) {
-		val = read_sysreg(pmceid0_el0);
-		/* always support CHAIN */
-		val |= BIT(ARMV8_PMUV3_PERFCTR_CHAIN);
+		val = compute_pmceid0(cpu_pmu);
 		base = 0;
 	} else {
-		val = read_sysreg(pmceid1_el0);
-		/*
-		 * Don't advertise STALL_SLOT*, as PMMIR_EL0 is handled
-		 * as RAZ
-		 */
-		val &= ~(BIT_ULL(ARMV8_PMUV3_PERFCTR_STALL_SLOT - 32) |
-			 BIT_ULL(ARMV8_PMUV3_PERFCTR_STALL_SLOT_FRONTEND - 32) |
-			 BIT_ULL(ARMV8_PMUV3_PERFCTR_STALL_SLOT_BACKEND - 32));
+		val = compute_pmceid1(cpu_pmu);
 		base = 32;
 	}
 
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 05/14] KVM: arm64: Always allow fixed cycle counter
  2024-12-03 19:32 [RFC PATCH 00/14] KVM: arm64: Support FEAT_PMUv3 on Apple hardware Oliver Upton
                   ` (3 preceding siblings ...)
  2024-12-03 19:32 ` [RFC PATCH 04/14] KVM: arm64: Compute PMCEID from arm_pmu's event bitmaps Oliver Upton
@ 2024-12-03 19:32 ` Oliver Upton
  2024-12-03 21:32   ` Marc Zyngier
  2024-12-03 19:32 ` [RFC PATCH 06/14] KVM: arm64: Use PERF_COUNT_HW_CPU_CYCLES for " Oliver Upton
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 20+ messages in thread
From: Oliver Upton @ 2024-12-03 19:32 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Mingwei Zhang, Colton Lewis, Raghavendra Rao Ananta,
	Catalin Marinas, Will Deacon, Mark Rutland, linux-arm-kernel,
	linux-kernel, Oliver Upton

The fixed CPU cycle counter is mandatory for PMUv3, so it doesn't make a
lot of sense allowing userspace to filter it. Only apply the PMU event
filter to *programmed* event counters.

While at it, use the generic CPU_CYCLES perf event to back the cycle
counter, potentially allowing non-PMUv3 drivers to map the event onto
the underlying implementation.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
---
 arch/arm64/kvm/pmu-emul.c | 35 +++++++++++++++++++----------------
 1 file changed, 19 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 809d65b912e8..3e7091e1a2e4 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -707,26 +707,27 @@ static void kvm_pmu_create_perf_event(struct kvm_pmc *pmc)
 	evtreg = kvm_pmc_read_evtreg(pmc);
 
 	kvm_pmu_stop_counter(pmc);
-	if (pmc->idx == ARMV8_PMU_CYCLE_IDX)
+	if (pmc->idx == ARMV8_PMU_CYCLE_IDX) {
 		eventsel = ARMV8_PMUV3_PERFCTR_CPU_CYCLES;
-	else
+	} else {
 		eventsel = evtreg & kvm_pmu_event_mask(vcpu->kvm);
 
-	/*
-	 * Neither SW increment nor chained events need to be backed
-	 * by a perf event.
-	 */
-	if (eventsel == ARMV8_PMUV3_PERFCTR_SW_INCR ||
-	    eventsel == ARMV8_PMUV3_PERFCTR_CHAIN)
-		return;
+		/*
+		 * If we have a filter in place and that the event isn't
+		 * allowed, do not install a perf event either.
+		 */
+		if (vcpu->kvm->arch.pmu_filter &&
+		    !test_bit(eventsel, vcpu->kvm->arch.pmu_filter))
+			return;
 
-	/*
-	 * If we have a filter in place and that the event isn't allowed, do
-	 * not install a perf event either.
-	 */
-	if (vcpu->kvm->arch.pmu_filter &&
-	    !test_bit(eventsel, vcpu->kvm->arch.pmu_filter))
-		return;
+		/*
+		 * Neither SW increment nor chained events need to be backed
+		 * by a perf event.
+		 */
+		if (eventsel == ARMV8_PMUV3_PERFCTR_SW_INCR ||
+		    eventsel == ARMV8_PMUV3_PERFCTR_CHAIN)
+			return;
+	}
 
 	memset(&attr, 0, sizeof(struct perf_event_attr));
 	attr.type = arm_pmu->pmu.type;
@@ -877,6 +878,8 @@ static u64 compute_pmceid0(struct arm_pmu *pmu)
 
 	/* always support CHAIN */
 	val |= BIT(ARMV8_PMUV3_PERFCTR_CHAIN);
+	/* always support CPU_CYCLES */
+	val |= BIT(ARMV8_PMUV3_PERFCTR_CPU_CYCLES);
 	return val;
 }
 
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 06/14] KVM: arm64: Use PERF_COUNT_HW_CPU_CYCLES for fixed cycle counter
  2024-12-03 19:32 [RFC PATCH 00/14] KVM: arm64: Support FEAT_PMUv3 on Apple hardware Oliver Upton
                   ` (4 preceding siblings ...)
  2024-12-03 19:32 ` [RFC PATCH 05/14] KVM: arm64: Always allow fixed cycle counter Oliver Upton
@ 2024-12-03 19:32 ` Oliver Upton
  2024-12-03 19:32 ` [RFC PATCH 07/14] KVM: arm64: Use a cpucap to determine if system supports FEAT_PMUv3 Oliver Upton
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Oliver Upton @ 2024-12-03 19:32 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Mingwei Zhang, Colton Lewis, Raghavendra Rao Ananta,
	Catalin Marinas, Will Deacon, Mark Rutland, linux-arm-kernel,
	linux-kernel, Oliver Upton

KVM will soon be able to emulate PMUv3 on non-PMUv3 hardware. Use the
generic event for CPU cycles to allow a non-PMUv3 driver to map the
event correctly on underlying hardware.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
---
 arch/arm64/kvm/pmu-emul.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 3e7091e1a2e4..0b2ad60717e8 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -700,16 +700,27 @@ static void kvm_pmu_create_perf_event(struct kvm_pmc *pmc)
 {
 	struct kvm_vcpu *vcpu = kvm_pmc_to_vcpu(pmc);
 	struct arm_pmu *arm_pmu = vcpu->kvm->arch.arm_pmu;
+	struct perf_event_attr attr = {};
 	struct perf_event *event;
-	struct perf_event_attr attr;
 	u64 eventsel, evtreg;
 
 	evtreg = kvm_pmc_read_evtreg(pmc);
 
 	kvm_pmu_stop_counter(pmc);
 	if (pmc->idx == ARMV8_PMU_CYCLE_IDX) {
-		eventsel = ARMV8_PMUV3_PERFCTR_CPU_CYCLES;
+		/*
+		 * Use the common event space for the cycle counter, allowing
+		 * the underlying PMU driver to map it onto hardware in the
+		 * unlikely case the host doesn't actually have PMUv3.
+		 */
+		attr.type = PERF_TYPE_HARDWARE;
+		eventsel = PERF_COUNT_HW_CPU_CYCLES;
 	} else {
+		/*
+		 * Otherwise, treat the event as a raw event for the selected
+		 * PMU.
+		 */
+		attr.type = arm_pmu->pmu.type;
 		eventsel = evtreg & kvm_pmu_event_mask(vcpu->kvm);
 
 		/*
@@ -729,8 +740,6 @@ static void kvm_pmu_create_perf_event(struct kvm_pmc *pmc)
 			return;
 	}
 
-	memset(&attr, 0, sizeof(struct perf_event_attr));
-	attr.type = arm_pmu->pmu.type;
 	attr.size = sizeof(attr);
 	attr.pinned = 1;
 	attr.disabled = !kvm_pmu_counter_is_enabled(pmc);
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 07/14] KVM: arm64: Use a cpucap to determine if system supports FEAT_PMUv3
  2024-12-03 19:32 [RFC PATCH 00/14] KVM: arm64: Support FEAT_PMUv3 on Apple hardware Oliver Upton
                   ` (5 preceding siblings ...)
  2024-12-03 19:32 ` [RFC PATCH 06/14] KVM: arm64: Use PERF_COUNT_HW_CPU_CYCLES for " Oliver Upton
@ 2024-12-03 19:32 ` Oliver Upton
  2024-12-03 19:32 ` [RFC PATCH 08/14] KVM: arm64: Drop kvm_arm_pmu_available static key Oliver Upton
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Oliver Upton @ 2024-12-03 19:32 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Mingwei Zhang, Colton Lewis, Raghavendra Rao Ananta,
	Catalin Marinas, Will Deacon, Mark Rutland, linux-arm-kernel,
	linux-kernel, Oliver Upton

KVM is about to learn some new tricks to virtualize PMUv3 on IMPDEF
hardware. As part of that, we now need to differentiate host support
from guest support for PMUv3.

Add a cpucap to determine if an architectural PMUv3 is present to guard
host usage of PMUv3 controls.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
---
 arch/arm64/include/asm/cpufeature.h     |  5 +++++
 arch/arm64/kernel/cpufeature.c          | 19 +++++++++++++++++++
 arch/arm64/kvm/hyp/include/hyp/switch.h |  4 ++--
 arch/arm64/kvm/pmu.c                    | 10 +++++-----
 arch/arm64/tools/cpucaps                |  1 +
 include/kvm/arm_pmu.h                   |  2 +-
 6 files changed, 33 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index b64e49bd9d10..3718840e9767 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -867,6 +867,11 @@ static __always_inline bool system_supports_mpam_hcr(void)
 	return alternative_has_cap_unlikely(ARM64_MPAM_HCR);
 }
 
+static inline bool system_supports_pmuv3(void)
+{
+	return cpus_have_final_cap(ARM64_HAS_PMUV3);
+}
+
 int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
 bool try_emulate_mrs(struct pt_regs *regs, u32 isn);
 
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 6ce71f444ed8..fa73fbdd8617 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1889,6 +1889,19 @@ static bool has_lpa2(const struct arm64_cpu_capabilities *entry, int scope)
 }
 #endif
 
+static bool has_pmuv3(const struct arm64_cpu_capabilities *entry, int scope)
+{
+	u64 dfr0 = read_sanitised_ftr_reg(SYS_ID_AA64DFR0_EL1);
+	unsigned int pmuver;
+
+	pmuver = cpuid_feature_extract_unsigned_field(dfr0,
+						      ID_AA64DFR0_EL1_PMUVer_SHIFT);
+	if (pmuver == ID_AA64DFR0_EL1_PMUVer_IMP_DEF)
+		return false;
+
+	return pmuver >= ID_AA64DFR0_EL1_PMUVer_IMP;
+}
+
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
 #define KPTI_NG_TEMP_VA		(-(1UL << PMD_SHIFT))
 
@@ -2990,6 +3003,12 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		ARM64_CPUID_FIELDS(ID_AA64PFR1_EL1, GCS, IMP)
 	},
 #endif
+	{
+		.desc = "PMUv3",
+		.capability = ARM64_HAS_PMUV3,
+		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
+		.matches = has_pmuv3,
+	},
 	{},
 };
 
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 34f53707892d..995aca419b1e 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -244,7 +244,7 @@ static inline void __activate_traps_common(struct kvm_vcpu *vcpu)
 	 * counter, which could make a PMXEVCNTR_EL0 access UNDEF at
 	 * EL1 instead of being trapped to EL2.
 	 */
-	if (kvm_arm_support_pmu_v3()) {
+	if (system_supports_pmuv3()) {
 		struct kvm_cpu_context *hctxt;
 
 		write_sysreg(0, pmselr_el0);
@@ -281,7 +281,7 @@ static inline void __deactivate_traps_common(struct kvm_vcpu *vcpu)
 	write_sysreg(*host_data_ptr(host_debug_state.mdcr_el2), mdcr_el2);
 
 	write_sysreg(0, hstr_el2);
-	if (kvm_arm_support_pmu_v3()) {
+	if (system_supports_pmuv3()) {
 		struct kvm_cpu_context *hctxt;
 
 		hctxt = host_data_ptr(host_ctxt);
diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
index 0b3adf3e17b4..6b48a3d16d0d 100644
--- a/arch/arm64/kvm/pmu.c
+++ b/arch/arm64/kvm/pmu.c
@@ -41,7 +41,7 @@ void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr)
 {
 	struct kvm_pmu_events *pmu = kvm_get_pmu_events();
 
-	if (!kvm_arm_support_pmu_v3() || !kvm_pmu_switch_needed(attr))
+	if (!system_supports_pmuv3() || !kvm_pmu_switch_needed(attr))
 		return;
 
 	if (!attr->exclude_host)
@@ -57,7 +57,7 @@ void kvm_clr_pmu_events(u64 clr)
 {
 	struct kvm_pmu_events *pmu = kvm_get_pmu_events();
 
-	if (!kvm_arm_support_pmu_v3())
+	if (!system_supports_pmuv3())
 		return;
 
 	pmu->events_host &= ~clr;
@@ -133,7 +133,7 @@ void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu)
 	struct kvm_pmu_events *pmu;
 	u64 events_guest, events_host;
 
-	if (!kvm_arm_support_pmu_v3() || !has_vhe())
+	if (!system_supports_pmuv3() || !has_vhe())
 		return;
 
 	preempt_disable();
@@ -154,7 +154,7 @@ void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu)
 	struct kvm_pmu_events *pmu;
 	u64 events_guest, events_host;
 
-	if (!kvm_arm_support_pmu_v3() || !has_vhe())
+	if (!system_supports_pmuv3() || !has_vhe())
 		return;
 
 	pmu = kvm_get_pmu_events();
@@ -180,7 +180,7 @@ bool kvm_set_pmuserenr(u64 val)
 	struct kvm_cpu_context *hctxt;
 	struct kvm_vcpu *vcpu;
 
-	if (!kvm_arm_support_pmu_v3() || !has_vhe())
+	if (!system_supports_pmuv3() || !has_vhe())
 		return false;
 
 	vcpu = kvm_get_running_vcpu();
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index eb17f59e543c..b291eb73f5e0 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -45,6 +45,7 @@ HAS_LSE_ATOMICS
 HAS_MOPS
 HAS_NESTED_VIRT
 HAS_PAN
+HAS_PMUV3
 HAS_S1PIE
 HAS_S1POE
 HAS_RAS_EXTN
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index e61dd7dd2286..7ef9eb3cede5 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -87,7 +87,7 @@ void kvm_vcpu_pmu_resync_el0(void);
  */
 #define kvm_pmu_update_vcpu_events(vcpu)				\
 	do {								\
-		if (!has_vhe() && kvm_arm_support_pmu_v3())		\
+		if (!has_vhe() && system_supports_pmuv3())		\
 			vcpu->arch.pmu.events = *kvm_get_pmu_events();	\
 	} while (0)
 
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 08/14] KVM: arm64: Drop kvm_arm_pmu_available static key
  2024-12-03 19:32 [RFC PATCH 00/14] KVM: arm64: Support FEAT_PMUv3 on Apple hardware Oliver Upton
                   ` (6 preceding siblings ...)
  2024-12-03 19:32 ` [RFC PATCH 07/14] KVM: arm64: Use a cpucap to determine if system supports FEAT_PMUv3 Oliver Upton
@ 2024-12-03 19:32 ` Oliver Upton
  2024-12-03 19:32 ` [RFC PATCH 09/14] KVM: arm64: Use guard() to cleanup usage of arm_pmus_lock Oliver Upton
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Oliver Upton @ 2024-12-03 19:32 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Mingwei Zhang, Colton Lewis, Raghavendra Rao Ananta,
	Catalin Marinas, Will Deacon, Mark Rutland, linux-arm-kernel,
	linux-kernel, Oliver Upton

With the PMUv3 cpucap, kvm_arm_pmu_available is no longer used in the
hot path of guest entry/exit. On top of that, guest support for PMUv3
may not correlate with host support for the feature, e.g. on IMPDEF
hardware.

Throw out the static key and just inspect the list of PMUs to determine
if PMUv3 is supported for KVM guests.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
---
 arch/arm64/kernel/image-vars.h |  5 -----
 arch/arm64/kvm/arm.c           |  4 ++--
 arch/arm64/kvm/pmu-emul.c      | 11 ++++++-----
 include/kvm/arm_pmu.h          | 13 +------------
 4 files changed, 9 insertions(+), 24 deletions(-)

diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 8f5422ed1b75..5919320bc802 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -109,11 +109,6 @@ KVM_NVHE_ALIAS(vgic_v3_cpuif_trap);
 KVM_NVHE_ALIAS(__start___kvm_ex_table);
 KVM_NVHE_ALIAS(__stop___kvm_ex_table);
 
-/* PMU available static key */
-#ifdef CONFIG_HW_PERF_EVENTS
-KVM_NVHE_ALIAS(kvm_arm_pmu_available);
-#endif
-
 /* Position-independent library routines */
 KVM_NVHE_ALIAS_HYP(clear_page, __pi_clear_page);
 KVM_NVHE_ALIAS_HYP(copy_page, __pi_copy_page);
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index a102c3aebdbc..081e638c674f 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -391,7 +391,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 		r = get_num_wrps();
 		break;
 	case KVM_CAP_ARM_PMU_V3:
-		r = kvm_arm_support_pmu_v3();
+		r = kvm_supports_guest_pmuv3();
 		break;
 	case KVM_CAP_ARM_INJECT_SERROR_ESR:
 		r = cpus_have_final_cap(ARM64_HAS_RAS_EXTN);
@@ -1397,7 +1397,7 @@ static unsigned long system_supported_vcpu_features(void)
 	if (!cpus_have_final_cap(ARM64_HAS_32BIT_EL1))
 		clear_bit(KVM_ARM_VCPU_EL1_32BIT, &features);
 
-	if (!kvm_arm_support_pmu_v3())
+	if (!kvm_supports_guest_pmuv3())
 		clear_bit(KVM_ARM_VCPU_PMU_V3, &features);
 
 	if (!system_supports_sve())
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 0b2ad60717e8..1ff343d43b61 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -17,14 +17,18 @@
 
 #define PERF_ATTR_CFG1_COUNTER_64BIT	BIT(0)
 
-DEFINE_STATIC_KEY_FALSE(kvm_arm_pmu_available);
-
 static LIST_HEAD(arm_pmus);
 static DEFINE_MUTEX(arm_pmus_lock);
 
 static void kvm_pmu_create_perf_event(struct kvm_pmc *pmc);
 static void kvm_pmu_release_perf_event(struct kvm_pmc *pmc);
 
+bool kvm_supports_guest_pmuv3(void)
+{
+	guard(mutex)(&arm_pmus_lock);
+	return !list_empty(&arm_pmus);
+}
+
 static struct kvm_vcpu *kvm_pmc_to_vcpu(const struct kvm_pmc *pmc)
 {
 	return container_of(pmc, struct kvm_vcpu, arch.pmu.pmc[pmc->idx]);
@@ -824,9 +828,6 @@ void kvm_host_pmu_init(struct arm_pmu *pmu)
 	entry->arm_pmu = pmu;
 	list_add_tail(&entry->entry, &arm_pmus);
 
-	if (list_is_singular(&arm_pmus))
-		static_branch_enable(&kvm_arm_pmu_available);
-
 out_unlock:
 	mutex_unlock(&arm_pmus_lock);
 }
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index 7ef9eb3cede5..d3dcf5438315 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -37,13 +37,7 @@ struct arm_pmu_entry {
 	struct arm_pmu *arm_pmu;
 };
 
-DECLARE_STATIC_KEY_FALSE(kvm_arm_pmu_available);
-
-static __always_inline bool kvm_arm_support_pmu_v3(void)
-{
-	return static_branch_likely(&kvm_arm_pmu_available);
-}
-
+bool kvm_supports_guest_pmuv3(void);
 #define kvm_arm_pmu_irq_initialized(v)	((v)->arch.pmu.irq_num >= VGIC_NR_SGIS)
 u64 kvm_pmu_get_counter_value(struct kvm_vcpu *vcpu, u64 select_idx);
 void kvm_pmu_set_counter_value(struct kvm_vcpu *vcpu, u64 select_idx, u64 val);
@@ -103,11 +97,6 @@ void kvm_pmu_nested_transition(struct kvm_vcpu *vcpu);
 struct kvm_pmu {
 };
 
-static inline bool kvm_arm_support_pmu_v3(void)
-{
-	return false;
-}
-
 #define kvm_arm_pmu_irq_initialized(v)	(false)
 static inline u64 kvm_pmu_get_counter_value(struct kvm_vcpu *vcpu,
 					    u64 select_idx)
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 09/14] KVM: arm64: Use guard() to cleanup usage of arm_pmus_lock
  2024-12-03 19:32 [RFC PATCH 00/14] KVM: arm64: Support FEAT_PMUv3 on Apple hardware Oliver Upton
                   ` (7 preceding siblings ...)
  2024-12-03 19:32 ` [RFC PATCH 08/14] KVM: arm64: Drop kvm_arm_pmu_available static key Oliver Upton
@ 2024-12-03 19:32 ` Oliver Upton
  2024-12-03 19:32 ` [RFC PATCH 10/14] KVM: arm64: Move PMUVer filtering into KVM code Oliver Upton
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Oliver Upton @ 2024-12-03 19:32 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Mingwei Zhang, Colton Lewis, Raghavendra Rao Ananta,
	Catalin Marinas, Will Deacon, Mark Rutland, linux-arm-kernel,
	linux-kernel, Oliver Upton

Get rid of some goto label patterns by using guard() to drop the
arm_pmus_lock when returning from a function.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
---
 arch/arm64/kvm/pmu-emul.c | 23 ++++++++---------------
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 1ff343d43b61..b9756a0d054a 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -819,26 +819,23 @@ void kvm_host_pmu_init(struct arm_pmu *pmu)
 	if (!pmuv3_implemented(kvm_arm_pmu_get_pmuver_limit()))
 		return;
 
-	mutex_lock(&arm_pmus_lock);
+	guard(mutex)(&arm_pmus_lock);
 
 	entry = kmalloc(sizeof(*entry), GFP_KERNEL);
 	if (!entry)
-		goto out_unlock;
+		return;
 
 	entry->arm_pmu = pmu;
 	list_add_tail(&entry->entry, &arm_pmus);
-
-out_unlock:
-	mutex_unlock(&arm_pmus_lock);
 }
 
 static struct arm_pmu *kvm_pmu_probe_armpmu(void)
 {
-	struct arm_pmu *tmp, *pmu = NULL;
 	struct arm_pmu_entry *entry;
+	struct arm_pmu *pmu;
 	int cpu;
 
-	mutex_lock(&arm_pmus_lock);
+	guard(mutex)(&arm_pmus_lock);
 
 	/*
 	 * It is safe to use a stale cpu to iterate the list of PMUs so long as
@@ -859,17 +856,13 @@ static struct arm_pmu *kvm_pmu_probe_armpmu(void)
 	 */
 	cpu = raw_smp_processor_id();
 	list_for_each_entry(entry, &arm_pmus, entry) {
-		tmp = entry->arm_pmu;
+		pmu = entry->arm_pmu;
 
-		if (cpumask_test_cpu(cpu, &tmp->supported_cpus)) {
-			pmu = tmp;
-			break;
-		}
+		if (cpumask_test_cpu(cpu, &pmu->supported_cpus))
+			return pmu;
 	}
 
-	mutex_unlock(&arm_pmus_lock);
-
-	return pmu;
+	return NULL;
 }
 
 static u64 __compute_pmceid(struct arm_pmu *pmu, bool pmceid1)
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 10/14] KVM: arm64: Move PMUVer filtering into KVM code
  2024-12-03 19:32 [RFC PATCH 00/14] KVM: arm64: Support FEAT_PMUv3 on Apple hardware Oliver Upton
                   ` (8 preceding siblings ...)
  2024-12-03 19:32 ` [RFC PATCH 09/14] KVM: arm64: Use guard() to cleanup usage of arm_pmus_lock Oliver Upton
@ 2024-12-03 19:32 ` Oliver Upton
  2024-12-03 19:32 ` [RFC PATCH 11/14] KVM: arm64: Compute synthetic sysreg ESR for Apple PMUv3 traps Oliver Upton
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Oliver Upton @ 2024-12-03 19:32 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Mingwei Zhang, Colton Lewis, Raghavendra Rao Ananta,
	Catalin Marinas, Will Deacon, Mark Rutland, linux-arm-kernel,
	linux-kernel, Oliver Upton

The supported guest PMU version on a particular platform is ultimately a
KVM decision. Move PMUVer filtering into KVM code.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
---
 arch/arm64/include/asm/cpufeature.h | 23 -----------------------
 arch/arm64/kvm/pmu-emul.c           | 15 +++++++++------
 2 files changed, 9 insertions(+), 29 deletions(-)

diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 3718840e9767..50ffb09132e9 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -525,29 +525,6 @@ cpuid_feature_extract_unsigned_field(u64 features, int field)
 	return cpuid_feature_extract_unsigned_field_width(features, field, 4);
 }
 
-/*
- * Fields that identify the version of the Performance Monitors Extension do
- * not follow the standard ID scheme. See ARM DDI 0487E.a page D13-2825,
- * "Alternative ID scheme used for the Performance Monitors Extension version".
- */
-static inline u64 __attribute_const__
-cpuid_feature_cap_perfmon_field(u64 features, int field, u64 cap)
-{
-	u64 val = cpuid_feature_extract_unsigned_field(features, field);
-	u64 mask = GENMASK_ULL(field + 3, field);
-
-	/* Treat IMPLEMENTATION DEFINED functionality as unimplemented */
-	if (val == ID_AA64DFR0_EL1_PMUVer_IMP_DEF)
-		val = 0;
-
-	if (val > cap) {
-		features &= ~mask;
-		features |= (cap << field) & mask;
-	}
-
-	return features;
-}
-
 static inline u64 arm64_ftr_mask(const struct arm64_ftr_bits *ftrp)
 {
 	return (u64)GENMASK(ftrp->shift + ftrp->width - 1, ftrp->shift);
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index b9756a0d054a..8a565817a814 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -1255,13 +1255,16 @@ int kvm_arm_pmu_v3_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 
 u8 kvm_arm_pmu_get_pmuver_limit(void)
 {
-	u64 tmp;
+	unsigned int pmuver;
 
-	tmp = read_sanitised_ftr_reg(SYS_ID_AA64DFR0_EL1);
-	tmp = cpuid_feature_cap_perfmon_field(tmp,
-					      ID_AA64DFR0_EL1_PMUVer_SHIFT,
-					      ID_AA64DFR0_EL1_PMUVer_V3P5);
-	return FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer), tmp);
+	pmuver = SYS_FIELD_GET(ID_AA64DFR0_EL1, PMUVer,
+			       read_sanitised_ftr_reg(SYS_ID_AA64DFR0_EL1));
+
+	/* Treat IMPLEMENTATION DEFINED functionality as unimplemented */
+	if (pmuver == ID_AA64DFR0_EL1_PMUVer_IMP_DEF)
+		return 0;
+
+	return min(pmuver, ID_AA64DFR0_EL1_PMUVer_V3P5);
 }
 
 /**
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 11/14] KVM: arm64: Compute synthetic sysreg ESR for Apple PMUv3 traps
  2024-12-03 19:32 [RFC PATCH 00/14] KVM: arm64: Support FEAT_PMUv3 on Apple hardware Oliver Upton
                   ` (9 preceding siblings ...)
  2024-12-03 19:32 ` [RFC PATCH 10/14] KVM: arm64: Move PMUVer filtering into KVM code Oliver Upton
@ 2024-12-03 19:32 ` Oliver Upton
  2024-12-03 19:32 ` [RFC PATCH 12/14] KVM: arm64: Advertise PMUv3 if IMPDEF traps are present Oliver Upton
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 20+ messages in thread
From: Oliver Upton @ 2024-12-03 19:32 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Mingwei Zhang, Colton Lewis, Raghavendra Rao Ananta,
	Catalin Marinas, Will Deacon, Mark Rutland, linux-arm-kernel,
	linux-kernel, Oliver Upton

Apple M* CPUs provide an IMPDEF trap for PMUv3 sysregs, where ESR_EL2.EC
is a reserved value (0x3F) and a sysreg-like ISS is reported in
AFSR1_EL2.

Compute a synthetic ESR for these PMUv3 traps, giving the illusion of
something architectural to the rest of KVM.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
---
 arch/arm64/kvm/hyp/vhe/switch.c | 22 ++++++++++++++++++++++
 arch/arm64/tools/cpucaps        |  1 +
 2 files changed, 23 insertions(+)

diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
index 80581b1c3995..da2ccaefaf6f 100644
--- a/arch/arm64/kvm/hyp/vhe/switch.c
+++ b/arch/arm64/kvm/hyp/vhe/switch.c
@@ -418,6 +418,25 @@ static bool kvm_hyp_handle_sysreg_vhe(struct kvm_vcpu *vcpu, u64 *exit_code)
 	return kvm_hyp_handle_sysreg(vcpu, exit_code);
 }
 
+static bool kvm_hyp_handle_impdef(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	u64 iss;
+
+	if (!cpus_have_final_cap(ARM64_WORKAROUND_PMUV3_IMPDEF_TRAPS))
+		return false;
+
+	/*
+	 * Compute a synthetic ESR for a sysreg trap. Conveniently, AFSR1_EL2
+	 * is populated with a correct ISS for a sysreg trap. These fruity
+	 * parts are 64bit only, so unconditionally set IL.
+	 */
+	iss = ESR_ELx_ISS(read_sysreg_s(SYS_AFSR1_EL2));
+	vcpu->arch.fault.esr_el2 = FIELD_PREP(ESR_ELx_EC_MASK, ESR_ELx_EC_SYS64) |
+				   FIELD_PREP(ESR_ELx_ISS_MASK, iss) |
+				   ESR_ELx_IL;
+	return false;
+}
+
 static const exit_handler_fn hyp_exit_handlers[] = {
 	[0 ... ESR_ELx_EC_MAX]		= NULL,
 	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15_32,
@@ -429,6 +448,9 @@ static const exit_handler_fn hyp_exit_handlers[] = {
 	[ESR_ELx_EC_WATCHPT_LOW]	= kvm_hyp_handle_watchpt_low,
 	[ESR_ELx_EC_ERET]		= kvm_hyp_handle_eret,
 	[ESR_ELx_EC_MOPS]		= kvm_hyp_handle_mops,
+
+	/* Apple shenanigans */
+	[0x3F]				= kvm_hyp_handle_impdef,
 };
 
 static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm_vcpu *vcpu)
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index b291eb73f5e0..a9c7111b831e 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -105,6 +105,7 @@ WORKAROUND_CAVIUM_TX2_219_TVM
 WORKAROUND_CLEAN_CACHE
 WORKAROUND_DEVICE_LOAD_ACQUIRE
 WORKAROUND_NVIDIA_CARMEL_CNP
+WORKAROUND_PMUV3_IMPDEF_TRAPS
 WORKAROUND_QCOM_FALKOR_E1003
 WORKAROUND_REPEAT_TLBI
 WORKAROUND_SPECULATIVE_AT
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 12/14] KVM: arm64: Advertise PMUv3 if IMPDEF traps are present
  2024-12-03 19:32 [RFC PATCH 00/14] KVM: arm64: Support FEAT_PMUv3 on Apple hardware Oliver Upton
                   ` (10 preceding siblings ...)
  2024-12-03 19:32 ` [RFC PATCH 11/14] KVM: arm64: Compute synthetic sysreg ESR for Apple PMUv3 traps Oliver Upton
@ 2024-12-03 19:32 ` Oliver Upton
  2024-12-03 19:32 ` [RFC PATCH 13/14] KVM: arm64: Advertise 0 event counters for IMPDEF PMU Oliver Upton
  2024-12-03 19:34 ` [RFC PATCH 14/14] arm64: Enable IMP DEF PMUv3 traps on Apple M2 Oliver Upton
  13 siblings, 0 replies; 20+ messages in thread
From: Oliver Upton @ 2024-12-03 19:32 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Mingwei Zhang, Colton Lewis, Raghavendra Rao Ananta,
	Catalin Marinas, Will Deacon, Mark Rutland, linux-arm-kernel,
	linux-kernel, Oliver Upton

Advertise a baseline PMUv3 implementation when running on hardware with
IMPDEF traps of the PMUv3 sysregs.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
---
 arch/arm64/kvm/pmu-emul.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 8a565817a814..34d9c08af209 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -1260,7 +1260,17 @@ u8 kvm_arm_pmu_get_pmuver_limit(void)
 	pmuver = SYS_FIELD_GET(ID_AA64DFR0_EL1, PMUVer,
 			       read_sanitised_ftr_reg(SYS_ID_AA64DFR0_EL1));
 
-	/* Treat IMPLEMENTATION DEFINED functionality as unimplemented */
+	/*
+	 * Spoof a barebones PMUv3 implementation if the system supports IMPDEF
+	 * traps of the PMUv3 sysregs
+	 */
+	if (cpus_have_final_cap(ARM64_WORKAROUND_PMUV3_IMPDEF_TRAPS))
+		return ID_AA64DFR0_EL1_PMUVer_IMP;
+
+	/*
+	 * Otherwise, treat IMPLEMENTATION DEFINED functionality as
+	 * unimplemented
+	 */
 	if (pmuver == ID_AA64DFR0_EL1_PMUVer_IMP_DEF)
 		return 0;
 
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 13/14] KVM: arm64: Advertise 0 event counters for IMPDEF PMU
  2024-12-03 19:32 [RFC PATCH 00/14] KVM: arm64: Support FEAT_PMUv3 on Apple hardware Oliver Upton
                   ` (11 preceding siblings ...)
  2024-12-03 19:32 ` [RFC PATCH 12/14] KVM: arm64: Advertise PMUv3 if IMPDEF traps are present Oliver Upton
@ 2024-12-03 19:32 ` Oliver Upton
  2024-12-03 19:34 ` [RFC PATCH 14/14] arm64: Enable IMP DEF PMUv3 traps on Apple M2 Oliver Upton
  13 siblings, 0 replies; 20+ messages in thread
From: Oliver Upton @ 2024-12-03 19:32 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Mingwei Zhang, Colton Lewis, Raghavendra Rao Ananta,
	Catalin Marinas, Will Deacon, Mark Rutland, linux-arm-kernel,
	linux-kernel, Oliver Upton

The programmable event counters on Apple M* parts are quite different
from what's available in PMUv3, as the event counters aren't fungible
(some events only work on specific counters) and the event ID space
doesn't match the architecture.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
---
 arch/arm64/kvm/pmu-emul.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 34d9c08af209..8ac4eee781c9 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -1044,6 +1044,9 @@ u8 kvm_arm_pmu_get_max_counters(struct kvm *kvm)
 {
 	struct arm_pmu *arm_pmu = kvm->arch.arm_pmu;
 
+	if (cpus_have_final_cap(ARM64_WORKAROUND_PMUV3_IMPDEF_TRAPS))
+		return 0;
+
 	/*
 	 * The arm_pmu->cntr_mask considers the fixed counter(s) as well.
 	 * Ignore those and return only the general-purpose counters.
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 14/14] arm64: Enable IMP DEF PMUv3 traps on Apple M2
  2024-12-03 19:32 [RFC PATCH 00/14] KVM: arm64: Support FEAT_PMUv3 on Apple hardware Oliver Upton
                   ` (12 preceding siblings ...)
  2024-12-03 19:32 ` [RFC PATCH 13/14] KVM: arm64: Advertise 0 event counters for IMPDEF PMU Oliver Upton
@ 2024-12-03 19:34 ` Oliver Upton
  13 siblings, 0 replies; 20+ messages in thread
From: Oliver Upton @ 2024-12-03 19:34 UTC (permalink / raw)
  To: kvmarm
  Cc: Marc Zyngier, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
	Mingwei Zhang, Colton Lewis, Raghavendra Rao Ananta,
	Catalin Marinas, Will Deacon, Mark Rutland, linux-arm-kernel,
	linux-kernel, Oliver Upton

Apple M2 CPUs support IMPDEF traps of the PMUv3 sysregs, allowing a
hypervisor to virtualize an architectural PMU for a VM. Flip the
appropriate bit in HACR_EL2 on supporting hardware.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
---
 arch/arm64/kernel/cpu_errata.c | 38 ++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index a78f247029ae..441ee4ffc770 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -194,6 +194,37 @@ has_neoverse_n1_erratum_1542419(const struct arm64_cpu_capabilities *entry,
 	return is_midr_in_range(midr, &range) && has_dic;
 }
 
+static const struct midr_range impdef_pmuv3_cpus[] = {
+	MIDR_ALL_VERSIONS(MIDR_APPLE_M2_BLIZZARD),
+	MIDR_ALL_VERSIONS(MIDR_APPLE_M2_AVALANCHE),
+	MIDR_ALL_VERSIONS(MIDR_APPLE_M2_BLIZZARD_PRO),
+	MIDR_ALL_VERSIONS(MIDR_APPLE_M2_AVALANCHE_PRO),
+	MIDR_ALL_VERSIONS(MIDR_APPLE_M2_BLIZZARD_MAX),
+	MIDR_ALL_VERSIONS(MIDR_APPLE_M2_AVALANCHE_MAX),
+	{},
+};
+
+static bool has_impdef_pmuv3(const struct arm64_cpu_capabilities *entry, int scope)
+{
+	u64 dfr0 = read_sanitised_ftr_reg(SYS_ID_AA64DFR0_EL1);
+	unsigned int pmuver;
+
+	if (!is_kernel_in_hyp_mode())
+		return false;
+
+	pmuver = cpuid_feature_extract_unsigned_field(dfr0,
+						      ID_AA64DFR0_EL1_PMUVer_SHIFT);
+	if (pmuver != ID_AA64DFR0_EL1_PMUVer_IMP_DEF)
+		return false;
+
+	return is_midr_in_range_list(read_cpuid_id(), impdef_pmuv3_cpus);
+}
+
+static void cpu_enable_impdef_pmuv3_traps(const struct arm64_cpu_capabilities *__unused)
+{
+	sysreg_clear_set_s(SYS_HACR_EL2, 0, BIT(56));
+}
+
 #ifdef CONFIG_ARM64_WORKAROUND_REPEAT_TLBI
 static const struct arm64_cpu_capabilities arm64_repeat_tlbi_list[] = {
 #ifdef CONFIG_QCOM_FALKOR_ERRATUM_1009
@@ -786,6 +817,13 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
 		ERRATA_MIDR_RANGE_LIST(erratum_ac03_cpu_38_list),
 	},
 #endif
+	{
+		.desc = "Apple IMPDEF PMUv3 Traps",
+		.capability = ARM64_WORKAROUND_PMUV3_IMPDEF_TRAPS,
+		.type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM,
+		.matches = has_impdef_pmuv3,
+		.cpu_enable = cpu_enable_impdef_pmuv3_traps,
+	},
 	{
 	}
 };
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 05/14] KVM: arm64: Always allow fixed cycle counter
  2024-12-03 19:32 ` [RFC PATCH 05/14] KVM: arm64: Always allow fixed cycle counter Oliver Upton
@ 2024-12-03 21:32   ` Marc Zyngier
  2024-12-03 22:32     ` Oliver Upton
  0 siblings, 1 reply; 20+ messages in thread
From: Marc Zyngier @ 2024-12-03 21:32 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Mingwei Zhang,
	Colton Lewis, Raghavendra Rao Ananta, Catalin Marinas,
	Will Deacon, Mark Rutland, linux-arm-kernel, linux-kernel

On Tue, 03 Dec 2024 19:32:11 +0000,
Oliver Upton <oliver.upton@linux.dev> wrote:
> 
> The fixed CPU cycle counter is mandatory for PMUv3, so it doesn't make a
> lot of sense allowing userspace to filter it. Only apply the PMU event
> filter to *programmed* event counters.

But that's a change in ABI, isn't it? We explicitly say in the
documentation that the cycle counter can be filtered by specifying
event 0x11.

More importantly, the current filtering works in terms of events, and
not in terms of counters.

Instead of changing the ABI, how about simply not supporting filtering
on such non-compliant HW? Surely that would simplify a few things.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 05/14] KVM: arm64: Always allow fixed cycle counter
  2024-12-03 21:32   ` Marc Zyngier
@ 2024-12-03 22:32     ` Oliver Upton
  2024-12-04  9:04       ` Marc Zyngier
  0 siblings, 1 reply; 20+ messages in thread
From: Oliver Upton @ 2024-12-03 22:32 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Mingwei Zhang,
	Colton Lewis, Raghavendra Rao Ananta, Catalin Marinas,
	Will Deacon, Mark Rutland, linux-arm-kernel, linux-kernel

On Tue, Dec 03, 2024 at 09:32:10PM +0000, Marc Zyngier wrote:
> On Tue, 03 Dec 2024 19:32:11 +0000,
> Oliver Upton <oliver.upton@linux.dev> wrote:
> > 
> > The fixed CPU cycle counter is mandatory for PMUv3, so it doesn't make a
> > lot of sense allowing userspace to filter it. Only apply the PMU event
> > filter to *programmed* event counters.
> 
> But that's a change in ABI, isn't it? We explicitly say in the
> documentation that the cycle counter can be filtered by specifying
> event 0x11.

Yeah... A bit of a dirty shortcut I took because I don't like the ABI,
but distaste isn't enough to break it :)

> More importantly, the current filtering works in terms of events, and
> not in terms of counters.
> 
> Instead of changing the ABI, how about simply not supporting filtering
> on such non-compliant HW? Surely that would simplify a few things.

Yeah, that sounds reasonable. Especially if we allow programmable event
counters where the event ID space doesn't match the architecture.

-- 
Thanks,
Oliver


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 05/14] KVM: arm64: Always allow fixed cycle counter
  2024-12-03 22:32     ` Oliver Upton
@ 2024-12-04  9:04       ` Marc Zyngier
  2024-12-04 21:56         ` Oliver Upton
  0 siblings, 1 reply; 20+ messages in thread
From: Marc Zyngier @ 2024-12-04  9:04 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Mingwei Zhang,
	Colton Lewis, Raghavendra Rao Ananta, Catalin Marinas,
	Will Deacon, Mark Rutland, linux-arm-kernel, linux-kernel

On Tue, 03 Dec 2024 22:32:38 +0000,
Oliver Upton <oliver.upton@linux.dev> wrote:
> 
> On Tue, Dec 03, 2024 at 09:32:10PM +0000, Marc Zyngier wrote:
> > On Tue, 03 Dec 2024 19:32:11 +0000,
> > Oliver Upton <oliver.upton@linux.dev> wrote:
> > > 
> > > The fixed CPU cycle counter is mandatory for PMUv3, so it doesn't make a
> > > lot of sense allowing userspace to filter it. Only apply the PMU event
> > > filter to *programmed* event counters.
> > 
> > But that's a change in ABI, isn't it? We explicitly say in the
> > documentation that the cycle counter can be filtered by specifying
> > event 0x11.
> 
> Yeah... A bit of a dirty shortcut I took because I don't like the ABI,
> but distaste isn't enough to break it :)
> 
> > More importantly, the current filtering works in terms of events, and
> > not in terms of counters.
> > 
> > Instead of changing the ABI, how about simply not supporting filtering
> > on such non-compliant HW? Surely that would simplify a few things.
> 
> Yeah, that sounds reasonable. Especially if we allow programmable event
> counters where the event ID space doesn't match the architecture.

Another thing I have been wondering is if a slightly better approach
would be to move some of the handling to the PMU driver itself, and
let it emulate PMUv3 if it can. This would allow conversion of event
numbers in situ rather than polluting the PMUv3 code in KVM.

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 05/14] KVM: arm64: Always allow fixed cycle counter
  2024-12-04  9:04       ` Marc Zyngier
@ 2024-12-04 21:56         ` Oliver Upton
  2024-12-10  9:49           ` Marc Zyngier
  0 siblings, 1 reply; 20+ messages in thread
From: Oliver Upton @ 2024-12-04 21:56 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Mingwei Zhang,
	Colton Lewis, Raghavendra Rao Ananta, Catalin Marinas,
	Will Deacon, Mark Rutland, linux-arm-kernel, linux-kernel

On Wed, Dec 04, 2024 at 09:04:26AM +0000, Marc Zyngier wrote:
> On Tue, 03 Dec 2024 22:32:38 +0000,
> Oliver Upton <oliver.upton@linux.dev> wrote:
> > > More importantly, the current filtering works in terms of events, and
> > > not in terms of counters.
> > > 
> > > Instead of changing the ABI, how about simply not supporting filtering
> > > on such non-compliant HW? Surely that would simplify a few things.
> > 
> > Yeah, that sounds reasonable. Especially if we allow programmable event
> > counters where the event ID space doesn't match the architecture.
> 
> Another thing I have been wondering is if a slightly better approach
> would be to move some of the handling to the PMU driver itself, and
> let it emulate PMUv3 if it can. This would allow conversion of event
> numbers in situ rather than polluting the PMUv3 code in KVM.

Sure, but I think the actual event fed into perf_event_create_kernel_counter()
should be the correct hardware event, not a PMUv3 event reinterpreted
behind the scenes. Otherwise, we'd need to devise an alternate config encoding
for PMUv3-like events since the event ID spaces overlap.

I'm thinking this could be a helper in the arm_pmu struct that takes a
PMUv3 event and spits out (in this case) an M1 event. The resulting KVM
code would be miniscule, like:

u64 kvm_map_pmu_event(struct kvm *kvm, u64 eventsel)
{
	struct arm_pmu *pmu = kvm->arch.arm_pmu;

	if (!pmu->map_pmuv3_event)
		return eventsel;

	return pmu->map_pmuv3_event(eventsel);
}

static void kvm_pmu_create_perf_event(struct kvm_pmc *pmc)
{

	[...]

	attr.config = kvm_map_pmu_event(vcpu->kvm, eventsel);
	event = perf_event_create_kernel_counter(&attr, ...);
}

We could even have the M1 PMU driver populate arm_pmu->pmceid_bitmap
with the events it knows about and get PMCEID emulation for free.

-- 
Thanks,
Oliver


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 05/14] KVM: arm64: Always allow fixed cycle counter
  2024-12-04 21:56         ` Oliver Upton
@ 2024-12-10  9:49           ` Marc Zyngier
  0 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2024-12-10  9:49 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, Joey Gouly, Suzuki K Poulose, Zenghui Yu, Mingwei Zhang,
	Colton Lewis, Raghavendra Rao Ananta, Catalin Marinas,
	Will Deacon, Mark Rutland, linux-arm-kernel, linux-kernel

On Wed, 04 Dec 2024 21:56:58 +0000,
Oliver Upton <oliver.upton@linux.dev> wrote:
> 
> On Wed, Dec 04, 2024 at 09:04:26AM +0000, Marc Zyngier wrote:
> > On Tue, 03 Dec 2024 22:32:38 +0000,
> > Oliver Upton <oliver.upton@linux.dev> wrote:
> > > > More importantly, the current filtering works in terms of events, and
> > > > not in terms of counters.
> > > > 
> > > > Instead of changing the ABI, how about simply not supporting filtering
> > > > on such non-compliant HW? Surely that would simplify a few things.
> > > 
> > > Yeah, that sounds reasonable. Especially if we allow programmable event
> > > counters where the event ID space doesn't match the architecture.
> > 
> > Another thing I have been wondering is if a slightly better approach
> > would be to move some of the handling to the PMU driver itself, and
> > let it emulate PMUv3 if it can. This would allow conversion of event
> > numbers in situ rather than polluting the PMUv3 code in KVM.
> 
> Sure, but I think the actual event fed into perf_event_create_kernel_counter()
> should be the correct hardware event, not a PMUv3 event reinterpreted
> behind the scenes. Otherwise, we'd need to devise an alternate config encoding
> for PMUv3-like events since the event ID spaces overlap.
> 
> I'm thinking this could be a helper in the arm_pmu struct that takes a
> PMUv3 event and spits out (in this case) an M1 event. The resulting KVM
> code would be miniscule, like:
> 
> u64 kvm_map_pmu_event(struct kvm *kvm, u64 eventsel)
> {
> 	struct arm_pmu *pmu = kvm->arch.arm_pmu;
> 
> 	if (!pmu->map_pmuv3_event)
> 		return eventsel;
> 
> 	return pmu->map_pmuv3_event(eventsel);
> }
> 
> static void kvm_pmu_create_perf_event(struct kvm_pmc *pmc)
> {
> 
> 	[...]
> 
> 	attr.config = kvm_map_pmu_event(vcpu->kvm, eventsel);
> 	event = perf_event_create_kernel_counter(&attr, ...);
> }
> 
> We could even have the M1 PMU driver populate arm_pmu->pmceid_bitmap
> with the events it knows about and get PMCEID emulation for free.

CONFIG_PMUv3_COMPAT? ;-)

I think this is probably the less intrusive thing to do for KVM
(outside of the butt-ugly trap decoding thingy), and it squarely puts
the responsibility on the PMU driver to expose something that makes
sense in a given context.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2024-12-10 10:03 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-03 19:32 [RFC PATCH 00/14] KVM: arm64: Support FEAT_PMUv3 on Apple hardware Oliver Upton
2024-12-03 19:32 ` [RFC PATCH 01/14] drivers/perf: apple_m1: Refactor event select/filter configuration Oliver Upton
2024-12-03 19:32 ` [RFC PATCH 02/14] drivers/perf: apple_m1: Support host/guest event filtering Oliver Upton
2024-12-03 19:32 ` [RFC PATCH 03/14] drivers/perf: apple_m1: Map generic branch events Oliver Upton
2024-12-03 19:32 ` [RFC PATCH 04/14] KVM: arm64: Compute PMCEID from arm_pmu's event bitmaps Oliver Upton
2024-12-03 19:32 ` [RFC PATCH 05/14] KVM: arm64: Always allow fixed cycle counter Oliver Upton
2024-12-03 21:32   ` Marc Zyngier
2024-12-03 22:32     ` Oliver Upton
2024-12-04  9:04       ` Marc Zyngier
2024-12-04 21:56         ` Oliver Upton
2024-12-10  9:49           ` Marc Zyngier
2024-12-03 19:32 ` [RFC PATCH 06/14] KVM: arm64: Use PERF_COUNT_HW_CPU_CYCLES for " Oliver Upton
2024-12-03 19:32 ` [RFC PATCH 07/14] KVM: arm64: Use a cpucap to determine if system supports FEAT_PMUv3 Oliver Upton
2024-12-03 19:32 ` [RFC PATCH 08/14] KVM: arm64: Drop kvm_arm_pmu_available static key Oliver Upton
2024-12-03 19:32 ` [RFC PATCH 09/14] KVM: arm64: Use guard() to cleanup usage of arm_pmus_lock Oliver Upton
2024-12-03 19:32 ` [RFC PATCH 10/14] KVM: arm64: Move PMUVer filtering into KVM code Oliver Upton
2024-12-03 19:32 ` [RFC PATCH 11/14] KVM: arm64: Compute synthetic sysreg ESR for Apple PMUv3 traps Oliver Upton
2024-12-03 19:32 ` [RFC PATCH 12/14] KVM: arm64: Advertise PMUv3 if IMPDEF traps are present Oliver Upton
2024-12-03 19:32 ` [RFC PATCH 13/14] KVM: arm64: Advertise 0 event counters for IMPDEF PMU Oliver Upton
2024-12-03 19:34 ` [RFC PATCH 14/14] arm64: Enable IMP DEF PMUv3 traps on Apple M2 Oliver Upton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).