* [PATCH v2] KVM: arm64: pmu: Resync EL0 state on counter rotation
@ 2023-08-20 9:01 Marc Zyngier
2023-08-22 6:48 ` Mark Rutland
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Marc Zyngier @ 2023-08-20 9:01 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: James Morse, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Huang Shijie, Leo Yan, Mark Rutland, Will Deacon
Huang Shijie reports that, when profiling a guest from the host
with a number of events that exceeds the number of available
counters, the reported counts are wildly inaccurate. Without
the counter oversubscription, the reported counts are correct.
Their investigation indicates that upon counter rotation (which
takes place on the back of a timer interrupt), we fail to
re-apply the guest EL0 enabling, leading to the counting of host
events instead of guest events.
In order to solve this, add yet another hook between the host PMU
driver and KVM, re-applying the guest EL0 configuration if the
right conditions apply (the host is VHE, we are in interrupt
context, and we interrupted a running vcpu). This triggers a new
vcpu request which will apply the correct configuration on guest
reentry.
With this, we have the correct counts, even when the counters are
oversubscribed.
Reported-by: Huang Shijie <shijie@os.amperecomputing.com>
Suggested-by: Oliver Upton <oliver.upton@linux.dev>
Tested_by: Huang Shijie <shijie@os.amperecomputing.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230809013953.7692-1-shijie@os.amperecomputing.com
---
Notes:
V2: Fixed 32bit compilation
arch/arm/include/asm/arm_pmuv3.h | 2 ++
arch/arm64/include/asm/kvm_host.h | 1 +
arch/arm64/kvm/arm.c | 3 +++
arch/arm64/kvm/pmu.c | 18 ++++++++++++++++++
drivers/perf/arm_pmuv3.c | 2 ++
include/kvm/arm_pmu.h | 2 ++
6 files changed, 28 insertions(+)
diff --git a/arch/arm/include/asm/arm_pmuv3.h b/arch/arm/include/asm/arm_pmuv3.h
index f3cd04ff022d..72529f5e2bed 100644
--- a/arch/arm/include/asm/arm_pmuv3.h
+++ b/arch/arm/include/asm/arm_pmuv3.h
@@ -227,6 +227,8 @@ static inline bool kvm_set_pmuserenr(u64 val)
return false;
}
+static inline void kvm_vcpu_pmu_resync_el0(void) {}
+
/* PMU Version in DFR Register */
#define ARMV8_PMU_DFR_VER_NI 0
#define ARMV8_PMU_DFR_VER_V3P4 0x5
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index d3dd05bbfe23..553040e0e375 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -49,6 +49,7 @@
#define KVM_REQ_RELOAD_GICv4 KVM_ARCH_REQ(4)
#define KVM_REQ_RELOAD_PMU KVM_ARCH_REQ(5)
#define KVM_REQ_SUSPEND KVM_ARCH_REQ(6)
+#define KVM_REQ_RESYNC_PMU_EL0 KVM_ARCH_REQ(7)
#define KVM_DIRTY_LOG_MANUAL_CAPS (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \
KVM_DIRTY_LOG_INITIALLY_SET)
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 72dc53a75d1c..978b0411082f 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -803,6 +803,9 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)
kvm_pmu_handle_pmcr(vcpu,
__vcpu_sys_reg(vcpu, PMCR_EL0));
+ if (kvm_check_request(KVM_REQ_RESYNC_PMU_EL0, vcpu))
+ kvm_vcpu_pmu_restore_guest(vcpu);
+
if (kvm_check_request(KVM_REQ_SUSPEND, vcpu))
return kvm_vcpu_suspend(vcpu);
diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
index 121f1a14c829..0eea225fd09a 100644
--- a/arch/arm64/kvm/pmu.c
+++ b/arch/arm64/kvm/pmu.c
@@ -236,3 +236,21 @@ bool kvm_set_pmuserenr(u64 val)
ctxt_sys_reg(hctxt, PMUSERENR_EL0) = val;
return true;
}
+
+/*
+ * If we interrupted the guest to update the host PMU context, make
+ * sure we re-apply the guest EL0 state.
+ */
+void kvm_vcpu_pmu_resync_el0(void)
+{
+ struct kvm_vcpu *vcpu;
+
+ if (!has_vhe() || !in_interrupt())
+ return;
+
+ vcpu = kvm_get_running_vcpu();
+ if (!vcpu)
+ return;
+
+ kvm_make_request(KVM_REQ_RESYNC_PMU_EL0, vcpu);
+}
diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
index 08b3a1bf0ef6..6a3d8176f54a 100644
--- a/drivers/perf/arm_pmuv3.c
+++ b/drivers/perf/arm_pmuv3.c
@@ -772,6 +772,8 @@ static void armv8pmu_start(struct arm_pmu *cpu_pmu)
/* Enable all counters */
armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
+
+ kvm_vcpu_pmu_resync_el0();
}
static void armv8pmu_stop(struct arm_pmu *cpu_pmu)
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index 847da6fc2713..3a8a70a60794 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -74,6 +74,7 @@ int kvm_arm_pmu_v3_enable(struct kvm_vcpu *vcpu);
struct kvm_pmu_events *kvm_get_pmu_events(void);
void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu);
void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu);
+void kvm_vcpu_pmu_resync_el0(void);
#define kvm_vcpu_has_pmu(vcpu) \
(test_bit(KVM_ARM_VCPU_PMU_V3, (vcpu)->arch.features))
@@ -171,6 +172,7 @@ static inline u8 kvm_arm_pmu_get_pmuver_limit(void)
{
return 0;
}
+static inline void kvm_vcpu_pmu_resync_el0(void) {}
#endif
--
2.39.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2] KVM: arm64: pmu: Resync EL0 state on counter rotation
2023-08-20 9:01 [PATCH v2] KVM: arm64: pmu: Resync EL0 state on counter rotation Marc Zyngier
@ 2023-08-22 6:48 ` Mark Rutland
2023-08-22 9:59 ` Will Deacon
2023-08-22 12:39 ` Marc Zyngier
` (2 subsequent siblings)
3 siblings, 1 reply; 9+ messages in thread
From: Mark Rutland @ 2023-08-22 6:48 UTC (permalink / raw)
To: Marc Zyngier, Will Deacon
Cc: linux-arm-kernel, kvmarm, James Morse, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Huang Shijie, Leo Yan
On Sun, Aug 20, 2023 at 10:01:08AM +0100, Marc Zyngier wrote:
> Huang Shijie reports that, when profiling a guest from the host
> with a number of events that exceeds the number of available
> counters, the reported counts are wildly inaccurate. Without
> the counter oversubscription, the reported counts are correct.
>
> Their investigation indicates that upon counter rotation (which
> takes place on the back of a timer interrupt), we fail to
> re-apply the guest EL0 enabling, leading to the counting of host
> events instead of guest events.
>
> In order to solve this, add yet another hook between the host PMU
> driver and KVM, re-applying the guest EL0 configuration if the
> right conditions apply (the host is VHE, we are in interrupt
> context, and we interrupted a running vcpu). This triggers a new
> vcpu request which will apply the correct configuration on guest
> reentry.
>
> With this, we have the correct counts, even when the counters are
> oversubscribed.
>
> Reported-by: Huang Shijie <shijie@os.amperecomputing.com>
> Suggested-by: Oliver Upton <oliver.upton@linux.dev>
> Tested_by: Huang Shijie <shijie@os.amperecomputing.com>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Cc: Leo Yan <leo.yan@linaro.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Link: https://lore.kernel.org/r/20230809013953.7692-1-shijie@os.amperecomputing.com
This looks sane to me so:
Acked-by: Mark Rutland <mark.rutland@arm.com>
It doesn't look like we have any PMU patches queued that would conflict, so if
Will's happy I reckon this should go through the KVM/arm64 tree.
Mark.
> ---
>
> Notes:
> V2: Fixed 32bit compilation
>
> arch/arm/include/asm/arm_pmuv3.h | 2 ++
> arch/arm64/include/asm/kvm_host.h | 1 +
> arch/arm64/kvm/arm.c | 3 +++
> arch/arm64/kvm/pmu.c | 18 ++++++++++++++++++
> drivers/perf/arm_pmuv3.c | 2 ++
> include/kvm/arm_pmu.h | 2 ++
> 6 files changed, 28 insertions(+)
>
> diff --git a/arch/arm/include/asm/arm_pmuv3.h b/arch/arm/include/asm/arm_pmuv3.h
> index f3cd04ff022d..72529f5e2bed 100644
> --- a/arch/arm/include/asm/arm_pmuv3.h
> +++ b/arch/arm/include/asm/arm_pmuv3.h
> @@ -227,6 +227,8 @@ static inline bool kvm_set_pmuserenr(u64 val)
> return false;
> }
>
> +static inline void kvm_vcpu_pmu_resync_el0(void) {}
> +
> /* PMU Version in DFR Register */
> #define ARMV8_PMU_DFR_VER_NI 0
> #define ARMV8_PMU_DFR_VER_V3P4 0x5
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index d3dd05bbfe23..553040e0e375 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -49,6 +49,7 @@
> #define KVM_REQ_RELOAD_GICv4 KVM_ARCH_REQ(4)
> #define KVM_REQ_RELOAD_PMU KVM_ARCH_REQ(5)
> #define KVM_REQ_SUSPEND KVM_ARCH_REQ(6)
> +#define KVM_REQ_RESYNC_PMU_EL0 KVM_ARCH_REQ(7)
>
> #define KVM_DIRTY_LOG_MANUAL_CAPS (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \
> KVM_DIRTY_LOG_INITIALLY_SET)
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 72dc53a75d1c..978b0411082f 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -803,6 +803,9 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)
> kvm_pmu_handle_pmcr(vcpu,
> __vcpu_sys_reg(vcpu, PMCR_EL0));
>
> + if (kvm_check_request(KVM_REQ_RESYNC_PMU_EL0, vcpu))
> + kvm_vcpu_pmu_restore_guest(vcpu);
> +
> if (kvm_check_request(KVM_REQ_SUSPEND, vcpu))
> return kvm_vcpu_suspend(vcpu);
>
> diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
> index 121f1a14c829..0eea225fd09a 100644
> --- a/arch/arm64/kvm/pmu.c
> +++ b/arch/arm64/kvm/pmu.c
> @@ -236,3 +236,21 @@ bool kvm_set_pmuserenr(u64 val)
> ctxt_sys_reg(hctxt, PMUSERENR_EL0) = val;
> return true;
> }
> +
> +/*
> + * If we interrupted the guest to update the host PMU context, make
> + * sure we re-apply the guest EL0 state.
> + */
> +void kvm_vcpu_pmu_resync_el0(void)
> +{
> + struct kvm_vcpu *vcpu;
> +
> + if (!has_vhe() || !in_interrupt())
> + return;
> +
> + vcpu = kvm_get_running_vcpu();
> + if (!vcpu)
> + return;
> +
> + kvm_make_request(KVM_REQ_RESYNC_PMU_EL0, vcpu);
> +}
> diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
> index 08b3a1bf0ef6..6a3d8176f54a 100644
> --- a/drivers/perf/arm_pmuv3.c
> +++ b/drivers/perf/arm_pmuv3.c
> @@ -772,6 +772,8 @@ static void armv8pmu_start(struct arm_pmu *cpu_pmu)
>
> /* Enable all counters */
> armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
> +
> + kvm_vcpu_pmu_resync_el0();
> }
>
> static void armv8pmu_stop(struct arm_pmu *cpu_pmu)
> diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
> index 847da6fc2713..3a8a70a60794 100644
> --- a/include/kvm/arm_pmu.h
> +++ b/include/kvm/arm_pmu.h
> @@ -74,6 +74,7 @@ int kvm_arm_pmu_v3_enable(struct kvm_vcpu *vcpu);
> struct kvm_pmu_events *kvm_get_pmu_events(void);
> void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu);
> void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu);
> +void kvm_vcpu_pmu_resync_el0(void);
>
> #define kvm_vcpu_has_pmu(vcpu) \
> (test_bit(KVM_ARM_VCPU_PMU_V3, (vcpu)->arch.features))
> @@ -171,6 +172,7 @@ static inline u8 kvm_arm_pmu_get_pmuver_limit(void)
> {
> return 0;
> }
> +static inline void kvm_vcpu_pmu_resync_el0(void) {}
>
> #endif
>
> --
> 2.39.2
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] KVM: arm64: pmu: Resync EL0 state on counter rotation
2023-08-22 6:48 ` Mark Rutland
@ 2023-08-22 9:59 ` Will Deacon
0 siblings, 0 replies; 9+ messages in thread
From: Will Deacon @ 2023-08-22 9:59 UTC (permalink / raw)
To: Mark Rutland
Cc: Marc Zyngier, linux-arm-kernel, kvmarm, James Morse,
Suzuki K Poulose, Oliver Upton, Zenghui Yu, Huang Shijie, Leo Yan
On Tue, Aug 22, 2023 at 07:48:28AM +0100, Mark Rutland wrote:
> On Sun, Aug 20, 2023 at 10:01:08AM +0100, Marc Zyngier wrote:
> > Huang Shijie reports that, when profiling a guest from the host
> > with a number of events that exceeds the number of available
> > counters, the reported counts are wildly inaccurate. Without
> > the counter oversubscription, the reported counts are correct.
> >
> > Their investigation indicates that upon counter rotation (which
> > takes place on the back of a timer interrupt), we fail to
> > re-apply the guest EL0 enabling, leading to the counting of host
> > events instead of guest events.
> >
> > In order to solve this, add yet another hook between the host PMU
> > driver and KVM, re-applying the guest EL0 configuration if the
> > right conditions apply (the host is VHE, we are in interrupt
> > context, and we interrupted a running vcpu). This triggers a new
> > vcpu request which will apply the correct configuration on guest
> > reentry.
> >
> > With this, we have the correct counts, even when the counters are
> > oversubscribed.
> >
> > Reported-by: Huang Shijie <shijie@os.amperecomputing.com>
> > Suggested-by: Oliver Upton <oliver.upton@linux.dev>
> > Tested_by: Huang Shijie <shijie@os.amperecomputing.com>
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > Cc: Leo Yan <leo.yan@linaro.org>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > Link: https://lore.kernel.org/r/20230809013953.7692-1-shijie@os.amperecomputing.com
>
> This looks sane to me so:
>
> Acked-by: Mark Rutland <mark.rutland@arm.com>
>
> It doesn't look like we have any PMU patches queued that would conflict, so if
> Will's happy I reckon this should go through the KVM/arm64 tree.
Fine by me!
Will
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] KVM: arm64: pmu: Resync EL0 state on counter rotation
2023-08-20 9:01 [PATCH v2] KVM: arm64: pmu: Resync EL0 state on counter rotation Marc Zyngier
2023-08-22 6:48 ` Mark Rutland
@ 2023-08-22 12:39 ` Marc Zyngier
2023-08-22 13:45 ` Leo Yan
2023-08-23 7:15 ` Alexander Stein
3 siblings, 0 replies; 9+ messages in thread
From: Marc Zyngier @ 2023-08-22 12:39 UTC (permalink / raw)
To: kvmarm, Marc Zyngier, linux-arm-kernel
Cc: Will Deacon, Suzuki K Poulose, Oliver Upton, Mark Rutland,
Zenghui Yu, Huang Shijie, James Morse, Leo Yan
On Sun, 20 Aug 2023 10:01:08 +0100, Marc Zyngier wrote:
> Huang Shijie reports that, when profiling a guest from the host
> with a number of events that exceeds the number of available
> counters, the reported counts are wildly inaccurate. Without
> the counter oversubscription, the reported counts are correct.
>
> Their investigation indicates that upon counter rotation (which
> takes place on the back of a timer interrupt), we fail to
> re-apply the guest EL0 enabling, leading to the counting of host
> events instead of guest events.
>
> [...]
Applied to next, thanks!
[1/1] KVM: arm64: pmu: Resync EL0 state on counter rotation
commit: b1f778a223a2a8ad6262e5233cfc3483bcf6e213
Cheers,
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] KVM: arm64: pmu: Resync EL0 state on counter rotation
2023-08-20 9:01 [PATCH v2] KVM: arm64: pmu: Resync EL0 state on counter rotation Marc Zyngier
2023-08-22 6:48 ` Mark Rutland
2023-08-22 12:39 ` Marc Zyngier
@ 2023-08-22 13:45 ` Leo Yan
2023-08-22 13:49 ` Leo Yan
2023-08-23 7:15 ` Alexander Stein
3 siblings, 1 reply; 9+ messages in thread
From: Leo Yan @ 2023-08-22 13:45 UTC (permalink / raw)
To: Marc Zyngier
Cc: linux-arm-kernel, kvmarm, James Morse, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Huang Shijie, Mark Rutland, Will Deacon
Hi,
On Sun, Aug 20, 2023 at 10:01:08AM +0100, Marc Zyngier wrote:
> Huang Shijie reports that, when profiling a guest from the host
> with a number of events that exceeds the number of available
> counters, the reported counts are wildly inaccurate. Without
> the counter oversubscription, the reported counts are correct.
>
> Their investigation indicates that upon counter rotation (which
> takes place on the back of a timer interrupt), we fail to
> re-apply the guest EL0 enabling, leading to the counting of host
> events instead of guest events.
>
> In order to solve this, add yet another hook between the host PMU
> driver and KVM, re-applying the guest EL0 configuration if the
> right conditions apply (the host is VHE, we are in interrupt
> context, and we interrupted a running vcpu). This triggers a new
> vcpu request which will apply the correct configuration on guest
> reentry.
>
> With this, we have the correct counts, even when the counters are
> oversubscribed.
I gave a test for this patch, It works well.
However, I do see this patch can introduce huge amount invoking
kvm_vcpu_pmu_restore_guest() when using 'perf record' command.
As I mentioned in the patch v2, we can call kvm_vcpu_pmu_resync_el0()
in the function kvm_set_pmu_events() rather than in armv8pmu_start().
With this change, the kernel only syncs PMU context when the host and
the guest have different traceing for EL0. Just paste the suggested
code for reference:
@@ -46,6 +48,8 @@ void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr)
pmu->events_host |= set;
if (!attr->exclude_guest)
pmu->events_guest |= set;
+
+ kvm_vcpu_pmu_resync_el0();
}
Below is the comparison result for counting resync, the result is for
counting how many times kvm_vcpu_pmu_restore_guest() is called for
'perf stat' and 'perf record' commands.
| perf stat(*) | perf record(**)
-----------------+------------------+-----------------
Patch v3: | 2506 | 47325
Proposed change: | 2514 | 2504
(*): sudo ./perf stat -a -e cycles:G,cycles:H -d -d -d sleep 10
(**): sudo ./perf record -a -e cycles:G,cycles:H -d -d -d sleep 10
Thanks,
Leo
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] KVM: arm64: pmu: Resync EL0 state on counter rotation
2023-08-22 13:45 ` Leo Yan
@ 2023-08-22 13:49 ` Leo Yan
0 siblings, 0 replies; 9+ messages in thread
From: Leo Yan @ 2023-08-22 13:49 UTC (permalink / raw)
To: Marc Zyngier
Cc: linux-arm-kernel, kvmarm, James Morse, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Huang Shijie, Mark Rutland, Will Deacon
On Tue, Aug 22, 2023 at 09:45:16PM +0800, Leo Yan wrote:
> Hi,
>
> On Sun, Aug 20, 2023 at 10:01:08AM +0100, Marc Zyngier wrote:
> > Huang Shijie reports that, when profiling a guest from the host
> > with a number of events that exceeds the number of available
> > counters, the reported counts are wildly inaccurate. Without
> > the counter oversubscription, the reported counts are correct.
> >
> > Their investigation indicates that upon counter rotation (which
> > takes place on the back of a timer interrupt), we fail to
> > re-apply the guest EL0 enabling, leading to the counting of host
> > events instead of guest events.
> >
> > In order to solve this, add yet another hook between the host PMU
> > driver and KVM, re-applying the guest EL0 configuration if the
> > right conditions apply (the host is VHE, we are in interrupt
> > context, and we interrupted a running vcpu). This triggers a new
> > vcpu request which will apply the correct configuration on guest
> > reentry.
> >
> > With this, we have the correct counts, even when the counters are
> > oversubscribed.
>
> I gave a test for this patch, It works well.
>
> However, I do see this patch can introduce huge amount invoking
> kvm_vcpu_pmu_restore_guest() when using 'perf record' command.
>
> As I mentioned in the patch v2, we can call kvm_vcpu_pmu_resync_el0()
> in the function kvm_set_pmu_events() rather than in armv8pmu_start().
> With this change, the kernel only syncs PMU context when the host and
> the guest have different traceing for EL0. Just paste the suggested
> code for reference:
>
> @@ -46,6 +48,8 @@ void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr)
> pmu->events_host |= set;
> if (!attr->exclude_guest)
> pmu->events_guest |= set;
> +
> + kvm_vcpu_pmu_resync_el0();
> }
>
> Below is the comparison result for counting resync, the result is for
> counting how many times kvm_vcpu_pmu_restore_guest() is called for
> 'perf stat' and 'perf record' commands.
>
> | perf stat(*) | perf record(**)
> -----------------+------------------+-----------------
> Patch v3: | 2506 | 47325
Here should be "Patch v2", sorry for typo.
> Proposed change: | 2514 | 2504
>
> (*): sudo ./perf stat -a -e cycles:G,cycles:H -d -d -d sleep 10
> (**): sudo ./perf record -a -e cycles:G,cycles:H -d -d -d sleep 10
>
> Thanks,
> Leo
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] KVM: arm64: pmu: Resync EL0 state on counter rotation
2023-08-20 9:01 [PATCH v2] KVM: arm64: pmu: Resync EL0 state on counter rotation Marc Zyngier
` (2 preceding siblings ...)
2023-08-22 13:45 ` Leo Yan
@ 2023-08-23 7:15 ` Alexander Stein
2023-08-23 9:21 ` Marc Zyngier
3 siblings, 1 reply; 9+ messages in thread
From: Alexander Stein @ 2023-08-23 7:15 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: James Morse, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Huang Shijie, Leo Yan, Mark Rutland, Will Deacon, Marc Zyngier
Hi,
Am Sonntag, 20. August 2023, 11:01:08 CEST schrieb Marc Zyngier:
> Huang Shijie reports that, when profiling a guest from the host
> with a number of events that exceeds the number of available
> counters, the reported counts are wildly inaccurate. Without
> the counter oversubscription, the reported counts are correct.
>
> Their investigation indicates that upon counter rotation (which
> takes place on the back of a timer interrupt), we fail to
> re-apply the guest EL0 enabling, leading to the counting of host
> events instead of guest events.
>
> In order to solve this, add yet another hook between the host PMU
> driver and KVM, re-applying the guest EL0 configuration if the
> right conditions apply (the host is VHE, we are in interrupt
> context, and we interrupted a running vcpu). This triggers a new
> vcpu request which will apply the correct configuration on guest
> reentry.
>
> With this, we have the correct counts, even when the counters are
> oversubscribed.
>
> Reported-by: Huang Shijie <shijie@os.amperecomputing.com>
> Suggested-by: Oliver Upton <oliver.upton@linux.dev>
> Tested_by: Huang Shijie <shijie@os.amperecomputing.com>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Cc: Leo Yan <leo.yan@linaro.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Link:
> https://lore.kernel.org/r/20230809013953.7692-1-shijie@os.amperecomputing.c
> om ---
>
> Notes:
> V2: Fixed 32bit compilation
>
> arch/arm/include/asm/arm_pmuv3.h | 2 ++
> arch/arm64/include/asm/kvm_host.h | 1 +
> arch/arm64/kvm/arm.c | 3 +++
> arch/arm64/kvm/pmu.c | 18 ++++++++++++++++++
> drivers/perf/arm_pmuv3.c | 2 ++
> include/kvm/arm_pmu.h | 2 ++
> 6 files changed, 28 insertions(+)
>
> diff --git a/arch/arm/include/asm/arm_pmuv3.h
> b/arch/arm/include/asm/arm_pmuv3.h index f3cd04ff022d..72529f5e2bed 100644
> --- a/arch/arm/include/asm/arm_pmuv3.h
> +++ b/arch/arm/include/asm/arm_pmuv3.h
> @@ -227,6 +227,8 @@ static inline bool kvm_set_pmuserenr(u64 val)
> return false;
> }
>
> +static inline void kvm_vcpu_pmu_resync_el0(void) {}
> +
> /* PMU Version in DFR Register */
> #define ARMV8_PMU_DFR_VER_NI 0
> #define ARMV8_PMU_DFR_VER_V3P4 0x5
> diff --git a/arch/arm64/include/asm/kvm_host.h
> b/arch/arm64/include/asm/kvm_host.h index d3dd05bbfe23..553040e0e375 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -49,6 +49,7 @@
> #define KVM_REQ_RELOAD_GICv4 KVM_ARCH_REQ(4)
> #define KVM_REQ_RELOAD_PMU KVM_ARCH_REQ(5)
> #define KVM_REQ_SUSPEND KVM_ARCH_REQ(6)
> +#define KVM_REQ_RESYNC_PMU_EL0 KVM_ARCH_REQ(7)
>
> #define KVM_DIRTY_LOG_MANUAL_CAPS (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE |
> \ KVM_DIRTY_LOG_INITIALLY_SET)
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 72dc53a75d1c..978b0411082f 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -803,6 +803,9 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)
> kvm_pmu_handle_pmcr(vcpu,
> __vcpu_sys_reg(vcpu,
PMCR_EL0));
>
> + if (kvm_check_request(KVM_REQ_RESYNC_PMU_EL0, vcpu))
> + kvm_vcpu_pmu_restore_guest(vcpu);
> +
> if (kvm_check_request(KVM_REQ_SUSPEND, vcpu))
> return kvm_vcpu_suspend(vcpu);
>
> diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
> index 121f1a14c829..0eea225fd09a 100644
> --- a/arch/arm64/kvm/pmu.c
> +++ b/arch/arm64/kvm/pmu.c
> @@ -236,3 +236,21 @@ bool kvm_set_pmuserenr(u64 val)
> ctxt_sys_reg(hctxt, PMUSERENR_EL0) = val;
> return true;
> }
> +
> +/*
> + * If we interrupted the guest to update the host PMU context, make
> + * sure we re-apply the guest EL0 state.
> + */
> +void kvm_vcpu_pmu_resync_el0(void)
> +{
> + struct kvm_vcpu *vcpu;
> +
> + if (!has_vhe() || !in_interrupt())
> + return;
> +
> + vcpu = kvm_get_running_vcpu();
> + if (!vcpu)
> + return;
> +
> + kvm_make_request(KVM_REQ_RESYNC_PMU_EL0, vcpu);
> +}
> diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
> index 08b3a1bf0ef6..6a3d8176f54a 100644
> --- a/drivers/perf/arm_pmuv3.c
> +++ b/drivers/perf/arm_pmuv3.c
> @@ -772,6 +772,8 @@ static void armv8pmu_start(struct arm_pmu *cpu_pmu)
>
> /* Enable all counters */
> armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
> +
> + kvm_vcpu_pmu_resync_el0();
This breaks if CONFIG_HW_PERF_EVENTS is enabled but CONFIG_KVM is not. As in
this case arch/arm64/kvm/pmu.c is never compiled, but the dummy inline in
include/kvm/arm_pmu.h for kvm_vcpu_pmu_resync_el0() only depends on
CONFIG_HW_PERF_EVENTS.
This results in this error:
aarch64-v8a-linux-gnu-ld: Unexpected GOT/PLT entries detected!
aarch64-v8a-linux-gnu-ld: Unexpected run-time procedure linkages detected!
aarch64-v8a-linux-gnu-ld: drivers/perf/arm_pmuv3.o: in function
`armv8pmu_start':
drivers/perf/arm_pmuv3.c:753: undefined reference to `kvm_vcpu_pmu_resync_el0'
Best regards,
Alexander
> }
>
> static void armv8pmu_stop(struct arm_pmu *cpu_pmu)
> diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
> index 847da6fc2713..3a8a70a60794 100644
> --- a/include/kvm/arm_pmu.h
> +++ b/include/kvm/arm_pmu.h
> @@ -74,6 +74,7 @@ int kvm_arm_pmu_v3_enable(struct kvm_vcpu *vcpu);
> struct kvm_pmu_events *kvm_get_pmu_events(void);
> void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu);
> void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu);
> +void kvm_vcpu_pmu_resync_el0(void);
>
> #define kvm_vcpu_has_pmu(vcpu)
\
> (test_bit(KVM_ARM_VCPU_PMU_V3, (vcpu)->arch.features))
> @@ -171,6 +172,7 @@ static inline u8 kvm_arm_pmu_get_pmuver_limit(void)
> {
> return 0;
> }
> +static inline void kvm_vcpu_pmu_resync_el0(void) {}
>
> #endif
--
TQ-Systems GmbH | Mühlstraße 2, Gut Delling | 82229 Seefeld, Germany
Amtsgericht München, HRB 105018
Geschäftsführer: Detlef Schneider, Rüdiger Stahl, Stefan Schneider
http://www.tq-group.com/
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] KVM: arm64: pmu: Resync EL0 state on counter rotation
2023-08-23 7:15 ` Alexander Stein
@ 2023-08-23 9:21 ` Marc Zyngier
2023-08-23 11:15 ` Alexander Stein
0 siblings, 1 reply; 9+ messages in thread
From: Marc Zyngier @ 2023-08-23 9:21 UTC (permalink / raw)
To: Alexander Stein
Cc: linux-arm-kernel, kvmarm, James Morse, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Huang Shijie, Leo Yan, Mark Rutland,
Will Deacon
On Wed, 23 Aug 2023 08:15:38 +0100,
Alexander Stein <alexander.stein@ew.tq-group.com> wrote:
>
> Hi,
>
> > diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
> > index 08b3a1bf0ef6..6a3d8176f54a 100644
> > --- a/drivers/perf/arm_pmuv3.c
> > +++ b/drivers/perf/arm_pmuv3.c
> > @@ -772,6 +772,8 @@ static void armv8pmu_start(struct arm_pmu *cpu_pmu)
> >
> > /* Enable all counters */
> > armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
> > +
> > + kvm_vcpu_pmu_resync_el0();
>
> This breaks if CONFIG_HW_PERF_EVENTS is enabled but CONFIG_KVM is not. As in
> this case arch/arm64/kvm/pmu.c is never compiled, but the dummy inline in
> include/kvm/arm_pmu.h for kvm_vcpu_pmu_resync_el0() only depends on
> CONFIG_HW_PERF_EVENTS.
> This results in this error:
>
> aarch64-v8a-linux-gnu-ld: Unexpected GOT/PLT entries detected!
> aarch64-v8a-linux-gnu-ld: Unexpected run-time procedure linkages detected!
> aarch64-v8a-linux-gnu-ld: drivers/perf/arm_pmuv3.o: in function
> `armv8pmu_start':
> drivers/perf/arm_pmuv3.c:753: undefined reference to `kvm_vcpu_pmu_resync_el0'
Huh, that's unexpected. Let's fix it with an adequately sized hammer.
Does the patch below work for you?
Thanks,
M.
From 2446f15d355ab7e766291eccf5b670a2060b4c3a Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz@kernel.org>
Date: Wed, 23 Aug 2023 10:14:48 +0100
Subject: [PATCH] KVM: arm64: pmu: Guard PMU emulation definitions with
CONFIG_KVM
Most of the internal definitions for PMU emulation are guarded with
CONFIG_HW_PERF_EVENTS. However, this isn't enough, and leads to
these definitions leaking if CONFIG_KVM isn't enabled.
This leads to some compilation breakage in this exact configuration.
Fix it by falling back to the dummy stubs if either perf or KVM
isn't selected.
Reported-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
include/kvm/arm_pmu.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index 3a8a70a60794..31029f4f7be8 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -12,7 +12,7 @@
#define ARMV8_PMU_CYCLE_IDX (ARMV8_PMU_MAX_COUNTERS - 1)
-#ifdef CONFIG_HW_PERF_EVENTS
+#if IS_ENABLED(CONFIG_HW_PERF_EVENTS) && IS_ENABLED(CONFIG_KVM)
struct kvm_pmc {
u8 idx; /* index into the pmu->pmc array */
--
2.34.1
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2] KVM: arm64: pmu: Resync EL0 state on counter rotation
2023-08-23 9:21 ` Marc Zyngier
@ 2023-08-23 11:15 ` Alexander Stein
0 siblings, 0 replies; 9+ messages in thread
From: Alexander Stein @ 2023-08-23 11:15 UTC (permalink / raw)
To: Marc Zyngier
Cc: linux-arm-kernel, kvmarm, James Morse, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Huang Shijie, Leo Yan, Mark Rutland,
Will Deacon
Hi Marc,
Am Mittwoch, 23. August 2023, 11:21:45 CEST schrieb Marc Zyngier:
> On Wed, 23 Aug 2023 08:15:38 +0100,
>
> Alexander Stein <alexander.stein@ew.tq-group.com> wrote:
> > Hi,
> >
> > > diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
> > > index 08b3a1bf0ef6..6a3d8176f54a 100644
> > > --- a/drivers/perf/arm_pmuv3.c
> > > +++ b/drivers/perf/arm_pmuv3.c
> > > @@ -772,6 +772,8 @@ static void armv8pmu_start(struct arm_pmu *cpu_pmu)
> > >
> > > /* Enable all counters */
> > > armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
> > >
> > > +
> > > + kvm_vcpu_pmu_resync_el0();
> >
> > This breaks if CONFIG_HW_PERF_EVENTS is enabled but CONFIG_KVM is not. As
> > in this case arch/arm64/kvm/pmu.c is never compiled, but the dummy inline
> > in include/kvm/arm_pmu.h for kvm_vcpu_pmu_resync_el0() only depends on
> > CONFIG_HW_PERF_EVENTS.
> > This results in this error:
> >
> > aarch64-v8a-linux-gnu-ld: Unexpected GOT/PLT entries detected!
> > aarch64-v8a-linux-gnu-ld: Unexpected run-time procedure linkages detected!
> > aarch64-v8a-linux-gnu-ld: drivers/perf/arm_pmuv3.o: in function
> > `armv8pmu_start':
> > drivers/perf/arm_pmuv3.c:753: undefined reference to
> > `kvm_vcpu_pmu_resync_el0'
> Huh, that's unexpected. Let's fix it with an adequately sized hammer.
> Does the patch below work for you?
Yes, the patch below does work for me. Thanks.
Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Best regards,
Alexander
> Thanks,
>
> M.
>
> From 2446f15d355ab7e766291eccf5b670a2060b4c3a Mon Sep 17 00:00:00 2001
> From: Marc Zyngier <maz@kernel.org>
> Date: Wed, 23 Aug 2023 10:14:48 +0100
> Subject: [PATCH] KVM: arm64: pmu: Guard PMU emulation definitions with
> CONFIG_KVM
>
> Most of the internal definitions for PMU emulation are guarded with
> CONFIG_HW_PERF_EVENTS. However, this isn't enough, and leads to
> these definitions leaking if CONFIG_KVM isn't enabled.
>
> This leads to some compilation breakage in this exact configuration.
> Fix it by falling back to the dummy stubs if either perf or KVM
> isn't selected.
>
> Reported-by: Alexander Stein <alexander.stein@ew.tq-group.com>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
> include/kvm/arm_pmu.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
> index 3a8a70a60794..31029f4f7be8 100644
> --- a/include/kvm/arm_pmu.h
> +++ b/include/kvm/arm_pmu.h
> @@ -12,7 +12,7 @@
>
> #define ARMV8_PMU_CYCLE_IDX (ARMV8_PMU_MAX_COUNTERS - 1)
>
> -#ifdef CONFIG_HW_PERF_EVENTS
> +#if IS_ENABLED(CONFIG_HW_PERF_EVENTS) && IS_ENABLED(CONFIG_KVM)
>
> struct kvm_pmc {
> u8 idx; /* index into the pmu->pmc array */
--
TQ-Systems GmbH | Mühlstraße 2, Gut Delling | 82229 Seefeld, Germany
Amtsgericht München, HRB 105018
Geschäftsführer: Detlef Schneider, Rüdiger Stahl, Stefan Schneider
http://www.tq-group.com/
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-08-23 11:16 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-20 9:01 [PATCH v2] KVM: arm64: pmu: Resync EL0 state on counter rotation Marc Zyngier
2023-08-22 6:48 ` Mark Rutland
2023-08-22 9:59 ` Will Deacon
2023-08-22 12:39 ` Marc Zyngier
2023-08-22 13:45 ` Leo Yan
2023-08-22 13:49 ` Leo Yan
2023-08-23 7:15 ` Alexander Stein
2023-08-23 9:21 ` Marc Zyngier
2023-08-23 11:15 ` Alexander Stein
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).