* [PATCH v2 0/2] KVM: arm64: PMU: Correct the handling of PMUSERENR_EL0
@ 2023-04-08 3:47 Reiji Watanabe
2023-04-08 3:47 ` [PATCH v2 1/2] KVM: arm64: PMU: Restore the host's PMUSERENR_EL0 Reiji Watanabe
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Reiji Watanabe @ 2023-04-08 3:47 UTC (permalink / raw)
To: Marc Zyngier, Mark Rutland, Oliver Upton, Will Deacon,
Catalin Marinas, kvmarm
Cc: kvm, linux-arm-kernel, James Morse, Alexandru Elisei, Zenghui Yu,
Suzuki K Poulose, Paolo Bonzini, Ricardo Koller, Jing Zhang,
Raghavendra Rao Anata, Shaoqin Huang, Rob Herring, Reiji Watanabe
This series will fix bugs in KVM's handling of PMUSERENR_EL0.
With PMU access support from EL0 [1], the perf subsystem would
set CR and ER bits of PMUSERENR_EL0 as needed to allow EL0 to have
a direct access to PMU counters. However, KVM appears to assume
that the register value is always zero for the host EL0, and has
the following two problems in handling the register.
[A] The host EL0 might lose the direct access to PMU counters, as
KVM always clears PMUSERENR_EL0 before returning to userspace.
[B] With VHE, the guest EL0 access to PMU counters might be trapped
to EL1 instead of to EL2 (even when PMUSERENR_EL0 for the guest
indicates that the guest EL0 has an access to the counters).
This is because, with VHE, KVM sets ER, CR, SW and EN bits of
PMUSERENR_EL0 to 1 on vcpu_load() to ensure to trap PMU access
from the guset EL0 to EL2, but those bits might be cleared by
the perf subsystem after vcpu_load() (when PMU counters are
programmed for the vPMU emulation).
Patch-1 will fix [A], and Patch-2 will fix [B] respectively.
The series is based on v6.3-rc5.
v2:
- Save the PMUSERENR_EL0 for the host in the sysreg array of
kvm_host_data. [Marc]
- Don't let armv8pmu_start() overwrite PMUSERENR if the vCPU
is loaded, instead have KVM update the saved shadow register
value for the host. [Marc, Mark]
v1: https://lore.kernel.org/all/20230329002136.2463442-1-reijiw@google.com/
[1] https://github.com/torvalds/linux/commit/83a7a4d643d33a8b74a42229346b7ed7139fcef9
Reiji Watanabe (2):
KVM: arm64: PMU: Restore the host's PMUSERENR_EL0
KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded
arch/arm64/include/asm/kvm_host.h | 5 +++++
arch/arm64/kernel/perf_event.c | 21 ++++++++++++++++++---
arch/arm64/kvm/hyp/include/hyp/switch.h | 13 +++++++++++--
arch/arm64/kvm/pmu.c | 20 ++++++++++++++++++++
4 files changed, 54 insertions(+), 5 deletions(-)
base-commit: 7e364e56293bb98cae1b55fd835f5991c4e96e7d
--
2.40.0.577.gac1e443424-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 1/2] KVM: arm64: PMU: Restore the host's PMUSERENR_EL0
2023-04-08 3:47 [PATCH v2 0/2] KVM: arm64: PMU: Correct the handling of PMUSERENR_EL0 Reiji Watanabe
@ 2023-04-08 3:47 ` Reiji Watanabe
2023-04-08 3:47 ` [PATCH v2 2/2] KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded Reiji Watanabe
2023-04-08 9:04 ` [PATCH v2 0/2] KVM: arm64: PMU: Correct the handling of PMUSERENR_EL0 Marc Zyngier
2 siblings, 0 replies; 13+ messages in thread
From: Reiji Watanabe @ 2023-04-08 3:47 UTC (permalink / raw)
To: Marc Zyngier, Mark Rutland, Oliver Upton, Will Deacon,
Catalin Marinas, kvmarm
Cc: kvm, linux-arm-kernel, James Morse, Alexandru Elisei, Zenghui Yu,
Suzuki K Poulose, Paolo Bonzini, Ricardo Koller, Jing Zhang,
Raghavendra Rao Anata, Shaoqin Huang, Rob Herring, Reiji Watanabe
Restore the host's PMUSERENR_EL0 value instead of clearing it,
before returning back to userspace, as the host's EL0 might have
a direct access to PMU registers (some bits of PMUSERENR_EL0 for
might not be zero for the host EL0).
Fixes: 83a7a4d643d3 ("arm64: perf: Enable PMU counter userspace access for perf event")
Signed-off-by: Reiji Watanabe <reijiw@google.com>
---
arch/arm64/kvm/hyp/include/hyp/switch.h | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 07d37ff88a3f..6718731729fd 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -81,7 +81,12 @@ static inline void __activate_traps_common(struct kvm_vcpu *vcpu)
* EL1 instead of being trapped to EL2.
*/
if (kvm_arm_support_pmu_v3()) {
+ struct kvm_cpu_context *hctxt;
+
write_sysreg(0, pmselr_el0);
+
+ hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
+ ctxt_sys_reg(hctxt, PMUSERENR_EL0) = read_sysreg(pmuserenr_el0);
write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0);
}
@@ -105,8 +110,12 @@ static inline void __deactivate_traps_common(struct kvm_vcpu *vcpu)
write_sysreg(vcpu->arch.mdcr_el2_host, mdcr_el2);
write_sysreg(0, hstr_el2);
- if (kvm_arm_support_pmu_v3())
- write_sysreg(0, pmuserenr_el0);
+ if (kvm_arm_support_pmu_v3()) {
+ struct kvm_cpu_context *hctxt;
+
+ hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
+ write_sysreg(ctxt_sys_reg(hctxt, PMUSERENR_EL0), pmuserenr_el0);
+ }
if (cpus_have_final_cap(ARM64_SME)) {
sysreg_clear_set_s(SYS_HFGRTR_EL2, 0,
--
2.40.0.577.gac1e443424-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 2/2] KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded
2023-04-08 3:47 [PATCH v2 0/2] KVM: arm64: PMU: Correct the handling of PMUSERENR_EL0 Reiji Watanabe
2023-04-08 3:47 ` [PATCH v2 1/2] KVM: arm64: PMU: Restore the host's PMUSERENR_EL0 Reiji Watanabe
@ 2023-04-08 3:47 ` Reiji Watanabe
2023-04-11 9:33 ` Mark Rutland
2023-04-08 9:04 ` [PATCH v2 0/2] KVM: arm64: PMU: Correct the handling of PMUSERENR_EL0 Marc Zyngier
2 siblings, 1 reply; 13+ messages in thread
From: Reiji Watanabe @ 2023-04-08 3:47 UTC (permalink / raw)
To: Marc Zyngier, Mark Rutland, Oliver Upton, Will Deacon,
Catalin Marinas, kvmarm
Cc: kvm, linux-arm-kernel, James Morse, Alexandru Elisei, Zenghui Yu,
Suzuki K Poulose, Paolo Bonzini, Ricardo Koller, Jing Zhang,
Raghavendra Rao Anata, Shaoqin Huang, Rob Herring, Reiji Watanabe
Currently, with VHE, KVM sets ER, CR, SW and EN bits of
PMUSERENR_EL0 to 1 on vcpu_load(), and saves and restores
the register value for the host on vcpu_load() and vcpu_put().
If the value of those bits are cleared on a pCPU with a vCPU
loaded (armv8pmu_start() would do that when PMU counters are
programmed for the guest), PMU access from the guest EL0 might
be trapped to the guest EL1 directly regardless of the current
PMUSERENR_EL0 value of the vCPU.
Fix this by not letting armv8pmu_start() overwrite PMUSERENR on
the pCPU on which a vCPU is loaded, and instead updating the
saved shadow register value for the host, so that the value can
be restored on vcpu_put() later.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Suggested-by: Marc Zyngier <maz@kernel.org>
Fixes: 83a7a4d643d3 ("arm64: perf: Enable PMU counter userspace access for perf event")
Signed-off-by: Reiji Watanabe <reijiw@google.com>
---
arch/arm64/include/asm/kvm_host.h | 5 +++++
arch/arm64/kernel/perf_event.c | 21 ++++++++++++++++++---
arch/arm64/kvm/pmu.c | 20 ++++++++++++++++++++
3 files changed, 43 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index bcd774d74f34..22db2f885c17 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -1028,9 +1028,14 @@ void kvm_arch_vcpu_put_debug_state_flags(struct kvm_vcpu *vcpu);
#ifdef CONFIG_KVM
void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr);
void kvm_clr_pmu_events(u32 clr);
+bool kvm_set_pmuserenr(u64 val);
#else
static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) {}
static inline void kvm_clr_pmu_events(u32 clr) {}
+static inline bool kvm_set_pmuserenr(u64 val)
+{
+ return false;
+}
#endif
void kvm_vcpu_load_sysregs_vhe(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index dde06c0f97f3..0fffe4c56c28 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -741,9 +741,25 @@ static inline u32 armv8pmu_getreset_flags(void)
return value;
}
+static void update_pmuserenr(u64 val)
+{
+ lockdep_assert_irqs_disabled();
+
+ /*
+ * The current pmuserenr value might be the value for the guest.
+ * If that's the case, have KVM keep tracking of the register value
+ * for the host EL0 so that KVM can restore it before returning to
+ * the host EL0. Otherwise, update the register now.
+ */
+ if (kvm_set_pmuserenr(val))
+ return;
+
+ write_sysreg(val, pmuserenr_el0);
+}
+
static void armv8pmu_disable_user_access(void)
{
- write_sysreg(0, pmuserenr_el0);
+ update_pmuserenr(0);
}
static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
@@ -759,8 +775,7 @@ static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
armv8pmu_write_evcntr(i, 0);
}
- write_sysreg(0, pmuserenr_el0);
- write_sysreg(ARMV8_PMU_USERENR_ER | ARMV8_PMU_USERENR_CR, pmuserenr_el0);
+ update_pmuserenr(ARMV8_PMU_USERENR_ER | ARMV8_PMU_USERENR_CR);
}
static void armv8pmu_enable_event(struct perf_event *event)
diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
index 7887133d15f0..40bb2cb13317 100644
--- a/arch/arm64/kvm/pmu.c
+++ b/arch/arm64/kvm/pmu.c
@@ -209,3 +209,23 @@ void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu)
kvm_vcpu_pmu_enable_el0(events_host);
kvm_vcpu_pmu_disable_el0(events_guest);
}
+
+/*
+ * With VHE, keep track of the PMUSERENR_EL0 value for the host EL0 on
+ * the pCPU where vCPU is loaded, since PMUSERENR_EL0 is switched to
+ * the value for the guest on vcpu_load(). The value for the host EL0
+ * will be restored on vcpu_put(), before returning to the EL0.
+ *
+ * Return true if KVM takes care of the register. Otherwise return false.
+ */
+bool kvm_set_pmuserenr(u64 val)
+{
+ struct kvm_cpu_context *hctxt;
+
+ if (!kvm_arm_support_pmu_v3() || !has_vhe() || !kvm_get_running_vcpu())
+ return false;
+
+ hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
+ ctxt_sys_reg(hctxt, PMUSERENR_EL0) = val;
+ return true;
+}
--
2.40.0.577.gac1e443424-goog
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2 0/2] KVM: arm64: PMU: Correct the handling of PMUSERENR_EL0
2023-04-08 3:47 [PATCH v2 0/2] KVM: arm64: PMU: Correct the handling of PMUSERENR_EL0 Reiji Watanabe
2023-04-08 3:47 ` [PATCH v2 1/2] KVM: arm64: PMU: Restore the host's PMUSERENR_EL0 Reiji Watanabe
2023-04-08 3:47 ` [PATCH v2 2/2] KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded Reiji Watanabe
@ 2023-04-08 9:04 ` Marc Zyngier
2023-04-11 11:24 ` Will Deacon
2 siblings, 1 reply; 13+ messages in thread
From: Marc Zyngier @ 2023-04-08 9:04 UTC (permalink / raw)
To: Reiji Watanabe, Mark Rutland, Will Deacon
Cc: Oliver Upton, Catalin Marinas, kvmarm, kvm, linux-arm-kernel,
James Morse, Alexandru Elisei, Zenghui Yu, Suzuki K Poulose,
Paolo Bonzini, Ricardo Koller, Jing Zhang, Raghavendra Rao Anata,
Shaoqin Huang, Rob Herring
On Sat, 08 Apr 2023 04:47:57 +0100,
Reiji Watanabe <reijiw@google.com> wrote:
>
> This series will fix bugs in KVM's handling of PMUSERENR_EL0.
>
> With PMU access support from EL0 [1], the perf subsystem would
> set CR and ER bits of PMUSERENR_EL0 as needed to allow EL0 to have
> a direct access to PMU counters. However, KVM appears to assume
> that the register value is always zero for the host EL0, and has
> the following two problems in handling the register.
>
> [A] The host EL0 might lose the direct access to PMU counters, as
> KVM always clears PMUSERENR_EL0 before returning to userspace.
>
> [B] With VHE, the guest EL0 access to PMU counters might be trapped
> to EL1 instead of to EL2 (even when PMUSERENR_EL0 for the guest
> indicates that the guest EL0 has an access to the counters).
> This is because, with VHE, KVM sets ER, CR, SW and EN bits of
> PMUSERENR_EL0 to 1 on vcpu_load() to ensure to trap PMU access
> from the guset EL0 to EL2, but those bits might be cleared by
> the perf subsystem after vcpu_load() (when PMU counters are
> programmed for the vPMU emulation).
>
> Patch-1 will fix [A], and Patch-2 will fix [B] respectively.
> The series is based on v6.3-rc5.
>
> v2:
> - Save the PMUSERENR_EL0 for the host in the sysreg array of
> kvm_host_data. [Marc]
> - Don't let armv8pmu_start() overwrite PMUSERENR if the vCPU
> is loaded, instead have KVM update the saved shadow register
> value for the host. [Marc, Mark]
This looks much better to me. If Mark is OK with it, I'm happy to take
it in 6.4.
Speaking of which, this will clash with the queued move of the PMUv3
code into drivers/perf, and probably break on 32bit. I can either take
a branch shared with arm64 (009d6dc87a56 ("ARM: perf: Allow the use of
the PMUv3 driver on 32bit ARM")), or wait until -rc1.
Will, what do you prefer?
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/2] KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded
2023-04-08 3:47 ` [PATCH v2 2/2] KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded Reiji Watanabe
@ 2023-04-11 9:33 ` Mark Rutland
2023-04-12 5:14 ` Reiji Watanabe
0 siblings, 1 reply; 13+ messages in thread
From: Mark Rutland @ 2023-04-11 9:33 UTC (permalink / raw)
To: Reiji Watanabe
Cc: Marc Zyngier, Oliver Upton, Will Deacon, Catalin Marinas, kvmarm,
kvm, linux-arm-kernel, James Morse, Alexandru Elisei, Zenghui Yu,
Suzuki K Poulose, Paolo Bonzini, Ricardo Koller, Jing Zhang,
Raghavendra Rao Anata, Shaoqin Huang, Rob Herring
On Fri, Apr 07, 2023 at 08:47:59PM -0700, Reiji Watanabe wrote:
> Currently, with VHE, KVM sets ER, CR, SW and EN bits of
> PMUSERENR_EL0 to 1 on vcpu_load(), and saves and restores
> the register value for the host on vcpu_load() and vcpu_put().
> If the value of those bits are cleared on a pCPU with a vCPU
> loaded (armv8pmu_start() would do that when PMU counters are
> programmed for the guest), PMU access from the guest EL0 might
> be trapped to the guest EL1 directly regardless of the current
> PMUSERENR_EL0 value of the vCPU.
>
> Fix this by not letting armv8pmu_start() overwrite PMUSERENR on
> the pCPU on which a vCPU is loaded, and instead updating the
> saved shadow register value for the host, so that the value can
> be restored on vcpu_put() later.
I'm happy with the hook in the PMU code, but I think there's still a race
between an IPI and vcpu_{load,put}() where we can lose an update to
PMUSERERNR_EL0. I tried to point that out in my final question in:
https://lore.kernel.org/all/ZCwzV7ACl21VbLru@FVFF77S0Q05N.cambridge.arm.com/
... but I looks like that wasn't all that clear.
Consider vcpu_load():
void vcpu_load(struct kvm_vcpu *vcpu)
{
int cpu = get_cpu();
__this_cpu_write(kvm_running_vcpu, vcpu);
preempt_notifier_register(&vcpu->preempt_notifier);
kvm_arch_vcpu_load(vcpu, cpu);
put_cpu();
}
AFAICT that's called with IRQs enabled, and the {get,put}_cpu() calls will only
disable migration/preemption. After the write to kvm_running_vcpu, the code in
kvm_set_pmuserenr() will see that there is a running vcpu, and write to the
host context without updating the real PMUSERENR_EL0 register.
If we take an IPI and call kvm_set_pmuserenr() after the write to
kvm_running_vcpu but before kvm_running_vcpu() completes, the call to
kvm_set_pmuserenr() could update the host context (without updating the real
PMUSERENR_EL0 value) before __activate_traps_common() saves the host value
with:
ctxt_sys_reg(hctxt, PMUSERENR_EL0) = read_sysreg(pmuserenr_el0);
... which would discard the write made by kvm_set_pmuserenr().
Similar can happen in vcpu_put() where an IPI after __deactivate_traps_common()
but before kvm_running_vcpu is cleared would result in kvm_set_pmuserenr()
writing to the host context, but this value would never be written into HW.
Unless I'm missing something (e.g. if interrupts are actually masked during
those windows), I don't think this is a complete fix as-is.
I'm not sure if there is a smart fix for that.
Thanks,
Mark.
> Suggested-by: Mark Rutland <mark.rutland@arm.com>
> Suggested-by: Marc Zyngier <maz@kernel.org>
> Fixes: 83a7a4d643d3 ("arm64: perf: Enable PMU counter userspace access for perf event")
> Signed-off-by: Reiji Watanabe <reijiw@google.com>
> ---
> arch/arm64/include/asm/kvm_host.h | 5 +++++
> arch/arm64/kernel/perf_event.c | 21 ++++++++++++++++++---
> arch/arm64/kvm/pmu.c | 20 ++++++++++++++++++++
> 3 files changed, 43 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index bcd774d74f34..22db2f885c17 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -1028,9 +1028,14 @@ void kvm_arch_vcpu_put_debug_state_flags(struct kvm_vcpu *vcpu);
> #ifdef CONFIG_KVM
> void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr);
> void kvm_clr_pmu_events(u32 clr);
> +bool kvm_set_pmuserenr(u64 val);
> #else
> static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) {}
> static inline void kvm_clr_pmu_events(u32 clr) {}
> +static inline bool kvm_set_pmuserenr(u64 val)
> +{
> + return false;
> +}
> #endif
>
> void kvm_vcpu_load_sysregs_vhe(struct kvm_vcpu *vcpu);
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index dde06c0f97f3..0fffe4c56c28 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -741,9 +741,25 @@ static inline u32 armv8pmu_getreset_flags(void)
> return value;
> }
>
> +static void update_pmuserenr(u64 val)
> +{
> + lockdep_assert_irqs_disabled();
> +
> + /*
> + * The current pmuserenr value might be the value for the guest.
> + * If that's the case, have KVM keep tracking of the register value
> + * for the host EL0 so that KVM can restore it before returning to
> + * the host EL0. Otherwise, update the register now.
> + */
> + if (kvm_set_pmuserenr(val))
> + return;
> +
> + write_sysreg(val, pmuserenr_el0);
> +}
> +
> static void armv8pmu_disable_user_access(void)
> {
> - write_sysreg(0, pmuserenr_el0);
> + update_pmuserenr(0);
> }
>
> static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
> @@ -759,8 +775,7 @@ static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
> armv8pmu_write_evcntr(i, 0);
> }
>
> - write_sysreg(0, pmuserenr_el0);
> - write_sysreg(ARMV8_PMU_USERENR_ER | ARMV8_PMU_USERENR_CR, pmuserenr_el0);
> + update_pmuserenr(ARMV8_PMU_USERENR_ER | ARMV8_PMU_USERENR_CR);
> }
>
> static void armv8pmu_enable_event(struct perf_event *event)
> diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
> index 7887133d15f0..40bb2cb13317 100644
> --- a/arch/arm64/kvm/pmu.c
> +++ b/arch/arm64/kvm/pmu.c
> @@ -209,3 +209,23 @@ void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu)
> kvm_vcpu_pmu_enable_el0(events_host);
> kvm_vcpu_pmu_disable_el0(events_guest);
> }
> +
> +/*
> + * With VHE, keep track of the PMUSERENR_EL0 value for the host EL0 on
> + * the pCPU where vCPU is loaded, since PMUSERENR_EL0 is switched to
> + * the value for the guest on vcpu_load(). The value for the host EL0
> + * will be restored on vcpu_put(), before returning to the EL0.
> + *
> + * Return true if KVM takes care of the register. Otherwise return false.
> + */
> +bool kvm_set_pmuserenr(u64 val)
> +{
> + struct kvm_cpu_context *hctxt;
> +
> + if (!kvm_arm_support_pmu_v3() || !has_vhe() || !kvm_get_running_vcpu())
> + return false;
> +
> + hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> + ctxt_sys_reg(hctxt, PMUSERENR_EL0) = val;
> + return true;
> +}
> --
> 2.40.0.577.gac1e443424-goog
>
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 0/2] KVM: arm64: PMU: Correct the handling of PMUSERENR_EL0
2023-04-08 9:04 ` [PATCH v2 0/2] KVM: arm64: PMU: Correct the handling of PMUSERENR_EL0 Marc Zyngier
@ 2023-04-11 11:24 ` Will Deacon
2023-04-12 10:29 ` Marc Zyngier
0 siblings, 1 reply; 13+ messages in thread
From: Will Deacon @ 2023-04-11 11:24 UTC (permalink / raw)
To: Marc Zyngier
Cc: Reiji Watanabe, Mark Rutland, Oliver Upton, Catalin Marinas,
kvmarm, kvm, linux-arm-kernel, James Morse, Alexandru Elisei,
Zenghui Yu, Suzuki K Poulose, Paolo Bonzini, Ricardo Koller,
Jing Zhang, Raghavendra Rao Anata, Shaoqin Huang, Rob Herring
On Sat, Apr 08, 2023 at 10:04:19AM +0100, Marc Zyngier wrote:
> On Sat, 08 Apr 2023 04:47:57 +0100,
> Reiji Watanabe <reijiw@google.com> wrote:
> >
> > This series will fix bugs in KVM's handling of PMUSERENR_EL0.
> >
> > With PMU access support from EL0 [1], the perf subsystem would
> > set CR and ER bits of PMUSERENR_EL0 as needed to allow EL0 to have
> > a direct access to PMU counters. However, KVM appears to assume
> > that the register value is always zero for the host EL0, and has
> > the following two problems in handling the register.
> >
> > [A] The host EL0 might lose the direct access to PMU counters, as
> > KVM always clears PMUSERENR_EL0 before returning to userspace.
> >
> > [B] With VHE, the guest EL0 access to PMU counters might be trapped
> > to EL1 instead of to EL2 (even when PMUSERENR_EL0 for the guest
> > indicates that the guest EL0 has an access to the counters).
> > This is because, with VHE, KVM sets ER, CR, SW and EN bits of
> > PMUSERENR_EL0 to 1 on vcpu_load() to ensure to trap PMU access
> > from the guset EL0 to EL2, but those bits might be cleared by
> > the perf subsystem after vcpu_load() (when PMU counters are
> > programmed for the vPMU emulation).
> >
> > Patch-1 will fix [A], and Patch-2 will fix [B] respectively.
> > The series is based on v6.3-rc5.
> >
> > v2:
> > - Save the PMUSERENR_EL0 for the host in the sysreg array of
> > kvm_host_data. [Marc]
> > - Don't let armv8pmu_start() overwrite PMUSERENR if the vCPU
> > is loaded, instead have KVM update the saved shadow register
> > value for the host. [Marc, Mark]
>
> This looks much better to me. If Mark is OK with it, I'm happy to take
> it in 6.4.
>
> Speaking of which, this will clash with the queued move of the PMUv3
> code into drivers/perf, and probably break on 32bit. I can either take
> a branch shared with arm64 (009d6dc87a56 ("ARM: perf: Allow the use of
> the PMUv3 driver on 32bit ARM")), or wait until -rc1.
>
> Will, what do you prefer?
I'd be inclined to wait until -rc1, but for-next/perf is stable if you
decide to take it anyway.
Will
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/2] KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded
2023-04-11 9:33 ` Mark Rutland
@ 2023-04-12 5:14 ` Reiji Watanabe
2023-04-12 9:20 ` Mark Rutland
0 siblings, 1 reply; 13+ messages in thread
From: Reiji Watanabe @ 2023-04-12 5:14 UTC (permalink / raw)
To: Mark Rutland
Cc: Marc Zyngier, Oliver Upton, Will Deacon, Catalin Marinas, kvmarm,
kvm, linux-arm-kernel, James Morse, Alexandru Elisei, Zenghui Yu,
Suzuki K Poulose, Paolo Bonzini, Ricardo Koller, Jing Zhang,
Raghavendra Rao Anata, Shaoqin Huang, Rob Herring
On Tue, Apr 11, 2023 at 10:33:50AM +0100, Mark Rutland wrote:
> On Fri, Apr 07, 2023 at 08:47:59PM -0700, Reiji Watanabe wrote:
> > Currently, with VHE, KVM sets ER, CR, SW and EN bits of
> > PMUSERENR_EL0 to 1 on vcpu_load(), and saves and restores
> > the register value for the host on vcpu_load() and vcpu_put().
> > If the value of those bits are cleared on a pCPU with a vCPU
> > loaded (armv8pmu_start() would do that when PMU counters are
> > programmed for the guest), PMU access from the guest EL0 might
> > be trapped to the guest EL1 directly regardless of the current
> > PMUSERENR_EL0 value of the vCPU.
> >
> > Fix this by not letting armv8pmu_start() overwrite PMUSERENR on
> > the pCPU on which a vCPU is loaded, and instead updating the
> > saved shadow register value for the host, so that the value can
> > be restored on vcpu_put() later.
>
> I'm happy with the hook in the PMU code, but I think there's still a race
> between an IPI and vcpu_{load,put}() where we can lose an update to
> PMUSERERNR_EL0. I tried to point that out in my final question in:
>
> https://lore.kernel.org/all/ZCwzV7ACl21VbLru@FVFF77S0Q05N.cambridge.arm.com/
>
> ... but I looks like that wasn't all that clear.
>
> Consider vcpu_load():
>
> void vcpu_load(struct kvm_vcpu *vcpu)
> {
> int cpu = get_cpu();
>
> __this_cpu_write(kvm_running_vcpu, vcpu);
> preempt_notifier_register(&vcpu->preempt_notifier);
> kvm_arch_vcpu_load(vcpu, cpu);
> put_cpu();
> }
>
> AFAICT that's called with IRQs enabled, and the {get,put}_cpu() calls will only
> disable migration/preemption. After the write to kvm_running_vcpu, the code in
> kvm_set_pmuserenr() will see that there is a running vcpu, and write to the
> host context without updating the real PMUSERENR_EL0 register.
>
> If we take an IPI and call kvm_set_pmuserenr() after the write to
> kvm_running_vcpu but before kvm_running_vcpu() completes, the call to
> kvm_set_pmuserenr() could update the host context (without updating the real
> PMUSERENR_EL0 value) before __activate_traps_common() saves the host value
> with:
>
> ctxt_sys_reg(hctxt, PMUSERENR_EL0) = read_sysreg(pmuserenr_el0);
>
> ... which would discard the write made by kvm_set_pmuserenr().
>
> Similar can happen in vcpu_put() where an IPI after __deactivate_traps_common()
> but before kvm_running_vcpu is cleared would result in kvm_set_pmuserenr()
> writing to the host context, but this value would never be written into HW.
>
> Unless I'm missing something (e.g. if interrupts are actually masked during
> those windows), I don't think this is a complete fix as-is.
>
> I'm not sure if there is a smart fix for that.
Thank you for the comment.
Uh, right, interrupts are not masked during those windows...
What I am currently considering on this would be disabling
IRQs while manipulating the register, and introducing a new flag
to indicate whether the PMUSERENR for the guest EL0 is loaded,
and having kvm_set_pmuserenr() check the new flag.
The code would be something like below (local_irq_save/local_irq_restore
needs to be excluded for NVHE though).
What do you think ?
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -668,6 +668,8 @@ struct kvm_vcpu_arch {
/* Software step state is Active-pending */
#define DBG_SS_ACTIVE_PENDING __vcpu_single_flag(sflags, BIT(5))
+/* PMUSERENR for the guest EL0 is on physical CPU */
+#define PMUSERENR_ON_CPU __vcpu_single_flag(sflags, BIT(6))
/* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
#define vcpu_sve_pffr(vcpu) (kern_hyp_va((vcpu)->arch.sve_state) + \
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 6718731729fd..57e4f480874a 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -82,12 +82,19 @@ static inline void __activate_traps_common(struct kvm_vcpu *vcpu)
*/
if (kvm_arm_support_pmu_v3()) {
struct kvm_cpu_context *hctxt;
+ unsigned long flags;
write_sysreg(0, pmselr_el0);
hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
+
+ local_irq_save(flags);
+
ctxt_sys_reg(hctxt, PMUSERENR_EL0) = read_sysreg(pmuserenr_el0);
write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0);
+ vcpu_set_flag(vcpu, PMUSERENR_ON_CPU);
+
+ local_irq_restore(flags);
}
vcpu->arch.mdcr_el2_host = read_sysreg(mdcr_el2);
@@ -112,9 +119,16 @@ static inline void __deactivate_traps_common(struct kvm_vcpu *vcpu)
write_sysreg(0, hstr_el2);
if (kvm_arm_support_pmu_v3()) {
struct kvm_cpu_context *hctxt;
+ unsigned long flags;
hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
+
+ local_irq_save(flags);
+
write_sysreg(ctxt_sys_reg(hctxt, PMUSERENR_EL0), pmuserenr_el0);
+ vcpu_clear_flag(vcpu, PMUSERENR_ON_CPU);
+
+ local_irq_restore(flags);
}
if (cpus_have_final_cap(ARM64_SME)) {
diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
index 40bb2cb13317..33cd8e1ecbd6 100644
--- a/arch/arm64/kvm/pmu.c
+++ b/arch/arm64/kvm/pmu.c
@@ -221,8 +221,13 @@ void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu)
bool kvm_set_pmuserenr(u64 val)
{
struct kvm_cpu_context *hctxt;
+ struct kvm_vcpu *vcpu;
- if (!kvm_arm_support_pmu_v3() || !has_vhe() || !kvm_get_running_vcpu())
+ if (!kvm_arm_support_pmu_v3() || !has_vhe())
+ return false;
+
+ vcpu = kvm_get_running_vcpu();
+ if (!vcpu || !vcpu_get_flag(vcpu, PMUSERENR_ON_CPU))
return false;
hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
--
Thank you,
Reiji
>
> Thanks,
> Mark.
>
> > Suggested-by: Mark Rutland <mark.rutland@arm.com>
> > Suggested-by: Marc Zyngier <maz@kernel.org>
> > Fixes: 83a7a4d643d3 ("arm64: perf: Enable PMU counter userspace access for perf event")
> > Signed-off-by: Reiji Watanabe <reijiw@google.com>
> > ---
> > arch/arm64/include/asm/kvm_host.h | 5 +++++
> > arch/arm64/kernel/perf_event.c | 21 ++++++++++++++++++---
> > arch/arm64/kvm/pmu.c | 20 ++++++++++++++++++++
> > 3 files changed, 43 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index bcd774d74f34..22db2f885c17 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -1028,9 +1028,14 @@ void kvm_arch_vcpu_put_debug_state_flags(struct kvm_vcpu *vcpu);
> > #ifdef CONFIG_KVM
> > void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr);
> > void kvm_clr_pmu_events(u32 clr);
> > +bool kvm_set_pmuserenr(u64 val);
> > #else
> > static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) {}
> > static inline void kvm_clr_pmu_events(u32 clr) {}
> > +static inline bool kvm_set_pmuserenr(u64 val)
> > +{
> > + return false;
> > +}
> > #endif
> >
> > void kvm_vcpu_load_sysregs_vhe(struct kvm_vcpu *vcpu);
> > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > index dde06c0f97f3..0fffe4c56c28 100644
> > --- a/arch/arm64/kernel/perf_event.c
> > +++ b/arch/arm64/kernel/perf_event.c
> > @@ -741,9 +741,25 @@ static inline u32 armv8pmu_getreset_flags(void)
> > return value;
> > }
> >
> > +static void update_pmuserenr(u64 val)
> > +{
> > + lockdep_assert_irqs_disabled();
> > +
> > + /*
> > + * The current pmuserenr value might be the value for the guest.
> > + * If that's the case, have KVM keep tracking of the register value
> > + * for the host EL0 so that KVM can restore it before returning to
> > + * the host EL0. Otherwise, update the register now.
> > + */
> > + if (kvm_set_pmuserenr(val))
> > + return;
> > +
> > + write_sysreg(val, pmuserenr_el0);
> > +}
> > +
> > static void armv8pmu_disable_user_access(void)
> > {
> > - write_sysreg(0, pmuserenr_el0);
> > + update_pmuserenr(0);
> > }
> >
> > static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
> > @@ -759,8 +775,7 @@ static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
> > armv8pmu_write_evcntr(i, 0);
> > }
> >
> > - write_sysreg(0, pmuserenr_el0);
> > - write_sysreg(ARMV8_PMU_USERENR_ER | ARMV8_PMU_USERENR_CR, pmuserenr_el0);
> > + update_pmuserenr(ARMV8_PMU_USERENR_ER | ARMV8_PMU_USERENR_CR);
> > }
> >
> > static void armv8pmu_enable_event(struct perf_event *event)
> > diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
> > index 7887133d15f0..40bb2cb13317 100644
> > --- a/arch/arm64/kvm/pmu.c
> > +++ b/arch/arm64/kvm/pmu.c
> > @@ -209,3 +209,23 @@ void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu)
> > kvm_vcpu_pmu_enable_el0(events_host);
> > kvm_vcpu_pmu_disable_el0(events_guest);
> > }
> > +
> > +/*
> > + * With VHE, keep track of the PMUSERENR_EL0 value for the host EL0 on
> > + * the pCPU where vCPU is loaded, since PMUSERENR_EL0 is switched to
> > + * the value for the guest on vcpu_load(). The value for the host EL0
> > + * will be restored on vcpu_put(), before returning to the EL0.
> > + *
> > + * Return true if KVM takes care of the register. Otherwise return false.
> > + */
> > +bool kvm_set_pmuserenr(u64 val)
> > +{
> > + struct kvm_cpu_context *hctxt;
> > +
> > + if (!kvm_arm_support_pmu_v3() || !has_vhe() || !kvm_get_running_vcpu())
> > + return false;
> > +
> > + hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> > + ctxt_sys_reg(hctxt, PMUSERENR_EL0) = val;
> > + return true;
> > +}
> > --
> > 2.40.0.577.gac1e443424-goog
> >
> >
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/2] KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded
2023-04-12 5:14 ` Reiji Watanabe
@ 2023-04-12 9:20 ` Mark Rutland
2023-04-12 10:22 ` Marc Zyngier
0 siblings, 1 reply; 13+ messages in thread
From: Mark Rutland @ 2023-04-12 9:20 UTC (permalink / raw)
To: Reiji Watanabe
Cc: Marc Zyngier, Oliver Upton, Will Deacon, Catalin Marinas, kvmarm,
kvm, linux-arm-kernel, James Morse, Alexandru Elisei, Zenghui Yu,
Suzuki K Poulose, Paolo Bonzini, Ricardo Koller, Jing Zhang,
Raghavendra Rao Anata, Shaoqin Huang, Rob Herring
On Tue, Apr 11, 2023 at 10:14:10PM -0700, Reiji Watanabe wrote:
> Uh, right, interrupts are not masked during those windows...
>
> What I am currently considering on this would be disabling
> IRQs while manipulating the register, and introducing a new flag
> to indicate whether the PMUSERENR for the guest EL0 is loaded,
> and having kvm_set_pmuserenr() check the new flag.
>
> The code would be something like below (local_irq_save/local_irq_restore
> needs to be excluded for NVHE though).
>
> What do you think ?
I'm happy with that; it doesn't change the arm_pmu side of the interface and it
looks good from a functional perspective.
I'll have to leave it to Marc and Oliver to say whether they're happy with the
KVM side.
Thanks,
Mark.
>
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -668,6 +668,8 @@ struct kvm_vcpu_arch {
> /* Software step state is Active-pending */
> #define DBG_SS_ACTIVE_PENDING __vcpu_single_flag(sflags, BIT(5))
>
> +/* PMUSERENR for the guest EL0 is on physical CPU */
> +#define PMUSERENR_ON_CPU __vcpu_single_flag(sflags, BIT(6))
>
> /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
> #define vcpu_sve_pffr(vcpu) (kern_hyp_va((vcpu)->arch.sve_state) + \
> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> index 6718731729fd..57e4f480874a 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> @@ -82,12 +82,19 @@ static inline void __activate_traps_common(struct kvm_vcpu *vcpu)
> */
> if (kvm_arm_support_pmu_v3()) {
> struct kvm_cpu_context *hctxt;
> + unsigned long flags;
>
> write_sysreg(0, pmselr_el0);
>
> hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> +
> + local_irq_save(flags);
> +
> ctxt_sys_reg(hctxt, PMUSERENR_EL0) = read_sysreg(pmuserenr_el0);
> write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0);
> + vcpu_set_flag(vcpu, PMUSERENR_ON_CPU);
> +
> + local_irq_restore(flags);
> }
>
> vcpu->arch.mdcr_el2_host = read_sysreg(mdcr_el2);
> @@ -112,9 +119,16 @@ static inline void __deactivate_traps_common(struct kvm_vcpu *vcpu)
> write_sysreg(0, hstr_el2);
> if (kvm_arm_support_pmu_v3()) {
> struct kvm_cpu_context *hctxt;
> + unsigned long flags;
>
> hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> +
> + local_irq_save(flags);
> +
> write_sysreg(ctxt_sys_reg(hctxt, PMUSERENR_EL0), pmuserenr_el0);
> + vcpu_clear_flag(vcpu, PMUSERENR_ON_CPU);
> +
> + local_irq_restore(flags);
> }
>
> if (cpus_have_final_cap(ARM64_SME)) {
> diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
> index 40bb2cb13317..33cd8e1ecbd6 100644
> --- a/arch/arm64/kvm/pmu.c
> +++ b/arch/arm64/kvm/pmu.c
> @@ -221,8 +221,13 @@ void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu)
> bool kvm_set_pmuserenr(u64 val)
> {
> struct kvm_cpu_context *hctxt;
> + struct kvm_vcpu *vcpu;
>
> - if (!kvm_arm_support_pmu_v3() || !has_vhe() || !kvm_get_running_vcpu())
> + if (!kvm_arm_support_pmu_v3() || !has_vhe())
> + return false;
> +
> + vcpu = kvm_get_running_vcpu();
> + if (!vcpu || !vcpu_get_flag(vcpu, PMUSERENR_ON_CPU))
> return false;
>
> hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> --
>
> Thank you,
> Reiji
>
>
> >
> > Thanks,
> > Mark.
> >
> > > Suggested-by: Mark Rutland <mark.rutland@arm.com>
> > > Suggested-by: Marc Zyngier <maz@kernel.org>
> > > Fixes: 83a7a4d643d3 ("arm64: perf: Enable PMU counter userspace access for perf event")
> > > Signed-off-by: Reiji Watanabe <reijiw@google.com>
> > > ---
> > > arch/arm64/include/asm/kvm_host.h | 5 +++++
> > > arch/arm64/kernel/perf_event.c | 21 ++++++++++++++++++---
> > > arch/arm64/kvm/pmu.c | 20 ++++++++++++++++++++
> > > 3 files changed, 43 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > > index bcd774d74f34..22db2f885c17 100644
> > > --- a/arch/arm64/include/asm/kvm_host.h
> > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > @@ -1028,9 +1028,14 @@ void kvm_arch_vcpu_put_debug_state_flags(struct kvm_vcpu *vcpu);
> > > #ifdef CONFIG_KVM
> > > void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr);
> > > void kvm_clr_pmu_events(u32 clr);
> > > +bool kvm_set_pmuserenr(u64 val);
> > > #else
> > > static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) {}
> > > static inline void kvm_clr_pmu_events(u32 clr) {}
> > > +static inline bool kvm_set_pmuserenr(u64 val)
> > > +{
> > > + return false;
> > > +}
> > > #endif
> > >
> > > void kvm_vcpu_load_sysregs_vhe(struct kvm_vcpu *vcpu);
> > > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > > index dde06c0f97f3..0fffe4c56c28 100644
> > > --- a/arch/arm64/kernel/perf_event.c
> > > +++ b/arch/arm64/kernel/perf_event.c
> > > @@ -741,9 +741,25 @@ static inline u32 armv8pmu_getreset_flags(void)
> > > return value;
> > > }
> > >
> > > +static void update_pmuserenr(u64 val)
> > > +{
> > > + lockdep_assert_irqs_disabled();
> > > +
> > > + /*
> > > + * The current pmuserenr value might be the value for the guest.
> > > + * If that's the case, have KVM keep tracking of the register value
> > > + * for the host EL0 so that KVM can restore it before returning to
> > > + * the host EL0. Otherwise, update the register now.
> > > + */
> > > + if (kvm_set_pmuserenr(val))
> > > + return;
> > > +
> > > + write_sysreg(val, pmuserenr_el0);
> > > +}
> > > +
> > > static void armv8pmu_disable_user_access(void)
> > > {
> > > - write_sysreg(0, pmuserenr_el0);
> > > + update_pmuserenr(0);
> > > }
> > >
> > > static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
> > > @@ -759,8 +775,7 @@ static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
> > > armv8pmu_write_evcntr(i, 0);
> > > }
> > >
> > > - write_sysreg(0, pmuserenr_el0);
> > > - write_sysreg(ARMV8_PMU_USERENR_ER | ARMV8_PMU_USERENR_CR, pmuserenr_el0);
> > > + update_pmuserenr(ARMV8_PMU_USERENR_ER | ARMV8_PMU_USERENR_CR);
> > > }
> > >
> > > static void armv8pmu_enable_event(struct perf_event *event)
> > > diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
> > > index 7887133d15f0..40bb2cb13317 100644
> > > --- a/arch/arm64/kvm/pmu.c
> > > +++ b/arch/arm64/kvm/pmu.c
> > > @@ -209,3 +209,23 @@ void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu)
> > > kvm_vcpu_pmu_enable_el0(events_host);
> > > kvm_vcpu_pmu_disable_el0(events_guest);
> > > }
> > > +
> > > +/*
> > > + * With VHE, keep track of the PMUSERENR_EL0 value for the host EL0 on
> > > + * the pCPU where vCPU is loaded, since PMUSERENR_EL0 is switched to
> > > + * the value for the guest on vcpu_load(). The value for the host EL0
> > > + * will be restored on vcpu_put(), before returning to the EL0.
> > > + *
> > > + * Return true if KVM takes care of the register. Otherwise return false.
> > > + */
> > > +bool kvm_set_pmuserenr(u64 val)
> > > +{
> > > + struct kvm_cpu_context *hctxt;
> > > +
> > > + if (!kvm_arm_support_pmu_v3() || !has_vhe() || !kvm_get_running_vcpu())
> > > + return false;
> > > +
> > > + hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> > > + ctxt_sys_reg(hctxt, PMUSERENR_EL0) = val;
> > > + return true;
> > > +}
> > > --
> > > 2.40.0.577.gac1e443424-goog
> > >
> > >
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/2] KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded
2023-04-12 9:20 ` Mark Rutland
@ 2023-04-12 10:22 ` Marc Zyngier
2023-04-13 0:07 ` Reiji Watanabe
0 siblings, 1 reply; 13+ messages in thread
From: Marc Zyngier @ 2023-04-12 10:22 UTC (permalink / raw)
To: Mark Rutland
Cc: Reiji Watanabe, Oliver Upton, Will Deacon, Catalin Marinas,
kvmarm, kvm, linux-arm-kernel, James Morse, Alexandru Elisei,
Zenghui Yu, Suzuki K Poulose, Paolo Bonzini, Ricardo Koller,
Jing Zhang, Raghavendra Rao Anata, Shaoqin Huang, Rob Herring
On Wed, 12 Apr 2023 10:20:05 +0100,
Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Tue, Apr 11, 2023 at 10:14:10PM -0700, Reiji Watanabe wrote:
> > Uh, right, interrupts are not masked during those windows...
> >
> > What I am currently considering on this would be disabling
> > IRQs while manipulating the register, and introducing a new flag
> > to indicate whether the PMUSERENR for the guest EL0 is loaded,
> > and having kvm_set_pmuserenr() check the new flag.
> >
> > The code would be something like below (local_irq_save/local_irq_restore
> > needs to be excluded for NVHE though).
It shouldn't need to be excluded. It should be fairly harmless, unless
I'm missing something really obvious?
> >
> > What do you think ?
>
> I'm happy with that; it doesn't change the arm_pmu side of the interface and it
> looks good from a functional perspective.
>
> I'll have to leave it to Marc and Oliver to say whether they're happy with the
> KVM side.
This looks OK to me. My only ask would be to have a small comment in
the __{de}activate_traps_common() functions to say what this protects
against, because I'll page it out within minutes.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 0/2] KVM: arm64: PMU: Correct the handling of PMUSERENR_EL0
2023-04-11 11:24 ` Will Deacon
@ 2023-04-12 10:29 ` Marc Zyngier
0 siblings, 0 replies; 13+ messages in thread
From: Marc Zyngier @ 2023-04-12 10:29 UTC (permalink / raw)
To: Will Deacon
Cc: Reiji Watanabe, Mark Rutland, Oliver Upton, Catalin Marinas,
kvmarm, kvm, linux-arm-kernel, James Morse, Alexandru Elisei,
Zenghui Yu, Suzuki K Poulose, Paolo Bonzini, Ricardo Koller,
Jing Zhang, Raghavendra Rao Anata, Shaoqin Huang, Rob Herring
On Tue, 11 Apr 2023 12:24:59 +0100,
Will Deacon <will@kernel.org> wrote:
>
> On Sat, Apr 08, 2023 at 10:04:19AM +0100, Marc Zyngier wrote:
> > On Sat, 08 Apr 2023 04:47:57 +0100,
> > Reiji Watanabe <reijiw@google.com> wrote:
> > >
> > > This series will fix bugs in KVM's handling of PMUSERENR_EL0.
> > >
> > > With PMU access support from EL0 [1], the perf subsystem would
> > > set CR and ER bits of PMUSERENR_EL0 as needed to allow EL0 to have
> > > a direct access to PMU counters. However, KVM appears to assume
> > > that the register value is always zero for the host EL0, and has
> > > the following two problems in handling the register.
> > >
> > > [A] The host EL0 might lose the direct access to PMU counters, as
> > > KVM always clears PMUSERENR_EL0 before returning to userspace.
> > >
> > > [B] With VHE, the guest EL0 access to PMU counters might be trapped
> > > to EL1 instead of to EL2 (even when PMUSERENR_EL0 for the guest
> > > indicates that the guest EL0 has an access to the counters).
> > > This is because, with VHE, KVM sets ER, CR, SW and EN bits of
> > > PMUSERENR_EL0 to 1 on vcpu_load() to ensure to trap PMU access
> > > from the guset EL0 to EL2, but those bits might be cleared by
> > > the perf subsystem after vcpu_load() (when PMU counters are
> > > programmed for the vPMU emulation).
> > >
> > > Patch-1 will fix [A], and Patch-2 will fix [B] respectively.
> > > The series is based on v6.3-rc5.
> > >
> > > v2:
> > > - Save the PMUSERENR_EL0 for the host in the sysreg array of
> > > kvm_host_data. [Marc]
> > > - Don't let armv8pmu_start() overwrite PMUSERENR if the vCPU
> > > is loaded, instead have KVM update the saved shadow register
> > > value for the host. [Marc, Mark]
> >
> > This looks much better to me. If Mark is OK with it, I'm happy to take
> > it in 6.4.
> >
> > Speaking of which, this will clash with the queued move of the PMUv3
> > code into drivers/perf, and probably break on 32bit. I can either take
> > a branch shared with arm64 (009d6dc87a56 ("ARM: perf: Allow the use of
> > the PMUv3 driver on 32bit ARM")), or wait until -rc1.
> >
> > Will, what do you prefer?
>
> I'd be inclined to wait until -rc1, but for-next/perf is stable if you
> decide to take it anyway.
Given that Mark and Reiji are still working out some of the corner
cases, -rc1 feels like the right target.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/2] KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded
2023-04-12 10:22 ` Marc Zyngier
@ 2023-04-13 0:07 ` Reiji Watanabe
2023-04-13 8:56 ` Marc Zyngier
0 siblings, 1 reply; 13+ messages in thread
From: Reiji Watanabe @ 2023-04-13 0:07 UTC (permalink / raw)
To: Marc Zyngier
Cc: Mark Rutland, Oliver Upton, Will Deacon, Catalin Marinas, kvmarm,
kvm, linux-arm-kernel, James Morse, Alexandru Elisei, Zenghui Yu,
Suzuki K Poulose, Paolo Bonzini, Ricardo Koller, Jing Zhang,
Raghavendra Rao Anata, Shaoqin Huang, Rob Herring
On Wed, Apr 12, 2023 at 11:22:29AM +0100, Marc Zyngier wrote:
> On Wed, 12 Apr 2023 10:20:05 +0100,
> Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > On Tue, Apr 11, 2023 at 10:14:10PM -0700, Reiji Watanabe wrote:
> > > Uh, right, interrupts are not masked during those windows...
> > >
> > > What I am currently considering on this would be disabling
> > > IRQs while manipulating the register, and introducing a new flag
> > > to indicate whether the PMUSERENR for the guest EL0 is loaded,
> > > and having kvm_set_pmuserenr() check the new flag.
> > >
> > > The code would be something like below (local_irq_save/local_irq_restore
> > > needs to be excluded for NVHE though).
>
> It shouldn't need to be excluded. It should be fairly harmless, unless
> I'm missing something really obvious?
The reason why I think local_irq_{save,restore} should be excluded
are because they use trace_hardirqs_{on,off} (Since IRQs are
masked here for NVHE, practically, they shouldn't be called with
the current KVM implementation though).
I'm looking at using "ifndef __KVM_NVHE_HYPERVISOR__" or other
ways to organize the code for this.
Since {__activate,__deactivate}_traps_common() are pretty lightweight
functions, I'm also considering disabling IRQs in their call sites
(i.e. activate_traps_vhe_load/deactivate_traps_vhe_put), instead of in
__{de}activate_traps_common() (Thanks for this suggestion, Oliver).
> > I'm happy with that; it doesn't change the arm_pmu side of the interface and it
> > looks good from a functional perspective.
> >
> > I'll have to leave it to Marc and Oliver to say whether they're happy with the
> > KVM side.
>
> This looks OK to me. My only ask would be to have a small comment in
> the __{de}activate_traps_common() functions to say what this protects
> against, because I'll page it out within minutes.
Thank you for the feedback!
Yes, I will add a comment for those.
Thank you,
Reiji
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/2] KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded
2023-04-13 0:07 ` Reiji Watanabe
@ 2023-04-13 8:56 ` Marc Zyngier
2023-04-15 3:11 ` Reiji Watanabe
0 siblings, 1 reply; 13+ messages in thread
From: Marc Zyngier @ 2023-04-13 8:56 UTC (permalink / raw)
To: Reiji Watanabe
Cc: Mark Rutland, Oliver Upton, Will Deacon, Catalin Marinas, kvmarm,
kvm, linux-arm-kernel, James Morse, Alexandru Elisei, Zenghui Yu,
Suzuki K Poulose, Paolo Bonzini, Ricardo Koller, Jing Zhang,
Raghavendra Rao Anata, Shaoqin Huang, Rob Herring
On Thu, 13 Apr 2023 01:07:38 +0100,
Reiji Watanabe <reijiw@google.com> wrote:
>
> On Wed, Apr 12, 2023 at 11:22:29AM +0100, Marc Zyngier wrote:
> > On Wed, 12 Apr 2023 10:20:05 +0100,
> > Mark Rutland <mark.rutland@arm.com> wrote:
> > >
> > > On Tue, Apr 11, 2023 at 10:14:10PM -0700, Reiji Watanabe wrote:
> > > > Uh, right, interrupts are not masked during those windows...
> > > >
> > > > What I am currently considering on this would be disabling
> > > > IRQs while manipulating the register, and introducing a new flag
> > > > to indicate whether the PMUSERENR for the guest EL0 is loaded,
> > > > and having kvm_set_pmuserenr() check the new flag.
> > > >
> > > > The code would be something like below (local_irq_save/local_irq_restore
> > > > needs to be excluded for NVHE though).
> >
> > It shouldn't need to be excluded. It should be fairly harmless, unless
> > I'm missing something really obvious?
>
> The reason why I think local_irq_{save,restore} should be excluded
> are because they use trace_hardirqs_{on,off} (Since IRQs are
> masked here for NVHE, practically, they shouldn't be called with
> the current KVM implementation though).
Gah. Indeed, we end-up with a lot of unwanted crap, and absolutely no
way to locally override it.
> I'm looking at using "ifndef __KVM_NVHE_HYPERVISOR__" or other
> ways to organize the code for this.
I'd vote for something like the code below:
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 530347cdebe3..1796fadb26cc 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -10,7 +10,7 @@ asflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS
# will explode instantly (Words of Marc Zyngier). So introduce a generic flag
# __DISABLE_TRACE_MMIO__ to disable MMIO tracing for nVHE KVM.
ccflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS -D__DISABLE_TRACE_MMIO__
-ccflags-y += -fno-stack-protector \
+ccflags-y += -fno-stack-protector -DNO_TRACE_IRQFLAGS \
-DDISABLE_BRANCH_PROFILING \
$(DISABLE_STACKLEAK_PLUGIN)
diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h
index 5ec0fa71399e..ab0ae58dd797 100644
--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -198,9 +198,10 @@ extern void warn_bogus_irq_restore(void);
/*
* The local_irq_*() APIs are equal to the raw_local_irq*()
- * if !TRACE_IRQFLAGS.
+ * if !TRACE_IRQFLAGS or if NO_TRACE_IRQFLAGS is localy
+ * set.
*/
-#ifdef CONFIG_TRACE_IRQFLAGS
+#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(NO_TRACE_IRQFLAGS)
#define local_irq_enable() \
do { \
> Since {__activate,__deactivate}_traps_common() are pretty lightweight
> functions, I'm also considering disabling IRQs in their call sites
> (i.e. activate_traps_vhe_load/deactivate_traps_vhe_put), instead of in
> __{de}activate_traps_common() (Thanks for this suggestion, Oliver).
That would work too.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/2] KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded
2023-04-13 8:56 ` Marc Zyngier
@ 2023-04-15 3:11 ` Reiji Watanabe
0 siblings, 0 replies; 13+ messages in thread
From: Reiji Watanabe @ 2023-04-15 3:11 UTC (permalink / raw)
To: Marc Zyngier
Cc: Mark Rutland, Oliver Upton, Will Deacon, Catalin Marinas, kvmarm,
kvm, linux-arm-kernel, James Morse, Alexandru Elisei, Zenghui Yu,
Suzuki K Poulose, Paolo Bonzini, Ricardo Koller, Jing Zhang,
Raghavendra Rao Anata, Shaoqin Huang, Rob Herring
> > > > > Uh, right, interrupts are not masked during those windows...
> > > > >
> > > > > What I am currently considering on this would be disabling
> > > > > IRQs while manipulating the register, and introducing a new flag
> > > > > to indicate whether the PMUSERENR for the guest EL0 is loaded,
> > > > > and having kvm_set_pmuserenr() check the new flag.
> > > > >
> > > > > The code would be something like below (local_irq_save/local_irq_restore
> > > > > needs to be excluded for NVHE though).
> > >
> > > It shouldn't need to be excluded. It should be fairly harmless, unless
> > > I'm missing something really obvious?
> >
> > The reason why I think local_irq_{save,restore} should be excluded
> > are because they use trace_hardirqs_{on,off} (Since IRQs are
> > masked here for NVHE, practically, they shouldn't be called with
> > the current KVM implementation though).
>
> Gah. Indeed, we end-up with a lot of unwanted crap, and absolutely no
> way to locally override it.
>
> > I'm looking at using "ifndef __KVM_NVHE_HYPERVISOR__" or other
> > ways to organize the code for this.
>
> I'd vote for something like the code below:
Thank you for the suggestion.
Considering that we may have similar cases in the future,
I will implement as you suggested in v3.
Thank you,
Reiji
>
> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index 530347cdebe3..1796fadb26cc 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -10,7 +10,7 @@ asflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS
> # will explode instantly (Words of Marc Zyngier). So introduce a generic flag
> # __DISABLE_TRACE_MMIO__ to disable MMIO tracing for nVHE KVM.
> ccflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS -D__DISABLE_TRACE_MMIO__
> -ccflags-y += -fno-stack-protector \
> +ccflags-y += -fno-stack-protector -DNO_TRACE_IRQFLAGS \
> -DDISABLE_BRANCH_PROFILING \
> $(DISABLE_STACKLEAK_PLUGIN)
>
> diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h
> index 5ec0fa71399e..ab0ae58dd797 100644
> --- a/include/linux/irqflags.h
> +++ b/include/linux/irqflags.h
> @@ -198,9 +198,10 @@ extern void warn_bogus_irq_restore(void);
>
> /*
> * The local_irq_*() APIs are equal to the raw_local_irq*()
> - * if !TRACE_IRQFLAGS.
> + * if !TRACE_IRQFLAGS or if NO_TRACE_IRQFLAGS is localy
> + * set.
> */
> -#ifdef CONFIG_TRACE_IRQFLAGS
> +#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(NO_TRACE_IRQFLAGS)
>
> #define local_irq_enable() \
> do { \
>
>
> > Since {__activate,__deactivate}_traps_common() are pretty lightweight
> > functions, I'm also considering disabling IRQs in their call sites
> > (i.e. activate_traps_vhe_load/deactivate_traps_vhe_put), instead of in
> > __{de}activate_traps_common() (Thanks for this suggestion, Oliver).
>
> That would work too.
>
> Thanks,
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2023-04-15 3:12 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-08 3:47 [PATCH v2 0/2] KVM: arm64: PMU: Correct the handling of PMUSERENR_EL0 Reiji Watanabe
2023-04-08 3:47 ` [PATCH v2 1/2] KVM: arm64: PMU: Restore the host's PMUSERENR_EL0 Reiji Watanabe
2023-04-08 3:47 ` [PATCH v2 2/2] KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded Reiji Watanabe
2023-04-11 9:33 ` Mark Rutland
2023-04-12 5:14 ` Reiji Watanabe
2023-04-12 9:20 ` Mark Rutland
2023-04-12 10:22 ` Marc Zyngier
2023-04-13 0:07 ` Reiji Watanabe
2023-04-13 8:56 ` Marc Zyngier
2023-04-15 3:11 ` Reiji Watanabe
2023-04-08 9:04 ` [PATCH v2 0/2] KVM: arm64: PMU: Correct the handling of PMUSERENR_EL0 Marc Zyngier
2023-04-11 11:24 ` Will Deacon
2023-04-12 10:29 ` Marc Zyngier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).