* [PATCH v2 00/12] KVM: arm64: Add NV timer support
@ 2024-12-17 14:23 Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 01/12] KVM: arm64: nv: Add handling of EL2-specific timer registers Marc Zyngier
` (13 more replies)
0 siblings, 14 replies; 27+ messages in thread
From: Marc Zyngier @ 2024-12-17 14:23 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Bjorn Andersson, Christoffer Dall, Ganapatrao Kulkarni,
Chase Conklin, Eric Auger
Here's another version of the series initially posted at [1], which
implements support for timers in NV context.
From v1:
- Repainted EL0->EL1 when rambling about the timers
- Simplified access to EL1 counters from HYP context
- Update the status register when handled as an early trap
- Added some documentation about the default PPI numbers
The whole thing has been tested with 6.13-rc3 as part of the my NV
integration branch [2], and is functional enough to run an L3 guest
with kvmtool as the VMM and EDK2 as the firmware. YMMV.
[1] https://lore.kernel.org/r/20241202172134.384923-1-maz@kernel.org
[2] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/nv-next
Marc Zyngier (12):
KVM: arm64: nv: Add handling of EL2-specific timer registers
KVM: arm64: nv: Sync nested timer state with FEAT_NV2
KVM: arm64: nv: Publish emulated timer interrupt state in the
in-memory state
KVM: arm64: nv: Use FEAT_ECV to trap access to EL0 timers
KVM: arm64: nv: Accelerate EL0 timer read accesses when FEAT_ECV in
use
KVM: arm64: nv: Accelerate EL0 counter accesses from hypervisor
context
KVM: arm64: Handle counter access early in non-HYP context
KVM: arm64: nv: Add trap routing for
CNTHCTL_EL2.EL1{NVPCT,NVVCT,TVT,TVCT}
KVM: arm64: nv: Propagate CNTHCTL_EL2.EL1NV{P,V}CT bits
KVM: arm64: nv: Sanitise CNTHCTL_EL2
KVM: arm64: Work around x1e's CNTVOFF_EL2 bogosity
KVM: arm64: nv: Document EL2 timer API
Documentation/virt/kvm/devices/vcpu.rst | 15 +-
arch/arm64/include/asm/cputype.h | 2 +
arch/arm64/include/asm/kvm_host.h | 2 +-
arch/arm64/include/asm/sysreg.h | 4 +
arch/arm64/kernel/cpu_errata.c | 8 ++
arch/arm64/kernel/image-vars.h | 3 +
arch/arm64/kvm/arch_timer.c | 179 +++++++++++++++++++++---
arch/arm64/kvm/arm.c | 3 +
arch/arm64/kvm/emulate-nested.c | 58 +++++++-
arch/arm64/kvm/hyp/include/hyp/switch.h | 39 ++++--
arch/arm64/kvm/hyp/nvhe/timer-sr.c | 16 ++-
arch/arm64/kvm/hyp/vhe/switch.c | 107 ++++++++++++++
arch/arm64/kvm/nested.c | 15 ++
arch/arm64/kvm/sys_regs.c | 146 ++++++++++++++++++-
arch/arm64/tools/cpucaps | 1 +
include/clocksource/arm_arch_timer.h | 6 +
include/kvm/arm_arch_timer.h | 23 +++
17 files changed, 580 insertions(+), 47 deletions(-)
--
2.39.2
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v2 01/12] KVM: arm64: nv: Add handling of EL2-specific timer registers
2024-12-17 14:23 [PATCH v2 00/12] KVM: arm64: Add NV timer support Marc Zyngier
@ 2024-12-17 14:23 ` Marc Zyngier
2024-12-21 1:38 ` Oliver Upton
2024-12-17 14:23 ` [PATCH v2 02/12] KVM: arm64: nv: Sync nested timer state with FEAT_NV2 Marc Zyngier
` (12 subsequent siblings)
13 siblings, 1 reply; 27+ messages in thread
From: Marc Zyngier @ 2024-12-17 14:23 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Bjorn Andersson, Christoffer Dall, Ganapatrao Kulkarni,
Chase Conklin, Eric Auger
Add the required handling for EL2 and EL02 registers, as
well as EL1 registers used in the E2H context. This includes
handling the virtual timer accesses when CNTHCTL_EL2.EL1TVT
or CNTHCTL_EL2.EL1TVCT are set.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/include/asm/sysreg.h | 4 +
arch/arm64/kvm/sys_regs.c | 143 +++++++++++++++++++++++++++
include/clocksource/arm_arch_timer.h | 2 +
3 files changed, 149 insertions(+)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index b8303a83c0bff..2ed33737c7a99 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -477,6 +477,7 @@
#define SYS_CNTFRQ_EL0 sys_reg(3, 3, 14, 0, 0)
#define SYS_CNTPCT_EL0 sys_reg(3, 3, 14, 0, 1)
+#define SYS_CNTVCT_EL0 sys_reg(3, 3, 14, 0, 2)
#define SYS_CNTPCTSS_EL0 sys_reg(3, 3, 14, 0, 5)
#define SYS_CNTVCTSS_EL0 sys_reg(3, 3, 14, 0, 6)
@@ -484,14 +485,17 @@
#define SYS_CNTP_CTL_EL0 sys_reg(3, 3, 14, 2, 1)
#define SYS_CNTP_CVAL_EL0 sys_reg(3, 3, 14, 2, 2)
+#define SYS_CNTV_TVAL_EL0 sys_reg(3, 3, 14, 3, 0)
#define SYS_CNTV_CTL_EL0 sys_reg(3, 3, 14, 3, 1)
#define SYS_CNTV_CVAL_EL0 sys_reg(3, 3, 14, 3, 2)
#define SYS_AARCH32_CNTP_TVAL sys_reg(0, 0, 14, 2, 0)
#define SYS_AARCH32_CNTP_CTL sys_reg(0, 0, 14, 2, 1)
#define SYS_AARCH32_CNTPCT sys_reg(0, 0, 0, 14, 0)
+#define SYS_AARCH32_CNTVCT sys_reg(0, 1, 0, 14, 0)
#define SYS_AARCH32_CNTP_CVAL sys_reg(0, 2, 0, 14, 0)
#define SYS_AARCH32_CNTPCTSS sys_reg(0, 8, 0, 14, 0)
+#define SYS_AARCH32_CNTVCTSS sys_reg(0, 9, 0, 14, 0)
#define __PMEV_op2(n) ((n) & 0x7)
#define __CNTR_CRm(n) (0x8 | (((n) >> 3) & 0x3))
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 83c6b4a07ef56..986e63d4f9faa 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1412,26 +1412,146 @@ static bool access_arch_timer(struct kvm_vcpu *vcpu,
switch (reg) {
case SYS_CNTP_TVAL_EL0:
+ if (is_hyp_ctxt(vcpu) && vcpu_el2_e2h_is_set(vcpu))
+ tmr = TIMER_HPTIMER;
+ else
+ tmr = TIMER_PTIMER;
+ treg = TIMER_REG_TVAL;
+ break;
+
+ case SYS_CNTV_TVAL_EL0:
+ if (is_hyp_ctxt(vcpu) && vcpu_el2_e2h_is_set(vcpu))
+ tmr = TIMER_HVTIMER;
+ else
+ tmr = TIMER_VTIMER;
+ treg = TIMER_REG_TVAL;
+ break;
+
case SYS_AARCH32_CNTP_TVAL:
+ case SYS_CNTP_TVAL_EL02:
tmr = TIMER_PTIMER;
treg = TIMER_REG_TVAL;
break;
+
+ case SYS_CNTV_TVAL_EL02:
+ tmr = TIMER_VTIMER;
+ treg = TIMER_REG_TVAL;
+ break;
+
+ case SYS_CNTHP_TVAL_EL2:
+ tmr = TIMER_HPTIMER;
+ treg = TIMER_REG_TVAL;
+ break;
+
+ case SYS_CNTHV_TVAL_EL2:
+ tmr = TIMER_HVTIMER;
+ treg = TIMER_REG_TVAL;
+ break;
+
case SYS_CNTP_CTL_EL0:
+ if (is_hyp_ctxt(vcpu) && vcpu_el2_e2h_is_set(vcpu))
+ tmr = TIMER_HPTIMER;
+ else
+ tmr = TIMER_PTIMER;
+ treg = TIMER_REG_CTL;
+ break;
+
+ case SYS_CNTV_CTL_EL0:
+ if (is_hyp_ctxt(vcpu) && vcpu_el2_e2h_is_set(vcpu))
+ tmr = TIMER_HVTIMER;
+ else
+ tmr = TIMER_VTIMER;
+ treg = TIMER_REG_CTL;
+ break;
+
case SYS_AARCH32_CNTP_CTL:
+ case SYS_CNTP_CTL_EL02:
tmr = TIMER_PTIMER;
treg = TIMER_REG_CTL;
break;
+
+ case SYS_CNTV_CTL_EL02:
+ tmr = TIMER_VTIMER;
+ treg = TIMER_REG_CTL;
+ break;
+
+ case SYS_CNTHP_CTL_EL2:
+ tmr = TIMER_HPTIMER;
+ treg = TIMER_REG_CTL;
+ break;
+
+ case SYS_CNTHV_CTL_EL2:
+ tmr = TIMER_HVTIMER;
+ treg = TIMER_REG_CTL;
+ break;
+
case SYS_CNTP_CVAL_EL0:
+ if (is_hyp_ctxt(vcpu) && vcpu_el2_e2h_is_set(vcpu))
+ tmr = TIMER_HPTIMER;
+ else
+ tmr = TIMER_PTIMER;
+ treg = TIMER_REG_CVAL;
+ break;
+
+ case SYS_CNTV_CVAL_EL0:
+ if (is_hyp_ctxt(vcpu) && vcpu_el2_e2h_is_set(vcpu))
+ tmr = TIMER_HVTIMER;
+ else
+ tmr = TIMER_VTIMER;
+ treg = TIMER_REG_CVAL;
+ break;
+
case SYS_AARCH32_CNTP_CVAL:
+ case SYS_CNTP_CVAL_EL02:
tmr = TIMER_PTIMER;
treg = TIMER_REG_CVAL;
break;
+
+ case SYS_CNTV_CVAL_EL02:
+ tmr = TIMER_VTIMER;
+ treg = TIMER_REG_CVAL;
+ break;
+
+ case SYS_CNTHP_CVAL_EL2:
+ tmr = TIMER_HPTIMER;
+ treg = TIMER_REG_CVAL;
+ break;
+
+ case SYS_CNTHV_CVAL_EL2:
+ tmr = TIMER_HVTIMER;
+ treg = TIMER_REG_CVAL;
+ break;
+
case SYS_CNTPCT_EL0:
case SYS_CNTPCTSS_EL0:
+ if (is_hyp_ctxt(vcpu))
+ tmr = TIMER_HPTIMER;
+ else
+ tmr = TIMER_PTIMER;
+ treg = TIMER_REG_CNT;
+ break;
+
case SYS_AARCH32_CNTPCT:
+ case SYS_AARCH32_CNTPCTSS:
tmr = TIMER_PTIMER;
treg = TIMER_REG_CNT;
break;
+
+ case SYS_CNTVCT_EL0:
+ case SYS_CNTVCTSS_EL0:
+ if (is_hyp_ctxt(vcpu))
+ tmr = TIMER_HVTIMER;
+ else
+ tmr = TIMER_VTIMER;
+ treg = TIMER_REG_CNT;
+ break;
+
+ case SYS_AARCH32_CNTVCT:
+ case SYS_AARCH32_CNTVCTSS:
+ tmr = TIMER_VTIMER;
+ treg = TIMER_REG_CNT;
+ break;
+
default:
print_sys_reg_msg(p, "%s", "Unhandled trapped timer register");
return undef_access(vcpu, p, r);
@@ -2901,11 +3021,17 @@ static const struct sys_reg_desc sys_reg_descs[] = {
AMU_AMEVTYPER1_EL0(15),
{ SYS_DESC(SYS_CNTPCT_EL0), access_arch_timer },
+ { SYS_DESC(SYS_CNTVCT_EL0), access_arch_timer },
{ SYS_DESC(SYS_CNTPCTSS_EL0), access_arch_timer },
+ { SYS_DESC(SYS_CNTVCTSS_EL0), access_arch_timer },
{ SYS_DESC(SYS_CNTP_TVAL_EL0), access_arch_timer },
{ SYS_DESC(SYS_CNTP_CTL_EL0), access_arch_timer },
{ SYS_DESC(SYS_CNTP_CVAL_EL0), access_arch_timer },
+ { SYS_DESC(SYS_CNTV_TVAL_EL0), access_arch_timer },
+ { SYS_DESC(SYS_CNTV_CTL_EL0), access_arch_timer },
+ { SYS_DESC(SYS_CNTV_CVAL_EL0), access_arch_timer },
+
/* PMEVCNTRn_EL0 */
PMU_PMEVCNTR_EL0(0),
PMU_PMEVCNTR_EL0(1),
@@ -3057,9 +3183,24 @@ static const struct sys_reg_desc sys_reg_descs[] = {
EL2_REG_VNCR(CNTVOFF_EL2, reset_val, 0),
EL2_REG(CNTHCTL_EL2, access_rw, reset_val, 0),
+ { SYS_DESC(SYS_CNTHP_TVAL_EL2), access_arch_timer },
+ EL2_REG(CNTHP_CTL_EL2, access_arch_timer, reset_val, 0),
+ EL2_REG(CNTHP_CVAL_EL2, access_arch_timer, reset_val, 0),
+
+ { SYS_DESC(SYS_CNTHV_TVAL_EL2), access_arch_timer },
+ EL2_REG(CNTHV_CTL_EL2, access_arch_timer, reset_val, 0),
+ EL2_REG(CNTHV_CVAL_EL2, access_arch_timer, reset_val, 0),
{ SYS_DESC(SYS_CNTKCTL_EL12), access_cntkctl_el12 },
+ { SYS_DESC(SYS_CNTP_TVAL_EL02), access_arch_timer },
+ { SYS_DESC(SYS_CNTP_CTL_EL02), access_arch_timer },
+ { SYS_DESC(SYS_CNTP_CVAL_EL02), access_arch_timer },
+
+ { SYS_DESC(SYS_CNTV_TVAL_EL02), access_arch_timer },
+ { SYS_DESC(SYS_CNTV_CTL_EL02), access_arch_timer },
+ { SYS_DESC(SYS_CNTV_CVAL_EL02), access_arch_timer },
+
EL2_REG(SP_EL2, NULL, reset_unknown, 0),
};
@@ -3879,9 +4020,11 @@ static const struct sys_reg_desc cp15_64_regs[] = {
{ SYS_DESC(SYS_AARCH32_CNTPCT), access_arch_timer },
{ Op1( 1), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, TTBR1_EL1 },
{ Op1( 1), CRn( 0), CRm(12), Op2( 0), access_gic_sgi }, /* ICC_ASGI1R */
+ { SYS_DESC(SYS_AARCH32_CNTVCT), access_arch_timer },
{ Op1( 2), CRn( 0), CRm(12), Op2( 0), access_gic_sgi }, /* ICC_SGI0R */
{ SYS_DESC(SYS_AARCH32_CNTP_CVAL), access_arch_timer },
{ SYS_DESC(SYS_AARCH32_CNTPCTSS), access_arch_timer },
+ { SYS_DESC(SYS_AARCH32_CNTVCTSS), access_arch_timer },
};
static bool check_sysreg_table(const struct sys_reg_desc *table, unsigned int n,
diff --git a/include/clocksource/arm_arch_timer.h b/include/clocksource/arm_arch_timer.h
index cbbc9a6dc5715..877dcbb2601ae 100644
--- a/include/clocksource/arm_arch_timer.h
+++ b/include/clocksource/arm_arch_timer.h
@@ -22,6 +22,8 @@
#define CNTHCTL_EVNTDIR (1 << 3)
#define CNTHCTL_EVNTI (0xF << 4)
#define CNTHCTL_ECV (1 << 12)
+#define CNTHCTL_EL1TVT (1 << 13)
+#define CNTHCTL_EL1TVCT (1 << 14)
enum arch_timer_reg {
ARCH_TIMER_REG_CTRL,
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 02/12] KVM: arm64: nv: Sync nested timer state with FEAT_NV2
2024-12-17 14:23 [PATCH v2 00/12] KVM: arm64: Add NV timer support Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 01/12] KVM: arm64: nv: Add handling of EL2-specific timer registers Marc Zyngier
@ 2024-12-17 14:23 ` Marc Zyngier
2025-01-06 2:19 ` Wei-Lin Chang
2025-01-26 15:25 ` Volodymyr Babchuk
2024-12-17 14:23 ` [PATCH v2 03/12] KVM: arm64: nv: Publish emulated timer interrupt state in the in-memory state Marc Zyngier
` (11 subsequent siblings)
13 siblings, 2 replies; 27+ messages in thread
From: Marc Zyngier @ 2024-12-17 14:23 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Bjorn Andersson, Christoffer Dall, Ganapatrao Kulkarni,
Chase Conklin, Eric Auger
Emulating the timers with FEAT_NV2 is a bit odd, as the timers
can be reconfigured behind our back without the hypervisor even
noticing. In the VHE case, that's an actual regression in the
architecture...
Co-developed-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/kvm/arch_timer.c | 44 ++++++++++++++++++++++++++++++++++++
arch/arm64/kvm/arm.c | 3 +++
include/kvm/arm_arch_timer.h | 1 +
3 files changed, 48 insertions(+)
diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index 1215df5904185..ee5f732fbbece 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -905,6 +905,50 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
kvm_timer_blocking(vcpu);
}
+void kvm_timer_sync_nested(struct kvm_vcpu *vcpu)
+{
+ /*
+ * When NV2 is on, guest hypervisors have their EL1 timer register
+ * accesses redirected to the VNCR page. Any guest action taken on
+ * the timer is postponed until the next exit, leading to a very
+ * poor quality of emulation.
+ */
+ if (!is_hyp_ctxt(vcpu))
+ return;
+
+ if (!vcpu_el2_e2h_is_set(vcpu)) {
+ /*
+ * A non-VHE guest hypervisor doesn't have any direct access
+ * to its timers: the EL2 registers trap (and the HW is
+ * fully emulated), while the EL0 registers access memory
+ * despite the access being notionally direct. Boo.
+ *
+ * We update the hardware timer registers with the
+ * latest value written by the guest to the VNCR page
+ * and let the hardware take care of the rest.
+ */
+ write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CTL_EL0), SYS_CNTV_CTL);
+ write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CVAL_EL0), SYS_CNTV_CVAL);
+ write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CTL_EL0), SYS_CNTP_CTL);
+ write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CVAL_EL0), SYS_CNTP_CVAL);
+ } else {
+ /*
+ * For a VHE guest hypervisor, the EL2 state is directly
+ * stored in the host EL1 timers, while the emulated EL0
+ * state is stored in the VNCR page. The latter could have
+ * been updated behind our back, and we must reset the
+ * emulation of the timers.
+ */
+ struct timer_map map;
+ get_timer_map(vcpu, &map);
+
+ soft_timer_cancel(&map.emul_vtimer->hrtimer);
+ soft_timer_cancel(&map.emul_ptimer->hrtimer);
+ timer_emulate(map.emul_vtimer);
+ timer_emulate(map.emul_ptimer);
+ }
+}
+
/*
* With a userspace irqchip we have to check if the guest de-asserted the
* timer and if so, unmask the timer irq signal on the host interrupt
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index a102c3aebdbc4..fa3089822f9f3 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1228,6 +1228,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
if (unlikely(!irqchip_in_kernel(vcpu->kvm)))
kvm_timer_sync_user(vcpu);
+ if (vcpu_has_nv(vcpu))
+ kvm_timer_sync_nested(vcpu);
+
kvm_arch_vcpu_ctxsync_fp(vcpu);
/*
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index fd650a8789b91..6e3f6b7ff2b22 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -98,6 +98,7 @@ int __init kvm_timer_hyp_init(bool has_gic);
int kvm_timer_enable(struct kvm_vcpu *vcpu);
void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu);
void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu);
+void kvm_timer_sync_nested(struct kvm_vcpu *vcpu);
void kvm_timer_sync_user(struct kvm_vcpu *vcpu);
bool kvm_timer_should_notify_user(struct kvm_vcpu *vcpu);
void kvm_timer_update_run(struct kvm_vcpu *vcpu);
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 03/12] KVM: arm64: nv: Publish emulated timer interrupt state in the in-memory state
2024-12-17 14:23 [PATCH v2 00/12] KVM: arm64: Add NV timer support Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 01/12] KVM: arm64: nv: Add handling of EL2-specific timer registers Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 02/12] KVM: arm64: nv: Sync nested timer state with FEAT_NV2 Marc Zyngier
@ 2024-12-17 14:23 ` Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 04/12] KVM: arm64: nv: Use FEAT_ECV to trap access to EL0 timers Marc Zyngier
` (10 subsequent siblings)
13 siblings, 0 replies; 27+ messages in thread
From: Marc Zyngier @ 2024-12-17 14:23 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Bjorn Andersson, Christoffer Dall, Ganapatrao Kulkarni,
Chase Conklin, Eric Auger
With FEAT_NV2, the EL0 timer state is entirely stored in memory,
meaning that the hypervisor can only provide a very poor emulation.
The only thing we can really do is to publish the interrupt state
in the guest view of CNT{P,V}_CTL_EL0, and defer everything else
to the next exit.
Only FEAT_ECV will allow us to fix it, at the cost of extra trapping.
Suggested-by: Chase Conklin <chase.conklin@arm.com>
Suggested-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/kvm/arch_timer.c | 21 +++++++++++++++++++++
arch/arm64/kvm/arm.c | 2 +-
2 files changed, 22 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index ee5f732fbbece..8bff913ed1264 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -441,11 +441,30 @@ void kvm_timer_update_run(struct kvm_vcpu *vcpu)
regs->device_irq_level |= KVM_ARM_DEV_EL1_PTIMER;
}
+static void kvm_timer_update_status(struct arch_timer_context *ctx, bool level)
+{
+ /*
+ * Paper over NV2 brokenness by publishing the interrupt status
+ * bit. This still results in a poor quality of emulation (guest
+ * writes will have no effect until the next exit).
+ *
+ * But hey, it's fast, right?
+ */
+ if (is_hyp_ctxt(ctx->vcpu) &&
+ (ctx == vcpu_vtimer(ctx->vcpu) || ctx == vcpu_ptimer(ctx->vcpu))) {
+ unsigned long val = timer_get_ctl(ctx);
+ __assign_bit(__ffs(ARCH_TIMER_CTRL_IT_STAT), &val, level);
+ timer_set_ctl(ctx, val);
+ }
+}
+
static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
struct arch_timer_context *timer_ctx)
{
int ret;
+ kvm_timer_update_status(timer_ctx, new_level);
+
timer_ctx->irq.level = new_level;
trace_kvm_timer_update_irq(vcpu->vcpu_id, timer_irq(timer_ctx),
timer_ctx->irq.level);
@@ -471,6 +490,8 @@ static void timer_emulate(struct arch_timer_context *ctx)
return;
}
+ kvm_timer_update_status(ctx, should_fire);
+
/*
* If the timer can fire now, we don't need to have a soft timer
* scheduled for the future. If the timer cannot fire at all,
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index fa3089822f9f3..bda905022df40 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1228,7 +1228,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
if (unlikely(!irqchip_in_kernel(vcpu->kvm)))
kvm_timer_sync_user(vcpu);
- if (vcpu_has_nv(vcpu))
+ if (is_hyp_ctxt(vcpu))
kvm_timer_sync_nested(vcpu);
kvm_arch_vcpu_ctxsync_fp(vcpu);
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 04/12] KVM: arm64: nv: Use FEAT_ECV to trap access to EL0 timers
2024-12-17 14:23 [PATCH v2 00/12] KVM: arm64: Add NV timer support Marc Zyngier
` (2 preceding siblings ...)
2024-12-17 14:23 ` [PATCH v2 03/12] KVM: arm64: nv: Publish emulated timer interrupt state in the in-memory state Marc Zyngier
@ 2024-12-17 14:23 ` Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 05/12] KVM: arm64: nv: Accelerate EL0 timer read accesses when FEAT_ECV in use Marc Zyngier
` (9 subsequent siblings)
13 siblings, 0 replies; 27+ messages in thread
From: Marc Zyngier @ 2024-12-17 14:23 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Bjorn Andersson, Christoffer Dall, Ganapatrao Kulkarni,
Chase Conklin, Eric Auger
Although FEAT_NV2 makes most things fast, it also makes it impossible
to correctly emulate the timers, as the sysreg accesses are redirected
to memory.
FEAT_ECV addresses this by giving a hypervisor the ability to trap
the EL02 sysregs as well as the virtual timer.
Add the required trap setting to make use of the feature, allowing
us to elide the ugly resync in the middle of the run loop.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/kvm/arch_timer.c | 36 +++++++++++++++++++++++++---
include/clocksource/arm_arch_timer.h | 2 ++
2 files changed, 35 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index 8bff913ed1264..b6a06bda9e534 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -783,7 +783,7 @@ static void kvm_timer_vcpu_load_nested_switch(struct kvm_vcpu *vcpu,
static void timer_set_traps(struct kvm_vcpu *vcpu, struct timer_map *map)
{
- bool tpt, tpc;
+ bool tvt, tpt, tvc, tpc, tvt02, tpt02;
u64 clr, set;
/*
@@ -798,7 +798,29 @@ static void timer_set_traps(struct kvm_vcpu *vcpu, struct timer_map *map)
* within this function, reality kicks in and we start adding
* traps based on emulation requirements.
*/
- tpt = tpc = false;
+ tvt = tpt = tvc = tpc = false;
+ tvt02 = tpt02 = false;
+
+ /*
+ * NV2 badly breaks the timer semantics by redirecting accesses to
+ * the EL1 timer state to memory, so let's call ECV to the rescue if
+ * available: we trap all CNT{P,V}_{CTL,CVAL,TVAL}_EL0 accesses.
+ *
+ * The treatment slightly varies depending whether we run a nVHE or
+ * VHE guest: nVHE will use the _EL0 registers directly, while VHE
+ * will use the _EL02 accessors. This translates in different trap
+ * bits.
+ *
+ * None of the trapping is required when running in non-HYP context,
+ * unless required by the L1 hypervisor settings once we advertise
+ * ECV+NV in the guest, or that we need trapping for other reasons.
+ */
+ if (cpus_have_final_cap(ARM64_HAS_ECV) && is_hyp_ctxt(vcpu)) {
+ if (vcpu_el2_e2h_is_set(vcpu))
+ tvt02 = tpt02 = true;
+ else
+ tvt = tpt = true;
+ }
/*
* We have two possibility to deal with a physical offset:
@@ -838,6 +860,10 @@ static void timer_set_traps(struct kvm_vcpu *vcpu, struct timer_map *map)
assign_clear_set_bit(tpt, CNTHCTL_EL1PCEN << 10, set, clr);
assign_clear_set_bit(tpc, CNTHCTL_EL1PCTEN << 10, set, clr);
+ assign_clear_set_bit(tvt, CNTHCTL_EL1TVT, clr, set);
+ assign_clear_set_bit(tvc, CNTHCTL_EL1TVCT, clr, set);
+ assign_clear_set_bit(tvt02, CNTHCTL_EL1NVVCT, clr, set);
+ assign_clear_set_bit(tpt02, CNTHCTL_EL1NVPCT, clr, set);
/* This only happens on VHE, so use the CNTHCTL_EL2 accessor. */
sysreg_clear_set(cnthctl_el2, clr, set);
@@ -933,8 +959,12 @@ void kvm_timer_sync_nested(struct kvm_vcpu *vcpu)
* accesses redirected to the VNCR page. Any guest action taken on
* the timer is postponed until the next exit, leading to a very
* poor quality of emulation.
+ *
+ * This is an unmitigated disaster, only papered over by FEAT_ECV,
+ * which allows trapping of the timer registers even with NV2.
+ * Still, this is still worse than FEAT_NV on its own. Meh.
*/
- if (!is_hyp_ctxt(vcpu))
+ if (cpus_have_final_cap(ARM64_HAS_ECV) || !is_hyp_ctxt(vcpu))
return;
if (!vcpu_el2_e2h_is_set(vcpu)) {
diff --git a/include/clocksource/arm_arch_timer.h b/include/clocksource/arm_arch_timer.h
index 877dcbb2601ae..c62811fb41309 100644
--- a/include/clocksource/arm_arch_timer.h
+++ b/include/clocksource/arm_arch_timer.h
@@ -24,6 +24,8 @@
#define CNTHCTL_ECV (1 << 12)
#define CNTHCTL_EL1TVT (1 << 13)
#define CNTHCTL_EL1TVCT (1 << 14)
+#define CNTHCTL_EL1NVPCT (1 << 15)
+#define CNTHCTL_EL1NVVCT (1 << 16)
enum arch_timer_reg {
ARCH_TIMER_REG_CTRL,
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 05/12] KVM: arm64: nv: Accelerate EL0 timer read accesses when FEAT_ECV in use
2024-12-17 14:23 [PATCH v2 00/12] KVM: arm64: Add NV timer support Marc Zyngier
` (3 preceding siblings ...)
2024-12-17 14:23 ` [PATCH v2 04/12] KVM: arm64: nv: Use FEAT_ECV to trap access to EL0 timers Marc Zyngier
@ 2024-12-17 14:23 ` Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 06/12] KVM: arm64: nv: Accelerate EL0 counter accesses from hypervisor context Marc Zyngier
` (8 subsequent siblings)
13 siblings, 0 replies; 27+ messages in thread
From: Marc Zyngier @ 2024-12-17 14:23 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Bjorn Andersson, Christoffer Dall, Ganapatrao Kulkarni,
Chase Conklin, Eric Auger
Although FEAT_ECV allows us to correctly emulate the timers, it also
reduces performances pretty badly.
Mitigate this by emulating the CTL/CVAL register reads in the
inner run loop, without returning to the general kernel.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/kvm/arch_timer.c | 21 +-----
arch/arm64/kvm/hyp/include/hyp/switch.h | 5 ++
arch/arm64/kvm/hyp/vhe/switch.c | 99 +++++++++++++++++++++++++
include/kvm/arm_arch_timer.h | 15 ++++
4 files changed, 122 insertions(+), 18 deletions(-)
diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index b6a06bda9e534..6f04f31c0a7f2 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -101,21 +101,6 @@ u64 timer_get_cval(struct arch_timer_context *ctxt)
}
}
-static u64 timer_get_offset(struct arch_timer_context *ctxt)
-{
- u64 offset = 0;
-
- if (!ctxt)
- return 0;
-
- if (ctxt->offset.vm_offset)
- offset += *ctxt->offset.vm_offset;
- if (ctxt->offset.vcpu_offset)
- offset += *ctxt->offset.vcpu_offset;
-
- return offset;
-}
-
static void timer_set_ctl(struct arch_timer_context *ctxt, u32 ctl)
{
struct kvm_vcpu *vcpu = ctxt->vcpu;
@@ -964,10 +949,10 @@ void kvm_timer_sync_nested(struct kvm_vcpu *vcpu)
* which allows trapping of the timer registers even with NV2.
* Still, this is still worse than FEAT_NV on its own. Meh.
*/
- if (cpus_have_final_cap(ARM64_HAS_ECV) || !is_hyp_ctxt(vcpu))
- return;
-
if (!vcpu_el2_e2h_is_set(vcpu)) {
+ if (cpus_have_final_cap(ARM64_HAS_ECV))
+ return;
+
/*
* A non-VHE guest hypervisor doesn't have any direct access
* to its timers: the EL2 registers trap (and the HW is
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 34f53707892df..30e572de28749 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -501,6 +501,11 @@ static inline bool handle_tx2_tvm(struct kvm_vcpu *vcpu)
return true;
}
+static inline u64 compute_counter_value(struct arch_timer_context *ctxt)
+{
+ return arch_timer_read_cntpct_el0() - timer_get_offset(ctxt);
+}
+
static bool kvm_hyp_handle_cntpct(struct kvm_vcpu *vcpu)
{
struct arch_timer_context *ctxt;
diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
index 80581b1c39959..51119d58ecff8 100644
--- a/arch/arm64/kvm/hyp/vhe/switch.c
+++ b/arch/arm64/kvm/hyp/vhe/switch.c
@@ -256,6 +256,102 @@ void kvm_vcpu_put_vhe(struct kvm_vcpu *vcpu)
host_data_ptr(host_ctxt)->__hyp_running_vcpu = NULL;
}
+static u64 compute_emulated_cntx_ctl_el0(struct kvm_vcpu *vcpu,
+ enum vcpu_sysreg reg)
+{
+ unsigned long ctl;
+ u64 cval, cnt;
+ bool stat;
+
+ switch (reg) {
+ case CNTP_CTL_EL0:
+ cval = __vcpu_sys_reg(vcpu, CNTP_CVAL_EL0);
+ ctl = __vcpu_sys_reg(vcpu, CNTP_CTL_EL0);
+ cnt = compute_counter_value(vcpu_ptimer(vcpu));
+ break;
+ case CNTV_CTL_EL0:
+ cval = __vcpu_sys_reg(vcpu, CNTV_CVAL_EL0);
+ ctl = __vcpu_sys_reg(vcpu, CNTV_CTL_EL0);
+ cnt = compute_counter_value(vcpu_vtimer(vcpu));
+ break;
+ default:
+ BUG();
+ }
+
+ stat = cval <= cnt;
+ __assign_bit(__ffs(ARCH_TIMER_CTRL_IT_STAT), &ctl, stat);
+
+ return ctl;
+}
+
+static bool kvm_hyp_handle_timer(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+ u64 esr, val;
+
+ /*
+ * Having FEAT_ECV allows for a better quality of timer emulation.
+ * However, this comes at a huge cost in terms of traps. Try and
+ * satisfy the reads from guest's hypervisor context without
+ * returning to the kernel if we can.
+ */
+ if (!is_hyp_ctxt(vcpu))
+ return false;
+
+ esr = kvm_vcpu_get_esr(vcpu);
+ if ((esr & ESR_ELx_SYS64_ISS_DIR_MASK) != ESR_ELx_SYS64_ISS_DIR_READ)
+ return false;
+
+ switch (esr_sys64_to_sysreg(esr)) {
+ case SYS_CNTP_CTL_EL02:
+ val = compute_emulated_cntx_ctl_el0(vcpu, CNTP_CTL_EL0);
+ break;
+ case SYS_CNTP_CTL_EL0:
+ if (vcpu_el2_e2h_is_set(vcpu))
+ val = read_sysreg_el0(SYS_CNTP_CTL);
+ else
+ val = compute_emulated_cntx_ctl_el0(vcpu, CNTP_CTL_EL0);
+ break;
+ case SYS_CNTP_CVAL_EL02:
+ val = __vcpu_sys_reg(vcpu, CNTP_CVAL_EL0);
+ break;
+ case SYS_CNTP_CVAL_EL0:
+ if (vcpu_el2_e2h_is_set(vcpu)) {
+ val = read_sysreg_el0(SYS_CNTP_CVAL);
+
+ if (!has_cntpoff())
+ val -= timer_get_offset(vcpu_hptimer(vcpu));
+ } else {
+ val = __vcpu_sys_reg(vcpu, CNTP_CVAL_EL0);
+ }
+ break;
+ case SYS_CNTV_CTL_EL02:
+ val = compute_emulated_cntx_ctl_el0(vcpu, CNTV_CTL_EL0);
+ break;
+ case SYS_CNTV_CTL_EL0:
+ if (vcpu_el2_e2h_is_set(vcpu))
+ val = read_sysreg_el0(SYS_CNTV_CTL);
+ else
+ val = compute_emulated_cntx_ctl_el0(vcpu, CNTV_CTL_EL0);
+ break;
+ case SYS_CNTV_CVAL_EL02:
+ val = __vcpu_sys_reg(vcpu, CNTV_CVAL_EL0);
+ break;
+ case SYS_CNTV_CVAL_EL0:
+ if (vcpu_el2_e2h_is_set(vcpu))
+ val = read_sysreg_el0(SYS_CNTV_CVAL);
+ else
+ val = __vcpu_sys_reg(vcpu, CNTV_CVAL_EL0);
+ break;
+ default:
+ return false;
+ }
+
+ vcpu_set_reg(vcpu, kvm_vcpu_sys_get_rt(vcpu), val);
+ __kvm_skip_instr(vcpu);
+
+ return true;
+}
+
static bool kvm_hyp_handle_eret(struct kvm_vcpu *vcpu, u64 *exit_code)
{
u64 esr = kvm_vcpu_get_esr(vcpu);
@@ -409,6 +505,9 @@ static bool kvm_hyp_handle_sysreg_vhe(struct kvm_vcpu *vcpu, u64 *exit_code)
if (kvm_hyp_handle_tlbi_el2(vcpu, exit_code))
return true;
+ if (kvm_hyp_handle_timer(vcpu, exit_code))
+ return true;
+
if (kvm_hyp_handle_cpacr_el1(vcpu, exit_code))
return true;
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index 6e3f6b7ff2b22..c1ba31fab6f52 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -156,4 +156,19 @@ static inline bool has_cntpoff(void)
return (has_vhe() && cpus_have_final_cap(ARM64_HAS_ECV_CNTPOFF));
}
+static inline u64 timer_get_offset(struct arch_timer_context *ctxt)
+{
+ u64 offset = 0;
+
+ if (!ctxt)
+ return 0;
+
+ if (ctxt->offset.vm_offset)
+ offset += *ctxt->offset.vm_offset;
+ if (ctxt->offset.vcpu_offset)
+ offset += *ctxt->offset.vcpu_offset;
+
+ return offset;
+}
+
#endif
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 06/12] KVM: arm64: nv: Accelerate EL0 counter accesses from hypervisor context
2024-12-17 14:23 [PATCH v2 00/12] KVM: arm64: Add NV timer support Marc Zyngier
` (4 preceding siblings ...)
2024-12-17 14:23 ` [PATCH v2 05/12] KVM: arm64: nv: Accelerate EL0 timer read accesses when FEAT_ECV in use Marc Zyngier
@ 2024-12-17 14:23 ` Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 07/12] KVM: arm64: Handle counter access early in non-HYP context Marc Zyngier
` (7 subsequent siblings)
13 siblings, 0 replies; 27+ messages in thread
From: Marc Zyngier @ 2024-12-17 14:23 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Bjorn Andersson, Christoffer Dall, Ganapatrao Kulkarni,
Chase Conklin, Eric Auger
Similarly to handling the physical timer accesses early when FEAT_ECV
causes a trap, we try to handle the physical counter without returning
to the general sysreg handling.
More surprisingly, we introduce something similar for the virtual
counter. Although this isn't necessary yet, it will prove useful on
systems that have a broken CNTVOFF_EL2 implementation. Yes, they exist.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/kvm/hyp/vhe/switch.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
index 51119d58ecff8..ef344bcff09a1 100644
--- a/arch/arm64/kvm/hyp/vhe/switch.c
+++ b/arch/arm64/kvm/hyp/vhe/switch.c
@@ -324,6 +324,10 @@ static bool kvm_hyp_handle_timer(struct kvm_vcpu *vcpu, u64 *exit_code)
val = __vcpu_sys_reg(vcpu, CNTP_CVAL_EL0);
}
break;
+ case SYS_CNTPCT_EL0:
+ case SYS_CNTPCTSS_EL0:
+ val = compute_counter_value(vcpu_hptimer(vcpu));
+ break;
case SYS_CNTV_CTL_EL02:
val = compute_emulated_cntx_ctl_el0(vcpu, CNTV_CTL_EL0);
break;
@@ -342,6 +346,10 @@ static bool kvm_hyp_handle_timer(struct kvm_vcpu *vcpu, u64 *exit_code)
else
val = __vcpu_sys_reg(vcpu, CNTV_CVAL_EL0);
break;
+ case SYS_CNTVCT_EL0:
+ case SYS_CNTVCTSS_EL0:
+ val = compute_counter_value(vcpu_hvtimer(vcpu));
+ break;
default:
return false;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 07/12] KVM: arm64: Handle counter access early in non-HYP context
2024-12-17 14:23 [PATCH v2 00/12] KVM: arm64: Add NV timer support Marc Zyngier
` (5 preceding siblings ...)
2024-12-17 14:23 ` [PATCH v2 06/12] KVM: arm64: nv: Accelerate EL0 counter accesses from hypervisor context Marc Zyngier
@ 2024-12-17 14:23 ` Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 08/12] KVM: arm64: nv: Add trap routing for CNTHCTL_EL2.EL1{NVPCT,NVVCT,TVT,TVCT} Marc Zyngier
` (6 subsequent siblings)
13 siblings, 0 replies; 27+ messages in thread
From: Marc Zyngier @ 2024-12-17 14:23 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Bjorn Andersson, Christoffer Dall, Ganapatrao Kulkarni,
Chase Conklin, Eric Auger
We already deal with CNTPCT_EL0 accesses in non-HYP context.
Let's add CNTVCT_EL0 as a good measure.
This is also an opportunity to simplify things and make it
plain that this code is only for non-HYP context handling.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/kvm/hyp/include/hyp/switch.h | 34 +++++++++++++++----------
1 file changed, 21 insertions(+), 13 deletions(-)
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 30e572de28749..719479b42b329 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -506,7 +506,7 @@ static inline u64 compute_counter_value(struct arch_timer_context *ctxt)
return arch_timer_read_cntpct_el0() - timer_get_offset(ctxt);
}
-static bool kvm_hyp_handle_cntpct(struct kvm_vcpu *vcpu)
+static bool kvm_handle_cntxct(struct kvm_vcpu *vcpu)
{
struct arch_timer_context *ctxt;
u32 sysreg;
@@ -516,18 +516,19 @@ static bool kvm_hyp_handle_cntpct(struct kvm_vcpu *vcpu)
* We only get here for 64bit guests, 32bit guests will hit
* the long and winding road all the way to the standard
* handling. Yes, it sucks to be irrelevant.
+ *
+ * Also, we only deal with non-hypervisor context here (either
+ * an EL1 guest, or a non-HYP context of an EL2 guest).
*/
+ if (is_hyp_ctxt(vcpu))
+ return false;
+
sysreg = esr_sys64_to_sysreg(kvm_vcpu_get_esr(vcpu));
switch (sysreg) {
case SYS_CNTPCT_EL0:
case SYS_CNTPCTSS_EL0:
if (vcpu_has_nv(vcpu)) {
- if (is_hyp_ctxt(vcpu)) {
- ctxt = vcpu_hptimer(vcpu);
- break;
- }
-
/* Check for guest hypervisor trapping */
val = __vcpu_sys_reg(vcpu, CNTHCTL_EL2);
if (!vcpu_el2_e2h_is_set(vcpu))
@@ -539,16 +540,23 @@ static bool kvm_hyp_handle_cntpct(struct kvm_vcpu *vcpu)
ctxt = vcpu_ptimer(vcpu);
break;
+ case SYS_CNTVCT_EL0:
+ case SYS_CNTVCTSS_EL0:
+ if (vcpu_has_nv(vcpu)) {
+ /* Check for guest hypervisor trapping */
+ val = __vcpu_sys_reg(vcpu, CNTHCTL_EL2);
+
+ if (val & CNTHCTL_EL1TVCT)
+ return false;
+ }
+
+ ctxt = vcpu_vtimer(vcpu);
+ break;
default:
return false;
}
- val = arch_timer_read_cntpct_el0();
-
- if (ctxt->offset.vm_offset)
- val -= *kern_hyp_va(ctxt->offset.vm_offset);
- if (ctxt->offset.vcpu_offset)
- val -= *kern_hyp_va(ctxt->offset.vcpu_offset);
+ val = compute_counter_value(ctxt);
vcpu_set_reg(vcpu, kvm_vcpu_sys_get_rt(vcpu), val);
__kvm_skip_instr(vcpu);
@@ -593,7 +601,7 @@ static bool kvm_hyp_handle_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
__vgic_v3_perform_cpuif_access(vcpu) == 1)
return true;
- if (kvm_hyp_handle_cntpct(vcpu))
+ if (kvm_handle_cntxct(vcpu))
return true;
return false;
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 08/12] KVM: arm64: nv: Add trap routing for CNTHCTL_EL2.EL1{NVPCT,NVVCT,TVT,TVCT}
2024-12-17 14:23 [PATCH v2 00/12] KVM: arm64: Add NV timer support Marc Zyngier
` (6 preceding siblings ...)
2024-12-17 14:23 ` [PATCH v2 07/12] KVM: arm64: Handle counter access early in non-HYP context Marc Zyngier
@ 2024-12-17 14:23 ` Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 09/12] KVM: arm64: nv: Propagate CNTHCTL_EL2.EL1NV{P,V}CT bits Marc Zyngier
` (5 subsequent siblings)
13 siblings, 0 replies; 27+ messages in thread
From: Marc Zyngier @ 2024-12-17 14:23 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Bjorn Andersson, Christoffer Dall, Ganapatrao Kulkarni,
Chase Conklin, Eric Auger
For completeness, fun, and cerebral meltdown, add the virtualisation
related traps to the counter and timers.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/kvm/emulate-nested.c | 58 +++++++++++++++++++++++++++++++--
1 file changed, 56 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kvm/emulate-nested.c b/arch/arm64/kvm/emulate-nested.c
index 1ffbfd1c3cf2e..03d21044c41f5 100644
--- a/arch/arm64/kvm/emulate-nested.c
+++ b/arch/arm64/kvm/emulate-nested.c
@@ -89,6 +89,9 @@ enum cgt_group_id {
CGT_HCRX_EnFPM,
CGT_HCRX_TCR2En,
+ CGT_CNTHCTL_EL1TVT,
+ CGT_CNTHCTL_EL1TVCT,
+
CGT_ICH_HCR_TC,
CGT_ICH_HCR_TALL0,
CGT_ICH_HCR_TALL1,
@@ -124,6 +127,8 @@ enum cgt_group_id {
__COMPLEX_CONDITIONS__,
CGT_CNTHCTL_EL1PCTEN = __COMPLEX_CONDITIONS__,
CGT_CNTHCTL_EL1PTEN,
+ CGT_CNTHCTL_EL1NVPCT,
+ CGT_CNTHCTL_EL1NVVCT,
CGT_CPTR_TTA,
CGT_MDCR_HPMN,
@@ -393,6 +398,18 @@ static const struct trap_bits coarse_trap_bits[] = {
.mask = HCRX_EL2_TCR2En,
.behaviour = BEHAVE_FORWARD_RW,
},
+ [CGT_CNTHCTL_EL1TVT] = {
+ .index = CNTHCTL_EL2,
+ .value = CNTHCTL_EL1TVT,
+ .mask = CNTHCTL_EL1TVT,
+ .behaviour = BEHAVE_FORWARD_RW,
+ },
+ [CGT_CNTHCTL_EL1TVCT] = {
+ .index = CNTHCTL_EL2,
+ .value = CNTHCTL_EL1TVCT,
+ .mask = CNTHCTL_EL1TVCT,
+ .behaviour = BEHAVE_FORWARD_READ,
+ },
[CGT_ICH_HCR_TC] = {
.index = ICH_HCR_EL2,
.value = ICH_HCR_TC,
@@ -487,6 +504,32 @@ static enum trap_behaviour check_cnthctl_el1pten(struct kvm_vcpu *vcpu)
return BEHAVE_FORWARD_RW;
}
+static bool is_nested_nv2_guest(struct kvm_vcpu *vcpu)
+{
+ u64 val;
+
+ val = __vcpu_sys_reg(vcpu, HCR_EL2);
+ return ((val & (HCR_E2H | HCR_TGE | HCR_NV2 | HCR_NV1 | HCR_NV)) == (HCR_E2H | HCR_NV2 | HCR_NV));
+}
+
+static enum trap_behaviour check_cnthctl_el1nvpct(struct kvm_vcpu *vcpu)
+{
+ if (!is_nested_nv2_guest(vcpu) ||
+ !(__vcpu_sys_reg(vcpu, CNTHCTL_EL2) & CNTHCTL_EL1NVPCT))
+ return BEHAVE_HANDLE_LOCALLY;
+
+ return BEHAVE_FORWARD_RW;
+}
+
+static enum trap_behaviour check_cnthctl_el1nvvct(struct kvm_vcpu *vcpu)
+{
+ if (!is_nested_nv2_guest(vcpu) ||
+ !(__vcpu_sys_reg(vcpu, CNTHCTL_EL2) & CNTHCTL_EL1NVVCT))
+ return BEHAVE_HANDLE_LOCALLY;
+
+ return BEHAVE_FORWARD_RW;
+}
+
static enum trap_behaviour check_cptr_tta(struct kvm_vcpu *vcpu)
{
u64 val = __vcpu_sys_reg(vcpu, CPTR_EL2);
@@ -534,6 +577,8 @@ static enum trap_behaviour check_mdcr_hpmn(struct kvm_vcpu *vcpu)
static const complex_condition_check ccc[] = {
CCC(CGT_CNTHCTL_EL1PCTEN, check_cnthctl_el1pcten),
CCC(CGT_CNTHCTL_EL1PTEN, check_cnthctl_el1pten),
+ CCC(CGT_CNTHCTL_EL1NVPCT, check_cnthctl_el1nvpct),
+ CCC(CGT_CNTHCTL_EL1NVVCT, check_cnthctl_el1nvvct),
CCC(CGT_CPTR_TTA, check_cptr_tta),
CCC(CGT_MDCR_HPMN, check_mdcr_hpmn),
};
@@ -850,11 +895,15 @@ static const struct encoding_to_trap_config encoding_to_cgt[] __initconst = {
SYS_CNTHP_CVAL_EL2, CGT_HCR_NV),
SR_RANGE_TRAP(SYS_CNTHV_TVAL_EL2,
SYS_CNTHV_CVAL_EL2, CGT_HCR_NV),
- /* All _EL02, _EL12 registers */
+ /* All _EL02, _EL12 registers up to CNTKCTL_EL12*/
SR_RANGE_TRAP(sys_reg(3, 5, 0, 0, 0),
sys_reg(3, 5, 10, 15, 7), CGT_HCR_NV),
SR_RANGE_TRAP(sys_reg(3, 5, 12, 0, 0),
- sys_reg(3, 5, 14, 15, 7), CGT_HCR_NV),
+ sys_reg(3, 5, 14, 1, 0), CGT_HCR_NV),
+ SR_TRAP(SYS_CNTP_CTL_EL02, CGT_CNTHCTL_EL1NVPCT),
+ SR_TRAP(SYS_CNTP_CVAL_EL02, CGT_CNTHCTL_EL1NVPCT),
+ SR_TRAP(SYS_CNTV_CTL_EL02, CGT_CNTHCTL_EL1NVVCT),
+ SR_TRAP(SYS_CNTV_CVAL_EL02, CGT_CNTHCTL_EL1NVVCT),
SR_TRAP(OP_AT_S1E2R, CGT_HCR_NV),
SR_TRAP(OP_AT_S1E2W, CGT_HCR_NV),
SR_TRAP(OP_AT_S12E1R, CGT_HCR_NV),
@@ -1184,6 +1233,11 @@ static const struct encoding_to_trap_config encoding_to_cgt[] __initconst = {
SR_TRAP(SYS_CNTP_CTL_EL0, CGT_CNTHCTL_EL1PTEN),
SR_TRAP(SYS_CNTPCT_EL0, CGT_CNTHCTL_EL1PCTEN),
SR_TRAP(SYS_CNTPCTSS_EL0, CGT_CNTHCTL_EL1PCTEN),
+ SR_TRAP(SYS_CNTV_TVAL_EL0, CGT_CNTHCTL_EL1TVT),
+ SR_TRAP(SYS_CNTV_CVAL_EL0, CGT_CNTHCTL_EL1TVT),
+ SR_TRAP(SYS_CNTV_CTL_EL0, CGT_CNTHCTL_EL1TVT),
+ SR_TRAP(SYS_CNTVCT_EL0, CGT_CNTHCTL_EL1TVCT),
+ SR_TRAP(SYS_CNTVCTSS_EL0, CGT_CNTHCTL_EL1TVCT),
SR_TRAP(SYS_FPMR, CGT_HCRX_EnFPM),
/*
* IMPDEF choice:
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 09/12] KVM: arm64: nv: Propagate CNTHCTL_EL2.EL1NV{P,V}CT bits
2024-12-17 14:23 [PATCH v2 00/12] KVM: arm64: Add NV timer support Marc Zyngier
` (7 preceding siblings ...)
2024-12-17 14:23 ` [PATCH v2 08/12] KVM: arm64: nv: Add trap routing for CNTHCTL_EL2.EL1{NVPCT,NVVCT,TVT,TVCT} Marc Zyngier
@ 2024-12-17 14:23 ` Marc Zyngier
2025-01-06 2:33 ` Wei-Lin Chang
2024-12-17 14:23 ` [PATCH v2 10/12] KVM: arm64: nv: Sanitise CNTHCTL_EL2 Marc Zyngier
` (4 subsequent siblings)
13 siblings, 1 reply; 27+ messages in thread
From: Marc Zyngier @ 2024-12-17 14:23 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Bjorn Andersson, Christoffer Dall, Ganapatrao Kulkarni,
Chase Conklin, Eric Auger
Allow a guest hypervisor to trap accesses to CNT{P,V}CT_EL02 by
propagating these trap bits to the host trap configuration.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/kvm/arch_timer.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index 6f04f31c0a7f2..e5951e6eaf236 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -824,6 +824,10 @@ static void timer_set_traps(struct kvm_vcpu *vcpu, struct timer_map *map)
* Apply the enable bits that the guest hypervisor has requested for
* its own guest. We can only add traps that wouldn't have been set
* above.
+ * Implementation choices: we do not support NV when E2H=0 in the
+ * guest, and we don't support configuration where E2H is writable
+ * by the guest (either FEAT_VHE or FEAT_E2H0 is implemented, but
+ * not both). This simplifies the handling of the EL1NV* bits.
*/
if (vcpu_has_nv(vcpu) && !is_hyp_ctxt(vcpu)) {
u64 val = __vcpu_sys_reg(vcpu, CNTHCTL_EL2);
@@ -834,6 +838,9 @@ static void timer_set_traps(struct kvm_vcpu *vcpu, struct timer_map *map)
tpt |= !(val & (CNTHCTL_EL1PCEN << 10));
tpc |= !(val & (CNTHCTL_EL1PCTEN << 10));
+
+ tpt02 |= (val & CNTHCTL_EL1NVPCT);
+ tvt02 |= (val & CNTHCTL_EL1NVVCT);
}
/*
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 10/12] KVM: arm64: nv: Sanitise CNTHCTL_EL2
2024-12-17 14:23 [PATCH v2 00/12] KVM: arm64: Add NV timer support Marc Zyngier
` (8 preceding siblings ...)
2024-12-17 14:23 ` [PATCH v2 09/12] KVM: arm64: nv: Propagate CNTHCTL_EL2.EL1NV{P,V}CT bits Marc Zyngier
@ 2024-12-17 14:23 ` Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 11/12] KVM: arm64: Work around x1e's CNTVOFF_EL2 bogosity Marc Zyngier
` (3 subsequent siblings)
13 siblings, 0 replies; 27+ messages in thread
From: Marc Zyngier @ 2024-12-17 14:23 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Bjorn Andersson, Christoffer Dall, Ganapatrao Kulkarni,
Chase Conklin, Eric Auger
Inject some sanity in CNTHCTL_EL2, ensuring that we don't handle
more than we advertise to the guest.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/include/asm/kvm_host.h | 2 +-
arch/arm64/kvm/nested.c | 15 +++++++++++++++
include/clocksource/arm_arch_timer.h | 2 ++
3 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index e18e9244d17a4..cf571d41faa89 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -490,7 +490,6 @@ enum vcpu_sysreg {
VBAR_EL2, /* Vector Base Address Register (EL2) */
RVBAR_EL2, /* Reset Vector Base Address Register */
CONTEXTIDR_EL2, /* Context ID Register (EL2) */
- CNTHCTL_EL2, /* Counter-timer Hypervisor Control register */
SP_EL2, /* EL2 Stack Pointer */
CNTHP_CTL_EL2,
CNTHP_CVAL_EL2,
@@ -501,6 +500,7 @@ enum vcpu_sysreg {
MARKER(__SANITISED_REG_START__),
TCR2_EL2, /* Extended Translation Control Register (EL2) */
MDCR_EL2, /* Monitor Debug Configuration Register (EL2) */
+ CNTHCTL_EL2, /* Counter-timer Hypervisor Control register */
/* Any VNCR-capable reg goes after this point */
MARKER(__VNCR_START__),
diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
index 9b36218b48def..9113c6025d6f3 100644
--- a/arch/arm64/kvm/nested.c
+++ b/arch/arm64/kvm/nested.c
@@ -1271,6 +1271,21 @@ int kvm_init_nv_sysregs(struct kvm *kvm)
res0 |= MDCR_EL2_EnSTEPOP;
set_sysreg_masks(kvm, MDCR_EL2, res0, res1);
+ /* CNTHCTL_EL2 */
+ res0 = GENMASK(63, 20);
+ res1 = 0;
+ if (!kvm_has_feat(kvm, ID_AA64PFR0_EL1, RME, IMP))
+ res0 |= CNTHCTL_CNTPMASK | CNTHCTL_CNTVMASK;
+ if (!kvm_has_feat(kvm, ID_AA64MMFR0_EL1, ECV, CNTPOFF)) {
+ res0 |= CNTHCTL_ECV;
+ if (!kvm_has_feat(kvm, ID_AA64MMFR0_EL1, ECV, IMP))
+ res0 |= (CNTHCTL_EL1TVT | CNTHCTL_EL1TVCT |
+ CNTHCTL_EL1NVPCT | CNTHCTL_EL1NVVCT);
+ }
+ if (!kvm_has_feat(kvm, ID_AA64MMFR1_EL1, VH, IMP))
+ res0 |= GENMASK(11, 8);
+ set_sysreg_masks(kvm, CNTHCTL_EL2, res0, res1);
+
return 0;
}
diff --git a/include/clocksource/arm_arch_timer.h b/include/clocksource/arm_arch_timer.h
index c62811fb41309..ce6521ad04d12 100644
--- a/include/clocksource/arm_arch_timer.h
+++ b/include/clocksource/arm_arch_timer.h
@@ -26,6 +26,8 @@
#define CNTHCTL_EL1TVCT (1 << 14)
#define CNTHCTL_EL1NVPCT (1 << 15)
#define CNTHCTL_EL1NVVCT (1 << 16)
+#define CNTHCTL_CNTVMASK (1 << 18)
+#define CNTHCTL_CNTPMASK (1 << 19)
enum arch_timer_reg {
ARCH_TIMER_REG_CTRL,
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 11/12] KVM: arm64: Work around x1e's CNTVOFF_EL2 bogosity
2024-12-17 14:23 [PATCH v2 00/12] KVM: arm64: Add NV timer support Marc Zyngier
` (9 preceding siblings ...)
2024-12-17 14:23 ` [PATCH v2 10/12] KVM: arm64: nv: Sanitise CNTHCTL_EL2 Marc Zyngier
@ 2024-12-17 14:23 ` Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 12/12] KVM: arm64: nv: Document EL2 timer API Marc Zyngier
` (2 subsequent siblings)
13 siblings, 0 replies; 27+ messages in thread
From: Marc Zyngier @ 2024-12-17 14:23 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Bjorn Andersson, Christoffer Dall, Ganapatrao Kulkarni,
Chase Conklin, Eric Auger
It appears that on Qualcomm's x1e CPU, CNTVOFF_EL2 doesn't really
work, specially with HCR_EL2.E2H=1.
A non-zero offset results in a screaming virtual timer interrupt,
to the tune of a few 100k interrupts per second on a 4 vcpu VM.
This is also evidenced by this CPU's inability to correctly run
any of the timer selftests.
The only case this doesn't break is when this register is set to 0,
which breaks VM migration.
When HCR_EL2.E2H=0, the timer seems to behave normally, and does
not result in an interrupt storm.
As a workaround, use the fact that this CPU implements FEAT_ECV,
and trap all accesses to the virtual timer and counter, keeping
CNTVOFF_EL2 set to zero, and emulate accesses to CVAL/TVAL/CTL
and the counter itself, fixing up the timer to account for the
missing offset.
And if you think this is disgusting, you'd probably be right.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/include/asm/cputype.h | 2 ++
arch/arm64/kernel/cpu_errata.c | 8 +++++
arch/arm64/kernel/image-vars.h | 3 ++
arch/arm64/kvm/arch_timer.c | 58 ++++++++++++++++++++++++++++--
arch/arm64/kvm/hyp/nvhe/timer-sr.c | 16 ++++++---
arch/arm64/kvm/sys_regs.c | 3 +-
arch/arm64/tools/cpucaps | 1 +
include/kvm/arm_arch_timer.h | 7 ++++
8 files changed, 90 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
index 488f8e7513495..6f3f4142e214f 100644
--- a/arch/arm64/include/asm/cputype.h
+++ b/arch/arm64/include/asm/cputype.h
@@ -122,6 +122,7 @@
#define QCOM_CPU_PART_KRYO_3XX_SILVER 0x803
#define QCOM_CPU_PART_KRYO_4XX_GOLD 0x804
#define QCOM_CPU_PART_KRYO_4XX_SILVER 0x805
+#define QCOM_CPU_PART_ORYON_X1 0x001
#define NVIDIA_CPU_PART_DENVER 0x003
#define NVIDIA_CPU_PART_CARMEL 0x004
@@ -198,6 +199,7 @@
#define MIDR_QCOM_KRYO_3XX_SILVER MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_KRYO_3XX_SILVER)
#define MIDR_QCOM_KRYO_4XX_GOLD MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_KRYO_4XX_GOLD)
#define MIDR_QCOM_KRYO_4XX_SILVER MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_KRYO_4XX_SILVER)
+#define MIDR_QCOM_ORYON_X1 MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_ORYON_X1)
#define MIDR_NVIDIA_DENVER MIDR_CPU_MODEL(ARM_CPU_IMP_NVIDIA, NVIDIA_CPU_PART_DENVER)
#define MIDR_NVIDIA_CARMEL MIDR_CPU_MODEL(ARM_CPU_IMP_NVIDIA, NVIDIA_CPU_PART_CARMEL)
#define MIDR_FUJITSU_A64FX MIDR_CPU_MODEL(ARM_CPU_IMP_FUJITSU, FUJITSU_CPU_PART_A64FX)
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index a78f247029aec..7ce5558628951 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -786,6 +786,14 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
ERRATA_MIDR_RANGE_LIST(erratum_ac03_cpu_38_list),
},
#endif
+ {
+ .desc = "Broken CNTVOFF_EL2",
+ .capability = ARM64_WORKAROUND_QCOM_ORYON_CNTVOFF,
+ ERRATA_MIDR_RANGE_LIST(((const struct midr_range[]) {
+ MIDR_ALL_VERSIONS(MIDR_QCOM_ORYON_X1),
+ {}
+ })),
+ },
{
}
};
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 8f5422ed1b758..ef3a69cc398e5 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -105,6 +105,9 @@ KVM_NVHE_ALIAS(__hyp_stub_vectors);
KVM_NVHE_ALIAS(vgic_v2_cpuif_trap);
KVM_NVHE_ALIAS(vgic_v3_cpuif_trap);
+/* Static key which is set if CNTVOFF_EL2 is unusable */
+KVM_NVHE_ALIAS(broken_cntvoff_key);
+
/* EL2 exception handling */
KVM_NVHE_ALIAS(__start___kvm_ex_table);
KVM_NVHE_ALIAS(__stop___kvm_ex_table);
diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index e5951e6eaf236..d3d243366536c 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -30,6 +30,7 @@ static u32 host_vtimer_irq_flags;
static u32 host_ptimer_irq_flags;
static DEFINE_STATIC_KEY_FALSE(has_gic_active_state);
+DEFINE_STATIC_KEY_FALSE(broken_cntvoff_key);
static const u8 default_ppi[] = {
[TIMER_PTIMER] = 30,
@@ -519,7 +520,12 @@ static void timer_save_state(struct arch_timer_context *ctx)
case TIMER_VTIMER:
case TIMER_HVTIMER:
timer_set_ctl(ctx, read_sysreg_el0(SYS_CNTV_CTL));
- timer_set_cval(ctx, read_sysreg_el0(SYS_CNTV_CVAL));
+ cval = read_sysreg_el0(SYS_CNTV_CVAL);
+
+ if (has_broken_cntvoff())
+ cval -= timer_get_offset(ctx);
+
+ timer_set_cval(ctx, cval);
/* Disable the timer */
write_sysreg_el0(0, SYS_CNTV_CTL);
@@ -624,8 +630,15 @@ static void timer_restore_state(struct arch_timer_context *ctx)
case TIMER_VTIMER:
case TIMER_HVTIMER:
- set_cntvoff(timer_get_offset(ctx));
- write_sysreg_el0(timer_get_cval(ctx), SYS_CNTV_CVAL);
+ cval = timer_get_cval(ctx);
+ offset = timer_get_offset(ctx);
+ if (has_broken_cntvoff()) {
+ set_cntvoff(0);
+ cval += offset;
+ } else {
+ set_cntvoff(offset);
+ }
+ write_sysreg_el0(cval, SYS_CNTV_CVAL);
isb();
write_sysreg_el0(timer_get_ctl(ctx), SYS_CNTV_CTL);
break;
@@ -820,6 +833,13 @@ static void timer_set_traps(struct kvm_vcpu *vcpu, struct timer_map *map)
if (!has_cntpoff() && timer_get_offset(map->direct_ptimer))
tpt = tpc = true;
+ /*
+ * For the poor sods that could not correctly substract one value
+ * from another, trap the full virtual timer and counter.
+ */
+ if (has_broken_cntvoff() && timer_get_offset(map->direct_vtimer))
+ tvt = tvc = true;
+
/*
* Apply the enable bits that the guest hypervisor has requested for
* its own guest. We can only add traps that wouldn't have been set
@@ -1450,6 +1470,37 @@ static int kvm_irq_init(struct arch_timer_kvm_info *info)
return 0;
}
+static void kvm_timer_handle_errata(void)
+{
+ u64 mmfr0, mmfr1, mmfr4;
+
+ /*
+ * CNTVOFF_EL2 is broken on some implementations. For those, we trap
+ * all virtual timer/counter accesses, requiring FEAT_ECV.
+ *
+ * However, a hypervisor supporting nesting is likely to mitigate the
+ * erratum at L0, and not require other levels to mitigate it (which
+ * would otherwise be a terrible performance sink due to trap
+ * amplification).
+ *
+ * Given that the affected HW implements both FEAT_VHE and FEAT_E2H0,
+ * and that NV is likely not to (because of limitations of the
+ * architecture), only enable the workaround when FEAT_VHE and
+ * FEAT_E2H0 are both detected. Time will tell if this actually holds.
+ */
+ mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
+ mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
+ mmfr4 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR4_EL1);
+ if (SYS_FIELD_GET(ID_AA64MMFR1_EL1, VH, mmfr1) &&
+ !SYS_FIELD_GET(ID_AA64MMFR4_EL1, E2H0, mmfr4) &&
+ SYS_FIELD_GET(ID_AA64MMFR0_EL1, ECV, mmfr0) &&
+ (has_vhe() || has_hvhe()) &&
+ cpus_have_final_cap(ARM64_WORKAROUND_QCOM_ORYON_CNTVOFF)) {
+ static_branch_enable(&broken_cntvoff_key);
+ kvm_info("Broken CNTVOFF_EL2, trapping virtual timer\n");
+ }
+}
+
int __init kvm_timer_hyp_init(bool has_gic)
{
struct arch_timer_kvm_info *info;
@@ -1518,6 +1569,7 @@ int __init kvm_timer_hyp_init(bool has_gic)
goto out_free_vtimer_irq;
}
+ kvm_timer_handle_errata();
return 0;
out_free_ptimer_irq:
diff --git a/arch/arm64/kvm/hyp/nvhe/timer-sr.c b/arch/arm64/kvm/hyp/nvhe/timer-sr.c
index 3aaab20ae5b47..ff176f4ce7deb 100644
--- a/arch/arm64/kvm/hyp/nvhe/timer-sr.c
+++ b/arch/arm64/kvm/hyp/nvhe/timer-sr.c
@@ -22,15 +22,16 @@ void __kvm_timer_set_cntvoff(u64 cntvoff)
*/
void __timer_disable_traps(struct kvm_vcpu *vcpu)
{
- u64 val, shift = 0;
+ u64 set, clr, shift = 0;
if (has_hvhe())
shift = 10;
/* Allow physical timer/counter access for the host */
- val = read_sysreg(cnthctl_el2);
- val |= (CNTHCTL_EL1PCTEN | CNTHCTL_EL1PCEN) << shift;
- write_sysreg(val, cnthctl_el2);
+ set = (CNTHCTL_EL1PCTEN | CNTHCTL_EL1PCEN) << shift;
+ clr = CNTHCTL_EL1TVT | CNTHCTL_EL1TVCT;
+
+ sysreg_clear_set(cnthctl_el2, clr, set);
}
/*
@@ -58,5 +59,12 @@ void __timer_enable_traps(struct kvm_vcpu *vcpu)
set <<= 10;
}
+ /*
+ * Trap the virtual counter/timer if we have a broken cntvoff
+ * implementation.
+ */
+ if (has_broken_cntvoff())
+ set |= CNTHCTL_EL1TVT | CNTHCTL_EL1TVCT;
+
sysreg_clear_set(cnthctl_el2, clr, set);
}
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 986e63d4f9faa..d161d6c05707a 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1721,7 +1721,8 @@ static u64 __kvm_read_sanitised_id_reg(const struct kvm_vcpu *vcpu,
if (!vcpu_has_ptrauth(vcpu))
val &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_APA3) |
ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_GPA3));
- if (!cpus_have_final_cap(ARM64_HAS_WFXT))
+ if (!cpus_have_final_cap(ARM64_HAS_WFXT) ||
+ has_broken_cntvoff())
val &= ~ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_WFxT);
break;
case SYS_ID_AA64MMFR2_EL1:
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index eb17f59e543c4..1e65f2fb45bd1 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -105,6 +105,7 @@ WORKAROUND_CLEAN_CACHE
WORKAROUND_DEVICE_LOAD_ACQUIRE
WORKAROUND_NVIDIA_CARMEL_CNP
WORKAROUND_QCOM_FALKOR_E1003
+WORKAROUND_QCOM_ORYON_CNTVOFF
WORKAROUND_REPEAT_TLBI
WORKAROUND_SPECULATIVE_AT
WORKAROUND_SPECULATIVE_SSBS
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index c1ba31fab6f52..681cf0c8b9df4 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -151,6 +151,13 @@ void kvm_timer_cpu_down(void);
/* CNTKCTL_EL1 valid bits as of DDI0487J.a */
#define CNTKCTL_VALID_BITS (BIT(17) | GENMASK_ULL(9, 0))
+DECLARE_STATIC_KEY_FALSE(broken_cntvoff_key);
+
+static inline bool has_broken_cntvoff(void)
+{
+ return static_branch_unlikely(&broken_cntvoff_key);
+}
+
static inline bool has_cntpoff(void)
{
return (has_vhe() && cpus_have_final_cap(ARM64_HAS_ECV_CNTPOFF));
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH v2 12/12] KVM: arm64: nv: Document EL2 timer API
2024-12-17 14:23 [PATCH v2 00/12] KVM: arm64: Add NV timer support Marc Zyngier
` (10 preceding siblings ...)
2024-12-17 14:23 ` [PATCH v2 11/12] KVM: arm64: Work around x1e's CNTVOFF_EL2 bogosity Marc Zyngier
@ 2024-12-17 14:23 ` Marc Zyngier
2025-01-02 19:15 ` [PATCH v2 00/12] KVM: arm64: Add NV timer support Oliver Upton
2025-01-02 19:25 ` Marc Zyngier
13 siblings, 0 replies; 27+ messages in thread
From: Marc Zyngier @ 2024-12-17 14:23 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Bjorn Andersson, Christoffer Dall, Ganapatrao Kulkarni,
Chase Conklin, Eric Auger
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
Documentation/virt/kvm/devices/vcpu.rst | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
index 31f14ec4a65b6..d62ba86ee1668 100644
--- a/Documentation/virt/kvm/devices/vcpu.rst
+++ b/Documentation/virt/kvm/devices/vcpu.rst
@@ -142,8 +142,9 @@ the cpu field to the processor id.
:Architectures: ARM64
-2.1. ATTRIBUTES: KVM_ARM_VCPU_TIMER_IRQ_VTIMER, KVM_ARM_VCPU_TIMER_IRQ_PTIMER
------------------------------------------------------------------------------
+2.1. ATTRIBUTES: KVM_ARM_VCPU_TIMER_IRQ_VTIMER, KVM_ARM_VCPU_TIMER_IRQ_PTIMER,
+ KVM_ARM_VCPU_TIMER_IRQ_HVTIMER, KVM_ARM_VCPU_TIMER_IRQ_HPTIMER,
+--------------------------------------------------------------------------------
:Parameters: in kvm_device_attr.addr the address for the timer interrupt is a
pointer to an int
@@ -159,10 +160,12 @@ A value describing the architected timer interrupt number when connected to an
in-kernel virtual GIC. These must be a PPI (16 <= intid < 32). Setting the
attribute overrides the default values (see below).
-============================= ==========================================
-KVM_ARM_VCPU_TIMER_IRQ_VTIMER The EL1 virtual timer intid (default: 27)
-KVM_ARM_VCPU_TIMER_IRQ_PTIMER The EL1 physical timer intid (default: 30)
-============================= ==========================================
+============================== ==========================================
+KVM_ARM_VCPU_TIMER_IRQ_VTIMER The EL1 virtual timer intid (default: 27)
+KVM_ARM_VCPU_TIMER_IRQ_PTIMER The EL1 physical timer intid (default: 30)
+KVM_ARM_VCPU_TIMER_IRQ_HVTIMER The EL2 virtual timer intid (default: 28)
+KVM_ARM_VCPU_TIMER_IRQ_HPTIMER The EL2 physical timer intid (default: 26)
+============================== ==========================================
Setting the same PPI for different timers will prevent the VCPUs from running.
Setting the interrupt number on a VCPU configures all VCPUs created at that
--
2.39.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH v2 01/12] KVM: arm64: nv: Add handling of EL2-specific timer registers
2024-12-17 14:23 ` [PATCH v2 01/12] KVM: arm64: nv: Add handling of EL2-specific timer registers Marc Zyngier
@ 2024-12-21 1:38 ` Oliver Upton
2024-12-21 9:57 ` Marc Zyngier
0 siblings, 1 reply; 27+ messages in thread
From: Oliver Upton @ 2024-12-21 1:38 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Bjorn Andersson, Christoffer Dall,
Ganapatrao Kulkarni, Chase Conklin, Eric Auger
On Tue, Dec 17, 2024 at 02:23:09PM +0000, Marc Zyngier wrote:
> @@ -3879,9 +4020,11 @@ static const struct sys_reg_desc cp15_64_regs[] = {
> { SYS_DESC(SYS_AARCH32_CNTPCT), access_arch_timer },
> { Op1( 1), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, TTBR1_EL1 },
> { Op1( 1), CRn( 0), CRm(12), Op2( 0), access_gic_sgi }, /* ICC_ASGI1R */
> + { SYS_DESC(SYS_AARCH32_CNTVCT), access_arch_timer },
> { Op1( 2), CRn( 0), CRm(12), Op2( 0), access_gic_sgi }, /* ICC_SGI0R */
> { SYS_DESC(SYS_AARCH32_CNTP_CVAL), access_arch_timer },
> { SYS_DESC(SYS_AARCH32_CNTPCTSS), access_arch_timer },
> + { SYS_DESC(SYS_AARCH32_CNTVCTSS), access_arch_timer },
> };
Huh. You know, I had always thought we hid 32-bit EL0 from nested
guests, but I now realize that isn't the case. Of course, we don't have
the necessary trap reflection for exits that came out of a 32-bit EL0,
nor should we bother.
Of the 4 NV2 implementations I'm aware of (Neoverse-V1, Neoverse-V2,
AmpereOne, M2) only Neoverse-V1 supports 32-bit userspace. And even
then, a lot of deployments of V1 have a broken NV2 implementation.
What do you think about advertising a 64-bit only EL0 for nested VMs?
--
Thanks,
Oliver
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2 01/12] KVM: arm64: nv: Add handling of EL2-specific timer registers
2024-12-21 1:38 ` Oliver Upton
@ 2024-12-21 9:57 ` Marc Zyngier
2024-12-21 21:58 ` Oliver Upton
0 siblings, 1 reply; 27+ messages in thread
From: Marc Zyngier @ 2024-12-21 9:57 UTC (permalink / raw)
To: Oliver Upton
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Bjorn Andersson, Christoffer Dall,
Ganapatrao Kulkarni, Chase Conklin, Eric Auger
On Sat, 21 Dec 2024 01:38:28 +0000,
Oliver Upton <oliver.upton@linux.dev> wrote:
>
> On Tue, Dec 17, 2024 at 02:23:09PM +0000, Marc Zyngier wrote:
> > @@ -3879,9 +4020,11 @@ static const struct sys_reg_desc cp15_64_regs[] = {
> > { SYS_DESC(SYS_AARCH32_CNTPCT), access_arch_timer },
> > { Op1( 1), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, TTBR1_EL1 },
> > { Op1( 1), CRn( 0), CRm(12), Op2( 0), access_gic_sgi }, /* ICC_ASGI1R */
> > + { SYS_DESC(SYS_AARCH32_CNTVCT), access_arch_timer },
> > { Op1( 2), CRn( 0), CRm(12), Op2( 0), access_gic_sgi }, /* ICC_SGI0R */
> > { SYS_DESC(SYS_AARCH32_CNTP_CVAL), access_arch_timer },
> > { SYS_DESC(SYS_AARCH32_CNTPCTSS), access_arch_timer },
> > + { SYS_DESC(SYS_AARCH32_CNTVCTSS), access_arch_timer },
> > };
>
> Huh. You know, I had always thought we hid 32-bit EL0 from nested
> guests, but I now realize that isn't the case. Of course, we don't have
> the necessary trap reflection for exits that came out of a 32-bit EL0,
> nor should we bother.
>
> Of the 4 NV2 implementations I'm aware of (Neoverse-V1, Neoverse-V2,
> AmpereOne, M2) only Neoverse-V1 supports 32-bit userspace. And even
> then, a lot of deployments of V1 have a broken NV2 implementation.
>
> What do you think about advertising a 64-bit only EL0 for nested VMs?
I'm completely OK with that.
Actually, we already nuke the guest if exiting from 32bit context, no
matter the EL (vcpu_mode_is_bad_32bit() is where this happens). But
we're missing the ID_AA64PFR0_EL1.EL0 sanitising, which is a bug. I'll
send a patch shortly.
Now, for this particular patch, I still think we should gracefully
handle access to the EL1 timer from a 32bit capable, non-NV guest.
Just in case we end-up with a CPU with a broken CNTVOFF_EL2 *and*
32bit capability.
In the end, it doesn't cost us much to support this case, and it helps
that we can verify that we handle all registers without exception.
Thoughts?
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2 01/12] KVM: arm64: nv: Add handling of EL2-specific timer registers
2024-12-21 9:57 ` Marc Zyngier
@ 2024-12-21 21:58 ` Oliver Upton
0 siblings, 0 replies; 27+ messages in thread
From: Oliver Upton @ 2024-12-21 21:58 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Bjorn Andersson, Christoffer Dall,
Ganapatrao Kulkarni, Chase Conklin, Eric Auger
On Sat, Dec 21, 2024 at 09:57:44AM +0000, Marc Zyngier wrote:
> On Sat, 21 Dec 2024 01:38:28 +0000,
> Oliver Upton <oliver.upton@linux.dev> wrote:
> >
> > On Tue, Dec 17, 2024 at 02:23:09PM +0000, Marc Zyngier wrote:
> > > @@ -3879,9 +4020,11 @@ static const struct sys_reg_desc cp15_64_regs[] = {
> > > { SYS_DESC(SYS_AARCH32_CNTPCT), access_arch_timer },
> > > { Op1( 1), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, TTBR1_EL1 },
> > > { Op1( 1), CRn( 0), CRm(12), Op2( 0), access_gic_sgi }, /* ICC_ASGI1R */
> > > + { SYS_DESC(SYS_AARCH32_CNTVCT), access_arch_timer },
> > > { Op1( 2), CRn( 0), CRm(12), Op2( 0), access_gic_sgi }, /* ICC_SGI0R */
> > > { SYS_DESC(SYS_AARCH32_CNTP_CVAL), access_arch_timer },
> > > { SYS_DESC(SYS_AARCH32_CNTPCTSS), access_arch_timer },
> > > + { SYS_DESC(SYS_AARCH32_CNTVCTSS), access_arch_timer },
> > > };
> >
> > Huh. You know, I had always thought we hid 32-bit EL0 from nested
> > guests, but I now realize that isn't the case. Of course, we don't have
> > the necessary trap reflection for exits that came out of a 32-bit EL0,
> > nor should we bother.
> >
> > Of the 4 NV2 implementations I'm aware of (Neoverse-V1, Neoverse-V2,
> > AmpereOne, M2) only Neoverse-V1 supports 32-bit userspace. And even
> > then, a lot of deployments of V1 have a broken NV2 implementation.
> >
> > What do you think about advertising a 64-bit only EL0 for nested VMs?
>
> I'm completely OK with that.
>
> Actually, we already nuke the guest if exiting from 32bit context, no
> matter the EL (vcpu_mode_is_bad_32bit() is where this happens). But
> we're missing the ID_AA64PFR0_EL1.EL0 sanitising, which is a bug. I'll
> send a patch shortly.
>
> Now, for this particular patch, I still think we should gracefully
> handle access to the EL1 timer from a 32bit capable, non-NV guest.
> Just in case we end-up with a CPU with a broken CNTVOFF_EL2 *and*
> 32bit capability.
>
> In the end, it doesn't cost us much to support this case, and it helps
> that we can verify that we handle all registers without exception.
>
> Thoughts?
Absolutely. The only reason I made the comment is because the 32-bit
changes had nudged me into thinking about it. Happy with the patch as
is.
--
Thanks,
Oliver
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2 00/12] KVM: arm64: Add NV timer support
2024-12-17 14:23 [PATCH v2 00/12] KVM: arm64: Add NV timer support Marc Zyngier
` (11 preceding siblings ...)
2024-12-17 14:23 ` [PATCH v2 12/12] KVM: arm64: nv: Document EL2 timer API Marc Zyngier
@ 2025-01-02 19:15 ` Oliver Upton
2025-01-02 19:25 ` Marc Zyngier
13 siblings, 0 replies; 27+ messages in thread
From: Oliver Upton @ 2025-01-02 19:15 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Bjorn Andersson, Christoffer Dall,
Ganapatrao Kulkarni, Chase Conklin, Eric Auger
On Tue, Dec 17, 2024 at 02:23:08PM +0000, Marc Zyngier wrote:
> Here's another version of the series initially posted at [1], which
> implements support for timers in NV context.
>
> From v1:
>
> - Repainted EL0->EL1 when rambling about the timers
>
> - Simplified access to EL1 counters from HYP context
>
> - Update the status register when handled as an early trap
>
> - Added some documentation about the default PPI numbers
>
> The whole thing has been tested with 6.13-rc3 as part of the my NV
> integration branch [2], and is functional enough to run an L3 guest
> with kvmtool as the VMM and EDK2 as the firmware. YMMV.
>
> [1] https://lore.kernel.org/r/20241202172134.384923-1-maz@kernel.org
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/nv-next
Time to let it rip!
Acked-by: Oliver Upton <oliver.upton@linux.dev>
--
Thanks,
Oliver
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2 00/12] KVM: arm64: Add NV timer support
2024-12-17 14:23 [PATCH v2 00/12] KVM: arm64: Add NV timer support Marc Zyngier
` (12 preceding siblings ...)
2025-01-02 19:15 ` [PATCH v2 00/12] KVM: arm64: Add NV timer support Oliver Upton
@ 2025-01-02 19:25 ` Marc Zyngier
13 siblings, 0 replies; 27+ messages in thread
From: Marc Zyngier @ 2025-01-02 19:25 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm, Marc Zyngier
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Bjorn Andersson, Christoffer Dall, Ganapatrao Kulkarni,
Chase Conklin, Eric Auger
On Tue, 17 Dec 2024 14:23:08 +0000, Marc Zyngier wrote:
> Here's another version of the series initially posted at [1], which
> implements support for timers in NV context.
>
> From v1:
>
> - Repainted EL0->EL1 when rambling about the timers
>
> [...]
Applied to next, thanks!
[01/12] KVM: arm64: nv: Add handling of EL2-specific timer registers
commit: b59dbb91f7636a89b54ab8fff756afe320ba6549
[02/12] KVM: arm64: nv: Sync nested timer state with FEAT_NV2
commit: 4bad3068cfa9fc38dd767441871e0edab821105b
[03/12] KVM: arm64: nv: Publish emulated timer interrupt state in the in-memory state
commit: cc45963cbf6334d2b9078f06efef9864639cddd0
[04/12] KVM: arm64: nv: Use FEAT_ECV to trap access to EL0 timers
commit: 2cd2a77f9c32f1eaf599fb72cbcd0394938a8b58
[05/12] KVM: arm64: nv: Accelerate EL0 timer read accesses when FEAT_ECV in use
commit: 338f8ea51944d02ea29eadb3d5fa9196e74a100d
[06/12] KVM: arm64: nv: Accelerate EL0 counter accesses from hypervisor context
commit: 9b3b2f00291e1abd54bff345761a7fadd8df4daa
[07/12] KVM: arm64: Handle counter access early in non-HYP context
commit: b86fc215dc26d8e1bb274f0a7990b5deab740ac8
[08/12] KVM: arm64: nv: Add trap routing for CNTHCTL_EL2.EL1{NVPCT,NVVCT,TVT,TVCT}
commit: c271269e3570766724820bcb76a144125dead272
[09/12] KVM: arm64: nv: Propagate CNTHCTL_EL2.EL1NV{P,V}CT bits
commit: 479428cc3dc99bbe28954b62b053b22accbfd1fd
[10/12] KVM: arm64: nv: Sanitise CNTHCTL_EL2
commit: d1e37a50e1d781201768c89314532f6ab87e5a42
[11/12] KVM: arm64: Work around x1e's CNTVOFF_EL2 bogosity
commit: 0bc9a9e85fcf4ffb69846b961273fde4eb0d03ab
[12/12] KVM: arm64: nv: Document EL2 timer API
commit: affd1c83e090133a3d1750916c7911b20f8911c0
Cheers,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2 02/12] KVM: arm64: nv: Sync nested timer state with FEAT_NV2
2024-12-17 14:23 ` [PATCH v2 02/12] KVM: arm64: nv: Sync nested timer state with FEAT_NV2 Marc Zyngier
@ 2025-01-06 2:19 ` Wei-Lin Chang
2025-01-26 15:25 ` Volodymyr Babchuk
1 sibling, 0 replies; 27+ messages in thread
From: Wei-Lin Chang @ 2025-01-06 2:19 UTC (permalink / raw)
To: Marc Zyngier, kvmarm, linux-arm-kernel, kvm
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Bjorn Andersson, Christoffer Dall, Ganapatrao Kulkarni,
Chase Conklin, Eric Auger, r09922117
Hi Marc and other KVM ARM developers,
I am trying to learn about NV and I have a few questions wile reading
the code and comments:
On Tue, Dec 17, 2024 at 02:23:10PM +0000, Marc Zyngier wrote:
> Emulating the timers with FEAT_NV2 is a bit odd, as the timers
> can be reconfigured behind our back without the hypervisor even
> noticing. In the VHE case, that's an actual regression in the
> architecture...
I'm curious why you implied NV2 isn't a regression in the nVHE case? In
my understanding without NV2 and ECV, an nVHE guest hypervisor would
directly program CNT{P, V}_*_EL0 registers without traps, and with NV2,
those accesses would be turned into memory accesses to the VNCR page,
delaying the set up of the timer until the next exit.
So NV2 also makes it worse for an nVHE guest hypervisor. What have I
missed here?
>
> Co-developed-by: Christoffer Dall <christoffer.dall@arm.com>
> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
> arch/arm64/kvm/arch_timer.c | 44 ++++++++++++++++++++++++++++++++++++
> arch/arm64/kvm/arm.c | 3 +++
> include/kvm/arm_arch_timer.h | 1 +
> 3 files changed, 48 insertions(+)
>
> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> index 1215df5904185..ee5f732fbbece 100644
> --- a/arch/arm64/kvm/arch_timer.c
> +++ b/arch/arm64/kvm/arch_timer.c
> @@ -905,6 +905,50 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
> kvm_timer_blocking(vcpu);
> }
>
> +void kvm_timer_sync_nested(struct kvm_vcpu *vcpu)
> +{
> + /*
> + * When NV2 is on, guest hypervisors have their EL1 timer register
> + * accesses redirected to the VNCR page. Any guest action taken on
> + * the timer is postponed until the next exit, leading to a very
> + * poor quality of emulation.
> + */
> + if (!is_hyp_ctxt(vcpu))
> + return;
> +
> + if (!vcpu_el2_e2h_is_set(vcpu)) {
> + /*
> + * A non-VHE guest hypervisor doesn't have any direct access
> + * to its timers: the EL2 registers trap (and the HW is
> + * fully emulated), while the EL0 registers access memory
> + * despite the access being notionally direct. Boo.
Reading this part of the comment I understand it as for an nVHE guest
hypervisor, the timer map would look like:
map->emul_vtimer == vcpu_hvtimer(vcpu) /* EL2 HW is fully emulated */
map->emul_ptimer == vcpu_hptimer(vcpu) /* EL2 HW is fully emulated */
map->direct_vtimer == vcpu_vtimer(vcpu) /* EL0 notionally direct */
map->direct_ptimer == vcpu_ptimer(vcpu) /* EL0 notionally direct */
But looking at get_timer_map() I see we enter the first nested if
statement, regardless of whether the guest hypervisor is VHE or nVHE,
since it checks like the following:
if (vcpu_has_nv(vcpu)) {
if (is_hyp_ctxt(vcpu)) {
map->direct_vtimer == vcpu_hvtimer(vcpu);
map->direct_ptimer == vcpu_hptimer(vcpu);
map->emul_vtimer == vcpu_vtimer(vcpu);
map->emul_ptimer == vcpu_ptimer(vcpu);
} else {
[...]
}
} else if (...) {
[...]
} else {
[...]
}
Which contradicts what the above comment says.
What did I misunderstand?
Thank you so much, any guidance is much appreciated.
Wei-Lin Chang
> + *
> + * We update the hardware timer registers with the
> + * latest value written by the guest to the VNCR page
> + * and let the hardware take care of the rest.
> + */
> + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CTL_EL0), SYS_CNTV_CTL);
> + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CVAL_EL0), SYS_CNTV_CVAL);
> + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CTL_EL0), SYS_CNTP_CTL);
> + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CVAL_EL0), SYS_CNTP_CVAL);
> + } else {
> + /*
> + * For a VHE guest hypervisor, the EL2 state is directly
> + * stored in the host EL1 timers, while the emulated EL0
> + * state is stored in the VNCR page. The latter could have
> + * been updated behind our back, and we must reset the
> + * emulation of the timers.
> + */
> + struct timer_map map;
> + get_timer_map(vcpu, &map);
> +
> + soft_timer_cancel(&map.emul_vtimer->hrtimer);
> + soft_timer_cancel(&map.emul_ptimer->hrtimer);
> + timer_emulate(map.emul_vtimer);
> + timer_emulate(map.emul_ptimer);
> + }
> +}
> +
> /*
> * With a userspace irqchip we have to check if the guest de-asserted the
> * timer and if so, unmask the timer irq signal on the host interrupt
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index a102c3aebdbc4..fa3089822f9f3 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -1228,6 +1228,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> if (unlikely(!irqchip_in_kernel(vcpu->kvm)))
> kvm_timer_sync_user(vcpu);
>
> + if (vcpu_has_nv(vcpu))
> + kvm_timer_sync_nested(vcpu);
> +
> kvm_arch_vcpu_ctxsync_fp(vcpu);
>
> /*
> diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
> index fd650a8789b91..6e3f6b7ff2b22 100644
> --- a/include/kvm/arm_arch_timer.h
> +++ b/include/kvm/arm_arch_timer.h
> @@ -98,6 +98,7 @@ int __init kvm_timer_hyp_init(bool has_gic);
> int kvm_timer_enable(struct kvm_vcpu *vcpu);
> void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu);
> void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu);
> +void kvm_timer_sync_nested(struct kvm_vcpu *vcpu);
> void kvm_timer_sync_user(struct kvm_vcpu *vcpu);
> bool kvm_timer_should_notify_user(struct kvm_vcpu *vcpu);
> void kvm_timer_update_run(struct kvm_vcpu *vcpu);
> --
> 2.39.2
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2 09/12] KVM: arm64: nv: Propagate CNTHCTL_EL2.EL1NV{P,V}CT bits
2024-12-17 14:23 ` [PATCH v2 09/12] KVM: arm64: nv: Propagate CNTHCTL_EL2.EL1NV{P,V}CT bits Marc Zyngier
@ 2025-01-06 2:33 ` Wei-Lin Chang
2025-01-17 15:19 ` Marc Zyngier
0 siblings, 1 reply; 27+ messages in thread
From: Wei-Lin Chang @ 2025-01-06 2:33 UTC (permalink / raw)
To: Marc Zyngier, kvmarm, linux-arm-kernel, kvm
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Bjorn Andersson, Christoffer Dall, Ganapatrao Kulkarni,
Chase Conklin, Eric Auger, r09922117
Hi Marc and other KVM ARM developers,
I have a question while learning about NV and reading the code:
On Tue, Dec 17, 2024 at 02:23:17PM +0000, Marc Zyngier wrote:
> Allow a guest hypervisor to trap accesses to CNT{P,V}CT_EL02 by
> propagating these trap bits to the host trap configuration.
>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
> arch/arm64/kvm/arch_timer.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> index 6f04f31c0a7f2..e5951e6eaf236 100644
> --- a/arch/arm64/kvm/arch_timer.c
> +++ b/arch/arm64/kvm/arch_timer.c
> @@ -824,6 +824,10 @@ static void timer_set_traps(struct kvm_vcpu *vcpu, struct timer_map *map)
> * Apply the enable bits that the guest hypervisor has requested for
> * its own guest. We can only add traps that wouldn't have been set
> * above.
> + * Implementation choices: we do not support NV when E2H=0 in the
> + * guest, and we don't support configuration where E2H is writable
> + * by the guest (either FEAT_VHE or FEAT_E2H0 is implemented, but
> + * not both). This simplifies the handling of the EL1NV* bits.
Previously I was not aware that KVM ARM has these constraints on guest's
view of NV and E2H, so I appreciate this comment very much. However,
after digging through the code I could not find anywhere where these
constraints are enforced, for example initially I thought I would find
ID_AA64MMFR2_EL1_NV being zeroed in limit_nv_id_regs(), or HCR_NV added
to the res0 mask of HCR_EL2, base on whether FEAT_VHE or FEAT_E2H0 is
available to the guest. But seems like in these places the code applies
constraints looking at the host's capabilities, not the guest's.
Do you mind providing some pointers for me to investigate the code mode?
Thank you so much, I'd appreciate any help.
Wei-Lin Chang
> */
> if (vcpu_has_nv(vcpu) && !is_hyp_ctxt(vcpu)) {
> u64 val = __vcpu_sys_reg(vcpu, CNTHCTL_EL2);
> @@ -834,6 +838,9 @@ static void timer_set_traps(struct kvm_vcpu *vcpu, struct timer_map *map)
>
> tpt |= !(val & (CNTHCTL_EL1PCEN << 10));
> tpc |= !(val & (CNTHCTL_EL1PCTEN << 10));
> +
> + tpt02 |= (val & CNTHCTL_EL1NVPCT);
> + tvt02 |= (val & CNTHCTL_EL1NVVCT);
> }
>
> /*
> --
> 2.39.2
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2 09/12] KVM: arm64: nv: Propagate CNTHCTL_EL2.EL1NV{P,V}CT bits
2025-01-06 2:33 ` Wei-Lin Chang
@ 2025-01-17 15:19 ` Marc Zyngier
2025-01-21 6:04 ` Wei-Lin Chang
0 siblings, 1 reply; 27+ messages in thread
From: Marc Zyngier @ 2025-01-17 15:19 UTC (permalink / raw)
To: Wei-Lin Chang
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Bjorn Andersson, Christoffer Dall,
Ganapatrao Kulkarni, Chase Conklin, Eric Auger
On Mon, 06 Jan 2025 02:33:39 +0000,
Wei-Lin Chang <r09922117@csie.ntu.edu.tw> wrote:
>
> Hi Marc and other KVM ARM developers,
>
> I have a question while learning about NV and reading the code:
>
> On Tue, Dec 17, 2024 at 02:23:17PM +0000, Marc Zyngier wrote:
> > Allow a guest hypervisor to trap accesses to CNT{P,V}CT_EL02 by
> > propagating these trap bits to the host trap configuration.
> >
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > ---
> > arch/arm64/kvm/arch_timer.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> > index 6f04f31c0a7f2..e5951e6eaf236 100644
> > --- a/arch/arm64/kvm/arch_timer.c
> > +++ b/arch/arm64/kvm/arch_timer.c
> > @@ -824,6 +824,10 @@ static void timer_set_traps(struct kvm_vcpu *vcpu, struct timer_map *map)
> > * Apply the enable bits that the guest hypervisor has requested for
> > * its own guest. We can only add traps that wouldn't have been set
> > * above.
> > + * Implementation choices: we do not support NV when E2H=0 in the
> > + * guest, and we don't support configuration where E2H is writable
> > + * by the guest (either FEAT_VHE or FEAT_E2H0 is implemented, but
> > + * not both). This simplifies the handling of the EL1NV* bits.
>
> Previously I was not aware that KVM ARM has these constraints on guest's
> view of NV and E2H, so I appreciate this comment very much. However,
> after digging through the code I could not find anywhere where these
> constraints are enforced, for example initially I thought I would find
> ID_AA64MMFR2_EL1_NV being zeroed in limit_nv_id_regs(), or HCR_NV added
> to the res0 mask of HCR_EL2, base on whether FEAT_VHE or FEAT_E2H0 is
> available to the guest. But seems like in these places the code applies
> constraints looking at the host's capabilities, not the guest's.
> Do you mind providing some pointers for me to investigate the code mode?
Where have you looked?
These constraints are enforced in the kvm-arm64/nv-e2h-select
branch[1], which is pulled in the kvm-arm64/nv-next branch[2].
The NV support is split in discrete series in order to make things
easier to review, but you need to have seen them all to somehow
connect the dots.
HTH,
M.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/nv-e2h-select
[2] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/nv-next
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2 09/12] KVM: arm64: nv: Propagate CNTHCTL_EL2.EL1NV{P,V}CT bits
2025-01-17 15:19 ` Marc Zyngier
@ 2025-01-21 6:04 ` Wei-Lin Chang
0 siblings, 0 replies; 27+ messages in thread
From: Wei-Lin Chang @ 2025-01-21 6:04 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Bjorn Andersson, Christoffer Dall,
Ganapatrao Kulkarni, Chase Conklin, Eric Auger, r09922117
On Fri, Jan 17, 2025 at 03:19:03PM +0000, Marc Zyngier wrote:
> On Mon, 06 Jan 2025 02:33:39 +0000,
> Wei-Lin Chang <r09922117@csie.ntu.edu.tw> wrote:
> >
> > Hi Marc and other KVM ARM developers,
> >
> > I have a question while learning about NV and reading the code:
> >
> > On Tue, Dec 17, 2024 at 02:23:17PM +0000, Marc Zyngier wrote:
> > > Allow a guest hypervisor to trap accesses to CNT{P,V}CT_EL02 by
> > > propagating these trap bits to the host trap configuration.
> > >
> > > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > > ---
> > > arch/arm64/kvm/arch_timer.c | 7 +++++++
> > > 1 file changed, 7 insertions(+)
> > >
> > > diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> > > index 6f04f31c0a7f2..e5951e6eaf236 100644
> > > --- a/arch/arm64/kvm/arch_timer.c
> > > +++ b/arch/arm64/kvm/arch_timer.c
> > > @@ -824,6 +824,10 @@ static void timer_set_traps(struct kvm_vcpu *vcpu, struct timer_map *map)
> > > * Apply the enable bits that the guest hypervisor has requested for
> > > * its own guest. We can only add traps that wouldn't have been set
> > > * above.
> > > + * Implementation choices: we do not support NV when E2H=0 in the
> > > + * guest, and we don't support configuration where E2H is writable
> > > + * by the guest (either FEAT_VHE or FEAT_E2H0 is implemented, but
> > > + * not both). This simplifies the handling of the EL1NV* bits.
> >
> > Previously I was not aware that KVM ARM has these constraints on guest's
> > view of NV and E2H, so I appreciate this comment very much. However,
> > after digging through the code I could not find anywhere where these
> > constraints are enforced, for example initially I thought I would find
> > ID_AA64MMFR2_EL1_NV being zeroed in limit_nv_id_regs(), or HCR_NV added
> > to the res0 mask of HCR_EL2, base on whether FEAT_VHE or FEAT_E2H0 is
> > available to the guest. But seems like in these places the code applies
> > constraints looking at the host's capabilities, not the guest's.
> > Do you mind providing some pointers for me to investigate the code mode?
>
> Where have you looked?
>
> These constraints are enforced in the kvm-arm64/nv-e2h-select
> branch[1], which is pulled in the kvm-arm64/nv-next branch[2].
>
> The NV support is split in discrete series in order to make things
> easier to review, but you need to have seen them all to somehow
> connect the dots.
Ah, that makes sense. I was only looking at kvmarm/next, and wasn't
really sure what the other branches are up to.
Thanks for telling me, looks like I got a lot more (complex) code to
study. :)
Sincerely,
Wei-Lin Chang
>
> HTH,
>
> M.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/nv-e2h-select
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/nv-next
>
> --
> Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2 02/12] KVM: arm64: nv: Sync nested timer state with FEAT_NV2
2024-12-17 14:23 ` [PATCH v2 02/12] KVM: arm64: nv: Sync nested timer state with FEAT_NV2 Marc Zyngier
2025-01-06 2:19 ` Wei-Lin Chang
@ 2025-01-26 15:25 ` Volodymyr Babchuk
2025-01-27 17:15 ` Marc Zyngier
1 sibling, 1 reply; 27+ messages in thread
From: Volodymyr Babchuk @ 2025-01-26 15:25 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
kvm@vger.kernel.org, Joey Gouly, Suzuki K Poulose, Oliver Upton,
Zenghui Yu, Bjorn Andersson, Christoffer Dall,
Ganapatrao Kulkarni, Chase Conklin, Eric Auger, Dmytro Terletskyi
Hi Marc,
Thank you for these patches. We (myself and Dmytro Terletskyi) are
trying to use this series to launch up Xen on Amazon Graviton 4 platform.
Graviton 4 is built on Neoverse V2 cores and does **not** support
FEAT_ECV. Looks like we have found issue in this particular patch on
this particular setup.
Marc Zyngier <maz@kernel.org> writes:
> Emulating the timers with FEAT_NV2 is a bit odd, as the timers
> can be reconfigured behind our back without the hypervisor even
> noticing. In the VHE case, that's an actual regression in the
> architecture...
>
> Co-developed-by: Christoffer Dall <christoffer.dall@arm.com>
> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
> arch/arm64/kvm/arch_timer.c | 44 ++++++++++++++++++++++++++++++++++++
> arch/arm64/kvm/arm.c | 3 +++
> include/kvm/arm_arch_timer.h | 1 +
> 3 files changed, 48 insertions(+)
>
> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> index 1215df5904185..ee5f732fbbece 100644
> --- a/arch/arm64/kvm/arch_timer.c
> +++ b/arch/arm64/kvm/arch_timer.c
> @@ -905,6 +905,50 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
> kvm_timer_blocking(vcpu);
> }
>
> +void kvm_timer_sync_nested(struct kvm_vcpu *vcpu)
> +{
> + /*
> + * When NV2 is on, guest hypervisors have their EL1 timer register
> + * accesses redirected to the VNCR page. Any guest action taken on
> + * the timer is postponed until the next exit, leading to a very
> + * poor quality of emulation.
> + */
> + if (!is_hyp_ctxt(vcpu))
> + return;
> +
> + if (!vcpu_el2_e2h_is_set(vcpu)) {
> + /*
> + * A non-VHE guest hypervisor doesn't have any direct access
> + * to its timers: the EL2 registers trap (and the HW is
> + * fully emulated), while the EL0 registers access memory
> + * despite the access being notionally direct. Boo.
> + *
> + * We update the hardware timer registers with the
> + * latest value written by the guest to the VNCR page
> + * and let the hardware take care of the rest.
> + */
> + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CTL_EL0), SYS_CNTV_CTL);
> + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CVAL_EL0), SYS_CNTV_CVAL);
> + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CTL_EL0), SYS_CNTP_CTL);
> + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CVAL_EL0), SYS_CNTP_CVAL);
Here you are overwriting trapped/emulated state of EL2 vtimer with EL0
vtimer, which renders all writes to EL2 timer registers useless.
This is the behavior we observed:
1. Xen writes to CNTHP_CVAL_EL2, which is trapped and handled in
kvm_arm_timer_write_sysreg().
2. timer_set_cval() updates __vcpu_sys_reg(vcpu, CNTHP_CVAL_EL2)
3. timer_restore_state() updates real CNTP_CVAL_EL0 with value from
__vcpu_sys_reg(vcpu, CNTHP_CVAL_EL2)
(so far so good)
4. kvm_timer_sync_nested() is called and it updates real CNTP_CVAL_EL0
with __vcpu_sys_reg(vcpu, CNTP_CVAL_EL0), overwriting value that we got
from Xen.
The same stands for other hypervisor timer registers of course.
I am wondering, what is the correct fix for this issue?
Also, we are observing issues with timers in Dom0, which seems related
to this, but we didn't pinpoint exact problem yet.
--
WBR, Volodymyr
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2 02/12] KVM: arm64: nv: Sync nested timer state with FEAT_NV2
2025-01-26 15:25 ` Volodymyr Babchuk
@ 2025-01-27 17:15 ` Marc Zyngier
2025-01-28 11:29 ` Volodymyr Babchuk
0 siblings, 1 reply; 27+ messages in thread
From: Marc Zyngier @ 2025-01-27 17:15 UTC (permalink / raw)
To: Volodymyr Babchuk
Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
kvm@vger.kernel.org, Joey Gouly, Suzuki K Poulose, Oliver Upton,
Zenghui Yu, Bjorn Andersson, Christoffer Dall,
Ganapatrao Kulkarni, Chase Conklin, Eric Auger, Dmytro Terletskyi,
Wei-Lin Chang
+ Wei-Lin Chang, who spotted something similar 3 weeks ago, that I
didn't manage to investigate in time.
On Sun, 26 Jan 2025 15:25:39 +0000,
Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> wrote:
>
>
> Hi Marc,
>
> Thank you for these patches. We (myself and Dmytro Terletskyi) are
> trying to use this series to launch up Xen on Amazon Graviton 4 platform.
> Graviton 4 is built on Neoverse V2 cores and does **not** support
> FEAT_ECV. Looks like we have found issue in this particular patch on
> this particular setup.
>
> Marc Zyngier <maz@kernel.org> writes:
>
> > Emulating the timers with FEAT_NV2 is a bit odd, as the timers
> > can be reconfigured behind our back without the hypervisor even
> > noticing. In the VHE case, that's an actual regression in the
> > architecture...
> >
> > Co-developed-by: Christoffer Dall <christoffer.dall@arm.com>
> > Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > ---
> > arch/arm64/kvm/arch_timer.c | 44 ++++++++++++++++++++++++++++++++++++
> > arch/arm64/kvm/arm.c | 3 +++
> > include/kvm/arm_arch_timer.h | 1 +
> > 3 files changed, 48 insertions(+)
> >
> > diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> > index 1215df5904185..ee5f732fbbece 100644
> > --- a/arch/arm64/kvm/arch_timer.c
> > +++ b/arch/arm64/kvm/arch_timer.c
> > @@ -905,6 +905,50 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
> > kvm_timer_blocking(vcpu);
> > }
> >
> > +void kvm_timer_sync_nested(struct kvm_vcpu *vcpu)
> > +{
> > + /*
> > + * When NV2 is on, guest hypervisors have their EL1 timer register
> > + * accesses redirected to the VNCR page. Any guest action taken on
> > + * the timer is postponed until the next exit, leading to a very
> > + * poor quality of emulation.
> > + */
> > + if (!is_hyp_ctxt(vcpu))
> > + return;
> > +
> > + if (!vcpu_el2_e2h_is_set(vcpu)) {
> > + /*
> > + * A non-VHE guest hypervisor doesn't have any direct access
> > + * to its timers: the EL2 registers trap (and the HW is
> > + * fully emulated), while the EL0 registers access memory
> > + * despite the access being notionally direct. Boo.
> > + *
> > + * We update the hardware timer registers with the
> > + * latest value written by the guest to the VNCR page
> > + * and let the hardware take care of the rest.
> > + */
> > + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CTL_EL0), SYS_CNTV_CTL);
> > + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CVAL_EL0), SYS_CNTV_CVAL);
> > + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CTL_EL0), SYS_CNTP_CTL);
> > + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CVAL_EL0), SYS_CNTP_CVAL);
>
>
> Here you are overwriting trapped/emulated state of EL2 vtimer with EL0
> vtimer, which renders all writes to EL2 timer registers useless.
>
> This is the behavior we observed:
>
> 1. Xen writes to CNTHP_CVAL_EL2, which is trapped and handled in
> kvm_arm_timer_write_sysreg().
>
> 2. timer_set_cval() updates __vcpu_sys_reg(vcpu, CNTHP_CVAL_EL2)
>
> 3. timer_restore_state() updates real CNTP_CVAL_EL0 with value from
> __vcpu_sys_reg(vcpu, CNTHP_CVAL_EL2)
>
> (so far so good)
>
> 4. kvm_timer_sync_nested() is called and it updates real CNTP_CVAL_EL0
> with __vcpu_sys_reg(vcpu, CNTP_CVAL_EL0), overwriting value that we got
> from Xen.
>
> The same stands for other hypervisor timer registers of course.
>
> I am wondering, what is the correct fix for this issue?
>
> Also, we are observing issues with timers in Dom0, which seems related
> to this, but we didn't pinpoint exact problem yet.
Thanks for the great debug above, much appreciated.
As Wei-Lin pointed out in their email[1], there is a copious amount of
nonsense here. This is due to leftovers from the mix of NV+NV2 that
KVM was initially trying to handle before switching to NV2 only.
The whole VHE vs nVHE makes no sense at all, and both should have the
same behaviour. The only difference is around what gets trapped, and
what doesn't.
Finally, this crap is masking a subtle bug in timer_emulate(), where
we return too early on updating the IRQ state, hence failing to
publish the interrupt state.
Could you please give the hack below a go with your setup and report
whether it solves this particular issue?
diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index 0e29958e20187..56f4905cdb859 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -471,10 +471,8 @@ static void timer_emulate(struct arch_timer_context *ctx)
trace_kvm_timer_emulate(ctx, should_fire);
- if (should_fire != ctx->irq.level) {
+ if (should_fire != ctx->irq.level)
kvm_timer_update_irq(ctx->vcpu, should_fire, ctx);
- return;
- }
kvm_timer_update_status(ctx, should_fire);
@@ -976,31 +974,21 @@ void kvm_timer_sync_nested(struct kvm_vcpu *vcpu)
* which allows trapping of the timer registers even with NV2.
* Still, this is still worse than FEAT_NV on its own. Meh.
*/
- if (!vcpu_el2_e2h_is_set(vcpu)) {
- if (cpus_have_final_cap(ARM64_HAS_ECV))
- return;
-
- /*
- * A non-VHE guest hypervisor doesn't have any direct access
- * to its timers: the EL2 registers trap (and the HW is
- * fully emulated), while the EL0 registers access memory
- * despite the access being notionally direct. Boo.
- *
- * We update the hardware timer registers with the
- * latest value written by the guest to the VNCR page
- * and let the hardware take care of the rest.
- */
- write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CTL_EL0), SYS_CNTV_CTL);
- write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CVAL_EL0), SYS_CNTV_CVAL);
- write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CTL_EL0), SYS_CNTP_CTL);
- write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CVAL_EL0), SYS_CNTP_CVAL);
- } else {
+ if (!cpus_have_final_cap(ARM64_HAS_ECV)) {
/*
* For a VHE guest hypervisor, the EL2 state is directly
- * stored in the host EL1 timers, while the emulated EL0
+ * stored in the host EL1 timers, while the emulated EL1
* state is stored in the VNCR page. The latter could have
* been updated behind our back, and we must reset the
* emulation of the timers.
+ *
+ * A non-VHE guest hypervisor doesn't have any direct access
+ * to its timers: the EL2 registers trap despite being
+ * notionally direct (we use the EL1 HW, as for VHE), while
+ * the EL1 registers access memory.
+ *
+ * In both cases, process the emulated timers on each guest
+ * exit. Boo.
*/
struct timer_map map;
get_timer_map(vcpu, &map);
Thanks,
M.
[1] https://lore.kernel.org/r/fqiqfjzwpgbzdtouu2pwqlu7llhnf5lmy4hzv5vo6ph4v3vyls@jdcfy3fjjc5k
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH v2 02/12] KVM: arm64: nv: Sync nested timer state with FEAT_NV2
2025-01-27 17:15 ` Marc Zyngier
@ 2025-01-28 11:29 ` Volodymyr Babchuk
2025-01-28 12:17 ` Marc Zyngier
0 siblings, 1 reply; 27+ messages in thread
From: Volodymyr Babchuk @ 2025-01-28 11:29 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
kvm@vger.kernel.org, Joey Gouly, Suzuki K Poulose, Oliver Upton,
Zenghui Yu, Bjorn Andersson, Christoffer Dall,
Ganapatrao Kulkarni, Chase Conklin, Eric Auger, Dmytro Terletskyi,
Wei-Lin Chang
Hi Marc,
Marc Zyngier <maz@kernel.org> writes:
> + Wei-Lin Chang, who spotted something similar 3 weeks ago, that I
> didn't manage to investigate in time.
>
> On Sun, 26 Jan 2025 15:25:39 +0000,
> Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> wrote:
>>
>>
>> Hi Marc,
>>
>> Thank you for these patches. We (myself and Dmytro Terletskyi) are
>> trying to use this series to launch up Xen on Amazon Graviton 4 platform.
>> Graviton 4 is built on Neoverse V2 cores and does **not** support
>> FEAT_ECV. Looks like we have found issue in this particular patch on
>> this particular setup.
>>
>> Marc Zyngier <maz@kernel.org> writes:
>>
>> > Emulating the timers with FEAT_NV2 is a bit odd, as the timers
>> > can be reconfigured behind our back without the hypervisor even
>> > noticing. In the VHE case, that's an actual regression in the
>> > architecture...
>> >
>> > Co-developed-by: Christoffer Dall <christoffer.dall@arm.com>
>> > Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
>> > Signed-off-by: Marc Zyngier <maz@kernel.org>
>> > ---
>> > arch/arm64/kvm/arch_timer.c | 44 ++++++++++++++++++++++++++++++++++++
>> > arch/arm64/kvm/arm.c | 3 +++
>> > include/kvm/arm_arch_timer.h | 1 +
>> > 3 files changed, 48 insertions(+)
>> >
>> > diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
>> > index 1215df5904185..ee5f732fbbece 100644
>> > --- a/arch/arm64/kvm/arch_timer.c
>> > +++ b/arch/arm64/kvm/arch_timer.c
>> > @@ -905,6 +905,50 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
>> > kvm_timer_blocking(vcpu);
>> > }
>> >
>> > +void kvm_timer_sync_nested(struct kvm_vcpu *vcpu)
>> > +{
>> > + /*
>> > + * When NV2 is on, guest hypervisors have their EL1 timer register
>> > + * accesses redirected to the VNCR page. Any guest action taken on
>> > + * the timer is postponed until the next exit, leading to a very
>> > + * poor quality of emulation.
>> > + */
>> > + if (!is_hyp_ctxt(vcpu))
>> > + return;
>> > +
>> > + if (!vcpu_el2_e2h_is_set(vcpu)) {
>> > + /*
>> > + * A non-VHE guest hypervisor doesn't have any direct access
>> > + * to its timers: the EL2 registers trap (and the HW is
>> > + * fully emulated), while the EL0 registers access memory
>> > + * despite the access being notionally direct. Boo.
>> > + *
>> > + * We update the hardware timer registers with the
>> > + * latest value written by the guest to the VNCR page
>> > + * and let the hardware take care of the rest.
>> > + */
>> > + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CTL_EL0), SYS_CNTV_CTL);
>> > + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CVAL_EL0), SYS_CNTV_CVAL);
>> > + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CTL_EL0), SYS_CNTP_CTL);
>> > + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CVAL_EL0), SYS_CNTP_CVAL);
>>
>>
>> Here you are overwriting trapped/emulated state of EL2 vtimer with EL0
>> vtimer, which renders all writes to EL2 timer registers useless.
>>
>> This is the behavior we observed:
>>
>> 1. Xen writes to CNTHP_CVAL_EL2, which is trapped and handled in
>> kvm_arm_timer_write_sysreg().
>>
>> 2. timer_set_cval() updates __vcpu_sys_reg(vcpu, CNTHP_CVAL_EL2)
>>
>> 3. timer_restore_state() updates real CNTP_CVAL_EL0 with value from
>> __vcpu_sys_reg(vcpu, CNTHP_CVAL_EL2)
>>
>> (so far so good)
>>
>> 4. kvm_timer_sync_nested() is called and it updates real CNTP_CVAL_EL0
>> with __vcpu_sys_reg(vcpu, CNTP_CVAL_EL0), overwriting value that we got
>> from Xen.
>>
>> The same stands for other hypervisor timer registers of course.
>>
>> I am wondering, what is the correct fix for this issue?
>>
>> Also, we are observing issues with timers in Dom0, which seems related
>> to this, but we didn't pinpoint exact problem yet.
>
> Thanks for the great debug above, much appreciated.
>
> As Wei-Lin pointed out in their email[1], there is a copious amount of
> nonsense here. This is due to leftovers from the mix of NV+NV2 that
> KVM was initially trying to handle before switching to NV2 only.
>
> The whole VHE vs nVHE makes no sense at all, and both should have the
> same behaviour. The only difference is around what gets trapped, and
> what doesn't.
>
> Finally, this crap is masking a subtle bug in timer_emulate(), where
> we return too early on updating the IRQ state, hence failing to
> publish the interrupt state.
>
> Could you please give the hack below a go with your setup and report
> whether it solves this particular issue?
Thanks! This is exactly what we needed. Your suggested changes fixed
both issues: in Xen and in Dom0.
[...]
--
WBR, Volodymyr
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2 02/12] KVM: arm64: nv: Sync nested timer state with FEAT_NV2
2025-01-28 11:29 ` Volodymyr Babchuk
@ 2025-01-28 12:17 ` Marc Zyngier
2025-01-28 13:56 ` Volodymyr Babchuk
0 siblings, 1 reply; 27+ messages in thread
From: Marc Zyngier @ 2025-01-28 12:17 UTC (permalink / raw)
To: Volodymyr Babchuk
Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
kvm@vger.kernel.org, Joey Gouly, Suzuki K Poulose, Oliver Upton,
Zenghui Yu, Bjorn Andersson, Christoffer Dall,
Ganapatrao Kulkarni, Chase Conklin, Eric Auger, Dmytro Terletskyi,
Wei-Lin Chang
On Tue, 28 Jan 2025 11:29:18 +0000,
Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> wrote:
>
>
> Hi Marc,
>
>
> Marc Zyngier <maz@kernel.org> writes:
>
> > + Wei-Lin Chang, who spotted something similar 3 weeks ago, that I
> > didn't manage to investigate in time.
> >
> > On Sun, 26 Jan 2025 15:25:39 +0000,
> > Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> wrote:
> >>
> >>
> >> Hi Marc,
> >>
> >> Thank you for these patches. We (myself and Dmytro Terletskyi) are
> >> trying to use this series to launch up Xen on Amazon Graviton 4 platform.
> >> Graviton 4 is built on Neoverse V2 cores and does **not** support
> >> FEAT_ECV. Looks like we have found issue in this particular patch on
> >> this particular setup.
> >>
> >> Marc Zyngier <maz@kernel.org> writes:
> >>
> >> > Emulating the timers with FEAT_NV2 is a bit odd, as the timers
> >> > can be reconfigured behind our back without the hypervisor even
> >> > noticing. In the VHE case, that's an actual regression in the
> >> > architecture...
> >> >
> >> > Co-developed-by: Christoffer Dall <christoffer.dall@arm.com>
> >> > Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
> >> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> >> > ---
> >> > arch/arm64/kvm/arch_timer.c | 44 ++++++++++++++++++++++++++++++++++++
> >> > arch/arm64/kvm/arm.c | 3 +++
> >> > include/kvm/arm_arch_timer.h | 1 +
> >> > 3 files changed, 48 insertions(+)
> >> >
> >> > diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> >> > index 1215df5904185..ee5f732fbbece 100644
> >> > --- a/arch/arm64/kvm/arch_timer.c
> >> > +++ b/arch/arm64/kvm/arch_timer.c
> >> > @@ -905,6 +905,50 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
> >> > kvm_timer_blocking(vcpu);
> >> > }
> >> >
> >> > +void kvm_timer_sync_nested(struct kvm_vcpu *vcpu)
> >> > +{
> >> > + /*
> >> > + * When NV2 is on, guest hypervisors have their EL1 timer register
> >> > + * accesses redirected to the VNCR page. Any guest action taken on
> >> > + * the timer is postponed until the next exit, leading to a very
> >> > + * poor quality of emulation.
> >> > + */
> >> > + if (!is_hyp_ctxt(vcpu))
> >> > + return;
> >> > +
> >> > + if (!vcpu_el2_e2h_is_set(vcpu)) {
> >> > + /*
> >> > + * A non-VHE guest hypervisor doesn't have any direct access
> >> > + * to its timers: the EL2 registers trap (and the HW is
> >> > + * fully emulated), while the EL0 registers access memory
> >> > + * despite the access being notionally direct. Boo.
> >> > + *
> >> > + * We update the hardware timer registers with the
> >> > + * latest value written by the guest to the VNCR page
> >> > + * and let the hardware take care of the rest.
> >> > + */
> >> > + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CTL_EL0), SYS_CNTV_CTL);
> >> > + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CVAL_EL0), SYS_CNTV_CVAL);
> >> > + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CTL_EL0), SYS_CNTP_CTL);
> >> > + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CVAL_EL0), SYS_CNTP_CVAL);
> >>
> >>
> >> Here you are overwriting trapped/emulated state of EL2 vtimer with EL0
> >> vtimer, which renders all writes to EL2 timer registers useless.
> >>
> >> This is the behavior we observed:
> >>
> >> 1. Xen writes to CNTHP_CVAL_EL2, which is trapped and handled in
> >> kvm_arm_timer_write_sysreg().
> >>
> >> 2. timer_set_cval() updates __vcpu_sys_reg(vcpu, CNTHP_CVAL_EL2)
> >>
> >> 3. timer_restore_state() updates real CNTP_CVAL_EL0 with value from
> >> __vcpu_sys_reg(vcpu, CNTHP_CVAL_EL2)
> >>
> >> (so far so good)
> >>
> >> 4. kvm_timer_sync_nested() is called and it updates real CNTP_CVAL_EL0
> >> with __vcpu_sys_reg(vcpu, CNTP_CVAL_EL0), overwriting value that we got
> >> from Xen.
> >>
> >> The same stands for other hypervisor timer registers of course.
> >>
> >> I am wondering, what is the correct fix for this issue?
> >>
> >> Also, we are observing issues with timers in Dom0, which seems related
> >> to this, but we didn't pinpoint exact problem yet.
> >
> > Thanks for the great debug above, much appreciated.
> >
> > As Wei-Lin pointed out in their email[1], there is a copious amount of
> > nonsense here. This is due to leftovers from the mix of NV+NV2 that
> > KVM was initially trying to handle before switching to NV2 only.
> >
> > The whole VHE vs nVHE makes no sense at all, and both should have the
> > same behaviour. The only difference is around what gets trapped, and
> > what doesn't.
> >
> > Finally, this crap is masking a subtle bug in timer_emulate(), where
> > we return too early on updating the IRQ state, hence failing to
> > publish the interrupt state.
> >
> > Could you please give the hack below a go with your setup and report
> > whether it solves this particular issue?
>
> Thanks! This is exactly what we needed. Your suggested changes fixed
> both issues: in Xen and in Dom0.
Great, thanks for letting me know.
I'll shortly post the fixes on the list, and would appreciate it if
you could reply with a Tested-by: tag.
Thanks again,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2 02/12] KVM: arm64: nv: Sync nested timer state with FEAT_NV2
2025-01-28 12:17 ` Marc Zyngier
@ 2025-01-28 13:56 ` Volodymyr Babchuk
0 siblings, 0 replies; 27+ messages in thread
From: Volodymyr Babchuk @ 2025-01-28 13:56 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
kvm@vger.kernel.org, Joey Gouly, Suzuki K Poulose, Oliver Upton,
Zenghui Yu, Bjorn Andersson, Christoffer Dall,
Ganapatrao Kulkarni, Chase Conklin, Eric Auger, Dmytro Terletskyi,
Wei-Lin Chang
Marc Zyngier <maz@kernel.org> writes:
> On Tue, 28 Jan 2025 11:29:18 +0000,
> Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> wrote:
>>
>>
>> Hi Marc,
>>
>>
>> Marc Zyngier <maz@kernel.org> writes:
>>
>> > + Wei-Lin Chang, who spotted something similar 3 weeks ago, that I
>> > didn't manage to investigate in time.
>> >
>> > On Sun, 26 Jan 2025 15:25:39 +0000,
>> > Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> wrote:
>> >>
>> >>
>> >> Hi Marc,
>> >>
>> >> Thank you for these patches. We (myself and Dmytro Terletskyi) are
>> >> trying to use this series to launch up Xen on Amazon Graviton 4 platform.
>> >> Graviton 4 is built on Neoverse V2 cores and does **not** support
>> >> FEAT_ECV. Looks like we have found issue in this particular patch on
>> >> this particular setup.
>> >>
>> >> Marc Zyngier <maz@kernel.org> writes:
>> >>
>> >> > Emulating the timers with FEAT_NV2 is a bit odd, as the timers
>> >> > can be reconfigured behind our back without the hypervisor even
>> >> > noticing. In the VHE case, that's an actual regression in the
>> >> > architecture...
>> >> >
>> >> > Co-developed-by: Christoffer Dall <christoffer.dall@arm.com>
>> >> > Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
>> >> > Signed-off-by: Marc Zyngier <maz@kernel.org>
>> >> > ---
>> >> > arch/arm64/kvm/arch_timer.c | 44 ++++++++++++++++++++++++++++++++++++
>> >> > arch/arm64/kvm/arm.c | 3 +++
>> >> > include/kvm/arm_arch_timer.h | 1 +
>> >> > 3 files changed, 48 insertions(+)
>> >> >
>> >> > diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
>> >> > index 1215df5904185..ee5f732fbbece 100644
>> >> > --- a/arch/arm64/kvm/arch_timer.c
>> >> > +++ b/arch/arm64/kvm/arch_timer.c
>> >> > @@ -905,6 +905,50 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
>> >> > kvm_timer_blocking(vcpu);
>> >> > }
>> >> >
>> >> > +void kvm_timer_sync_nested(struct kvm_vcpu *vcpu)
>> >> > +{
>> >> > + /*
>> >> > + * When NV2 is on, guest hypervisors have their EL1 timer register
>> >> > + * accesses redirected to the VNCR page. Any guest action taken on
>> >> > + * the timer is postponed until the next exit, leading to a very
>> >> > + * poor quality of emulation.
>> >> > + */
>> >> > + if (!is_hyp_ctxt(vcpu))
>> >> > + return;
>> >> > +
>> >> > + if (!vcpu_el2_e2h_is_set(vcpu)) {
>> >> > + /*
>> >> > + * A non-VHE guest hypervisor doesn't have any direct access
>> >> > + * to its timers: the EL2 registers trap (and the HW is
>> >> > + * fully emulated), while the EL0 registers access memory
>> >> > + * despite the access being notionally direct. Boo.
>> >> > + *
>> >> > + * We update the hardware timer registers with the
>> >> > + * latest value written by the guest to the VNCR page
>> >> > + * and let the hardware take care of the rest.
>> >> > + */
>> >> > + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CTL_EL0), SYS_CNTV_CTL);
>> >> > + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CVAL_EL0), SYS_CNTV_CVAL);
>> >> > + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CTL_EL0), SYS_CNTP_CTL);
>> >> > + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CVAL_EL0), SYS_CNTP_CVAL);
>> >>
>> >>
>> >> Here you are overwriting trapped/emulated state of EL2 vtimer with EL0
>> >> vtimer, which renders all writes to EL2 timer registers useless.
>> >>
>> >> This is the behavior we observed:
>> >>
>> >> 1. Xen writes to CNTHP_CVAL_EL2, which is trapped and handled in
>> >> kvm_arm_timer_write_sysreg().
>> >>
>> >> 2. timer_set_cval() updates __vcpu_sys_reg(vcpu, CNTHP_CVAL_EL2)
>> >>
>> >> 3. timer_restore_state() updates real CNTP_CVAL_EL0 with value from
>> >> __vcpu_sys_reg(vcpu, CNTHP_CVAL_EL2)
>> >>
>> >> (so far so good)
>> >>
>> >> 4. kvm_timer_sync_nested() is called and it updates real CNTP_CVAL_EL0
>> >> with __vcpu_sys_reg(vcpu, CNTP_CVAL_EL0), overwriting value that we got
>> >> from Xen.
>> >>
>> >> The same stands for other hypervisor timer registers of course.
>> >>
>> >> I am wondering, what is the correct fix for this issue?
>> >>
>> >> Also, we are observing issues with timers in Dom0, which seems related
>> >> to this, but we didn't pinpoint exact problem yet.
>> >
>> > Thanks for the great debug above, much appreciated.
>> >
>> > As Wei-Lin pointed out in their email[1], there is a copious amount of
>> > nonsense here. This is due to leftovers from the mix of NV+NV2 that
>> > KVM was initially trying to handle before switching to NV2 only.
>> >
>> > The whole VHE vs nVHE makes no sense at all, and both should have the
>> > same behaviour. The only difference is around what gets trapped, and
>> > what doesn't.
>> >
>> > Finally, this crap is masking a subtle bug in timer_emulate(), where
>> > we return too early on updating the IRQ state, hence failing to
>> > publish the interrupt state.
>> >
>> > Could you please give the hack below a go with your setup and report
>> > whether it solves this particular issue?
>>
>> Thanks! This is exactly what we needed. Your suggested changes fixed
>> both issues: in Xen and in Dom0.
>
> Great, thanks for letting me know.
>
> I'll shortly post the fixes on the list, and would appreciate it if
> you could reply with a Tested-by: tag.
>
Sure. I am not subscribed to the list. So could you please CC me and
Dmytro?
--
WBR, Volodymyr
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2025-01-28 13:57 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-17 14:23 [PATCH v2 00/12] KVM: arm64: Add NV timer support Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 01/12] KVM: arm64: nv: Add handling of EL2-specific timer registers Marc Zyngier
2024-12-21 1:38 ` Oliver Upton
2024-12-21 9:57 ` Marc Zyngier
2024-12-21 21:58 ` Oliver Upton
2024-12-17 14:23 ` [PATCH v2 02/12] KVM: arm64: nv: Sync nested timer state with FEAT_NV2 Marc Zyngier
2025-01-06 2:19 ` Wei-Lin Chang
2025-01-26 15:25 ` Volodymyr Babchuk
2025-01-27 17:15 ` Marc Zyngier
2025-01-28 11:29 ` Volodymyr Babchuk
2025-01-28 12:17 ` Marc Zyngier
2025-01-28 13:56 ` Volodymyr Babchuk
2024-12-17 14:23 ` [PATCH v2 03/12] KVM: arm64: nv: Publish emulated timer interrupt state in the in-memory state Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 04/12] KVM: arm64: nv: Use FEAT_ECV to trap access to EL0 timers Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 05/12] KVM: arm64: nv: Accelerate EL0 timer read accesses when FEAT_ECV in use Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 06/12] KVM: arm64: nv: Accelerate EL0 counter accesses from hypervisor context Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 07/12] KVM: arm64: Handle counter access early in non-HYP context Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 08/12] KVM: arm64: nv: Add trap routing for CNTHCTL_EL2.EL1{NVPCT,NVVCT,TVT,TVCT} Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 09/12] KVM: arm64: nv: Propagate CNTHCTL_EL2.EL1NV{P,V}CT bits Marc Zyngier
2025-01-06 2:33 ` Wei-Lin Chang
2025-01-17 15:19 ` Marc Zyngier
2025-01-21 6:04 ` Wei-Lin Chang
2024-12-17 14:23 ` [PATCH v2 10/12] KVM: arm64: nv: Sanitise CNTHCTL_EL2 Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 11/12] KVM: arm64: Work around x1e's CNTVOFF_EL2 bogosity Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 12/12] KVM: arm64: nv: Document EL2 timer API Marc Zyngier
2025-01-02 19:15 ` [PATCH v2 00/12] KVM: arm64: Add NV timer support Oliver Upton
2025-01-02 19:25 ` Marc Zyngier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).