linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
Cc: "kvmarm@lists.linux.dev" <kvmarm@lists.linux.dev>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Joey Gouly <joey.gouly@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Oliver Upton <oliver.upton@linux.dev>,
	Zenghui Yu <yuzenghui@huawei.com>,
	Bjorn Andersson <andersson@kernel.org>,
	Christoffer Dall <christoffer.dall@arm.com>,
	Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>,
	Chase Conklin <chase.conklin@arm.com>,
	Eric Auger <eauger@redhat.com>,
	Dmytro Terletskyi <Dmytro_Terletskyi@epam.com>,
	Wei-Lin Chang <r09922117@csie.ntu.edu.tw>
Subject: Re: [PATCH v2 02/12] KVM: arm64: nv: Sync nested timer state with FEAT_NV2
Date: Mon, 27 Jan 2025 17:15:57 +0000	[thread overview]
Message-ID: <86h65kuqia.wl-maz@kernel.org> (raw)
In-Reply-To: <87frl51tse.fsf@epam.com>

+ Wei-Lin Chang, who spotted something similar 3 weeks ago, that I
didn't manage to investigate in time.

On Sun, 26 Jan 2025 15:25:39 +0000,
Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> wrote:
> 
> 
> Hi Marc,
> 
> Thank you for these patches. We (myself and Dmytro Terletskyi) are
> trying to use this series to launch up Xen on Amazon Graviton 4 platform.
> Graviton 4 is built on Neoverse V2 cores and does **not** support
> FEAT_ECV. Looks like we have found issue in this particular patch on
> this particular setup.
> 
> Marc Zyngier <maz@kernel.org> writes:
> 
> > Emulating the timers with FEAT_NV2 is a bit odd, as the timers
> > can be reconfigured behind our back without the hypervisor even
> > noticing. In the VHE case, that's an actual regression in the
> > architecture...
> >
> > Co-developed-by: Christoffer Dall <christoffer.dall@arm.com>
> > Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > ---
> >  arch/arm64/kvm/arch_timer.c  | 44 ++++++++++++++++++++++++++++++++++++
> >  arch/arm64/kvm/arm.c         |  3 +++
> >  include/kvm/arm_arch_timer.h |  1 +
> >  3 files changed, 48 insertions(+)
> >
> > diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> > index 1215df5904185..ee5f732fbbece 100644
> > --- a/arch/arm64/kvm/arch_timer.c
> > +++ b/arch/arm64/kvm/arch_timer.c
> > @@ -905,6 +905,50 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
> >  		kvm_timer_blocking(vcpu);
> >  }
> >  
> > +void kvm_timer_sync_nested(struct kvm_vcpu *vcpu)
> > +{
> > +	/*
> > +	 * When NV2 is on, guest hypervisors have their EL1 timer register
> > +	 * accesses redirected to the VNCR page. Any guest action taken on
> > +	 * the timer is postponed until the next exit, leading to a very
> > +	 * poor quality of emulation.
> > +	 */
> > +	if (!is_hyp_ctxt(vcpu))
> > +		return;
> > +
> > +	if (!vcpu_el2_e2h_is_set(vcpu)) {
> > +		/*
> > +		 * A non-VHE guest hypervisor doesn't have any direct access
> > +		 * to its timers: the EL2 registers trap (and the HW is
> > +		 * fully emulated), while the EL0 registers access memory
> > +		 * despite the access being notionally direct. Boo.
> > +		 *
> > +		 * We update the hardware timer registers with the
> > +		 * latest value written by the guest to the VNCR page
> > +		 * and let the hardware take care of the rest.
> > +		 */
> > +		write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CTL_EL0),  SYS_CNTV_CTL);
> > +		write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CVAL_EL0), SYS_CNTV_CVAL);
> > +		write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CTL_EL0),  SYS_CNTP_CTL);
> > +		write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CVAL_EL0), SYS_CNTP_CVAL);
> 
> 
> Here you are overwriting trapped/emulated state of  EL2 vtimer with EL0
> vtimer, which renders all writes to EL2 timer registers useless.
> 
> This is the behavior we observed:
> 
>  1. Xen writes to CNTHP_CVAL_EL2, which is trapped and handled in
>     kvm_arm_timer_write_sysreg().
> 
>  2. timer_set_cval() updates __vcpu_sys_reg(vcpu, CNTHP_CVAL_EL2)
> 
>  3. timer_restore_state() updates real CNTP_CVAL_EL0 with value from
>    __vcpu_sys_reg(vcpu, CNTHP_CVAL_EL2)
> 
>  (so far so good)
> 
>  4. kvm_timer_sync_nested() is called and it updates real CNTP_CVAL_EL0
>  with __vcpu_sys_reg(vcpu, CNTP_CVAL_EL0), overwriting value that we got
>  from Xen.
> 
> The same stands for other hypervisor timer registers of course.
> 
> I am wondering, what is the correct fix for this issue?
> 
> Also, we are observing issues with timers in Dom0, which seems related
> to this, but we didn't pinpoint exact problem yet.

Thanks for the great debug above, much appreciated.

As Wei-Lin pointed out in their email[1], there is a copious amount of
nonsense here. This is due to leftovers from the mix of NV+NV2 that
KVM was initially trying to handle before switching to NV2 only.

The whole VHE vs nVHE makes no sense at all, and both should have the
same behaviour. The only difference is around what gets trapped, and
what doesn't.

Finally, this crap is masking a subtle bug in timer_emulate(), where
we return too early on updating the IRQ state, hence failing to
publish the interrupt state.

Could you please give the hack below a go with your setup and report
whether it solves this particular issue?

diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index 0e29958e20187..56f4905cdb859 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -471,10 +471,8 @@ static void timer_emulate(struct arch_timer_context *ctx)
 
 	trace_kvm_timer_emulate(ctx, should_fire);
 
-	if (should_fire != ctx->irq.level) {
+	if (should_fire != ctx->irq.level)
 		kvm_timer_update_irq(ctx->vcpu, should_fire, ctx);
-		return;
-	}
 
 	kvm_timer_update_status(ctx, should_fire);
 
@@ -976,31 +974,21 @@ void kvm_timer_sync_nested(struct kvm_vcpu *vcpu)
 	 * which allows trapping of the timer registers even with NV2.
 	 * Still, this is still worse than FEAT_NV on its own. Meh.
 	 */
-	if (!vcpu_el2_e2h_is_set(vcpu)) {
-		if (cpus_have_final_cap(ARM64_HAS_ECV))
-			return;
-
-		/*
-		 * A non-VHE guest hypervisor doesn't have any direct access
-		 * to its timers: the EL2 registers trap (and the HW is
-		 * fully emulated), while the EL0 registers access memory
-		 * despite the access being notionally direct. Boo.
-		 *
-		 * We update the hardware timer registers with the
-		 * latest value written by the guest to the VNCR page
-		 * and let the hardware take care of the rest.
-		 */
-		write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CTL_EL0),  SYS_CNTV_CTL);
-		write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CVAL_EL0), SYS_CNTV_CVAL);
-		write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CTL_EL0),  SYS_CNTP_CTL);
-		write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CVAL_EL0), SYS_CNTP_CVAL);
-	} else {
+	if (!cpus_have_final_cap(ARM64_HAS_ECV)) {
 		/*
 		 * For a VHE guest hypervisor, the EL2 state is directly
-		 * stored in the host EL1 timers, while the emulated EL0
+		 * stored in the host EL1 timers, while the emulated EL1
 		 * state is stored in the VNCR page. The latter could have
 		 * been updated behind our back, and we must reset the
 		 * emulation of the timers.
+		 *
+		 * A non-VHE guest hypervisor doesn't have any direct access
+		 * to its timers: the EL2 registers trap despite being
+		 * notionally direct (we use the EL1 HW, as for VHE), while
+		 * the EL1 registers access memory.
+		 *
+		 * In both cases, process the emulated timers on each guest
+		 * exit. Boo.
 		 */
 		struct timer_map map;
 		get_timer_map(vcpu, &map);

Thanks,

	M.

[1] https://lore.kernel.org/r/fqiqfjzwpgbzdtouu2pwqlu7llhnf5lmy4hzv5vo6ph4v3vyls@jdcfy3fjjc5k

-- 
Without deviation from the norm, progress is not possible.


  reply	other threads:[~2025-01-27 17:17 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-17 14:23 [PATCH v2 00/12] KVM: arm64: Add NV timer support Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 01/12] KVM: arm64: nv: Add handling of EL2-specific timer registers Marc Zyngier
2024-12-21  1:38   ` Oliver Upton
2024-12-21  9:57     ` Marc Zyngier
2024-12-21 21:58       ` Oliver Upton
2024-12-17 14:23 ` [PATCH v2 02/12] KVM: arm64: nv: Sync nested timer state with FEAT_NV2 Marc Zyngier
2025-01-06  2:19   ` Wei-Lin Chang
2025-01-26 15:25   ` Volodymyr Babchuk
2025-01-27 17:15     ` Marc Zyngier [this message]
2025-01-28 11:29       ` Volodymyr Babchuk
2025-01-28 12:17         ` Marc Zyngier
2025-01-28 13:56           ` Volodymyr Babchuk
2024-12-17 14:23 ` [PATCH v2 03/12] KVM: arm64: nv: Publish emulated timer interrupt state in the in-memory state Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 04/12] KVM: arm64: nv: Use FEAT_ECV to trap access to EL0 timers Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 05/12] KVM: arm64: nv: Accelerate EL0 timer read accesses when FEAT_ECV in use Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 06/12] KVM: arm64: nv: Accelerate EL0 counter accesses from hypervisor context Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 07/12] KVM: arm64: Handle counter access early in non-HYP context Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 08/12] KVM: arm64: nv: Add trap routing for CNTHCTL_EL2.EL1{NVPCT,NVVCT,TVT,TVCT} Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 09/12] KVM: arm64: nv: Propagate CNTHCTL_EL2.EL1NV{P,V}CT bits Marc Zyngier
2025-01-06  2:33   ` Wei-Lin Chang
2025-01-17 15:19     ` Marc Zyngier
2025-01-21  6:04       ` Wei-Lin Chang
2024-12-17 14:23 ` [PATCH v2 10/12] KVM: arm64: nv: Sanitise CNTHCTL_EL2 Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 11/12] KVM: arm64: Work around x1e's CNTVOFF_EL2 bogosity Marc Zyngier
2024-12-17 14:23 ` [PATCH v2 12/12] KVM: arm64: nv: Document EL2 timer API Marc Zyngier
2025-01-02 19:15 ` [PATCH v2 00/12] KVM: arm64: Add NV timer support Oliver Upton
2025-01-02 19:25 ` Marc Zyngier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86h65kuqia.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=Dmytro_Terletskyi@epam.com \
    --cc=Volodymyr_Babchuk@epam.com \
    --cc=andersson@kernel.org \
    --cc=chase.conklin@arm.com \
    --cc=christoffer.dall@arm.com \
    --cc=eauger@redhat.com \
    --cc=gankulkarni@os.amperecomputing.com \
    --cc=joey.gouly@arm.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=oliver.upton@linux.dev \
    --cc=r09922117@csie.ntu.edu.tw \
    --cc=suzuki.poulose@arm.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).