Re: [PATCH v2 06/27] KVM: arm64: nv: Honor SError exception routing / masking

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Oliver Upton <oliver.upton@linux.dev>
To: Marc Zyngier <maz@kernel.org>
Cc: kvmarm@lists.linux.dev, Joey Gouly <joey.gouly@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>
Subject: Re: [PATCH v2 06/27] KVM: arm64: nv: Honor SError exception routing / masking
Date: Tue, 24 Jun 2025 04:44:40 -0700	[thread overview]
Message-ID: <aFqPqBqwUdwBc-Ub@linux.dev> (raw)
In-Reply-To: <86ecvdcqw4.wl-maz@kernel.org>

On Sat, Jun 21, 2025 at 11:47:55AM +0100, Marc Zyngier wrote:
> On Tue, 17 Jun 2025 00:02:47 +0100,
> Oliver Upton <oliver.upton@linux.dev> wrote:
> > 
> > To date KVM has used HCR_EL2.VSE to track the state of a pending SError
> > for the guest. With this bit set, hardware respects the EL1 exception
> > routing / masking rules and injects the vSError when appropriate.
> > 
> > This isn't correct for NV guests as hardware is oblivious to vEL2's
> > intentions for SErrors. Better yet, with FEAT_NV2 the guest can change
> > the routing behind our back as HCR_EL2 is redirected to memory. Cope
> > with this mess by:
> > 
> >  - Using a flag (instead of HCR_EL2.VSE) to track the pending SError
> >    state when SErrors are unconditionally masked for the current context
> > 
> >  - Resampling the routing / masking of a pending SError on every guest
> >    entry/exit
> > 
> >  - Emulating exception entry when SError routing implies a translation
> >    regime change
> > 
> > Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
> > ---
> >  arch/arm64/include/asm/kvm_emulate.h | 20 +++++++++++++-
> >  arch/arm64/include/asm/kvm_host.h    | 20 +++++++++++---
> >  arch/arm64/include/asm/kvm_nested.h  |  2 ++
> >  arch/arm64/kvm/arm.c                 |  4 +++
> >  arch/arm64/kvm/emulate-nested.c      |  8 ++++++
> >  arch/arm64/kvm/guest.c               | 32 +++++++++++++----------
> >  arch/arm64/kvm/handle_exit.c         |  4 +--
> >  arch/arm64/kvm/hyp/exception.c       |  6 ++++-
> >  arch/arm64/kvm/inject_fault.c        | 39 ++++++++++++++++------------
> >  arch/arm64/kvm/mmu.c                 |  2 +-
> >  arch/arm64/kvm/nested.c              | 36 +++++++++++++++++++++++++
> >  11 files changed, 134 insertions(+), 39 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> > index 1a0d51c74b42..45029dd5e9c7 100644
> > --- a/arch/arm64/include/asm/kvm_emulate.h
> > +++ b/arch/arm64/include/asm/kvm_emulate.h
> > @@ -45,7 +45,7 @@ bool kvm_condition_valid32(const struct kvm_vcpu *vcpu);
> >  void kvm_skip_instr32(struct kvm_vcpu *vcpu);
> >  
> >  void kvm_inject_undefined(struct kvm_vcpu *vcpu);
> > -void kvm_inject_vabt(struct kvm_vcpu *vcpu);
> > +int kvm_inject_serror_esr(struct kvm_vcpu *vcpu, u64 esr);
> >  int kvm_inject_sea(struct kvm_vcpu *vcpu, bool iabt, u64 addr);
> >  void kvm_inject_size_fault(struct kvm_vcpu *vcpu);
> >  
> > @@ -59,12 +59,25 @@ static inline int kvm_inject_sea_iabt(struct kvm_vcpu *vcpu, u64 addr)
> >  	return kvm_inject_sea(vcpu, true, addr);
> >  }
> >  
> > +static inline int kvm_inject_serror(struct kvm_vcpu *vcpu)
> > +{
> > +	/*
> > +	 * ESR_ELx.ISV (later renamed to IDS) indicates whether or not
> > +	 * ESR_ELx.ISS contains IMPLEMENTATION DEFINED syndrome information.
> > +	 *
> > +	 * Set the bit when injecting an SError w/o an ESR to indicate ISS
> > +	 * does not follow the architected format.
> > +	 */
> > +	return kvm_inject_serror_esr(vcpu, ESR_ELx_ISV);
> > +}
> > +
> >  void kvm_vcpu_wfi(struct kvm_vcpu *vcpu);
> >  
> >  void kvm_emulate_nested_eret(struct kvm_vcpu *vcpu);
> >  int kvm_inject_nested_sync(struct kvm_vcpu *vcpu, u64 esr_el2);
> >  int kvm_inject_nested_irq(struct kvm_vcpu *vcpu);
> >  int kvm_inject_nested_sea(struct kvm_vcpu *vcpu, bool iabt, u64 addr);
> > +int kvm_inject_nested_serror(struct kvm_vcpu *vcpu, u64 esr);
> >  
> >  static inline void kvm_inject_nested_sve_trap(struct kvm_vcpu *vcpu)
> >  {
> > @@ -205,6 +218,11 @@ static inline bool vcpu_el2_tge_is_set(const struct kvm_vcpu *vcpu)
> >  	return ctxt_sys_reg(&vcpu->arch.ctxt, HCR_EL2) & HCR_TGE;
> >  }
> >  
> > +static inline bool vcpu_el2_amo_is_set(const struct kvm_vcpu *vcpu)
> > +{
> > +	return ctxt_sys_reg(&vcpu->arch.ctxt, HCR_EL2) & HCR_AMO;
> > +}
> > +
> >  static inline bool is_hyp_ctxt(const struct kvm_vcpu *vcpu)
> >  {
> >  	bool e2h, tge;
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 5ccca509dff1..dd7405d676b3 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -817,7 +817,7 @@ struct kvm_vcpu_arch {
> >  	u8 iflags;
> >  
> >  	/* State flags for kernel bookkeeping, unused by the hypervisor code */
> > -	u8 sflags;
> > +	u16 sflags;
> >  
> >  	/*
> >  	 * Don't run the guest (internal implementation need).
> > @@ -953,9 +953,23 @@ struct kvm_vcpu_arch {
> >  		__vcpu_flags_preempt_enable();			\
> >  	} while (0)
> >  
> > +#define __vcpu_test_and_clear_flag(v, flagset, f, m)		\
> > +	({							\
> > +		typeof(v->arch.flagset) set;			\
> > +								\
> > +		__vcpu_flags_preempt_disable();			\
> > +		set = __vcpu_get_flag(v, flagset, f, m);	\
> > +		__vcpu_clear_flag(v, flagset, f, m);		\
> > +		__vcpu_flags_preempt_enable();			\
> 
> I have the feeling that you can drop the preemption manipulation
> here. as __vcpu_clear_flags() already does it.

Agreed, this was previously open-coded which is where I picked up the
explicit preemption guard.

[...]

> > +	if (!serror_pending)
> > +		return 0;
> > +
> > +	if (!cpus_have_final_cap(ARM64_HAS_RAS_EXTN) && has_esr)
> > +		return -EINVAL;
> > +
> > +	if (has_esr && (esr & ~ESR_ELx_ISS_MASK))
> > +		return -EINVAL;
> > +
> > +	if (has_esr)
> 
> We should probably consider whether the VM itself has RAS before
> populating an ESR, and return an error to userspace otherwise. Unless
> that's yet another can of worm that we'd rather stay closed?
> 
> I have the ugly feeling that it might be the latter...

Yeah, I'm rather hesitant to change things around here because we've
made it ABI at this point. This was an oversight when we added support
for writable ID registers.

The hardware sucks here because VSESR propagation happens unconditionally
when FEAT_RAS exists...

> > +void kvm_nested_sync_hwstate(struct kvm_vcpu *vcpu)
> > +{
> > +	unsigned long *hcr = vcpu_hcr(vcpu);
> > +
> > +	if (!vcpu_has_nv(vcpu))
> > +		return;
> > +
> > +	/*
> > +	 * We previously decided that an SError was deliverable to the guest.
> > +	 * Reap the pending state from HCR_EL2 and...
> > +	 */
> > +	if (unlikely(__test_and_clear_bit(__ffs(HCR_VSE), hcr)))
> > +		vcpu_set_flag(vcpu, NESTED_SERROR_PENDING);
> > +
> > +	/* Re-attempt SError injection in case the deliverability has changed */
> > +	if (unlikely(vcpu_test_and_clear_flag(vcpu, NESTED_SERROR_PENDING)))
> > +		kvm_inject_serror_esr(vcpu, vcpu_get_vsesr(vcpu));
> 
> Why do we need to re-attempt the injection, given that we already do
> it on flush?

This bit needs a clarifying comment, because it is actually
load-bearing. We need to make sure a pending SError (unmasked by AMO) is
treated as a wakeup condition for WFI emulation. e.g. we may block
indefinitely in the case of:

	sysreg_clear_set(hcr_el2, HCR_EL2_AMO, 0);
	isb();
	local_daif_mask();
	assert(read_sysreg(isr_el1) & ISR_EL1_A); /* Pending SError */
	sysreg_clear_set(hcr_el2, 0, HCR_EL2_AMO);
	isb();
	wfi();
	local_daif_restore();

For that to work we need to set the VSE line and can't check the vCPU
flag itself (SErrors masked by AMO are *not* wakeup conditions).

> Another thing that might be worth considering is how this pending
> state is preserved across save/restore. The GIC provides this state
> implicitly for IRQ/FIQ, but we don't have an external component
> driving SError. Do we need to do anything about this here?

I think that's working as intended, I updated KVM_GET_VCPU_EVENTS to
test the flag in addition to VSE. I'll stick a mention in the changelog
to make that a bit more obvious.

Thanks,
Oliver

next prev parent reply	other threads:[~2025-06-24 11:44 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-16 23:02 [PATCH v2 00/27] KVM: arm64: SCTLR2, DoubleFault2, and NV external abort fixes Oliver Upton
2025-06-16 23:02 ` [PATCH v2 01/27] arm64: Detect FEAT_SCTLR2 Oliver Upton
2025-06-16 23:02 ` [PATCH v2 02/27] arm64: Detect FEAT_DoubleFault2 Oliver Upton
2025-06-16 23:02 ` [PATCH v2 03/27] KVM: arm64: Add helper to identify a nested context Oliver Upton
2025-06-16 23:02 ` [PATCH v2 04/27] KVM: arm64: Treat vCPU with pending SError as runnable Oliver Upton
2025-06-16 23:02 ` [PATCH v2 05/27] KVM: arm64: nv: Respect exception routing rules for SEAs Oliver Upton
2025-06-21  9:51   ` Marc Zyngier
2025-06-16 23:02 ` [PATCH v2 06/27] KVM: arm64: nv: Honor SError exception routing / masking Oliver Upton
2025-06-21 10:47   ` Marc Zyngier
2025-06-24 11:44     ` Oliver Upton [this message]
2025-06-16 23:02 ` [PATCH v2 07/27] KVM: arm64: nv: Add FEAT_RAS vSError sys regs to table Oliver Upton
2025-06-16 23:02 ` [PATCH v2 08/27] KVM: arm64: nv: Use guest hypervisor's vSError state Oliver Upton
2025-06-21 11:09   ` Marc Zyngier
2025-06-16 23:02 ` [PATCH v2 09/27] KVM: arm64: nv: Advertise support for FEAT_RAS Oliver Upton
2025-06-16 23:02 ` [PATCH v2 10/27] KVM: arm64: nv: Describe trap behavior of SCTLR2_EL1 Oliver Upton
2025-06-16 23:02 ` [PATCH v2 11/27] KVM: arm64: Wire up SCTLR2_ELx sysreg descriptors Oliver Upton
2025-06-16 23:02 ` [PATCH v2 12/27] KVM: arm64: Context switch SCTLR2_ELx when advertised to the guest Oliver Upton
2025-06-16 23:02 ` [PATCH v2 13/27] KVM: arm64: Enable SCTLR2 " Oliver Upton
2025-06-16 23:02 ` [PATCH v2 14/27] KVM: arm64: Describe SCTLR2_ELx RESx masks Oliver Upton
2025-06-21 11:34   ` Marc Zyngier
2025-06-16 23:02 ` [PATCH v2 15/27] KVM: arm64: Factor out helper for selecting exception target EL Oliver Upton
2025-06-16 23:02 ` [PATCH v2 16/27] KVM: arm64: nv: Ensure Address size faults affect correct ESR Oliver Upton
2025-06-16 23:02 ` [PATCH v2 17/27] KVM: arm64: Route SEAs to the SError vector when EASE is set Oliver Upton
2025-06-21 11:54   ` Marc Zyngier
2025-06-24  8:12     ` Oliver Upton
2025-06-16 23:02 ` [PATCH v2 18/27] KVM: arm64: nv: Handle effects of HCRX_EL2.TMEA on SError injection Oliver Upton
2025-06-21 13:03   ` Marc Zyngier
2025-06-16 23:03 ` [PATCH v2 19/27] KVM: arm64: Take "masked" SEAs to EL2 when TMEA is set Oliver Upton
2025-06-22  8:39   ` Marc Zyngier
2025-06-16 23:03 ` [PATCH v2 20/27] KVM: arm64: nv: Enable vSErrors when HCRX_EL2.TMEA " Oliver Upton
2025-06-16 23:03 ` [PATCH v2 21/27] KVM: arm64: Advertise support for FEAT_SCTLR2 Oliver Upton
2025-06-16 23:03 ` [PATCH v2 22/27] KVM: arm64: Advertise support for FEAT_DoubleFault2 Oliver Upton
2025-06-16 23:03 ` [PATCH v2 23/27] KVM: arm64: Don't retire MMIO instruction w/ pending (emulated) SError Oliver Upton
2025-06-16 23:03 ` [PATCH v2 24/27] KVM: arm64: selftests: Add basic SError injection test Oliver Upton
2025-06-16 23:03 ` [PATCH v2 25/27] KVM: arm64: selftests: Test SEAs are taken to SError vector when EASE=1 Oliver Upton
2025-06-16 23:03 ` [PATCH v2 26/27] KVM: arm64: selftests: Add SCTLR2_EL1 to get-reg-list Oliver Upton
2025-06-16 23:03 ` [PATCH v2 27/27] KVM: arm64: selftests: Catch up set_id_regs with the kernel Oliver Upton
2025-06-22  9:25 ` [PATCH v2 00/27] KVM: arm64: SCTLR2, DoubleFault2, and NV external abort fixes Marc Zyngier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aFqPqBqwUdwBc-Ub@linux.dev \
    --to=oliver.upton@linux.dev \
    --cc=joey.gouly@arm.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=maz@kernel.org \
    --cc=suzuki.poulose@arm.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.