Re: [PATCH 1/4] KVM: arm64: nv: Respect exception routing rules for SEAs

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Marc Zyngier <maz@kernel.org>
To: Oliver Upton <oliver.upton@linux.dev>
Cc: kvmarm@lists.linux.dev, Joey Gouly <joey.gouly@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>
Subject: Re: [PATCH 1/4] KVM: arm64: nv: Respect exception routing rules for SEAs
Date: Sat, 31 May 2025 17:23:44 +0100	[thread overview]
Message-ID: <87r004eokv.wl-maz@kernel.org> (raw)
In-Reply-To: <20250530230623.650888-2-oliver.upton@linux.dev>

On Sat, 31 May 2025 00:06:20 +0100,
Oliver Upton <oliver.upton@linux.dev> wrote:
> 
> Synchronous external aborts are taken to EL2 if ELIsInHost() or
> HCR_EL2.TEA=1. Rework the SEA injection plumbing to respect the imposed
> routing of the guest hypervisor and opportunistically rephrase things to
> make their function a bit more obvious.

nit: this is only true when FEAT_RAS is implemented, which isn't the
case so far when NV is enabled.

>
> Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
> ---
>  arch/arm64/include/asm/kvm_emulate.h | 15 +++++++++--
>  arch/arm64/kvm/emulate-nested.c      | 20 ++++++++++++++
>  arch/arm64/kvm/guest.c               |  8 ++++--
>  arch/arm64/kvm/inject_fault.c        | 40 +++++++++-------------------
>  arch/arm64/kvm/mmio.c                |  6 ++---
>  arch/arm64/kvm/mmu.c                 | 15 +++--------
>  6 files changed, 57 insertions(+), 47 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index bd020fc28aa9..e0fa8a5a2530 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -46,15 +46,26 @@ void kvm_skip_instr32(struct kvm_vcpu *vcpu);
>  
>  void kvm_inject_undefined(struct kvm_vcpu *vcpu);
>  void kvm_inject_vabt(struct kvm_vcpu *vcpu);
> -void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr);
> -void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr);
> +void __kvm_inject_sea(struct kvm_vcpu *vcpu, bool iabt, u64 addr);
> +int kvm_inject_sea(struct kvm_vcpu *vcpu, bool iabt, u64 addr);
>  void kvm_inject_size_fault(struct kvm_vcpu *vcpu);
>  
> +static inline int kvm_inject_sea_dabt(struct kvm_vcpu *vcpu, u64 addr)
> +{
> +	return kvm_inject_sea(vcpu, false, addr);
> +}
> +
> +static inline int kvm_inject_sea_iabt(struct kvm_vcpu *vcpu, u64 addr)
> +{
> +	return kvm_inject_sea(vcpu, true, addr);
> +}
> +
>  void kvm_vcpu_wfi(struct kvm_vcpu *vcpu);
>  
>  void kvm_emulate_nested_eret(struct kvm_vcpu *vcpu);
>  int kvm_inject_nested_sync(struct kvm_vcpu *vcpu, u64 esr_el2);
>  int kvm_inject_nested_irq(struct kvm_vcpu *vcpu);
> +int kvm_inject_nested_sea(struct kvm_vcpu *vcpu, bool iabt, u64 addr);
>  
>  static inline void kvm_inject_nested_sve_trap(struct kvm_vcpu *vcpu)
>  {
> diff --git a/arch/arm64/kvm/emulate-nested.c b/arch/arm64/kvm/emulate-nested.c
> index 3a384e9660b8..3d2f98fdca2f 100644
> --- a/arch/arm64/kvm/emulate-nested.c
> +++ b/arch/arm64/kvm/emulate-nested.c
> @@ -2816,3 +2816,23 @@ int kvm_inject_nested_irq(struct kvm_vcpu *vcpu)
>  	/* esr_el2 value doesn't matter for exits due to irqs. */
>  	return kvm_inject_nested(vcpu, 0, except_type_irq);
>  }
> +
> +int kvm_inject_nested_sea(struct kvm_vcpu *vcpu, bool iabt, u64 addr)
> +{
> +	u64 esr;
> +
> +	/*
> +	 * The destination EL is in the same translation regime as the origin;
> +	 * directly inject the SEA.
> +	 */
> +	if (is_hyp_ctxt(vcpu) || !(__vcpu_sys_reg(vcpu, HCR_EL2) & HCR_TEA)) {
> +		__kvm_inject_sea(vcpu, iabt, addr);
> +		return 1;
> +	}

I find this a bit confusing.

Here, we end-up *not* injecting a nested exception, but instead
delivering it in context. I think it would be clearer to move this
condition in kvm_inject_sea(), and then make __kvm_inject_sea()
static.

I guess the confusion also stems from the fact that we tend to lump
two things together:

- exception taken from EL2&0 to EL2

- exception taken from EL1&0 to EL2

I would like to make sure that it is the second case we deal with in
emulated-nested.c, and the first one locally.

At which point, you can end up with something like:

static inline bool is_nested_ctxt(struct kvm_vcpu *vcpu)
{
	return vcpu_has_nv(vcpu) && !is_hyp_ctxt(vcpu);
}

int kvm_inject_sea(struct kvm_vcpu *vcpu, bool iabt, u64 addr)
{
	lockdep_assert_held(&vcpu->mutex);

	if (is_nested_ctxt(vcpu) && (__vcpu_sys_reg(vcpu, HCR_EL2) & HCR_TEA))
		return kvm_inject_nested_sea(vcpu, iabt, addr);

	__kvm_inject_sea(vcpu, iabt, addr);
	return 1;
}

I'll post a separate patch with the is_nested_ctxt() helper, as it
makes things more readable overall.

> +
> +	esr = FIELD_PREP(ESR_ELx_EC_MASK,
> +			 iabt ? ESR_ELx_EC_IABT_LOW : ESR_ELx_EC_DABT_LOW);
> +	esr |= ESR_ELx_FSC_EXTABT | ESR_ELx_IL;
> +
> +	return kvm_inject_s2_fault(vcpu, esr);
> +}
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 2196979a24a3..dd5cce0006f3 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -839,6 +839,7 @@ int __kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu,
>  	bool serror_pending = events->exception.serror_pending;
>  	bool has_esr = events->exception.serror_has_esr;
>  	bool ext_dabt_pending = events->exception.ext_dabt_pending;
> +	int ret;
>  
>  	if (serror_pending && has_esr) {
>  		if (!cpus_have_final_cap(ARM64_HAS_RAS_EXTN))
> @@ -852,8 +853,11 @@ int __kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu,
>  		kvm_inject_vabt(vcpu);
>  	}
>  
> -	if (ext_dabt_pending)
> -		kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
> +	if (ext_dabt_pending) {
> +		ret = kvm_inject_sea_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
> +		if (ret < 0)
> +			return ret;
> +	}
>  
>  	return 0;
>  }
> diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
> index a640e839848e..3e61fa0a721b 100644
> --- a/arch/arm64/kvm/inject_fault.c
> +++ b/arch/arm64/kvm/inject_fault.c
> @@ -155,36 +155,23 @@ static void inject_abt32(struct kvm_vcpu *vcpu, bool is_pabt, u32 addr)
>  	vcpu_write_sys_reg(vcpu, far, FAR_EL1);
>  }
>  
> -/**
> - * kvm_inject_dabt - inject a data abort into the guest
> - * @vcpu: The VCPU to receive the data abort
> - * @addr: The address to report in the DFAR
> - *
> - * It is assumed that this code is called from the VCPU thread and that the
> - * VCPU therefore is not currently executing guest code.
> - */
> -void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr)
> +void __kvm_inject_sea(struct kvm_vcpu *vcpu, bool iabt, u64 addr)
>  {
>  	if (vcpu_el1_is_32bit(vcpu))
> -		inject_abt32(vcpu, false, addr);
> +		inject_abt32(vcpu, iabt, addr);
>  	else
> -		inject_abt64(vcpu, false, addr);
> +		inject_abt64(vcpu, iabt, addr);
>  }
>  
> -/**
> - * kvm_inject_pabt - inject a prefetch abort into the guest
> - * @vcpu: The VCPU to receive the prefetch abort
> - * @addr: The address to report in the DFAR
> - *
> - * It is assumed that this code is called from the VCPU thread and that the
> - * VCPU therefore is not currently executing guest code.
> - */
> -void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr)
> +int kvm_inject_sea(struct kvm_vcpu *vcpu, bool iabt, u64 addr)
>  {
> -	if (vcpu_el1_is_32bit(vcpu))
> -		inject_abt32(vcpu, true, addr);
> -	else
> -		inject_abt64(vcpu, true, addr);
> +	lockdep_assert_held(&vcpu->mutex);
> +
> +	if (vcpu_has_nv(vcpu))
> +		return kvm_inject_nested_sea(vcpu, iabt, addr);
> +
> +	__kvm_inject_sea(vcpu, iabt, addr);
> +	return 1;
>  }
>  
>  void kvm_inject_size_fault(struct kvm_vcpu *vcpu)
> @@ -194,10 +181,7 @@ void kvm_inject_size_fault(struct kvm_vcpu *vcpu)
>  	addr  = kvm_vcpu_get_fault_ipa(vcpu);
>  	addr |= kvm_vcpu_get_hfar(vcpu) & GENMASK(11, 0);
>  
> -	if (kvm_vcpu_trap_is_iabt(vcpu))
> -		kvm_inject_pabt(vcpu, addr);
> -	else
> -		kvm_inject_dabt(vcpu, addr);
> +	__kvm_inject_sea(vcpu, kvm_vcpu_trap_is_iabt(vcpu), addr);
>  
>  	/*
>  	 * If AArch64 or LPAE, set FSC to 0 to indicate an Address
> diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
> index ab365e839874..573a6ade2f4e 100644
> --- a/arch/arm64/kvm/mmio.c
> +++ b/arch/arm64/kvm/mmio.c
> @@ -169,10 +169,8 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
>  		trace_kvm_mmio_nisv(*vcpu_pc(vcpu), kvm_vcpu_get_esr(vcpu),
>  				    kvm_vcpu_get_hfar(vcpu), fault_ipa);
>  
> -		if (vcpu_is_protected(vcpu)) {
> -			kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
> -			return 1;
> -		}
> +		if (vcpu_is_protected(vcpu))
> +			return kvm_inject_sea_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
>  
>  		if (test_bit(KVM_ARCH_FLAG_RETURN_NISV_IO_ABORT_TO_USER,
>  			     &vcpu->kvm->arch.flags)) {
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index e445db2cb4a4..0e0d51d7ab85 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1833,11 +1833,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
>  		if (fault_ipa >= BIT_ULL(VTCR_EL2_IPA(vcpu->arch.hw_mmu->vtcr))) {
>  			fault_ipa |= kvm_vcpu_get_hfar(vcpu) & GENMASK(11, 0);
>  
> -			if (is_iabt)
> -				kvm_inject_pabt(vcpu, fault_ipa);
> -			else
> -				kvm_inject_dabt(vcpu, fault_ipa);
> -			return 1;
> +			return kvm_inject_sea(vcpu, is_iabt, fault_ipa);
>  		}
>  	}
>  
> @@ -1909,8 +1905,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
>  		}
>  
>  		if (kvm_vcpu_abt_iss1tw(vcpu)) {
> -			kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
> -			ret = 1;
> +			ret = kvm_inject_sea_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
>  			goto out_unlock;
>  		}
>  
> @@ -1955,10 +1950,8 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
>  	if (ret == 0)
>  		ret = 1;
>  out:
> -	if (ret == -ENOEXEC) {
> -		kvm_inject_pabt(vcpu, kvm_vcpu_get_hfar(vcpu));
> -		ret = 1;
> -	}
> +	if (ret == -ENOEXEC)
> +		ret = kvm_inject_sea_iabt(vcpu, kvm_vcpu_get_hfar(vcpu));
>  out_unlock:
>  	srcu_read_unlock(&vcpu->kvm->srcu, idx);
>  	return ret;

Other than my rambling above, this looks rather good. But there is a
bit more, "thanks" to FEAT_DoubleFault2:

- HCRX_EL2.TMEA also affects this patch, both on the SEA and SError
  paths (both can be routed to EL2 when masked).

- SCTLR2_EL{1,2}2.EASE also influence the delivery of the SEA,
  upgrading it to a SError (yes, this is the routing from hell and
  ties directly into the following patches).

I was expecting to see FEAT_RAS being enabled at some point, but
that's not the case yet. Are you planning to do so?

Thanks,

	M.

-- 
Jazz isn't dead. It just smells funny.

next prev parent reply	other threads:[~2025-05-31 16:23 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-30 23:06 [PATCH 0/4] KVM: arm64: nv: Fixes for external abort exception routing Oliver Upton
2025-05-30 23:06 ` [PATCH 1/4] KVM: arm64: nv: Respect exception routing rules for SEAs Oliver Upton
2025-05-31 16:23   ` Marc Zyngier [this message]
2025-05-31 17:51     ` Oliver Upton
2025-05-30 23:06 ` [PATCH 2/4] KVM: arm64: nv: Ensure Address size faults affect correct ESR Oliver Upton
2025-05-30 23:06 ` [PATCH 3/4] KVM: arm64: nv: Honor SError exception routing / masking Oliver Upton
2025-05-30 23:06 ` [PATCH 4/4] KVM: arm64: Treat vCPU with pending SError as runnable Oliver Upton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r004eokv.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=joey.gouly@arm.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=oliver.upton@linux.dev \
    --cc=suzuki.poulose@arm.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.