All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oliver Upton <oupton@kernel.org>
To: Weiming Shi <bestswngs@gmail.com>
Cc: Marc Zyngier <maz@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Joey Gouly <joey.gouly@arm.com>,
	Steffen Eiden <seiden@linux.ibm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jakub Kicinski <kuba@kernel.org>,
	Bjorn Andersson <andersson@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Kristina Martsenko <kristina.martsenko@arm.com>,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	Zhong Wang <wangzhong.c0ss4ck@bytedance.com>,
	Xuanqing Shi <shixuanqing.11@bytedance.com>
Subject: Re: [PATCH] KVM: arm64: nv: Translate vEL2 PSTATE to EL1 in kvm_hyp_handle_mops()
Date: Tue, 16 Jun 2026 13:14:39 -0700	[thread overview]
Message-ID: <ajGur_YEMoMLZPSF@kernel.org> (raw)
In-Reply-To: <20260616114943.81188-2-bestswngs@gmail.com>

Hi Weiming,

Thanks for the fix.

On Tue, Jun 16, 2026 at 07:49:44PM +0800, Weiming Shi wrote:
> When a nested virtualisation guest is running its virtual EL2 (vEL2),
> fixup_guest_exit() rewrites vcpu_cpsr() to the guest's virtual exception
> level: a hardware PSTATE.M of EL1{t,h} is presented as EL2{t,h}. The
> hardware, however, executes vEL2 at EL1.
> 
> kvm_hyp_handle_mops() runs on the fast guest re-entry path, where it
> clears the single-step bit and restores SPSR_EL2 directly from
> vcpu_cpsr():
> 
> 	*vcpu_cpsr(vcpu) &= ~DBG_SPSR_SS;
> 	write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR);
> 
> For a guest hypervisor this writes the vEL2 view (PSTATE.M == EL2h) into
> the hardware SPSR_EL2 without translating it back. The fast path re-enters
> the guest via __guest_enter()/ERET without going through
> __sysreg_restore_el2_return_state(), so neither to_hw_pstate() nor the
> "return to a less privileged mode" safety check there (which would set
> PSR_IL_BIT) is applied. The ERET therefore restores PSTATE.M = EL2h and
> re-enters the guest at the real EL2 with a guest-controlled ELR, escaping
> stage-2 and the guest/host boundary.
> 
> This is reachable on a kernel with FEAT_MOPS running a KVM nested guest
> (kvm-arm.mode=nested): KVM sets HCRX_EL2.MCE2, which the guest hypervisor
> cannot clear for its own context (is_nested_ctxt() is false), so a vEL2
> MOPS exception is taken to the host and dispatched to kvm_hyp_handle_mops()
> with VCPU_IN_HYP_CONTEXT set.
> 
> Translate EL2{t,h} back to EL1{t,h} before writing SPSR_EL2, mirroring
> kvm_hyp_handle_eret(). For non-nested guests vcpu_cpsr() never holds an
> EL2 mode, so the translation is a no-op and behaviour is unchanged.

The changelog is unnecessarily verbose, instead:

  kvm_hyp_handle_mops() resets the single-step state machine as part of
  rewinding state for a MOPS exception by modifying vcpu_cpsr() and
  writing the result directly into hardware.

  In the case of nested virtualization, vcpu_cpsr() is a synthetic value
  such that the rest of KVM can deal with vEL2 cleanly. That means the 
  value requires translation before being written into hardware, which is
  unfortunately missing from the MOPS handler.

  Fix it by directly modifying SPSR_EL2 and avoiding the synthetic state
  altogether, which will be resynchronized on the next 'full' exit back
  to KVM.

Also:

Cc: stable@vger.kernel.org

Definitely meets the bar :)

> Fixes: 2de451a329cf ("KVM: arm64: Add handler for MOPS exceptions")
> Assisted-by: Claude:claude-opus-4-8
> Reported-by: Zhong Wang <wangzhong.c0ss4ck@bytedance.com>
> Reported-by: Xuanqing Shi <shixuanqing.11@bytedance.com>
> Signed-off-by: Weiming Shi <bestswngs@gmail.com>
> ---
>  arch/arm64/kvm/hyp/include/hyp/switch.h | 23 ++++++++++++++++++++++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> index e9b36a3b27bbc..a6b7963ddbf0b 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> @@ -448,6 +448,8 @@ static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
>  
>  static inline bool kvm_hyp_handle_mops(struct kvm_vcpu *vcpu, u64 *exit_code)
>  {
> +	u64 spsr, mode;
> +
>  	*vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
>  	arm64_mops_reset_regs(vcpu_gp_regs(vcpu), vcpu->arch.fault.esr_el2);
>  	write_sysreg_el2(*vcpu_pc(vcpu), SYS_ELR);
> @@ -457,7 +459,26 @@ static inline bool kvm_hyp_handle_mops(struct kvm_vcpu *vcpu, u64 *exit_code)
>  	 * instruction.
>  	 */
>  	*vcpu_cpsr(vcpu) &= ~DBG_SPSR_SS;
> -	write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR);
> +
> +	/*
> +	 * For a guest hypervisor, vcpu_cpsr() holds the vEL2 view
> +	 * (PSTATE.M == EL2h) installed by fixup_guest_exit(), but vEL2
> +	 * runs at EL1. Translate it back before restoring SPSR_EL2, as in
> +	 * kvm_hyp_handle_eret().
> +	 */
> +	spsr = *vcpu_cpsr(vcpu);
> +	mode = spsr & (PSR_MODE_MASK | PSR_MODE32_BIT);
> +	switch (mode) {
> +	case PSR_MODE_EL2t:
> +		mode = PSR_MODE_EL1t;
> +		break;
> +	case PSR_MODE_EL2h:
> +		mode = PSR_MODE_EL1h;
> +		break;
> +	}
> +	spsr = (spsr & ~(PSR_MODE_MASK | PSR_MODE32_BIT)) | mode;
> +
> +	write_sysreg_el2(spsr, SYS_SPSR);

As I allude to in the modified changelog, I'd rather we just manipulate
the hardware value of SPSR_EL2 directly. We already do this in
kvm_hyp_handle_eret()

	spsr = read_sysreg_el2(SYS_SPSR);
	write_sysreg_el2(spsr & ~DBG_SPSR_SS, SYS_SPSR);

Thanks,
Oliver


      parent reply	other threads:[~2026-06-16 20:14 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-16 11:49 [PATCH] KVM: arm64: nv: Translate vEL2 PSTATE to EL1 in kvm_hyp_handle_mops() Weiming Shi
2026-06-16 12:03 ` Weiming Shi
2026-06-16 20:14 ` Oliver Upton [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ajGur_YEMoMLZPSF@kernel.org \
    --to=oupton@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=andersson@kernel.org \
    --cc=bestswngs@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=joey.gouly@arm.com \
    --cc=kristina.martsenko@arm.com \
    --cc=kuba@kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=seiden@linux.ibm.com \
    --cc=shixuanqing.11@bytedance.com \
    --cc=suzuki.poulose@arm.com \
    --cc=wangzhong.c0ss4ck@bytedance.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.