Re: [PATCH RFC v3 09/27] KVM: arm64: Factor SVE guest exit handling out into a function

public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed

From: Marc Zyngier <maz@kernel.org>
To: Mark Rutland <mark.rutland@arm.com>
Cc: Mark Brown <broonie@kernel.org>,
	Oliver Upton <oliver.upton@linux.dev>,
	Joey Gouly <joey.gouly@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Will Deacon <will@kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Jonathan Corbet <corbet@lwn.net>, Shuah Khan <shuah@kernel.org>,
	Dave Martin <Dave.Martin@arm.com>, Fuad Tabba <tabba@google.com>,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org
Subject: Re: [PATCH RFC v3 09/27] KVM: arm64: Factor SVE guest exit handling out into a function
Date: Wed, 22 Jan 2025 11:51:00 +0000	[thread overview]
Message-ID: <86o6zzukwr.wl-maz@kernel.org> (raw)
In-Reply-To: <Z4pAMaEYvdLpmbg2@J2N7QTR9R3>

On Fri, 17 Jan 2025 11:34:09 +0000,
Mark Rutland <mark.rutland@arm.com> wrote:
> 
> On Fri, Dec 20, 2024 at 04:46:34PM +0000, Mark Brown wrote:
> > The SVE portion of kvm_vcpu_put() is quite large, especially given the
> > comments required.  When we add similar handling for SME the function
> > will get even larger, in order to keep things managable factor the SVE
> > portion out of the main kvm_vcpu_put().
> 
> While investigating some problems with SVE I spotted a latent bug in
> this area where I suspect the fix will conflict with / supersede this
> rework. Details below; IIUC the bug was introduced in commit:
> 
>   8c8010d69c132273 ("KVM: arm64: Save/restore SVE state for nVHE")
> 
> > Signed-off-by: Mark Brown <broonie@kernel.org>
> > ---
> >  arch/arm64/kvm/fpsimd.c | 67 +++++++++++++++++++++++++++----------------------
> >  1 file changed, 37 insertions(+), 30 deletions(-)
> > 
> > diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> > index 09b65abaf9db60cc57dbc554ad2108a80c2dc46b..3c2e0b96877ac5b4f3b9d8dfa38975f11b74b60d 100644
> > --- a/arch/arm64/kvm/fpsimd.c
> > +++ b/arch/arm64/kvm/fpsimd.c
> > @@ -151,6 +151,41 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
> >  	}
> >  }
> >  
> > +static void kvm_vcpu_put_sve(struct kvm_vcpu *vcpu)
> > +{
> > +	u64 zcr;
> > +
> > +	if (!vcpu_has_sve(vcpu))
> > +		return;
> > +
> > +	zcr = read_sysreg_el1(SYS_ZCR);
> > +
> > +	/*
> > +	 * If the vCPU is in the hyp context then ZCR_EL1 is loaded
> > +	 * with its vEL2 counterpart.
> > +	 */
> > +	__vcpu_sys_reg(vcpu, vcpu_sve_zcr_elx(vcpu)) = zcr;
> > +
> > +	/*
> > +	 * Restore the VL that was saved when bound to the CPU, which
> > +	 * is the maximum VL for the guest. Because the layout of the
> > +	 * data when saving the sve state depends on the VL, we need
> > +	 * to use a consistent (i.e., the maximum) VL.  Note that this
> > +	 * means that at guest exit ZCR_EL1 is not necessarily the
> > +	 * same as on guest entry.
> > +	 *
> > +	 * ZCR_EL2 holds the guest hypervisor's VL when running a
> > +	 * nested guest, which could be smaller than the max for the
> > +	 * vCPU. Similar to above, we first need to switch to a VL
> > +	 * consistent with the layout of the vCPU's SVE state. KVM
> > +	 * support for NV implies VHE, so using the ZCR_EL1 alias is
> > +	 * safe.
> > +	 */
> > +	if (!has_vhe() || (vcpu_has_nv(vcpu) && !is_hyp_ctxt(vcpu)))
> > +		sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1,
> > +				       SYS_ZCR_EL1);
> > +}
> > +
> >  /*
> >   * Write back the vcpu FPSIMD regs if they are dirty, and invalidate the
> >   * cpu FPSIMD regs so that they can't be spuriously reused if this vcpu
> > @@ -179,38 +214,10 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
> >  	}
> 
> A little before this context, kvm_arch_vcpu_put_fp() calls
> local_irq_save(), which indicates that IRQs can be taken before this
> point, which is deeply suspicious.
> 
> If IRQs are enabled, then it's possible to take an IRQ and potentially
> run a softirq handler which uses kernel-mode NEON. That means
> kernel_neon_begin() will try to save the live FPSIMD/SVE/SME state via
> fpsimd_save_user_state(), using the live value of ZCR_ELx.LEN, which would not
> be correct per the comment.
> 
> Looking at kvm_arch_vcpu_ioctl_run(), the relevant logic is:
> 
> 	vcpu_load(vcpu); // calls kvm_arch_vcpu_load_fp()
> 
> 	while (ret > 0) {
> 		preempt_disable();
> 		local_irq_disable();
> 
> 		kvm_arch_vcpu_ctxflush_fp();
> 		ret = kvm_arm_vcpu_enter_exit(vcpu);
> 		kvm_arch_vcpu_ctxsync_fp(vcpu);
> 
> 		local_irq_enable();
> 		preempt_enable();
> 	}
> 
> 	vcpu_put(vcpu); // calls kvm_arch_vcpu_put_fp()
> 
> ... and the problem can occur at any point after the vCPU has run where IRQs
> are enabled, i.e, between local_irq_enable() and either local_irq_disable() or
> vcpu_put()'s call to kvm_arch_vcpu_put_fp().
> 
> Note that kernel_neon_begin() calls:
> 
> 	fpsimd_save_user_state();
> 	...
> 	fpsimd_flush_cpu_state();
> 
> ... and fpsimd_save_user_state() will see that the SVE VL is wrong:
> 
> 	if (WARN_ON(sve_get_vl() != vl)) {
> 		force_signal_inject(SIGKILL, SI_KERNEL, 0, 0);
> 		return;
> 	}
> 
> ... pending a SIGKILL for the VMM thread without saving the vCPU's state
> before unbinding the live state via fpsimd_flush_cpu_state(), which'll
> set TIF_FOREIGN_FPSTATE.
> 
> AFAICT it's possible to re-enter the vCPU after that happens, whereupon
> stale vCPU FPSIMD/SVE state will be restored. Upon return to userspace
> the SIGKILL will be delivered, killing the VMM.
> 
> As above, it looks like that's been broken since the nVHE SVE
> save/restore was introduced in commit:
> 
>   8c8010d69c132273 ("KVM: arm64: Save/restore SVE state for nVHE")
> 
> The TL;DR summary is that it's not sufficient for kvm_arch_vcpu_put_fp()
> to fix up ZCR_ELx. Either:
> 
> * That needs to be fixed up while IRQs are masked, e.g. by
>   saving/restoring the host and guest ZCR_EL1 and/or ZCR_ELx values in
>   kvm_arch_vcpu_ctxflush_fp() and kvm_arch_vcpu_ctxsync_fp()
> 
> * The lazy save logic in fpsimd_save_user_state() needs to handle KVM
>   explicitly, saving the guest's ZCR_EL1 and restoring the host's
>   ZCR_ELx.
> 
> I think we need to fix that before we extend this logic for SME.

So save/restore ZCR_ELx eagerly? If that's what it takes, let's do
that now.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

next prev parent reply	other threads:[~2025-01-22 11:52 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-20 16:46 [PATCH RFC v3 00/27] KVM: arm64: Implement support for SME in non-protected guests Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 01/27] arm64/fpsimd: Update FA64 and ZT0 enables when loading SME state Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 02/27] arm64/fpsimd: Decide to save ZT0 and streaming mode FFR at bind time Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 03/27] arm64/fpsimd: Check enable bit for FA64 when saving EFI state Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 04/27] arm64/fpsimd: Determine maximum virtualisable SME vector length Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 05/27] KVM: arm64: Introduce non-UNDEF FGT control Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 06/27] KVM: arm64: Pull ctxt_has_ helpers to start of sysreg-sr.h Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 07/27] KVM: arm64: Convert cpacr_clear_set() to a static inline Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 08/27] KVM: arm64: Move SVE state access macros after feature test macros Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 09/27] KVM: arm64: Factor SVE guest exit handling out into a function Mark Brown
2025-01-17 11:34   ` Mark Rutland
2025-01-17 12:37     ` Mark Brown
2025-01-22 11:51     ` Marc Zyngier [this message]
2025-01-22 11:56       ` Mark Rutland
2024-12-20 16:46 ` [PATCH RFC v3 10/27] KVM: arm64: Rename SVE finalization constants to be more general Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 11/27] KVM: arm64: Document the KVM ABI for SME Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 12/27] KVM: arm64: Define internal features " Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 13/27] KVM: arm64: Rename sve_state_reg_region Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 14/27] KVM: arm64: Store vector lengths in an array Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 15/27] KVM: arm64: Implement SME vector length configuration Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 16/27] KVM: arm64: Add definitions for SME control register Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 17/27] KVM: arm64: Support TPIDR2_EL0 Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 18/27] KVM: arm64: Support SMIDR_EL1 for guests Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 19/27] KVM: arm64: Support SME priority registers Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 20/27] KVM: arm64: Provide assembly for SME state restore Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 21/27] KVM: arm64: Support Z and P registers in streaming mode Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 22/27] KVM: arm64: Expose SME specific state to userspace Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 23/27] KVM: arm64: Context switch SME state for normal guests Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 24/27] KVM: arm64: Handle SME exceptions Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 25/27] KVM: arm64: Provide interface for configuring and enabling SME for guests Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 26/27] KVM: arm64: selftests: Add SME system registers to get-reg-list Mark Brown
2024-12-20 16:46 ` [PATCH RFC v3 27/27] KVM: arm64: selftests: Add SME to set_id_regs test Mark Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86o6zzukwr.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=Dave.Martin@arm.com \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=corbet@lwn.net \
    --cc=joey.gouly@arm.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=oliver.upton@linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=shuah@kernel.org \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox