Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Mark Rutland <mark.rutland@arm.com>
Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
	kvm@vger.kernel.org, Steffen Eiden <seiden@linux.ibm.com>,
	Joey Gouly <joey.gouly@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Oliver Upton <oupton@kernel.org>,
	Zenghui Yu <yuzenghui@huawei.com>, Will Deacon <will@kernel.org>,
	Fuad Tabba <tabba@google.com>
Subject: Re: [PATCH v2 2/2] KVM: arm64: nv: Don't save/restore FP register during a nested ERET or exception
Date: Thu, 21 May 2026 07:35:41 +0100	[thread overview]
Message-ID: <86mrxtw9qa.wl-maz@kernel.org> (raw)
In-Reply-To: <ag2w0G34NycT2456@J2N7QTR9R3.cambridge.arm.com>

On Wed, 20 May 2026 14:02:08 +0100,
Mark Rutland <mark.rutland@arm.com> wrote:
> 
> On Wed, May 20, 2026 at 09:50:36AM +0100, Marc Zyngier wrote:
> > When switching between L1 and L2, we save the old state using
> > kvm_arch_vcpu_put(), mutate the state in memory, then load the new
> > state using kvm_arch_vcpu_load(). Any live FPSIMD/SVE state is saved
> > and unbound, such that it can be lazily restored on a subsequent trap.
> > 
> > The FPSIMD/SVE state is shared by exception levels, and only a handful
> > of related control registers need to be changed when transitioning
> > between L1 and L2. The save/restore of the common state is needless
> > overhead, especially as trapping becomes exponentially more expensive
> > with nesting.
> > 
> > Avoid this overhead by leaving the common FPSIMD/SVE state live on the
> > CPU, and only switching the state that is distinct for L1 and L2:
> > 
> > - the trap controls: the effective values are recomputed on each entry
> >   into the guest to take the EL into account and merge the L0 and L1
> >   configuration if in a nested context, or directly use the L0 configuration
> >   in non-nested context (see __activate_traps()).
> > 
> > - the VL settings: the effective values are are also recomputed on each
> >   entry into the guest (see fpsimd_lazy_switch_to_guest()).
> >
> > Since we appear to cover all bases, use the vcpu flags indicating the
> > handling of a nested ERET or exception delivery to avoid the whole FP
> > save/restore shenanigans. SME will have to be similarly dealt with when
> > it eventually gets supported.
> > 
> > For an EL1 L3 guest where L1 and L2 have this optimisation, this
> > results in at least a 10% wall clock reduction when running an I/O
> > heavy workload, generating a high rate of nested exceptions.
> 
> There's on additional thing that's important, but I forgot to mention
> last time: in the window between kvm_arch_vcpu_put() and
> kvm_arch_vcpu_load(), it's possible to take an interrupt, and for a
> softirq handler to try to use kernel mode NEON.
> 
> Due to that, kvm_arch_vcpu_put() must leave the L1 guest's maximum VL
> configured in the host's ZCR_ELx, such that the guest's state can be
> saved.
> 
> That value is configured by fpsimd_lazy_switch_to_host(), so we just
> need to make sure that kvm_arch_vcpu_put() doesn't clobber it. I *think*
> that's fine today, but maybe that warrants a comment somewhere.

I have slapped this onto this patch:

diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index aca98752a6e42..3f6b1e29cd6b9 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -117,7 +117,10 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
 	unsigned long flags;
 
 	/*
-	 * See comment in kvm_arch_vcpu_load_fp().
+	 * See comment in kvm_arch_vcpu_load_fp(). Note that we also rely on
+	 * the guest's max VL to have been set by fpsimd_lazy_switch_to_host()
+	 * so that any intervening kernel-mode SIMD (NEON or otherwise)
+	 * operation sees the full guest state that needs saving.
 	 */
 	if (vcpu_get_flag(vcpu, IN_NESTED_ERET) ||
 	    vcpu_get_flag(vcpu, IN_NESTED_EXCEPTION)) {

> Other than that, this all looks good to me:
> 
> Acked-by: Mark Rutland <mark.rutland@arm.com>

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.


  reply	other threads:[~2026-05-21  6:35 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-20  8:50 [PATCH v2 0/2] KVM: arm64: nv: Reduce FP/SVE overhead on exception/exception return Marc Zyngier
2026-05-20  8:50 ` [PATCH v2 1/2] KVM: arm64: nv: Track L2 to L1 exception emulation Marc Zyngier
2026-05-20  8:50 ` [PATCH v2 2/2] KVM: arm64: nv: Don't save/restore FP register during a nested ERET or exception Marc Zyngier
2026-05-20 11:02   ` Joey Gouly
2026-05-21  6:21     ` Marc Zyngier
2026-05-20 13:02   ` Mark Rutland
2026-05-21  6:35     ` Marc Zyngier [this message]
2026-05-21  7:07 ` [PATCH v2 0/2] KVM: arm64: nv: Reduce FP/SVE overhead on exception/exception return Marc Zyngier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86mrxtw9qa.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=joey.gouly@arm.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=mark.rutland@arm.com \
    --cc=oupton@kernel.org \
    --cc=seiden@linux.ibm.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox