From: Marc Zyngier <maz@kernel.org>
To: Mark Rutland <mark.rutland@arm.com>
Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
kvm@vger.kernel.org, Steffen Eiden <seiden@linux.ibm.com>,
Joey Gouly <joey.gouly@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Oliver Upton <oupton@kernel.org>,
Zenghui Yu <yuzenghui@huawei.com>, Will Deacon <will@kernel.org>,
Fuad Tabba <tabba@google.com>
Subject: Re: [PATCH v2 2/2] KVM: arm64: nv: Don't save/restore FP register during a nested ERET or exception
Date: Thu, 21 May 2026 07:35:41 +0100 [thread overview]
Message-ID: <86mrxtw9qa.wl-maz@kernel.org> (raw)
In-Reply-To: <ag2w0G34NycT2456@J2N7QTR9R3.cambridge.arm.com>
On Wed, 20 May 2026 14:02:08 +0100,
Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Wed, May 20, 2026 at 09:50:36AM +0100, Marc Zyngier wrote:
> > When switching between L1 and L2, we save the old state using
> > kvm_arch_vcpu_put(), mutate the state in memory, then load the new
> > state using kvm_arch_vcpu_load(). Any live FPSIMD/SVE state is saved
> > and unbound, such that it can be lazily restored on a subsequent trap.
> >
> > The FPSIMD/SVE state is shared by exception levels, and only a handful
> > of related control registers need to be changed when transitioning
> > between L1 and L2. The save/restore of the common state is needless
> > overhead, especially as trapping becomes exponentially more expensive
> > with nesting.
> >
> > Avoid this overhead by leaving the common FPSIMD/SVE state live on the
> > CPU, and only switching the state that is distinct for L1 and L2:
> >
> > - the trap controls: the effective values are recomputed on each entry
> > into the guest to take the EL into account and merge the L0 and L1
> > configuration if in a nested context, or directly use the L0 configuration
> > in non-nested context (see __activate_traps()).
> >
> > - the VL settings: the effective values are are also recomputed on each
> > entry into the guest (see fpsimd_lazy_switch_to_guest()).
> >
> > Since we appear to cover all bases, use the vcpu flags indicating the
> > handling of a nested ERET or exception delivery to avoid the whole FP
> > save/restore shenanigans. SME will have to be similarly dealt with when
> > it eventually gets supported.
> >
> > For an EL1 L3 guest where L1 and L2 have this optimisation, this
> > results in at least a 10% wall clock reduction when running an I/O
> > heavy workload, generating a high rate of nested exceptions.
>
> There's on additional thing that's important, but I forgot to mention
> last time: in the window between kvm_arch_vcpu_put() and
> kvm_arch_vcpu_load(), it's possible to take an interrupt, and for a
> softirq handler to try to use kernel mode NEON.
>
> Due to that, kvm_arch_vcpu_put() must leave the L1 guest's maximum VL
> configured in the host's ZCR_ELx, such that the guest's state can be
> saved.
>
> That value is configured by fpsimd_lazy_switch_to_host(), so we just
> need to make sure that kvm_arch_vcpu_put() doesn't clobber it. I *think*
> that's fine today, but maybe that warrants a comment somewhere.
I have slapped this onto this patch:
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index aca98752a6e42..3f6b1e29cd6b9 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -117,7 +117,10 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
unsigned long flags;
/*
- * See comment in kvm_arch_vcpu_load_fp().
+ * See comment in kvm_arch_vcpu_load_fp(). Note that we also rely on
+ * the guest's max VL to have been set by fpsimd_lazy_switch_to_host()
+ * so that any intervening kernel-mode SIMD (NEON or otherwise)
+ * operation sees the full guest state that needs saving.
*/
if (vcpu_get_flag(vcpu, IN_NESTED_ERET) ||
vcpu_get_flag(vcpu, IN_NESTED_EXCEPTION)) {
> Other than that, this all looks good to me:
>
> Acked-by: Mark Rutland <mark.rutland@arm.com>
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
next prev parent reply other threads:[~2026-05-21 6:35 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-20 8:50 [PATCH v2 0/2] KVM: arm64: nv: Reduce FP/SVE overhead on exception/exception return Marc Zyngier
2026-05-20 8:50 ` [PATCH v2 1/2] KVM: arm64: nv: Track L2 to L1 exception emulation Marc Zyngier
2026-05-20 8:50 ` [PATCH v2 2/2] KVM: arm64: nv: Don't save/restore FP register during a nested ERET or exception Marc Zyngier
2026-05-20 11:02 ` Joey Gouly
2026-05-21 6:21 ` Marc Zyngier
2026-05-20 13:02 ` Mark Rutland
2026-05-21 6:35 ` Marc Zyngier [this message]
2026-05-21 7:07 ` [PATCH v2 0/2] KVM: arm64: nv: Reduce FP/SVE overhead on exception/exception return Marc Zyngier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86mrxtw9qa.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=joey.gouly@arm.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=mark.rutland@arm.com \
--cc=oupton@kernel.org \
--cc=seiden@linux.ibm.com \
--cc=suzuki.poulose@arm.com \
--cc=tabba@google.com \
--cc=will@kernel.org \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.