* [PATCH v2 0/2] KVM: arm64: nv: Reduce FP/SVE overhead on exception/exception return
@ 2026-05-20 8:50 Marc Zyngier
2026-05-20 8:50 ` [PATCH v2 1/2] KVM: arm64: nv: Track L2 to L1 exception emulation Marc Zyngier
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Marc Zyngier @ 2026-05-20 8:50 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm
Cc: Steffen Eiden, Joey Gouly, Suzuki K Poulose, Oliver Upton,
Zenghui Yu, Mark Rutland, Will Deacon, Fuad Tabba
This is the second version of this short series optimising away a lot
of unnecessary FPSIMD/SVE context switch with NV.
* From v1 [1]:
- New commit message on patch #2 (Mark)
- Additional comments and WARN_ON_ONCE() (Mark)
If nobody screams, I'll stick that into -next.
Thanks,
M.
[1] https://lore.kernel.org/r/20260512140755.3676306-1-maz@kernel.org
Marc Zyngier (2):
KVM: arm64: nv: Track L2 to L1 exception emulation
KVM: arm64: nv: Don't save/restore FP register during a nested ERET or
exception
arch/arm64/include/asm/kvm_host.h | 3 ++-
arch/arm64/kvm/emulate-nested.c | 4 ++++
arch/arm64/kvm/fpsimd.c | 23 +++++++++++++++++++++++
3 files changed, 29 insertions(+), 1 deletion(-)
--
2.47.3
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH v2 1/2] KVM: arm64: nv: Track L2 to L1 exception emulation 2026-05-20 8:50 [PATCH v2 0/2] KVM: arm64: nv: Reduce FP/SVE overhead on exception/exception return Marc Zyngier @ 2026-05-20 8:50 ` Marc Zyngier 2026-05-20 8:50 ` [PATCH v2 2/2] KVM: arm64: nv: Don't save/restore FP register during a nested ERET or exception Marc Zyngier 2026-05-21 7:07 ` [PATCH v2 0/2] KVM: arm64: nv: Reduce FP/SVE overhead on exception/exception return Marc Zyngier 2 siblings, 0 replies; 8+ messages in thread From: Marc Zyngier @ 2026-05-20 8:50 UTC (permalink / raw) To: kvmarm, linux-arm-kernel, kvm Cc: Steffen Eiden, Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu, Mark Rutland, Will Deacon, Fuad Tabba While we currently track that we are emulating a nested ERET from L1 to L2, we currently don't track the reverse direction (an exception going from L2 to L1). Add a new vcpu state flag for this purpose, which will see some use shortly. Signed-off-by: Marc Zyngier <maz@kernel.org> --- arch/arm64/include/asm/kvm_host.h | 3 ++- arch/arm64/kvm/emulate-nested.c | 4 ++++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 65eead8362e0b..c79747d5f4dd1 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -1112,7 +1112,8 @@ struct kvm_vcpu_arch { #define IN_NESTED_ERET __vcpu_single_flag(sflags, BIT(7)) /* SError pending for nested guest */ #define NESTED_SERROR_PENDING __vcpu_single_flag(sflags, BIT(8)) - +/* KVM is currently emulating an L2 to L1 exception */ +#define IN_NESTED_EXCEPTION __vcpu_single_flag(sflags, BIT(9)) /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */ #define vcpu_sve_pffr(vcpu) (kern_hyp_va((vcpu)->arch.sve_state) + \ diff --git a/arch/arm64/kvm/emulate-nested.c b/arch/arm64/kvm/emulate-nested.c index dba7ced74ca5e..15c691a6266d5 100644 --- a/arch/arm64/kvm/emulate-nested.c +++ b/arch/arm64/kvm/emulate-nested.c @@ -2862,6 +2862,8 @@ static int kvm_inject_nested(struct kvm_vcpu *vcpu, u64 esr_el2, preempt_disable(); + vcpu_set_flag(vcpu, IN_NESTED_EXCEPTION); + /* * We may have an exception or PC update in the EL0/EL1 context. * Commit it before entering EL2. @@ -2884,6 +2886,8 @@ static int kvm_inject_nested(struct kvm_vcpu *vcpu, u64 esr_el2, __kvm_adjust_pc(vcpu); kvm_arch_vcpu_load(vcpu, smp_processor_id()); + vcpu_clear_flag(vcpu, IN_NESTED_EXCEPTION); + preempt_enable(); if (kvm_vcpu_has_pmu(vcpu)) -- 2.47.3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 2/2] KVM: arm64: nv: Don't save/restore FP register during a nested ERET or exception 2026-05-20 8:50 [PATCH v2 0/2] KVM: arm64: nv: Reduce FP/SVE overhead on exception/exception return Marc Zyngier 2026-05-20 8:50 ` [PATCH v2 1/2] KVM: arm64: nv: Track L2 to L1 exception emulation Marc Zyngier @ 2026-05-20 8:50 ` Marc Zyngier 2026-05-20 11:02 ` Joey Gouly 2026-05-20 13:02 ` Mark Rutland 2026-05-21 7:07 ` [PATCH v2 0/2] KVM: arm64: nv: Reduce FP/SVE overhead on exception/exception return Marc Zyngier 2 siblings, 2 replies; 8+ messages in thread From: Marc Zyngier @ 2026-05-20 8:50 UTC (permalink / raw) To: kvmarm, linux-arm-kernel, kvm Cc: Steffen Eiden, Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu, Mark Rutland, Will Deacon, Fuad Tabba When switching between L1 and L2, we save the old state using kvm_arch_vcpu_put(), mutate the state in memory, then load the new state using kvm_arch_vcpu_load(). Any live FPSIMD/SVE state is saved and unbound, such that it can be lazily restored on a subsequent trap. The FPSIMD/SVE state is shared by exception levels, and only a handful of related control registers need to be changed when transitioning between L1 and L2. The save/restore of the common state is needless overhead, especially as trapping becomes exponentially more expensive with nesting. Avoid this overhead by leaving the common FPSIMD/SVE state live on the CPU, and only switching the state that is distinct for L1 and L2: - the trap controls: the effective values are recomputed on each entry into the guest to take the EL into account and merge the L0 and L1 configuration if in a nested context, or directly use the L0 configuration in non-nested context (see __activate_traps()). - the VL settings: the effective values are are also recomputed on each entry into the guest (see fpsimd_lazy_switch_to_guest()). Since we appear to cover all bases, use the vcpu flags indicating the handling of a nested ERET or exception delivery to avoid the whole FP save/restore shenanigans. SME will have to be similarly dealt with when it eventually gets supported. For an EL1 L3 guest where L1 and L2 have this optimisation, this results in at least a 10% wall clock reduction when running an I/O heavy workload, generating a high rate of nested exceptions. Signed-off-by: Marc Zyngier <maz@kernel.org> --- arch/arm64/kvm/fpsimd.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c index 15e17aca1dec0..aca98752a6e42 100644 --- a/arch/arm64/kvm/fpsimd.c +++ b/arch/arm64/kvm/fpsimd.c @@ -28,6 +28,20 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu) if (!system_supports_fpsimd()) return; + /* + * Avoid needless save/restore of the guest's common + * FPSIMD/SVE/SME regs during transitions between L1/L2. + * + * These transitions only happens in a non-preemptible context + * where the host regs have already been saved and unbound. The + * live registers are either free or owned by the guest. + */ + if (vcpu_get_flag(vcpu, IN_NESTED_ERET) || + vcpu_get_flag(vcpu, IN_NESTED_EXCEPTION)) { + WARN_ON_ONCE(host_owns_fp_regs()); + return; + } + /* * Ensure that any host FPSIMD/SVE/SME state is saved and unbound such * that the host kernel is responsible for restoring this state upon @@ -102,6 +116,15 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) { unsigned long flags; + /* + * See comment in kvm_arch_vcpu_load_fp(). + */ + if (vcpu_get_flag(vcpu, IN_NESTED_ERET) || + vcpu_get_flag(vcpu, IN_NESTED_EXCEPTION)) { + WARN_ON_ONCE(host_owns_fp_regs()); + return; + } + local_irq_save(flags); if (guest_owns_fp_regs()) { -- 2.47.3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/2] KVM: arm64: nv: Don't save/restore FP register during a nested ERET or exception 2026-05-20 8:50 ` [PATCH v2 2/2] KVM: arm64: nv: Don't save/restore FP register during a nested ERET or exception Marc Zyngier @ 2026-05-20 11:02 ` Joey Gouly 2026-05-21 6:21 ` Marc Zyngier 2026-05-20 13:02 ` Mark Rutland 1 sibling, 1 reply; 8+ messages in thread From: Joey Gouly @ 2026-05-20 11:02 UTC (permalink / raw) To: Marc Zyngier Cc: kvmarm, linux-arm-kernel, kvm, Steffen Eiden, Suzuki K Poulose, Oliver Upton, Zenghui Yu, Mark Rutland, Will Deacon, Fuad Tabba Hi Marc, On Wed, May 20, 2026 at 09:50:36AM +0100, Marc Zyngier wrote: > When switching between L1 and L2, we save the old state using > kvm_arch_vcpu_put(), mutate the state in memory, then load the new > state using kvm_arch_vcpu_load(). Any live FPSIMD/SVE state is saved > and unbound, such that it can be lazily restored on a subsequent trap. > > The FPSIMD/SVE state is shared by exception levels, and only a handful > of related control registers need to be changed when transitioning > between L1 and L2. The save/restore of the common state is needless > overhead, especially as trapping becomes exponentially more expensive > with nesting. > > Avoid this overhead by leaving the common FPSIMD/SVE state live on the > CPU, and only switching the state that is distinct for L1 and L2: To make sure I understand this part: L1 sets up L2's FP state live on the CPU L1 erets eret traps to L0/host preemption disabled kvm_arch_vcpu_put() kvm_arch_vcpu_put_fp() <-- actually saves the state of the live registers .. set elr etc .. kvm_arch_vcpu_load() kvm_arch_vcpu_load_fp() <-- doesn't actually restore state, but ensures the CPTR trap will be set .. returns to L2 (traps on first use of FP and state will be restored) So this patch is (effectively) removing the put_fp()/load_fp(), because the FP state is common/shared between L1 and L2, so whatever L1 put into that state before the eret, L2 was going to see. If my understanding is correct: Reviewed-by: Joey Gouly <joey.gouly@arm.com> Thanks, Joey > > - the trap controls: the effective values are recomputed on each entry > into the guest to take the EL into account and merge the L0 and L1 > configuration if in a nested context, or directly use the L0 configuration > in non-nested context (see __activate_traps()). > > - the VL settings: the effective values are are also recomputed on each > entry into the guest (see fpsimd_lazy_switch_to_guest()). > > Since we appear to cover all bases, use the vcpu flags indicating the > handling of a nested ERET or exception delivery to avoid the whole FP > save/restore shenanigans. SME will have to be similarly dealt with when > it eventually gets supported. > > For an EL1 L3 guest where L1 and L2 have this optimisation, this > results in at least a 10% wall clock reduction when running an I/O > heavy workload, generating a high rate of nested exceptions. > > Signed-off-by: Marc Zyngier <maz@kernel.org> > --- > arch/arm64/kvm/fpsimd.c | 23 +++++++++++++++++++++++ > 1 file changed, 23 insertions(+) > > diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c > index 15e17aca1dec0..aca98752a6e42 100644 > --- a/arch/arm64/kvm/fpsimd.c > +++ b/arch/arm64/kvm/fpsimd.c > @@ -28,6 +28,20 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu) > if (!system_supports_fpsimd()) > return; > > + /* > + * Avoid needless save/restore of the guest's common > + * FPSIMD/SVE/SME regs during transitions between L1/L2. > + * > + * These transitions only happens in a non-preemptible context > + * where the host regs have already been saved and unbound. The > + * live registers are either free or owned by the guest. > + */ > + if (vcpu_get_flag(vcpu, IN_NESTED_ERET) || > + vcpu_get_flag(vcpu, IN_NESTED_EXCEPTION)) { > + WARN_ON_ONCE(host_owns_fp_regs()); > + return; > + } > + > /* > * Ensure that any host FPSIMD/SVE/SME state is saved and unbound such > * that the host kernel is responsible for restoring this state upon > @@ -102,6 +116,15 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) > { > unsigned long flags; > > + /* > + * See comment in kvm_arch_vcpu_load_fp(). > + */ > + if (vcpu_get_flag(vcpu, IN_NESTED_ERET) || > + vcpu_get_flag(vcpu, IN_NESTED_EXCEPTION)) { > + WARN_ON_ONCE(host_owns_fp_regs()); > + return; > + } > + > local_irq_save(flags); > > if (guest_owns_fp_regs()) { > -- > 2.47.3 > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/2] KVM: arm64: nv: Don't save/restore FP register during a nested ERET or exception 2026-05-20 11:02 ` Joey Gouly @ 2026-05-21 6:21 ` Marc Zyngier 0 siblings, 0 replies; 8+ messages in thread From: Marc Zyngier @ 2026-05-21 6:21 UTC (permalink / raw) To: Joey Gouly Cc: kvmarm, linux-arm-kernel, kvm, Steffen Eiden, Suzuki K Poulose, Oliver Upton, Zenghui Yu, Mark Rutland, Will Deacon, Fuad Tabba On Wed, 20 May 2026 12:02:31 +0100, Joey Gouly <joey.gouly@arm.com> wrote: > > Hi Marc, > > On Wed, May 20, 2026 at 09:50:36AM +0100, Marc Zyngier wrote: > > When switching between L1 and L2, we save the old state using > > kvm_arch_vcpu_put(), mutate the state in memory, then load the new > > state using kvm_arch_vcpu_load(). Any live FPSIMD/SVE state is saved > > and unbound, such that it can be lazily restored on a subsequent trap. > > > > The FPSIMD/SVE state is shared by exception levels, and only a handful > > of related control registers need to be changed when transitioning > > between L1 and L2. The save/restore of the common state is needless > > overhead, especially as trapping becomes exponentially more expensive > > with nesting. > > > > Avoid this overhead by leaving the common FPSIMD/SVE state live on the > > CPU, and only switching the state that is distinct for L1 and L2: > > To make sure I understand this part: > > L1 sets up L2's FP state live on the CPU > L1 erets > eret traps to L0/host > preemption disabled > kvm_arch_vcpu_put() > kvm_arch_vcpu_put_fp() <-- actually saves the state of the live registers > .. set elr etc .. > kvm_arch_vcpu_load() > kvm_arch_vcpu_load_fp() <-- doesn't actually restore state, but ensures > the CPTR trap will be set > .. returns to L2 (traps on first use of FP and state will be restored) > > So this patch is (effectively) removing the put_fp()/load_fp(), because the FP > state is common/shared between L1 and L2, so whatever L1 put into that state > before the eret, L2 was going to see. Yes, you got it right. The other path is on L1 to L2 exception, which also requires L0 mediation and has a similar shape. The most horrible thing is that because all these traps can happen at a arbitrary depth, each individual trap usually results in the combination of all of the above. > If my understanding is correct: > Reviewed-by: Joey Gouly <joey.gouly@arm.com> Thanks! M. -- Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/2] KVM: arm64: nv: Don't save/restore FP register during a nested ERET or exception 2026-05-20 8:50 ` [PATCH v2 2/2] KVM: arm64: nv: Don't save/restore FP register during a nested ERET or exception Marc Zyngier 2026-05-20 11:02 ` Joey Gouly @ 2026-05-20 13:02 ` Mark Rutland 2026-05-21 6:35 ` Marc Zyngier 1 sibling, 1 reply; 8+ messages in thread From: Mark Rutland @ 2026-05-20 13:02 UTC (permalink / raw) To: Marc Zyngier Cc: kvmarm, linux-arm-kernel, kvm, Steffen Eiden, Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu, Will Deacon, Fuad Tabba On Wed, May 20, 2026 at 09:50:36AM +0100, Marc Zyngier wrote: > When switching between L1 and L2, we save the old state using > kvm_arch_vcpu_put(), mutate the state in memory, then load the new > state using kvm_arch_vcpu_load(). Any live FPSIMD/SVE state is saved > and unbound, such that it can be lazily restored on a subsequent trap. > > The FPSIMD/SVE state is shared by exception levels, and only a handful > of related control registers need to be changed when transitioning > between L1 and L2. The save/restore of the common state is needless > overhead, especially as trapping becomes exponentially more expensive > with nesting. > > Avoid this overhead by leaving the common FPSIMD/SVE state live on the > CPU, and only switching the state that is distinct for L1 and L2: > > - the trap controls: the effective values are recomputed on each entry > into the guest to take the EL into account and merge the L0 and L1 > configuration if in a nested context, or directly use the L0 configuration > in non-nested context (see __activate_traps()). > > - the VL settings: the effective values are are also recomputed on each > entry into the guest (see fpsimd_lazy_switch_to_guest()). > > Since we appear to cover all bases, use the vcpu flags indicating the > handling of a nested ERET or exception delivery to avoid the whole FP > save/restore shenanigans. SME will have to be similarly dealt with when > it eventually gets supported. > > For an EL1 L3 guest where L1 and L2 have this optimisation, this > results in at least a 10% wall clock reduction when running an I/O > heavy workload, generating a high rate of nested exceptions. There's on additional thing that's important, but I forgot to mention last time: in the window between kvm_arch_vcpu_put() and kvm_arch_vcpu_load(), it's possible to take an interrupt, and for a softirq handler to try to use kernel mode NEON. Due to that, kvm_arch_vcpu_put() must leave the L1 guest's maximum VL configured in the host's ZCR_ELx, such that the guest's state can be saved. That value is configured by fpsimd_lazy_switch_to_host(), so we just need to make sure that kvm_arch_vcpu_put() doesn't clobber it. I *think* that's fine today, but maybe that warrants a comment somewhere. Other than that, this all looks good to me: Acked-by: Mark Rutland <mark.rutland@arm.com> Mark. > Signed-off-by: Marc Zyngier <maz@kernel.org> > --- > arch/arm64/kvm/fpsimd.c | 23 +++++++++++++++++++++++ > 1 file changed, 23 insertions(+) > > diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c > index 15e17aca1dec0..aca98752a6e42 100644 > --- a/arch/arm64/kvm/fpsimd.c > +++ b/arch/arm64/kvm/fpsimd.c > @@ -28,6 +28,20 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu) > if (!system_supports_fpsimd()) > return; > > + /* > + * Avoid needless save/restore of the guest's common > + * FPSIMD/SVE/SME regs during transitions between L1/L2. > + * > + * These transitions only happens in a non-preemptible context > + * where the host regs have already been saved and unbound. The > + * live registers are either free or owned by the guest. > + */ > + if (vcpu_get_flag(vcpu, IN_NESTED_ERET) || > + vcpu_get_flag(vcpu, IN_NESTED_EXCEPTION)) { > + WARN_ON_ONCE(host_owns_fp_regs()); > + return; > + } > + > /* > * Ensure that any host FPSIMD/SVE/SME state is saved and unbound such > * that the host kernel is responsible for restoring this state upon > @@ -102,6 +116,15 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) > { > unsigned long flags; > > + /* > + * See comment in kvm_arch_vcpu_load_fp(). > + */ > + if (vcpu_get_flag(vcpu, IN_NESTED_ERET) || > + vcpu_get_flag(vcpu, IN_NESTED_EXCEPTION)) { > + WARN_ON_ONCE(host_owns_fp_regs()); > + return; > + } > + > local_irq_save(flags); > > if (guest_owns_fp_regs()) { > -- > 2.47.3 > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/2] KVM: arm64: nv: Don't save/restore FP register during a nested ERET or exception 2026-05-20 13:02 ` Mark Rutland @ 2026-05-21 6:35 ` Marc Zyngier 0 siblings, 0 replies; 8+ messages in thread From: Marc Zyngier @ 2026-05-21 6:35 UTC (permalink / raw) To: Mark Rutland Cc: kvmarm, linux-arm-kernel, kvm, Steffen Eiden, Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu, Will Deacon, Fuad Tabba On Wed, 20 May 2026 14:02:08 +0100, Mark Rutland <mark.rutland@arm.com> wrote: > > On Wed, May 20, 2026 at 09:50:36AM +0100, Marc Zyngier wrote: > > When switching between L1 and L2, we save the old state using > > kvm_arch_vcpu_put(), mutate the state in memory, then load the new > > state using kvm_arch_vcpu_load(). Any live FPSIMD/SVE state is saved > > and unbound, such that it can be lazily restored on a subsequent trap. > > > > The FPSIMD/SVE state is shared by exception levels, and only a handful > > of related control registers need to be changed when transitioning > > between L1 and L2. The save/restore of the common state is needless > > overhead, especially as trapping becomes exponentially more expensive > > with nesting. > > > > Avoid this overhead by leaving the common FPSIMD/SVE state live on the > > CPU, and only switching the state that is distinct for L1 and L2: > > > > - the trap controls: the effective values are recomputed on each entry > > into the guest to take the EL into account and merge the L0 and L1 > > configuration if in a nested context, or directly use the L0 configuration > > in non-nested context (see __activate_traps()). > > > > - the VL settings: the effective values are are also recomputed on each > > entry into the guest (see fpsimd_lazy_switch_to_guest()). > > > > Since we appear to cover all bases, use the vcpu flags indicating the > > handling of a nested ERET or exception delivery to avoid the whole FP > > save/restore shenanigans. SME will have to be similarly dealt with when > > it eventually gets supported. > > > > For an EL1 L3 guest where L1 and L2 have this optimisation, this > > results in at least a 10% wall clock reduction when running an I/O > > heavy workload, generating a high rate of nested exceptions. > > There's on additional thing that's important, but I forgot to mention > last time: in the window between kvm_arch_vcpu_put() and > kvm_arch_vcpu_load(), it's possible to take an interrupt, and for a > softirq handler to try to use kernel mode NEON. > > Due to that, kvm_arch_vcpu_put() must leave the L1 guest's maximum VL > configured in the host's ZCR_ELx, such that the guest's state can be > saved. > > That value is configured by fpsimd_lazy_switch_to_host(), so we just > need to make sure that kvm_arch_vcpu_put() doesn't clobber it. I *think* > that's fine today, but maybe that warrants a comment somewhere. I have slapped this onto this patch: diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c index aca98752a6e42..3f6b1e29cd6b9 100644 --- a/arch/arm64/kvm/fpsimd.c +++ b/arch/arm64/kvm/fpsimd.c @@ -117,7 +117,10 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) unsigned long flags; /* - * See comment in kvm_arch_vcpu_load_fp(). + * See comment in kvm_arch_vcpu_load_fp(). Note that we also rely on + * the guest's max VL to have been set by fpsimd_lazy_switch_to_host() + * so that any intervening kernel-mode SIMD (NEON or otherwise) + * operation sees the full guest state that needs saving. */ if (vcpu_get_flag(vcpu, IN_NESTED_ERET) || vcpu_get_flag(vcpu, IN_NESTED_EXCEPTION)) { > Other than that, this all looks good to me: > > Acked-by: Mark Rutland <mark.rutland@arm.com> Thanks, M. -- Without deviation from the norm, progress is not possible. ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 0/2] KVM: arm64: nv: Reduce FP/SVE overhead on exception/exception return 2026-05-20 8:50 [PATCH v2 0/2] KVM: arm64: nv: Reduce FP/SVE overhead on exception/exception return Marc Zyngier 2026-05-20 8:50 ` [PATCH v2 1/2] KVM: arm64: nv: Track L2 to L1 exception emulation Marc Zyngier 2026-05-20 8:50 ` [PATCH v2 2/2] KVM: arm64: nv: Don't save/restore FP register during a nested ERET or exception Marc Zyngier @ 2026-05-21 7:07 ` Marc Zyngier 2 siblings, 0 replies; 8+ messages in thread From: Marc Zyngier @ 2026-05-21 7:07 UTC (permalink / raw) To: kvmarm, linux-arm-kernel, kvm, Marc Zyngier Cc: Steffen Eiden, Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu, Mark Rutland, Will Deacon, Fuad Tabba On Wed, 20 May 2026 09:50:34 +0100, Marc Zyngier wrote: > This is the second version of this short series optimising away a lot > of unnecessary FPSIMD/SVE context switch with NV. > > * From v1 [1]: > > - New commit message on patch #2 (Mark) > > [...] Applied to next, thanks! [1/2] KVM: arm64: nv: Track L2 to L1 exception emulation commit: 27ae400e6e888153ded1ad807a94a94e506dd2df [2/2] KVM: arm64: nv: Don't save/restore FP register during a nested ERET or exception commit: 435c466196148ae116f616e6cda97c33281defc2 Cheers, M. -- Without deviation from the norm, progress is not possible. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-05-21 7:07 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-20 8:50 [PATCH v2 0/2] KVM: arm64: nv: Reduce FP/SVE overhead on exception/exception return Marc Zyngier 2026-05-20 8:50 ` [PATCH v2 1/2] KVM: arm64: nv: Track L2 to L1 exception emulation Marc Zyngier 2026-05-20 8:50 ` [PATCH v2 2/2] KVM: arm64: nv: Don't save/restore FP register during a nested ERET or exception Marc Zyngier 2026-05-20 11:02 ` Joey Gouly 2026-05-21 6:21 ` Marc Zyngier 2026-05-20 13:02 ` Mark Rutland 2026-05-21 6:35 ` Marc Zyngier 2026-05-21 7:07 ` [PATCH v2 0/2] KVM: arm64: nv: Reduce FP/SVE overhead on exception/exception return Marc Zyngier
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox