From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id ABB6DCD4F24 for ; Wed, 13 May 2026 12:50:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:MIME-Version: References:In-Reply-To:Subject:Cc:To:From:Message-ID:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=1dSMzbmeUabI/WwP9W/TzcNkCeLHvcBgEmZ8aCbtB84=; b=oUDbSHg6/0cojuF6ApTo7SYNzQ NrZUgkTyafS/WfWPMkq+UMLw1GcaNGRsEaz/OAXEOYD0gj8l8FiD8pNioIvNrx36SNzYg36/SjJ5Y +0GjnGzAqpNR0LweEgsh6lkpOWeRnBA4dt5Ut8zWPNhkrJx4jc+DqM3LcUltyOEHrs9TiVXTcoCR0 zxevJxdHULWI/X6kG+ivXQmNUG6PqyeOiU8JSy+mVjQ5E8Uae7VBaxzmgzCatyyM4lcsrp4lRXR2d 5ZupVnqPcMMsFkFnktjh2NB1CAnLC25hkkEWeqr/WIA+2fJeK0ODIwSgv/4m90dc5iiLFyHbZh4PA WsNK8rRg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.99.1 #2 (Red Hat Linux)) id 1wN92A-00000002Yu4-1Bgj; Wed, 13 May 2026 12:49:54 +0000 Received: from tor.source.kernel.org ([2600:3c04:e001:324:0:1991:8:25]) by bombadil.infradead.org with esmtps (Exim 4.99.1 #2 (Red Hat Linux)) id 1wN929-00000002Ytj-1HUH for linux-arm-kernel@lists.infradead.org; Wed, 13 May 2026 12:49:53 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by tor.source.kernel.org (Postfix) with ESMTP id BCFEB60123; Wed, 13 May 2026 12:49:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5D063C32781; Wed, 13 May 2026 12:49:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1778676592; bh=oqqsZPRIOAYF1RAVvp5pmi6HIhUQ2sGPQHoiMdrWBPU=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=c8V6U82lodLIaCCCqqXhTEnJJDB60lziNiL/BKaOSyMllAXdLmmwu4MhGMOdQJ53d 41wh5HbFrWhozWTC4oiPv3U3vBZG0dfdUJcYf9BQX5XXLnJfEaUTpBEoWcNYe3eBWT 8qwFvWZLtFt0/p440elC97hcmGDNhT70RTqa3HumZXQBM58E+BoOJ/0Jns+Pl1u66d 3itfPDTJ5BjuwW4ZxCe+qn0APckd8fzvcQdmwE0dQcZwfk3E1Wu3w0dnZQWTyPpxuK lrjhF5sL8Usff+i8FAEUdXRrIDfpSPq3OLdbVbysNdNrGU+KhsJg/bUB6xCB0MkVKV WotvW480QMeoQ== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wN925-00000001ycI-320H; Wed, 13 May 2026 12:49:49 +0000 Date: Wed, 13 May 2026 13:49:49 +0100 Message-ID: <86cxyzxymq.wl-maz@kernel.org> From: Marc Zyngier To: Mark Rutland Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org, Steffen Eiden , Joey Gouly , Suzuki K Poulose , Oliver Upton , Zenghui Yu , Will Deacon , Fuad Tabba Subject: Re: [PATCH 2/2] KVM: arm64: nv: Don't save/restore FP register during a nested ERET or exception In-Reply-To: References: <20260512140755.3676306-1-maz@kernel.org> <20260512140755.3676306-3-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: mark.rutland@arm.com, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org, seiden@linux.ibm.com, joey.gouly@arm.com, suzuki.poulose@arm.com, oupton@kernel.org, yuzenghui@huawei.com, will@kernel.org, tabba@google.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Mark, Thanks for looking into this. On Wed, 13 May 2026 13:28:56 +0100, Mark Rutland wrote: > > On Tue, May 12, 2026 at 03:07:55PM +0100, Marc Zyngier wrote: > > When switching between L1 and L2, we diligently use a non-preemptible > > put/load sequence in order to make sure that the old state is saved, > > while the new state is brought in. Crucially, this includes the FP > > registers. > > > > However, this is a bit silly. The FP registers are completely shared > > between the various ELs (just like the GPRs, really), and eagerly > > save/restoring those in a non-preemptible section is just overhead. > > Not to mention that the next access will end-up trapping, something > > that becomes exponentially expensive as we nest deeper. > > > > The temptation is therefore to completely drop this save/restore thing. > > Why is it valid to do so? By analogy, the hypervisor doesn't try to > > poloce things between EL1 and EL0, or between EL2 and EL0. Why should > > it do so between EL2 and EL1 (or EL2 and L2 EL0)? > > > > Once you admit that the FP (and by extension SVE) registers are EL-agnostic, > > the things that matter are: > > s/poloce/police/ ? That. > > The above is a bit flowery; it would be nice to remove the rhetorical > questions and just state that (aside from some control registers) the > FPSIMD/SVE/SME state is shared between exception levels and doesn't need > to be saved/restored. > > How about: > > When switching between L1 and L2, we save the old state using > kvm_arch_vcpu_put(), mutate the state in memory, then load the new > state using kvm_arch_vcpu_load(). Any live FPSIMD/SVE state is saved > and unbound, such that it can be lazily restored on a subsequent trap. > > The FPSIMD/SVE state is shared by exception levels, and only a handful > of related control registers need to be changed when transitioning > between L1 and L2. The save/restore of the common state is needless > overhead, especially as trapping becomes exponentially more expensive > with nesting. > > Avoid this overhead by leaving the common FPSIMD/SVE state live on the > CPU, and only switching the state that is distinct for L1 and L2: > Sold. Do you offer a CMAAS (Commit Message As A Service)? Asking for a friend... ;-) > > - the trap controls: the effective values are recomputed on each entry > > into the guest to take the EL into account and merge the L0 and L1 > > configuration if in a nested context, or directly use the L0 configuration > > in non-nested context (see __activate_traps()). > > > > - the VL settings: the effective values are are also recomputed on each > > entry into the guest (see fpsimd_lazy_switch_to_guest()). > > This is true for FPSIMD+SVE today. For SME, SMCR_ELx also contains other > controls, and will need to be dealt with similarly. It might be worth > noting that (and that ZCR_ELx could gain new controls in future). > Yeah. I tried not to worry too much about SME, but given that it is on people's radar, I'll drop a comment here. > > Since we appear to cover all bases, use the vcpu flags indicating the > > handling of a nested ERET or exception delivery to avoid the whole FP > > save/restore shenanigans. > > > > For an EL1 L3 guest where L1 and L2 have this optimisation, this > > results in at least a 10% wall clock reduction when running an I/O > > heavy workload, generating a high rate of nested exceptions. > > > > Signed-off-by: Marc Zyngier > > --- > > arch/arm64/kvm/fpsimd.c | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > > > diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c > > index 15e17aca1dec0..73eda0f46b127 100644 > > --- a/arch/arm64/kvm/fpsimd.c > > +++ b/arch/arm64/kvm/fpsimd.c > > @@ -28,6 +28,10 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu) > > if (!system_supports_fpsimd()) > > return; > > > > + if (vcpu_get_flag(vcpu, IN_NESTED_ERET) || > > + vcpu_get_flag(vcpu, IN_NESTED_EXCEPTION)) > > + return; > > + > > I think we need a comment as to why this is safe, with some other detail > from the commit message. It would also be good to have asserts here to > catch if something goes wrong. > > How about: > > /* > * Avoid needless save/restore of the guest's common > * FPSIMD/SVE/SME regs during transitions between L1/L2. > * > * These transitions only happens in a non-preemptible context > * where the host regs have already been saved and unbound. The > * live registers are either free or owned by the guest. > */ > if (vcpu_get_flag(vcpu, IN_NESTED_ERET) || > vcpu_get_flag(vcpu, IN_NESTED_EXCEPTION) { > WARN_ON_ONCE(host_owns_fp_regs()); > return; > } > > ... ? > > Note: I didn't add WARN_ON_ONCE(preemptible()), since > kvm_arch_vcpu_load_fp() should *never* be called in a preemptible > context. > > > /* > > * Ensure that any host FPSIMD/SVE/SME state is saved and unbound such > > * that the host kernel is responsible for restoring this state upon > > @@ -102,6 +106,10 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) > > { > > unsigned long flags; > > > > + if (vcpu_get_flag(vcpu, IN_NESTED_ERET) || > > + vcpu_get_flag(vcpu, IN_NESTED_EXCEPTION)) > > + return; > > Likewise here, but we can reduce the comment, e.g. > > /* > * See comment in kvm_arch_vcpu_load_fp(). > */ > if (vcpu_get_flag(vcpu, IN_NESTED_ERET) || > vcpu_get_flag(vcpu, IN_NESTED_EXCEPTION) { > WARN_ON_ONCE(host_owns_fp_regs()); > return; > } Yup, that all looks good to me. I'll repost that next week with these changes. Thanks again, M. -- Without deviation from the norm, progress is not possible.