From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 765F21D8E1F for ; Mon, 7 Oct 2024 19:06:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728327996; cv=none; b=q5ZtOum+qecDEbDsV6JPzEoZTn8eYutc2/2tayDDFMPC8pgpcXMk0uBwGMqYV0mMEzTusPsHxxU6T6FDPNnf9W2q82uKPmOhsW6u9/DxwZrpt6coWtPuEv4Uky2YQX5bxBZxTK5Mu1w65e0tFkxnMlMOlWSFMbuVg/nARr7sZcU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728327996; c=relaxed/simple; bh=qPI+bYV597cxqfbIFGZeEgeF+Pa6ctBHeM3AM0HEjiE=; h=Date:Message-ID:From:To:Cc:Subject:In-Reply-To:References: MIME-Version:Content-Type; b=Y9imUuwnO9VM4/Xgk2fd7UC6b5ArrVwqzFKDR+jmwn8PmZOpCEAqcj96l0JZh14w+DVjICH5mLxafJxcP8SnFCDwneP8GCmbnqiWxmHP5FNx/J6afGkJgYY5gesuB/f7saJr1QV2GZOsoLOURdWiG6+EXDkPQ4vb/P+lad6d0ys= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Vzzw6vZL; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Vzzw6vZL" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 132C9C4CEC6; Mon, 7 Oct 2024 19:06:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1728327996; bh=qPI+bYV597cxqfbIFGZeEgeF+Pa6ctBHeM3AM0HEjiE=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=Vzzw6vZLrjkt78LaGiknHLhIHSrIVybCaJA2aS9oWn8SWadEOxVRwsh9T0PK/Y1p5 mtQgap3Tl6iIlS2lsp+bd6fp7ksOdYr7Oo9s4tl8CETOqRNoVnphR1oQy5gAZ8ihhn 0JoB4nO8cnY5DX9pmD49bPOw9lIvAhpItXU9PyBAlJEzbvFfdegFQGHhCT7qBEm7JZ +weATAoFS+x0qASnra6m41nJEmEzvN38RwR9neykRLrgfXvWfn83j6TcdIzO8gBBIS IzjzmPYGDnPP2TqtVo+eQ1J3U+vsv789bGiA7X00QLmUIK6M1KPa92gzGx5wrARfhW YI+E3gZeBAKYA== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1sxt3v-00183u-Fr; Mon, 07 Oct 2024 20:06:33 +0100 Date: Mon, 07 Oct 2024 20:06:30 +0100 Message-ID: <86y12z68cp.wl-maz@kernel.org> From: Marc Zyngier To: Oliver Upton Cc: kvmarm@lists.linux.dev, Joey Gouly , Suzuki K Poulose , Zenghui Yu , Sean Christopherson Subject: Re: [PATCH v2 3/4] KVM: arm64: nv: Punt stage-2 recycling to a vCPU request In-Reply-To: <20241007164256.1795250-4-oliver.upton@linux.dev> References: <20241007164256.1795250-1-oliver.upton@linux.dev> <20241007164256.1795250-4-oliver.upton@linux.dev> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.4 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) Precedence: bulk X-Mailing-List: kvmarm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: oliver.upton@linux.dev, kvmarm@lists.linux.dev, joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, seanjc@google.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false On Mon, 07 Oct 2024 17:42:54 +0100, Oliver Upton wrote: > > Currently, when a nested MMU is repurposed for some other MMU context, > KVM unmaps everything during vcpu_load() while holding the MMU lock for > write. This is quite a performance bottleneck for large nested VMs, as > all vCPU scheduling will spin until the unmap completes. > > Start punting the MMU cleanup to a vCPU request, where it is then > possible to periodically release the MMU lock and CPU in the presence of > contention. > > Ensure that no vCPU winds up using a stale MMU by tracking the pending > unmap on the S2 MMU itself and requesting an unmap on every vCPU that > finds it. > > Signed-off-by: Oliver Upton > --- > arch/arm64/include/asm/kvm_host.h | 3 +++ > arch/arm64/include/asm/kvm_nested.h | 2 ++ > arch/arm64/kvm/arm.c | 2 ++ > arch/arm64/kvm/nested.c | 21 +++++++++++++++++++-- > 4 files changed, 26 insertions(+), 2 deletions(-) > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > index 329619c6fa96..16b4b0003e7d 100644 > --- a/arch/arm64/include/asm/kvm_host.h > +++ b/arch/arm64/include/asm/kvm_host.h > @@ -51,6 +51,7 @@ > #define KVM_REQ_RELOAD_PMU KVM_ARCH_REQ(5) > #define KVM_REQ_SUSPEND KVM_ARCH_REQ(6) > #define KVM_REQ_RESYNC_PMU_EL0 KVM_ARCH_REQ(7) > +#define KVM_REQ_NESTED_S2_UNMAP KVM_ARCH_REQ(8) > > #define KVM_DIRTY_LOG_MANUAL_CAPS (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \ > KVM_DIRTY_LOG_INITIALLY_SET) > @@ -216,6 +217,8 @@ struct kvm_s2_mmu { > * >0: Somebody is actively using this. > */ > atomic_t refcnt; > + > + bool pending_unmap; Nit: move this up next to nested_stage2_enabled, saving a grand total of 4 bytes... > }; > > struct kvm_arch_memory_slot { > diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h > index e74b90dcfac4..233e65522716 100644 > --- a/arch/arm64/include/asm/kvm_nested.h > +++ b/arch/arm64/include/asm/kvm_nested.h > @@ -78,6 +78,8 @@ extern void kvm_s2_mmu_iterate_by_vmid(struct kvm *kvm, u16 vmid, > extern void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu); > extern void kvm_vcpu_put_hw_mmu(struct kvm_vcpu *vcpu); > > +extern void check_nested_vcpu_requests(struct kvm_vcpu *vcpu); > + > struct kvm_s2_trans { > phys_addr_t output; > unsigned long block_size; > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > index a0d01c46e408..c7b659604bae 100644 > --- a/arch/arm64/kvm/arm.c > +++ b/arch/arm64/kvm/arm.c > @@ -1031,6 +1031,8 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu) > > if (kvm_dirty_ring_check_request(vcpu)) > return 0; > + > + check_nested_vcpu_requests(vcpu); > } > > return 1; > diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c > index 58d3b998793a..71b96eaf6632 100644 > --- a/arch/arm64/kvm/nested.c > +++ b/arch/arm64/kvm/nested.c > @@ -632,9 +632,9 @@ static struct kvm_s2_mmu *get_s2_mmu_nested(struct kvm_vcpu *vcpu) > /* Set the scene for the next search */ > kvm->arch.nested_mmus_next = (i + 1) % kvm->arch.nested_mmus_size; > > - /* Clear the old state */ > + /* Make sure we don't forget to do the laundry */ > if (kvm_s2_mmu_valid(s2_mmu)) > - kvm_stage2_unmap_range(s2_mmu, 0, kvm_phys_size(s2_mmu), false); > + s2_mmu->pending_unmap = true; > > /* > * The virtual VMID (modulo CnP) will be used as a key when matching > @@ -650,6 +650,9 @@ static struct kvm_s2_mmu *get_s2_mmu_nested(struct kvm_vcpu *vcpu) > > out: > atomic_inc(&s2_mmu->refcnt); > + if (s2_mmu->pending_unmap) > + kvm_make_request(KVM_REQ_NESTED_S2_UNMAP, vcpu); > + This bit is awesomely subtle. Any vcpu incrementing the refcount on an MMU being laundered gets to partake in the cleaning process. Without this, the second guy would get to use the MMU before it has been cleaned-up. If you don't mind, I'd like to add a comment to that effect, so that our future selves remember what this is about. And move the pending_unmap field in the process. Thanks, M. -- Without deviation from the norm, progress is not possible.