[PATCH v6 11/11] KVM: arm64: Use TLBI range-based intructions for unmap

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Raghavendra Rao Ananta <rananta@google.com>
To: Oliver Upton <oliver.upton@linux.dev>,
	Marc Zyngier <maz@kernel.org>, James Morse <james.morse@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Huacai Chen <chenhuacai@kernel.org>,
	Zenghui Yu <yuzenghui@huawei.com>,
	Anup Patel <anup@brainfault.org>,
	Atish Patra <atishp@atishpatra.org>,
	Jing Zhang <jingzhangos@google.com>,
	Colton Lewis <coltonlewis@google.com>,
	Raghavendra Rao Anata <rananta@google.com>,
	David Matlack <dmatlack@google.com>,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	linux-mips@vger.kernel.org, kvm-riscv@lists.infradead.org,
	linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org
Subject: [PATCH v6 11/11] KVM: arm64: Use TLBI range-based intructions for unmap
Date: Sat, 15 Jul 2023 00:54:05 +0000	[thread overview]
Message-ID: <20230715005405.3689586-12-rananta@google.com> (raw)
In-Reply-To: <20230715005405.3689586-1-rananta@google.com>

The current implementation of the stage-2 unmap walker traverses
the given range and, as a part of break-before-make, performs
TLB invalidations with a DSB for every PTE. A multitude of this
combination could cause a performance bottleneck on some systems.

Hence, if the system supports FEAT_TLBIRANGE, defer the TLB
invalidations until the entire walk is finished, and then
use range-based instructions to invalidate the TLBs in one go.
Condition deferred TLB invalidation on the system supporting FWB,
as the optimization is entirely pointless when the unmap walker
needs to perform CMOs.

Rename stage2_put_pte() to stage2_unmap_put_pte() as the function
now serves the stage-2 unmap walker specifically, rather than
acting generic.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
---
 arch/arm64/kvm/hyp/pgtable.c | 67 +++++++++++++++++++++++++++++++-----
 1 file changed, 58 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 5ef098af1736..cf88933a2ea0 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -831,16 +831,54 @@ static void stage2_make_pte(const struct kvm_pgtable_visit_ctx *ctx, kvm_pte_t n
 	smp_store_release(ctx->ptep, new);
 }
 
-static void stage2_put_pte(const struct kvm_pgtable_visit_ctx *ctx, struct kvm_s2_mmu *mmu,
-			   struct kvm_pgtable_mm_ops *mm_ops)
+struct stage2_unmap_data {
+	struct kvm_pgtable *pgt;
+	bool defer_tlb_flush_init;
+};
+
+static bool __stage2_unmap_defer_tlb_flush(struct kvm_pgtable *pgt)
+{
+	/*
+	 * If FEAT_TLBIRANGE is implemented, defer the individual
+	 * TLB invalidations until the entire walk is finished, and
+	 * then use the range-based TLBI instructions to do the
+	 * invalidations. Condition deferred TLB invalidation on the
+	 * system supporting FWB, as the optimization is entirely
+	 * pointless when the unmap walker needs to perform CMOs.
+	 */
+	return system_supports_tlb_range() && stage2_has_fwb(pgt);
+}
+
+static bool stage2_unmap_defer_tlb_flush(struct stage2_unmap_data *unmap_data)
+{
+	bool defer_tlb_flush = __stage2_unmap_defer_tlb_flush(unmap_data->pgt);
+
+	/*
+	 * Since __stage2_unmap_defer_tlb_flush() is based on alternative
+	 * patching and the TLBIs' operations behavior depend on this,
+	 * track if there's any change in the state during the unmap sequence.
+	 */
+	WARN_ON(unmap_data->defer_tlb_flush_init != defer_tlb_flush);
+	return defer_tlb_flush;
+}
+
+static void stage2_unmap_put_pte(const struct kvm_pgtable_visit_ctx *ctx,
+				struct kvm_s2_mmu *mmu,
+				struct kvm_pgtable_mm_ops *mm_ops)
 {
+	struct stage2_unmap_data *unmap_data = ctx->arg;
+
 	/*
-	 * Clear the existing PTE, and perform break-before-make with
-	 * TLB maintenance if it was valid.
+	 * Clear the existing PTE, and perform break-before-make if it was
+	 * valid. Depending on the system support, the TLB maintenance for
+	 * the same can be deferred until the entire unmap is completed.
 	 */
 	if (kvm_pte_valid(ctx->old)) {
 		kvm_clear_pte(ctx->ptep);
-		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu, ctx->addr, ctx->level);
+
+		if (!stage2_unmap_defer_tlb_flush(unmap_data))
+			kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu,
+					ctx->addr, ctx->level);
 	}
 
 	mm_ops->put_page(ctx->ptep);
@@ -1070,7 +1108,8 @@ int kvm_pgtable_stage2_set_owner(struct kvm_pgtable *pgt, u64 addr, u64 size,
 static int stage2_unmap_walker(const struct kvm_pgtable_visit_ctx *ctx,
 			       enum kvm_pgtable_walk_flags visit)
 {
-	struct kvm_pgtable *pgt = ctx->arg;
+	struct stage2_unmap_data *unmap_data = ctx->arg;
+	struct kvm_pgtable *pgt = unmap_data->pgt;
 	struct kvm_s2_mmu *mmu = pgt->mmu;
 	struct kvm_pgtable_mm_ops *mm_ops = ctx->mm_ops;
 	kvm_pte_t *childp = NULL;
@@ -1098,7 +1137,7 @@ static int stage2_unmap_walker(const struct kvm_pgtable_visit_ctx *ctx,
 	 * block entry and rely on the remaining portions being faulted
 	 * back lazily.
 	 */
-	stage2_put_pte(ctx, mmu, mm_ops);
+	stage2_unmap_put_pte(ctx, mmu, mm_ops);
 
 	if (need_flush && mm_ops->dcache_clean_inval_poc)
 		mm_ops->dcache_clean_inval_poc(kvm_pte_follow(ctx->old, mm_ops),
@@ -1112,13 +1151,23 @@ static int stage2_unmap_walker(const struct kvm_pgtable_visit_ctx *ctx,
 
 int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size)
 {
+	int ret;
+	struct stage2_unmap_data unmap_data = {
+		.pgt = pgt,
+		.defer_tlb_flush_init = __stage2_unmap_defer_tlb_flush(pgt),
+	};
 	struct kvm_pgtable_walker walker = {
 		.cb	= stage2_unmap_walker,
-		.arg	= pgt,
+		.arg	= &unmap_data,
 		.flags	= KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST,
 	};
 
-	return kvm_pgtable_walk(pgt, addr, size, &walker);
+	ret = kvm_pgtable_walk(pgt, addr, size, &walker);
+	if (stage2_unmap_defer_tlb_flush(&unmap_data))
+		/* Perform the deferred TLB invalidations */
+		kvm_tlb_flush_vmid_range(pgt->mmu, addr, size);
+
+	return ret;
 }
 
 struct stage2_attr_data {
-- 
2.41.0.455.g037347b96a-goog

next prev parent reply	other threads:[~2023-07-15  0:54 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-15  0:53 [PATCH v6 00/11] KVM: arm64: Add support for FEAT_TLBIRANGE Raghavendra Rao Ananta
2023-07-15  0:53 ` [PATCH v6 01/11] KVM: Rename kvm_arch_flush_remote_tlb() to kvm_arch_flush_remote_tlbs() Raghavendra Rao Ananta
2023-07-17  8:03   ` Philippe Mathieu-Daudé
2023-07-17 11:06   ` Shaoqin Huang
2023-07-15  0:53 ` [PATCH v6 02/11] KVM: arm64: Use kvm_arch_flush_remote_tlbs() Raghavendra Rao Ananta
2023-07-17  8:12   ` Philippe Mathieu-Daudé
2023-07-17 16:28     ` Raghavendra Rao Ananta
2023-07-17 11:27   ` Shaoqin Huang
2023-07-15  0:53 ` [PATCH v6 03/11] KVM: Allow range-based TLB invalidation from common code Raghavendra Rao Ananta
2023-07-17 11:40   ` Shaoqin Huang
2023-07-17 16:37     ` Raghavendra Rao Ananta
2023-07-18  2:49       ` Shaoqin Huang
2023-07-18 16:31         ` Raghavendra Rao Ananta
2023-07-15  0:53 ` [PATCH v6 04/11] KVM: Move kvm_arch_flush_remote_tlbs_memslot() to " Raghavendra Rao Ananta
2023-07-18  2:47   ` Shaoqin Huang
2023-07-15  0:53 ` [PATCH v6 05/11] arm64: tlb: Refactor the core flush algorithm of __flush_tlb_range Raghavendra Rao Ananta
2023-07-18  6:21   ` Shaoqin Huang
2023-07-15  0:54 ` [PATCH v6 06/11] KVM: arm64: Implement __kvm_tlb_flush_vmid_range() Raghavendra Rao Ananta
2023-07-18  7:50   ` Shaoqin Huang
2023-07-18 16:39     ` Raghavendra Rao Ananta
2023-07-15  0:54 ` [PATCH v6 07/11] KVM: arm64: Define kvm_tlb_flush_vmid_range() Raghavendra Rao Ananta
2023-07-18  8:39   ` Shaoqin Huang
2023-07-15  0:54 ` [PATCH v6 08/11] KVM: arm64: Implement kvm_arch_flush_remote_tlbs_range() Raghavendra Rao Ananta
2023-07-18 11:15   ` Shaoqin Huang
2023-07-15  0:54 ` [PATCH v6 09/11] KVM: arm64: Flush only the memslot after write-protect Raghavendra Rao Ananta
2023-07-18 11:17   ` Shaoqin Huang
2023-07-15  0:54 ` [PATCH v6 10/11] KVM: arm64: Invalidate the table entries upon a range Raghavendra Rao Ananta
2023-07-18 11:17   ` Shaoqin Huang
2023-07-15  0:54 ` Raghavendra Rao Ananta [this message]
2023-07-15  2:02 ` [PATCH v6 00/11] KVM: arm64: Add support for FEAT_TLBIRANGE Raghavendra Rao Ananta

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:5ef098af173 dfblob:cf88933a2ea )
 OR (
bs:"[PATCH v6 11/11] KVM: arm64: Use TLBI range-based intructions for unmap" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230715005405.3689586-12-rananta@google.com \
    --to=rananta@google.com \
    --cc=anup@brainfault.org \
    --cc=atishp@atishpatra.org \
    --cc=chenhuacai@kernel.org \
    --cc=coltonlewis@google.com \
    --cc=dmatlack@google.com \
    --cc=james.morse@arm.com \
    --cc=jingzhangos@google.com \
    --cc=kvm-riscv@lists.infradead.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=suzuki.poulose@arm.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).