From mboxrd@z Thu Jan  1 00:00:00 1970
From: Paul Mackerras <paulus@samba.org>
Subject: [PATCH 23/23] KVM: PPC: Book3S PR: Reduce number of shadow PTEs
 invalidated by MMU notifiers
Date: Tue, 6 Aug 2013 14:28:19 +1000
Message-ID: <20130806042819.GC19254@iris.ozlabs.ibm.com>
References: <20130806041259.GF19254@iris.ozlabs.ibm.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
To: Alexander Graf <agraf@suse.de>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>
Return-path: <kvm-owner@vger.kernel.org>
Received: from ozlabs.org ([203.10.76.45]:46258 "EHLO ozlabs.org"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1751694Ab3HFE2t (ORCPT <rfc822;kvm@vger.kernel.org>);
	Tue, 6 Aug 2013 00:28:49 -0400
Content-Disposition: inline
In-Reply-To: <20130806041259.GF19254@iris.ozlabs.ibm.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

Currently, whenever any of the MMU notifier callbacks get called, we
invalidate all the shadow PTEs.  This is inefficient because it means
that we typically then get a lot of DSIs and ISIs in the guest to fault
the shadow PTEs back in.  We do this even if the address range being
notified doesn't correspond to guest memory.

This commit adds code to scan the memslot array to find out what range(s)
of guest physical addresses corresponds to the host virtual address range
being affected.  For each such range we flush only the shadow PTEs
for the range, on all cpus.

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/kvm/book3s_pr.c | 40 ++++++++++++++++++++++++++++++++--------
 1 file changed, 32 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 71f7cfe..2336d9c 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -150,16 +150,41 @@ int kvmppc_core_check_requests_pr(struct kvm_vcpu *vcpu)
 }
 
 /************* MMU Notifiers *************/
+static void do_kvm_unmap_hva(struct kvm *kvm, unsigned long start,
+			     unsigned long end)
+{
+	long i;
+	struct kvm_vcpu *vcpu;
+	struct kvm_memslots *slots;
+	struct kvm_memory_slot *memslot;
+
+	slots = kvm_memslots(kvm);
+	kvm_for_each_memslot(memslot, slots) {
+		unsigned long hva_start, hva_end;
+		gfn_t gfn, gfn_end;
+
+		hva_start = max(start, memslot->userspace_addr);
+		hva_end = min(end, memslot->userspace_addr +
+					(memslot->npages << PAGE_SHIFT));
+		if (hva_start >= hva_end)
+			continue;
+		/*
+		 * {gfn(page) | page intersects with [hva_start, hva_end)} =
+		 * {gfn, gfn+1, ..., gfn_end-1}.
+		 */
+		gfn = hva_to_gfn_memslot(hva_start, memslot);
+		gfn_end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, memslot);
+		kvm_for_each_vcpu(i, vcpu, kvm)
+			kvmppc_mmu_pte_pflush(vcpu, gfn << PAGE_SHIFT,
+					      gfn_end << PAGE_SHIFT);
+	}
+}
 
 int kvm_unmap_hva_pr(struct kvm *kvm, unsigned long hva)
 {
 	trace_kvm_unmap_hva(hva);
 
-	/*
-	 * Flush all shadow tlb entries everywhere. This is slow, but
-	 * we are 100% sure that we catch the to be unmapped page
-	 */
-	kvm_flush_remote_tlbs(kvm);
+	do_kvm_unmap_hva(kvm, hva, hva + PAGE_SIZE);
 
 	return 0;
 }
@@ -167,8 +192,7 @@ int kvm_unmap_hva_pr(struct kvm *kvm, unsigned long hva)
 int kvm_unmap_hva_range_pr(struct kvm *kvm, unsigned long start,
 			   unsigned long end)
 {
-	/* kvm_unmap_hva flushes everything anyways */
-	kvm_unmap_hva(kvm, start);
+	do_kvm_unmap_hva(kvm, start, end);
 
 	return 0;
 }
@@ -188,7 +212,7 @@ int kvm_test_age_hva_pr(struct kvm *kvm, unsigned long hva)
 void kvm_set_spte_hva_pr(struct kvm *kvm, unsigned long hva, pte_t pte)
 {
 	/* The page will get remapped properly on its next fault */
-	kvm_unmap_hva(kvm, hva);
+	do_kvm_unmap_hva(kvm, hva, hva + PAGE_SIZE);
 }
 
 /*****************************************/
-- 
1.8.3.1