public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes
@ 2013-04-01  9:56 Xiao Guangrong
  2013-04-01  9:56 ` [PATCH v2 1/6] KVM: MMU: retain more available bits on mmio spte Xiao Guangrong
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Xiao Guangrong @ 2013-04-01  9:56 UTC (permalink / raw)
  To: mtosatti; +Cc: gleb, linux-kernel, kvm, Xiao Guangrong

Changelog in v2:
  - rename kvm_mmu_invalid_mmio_spte to kvm_mmu_invalid_mmio_sptes
  - use kvm->memslots->generation as kvm global generation-number
  - fix comment and codestyle
  - init kvm generation close to mmio wrap-around value
  - keep kvm_mmu_zap_mmio_sptes

The current way is holding hot mmu-lock and walking all shadow pages, this
is not scale. This patchset tries to introduce a very simple and scale way
to fast invalid all mmio sptes - it need not walk any shadow pages and hold
any locks.

The idea is simple:
KVM maintains a global mmio invalid generation-number which is stored in
kvm->memslots.generation and every mmio spte stores the current global
generation-number into his available bits when it is created

When KVM need zap all mmio sptes, it just simply increase the global
generation-number. When guests do mmio access, KVM intercepts a MMIO #PF
then it walks the shadow page table and get the mmio spte. If the
generation-number on the spte does not equal the global generation-number,
it will go to the normal #PF handler to update the mmio spte

Since 19 bits are used to store generation-number on mmio spte, we zap all
mmio sptes when the number is round

Xiao Guangrong (6):
  KVM: MMU: retain more available bits on mmio spte
  KVM: MMU: store generation-number into mmio spte
  KVM: MMU: make return value of mmio page fault handler more readable
  KVM: MMU: fast invalid all mmio sptes
  KVM: MMU: add tracepoint for check_mmio_spte
  KVM: MMU: init kvm generation close to mmio wrap-around value

 arch/x86/include/asm/kvm_host.h |    3 +-
 arch/x86/kvm/mmu.c              |  134 +++++++++++++++++++++++++++++++--------
 arch/x86/kvm/mmu.h              |   17 +++++
 arch/x86/kvm/mmutrace.h         |   34 +++++++++-
 arch/x86/kvm/paging_tmpl.h      |   10 ++-
 arch/x86/kvm/vmx.c              |   12 +++-
 arch/x86/kvm/x86.c              |   11 +++-
 virt/kvm/kvm_main.c             |    6 ++
 8 files changed, 186 insertions(+), 41 deletions(-)

-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/6] KVM: MMU: retain more available bits on mmio spte
  2013-04-01  9:56 [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes Xiao Guangrong
@ 2013-04-01  9:56 ` Xiao Guangrong
  2013-04-01  9:56 ` [PATCH v2 2/6] KVM: MMU: store generation-number into " Xiao Guangrong
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Xiao Guangrong @ 2013-04-01  9:56 UTC (permalink / raw)
  To: mtosatti; +Cc: gleb, linux-kernel, kvm, Xiao Guangrong

Let mmio spte only use bit62 and bit63 on upper 32 bits, then bit 52 ~ bit 61
can be used for other purposes

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 arch/x86/kvm/vmx.c |    4 ++--
 arch/x86/kvm/x86.c |    8 +++++++-
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 03f5746..915ef56 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3983,10 +3983,10 @@ static void ept_set_mmio_spte_mask(void)
 	/*
 	 * EPT Misconfigurations can be generated if the value of bits 2:0
 	 * of an EPT paging-structure entry is 110b (write/execute).
-	 * Also, magic bits (0xffull << 49) is set to quickly identify mmio
+	 * Also, magic bits (0x3ull << 62) is set to quickly identify mmio
 	 * spte.
 	 */
-	kvm_mmu_set_mmio_spte_mask(0xffull << 49 | 0x6ull);
+	kvm_mmu_set_mmio_spte_mask((0x3ull << 62) | 0x6ull);
 }
 
 /*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9ce99ed..12ad5b5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5194,7 +5194,13 @@ static void kvm_set_mmio_spte_mask(void)
 	 * Set the reserved bits and the present bit of an paging-structure
 	 * entry to generate page fault with PFER.RSV = 1.
 	 */
-	mask = ((1ull << (62 - maxphyaddr + 1)) - 1) << maxphyaddr;
+	 /* Mask the reserved physical address bits. */
+	mask = ((1ull << (51 - maxphyaddr + 1)) - 1) << maxphyaddr;
+
+	/* Bit 62 is always reserved for 32bit host. */
+	mask |= 0x3ull << 62;
+
+	/* Set the present bit. */
 	mask |= 1ull;
 
 #ifdef CONFIG_X86_64
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/6] KVM: MMU: store generation-number into mmio spte
  2013-04-01  9:56 [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes Xiao Guangrong
  2013-04-01  9:56 ` [PATCH v2 1/6] KVM: MMU: retain more available bits on mmio spte Xiao Guangrong
@ 2013-04-01  9:56 ` Xiao Guangrong
  2013-04-01  9:56 ` [PATCH v2 3/6] KVM: MMU: make return value of mmio page fault handler more readable Xiao Guangrong
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Xiao Guangrong @ 2013-04-01  9:56 UTC (permalink / raw)
  To: mtosatti; +Cc: gleb, linux-kernel, kvm, Xiao Guangrong

Store the generation-number into bit3 ~ bit11 and bit52 ~ bit61, totally
19 bits can be used, it should be enough for nearly all most common cases

In this patch, the generation-number is always 0, it will be changed in
the later patch

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 arch/x86/kvm/mmu.c         |   58 +++++++++++++++++++++++++++++++++++--------
 arch/x86/kvm/mmutrace.h    |   10 ++++---
 arch/x86/kvm/paging_tmpl.h |    3 +-
 3 files changed, 55 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 1ebca53..be4f733 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -197,15 +197,50 @@ void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask)
 }
 EXPORT_SYMBOL_GPL(kvm_mmu_set_mmio_spte_mask);
 
-static void mark_mmio_spte(u64 *sptep, u64 gfn, unsigned access)
+/*
+ * spte bits of bit 3 ~ bit 11 are used as low 9 bits of generation number,
+ * the bits of bits 52 ~ bit 61 are used as high 10 bits of generation
+ * number.
+ */
+#define MMIO_SPTE_GEN_LOW_SHIFT		3
+#define MMIO_SPTE_GEN_HIGH_SHIFT	52
+
+#define MMIO_GEN_LOW_SHIFT		9
+#define MMIO_GEN_LOW_MASK		((1 << MMIO_GEN_LOW_SHIFT) - 1)
+#define MMIO_MAX_GEN			((1 << 19) - 1)
+
+static u64 generation_mmio_spte_mask(unsigned int gen)
 {
-	struct kvm_mmu_page *sp =  page_header(__pa(sptep));
+	u64 mask;
+
+	WARN_ON(gen > MMIO_MAX_GEN);
+
+	mask = (gen & MMIO_GEN_LOW_MASK) << MMIO_SPTE_GEN_LOW_SHIFT;
+	mask |= ((u64)gen >> MMIO_GEN_LOW_SHIFT) << MMIO_SPTE_GEN_HIGH_SHIFT;
+	return mask;
+}
+
+static unsigned int get_mmio_spte_generation(u64 spte)
+{
+	unsigned int gen;
+
+	spte &= ~shadow_mmio_mask;
+
+	gen = (spte >> MMIO_SPTE_GEN_LOW_SHIFT) & MMIO_GEN_LOW_MASK;
+	gen |= (spte >> MMIO_SPTE_GEN_HIGH_SHIFT) << MMIO_GEN_LOW_SHIFT;
+	return gen;
+}
+
+static void mark_mmio_spte(struct kvm *kvm, u64 *sptep, u64 gfn,
+			   unsigned access)
+{
+	u64 mask = generation_mmio_spte_mask(0);
 
 	access &= ACC_WRITE_MASK | ACC_USER_MASK;
+	mask |= shadow_mmio_mask | access | gfn << PAGE_SHIFT;
 
-	sp->mmio_cached = true;
-	trace_mark_mmio_spte(sptep, gfn, access);
-	mmu_spte_set(sptep, shadow_mmio_mask | access | gfn << PAGE_SHIFT);
+	trace_mark_mmio_spte(sptep, gfn, access, 0);
+	mmu_spte_set(sptep, mask);
 }
 
 static bool is_mmio_spte(u64 spte)
@@ -223,10 +258,11 @@ static unsigned get_mmio_spte_access(u64 spte)
 	return (spte & ~shadow_mmio_mask) & ~PAGE_MASK;
 }
 
-static bool set_mmio_spte(u64 *sptep, gfn_t gfn, pfn_t pfn, unsigned access)
+static bool set_mmio_spte(struct kvm *kvm, u64 *sptep, gfn_t gfn,
+			  pfn_t pfn, unsigned access)
 {
 	if (unlikely(is_noslot_pfn(pfn))) {
-		mark_mmio_spte(sptep, gfn, access);
+		mark_mmio_spte(kvm, sptep, gfn, access);
 		return true;
 	}
 
@@ -2335,7 +2371,7 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 	u64 spte;
 	int ret = 0;
 
-	if (set_mmio_spte(sptep, gfn, pfn, pte_access))
+	if (set_mmio_spte(vcpu->kvm, sptep, gfn, pfn, pte_access))
 		return 0;
 
 	spte = PT_PRESENT_MASK;
@@ -3388,8 +3424,8 @@ static inline void protect_clean_gpte(unsigned *access, unsigned gpte)
 	*access &= mask;
 }
 
-static bool sync_mmio_spte(u64 *sptep, gfn_t gfn, unsigned access,
-			   int *nr_present)
+static bool sync_mmio_spte(struct kvm *kvm, u64 *sptep, gfn_t gfn,
+			   unsigned access, int *nr_present)
 {
 	if (unlikely(is_mmio_spte(*sptep))) {
 		if (gfn != get_mmio_spte_gfn(*sptep)) {
@@ -3398,7 +3434,7 @@ static bool sync_mmio_spte(u64 *sptep, gfn_t gfn, unsigned access,
 		}
 
 		(*nr_present)++;
-		mark_mmio_spte(sptep, gfn, access);
+		mark_mmio_spte(kvm, sptep, gfn, access);
 		return true;
 	}
 
diff --git a/arch/x86/kvm/mmutrace.h b/arch/x86/kvm/mmutrace.h
index b8f6172..f5b62a7 100644
--- a/arch/x86/kvm/mmutrace.h
+++ b/arch/x86/kvm/mmutrace.h
@@ -197,23 +197,25 @@ DEFINE_EVENT(kvm_mmu_page_class, kvm_mmu_prepare_zap_page,
 
 TRACE_EVENT(
 	mark_mmio_spte,
-	TP_PROTO(u64 *sptep, gfn_t gfn, unsigned access),
-	TP_ARGS(sptep, gfn, access),
+	TP_PROTO(u64 *sptep, gfn_t gfn, unsigned access, unsigned int gen),
+	TP_ARGS(sptep, gfn, access, gen),
 
 	TP_STRUCT__entry(
 		__field(void *, sptep)
 		__field(gfn_t, gfn)
 		__field(unsigned, access)
+		__field(unsigned int, gen)
 	),
 
 	TP_fast_assign(
 		__entry->sptep = sptep;
 		__entry->gfn = gfn;
 		__entry->access = access;
+		__entry->gen = gen;
 	),
 
-	TP_printk("sptep:%p gfn %llx access %x", __entry->sptep, __entry->gfn,
-		  __entry->access)
+	TP_printk("sptep:%p gfn %llx access %x gen %x", __entry->sptep,
+		  __entry->gfn, __entry->access, __entry->gen)
 );
 
 TRACE_EVENT(
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index af143f0..bddc5f6 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -791,7 +791,8 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
 		pte_access &= gpte_access(vcpu, gpte);
 		protect_clean_gpte(&pte_access, gpte);
 
-		if (sync_mmio_spte(&sp->spt[i], gfn, pte_access, &nr_present))
+		if (sync_mmio_spte(vcpu->kvm, &sp->spt[i], gfn, pte_access,
+		      &nr_present))
 			continue;
 
 		if (gfn != sp->gfns[i]) {
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/6] KVM: MMU: make return value of mmio page fault handler more readable
  2013-04-01  9:56 [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes Xiao Guangrong
  2013-04-01  9:56 ` [PATCH v2 1/6] KVM: MMU: retain more available bits on mmio spte Xiao Guangrong
  2013-04-01  9:56 ` [PATCH v2 2/6] KVM: MMU: store generation-number into " Xiao Guangrong
@ 2013-04-01  9:56 ` Xiao Guangrong
  2013-04-24 13:34   ` Gleb Natapov
  2013-04-01  9:56 ` [PATCH v2 4/6] KVM: MMU: fast invalid all mmio sptes Xiao Guangrong
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Xiao Guangrong @ 2013-04-01  9:56 UTC (permalink / raw)
  To: mtosatti; +Cc: gleb, linux-kernel, kvm, Xiao Guangrong

Define some meaningful names instead of raw code

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 arch/x86/kvm/mmu.c |   15 +++++----------
 arch/x86/kvm/mmu.h |   14 ++++++++++++++
 arch/x86/kvm/vmx.c |    4 ++--
 3 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index be4f733..31c5586 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3182,17 +3182,12 @@ static u64 walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr)
 	return spte;
 }
 
-/*
- * If it is a real mmio page fault, return 1 and emulat the instruction
- * directly, return 0 to let CPU fault again on the address, -1 is
- * returned if bug is detected.
- */
 int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct)
 {
 	u64 spte;
 
 	if (quickly_check_mmio_pf(vcpu, addr, direct))
-		return 1;
+		return RET_MMIO_PF_EMU;
 
 	spte = walk_shadow_page_get_mmio_spte(vcpu, addr);
 
@@ -3205,7 +3200,7 @@ int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct)
 
 		trace_handle_mmio_page_fault(addr, gfn, access);
 		vcpu_cache_mmio_info(vcpu, addr, gfn, access);
-		return 1;
+		return RET_MMIO_PF_EMU;
 	}
 
 	/*
@@ -3213,13 +3208,13 @@ int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct)
 	 * it's a BUG if the gfn is not a mmio page.
 	 */
 	if (direct && !check_direct_spte_mmio_pf(spte))
-		return -1;
+		return RET_MMIO_PF_BUG;
 
 	/*
 	 * If the page table is zapped by other cpus, let CPU fault again on
 	 * the address.
 	 */
-	return 0;
+	return RET_MMIO_PF_RETRY;
 }
 EXPORT_SYMBOL_GPL(handle_mmio_page_fault_common);
 
@@ -3229,7 +3224,7 @@ static int handle_mmio_page_fault(struct kvm_vcpu *vcpu, u64 addr,
 	int ret;
 
 	ret = handle_mmio_page_fault_common(vcpu, addr, direct);
-	WARN_ON(ret < 0);
+	WARN_ON(ret == RET_MMIO_PF_BUG);
 	return ret;
 }
 
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 2adcbc2..6b4ba1e 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -52,6 +52,20 @@
 
 int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 sptes[4]);
 void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask);
+
+/*
+ * Return values of handle_mmio_page_fault_common:
+ * RET_MMIO_PF_EMU: it is a real mmio page fault, emulate the instruction
+ *		    directly.
+ * RET_MMIO_PF_RETRY: let CPU fault again on the address.
+ * RET_MMIO_PF_BUG: bug is detected.
+ */
+enum {
+	RET_MMIO_PF_EMU = 1,
+	RET_MMIO_PF_RETRY = 0,
+	RET_MMIO_PF_BUG = -1
+};
+
 int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct);
 int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context);
 
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 915ef56..d0f2790 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -5135,10 +5135,10 @@ static int handle_ept_misconfig(struct kvm_vcpu *vcpu)
 	gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS);
 
 	ret = handle_mmio_page_fault_common(vcpu, gpa, true);
-	if (likely(ret == 1))
+	if (likely(ret == RET_MMIO_PF_EMU))
 		return x86_emulate_instruction(vcpu, gpa, 0, NULL, 0) ==
 					      EMULATE_DONE;
-	if (unlikely(!ret))
+	if (unlikely(ret == RET_MMIO_PF_RETRY))
 		return 1;
 
 	/* It is the real ept misconfig */
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 4/6] KVM: MMU: fast invalid all mmio sptes
  2013-04-01  9:56 [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes Xiao Guangrong
                   ` (2 preceding siblings ...)
  2013-04-01  9:56 ` [PATCH v2 3/6] KVM: MMU: make return value of mmio page fault handler more readable Xiao Guangrong
@ 2013-04-01  9:56 ` Xiao Guangrong
  2013-04-01  9:56 ` [PATCH v2 5/6] KVM: MMU: add tracepoint for check_mmio_spte Xiao Guangrong
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Xiao Guangrong @ 2013-04-01  9:56 UTC (permalink / raw)
  To: mtosatti; +Cc: gleb, linux-kernel, kvm, Xiao Guangrong

This patch tries to introduce a very simple and scale way to invalid all
mmio sptes - it need not walk any shadow pages and hold mmu-lock

KVM maintains a global mmio invalid generation-number which is stored in
kvm->memslots.generation and every mmio spte stores the current global
generation-number into his available bits when it is created

When KVM need zap all mmio sptes, it just simply increase the global
generation-number. When guests do mmio access, KVM intercepts a MMIO #PF
then it walks the shadow page table and get the mmio spte. If the
generation-number on the spte does not equal the global generation-number,
it will go to the normal #PF handler to update the mmio spte

Since 19 bits are used to store generation-number on mmio spte, we zap all
mmio sptes when the number is round

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 arch/x86/include/asm/kvm_host.h |    2 +-
 arch/x86/kvm/mmu.c              |   54 +++++++++++++++++++++++++++++++++------
 arch/x86/kvm/mmu.h              |    3 ++
 arch/x86/kvm/paging_tmpl.h      |    7 +++-
 arch/x86/kvm/vmx.c              |    4 +++
 arch/x86/kvm/x86.c              |    3 +-
 6 files changed, 60 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index b5a6462..6c1e642 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -767,7 +767,7 @@ void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
 				     struct kvm_memory_slot *slot,
 				     gfn_t gfn_offset, unsigned long mask);
 void kvm_mmu_zap_all(struct kvm *kvm);
-void kvm_mmu_zap_mmio_sptes(struct kvm *kvm);
+void kvm_mmu_invalid_mmio_sptes(struct kvm *kvm);
 unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
 void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages);
 
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 31c5586..1020152 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -205,9 +205,11 @@ EXPORT_SYMBOL_GPL(kvm_mmu_set_mmio_spte_mask);
 #define MMIO_SPTE_GEN_LOW_SHIFT		3
 #define MMIO_SPTE_GEN_HIGH_SHIFT	52
 
+#define MMIO_GEN_SHIFT			19
 #define MMIO_GEN_LOW_SHIFT		9
 #define MMIO_GEN_LOW_MASK		((1 << MMIO_GEN_LOW_SHIFT) - 1)
-#define MMIO_MAX_GEN			((1 << 19) - 1)
+#define MMIO_GEN_MASK			((1 << MMIO_GEN_SHIFT) - 1)
+#define MMIO_MAX_GEN			((1 << MMIO_GEN_SHIFT) - 1)
 
 static u64 generation_mmio_spte_mask(unsigned int gen)
 {
@@ -231,15 +233,21 @@ static unsigned int get_mmio_spte_generation(u64 spte)
 	return gen;
 }
 
+static unsigned int kvm_current_mmio_generation(struct kvm *kvm)
+{
+	return kvm_memslots(kvm)->generation & MMIO_GEN_MASK;
+}
+
 static void mark_mmio_spte(struct kvm *kvm, u64 *sptep, u64 gfn,
 			   unsigned access)
 {
-	u64 mask = generation_mmio_spte_mask(0);
+	unsigned int gen = kvm_current_mmio_generation(kvm);
+	u64 mask = generation_mmio_spte_mask(gen);
 
 	access &= ACC_WRITE_MASK | ACC_USER_MASK;
 	mask |= shadow_mmio_mask | access | gfn << PAGE_SHIFT;
 
-	trace_mark_mmio_spte(sptep, gfn, access, 0);
+	trace_mark_mmio_spte(sptep, gfn, access, gen);
 	mmu_spte_set(sptep, mask);
 }
 
@@ -269,6 +277,12 @@ static bool set_mmio_spte(struct kvm *kvm, u64 *sptep, gfn_t gfn,
 	return false;
 }
 
+static bool check_mmio_spte(struct kvm *kvm, u64 spte)
+{
+	return get_mmio_spte_generation(spte) ==
+		      kvm_current_mmio_generation(kvm);
+}
+
 static inline u64 rsvd_bits(int s, int e)
 {
 	return ((1ULL << (e - s + 1)) - 1) << s;
@@ -3195,6 +3209,9 @@ int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct)
 		gfn_t gfn = get_mmio_spte_gfn(spte);
 		unsigned access = get_mmio_spte_access(spte);
 
+		if (unlikely(!check_mmio_spte(vcpu->kvm, spte)))
+			return RET_MMIO_PF_INVALID;
+
 		if (direct)
 			addr = 0;
 
@@ -3236,8 +3253,12 @@ static int nonpaging_page_fault(struct kvm_vcpu *vcpu, gva_t gva,
 
 	pgprintk("%s: gva %lx error %x\n", __func__, gva, error_code);
 
-	if (unlikely(error_code & PFERR_RSVD_MASK))
-		return handle_mmio_page_fault(vcpu, gva, error_code, true);
+	if (unlikely(error_code & PFERR_RSVD_MASK)) {
+		r = handle_mmio_page_fault(vcpu, gva, error_code, true);
+
+		if (likely(r != RET_MMIO_PF_INVALID))
+			return r;
+	}
 
 	r = mmu_topup_memory_caches(vcpu);
 	if (r)
@@ -3313,8 +3334,12 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
 	ASSERT(vcpu);
 	ASSERT(VALID_PAGE(vcpu->arch.mmu.root_hpa));
 
-	if (unlikely(error_code & PFERR_RSVD_MASK))
-		return handle_mmio_page_fault(vcpu, gpa, error_code, true);
+	if (unlikely(error_code & PFERR_RSVD_MASK)) {
+		r = handle_mmio_page_fault(vcpu, gpa, error_code, true);
+
+		if (likely(r != RET_MMIO_PF_INVALID))
+			return r;
+	}
 
 	r = mmu_topup_memory_caches(vcpu);
 	if (r)
@@ -4231,7 +4256,7 @@ restart:
 	spin_unlock(&kvm->mmu_lock);
 }
 
-void kvm_mmu_zap_mmio_sptes(struct kvm *kvm)
+static void kvm_mmu_zap_mmio_sptes(struct kvm *kvm)
 {
 	struct kvm_mmu_page *sp, *node;
 	LIST_HEAD(invalid_list);
@@ -4249,6 +4274,19 @@ restart:
 	spin_unlock(&kvm->mmu_lock);
 }
 
+void kvm_mmu_invalid_mmio_sptes(struct kvm *kvm)
+{
+	/*
+	 * The very rare case: if the generation-number is round,
+	 * zap all shadow pages.
+	 *
+	 * The max value is MMIO_MAX_GEN - 1 since it is not called
+	 * when mark memslot invalid.
+	 */
+	if (unlikely(kvm_current_mmio_generation(kvm) >= (MMIO_MAX_GEN - 1)))
+		kvm_mmu_zap_mmio_sptes(kvm);
+}
+
 static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
 {
 	struct kvm *kvm;
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 6b4ba1e..ffd40d1 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -57,11 +57,14 @@ void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask);
  * Return values of handle_mmio_page_fault_common:
  * RET_MMIO_PF_EMU: it is a real mmio page fault, emulate the instruction
  *		    directly.
+ * RET_MMIO_PF_INVALID: invalid spte is detected then let the real page
+ *			fault path update the mmio spte.
  * RET_MMIO_PF_RETRY: let CPU fault again on the address.
  * RET_MMIO_PF_BUG: bug is detected.
  */
 enum {
 	RET_MMIO_PF_EMU = 1,
+	RET_MMIO_PF_INVALID = 2,
 	RET_MMIO_PF_RETRY = 0,
 	RET_MMIO_PF_BUG = -1
 };
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index bddc5f6..29766ec 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -552,9 +552,12 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t addr, u32 error_code,
 
 	pgprintk("%s: addr %lx err %x\n", __func__, addr, error_code);
 
-	if (unlikely(error_code & PFERR_RSVD_MASK))
-		return handle_mmio_page_fault(vcpu, addr, error_code,
+	if (unlikely(error_code & PFERR_RSVD_MASK)) {
+		r = handle_mmio_page_fault(vcpu, addr, error_code,
 					      mmu_is_nested(vcpu));
+		if (likely(r != RET_MMIO_PF_INVALID))
+			return r;
+	};
 
 	r = mmu_topup_memory_caches(vcpu);
 	if (r)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index d0f2790..9aa9a54 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -5138,6 +5138,10 @@ static int handle_ept_misconfig(struct kvm_vcpu *vcpu)
 	if (likely(ret == RET_MMIO_PF_EMU))
 		return x86_emulate_instruction(vcpu, gpa, 0, NULL, 0) ==
 					      EMULATE_DONE;
+
+	if (unlikely(ret == RET_MMIO_PF_INVALID))
+		return kvm_mmu_page_fault(vcpu, gpa, 0, NULL, 0);
+
 	if (unlikely(ret == RET_MMIO_PF_RETRY))
 		return 1;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 12ad5b5..4be4733 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6994,8 +6994,7 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 	 * If memory slot is created, or moved, we need to clear all
 	 * mmio sptes.
 	 */
-	if ((change == KVM_MR_CREATE) || (change == KVM_MR_MOVE))
-		kvm_mmu_zap_mmio_sptes(kvm);
+	kvm_mmu_invalid_mmio_sptes(kvm);
 }
 
 void kvm_arch_flush_shadow_all(struct kvm *kvm)
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 5/6] KVM: MMU: add tracepoint for check_mmio_spte
  2013-04-01  9:56 [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes Xiao Guangrong
                   ` (3 preceding siblings ...)
  2013-04-01  9:56 ` [PATCH v2 4/6] KVM: MMU: fast invalid all mmio sptes Xiao Guangrong
@ 2013-04-01  9:56 ` Xiao Guangrong
  2013-04-01  9:56 ` [PATCH v2 6/6] KVM: MMU: init kvm generation close to mmio wrap-around value Xiao Guangrong
  2013-04-16  0:54 ` [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes Marcelo Tosatti
  6 siblings, 0 replies; 13+ messages in thread
From: Xiao Guangrong @ 2013-04-01  9:56 UTC (permalink / raw)
  To: mtosatti; +Cc: gleb, linux-kernel, kvm, Xiao Guangrong

It is useful for debug mmio spte invalidation

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 arch/x86/kvm/mmu.c      |    9 +++++++--
 arch/x86/kvm/mmutrace.h |   24 ++++++++++++++++++++++++
 2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 1020152..d314e21 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -279,8 +279,13 @@ static bool set_mmio_spte(struct kvm *kvm, u64 *sptep, gfn_t gfn,
 
 static bool check_mmio_spte(struct kvm *kvm, u64 spte)
 {
-	return get_mmio_spte_generation(spte) ==
-		      kvm_current_mmio_generation(kvm);
+	unsigned int kvm_gen, spte_gen;
+
+	kvm_gen = kvm_current_mmio_generation(kvm);
+	spte_gen = get_mmio_spte_generation(spte);
+
+	trace_check_mmio_spte(spte, kvm_gen, spte_gen);
+	return kvm_gen == spte_gen;
 }
 
 static inline u64 rsvd_bits(int s, int e)
diff --git a/arch/x86/kvm/mmutrace.h b/arch/x86/kvm/mmutrace.h
index f5b62a7..dac44ab 100644
--- a/arch/x86/kvm/mmutrace.h
+++ b/arch/x86/kvm/mmutrace.h
@@ -276,6 +276,30 @@ TRACE_EVENT(
 		  __spte_satisfied(old_spte), __spte_satisfied(new_spte)
 	)
 );
+
+
+TRACE_EVENT(
+	check_mmio_spte,
+	TP_PROTO(u64 spte, unsigned int kvm_gen, unsigned int spte_gen),
+	TP_ARGS(spte, kvm_gen, spte_gen),
+
+	TP_STRUCT__entry(
+		__field(unsigned int, kvm_gen)
+		__field(unsigned int, spte_gen)
+		__field(u64, spte)
+	),
+
+	TP_fast_assign(
+		__entry->kvm_gen = kvm_gen;
+		__entry->spte_gen = spte_gen;
+		__entry->spte = spte;
+	),
+
+	TP_printk("spte %llx kvm_gen %x spte-gen %x valid %d", __entry->spte,
+		  __entry->kvm_gen, __entry->spte_gen,
+		  __entry->kvm_gen == __entry->spte_gen
+	)
+);
 #endif /* _TRACE_KVMMMU_H */
 
 #undef TRACE_INCLUDE_PATH
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 6/6] KVM: MMU: init kvm generation close to mmio wrap-around value
  2013-04-01  9:56 [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes Xiao Guangrong
                   ` (4 preceding siblings ...)
  2013-04-01  9:56 ` [PATCH v2 5/6] KVM: MMU: add tracepoint for check_mmio_spte Xiao Guangrong
@ 2013-04-01  9:56 ` Xiao Guangrong
  2013-04-24 12:59   ` Gleb Natapov
  2013-04-16  0:54 ` [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes Marcelo Tosatti
  6 siblings, 1 reply; 13+ messages in thread
From: Xiao Guangrong @ 2013-04-01  9:56 UTC (permalink / raw)
  To: mtosatti; +Cc: gleb, linux-kernel, kvm, Xiao Guangrong

Then it has chance to trigger mmio generation number wrap-around

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 arch/x86/include/asm/kvm_host.h |    1 +
 arch/x86/kvm/mmu.c              |    8 ++++++++
 virt/kvm/kvm_main.c             |    6 ++++++
 3 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6c1e642..4e1f7cb 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -767,6 +767,7 @@ void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
 				     struct kvm_memory_slot *slot,
 				     gfn_t gfn_offset, unsigned long mask);
 void kvm_mmu_zap_all(struct kvm *kvm);
+void kvm_arch_init_generation(struct kvm *kvm);
 void kvm_mmu_invalid_mmio_sptes(struct kvm *kvm);
 unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
 void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages);
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index d314e21..dcc059c 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4279,6 +4279,14 @@ restart:
 	spin_unlock(&kvm->mmu_lock);
 }
 
+void kvm_arch_init_generation(struct kvm *kvm)
+{
+	mutex_lock(&kvm->slots_lock);
+	/* It is easier to trigger mmio generation-number wrap-around. */
+	kvm_memslots(kvm)->generation = MMIO_MAX_GEN - 13;
+	mutex_unlock(&kvm->slots_lock);
+}
+
 void kvm_mmu_invalid_mmio_sptes(struct kvm *kvm)
 {
 	/*
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index ff71541..d21694a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -459,6 +459,10 @@ static void kvm_init_memslots_id(struct kvm *kvm)
 		slots->id_to_index[i] = slots->memslots[i].id = i;
 }
 
+void __attribute__((weak)) kvm_arch_init_generation(struct kvm *kvm)
+{
+}
+
 static struct kvm *kvm_create_vm(unsigned long type)
 {
 	int r, i;
@@ -505,6 +509,8 @@ static struct kvm *kvm_create_vm(unsigned long type)
 	mutex_init(&kvm->slots_lock);
 	atomic_set(&kvm->users_count, 1);
 
+	kvm_arch_init_generation(kvm);
+
 	r = kvm_init_mmu_notifier(kvm);
 	if (r)
 		goto out_err;
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes
  2013-04-01  9:56 [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes Xiao Guangrong
                   ` (5 preceding siblings ...)
  2013-04-01  9:56 ` [PATCH v2 6/6] KVM: MMU: init kvm generation close to mmio wrap-around value Xiao Guangrong
@ 2013-04-16  0:54 ` Marcelo Tosatti
  2013-04-16  3:09   ` Xiao Guangrong
  6 siblings, 1 reply; 13+ messages in thread
From: Marcelo Tosatti @ 2013-04-16  0:54 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: gleb, linux-kernel, kvm

On Mon, Apr 01, 2013 at 05:56:43PM +0800, Xiao Guangrong wrote:
> Changelog in v2:
>   - rename kvm_mmu_invalid_mmio_spte to kvm_mmu_invalid_mmio_sptes
>   - use kvm->memslots->generation as kvm global generation-number
>   - fix comment and codestyle
>   - init kvm generation close to mmio wrap-around value
>   - keep kvm_mmu_zap_mmio_sptes
> 
> The current way is holding hot mmu-lock and walking all shadow pages, this
> is not scale. This patchset tries to introduce a very simple and scale way
> to fast invalid all mmio sptes - it need not walk any shadow pages and hold
> any locks.
> 
> The idea is simple:
> KVM maintains a global mmio invalid generation-number which is stored in
> kvm->memslots.generation and every mmio spte stores the current global
> generation-number into his available bits when it is created
> 
> When KVM need zap all mmio sptes, it just simply increase the global
> generation-number. When guests do mmio access, KVM intercepts a MMIO #PF
> then it walks the shadow page table and get the mmio spte. If the
> generation-number on the spte does not equal the global generation-number,
> it will go to the normal #PF handler to update the mmio spte
> 
> Since 19 bits are used to store generation-number on mmio spte, we zap all
> mmio sptes when the number is round

Hi Xiao,

Is it still necessary with generation numbers at 'struct shadow_page'
level (which covers the slot deletion case).

That is, once kvm_mmu_zap_all is fixed to increase generation count and
nuke roots, can't that be used instead with similar effectiveness for
SLOT_CREATE/SLOT_MOVE cases?

> Xiao Guangrong (6):
>   KVM: MMU: retain more available bits on mmio spte
>   KVM: MMU: store generation-number into mmio spte
>   KVM: MMU: make return value of mmio page fault handler more readable
>   KVM: MMU: fast invalid all mmio sptes
>   KVM: MMU: add tracepoint for check_mmio_spte
>   KVM: MMU: init kvm generation close to mmio wrap-around value
> 
>  arch/x86/include/asm/kvm_host.h |    3 +-
>  arch/x86/kvm/mmu.c              |  134 +++++++++++++++++++++++++++++++--------
>  arch/x86/kvm/mmu.h              |   17 +++++
>  arch/x86/kvm/mmutrace.h         |   34 +++++++++-
>  arch/x86/kvm/paging_tmpl.h      |   10 ++-
>  arch/x86/kvm/vmx.c              |   12 +++-
>  arch/x86/kvm/x86.c              |   11 +++-
>  virt/kvm/kvm_main.c             |    6 ++
>  8 files changed, 186 insertions(+), 41 deletions(-)
> 
> -- 
> 1.7.7.6
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes
  2013-04-16  0:54 ` [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes Marcelo Tosatti
@ 2013-04-16  3:09   ` Xiao Guangrong
  0 siblings, 0 replies; 13+ messages in thread
From: Xiao Guangrong @ 2013-04-16  3:09 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: gleb, linux-kernel, kvm


Hi Marcelo,

On 04/16/2013 08:54 AM, Marcelo Tosatti wrote:
> On Mon, Apr 01, 2013 at 05:56:43PM +0800, Xiao Guangrong wrote:
>> Changelog in v2:
>>   - rename kvm_mmu_invalid_mmio_spte to kvm_mmu_invalid_mmio_sptes
>>   - use kvm->memslots->generation as kvm global generation-number
>>   - fix comment and codestyle
>>   - init kvm generation close to mmio wrap-around value
>>   - keep kvm_mmu_zap_mmio_sptes
>>
>> The current way is holding hot mmu-lock and walking all shadow pages, this
>> is not scale. This patchset tries to introduce a very simple and scale way
>> to fast invalid all mmio sptes - it need not walk any shadow pages and hold
>> any locks.
>>
>> The idea is simple:
>> KVM maintains a global mmio invalid generation-number which is stored in
>> kvm->memslots.generation and every mmio spte stores the current global
>> generation-number into his available bits when it is created
>>
>> When KVM need zap all mmio sptes, it just simply increase the global
>> generation-number. When guests do mmio access, KVM intercepts a MMIO #PF
>> then it walks the shadow page table and get the mmio spte. If the
>> generation-number on the spte does not equal the global generation-number,
>> it will go to the normal #PF handler to update the mmio spte
>>
>> Since 19 bits are used to store generation-number on mmio spte, we zap all
>> mmio sptes when the number is round
> 
> Hi Xiao,
> 
> Is it still necessary with generation numbers at 'struct shadow_page'
> level (which covers the slot deletion case).

Yes.

> 
> That is, once kvm_mmu_zap_all is fixed to increase generation count and
> nuke roots, can't that be used instead with similar effectiveness for
> SLOT_CREATE/SLOT_MOVE cases?

It seems not easy. :(

We can not increase kvm's generation count for SLOT_CREATE since any change
on kvm->generation_count will cause all vcpus fault on _all_ memory region.

We also can not separately update mmio-sp's generation count instead of
zapping them since a sp can have both mmio-spte and normal-spte, we should
zap the normal spte on a mmio-sp.

Thanks!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 6/6] KVM: MMU: init kvm generation close to mmio wrap-around value
  2013-04-01  9:56 ` [PATCH v2 6/6] KVM: MMU: init kvm generation close to mmio wrap-around value Xiao Guangrong
@ 2013-04-24 12:59   ` Gleb Natapov
  2013-04-25  9:23     ` Xiao Guangrong
  0 siblings, 1 reply; 13+ messages in thread
From: Gleb Natapov @ 2013-04-24 12:59 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: mtosatti, linux-kernel, kvm

On Mon, Apr 01, 2013 at 05:56:49PM +0800, Xiao Guangrong wrote:
> Then it has chance to trigger mmio generation number wrap-around
> 
> Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
> ---
>  arch/x86/include/asm/kvm_host.h |    1 +
>  arch/x86/kvm/mmu.c              |    8 ++++++++
>  virt/kvm/kvm_main.c             |    6 ++++++
>  3 files changed, 15 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 6c1e642..4e1f7cb 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -767,6 +767,7 @@ void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
>  				     struct kvm_memory_slot *slot,
>  				     gfn_t gfn_offset, unsigned long mask);
>  void kvm_mmu_zap_all(struct kvm *kvm);
> +void kvm_arch_init_generation(struct kvm *kvm);
>  void kvm_mmu_invalid_mmio_sptes(struct kvm *kvm);
>  unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
>  void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages);
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index d314e21..dcc059c 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -4279,6 +4279,14 @@ restart:
>  	spin_unlock(&kvm->mmu_lock);
>  }
>  
> +void kvm_arch_init_generation(struct kvm *kvm)
> +{
> +	mutex_lock(&kvm->slots_lock);
> +	/* It is easier to trigger mmio generation-number wrap-around. */
> +	kvm_memslots(kvm)->generation = MMIO_MAX_GEN - 13;
kvm_memslots(kvm)->generation should never overflow since
(read|write)_cached mechanism does not handle it. Initialising it to
anything but 0 makes overflow more likely.

You can hide mmio overflow trick in kvm_current_mmio_generation():

static unsigned int kvm_current_mmio_generation(struct kvm *kvm)
{
	return (kvm_memslots(kvm)->generation + MMIO_MAX_GEN - 13) & MMIO_GEN_MASK;
}

--
			Gleb.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 3/6] KVM: MMU: make return value of mmio page fault handler more readable
  2013-04-01  9:56 ` [PATCH v2 3/6] KVM: MMU: make return value of mmio page fault handler more readable Xiao Guangrong
@ 2013-04-24 13:34   ` Gleb Natapov
  2013-04-25  9:24     ` Xiao Guangrong
  0 siblings, 1 reply; 13+ messages in thread
From: Gleb Natapov @ 2013-04-24 13:34 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: mtosatti, linux-kernel, kvm

On Mon, Apr 01, 2013 at 05:56:46PM +0800, Xiao Guangrong wrote:
> Define some meaningful names instead of raw code
> 
> Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
> ---
>  arch/x86/kvm/mmu.c |   15 +++++----------
>  arch/x86/kvm/mmu.h |   14 ++++++++++++++
>  arch/x86/kvm/vmx.c |    4 ++--
>  3 files changed, 21 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index be4f733..31c5586 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -3182,17 +3182,12 @@ static u64 walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr)
>  	return spte;
>  }
>  
> -/*
> - * If it is a real mmio page fault, return 1 and emulat the instruction
> - * directly, return 0 to let CPU fault again on the address, -1 is
> - * returned if bug is detected.
> - */
>  int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct)
>  {
>  	u64 spte;
>  
>  	if (quickly_check_mmio_pf(vcpu, addr, direct))
> -		return 1;
> +		return RET_MMIO_PF_EMU;
>  
>  	spte = walk_shadow_page_get_mmio_spte(vcpu, addr);
>  
> @@ -3205,7 +3200,7 @@ int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct)
>  
>  		trace_handle_mmio_page_fault(addr, gfn, access);
>  		vcpu_cache_mmio_info(vcpu, addr, gfn, access);
> -		return 1;
> +		return RET_MMIO_PF_EMU;
>  	}
>  
>  	/*
> @@ -3213,13 +3208,13 @@ int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct)
>  	 * it's a BUG if the gfn is not a mmio page.
>  	 */
>  	if (direct && !check_direct_spte_mmio_pf(spte))
> -		return -1;
> +		return RET_MMIO_PF_BUG;
>  
>  	/*
>  	 * If the page table is zapped by other cpus, let CPU fault again on
>  	 * the address.
>  	 */
> -	return 0;
> +	return RET_MMIO_PF_RETRY;
>  }
>  EXPORT_SYMBOL_GPL(handle_mmio_page_fault_common);
>  
> @@ -3229,7 +3224,7 @@ static int handle_mmio_page_fault(struct kvm_vcpu *vcpu, u64 addr,
>  	int ret;
>  
>  	ret = handle_mmio_page_fault_common(vcpu, addr, direct);
> -	WARN_ON(ret < 0);
> +	WARN_ON(ret == RET_MMIO_PF_BUG);
>  	return ret;
>  }
>  
> diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
> index 2adcbc2..6b4ba1e 100644
> --- a/arch/x86/kvm/mmu.h
> +++ b/arch/x86/kvm/mmu.h
> @@ -52,6 +52,20 @@
>  
>  int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 sptes[4]);
>  void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask);
> +
> +/*
> + * Return values of handle_mmio_page_fault_common:
> + * RET_MMIO_PF_EMU: it is a real mmio page fault, emulate the instruction
> + *		    directly.
> + * RET_MMIO_PF_RETRY: let CPU fault again on the address.
> + * RET_MMIO_PF_BUG: bug is detected.
> + */
> +enum {
> +	RET_MMIO_PF_EMU = 1,
Make it RET_MMIO_PF_EMULATE please.

> +	RET_MMIO_PF_RETRY = 0,
> +	RET_MMIO_PF_BUG = -1
> +};
> +
>  int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct);
>  int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context);
>  
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 915ef56..d0f2790 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -5135,10 +5135,10 @@ static int handle_ept_misconfig(struct kvm_vcpu *vcpu)
>  	gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS);
>  
>  	ret = handle_mmio_page_fault_common(vcpu, gpa, true);
> -	if (likely(ret == 1))
> +	if (likely(ret == RET_MMIO_PF_EMU))
>  		return x86_emulate_instruction(vcpu, gpa, 0, NULL, 0) ==
>  					      EMULATE_DONE;
> -	if (unlikely(!ret))
> +	if (unlikely(ret == RET_MMIO_PF_RETRY))
>  		return 1;
>  
>  	/* It is the real ept misconfig */
> -- 
> 1.7.7.6

--
			Gleb.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 6/6] KVM: MMU: init kvm generation close to mmio wrap-around value
  2013-04-24 12:59   ` Gleb Natapov
@ 2013-04-25  9:23     ` Xiao Guangrong
  0 siblings, 0 replies; 13+ messages in thread
From: Xiao Guangrong @ 2013-04-25  9:23 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: mtosatti, linux-kernel, kvm

On 04/24/2013 08:59 PM, Gleb Natapov wrote:
> On Mon, Apr 01, 2013 at 05:56:49PM +0800, Xiao Guangrong wrote:
>> Then it has chance to trigger mmio generation number wrap-around
>>
>> Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
>> ---
>>  arch/x86/include/asm/kvm_host.h |    1 +
>>  arch/x86/kvm/mmu.c              |    8 ++++++++
>>  virt/kvm/kvm_main.c             |    6 ++++++
>>  3 files changed, 15 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 6c1e642..4e1f7cb 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -767,6 +767,7 @@ void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
>>  				     struct kvm_memory_slot *slot,
>>  				     gfn_t gfn_offset, unsigned long mask);
>>  void kvm_mmu_zap_all(struct kvm *kvm);
>> +void kvm_arch_init_generation(struct kvm *kvm);
>>  void kvm_mmu_invalid_mmio_sptes(struct kvm *kvm);
>>  unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
>>  void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages);
>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>> index d314e21..dcc059c 100644
>> --- a/arch/x86/kvm/mmu.c
>> +++ b/arch/x86/kvm/mmu.c
>> @@ -4279,6 +4279,14 @@ restart:
>>  	spin_unlock(&kvm->mmu_lock);
>>  }
>>  
>> +void kvm_arch_init_generation(struct kvm *kvm)
>> +{
>> +	mutex_lock(&kvm->slots_lock);
>> +	/* It is easier to trigger mmio generation-number wrap-around. */
>> +	kvm_memslots(kvm)->generation = MMIO_MAX_GEN - 13;
> kvm_memslots(kvm)->generation should never overflow since
> (read|write)_cached mechanism does not handle it. Initialising it to
> anything but 0 makes overflow more likely.
> 
> You can hide mmio overflow trick in kvm_current_mmio_generation():
> 
> static unsigned int kvm_current_mmio_generation(struct kvm *kvm)
> {
> 	return (kvm_memslots(kvm)->generation + MMIO_MAX_GEN - 13) & MMIO_GEN_MASK;
> }

Very smart idea. Thanks you, Gleb!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 3/6] KVM: MMU: make return value of mmio page fault handler more readable
  2013-04-24 13:34   ` Gleb Natapov
@ 2013-04-25  9:24     ` Xiao Guangrong
  0 siblings, 0 replies; 13+ messages in thread
From: Xiao Guangrong @ 2013-04-25  9:24 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: mtosatti, linux-kernel, kvm

On 04/24/2013 09:34 PM, Gleb Natapov wrote:

>> diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
>> index 2adcbc2..6b4ba1e 100644
>> --- a/arch/x86/kvm/mmu.h
>> +++ b/arch/x86/kvm/mmu.h
>> @@ -52,6 +52,20 @@
>>  
>>  int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 sptes[4]);
>>  void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask);
>> +
>> +/*
>> + * Return values of handle_mmio_page_fault_common:
>> + * RET_MMIO_PF_EMU: it is a real mmio page fault, emulate the instruction
>> + *		    directly.
>> + * RET_MMIO_PF_RETRY: let CPU fault again on the address.
>> + * RET_MMIO_PF_BUG: bug is detected.
>> + */
>> +enum {
>> +	RET_MMIO_PF_EMU = 1,
> Make it RET_MMIO_PF_EMULATE please.

Good to me, will do.

Thanks!

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-04-25  9:24 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-01  9:56 [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes Xiao Guangrong
2013-04-01  9:56 ` [PATCH v2 1/6] KVM: MMU: retain more available bits on mmio spte Xiao Guangrong
2013-04-01  9:56 ` [PATCH v2 2/6] KVM: MMU: store generation-number into " Xiao Guangrong
2013-04-01  9:56 ` [PATCH v2 3/6] KVM: MMU: make return value of mmio page fault handler more readable Xiao Guangrong
2013-04-24 13:34   ` Gleb Natapov
2013-04-25  9:24     ` Xiao Guangrong
2013-04-01  9:56 ` [PATCH v2 4/6] KVM: MMU: fast invalid all mmio sptes Xiao Guangrong
2013-04-01  9:56 ` [PATCH v2 5/6] KVM: MMU: add tracepoint for check_mmio_spte Xiao Guangrong
2013-04-01  9:56 ` [PATCH v2 6/6] KVM: MMU: init kvm generation close to mmio wrap-around value Xiao Guangrong
2013-04-24 12:59   ` Gleb Natapov
2013-04-25  9:23     ` Xiao Guangrong
2013-04-16  0:54 ` [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes Marcelo Tosatti
2013-04-16  3:09   ` Xiao Guangrong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox