public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/22] Nested Paging support for Nested SVM v2
@ 2010-04-27 10:38 Joerg Roedel
  2010-04-27 10:38 ` [PATCH 01/22] KVM: MMU: Check for root_level instead of long mode Joerg Roedel
                   ` (22 more replies)
  0 siblings, 23 replies; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel

Hi,

this is the second and reworked version of my nested paging for nested svm
patchset. Changes to the previous version include:

	* Renamed mmu.tdp_enabled to mmu.direct_map
	* Introduced two helper functions to read physical memory
	  locations of the current running guest level
	* Fixed a couple of bugs were KVM needs to read l2
	  physical memory and others

This patchset is tested with KVM and Windows 7 XPmode. I also tested in KVM
with Linux and Windows 7 as l2 guests. I also tested different pagesize
combinations and did a stress tests for a couple of hours. All these tests
showed no introduced regressions.
Please review this second version of the patch-set. I appreciate your feedback.

Thanks,

	Joerg

Diffstat:

 arch/x86/include/asm/kvm_host.h |   26 ++++++
 arch/x86/kvm/emulate.c          |   22 +++---
 arch/x86/kvm/mmu.c              |  160 ++++++++++++++++++++++++++++++--------
 arch/x86/kvm/mmu.h              |    2 +
 arch/x86/kvm/paging_tmpl.h      |   97 +++++++++++++++++++-----
 arch/x86/kvm/svm.c              |  120 ++++++++++++++++++++++++-----
 arch/x86/kvm/vmx.c              |    3 +
 arch/x86/kvm/x86.c              |   77 +++++++++++++++++--
 include/linux/kvm_host.h        |    5 +
 9 files changed, 423 insertions(+), 89 deletions(-)

Shortlog:

Joerg Roedel (22):
      KVM: MMU: Check for root_level instead of long mode
      KVM: MMU: Make tdp_enabled a mmu-context parameter
      KVM: MMU: Make set_cr3 a function pointer in kvm_mmu
      KVM: X86: Introduce a tdp_set_cr3 function
      KVM: MMU: Introduce get_cr3 function pointer
      KVM: MMU: Introduce inject_page_fault function pointer
      KVM: SVM: Implement MMU helper functions for Nested Nested Paging
      KVM: MMU: Change init_kvm_softmmu to take a context as parameter
      KVM: MMU: Let is_rsvd_bits_set take mmu context instead of vcpu
      KVM: MMU: Introduce generic walk_addr function
      KVM: MMU: Add infrastructure for two-level page walker
      KVM: MMU: Implement nested gva_to_gpa functions
      KVM: X86: Add kvm_read_guest_page_tdp function
      KVM: MMU: Make walk_addr_generic capable for two-level walking
      KVM: MMU: Introduce kvm_read_guest_page_x86()
      KVM: MMU: Track page fault data in struct vcpu
      KVM: MMU: Propagate the right fault back to the guest after gva_to_gpa
      KVM: X86: Propagate fetch faults
      KVM: MMU: Introduce init_kvm_nested_mmu()
      KVM: SVM: Initialize Nested Nested MMU context on VMRUN
      KVM: SVM: Report Nested Paging support to userspace
      KVM: SVM: Expect two more candiates for exit_int_info



^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 01/22] KVM: MMU: Check for root_level instead of long mode
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 10:38 ` [PATCH 02/22] KVM: MMU: Make tdp_enabled a mmu-context parameter Joerg Roedel
                   ` (21 subsequent siblings)
  22 siblings, 0 replies; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

The walk_addr function checks for !is_long_mode in its 64
bit version. But what is meant here is a check for pae
paging. Change the condition to really check for pae paging
so that it also works with nested nested paging.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/kvm/paging_tmpl.h |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index d0cc07e..b07cec6 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -128,7 +128,7 @@ walk:
 	walker->level = vcpu->arch.mmu.root_level;
 	pte = vcpu->arch.cr3;
 #if PTTYPE == 64
-	if (!is_long_mode(vcpu)) {
+	if (vcpu->arch.mmu.root_level == PT32E_ROOT_LEVEL) {
 		pte = kvm_pdptr_read(vcpu, (addr >> 30) & 3);
 		trace_kvm_mmu_paging_element(pte, walker->level);
 		if (!is_present_gpte(pte))
@@ -194,7 +194,7 @@ walk:
 				(PTTYPE == 64 || is_pse(vcpu))) ||
 		    ((walker->level == PT_PDPE_LEVEL) &&
 				is_large_pte(pte) &&
-				is_long_mode(vcpu))) {
+				vcpu->arch.mmu.root_level == PT64_ROOT_LEVEL)) {
 			int lvl = walker->level;
 
 			walker->gfn = gpte_to_gfn_lvl(pte, lvl);
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 02/22] KVM: MMU: Make tdp_enabled a mmu-context parameter
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
  2010-04-27 10:38 ` [PATCH 01/22] KVM: MMU: Check for root_level instead of long mode Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 12:06   ` Avi Kivity
  2010-04-27 10:38 ` [PATCH 03/22] KVM: MMU: Make set_cr3 a function pointer in kvm_mmu Joerg Roedel
                   ` (20 subsequent siblings)
  22 siblings, 1 reply; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This patch changes the tdp_enabled flag from its global
meaning to the mmu-context and renames it to direct_map
there. This is necessary for Nested SVM with emulation of
Nested Paging where we need an extra MMU context to shadow
the Nested Nested Page Table.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    1 +
 arch/x86/kvm/mmu.c              |    8 +++++---
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3f0007b..44eda9b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -246,6 +246,7 @@ struct kvm_mmu {
 	int root_level;
 	int shadow_root_level;
 	union kvm_mmu_page_role base_role;
+	bool direct_map;
 
 	u64 *pae_root;
 	u64 rsvd_bits_mask[2][4];
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index ddfa865..618d25f 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1794,7 +1794,7 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 		spte |= shadow_user_mask;
 	if (level > PT_PAGE_TABLE_LEVEL)
 		spte |= PT_PAGE_SIZE_MASK;
-	if (tdp_enabled)
+	if (vcpu->arch.mmu.direct_map)
 		spte |= kvm_x86_ops->get_mt_mask(vcpu, gfn,
 			kvm_is_mmio_pfn(pfn));
 
@@ -2059,7 +2059,7 @@ static int mmu_alloc_roots(struct kvm_vcpu *vcpu)
 		hpa_t root = vcpu->arch.mmu.root_hpa;
 
 		ASSERT(!VALID_PAGE(root));
-		if (tdp_enabled)
+		if (vcpu->arch.mmu.direct_map)
 			direct = 1;
 		if (mmu_check_root(vcpu, root_gfn))
 			return 1;
@@ -2072,7 +2072,7 @@ static int mmu_alloc_roots(struct kvm_vcpu *vcpu)
 		return 0;
 	}
 	direct = !is_paging(vcpu);
-	if (tdp_enabled)
+	if (vcpu->arch.mmu.direct_map)
 		direct = 1;
 	for (i = 0; i < 4; ++i) {
 		hpa_t root = vcpu->arch.mmu.pae_root[i];
@@ -2385,6 +2385,7 @@ static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
 	context->invlpg = nonpaging_invlpg;
 	context->shadow_root_level = kvm_x86_ops->get_tdp_level();
 	context->root_hpa = INVALID_PAGE;
+	vcpu->arch.mmu.direct_map = true;
 
 	if (!is_paging(vcpu)) {
 		context->gva_to_gpa = nonpaging_gva_to_gpa;
@@ -2423,6 +2424,7 @@ static int init_kvm_softmmu(struct kvm_vcpu *vcpu)
 		r = paging32_init_context(vcpu);
 
 	vcpu->arch.mmu.base_role.cr4_pae = !!is_pae(vcpu);
+	vcpu->arch.mmu.direct_map        = false;
 
 	return r;
 }
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 03/22] KVM: MMU: Make set_cr3 a function pointer in kvm_mmu
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
  2010-04-27 10:38 ` [PATCH 01/22] KVM: MMU: Check for root_level instead of long mode Joerg Roedel
  2010-04-27 10:38 ` [PATCH 02/22] KVM: MMU: Make tdp_enabled a mmu-context parameter Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 10:38 ` [PATCH 04/22] KVM: X86: Introduce a tdp_set_cr3 function Joerg Roedel
                   ` (19 subsequent siblings)
  22 siblings, 0 replies; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This is necessary to implement Nested Nested Paging. As a
side effect this allows some cleanups in the SVM nested
paging code.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    1 +
 arch/x86/kvm/mmu.c              |    4 +++-
 2 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 44eda9b..04834b0 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -233,6 +233,7 @@ struct kvm_pio_request {
  */
 struct kvm_mmu {
 	void (*new_cr3)(struct kvm_vcpu *vcpu);
+	void (*set_cr3)(struct kvm_vcpu *vcpu, unsigned long root);
 	int (*page_fault)(struct kvm_vcpu *vcpu, gva_t gva, u32 err);
 	void (*free)(struct kvm_vcpu *vcpu);
 	gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t gva, u32 access,
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 618d25f..6eedcdd 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2386,6 +2386,7 @@ static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
 	context->shadow_root_level = kvm_x86_ops->get_tdp_level();
 	context->root_hpa = INVALID_PAGE;
 	vcpu->arch.mmu.direct_map = true;
+	vcpu->arch.mmu.set_cr3 = kvm_x86_ops->set_cr3;
 
 	if (!is_paging(vcpu)) {
 		context->gva_to_gpa = nonpaging_gva_to_gpa;
@@ -2425,6 +2426,7 @@ static int init_kvm_softmmu(struct kvm_vcpu *vcpu)
 
 	vcpu->arch.mmu.base_role.cr4_pae = !!is_pae(vcpu);
 	vcpu->arch.mmu.direct_map        = false;
+	vcpu->arch.mmu.set_cr3           = kvm_x86_ops->set_cr3;
 
 	return r;
 }
@@ -2470,7 +2472,7 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu)
 	if (r)
 		goto out;
 	/* set_cr3() should ensure TLB has been flushed */
-	kvm_x86_ops->set_cr3(vcpu, vcpu->arch.mmu.root_hpa);
+	vcpu->arch.mmu.set_cr3(vcpu, vcpu->arch.mmu.root_hpa);
 out:
 	return r;
 }
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 04/22] KVM: X86: Introduce a tdp_set_cr3 function
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
                   ` (2 preceding siblings ...)
  2010-04-27 10:38 ` [PATCH 03/22] KVM: MMU: Make set_cr3 a function pointer in kvm_mmu Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 10:38 ` [PATCH 05/22] KVM: MMU: Introduce get_cr3 function pointer Joerg Roedel
                   ` (18 subsequent siblings)
  22 siblings, 0 replies; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This patch introduces a special set_tdp_cr3 function pointer
in kvm_x86_ops which is only used for tpd enabled mmu
contexts. This allows to remove some hacks from svm code.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    1 +
 arch/x86/kvm/mmu.c              |    2 +-
 arch/x86/kvm/svm.c              |   23 ++++++++++++++---------
 arch/x86/kvm/vmx.c              |    3 +++
 4 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 04834b0..5c74269 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -531,6 +531,7 @@ struct kvm_x86_ops {
 	u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio);
 	int (*get_lpage_level)(void);
 	bool (*rdtscp_supported)(void);
+	void (*set_tdp_cr3)(struct kvm_vcpu *vcpu, unsigned long cr3);
 
 	void (*set_supported_cpuid)(u32 func, struct kvm_cpuid_entry2 *entry);
 
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 6eedcdd..f1fac9a 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2386,7 +2386,7 @@ static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
 	context->shadow_root_level = kvm_x86_ops->get_tdp_level();
 	context->root_hpa = INVALID_PAGE;
 	vcpu->arch.mmu.direct_map = true;
-	vcpu->arch.mmu.set_cr3 = kvm_x86_ops->set_cr3;
+	vcpu->arch.mmu.set_cr3 = kvm_x86_ops->set_tdp_cr3;
 
 	if (!is_paging(vcpu)) {
 		context->gva_to_gpa = nonpaging_gva_to_gpa;
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 889f660..4aae4be 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2982,9 +2982,6 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
 	gs_selector = kvm_read_gs();
 	ldt_selector = kvm_read_ldt();
 	svm->vmcb->save.cr2 = vcpu->arch.cr2;
-	/* required for live migration with NPT */
-	if (npt_enabled)
-		svm->vmcb->save.cr3 = vcpu->arch.cr3;
 
 	clgi();
 
@@ -3093,16 +3090,22 @@ static void svm_set_cr3(struct kvm_vcpu *vcpu, unsigned long root)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 
-	if (npt_enabled) {
-		svm->vmcb->control.nested_cr3 = root;
-		force_new_asid(vcpu);
-		return;
-	}
-
 	svm->vmcb->save.cr3 = root;
 	force_new_asid(vcpu);
 }
 
+static void set_tdp_cr3(struct kvm_vcpu *vcpu, unsigned long root)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	svm->vmcb->control.nested_cr3 = root;
+
+	/* Also sync guest cr3 here in case we live migrate */
+	svm->vmcb->save.cr3 = vcpu->arch.cr3;
+
+	force_new_asid(vcpu);
+}
+
 static int is_disabled(void)
 {
 	u64 vm_cr;
@@ -3314,6 +3317,8 @@ static struct kvm_x86_ops svm_x86_ops = {
 	.rdtscp_supported = svm_rdtscp_supported,
 
 	.set_supported_cpuid = svm_set_supported_cpuid,
+
+	.set_tdp_cr3 = set_tdp_cr3,
 };
 
 static int __init svm_init(void)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 875b785..d6377f0 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4191,6 +4191,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
 	.get_mt_mask = vmx_get_mt_mask,
 
 	.exit_reasons_str = vmx_exit_reasons_str,
+
 	.get_lpage_level = vmx_get_lpage_level,
 
 	.cpuid_update = vmx_cpuid_update,
@@ -4198,6 +4199,8 @@ static struct kvm_x86_ops vmx_x86_ops = {
 	.rdtscp_supported = vmx_rdtscp_supported,
 
 	.set_supported_cpuid = vmx_set_supported_cpuid,
+
+	.set_tdp_cr3 = vmx_set_cr3,
 };
 
 static int __init vmx_init(void)
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 05/22] KVM: MMU: Introduce get_cr3 function pointer
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
                   ` (3 preceding siblings ...)
  2010-04-27 10:38 ` [PATCH 04/22] KVM: X86: Introduce a tdp_set_cr3 function Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 10:38 ` [PATCH 06/22] KVM: MMU: Introduce inject_page_fault " Joerg Roedel
                   ` (17 subsequent siblings)
  22 siblings, 0 replies; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This function pointer in the MMU context is required to
implement Nested Nested Paging.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    1 +
 arch/x86/kvm/mmu.c              |    9 ++++++++-
 arch/x86/kvm/paging_tmpl.h      |    4 ++--
 3 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 5c74269..47d1755 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -234,6 +234,7 @@ struct kvm_pio_request {
 struct kvm_mmu {
 	void (*new_cr3)(struct kvm_vcpu *vcpu);
 	void (*set_cr3)(struct kvm_vcpu *vcpu, unsigned long root);
+	unsigned long (*get_cr3)(struct kvm_vcpu *vcpu);
 	int (*page_fault)(struct kvm_vcpu *vcpu, gva_t gva, u32 err);
 	void (*free)(struct kvm_vcpu *vcpu);
 	gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t gva, u32 access,
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index f1fac9a..a25a72e 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2053,7 +2053,7 @@ static int mmu_alloc_roots(struct kvm_vcpu *vcpu)
 	int direct = 0;
 	u64 pdptr;
 
-	root_gfn = vcpu->arch.cr3 >> PAGE_SHIFT;
+	root_gfn = vcpu->arch.mmu.get_cr3(vcpu) >> PAGE_SHIFT;
 
 	if (vcpu->arch.mmu.shadow_root_level == PT64_ROOT_LEVEL) {
 		hpa_t root = vcpu->arch.mmu.root_hpa;
@@ -2236,6 +2236,11 @@ static void paging_new_cr3(struct kvm_vcpu *vcpu)
 	mmu_free_roots(vcpu);
 }
 
+static unsigned long get_cr3(struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.cr3;
+}
+
 static void inject_page_fault(struct kvm_vcpu *vcpu,
 			      u64 addr,
 			      u32 err_code)
@@ -2387,6 +2392,7 @@ static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
 	context->root_hpa = INVALID_PAGE;
 	vcpu->arch.mmu.direct_map = true;
 	vcpu->arch.mmu.set_cr3 = kvm_x86_ops->set_tdp_cr3;
+	vcpu->arch.mmu.get_cr3 = get_cr3;
 
 	if (!is_paging(vcpu)) {
 		context->gva_to_gpa = nonpaging_gva_to_gpa;
@@ -2427,6 +2433,7 @@ static int init_kvm_softmmu(struct kvm_vcpu *vcpu)
 	vcpu->arch.mmu.base_role.cr4_pae = !!is_pae(vcpu);
 	vcpu->arch.mmu.direct_map        = false;
 	vcpu->arch.mmu.set_cr3           = kvm_x86_ops->set_cr3;
+	vcpu->arch.mmu.get_cr3           = get_cr3;
 
 	return r;
 }
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index b07cec6..802c513 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -126,7 +126,7 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 				     fetch_fault);
 walk:
 	walker->level = vcpu->arch.mmu.root_level;
-	pte = vcpu->arch.cr3;
+	pte = vcpu->arch.mmu.get_cr3(vcpu);
 #if PTTYPE == 64
 	if (vcpu->arch.mmu.root_level == PT32E_ROOT_LEVEL) {
 		pte = kvm_pdptr_read(vcpu, (addr >> 30) & 3);
@@ -137,7 +137,7 @@ walk:
 	}
 #endif
 	ASSERT((!is_long_mode(vcpu) && is_pae(vcpu)) ||
-	       (vcpu->arch.cr3 & CR3_NONPAE_RESERVED_BITS) == 0);
+	       (vcpu->arch.mmu.get_cr3(vcpu) & CR3_NONPAE_RESERVED_BITS) == 0);
 
 	pt_access = ACC_ALL;
 
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 06/22] KVM: MMU: Introduce inject_page_fault function pointer
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
                   ` (4 preceding siblings ...)
  2010-04-27 10:38 ` [PATCH 05/22] KVM: MMU: Introduce get_cr3 function pointer Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 10:38 ` [PATCH 07/22] KVM: SVM: Implement MMU helper functions for Nested Nested Paging Joerg Roedel
                   ` (16 subsequent siblings)
  22 siblings, 0 replies; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This patch introduces an inject_page_fault function pointer
into struct kvm_mmu which will be used to inject a page
fault. This will be used later when Nested Nested Paging is
implemented.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    3 +++
 arch/x86/kvm/mmu.c              |    4 +++-
 2 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 47d1755..f1d46e6 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -236,6 +236,9 @@ struct kvm_mmu {
 	void (*set_cr3)(struct kvm_vcpu *vcpu, unsigned long root);
 	unsigned long (*get_cr3)(struct kvm_vcpu *vcpu);
 	int (*page_fault)(struct kvm_vcpu *vcpu, gva_t gva, u32 err);
+	void (*inject_page_fault)(struct kvm_vcpu *vcpu,
+				  unsigned long addr,
+				  u32 error_code);
 	void (*free)(struct kvm_vcpu *vcpu);
 	gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t gva, u32 access,
 			    u32 *error);
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index a25a72e..114c522 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2245,7 +2245,7 @@ static void inject_page_fault(struct kvm_vcpu *vcpu,
 			      u64 addr,
 			      u32 err_code)
 {
-	kvm_inject_page_fault(vcpu, addr, err_code);
+	vcpu->arch.mmu.inject_page_fault(vcpu, addr, err_code);
 }
 
 static void paging_free(struct kvm_vcpu *vcpu)
@@ -2393,6 +2393,7 @@ static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
 	vcpu->arch.mmu.direct_map = true;
 	vcpu->arch.mmu.set_cr3 = kvm_x86_ops->set_tdp_cr3;
 	vcpu->arch.mmu.get_cr3 = get_cr3;
+	vcpu->arch.mmu.inject_page_fault = kvm_inject_page_fault;
 
 	if (!is_paging(vcpu)) {
 		context->gva_to_gpa = nonpaging_gva_to_gpa;
@@ -2434,6 +2435,7 @@ static int init_kvm_softmmu(struct kvm_vcpu *vcpu)
 	vcpu->arch.mmu.direct_map        = false;
 	vcpu->arch.mmu.set_cr3           = kvm_x86_ops->set_cr3;
 	vcpu->arch.mmu.get_cr3           = get_cr3;
+	vcpu->arch.mmu.inject_page_fault = kvm_inject_page_fault;
 
 	return r;
 }
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 07/22] KVM: SVM: Implement MMU helper functions for Nested Nested Paging
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
                   ` (5 preceding siblings ...)
  2010-04-27 10:38 ` [PATCH 06/22] KVM: MMU: Introduce inject_page_fault " Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 10:38 ` [PATCH 08/22] KVM: MMU: Change init_kvm_softmmu to take a context as parameter Joerg Roedel
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This patch adds the helper functions which will be used in
the mmu context for handling nested nested page faults.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/kvm/svm.c |   33 +++++++++++++++++++++++++++++++++
 1 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 4aae4be..e31f601 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -92,6 +92,9 @@ struct nested_state {
 	u32 intercept_exceptions;
 	u64 intercept;
 
+	/* Nested Paging related state */
+	u64 nested_cr3;
+
 };
 
 #define MSRPM_OFFSETS	16
@@ -1490,6 +1493,36 @@ static int vmmcall_interception(struct vcpu_svm *svm)
 	return 1;
 }
 
+static unsigned long nested_svm_get_tdp_cr3(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	return svm->nested.nested_cr3;
+}
+
+static void nested_svm_set_tdp_cr3(struct kvm_vcpu *vcpu,
+				   unsigned long root)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	svm->vmcb->control.nested_cr3 = root;
+	force_new_asid(vcpu);
+}
+
+static void nested_svm_inject_npf_exit(struct kvm_vcpu *vcpu,
+				       unsigned long addr,
+				       u32 error_code)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	svm->vmcb->control.exit_code = SVM_EXIT_NPF;
+	svm->vmcb->control.exit_code_hi = 0;
+	svm->vmcb->control.exit_info_1 = error_code;
+	svm->vmcb->control.exit_info_2 = addr;
+
+	nested_svm_vmexit(svm);
+}
+
 static int nested_svm_check_permissions(struct vcpu_svm *svm)
 {
 	if (!(svm->vcpu.arch.efer & EFER_SVME)
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 08/22] KVM: MMU: Change init_kvm_softmmu to take a context as parameter
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
                   ` (6 preceding siblings ...)
  2010-04-27 10:38 ` [PATCH 07/22] KVM: SVM: Implement MMU helper functions for Nested Nested Paging Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 10:38 ` [PATCH 09/22] KVM: MMU: Let is_rsvd_bits_set take mmu context instead of vcpu Joerg Roedel
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

Some logic of this function is required to build the Nested
Nested Paging context. So factor the required logic into a
seperate function and export it.
Also make the whole init path suitable for more than one mmu
context.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/kvm/mmu.c |   60 ++++++++++++++++++++++++++++++---------------------
 arch/x86/kvm/mmu.h |    1 +
 2 files changed, 36 insertions(+), 25 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 114c522..ef9d284 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2207,10 +2207,9 @@ static void nonpaging_free(struct kvm_vcpu *vcpu)
 	mmu_free_roots(vcpu);
 }
 
-static int nonpaging_init_context(struct kvm_vcpu *vcpu)
+static int nonpaging_init_context(struct kvm_vcpu *vcpu,
+				  struct kvm_mmu *context)
 {
-	struct kvm_mmu *context = &vcpu->arch.mmu;
-
 	context->new_cr3 = nonpaging_new_cr3;
 	context->page_fault = nonpaging_page_fault;
 	context->gva_to_gpa = nonpaging_gva_to_gpa;
@@ -2269,9 +2268,10 @@ static bool is_rsvd_bits_set(struct kvm_vcpu *vcpu, u64 gpte, int level)
 #include "paging_tmpl.h"
 #undef PTTYPE
 
-static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu, int level)
+static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
+				  struct kvm_mmu *context,
+				  int level)
 {
-	struct kvm_mmu *context = &vcpu->arch.mmu;
 	int maxphyaddr = cpuid_maxphyaddr(vcpu);
 	u64 exb_bit_rsvd = 0;
 
@@ -2330,9 +2330,11 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu, int level)
 	}
 }
 
-static int paging64_init_context_common(struct kvm_vcpu *vcpu, int level)
+static int paging64_init_context_common(struct kvm_vcpu *vcpu,
+					struct kvm_mmu *context,
+					int level)
 {
-	struct kvm_mmu *context = &vcpu->arch.mmu;
+	reset_rsvds_bits_mask(vcpu, context, level);
 
 	ASSERT(is_pae(vcpu));
 	context->new_cr3 = paging_new_cr3;
@@ -2348,17 +2350,17 @@ static int paging64_init_context_common(struct kvm_vcpu *vcpu, int level)
 	return 0;
 }
 
-static int paging64_init_context(struct kvm_vcpu *vcpu)
+static int paging64_init_context(struct kvm_vcpu *vcpu,
+				 struct kvm_mmu *context)
 {
-	reset_rsvds_bits_mask(vcpu, PT64_ROOT_LEVEL);
-	return paging64_init_context_common(vcpu, PT64_ROOT_LEVEL);
+	return paging64_init_context_common(vcpu, context, PT64_ROOT_LEVEL);
 }
 
-static int paging32_init_context(struct kvm_vcpu *vcpu)
+static int paging32_init_context(struct kvm_vcpu *vcpu,
+				 struct kvm_mmu *context)
 {
-	struct kvm_mmu *context = &vcpu->arch.mmu;
+	reset_rsvds_bits_mask(vcpu, context, PT32_ROOT_LEVEL);
 
-	reset_rsvds_bits_mask(vcpu, PT32_ROOT_LEVEL);
 	context->new_cr3 = paging_new_cr3;
 	context->page_fault = paging32_page_fault;
 	context->gva_to_gpa = paging32_gva_to_gpa;
@@ -2372,10 +2374,10 @@ static int paging32_init_context(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
-static int paging32E_init_context(struct kvm_vcpu *vcpu)
+static int paging32E_init_context(struct kvm_vcpu *vcpu,
+				  struct kvm_mmu *context)
 {
-	reset_rsvds_bits_mask(vcpu, PT32E_ROOT_LEVEL);
-	return paging64_init_context_common(vcpu, PT32E_ROOT_LEVEL);
+	return paging64_init_context_common(vcpu, context, PT32E_ROOT_LEVEL);
 }
 
 static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
@@ -2399,15 +2401,15 @@ static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
 		context->gva_to_gpa = nonpaging_gva_to_gpa;
 		context->root_level = 0;
 	} else if (is_long_mode(vcpu)) {
-		reset_rsvds_bits_mask(vcpu, PT64_ROOT_LEVEL);
+		reset_rsvds_bits_mask(vcpu, context, PT64_ROOT_LEVEL);
 		context->gva_to_gpa = paging64_gva_to_gpa;
 		context->root_level = PT64_ROOT_LEVEL;
 	} else if (is_pae(vcpu)) {
-		reset_rsvds_bits_mask(vcpu, PT32E_ROOT_LEVEL);
+		reset_rsvds_bits_mask(vcpu, context, PT32E_ROOT_LEVEL);
 		context->gva_to_gpa = paging64_gva_to_gpa;
 		context->root_level = PT32E_ROOT_LEVEL;
 	} else {
-		reset_rsvds_bits_mask(vcpu, PT32_ROOT_LEVEL);
+		reset_rsvds_bits_mask(vcpu, context, PT32_ROOT_LEVEL);
 		context->gva_to_gpa = paging32_gva_to_gpa;
 		context->root_level = PT32_ROOT_LEVEL;
 	}
@@ -2415,24 +2417,32 @@ static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
-static int init_kvm_softmmu(struct kvm_vcpu *vcpu)
+int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context)
 {
 	int r;
-
 	ASSERT(vcpu);
 	ASSERT(!VALID_PAGE(vcpu->arch.mmu.root_hpa));
 
 	if (!is_paging(vcpu))
-		r = nonpaging_init_context(vcpu);
+		r = nonpaging_init_context(vcpu, context);
 	else if (is_long_mode(vcpu))
-		r = paging64_init_context(vcpu);
+		r = paging64_init_context(vcpu, context);
 	else if (is_pae(vcpu))
-		r = paging32E_init_context(vcpu);
+		r = paging32E_init_context(vcpu, context);
 	else
-		r = paging32_init_context(vcpu);
+		r = paging32_init_context(vcpu, context);
 
 	vcpu->arch.mmu.base_role.cr4_pae = !!is_pae(vcpu);
 	vcpu->arch.mmu.direct_map        = false;
+
+	return r;
+}
+EXPORT_SYMBOL_GPL(kvm_init_shadow_mmu);
+
+static int init_kvm_softmmu(struct kvm_vcpu *vcpu)
+{
+	int r = kvm_init_shadow_mmu(vcpu, &vcpu->arch.mmu);
+
 	vcpu->arch.mmu.set_cr3           = kvm_x86_ops->set_cr3;
 	vcpu->arch.mmu.get_cr3           = get_cr3;
 	vcpu->arch.mmu.inject_page_fault = kvm_inject_page_fault;
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index be66759..64f619b 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -49,6 +49,7 @@
 #define PFERR_FETCH_MASK (1U << 4)
 
 int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 sptes[4]);
+int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context);
 
 static inline void kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu)
 {
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 09/22] KVM: MMU: Let is_rsvd_bits_set take mmu context instead of vcpu
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
                   ` (7 preceding siblings ...)
  2010-04-27 10:38 ` [PATCH 08/22] KVM: MMU: Change init_kvm_softmmu to take a context as parameter Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 10:38 ` [PATCH 10/22] KVM: MMU: Introduce generic walk_addr function Joerg Roedel
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This patch changes is_rsvd_bits_set() function prototype to
take only a kvm_mmu context instead of a full vcpu.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/kvm/mmu.c         |    4 ++--
 arch/x86/kvm/paging_tmpl.h |    3 ++-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index ef9d284..615c78f 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2252,12 +2252,12 @@ static void paging_free(struct kvm_vcpu *vcpu)
 	nonpaging_free(vcpu);
 }
 
-static bool is_rsvd_bits_set(struct kvm_vcpu *vcpu, u64 gpte, int level)
+static bool is_rsvd_bits_set(struct kvm_mmu *mmu, u64 gpte, int level)
 {
 	int bit7;
 
 	bit7 = (gpte >> 7) & 1;
-	return (gpte & vcpu->arch.mmu.rsvd_bits_mask[bit7][level-1]) != 0;
+	return (gpte & mmu->rsvd_bits_mask[bit7][level-1]) != 0;
 }
 
 #define PTTYPE 64
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 802c513..1975ab6 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -158,7 +158,8 @@ walk:
 		if (!is_present_gpte(pte))
 			goto not_present;
 
-		rsvd_fault = is_rsvd_bits_set(vcpu, pte, walker->level);
+		rsvd_fault = is_rsvd_bits_set(&vcpu->arch.mmu, pte,
+					      walker->level);
 		if (rsvd_fault)
 			goto access_error;
 
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 10/22] KVM: MMU: Introduce generic walk_addr function
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
                   ` (8 preceding siblings ...)
  2010-04-27 10:38 ` [PATCH 09/22] KVM: MMU: Let is_rsvd_bits_set take mmu context instead of vcpu Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 10:38 ` [PATCH 11/22] KVM: MMU: Add infrastructure for two-level page walker Joerg Roedel
                   ` (12 subsequent siblings)
  22 siblings, 0 replies; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This is the first patch in the series towards a generic
walk_addr implementation which could walk two-dimensional
page tables in the end. In this first step the walk_addr
function is renamed into walk_addr_generic which takes a
mmu context as an additional parameter.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/kvm/paging_tmpl.h |   30 ++++++++++++++++++++----------
 1 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 1975ab6..b1086fe 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -112,9 +112,10 @@ static unsigned FNAME(gpte_access)(struct kvm_vcpu *vcpu, pt_element_t gpte)
 /*
  * Fetch a guest pte for a guest virtual address
  */
-static int FNAME(walk_addr)(struct guest_walker *walker,
-			    struct kvm_vcpu *vcpu, gva_t addr,
-			    int write_fault, int user_fault, int fetch_fault)
+static int FNAME(walk_addr_generic)(struct guest_walker *walker,
+				    struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
+				    gva_t addr, int write_fault,
+				    int user_fault, int fetch_fault)
 {
 	pt_element_t pte;
 	gfn_t table_gfn;
@@ -125,10 +126,12 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 	trace_kvm_mmu_pagetable_walk(addr, write_fault, user_fault,
 				     fetch_fault);
 walk:
-	walker->level = vcpu->arch.mmu.root_level;
-	pte = vcpu->arch.mmu.get_cr3(vcpu);
+
+	walker->level = mmu->root_level;
+	pte =           mmu->get_cr3(vcpu);
+
 #if PTTYPE == 64
-	if (vcpu->arch.mmu.root_level == PT32E_ROOT_LEVEL) {
+	if (walker->level == PT32E_ROOT_LEVEL) {
 		pte = kvm_pdptr_read(vcpu, (addr >> 30) & 3);
 		trace_kvm_mmu_paging_element(pte, walker->level);
 		if (!is_present_gpte(pte))
@@ -137,7 +140,7 @@ walk:
 	}
 #endif
 	ASSERT((!is_long_mode(vcpu) && is_pae(vcpu)) ||
-	       (vcpu->arch.mmu.get_cr3(vcpu) & CR3_NONPAE_RESERVED_BITS) == 0);
+	       (mmu->get_cr3(vcpu) & CR3_NONPAE_RESERVED_BITS) == 0);
 
 	pt_access = ACC_ALL;
 
@@ -158,8 +161,7 @@ walk:
 		if (!is_present_gpte(pte))
 			goto not_present;
 
-		rsvd_fault = is_rsvd_bits_set(&vcpu->arch.mmu, pte,
-					      walker->level);
+		rsvd_fault = is_rsvd_bits_set(mmu, pte, walker->level);
 		if (rsvd_fault)
 			goto access_error;
 
@@ -195,7 +197,7 @@ walk:
 				(PTTYPE == 64 || is_pse(vcpu))) ||
 		    ((walker->level == PT_PDPE_LEVEL) &&
 				is_large_pte(pte) &&
-				vcpu->arch.mmu.root_level == PT64_ROOT_LEVEL)) {
+				mmu->root_level == PT64_ROOT_LEVEL)) {
 			int lvl = walker->level;
 
 			walker->gfn = gpte_to_gfn_lvl(pte, lvl);
@@ -253,6 +255,14 @@ err:
 	return 0;
 }
 
+static int FNAME(walk_addr)(struct guest_walker *walker,
+			    struct kvm_vcpu *vcpu, gva_t addr,
+			    int write_fault, int user_fault, int fetch_fault)
+{
+	return FNAME(walk_addr_generic)(walker, vcpu, &vcpu->arch.mmu, addr,
+					write_fault, user_fault, fetch_fault);
+}
+
 static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page,
 			      u64 *spte, const void *pte)
 {
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 11/22] KVM: MMU: Add infrastructure for two-level page walker
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
                   ` (9 preceding siblings ...)
  2010-04-27 10:38 ` [PATCH 10/22] KVM: MMU: Introduce generic walk_addr function Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 12:34   ` Avi Kivity
  2010-04-27 10:38 ` [PATCH 12/22] KVM: MMU: Implement nested gva_to_gpa functions Joerg Roedel
                   ` (11 subsequent siblings)
  22 siblings, 1 reply; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This patch introduces a mmu-callback to translate gpa
addresses in the walk_addr code. This is later used to
translate l2_gpa addresses into l1_gpa addresses.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    1 +
 arch/x86/kvm/mmu.c              |    7 +++++++
 arch/x86/kvm/paging_tmpl.h      |    1 +
 include/linux/kvm_host.h        |    5 +++++
 4 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f1d46e6..ccdbd2f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -242,6 +242,7 @@ struct kvm_mmu {
 	void (*free)(struct kvm_vcpu *vcpu);
 	gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t gva, u32 access,
 			    u32 *error);
+	gpa_t (*translate_gpa)(struct kvm_vcpu *vcpu, gpa_t gpa, u32 *error);
 	void (*prefetch_page)(struct kvm_vcpu *vcpu,
 			      struct kvm_mmu_page *page);
 	int (*sync_page)(struct kvm_vcpu *vcpu,
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 615c78f..4e0bfdb 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2131,6 +2131,11 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu)
 	spin_unlock(&vcpu->kvm->mmu_lock);
 }
 
+static gpa_t translate_gpa(struct kvm_vcpu *vcpu, gpa_t gpa, u32 *error)
+{
+	return gpa;
+}
+
 static gpa_t nonpaging_gva_to_gpa(struct kvm_vcpu *vcpu, gva_t vaddr,
 				  u32 access, u32 *error)
 {
@@ -2387,6 +2392,7 @@ static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
 	context->new_cr3 = nonpaging_new_cr3;
 	context->page_fault = tdp_page_fault;
 	context->free = nonpaging_free;
+	context->translate_gpa = translate_gpa;
 	context->prefetch_page = nonpaging_prefetch_page;
 	context->sync_page = nonpaging_sync_page;
 	context->invlpg = nonpaging_invlpg;
@@ -2434,6 +2440,7 @@ int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context)
 
 	vcpu->arch.mmu.base_role.cr4_pae = !!is_pae(vcpu);
 	vcpu->arch.mmu.direct_map        = false;
+	vcpu->arch.mmu.translate_gpa     = translate_gpa;
 
 	return r;
 }
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index b1086fe..101849a 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -122,6 +122,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
 	unsigned index, pt_access, pte_access;
 	gpa_t pte_gpa;
 	int rsvd_fault = 0;
+	u32 error;
 
 	trace_kvm_mmu_pagetable_walk(addr, write_fault, user_fault,
 				     fetch_fault);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ce027d5..70025d1 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -520,6 +520,11 @@ static inline gpa_t gfn_to_gpa(gfn_t gfn)
 	return (gpa_t)gfn << PAGE_SHIFT;
 }
 
+static inline gfn_t gpa_to_gfn(gpa_t gpa)
+{
+	return (gfn_t)gpa >> PAGE_SHIFT;
+}
+
 static inline hpa_t pfn_to_hpa(pfn_t pfn)
 {
 	return (hpa_t)pfn << PAGE_SHIFT;
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 12/22] KVM: MMU: Implement nested gva_to_gpa functions
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
                   ` (10 preceding siblings ...)
  2010-04-27 10:38 ` [PATCH 11/22] KVM: MMU: Add infrastructure for two-level page walker Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 12:37   ` Avi Kivity
  2010-04-27 10:38 ` [PATCH 13/22] KVM: X86: Add kvm_read_guest_page_tdp function Joerg Roedel
                   ` (10 subsequent siblings)
  22 siblings, 1 reply; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This patch adds the functions to do a nested l2_gva to
l1_gpa page table walk.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    4 ++++
 arch/x86/kvm/mmu.c              |    8 ++++++++
 arch/x86/kvm/paging_tmpl.h      |   31 +++++++++++++++++++++++++++++++
 3 files changed, 43 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index ccdbd2f..3cbfb51 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -287,6 +287,10 @@ struct kvm_vcpu_arch {
 	bool tpr_access_reporting;
 
 	struct kvm_mmu mmu;
+
+	/* Used for two dimensional paging emulation */
+	struct kvm_mmu nested_mmu;
+
 	/* only needed in kvm_pv_mmu_op() path, but it's hot so
 	 * put it here to avoid allocation */
 	struct kvm_pv_mmu_op_buffer mmu_op_buffer;
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 4e0bfdb..8bd40b5 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2144,6 +2144,14 @@ static gpa_t nonpaging_gva_to_gpa(struct kvm_vcpu *vcpu, gva_t vaddr,
 	return vaddr;
 }
 
+static gpa_t nonpaging_gva_to_gpa_nested(struct kvm_vcpu *vcpu, gva_t vaddr,
+					 u32 access, u32 *error)
+{
+	if (error)
+		*error = 0;
+	return vcpu->arch.nested_mmu.translate_gpa(vcpu, vaddr, error);
+}
+
 static int nonpaging_page_fault(struct kvm_vcpu *vcpu, gva_t gva,
 				u32 error_code)
 {
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 101849a..7819a6f 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -264,6 +264,16 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
 					write_fault, user_fault, fetch_fault);
 }
 
+static int FNAME(walk_addr_nested)(struct guest_walker *walker,
+				   struct kvm_vcpu *vcpu, gva_t addr,
+				   int write_fault, int user_fault,
+				   int fetch_fault)
+{
+	return FNAME(walk_addr_generic)(walker, vcpu, &vcpu->arch.nested_mmu,
+					addr, write_fault, user_fault,
+					fetch_fault);
+}
+
 static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page,
 			      u64 *spte, const void *pte)
 {
@@ -544,6 +554,27 @@ static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t vaddr, u32 access,
 	return gpa;
 }
 
+static gpa_t FNAME(gva_to_gpa_nested)(struct kvm_vcpu *vcpu, gva_t vaddr,
+				      u32 access, u32 *error)
+{
+	struct guest_walker walker;
+	gpa_t gpa = UNMAPPED_GVA;
+	int r;
+
+	r = FNAME(walk_addr_nested)(&walker, vcpu, vaddr,
+				    !!(access & PFERR_WRITE_MASK),
+				    !!(access & PFERR_USER_MASK),
+				    !!(access & PFERR_FETCH_MASK));
+
+	if (r) {
+		gpa = gfn_to_gpa(walker.gfn);
+		gpa |= vaddr & ~PAGE_MASK;
+	} else if (error)
+		*error = walker.error_code;
+
+	return gpa;
+}
+
 static void FNAME(prefetch_page)(struct kvm_vcpu *vcpu,
 				 struct kvm_mmu_page *sp)
 {
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 13/22] KVM: X86: Add kvm_read_guest_page_tdp function
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
                   ` (11 preceding siblings ...)
  2010-04-27 10:38 ` [PATCH 12/22] KVM: MMU: Implement nested gva_to_gpa functions Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 12:42   ` Avi Kivity
  2010-04-27 10:38 ` [PATCH 14/22] KVM: MMU: Make walk_addr_generic capable for two-level walking Joerg Roedel
                   ` (9 subsequent siblings)
  22 siblings, 1 reply; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This patch adds a function which can read from the guests
physical memory or from the guest's guest physical memory.
This will be used in the two-dimensional page table walker.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    3 +++
 arch/x86/kvm/x86.c              |   24 ++++++++++++++++++++++++
 2 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3cbfb51..7851bbc 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -635,6 +635,9 @@ void kvm_requeue_exception(struct kvm_vcpu *vcpu, unsigned nr);
 void kvm_requeue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code);
 void kvm_inject_page_fault(struct kvm_vcpu *vcpu, unsigned long cr2,
 			   u32 error_code);
+int kvm_read_guest_page_tdp(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
+			    gfn_t gfn, void *data, int offset, int len,
+			    u32 *error);
 bool kvm_require_cpl(struct kvm_vcpu *vcpu, int required_cpl);
 
 int kvm_pic_set_irq(void *opaque, int irq, int level);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6b2ce1d..558d995 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -356,6 +356,30 @@ bool kvm_require_cpl(struct kvm_vcpu *vcpu, int required_cpl)
 EXPORT_SYMBOL_GPL(kvm_require_cpl);
 
 /*
+ * This function will be used to read from the physical memory of the currently
+ * running guest. The difference to kvm_read_guest_page ist that this function
+ * can read from guest physical or from the guest's guest physical memory.
+ */
+int kvm_read_guest_page_tdp(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
+			    gfn_t gfn, void *data, int offset, int len,
+			    u32 *error)
+{
+	gfn_t real_gfn;
+	gpa_t gpa;
+
+	*error   = 0;
+	gpa      = gfn << PAGE_SHIFT;
+	real_gfn = mmu->translate_gpa(vcpu, gpa, error);
+	if (real_gfn == UNMAPPED_GVA)
+		return -EFAULT;
+
+	real_gfn >>= PAGE_SHIFT;
+
+	return kvm_read_guest_page(vcpu->kvm, real_gfn, data, offset, len);
+}
+EXPORT_SYMBOL_GPL(kvm_read_guest_page_tdp);
+
+/*
  * Load the pae pdptrs.  Return true is they are all valid.
  */
 int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3)
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 14/22] KVM: MMU: Make walk_addr_generic capable for two-level walking
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
                   ` (12 preceding siblings ...)
  2010-04-27 10:38 ` [PATCH 13/22] KVM: X86: Add kvm_read_guest_page_tdp function Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 10:38 ` [PATCH 15/22] KVM: MMU: Introduce kvm_read_guest_page_x86() Joerg Roedel
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This patch uses kvm_read_guest_page_tdp to make the
walk_addr_generic functions suitable for two-level page
table walking.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/kvm/paging_tmpl.h |   34 ++++++++++++++++++++++++----------
 1 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 7819a6f..8b27026 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -121,8 +121,9 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
 	gfn_t table_gfn;
 	unsigned index, pt_access, pte_access;
 	gpa_t pte_gpa;
+	int offset;
 	int rsvd_fault = 0;
-	u32 error;
+	u32 error = 0;
 
 	trace_kvm_mmu_pagetable_walk(addr, write_fault, user_fault,
 				     fetch_fault);
@@ -149,13 +150,16 @@ walk:
 		index = PT_INDEX(addr, walker->level);
 
 		table_gfn = gpte_to_gfn(pte);
-		pte_gpa = gfn_to_gpa(table_gfn);
-		pte_gpa += index * sizeof(pt_element_t);
+		offset    = index * sizeof(pt_element_t);
+		pte_gpa   = gfn_to_gpa(table_gfn) + offset;
 		walker->table_gfn[walker->level - 1] = table_gfn;
 		walker->pte_gpa[walker->level - 1] = pte_gpa;
 
-		if (kvm_read_guest(vcpu->kvm, pte_gpa, &pte, sizeof(pte)))
-			goto not_present;
+		if (kvm_read_guest_page_tdp(vcpu, mmu, table_gfn, &pte, offset,
+					    sizeof(pte), &error)) {
+			walker->error_code = error;
+			return 0;
+		}
 
 		trace_kvm_mmu_paging_element(pte, walker->level);
 
@@ -200,15 +204,25 @@ walk:
 				is_large_pte(pte) &&
 				mmu->root_level == PT64_ROOT_LEVEL)) {
 			int lvl = walker->level;
+			gpa_t real_gpa;
+			gfn_t gfn;
 
-			walker->gfn = gpte_to_gfn_lvl(pte, lvl);
-			walker->gfn += (addr & PT_LVL_OFFSET_MASK(lvl))
-					>> PAGE_SHIFT;
+			gfn = gpte_to_gfn_lvl(pte, lvl);
+			gfn += (addr & PT_LVL_OFFSET_MASK(lvl)) >> PAGE_SHIFT;
 
 			if (PTTYPE == 32 &&
 			    walker->level == PT_DIRECTORY_LEVEL &&
 			    is_cpuid_PSE36())
-				walker->gfn += pse36_gfn_delta(pte);
+				gfn += pse36_gfn_delta(pte);
+
+			real_gpa = mmu->translate_gpa(vcpu, gfn_to_gpa(gfn),
+						      &error);
+			if (real_gpa == UNMAPPED_GVA) {
+				walker->error_code = error;
+				return 0;
+			}
+
+			walker->gfn = real_gpa >> PAGE_SHIFT;
 
 			break;
 		}
@@ -237,7 +251,7 @@ walk:
 	return 1;
 
 not_present:
-	walker->error_code = 0;
+	walker->error_code = error;
 	goto err;
 
 access_error:
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 15/22] KVM: MMU: Introduce kvm_read_guest_page_x86()
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
                   ` (13 preceding siblings ...)
  2010-04-27 10:38 ` [PATCH 14/22] KVM: MMU: Make walk_addr_generic capable for two-level walking Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 12:52   ` Avi Kivity
  2010-04-27 10:38 ` [PATCH 16/22] KVM: MMU: Track page fault data in struct vcpu Joerg Roedel
                   ` (7 subsequent siblings)
  22 siblings, 1 reply; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This patch introduces the kvm_read_guest_page_x86 function
which reads from the physical memory of the guest. If the
guest is running in guest-mode itself with nested paging
enabled it will read from the guest's guest physical memory
instead.
The patch also changes changes the code to use this function
where it is necessary.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    7 +++++++
 arch/x86/kvm/x86.c              |   29 +++++++++++++++++++++++++----
 2 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 7851bbc..d9dfc8c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -254,6 +254,13 @@ struct kvm_mmu {
 	union kvm_mmu_page_role base_role;
 	bool direct_map;
 
+	/*
+	 * If true the mmu runs in two-level mode.
+	 * vcpu->arch.nested_mmu needs to contain meaningful values in
+	 * this case.
+	 */
+	bool nested;
+
 	u64 *pae_root;
 	u64 rsvd_bits_mask[2][4];
 };
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 558d995..317ad26 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -379,6 +379,20 @@ int kvm_read_guest_page_tdp(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
 }
 EXPORT_SYMBOL_GPL(kvm_read_guest_page_tdp);
 
+int kvm_read_guest_page_x86(struct kvm_vcpu *vcpu, gfn_t gfn,
+			    void *data, int offset, int len, u32 *error)
+{
+	struct kvm_mmu *mmu;
+
+	if (vcpu->arch.mmu.nested)
+		mmu = &vcpu->arch.nested_mmu;
+	else
+		mmu = &vcpu->arch.mmu;
+
+	return kvm_read_guest_page_tdp(vcpu, mmu, gfn, data, offset, len,
+				       error);
+}
+
 /*
  * Load the pae pdptrs.  Return true is they are all valid.
  */
@@ -386,12 +400,13 @@ int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3)
 {
 	gfn_t pdpt_gfn = cr3 >> PAGE_SHIFT;
 	unsigned offset = ((cr3 & (PAGE_SIZE-1)) >> 5) << 2;
-	int i;
+	int i, error;
 	int ret;
 	u64 pdpte[ARRAY_SIZE(vcpu->arch.pdptrs)];
 
-	ret = kvm_read_guest_page(vcpu->kvm, pdpt_gfn, pdpte,
-				  offset * sizeof(u64), sizeof(pdpte));
+	ret = kvm_read_guest_page_x86(vcpu, pdpt_gfn, pdpte,
+				      offset * sizeof(u64),
+				      sizeof(pdpte), &error);
 	if (ret < 0) {
 		ret = 0;
 		goto out;
@@ -420,6 +435,9 @@ static bool pdptrs_changed(struct kvm_vcpu *vcpu)
 {
 	u64 pdpte[ARRAY_SIZE(vcpu->arch.pdptrs)];
 	bool changed = true;
+	int offset;
+	u32 error;
+	gfn_t gfn;
 	int r;
 
 	if (is_long_mode(vcpu) || !is_pae(vcpu))
@@ -429,7 +447,10 @@ static bool pdptrs_changed(struct kvm_vcpu *vcpu)
 		      (unsigned long *)&vcpu->arch.regs_avail))
 		return true;
 
-	r = kvm_read_guest(vcpu->kvm, vcpu->arch.cr3 & ~31u, pdpte, sizeof(pdpte));
+	gfn = (vcpu->arch.cr3 & ~31u) >> PAGE_SHIFT;
+	offset = (vcpu->arch.cr3 & ~31u) & (PAGE_SIZE - 1);
+	r = kvm_read_guest_page_x86(vcpu, gfn, pdpte, offset, sizeof(pdpte),
+				    &error);
 	if (r < 0)
 		goto out;
 	changed = memcmp(pdpte, vcpu->arch.pdptrs, sizeof(pdpte)) != 0;
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 16/22] KVM: MMU: Track page fault data in struct vcpu
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
                   ` (14 preceding siblings ...)
  2010-04-27 10:38 ` [PATCH 15/22] KVM: MMU: Introduce kvm_read_guest_page_x86() Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 12:58   ` Avi Kivity
  2010-04-27 10:38 ` [PATCH 17/22] KVM: MMU: Propagate the right fault back to the guest after gva_to_gpa Joerg Roedel
                   ` (6 subsequent siblings)
  22 siblings, 1 reply; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This patch introduces two fields in vcpu_arch for x86:

	* fault_address
	* fault_error_code

This will be used to correctly propagate page faults back
into the guest when we could have either an ordinary page
fault or a nested page fault. In the case of a nested page
fault the fault-address is different from the original
address that should be walked. So we need to keep track
about the real fault-address.
We could also remove the current path of the error_code to
the fault. But this change is too invasive and outside the
scope of this patch set. It will be changed and tested
seperatly.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    3 +++
 arch/x86/kvm/paging_tmpl.h      |    4 ++++
 2 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d9dfc8c..8426870 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -298,6 +298,9 @@ struct kvm_vcpu_arch {
 	/* Used for two dimensional paging emulation */
 	struct kvm_mmu nested_mmu;
 
+	unsigned long fault_address;
+	int fault_error_code;
+
 	/* only needed in kvm_pv_mmu_op() path, but it's hot so
 	 * put it here to avoid allocation */
 	struct kvm_pv_mmu_op_buffer mmu_op_buffer;
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 8b27026..df92ce2 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -266,6 +266,10 @@ err:
 		walker->error_code |= PFERR_FETCH_MASK;
 	if (rsvd_fault)
 		walker->error_code |= PFERR_RSVD_MASK;
+
+	vcpu->arch.fault_address    = addr;
+	vcpu->arch.fault_error_code = walker->error_code;
+
 	trace_kvm_mmu_walker_error(walker->error_code);
 	return 0;
 }
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 17/22] KVM: MMU: Propagate the right fault back to the guest after gva_to_gpa
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
                   ` (15 preceding siblings ...)
  2010-04-27 10:38 ` [PATCH 16/22] KVM: MMU: Track page fault data in struct vcpu Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 10:38 ` [PATCH 18/22] KVM: X86: Propagate fetch faults Joerg Roedel
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This patch implements logic to make sure that either a
page-fault/page-fault-vmexit or a nested-page-fault-vmexit
is propagated back to the guest.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    1 +
 arch/x86/kvm/emulate.c          |   20 ++++++++++----------
 arch/x86/kvm/mmu.h              |    1 +
 arch/x86/kvm/x86.c              |   20 ++++++++++++++++++--
 4 files changed, 30 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 8426870..d024e27 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -645,6 +645,7 @@ void kvm_requeue_exception(struct kvm_vcpu *vcpu, unsigned nr);
 void kvm_requeue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code);
 void kvm_inject_page_fault(struct kvm_vcpu *vcpu, unsigned long cr2,
 			   u32 error_code);
+void kvm_propagate_fault(struct kvm_vcpu *vcpu);
 int kvm_read_guest_page_tdp(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
 			    gfn_t gfn, void *data, int offset, int len,
 			    u32 *error);
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 5ac0bb4..171e1c7 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -1336,7 +1336,7 @@ static int read_segment_descriptor(struct x86_emulate_ctxt *ctxt,
 	addr = dt.address + index * 8;
 	ret = ops->read_std(addr, desc, sizeof *desc, ctxt->vcpu,  &err);
 	if (ret == X86EMUL_PROPAGATE_FAULT)
-		kvm_inject_page_fault(ctxt->vcpu, addr, err);
+		kvm_propagate_fault(ctxt->vcpu);
 
        return ret;
 }
@@ -1362,7 +1362,7 @@ static int write_segment_descriptor(struct x86_emulate_ctxt *ctxt,
 	addr = dt.address + index * 8;
 	ret = ops->write_std(addr, desc, sizeof *desc, ctxt->vcpu, &err);
 	if (ret == X86EMUL_PROPAGATE_FAULT)
-		kvm_inject_page_fault(ctxt->vcpu, addr, err);
+		kvm_propagate_fault(ctxt->vcpu);
 
 	return ret;
 }
@@ -2165,7 +2165,7 @@ static int task_switch_16(struct x86_emulate_ctxt *ctxt,
 			    &err);
 	if (ret == X86EMUL_PROPAGATE_FAULT) {
 		/* FIXME: need to provide precise fault address */
-		kvm_inject_page_fault(ctxt->vcpu, old_tss_base, err);
+		kvm_propagate_fault(ctxt->vcpu);
 		return ret;
 	}
 
@@ -2175,7 +2175,7 @@ static int task_switch_16(struct x86_emulate_ctxt *ctxt,
 			     &err);
 	if (ret == X86EMUL_PROPAGATE_FAULT) {
 		/* FIXME: need to provide precise fault address */
-		kvm_inject_page_fault(ctxt->vcpu, old_tss_base, err);
+		kvm_propagate_fault(ctxt->vcpu);
 		return ret;
 	}
 
@@ -2183,7 +2183,7 @@ static int task_switch_16(struct x86_emulate_ctxt *ctxt,
 			    &err);
 	if (ret == X86EMUL_PROPAGATE_FAULT) {
 		/* FIXME: need to provide precise fault address */
-		kvm_inject_page_fault(ctxt->vcpu, new_tss_base, err);
+		kvm_propagate_fault(ctxt->vcpu);
 		return ret;
 	}
 
@@ -2196,7 +2196,7 @@ static int task_switch_16(struct x86_emulate_ctxt *ctxt,
 				     ctxt->vcpu, &err);
 		if (ret == X86EMUL_PROPAGATE_FAULT) {
 			/* FIXME: need to provide precise fault address */
-			kvm_inject_page_fault(ctxt->vcpu, new_tss_base, err);
+			kvm_propagate_fault(ctxt->vcpu);
 			return ret;
 		}
 	}
@@ -2304,7 +2304,7 @@ static int task_switch_32(struct x86_emulate_ctxt *ctxt,
 			    &err);
 	if (ret == X86EMUL_PROPAGATE_FAULT) {
 		/* FIXME: need to provide precise fault address */
-		kvm_inject_page_fault(ctxt->vcpu, old_tss_base, err);
+		kvm_propagate_fault(ctxt->vcpu);
 		return ret;
 	}
 
@@ -2314,7 +2314,7 @@ static int task_switch_32(struct x86_emulate_ctxt *ctxt,
 			     &err);
 	if (ret == X86EMUL_PROPAGATE_FAULT) {
 		/* FIXME: need to provide precise fault address */
-		kvm_inject_page_fault(ctxt->vcpu, old_tss_base, err);
+		kvm_propagate_fault(ctxt->vcpu);
 		return ret;
 	}
 
@@ -2322,7 +2322,7 @@ static int task_switch_32(struct x86_emulate_ctxt *ctxt,
 			    &err);
 	if (ret == X86EMUL_PROPAGATE_FAULT) {
 		/* FIXME: need to provide precise fault address */
-		kvm_inject_page_fault(ctxt->vcpu, new_tss_base, err);
+		kvm_propagate_fault(ctxt->vcpu);
 		return ret;
 	}
 
@@ -2335,7 +2335,7 @@ static int task_switch_32(struct x86_emulate_ctxt *ctxt,
 				     ctxt->vcpu, &err);
 		if (ret == X86EMUL_PROPAGATE_FAULT) {
 			/* FIXME: need to provide precise fault address */
-			kvm_inject_page_fault(ctxt->vcpu, new_tss_base, err);
+			kvm_propagate_fault(ctxt->vcpu);
 			return ret;
 		}
 	}
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 64f619b..b42b27e 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -47,6 +47,7 @@
 #define PFERR_USER_MASK (1U << 2)
 #define PFERR_RSVD_MASK (1U << 3)
 #define PFERR_FETCH_MASK (1U << 4)
+#define PFERR_NESTED_MASK (1U << 31)
 
 int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 sptes[4]);
 int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 317ad26..4d3a698 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -324,6 +324,22 @@ void kvm_inject_page_fault(struct kvm_vcpu *vcpu, unsigned long addr,
 	kvm_queue_exception_e(vcpu, PF_VECTOR, error_code);
 }
 
+void kvm_propagate_fault(struct kvm_vcpu *vcpu)
+{
+	unsigned long address;
+	u32 nested, error;
+
+	address = vcpu->arch.fault_address;
+	error   = vcpu->arch.fault_error_code;
+	nested  = error &  PFERR_NESTED_MASK;
+	error   = error & ~PFERR_NESTED_MASK;
+
+	if (vcpu->arch.mmu.nested && !nested)
+		vcpu->arch.nested_mmu.inject_page_fault(vcpu, address, error);
+	else
+		vcpu->arch.mmu.inject_page_fault(vcpu, address, error);
+}
+
 void kvm_inject_nmi(struct kvm_vcpu *vcpu)
 {
 	vcpu->arch.nmi_pending = 1;
@@ -3338,7 +3354,7 @@ static int emulator_read_emulated(unsigned long addr,
 	gpa = kvm_mmu_gva_to_gpa_read(vcpu, addr, &error_code);
 
 	if (gpa == UNMAPPED_GVA) {
-		kvm_inject_page_fault(vcpu, addr, error_code);
+		kvm_propagate_fault(vcpu);
 		return X86EMUL_PROPAGATE_FAULT;
 	}
 
@@ -3392,7 +3408,7 @@ static int emulator_write_emulated_onepage(unsigned long addr,
 	gpa = kvm_mmu_gva_to_gpa_write(vcpu, addr, &error_code);
 
 	if (gpa == UNMAPPED_GVA) {
-		kvm_inject_page_fault(vcpu, addr, error_code);
+		kvm_propagate_fault(vcpu);
 		return X86EMUL_PROPAGATE_FAULT;
 	}
 
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 18/22] KVM: X86: Propagate fetch faults
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
                   ` (16 preceding siblings ...)
  2010-04-27 10:38 ` [PATCH 17/22] KVM: MMU: Propagate the right fault back to the guest after gva_to_gpa Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 10:38 ` [PATCH 19/22] KVM: MMU: Introduce init_kvm_nested_mmu() Joerg Roedel
                   ` (4 subsequent siblings)
  22 siblings, 0 replies; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

KVM currently ignores fetch faults in the instruction
emulator. With nested-npt we could have such faults. This
patch adds the code to handle these.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/kvm/emulate.c |    2 +-
 arch/x86/kvm/x86.c     |    4 ++++
 2 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 171e1c7..bb90307 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -1260,7 +1260,7 @@ done_prefixes:
 	}
 
 done:
-	return (rc == X86EMUL_UNHANDLEABLE) ? -1 : 0;
+	return rc;
 }
 
 static int pio_in_emulated(struct x86_emulate_ctxt *ctxt,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4d3a698..d159319 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3856,6 +3856,10 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
 			? X86EMUL_MODE_PROT32 : X86EMUL_MODE_PROT16;
 
 		r = x86_decode_insn(&vcpu->arch.emulate_ctxt, &emulate_ops);
+		if (r == X86EMUL_PROPAGATE_FAULT) {
+			kvm_propagate_fault(vcpu);
+			return EMULATE_DONE;
+		}
 		trace_kvm_emulate_insn_start(vcpu);
 
 		/* Only allow emulation of specific instructions on #UD
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 19/22] KVM: MMU: Introduce init_kvm_nested_mmu()
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
                   ` (17 preceding siblings ...)
  2010-04-27 10:38 ` [PATCH 18/22] KVM: X86: Propagate fetch faults Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 10:38 ` [PATCH 20/22] KVM: SVM: Initialize Nested Nested MMU context on VMRUN Joerg Roedel
                   ` (3 subsequent siblings)
  22 siblings, 0 replies; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This patch introduces the init_kvm_nested_mmu() function
which is used to re-initialize the nested mmu when the l2
guest changes its paging mode.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/kvm/mmu.c |   55 +++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 54 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 8bd40b5..af89e71 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2136,6 +2136,25 @@ static gpa_t translate_gpa(struct kvm_vcpu *vcpu, gpa_t gpa, u32 *error)
 	return gpa;
 }
 
+static gpa_t translate_nested_gpa(struct kvm_vcpu *vcpu, gpa_t gpa, u32 *error)
+{
+	gpa_t t_gpa;
+	u32 access;
+	u32 err;
+
+	BUG_ON(!vcpu->arch.mmu.nested);
+
+	/* NPT walks are treated as user writes */
+	access = PFERR_WRITE_MASK | PFERR_USER_MASK;
+	t_gpa  = vcpu->arch.nested_mmu.gva_to_gpa(vcpu, gpa, access, &err);
+	if (t_gpa == UNMAPPED_GVA) {
+		vcpu->arch.fault_address    = gpa;
+		vcpu->arch.fault_error_code = err | PFERR_NESTED_MASK;
+	}
+
+	return t_gpa;
+}
+
 static gpa_t nonpaging_gva_to_gpa(struct kvm_vcpu *vcpu, gva_t vaddr,
 				  u32 access, u32 *error)
 {
@@ -2465,11 +2484,45 @@ static int init_kvm_softmmu(struct kvm_vcpu *vcpu)
 	return r;
 }
 
+static int init_kvm_nested_mmu(struct kvm_vcpu *vcpu)
+{
+	struct kvm_mmu *g_context = &vcpu->arch.nested_mmu;
+	struct kvm_mmu *h_context = &vcpu->arch.mmu;
+
+	g_context->get_cr3           = get_cr3;
+	g_context->translate_gpa     = translate_nested_gpa;
+	g_context->inject_page_fault = kvm_inject_page_fault;
+
+	/*
+	 * Note that arch.mmu.gva_to_gpa translates l2_gva to l1_gpa. The
+	 * translation of l2_gpa to l1_gpa addresses is done using the
+	 * arch.nested_mmu.gva_to_gpa function. Basically the gva_to_gpa
+	 * functions between mmu and nested_mmu are swapped.
+	 */
+	if (!is_paging(vcpu)) {
+		g_context->root_level = 0;
+		h_context->gva_to_gpa = nonpaging_gva_to_gpa_nested;
+	} else if (is_long_mode(vcpu)) {
+		g_context->root_level = PT64_ROOT_LEVEL;
+		h_context->gva_to_gpa = paging64_gva_to_gpa_nested;
+	} else if (is_pae(vcpu)) {
+		g_context->root_level = PT32E_ROOT_LEVEL;
+		h_context->gva_to_gpa = paging64_gva_to_gpa_nested;
+	} else {
+		g_context->root_level = PT32_ROOT_LEVEL;
+		h_context->gva_to_gpa = paging32_gva_to_gpa_nested;
+	}
+
+	return 0;
+}
+
 static int init_kvm_mmu(struct kvm_vcpu *vcpu)
 {
 	vcpu->arch.update_pte.pfn = bad_pfn;
 
-	if (tdp_enabled)
+	if (vcpu->arch.mmu.nested)
+		return init_kvm_nested_mmu(vcpu);
+	else if (tdp_enabled)
 		return init_kvm_tdp_mmu(vcpu);
 	else
 		return init_kvm_softmmu(vcpu);
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 20/22] KVM: SVM: Initialize Nested Nested MMU context on VMRUN
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
                   ` (18 preceding siblings ...)
  2010-04-27 10:38 ` [PATCH 19/22] KVM: MMU: Introduce init_kvm_nested_mmu() Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 13:01   ` Avi Kivity
  2010-04-27 10:38 ` [PATCH 21/22] KVM: SVM: Report Nested Paging support to userspace Joerg Roedel
                   ` (2 subsequent siblings)
  22 siblings, 1 reply; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This patch adds code to initialize the Nested Nested Paging
MMU context when the L1 guest executes a VMRUN instruction
and has nested paging enabled in its VMCB.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/kvm/mmu.c |    1 +
 arch/x86/kvm/svm.c |   56 ++++++++++++++++++++++++++++++++++++++++++---------
 2 files changed, 47 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index af89e71..e5dc853 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2569,6 +2569,7 @@ void kvm_mmu_unload(struct kvm_vcpu *vcpu)
 {
 	mmu_free_roots(vcpu);
 }
+EXPORT_SYMBOL_GPL(kvm_mmu_unload);
 
 static void mmu_pte_write_zap_pte(struct kvm_vcpu *vcpu,
 				  struct kvm_mmu_page *sp,
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index e31f601..266b1d4 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -94,7 +94,6 @@ struct nested_state {
 
 	/* Nested Paging related state */
 	u64 nested_cr3;
-
 };
 
 #define MSRPM_OFFSETS	16
@@ -283,6 +282,15 @@ static inline void flush_guest_tlb(struct kvm_vcpu *vcpu)
 	force_new_asid(vcpu);
 }
 
+static int get_npt_level(void)
+{
+#ifdef CONFIG_X86_64
+	return PT64_ROOT_LEVEL;
+#else
+	return PT32E_ROOT_LEVEL;
+#endif
+}
+
 static void svm_set_efer(struct kvm_vcpu *vcpu, u64 efer)
 {
 	if (!npt_enabled && !(efer & EFER_LMA))
@@ -1523,6 +1531,27 @@ static void nested_svm_inject_npf_exit(struct kvm_vcpu *vcpu,
 	nested_svm_vmexit(svm);
 }
 
+static int nested_svm_init_mmu_context(struct kvm_vcpu *vcpu)
+{
+	int r;
+
+	r = kvm_init_shadow_mmu(vcpu, &vcpu->arch.mmu);
+
+	vcpu->arch.mmu.set_cr3           = nested_svm_set_tdp_cr3;
+	vcpu->arch.mmu.get_cr3           = nested_svm_get_tdp_cr3;
+	vcpu->arch.mmu.inject_page_fault = nested_svm_inject_npf_exit;
+	vcpu->arch.mmu.shadow_root_level = get_npt_level();
+	vcpu->arch.nested_mmu.gva_to_gpa = vcpu->arch.mmu.gva_to_gpa;
+	vcpu->arch.mmu.nested            = true;
+
+	return r;
+}
+
+static void nested_svm_uninit_mmu_context(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.mmu.nested = false;
+}
+
 static int nested_svm_check_permissions(struct vcpu_svm *svm)
 {
 	if (!(svm->vcpu.arch.efer & EFER_SVME)
@@ -1889,6 +1918,8 @@ static int nested_svm_vmexit(struct vcpu_svm *svm)
 	kvm_clear_exception_queue(&svm->vcpu);
 	kvm_clear_interrupt_queue(&svm->vcpu);
 
+	svm->nested.nested_cr3 = 0;
+
 	/* Restore selected save entries */
 	svm->vmcb->save.es = hsave->save.es;
 	svm->vmcb->save.cs = hsave->save.cs;
@@ -1915,6 +1946,7 @@ static int nested_svm_vmexit(struct vcpu_svm *svm)
 
 	nested_svm_unmap(page);
 
+	nested_svm_uninit_mmu_context(&svm->vcpu);
 	kvm_mmu_reset_context(&svm->vcpu);
 	kvm_mmu_load(&svm->vcpu);
 
@@ -1968,6 +2000,13 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
 	if (!nested_vmcb)
 		return false;
 
+	/* Do check if nested paging is allowed for the guest */
+	if (nested_vmcb->control.nested_ctl && !npt_enabled) {
+		nested_vmcb->control.exit_code = SVM_EXIT_ERR;
+		nested_svm_unmap(page);
+		return false;
+	}
+
 	trace_kvm_nested_vmrun(svm->vmcb->save.rip - 3, vmcb_gpa,
 			       nested_vmcb->save.rip,
 			       nested_vmcb->control.int_ctl,
@@ -2012,6 +2051,12 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
 	else
 		svm->vcpu.arch.hflags &= ~HF_HIF_MASK;
 
+	if (nested_vmcb->control.nested_ctl) {
+		kvm_mmu_unload(&svm->vcpu);
+		svm->nested.nested_cr3 = nested_vmcb->control.nested_cr3;
+		nested_svm_init_mmu_context(&svm->vcpu);
+	}
+
 	/* Load the nested guest state */
 	svm->vmcb->save.es = nested_vmcb->save.es;
 	svm->vmcb->save.cs = nested_vmcb->save.cs;
@@ -3171,15 +3216,6 @@ static bool svm_cpu_has_accelerated_tpr(void)
 	return false;
 }
 
-static int get_npt_level(void)
-{
-#ifdef CONFIG_X86_64
-	return PT64_ROOT_LEVEL;
-#else
-	return PT32E_ROOT_LEVEL;
-#endif
-}
-
 static u64 svm_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
 {
 	return 0;
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 21/22] KVM: SVM: Report Nested Paging support to userspace
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
                   ` (19 preceding siblings ...)
  2010-04-27 10:38 ` [PATCH 20/22] KVM: SVM: Initialize Nested Nested MMU context on VMRUN Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 10:38 ` [PATCH 22/22] KVM: SVM: Expect two more candiates for exit_int_info Joerg Roedel
  2010-04-27 13:03 ` [PATCH 0/22] Nested Paging support for Nested SVM v2 Avi Kivity
  22 siblings, 0 replies; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This patch implements the reporting of the nested paging
feature support to userspace.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/kvm/svm.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 266b1d4..6d6b300 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3233,7 +3233,12 @@ static void svm_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry)
 		entry->ebx = 8; /* Lets support 8 ASIDs in case we add proper
 				   ASID emulation to nested SVM */
 		entry->ecx = 0; /* Reserved */
-		entry->edx = 0; /* Do not support any additional features */
+		entry->edx = 0; /* Per default do not support any
+				   additional features */
+
+		/* Support NPT for the guest if enabled */
+		if (npt_enabled)
+			entry->edx = SVM_FEATURE_NPT;
 
 		break;
 	}
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 22/22] KVM: SVM: Expect two more candiates for exit_int_info
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
                   ` (20 preceding siblings ...)
  2010-04-27 10:38 ` [PATCH 21/22] KVM: SVM: Report Nested Paging support to userspace Joerg Roedel
@ 2010-04-27 10:38 ` Joerg Roedel
  2010-04-27 13:03 ` [PATCH 0/22] Nested Paging support for Nested SVM v2 Avi Kivity
  22 siblings, 0 replies; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 10:38 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This patch adds INTR and NMI intercepts to the list of
expected intercepts with an exit_int_info set. While this
can't happen on bare metal it is architectural legal and may
happen with KVMs SVM emulation.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/kvm/svm.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 6d6b300..482ef5d 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2764,7 +2764,8 @@ static int handle_exit(struct kvm_vcpu *vcpu)
 
 	if (is_external_interrupt(svm->vmcb->control.exit_int_info) &&
 	    exit_code != SVM_EXIT_EXCP_BASE + PF_VECTOR &&
-	    exit_code != SVM_EXIT_NPF && exit_code != SVM_EXIT_TASK_SWITCH)
+	    exit_code != SVM_EXIT_NPF && exit_code != SVM_EXIT_TASK_SWITCH &&
+	    exit_code != SVM_EXIT_INTR && exit_code != SVM_EXIT_NMI)
 		printk(KERN_ERR "%s: unexpected exit_ini_info 0x%x "
 		       "exit_code 0x%x\n",
 		       __func__, svm->vmcb->control.exit_int_info,
-- 
1.7.0.4



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH 02/22] KVM: MMU: Make tdp_enabled a mmu-context parameter
  2010-04-27 10:38 ` [PATCH 02/22] KVM: MMU: Make tdp_enabled a mmu-context parameter Joerg Roedel
@ 2010-04-27 12:06   ` Avi Kivity
  0 siblings, 0 replies; 57+ messages in thread
From: Avi Kivity @ 2010-04-27 12:06 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel

On 04/27/2010 01:38 PM, Joerg Roedel wrote:
> This patch changes the tdp_enabled flag from its global
> meaning to the mmu-context and renames it to direct_map
> there. This is necessary for Nested SVM with emulation of
> Nested Paging where we need an extra MMU context to shadow
> the Nested Nested Page Table.
>
> @@ -2423,6 +2424,7 @@ static int init_kvm_softmmu(struct kvm_vcpu *vcpu)
>   		r = paging32_init_context(vcpu);
>
>   	vcpu->arch.mmu.base_role.cr4_pae = !!is_pae(vcpu);
> +	vcpu->arch.mmu.direct_map        = false;
>
>   	return r;
>   }
>    

We could also set direct_map = true for real mode, would probably 
simplify some later logic (can be cleaned up later, no need to update now).

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 11/22] KVM: MMU: Add infrastructure for two-level page walker
  2010-04-27 10:38 ` [PATCH 11/22] KVM: MMU: Add infrastructure for two-level page walker Joerg Roedel
@ 2010-04-27 12:34   ` Avi Kivity
  2010-04-28 10:52     ` Joerg Roedel
  2010-04-28 11:03     ` Joerg Roedel
  0 siblings, 2 replies; 57+ messages in thread
From: Avi Kivity @ 2010-04-27 12:34 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel

On 04/27/2010 01:38 PM, Joerg Roedel wrote:
> This patch introduces a mmu-callback to translate gpa
> addresses in the walk_addr code. This is later used to
> translate l2_gpa addresses into l1_gpa addresses.
>
> +static inline gfn_t gpa_to_gfn(gpa_t gpa)
> +{
> +	return (gfn_t)gpa>>  PAGE_SHIFT;
> +}
> +
>    

This overflows on 32-bit, since gpa_t is u64 and gfn_t is ulong.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 12/22] KVM: MMU: Implement nested gva_to_gpa functions
  2010-04-27 10:38 ` [PATCH 12/22] KVM: MMU: Implement nested gva_to_gpa functions Joerg Roedel
@ 2010-04-27 12:37   ` Avi Kivity
  2010-04-28 14:20     ` Joerg Roedel
  0 siblings, 1 reply; 57+ messages in thread
From: Avi Kivity @ 2010-04-27 12:37 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel

On 04/27/2010 01:38 PM, Joerg Roedel wrote:
> This patch adds the functions to do a nested l2_gva to
> l1_gpa page table walk.
>
> Signed-off-by: Joerg Roedel<joerg.roedel@amd.com>
> ---
>   arch/x86/include/asm/kvm_host.h |    4 ++++
>   arch/x86/kvm/mmu.c              |    8 ++++++++
>   arch/x86/kvm/paging_tmpl.h      |   31 +++++++++++++++++++++++++++++++
>   3 files changed, 43 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index ccdbd2f..3cbfb51 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -287,6 +287,10 @@ struct kvm_vcpu_arch {
>   	bool tpr_access_reporting;
>
>   	struct kvm_mmu mmu;
> +
> +	/* Used for two dimensional paging emulation */
> +	struct kvm_mmu nested_mmu;
> +
>   	/* only needed in kvm_pv_mmu_op() path, but it's hot so
>   	 * put it here to avoid allocation */
>   	struct kvm_pv_mmu_op_buffer mmu_op_buffer;
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 4e0bfdb..8bd40b5 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -2144,6 +2144,14 @@ static gpa_t nonpaging_gva_to_gpa(struct kvm_vcpu *vcpu, gva_t vaddr,
>   	return vaddr;
>   }
>
> +static gpa_t nonpaging_gva_to_gpa_nested(struct kvm_vcpu *vcpu, gva_t vaddr,
> +					 u32 access, u32 *error)
> +{
> +	if (error)
> +		*error = 0;
>    

Why allow error == NULL?

>
> +static gpa_t FNAME(gva_to_gpa_nested)(struct kvm_vcpu *vcpu, gva_t vaddr,
> +				      u32 access, u32 *error)
> +{
> +	struct guest_walker walker;
> +	gpa_t gpa = UNMAPPED_GVA;
> +	int r;
> +
> +	r = FNAME(walk_addr_nested)(&walker, vcpu, vaddr,
> +				    !!(access&  PFERR_WRITE_MASK),
> +				    !!(access&  PFERR_USER_MASK),
> +				    !!(access&  PFERR_FETCH_MASK));
>    

Those !! are unneeded.


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 13/22] KVM: X86: Add kvm_read_guest_page_tdp function
  2010-04-27 10:38 ` [PATCH 13/22] KVM: X86: Add kvm_read_guest_page_tdp function Joerg Roedel
@ 2010-04-27 12:42   ` Avi Kivity
  2010-04-27 13:10     ` Joerg Roedel
  2010-04-27 13:40     ` Avi Kivity
  0 siblings, 2 replies; 57+ messages in thread
From: Avi Kivity @ 2010-04-27 12:42 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel

On 04/27/2010 01:38 PM, Joerg Roedel wrote:
> This patch adds a function which can read from the guests
> physical memory or from the guest's guest physical memory.
> This will be used in the two-dimensional page table walker.
>
> Signed-off-by: Joerg Roedel<joerg.roedel@amd.com>
> ---
>   arch/x86/include/asm/kvm_host.h |    3 +++
>   arch/x86/kvm/x86.c              |   24 ++++++++++++++++++++++++
>   2 files changed, 27 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 3cbfb51..7851bbc 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -635,6 +635,9 @@ void kvm_requeue_exception(struct kvm_vcpu *vcpu, unsigned nr);
>   void kvm_requeue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code);
>   void kvm_inject_page_fault(struct kvm_vcpu *vcpu, unsigned long cr2,
>   			   u32 error_code);
> +int kvm_read_guest_page_tdp(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
> +			    gfn_t gfn, void *data, int offset, int len,
> +			    u32 *error);
>   bool kvm_require_cpl(struct kvm_vcpu *vcpu, int required_cpl);
>
>   int kvm_pic_set_irq(void *opaque, int irq, int level);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 6b2ce1d..558d995 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -356,6 +356,30 @@ bool kvm_require_cpl(struct kvm_vcpu *vcpu, int required_cpl)
>   EXPORT_SYMBOL_GPL(kvm_require_cpl);
>
>   /*
> + * This function will be used to read from the physical memory of the currently
> + * running guest. The difference to kvm_read_guest_page ist that this function
> + * can read from guest physical or from the guest's guest physical memory.
> + */
>    

s/ist/is/

> +int kvm_read_guest_page_tdp(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
> +			    gfn_t gfn, void *data, int offset, int len,
> +			    u32 *error)
>    

Naming: I see 'tdp' as a host property, and this is valid whether tdp is 
enabled or not.  Suggest calling it kvm_read_nested_guest_page().

Following mmu.txt, the parameter would be ngfn, not gfn.

> +{
> +	gfn_t real_gfn;
> +	gpa_t gpa;
> +
> +	*error   = 0;
> +	gpa      = gfn<<  PAGE_SHIFT;
> +	real_gfn = mmu->translate_gpa(vcpu, gpa, error);
>    

Overflow: sizeof(gpa) > sizeof(gfn).

> +	if (real_gfn == UNMAPPED_GVA)
> +		return -EFAULT;
> +
> +	real_gfn>>= PAGE_SHIFT;
>    

gpa_to_gfn().

> +
> +	return kvm_read_guest_page(vcpu->kvm, real_gfn, data, offset, len);
> +}
> +EXPORT_SYMBOL_GPL(kvm_read_guest_page_tdp);
> +
>    

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 15/22] KVM: MMU: Introduce kvm_read_guest_page_x86()
  2010-04-27 10:38 ` [PATCH 15/22] KVM: MMU: Introduce kvm_read_guest_page_x86() Joerg Roedel
@ 2010-04-27 12:52   ` Avi Kivity
  2010-04-27 13:20     ` Joerg Roedel
  0 siblings, 1 reply; 57+ messages in thread
From: Avi Kivity @ 2010-04-27 12:52 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel

On 04/27/2010 01:38 PM, Joerg Roedel wrote:
> This patch introduces the kvm_read_guest_page_x86 function
> which reads from the physical memory of the guest. If the
> guest is running in guest-mode itself with nested paging
> enabled it will read from the guest's guest physical memory
> instead.
> The patch also changes changes the code to use this function
> where it is necessary.
>
>
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 7851bbc..d9dfc8c 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -254,6 +254,13 @@ struct kvm_mmu {
>   	union kvm_mmu_page_role base_role;
>   	bool direct_map;
>
> +	/*
> +	 * If true the mmu runs in two-level mode.
> +	 * vcpu->arch.nested_mmu needs to contain meaningful values in
> +	 * this case.
> +	 */
> +	bool nested;
> +
>    

struct mmu_context *active_mmu? (in vcpu->arch)

>   	u64 *pae_root;
>   	u64 rsvd_bits_mask[2][4];
>   };
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 558d995..317ad26 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -379,6 +379,20 @@ int kvm_read_guest_page_tdp(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
>   }
>   EXPORT_SYMBOL_GPL(kvm_read_guest_page_tdp);
>
> +int kvm_read_guest_page_x86(struct kvm_vcpu *vcpu, gfn_t gfn,
> +			    void *data, int offset, int len, u32 *error)
> +{
> +	struct kvm_mmu *mmu;
> +
> +	if (vcpu->arch.mmu.nested)
> +		mmu =&vcpu->arch.nested_mmu;
> +	else
> +		mmu =&vcpu->arch.mmu;
> +
> +	return kvm_read_guest_page_tdp(vcpu, mmu, gfn, data, offset, len,
> +				       error);
> +}
>    

This is really not x86 specific (though the implementation certainly 
is).  s390 will have exactly the same need when it gets nested virt.  I 
think this can be folded into 
kvm_read_guest_page_tdp()/kvm_read_nested_guest_page().


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 16/22] KVM: MMU: Track page fault data in struct vcpu
  2010-04-27 10:38 ` [PATCH 16/22] KVM: MMU: Track page fault data in struct vcpu Joerg Roedel
@ 2010-04-27 12:58   ` Avi Kivity
  2010-04-27 13:28     ` Joerg Roedel
  2010-05-03 16:32     ` Joerg Roedel
  0 siblings, 2 replies; 57+ messages in thread
From: Avi Kivity @ 2010-04-27 12:58 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel

On 04/27/2010 01:38 PM, Joerg Roedel wrote:
> This patch introduces two fields in vcpu_arch for x86:
>
> 	* fault_address
> 	* fault_error_code
>
> This will be used to correctly propagate page faults back
> into the guest when we could have either an ordinary page
> fault or a nested page fault. In the case of a nested page
> fault the fault-address is different from the original
> address that should be walked. So we need to keep track
> about the real fault-address.
> We could also remove the current path of the error_code to
> the fault. But this change is too invasive and outside the
> scope of this patch set. It will be changed and tested
> seperatly.
>
> Signed-off-by: Joerg Roedel<joerg.roedel@amd.com>
> ---
>   arch/x86/include/asm/kvm_host.h |    3 +++
>   arch/x86/kvm/paging_tmpl.h      |    4 ++++
>   2 files changed, 7 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index d9dfc8c..8426870 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -298,6 +298,9 @@ struct kvm_vcpu_arch {
>   	/* Used for two dimensional paging emulation */
>   	struct kvm_mmu nested_mmu;
>
> +	unsigned long fault_address;
>    

Probably a problem on i386.  How does npt handle faults when the guest 
is using pae paging and the host (in our case the guest...) isn't?  I 
see it uses exit_info_2 for the address, which is a u64.

So we probably need to upgrade gva_t to a u64.  Please send this as a 
separate patch, and test on i386 hosts.

> +	int fault_error_code;
>    

unsigned.

Maybe put the two in a struct, easier to pass around.


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 20/22] KVM: SVM: Initialize Nested Nested MMU context on VMRUN
  2010-04-27 10:38 ` [PATCH 20/22] KVM: SVM: Initialize Nested Nested MMU context on VMRUN Joerg Roedel
@ 2010-04-27 13:01   ` Avi Kivity
  0 siblings, 0 replies; 57+ messages in thread
From: Avi Kivity @ 2010-04-27 13:01 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel

On 04/27/2010 01:38 PM, Joerg Roedel wrote:
> This patch adds code to initialize the Nested Nested Paging
> MMU context when the L1 guest executes a VMRUN instruction
> and has nested paging enabled in its VMCB.
>   				  struct kvm_mmu_page *sp,
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index e31f601..266b1d4 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -94,7 +94,6 @@ struct nested_state {
>
>   	/* Nested Paging related state */
>   	u64 nested_cr3;
> -
>   };
>    

Better not to introduce in the first place (I resisted commenting on it, 
but now I broke down).


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 0/22] Nested Paging support for Nested SVM v2
  2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
                   ` (21 preceding siblings ...)
  2010-04-27 10:38 ` [PATCH 22/22] KVM: SVM: Expect two more candiates for exit_int_info Joerg Roedel
@ 2010-04-27 13:03 ` Avi Kivity
  22 siblings, 0 replies; 57+ messages in thread
From: Avi Kivity @ 2010-04-27 13:03 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel

On 04/27/2010 01:38 PM, Joerg Roedel wrote:
> Hi,
>
> this is the second and reworked version of my nested paging for nested svm
> patchset. Changes to the previous version include:
>
> 	* Renamed mmu.tdp_enabled to mmu.direct_map
> 	* Introduced two helper functions to read physical memory
> 	  locations of the current running guest level
> 	* Fixed a couple of bugs were KVM needs to read l2
> 	  physical memory and others
>
> This patchset is tested with KVM and Windows 7 XPmode. I also tested in KVM
> with Linux and Windows 7 as l2 guests. I also tested different pagesize
> combinations and did a stress tests for a couple of hours. All these tests
> showed no introduced regressions.
> Please review this second version of the patch-set. I appreciate your feedback.
>    

Ok, apart from the overflow on i386 nasties this look really good!  I 
look forward to merging v3.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 13/22] KVM: X86: Add kvm_read_guest_page_tdp function
  2010-04-27 12:42   ` Avi Kivity
@ 2010-04-27 13:10     ` Joerg Roedel
  2010-04-27 13:40     ` Avi Kivity
  1 sibling, 0 replies; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 13:10 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, linux-kernel

On Tue, Apr 27, 2010 at 03:42:27PM +0300, Avi Kivity wrote:
> On 04/27/2010 01:38 PM, Joerg Roedel wrote:
> >This patch adds a function which can read from the guests
> >physical memory or from the guest's guest physical memory.
> >This will be used in the two-dimensional page table walker.
> >
> >Signed-off-by: Joerg Roedel<joerg.roedel@amd.com>
> >---
> >  arch/x86/include/asm/kvm_host.h |    3 +++
> >  arch/x86/kvm/x86.c              |   24 ++++++++++++++++++++++++
> >  2 files changed, 27 insertions(+), 0 deletions(-)
> >
> >diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> >index 3cbfb51..7851bbc 100644
> >--- a/arch/x86/include/asm/kvm_host.h
> >+++ b/arch/x86/include/asm/kvm_host.h
> >@@ -635,6 +635,9 @@ void kvm_requeue_exception(struct kvm_vcpu *vcpu, unsigned nr);
> >  void kvm_requeue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code);
> >  void kvm_inject_page_fault(struct kvm_vcpu *vcpu, unsigned long cr2,
> >  			   u32 error_code);
> >+int kvm_read_guest_page_tdp(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
> >+			    gfn_t gfn, void *data, int offset, int len,
> >+			    u32 *error);
> >  bool kvm_require_cpl(struct kvm_vcpu *vcpu, int required_cpl);
> >
> >  int kvm_pic_set_irq(void *opaque, int irq, int level);
> >diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >index 6b2ce1d..558d995 100644
> >--- a/arch/x86/kvm/x86.c
> >+++ b/arch/x86/kvm/x86.c
> >@@ -356,6 +356,30 @@ bool kvm_require_cpl(struct kvm_vcpu *vcpu, int required_cpl)
> >  EXPORT_SYMBOL_GPL(kvm_require_cpl);
> >
> >  /*
> >+ * This function will be used to read from the physical memory of the currently
> >+ * running guest. The difference to kvm_read_guest_page ist that this function
> >+ * can read from guest physical or from the guest's guest physical memory.
> >+ */
> 
> s/ist/is/

This is a common typo error I make ;) I'll fix it.

> >+int kvm_read_guest_page_tdp(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
> >+			    gfn_t gfn, void *data, int offset, int len,
> >+			    u32 *error)
> 
> Naming: I see 'tdp' as a host property, and this is valid whether
> tdp is enabled or not.  Suggest calling it
> kvm_read_nested_guest_page().
> 
> Following mmu.txt, the parameter would be ngfn, not gfn.
> 
> >+{
> >+	gfn_t real_gfn;
> >+	gpa_t gpa;
> >+
> >+	*error   = 0;
> >+	gpa      = gfn<<  PAGE_SHIFT;
> >+	real_gfn = mmu->translate_gpa(vcpu, gpa, error);
> 
> Overflow: sizeof(gpa) > sizeof(gfn).
> 
> >+	if (real_gfn == UNMAPPED_GVA)
> >+		return -EFAULT;
> >+
> >+	real_gfn>>= PAGE_SHIFT;
> 
> gpa_to_gfn().

I'll fix the overflow errors and send an updated patch.

Thanks,
	Joerg


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 15/22] KVM: MMU: Introduce kvm_read_guest_page_x86()
  2010-04-27 12:52   ` Avi Kivity
@ 2010-04-27 13:20     ` Joerg Roedel
  2010-04-27 13:35       ` Avi Kivity
  0 siblings, 1 reply; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 13:20 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, linux-kernel

On Tue, Apr 27, 2010 at 03:52:37PM +0300, Avi Kivity wrote:
> On 04/27/2010 01:38 PM, Joerg Roedel wrote:
> >This patch introduces the kvm_read_guest_page_x86 function
> >which reads from the physical memory of the guest. If the
> >guest is running in guest-mode itself with nested paging
> >enabled it will read from the guest's guest physical memory
> >instead.
> >The patch also changes changes the code to use this function
> >where it is necessary.
> >
> >
> >
> >diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> >index 7851bbc..d9dfc8c 100644
> >--- a/arch/x86/include/asm/kvm_host.h
> >+++ b/arch/x86/include/asm/kvm_host.h
> >@@ -254,6 +254,13 @@ struct kvm_mmu {
> >  	union kvm_mmu_page_role base_role;
> >  	bool direct_map;
> >
> >+	/*
> >+	 * If true the mmu runs in two-level mode.
> >+	 * vcpu->arch.nested_mmu needs to contain meaningful values in
> >+	 * this case.
> >+	 */
> >+	bool nested;
> >+
> 
> struct mmu_context *active_mmu? (in vcpu->arch)

Hmm, difficult since both mmu's are active in the npt-npt case. The
arch.mmu struct contains mostly the l1 paging state initialized for
shadow paging and different set_cr3/get_cr3/inject_page_fault functions.
This keeps the changes to the mmu small and optimize for the common case
(a nested npt fault).
The arch.nested_mmu contains the l2 paging mode and is only used for
nested gva_to_gpa translations (thats the reason it is only partially
initialized).

> >  	u64 *pae_root;
> >  	u64 rsvd_bits_mask[2][4];
> >  };
> >diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >index 558d995..317ad26 100644
> >--- a/arch/x86/kvm/x86.c
> >+++ b/arch/x86/kvm/x86.c
> >@@ -379,6 +379,20 @@ int kvm_read_guest_page_tdp(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
> >  }
> >  EXPORT_SYMBOL_GPL(kvm_read_guest_page_tdp);
> >
> >+int kvm_read_guest_page_x86(struct kvm_vcpu *vcpu, gfn_t gfn,
> >+			    void *data, int offset, int len, u32 *error)
> >+{
> >+	struct kvm_mmu *mmu;
> >+
> >+	if (vcpu->arch.mmu.nested)
> >+		mmu =&vcpu->arch.nested_mmu;
> >+	else
> >+		mmu =&vcpu->arch.mmu;
> >+
> >+	return kvm_read_guest_page_tdp(vcpu, mmu, gfn, data, offset, len,
> >+				       error);
> >+}
> 
> This is really not x86 specific (though the implementation certainly
> is).  s390 will have exactly the same need when it gets nested virt.
> I think this can be folded into
> kvm_read_guest_page_tdp()/kvm_read_nested_guest_page().

For the generic walk_addr I need a version of that function that takes
an mmu_context parameter. Thats the reason I made two functions.
The function (or at least its semantic) is useful for !x86 too, thats
right. But it currently can't be made generic because the MMU
implementation is architecture specific. Do you suggest to give it a
more generic name so we can move it later?

	Joerg



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 16/22] KVM: MMU: Track page fault data in struct vcpu
  2010-04-27 12:58   ` Avi Kivity
@ 2010-04-27 13:28     ` Joerg Roedel
  2010-04-27 13:37       ` Avi Kivity
  2010-05-03 16:32     ` Joerg Roedel
  1 sibling, 1 reply; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 13:28 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, linux-kernel

On Tue, Apr 27, 2010 at 03:58:36PM +0300, Avi Kivity wrote:
> On 04/27/2010 01:38 PM, Joerg Roedel wrote:
> >This patch introduces two fields in vcpu_arch for x86:
> >
> >	* fault_address
> >	* fault_error_code
> >
> >This will be used to correctly propagate page faults back
> >into the guest when we could have either an ordinary page
> >fault or a nested page fault. In the case of a nested page
> >fault the fault-address is different from the original
> >address that should be walked. So we need to keep track
> >about the real fault-address.
> >We could also remove the current path of the error_code to
> >the fault. But this change is too invasive and outside the
> >scope of this patch set. It will be changed and tested
> >seperatly.
> >
> >Signed-off-by: Joerg Roedel<joerg.roedel@amd.com>
> >---
> >  arch/x86/include/asm/kvm_host.h |    3 +++
> >  arch/x86/kvm/paging_tmpl.h      |    4 ++++
> >  2 files changed, 7 insertions(+), 0 deletions(-)
> >
> >diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> >index d9dfc8c..8426870 100644
> >--- a/arch/x86/include/asm/kvm_host.h
> >+++ b/arch/x86/include/asm/kvm_host.h
> >@@ -298,6 +298,9 @@ struct kvm_vcpu_arch {
> >  	/* Used for two dimensional paging emulation */
> >  	struct kvm_mmu nested_mmu;
> >
> >+	unsigned long fault_address;
> 
> Probably a problem on i386.  How does npt handle faults when the
> guest is using pae paging and the host (in our case the guest...)
> isn't?  I see it uses exit_info_2 for the address, which is a u64.

This shouldn't be an issue. If we run on 32bit host with nested paging
the guest can't have more than 4gb of addressable memory because of the
page table limitations (nested page table is always in host format).

> So we probably need to upgrade gva_t to a u64.  Please send this as
> a separate patch, and test on i386 hosts.
> 
> >+	int fault_error_code;
> 
> unsigned.

Right. I'll fix it.

> Maybe put the two in a struct, easier to pass around.

Yeah, makes it easier to read.

	Joerg



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 15/22] KVM: MMU: Introduce kvm_read_guest_page_x86()
  2010-04-27 13:20     ` Joerg Roedel
@ 2010-04-27 13:35       ` Avi Kivity
  2010-04-27 15:40         ` Joerg Roedel
  2010-04-28 15:31         ` Joerg Roedel
  0 siblings, 2 replies; 57+ messages in thread
From: Avi Kivity @ 2010-04-27 13:35 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel

On 04/27/2010 04:20 PM, Joerg Roedel wrote:
> On Tue, Apr 27, 2010 at 03:52:37PM +0300, Avi Kivity wrote:
>    
>> On 04/27/2010 01:38 PM, Joerg Roedel wrote:
>>      
>>> This patch introduces the kvm_read_guest_page_x86 function
>>> which reads from the physical memory of the guest. If the
>>> guest is running in guest-mode itself with nested paging
>>> enabled it will read from the guest's guest physical memory
>>> instead.
>>> The patch also changes changes the code to use this function
>>> where it is necessary.
>>>
>>>
>>>
>>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>>> index 7851bbc..d9dfc8c 100644
>>> --- a/arch/x86/include/asm/kvm_host.h
>>> +++ b/arch/x86/include/asm/kvm_host.h
>>> @@ -254,6 +254,13 @@ struct kvm_mmu {
>>>   	union kvm_mmu_page_role base_role;
>>>   	bool direct_map;
>>>
>>> +	/*
>>> +	 * If true the mmu runs in two-level mode.
>>> +	 * vcpu->arch.nested_mmu needs to contain meaningful values in
>>> +	 * this case.
>>> +	 */
>>> +	bool nested;
>>> +
>>>        
>> struct mmu_context *active_mmu? (in vcpu->arch)
>>      
> Hmm, difficult since both mmu's are active in the npt-npt case. The
> arch.mmu struct contains mostly the l1 paging state initialized for
> shadow paging and different set_cr3/get_cr3/inject_page_fault functions.
> This keeps the changes to the mmu small and optimize for the common case
> (a nested npt fault).
>    

Well, it reduces the changes to the mmu, but it makes a 'struct kvm_mmu' 
incoherent since its meaning depends on whether it is nested or not.  
For someone reading the code, it is hard to see when to use ->nested_mmu 
or ->mmu.

Perhaps have

    struct kvm_mmu base_mmu;
    struct kvm_mmu nested_mmu;
    struct kvm_mmu *mmu;

You could have one patch that mindlessly changes mmu. to mmu->.  The 
impact of the patchset increases, but I think the result is more readable.

It will be a pain adapting the patchset, but easier than updating 
mmu.txt to reflect the current situation.

> The arch.nested_mmu contains the l2 paging mode and is only used for
> nested gva_to_gpa translations (thats the reason it is only partially
> initialized).
>
>    
>>>   	u64 *pae_root;
>>>   	u64 rsvd_bits_mask[2][4];
>>>   };
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index 558d995..317ad26 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -379,6 +379,20 @@ int kvm_read_guest_page_tdp(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
>>>   }
>>>   EXPORT_SYMBOL_GPL(kvm_read_guest_page_tdp);
>>>
>>> +int kvm_read_guest_page_x86(struct kvm_vcpu *vcpu, gfn_t gfn,
>>> +			    void *data, int offset, int len, u32 *error)
>>> +{
>>> +	struct kvm_mmu *mmu;
>>> +
>>> +	if (vcpu->arch.mmu.nested)
>>> +		mmu =&vcpu->arch.nested_mmu;
>>> +	else
>>> +		mmu =&vcpu->arch.mmu;
>>> +
>>> +	return kvm_read_guest_page_tdp(vcpu, mmu, gfn, data, offset, len,
>>> +				       error);
>>> +}
>>>        
>> This is really not x86 specific (though the implementation certainly
>> is).  s390 will have exactly the same need when it gets nested virt.
>> I think this can be folded into
>> kvm_read_guest_page_tdp()/kvm_read_nested_guest_page().
>>      
> For the generic walk_addr I need a version of that function that takes
> an mmu_context parameter. Thats the reason I made two functions.
> The function (or at least its semantic) is useful for !x86 too, thats
> right. But it currently can't be made generic because the MMU
> implementation is architecture specific. Do you suggest to give it a
> more generic name so we can move it later?
>    

kvm_read_guest_page_mmu() for the internal one, 
kvm_read_nested_guest_page() for the generic one?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 16/22] KVM: MMU: Track page fault data in struct vcpu
  2010-04-27 13:28     ` Joerg Roedel
@ 2010-04-27 13:37       ` Avi Kivity
  2010-04-27 13:57         ` Joerg Roedel
  0 siblings, 1 reply; 57+ messages in thread
From: Avi Kivity @ 2010-04-27 13:37 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel

On 04/27/2010 04:28 PM, Joerg Roedel wrote:
>
>>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>>> index d9dfc8c..8426870 100644
>>> --- a/arch/x86/include/asm/kvm_host.h
>>> +++ b/arch/x86/include/asm/kvm_host.h
>>> @@ -298,6 +298,9 @@ struct kvm_vcpu_arch {
>>>   	/* Used for two dimensional paging emulation */
>>>   	struct kvm_mmu nested_mmu;
>>>
>>> +	unsigned long fault_address;
>>>        
>> Probably a problem on i386.  How does npt handle faults when the
>> guest is using pae paging and the host (in our case the guest...)
>> isn't?  I see it uses exit_info_2 for the address, which is a u64.
>>      
> This shouldn't be an issue. If we run on 32bit host with nested paging
> the guest can't have more than 4gb of addressable memory because of the
> page table limitations (nested page table is always in host format).
>    

But the nested guest can use pae paging and generate a #NPF with 
exit_info_2 > 4GB.  So we need to keep the full fault address; if we 
truncate, the guest might actually resolve the fault and let the nested 
guest continue.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 13/22] KVM: X86: Add kvm_read_guest_page_tdp function
  2010-04-27 12:42   ` Avi Kivity
  2010-04-27 13:10     ` Joerg Roedel
@ 2010-04-27 13:40     ` Avi Kivity
  1 sibling, 0 replies; 57+ messages in thread
From: Avi Kivity @ 2010-04-27 13:40 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel

On 04/27/2010 03:42 PM, Avi Kivity wrote:
>
>> +{
>> +    gfn_t real_gfn;
>> +    gpa_t gpa;
>> +
>> +    *error   = 0;
>> +    gpa      = gfn<<  PAGE_SHIFT;

This overflows too :)

Use gfn_to_gpa().

>> +    real_gfn = mmu->translate_gpa(vcpu, gpa, error);
>
> Overflow: sizeof(gpa) > sizeof(gfn).

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 16/22] KVM: MMU: Track page fault data in struct vcpu
  2010-04-27 13:37       ` Avi Kivity
@ 2010-04-27 13:57         ` Joerg Roedel
  2010-04-27 16:02           ` Avi Kivity
  0 siblings, 1 reply; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 13:57 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, linux-kernel

On Tue, Apr 27, 2010 at 04:37:42PM +0300, Avi Kivity wrote:
> On 04/27/2010 04:28 PM, Joerg Roedel wrote:
> >
> >>>diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> >>>index d9dfc8c..8426870 100644
> >>>--- a/arch/x86/include/asm/kvm_host.h
> >>>+++ b/arch/x86/include/asm/kvm_host.h
> >>>@@ -298,6 +298,9 @@ struct kvm_vcpu_arch {
> >>>  	/* Used for two dimensional paging emulation */
> >>>  	struct kvm_mmu nested_mmu;
> >>>
> >>>+	unsigned long fault_address;
> >>Probably a problem on i386.  How does npt handle faults when the
> >>guest is using pae paging and the host (in our case the guest...)
> >>isn't?  I see it uses exit_info_2 for the address, which is a u64.
> >This shouldn't be an issue. If we run on 32bit host with nested paging
> >the guest can't have more than 4gb of addressable memory because of the
> >page table limitations (nested page table is always in host format).
> 
> But the nested guest can use pae paging and generate a #NPF with
> exit_info_2 > 4GB.  So we need to keep the full fault address; if we
> truncate, the guest might actually resolve the fault and let the
> nested guest continue.

This could only be a malicious guest because it can't have memory above
4gb. But a guest could certainly setup its page tables to point there,
thats true. So I change it to u64.

	Joerg



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 15/22] KVM: MMU: Introduce kvm_read_guest_page_x86()
  2010-04-27 13:35       ` Avi Kivity
@ 2010-04-27 15:40         ` Joerg Roedel
  2010-04-27 16:09           ` Avi Kivity
  2010-04-28 15:31         ` Joerg Roedel
  1 sibling, 1 reply; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 15:40 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, linux-kernel

On Tue, Apr 27, 2010 at 04:35:06PM +0300, Avi Kivity wrote:
> On 04/27/2010 04:20 PM, Joerg Roedel wrote:

> >Hmm, difficult since both mmu's are active in the npt-npt case. The
> >arch.mmu struct contains mostly the l1 paging state initialized for
> >shadow paging and different set_cr3/get_cr3/inject_page_fault functions.
> >This keeps the changes to the mmu small and optimize for the common case
> >(a nested npt fault).
> 
> Well, it reduces the changes to the mmu, but it makes a 'struct
> kvm_mmu' incoherent since its meaning depends on whether it is
> nested or not.  For someone reading the code, it is hard to see when
> to use ->nested_mmu or ->mmu.
> 
> Perhaps have
> 
>    struct kvm_mmu base_mmu;
>    struct kvm_mmu nested_mmu;
>    struct kvm_mmu *mmu;
> 
> You could have one patch that mindlessly changes mmu. to mmu->.  The
> impact of the patchset increases, but I think the result is more
> readable.
> 
> It will be a pain adapting the patchset, but easier than updating
> mmu.txt to reflect the current situation.

Okay, I finally read most of mmu.txt :)
I agree that the implemented scheme for using arch.mmu and
arch.nested_mmu is non-obvious. The most complicated thing is that
.gva_to_gpa functions are exchanged between mmu and nested_mmu. This
means that nested_mmu.gva_to_gpa basically walks a page table in a mode
described by arch.mmu while mmu.gva_to_gpa walks the full
two-dimensional page table.
But I also don't yet see how a *mmu pointer would help here. From my
current understanding of the idea the only place to use it would be the
init_kvm_mmu path. But maybe we could also do this to make things more
clear:

	* Rename nested_mmu to l2_mmu and use it to save the current
	  paging mode of the l2 guest
	* Do not switch the the .gva_to_gpa function pointers between
	  mmu contexts and provide a wrapper function for all callers of
	  arch.mmu.gva_to_gpa which uses the right one for the current
	  context. The arch.nested flag is the indicator for which one
	  to use.

Btw. the purpose of the nested-flag is to prevent to reset arch.mmu when
the l2 guest changes his paging mode and we could remove this flag with
a pointer-solution.
Currently its a bit unclear when to use mmu or nested_mmu. With a
pointer it would be unclear to the code reader when to use the pointer
and when to select the mmu_contexts directly.

	Joerg



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 16/22] KVM: MMU: Track page fault data in struct vcpu
  2010-04-27 13:57         ` Joerg Roedel
@ 2010-04-27 16:02           ` Avi Kivity
  0 siblings, 0 replies; 57+ messages in thread
From: Avi Kivity @ 2010-04-27 16:02 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel

On 04/27/2010 04:57 PM, Joerg Roedel wrote:
>
>> But the nested guest can use pae paging and generate a #NPF with
>> exit_info_2>  4GB.  So we need to keep the full fault address; if we
>> truncate, the guest might actually resolve the fault and let the
>> nested guest continue.
>>      
> This could only be a malicious guest because it can't have memory above
> 4gb. But a guest could certainly setup its page tables to point there,
> thats true. So I change it to u64.
>    

It doesn't need to be malicious, for example it could set up an mmio PCI 
BAR outside the 4GB space.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 15/22] KVM: MMU: Introduce kvm_read_guest_page_x86()
  2010-04-27 15:40         ` Joerg Roedel
@ 2010-04-27 16:09           ` Avi Kivity
  2010-04-27 16:27             ` Joerg Roedel
  0 siblings, 1 reply; 57+ messages in thread
From: Avi Kivity @ 2010-04-27 16:09 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel

On 04/27/2010 06:40 PM, Joerg Roedel wrote:
>
> Currently its a bit unclear when to use mmu or nested_mmu. With a
> pointer it would be unclear to the code reader when to use the pointer
> and when to select the mmu_contexts directly.
>    

I think in most cases you'd want full translation, thus the pointer.  
This should be the default.  In specific cases you'd want just the 
non-nested guest translation.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 15/22] KVM: MMU: Introduce kvm_read_guest_page_x86()
  2010-04-27 16:09           ` Avi Kivity
@ 2010-04-27 16:27             ` Joerg Roedel
  0 siblings, 0 replies; 57+ messages in thread
From: Joerg Roedel @ 2010-04-27 16:27 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, linux-kernel

On Tue, Apr 27, 2010 at 07:09:37PM +0300, Avi Kivity wrote:
> On 04/27/2010 06:40 PM, Joerg Roedel wrote:
> >
> >Currently its a bit unclear when to use mmu or nested_mmu. With a
> >pointer it would be unclear to the code reader when to use the pointer
> >and when to select the mmu_contexts directly.
> 
> I think in most cases you'd want full translation, thus the pointer.
> This should be the default.  In specific cases you'd want just the
> non-nested guest translation.

Hmm, for most cases == all gva_to_gpa cases. The page fault path can't
use the pointer. I'll try out how this works. It shouldn't be too
complicated.

	Joerg



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 11/22] KVM: MMU: Add infrastructure for two-level page walker
  2010-04-27 12:34   ` Avi Kivity
@ 2010-04-28 10:52     ` Joerg Roedel
  2010-04-28 11:24       ` Avi Kivity
  2010-04-28 11:03     ` Joerg Roedel
  1 sibling, 1 reply; 57+ messages in thread
From: Joerg Roedel @ 2010-04-28 10:52 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, linux-kernel

On Tue, Apr 27, 2010 at 03:34:10PM +0300, Avi Kivity wrote:
> On 04/27/2010 01:38 PM, Joerg Roedel wrote:
> >This patch introduces a mmu-callback to translate gpa
> >addresses in the walk_addr code. This is later used to
> >translate l2_gpa addresses into l1_gpa addresses.
> >
> >+static inline gfn_t gpa_to_gfn(gpa_t gpa)
> >+{
> >+	return (gfn_t)gpa>>  PAGE_SHIFT;
> >+}
> >+
> 
> This overflows on 32-bit, since gpa_t is u64 and gfn_t is ulong.

Hm, this is a problem outside of this patchset too (for 32bit hosts).
The best solution is probably to convert gfn_t to u64 too.

	Joerg



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 11/22] KVM: MMU: Add infrastructure for two-level page walker
  2010-04-27 12:34   ` Avi Kivity
  2010-04-28 10:52     ` Joerg Roedel
@ 2010-04-28 11:03     ` Joerg Roedel
  2010-04-28 11:09       ` Avi Kivity
  1 sibling, 1 reply; 57+ messages in thread
From: Joerg Roedel @ 2010-04-28 11:03 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, linux-kernel

On Tue, Apr 27, 2010 at 03:34:10PM +0300, Avi Kivity wrote:
> On 04/27/2010 01:38 PM, Joerg Roedel wrote:
> >This patch introduces a mmu-callback to translate gpa
> >addresses in the walk_addr code. This is later used to
> >translate l2_gpa addresses into l1_gpa addresses.
> >
> >+static inline gfn_t gpa_to_gfn(gpa_t gpa)
> >+{
> >+	return (gfn_t)gpa>>  PAGE_SHIFT;
> >+}
> >+
> 
> This overflows on 32-bit, since gpa_t is u64 and gfn_t is ulong.

Thinking again about it, on 32 bit the physical address width is only 36
bits. So there shouldn't be an overflow, no?

	Joerg



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 11/22] KVM: MMU: Add infrastructure for two-level page walker
  2010-04-28 11:03     ` Joerg Roedel
@ 2010-04-28 11:09       ` Avi Kivity
  0 siblings, 0 replies; 57+ messages in thread
From: Avi Kivity @ 2010-04-28 11:09 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel

On 04/28/2010 02:03 PM, Joerg Roedel wrote:
> On Tue, Apr 27, 2010 at 03:34:10PM +0300, Avi Kivity wrote:
>    
>> On 04/27/2010 01:38 PM, Joerg Roedel wrote:
>>      
>>> This patch introduces a mmu-callback to translate gpa
>>> addresses in the walk_addr code. This is later used to
>>> translate l2_gpa addresses into l1_gpa addresses.
>>>
>>> +static inline gfn_t gpa_to_gfn(gpa_t gpa)
>>> +{
>>> +	return (gfn_t)gpa>>   PAGE_SHIFT;
>>> +}
>>> +
>>>        
>> This overflows on 32-bit, since gpa_t is u64 and gfn_t is ulong.
>>      
> Thinking again about it, on 32 bit the physical address width is only 36
> bits. So there shouldn't be an overflow, no?
>    

It's limited by MAXPHYADDR (at least on Intel) even on 32-bits.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 11/22] KVM: MMU: Add infrastructure for two-level page walker
  2010-04-28 10:52     ` Joerg Roedel
@ 2010-04-28 11:24       ` Avi Kivity
  0 siblings, 0 replies; 57+ messages in thread
From: Avi Kivity @ 2010-04-28 11:24 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel

On 04/28/2010 01:52 PM, Joerg Roedel wrote:
> On Tue, Apr 27, 2010 at 03:34:10PM +0300, Avi Kivity wrote:
>    
>> On 04/27/2010 01:38 PM, Joerg Roedel wrote:
>>      
>>> This patch introduces a mmu-callback to translate gpa
>>> addresses in the walk_addr code. This is later used to
>>> translate l2_gpa addresses into l1_gpa addresses.
>>>
>>> +static inline gfn_t gpa_to_gfn(gpa_t gpa)
>>> +{
>>> +	return (gfn_t)gpa>>   PAGE_SHIFT;
>>> +}
>>> +
>>>        
>> This overflows on 32-bit, since gpa_t is u64 and gfn_t is ulong.
>>      
> Hm, this is a problem outside of this patchset too (for 32bit hosts).
> The best solution is probably to convert gfn_t to u64 too.
>    

If you cast like

    (gfn_t)(gpa >> PAGE_SHIFT)

you avoid the overflow for MAXPHYADDR < 48.  However, I agree that 
converting gfn_t to u64 is best, the minor performance degradation is in 
no way comparable to the corruption that results from a miscast.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 12/22] KVM: MMU: Implement nested gva_to_gpa functions
  2010-04-27 12:37   ` Avi Kivity
@ 2010-04-28 14:20     ` Joerg Roedel
  0 siblings, 0 replies; 57+ messages in thread
From: Joerg Roedel @ 2010-04-28 14:20 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, linux-kernel

On Tue, Apr 27, 2010 at 03:37:40PM +0300, Avi Kivity wrote:
> On 04/27/2010 01:38 PM, Joerg Roedel wrote:
> >This patch adds the functions to do a nested l2_gva to
> >l1_gpa page table walk.
> >
> >Signed-off-by: Joerg Roedel<joerg.roedel@amd.com>
> >---
> >  arch/x86/include/asm/kvm_host.h |    4 ++++
> >  arch/x86/kvm/mmu.c              |    8 ++++++++
> >  arch/x86/kvm/paging_tmpl.h      |   31 +++++++++++++++++++++++++++++++
> >  3 files changed, 43 insertions(+), 0 deletions(-)
> >
> >diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> >index ccdbd2f..3cbfb51 100644
> >--- a/arch/x86/include/asm/kvm_host.h
> >+++ b/arch/x86/include/asm/kvm_host.h
> >@@ -287,6 +287,10 @@ struct kvm_vcpu_arch {
> >  	bool tpr_access_reporting;
> >
> >  	struct kvm_mmu mmu;
> >+
> >+	/* Used for two dimensional paging emulation */
> >+	struct kvm_mmu nested_mmu;
> >+
> >  	/* only needed in kvm_pv_mmu_op() path, but it's hot so
> >  	 * put it here to avoid allocation */
> >  	struct kvm_pv_mmu_op_buffer mmu_op_buffer;
> >diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> >index 4e0bfdb..8bd40b5 100644
> >--- a/arch/x86/kvm/mmu.c
> >+++ b/arch/x86/kvm/mmu.c
> >@@ -2144,6 +2144,14 @@ static gpa_t nonpaging_gva_to_gpa(struct kvm_vcpu *vcpu, gva_t vaddr,
> >  	return vaddr;
> >  }
> >
> >+static gpa_t nonpaging_gva_to_gpa_nested(struct kvm_vcpu *vcpu, gva_t vaddr,
> >+					 u32 access, u32 *error)
> >+{
> >+	if (error)
> >+		*error = 0;
> 
> Why allow error == NULL?

In the emulator there are some functions passing NULL for the error
code. In general it doesn't matter much because when we track this
information in the vcpu struct these error parameters can be removed. I
just let this outside the scope of this patchset and make that cleanup
later.

> 
> >
> >+static gpa_t FNAME(gva_to_gpa_nested)(struct kvm_vcpu *vcpu, gva_t vaddr,
> >+				      u32 access, u32 *error)
> >+{
> >+	struct guest_walker walker;
> >+	gpa_t gpa = UNMAPPED_GVA;
> >+	int r;
> >+
> >+	r = FNAME(walk_addr_nested)(&walker, vcpu, vaddr,
> >+				    !!(access&  PFERR_WRITE_MASK),
> >+				    !!(access&  PFERR_USER_MASK),
> >+				    !!(access&  PFERR_FETCH_MASK));
> 
> Those !! are unneeded.

Ok, I remove them.

	Joerg


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 15/22] KVM: MMU: Introduce kvm_read_guest_page_x86()
  2010-04-27 13:35       ` Avi Kivity
  2010-04-27 15:40         ` Joerg Roedel
@ 2010-04-28 15:31         ` Joerg Roedel
  1 sibling, 0 replies; 57+ messages in thread
From: Joerg Roedel @ 2010-04-28 15:31 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, linux-kernel

On Tue, Apr 27, 2010 at 04:35:06PM +0300, Avi Kivity wrote:
> Perhaps have
> 
>    struct kvm_mmu base_mmu;
>    struct kvm_mmu nested_mmu;
>    struct kvm_mmu *mmu;

Okay, the pointer is only used in the gva_to_gpa path which is the
minority of mmu context usages. I will introduce a

	struct kvm_mmu *walk_mmu;

and use it in the page walking code.

	Joerg



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 16/22] KVM: MMU: Track page fault data in struct vcpu
  2010-04-27 12:58   ` Avi Kivity
  2010-04-27 13:28     ` Joerg Roedel
@ 2010-05-03 16:32     ` Joerg Roedel
  2010-05-04  7:53       ` Avi Kivity
  1 sibling, 1 reply; 57+ messages in thread
From: Joerg Roedel @ 2010-05-03 16:32 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, linux-kernel

On Tue, Apr 27, 2010 at 03:58:36PM +0300, Avi Kivity wrote:
> So we probably need to upgrade gva_t to a u64.  Please send this as
> a separate patch, and test on i386 hosts.

Are there _any_ regular tests of KVM on i386 hosts? For me this is
terribly broken (also after I fixed the issue which gave me a
VMEXIT_INVALID at the first vmrun).

	Joerg



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 16/22] KVM: MMU: Track page fault data in struct vcpu
  2010-05-03 16:32     ` Joerg Roedel
@ 2010-05-04  7:53       ` Avi Kivity
  2010-05-04  9:11         ` Roedel, Joerg
  0 siblings, 1 reply; 57+ messages in thread
From: Avi Kivity @ 2010-05-04  7:53 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel

On 05/03/2010 07:32 PM, Joerg Roedel wrote:
> On Tue, Apr 27, 2010 at 03:58:36PM +0300, Avi Kivity wrote:
>    
>> So we probably need to upgrade gva_t to a u64.  Please send this as
>> a separate patch, and test on i386 hosts.
>>      
> Are there _any_ regular tests of KVM on i386 hosts? For me this is
> terribly broken (also after I fixed the issue which gave me a
> VMEXIT_INVALID at the first vmrun).
>
>    

No, apart from the poor users.  I'll try to set something up using nsvm.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 16/22] KVM: MMU: Track page fault data in struct vcpu
  2010-05-04  7:53       ` Avi Kivity
@ 2010-05-04  9:11         ` Roedel, Joerg
  2010-05-04  9:20           ` Avi Kivity
  0 siblings, 1 reply; 57+ messages in thread
From: Roedel, Joerg @ 2010-05-04  9:11 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Marcelo Tosatti, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org

On Tue, May 04, 2010 at 03:53:57AM -0400, Avi Kivity wrote:
> On 05/03/2010 07:32 PM, Joerg Roedel wrote:
> > On Tue, Apr 27, 2010 at 03:58:36PM +0300, Avi Kivity wrote:
> >    
> >> So we probably need to upgrade gva_t to a u64.  Please send this as
> >> a separate patch, and test on i386 hosts.
> >>      
> > Are there _any_ regular tests of KVM on i386 hosts? For me this is
> > terribly broken (also after I fixed the issue which gave me a
> > VMEXIT_INVALID at the first vmrun).
> >
> >    
> 
> No, apart from the poor users.  I'll try to set something up using nsvm.

Ok. I will post an initial fix for the VMEXIT_INVALID bug soon. Apart
from that I get a lockdep warning when I try to start a guest. The guest
actually boots if it is single-vcpu. SMP guests don't even boot through
the BIOS for me.

	Joerg



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 16/22] KVM: MMU: Track page fault data in struct vcpu
  2010-05-04  9:11         ` Roedel, Joerg
@ 2010-05-04  9:20           ` Avi Kivity
  2010-05-04  9:37             ` Roedel, Joerg
  0 siblings, 1 reply; 57+ messages in thread
From: Avi Kivity @ 2010-05-04  9:20 UTC (permalink / raw)
  To: Roedel, Joerg
  Cc: Marcelo Tosatti, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org

On 05/04/2010 12:11 PM, Roedel, Joerg wrote:
> On Tue, May 04, 2010 at 03:53:57AM -0400, Avi Kivity wrote:
>    
>> On 05/03/2010 07:32 PM, Joerg Roedel wrote:
>>      
>>> On Tue, Apr 27, 2010 at 03:58:36PM +0300, Avi Kivity wrote:
>>>
>>>        
>>>> So we probably need to upgrade gva_t to a u64.  Please send this as
>>>> a separate patch, and test on i386 hosts.
>>>>
>>>>          
>>> Are there _any_ regular tests of KVM on i386 hosts? For me this is
>>> terribly broken (also after I fixed the issue which gave me a
>>> VMEXIT_INVALID at the first vmrun).
>>>
>>>
>>>        
>> No, apart from the poor users.  I'll try to set something up using nsvm.
>>      
> Ok. I will post an initial fix for the VMEXIT_INVALID bug soon. Apart
> from that I get a lockdep warning when I try to start a guest. The guest
> actually boots if it is single-vcpu. SMP guests don't even boot through
> the BIOS for me.
>
>    

Strange.  i386 vs x86_64 shouldn't have that much effect!

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 16/22] KVM: MMU: Track page fault data in struct vcpu
  2010-05-04  9:20           ` Avi Kivity
@ 2010-05-04  9:37             ` Roedel, Joerg
  2010-05-04  9:45               ` Avi Kivity
  0 siblings, 1 reply; 57+ messages in thread
From: Roedel, Joerg @ 2010-05-04  9:37 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Marcelo Tosatti, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org

On Tue, May 04, 2010 at 05:20:02AM -0400, Avi Kivity wrote:
> On 05/04/2010 12:11 PM, Roedel, Joerg wrote:
> > On Tue, May 04, 2010 at 03:53:57AM -0400, Avi Kivity wrote:
> >    
> >> On 05/03/2010 07:32 PM, Joerg Roedel wrote:
> >>      
> >>> On Tue, Apr 27, 2010 at 03:58:36PM +0300, Avi Kivity wrote:
> >>>
> >>>        
> >>>> So we probably need to upgrade gva_t to a u64.  Please send this as
> >>>> a separate patch, and test on i386 hosts.
> >>>>
> >>>>          
> >>> Are there _any_ regular tests of KVM on i386 hosts? For me this is
> >>> terribly broken (also after I fixed the issue which gave me a
> >>> VMEXIT_INVALID at the first vmrun).
> >>>
> >>>
> >>>        
> >> No, apart from the poor users.  I'll try to set something up using nsvm.
> >>      
> > Ok. I will post an initial fix for the VMEXIT_INVALID bug soon. Apart
> > from that I get a lockdep warning when I try to start a guest. The guest
> > actually boots if it is single-vcpu. SMP guests don't even boot through
> > the BIOS for me.
> >
> >    
> 
> Strange.  i386 vs x86_64 shouldn't have that much effect!

This is the lockdep warning I get when I start booting a Linux kernel.
It is with the nested-npt patchset but the warning occurs without it too
(slightly different backtraces then).

[60390.953424] =======================================================
[60390.954324] [ INFO: possible circular locking dependency detected ]
[60390.954324] 2.6.34-rc5 #7
[60390.954324] -------------------------------------------------------
[60390.954324] qemu-system-x86/2506 is trying to acquire lock:
[60390.954324]  (&mm->mmap_sem){++++++}, at: [<c10ab0f4>] might_fault+0x4c/0x86
[60390.954324] 
[60390.954324] but task is already holding lock:
[60390.954324]  (&(&kvm->mmu_lock)->rlock){+.+...}, at: [<f8ec1b50>] spin_lock+0xd/0xf [kvm]
[60390.954324] 
[60390.954324] which lock already depends on the new lock.
[60390.954324] 
[60390.954324] 
[60390.954324] the existing dependency chain (in reverse order) is:
[60390.954324] 
[60390.954324] -> #1 (&(&kvm->mmu_lock)->rlock){+.+...}:
[60390.954324]        [<c10575ad>] __lock_acquire+0x9fa/0xb6c
[60390.954324]        [<c10577b8>] lock_acquire+0x99/0xb8
[60390.954324]        [<c15afa2b>] _raw_spin_lock+0x20/0x2f
[60390.954324]        [<f8eafe19>] spin_lock+0xd/0xf [kvm]
[60390.954324]        [<f8eb104e>] kvm_mmu_notifier_invalidate_range_start+0x2f/0x71 [kvm]
[60390.954324]        [<c10bc994>] __mmu_notifier_invalidate_range_start+0x31/0x57
[60390.954324]        [<c10b1de3>] mprotect_fixup+0x153/0x3d5
[60390.954324]        [<c10b21ca>] sys_mprotect+0x165/0x1db
[60390.954324]        [<c10028cc>] sysenter_do_call+0x12/0x32
[60390.954324] 
[60390.954324] -> #0 (&mm->mmap_sem){++++++}:
[60390.954324]        [<c10574af>] __lock_acquire+0x8fc/0xb6c
[60390.954324]        [<c10577b8>] lock_acquire+0x99/0xb8
[60390.954324]        [<c10ab111>] might_fault+0x69/0x86
[60390.954324]        [<c11d5987>] _copy_from_user+0x36/0x119
[60390.954324]        [<f8eafcd9>] copy_from_user+0xd/0xf [kvm]
[60390.954324]        [<f8eb0ac0>] kvm_read_guest_page+0x24/0x33 [kvm]
[60390.954324]        [<f8ebb362>] kvm_read_guest_page_mmu+0x55/0x63 [kvm]
[60390.954324]        [<f8ebb397>] kvm_read_nested_guest_page+0x27/0x2e [kvm]
[60390.954324]        [<f8ebb3da>] load_pdptrs+0x3c/0x9e [kvm]
[60390.954324]        [<f84747ac>] svm_cache_reg+0x25/0x2b [kvm_amd]
[60390.954324]        [<f8ec7894>] kvm_mmu_load+0xf1/0x1fa [kvm]
[60390.954324]        [<f8ebbdfc>] kvm_arch_vcpu_ioctl_run+0x252/0x9c7 [kvm]
[60390.954324]        [<f8eb1fb5>] kvm_vcpu_ioctl+0xee/0x432 [kvm]
[60390.954324]        [<c10cf8e9>] vfs_ioctl+0x2c/0x96
[60390.954324]        [<c10cfe88>] do_vfs_ioctl+0x491/0x4cf
[60390.954324]        [<c10cff0c>] sys_ioctl+0x46/0x66
[60390.954324]        [<c10028cc>] sysenter_do_call+0x12/0x32
[60390.954324] 
[60390.954324] other info that might help us debug this:
[60390.954324] 
[60390.954324] 3 locks held by qemu-system-x86/2506:
[60390.954324]  #0:  (&vcpu->mutex){+.+.+.}, at: [<f8eb1185>] vcpu_load+0x16/0x32 [kvm]
[60390.954324]  #1:  (&kvm->srcu){.+.+.+}, at: [<f8eb952c>] srcu_read_lock+0x0/0x33 [kvm]
[60390.954324]  #2:  (&(&kvm->mmu_lock)->rlock){+.+...}, at: [<f8ec1b50>] spin_lock+0xd/0xf [kvm]
[60390.954324] 
[60390.954324] stack backtrace:
[60390.954324] Pid: 2506, comm: qemu-system-x86 Not tainted 2.6.34-rc5 #7
[60390.954324] Call Trace:
[60390.954324]  [<c15adf46>] ? printk+0x14/0x16
[60390.954324]  [<c1056877>] print_circular_bug+0x8a/0x96
[60390.954324]  [<c10574af>] __lock_acquire+0x8fc/0xb6c
[60390.954324]  [<f8ec1b50>] ? spin_lock+0xd/0xf [kvm]
[60390.954324]  [<c10ab0f4>] ? might_fault+0x4c/0x86
[60390.954324]  [<c10577b8>] lock_acquire+0x99/0xb8
[60390.954324]  [<c10ab0f4>] ? might_fault+0x4c/0x86
[60390.954324]  [<c10ab111>] might_fault+0x69/0x86
[60390.954324]  [<c10ab0f4>] ? might_fault+0x4c/0x86
[60390.954324]  [<c11d5987>] _copy_from_user+0x36/0x119
[60390.954324]  [<f8eafcd9>] copy_from_user+0xd/0xf [kvm]
[60390.954324]  [<f8eb0ac0>] kvm_read_guest_page+0x24/0x33 [kvm]
[60390.954324]  [<f8ebb362>] kvm_read_guest_page_mmu+0x55/0x63 [kvm]
[60390.954324]  [<f8ebb397>] kvm_read_nested_guest_page+0x27/0x2e [kvm]
[60390.954324]  [<f8ebb3da>] load_pdptrs+0x3c/0x9e [kvm]
[60390.954324]  [<f8ec1b50>] ? spin_lock+0xd/0xf [kvm]
[60390.954324]  [<c15afa32>] ? _raw_spin_lock+0x27/0x2f
[60390.954324]  [<f84747ac>] svm_cache_reg+0x25/0x2b [kvm_amd]
[60390.954324]  [<f84747ac>] ? svm_cache_reg+0x25/0x2b [kvm_amd]
[60390.954324]  [<f8ec7894>] kvm_mmu_load+0xf1/0x1fa [kvm]
[60390.954324]  [<f8ebbdfc>] kvm_arch_vcpu_ioctl_run+0x252/0x9c7 [kvm]
[60390.954324]  [<f8eb1fb5>] kvm_vcpu_ioctl+0xee/0x432 [kvm]
[60390.954324]  [<c1057710>] ? __lock_acquire+0xb5d/0xb6c
[60390.954324]  [<c107a300>] ? __rcu_process_callbacks+0x6/0x244
[60390.954324]  [<c119eb09>] ? file_has_perm+0x84/0x8d
[60390.954324]  [<c10cf8e9>] vfs_ioctl+0x2c/0x96
[60390.954324]  [<f8eb1ec7>] ? kvm_vcpu_ioctl+0x0/0x432 [kvm]
[60390.954324]  [<c10cfe88>] do_vfs_ioctl+0x491/0x4cf
[60390.954324]  [<c119ece0>] ? selinux_file_ioctl+0x43/0x46
[60390.954324]  [<c10cff0c>] sys_ioctl+0x46/0x66
[60390.954324]  [<c10028cc>] sysenter_do_call+0x12/0x32

What makes me wondering about this is that the two traces to the locks seem to
belong to different threads.

HTH, Joerg



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 16/22] KVM: MMU: Track page fault data in struct vcpu
  2010-05-04  9:37             ` Roedel, Joerg
@ 2010-05-04  9:45               ` Avi Kivity
  2010-05-04  9:50                 ` Avi Kivity
  0 siblings, 1 reply; 57+ messages in thread
From: Avi Kivity @ 2010-05-04  9:45 UTC (permalink / raw)
  To: Roedel, Joerg
  Cc: Marcelo Tosatti, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org

On 05/04/2010 12:37 PM, Roedel, Joerg wrote:
>
> This is the lockdep warning I get when I start booting a Linux kernel.
> It is with the nested-npt patchset but the warning occurs without it too
> (slightly different backtraces then).
>
> [60390.953424] =======================================================
> [60390.954324] [ INFO: possible circular locking dependency detected ]
> [60390.954324] 2.6.34-rc5 #7
> [60390.954324] -------------------------------------------------------
> [60390.954324] qemu-system-x86/2506 is trying to acquire lock:
> [60390.954324]  (&mm->mmap_sem){++++++}, at: [<c10ab0f4>] might_fault+0x4c/0x86
> [60390.954324]
> [60390.954324] but task is already holding lock:
> [60390.954324]  (&(&kvm->mmu_lock)->rlock){+.+...}, at: [<f8ec1b50>] spin_lock+0xd/0xf [kvm]
> [60390.954324]
> [60390.954324] which lock already depends on the new lock.
> [60390.954324]
> [60390.954324]
> [60390.954324] the existing dependency chain (in reverse order) is:
> [60390.954324]
> [60390.954324] ->  #1 (&(&kvm->mmu_lock)->rlock){+.+...}:
> [60390.954324]        [<c10575ad>] __lock_acquire+0x9fa/0xb6c
> [60390.954324]        [<c10577b8>] lock_acquire+0x99/0xb8
> [60390.954324]        [<c15afa2b>] _raw_spin_lock+0x20/0x2f
> [60390.954324]        [<f8eafe19>] spin_lock+0xd/0xf [kvm]
> [60390.954324]        [<f8eb104e>] kvm_mmu_notifier_invalidate_range_start+0x2f/0x71 [kvm]
> [60390.954324]        [<c10bc994>] __mmu_notifier_invalidate_range_start+0x31/0x57
> [60390.954324]        [<c10b1de3>] mprotect_fixup+0x153/0x3d5
> [60390.954324]        [<c10b21ca>] sys_mprotect+0x165/0x1db
> [60390.954324]        [<c10028cc>] sysenter_do_call+0x12/0x32
>    

Unrelated.  This can take the lock and free it.  It only shows up 
because we do memory ops inside the mmu_lock, which is deeply forbidden 
(anything which touches user memory, including kmalloc(), can trigger 
mmu notifiers and recursive locking).

> [60390.954324]
> [60390.954324] ->  #0 (&mm->mmap_sem){++++++}:
> [60390.954324]        [<c10574af>] __lock_acquire+0x8fc/0xb6c
> [60390.954324]        [<c10577b8>] lock_acquire+0x99/0xb8
> [60390.954324]        [<c10ab111>] might_fault+0x69/0x86
> [60390.954324]        [<c11d5987>] _copy_from_user+0x36/0x119
> [60390.954324]        [<f8eafcd9>] copy_from_user+0xd/0xf [kvm]
> [60390.954324]        [<f8eb0ac0>] kvm_read_guest_page+0x24/0x33 [kvm]
> [60390.954324]        [<f8ebb362>] kvm_read_guest_page_mmu+0x55/0x63 [kvm]
> [60390.954324]        [<f8ebb397>] kvm_read_nested_guest_page+0x27/0x2e [kvm]
> [60390.954324]        [<f8ebb3da>] load_pdptrs+0x3c/0x9e [kvm]
> [60390.954324]        [<f84747ac>] svm_cache_reg+0x25/0x2b [kvm_amd]
> [60390.954324]        [<f8ec7894>] kvm_mmu_load+0xf1/0x1fa [kvm]
> [60390.954324]        [<f8ebbdfc>] kvm_arch_vcpu_ioctl_run+0x252/0x9c7 [kvm]
> [60390.954324]        [<f8eb1fb5>] kvm_vcpu_ioctl+0xee/0x432 [kvm]
> [60390.954324]        [<c10cf8e9>] vfs_ioctl+0x2c/0x96
> [60390.954324]        [<c10cfe88>] do_vfs_ioctl+0x491/0x4cf
> [60390.954324]        [<c10cff0c>] sys_ioctl+0x46/0x66
> [60390.954324]        [<c10028cc>] sysenter_do_call+0x12/0x32
>    


Just a silly bug.  kvm_pdptr_read() can cause a guest memory read on 
svm, in this case with the mmu lock taken.  I'll post something to fix it.

> What makes me wondering about this is that the two traces to the locks seem to
> belong to different threads.
>    

Ever increasing complexity...

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 16/22] KVM: MMU: Track page fault data in struct vcpu
  2010-05-04  9:45               ` Avi Kivity
@ 2010-05-04  9:50                 ` Avi Kivity
  2010-05-04 12:00                   ` Roedel, Joerg
  0 siblings, 1 reply; 57+ messages in thread
From: Avi Kivity @ 2010-05-04  9:50 UTC (permalink / raw)
  To: Roedel, Joerg
  Cc: Marcelo Tosatti, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org

On 05/04/2010 12:45 PM, Avi Kivity wrote:
>
>
> Just a silly bug.  kvm_pdptr_read() can cause a guest memory read on 
> svm, in this case with the mmu lock taken.  I'll post something to fix 
> it.

I guess this was not reported because most svm machines have npt, and 
this requires npt=0 to trigger.  Nonpae paging disables npt, so you were 
hit.  Interestingly, nsvm makes it more likely to appear, since npt on 
i386+pae will need the pdptrs.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 16/22] KVM: MMU: Track page fault data in struct vcpu
  2010-05-04  9:50                 ` Avi Kivity
@ 2010-05-04 12:00                   ` Roedel, Joerg
  2010-05-04 12:04                     ` Avi Kivity
  0 siblings, 1 reply; 57+ messages in thread
From: Roedel, Joerg @ 2010-05-04 12:00 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Marcelo Tosatti, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org

On Tue, May 04, 2010 at 05:50:50AM -0400, Avi Kivity wrote:
> On 05/04/2010 12:45 PM, Avi Kivity wrote:
> >
> >
> > Just a silly bug.  kvm_pdptr_read() can cause a guest memory read on 
> > svm, in this case with the mmu lock taken.  I'll post something to fix 
> > it.
> 
> I guess this was not reported because most svm machines have npt, and 
> this requires npt=0 to trigger.  Nonpae paging disables npt, so you were 
> hit.  Interestingly, nsvm makes it more likely to appear, since npt on 
> i386+pae will need the pdptrs.

Hmm, actually it happened on 32 bit with npt enabled. I think this
can trigger when mmu_alloc_roots is called for an pae guest because it
accidentially tries read the root_gfn from the guest before it figures
out that it runs with tdp and omits the gfn read from the guest.
I need to touch this for nested-npt and will look into a way improving
this.

	Joerg



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 16/22] KVM: MMU: Track page fault data in struct vcpu
  2010-05-04 12:00                   ` Roedel, Joerg
@ 2010-05-04 12:04                     ` Avi Kivity
  0 siblings, 0 replies; 57+ messages in thread
From: Avi Kivity @ 2010-05-04 12:04 UTC (permalink / raw)
  To: Roedel, Joerg
  Cc: Marcelo Tosatti, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org

On 05/04/2010 03:00 PM, Roedel, Joerg wrote:
> On Tue, May 04, 2010 at 05:50:50AM -0400, Avi Kivity wrote:
>    
>> On 05/04/2010 12:45 PM, Avi Kivity wrote:
>>      
>>>
>>> Just a silly bug.  kvm_pdptr_read() can cause a guest memory read on
>>> svm, in this case with the mmu lock taken.  I'll post something to fix
>>> it.
>>>        
>> I guess this was not reported because most svm machines have npt, and
>> this requires npt=0 to trigger.  Nonpae paging disables npt, so you were
>> hit.  Interestingly, nsvm makes it more likely to appear, since npt on
>> i386+pae will need the pdptrs.
>>      
> Hmm, actually it happened on 32 bit with npt enabled. I think this
> can trigger when mmu_alloc_roots is called for an pae guest because it
> accidentially tries read the root_gfn from the guest before it figures
> out that it runs with tdp and omits the gfn read from the guest.
>    

Yes.  I had a patchset which moved the 'direct' calculation before, and 
skipped root_gfn if it was direct, but it was broken.  If you like I can 
resurrect it, but it may interfere with your work.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2010-05-04 12:04 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-27 10:38 [PATCH 0/22] Nested Paging support for Nested SVM v2 Joerg Roedel
2010-04-27 10:38 ` [PATCH 01/22] KVM: MMU: Check for root_level instead of long mode Joerg Roedel
2010-04-27 10:38 ` [PATCH 02/22] KVM: MMU: Make tdp_enabled a mmu-context parameter Joerg Roedel
2010-04-27 12:06   ` Avi Kivity
2010-04-27 10:38 ` [PATCH 03/22] KVM: MMU: Make set_cr3 a function pointer in kvm_mmu Joerg Roedel
2010-04-27 10:38 ` [PATCH 04/22] KVM: X86: Introduce a tdp_set_cr3 function Joerg Roedel
2010-04-27 10:38 ` [PATCH 05/22] KVM: MMU: Introduce get_cr3 function pointer Joerg Roedel
2010-04-27 10:38 ` [PATCH 06/22] KVM: MMU: Introduce inject_page_fault " Joerg Roedel
2010-04-27 10:38 ` [PATCH 07/22] KVM: SVM: Implement MMU helper functions for Nested Nested Paging Joerg Roedel
2010-04-27 10:38 ` [PATCH 08/22] KVM: MMU: Change init_kvm_softmmu to take a context as parameter Joerg Roedel
2010-04-27 10:38 ` [PATCH 09/22] KVM: MMU: Let is_rsvd_bits_set take mmu context instead of vcpu Joerg Roedel
2010-04-27 10:38 ` [PATCH 10/22] KVM: MMU: Introduce generic walk_addr function Joerg Roedel
2010-04-27 10:38 ` [PATCH 11/22] KVM: MMU: Add infrastructure for two-level page walker Joerg Roedel
2010-04-27 12:34   ` Avi Kivity
2010-04-28 10:52     ` Joerg Roedel
2010-04-28 11:24       ` Avi Kivity
2010-04-28 11:03     ` Joerg Roedel
2010-04-28 11:09       ` Avi Kivity
2010-04-27 10:38 ` [PATCH 12/22] KVM: MMU: Implement nested gva_to_gpa functions Joerg Roedel
2010-04-27 12:37   ` Avi Kivity
2010-04-28 14:20     ` Joerg Roedel
2010-04-27 10:38 ` [PATCH 13/22] KVM: X86: Add kvm_read_guest_page_tdp function Joerg Roedel
2010-04-27 12:42   ` Avi Kivity
2010-04-27 13:10     ` Joerg Roedel
2010-04-27 13:40     ` Avi Kivity
2010-04-27 10:38 ` [PATCH 14/22] KVM: MMU: Make walk_addr_generic capable for two-level walking Joerg Roedel
2010-04-27 10:38 ` [PATCH 15/22] KVM: MMU: Introduce kvm_read_guest_page_x86() Joerg Roedel
2010-04-27 12:52   ` Avi Kivity
2010-04-27 13:20     ` Joerg Roedel
2010-04-27 13:35       ` Avi Kivity
2010-04-27 15:40         ` Joerg Roedel
2010-04-27 16:09           ` Avi Kivity
2010-04-27 16:27             ` Joerg Roedel
2010-04-28 15:31         ` Joerg Roedel
2010-04-27 10:38 ` [PATCH 16/22] KVM: MMU: Track page fault data in struct vcpu Joerg Roedel
2010-04-27 12:58   ` Avi Kivity
2010-04-27 13:28     ` Joerg Roedel
2010-04-27 13:37       ` Avi Kivity
2010-04-27 13:57         ` Joerg Roedel
2010-04-27 16:02           ` Avi Kivity
2010-05-03 16:32     ` Joerg Roedel
2010-05-04  7:53       ` Avi Kivity
2010-05-04  9:11         ` Roedel, Joerg
2010-05-04  9:20           ` Avi Kivity
2010-05-04  9:37             ` Roedel, Joerg
2010-05-04  9:45               ` Avi Kivity
2010-05-04  9:50                 ` Avi Kivity
2010-05-04 12:00                   ` Roedel, Joerg
2010-05-04 12:04                     ` Avi Kivity
2010-04-27 10:38 ` [PATCH 17/22] KVM: MMU: Propagate the right fault back to the guest after gva_to_gpa Joerg Roedel
2010-04-27 10:38 ` [PATCH 18/22] KVM: X86: Propagate fetch faults Joerg Roedel
2010-04-27 10:38 ` [PATCH 19/22] KVM: MMU: Introduce init_kvm_nested_mmu() Joerg Roedel
2010-04-27 10:38 ` [PATCH 20/22] KVM: SVM: Initialize Nested Nested MMU context on VMRUN Joerg Roedel
2010-04-27 13:01   ` Avi Kivity
2010-04-27 10:38 ` [PATCH 21/22] KVM: SVM: Report Nested Paging support to userspace Joerg Roedel
2010-04-27 10:38 ` [PATCH 22/22] KVM: SVM: Expect two more candiates for exit_int_info Joerg Roedel
2010-04-27 13:03 ` [PATCH 0/22] Nested Paging support for Nested SVM v2 Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox