* [PATCH 01/29] KVM: MMU: Check for root_level instead of long mode
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-10 15:30 ` [PATCH 02/29] KVM: MMU: Make tdp_enabled a mmu-context parameter Joerg Roedel
` (28 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
The walk_addr function checks for !is_long_mode in its 64
bit version. But what is meant here is a check for pae
paging. Change the condition to really check for pae paging
so that it also works with nested nested paging.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/kvm/paging_tmpl.h | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index debe770..e4ad3dc 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -132,7 +132,7 @@ walk:
walker->level = vcpu->arch.mmu.root_level;
pte = vcpu->arch.cr3;
#if PTTYPE == 64
- if (!is_long_mode(vcpu)) {
+ if (vcpu->arch.mmu.root_level == PT32E_ROOT_LEVEL) {
pte = kvm_pdptr_read(vcpu, (addr >> 30) & 3);
trace_kvm_mmu_paging_element(pte, walker->level);
if (!is_present_gpte(pte)) {
@@ -205,7 +205,7 @@ walk:
(PTTYPE == 64 || is_pse(vcpu))) ||
((walker->level == PT_PDPE_LEVEL) &&
is_large_pte(pte) &&
- is_long_mode(vcpu))) {
+ vcpu->arch.mmu.root_level == PT64_ROOT_LEVEL)) {
int lvl = walker->level;
walker->gfn = gpte_to_gfn_lvl(pte, lvl);
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 02/29] KVM: MMU: Make tdp_enabled a mmu-context parameter
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
2010-09-10 15:30 ` [PATCH 01/29] KVM: MMU: Check for root_level instead of long mode Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-10 15:30 ` [PATCH 03/29] KVM: MMU: Make set_cr3 a function pointer in kvm_mmu Joerg Roedel
` (27 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This patch changes the tdp_enabled flag from its global
meaning to the mmu-context and renames it to direct_map
there. This is necessary for Nested SVM with emulation of
Nested Paging where we need an extra MMU context to shadow
the Nested Nested Page Table.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/mmu.c | 22 ++++++++++++++--------
2 files changed, 15 insertions(+), 8 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 9b30285..53cdf39 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -249,6 +249,7 @@ struct kvm_mmu {
int root_level;
int shadow_root_level;
union kvm_mmu_page_role base_role;
+ bool direct_map;
u64 *pae_root;
u64 rsvd_bits_mask[2][4];
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index b2136f9..5c28e97 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1448,7 +1448,8 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
if (role.direct)
role.cr4_pae = 0;
role.access = access;
- if (!tdp_enabled && vcpu->arch.mmu.root_level <= PT32_ROOT_LEVEL) {
+ if (!vcpu->arch.mmu.direct_map
+ && vcpu->arch.mmu.root_level <= PT32_ROOT_LEVEL) {
quadrant = gaddr >> (PAGE_SHIFT + (PT64_PT_BITS * level));
quadrant &= (1 << ((PT32_PT_BITS - PT64_PT_BITS) * level)) - 1;
role.quadrant = quadrant;
@@ -1973,7 +1974,7 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
spte |= shadow_user_mask;
if (level > PT_PAGE_TABLE_LEVEL)
spte |= PT_PAGE_SIZE_MASK;
- if (tdp_enabled)
+ if (vcpu->arch.mmu.direct_map)
spte |= kvm_x86_ops->get_mt_mask(vcpu, gfn,
kvm_is_mmio_pfn(pfn));
@@ -1983,8 +1984,8 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
spte |= (u64)pfn << PAGE_SHIFT;
if ((pte_access & ACC_WRITE_MASK)
- || (!tdp_enabled && write_fault && !is_write_protection(vcpu)
- && !user_fault)) {
+ || (!vcpu->arch.mmu.direct_map && write_fault
+ && !is_write_protection(vcpu) && !user_fault)) {
if (level > PT_PAGE_TABLE_LEVEL &&
has_wrprotected_page(vcpu->kvm, gfn, level)) {
@@ -1995,7 +1996,8 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
spte |= PT_WRITABLE_MASK;
- if (!tdp_enabled && !(pte_access & ACC_WRITE_MASK))
+ if (!vcpu->arch.mmu.direct_map
+ && !(pte_access & ACC_WRITE_MASK))
spte &= ~PT_USER_MASK;
/*
@@ -2371,7 +2373,7 @@ static int mmu_alloc_roots(struct kvm_vcpu *vcpu)
ASSERT(!VALID_PAGE(root));
if (mmu_check_root(vcpu, root_gfn))
return 1;
- if (tdp_enabled) {
+ if (vcpu->arch.mmu.direct_map) {
direct = 1;
root_gfn = 0;
}
@@ -2406,7 +2408,7 @@ static int mmu_alloc_roots(struct kvm_vcpu *vcpu)
return 1;
} else if (vcpu->arch.mmu.root_level == 0)
root_gfn = 0;
- if (tdp_enabled) {
+ if (vcpu->arch.mmu.direct_map) {
direct = 1;
root_gfn = i << 30;
}
@@ -2544,6 +2546,7 @@ static int nonpaging_init_context(struct kvm_vcpu *vcpu)
context->root_level = 0;
context->shadow_root_level = PT32E_ROOT_LEVEL;
context->root_hpa = INVALID_PAGE;
+ context->direct_map = true;
return 0;
}
@@ -2663,6 +2666,7 @@ static int paging64_init_context_common(struct kvm_vcpu *vcpu, int level)
context->root_level = level;
context->shadow_root_level = level;
context->root_hpa = INVALID_PAGE;
+ context->direct_map = false;
return 0;
}
@@ -2687,6 +2691,7 @@ static int paging32_init_context(struct kvm_vcpu *vcpu)
context->root_level = PT32_ROOT_LEVEL;
context->shadow_root_level = PT32E_ROOT_LEVEL;
context->root_hpa = INVALID_PAGE;
+ context->direct_map = false;
return 0;
}
@@ -2708,6 +2713,7 @@ static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
context->invlpg = nonpaging_invlpg;
context->shadow_root_level = kvm_x86_ops->get_tdp_level();
context->root_hpa = INVALID_PAGE;
+ context->direct_map = true;
if (!is_paging(vcpu)) {
context->gva_to_gpa = nonpaging_gva_to_gpa;
@@ -3060,7 +3066,7 @@ int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva)
gpa_t gpa;
int r;
- if (tdp_enabled)
+ if (vcpu->arch.mmu.direct_map)
return 0;
gpa = kvm_mmu_gva_to_gpa_read(vcpu, gva, NULL);
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 03/29] KVM: MMU: Make set_cr3 a function pointer in kvm_mmu
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
2010-09-10 15:30 ` [PATCH 01/29] KVM: MMU: Check for root_level instead of long mode Joerg Roedel
2010-09-10 15:30 ` [PATCH 02/29] KVM: MMU: Make tdp_enabled a mmu-context parameter Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-10 15:30 ` [PATCH 04/29] KVM: X86: Introduce a tdp_set_cr3 function Joerg Roedel
` (26 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This is necessary to implement Nested Nested Paging. As a
side effect this allows some cleanups in the SVM nested
paging code.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/mmu.c | 6 ++++--
2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 53cdf39..43c8db0 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -236,6 +236,7 @@ struct kvm_pio_request {
*/
struct kvm_mmu {
void (*new_cr3)(struct kvm_vcpu *vcpu);
+ void (*set_cr3)(struct kvm_vcpu *vcpu, unsigned long root);
int (*page_fault)(struct kvm_vcpu *vcpu, gva_t gva, u32 err);
void (*free)(struct kvm_vcpu *vcpu);
gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t gva, u32 access,
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 5c28e97..c8acb96 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2714,6 +2714,7 @@ static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
context->shadow_root_level = kvm_x86_ops->get_tdp_level();
context->root_hpa = INVALID_PAGE;
context->direct_map = true;
+ context->set_cr3 = kvm_x86_ops->set_cr3;
if (!is_paging(vcpu)) {
context->gva_to_gpa = nonpaging_gva_to_gpa;
@@ -2752,7 +2753,8 @@ static int init_kvm_softmmu(struct kvm_vcpu *vcpu)
r = paging32_init_context(vcpu);
vcpu->arch.mmu.base_role.cr4_pae = !!is_pae(vcpu);
- vcpu->arch.mmu.base_role.cr0_wp = is_write_protection(vcpu);
+ vcpu->arch.mmu.base_role.cr0_wp = is_write_protection(vcpu);
+ vcpu->arch.mmu.set_cr3 = kvm_x86_ops->set_cr3;
return r;
}
@@ -2796,7 +2798,7 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu)
if (r)
goto out;
/* set_cr3() should ensure TLB has been flushed */
- kvm_x86_ops->set_cr3(vcpu, vcpu->arch.mmu.root_hpa);
+ vcpu->arch.mmu.set_cr3(vcpu, vcpu->arch.mmu.root_hpa);
out:
return r;
}
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 04/29] KVM: X86: Introduce a tdp_set_cr3 function
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (2 preceding siblings ...)
2010-09-10 15:30 ` [PATCH 03/29] KVM: MMU: Make set_cr3 a function pointer in kvm_mmu Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-10 15:30 ` [PATCH 05/29] KVM: MMU: Introduce get_cr3 function pointer Joerg Roedel
` (25 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This patch introduces a special set_tdp_cr3 function pointer
in kvm_x86_ops which is only used for tpd enabled mmu
contexts. This allows to remove some hacks from svm code.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/include/asm/kvm_host.h | 2 ++
arch/x86/kvm/mmu.c | 2 +-
arch/x86/kvm/svm.c | 23 ++++++++++++++---------
arch/x86/kvm/vmx.c | 2 ++
4 files changed, 19 insertions(+), 10 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 43c8db0..aeeea9c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -526,6 +526,8 @@ struct kvm_x86_ops {
bool (*rdtscp_supported)(void);
void (*adjust_tsc_offset)(struct kvm_vcpu *vcpu, s64 adjustment);
+ void (*set_tdp_cr3)(struct kvm_vcpu *vcpu, unsigned long cr3);
+
void (*set_supported_cpuid)(u32 func, struct kvm_cpuid_entry2 *entry);
bool (*has_wbinvd_exit)(void);
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index c8acb96..a55f8d5 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2714,7 +2714,7 @@ static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
context->shadow_root_level = kvm_x86_ops->get_tdp_level();
context->root_hpa = INVALID_PAGE;
context->direct_map = true;
- context->set_cr3 = kvm_x86_ops->set_cr3;
+ context->set_cr3 = kvm_x86_ops->set_tdp_cr3;
if (!is_paging(vcpu)) {
context->gva_to_gpa = nonpaging_gva_to_gpa;
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 6808f64..094df31 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3216,9 +3216,6 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
gs_selector = kvm_read_gs();
ldt_selector = kvm_read_ldt();
svm->vmcb->save.cr2 = vcpu->arch.cr2;
- /* required for live migration with NPT */
- if (npt_enabled)
- svm->vmcb->save.cr3 = vcpu->arch.cr3;
clgi();
@@ -3335,16 +3332,22 @@ static void svm_set_cr3(struct kvm_vcpu *vcpu, unsigned long root)
{
struct vcpu_svm *svm = to_svm(vcpu);
- if (npt_enabled) {
- svm->vmcb->control.nested_cr3 = root;
- force_new_asid(vcpu);
- return;
- }
-
svm->vmcb->save.cr3 = root;
force_new_asid(vcpu);
}
+static void set_tdp_cr3(struct kvm_vcpu *vcpu, unsigned long root)
+{
+ struct vcpu_svm *svm = to_svm(vcpu);
+
+ svm->vmcb->control.nested_cr3 = root;
+
+ /* Also sync guest cr3 here in case we live migrate */
+ svm->vmcb->save.cr3 = vcpu->arch.cr3;
+
+ force_new_asid(vcpu);
+}
+
static int is_disabled(void)
{
u64 vm_cr;
@@ -3571,6 +3574,8 @@ static struct kvm_x86_ops svm_x86_ops = {
.write_tsc_offset = svm_write_tsc_offset,
.adjust_tsc_offset = svm_adjust_tsc_offset,
+
+ .set_tdp_cr3 = set_tdp_cr3,
};
static int __init svm_init(void)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 676555c..0e62d8a 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4347,6 +4347,8 @@ static struct kvm_x86_ops vmx_x86_ops = {
.write_tsc_offset = vmx_write_tsc_offset,
.adjust_tsc_offset = vmx_adjust_tsc_offset,
+
+ .set_tdp_cr3 = vmx_set_cr3,
};
static int __init vmx_init(void)
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 05/29] KVM: MMU: Introduce get_cr3 function pointer
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (3 preceding siblings ...)
2010-09-10 15:30 ` [PATCH 04/29] KVM: X86: Introduce a tdp_set_cr3 function Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-10 15:30 ` [PATCH 06/29] KVM: MMU: Introduce inject_page_fault " Joerg Roedel
` (24 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This function pointer in the MMU context is required to
implement Nested Nested Paging.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/mmu.c | 9 ++++++++-
arch/x86/kvm/paging_tmpl.h | 4 ++--
3 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index aeeea9c..ab708ee 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -237,6 +237,7 @@ struct kvm_pio_request {
struct kvm_mmu {
void (*new_cr3)(struct kvm_vcpu *vcpu);
void (*set_cr3)(struct kvm_vcpu *vcpu, unsigned long root);
+ unsigned long (*get_cr3)(struct kvm_vcpu *vcpu);
int (*page_fault)(struct kvm_vcpu *vcpu, gva_t gva, u32 err);
void (*free)(struct kvm_vcpu *vcpu);
gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t gva, u32 access,
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index a55f8d5..e4a7de4 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2365,7 +2365,7 @@ static int mmu_alloc_roots(struct kvm_vcpu *vcpu)
int direct = 0;
u64 pdptr;
- root_gfn = vcpu->arch.cr3 >> PAGE_SHIFT;
+ root_gfn = vcpu->arch.mmu.get_cr3(vcpu) >> PAGE_SHIFT;
if (vcpu->arch.mmu.shadow_root_level == PT64_ROOT_LEVEL) {
hpa_t root = vcpu->arch.mmu.root_hpa;
@@ -2562,6 +2562,11 @@ static void paging_new_cr3(struct kvm_vcpu *vcpu)
mmu_free_roots(vcpu);
}
+static unsigned long get_cr3(struct kvm_vcpu *vcpu)
+{
+ return vcpu->arch.cr3;
+}
+
static void inject_page_fault(struct kvm_vcpu *vcpu,
u64 addr,
u32 err_code)
@@ -2715,6 +2720,7 @@ static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
context->root_hpa = INVALID_PAGE;
context->direct_map = true;
context->set_cr3 = kvm_x86_ops->set_tdp_cr3;
+ context->get_cr3 = get_cr3;
if (!is_paging(vcpu)) {
context->gva_to_gpa = nonpaging_gva_to_gpa;
@@ -2755,6 +2761,7 @@ static int init_kvm_softmmu(struct kvm_vcpu *vcpu)
vcpu->arch.mmu.base_role.cr4_pae = !!is_pae(vcpu);
vcpu->arch.mmu.base_role.cr0_wp = is_write_protection(vcpu);
vcpu->arch.mmu.set_cr3 = kvm_x86_ops->set_cr3;
+ vcpu->arch.mmu.get_cr3 = get_cr3;
return r;
}
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index e4ad3dc..13d0c06 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -130,7 +130,7 @@ walk:
present = true;
eperm = rsvd_fault = false;
walker->level = vcpu->arch.mmu.root_level;
- pte = vcpu->arch.cr3;
+ pte = vcpu->arch.mmu.get_cr3(vcpu);
#if PTTYPE == 64
if (vcpu->arch.mmu.root_level == PT32E_ROOT_LEVEL) {
pte = kvm_pdptr_read(vcpu, (addr >> 30) & 3);
@@ -143,7 +143,7 @@ walk:
}
#endif
ASSERT((!is_long_mode(vcpu) && is_pae(vcpu)) ||
- (vcpu->arch.cr3 & CR3_NONPAE_RESERVED_BITS) == 0);
+ (vcpu->arch.mmu.get_cr3(vcpu) & CR3_NONPAE_RESERVED_BITS) == 0);
pt_access = ACC_ALL;
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 06/29] KVM: MMU: Introduce inject_page_fault function pointer
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (4 preceding siblings ...)
2010-09-10 15:30 ` [PATCH 05/29] KVM: MMU: Introduce get_cr3 function pointer Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-10 15:30 ` [PATCH 07/29] KVM: MMU: Introduce kvm_init_shadow_mmu helper function Joerg Roedel
` (23 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This patch introduces an inject_page_fault function pointer
into struct kvm_mmu which will be used to inject a page
fault. This will be used later when Nested Nested Paging is
implemented.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/include/asm/kvm_host.h | 3 +++
arch/x86/kvm/mmu.c | 4 +++-
2 files changed, 6 insertions(+), 1 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index ab708ee..3fefcd8 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -239,6 +239,9 @@ struct kvm_mmu {
void (*set_cr3)(struct kvm_vcpu *vcpu, unsigned long root);
unsigned long (*get_cr3)(struct kvm_vcpu *vcpu);
int (*page_fault)(struct kvm_vcpu *vcpu, gva_t gva, u32 err);
+ void (*inject_page_fault)(struct kvm_vcpu *vcpu,
+ unsigned long addr,
+ u32 error_code);
void (*free)(struct kvm_vcpu *vcpu);
gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t gva, u32 access,
u32 *error);
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index e4a7de4..a751dfc 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2571,7 +2571,7 @@ static void inject_page_fault(struct kvm_vcpu *vcpu,
u64 addr,
u32 err_code)
{
- kvm_inject_page_fault(vcpu, addr, err_code);
+ vcpu->arch.mmu.inject_page_fault(vcpu, addr, err_code);
}
static void paging_free(struct kvm_vcpu *vcpu)
@@ -2721,6 +2721,7 @@ static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
context->direct_map = true;
context->set_cr3 = kvm_x86_ops->set_tdp_cr3;
context->get_cr3 = get_cr3;
+ context->inject_page_fault = kvm_inject_page_fault;
if (!is_paging(vcpu)) {
context->gva_to_gpa = nonpaging_gva_to_gpa;
@@ -2762,6 +2763,7 @@ static int init_kvm_softmmu(struct kvm_vcpu *vcpu)
vcpu->arch.mmu.base_role.cr0_wp = is_write_protection(vcpu);
vcpu->arch.mmu.set_cr3 = kvm_x86_ops->set_cr3;
vcpu->arch.mmu.get_cr3 = get_cr3;
+ vcpu->arch.mmu.inject_page_fault = kvm_inject_page_fault;
return r;
}
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 07/29] KVM: MMU: Introduce kvm_init_shadow_mmu helper function
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (5 preceding siblings ...)
2010-09-10 15:30 ` [PATCH 06/29] KVM: MMU: Introduce inject_page_fault " Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-10 15:30 ` [PATCH 08/29] KVM: MMU: Let is_rsvd_bits_set take mmu context instead of vcpu Joerg Roedel
` (22 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
Some logic of the init_kvm_softmmu function is required to
build the Nested Nested Paging context. So factor the
required logic into a seperate function and export it.
Also make the whole init path suitable for more than one mmu
context.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/kvm/mmu.c | 60 ++++++++++++++++++++++++++++++---------------------
arch/x86/kvm/mmu.h | 1 +
2 files changed, 36 insertions(+), 25 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index a751dfc..9e48a77 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2532,10 +2532,9 @@ static void nonpaging_free(struct kvm_vcpu *vcpu)
mmu_free_roots(vcpu);
}
-static int nonpaging_init_context(struct kvm_vcpu *vcpu)
+static int nonpaging_init_context(struct kvm_vcpu *vcpu,
+ struct kvm_mmu *context)
{
- struct kvm_mmu *context = &vcpu->arch.mmu;
-
context->new_cr3 = nonpaging_new_cr3;
context->page_fault = nonpaging_page_fault;
context->gva_to_gpa = nonpaging_gva_to_gpa;
@@ -2595,9 +2594,10 @@ static bool is_rsvd_bits_set(struct kvm_vcpu *vcpu, u64 gpte, int level)
#include "paging_tmpl.h"
#undef PTTYPE
-static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu, int level)
+static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
+ struct kvm_mmu *context,
+ int level)
{
- struct kvm_mmu *context = &vcpu->arch.mmu;
int maxphyaddr = cpuid_maxphyaddr(vcpu);
u64 exb_bit_rsvd = 0;
@@ -2656,9 +2656,11 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu, int level)
}
}
-static int paging64_init_context_common(struct kvm_vcpu *vcpu, int level)
+static int paging64_init_context_common(struct kvm_vcpu *vcpu,
+ struct kvm_mmu *context,
+ int level)
{
- struct kvm_mmu *context = &vcpu->arch.mmu;
+ reset_rsvds_bits_mask(vcpu, context, level);
ASSERT(is_pae(vcpu));
context->new_cr3 = paging_new_cr3;
@@ -2675,17 +2677,17 @@ static int paging64_init_context_common(struct kvm_vcpu *vcpu, int level)
return 0;
}
-static int paging64_init_context(struct kvm_vcpu *vcpu)
+static int paging64_init_context(struct kvm_vcpu *vcpu,
+ struct kvm_mmu *context)
{
- reset_rsvds_bits_mask(vcpu, PT64_ROOT_LEVEL);
- return paging64_init_context_common(vcpu, PT64_ROOT_LEVEL);
+ return paging64_init_context_common(vcpu, context, PT64_ROOT_LEVEL);
}
-static int paging32_init_context(struct kvm_vcpu *vcpu)
+static int paging32_init_context(struct kvm_vcpu *vcpu,
+ struct kvm_mmu *context)
{
- struct kvm_mmu *context = &vcpu->arch.mmu;
+ reset_rsvds_bits_mask(vcpu, context, PT32_ROOT_LEVEL);
- reset_rsvds_bits_mask(vcpu, PT32_ROOT_LEVEL);
context->new_cr3 = paging_new_cr3;
context->page_fault = paging32_page_fault;
context->gva_to_gpa = paging32_gva_to_gpa;
@@ -2700,10 +2702,10 @@ static int paging32_init_context(struct kvm_vcpu *vcpu)
return 0;
}
-static int paging32E_init_context(struct kvm_vcpu *vcpu)
+static int paging32E_init_context(struct kvm_vcpu *vcpu,
+ struct kvm_mmu *context)
{
- reset_rsvds_bits_mask(vcpu, PT32E_ROOT_LEVEL);
- return paging64_init_context_common(vcpu, PT32E_ROOT_LEVEL);
+ return paging64_init_context_common(vcpu, context, PT32E_ROOT_LEVEL);
}
static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
@@ -2727,15 +2729,15 @@ static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
context->gva_to_gpa = nonpaging_gva_to_gpa;
context->root_level = 0;
} else if (is_long_mode(vcpu)) {
- reset_rsvds_bits_mask(vcpu, PT64_ROOT_LEVEL);
+ reset_rsvds_bits_mask(vcpu, context, PT64_ROOT_LEVEL);
context->gva_to_gpa = paging64_gva_to_gpa;
context->root_level = PT64_ROOT_LEVEL;
} else if (is_pae(vcpu)) {
- reset_rsvds_bits_mask(vcpu, PT32E_ROOT_LEVEL);
+ reset_rsvds_bits_mask(vcpu, context, PT32E_ROOT_LEVEL);
context->gva_to_gpa = paging64_gva_to_gpa;
context->root_level = PT32E_ROOT_LEVEL;
} else {
- reset_rsvds_bits_mask(vcpu, PT32_ROOT_LEVEL);
+ reset_rsvds_bits_mask(vcpu, context, PT32_ROOT_LEVEL);
context->gva_to_gpa = paging32_gva_to_gpa;
context->root_level = PT32_ROOT_LEVEL;
}
@@ -2743,24 +2745,32 @@ static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
return 0;
}
-static int init_kvm_softmmu(struct kvm_vcpu *vcpu)
+int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context)
{
int r;
-
ASSERT(vcpu);
ASSERT(!VALID_PAGE(vcpu->arch.mmu.root_hpa));
if (!is_paging(vcpu))
- r = nonpaging_init_context(vcpu);
+ r = nonpaging_init_context(vcpu, context);
else if (is_long_mode(vcpu))
- r = paging64_init_context(vcpu);
+ r = paging64_init_context(vcpu, context);
else if (is_pae(vcpu))
- r = paging32E_init_context(vcpu);
+ r = paging32E_init_context(vcpu, context);
else
- r = paging32_init_context(vcpu);
+ r = paging32_init_context(vcpu, context);
vcpu->arch.mmu.base_role.cr4_pae = !!is_pae(vcpu);
vcpu->arch.mmu.base_role.cr0_wp = is_write_protection(vcpu);
+
+ return r;
+}
+EXPORT_SYMBOL_GPL(kvm_init_shadow_mmu);
+
+static int init_kvm_softmmu(struct kvm_vcpu *vcpu)
+{
+ int r = kvm_init_shadow_mmu(vcpu, &vcpu->arch.mmu);
+
vcpu->arch.mmu.set_cr3 = kvm_x86_ops->set_cr3;
vcpu->arch.mmu.get_cr3 = get_cr3;
vcpu->arch.mmu.inject_page_fault = kvm_inject_page_fault;
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index f05a03d..7086ca8 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -49,6 +49,7 @@
#define PFERR_FETCH_MASK (1U << 4)
int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 sptes[4]);
+int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context);
static inline unsigned int kvm_mmu_available_pages(struct kvm *kvm)
{
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 08/29] KVM: MMU: Let is_rsvd_bits_set take mmu context instead of vcpu
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (6 preceding siblings ...)
2010-09-10 15:30 ` [PATCH 07/29] KVM: MMU: Introduce kvm_init_shadow_mmu helper function Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-10 15:30 ` [PATCH 09/29] KVM: MMU: Track page fault data in struct vcpu Joerg Roedel
` (21 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This patch changes is_rsvd_bits_set() function prototype to
take only a kvm_mmu context instead of a full vcpu.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/kvm/mmu.c | 6 +++---
arch/x86/kvm/paging_tmpl.h | 7 ++++---
2 files changed, 7 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 9e48a77..86f7557 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2578,12 +2578,12 @@ static void paging_free(struct kvm_vcpu *vcpu)
nonpaging_free(vcpu);
}
-static bool is_rsvd_bits_set(struct kvm_vcpu *vcpu, u64 gpte, int level)
+static bool is_rsvd_bits_set(struct kvm_mmu *mmu, u64 gpte, int level)
{
int bit7;
bit7 = (gpte >> 7) & 1;
- return (gpte & vcpu->arch.mmu.rsvd_bits_mask[bit7][level-1]) != 0;
+ return (gpte & mmu->rsvd_bits_mask[bit7][level-1]) != 0;
}
#define PTTYPE 64
@@ -2859,7 +2859,7 @@ static void mmu_pte_write_new_pte(struct kvm_vcpu *vcpu,
return;
}
- if (is_rsvd_bits_set(vcpu, *(u64 *)new, PT_PAGE_TABLE_LEVEL))
+ if (is_rsvd_bits_set(&vcpu->arch.mmu, *(u64 *)new, PT_PAGE_TABLE_LEVEL))
return;
++vcpu->kvm->stat.mmu_pte_updated;
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 13d0c06..68ee1b7 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -168,7 +168,7 @@ walk:
break;
}
- if (is_rsvd_bits_set(vcpu, pte, walker->level)) {
+ if (is_rsvd_bits_set(&vcpu->arch.mmu, pte, walker->level)) {
rsvd_fault = true;
break;
}
@@ -327,6 +327,7 @@ static void FNAME(pte_prefetch)(struct kvm_vcpu *vcpu, struct guest_walker *gw,
u64 *sptep)
{
struct kvm_mmu_page *sp;
+ struct kvm_mmu *mmu = &vcpu->arch.mmu;
pt_element_t *gptep = gw->prefetch_ptes;
u64 *spte;
int i;
@@ -358,7 +359,7 @@ static void FNAME(pte_prefetch)(struct kvm_vcpu *vcpu, struct guest_walker *gw,
gpte = gptep[i];
if (!is_present_gpte(gpte) ||
- is_rsvd_bits_set(vcpu, gpte, PT_PAGE_TABLE_LEVEL)) {
+ is_rsvd_bits_set(mmu, gpte, PT_PAGE_TABLE_LEVEL)) {
if (!sp->unsync)
__set_spte(spte, shadow_notrap_nonpresent_pte);
continue;
@@ -713,7 +714,7 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
return -EINVAL;
gfn = gpte_to_gfn(gpte);
- if (is_rsvd_bits_set(vcpu, gpte, PT_PAGE_TABLE_LEVEL)
+ if (is_rsvd_bits_set(&vcpu->arch.mmu, gpte, PT_PAGE_TABLE_LEVEL)
|| gfn != sp->gfns[i] || !is_present_gpte(gpte)
|| !(gpte & PT_ACCESSED_MASK)) {
u64 nonpresent;
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 09/29] KVM: MMU: Track page fault data in struct vcpu
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (7 preceding siblings ...)
2010-09-10 15:30 ` [PATCH 08/29] KVM: MMU: Let is_rsvd_bits_set take mmu context instead of vcpu Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-10 15:30 ` [PATCH 10/29] KVM: MMU: Introduce generic walk_addr function Joerg Roedel
` (20 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This patch introduces a struct with two new fields in
vcpu_arch for x86:
* fault.address
* fault.error_code
This will be used to correctly propagate page faults back
into the guest when we could have either an ordinary page
fault or a nested page fault. In the case of a nested page
fault the fault-address is different from the original
address that should be walked. So we need to keep track
about the real fault-address.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/include/asm/kvm_emulate.h | 1 -
arch/x86/include/asm/kvm_host.h | 17 ++++++++++++-----
arch/x86/kvm/emulate.c | 30 ++++++++++++++----------------
arch/x86/kvm/mmu.c | 6 ++----
arch/x86/kvm/paging_tmpl.h | 6 +++++-
arch/x86/kvm/x86.c | 9 +++++----
6 files changed, 38 insertions(+), 31 deletions(-)
diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index 1bf1140..5187dd8 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -229,7 +229,6 @@ struct x86_emulate_ctxt {
int exception; /* exception that happens during emulation or -1 */
u32 error_code; /* error code for exception */
bool error_code_valid;
- unsigned long cr2; /* faulted address in case of #PF */
/* decode cache */
struct decode_cache decode;
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3fefcd8..235023e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -239,9 +239,7 @@ struct kvm_mmu {
void (*set_cr3)(struct kvm_vcpu *vcpu, unsigned long root);
unsigned long (*get_cr3)(struct kvm_vcpu *vcpu);
int (*page_fault)(struct kvm_vcpu *vcpu, gva_t gva, u32 err);
- void (*inject_page_fault)(struct kvm_vcpu *vcpu,
- unsigned long addr,
- u32 error_code);
+ void (*inject_page_fault)(struct kvm_vcpu *vcpu);
void (*free)(struct kvm_vcpu *vcpu);
gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t gva, u32 access,
u32 *error);
@@ -288,6 +286,16 @@ struct kvm_vcpu_arch {
bool tpr_access_reporting;
struct kvm_mmu mmu;
+
+ /*
+ * This struct is filled with the necessary information to propagate a
+ * page fault into the guest
+ */
+ struct {
+ u64 address;
+ unsigned error_code;
+ } fault;
+
/* only needed in kvm_pv_mmu_op() path, but it's hot so
* put it here to avoid allocation */
struct kvm_pv_mmu_op_buffer mmu_op_buffer;
@@ -624,8 +632,7 @@ void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr);
void kvm_queue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code);
void kvm_requeue_exception(struct kvm_vcpu *vcpu, unsigned nr);
void kvm_requeue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code);
-void kvm_inject_page_fault(struct kvm_vcpu *vcpu, unsigned long cr2,
- u32 error_code);
+void kvm_inject_page_fault(struct kvm_vcpu *vcpu);
bool kvm_require_cpl(struct kvm_vcpu *vcpu, int required_cpl);
int kvm_pic_set_irq(void *opaque, int irq, int level);
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 27d2c22..2b08b78 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -487,11 +487,9 @@ static void emulate_gp(struct x86_emulate_ctxt *ctxt, int err)
emulate_exception(ctxt, GP_VECTOR, err, true);
}
-static void emulate_pf(struct x86_emulate_ctxt *ctxt, unsigned long addr,
- int err)
+static void emulate_pf(struct x86_emulate_ctxt *ctxt)
{
- ctxt->cr2 = addr;
- emulate_exception(ctxt, PF_VECTOR, err, true);
+ emulate_exception(ctxt, PF_VECTOR, 0, true);
}
static void emulate_ud(struct x86_emulate_ctxt *ctxt)
@@ -834,7 +832,7 @@ static int read_emulated(struct x86_emulate_ctxt *ctxt,
rc = ops->read_emulated(addr, mc->data + mc->end, n, &err,
ctxt->vcpu);
if (rc == X86EMUL_PROPAGATE_FAULT)
- emulate_pf(ctxt, addr, err);
+ emulate_pf(ctxt);
if (rc != X86EMUL_CONTINUE)
return rc;
mc->end += n;
@@ -921,7 +919,7 @@ static int read_segment_descriptor(struct x86_emulate_ctxt *ctxt,
addr = dt.address + index * 8;
ret = ops->read_std(addr, desc, sizeof *desc, ctxt->vcpu, &err);
if (ret == X86EMUL_PROPAGATE_FAULT)
- emulate_pf(ctxt, addr, err);
+ emulate_pf(ctxt);
return ret;
}
@@ -947,7 +945,7 @@ static int write_segment_descriptor(struct x86_emulate_ctxt *ctxt,
addr = dt.address + index * 8;
ret = ops->write_std(addr, desc, sizeof *desc, ctxt->vcpu, &err);
if (ret == X86EMUL_PROPAGATE_FAULT)
- emulate_pf(ctxt, addr, err);
+ emulate_pf(ctxt);
return ret;
}
@@ -1117,7 +1115,7 @@ static inline int writeback(struct x86_emulate_ctxt *ctxt,
&err,
ctxt->vcpu);
if (rc == X86EMUL_PROPAGATE_FAULT)
- emulate_pf(ctxt, c->dst.addr.mem, err);
+ emulate_pf(ctxt);
if (rc != X86EMUL_CONTINUE)
return rc;
break;
@@ -1939,7 +1937,7 @@ static int task_switch_16(struct x86_emulate_ctxt *ctxt,
&err);
if (ret == X86EMUL_PROPAGATE_FAULT) {
/* FIXME: need to provide precise fault address */
- emulate_pf(ctxt, old_tss_base, err);
+ emulate_pf(ctxt);
return ret;
}
@@ -1949,7 +1947,7 @@ static int task_switch_16(struct x86_emulate_ctxt *ctxt,
&err);
if (ret == X86EMUL_PROPAGATE_FAULT) {
/* FIXME: need to provide precise fault address */
- emulate_pf(ctxt, old_tss_base, err);
+ emulate_pf(ctxt);
return ret;
}
@@ -1957,7 +1955,7 @@ static int task_switch_16(struct x86_emulate_ctxt *ctxt,
&err);
if (ret == X86EMUL_PROPAGATE_FAULT) {
/* FIXME: need to provide precise fault address */
- emulate_pf(ctxt, new_tss_base, err);
+ emulate_pf(ctxt);
return ret;
}
@@ -1970,7 +1968,7 @@ static int task_switch_16(struct x86_emulate_ctxt *ctxt,
ctxt->vcpu, &err);
if (ret == X86EMUL_PROPAGATE_FAULT) {
/* FIXME: need to provide precise fault address */
- emulate_pf(ctxt, new_tss_base, err);
+ emulate_pf(ctxt);
return ret;
}
}
@@ -2081,7 +2079,7 @@ static int task_switch_32(struct x86_emulate_ctxt *ctxt,
&err);
if (ret == X86EMUL_PROPAGATE_FAULT) {
/* FIXME: need to provide precise fault address */
- emulate_pf(ctxt, old_tss_base, err);
+ emulate_pf(ctxt);
return ret;
}
@@ -2091,7 +2089,7 @@ static int task_switch_32(struct x86_emulate_ctxt *ctxt,
&err);
if (ret == X86EMUL_PROPAGATE_FAULT) {
/* FIXME: need to provide precise fault address */
- emulate_pf(ctxt, old_tss_base, err);
+ emulate_pf(ctxt);
return ret;
}
@@ -2099,7 +2097,7 @@ static int task_switch_32(struct x86_emulate_ctxt *ctxt,
&err);
if (ret == X86EMUL_PROPAGATE_FAULT) {
/* FIXME: need to provide precise fault address */
- emulate_pf(ctxt, new_tss_base, err);
+ emulate_pf(ctxt);
return ret;
}
@@ -2112,7 +2110,7 @@ static int task_switch_32(struct x86_emulate_ctxt *ctxt,
ctxt->vcpu, &err);
if (ret == X86EMUL_PROPAGATE_FAULT) {
/* FIXME: need to provide precise fault address */
- emulate_pf(ctxt, new_tss_base, err);
+ emulate_pf(ctxt);
return ret;
}
}
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 86f7557..9936727 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2566,11 +2566,9 @@ static unsigned long get_cr3(struct kvm_vcpu *vcpu)
return vcpu->arch.cr3;
}
-static void inject_page_fault(struct kvm_vcpu *vcpu,
- u64 addr,
- u32 err_code)
+static void inject_page_fault(struct kvm_vcpu *vcpu)
{
- vcpu->arch.mmu.inject_page_fault(vcpu, addr, err_code);
+ vcpu->arch.mmu.inject_page_fault(vcpu);
}
static void paging_free(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 68ee1b7..d07f48a 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -258,6 +258,10 @@ error:
walker->error_code |= PFERR_FETCH_MASK;
if (rsvd_fault)
walker->error_code |= PFERR_RSVD_MASK;
+
+ vcpu->arch.fault.address = addr;
+ vcpu->arch.fault.error_code = walker->error_code;
+
trace_kvm_mmu_walker_error(walker->error_code);
return 0;
}
@@ -521,7 +525,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t addr,
*/
if (!r) {
pgprintk("%s: guest page fault\n", __func__);
- inject_page_fault(vcpu, addr, walker.error_code);
+ inject_page_fault(vcpu);
vcpu->arch.last_pt_write_count = 0; /* reset fork detector */
return 0;
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f47db25..57c97a4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -329,11 +329,12 @@ void kvm_requeue_exception(struct kvm_vcpu *vcpu, unsigned nr)
}
EXPORT_SYMBOL_GPL(kvm_requeue_exception);
-void kvm_inject_page_fault(struct kvm_vcpu *vcpu, unsigned long addr,
- u32 error_code)
+void kvm_inject_page_fault(struct kvm_vcpu *vcpu)
{
+ unsigned error_code = vcpu->arch.fault.error_code;
+
++vcpu->stat.pf_guest;
- vcpu->arch.cr2 = addr;
+ vcpu->arch.cr2 = vcpu->arch.fault.address;
kvm_queue_exception_e(vcpu, PF_VECTOR, error_code);
}
@@ -4066,7 +4067,7 @@ static void inject_emulated_exception(struct kvm_vcpu *vcpu)
{
struct x86_emulate_ctxt *ctxt = &vcpu->arch.emulate_ctxt;
if (ctxt->exception == PF_VECTOR)
- kvm_inject_page_fault(vcpu, ctxt->cr2, ctxt->error_code);
+ kvm_inject_page_fault(vcpu);
else if (ctxt->error_code_valid)
kvm_queue_exception_e(vcpu, ctxt->exception, ctxt->error_code);
else
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 10/29] KVM: MMU: Introduce generic walk_addr function
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (8 preceding siblings ...)
2010-09-10 15:30 ` [PATCH 09/29] KVM: MMU: Track page fault data in struct vcpu Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-10 15:30 ` [PATCH 11/29] KVM: MMU: Add infrastructure for two-level page walker Joerg Roedel
` (19 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This is the first patch in the series towards a generic
walk_addr implementation which could walk two-dimensional
page tables in the end. In this first step the walk_addr
function is renamed into walk_addr_generic which takes a
mmu context as an additional parameter.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/kvm/paging_tmpl.h | 26 ++++++++++++++++++--------
1 files changed, 18 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index d07f48a..a704a81 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -114,9 +114,10 @@ static unsigned FNAME(gpte_access)(struct kvm_vcpu *vcpu, pt_element_t gpte)
/*
* Fetch a guest pte for a guest virtual address
*/
-static int FNAME(walk_addr)(struct guest_walker *walker,
- struct kvm_vcpu *vcpu, gva_t addr,
- int write_fault, int user_fault, int fetch_fault)
+static int FNAME(walk_addr_generic)(struct guest_walker *walker,
+ struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
+ gva_t addr, int write_fault,
+ int user_fault, int fetch_fault)
{
pt_element_t pte;
gfn_t table_gfn;
@@ -129,10 +130,11 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
walk:
present = true;
eperm = rsvd_fault = false;
- walker->level = vcpu->arch.mmu.root_level;
- pte = vcpu->arch.mmu.get_cr3(vcpu);
+ walker->level = mmu->root_level;
+ pte = mmu->get_cr3(vcpu);
+
#if PTTYPE == 64
- if (vcpu->arch.mmu.root_level == PT32E_ROOT_LEVEL) {
+ if (walker->level == PT32E_ROOT_LEVEL) {
pte = kvm_pdptr_read(vcpu, (addr >> 30) & 3);
trace_kvm_mmu_paging_element(pte, walker->level);
if (!is_present_gpte(pte)) {
@@ -143,7 +145,7 @@ walk:
}
#endif
ASSERT((!is_long_mode(vcpu) && is_pae(vcpu)) ||
- (vcpu->arch.mmu.get_cr3(vcpu) & CR3_NONPAE_RESERVED_BITS) == 0);
+ (mmu->get_cr3(vcpu) & CR3_NONPAE_RESERVED_BITS) == 0);
pt_access = ACC_ALL;
@@ -205,7 +207,7 @@ walk:
(PTTYPE == 64 || is_pse(vcpu))) ||
((walker->level == PT_PDPE_LEVEL) &&
is_large_pte(pte) &&
- vcpu->arch.mmu.root_level == PT64_ROOT_LEVEL)) {
+ mmu->root_level == PT64_ROOT_LEVEL)) {
int lvl = walker->level;
walker->gfn = gpte_to_gfn_lvl(pte, lvl);
@@ -266,6 +268,14 @@ error:
return 0;
}
+static int FNAME(walk_addr)(struct guest_walker *walker,
+ struct kvm_vcpu *vcpu, gva_t addr,
+ int write_fault, int user_fault, int fetch_fault)
+{
+ return FNAME(walk_addr_generic)(walker, vcpu, &vcpu->arch.mmu, addr,
+ write_fault, user_fault, fetch_fault);
+}
+
static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
u64 *spte, const void *pte)
{
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 11/29] KVM: MMU: Add infrastructure for two-level page walker
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (9 preceding siblings ...)
2010-09-10 15:30 ` [PATCH 10/29] KVM: MMU: Introduce generic walk_addr function Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-10 15:30 ` [PATCH 12/29] KVM: X86: Introduce pointer to mmu context used for gva_to_gpa Joerg Roedel
` (18 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This patch introduces a mmu-callback to translate gpa
addresses in the walk_addr code. This is later used to
translate l2_gpa addresses into l1_gpa addresses.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/x86.c | 6 ++++++
include/linux/kvm_host.h | 5 +++++
3 files changed, 12 insertions(+), 0 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 235023e..7f95260 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -243,6 +243,7 @@ struct kvm_mmu {
void (*free)(struct kvm_vcpu *vcpu);
gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t gva, u32 access,
u32 *error);
+ gpa_t (*translate_gpa)(struct kvm_vcpu *vcpu, gpa_t gpa, u32 access);
void (*prefetch_page)(struct kvm_vcpu *vcpu,
struct kvm_mmu_page *page);
int (*sync_page)(struct kvm_vcpu *vcpu,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 57c97a4..d29ae19 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3434,6 +3434,11 @@ void kvm_get_segment(struct kvm_vcpu *vcpu,
kvm_x86_ops->get_segment(vcpu, var, seg);
}
+static gpa_t translate_gpa(struct kvm_vcpu *vcpu, gpa_t gpa, u32 access)
+{
+ return gpa;
+}
+
gpa_t kvm_mmu_gva_to_gpa_read(struct kvm_vcpu *vcpu, gva_t gva, u32 *error)
{
u32 access = (kvm_x86_ops->get_cpl(vcpu) == 3) ? PFERR_USER_MASK : 0;
@@ -5645,6 +5650,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
vcpu->arch.emulate_ctxt.ops = &emulate_ops;
vcpu->arch.mmu.root_hpa = INVALID_PAGE;
+ vcpu->arch.mmu.translate_gpa = translate_gpa;
if (!irqchip_in_kernel(kvm) || kvm_vcpu_is_bsp(vcpu))
vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
else
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index f2ecdd5..917e68f 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -534,6 +534,11 @@ static inline gpa_t gfn_to_gpa(gfn_t gfn)
return (gpa_t)gfn << PAGE_SHIFT;
}
+static inline gfn_t gpa_to_gfn(gpa_t gpa)
+{
+ return (gfn_t)(gpa >> PAGE_SHIFT);
+}
+
static inline hpa_t pfn_to_hpa(pfn_t pfn)
{
return (hpa_t)pfn << PAGE_SHIFT;
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 12/29] KVM: X86: Introduce pointer to mmu context used for gva_to_gpa
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (10 preceding siblings ...)
2010-09-10 15:30 ` [PATCH 11/29] KVM: MMU: Add infrastructure for two-level page walker Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-10 15:30 ` [PATCH 13/29] KVM: MMU: Implement nested gva_to_gpa functions Joerg Roedel
` (17 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This patch introduces the walk_mmu pointer which points to
the mmu-context currently used for gva_to_gpa translations.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/include/asm/kvm_host.h | 13 +++++++++++++
arch/x86/kvm/mmu.c | 10 +++++-----
arch/x86/kvm/x86.c | 17 ++++++++++-------
3 files changed, 28 insertions(+), 12 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 7f95260..91c7d35 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -286,9 +286,22 @@ struct kvm_vcpu_arch {
u64 ia32_misc_enable_msr;
bool tpr_access_reporting;
+ /*
+ * Paging state of the vcpu
+ *
+ * If the vcpu runs in guest mode with two level paging this still saves
+ * the paging mode of the l1 guest. This context is always used to
+ * handle faults.
+ */
struct kvm_mmu mmu;
/*
+ * Pointer to the mmu context currently used for
+ * gva_to_gpa translations.
+ */
+ struct kvm_mmu *walk_mmu;
+
+ /*
* This struct is filled with the necessary information to propagate a
* page fault into the guest
*/
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 9936727..cb06ada 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2708,7 +2708,7 @@ static int paging32E_init_context(struct kvm_vcpu *vcpu,
static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
{
- struct kvm_mmu *context = &vcpu->arch.mmu;
+ struct kvm_mmu *context = vcpu->arch.walk_mmu;
context->new_cr3 = nonpaging_new_cr3;
context->page_fault = tdp_page_fault;
@@ -2767,11 +2767,11 @@ EXPORT_SYMBOL_GPL(kvm_init_shadow_mmu);
static int init_kvm_softmmu(struct kvm_vcpu *vcpu)
{
- int r = kvm_init_shadow_mmu(vcpu, &vcpu->arch.mmu);
+ int r = kvm_init_shadow_mmu(vcpu, vcpu->arch.walk_mmu);
- vcpu->arch.mmu.set_cr3 = kvm_x86_ops->set_cr3;
- vcpu->arch.mmu.get_cr3 = get_cr3;
- vcpu->arch.mmu.inject_page_fault = kvm_inject_page_fault;
+ vcpu->arch.walk_mmu->set_cr3 = kvm_x86_ops->set_cr3;
+ vcpu->arch.walk_mmu->get_cr3 = get_cr3;
+ vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault;
return r;
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d29ae19..1054519 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3442,27 +3442,27 @@ static gpa_t translate_gpa(struct kvm_vcpu *vcpu, gpa_t gpa, u32 access)
gpa_t kvm_mmu_gva_to_gpa_read(struct kvm_vcpu *vcpu, gva_t gva, u32 *error)
{
u32 access = (kvm_x86_ops->get_cpl(vcpu) == 3) ? PFERR_USER_MASK : 0;
- return vcpu->arch.mmu.gva_to_gpa(vcpu, gva, access, error);
+ return vcpu->arch.walk_mmu->gva_to_gpa(vcpu, gva, access, error);
}
gpa_t kvm_mmu_gva_to_gpa_fetch(struct kvm_vcpu *vcpu, gva_t gva, u32 *error)
{
u32 access = (kvm_x86_ops->get_cpl(vcpu) == 3) ? PFERR_USER_MASK : 0;
access |= PFERR_FETCH_MASK;
- return vcpu->arch.mmu.gva_to_gpa(vcpu, gva, access, error);
+ return vcpu->arch.walk_mmu->gva_to_gpa(vcpu, gva, access, error);
}
gpa_t kvm_mmu_gva_to_gpa_write(struct kvm_vcpu *vcpu, gva_t gva, u32 *error)
{
u32 access = (kvm_x86_ops->get_cpl(vcpu) == 3) ? PFERR_USER_MASK : 0;
access |= PFERR_WRITE_MASK;
- return vcpu->arch.mmu.gva_to_gpa(vcpu, gva, access, error);
+ return vcpu->arch.walk_mmu->gva_to_gpa(vcpu, gva, access, error);
}
/* uses this to access any guest's mapped memory without checking CPL */
gpa_t kvm_mmu_gva_to_gpa_system(struct kvm_vcpu *vcpu, gva_t gva, u32 *error)
{
- return vcpu->arch.mmu.gva_to_gpa(vcpu, gva, 0, error);
+ return vcpu->arch.walk_mmu->gva_to_gpa(vcpu, gva, 0, error);
}
static int kvm_read_guest_virt_helper(gva_t addr, void *val, unsigned int bytes,
@@ -3473,7 +3473,8 @@ static int kvm_read_guest_virt_helper(gva_t addr, void *val, unsigned int bytes,
int r = X86EMUL_CONTINUE;
while (bytes) {
- gpa_t gpa = vcpu->arch.mmu.gva_to_gpa(vcpu, addr, access, error);
+ gpa_t gpa = vcpu->arch.walk_mmu->gva_to_gpa(vcpu, addr, access,
+ error);
unsigned offset = addr & (PAGE_SIZE-1);
unsigned toread = min(bytes, (unsigned)PAGE_SIZE - offset);
int ret;
@@ -3528,8 +3529,9 @@ static int kvm_write_guest_virt_system(gva_t addr, void *val,
int r = X86EMUL_CONTINUE;
while (bytes) {
- gpa_t gpa = vcpu->arch.mmu.gva_to_gpa(vcpu, addr,
- PFERR_WRITE_MASK, error);
+ gpa_t gpa = vcpu->arch.walk_mmu->gva_to_gpa(vcpu, addr,
+ PFERR_WRITE_MASK,
+ error);
unsigned offset = addr & (PAGE_SIZE-1);
unsigned towrite = min(bytes, (unsigned)PAGE_SIZE - offset);
int ret;
@@ -5649,6 +5651,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
kvm = vcpu->kvm;
vcpu->arch.emulate_ctxt.ops = &emulate_ops;
+ vcpu->arch.walk_mmu = &vcpu->arch.mmu;
vcpu->arch.mmu.root_hpa = INVALID_PAGE;
vcpu->arch.mmu.translate_gpa = translate_gpa;
if (!irqchip_in_kernel(kvm) || kvm_vcpu_is_bsp(vcpu))
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 13/29] KVM: MMU: Implement nested gva_to_gpa functions
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (11 preceding siblings ...)
2010-09-10 15:30 ` [PATCH 12/29] KVM: X86: Introduce pointer to mmu context used for gva_to_gpa Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-10 15:30 ` [PATCH 14/29] KVM: X86: Add kvm_read_guest_page_mmu function Joerg Roedel
` (16 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This patch adds the functions to do a nested l2_gva to
l1_gpa page table walk.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/include/asm/kvm_host.h | 10 ++++++++++
arch/x86/kvm/mmu.c | 8 ++++++++
arch/x86/kvm/paging_tmpl.h | 31 +++++++++++++++++++++++++++++++
arch/x86/kvm/x86.h | 5 +++++
4 files changed, 54 insertions(+), 0 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 91c7d35..10a5ddd 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -296,6 +296,16 @@ struct kvm_vcpu_arch {
struct kvm_mmu mmu;
/*
+ * Paging state of an L2 guest (used for nested npt)
+ *
+ * This context will save all necessary information to walk page tables
+ * of the an L2 guest. This context is only initialized for page table
+ * walking and not for faulting since we never handle l2 page faults on
+ * the host.
+ */
+ struct kvm_mmu nested_mmu;
+
+ /*
* Pointer to the mmu context currently used for
* gva_to_gpa translations.
*/
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index cb06ada..1e215e8 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2466,6 +2466,14 @@ static gpa_t nonpaging_gva_to_gpa(struct kvm_vcpu *vcpu, gva_t vaddr,
return vaddr;
}
+static gpa_t nonpaging_gva_to_gpa_nested(struct kvm_vcpu *vcpu, gva_t vaddr,
+ u32 access, u32 *error)
+{
+ if (error)
+ *error = 0;
+ return vcpu->arch.nested_mmu.translate_gpa(vcpu, vaddr, access);
+}
+
static int nonpaging_page_fault(struct kvm_vcpu *vcpu, gva_t gva,
u32 error_code)
{
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index a704a81..eefe363 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -276,6 +276,16 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
write_fault, user_fault, fetch_fault);
}
+static int FNAME(walk_addr_nested)(struct guest_walker *walker,
+ struct kvm_vcpu *vcpu, gva_t addr,
+ int write_fault, int user_fault,
+ int fetch_fault)
+{
+ return FNAME(walk_addr_generic)(walker, vcpu, &vcpu->arch.nested_mmu,
+ addr, write_fault, user_fault,
+ fetch_fault);
+}
+
static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
u64 *spte, const void *pte)
{
@@ -660,6 +670,27 @@ static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t vaddr, u32 access,
return gpa;
}
+static gpa_t FNAME(gva_to_gpa_nested)(struct kvm_vcpu *vcpu, gva_t vaddr,
+ u32 access, u32 *error)
+{
+ struct guest_walker walker;
+ gpa_t gpa = UNMAPPED_GVA;
+ int r;
+
+ r = FNAME(walk_addr_nested)(&walker, vcpu, vaddr,
+ access & PFERR_WRITE_MASK,
+ access & PFERR_USER_MASK,
+ access & PFERR_FETCH_MASK);
+
+ if (r) {
+ gpa = gfn_to_gpa(walker.gfn);
+ gpa |= vaddr & ~PAGE_MASK;
+ } else if (error)
+ *error = walker.error_code;
+
+ return gpa;
+}
+
static void FNAME(prefetch_page)(struct kvm_vcpu *vcpu,
struct kvm_mmu_page *sp)
{
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 2d6385e..bf4dc2f 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -50,6 +50,11 @@ static inline int is_long_mode(struct kvm_vcpu *vcpu)
#endif
}
+static inline bool mmu_is_nested(struct kvm_vcpu *vcpu)
+{
+ return vcpu->arch.walk_mmu == &vcpu->arch.nested_mmu;
+}
+
static inline int is_pae(struct kvm_vcpu *vcpu)
{
return kvm_read_cr4_bits(vcpu, X86_CR4_PAE);
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 14/29] KVM: X86: Add kvm_read_guest_page_mmu function
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (12 preceding siblings ...)
2010-09-10 15:30 ` [PATCH 13/29] KVM: MMU: Implement nested gva_to_gpa functions Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-10 15:30 ` [PATCH 15/29] KVM: MMU: Make walk_addr_generic capable for two-level walking Joerg Roedel
` (15 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This patch adds a function which can read from the guests
physical memory or from the guest's guest physical memory.
This will be used in the two-dimensional page table walker.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/include/asm/kvm_host.h | 3 +++
arch/x86/kvm/x86.c | 23 +++++++++++++++++++++++
2 files changed, 26 insertions(+), 0 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 10a5ddd..5d9e0bb 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -657,6 +657,9 @@ void kvm_queue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code);
void kvm_requeue_exception(struct kvm_vcpu *vcpu, unsigned nr);
void kvm_requeue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code);
void kvm_inject_page_fault(struct kvm_vcpu *vcpu);
+int kvm_read_guest_page_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
+ gfn_t gfn, void *data, int offset, int len,
+ u32 access);
bool kvm_require_cpl(struct kvm_vcpu *vcpu, int required_cpl);
int kvm_pic_set_irq(void *opaque, int irq, int level);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1054519..34bc840 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -370,6 +370,29 @@ bool kvm_require_cpl(struct kvm_vcpu *vcpu, int required_cpl)
EXPORT_SYMBOL_GPL(kvm_require_cpl);
/*
+ * This function will be used to read from the physical memory of the currently
+ * running guest. The difference to kvm_read_guest_page is that this function
+ * can read from guest physical or from the guest's guest physical memory.
+ */
+int kvm_read_guest_page_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
+ gfn_t ngfn, void *data, int offset, int len,
+ u32 access)
+{
+ gfn_t real_gfn;
+ gpa_t ngpa;
+
+ ngpa = gfn_to_gpa(ngfn);
+ real_gfn = mmu->translate_gpa(vcpu, ngpa, access);
+ if (real_gfn == UNMAPPED_GVA)
+ return -EFAULT;
+
+ real_gfn = gpa_to_gfn(real_gfn);
+
+ return kvm_read_guest_page(vcpu->kvm, real_gfn, data, offset, len);
+}
+EXPORT_SYMBOL_GPL(kvm_read_guest_page_mmu);
+
+/*
* Load the pae pdptrs. Return true is they are all valid.
*/
int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3)
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 15/29] KVM: MMU: Make walk_addr_generic capable for two-level walking
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (13 preceding siblings ...)
2010-09-10 15:30 ` [PATCH 14/29] KVM: X86: Add kvm_read_guest_page_mmu function Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-10 15:30 ` [PATCH 16/29] KVM: MMU: Introduce kvm_read_nested_guest_page() Joerg Roedel
` (14 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This patch uses kvm_read_guest_page_tdp to make the
walk_addr_generic functions suitable for two-level page
table walking.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/kvm/paging_tmpl.h | 30 +++++++++++++++++++++++-------
1 files changed, 23 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index eefe363..f4e09d3 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -124,6 +124,8 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
unsigned index, pt_access, uninitialized_var(pte_access);
gpa_t pte_gpa;
bool eperm, present, rsvd_fault;
+ int offset;
+ u32 access = 0;
trace_kvm_mmu_pagetable_walk(addr, write_fault, user_fault,
fetch_fault);
@@ -153,12 +155,14 @@ walk:
index = PT_INDEX(addr, walker->level);
table_gfn = gpte_to_gfn(pte);
- pte_gpa = gfn_to_gpa(table_gfn);
- pte_gpa += index * sizeof(pt_element_t);
+ offset = index * sizeof(pt_element_t);
+ pte_gpa = gfn_to_gpa(table_gfn) + offset;
walker->table_gfn[walker->level - 1] = table_gfn;
walker->pte_gpa[walker->level - 1] = pte_gpa;
- if (kvm_read_guest(vcpu->kvm, pte_gpa, &pte, sizeof(pte))) {
+ if (kvm_read_guest_page_mmu(vcpu, mmu, table_gfn, &pte,
+ offset, sizeof(pte),
+ PFERR_USER_MASK|PFERR_WRITE_MASK)) {
present = false;
break;
}
@@ -209,15 +213,27 @@ walk:
is_large_pte(pte) &&
mmu->root_level == PT64_ROOT_LEVEL)) {
int lvl = walker->level;
+ gpa_t real_gpa;
+ gfn_t gfn;
- walker->gfn = gpte_to_gfn_lvl(pte, lvl);
- walker->gfn += (addr & PT_LVL_OFFSET_MASK(lvl))
- >> PAGE_SHIFT;
+ gfn = gpte_to_gfn_lvl(pte, lvl);
+ gfn += (addr & PT_LVL_OFFSET_MASK(lvl)) >> PAGE_SHIFT;
if (PTTYPE == 32 &&
walker->level == PT_DIRECTORY_LEVEL &&
is_cpuid_PSE36())
- walker->gfn += pse36_gfn_delta(pte);
+ gfn += pse36_gfn_delta(pte);
+
+ access |= write_fault ? PFERR_WRITE_MASK : 0;
+ access |= fetch_fault ? PFERR_FETCH_MASK : 0;
+ access |= user_fault ? PFERR_USER_MASK : 0;
+
+ real_gpa = mmu->translate_gpa(vcpu, gfn_to_gpa(gfn),
+ access);
+ if (real_gpa == UNMAPPED_GVA)
+ return 0;
+
+ walker->gfn = real_gpa >> PAGE_SHIFT;
break;
}
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 16/29] KVM: MMU: Introduce kvm_read_nested_guest_page()
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (14 preceding siblings ...)
2010-09-10 15:30 ` [PATCH 15/29] KVM: MMU: Make walk_addr_generic capable for two-level walking Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-10 15:30 ` [PATCH 17/29] KVM: MMU: Introduce init_kvm_nested_mmu() Joerg Roedel
` (13 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This patch introduces the kvm_read_guest_page_x86 function
which reads from the physical memory of the guest. If the
guest is running in guest-mode itself with nested paging
enabled it will read from the guest's guest physical memory
instead.
The patch also changes changes the code to use this function
where it is necessary.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/kvm/x86.c | 19 ++++++++++++++++---
1 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 34bc840..540255d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -392,6 +392,13 @@ int kvm_read_guest_page_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
}
EXPORT_SYMBOL_GPL(kvm_read_guest_page_mmu);
+int kvm_read_nested_guest_page(struct kvm_vcpu *vcpu, gfn_t gfn,
+ void *data, int offset, int len, u32 access)
+{
+ return kvm_read_guest_page_mmu(vcpu, vcpu->arch.walk_mmu, gfn,
+ data, offset, len, access);
+}
+
/*
* Load the pae pdptrs. Return true is they are all valid.
*/
@@ -403,8 +410,9 @@ int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3)
int ret;
u64 pdpte[ARRAY_SIZE(vcpu->arch.pdptrs)];
- ret = kvm_read_guest_page(vcpu->kvm, pdpt_gfn, pdpte,
- offset * sizeof(u64), sizeof(pdpte));
+ ret = kvm_read_nested_guest_page(vcpu, pdpt_gfn, pdpte,
+ offset * sizeof(u64), sizeof(pdpte),
+ PFERR_USER_MASK|PFERR_WRITE_MASK);
if (ret < 0) {
ret = 0;
goto out;
@@ -433,6 +441,8 @@ static bool pdptrs_changed(struct kvm_vcpu *vcpu)
{
u64 pdpte[ARRAY_SIZE(vcpu->arch.pdptrs)];
bool changed = true;
+ int offset;
+ gfn_t gfn;
int r;
if (is_long_mode(vcpu) || !is_pae(vcpu))
@@ -442,7 +452,10 @@ static bool pdptrs_changed(struct kvm_vcpu *vcpu)
(unsigned long *)&vcpu->arch.regs_avail))
return true;
- r = kvm_read_guest(vcpu->kvm, vcpu->arch.cr3 & ~31u, pdpte, sizeof(pdpte));
+ gfn = (vcpu->arch.cr3 & ~31u) >> PAGE_SHIFT;
+ offset = (vcpu->arch.cr3 & ~31u) & (PAGE_SIZE - 1);
+ r = kvm_read_nested_guest_page(vcpu, gfn, pdpte, offset, sizeof(pdpte),
+ PFERR_USER_MASK | PFERR_WRITE_MASK);
if (r < 0)
goto out;
changed = memcmp(pdpte, vcpu->arch.pdptrs, sizeof(pdpte)) != 0;
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 17/29] KVM: MMU: Introduce init_kvm_nested_mmu()
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (15 preceding siblings ...)
2010-09-10 15:30 ` [PATCH 16/29] KVM: MMU: Introduce kvm_read_nested_guest_page() Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-10 15:30 ` [PATCH 18/29] KVM: MMU: Propagate the right fault back to the guest after gva_to_gpa Joerg Roedel
` (12 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This patch introduces the init_kvm_nested_mmu() function
which is used to re-initialize the nested mmu when the l2
guest changes its paging mode.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/mmu.c | 37 ++++++++++++++++++++++++++++++++++++-
arch/x86/kvm/mmu.h | 1 +
arch/x86/kvm/x86.c | 17 +++++++++++++++++
4 files changed, 55 insertions(+), 1 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 5d9e0bb..87734cc 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -811,3 +811,4 @@ void kvm_set_shared_msr(unsigned index, u64 val, u64 mask);
bool kvm_is_linear_rip(struct kvm_vcpu *vcpu, unsigned long linear_rip);
#endif /* _ASM_X86_KVM_HOST_H */
+
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 1e215e8..a26f13b 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2784,11 +2784,46 @@ static int init_kvm_softmmu(struct kvm_vcpu *vcpu)
return r;
}
+static int init_kvm_nested_mmu(struct kvm_vcpu *vcpu)
+{
+ struct kvm_mmu *g_context = &vcpu->arch.nested_mmu;
+
+ g_context->get_cr3 = get_cr3;
+ g_context->inject_page_fault = kvm_inject_page_fault;
+
+ /*
+ * Note that arch.mmu.gva_to_gpa translates l2_gva to l1_gpa. The
+ * translation of l2_gpa to l1_gpa addresses is done using the
+ * arch.nested_mmu.gva_to_gpa function. Basically the gva_to_gpa
+ * functions between mmu and nested_mmu are swapped.
+ */
+ if (!is_paging(vcpu)) {
+ g_context->root_level = 0;
+ g_context->gva_to_gpa = nonpaging_gva_to_gpa_nested;
+ } else if (is_long_mode(vcpu)) {
+ reset_rsvds_bits_mask(vcpu, g_context, PT64_ROOT_LEVEL);
+ g_context->root_level = PT64_ROOT_LEVEL;
+ g_context->gva_to_gpa = paging64_gva_to_gpa_nested;
+ } else if (is_pae(vcpu)) {
+ reset_rsvds_bits_mask(vcpu, g_context, PT32E_ROOT_LEVEL);
+ g_context->root_level = PT32E_ROOT_LEVEL;
+ g_context->gva_to_gpa = paging64_gva_to_gpa_nested;
+ } else {
+ reset_rsvds_bits_mask(vcpu, g_context, PT32_ROOT_LEVEL);
+ g_context->root_level = PT32_ROOT_LEVEL;
+ g_context->gva_to_gpa = paging32_gva_to_gpa_nested;
+ }
+
+ return 0;
+}
+
static int init_kvm_mmu(struct kvm_vcpu *vcpu)
{
vcpu->arch.update_pte.pfn = bad_pfn;
- if (tdp_enabled)
+ if (mmu_is_nested(vcpu))
+ return init_kvm_nested_mmu(vcpu);
+ else if (tdp_enabled)
return init_kvm_tdp_mmu(vcpu);
else
return init_kvm_softmmu(vcpu);
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 7086ca8..513abbb 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -47,6 +47,7 @@
#define PFERR_USER_MASK (1U << 2)
#define PFERR_RSVD_MASK (1U << 3)
#define PFERR_FETCH_MASK (1U << 4)
+#define PFERR_NESTED_MASK (1U << 31)
int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 sptes[4]);
int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 540255d..307599d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3475,6 +3475,22 @@ static gpa_t translate_gpa(struct kvm_vcpu *vcpu, gpa_t gpa, u32 access)
return gpa;
}
+static gpa_t translate_nested_gpa(struct kvm_vcpu *vcpu, gpa_t gpa, u32 access)
+{
+ gpa_t t_gpa;
+ u32 error;
+
+ BUG_ON(!mmu_is_nested(vcpu));
+
+ /* NPT walks are always user-walks */
+ access |= PFERR_USER_MASK;
+ t_gpa = vcpu->arch.mmu.gva_to_gpa(vcpu, gpa, access, &error);
+ if (t_gpa == UNMAPPED_GVA)
+ vcpu->arch.fault.error_code |= PFERR_NESTED_MASK;
+
+ return t_gpa;
+}
+
gpa_t kvm_mmu_gva_to_gpa_read(struct kvm_vcpu *vcpu, gva_t gva, u32 *error)
{
u32 access = (kvm_x86_ops->get_cpl(vcpu) == 3) ? PFERR_USER_MASK : 0;
@@ -5690,6 +5706,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
vcpu->arch.walk_mmu = &vcpu->arch.mmu;
vcpu->arch.mmu.root_hpa = INVALID_PAGE;
vcpu->arch.mmu.translate_gpa = translate_gpa;
+ vcpu->arch.nested_mmu.translate_gpa = translate_nested_gpa;
if (!irqchip_in_kernel(kvm) || kvm_vcpu_is_bsp(vcpu))
vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
else
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 18/29] KVM: MMU: Propagate the right fault back to the guest after gva_to_gpa
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (16 preceding siblings ...)
2010-09-10 15:30 ` [PATCH 17/29] KVM: MMU: Introduce init_kvm_nested_mmu() Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-12 8:50 ` Avi Kivity
2010-09-10 15:30 ` [PATCH 19/29] KVM: X86: Propagate fetch faults Joerg Roedel
` (11 subsequent siblings)
29 siblings, 1 reply; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This patch implements logic to make sure that either a
page-fault/page-fault-vmexit or a nested-page-fault-vmexit
is propagated back to the guest.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/x86.c | 18 +++++++++++++++++-
2 files changed, 18 insertions(+), 1 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 87734cc..7d3adb8 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -660,6 +660,7 @@ void kvm_inject_page_fault(struct kvm_vcpu *vcpu);
int kvm_read_guest_page_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
gfn_t gfn, void *data, int offset, int len,
u32 access);
+void kvm_propagate_fault(struct kvm_vcpu *vcpu);
bool kvm_require_cpl(struct kvm_vcpu *vcpu, int required_cpl);
int kvm_pic_set_irq(void *opaque, int irq, int level);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 307599d..2816e5a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -338,6 +338,22 @@ void kvm_inject_page_fault(struct kvm_vcpu *vcpu)
kvm_queue_exception_e(vcpu, PF_VECTOR, error_code);
}
+void kvm_propagate_fault(struct kvm_vcpu *vcpu)
+{
+ u32 nested, error;
+
+ error = vcpu->arch.fault.error_code;
+ nested = error & PFERR_NESTED_MASK;
+ error = error & ~PFERR_NESTED_MASK;
+
+ vcpu->arch.fault.error_code = error;
+
+ if (mmu_is_nested(vcpu) && !nested)
+ vcpu->arch.nested_mmu.inject_page_fault(vcpu);
+ else
+ vcpu->arch.mmu.inject_page_fault(vcpu);
+}
+
void kvm_inject_nmi(struct kvm_vcpu *vcpu)
{
vcpu->arch.nmi_pending = 1;
@@ -4126,7 +4142,7 @@ static void inject_emulated_exception(struct kvm_vcpu *vcpu)
{
struct x86_emulate_ctxt *ctxt = &vcpu->arch.emulate_ctxt;
if (ctxt->exception == PF_VECTOR)
- kvm_inject_page_fault(vcpu);
+ kvm_propagate_fault(vcpu);
else if (ctxt->error_code_valid)
kvm_queue_exception_e(vcpu, ctxt->exception, ctxt->error_code);
else
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* Re: [PATCH 18/29] KVM: MMU: Propagate the right fault back to the guest after gva_to_gpa
2010-09-10 15:30 ` [PATCH 18/29] KVM: MMU: Propagate the right fault back to the guest after gva_to_gpa Joerg Roedel
@ 2010-09-12 8:50 ` Avi Kivity
0 siblings, 0 replies; 33+ messages in thread
From: Avi Kivity @ 2010-09-12 8:50 UTC (permalink / raw)
To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel
On 09/10/2010 06:30 PM, Joerg Roedel wrote:
> This patch implements logic to make sure that either a
> page-fault/page-fault-vmexit or a nested-page-fault-vmexit
> is propagated back to the guest.
>
> @@ -338,6 +338,22 @@ void kvm_inject_page_fault(struct kvm_vcpu *vcpu)
> kvm_queue_exception_e(vcpu, PF_VECTOR, error_code);
> }
>
> +void kvm_propagate_fault(struct kvm_vcpu *vcpu)
> +{
> + u32 nested, error;
> +
> + error = vcpu->arch.fault.error_code;
> + nested = error& PFERR_NESTED_MASK;
> + error = error& ~PFERR_NESTED_MASK;
> +
> + vcpu->arch.fault.error_code = error;
> +
> + if (mmu_is_nested(vcpu)&& !nested)
> + vcpu->arch.nested_mmu.inject_page_fault(vcpu);
> + else
> + vcpu->arch.mmu.inject_page_fault(vcpu);
> +}
> +
Tacking a non-architectural bit on top of the error code is not very
nice. Can you move it to a separate vcpu->arch.fault.nested?
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 19/29] KVM: X86: Propagate fetch faults
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (17 preceding siblings ...)
2010-09-10 15:30 ` [PATCH 18/29] KVM: MMU: Propagate the right fault back to the guest after gva_to_gpa Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-10 15:30 ` [PATCH 20/29] KVM: MMU: Add kvm_mmu parameter to load_pdptrs function Joerg Roedel
` (10 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
KVM currently ignores fetch faults in the instruction
emulator. With nested-npt we could have such faults. This
patch adds the code to handle these.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/kvm/emulate.c | 3 +++
arch/x86/kvm/x86.c | 4 ++++
2 files changed, 7 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 2b08b78..aead72e 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -1198,6 +1198,9 @@ static int emulate_popf(struct x86_emulate_ctxt *ctxt,
*(unsigned long *)dest =
(ctxt->eflags & ~change_mask) | (val & change_mask);
+ if (rc == X86EMUL_PROPAGATE_FAULT)
+ emulate_pf(ctxt);
+
return rc;
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2816e5a..16c044b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4233,6 +4233,9 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
vcpu->arch.emulate_ctxt.perm_ok = false;
r = x86_decode_insn(&vcpu->arch.emulate_ctxt);
+ if (r == X86EMUL_PROPAGATE_FAULT)
+ goto done;
+
trace_kvm_emulate_insn_start(vcpu);
/* Only allow emulation of specific instructions on #UD
@@ -4291,6 +4294,7 @@ restart:
return handle_emulation_failure(vcpu);
}
+done:
if (vcpu->arch.emulate_ctxt.exception >= 0) {
inject_emulated_exception(vcpu);
r = EMULATE_DONE;
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 20/29] KVM: MMU: Add kvm_mmu parameter to load_pdptrs function
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (18 preceding siblings ...)
2010-09-10 15:30 ` [PATCH 19/29] KVM: X86: Propagate fetch faults Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-10 15:30 ` [PATCH 21/29] KVM: MMU: Introduce kvm_pdptr_read_mmu Joerg Roedel
` (9 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This function need to be able to load the pdptrs from any
mmu context currently in use. So change this function to
take an kvm_mmu parameter to fit these needs.
As a side effect this patch also moves the cached pdptrs
from vcpu_arch into the kvm_mmu struct.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/include/asm/kvm_host.h | 5 +++--
arch/x86/kvm/kvm_cache_regs.h | 2 +-
arch/x86/kvm/svm.c | 2 +-
arch/x86/kvm/vmx.c | 16 ++++++++--------
arch/x86/kvm/x86.c | 26 ++++++++++++++------------
5 files changed, 27 insertions(+), 24 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 7d3adb8..33946ed 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -257,6 +257,8 @@ struct kvm_mmu {
u64 *pae_root;
u64 rsvd_bits_mask[2][4];
+
+ u64 pdptrs[4]; /* pae */
};
struct kvm_vcpu_arch {
@@ -276,7 +278,6 @@ struct kvm_vcpu_arch {
unsigned long cr4_guest_owned_bits;
unsigned long cr8;
u32 hflags;
- u64 pdptrs[4]; /* pae */
u64 efer;
u64 apic_base;
struct kvm_lapic *apic; /* kernel irqchip context */
@@ -592,7 +593,7 @@ void kvm_mmu_zap_all(struct kvm *kvm);
unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages);
-int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3);
+int load_pdptrs(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, unsigned long cr3);
int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa,
const void *val, int bytes);
diff --git a/arch/x86/kvm/kvm_cache_regs.h b/arch/x86/kvm/kvm_cache_regs.h
index 6491ac8..a37abe2 100644
--- a/arch/x86/kvm/kvm_cache_regs.h
+++ b/arch/x86/kvm/kvm_cache_regs.h
@@ -42,7 +42,7 @@ static inline u64 kvm_pdptr_read(struct kvm_vcpu *vcpu, int index)
(unsigned long *)&vcpu->arch.regs_avail))
kvm_x86_ops->cache_reg(vcpu, VCPU_EXREG_PDPTR);
- return vcpu->arch.pdptrs[index];
+ return vcpu->arch.walk_mmu->pdptrs[index];
}
static inline ulong kvm_read_cr0_bits(struct kvm_vcpu *vcpu, ulong mask)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 094df31..a98ac52 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1010,7 +1010,7 @@ static void svm_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg)
switch (reg) {
case VCPU_EXREG_PDPTR:
BUG_ON(!npt_enabled);
- load_pdptrs(vcpu, vcpu->arch.cr3);
+ load_pdptrs(vcpu, vcpu->arch.walk_mmu, vcpu->arch.cr3);
break;
default:
BUG();
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 0e62d8a..0a70194 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1848,20 +1848,20 @@ static void ept_load_pdptrs(struct kvm_vcpu *vcpu)
return;
if (is_paging(vcpu) && is_pae(vcpu) && !is_long_mode(vcpu)) {
- vmcs_write64(GUEST_PDPTR0, vcpu->arch.pdptrs[0]);
- vmcs_write64(GUEST_PDPTR1, vcpu->arch.pdptrs[1]);
- vmcs_write64(GUEST_PDPTR2, vcpu->arch.pdptrs[2]);
- vmcs_write64(GUEST_PDPTR3, vcpu->arch.pdptrs[3]);
+ vmcs_write64(GUEST_PDPTR0, vcpu->arch.mmu.pdptrs[0]);
+ vmcs_write64(GUEST_PDPTR1, vcpu->arch.mmu.pdptrs[1]);
+ vmcs_write64(GUEST_PDPTR2, vcpu->arch.mmu.pdptrs[2]);
+ vmcs_write64(GUEST_PDPTR3, vcpu->arch.mmu.pdptrs[3]);
}
}
static void ept_save_pdptrs(struct kvm_vcpu *vcpu)
{
if (is_paging(vcpu) && is_pae(vcpu) && !is_long_mode(vcpu)) {
- vcpu->arch.pdptrs[0] = vmcs_read64(GUEST_PDPTR0);
- vcpu->arch.pdptrs[1] = vmcs_read64(GUEST_PDPTR1);
- vcpu->arch.pdptrs[2] = vmcs_read64(GUEST_PDPTR2);
- vcpu->arch.pdptrs[3] = vmcs_read64(GUEST_PDPTR3);
+ vcpu->arch.mmu.pdptrs[0] = vmcs_read64(GUEST_PDPTR0);
+ vcpu->arch.mmu.pdptrs[1] = vmcs_read64(GUEST_PDPTR1);
+ vcpu->arch.mmu.pdptrs[2] = vmcs_read64(GUEST_PDPTR2);
+ vcpu->arch.mmu.pdptrs[3] = vmcs_read64(GUEST_PDPTR3);
}
__set_bit(VCPU_EXREG_PDPTR,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 16c044b..16f49c7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -418,17 +418,17 @@ int kvm_read_nested_guest_page(struct kvm_vcpu *vcpu, gfn_t gfn,
/*
* Load the pae pdptrs. Return true is they are all valid.
*/
-int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3)
+int load_pdptrs(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, unsigned long cr3)
{
gfn_t pdpt_gfn = cr3 >> PAGE_SHIFT;
unsigned offset = ((cr3 & (PAGE_SIZE-1)) >> 5) << 2;
int i;
int ret;
- u64 pdpte[ARRAY_SIZE(vcpu->arch.pdptrs)];
+ u64 pdpte[ARRAY_SIZE(mmu->pdptrs)];
- ret = kvm_read_nested_guest_page(vcpu, pdpt_gfn, pdpte,
- offset * sizeof(u64), sizeof(pdpte),
- PFERR_USER_MASK|PFERR_WRITE_MASK);
+ ret = kvm_read_guest_page_mmu(vcpu, mmu, pdpt_gfn, pdpte,
+ offset * sizeof(u64), sizeof(pdpte),
+ PFERR_USER_MASK|PFERR_WRITE_MASK);
if (ret < 0) {
ret = 0;
goto out;
@@ -442,7 +442,7 @@ int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3)
}
ret = 1;
- memcpy(vcpu->arch.pdptrs, pdpte, sizeof(vcpu->arch.pdptrs));
+ memcpy(mmu->pdptrs, pdpte, sizeof(mmu->pdptrs));
__set_bit(VCPU_EXREG_PDPTR,
(unsigned long *)&vcpu->arch.regs_avail);
__set_bit(VCPU_EXREG_PDPTR,
@@ -455,7 +455,7 @@ EXPORT_SYMBOL_GPL(load_pdptrs);
static bool pdptrs_changed(struct kvm_vcpu *vcpu)
{
- u64 pdpte[ARRAY_SIZE(vcpu->arch.pdptrs)];
+ u64 pdpte[ARRAY_SIZE(vcpu->arch.walk_mmu->pdptrs)];
bool changed = true;
int offset;
gfn_t gfn;
@@ -474,7 +474,7 @@ static bool pdptrs_changed(struct kvm_vcpu *vcpu)
PFERR_USER_MASK | PFERR_WRITE_MASK);
if (r < 0)
goto out;
- changed = memcmp(pdpte, vcpu->arch.pdptrs, sizeof(pdpte)) != 0;
+ changed = memcmp(pdpte, vcpu->arch.walk_mmu->pdptrs, sizeof(pdpte)) != 0;
out:
return changed;
@@ -513,7 +513,8 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
return 1;
} else
#endif
- if (is_pae(vcpu) && !load_pdptrs(vcpu, vcpu->arch.cr3))
+ if (is_pae(vcpu) && !load_pdptrs(vcpu, vcpu->arch.walk_mmu,
+ vcpu->arch.cr3))
return 1;
}
@@ -602,7 +603,7 @@ int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
return 1;
} else if (is_paging(vcpu) && (cr4 & X86_CR4_PAE)
&& ((cr4 ^ old_cr4) & pdptr_bits)
- && !load_pdptrs(vcpu, vcpu->arch.cr3))
+ && !load_pdptrs(vcpu, vcpu->arch.walk_mmu, vcpu->arch.cr3))
return 1;
if (cr4 & X86_CR4_VMXE)
@@ -635,7 +636,8 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
if (is_pae(vcpu)) {
if (cr3 & CR3_PAE_RESERVED_BITS)
return 1;
- if (is_paging(vcpu) && !load_pdptrs(vcpu, cr3))
+ if (is_paging(vcpu) &&
+ !load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3))
return 1;
}
/*
@@ -5408,7 +5410,7 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
mmu_reset_needed |= kvm_read_cr4(vcpu) != sregs->cr4;
kvm_x86_ops->set_cr4(vcpu, sregs->cr4);
if (!is_long_mode(vcpu) && is_pae(vcpu)) {
- load_pdptrs(vcpu, vcpu->arch.cr3);
+ load_pdptrs(vcpu, vcpu->arch.walk_mmu, vcpu->arch.cr3);
mmu_reset_needed = 1;
}
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 21/29] KVM: MMU: Introduce kvm_pdptr_read_mmu
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (19 preceding siblings ...)
2010-09-10 15:30 ` [PATCH 20/29] KVM: MMU: Add kvm_mmu parameter to load_pdptrs function Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-10 15:30 ` [PATCH 22/29] KVM: MMU: Refactor mmu_alloc_roots function Joerg Roedel
` (8 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This function is implemented to load the pdptr pointers of
the currently running guest (l1 or l2 guest). Therefore it
takes care about the current paging mode and can read pdptrs
out of l2 guest physical memory.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/kvm/kvm_cache_regs.h | 7 +++++++
arch/x86/kvm/mmu.c | 2 +-
arch/x86/kvm/paging_tmpl.h | 2 +-
3 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/kvm_cache_regs.h b/arch/x86/kvm/kvm_cache_regs.h
index a37abe2..975bb45 100644
--- a/arch/x86/kvm/kvm_cache_regs.h
+++ b/arch/x86/kvm/kvm_cache_regs.h
@@ -45,6 +45,13 @@ static inline u64 kvm_pdptr_read(struct kvm_vcpu *vcpu, int index)
return vcpu->arch.walk_mmu->pdptrs[index];
}
+static inline u64 kvm_pdptr_read_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, int index)
+{
+ load_pdptrs(vcpu, mmu, mmu->get_cr3(vcpu));
+
+ return mmu->pdptrs[index];
+}
+
static inline ulong kvm_read_cr0_bits(struct kvm_vcpu *vcpu, ulong mask)
{
ulong tmask = mask & KVM_POSSIBLE_CR0_GUEST_BITS;
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index a26f13b..a25173a 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2398,7 +2398,7 @@ static int mmu_alloc_roots(struct kvm_vcpu *vcpu)
ASSERT(!VALID_PAGE(root));
if (vcpu->arch.mmu.root_level == PT32E_ROOT_LEVEL) {
- pdptr = kvm_pdptr_read(vcpu, i);
+ pdptr = kvm_pdptr_read_mmu(vcpu, &vcpu->arch.mmu, i);
if (!is_present_gpte(pdptr)) {
vcpu->arch.mmu.pae_root[i] = 0;
continue;
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index f4e09d3..a28f09b 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -137,7 +137,7 @@ walk:
#if PTTYPE == 64
if (walker->level == PT32E_ROOT_LEVEL) {
- pte = kvm_pdptr_read(vcpu, (addr >> 30) & 3);
+ pte = kvm_pdptr_read_mmu(vcpu, mmu, (addr >> 30) & 3);
trace_kvm_mmu_paging_element(pte, walker->level);
if (!is_present_gpte(pte)) {
present = false;
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 22/29] KVM: MMU: Refactor mmu_alloc_roots function
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (20 preceding siblings ...)
2010-09-10 15:30 ` [PATCH 21/29] KVM: MMU: Introduce kvm_pdptr_read_mmu Joerg Roedel
@ 2010-09-10 15:30 ` Joerg Roedel
2010-09-10 15:31 ` [PATCH 23/29] KVM: MMU: Allow long mode shadows for legacy page tables Joerg Roedel
` (7 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:30 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This patch factors out the direct-mapping paths of the
mmu_alloc_roots function into a seperate function. This
makes it a lot easier to avoid all the unnecessary checks
done in the shadow path which may break when running direct.
In fact, this patch already fixes a problem when running PAE
guests on a PAE shadow page table.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/kvm/mmu.c | 82 ++++++++++++++++++++++++++++++++++++++--------------
1 files changed, 60 insertions(+), 22 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index a25173a..9cd5a71 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2357,42 +2357,77 @@ static int mmu_check_root(struct kvm_vcpu *vcpu, gfn_t root_gfn)
return ret;
}
-static int mmu_alloc_roots(struct kvm_vcpu *vcpu)
+static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
+{
+ struct kvm_mmu_page *sp;
+ int i;
+
+ if (vcpu->arch.mmu.shadow_root_level == PT64_ROOT_LEVEL) {
+ spin_lock(&vcpu->kvm->mmu_lock);
+ kvm_mmu_free_some_pages(vcpu);
+ sp = kvm_mmu_get_page(vcpu, 0, 0, PT64_ROOT_LEVEL,
+ 1, ACC_ALL, NULL);
+ ++sp->root_count;
+ spin_unlock(&vcpu->kvm->mmu_lock);
+ vcpu->arch.mmu.root_hpa = __pa(sp->spt);
+ } else if (vcpu->arch.mmu.shadow_root_level == PT32E_ROOT_LEVEL) {
+ for (i = 0; i < 4; ++i) {
+ hpa_t root = vcpu->arch.mmu.pae_root[i];
+
+ ASSERT(!VALID_PAGE(root));
+ spin_lock(&vcpu->kvm->mmu_lock);
+ kvm_mmu_free_some_pages(vcpu);
+ sp = kvm_mmu_get_page(vcpu, i << 30, i << 30,
+ PT32_ROOT_LEVEL, 1, ACC_ALL,
+ NULL);
+ root = __pa(sp->spt);
+ ++sp->root_count;
+ spin_unlock(&vcpu->kvm->mmu_lock);
+ vcpu->arch.mmu.pae_root[i] = root | PT_PRESENT_MASK;
+ vcpu->arch.mmu.root_hpa = __pa(vcpu->arch.mmu.pae_root);
+ }
+ } else
+ BUG();
+
+ return 0;
+}
+
+static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
{
int i;
gfn_t root_gfn;
struct kvm_mmu_page *sp;
- int direct = 0;
u64 pdptr;
root_gfn = vcpu->arch.mmu.get_cr3(vcpu) >> PAGE_SHIFT;
- if (vcpu->arch.mmu.shadow_root_level == PT64_ROOT_LEVEL) {
+ if (mmu_check_root(vcpu, root_gfn))
+ return 1;
+
+ /*
+ * Do we shadow a long mode page table? If so we need to
+ * write-protect the guests page table root.
+ */
+ if (vcpu->arch.mmu.root_level == PT64_ROOT_LEVEL) {
hpa_t root = vcpu->arch.mmu.root_hpa;
ASSERT(!VALID_PAGE(root));
- if (mmu_check_root(vcpu, root_gfn))
- return 1;
- if (vcpu->arch.mmu.direct_map) {
- direct = 1;
- root_gfn = 0;
- }
+
spin_lock(&vcpu->kvm->mmu_lock);
kvm_mmu_free_some_pages(vcpu);
- sp = kvm_mmu_get_page(vcpu, root_gfn, 0,
- PT64_ROOT_LEVEL, direct,
- ACC_ALL, NULL);
+ sp = kvm_mmu_get_page(vcpu, root_gfn, 0, PT64_ROOT_LEVEL,
+ 0, ACC_ALL, NULL);
root = __pa(sp->spt);
++sp->root_count;
spin_unlock(&vcpu->kvm->mmu_lock);
vcpu->arch.mmu.root_hpa = root;
return 0;
}
- direct = !is_paging(vcpu);
-
- if (mmu_check_root(vcpu, root_gfn))
- return 1;
+ /*
+ * We shadow a 32 bit page table. This may be a legacy 2-level
+ * or a PAE 3-level page table.
+ */
for (i = 0; i < 4; ++i) {
hpa_t root = vcpu->arch.mmu.pae_root[i];
@@ -2406,16 +2441,11 @@ static int mmu_alloc_roots(struct kvm_vcpu *vcpu)
root_gfn = pdptr >> PAGE_SHIFT;
if (mmu_check_root(vcpu, root_gfn))
return 1;
- } else if (vcpu->arch.mmu.root_level == 0)
- root_gfn = 0;
- if (vcpu->arch.mmu.direct_map) {
- direct = 1;
- root_gfn = i << 30;
}
spin_lock(&vcpu->kvm->mmu_lock);
kvm_mmu_free_some_pages(vcpu);
sp = kvm_mmu_get_page(vcpu, root_gfn, i << 30,
- PT32_ROOT_LEVEL, direct,
+ PT32_ROOT_LEVEL, 0,
ACC_ALL, NULL);
root = __pa(sp->spt);
++sp->root_count;
@@ -2427,6 +2457,14 @@ static int mmu_alloc_roots(struct kvm_vcpu *vcpu)
return 0;
}
+static int mmu_alloc_roots(struct kvm_vcpu *vcpu)
+{
+ if (vcpu->arch.mmu.direct_map)
+ return mmu_alloc_direct_roots(vcpu);
+ else
+ return mmu_alloc_shadow_roots(vcpu);
+}
+
static void mmu_sync_roots(struct kvm_vcpu *vcpu)
{
int i;
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 23/29] KVM: MMU: Allow long mode shadows for legacy page tables
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (21 preceding siblings ...)
2010-09-10 15:30 ` [PATCH 22/29] KVM: MMU: Refactor mmu_alloc_roots function Joerg Roedel
@ 2010-09-10 15:31 ` Joerg Roedel
2010-09-10 15:31 ` [PATCH 24/29] KVM: MMU: Track NX state in struct kvm_mmu Joerg Roedel
` (6 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:31 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
Currently the KVM softmmu implementation can not shadow a 32
bit legacy or PAE page table with a long mode page table.
This is a required feature for nested paging emulation
because the nested page table must alway be in host format.
So this patch implements the missing pieces to allow long
mode page tables for page table types.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/mmu.c | 60 +++++++++++++++++++++++++++++++++-----
2 files changed, 53 insertions(+), 8 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 33946ed..8647578 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -256,6 +256,7 @@ struct kvm_mmu {
bool direct_map;
u64 *pae_root;
+ u64 *lm_root;
u64 rsvd_bits_mask[2][4];
u64 pdptrs[4]; /* pae */
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 9cd5a71..dd76765 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1504,6 +1504,12 @@ static void shadow_walk_init(struct kvm_shadow_walk_iterator *iterator,
iterator->addr = addr;
iterator->shadow_addr = vcpu->arch.mmu.root_hpa;
iterator->level = vcpu->arch.mmu.shadow_root_level;
+
+ if (iterator->level == PT64_ROOT_LEVEL &&
+ vcpu->arch.mmu.root_level < PT64_ROOT_LEVEL &&
+ !vcpu->arch.mmu.direct_map)
+ --iterator->level;
+
if (iterator->level == PT32E_ROOT_LEVEL) {
iterator->shadow_addr
= vcpu->arch.mmu.pae_root[(addr >> 30) & 3];
@@ -2314,7 +2320,9 @@ static void mmu_free_roots(struct kvm_vcpu *vcpu)
if (!VALID_PAGE(vcpu->arch.mmu.root_hpa))
return;
spin_lock(&vcpu->kvm->mmu_lock);
- if (vcpu->arch.mmu.shadow_root_level == PT64_ROOT_LEVEL) {
+ if (vcpu->arch.mmu.shadow_root_level == PT64_ROOT_LEVEL &&
+ (vcpu->arch.mmu.root_level == PT64_ROOT_LEVEL ||
+ vcpu->arch.mmu.direct_map)) {
hpa_t root = vcpu->arch.mmu.root_hpa;
sp = page_header(root);
@@ -2394,10 +2402,10 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
{
- int i;
- gfn_t root_gfn;
struct kvm_mmu_page *sp;
- u64 pdptr;
+ u64 pdptr, pm_mask;
+ gfn_t root_gfn;
+ int i;
root_gfn = vcpu->arch.mmu.get_cr3(vcpu) >> PAGE_SHIFT;
@@ -2426,8 +2434,13 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
/*
* We shadow a 32 bit page table. This may be a legacy 2-level
- * or a PAE 3-level page table.
+ * or a PAE 3-level page table. In either case we need to be aware that
+ * the shadow page table may be a PAE or a long mode page table.
*/
+ pm_mask = PT_PRESENT_MASK;
+ if (vcpu->arch.mmu.shadow_root_level == PT64_ROOT_LEVEL)
+ pm_mask |= PT_ACCESSED_MASK | PT_WRITABLE_MASK | PT_USER_MASK;
+
for (i = 0; i < 4; ++i) {
hpa_t root = vcpu->arch.mmu.pae_root[i];
@@ -2451,9 +2464,35 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
++sp->root_count;
spin_unlock(&vcpu->kvm->mmu_lock);
- vcpu->arch.mmu.pae_root[i] = root | PT_PRESENT_MASK;
+ vcpu->arch.mmu.pae_root[i] = root | pm_mask;
+ vcpu->arch.mmu.root_hpa = __pa(vcpu->arch.mmu.pae_root);
}
- vcpu->arch.mmu.root_hpa = __pa(vcpu->arch.mmu.pae_root);
+
+ /*
+ * If we shadow a 32 bit page table with a long mode page
+ * table we enter this path.
+ */
+ if (vcpu->arch.mmu.shadow_root_level == PT64_ROOT_LEVEL) {
+ if (vcpu->arch.mmu.lm_root == NULL) {
+ /*
+ * The additional page necessary for this is only
+ * allocated on demand.
+ */
+
+ u64 *lm_root;
+
+ lm_root = (void*)get_zeroed_page(GFP_KERNEL);
+ if (lm_root == NULL)
+ return 1;
+
+ lm_root[0] = __pa(vcpu->arch.mmu.pae_root) | pm_mask;
+
+ vcpu->arch.mmu.lm_root = lm_root;
+ }
+
+ vcpu->arch.mmu.root_hpa = __pa(vcpu->arch.mmu.lm_root);
+ }
+
return 0;
}
@@ -2470,9 +2509,12 @@ static void mmu_sync_roots(struct kvm_vcpu *vcpu)
int i;
struct kvm_mmu_page *sp;
+ if (vcpu->arch.mmu.direct_map)
+ return;
+
if (!VALID_PAGE(vcpu->arch.mmu.root_hpa))
return;
- if (vcpu->arch.mmu.shadow_root_level == PT64_ROOT_LEVEL) {
+ if (vcpu->arch.mmu.root_level == PT64_ROOT_LEVEL) {
hpa_t root = vcpu->arch.mmu.root_hpa;
sp = page_header(root);
mmu_sync_children(vcpu, sp);
@@ -3253,6 +3295,8 @@ EXPORT_SYMBOL_GPL(kvm_disable_tdp);
static void free_mmu_pages(struct kvm_vcpu *vcpu)
{
free_page((unsigned long)vcpu->arch.mmu.pae_root);
+ if (vcpu->arch.mmu.lm_root != NULL)
+ free_page((unsigned long)vcpu->arch.mmu.lm_root);
}
static int alloc_mmu_pages(struct kvm_vcpu *vcpu)
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 24/29] KVM: MMU: Track NX state in struct kvm_mmu
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (22 preceding siblings ...)
2010-09-10 15:31 ` [PATCH 23/29] KVM: MMU: Allow long mode shadows for legacy page tables Joerg Roedel
@ 2010-09-10 15:31 ` Joerg Roedel
2010-09-12 8:06 ` Avi Kivity
2010-09-10 15:31 ` [PATCH 25/29] KVM: SVM: Implement MMU helper functions for Nested Nested Paging Joerg Roedel
` (5 subsequent siblings)
29 siblings, 1 reply; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:31 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
With Nested Paging emulation the NX state between the two
MMU contexts may differ. To make sure that always the right
fault error code is recorded this patch moves the NX state
into struct kvm_mmu so that the code can distinguish between
L1 and L2 NX state.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/include/asm/kvm_host.h | 2 ++
arch/x86/kvm/mmu.c | 16 +++++++++++++++-
arch/x86/kvm/paging_tmpl.h | 4 ++--
3 files changed, 19 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 8647578..fca8470 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -259,6 +259,8 @@ struct kvm_mmu {
u64 *lm_root;
u64 rsvd_bits_mask[2][4];
+ bool nx;
+
u64 pdptrs[4]; /* pae */
};
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index dd76765..95cbeed 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2634,6 +2634,7 @@ static int nonpaging_init_context(struct kvm_vcpu *vcpu,
context->shadow_root_level = PT32E_ROOT_LEVEL;
context->root_hpa = INVALID_PAGE;
context->direct_map = true;
+ context->nx = false;
return 0;
}
@@ -2687,7 +2688,7 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
int maxphyaddr = cpuid_maxphyaddr(vcpu);
u64 exb_bit_rsvd = 0;
- if (!is_nx(vcpu))
+ if (!context->nx)
exb_bit_rsvd = rsvd_bits(63, 63);
switch (level) {
case PT32_ROOT_LEVEL:
@@ -2746,6 +2747,8 @@ static int paging64_init_context_common(struct kvm_vcpu *vcpu,
struct kvm_mmu *context,
int level)
{
+ context->nx = is_nx(vcpu);
+
reset_rsvds_bits_mask(vcpu, context, level);
ASSERT(is_pae(vcpu));
@@ -2772,6 +2775,8 @@ static int paging64_init_context(struct kvm_vcpu *vcpu,
static int paging32_init_context(struct kvm_vcpu *vcpu,
struct kvm_mmu *context)
{
+ context->nx = false;
+
reset_rsvds_bits_mask(vcpu, context, PT32_ROOT_LEVEL);
context->new_cr3 = paging_new_cr3;
@@ -2810,19 +2815,24 @@ static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
context->set_cr3 = kvm_x86_ops->set_tdp_cr3;
context->get_cr3 = get_cr3;
context->inject_page_fault = kvm_inject_page_fault;
+ context->nx = is_nx(vcpu);
if (!is_paging(vcpu)) {
+ context->nx = false;
context->gva_to_gpa = nonpaging_gva_to_gpa;
context->root_level = 0;
} else if (is_long_mode(vcpu)) {
+ context->nx = is_nx(vcpu);
reset_rsvds_bits_mask(vcpu, context, PT64_ROOT_LEVEL);
context->gva_to_gpa = paging64_gva_to_gpa;
context->root_level = PT64_ROOT_LEVEL;
} else if (is_pae(vcpu)) {
+ context->nx = is_nx(vcpu);
reset_rsvds_bits_mask(vcpu, context, PT32E_ROOT_LEVEL);
context->gva_to_gpa = paging64_gva_to_gpa;
context->root_level = PT32E_ROOT_LEVEL;
} else {
+ context->nx = false;
reset_rsvds_bits_mask(vcpu, context, PT32_ROOT_LEVEL);
context->gva_to_gpa = paging32_gva_to_gpa;
context->root_level = PT32_ROOT_LEVEL;
@@ -2878,17 +2888,21 @@ static int init_kvm_nested_mmu(struct kvm_vcpu *vcpu)
* functions between mmu and nested_mmu are swapped.
*/
if (!is_paging(vcpu)) {
+ g_context->nx = false;
g_context->root_level = 0;
g_context->gva_to_gpa = nonpaging_gva_to_gpa_nested;
} else if (is_long_mode(vcpu)) {
+ g_context->nx = is_nx(vcpu);
reset_rsvds_bits_mask(vcpu, g_context, PT64_ROOT_LEVEL);
g_context->root_level = PT64_ROOT_LEVEL;
g_context->gva_to_gpa = paging64_gva_to_gpa_nested;
} else if (is_pae(vcpu)) {
+ g_context->nx = is_nx(vcpu);
reset_rsvds_bits_mask(vcpu, g_context, PT32E_ROOT_LEVEL);
g_context->root_level = PT32E_ROOT_LEVEL;
g_context->gva_to_gpa = paging64_gva_to_gpa_nested;
} else {
+ g_context->nx = false;
reset_rsvds_bits_mask(vcpu, g_context, PT32_ROOT_LEVEL);
g_context->root_level = PT32_ROOT_LEVEL;
g_context->gva_to_gpa = paging32_gva_to_gpa_nested;
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index a28f09b..2bdd843 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -105,7 +105,7 @@ static unsigned FNAME(gpte_access)(struct kvm_vcpu *vcpu, pt_element_t gpte)
access = (gpte & (PT_WRITABLE_MASK | PT_USER_MASK)) | ACC_EXEC_MASK;
#if PTTYPE == 64
- if (is_nx(vcpu))
+ if (vcpu->arch.mmu.nx)
access &= ~(gpte >> PT64_NX_SHIFT);
#endif
return access;
@@ -272,7 +272,7 @@ error:
walker->error_code |= PFERR_WRITE_MASK;
if (user_fault)
walker->error_code |= PFERR_USER_MASK;
- if (fetch_fault && is_nx(vcpu))
+ if (fetch_fault && mmu->nx)
walker->error_code |= PFERR_FETCH_MASK;
if (rsvd_fault)
walker->error_code |= PFERR_RSVD_MASK;
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* Re: [PATCH 24/29] KVM: MMU: Track NX state in struct kvm_mmu
2010-09-10 15:31 ` [PATCH 24/29] KVM: MMU: Track NX state in struct kvm_mmu Joerg Roedel
@ 2010-09-12 8:06 ` Avi Kivity
0 siblings, 0 replies; 33+ messages in thread
From: Avi Kivity @ 2010-09-12 8:06 UTC (permalink / raw)
To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel
On 09/10/2010 06:31 PM, Joerg Roedel wrote:
> With Nested Paging emulation the NX state between the two
> MMU contexts may differ. To make sure that always the right
> fault error code is recorded this patch moves the NX state
> into struct kvm_mmu so that the code can distinguish between
> L1 and L2 NX state.
>
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -259,6 +259,8 @@ struct kvm_mmu {
> u64 *lm_root;
> u64 rsvd_bits_mask[2][4];
>
> + bool nx;
> +
> u64 pdptrs[4]; /* pae */
> };
This is redundant with mmu->base_role.nx. Please post a follow on patch
to use that instead.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 25/29] KVM: SVM: Implement MMU helper functions for Nested Nested Paging
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (23 preceding siblings ...)
2010-09-10 15:31 ` [PATCH 24/29] KVM: MMU: Track NX state in struct kvm_mmu Joerg Roedel
@ 2010-09-10 15:31 ` Joerg Roedel
2010-09-10 15:31 ` [PATCH 26/29] KVM: SVM: Initialize Nested Nested MMU context on VMRUN Joerg Roedel
` (4 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:31 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This patch adds the helper functions which will be used in
the mmu context for handling nested nested page faults.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/kvm/svm.c | 30 ++++++++++++++++++++++++++++++
1 files changed, 30 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index a98ac52..a483aa9 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -104,6 +104,8 @@ struct nested_state {
u32 intercept_exceptions;
u64 intercept;
+ /* Nested Paging related state */
+ u64 nested_cr3;
};
#define MSRPM_OFFSETS 16
@@ -1600,6 +1602,34 @@ static int vmmcall_interception(struct vcpu_svm *svm)
return 1;
}
+static unsigned long nested_svm_get_tdp_cr3(struct kvm_vcpu *vcpu)
+{
+ struct vcpu_svm *svm = to_svm(vcpu);
+
+ return svm->nested.nested_cr3;
+}
+
+static void nested_svm_set_tdp_cr3(struct kvm_vcpu *vcpu,
+ unsigned long root)
+{
+ struct vcpu_svm *svm = to_svm(vcpu);
+
+ svm->vmcb->control.nested_cr3 = root;
+ force_new_asid(vcpu);
+}
+
+static void nested_svm_inject_npf_exit(struct kvm_vcpu *vcpu)
+{
+ struct vcpu_svm *svm = to_svm(vcpu);
+
+ svm->vmcb->control.exit_code = SVM_EXIT_NPF;
+ svm->vmcb->control.exit_code_hi = 0;
+ svm->vmcb->control.exit_info_1 = vcpu->arch.fault.error_code;
+ svm->vmcb->control.exit_info_2 = vcpu->arch.fault.address;
+
+ nested_svm_vmexit(svm);
+}
+
static int nested_svm_check_permissions(struct vcpu_svm *svm)
{
if (!(svm->vcpu.arch.efer & EFER_SVME)
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 26/29] KVM: SVM: Initialize Nested Nested MMU context on VMRUN
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (24 preceding siblings ...)
2010-09-10 15:31 ` [PATCH 25/29] KVM: SVM: Implement MMU helper functions for Nested Nested Paging Joerg Roedel
@ 2010-09-10 15:31 ` Joerg Roedel
2010-09-10 15:31 ` [PATCH 27/29] KVM: SVM: Expect two more candiates for exit_int_info Joerg Roedel
` (3 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:31 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This patch adds code to initialize the Nested Nested Paging
MMU context when the L1 guest executes a VMRUN instruction
and has nested paging enabled in its VMCB.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/kvm/mmu.c | 1 +
arch/x86/kvm/svm.c | 50 +++++++++++++++++++++++++++++++++++++++++---------
2 files changed, 42 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 95cbeed..6e248d8 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2962,6 +2962,7 @@ void kvm_mmu_unload(struct kvm_vcpu *vcpu)
{
mmu_free_roots(vcpu);
}
+EXPORT_SYMBOL_GPL(kvm_mmu_unload);
static void mmu_pte_write_zap_pte(struct kvm_vcpu *vcpu,
struct kvm_mmu_page *sp,
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index a483aa9..9df60c3 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -294,6 +294,15 @@ static inline void flush_guest_tlb(struct kvm_vcpu *vcpu)
force_new_asid(vcpu);
}
+static int get_npt_level(void)
+{
+#ifdef CONFIG_X86_64
+ return PT64_ROOT_LEVEL;
+#else
+ return PT32E_ROOT_LEVEL;
+#endif
+}
+
static void svm_set_efer(struct kvm_vcpu *vcpu, u64 efer)
{
vcpu->arch.efer = efer;
@@ -1630,6 +1639,26 @@ static void nested_svm_inject_npf_exit(struct kvm_vcpu *vcpu)
nested_svm_vmexit(svm);
}
+static int nested_svm_init_mmu_context(struct kvm_vcpu *vcpu)
+{
+ int r;
+
+ r = kvm_init_shadow_mmu(vcpu, &vcpu->arch.mmu);
+
+ vcpu->arch.mmu.set_cr3 = nested_svm_set_tdp_cr3;
+ vcpu->arch.mmu.get_cr3 = nested_svm_get_tdp_cr3;
+ vcpu->arch.mmu.inject_page_fault = nested_svm_inject_npf_exit;
+ vcpu->arch.mmu.shadow_root_level = get_npt_level();
+ vcpu->arch.walk_mmu = &vcpu->arch.nested_mmu;
+
+ return r;
+}
+
+static void nested_svm_uninit_mmu_context(struct kvm_vcpu *vcpu)
+{
+ vcpu->arch.walk_mmu = &vcpu->arch.mmu;
+}
+
static int nested_svm_check_permissions(struct vcpu_svm *svm)
{
if (!(svm->vcpu.arch.efer & EFER_SVME)
@@ -1998,6 +2027,8 @@ static int nested_svm_vmexit(struct vcpu_svm *svm)
kvm_clear_exception_queue(&svm->vcpu);
kvm_clear_interrupt_queue(&svm->vcpu);
+ svm->nested.nested_cr3 = 0;
+
/* Restore selected save entries */
svm->vmcb->save.es = hsave->save.es;
svm->vmcb->save.cs = hsave->save.cs;
@@ -2024,6 +2055,7 @@ static int nested_svm_vmexit(struct vcpu_svm *svm)
nested_svm_unmap(page);
+ nested_svm_uninit_mmu_context(&svm->vcpu);
kvm_mmu_reset_context(&svm->vcpu);
kvm_mmu_load(&svm->vcpu);
@@ -2071,6 +2103,9 @@ static bool nested_vmcb_checks(struct vmcb *vmcb)
if (vmcb->control.asid == 0)
return false;
+ if (vmcb->control.nested_ctl && !npt_enabled)
+ return false;
+
return true;
}
@@ -2143,6 +2178,12 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
else
svm->vcpu.arch.hflags &= ~HF_HIF_MASK;
+ if (nested_vmcb->control.nested_ctl) {
+ kvm_mmu_unload(&svm->vcpu);
+ svm->nested.nested_cr3 = nested_vmcb->control.nested_cr3;
+ nested_svm_init_mmu_context(&svm->vcpu);
+ }
+
/* Load the nested guest state */
svm->vmcb->save.es = nested_vmcb->save.es;
svm->vmcb->save.cs = nested_vmcb->save.cs;
@@ -3410,15 +3451,6 @@ static bool svm_cpu_has_accelerated_tpr(void)
return false;
}
-static int get_npt_level(void)
-{
-#ifdef CONFIG_X86_64
- return PT64_ROOT_LEVEL;
-#else
- return PT32E_ROOT_LEVEL;
-#endif
-}
-
static u64 svm_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
{
return 0;
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 27/29] KVM: SVM: Expect two more candiates for exit_int_info
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (25 preceding siblings ...)
2010-09-10 15:31 ` [PATCH 26/29] KVM: SVM: Initialize Nested Nested MMU context on VMRUN Joerg Roedel
@ 2010-09-10 15:31 ` Joerg Roedel
2010-09-10 15:31 ` [PATCH 28/29] KVM: SVM: Report Nested Paging support to userspace Joerg Roedel
` (2 subsequent siblings)
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:31 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This patch adds INTR and NMI intercepts to the list of
expected intercepts with an exit_int_info set. While this
can't happen on bare metal it is architectural legal and may
happen with KVMs SVM emulation.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/kvm/svm.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 9df60c3..ede95e0 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2991,7 +2991,8 @@ static int handle_exit(struct kvm_vcpu *vcpu)
if (is_external_interrupt(svm->vmcb->control.exit_int_info) &&
exit_code != SVM_EXIT_EXCP_BASE + PF_VECTOR &&
- exit_code != SVM_EXIT_NPF && exit_code != SVM_EXIT_TASK_SWITCH)
+ exit_code != SVM_EXIT_NPF && exit_code != SVM_EXIT_TASK_SWITCH &&
+ exit_code != SVM_EXIT_INTR && exit_code != SVM_EXIT_NMI)
printk(KERN_ERR "%s: unexpected exit_ini_info 0x%x "
"exit_code 0x%x\n",
__func__, svm->vmcb->control.exit_int_info,
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 28/29] KVM: SVM: Report Nested Paging support to userspace
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (26 preceding siblings ...)
2010-09-10 15:31 ` [PATCH 27/29] KVM: SVM: Expect two more candiates for exit_int_info Joerg Roedel
@ 2010-09-10 15:31 ` Joerg Roedel
2010-09-10 15:31 ` [PATCH 29/29] KVM: X86: Report SVM bit to userspace only when supported Joerg Roedel
2010-09-12 8:56 ` [PATCH 0/29] Nested Paging Virtualization for KVM v4 Avi Kivity
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:31 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel
This patch implements the reporting of the nested paging
feature support to userspace.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/kvm/svm.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index ede95e0..678602e 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3476,6 +3476,10 @@ static void svm_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry)
if (svm_has(SVM_FEATURE_NRIP))
entry->edx |= SVM_FEATURE_NRIP;
+ /* Support NPT for the guest if enabled */
+ if (npt_enabled)
+ entry->edx |= SVM_FEATURE_NPT;
+
break;
}
}
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH 29/29] KVM: X86: Report SVM bit to userspace only when supported
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (27 preceding siblings ...)
2010-09-10 15:31 ` [PATCH 28/29] KVM: SVM: Report Nested Paging support to userspace Joerg Roedel
@ 2010-09-10 15:31 ` Joerg Roedel
2010-09-12 8:56 ` [PATCH 0/29] Nested Paging Virtualization for KVM v4 Avi Kivity
29 siblings, 0 replies; 33+ messages in thread
From: Joerg Roedel @ 2010-09-10 15:31 UTC (permalink / raw)
To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel, stable
This patch fixes a bug in KVM where it _always_ reports the
support of the SVM feature to userspace. But KVM only
supports SVM on AMD hardware and only when it is enabled in
the kernel module. This patch fixes the wrong reporting.
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/kvm/svm.c | 4 ++++
arch/x86/kvm/x86.c | 2 +-
2 files changed, 5 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 678602e..eeb08d6 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3464,6 +3464,10 @@ static void svm_cpuid_update(struct kvm_vcpu *vcpu)
static void svm_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry)
{
switch (func) {
+ case 0x80000001:
+ if (nested)
+ entry->ecx |= (1 << 2); /* Set SVM bit */
+ break;
case 0x8000000A:
entry->eax = 1; /* SVM revision 1 */
entry->ebx = 8; /* Lets support 8 ASIDs in case we add proper
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 16f49c7..f7bab03 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2195,7 +2195,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
0 /* Reserved, AES */ | F(XSAVE) | 0 /* OSXSAVE */ | F(AVX);
/* cpuid 0x80000001.ecx */
const u32 kvm_supported_word6_x86_features =
- F(LAHF_LM) | F(CMP_LEGACY) | F(SVM) | 0 /* ExtApicSpace */ |
+ F(LAHF_LM) | F(CMP_LEGACY) | 0 /*SVM*/ | 0 /* ExtApicSpace */ |
F(CR8_LEGACY) | F(ABM) | F(SSE4A) | F(MISALIGNSSE) |
F(3DNOWPREFETCH) | 0 /* OSVW */ | 0 /* IBS */ | F(SSE5) |
0 /* SKINIT */ | 0 /* WDT */;
--
1.7.0.4
^ permalink raw reply related [flat|nested] 33+ messages in thread* Re: [PATCH 0/29] Nested Paging Virtualization for KVM v4
2010-09-10 15:30 [PATCH 0/29] Nested Paging Virtualization for KVM v4 Joerg Roedel
` (28 preceding siblings ...)
2010-09-10 15:31 ` [PATCH 29/29] KVM: X86: Report SVM bit to userspace only when supported Joerg Roedel
@ 2010-09-12 8:56 ` Avi Kivity
29 siblings, 0 replies; 33+ messages in thread
From: Avi Kivity @ 2010-09-12 8:56 UTC (permalink / raw)
To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel
On 09/10/2010 06:30 PM, Joerg Roedel wrote:
> Hi Avi, Marcelo,
>
> as promised here is -v4 of my patch-set for virtualizing nested paging in KVM.
> I addresses all of your review comments in this version and fixed a misbehavior
> in the nested page-walker code where it would have reported the wrong
> error-code on an emulated nested page fault.
>
> Here is the complete list of changes:
>
> * Fixed the bug in the gpa_to_gfn function
> * Made sure that the right fault values are kept in the two-dimensional
> page-table walker
> * Fixed the return code of x86_decode_insn so that a page-fault within
> that function can be handled
> * Set vcpu->arch.mmu.direct_map to true for nonpaging mode too
> * Changed code so that the right access-mode is used on nested
> page-table walks
> * Made the NX mode a capability of the MMU context to distinguish
> between l1-nx and l2-nx (1 additional patch)
> * Fixed the bug that KVM always reports the SVM flag as supported to
> userspace, it should only be reported on AMD hardware when nesting is
> enabled
>
> As the patch-set before this one was tested with the same set of combinations
> too. I found no regressions to current avi/master. This patch-set applies on
> current avi/master plus the three fixes I sent last week.
>
> As with the last version of the patch-set I also pushed this one to a tree on
> korg. Find it in
>
> git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-kvm.git npt-virt-v4
>
> Please review these patches and/or apply them :-) As usual I appreciate your
> feedback.
All applied, thanks. I'll go play with my new toy now.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 33+ messages in thread