* [PATCH 0/3] Cache PDPTRs under ept/npt
@ 2009-06-01 13:22 Avi Kivity
2009-06-01 13:22 ` [PATCH 1/3] KVM: VMX: Avoid duplicate ept tlb flush when setting cr3 Avi Kivity
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Avi Kivity @ 2009-06-01 13:22 UTC (permalink / raw)
To: kvm; +Cc: Marcelo Tosatti, Joerg Roedel, Sheng Yang
Currently the EPT code re-loads the PDPTRs after every exit, even though
they usually are not needed. Moreover, the PDPTRs are reloaded from memory,
even though they are supposed to be kept on processor registers independent
from memory. The NPT case is similar.
This patchset makes the PDPTRs cacheable registers (like RSP and RIP on vmx)
so they are only stored in the VMCS if they are dirtied and copied from the
VMCS to memory on demand. As SVM doesn't virtualize the PDPTRs, we simple
load them from memory on demand (instead of on every exit).
It reduces ept vmexit costs (inb on some pic port) from 3231 cycles to 2750
cycles, a 15% reduction.
Please review.
Avi Kivity (3):
KVM: VMX: Avoid duplicate ept tlb flush when setting cr3
KVM: VMX: Simplify pdptr and cr3 management
KVM: Cache pdptrs
arch/x86/include/asm/kvm_host.h | 4 +++
arch/x86/kvm/kvm_cache_regs.h | 10 +++++++++
arch/x86/kvm/mmu.c | 7 ++++-
arch/x86/kvm/paging_tmpl.h | 2 +-
arch/x86/kvm/svm.c | 24 ++++++++++++++++-----
arch/x86/kvm/vmx.c | 42 +++++++++++++++++++++++++++++---------
arch/x86/kvm/x86.c | 8 +++++++
7 files changed, 78 insertions(+), 19 deletions(-)
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/3] KVM: VMX: Avoid duplicate ept tlb flush when setting cr3
2009-06-01 13:22 [PATCH 0/3] Cache PDPTRs under ept/npt Avi Kivity
@ 2009-06-01 13:22 ` Avi Kivity
2009-06-01 13:22 ` [PATCH 2/3] KVM: VMX: Simplify pdptr and cr3 management Avi Kivity
2009-06-01 13:22 ` [PATCH 3/3] KVM: Cache pdptrs Avi Kivity
2 siblings, 0 replies; 17+ messages in thread
From: Avi Kivity @ 2009-06-01 13:22 UTC (permalink / raw)
To: kvm; +Cc: Marcelo Tosatti, Joerg Roedel, Sheng Yang
vmx_set_cr3() will call vmx_tlb_flush(), which will flush the ept context.
So there is no need to call ept_sync_context() explicitly.
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/vmx.c | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 25f1239..5607de8 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1642,7 +1642,6 @@ static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
if (enable_ept) {
eptp = construct_eptp(cr3);
vmcs_write64(EPT_POINTER, eptp);
- ept_sync_context(eptp);
ept_load_pdptrs(vcpu);
guest_cr3 = is_paging(vcpu) ? vcpu->arch.cr3 :
VMX_EPT_IDENTITY_PAGETABLE_ADDR;
--
1.6.0.6
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 2/3] KVM: VMX: Simplify pdptr and cr3 management
2009-06-01 13:22 [PATCH 0/3] Cache PDPTRs under ept/npt Avi Kivity
2009-06-01 13:22 ` [PATCH 1/3] KVM: VMX: Avoid duplicate ept tlb flush when setting cr3 Avi Kivity
@ 2009-06-01 13:22 ` Avi Kivity
2009-06-02 9:22 ` Sheng Yang
2009-06-01 13:22 ` [PATCH 3/3] KVM: Cache pdptrs Avi Kivity
2 siblings, 1 reply; 17+ messages in thread
From: Avi Kivity @ 2009-06-01 13:22 UTC (permalink / raw)
To: kvm; +Cc: Marcelo Tosatti, Joerg Roedel, Sheng Yang
Instead of reading the PDPTRs from memory after every exit (which is slow
and wrong, as the PDPTRs are stored on the cpu), sync the PDPTRs from
memory to the VMCS before entry, and from the VMCS to memory after exit.
Do the same for cr3.
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/vmx.c | 21 +++++++++++++++------
1 files changed, 15 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 5607de8..1783606 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1538,10 +1538,6 @@ static void vmx_decache_cr4_guest_bits(struct kvm_vcpu *vcpu)
static void ept_load_pdptrs(struct kvm_vcpu *vcpu)
{
if (is_paging(vcpu) && is_pae(vcpu) && !is_long_mode(vcpu)) {
- if (!load_pdptrs(vcpu, vcpu->arch.cr3)) {
- printk(KERN_ERR "EPT: Fail to load pdptrs!\n");
- return;
- }
vmcs_write64(GUEST_PDPTR0, vcpu->arch.pdptrs[0]);
vmcs_write64(GUEST_PDPTR1, vcpu->arch.pdptrs[1]);
vmcs_write64(GUEST_PDPTR2, vcpu->arch.pdptrs[2]);
@@ -1549,6 +1545,16 @@ static void ept_load_pdptrs(struct kvm_vcpu *vcpu)
}
}
+static void ept_save_pdptrs(struct kvm_vcpu *vcpu)
+{
+ if (is_paging(vcpu) && is_pae(vcpu) && !is_long_mode(vcpu)) {
+ vcpu->arch.pdptrs[0] = vmcs_read64(GUEST_PDPTR0);
+ vcpu->arch.pdptrs[1] = vmcs_read64(GUEST_PDPTR1);
+ vcpu->arch.pdptrs[2] = vmcs_read64(GUEST_PDPTR2);
+ vcpu->arch.pdptrs[3] = vmcs_read64(GUEST_PDPTR3);
+ }
+}
+
static void vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4);
static void ept_update_paging_mode_cr0(unsigned long *hw_cr0,
@@ -1642,7 +1648,6 @@ static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
if (enable_ept) {
eptp = construct_eptp(cr3);
vmcs_write64(EPT_POINTER, eptp);
- ept_load_pdptrs(vcpu);
guest_cr3 = is_paging(vcpu) ? vcpu->arch.cr3 :
VMX_EPT_IDENTITY_PAGETABLE_ADDR;
}
@@ -3199,7 +3204,7 @@ static int vmx_handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
* to sync with guest real CR3. */
if (enable_ept && is_paging(vcpu)) {
vcpu->arch.cr3 = vmcs_readl(GUEST_CR3);
- ept_load_pdptrs(vcpu);
+ ept_save_pdptrs(vcpu);
}
if (unlikely(vmx->fail)) {
@@ -3376,6 +3381,10 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
+ if (enable_ept && is_paging(vcpu)) {
+ vmcs_writel(GUEST_CR3, vcpu->arch.cr3);
+ ept_load_pdptrs(vcpu);
+ }
/* Record the guest's net vcpu time for enforced NMI injections. */
if (unlikely(!cpu_has_virtual_nmis() && vmx->soft_vnmi_blocked))
vmx->entry_time = ktime_get();
--
1.6.0.6
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 3/3] KVM: Cache pdptrs
2009-06-01 13:22 [PATCH 0/3] Cache PDPTRs under ept/npt Avi Kivity
2009-06-01 13:22 ` [PATCH 1/3] KVM: VMX: Avoid duplicate ept tlb flush when setting cr3 Avi Kivity
2009-06-01 13:22 ` [PATCH 2/3] KVM: VMX: Simplify pdptr and cr3 management Avi Kivity
@ 2009-06-01 13:22 ` Avi Kivity
2009-06-02 9:04 ` Joerg Roedel
2 siblings, 1 reply; 17+ messages in thread
From: Avi Kivity @ 2009-06-01 13:22 UTC (permalink / raw)
To: kvm; +Cc: Marcelo Tosatti, Joerg Roedel, Sheng Yang
Instead of reloading the pdptrs on every entry and exit (vmcs writes on vmx,
guest memory access on svm) extract them on demand.
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/include/asm/kvm_host.h | 4 ++++
arch/x86/kvm/kvm_cache_regs.h | 10 ++++++++++
arch/x86/kvm/mmu.c | 7 +++++--
arch/x86/kvm/paging_tmpl.h | 2 +-
arch/x86/kvm/svm.c | 24 ++++++++++++++++++------
arch/x86/kvm/vmx.c | 22 ++++++++++++++++++----
arch/x86/kvm/x86.c | 8 ++++++++
7 files changed, 64 insertions(+), 13 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d2b082d..1951d39 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -120,6 +120,10 @@ enum kvm_reg {
NR_VCPU_REGS
};
+enum kvm_reg_ex {
+ VCPU_EXREG_PDPTR = NR_VCPU_REGS,
+};
+
enum {
VCPU_SREG_ES,
VCPU_SREG_CS,
diff --git a/arch/x86/kvm/kvm_cache_regs.h b/arch/x86/kvm/kvm_cache_regs.h
index 1ff819d..24fc742 100644
--- a/arch/x86/kvm/kvm_cache_regs.h
+++ b/arch/x86/kvm/kvm_cache_regs.h
@@ -29,4 +29,14 @@ static inline void kvm_rip_write(struct kvm_vcpu *vcpu, unsigned long val)
kvm_register_write(vcpu, VCPU_REGS_RIP, val);
}
+
+static inline u64 kvm_pdptr_read(struct kvm_vcpu *vcpu, int index)
+{
+ if (!test_bit(VCPU_EXREG_PDPTR,
+ (unsigned long *)&vcpu->arch.regs_avail))
+ kvm_x86_ops->cache_reg(vcpu, VCPU_EXREG_PDPTR);
+
+ return vcpu->arch.pdptrs[index];
+}
+
#endif
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 7030b5f..809cce0 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -18,6 +18,7 @@
*/
#include "mmu.h"
+#include "kvm_cache_regs.h"
#include <linux/kvm_host.h>
#include <linux/types.h>
@@ -1930,6 +1931,7 @@ static int mmu_alloc_roots(struct kvm_vcpu *vcpu)
gfn_t root_gfn;
struct kvm_mmu_page *sp;
int direct = 0;
+ u64 pdptr;
root_gfn = vcpu->arch.cr3 >> PAGE_SHIFT;
@@ -1957,11 +1959,12 @@ static int mmu_alloc_roots(struct kvm_vcpu *vcpu)
ASSERT(!VALID_PAGE(root));
if (vcpu->arch.mmu.root_level == PT32E_ROOT_LEVEL) {
- if (!is_present_pte(vcpu->arch.pdptrs[i])) {
+ pdptr = kvm_pdptr_read(vcpu, i);
+ if (!is_present_pte(pdptr)) {
vcpu->arch.mmu.pae_root[i] = 0;
continue;
}
- root_gfn = vcpu->arch.pdptrs[i] >> PAGE_SHIFT;
+ root_gfn = pdptr >> PAGE_SHIFT;
} else if (vcpu->arch.mmu.root_level == 0)
root_gfn = 0;
if (mmu_check_root(vcpu, root_gfn))
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 67785f6..4cb1dbf 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -131,7 +131,7 @@ walk:
pte = vcpu->arch.cr3;
#if PTTYPE == 64
if (!is_long_mode(vcpu)) {
- pte = vcpu->arch.pdptrs[(addr >> 30) & 3];
+ pte = kvm_pdptr_read(vcpu, (addr >> 30) & 3);
if (!is_present_pte(pte))
goto not_present;
--walker->level;
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 3ac45e3..37397f6 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -779,6 +779,18 @@ static void svm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags)
to_svm(vcpu)->vmcb->save.rflags = rflags;
}
+static void svm_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg)
+{
+ switch (reg) {
+ case VCPU_EXREG_PDPTR:
+ BUG_ON(!npt_enabled);
+ load_pdptrs(vcpu, vcpu->arch.cr3);
+ break;
+ default:
+ BUG();
+ }
+}
+
static void svm_set_vintr(struct vcpu_svm *svm)
{
svm->vmcb->control.intercept |= 1ULL << INTERCEPT_VINTR;
@@ -2286,12 +2298,6 @@ static int handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
}
vcpu->arch.cr0 = svm->vmcb->save.cr0;
vcpu->arch.cr3 = svm->vmcb->save.cr3;
- if (is_paging(vcpu) && is_pae(vcpu) && !is_long_mode(vcpu)) {
- if (!load_pdptrs(vcpu, vcpu->arch.cr3)) {
- kvm_inject_gp(vcpu, 0);
- return 1;
- }
- }
if (mmu_reload) {
kvm_mmu_reset_context(vcpu);
kvm_mmu_load(vcpu);
@@ -2642,6 +2648,11 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
svm->next_rip = 0;
+ if (npt_enabled) {
+ vcpu->arch.regs_avail &= ~(1 << VCPU_EXREG_PDPTR);
+ vcpu->arch.regs_dirty &= ~(1 << VCPU_EXREG_PDPTR);
+ }
+
svm_complete_interrupts(svm);
}
@@ -2750,6 +2761,7 @@ static struct kvm_x86_ops svm_x86_ops = {
.set_gdt = svm_set_gdt,
.get_dr = svm_get_dr,
.set_dr = svm_set_dr,
+ .cache_reg = svm_cache_reg,
.get_rflags = svm_get_rflags,
.set_rflags = svm_set_rflags,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 1783606..fd05fd2 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -159,6 +159,8 @@ static struct kvm_vmx_segment_field {
VMX_SEGMENT_FIELD(LDTR),
};
+static void ept_save_pdptrs(struct kvm_vcpu *vcpu);
+
/*
* Keep MSR_K6_STAR at the end, as setup_msrs() will try to optimize it
* away by decrementing the array size.
@@ -1038,6 +1040,10 @@ static void vmx_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg)
case VCPU_REGS_RIP:
vcpu->arch.regs[VCPU_REGS_RIP] = vmcs_readl(GUEST_RIP);
break;
+ case VCPU_EXREG_PDPTR:
+ if (enable_ept)
+ ept_save_pdptrs(vcpu);
+ break;
default:
break;
}
@@ -1537,6 +1543,10 @@ static void vmx_decache_cr4_guest_bits(struct kvm_vcpu *vcpu)
static void ept_load_pdptrs(struct kvm_vcpu *vcpu)
{
+ if (!test_bit(VCPU_EXREG_PDPTR,
+ (unsigned long *)&vcpu->arch.regs_dirty))
+ return;
+
if (is_paging(vcpu) && is_pae(vcpu) && !is_long_mode(vcpu)) {
vmcs_write64(GUEST_PDPTR0, vcpu->arch.pdptrs[0]);
vmcs_write64(GUEST_PDPTR1, vcpu->arch.pdptrs[1]);
@@ -1553,6 +1563,11 @@ static void ept_save_pdptrs(struct kvm_vcpu *vcpu)
vcpu->arch.pdptrs[2] = vmcs_read64(GUEST_PDPTR2);
vcpu->arch.pdptrs[3] = vmcs_read64(GUEST_PDPTR3);
}
+
+ __set_bit(VCPU_EXREG_PDPTR,
+ (unsigned long *)&vcpu->arch.regs_avail);
+ __set_bit(VCPU_EXREG_PDPTR,
+ (unsigned long *)&vcpu->arch.regs_dirty);
}
static void vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4);
@@ -3202,10 +3217,8 @@ static int vmx_handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
/* Access CR3 don't cause VMExit in paging mode, so we need
* to sync with guest real CR3. */
- if (enable_ept && is_paging(vcpu)) {
+ if (enable_ept && is_paging(vcpu))
vcpu->arch.cr3 = vmcs_readl(GUEST_CR3);
- ept_save_pdptrs(vcpu);
- }
if (unlikely(vmx->fail)) {
kvm_run->exit_reason = KVM_EXIT_FAIL_ENTRY;
@@ -3506,7 +3519,8 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
#endif
);
- vcpu->arch.regs_avail = ~((1 << VCPU_REGS_RIP) | (1 << VCPU_REGS_RSP));
+ vcpu->arch.regs_avail = ~((1 << VCPU_REGS_RIP) | (1 << VCPU_REGS_RSP)
+ | (1 << VCPU_EXREG_PDPTR));
vcpu->arch.regs_dirty = 0;
get_debugreg(vcpu->arch.dr6, 6);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6d44dd5..d5d1ba8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -246,6 +246,10 @@ int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3)
ret = 1;
memcpy(vcpu->arch.pdptrs, pdpte, sizeof(vcpu->arch.pdptrs));
+ __set_bit(VCPU_EXREG_PDPTR,
+ (unsigned long *)&vcpu->arch.regs_avail);
+ __set_bit(VCPU_EXREG_PDPTR,
+ (unsigned long *)&vcpu->arch.regs_dirty);
out:
return ret;
@@ -261,6 +265,10 @@ static bool pdptrs_changed(struct kvm_vcpu *vcpu)
if (is_long_mode(vcpu) || !is_pae(vcpu))
return false;
+ if (!test_bit(VCPU_EXREG_PDPTR,
+ (unsigned long *)&vcpu->arch.regs_avail))
+ return true;
+
r = kvm_read_guest(vcpu->kvm, vcpu->arch.cr3 & ~31u, pdpte, sizeof(pdpte));
if (r < 0)
goto out;
--
1.6.0.6
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 3/3] KVM: Cache pdptrs
2009-06-01 13:22 ` [PATCH 3/3] KVM: Cache pdptrs Avi Kivity
@ 2009-06-02 9:04 ` Joerg Roedel
2009-06-02 9:09 ` Avi Kivity
0 siblings, 1 reply; 17+ messages in thread
From: Joerg Roedel @ 2009-06-02 9:04 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm, Marcelo Tosatti, Sheng Yang
On Mon, Jun 01, 2009 at 04:22:03PM +0300, Avi Kivity wrote:
> +static void svm_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg)
> +{
> + switch (reg) {
> + case VCPU_EXREG_PDPTR:
> + BUG_ON(!npt_enabled);
> + load_pdptrs(vcpu, vcpu->arch.cr3);
> + break;
> + default:
> + BUG();
> + }
> +}
Don't we need to check for the return value of load_pdptrs() here and inject
a #GP it it fails?
> +
> static void svm_set_vintr(struct vcpu_svm *svm)
> {
> svm->vmcb->control.intercept |= 1ULL << INTERCEPT_VINTR;
> @@ -2286,12 +2298,6 @@ static int handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
> }
> vcpu->arch.cr0 = svm->vmcb->save.cr0;
> vcpu->arch.cr3 = svm->vmcb->save.cr3;
> - if (is_paging(vcpu) && is_pae(vcpu) && !is_long_mode(vcpu)) {
> - if (!load_pdptrs(vcpu, vcpu->arch.cr3)) {
> - kvm_inject_gp(vcpu, 0);
> - return 1;
> - }
> - }
... as done here.
Joerg
--
| Advanced Micro Devices GmbH
Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
System |
Research | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
Center | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
| Registergericht München, HRB Nr. 43632
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/3] KVM: Cache pdptrs
2009-06-02 9:04 ` Joerg Roedel
@ 2009-06-02 9:09 ` Avi Kivity
2009-06-02 9:30 ` Joerg Roedel
2009-06-02 11:50 ` Avi Kivity
0 siblings, 2 replies; 17+ messages in thread
From: Avi Kivity @ 2009-06-02 9:09 UTC (permalink / raw)
To: Joerg Roedel; +Cc: kvm, Marcelo Tosatti, Sheng Yang
Joerg Roedel wrote:
> On Mon, Jun 01, 2009 at 04:22:03PM +0300, Avi Kivity wrote:
>
>> +static void svm_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg)
>> +{
>> + switch (reg) {
>> + case VCPU_EXREG_PDPTR:
>> + BUG_ON(!npt_enabled);
>> + load_pdptrs(vcpu, vcpu->arch.cr3);
>> + break;
>> + default:
>> + BUG();
>> + }
>> +}
>>
>
> Don't we need to check for the return value of load_pdptrs() here and inject
> a #GP it it fails?
>
We're after some random exit, the guest won't be expecting a #GP in some
random instruction.
The only options are ignore and triple fault.
>> +
>> static void svm_set_vintr(struct vcpu_svm *svm)
>> {
>> svm->vmcb->control.intercept |= 1ULL << INTERCEPT_VINTR;
>> @@ -2286,12 +2298,6 @@ static int handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
>> }
>> vcpu->arch.cr0 = svm->vmcb->save.cr0;
>> vcpu->arch.cr3 = svm->vmcb->save.cr3;
>> - if (is_paging(vcpu) && is_pae(vcpu) && !is_long_mode(vcpu)) {
>> - if (!load_pdptrs(vcpu, vcpu->arch.cr3)) {
>> - kvm_inject_gp(vcpu, 0);
>> - return 1;
>> - }
>> - }
>>
>
> ... as done here.
That's a bug... luckily no guests trash their PDPTs after loading CR3.
I guess I should fix in a separate patch to avoid mixing a bugfix with a
feature.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] KVM: VMX: Simplify pdptr and cr3 management
2009-06-01 13:22 ` [PATCH 2/3] KVM: VMX: Simplify pdptr and cr3 management Avi Kivity
@ 2009-06-02 9:22 ` Sheng Yang
2009-06-02 9:26 ` Avi Kivity
0 siblings, 1 reply; 17+ messages in thread
From: Sheng Yang @ 2009-06-02 9:22 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm, Marcelo Tosatti, Joerg Roedel
On Monday 01 June 2009 21:22:02 Avi Kivity wrote:
> Instead of reading the PDPTRs from memory after every exit (which is slow
> and wrong, as the PDPTRs are stored on the cpu), sync the PDPTRs from
> memory to the VMCS before entry, and from the VMCS to memory after exit.
> Do the same for cr3.
>
Thanks for fixing!
After review my original code, I found a potential bug. For SDM 3B have this:
23.3.4 Saving Non-Register State
...
If the logical processor supports the 1-setting of the “enable EPT” VM-
execution control, values are saved into the four (4) PDPTE fields as follows:
— If the “enable EPT” VM-execution control is 1 and the logical processor was
using PAE paging at the time of the VM exit, the PDPTE values currently in use
are saved:
• The values saved into bits 11:9 of each of the fields is undefined.
• If the value saved into one of the fields has bit 0 (present) clear, the
value saved into bits 63:1 of that field is undefined. That value need not
correspond to the value that was loaded by VM entry or to any value that
might have been loaded in VMX non-root operation.
• If the value saved into one of the fields has bit 0 (present) set, the value
saved into bits 63:12 of the field is a guest-physical address.
— If the “enable EPT” VM-execution control is 0 or the logical processor was
not using PAE paging at the time of the VM exit, the values saved are
undefined.
But drop the ept_load_pdptrs() when exit and add it in cr0 handling result in
Windows PAE guest hang on boot. I am checking it now. Any thoughts?...
--
regards
Yang, Sheng
> Signed-off-by: Avi Kivity <avi@redhat.com>
> ---
> arch/x86/kvm/vmx.c | 21 +++++++++++++++------
> 1 files changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 5607de8..1783606 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -1538,10 +1538,6 @@ static void vmx_decache_cr4_guest_bits(struct
> kvm_vcpu *vcpu) static void ept_load_pdptrs(struct kvm_vcpu *vcpu)
> {
> if (is_paging(vcpu) && is_pae(vcpu) && !is_long_mode(vcpu)) {
> - if (!load_pdptrs(vcpu, vcpu->arch.cr3)) {
> - printk(KERN_ERR "EPT: Fail to load pdptrs!\n");
> - return;
> - }
> vmcs_write64(GUEST_PDPTR0, vcpu->arch.pdptrs[0]);
> vmcs_write64(GUEST_PDPTR1, vcpu->arch.pdptrs[1]);
> vmcs_write64(GUEST_PDPTR2, vcpu->arch.pdptrs[2]);
> @@ -1549,6 +1545,16 @@ static void ept_load_pdptrs(struct kvm_vcpu *vcpu)
> }
> }
>
> +static void ept_save_pdptrs(struct kvm_vcpu *vcpu)
> +{
> + if (is_paging(vcpu) && is_pae(vcpu) && !is_long_mode(vcpu)) {
> + vcpu->arch.pdptrs[0] = vmcs_read64(GUEST_PDPTR0);
> + vcpu->arch.pdptrs[1] = vmcs_read64(GUEST_PDPTR1);
> + vcpu->arch.pdptrs[2] = vmcs_read64(GUEST_PDPTR2);
> + vcpu->arch.pdptrs[3] = vmcs_read64(GUEST_PDPTR3);
> + }
> +}
> +
> static void vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4);
>
> static void ept_update_paging_mode_cr0(unsigned long *hw_cr0,
> @@ -1642,7 +1648,6 @@ static void vmx_set_cr3(struct kvm_vcpu *vcpu,
> unsigned long cr3) if (enable_ept) {
> eptp = construct_eptp(cr3);
> vmcs_write64(EPT_POINTER, eptp);
> - ept_load_pdptrs(vcpu);
> guest_cr3 = is_paging(vcpu) ? vcpu->arch.cr3 :
> VMX_EPT_IDENTITY_PAGETABLE_ADDR;
> }
> @@ -3199,7 +3204,7 @@ static int vmx_handle_exit(struct kvm_run *kvm_run,
> struct kvm_vcpu *vcpu) * to sync with guest real CR3. */
> if (enable_ept && is_paging(vcpu)) {
> vcpu->arch.cr3 = vmcs_readl(GUEST_CR3);
> - ept_load_pdptrs(vcpu);
> + ept_save_pdptrs(vcpu);
> }
>
> if (unlikely(vmx->fail)) {
> @@ -3376,6 +3381,10 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu,
> struct kvm_run *kvm_run) {
> struct vcpu_vmx *vmx = to_vmx(vcpu);
>
> + if (enable_ept && is_paging(vcpu)) {
> + vmcs_writel(GUEST_CR3, vcpu->arch.cr3);
> + ept_load_pdptrs(vcpu);
> + }
> /* Record the guest's net vcpu time for enforced NMI injections. */
> if (unlikely(!cpu_has_virtual_nmis() && vmx->soft_vnmi_blocked))
> vmx->entry_time = ktime_get();
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] KVM: VMX: Simplify pdptr and cr3 management
2009-06-02 9:22 ` Sheng Yang
@ 2009-06-02 9:26 ` Avi Kivity
2009-06-02 9:30 ` Sheng Yang
0 siblings, 1 reply; 17+ messages in thread
From: Avi Kivity @ 2009-06-02 9:26 UTC (permalink / raw)
To: Sheng Yang; +Cc: kvm, Marcelo Tosatti, Joerg Roedel
Sheng Yang wrote:
> On Monday 01 June 2009 21:22:02 Avi Kivity wrote:
>
>> Instead of reading the PDPTRs from memory after every exit (which is slow
>> and wrong, as the PDPTRs are stored on the cpu), sync the PDPTRs from
>> memory to the VMCS before entry, and from the VMCS to memory after exit.
>> Do the same for cr3.
>>
>>
>
> Thanks for fixing!
>
> After review my original code, I found a potential bug. For SDM 3B have this:
>
> 23.3.4 Saving Non-Register State
> ...
> If the logical processor supports the 1-setting of the “enable EPT” VM-
> execution control, values are saved into the four (4) PDPTE fields as follows:
> — If the “enable EPT” VM-execution control is 1 and the logical processor was
> using PAE paging at the time of the VM exit, the PDPTE values currently in use
> are saved:
> • The values saved into bits 11:9 of each of the fields is undefined.
> • If the value saved into one of the fields has bit 0 (present) clear, the
> value saved into bits 63:1 of that field is undefined. That value need not
> correspond to the value that was loaded by VM entry or to any value that
> might have been loaded in VMX non-root operation.
> • If the value saved into one of the fields has bit 0 (present) set, the value
> saved into bits 63:12 of the field is a guest-physical address.
> — If the “enable EPT” VM-execution control is 0 or the logical processor was
> not using PAE paging at the time of the VM exit, the values saved are
> undefined.
>
> But drop the ept_load_pdptrs() when exit and add it in cr0 handling result in
> Windows PAE guest hang on boot. I am checking it now. Any thoughts?...
>
You mean with the new code? What version of Windows exactly?
I'll check it out, though EPTs are a little hard to find here.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] KVM: VMX: Simplify pdptr and cr3 management
2009-06-02 9:26 ` Avi Kivity
@ 2009-06-02 9:30 ` Sheng Yang
2009-06-02 9:46 ` Avi Kivity
0 siblings, 1 reply; 17+ messages in thread
From: Sheng Yang @ 2009-06-02 9:30 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm, Marcelo Tosatti, Joerg Roedel
On Tuesday 02 June 2009 17:26:27 Avi Kivity wrote:
> Sheng Yang wrote:
> > On Monday 01 June 2009 21:22:02 Avi Kivity wrote:
> >> Instead of reading the PDPTRs from memory after every exit (which is
> >> slow and wrong, as the PDPTRs are stored on the cpu), sync the PDPTRs
> >> from memory to the VMCS before entry, and from the VMCS to memory after
> >> exit. Do the same for cr3.
> >
> > Thanks for fixing!
> >
> > After review my original code, I found a potential bug. For SDM 3B have
> > this:
> >
> > 23.3.4 Saving Non-Register State
> > ...
> > If the logical processor supports the 1-setting of the “enable EPT” VM-
> > execution control, values are saved into the four (4) PDPTE fields as
> > follows: — If the “enable EPT” VM-execution control is 1 and the logical
> > processor was using PAE paging at the time of the VM exit, the PDPTE
> > values currently in use are saved:
> > • The values saved into bits 11:9 of each of the fields is undefined.
> > • If the value saved into one of the fields has bit 0 (present) clear,
> > the value saved into bits 63:1 of that field is undefined. That value
> > need not correspond to the value that was loaded by VM entry or to any
> > value that might have been loaded in VMX non-root operation.
> > • If the value saved into one of the fields has bit 0 (present) set, the
> > value saved into bits 63:12 of the field is a guest-physical address.
> > — If the “enable EPT” VM-execution control is 0 or the logical processor
> > was not using PAE paging at the time of the VM exit, the values saved are
> > undefined.
> >
> > But drop the ept_load_pdptrs() when exit and add it in cr0 handling
> > result in Windows PAE guest hang on boot. I am checking it now. Any
> > thoughts?...
>
> You mean with the new code? What version of Windows exactly?
>
> I'll check it out, though EPTs are a little hard to find here.
No, no, not with the new code. For CPU can load pdptrs if EPT enabled with PAE
from VM exit, there should not be necessary load it explicitly. So I estimate
the ept_load_pdptr() in exit handler, and put it in CR0 handling. Just tried
to optimize load-pdptr according to the spec, but not got the desired
result...
So I am trying to find the failure reason...
--
regards
Yang, Sheng
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/3] KVM: Cache pdptrs
2009-06-02 9:09 ` Avi Kivity
@ 2009-06-02 9:30 ` Joerg Roedel
2009-06-02 9:44 ` Avi Kivity
2009-06-02 11:50 ` Avi Kivity
1 sibling, 1 reply; 17+ messages in thread
From: Joerg Roedel @ 2009-06-02 9:30 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm, Marcelo Tosatti, Sheng Yang
On Tue, Jun 02, 2009 at 12:09:17PM +0300, Avi Kivity wrote:
> Joerg Roedel wrote:
>> On Mon, Jun 01, 2009 at 04:22:03PM +0300, Avi Kivity wrote:
>>
>>> +static void svm_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg)
>>> +{
>>> + switch (reg) {
>>> + case VCPU_EXREG_PDPTR:
>>> + BUG_ON(!npt_enabled);
>>> + load_pdptrs(vcpu, vcpu->arch.cr3);
>>> + break;
>>> + default:
>>> + BUG();
>>> + }
>>> +}
>>>
>>
>> Don't we need to check for the return value of load_pdptrs() here and inject
>> a #GP it it fails?
>>
>
> We're after some random exit, the guest won't be expecting a #GP in some
> random instruction.
>
> The only options are ignore and triple fault.
Thats not different from PAE with NPT anyways. With NPT the hardware
does not load all four pdptrs on cr3 switch time, only when they
are used.
Joerg
--
| Advanced Micro Devices GmbH
Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
System |
Research | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
Center | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
| Registergericht München, HRB Nr. 43632
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/3] KVM: Cache pdptrs
2009-06-02 9:30 ` Joerg Roedel
@ 2009-06-02 9:44 ` Avi Kivity
0 siblings, 0 replies; 17+ messages in thread
From: Avi Kivity @ 2009-06-02 9:44 UTC (permalink / raw)
To: Joerg Roedel; +Cc: kvm, Marcelo Tosatti, Sheng Yang
Joerg Roedel wrote:
> On Tue, Jun 02, 2009 at 12:09:17PM +0300, Avi Kivity wrote:
>
>> Joerg Roedel wrote:
>>
>>> On Mon, Jun 01, 2009 at 04:22:03PM +0300, Avi Kivity wrote:
>>>
>>>
>>>> +static void svm_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg)
>>>> +{
>>>> + switch (reg) {
>>>> + case VCPU_EXREG_PDPTR:
>>>> + BUG_ON(!npt_enabled);
>>>> + load_pdptrs(vcpu, vcpu->arch.cr3);
>>>> + break;
>>>> + default:
>>>> + BUG();
>>>> + }
>>>> +}
>>>>
>>>>
>>> Don't we need to check for the return value of load_pdptrs() here and inject
>>> a #GP it it fails?
>>>
>>>
>> We're after some random exit, the guest won't be expecting a #GP in some
>> random instruction.
>>
>> The only options are ignore and triple fault.
>>
>
> Thats not different from PAE with NPT anyways. With NPT the hardware
> does not load all four pdptrs on cr3 switch time, only when they
> are used.
>
That will at least cause a page fault on a related instruction. But we
can't #GP on a random exit.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] KVM: VMX: Simplify pdptr and cr3 management
2009-06-02 9:30 ` Sheng Yang
@ 2009-06-02 9:46 ` Avi Kivity
2009-06-02 9:56 ` Sheng Yang
0 siblings, 1 reply; 17+ messages in thread
From: Avi Kivity @ 2009-06-02 9:46 UTC (permalink / raw)
To: Sheng Yang; +Cc: kvm, Marcelo Tosatti, Joerg Roedel
Sheng Yang wrote:
> No, no, not with the new code. For CPU can load pdptrs if EPT enabled with PAE
> from VM exit, there should not be necessary load it explicitly. So I estimate
> the ept_load_pdptr() in exit handler, and put it in CR0 handling. Just tried
> to optimize load-pdptr according to the spec, but not got the desired
> result...
>
> So I am trying to find the failure reason...
>
so from your point of view, the patches are okay?
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] KVM: VMX: Simplify pdptr and cr3 management
2009-06-02 9:46 ` Avi Kivity
@ 2009-06-02 9:56 ` Sheng Yang
2009-06-02 10:16 ` Avi Kivity
0 siblings, 1 reply; 17+ messages in thread
From: Sheng Yang @ 2009-06-02 9:56 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm, Marcelo Tosatti, Joerg Roedel
On Tuesday 02 June 2009 17:46:39 Avi Kivity wrote:
> Sheng Yang wrote:
> > No, no, not with the new code. For CPU can load pdptrs if EPT enabled
> > with PAE from VM exit, there should not be necessary load it explicitly.
> > So I estimate the ept_load_pdptr() in exit handler, and put it in CR0
> > handling. Just tried to optimize load-pdptr according to the spec, but
> > not got the desired result...
> >
> > So I am trying to find the failure reason...
>
> so from your point of view, the patches are okay?
Yes. But we should can optimize it more by avoiding most of PDPTR loading,
according to the spec.
--
regards
Yang, Sheng
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] KVM: VMX: Simplify pdptr and cr3 management
2009-06-02 9:56 ` Sheng Yang
@ 2009-06-02 10:16 ` Avi Kivity
2009-06-02 11:31 ` Sheng Yang
0 siblings, 1 reply; 17+ messages in thread
From: Avi Kivity @ 2009-06-02 10:16 UTC (permalink / raw)
To: Sheng Yang; +Cc: kvm, Marcelo Tosatti, Joerg Roedel
Sheng Yang wrote:
> On Tuesday 02 June 2009 17:46:39 Avi Kivity wrote:
>
>> Sheng Yang wrote:
>>
>>> No, no, not with the new code. For CPU can load pdptrs if EPT enabled
>>> with PAE from VM exit, there should not be necessary load it explicitly.
>>> So I estimate the ept_load_pdptr() in exit handler, and put it in CR0
>>> handling. Just tried to optimize load-pdptr according to the spec, but
>>> not got the desired result...
>>>
>>> So I am trying to find the failure reason...
>>>
>> so from your point of view, the patches are okay?
>>
>
> Yes. But we should can optimize it more by avoiding most of PDPTR loading,
> according to the spec.
>
Patch 3 makes reloading a very rare event. It only happens on pae mode
changes or if userspace sets cr3.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] KVM: VMX: Simplify pdptr and cr3 management
2009-06-02 10:16 ` Avi Kivity
@ 2009-06-02 11:31 ` Sheng Yang
2009-06-02 11:44 ` Avi Kivity
0 siblings, 1 reply; 17+ messages in thread
From: Sheng Yang @ 2009-06-02 11:31 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm, Marcelo Tosatti, Joerg Roedel
On Tuesday 02 June 2009 18:16:14 Avi Kivity wrote:
> Sheng Yang wrote:
> > On Tuesday 02 June 2009 17:46:39 Avi Kivity wrote:
> >> Sheng Yang wrote:
> >>> No, no, not with the new code. For CPU can load pdptrs if EPT enabled
> >>> with PAE from VM exit, there should not be necessary load it
> >>> explicitly. So I estimate the ept_load_pdptr() in exit handler, and put
> >>> it in CR0 handling. Just tried to optimize load-pdptr according to the
> >>> spec, but not got the desired result...
> >>>
> >>> So I am trying to find the failure reason...
> >>
> >> so from your point of view, the patches are okay?
> >
> > Yes. But we should can optimize it more by avoiding most of PDPTR
> > loading, according to the spec.
>
> Patch 3 makes reloading a very rare event. It only happens on pae mode
> changes or if userspace sets cr3.
Yes, you are right. Quite beautiful. :)
Sorry for misunderstood the meaning of "on demand" without looking at the
patch carefully...
Looks fine now. Thanks!
BTW: one extra blank line above function kvm_pdptr_read() in patch 3. ;)
--
regards
Yang, Sheng
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] KVM: VMX: Simplify pdptr and cr3 management
2009-06-02 11:31 ` Sheng Yang
@ 2009-06-02 11:44 ` Avi Kivity
0 siblings, 0 replies; 17+ messages in thread
From: Avi Kivity @ 2009-06-02 11:44 UTC (permalink / raw)
To: Sheng Yang; +Cc: kvm, Marcelo Tosatti, Joerg Roedel
Sheng Yang wrote:
> Yes, you are right. Quite beautiful. :)
>
Thanks :)
It's basically a continuation of RIP/RSP caching. I plan to continue it
for cr0/cr3.
> Sorry for misunderstood the meaning of "on demand" without looking at the
> patch carefully...
>
> Looks fine now. Thanks!
>
> BTW: one extra blank line above function kvm_pdptr_read() in patch 3. ;)
>
Will fix.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/3] KVM: Cache pdptrs
2009-06-02 9:09 ` Avi Kivity
2009-06-02 9:30 ` Joerg Roedel
@ 2009-06-02 11:50 ` Avi Kivity
1 sibling, 0 replies; 17+ messages in thread
From: Avi Kivity @ 2009-06-02 11:50 UTC (permalink / raw)
To: Joerg Roedel; +Cc: kvm, Marcelo Tosatti, Sheng Yang
Avi Kivity wrote:
>>> +
>>> static void svm_set_vintr(struct vcpu_svm *svm)
>>> {
>>> svm->vmcb->control.intercept |= 1ULL << INTERCEPT_VINTR;
>>> @@ -2286,12 +2298,6 @@ static int handle_exit(struct kvm_run
>>> *kvm_run, struct kvm_vcpu *vcpu)
>>> }
>>> vcpu->arch.cr0 = svm->vmcb->save.cr0;
>>> vcpu->arch.cr3 = svm->vmcb->save.cr3;
>>> - if (is_paging(vcpu) && is_pae(vcpu) && !is_long_mode(vcpu)) {
>>> - if (!load_pdptrs(vcpu, vcpu->arch.cr3)) {
>>> - kvm_inject_gp(vcpu, 0);
>>> - return 1;
>>> - }
>>> - }
>>>
>>
>> ... as done here.
>
> That's a bug... luckily no guests trash their PDPTs after loading CR3.
>
> I guess I should fix in a separate patch to avoid mixing a bugfix with
> a feature.
>
That could cause a regression if we catch a guest preparing a new
pdpte. So I'll leave the code as is.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2009-06-02 11:50 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-01 13:22 [PATCH 0/3] Cache PDPTRs under ept/npt Avi Kivity
2009-06-01 13:22 ` [PATCH 1/3] KVM: VMX: Avoid duplicate ept tlb flush when setting cr3 Avi Kivity
2009-06-01 13:22 ` [PATCH 2/3] KVM: VMX: Simplify pdptr and cr3 management Avi Kivity
2009-06-02 9:22 ` Sheng Yang
2009-06-02 9:26 ` Avi Kivity
2009-06-02 9:30 ` Sheng Yang
2009-06-02 9:46 ` Avi Kivity
2009-06-02 9:56 ` Sheng Yang
2009-06-02 10:16 ` Avi Kivity
2009-06-02 11:31 ` Sheng Yang
2009-06-02 11:44 ` Avi Kivity
2009-06-01 13:22 ` [PATCH 3/3] KVM: Cache pdptrs Avi Kivity
2009-06-02 9:04 ` Joerg Roedel
2009-06-02 9:09 ` Avi Kivity
2009-06-02 9:30 ` Joerg Roedel
2009-06-02 9:44 ` Avi Kivity
2009-06-02 11:50 ` Avi Kivity
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox