* [PATCH 0/3] Lazy fpu for svm/npt
@ 2010-01-07 12:15 Avi Kivity
2010-01-07 12:15 ` [PATCH 1/3] KVM: SVM: Fix SVM_CR0_SELECTIVE_MASK Avi Kivity
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Avi Kivity @ 2010-01-07 12:15 UTC (permalink / raw)
To: Marcelo Tosatti, Joerg Roedel; +Cc: kvm
This patchset (on top of the previous cr0 patchset) brings lazy fpu to npt.
For the cases where guest and host cr0 match (the majority) it will disable
intercepts for cr0.ts once the guest fpu is loaded, so the guest can to its
own lazy fpu without trapping.
Avi Kivity (3):
KVM: SVM: Fix SVM_CR0_SELECTIVE_MASK
KVM: SVM: Initialize fpu_active in init_vmcb()
KVM: SVM: Lazy fpu for npt
arch/x86/include/asm/svm.h | 2 +-
arch/x86/kvm/svm.c | 73 +++++++++++++++++++++----------------------
2 files changed, 37 insertions(+), 38 deletions(-)
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/3] KVM: SVM: Fix SVM_CR0_SELECTIVE_MASK
2010-01-07 12:15 [PATCH 0/3] Lazy fpu for svm/npt Avi Kivity
@ 2010-01-07 12:15 ` Avi Kivity
2010-01-07 12:15 ` [PATCH 2/3] KVM: SVM: Initialize fpu_active in init_vmcb() Avi Kivity
2010-01-07 12:15 ` [PATCH 3/3] KVM: SVM: Lazy fpu for npt Avi Kivity
2 siblings, 0 replies; 6+ messages in thread
From: Avi Kivity @ 2010-01-07 12:15 UTC (permalink / raw)
To: Marcelo Tosatti, Joerg Roedel; +Cc: kvm
Instead of selecting TS and MP as the comments say, the macro included TS and
PE. Luckily the macro is unused now, but fix in order to save a few hours of
debugging from anyone who attempts to use it.
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/include/asm/svm.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 1fecb7e..38638cd 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -313,7 +313,7 @@ struct __attribute__ ((__packed__)) vmcb {
#define SVM_EXIT_ERR -1
-#define SVM_CR0_SELECTIVE_MASK (1 << 3 | 1) /* TS and MP */
+#define SVM_CR0_SELECTIVE_MASK (X86_CR0_TS | X86_CR0_MP)
#define SVM_VMLOAD ".byte 0x0f, 0x01, 0xda"
#define SVM_VMRUN ".byte 0x0f, 0x01, 0xd8"
--
1.6.5.3
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/3] KVM: SVM: Initialize fpu_active in init_vmcb()
2010-01-07 12:15 [PATCH 0/3] Lazy fpu for svm/npt Avi Kivity
2010-01-07 12:15 ` [PATCH 1/3] KVM: SVM: Fix SVM_CR0_SELECTIVE_MASK Avi Kivity
@ 2010-01-07 12:15 ` Avi Kivity
2010-01-07 12:15 ` [PATCH 3/3] KVM: SVM: Lazy fpu for npt Avi Kivity
2 siblings, 0 replies; 6+ messages in thread
From: Avi Kivity @ 2010-01-07 12:15 UTC (permalink / raw)
To: Marcelo Tosatti, Joerg Roedel; +Cc: kvm
init_vmcb() sets up the intercepts as if the fpu is active, so initialize it
there. This avoids an INIT from setting up intercepts inconsistent with
fpu_active.
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/svm.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 2a3890f..f4418e2 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -540,6 +540,8 @@ static void init_vmcb(struct vcpu_svm *svm)
struct vmcb_control_area *control = &svm->vmcb->control;
struct vmcb_save_area *save = &svm->vmcb->save;
+ svm->vcpu.fpu_active = 1;
+
control->intercept_cr_read = INTERCEPT_CR0_MASK |
INTERCEPT_CR3_MASK |
INTERCEPT_CR4_MASK;
@@ -730,7 +732,6 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id)
init_vmcb(svm);
fx_init(&svm->vcpu);
- svm->vcpu.fpu_active = 1;
svm->vcpu.arch.apic_base = 0xfee00000 | MSR_IA32_APICBASE_ENABLE;
if (kvm_vcpu_is_bsp(&svm->vcpu))
svm->vcpu.arch.apic_base |= MSR_IA32_APICBASE_BSP;
--
1.6.5.3
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 3/3] KVM: SVM: Lazy fpu for npt
2010-01-07 12:15 [PATCH 0/3] Lazy fpu for svm/npt Avi Kivity
2010-01-07 12:15 ` [PATCH 1/3] KVM: SVM: Fix SVM_CR0_SELECTIVE_MASK Avi Kivity
2010-01-07 12:15 ` [PATCH 2/3] KVM: SVM: Initialize fpu_active in init_vmcb() Avi Kivity
@ 2010-01-07 12:15 ` Avi Kivity
2010-01-07 16:52 ` Joerg Roedel
2 siblings, 1 reply; 6+ messages in thread
From: Avi Kivity @ 2010-01-07 12:15 UTC (permalink / raw)
To: Marcelo Tosatti, Joerg Roedel; +Cc: kvm
If two conditions apply:
- no bits outside TS and EM differ between the host and guest cr0
- the fpu is active
then we can activate the selective cr0 write intercept and drop the
unconditional cr0 read and write intercept, and allow the guest to run
with the host fpu state. This reduces the heavyweight context switch
when npt is enabled.
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/svm.c | 70 +++++++++++++++++++++++++--------------------------
1 files changed, 34 insertions(+), 36 deletions(-)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index f4418e2..7f3d890 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -571,6 +571,7 @@ static void init_vmcb(struct vcpu_svm *svm)
control->intercept = (1ULL << INTERCEPT_INTR) |
(1ULL << INTERCEPT_NMI) |
(1ULL << INTERCEPT_SMI) |
+ (1ULL << INTERCEPT_SELECTIVE_CR0) |
(1ULL << INTERCEPT_CPUID) |
(1ULL << INTERCEPT_INVD) |
(1ULL << INTERCEPT_HLT) |
@@ -643,10 +644,8 @@ static void init_vmcb(struct vcpu_svm *svm)
control->intercept &= ~((1ULL << INTERCEPT_TASK_SWITCH) |
(1ULL << INTERCEPT_INVLPG));
control->intercept_exceptions &= ~(1 << PF_VECTOR);
- control->intercept_cr_read &= ~(INTERCEPT_CR0_MASK|
- INTERCEPT_CR3_MASK);
- control->intercept_cr_write &= ~(INTERCEPT_CR0_MASK|
- INTERCEPT_CR3_MASK);
+ control->intercept_cr_read &= ~INTERCEPT_CR3_MASK;
+ control->intercept_cr_write &= ~INTERCEPT_CR3_MASK;
save->g_pat = 0x0007040600070406ULL;
save->cr3 = 0;
save->cr4 = 0;
@@ -965,6 +964,27 @@ static void svm_decache_cr4_guest_bits(struct kvm_vcpu *vcpu)
{
}
+static void update_cr0_intercept(struct vcpu_svm *svm)
+{
+ ulong gcr0 = svm->vcpu.arch.cr0;
+ u64 *hcr0 = &svm->vmcb->save.cr0;
+
+ if (!svm->vcpu.fpu_active)
+ *hcr0 |= SVM_CR0_SELECTIVE_MASK;
+ else
+ *hcr0 = (*hcr0 & ~SVM_CR0_SELECTIVE_MASK)
+ | (gcr0 & SVM_CR0_SELECTIVE_MASK);
+
+
+ if (gcr0 == *hcr0 && svm->vcpu.fpu_active) {
+ svm->vmcb->control.intercept_cr_read &= ~INTERCEPT_CR0_MASK;
+ svm->vmcb->control.intercept_cr_write &= ~INTERCEPT_CR0_MASK;
+ } else {
+ svm->vmcb->control.intercept_cr_read |= INTERCEPT_CR0_MASK;
+ svm->vmcb->control.intercept_cr_write |= INTERCEPT_CR0_MASK;
+ }
+}
+
static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
{
struct vcpu_svm *svm = to_svm(vcpu);
@@ -982,12 +1002,11 @@ static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
}
}
#endif
- if (npt_enabled)
- goto set;
-
vcpu->arch.cr0 = cr0;
- cr0 |= X86_CR0_PG | X86_CR0_WP;
-set:
+
+ if (!npt_enabled)
+ cr0 |= X86_CR0_PG | X86_CR0_WP;
+
/*
* re-enable caching here because the QEMU bios
* does not do it - this results in some delay at
@@ -995,6 +1014,7 @@ set:
*/
cr0 &= ~(X86_CR0_CD | X86_CR0_NW);
svm->vmcb->save.cr0 = cr0;
+ update_cr0_intercept(svm);
}
static void svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
@@ -1240,11 +1260,8 @@ static int ud_interception(struct vcpu_svm *svm)
static int nm_interception(struct vcpu_svm *svm)
{
svm->vmcb->control.intercept_exceptions &= ~(1 << NM_VECTOR);
- if (!kvm_read_cr0_bits(&svm->vcpu, X86_CR0_TS))
- svm->vmcb->save.cr0 &= ~X86_CR0_TS;
- else
- svm->vmcb->save.cr0 |= X86_CR0_TS;
svm->vcpu.fpu_active = 1;
+ update_cr0_intercept(svm);
return 1;
}
@@ -2297,7 +2314,7 @@ static int (*svm_exit_handlers[])(struct vcpu_svm *svm) = {
[SVM_EXIT_READ_CR3] = emulate_on_interception,
[SVM_EXIT_READ_CR4] = emulate_on_interception,
[SVM_EXIT_READ_CR8] = emulate_on_interception,
- /* for now: */
+ [SVM_EXIT_CR0_SEL_WRITE] = emulate_on_interception,
[SVM_EXIT_WRITE_CR0] = emulate_on_interception,
[SVM_EXIT_WRITE_CR3] = emulate_on_interception,
[SVM_EXIT_WRITE_CR4] = emulate_on_interception,
@@ -2383,21 +2400,10 @@ static int handle_exit(struct kvm_vcpu *vcpu)
svm_complete_interrupts(svm);
- if (npt_enabled) {
- int mmu_reload = 0;
- if ((kvm_read_cr0_bits(vcpu, X86_CR0_PG) ^ svm->vmcb->save.cr0)
- & X86_CR0_PG) {
- svm_set_cr0(vcpu, svm->vmcb->save.cr0);
- mmu_reload = 1;
- }
+ if (!(svm->vmcb->control.intercept_cr_write & INTERCEPT_CR0_MASK))
vcpu->arch.cr0 = svm->vmcb->save.cr0;
+ if (npt_enabled)
vcpu->arch.cr3 = svm->vmcb->save.cr3;
- if (mmu_reload) {
- kvm_mmu_reset_context(vcpu);
- kvm_mmu_load(vcpu);
- }
- }
-
if (svm->vmcb->control.exit_code == SVM_EXIT_ERR) {
kvm_run->exit_reason = KVM_EXIT_FAIL_ENTRY;
@@ -2580,8 +2586,6 @@ static void svm_flush_tlb(struct kvm_vcpu *vcpu)
static void svm_prepare_guest_switch(struct kvm_vcpu *vcpu)
{
- if (npt_enabled)
- vcpu->fpu_active = 1;
}
static inline void sync_cr8_to_lapic(struct kvm_vcpu *vcpu)
@@ -2920,14 +2924,8 @@ static void svm_fpu_deactivate(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
- if (npt_enabled) {
- /* hack: npt requires active fpu at this time */
- vcpu->fpu_active = 1;
- return;
- }
-
+ update_cr0_intercept(svm);
svm->vmcb->control.intercept_exceptions |= 1 << NM_VECTOR;
- svm->vmcb->save.cr0 |= X86_CR0_TS;
}
static struct kvm_x86_ops svm_x86_ops = {
--
1.6.5.3
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 3/3] KVM: SVM: Lazy fpu for npt
2010-01-07 12:15 ` [PATCH 3/3] KVM: SVM: Lazy fpu for npt Avi Kivity
@ 2010-01-07 16:52 ` Joerg Roedel
2010-01-07 16:55 ` Avi Kivity
0 siblings, 1 reply; 6+ messages in thread
From: Joerg Roedel @ 2010-01-07 16:52 UTC (permalink / raw)
To: Avi Kivity; +Cc: Marcelo Tosatti, Joerg Roedel, kvm
On Thu, Jan 07, 2010 at 02:15:44PM +0200, Avi Kivity wrote:
> If two conditions apply:
> - no bits outside TS and EM differ between the host and guest cr0
> - the fpu is active
>
> then we can activate the selective cr0 write intercept and drop the
> unconditional cr0 read and write intercept, and allow the guest to run
> with the host fpu state. This reduces the heavyweight context switch
> when npt is enabled.
> - if (npt_enabled) {
> - int mmu_reload = 0;
> - if ((kvm_read_cr0_bits(vcpu, X86_CR0_PG) ^ svm->vmcb->save.cr0)
> - & X86_CR0_PG) {
> - svm_set_cr0(vcpu, svm->vmcb->save.cr0);
> - mmu_reload = 1;
> - }
> + if (!(svm->vmcb->control.intercept_cr_write & INTERCEPT_CR0_MASK))
> vcpu->arch.cr0 = svm->vmcb->save.cr0;
> + if (npt_enabled)
> vcpu->arch.cr3 = svm->vmcb->save.cr3;
> - if (mmu_reload) {
> - kvm_mmu_reset_context(vcpu);
> - kvm_mmu_load(vcpu);
> - }
> - }
> -
Hmm, I think removing this hack is a seperate issue. Should it be a
sepearte patch which enables cr0 intercept for npt and removes these
lines? It makes this change more clear in the logs.
Joerg
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 3/3] KVM: SVM: Lazy fpu for npt
2010-01-07 16:52 ` Joerg Roedel
@ 2010-01-07 16:55 ` Avi Kivity
0 siblings, 0 replies; 6+ messages in thread
From: Avi Kivity @ 2010-01-07 16:55 UTC (permalink / raw)
To: Joerg Roedel; +Cc: Marcelo Tosatti, Joerg Roedel, kvm
On 01/07/2010 06:52 PM, Joerg Roedel wrote:
> On Thu, Jan 07, 2010 at 02:15:44PM +0200, Avi Kivity wrote:
>
>> If two conditions apply:
>> - no bits outside TS and EM differ between the host and guest cr0
>> - the fpu is active
>>
>> then we can activate the selective cr0 write intercept and drop the
>> unconditional cr0 read and write intercept, and allow the guest to run
>> with the host fpu state. This reduces the heavyweight context switch
>> when npt is enabled.
>>
>
>
>> - if (npt_enabled) {
>> - int mmu_reload = 0;
>> - if ((kvm_read_cr0_bits(vcpu, X86_CR0_PG) ^ svm->vmcb->save.cr0)
>> - & X86_CR0_PG) {
>> - svm_set_cr0(vcpu, svm->vmcb->save.cr0);
>> - mmu_reload = 1;
>> - }
>> + if (!(svm->vmcb->control.intercept_cr_write& INTERCEPT_CR0_MASK))
>> vcpu->arch.cr0 = svm->vmcb->save.cr0;
>> + if (npt_enabled)
>> vcpu->arch.cr3 = svm->vmcb->save.cr3;
>> - if (mmu_reload) {
>> - kvm_mmu_reset_context(vcpu);
>> - kvm_mmu_load(vcpu);
>> - }
>> - }
>> -
>>
> Hmm, I think removing this hack is a seperate issue. Should it be a
> sepearte patch which enables cr0 intercept for npt and removes these
> lines? It makes this change more clear in the logs.
>
Enabling cr0 intercept without enabling selective cr0 intercept will be
a massive performance regression, so performance-wise these two are tied
up. But I agree that it would make the patch easier to read, I'll try
to split it up.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2010-01-07 17:00 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-07 12:15 [PATCH 0/3] Lazy fpu for svm/npt Avi Kivity
2010-01-07 12:15 ` [PATCH 1/3] KVM: SVM: Fix SVM_CR0_SELECTIVE_MASK Avi Kivity
2010-01-07 12:15 ` [PATCH 2/3] KVM: SVM: Initialize fpu_active in init_vmcb() Avi Kivity
2010-01-07 12:15 ` [PATCH 3/3] KVM: SVM: Lazy fpu for npt Avi Kivity
2010-01-07 16:52 ` Joerg Roedel
2010-01-07 16:55 ` Avi Kivity
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).