kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] Important fixes for KVM-AMD
@ 2010-05-05 14:04 Joerg Roedel
  2010-05-05 14:04 ` [PATCH 1/5] KVM: X86: Fix stupid bug in exception reinjection path Joerg Roedel
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Joerg Roedel @ 2010-05-05 14:04 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel

Hi Avi, Marcelo,

here is a set of patches which fix problems in kvm-amd. Patch 1 fixes a
stupid problem with the event-reinjection introduced by me in my
previous patchset.  Patch 2 was a helper to find the bug patch 3 fixes.
I kept it in the patchset because it may be helpful in the future to
debug other problems too. Patch 3 is the most important fix because it
makes kvm-amd on 32 bit hosts work again.  Without this patch the first
vmrum fails with exit-reason VMEXIT_INVALID. Patch 4 fixes the Xen 4.0
shipped with SLES11 in nested svm. The last patch in this series fixes a
potential l2-guest breakout scenario because it may be possible for the
l2-guest to issue hypercalls directly to the host if the l1-hypervisor
does not intercept VMMCALL.

Thanks,

	Joerg

Diffstat:

 arch/x86/include/asm/msr-index.h |    2 +
 arch/x86/kvm/svm.c               |  108 ++++++++++++++++++++++++++++++++++++--
 arch/x86/kvm/x86.c               |    2 +-
 3 files changed, 106 insertions(+), 6 deletions(-)

Shortlog:

Joerg Roedel (5):
      KVM: X86: Fix stupid bug in exception reinjection path
      KVM: SVM: Dump vmcb contents on failed vmrun
      KVM: SVM: Fix wrong intercept masks on 32 bit
      KVM: SVM: Allow EFER.LMSLE to be set with nested svm
      KVM: SVM: Don't allow nested guest to VMMCALL into host

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/5] KVM: X86: Fix stupid bug in exception reinjection path
  2010-05-05 14:04 [PATCH 0/5] Important fixes for KVM-AMD Joerg Roedel
@ 2010-05-05 14:04 ` Joerg Roedel
  2010-05-05 14:04 ` [PATCH 2/5] KVM: SVM: Dump vmcb contents on failed vmrun Joerg Roedel
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Joerg Roedel @ 2010-05-05 14:04 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

The patch merged recently which allowed to mark an exception
as reinjected has a bug as it always marks the exception as
reinjected. This breaks nested-svm shadow-on-shadow
implementation.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/kvm/x86.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6b2ce1d..c83528e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -277,7 +277,7 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
 		vcpu->arch.exception.has_error_code = has_error;
 		vcpu->arch.exception.nr = nr;
 		vcpu->arch.exception.error_code = error_code;
-		vcpu->arch.exception.reinject = true;
+		vcpu->arch.exception.reinject = reinject;
 		return;
 	}
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/5] KVM: SVM: Dump vmcb contents on failed vmrun
  2010-05-05 14:04 [PATCH 0/5] Important fixes for KVM-AMD Joerg Roedel
  2010-05-05 14:04 ` [PATCH 1/5] KVM: X86: Fix stupid bug in exception reinjection path Joerg Roedel
@ 2010-05-05 14:04 ` Joerg Roedel
  2010-05-05 14:04 ` [PATCH 3/5] KVM: SVM: Fix wrong intercept masks on 32 bit Joerg Roedel
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Joerg Roedel @ 2010-05-05 14:04 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This patch adds a function to dump the vmcb into the kernel
log and calls it after a failed vmrun to ease debugging.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/kvm/svm.c |   95 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 95 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 889f660..0201b06 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2637,6 +2637,99 @@ static int (*svm_exit_handlers[])(struct vcpu_svm *svm) = {
 	[SVM_EXIT_NPF]				= pf_interception,
 };
 
+void dump_vmcb(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+	struct vmcb_control_area *control = &svm->vmcb->control;
+	struct vmcb_save_area *save = &svm->vmcb->save;
+
+	pr_err("VMCB Control Area:\n");
+	pr_err("cr_read:            %04x\n", control->intercept_cr_read);
+	pr_err("cr_write:           %04x\n", control->intercept_cr_write);
+	pr_err("dr_read:            %04x\n", control->intercept_dr_read);
+	pr_err("dr_write:           %04x\n", control->intercept_dr_write);
+	pr_err("exceptions:         %08x\n", control->intercept_exceptions);
+	pr_err("intercepts:         %016llx\n", control->intercept);
+	pr_err("pause filter count: %d\n", control->pause_filter_count);
+	pr_err("iopm_base_pa:       %016llx\n", control->iopm_base_pa);
+	pr_err("msrpm_base_pa:      %016llx\n", control->msrpm_base_pa);
+	pr_err("tsc_offset:         %016llx\n", control->tsc_offset);
+	pr_err("asid:               %d\n", control->asid);
+	pr_err("tlb_ctl:            %d\n", control->tlb_ctl);
+	pr_err("int_ctl:            %08x\n", control->int_ctl);
+	pr_err("int_vector:         %08x\n", control->int_vector);
+	pr_err("int_state:          %08x\n", control->int_state);
+	pr_err("exit_code:          %08x\n", control->exit_code);
+	pr_err("exit_info1:         %016llx\n", control->exit_info_1);
+	pr_err("exit_info2:         %016llx\n", control->exit_info_2);
+	pr_err("exit_int_info:      %08x\n", control->exit_int_info);
+	pr_err("exit_int_info_err:  %08x\n", control->exit_int_info_err);
+	pr_err("nested_ctl:         %lld\n", control->nested_ctl);
+	pr_err("nested_cr3:         %016llx\n", control->nested_cr3);
+	pr_err("event_inj:          %08x\n", control->event_inj);
+	pr_err("event_inj_err:      %08x\n", control->event_inj_err);
+	pr_err("lbr_ctl:            %lld\n", control->lbr_ctl);
+	pr_err("next_rip:           %016llx\n", control->next_rip);
+	pr_err("VMCB State Save Area:\n");
+	pr_err("es:   s: %04x a: %04x l: %08x b: %016llx\n",
+		save->es.selector, save->es.attrib,
+		save->es.limit, save->es.base);
+	pr_err("cs:   s: %04x a: %04x l: %08x b: %016llx\n",
+		save->cs.selector, save->cs.attrib,
+		save->cs.limit, save->cs.base);
+	pr_err("ss:   s: %04x a: %04x l: %08x b: %016llx\n",
+		save->ss.selector, save->ss.attrib,
+		save->ss.limit, save->ss.base);
+	pr_err("ds:   s: %04x a: %04x l: %08x b: %016llx\n",
+		save->ds.selector, save->ds.attrib,
+		save->ds.limit, save->ds.base);
+	pr_err("fs:   s: %04x a: %04x l: %08x b: %016llx\n",
+		save->fs.selector, save->fs.attrib,
+		save->fs.limit, save->fs.base);
+	pr_err("gs:   s: %04x a: %04x l: %08x b: %016llx\n",
+		save->gs.selector, save->gs.attrib,
+		save->gs.limit, save->gs.base);
+	pr_err("gdtr: s: %04x a: %04x l: %08x b: %016llx\n",
+		save->gdtr.selector, save->gdtr.attrib,
+		save->gdtr.limit, save->gdtr.base);
+	pr_err("ldtr: s: %04x a: %04x l: %08x b: %016llx\n",
+		save->ldtr.selector, save->ldtr.attrib,
+		save->ldtr.limit, save->ldtr.base);
+	pr_err("idtr: s: %04x a: %04x l: %08x b: %016llx\n",
+		save->idtr.selector, save->idtr.attrib,
+		save->idtr.limit, save->idtr.base);
+	pr_err("tr:   s: %04x a: %04x l: %08x b: %016llx\n",
+		save->tr.selector, save->tr.attrib,
+		save->tr.limit, save->tr.base);
+	pr_err("cpl:            %d                efer:         %016llx\n",
+		save->cpl, save->efer);
+	pr_err("cr0:            %016llx cr2:          %016llx\n",
+		save->cr0, save->cr2);
+	pr_err("cr3:            %016llx cr4:          %016llx\n",
+		save->cr3, save->cr4);
+	pr_err("dr6:            %016llx dr7:          %016llx\n",
+		save->dr6, save->dr7);
+	pr_err("rip:            %016llx rflags:       %016llx\n",
+		save->rip, save->rflags);
+	pr_err("rsp:            %016llx rax:          %016llx\n",
+		save->rsp, save->rax);
+	pr_err("star:           %016llx lstar:        %016llx\n",
+		save->star, save->lstar);
+	pr_err("cstar:          %016llx sfmask:       %016llx\n",
+		save->cstar, save->sfmask);
+	pr_err("kernel_gs_base: %016llx sysenter_cs:  %016llx\n",
+		save->kernel_gs_base, save->sysenter_cs);
+	pr_err("sysenter_esp:   %016llx sysenter_eip: %016llx\n",
+		save->sysenter_esp, save->sysenter_eip);
+	pr_err("gpat:           %016llx dbgctl:       %016llx\n",
+		save->g_pat, save->dbgctl);
+	pr_err("br_from:        %016llx br_to:        %016llx\n",
+		save->br_from, save->br_to);
+	pr_err("excp_from:      %016llx excp_to:      %016llx\n",
+		save->last_excp_from, save->last_excp_to);
+
+}
+
 static int handle_exit(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
@@ -2681,6 +2774,8 @@ static int handle_exit(struct kvm_vcpu *vcpu)
 		kvm_run->exit_reason = KVM_EXIT_FAIL_ENTRY;
 		kvm_run->fail_entry.hardware_entry_failure_reason
 			= svm->vmcb->control.exit_code;
+		pr_err("KVM: FAILED VMRUN WITH VMCB:\n");
+		dump_vmcb(vcpu);
 		return 0;
 	}
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/5] KVM: SVM: Fix wrong intercept masks on 32 bit
  2010-05-05 14:04 [PATCH 0/5] Important fixes for KVM-AMD Joerg Roedel
  2010-05-05 14:04 ` [PATCH 1/5] KVM: X86: Fix stupid bug in exception reinjection path Joerg Roedel
  2010-05-05 14:04 ` [PATCH 2/5] KVM: SVM: Dump vmcb contents on failed vmrun Joerg Roedel
@ 2010-05-05 14:04 ` Joerg Roedel
  2010-05-05 14:04 ` [PATCH 4/5] KVM: SVM: Allow EFER.LMSLE to be set with nested svm Joerg Roedel
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Joerg Roedel @ 2010-05-05 14:04 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: kvm, linux-kernel, Joerg Roedel, Jan Kiszka, Gleb Natapov, stable

This patch makes KVM on 32 bit SVM working again by
correcting the masks used for iret interception. With the
wrong masks the upper 32 bits of the intercepts are masked
out which leaves vmrun unintercepted. This is not legal on
svm and the vmrun fails.
Bug was introduced by commits 95ba827313 and 3cfc3092.

Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/kvm/svm.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 0201b06..74f7b9d 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2290,7 +2290,7 @@ static int cpuid_interception(struct vcpu_svm *svm)
 static int iret_interception(struct vcpu_svm *svm)
 {
 	++svm->vcpu.stat.nmi_window_exits;
-	svm->vmcb->control.intercept &= ~(1UL << INTERCEPT_IRET);
+	svm->vmcb->control.intercept &= ~(1ULL << INTERCEPT_IRET);
 	svm->vcpu.arch.hflags |= HF_IRET_MASK;
 	return 1;
 }
@@ -2824,7 +2824,7 @@ static void svm_inject_nmi(struct kvm_vcpu *vcpu)
 
 	svm->vmcb->control.event_inj = SVM_EVTINJ_VALID | SVM_EVTINJ_TYPE_NMI;
 	vcpu->arch.hflags |= HF_NMI_MASK;
-	svm->vmcb->control.intercept |= (1UL << INTERCEPT_IRET);
+	svm->vmcb->control.intercept |= (1ULL << INTERCEPT_IRET);
 	++vcpu->stat.nmi_injections;
 }
 
@@ -2891,10 +2891,10 @@ static void svm_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked)
 
 	if (masked) {
 		svm->vcpu.arch.hflags |= HF_NMI_MASK;
-		svm->vmcb->control.intercept |= (1UL << INTERCEPT_IRET);
+		svm->vmcb->control.intercept |= (1ULL << INTERCEPT_IRET);
 	} else {
 		svm->vcpu.arch.hflags &= ~HF_NMI_MASK;
-		svm->vmcb->control.intercept &= ~(1UL << INTERCEPT_IRET);
+		svm->vmcb->control.intercept &= ~(1ULL << INTERCEPT_IRET);
 	}
 }
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/5] KVM: SVM: Allow EFER.LMSLE to be set with nested svm
  2010-05-05 14:04 [PATCH 0/5] Important fixes for KVM-AMD Joerg Roedel
                   ` (2 preceding siblings ...)
  2010-05-05 14:04 ` [PATCH 3/5] KVM: SVM: Fix wrong intercept masks on 32 bit Joerg Roedel
@ 2010-05-05 14:04 ` Joerg Roedel
  2010-05-05 14:46   ` Avi Kivity
  2010-05-05 14:04 ` [PATCH 5/5] KVM: SVM: Don't allow nested guest to VMMCALL into host Joerg Roedel
  2010-05-06  8:51 ` [PATCH 0/5] Important fixes for KVM-AMD Avi Kivity
  5 siblings, 1 reply; 15+ messages in thread
From: Joerg Roedel @ 2010-05-05 14:04 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This patch enables setting of efer bit 13 which is allowed
in all SVM capable processors. This is necessary for the
SLES11 version of Xen 4.0 to boot with nested svm.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/include/asm/msr-index.h |    2 ++
 arch/x86/kvm/svm.c               |    2 +-
 2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index bc473ac..352767d 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -20,6 +20,7 @@
 #define _EFER_LMA		10 /* Long mode active (read-only) */
 #define _EFER_NX		11 /* No execute enable */
 #define _EFER_SVME		12 /* Enable virtualization */
+#define _EFER_LMSLE		13 /* Long Mode Segment Limit Enable */
 #define _EFER_FFXSR		14 /* Enable Fast FXSAVE/FXRSTOR */
 
 #define EFER_SCE		(1<<_EFER_SCE)
@@ -27,6 +28,7 @@
 #define EFER_LMA		(1<<_EFER_LMA)
 #define EFER_NX			(1<<_EFER_NX)
 #define EFER_SVME		(1<<_EFER_SVME)
+#define EFER_LMSLE		(1<<_EFER_LMSLE)
 #define EFER_FFXSR		(1<<_EFER_FFXSR)
 
 /* Intel MSRs. Some also available on other CPUs */
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 74f7b9d..bc087c7 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -610,7 +610,7 @@ static __init int svm_hardware_setup(void)
 
 	if (nested) {
 		printk(KERN_INFO "kvm: Nested Virtualization enabled\n");
-		kvm_enable_efer_bits(EFER_SVME);
+		kvm_enable_efer_bits(EFER_SVME | EFER_LMSLE);
 	}
 
 	for_each_possible_cpu(cpu) {
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/5] KVM: SVM: Don't allow nested guest to VMMCALL into host
  2010-05-05 14:04 [PATCH 0/5] Important fixes for KVM-AMD Joerg Roedel
                   ` (3 preceding siblings ...)
  2010-05-05 14:04 ` [PATCH 4/5] KVM: SVM: Allow EFER.LMSLE to be set with nested svm Joerg Roedel
@ 2010-05-05 14:04 ` Joerg Roedel
  2010-05-06  8:51 ` [PATCH 0/5] Important fixes for KVM-AMD Avi Kivity
  5 siblings, 0 replies; 15+ messages in thread
From: Joerg Roedel @ 2010-05-05 14:04 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Joerg Roedel

This patch disables the possibility for a l2-guest to do a
VMMCALL directly into the host. This would happen if the
l1-hypervisor doesn't intercept VMMCALL and the l2-guest
executes this instruction.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/kvm/svm.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index bc087c7..2e9b57a 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2036,6 +2036,9 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
 		svm->vmcb->control.intercept_cr_write &= ~INTERCEPT_CR8_MASK;
 	}
 
+	/* We don't want to see VMMCALLs from a nested guest */
+	svm->vmcb->control.intercept &= ~(1ULL << INTERCEPT_VMMCALL);
+
 	/*
 	 * We don't want a nested guest to be more powerful than the guest, so
 	 * all intercepts are ORed
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] KVM: SVM: Allow EFER.LMSLE to be set with nested svm
  2010-05-05 14:04 ` [PATCH 4/5] KVM: SVM: Allow EFER.LMSLE to be set with nested svm Joerg Roedel
@ 2010-05-05 14:46   ` Avi Kivity
  2010-05-05 15:04     ` Joerg Roedel
  2010-05-05 20:57     ` Andre Przywara
  0 siblings, 2 replies; 15+ messages in thread
From: Avi Kivity @ 2010-05-05 14:46 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel

On 05/05/2010 05:04 PM, Joerg Roedel wrote:
> This patch enables setting of efer bit 13 which is allowed
> in all SVM capable processors. This is necessary for the
> SLES11 version of Xen 4.0 to boot with nested svm.
>    

Interesting, why does it require it?

Obviously it isn't needed since it manages to run on Intel without it.

>   /* Intel MSRs. Some also available on other CPUs */
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 74f7b9d..bc087c7 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -610,7 +610,7 @@ static __init int svm_hardware_setup(void)
>
>   	if (nested) {
>   		printk(KERN_INFO "kvm: Nested Virtualization enabled\n");
> -		kvm_enable_efer_bits(EFER_SVME);
> +		kvm_enable_efer_bits(EFER_SVME | EFER_LMSLE);
>   	}
>
>   	for_each_possible_cpu(cpu) {
>    

What if the host doesn't have it?

Why enable it only for the nested case?  It's not svm specific (it's 
useful for running non-hvm Xen in non-nested mode).

Isn't there a cpuid bit for it?  If so, it should be exposed to 
userspace, and the feature should depend on it.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] KVM: SVM: Allow EFER.LMSLE to be set with nested svm
  2010-05-05 14:46   ` Avi Kivity
@ 2010-05-05 15:04     ` Joerg Roedel
  2010-05-05 15:06       ` Avi Kivity
  2010-05-05 20:57     ` Andre Przywara
  1 sibling, 1 reply; 15+ messages in thread
From: Joerg Roedel @ 2010-05-05 15:04 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Joerg Roedel, Marcelo Tosatti, kvm, linux-kernel

On Wed, May 05, 2010 at 05:46:59PM +0300, Avi Kivity wrote:
> On 05/05/2010 05:04 PM, Joerg Roedel wrote:
>> This patch enables setting of efer bit 13 which is allowed
>> in all SVM capable processors. This is necessary for the
>> SLES11 version of Xen 4.0 to boot with nested svm.
>>    
>
> Interesting, why does it require it?

I don't know. I traced the Xen crash down and found that is gets a #GP
because it tries to set this bit.

> Obviously it isn't needed since it manages to run on Intel without it.

I have heard inofficial statements that they set this bit to provide the
functionality to their guests. And Xen sets this bit together with the
SVM bit.

>>   /* Intel MSRs. Some also available on other CPUs */
>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> index 74f7b9d..bc087c7 100644
>> --- a/arch/x86/kvm/svm.c
>> +++ b/arch/x86/kvm/svm.c
>> @@ -610,7 +610,7 @@ static __init int svm_hardware_setup(void)
>>
>>   	if (nested) {
>>   		printk(KERN_INFO "kvm: Nested Virtualization enabled\n");
>> -		kvm_enable_efer_bits(EFER_SVME);
>> +		kvm_enable_efer_bits(EFER_SVME | EFER_LMSLE);
>>   	}
>>
>>   	for_each_possible_cpu(cpu) {
>>    
>
> What if the host doesn't have it?

It is present in all SVM capable AMD processors.

> Why enable it only for the nested case?  It's not svm specific (it's  
> useful for running non-hvm Xen in non-nested mode).

Because there is no cpuid bit for this feature. You can roughly check
for it using the svm cpuid bit.


	Joerg


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] KVM: SVM: Allow EFER.LMSLE to be set with nested svm
  2010-05-05 15:04     ` Joerg Roedel
@ 2010-05-05 15:06       ` Avi Kivity
  2010-05-05 15:14         ` Avi Kivity
  2010-05-05 15:16         ` Roedel, Joerg
  0 siblings, 2 replies; 15+ messages in thread
From: Avi Kivity @ 2010-05-05 15:06 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Joerg Roedel, Marcelo Tosatti, kvm, linux-kernel

On 05/05/2010 06:04 PM, Joerg Roedel wrote:
> Because there is no cpuid bit for this feature.

That is sad.

> You can roughly check
> for it using the svm cpuid bit.
>    

Doesn't it kill cross-vendor migration?

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] KVM: SVM: Allow EFER.LMSLE to be set with nested svm
  2010-05-05 15:06       ` Avi Kivity
@ 2010-05-05 15:14         ` Avi Kivity
  2010-05-05 15:16         ` Roedel, Joerg
  1 sibling, 0 replies; 15+ messages in thread
From: Avi Kivity @ 2010-05-05 15:14 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Joerg Roedel, Marcelo Tosatti, kvm, linux-kernel

On 05/05/2010 06:06 PM, Avi Kivity wrote:
>
>> You can roughly check
>> for it using the svm cpuid bit.
>
> Doesn't it kill cross-vendor migration?
>

Oh, the fact that it's in if (nested) protects against that (at least 
until Intel implements EFER.SVME.  So the patch is good.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] KVM: SVM: Allow EFER.LMSLE to be set with nested svm
  2010-05-05 15:06       ` Avi Kivity
  2010-05-05 15:14         ` Avi Kivity
@ 2010-05-05 15:16         ` Roedel, Joerg
  1 sibling, 0 replies; 15+ messages in thread
From: Roedel, Joerg @ 2010-05-05 15:16 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Joerg Roedel, Marcelo Tosatti, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org

On Wed, May 05, 2010 at 11:06:59AM -0400, Avi Kivity wrote:
> On 05/05/2010 06:04 PM, Joerg Roedel wrote:
> > You can roughly check
> > for it using the svm cpuid bit.
> 
> Doesn't it kill cross-vendor migration?

Enabling Nested SVM kills it anyway, so this is not an issue. AFAIK the
feature is not present on Intel CPUs.

	Joerg

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] KVM: SVM: Allow EFER.LMSLE to be set with nested svm
  2010-05-05 14:46   ` Avi Kivity
  2010-05-05 15:04     ` Joerg Roedel
@ 2010-05-05 20:57     ` Andre Przywara
  2010-05-06  9:38       ` Roedel, Joerg
  1 sibling, 1 reply; 15+ messages in thread
From: Andre Przywara @ 2010-05-05 20:57 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Roedel, Joerg, Marcelo Tosatti, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org

Avi Kivity wrote:
> On 05/05/2010 05:04 PM, Joerg Roedel wrote:
>> This patch enables setting of efer bit 13 which is allowed
>> in all SVM capable processors. This is necessary for the
>> SLES11 version of Xen 4.0 to boot with nested svm.
>>    
> 
> Interesting, why does it require it?
It does not. Best is you check the patch, which has just been posted 
yesterday on xen-devel for upstream inclusion:
http://lists.xensource.com/archives/html/xen-devel/2010-05/msg00096.html
Basically it _tries_ to set the bit via safe_wrmsr() to detect whether 
its legal (in replacement of the missing CPUID check). Afterwards it 
resets it. If it available, it allows _guests_ to enable it. AFAIK the 
only user of this feature is VMware with binary translation running 
64bit guests, as they rely on segment limits to protect their hypervisor 
code from being read from the guest.
If I understood this correctly, there is a bug somewhere, maybe even in 
KVM's nested SVM implementation. Xen is fine with this bit-set provoking 
a #GP. I haven't had time yet to further investigate this, though.

> Obviously it isn't needed since it manages to run on Intel without it.
VMware's binary translation relies on VMX for running 64bit guests, on 
AMD you don't need SVM if you had a K8RevE (dual core 90nm) with this 
feature. In fact you should not be able to run VMware with 64bit guests 
inside a KVM guest on an Intel box (without nested VMX, that is).
On AMD you can either use nested SVM or this feature.

Regards,
Andre.

> 
>>   /* Intel MSRs. Some also available on other CPUs */
>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> index 74f7b9d..bc087c7 100644
>> --- a/arch/x86/kvm/svm.c
>> +++ b/arch/x86/kvm/svm.c
>> @@ -610,7 +610,7 @@ static __init int svm_hardware_setup(void)
>>
>>   	if (nested) {
>>   		printk(KERN_INFO "kvm: Nested Virtualization enabled\n");
>> -		kvm_enable_efer_bits(EFER_SVME);
>> +		kvm_enable_efer_bits(EFER_SVME | EFER_LMSLE);
>>   	}
>>
>>   	for_each_possible_cpu(cpu) {
>>    
> 
> What if the host doesn't have it?
> 
> Why enable it only for the nested case?  It's not svm specific (it's 
> useful for running non-hvm Xen in non-nested mode).
> 
> Isn't there a cpuid bit for it?  If so, it should be exposed to 
> userspace, and the feature should depend on it.
> 

-- 
Andre Przywara
AMD-OSRC (Dresden)
Tel: x29712

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/5] Important fixes for KVM-AMD
  2010-05-05 14:04 [PATCH 0/5] Important fixes for KVM-AMD Joerg Roedel
                   ` (4 preceding siblings ...)
  2010-05-05 14:04 ` [PATCH 5/5] KVM: SVM: Don't allow nested guest to VMMCALL into host Joerg Roedel
@ 2010-05-06  8:51 ` Avi Kivity
  5 siblings, 0 replies; 15+ messages in thread
From: Avi Kivity @ 2010-05-06  8:51 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel

On 05/05/2010 05:04 PM, Joerg Roedel wrote:
> Hi Avi, Marcelo,
>
> here is a set of patches which fix problems in kvm-amd. Patch 1 fixes a
> stupid problem with the event-reinjection introduced by me in my
> previous patchset.  Patch 2 was a helper to find the bug patch 3 fixes.
> I kept it in the patchset because it may be helpful in the future to
> debug other problems too. Patch 3 is the most important fix because it
> makes kvm-amd on 32 bit hosts work again.  Without this patch the first
> vmrum fails with exit-reason VMEXIT_INVALID. Patch 4 fixes the Xen 4.0
> shipped with SLES11 in nested svm. The last patch in this series fixes a
> potential l2-guest breakout scenario because it may be possible for the
> l2-guest to issue hypercalls directly to the host if the l1-hypervisor
> does not intercept VMMCALL.
>    

All applied, thanks.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] KVM: SVM: Allow EFER.LMSLE to be set with nested svm
  2010-05-05 20:57     ` Andre Przywara
@ 2010-05-06  9:38       ` Roedel, Joerg
  2010-05-06 11:42         ` Avi Kivity
  0 siblings, 1 reply; 15+ messages in thread
From: Roedel, Joerg @ 2010-05-06  9:38 UTC (permalink / raw)
  To: Przywara, Andre
  Cc: Avi Kivity, Marcelo Tosatti, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org

On Wed, May 05, 2010 at 04:57:00PM -0400, Przywara, Andre wrote:

> If I understood this correctly, there is a bug somewhere, maybe even in 
> KVM's nested SVM implementation. Xen is fine with this bit-set provoking 
> a #GP. I haven't had time yet to further investigate this, though.

Ok, I looked at this again and reproduced the traces I already deleted
and fetched the Xen crash message and found something I missed before.
The relevant part of the KVM trace is:

 qemu-system-x86-7364  [012]   790.715351: kvm_exit: reason msr rip 0xffff82c4801b5c93
 qemu-system-x86-7364  [012]   790.715352: kvm_msr: msr_write c0000080 = 0x3d01
 qemu-system-x86-7364  [012]   790.715354: kvm_inj_exception: #GP (0x0)

And the Xen-Crash message is:

(XEN) Xen call trace:
(XEN)    [<ffff82c4801b5c95>] svm_cpu_up+0x135/0x200
(XEN)    [<ffff82c4801b5d9c>] start_svm+0x3c/0xe0
(XEN)    [<ffff82c4801948b2>] identify_cpu+0xd2/0x240
(XEN)    [<ffff82c480252c6b>] __start_xen+0x1dbb/0x3660
(XEN)    [<ffff82c4801000b5>] __high_start+0xa1/0xa3
(XEN)    
(XEN) 
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) GENERAL PROTECTION FAULT
(XEN) [error_code=0000]
(XEN) ****************************************

The MSR write happens on rip 0xffff82c4801b5c93 while the #GP is
injected at rip ffff82c4801b5c95 (== right after the wrmsr instruction).
So yes, there is another bug in KVM here. The problem is that the
set_efer function does not report write errors to ist caller and injects
the #GP directly. The svm:wrmsr_interception recognizes a success and
advances the rip.
The attached patch fixes this.

>From e0d69cf7a396d35ae9aa4778e87f82c243bfa0ae Mon Sep 17 00:00:00 2001
From: Joerg Roedel <joerg.roedel@amd.com>
Date: Thu, 6 May 2010 11:07:46 +0200
Subject: [PATCH] KVM: X86: Inject #GP with the right rip on efer writes

This patch fixes a bug in the KVM efer-msr write path. If a
guest writes to a reserved efer bit the set_efer function
injects the #GP directly. The architecture dependent wrmsr
function does not see this, assumes success and advances the
rip. This results in a #GP in the guest with the wrong rip.
This patch fixes this by reporting efer write errors back to
the architectural wrmsr function.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/kvm/x86.c |   31 ++++++++++++-------------------
 1 files changed, 12 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c83528e..5bd7b30 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -683,37 +683,29 @@ static u32 emulated_msrs[] = {
 	MSR_IA32_MISC_ENABLE,
 };
 
-static void set_efer(struct kvm_vcpu *vcpu, u64 efer)
+static int set_efer(struct kvm_vcpu *vcpu, u64 efer)
 {
-	if (efer & efer_reserved_bits) {
-		kvm_inject_gp(vcpu, 0);
-		return;
-	}
+	if (efer & efer_reserved_bits)
+		return 1;
 
 	if (is_paging(vcpu)
-	    && (vcpu->arch.efer & EFER_LME) != (efer & EFER_LME)) {
-		kvm_inject_gp(vcpu, 0);
-		return;
-	}
+	    && (vcpu->arch.efer & EFER_LME) != (efer & EFER_LME))
+		return 1;
 
 	if (efer & EFER_FFXSR) {
 		struct kvm_cpuid_entry2 *feat;
 
 		feat = kvm_find_cpuid_entry(vcpu, 0x80000001, 0);
-		if (!feat || !(feat->edx & bit(X86_FEATURE_FXSR_OPT))) {
-			kvm_inject_gp(vcpu, 0);
-			return;
-		}
+		if (!feat || !(feat->edx & bit(X86_FEATURE_FXSR_OPT)))
+			return 1;
 	}
 
 	if (efer & EFER_SVME) {
 		struct kvm_cpuid_entry2 *feat;
 
 		feat = kvm_find_cpuid_entry(vcpu, 0x80000001, 0);
-		if (!feat || !(feat->ecx & bit(X86_FEATURE_SVM))) {
-			kvm_inject_gp(vcpu, 0);
-			return;
-		}
+		if (!feat || !(feat->ecx & bit(X86_FEATURE_SVM)))
+			return 1;
 	}
 
 	kvm_x86_ops->set_efer(vcpu, efer);
@@ -725,6 +717,8 @@ static void set_efer(struct kvm_vcpu *vcpu, u64 efer)
 
 	vcpu->arch.mmu.base_role.nxe = (efer & EFER_NX) && !tdp_enabled;
 	kvm_mmu_reset_context(vcpu);
+
+	return 0;
 }
 
 void kvm_enable_efer_bits(u64 mask)
@@ -1145,8 +1139,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
 {
 	switch (msr) {
 	case MSR_EFER:
-		set_efer(vcpu, data);
-		break;
+		return set_efer(vcpu, data);
 	case MSR_K7_HWCR:
 		data &= ~(u64)0x40;	/* ignore flush filter disable */
 		data &= ~(u64)0x100;	/* ignore ignne emulation enable */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] KVM: SVM: Allow EFER.LMSLE to be set with nested svm
  2010-05-06  9:38       ` Roedel, Joerg
@ 2010-05-06 11:42         ` Avi Kivity
  0 siblings, 0 replies; 15+ messages in thread
From: Avi Kivity @ 2010-05-06 11:42 UTC (permalink / raw)
  To: Roedel, Joerg
  Cc: Przywara, Andre, Marcelo Tosatti, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org

On 05/06/2010 12:38 PM, Roedel, Joerg wrote:
> Subject: [PATCH] KVM: X86: Inject #GP with the right rip on efer writes
> This patch fixes a bug in the KVM efer-msr write path. If a
> guest writes to a reserved efer bit the set_efer function
> injects the #GP directly. The architecture dependent wrmsr
> function does not see this, assumes success and advances the
> rip. This results in a #GP in the guest with the wrong rip.
> This patch fixes this by reporting efer write errors back to
> the architectural wrmsr function.
>    

Applied, thanks.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2010-05-06 11:42 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-05 14:04 [PATCH 0/5] Important fixes for KVM-AMD Joerg Roedel
2010-05-05 14:04 ` [PATCH 1/5] KVM: X86: Fix stupid bug in exception reinjection path Joerg Roedel
2010-05-05 14:04 ` [PATCH 2/5] KVM: SVM: Dump vmcb contents on failed vmrun Joerg Roedel
2010-05-05 14:04 ` [PATCH 3/5] KVM: SVM: Fix wrong intercept masks on 32 bit Joerg Roedel
2010-05-05 14:04 ` [PATCH 4/5] KVM: SVM: Allow EFER.LMSLE to be set with nested svm Joerg Roedel
2010-05-05 14:46   ` Avi Kivity
2010-05-05 15:04     ` Joerg Roedel
2010-05-05 15:06       ` Avi Kivity
2010-05-05 15:14         ` Avi Kivity
2010-05-05 15:16         ` Roedel, Joerg
2010-05-05 20:57     ` Andre Przywara
2010-05-06  9:38       ` Roedel, Joerg
2010-05-06 11:42         ` Avi Kivity
2010-05-05 14:04 ` [PATCH 5/5] KVM: SVM: Don't allow nested guest to VMMCALL into host Joerg Roedel
2010-05-06  8:51 ` [PATCH 0/5] Important fixes for KVM-AMD Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).