From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kiszka Subject: Re: [PATCH 10/11] VMX: work around lacking VNMI support Date: Wed, 24 Sep 2008 14:56:40 +0200 Message-ID: <48DA3908.2000204@siemens.com> References: <48D74CE6.5060008@siemens.com> <200809231742.03316.sheng.yang@intel.com> <20080923094544.GE3072@minantech.com> <200809231750.49882.sheng.yang@intel.com> <48DA3532.9040306@siemens.com> <20080924125057.GF3072@minantech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: "Yang, Sheng" , kvm-devel , Avi Kivity To: Gleb Natapov Return-path: Received: from gecko.sbs.de ([194.138.37.40]:17922 "EHLO gecko.sbs.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751437AbYIXM5A (ORCPT ); Wed, 24 Sep 2008 08:57:00 -0400 In-Reply-To: <20080924125057.GF3072@minantech.com> Sender: kvm-owner@vger.kernel.org List-ID: Gleb Natapov wrote: > On Wed, Sep 24, 2008 at 02:40:18PM +0200, Jan Kiszka wrote: >> Yang, Sheng wrote: >>> On Tuesday 23 September 2008 17:45:44 Gleb Natapov wrote: >>>> On Tue, Sep 23, 2008 at 05:42:02PM +0800, Yang, Sheng wrote: >>>>>>> That is exactly what I am using. Run it with SMP hal and do >>>>>>> hibernate. >>>>>> Oh... Finally found how to enable that hibernate option.... >>>>>> >>>>>> And this hibernate works on my virtual_nmi supported box, with smp hal >>>>>> and 2 cpus. >>>>> However, for this hibernate won't success if there is no NMI support, >>>>> maybe we can say it's not a "regression"... >>>> I am not saying it's a regression, but it would be nice to have it >>>> working :) >>>> >>> Yeah, of course. :) >> OK, I've a 2003 server up and running now, I'm able to reproduce >> >> kvm_handle_exit: unexpected, valid vectoring info and exit reason is 0x9 >> >> but not via hibernate (it suspends and powers off normally, but then >> hangs after resume), rather by manually injecting an NMI on CPU0. >> > I found out today that on regular windows 2003 the problem does not > exist (on hibernate at least). The image I have was used to run WLK > tests (windows logo kit) and this kit changes something in windows > kernel to do additional stuff after hibernation and that is where we > crash. Ahh! > >> After Windows' graphical installation phase I had a hanging guest. At >> the same time I got >> >> kvm_handle_exit: unexpected, valid vectoring info and exit reason is 0x9 >> kvm_handle_exit: Breaking out of NMI-blocked state on VCPU 0 after 1 s >> timeout >> >> in the kernel log as well. Something is borken. Will retest the >> installation with vanilla KVM. Anyone any ideas on the task switch >> thing? Just a false positive or an indication for the real problem in >> that domain? >> > Nothing is broken IMO. The IDT entry for NMI is set up as task gate so > we get a task switch exit after NMI injection. Yes, that's what I see here now as well. > > We should do something like this: > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index 046a91b..860e66d 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -2826,10 +2826,20 @@ static int handle_task_switch(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) > unsigned long exit_qualification; > u16 tss_selector; > int reason; > + struct vcpu_vmx *vmx = to_vmx(vcpu); > > exit_qualification = vmcs_readl(EXIT_QUALIFICATION); > > reason = (u32)exit_qualification >> 30; > + > + if (reason == TASK_SWITCH_GATE && vmx->vcpu.arch.nmi_injected && > + (vmx->idt_vectoring_info & VECTORING_INFO_VALID_MASK) && > + (vmx->idt_vectoring_info & VECTORING_INFO_TYPE_MASK) == INTR_TYPE_NMI_INTR) { > + vcpu->arch.nmi_injected = false; > + vmcs_set_bits(GUEST_INTERRUPTIBILITY_INFO, > + GUEST_INTR_STATE_NMI); > + printk(KERN_DEBUG"NMI cause task switch. No need to reinject\n"); > + } OK, I just think we are not supposed to set GUEST_INTR_STATE_NMI without cpu_has_virtual_nmis(). Otherwise looks reasonable. Have you tested this? Does it make your 2003 power-off? > tss_selector = exit_qualification; > > return kvm_task_switch(vcpu, tss_selector, reason); > @@ -3002,7 +3012,8 @@ static int kvm_handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) > > if ((vectoring_info & VECTORING_INFO_VALID_MASK) && > (exit_reason != EXIT_REASON_EXCEPTION_NMI && > - exit_reason != EXIT_REASON_EPT_VIOLATION)) > + exit_reason != EXIT_REASON_EPT_VIOLATION && > + exit_reason != EXIT_REASON_TASK_SWITCH)) > printk(KERN_WARNING "%s: unexpected, valid vectoring info and " > "exit reason is 0x%x\n", __func__, exit_reason); Dumping the vectoring info here as well would have accelerated the debugging. I think we should add this. > if (exit_reason < kvm_vmx_max_exit_handlers > > -- > Gleb. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux