From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kiszka Subject: Re: [PATCH 10/11] VMX: work around lacking VNMI support Date: Tue, 23 Sep 2008 17:15:01 +0200 Message-ID: <48D907F5.2000401@siemens.com> References: <48D74CE6.5060008@siemens.com> <48D8AF84.3020707@siemens.com> <20080923090021.GB3072@minantech.com> <200809231708.09617.sheng.yang@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Gleb Natapov , kvm-devel , Avi Kivity To: "Yang, Sheng" Return-path: Received: from gecko.sbs.de ([194.138.37.40]:16721 "EHLO gecko.sbs.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752040AbYIWPPV (ORCPT ); Tue, 23 Sep 2008 11:15:21 -0400 In-Reply-To: <200809231708.09617.sheng.yang@intel.com> Sender: kvm-owner@vger.kernel.org List-ID: Yang, Sheng wrote: > On Tuesday 23 September 2008 17:00:21 Gleb Natapov wrote: >> On Tue, Sep 23, 2008 at 10:57:40AM +0200, Jan Kiszka wrote: >>> Gleb Natapov wrote: >>>> On Tue, Sep 23, 2008 at 10:46:38AM +0200, Jan Kiszka wrote: >>>>> Gleb Natapov wrote: >>>>>> On Mon, Sep 22, 2008 at 09:59:07AM +0200, Jan Kiszka wrote: >>>>>>> @@ -2356,6 +2384,19 @@ static void vmx_inject_nmi(struct kvm_vc >>>>>>> { >>>>>>> struct vcpu_vmx *vmx = to_vmx(vcpu); >>>>>>> >>>>>>> + if (!cpu_has_virtual_nmis()) { >>>>>>> + /* >>>>>>> + * Tracking the NMI-blocked state in software is >>>>>>> built upon + * finding the next open IRQ window. >>>>>>> This, in turn, depends on + * well-behaving guests: >>>>>>> They have to keep IRQs disabled at + * least as long >>>>>>> as the NMI handler runs. Otherwise we may + * cause >>>>>>> NMI nesting, maybe breaking the guest. But as this is + >>>>>>> * highly unlikely, we can live with the residual risk. + >>>>>>> */ >>>>>>> + vmx->soft_vnmi_blocked = 1; >>>>>>> + vmx->vnmi_blocked_time = 0; >>>>>>> + } >>>>>>> + >>>>>> We still get here with vmx->soft_vnmi_blocked = 1. Trying to find out >>>>>> how. >>>>> We should only come along here with vnmi blocked on reinjection (after >>>>> a fault on calling the handler). >>>> I see that nmi_injected is never cleared and it is check before calling >>>> vmx_inject_nmi(); >>> That should happen in vmx_complete_interrupts, but only if the exit >>> takes place after the NMI has been successfully delivered to the guest >>> (which is not the case if invoking the handler raises an exception). So >>> far for the theory... >> Okey, I have this one in dmesg: >> kvm_handle_exit: unexpected, valid vectoring info and exit reason is 0x9 >> > Oh... Another task switch issue... Maybe that pending vector is #2, the NMI that is supposed to trigger the task switch? > > I think it's may not be a issue import by this patchset? Seems need more > debug... > > The patchset is OK for me, except I don't know when we would need that timeout > one (buggy guest?...), and we may also root cause this issue or ensure that > it's not a regression. The timeout is indeed for buggy guests: disable_irqs(); spin_endlessly(); Linux, e.g., needs more than one watchdog NMI over this code to detect that there is a lock-up. With soft-VNMIs + their timeouts, this detection will take longer then in reality, but it will still work. And one second is large enough to practically avoid breaking into a running NMI handler (unless the guest is totally screwed and spins inside that handler). Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux