From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jan Kiszka <jan.kiszka@siemens.com>
Subject: Re: [PATCH] KVM: VMX: Fix race between pending IRQ and NMI
Date: Fri, 21 Nov 2008 11:04:46 +0100
Message-ID: <492687BE.9030307@siemens.com>
References: <491858C8.2040401@siemens.com> <49201213.1080305@redhat.com> <49203513.2080800@web.de> <4920392F.9020303@redhat.com> <49203EAF.3000800@web.de> <49244F20.3030803@redhat.com> <49248514.9020605@redhat.com> <49256649.6060801@siemens.com> <49256D38.4090908@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-15
Content-Transfer-Encoding: 7bit
Cc: Jan Kiszka <jan.kiszka@web.de>, kvm-devel <kvm@vger.kernel.org>,
	"Xu, Jiajun" <jiajun.xu@intel.com>,
	"Yang, Sheng" <sheng.yang@intel.com>
To: Avi Kivity <avi@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from gecko.sbs.de ([194.138.37.40]:15034 "EHLO gecko.sbs.de"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1752892AbYKUKFw (ORCPT <rfc822;kvm@vger.kernel.org>);
	Fri, 21 Nov 2008 05:05:52 -0500
In-Reply-To: <49256D38.4090908@redhat.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

Avi Kivity wrote:
> Jan Kiszka wrote:
>>> The nmi handler could change the tpr to mask the preempted interrupt;
>>> but the code would not notice that.  Once the interrupt was injected the
>>> guest would see an interrupt at a higher priority than it has programmed
>>> the hardware to allow.
>>>     
>>
>> I consider this a bit far fetch. What sane NMI handler would fiddle with
>> the APIC? It would be fairly tricky to properly synchronize this with
>> the rest of the OS.
>>   
> 
> Sure, this is not a realistic guest.
> 
>>> Basically, once we commit to an interrupt via kvm_cpu_get_interrupt(),
>>> we must inject it before the any instruction gets executed.
>>>
>>> I don't think any real guest would notice, though.
>>>
>>>     
>>
>> Well, I have no problems with your approach (when also applied on the
>> user space irqchip path) of keeping the order *if* we can ensure that
>> only the first instruction of the IRQ handler is executed and we will
>> then inject the NMI. Otherwise this opens a prio inversion between IRQs
>> and NMIs. The point is that, unless I'm overseeing some detail right
>> now, your approach will inject the pending NMI only once the guest
>> /happens/ to exit the VM, right? If yes, then it's a no-go IMHO, also
>> for keeping this property with the queue approach.
>>   
> 
> enable_nmi_window() should cause an exit once the interrupt has been
> injected (likely before the first interrupt handler instruction was
> executed, but after the stack frame was created).  So the nmi will not
> be delayed.

Right now, you only call enable_nmi_window() if that window is currently
closed - and that's not the common case I'm worried about.

> 
> But I think I see a bigger issue - if we inject an regular interrupt
> while another is pending, then we will encounter this problem.  Looks
> like we have to enable the interrupt window after injecting an interrupt
> if there are still pending interrupts.

Yeah, probably. I'm just wondering now if we can set
exit-on-interrupt-window while the vcpu state is interruptible (ie.
_before_ the injection). There is some entry check like this for NMIs,
but maybe no for interrupts. Need to check.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2 ES-OS
Corporate Competence Center Embedded Linux