From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kiszka Subject: Re: [PATCH] KVM: VMX: Fix race between pending IRQ and NMI Date: Thu, 20 Nov 2008 14:29:45 +0100 Message-ID: <49256649.6060801@siemens.com> References: <491858C8.2040401@siemens.com> <49201213.1080305@redhat.com> <49203513.2080800@web.de> <4920392F.9020303@redhat.com> <49203EAF.3000800@web.de> <49244F20.3030803@redhat.com> <49248514.9020605@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Cc: Jan Kiszka , kvm-devel , "Xu, Jiajun" , "Yang, Sheng" To: Avi Kivity Return-path: Received: from gecko.sbs.de ([194.138.37.40]:16742 "EHLO gecko.sbs.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750716AbYKTNat (ORCPT ); Thu, 20 Nov 2008 08:30:49 -0500 In-Reply-To: <49248514.9020605@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: Avi Kivity wrote: > Avi Kivity wrote: >> Jan Kiszka wrote: >>> Jiajun kindly provided me a RHEL kernel and initrd (2.6.18-53-el5) which >>> I ran for a while (or booted a few times) to trigger the hang. Basically >>> you need high IRQ load (preferably via LAPIC, to exploit that un-acked >>> IRQs will block low-prio IRQs as well) + high NMI load (e.g. via NMI >>> watchdog). >>> >> >> I was able to reproduce it easily by zapping the mmu every second. >> >> Attached is a patch the fixes it for me. Basically it avoids the nmi >> path if an interrupt is being injected. This is closer to my event >> queue plan, and also is similar to what the code does today with >> exceptions (avoid ->inject_pending_irq() if an exception is pending). >> > > Oh, and I think this is more correct than the previous approach of > letting the nmi preempt the interrupt. > > The nmi handler could change the tpr to mask the preempted interrupt; > but the code would not notice that. Once the interrupt was injected the > guest would see an interrupt at a higher priority than it has programmed > the hardware to allow. I consider this a bit far fetch. What sane NMI handler would fiddle with the APIC? It would be fairly tricky to properly synchronize this with the rest of the OS. > > Basically, once we commit to an interrupt via kvm_cpu_get_interrupt(), > we must inject it before the any instruction gets executed. > > I don't think any real guest would notice, though. > Well, I have no problems with your approach (when also applied on the user space irqchip path) of keeping the order *if* we can ensure that only the first instruction of the IRQ handler is executed and we will then inject the NMI. Otherwise this opens a prio inversion between IRQs and NMIs. The point is that, unless I'm overseeing some detail right now, your approach will inject the pending NMI only once the guest /happens/ to exit the VM, right? If yes, then it's a no-go IMHO, also for keeping this property with the queue approach. Jan -- Siemens AG, Corporate Technology, CT SE 2 ES-OS Corporate Competence Center Embedded Linux