From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [RFC] KVM: Fix simultaneous NMIs Date: Thu, 15 Sep 2011 20:02:32 +0300 Message-ID: <4E722FA8.2030006@redhat.com> References: <1316097911-16424-1-git-send-email-avi@redhat.com> <4E722140.4070702@siemens.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Marcelo Tosatti , "kvm@vger.kernel.org" To: Jan Kiszka Return-path: Received: from mx1.redhat.com ([209.132.183.28]:31445 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751441Ab1IORCk (ORCPT ); Thu, 15 Sep 2011 13:02:40 -0400 In-Reply-To: <4E722140.4070702@siemens.com> Sender: kvm-owner@vger.kernel.org List-ID: On 09/15/2011 07:01 PM, Jan Kiszka wrote: > On 2011-09-15 16:45, Avi Kivity wrote: > > If simultaneous NMIs happen, we're supposed to queue the second > > and next (collapsing them), but currently we sometimes collapse > > the second into the first. > > Can you describe the race in a few more details here ("sometimes" sounds > like "I don't know when" :) )? In this case it was "I'm in a hurry". > > > > void kvm_inject_nmi(struct kvm_vcpu *vcpu) > > { > > + atomic_inc(&vcpu->arch.nmi_pending); > > kvm_make_request(KVM_REQ_EVENT, vcpu); > > - vcpu->arch.nmi_pending = 1; > > Does the reordering matter? I think so. Suppose the vcpu enters just after kvm_make_request(); it sees KVM_REQ_EVENT and clears it, but doesn't see nmi_pending because it wasn't set set. Then comes a kick, the guest is reentered with nmi_pending set but KVM_REQ_EVENT clear and sails through the check and enters the guest. The NMI is delayed until the next KVM_REQ_EVENT. > Do we need barriers? Yes. > > > @@ -5570,9 +5570,9 @@ static void inject_pending_event(struct kvm_vcpu *vcpu) > > } > > > > /* try to inject new event if pending */ > > - if (vcpu->arch.nmi_pending) { > > + if (atomic_read(&vcpu->arch.nmi_pending)) { > > if (kvm_x86_ops->nmi_allowed(vcpu)) { > > - vcpu->arch.nmi_pending = false; > > + atomic_dec(&vcpu->arch.nmi_pending); > > Here we lost NMIs in the past by overwriting nmi_pending while another > one was already queued, right? One place, yes. The other is kvm_inject_nmi() - if the first nmi didn't get picked up by the vcpu by the time the second nmi arrives, we lose the second nmi. > > if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) { > > inject_pending_event(vcpu); > > > > /* enable NMI/IRQ window open exits if needed */ > > - if (nmi_pending) > > + if (atomic_read(&vcpu->arch.nmi_pending) > > + && nmi_in_progress(vcpu)) > > Is nmi_pending&& !nmi_in_progress possible at all? Yes, due to NMI-blocked-by-STI. A really touchy area. > Is it rather a BUG > condition? No. > If not, what will happen next? The NMI window will open and we'll inject the NMI. But I think we have a bug here - we should only kvm_collapse_nmis() if an NMI handler was indeed running, yet we do it unconditionally. > > > > +static inline void kvm_collapse_pending_nmis(struct kvm_vcpu *vcpu) > > +{ > > + /* Collapse all NMIs queued while an NMI handler was running to one */ > > + if (atomic_read(&vcpu->arch.nmi_pending)) > > + atomic_set(&vcpu->arch.nmi_pending, 1); > > Is it OK that NMIs injected after the collapse will increment this to> > 1 again? Or is that impossible? > It's possible and okay. We're now completing execution of IRET. Doing atomic_set() after atomic_inc() means the NMI happened before IRET completed, and vice versa. Since these events are asynchronous, we're free to choose one or the other (a self-IPI-NMI just before the IRET must be swallowed, and a self-IPI-NMI just after the IRET would only be executed after the next time around the handler). -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.