From mboxrd@z Thu Jan 1 00:00:00 1970 From: Gleb Natapov Subject: Re: [PATCH RFC] KVM: Fix race in apic->pending_events processing Date: Tue, 28 May 2013 18:00:57 +0300 Message-ID: <20130528150057.GA6891@redhat.com> References: <20130526130031.GS4725@redhat.com> <51A48D53.7070204@redhat.com> <20130528125613.GB3326@redhat.com> <51A4B5CA.9070109@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: kvm@vger.kernel.org, Jan Kiszka To: Paolo Bonzini Return-path: Received: from mx1.redhat.com ([209.132.183.28]:17791 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934300Ab3E1PBG (ORCPT ); Tue, 28 May 2013 11:01:06 -0400 Content-Disposition: inline In-Reply-To: <51A4B5CA.9070109@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Tue, May 28, 2013 at 03:48:58PM +0200, Paolo Bonzini wrote: > Il 28/05/2013 14:56, Gleb Natapov ha scritto: > >> > else > >> > vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED; > >> > } > >> > - if (test_and_clear_bit(KVM_APIC_SIPI, &apic->pending_events) && > >> > + /* > >> > + * Note that we may get another INIT+SIPI sequence right here; process > >> > + * the INIT first. Assumes that there are only KVM_APIC_INIT/SIPI. > >> > + */ > >> > + if (cmpxchg(&apic->pending_events, KVM_APIC_SIPI, 0) == KVM_APIC_SIPI && > >> > vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) { > > Because pending_events can be INIT/SIPI at this point and it should be > > interpreted as: do SIPI and ignore INIT (atomically). > > My patch does "do another INIT (which will have no effect) and do SIPI > after that INIT", which is different but has almost the same effect. > If pending_events is INIT/SIPI, it ignores the SIPI for now and lets > the next iteration of kvm_apic_accept_events do both. The difference > would be that in a carefully-timed sequence of interrupts > You assume that the next processing will actually happen, but this is not necessary the case. > INIT-INIT-SIPI-INIT-SIPI > > your version would do many SIPIs, while mine would do just one. > > Hmm... there is a reference to this in 25.2 "Other causes of VM exits": > "If a logical processor is in the wait-for-SIPI state, INIT signals are > blocked. They do not cause VM exits in this case." It is not for the > physical processor, but it makes sense to have the same thing. Is this > the reason why you did the cmpxchg at the end? > No, but if helps us model proper behaviour for nested guest +1 to it :) But until we handle INIT in nested virt it is not something that dictates the solution. > But then, there's another way to mask INITs in the wait-for-SIPI > state. Considering that KVM_MP_STATE_INIT_RECEIVED is really a > wait-for-SIPI, you can do: > Haven't checked it for races (especially races between multiple CPUS sending INIT), but looks more complicated to me. > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > index e1adbb4..36bc308 100644 > --- a/arch/x86/kvm/lapic.c > +++ b/arch/x86/kvm/lapic.c > @@ -720,8 +720,12 @@ out: > break; > > case APIC_DM_INIT: > - if (!trig_mode || level) { > + if ((!trig_mode || level) && > + vcpu->arch.mp_state != KVM_MP_STATE_INIT_RECEIVED) { > result = 1; > + > + /* check mp_state before writing apic->pending_events */ > + smp_mb(); > /* assumes that there are only KVM_APIC_INIT/SIPI */ > apic->pending_events = (1UL << KVM_APIC_INIT); > /* make sure pending_events is visible before sending > @@ -1865,13 +1869,17 @@ void kvm_apic_accept_events(struct kvm_vcpu *vcpu) > if (!kvm_vcpu_has_lapic(vcpu)) > return; > > - if (test_and_clear_bit(KVM_APIC_INIT, &apic->pending_events)) { > + if (test_bit(KVM_APIC_INIT, &apic->pending_events)) { > kvm_lapic_reset(vcpu); > kvm_vcpu_reset(vcpu); > if (kvm_vcpu_is_bsp(apic->vcpu)) > vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; > else > vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED; > + > + /* write mp_state before toggling KVM_APIC_INIT */ > + smp_mb__before_clear_bit(); > + clear_bit(KVM_APIC_INIT, &apic->pending_events); > } > if (test_and_clear_bit(KVM_APIC_SIPI, &apic->pending_events) && > vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) { > > I don't have a particular preference. > > Paolo -- Gleb.