From mboxrd@z Thu Jan 1 00:00:00 1970 From: Gleb Natapov Subject: Re: [PATCH RFC] KVM: Fix race in apic->pending_events processing Date: Sun, 2 Jun 2013 16:14:42 +0300 Message-ID: <20130602131442.GF24773@redhat.com> References: <20130530123454.GA4845@redhat.com> <51A74CE1.1000700@redhat.com> <20130530131054.GB5495@redhat.com> <51A752D7.1020809@redhat.com> <20130530133534.GC5495@redhat.com> <51A75F07.805@redhat.com> <20130531043643.GA26250@redhat.com> <51A863E0.9080202@redhat.com> <20130531091835.GA467@redhat.com> <51A871DA.7070905@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: kvm@vger.kernel.org, Jan Kiszka To: Paolo Bonzini Return-path: Received: from mx1.redhat.com ([209.132.183.28]:47540 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751491Ab3FBNOr (ORCPT ); Sun, 2 Jun 2013 09:14:47 -0400 Content-Disposition: inline In-Reply-To: <51A871DA.7070905@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Fri, May 31, 2013 at 11:48:10AM +0200, Paolo Bonzini wrote: > Il 31/05/2013 11:18, Gleb Natapov ha scritto: > > On Fri, May 31, 2013 at 10:48:32AM +0200, Paolo Bonzini wrote: > >> Il 31/05/2013 06:36, Gleb Natapov ha scritto: > >>> In my commit message there is two INITs in a row: > >>> vpu0: vcpu1: > >>> set INIT > >>> test_and_clear_bit(KVM_APIC_INIT) > >>> process INIT > >>> set INIT > >>> set SIPI > >>> test_and_clear_bit(KVM_APIC_SIPI) > >>> process SIPI > >>> > >>> Two INITs before SIPI are essential to trigger the bug > >> > >> I see now. Let's draw pending_events as well: > >> > >> event sent event processed pending_events > >> INIT INIT > >> INIT 0 > >> INIT INIT > >> SIPI INIT|SIPI > >> SIPI INIT > >> INIT 0 > >> > >> Events are reordered, there is indeed a bug if the second INIT comes at > >> just the right time. With your patch: > >> > >> event sent event processed pending_events > >> INIT INIT > >> INIT 0 > >> INIT INIT > >> SIPI INIT|SIPI > >> SIPI, failed cmpxchg INIT|SIPI > >> INIT SIPI > >> SIPI SIPI > > > > This is incorrect. cmpxchg will fail only if another INIT cames after SIPI. > > Why would it fail? > > You're right. > > Can you show what is the case in my patch where you have coalescing? I You'ev said it in some of your emails. Quoting: " INIT-INIT-SIPI-INIT-SIPI your version would do many SIPIs, while mine would do just one." > still prefer it because it is a smaller change, it keeps the "clear a > bit before processing" idea that you find almost everywhere. Changing > it to "clear a bit after processing" is a bigger and more surprising > change, though both are indeed tricky. > There is nothing "surprising" in it for me. Really it is so subjection that arguing about it is waste of everybody time and energy. So if we want to continue have fun arguing about it lets move to some real patch problems/benefits. So what I didn't like from the start about pending_events is that it introduces two locked instruction on each interrupt injection path, your patch makes it worse by change one of those locked instruction to cmpxchg, while mine actually removes one. But I think we can do even better and get rid of both of them for common case and do only one locked inst while there are events pending, but this is slow path so less important: diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 9d75193..3e0e85a 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1850,11 +1850,14 @@ void kvm_apic_accept_events(struct kvm_vcpu *vcpu) { struct kvm_lapic *apic = vcpu->arch.apic; unsigned int sipi_vector; + unsigned long pe; - if (!kvm_vcpu_has_lapic(vcpu)) + if (!kvm_vcpu_has_lapic(vcpu) || !apic->pending_events) return; - if (test_and_clear_bit(KVM_APIC_INIT, &apic->pending_events)) { + pe = xchg(&apic->pending_events, 0); + + if (test_bit(KVM_APIC_INIT, &pe)) { kvm_lapic_reset(vcpu); kvm_vcpu_reset(vcpu); if (kvm_vcpu_is_bsp(apic->vcpu)) @@ -1862,7 +1865,7 @@ void kvm_apic_accept_events(struct kvm_vcpu *vcpu) else vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED; } - if (test_and_clear_bit(KVM_APIC_SIPI, &apic->pending_events) && + if (test_bit(KVM_APIC_SIPI, &pe) && vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) { /* evaluate pending_events before reading the vector */ smp_rmb(); -- Gleb.