From mboxrd@z Thu Jan 1 00:00:00 1970 From: Oleg Nesterov Subject: Re: [PATCH] x86 spinlock: Fix memory corruption on completing completions Date: Thu, 12 Feb 2015 15:18:19 +0100 Message-ID: <20150212141819.GA11633@redhat.com> References: <1423234148-13886-1-git-send-email-raghavendra.kt@linux.vnet.ibm.com> <54D7D19B.1000103@goop.org> <54D87F1E.9060307@linux.vnet.ibm.com> <20150209120227.GT21418@twins.programming.kicks-ass.net> <54D9CFC7.5020007@linux.vnet.ibm.com> <20150210132634.GA30380@redhat.com> <54DAADEE.6070506@goop.org> <20150211172434.GA28689@redhat.com> <54DBE27C.8050105@goop.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <54DBE27C.8050105@goop.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: virtualization-bounces@lists.linux-foundation.org Errors-To: virtualization-bounces@lists.linux-foundation.org To: Jeremy Fitzhardinge Cc: the arch/x86 maintainers , KVM list , Peter Zijlstra , virtualization , Paul Gortmaker , Peter Anvin , Davidlohr Bueso , Andrey Ryabinin , Raghavendra K T , Christian Borntraeger , Ingo Molnar , Sasha Levin , Paul McKenney , Rik van Riel , Konrad Rzeszutek Wilk , Andi Kleen , xen-devel@lists.xenproject.org, Dave Jones , Thomas Gleixner , Waiman Long , Linux Kernel Mailing List , Paolo Bonzini List-Id: virtualization@lists.linuxfoundation.org On 02/11, Jeremy Fitzhardinge wrote: > > On 02/11/2015 09:24 AM, Oleg Nesterov wrote: > > I agree, and I have to admit I am not sure I fully understand why > > unlock uses the locked add. Except we need a barrier to avoid the race > > with the enter_slowpath() users, of course. Perhaps this is the only > > reason? > > Right now it needs to be a locked operation to prevent read-reordering. > x86 memory ordering rules state that all writes are seen in a globally > consistent order, and are globally ordered wrt reads *on the same > addresses*, but reads to different addresses can be reordered wrt to writes. > > So, if the unlocking add were not a locked operation: > > __add(&lock->tickets.head, TICKET_LOCK_INC); /* not locked */ > > if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG)) > __ticket_unlock_slowpath(lock, prev); > > Then the read of lock->tickets.tail can be reordered before the unlock, > which introduces a race: Yes, yes, thanks, but this is what I meant. We need a barrier. Even if "Every store is a release" as Linus mentioned. > This *might* be OK, but I think it's on dubious ground: > > __add(&lock->tickets.head, TICKET_LOCK_INC); /* not locked */ > > /* read overlaps write, and so is ordered */ > if (unlikely(lock->head_tail & (TICKET_SLOWPATH_FLAG << TICKET_SHIFT)) > __ticket_unlock_slowpath(lock, prev); > > because I think Intel and AMD differed in interpretation about how > overlapping but different-sized reads & writes are ordered (or it simply > isn't architecturally defined). can't comment, I simply so not know how the hardware works. > If the slowpath flag is moved to head, then it would always have to be > locked anyway, because it needs to be atomic against other CPU's RMW > operations setting the flag. Yes, this is true. But again, if we want to avoid the read-after-unlock, we need to update this lock and read SLOWPATH atomically, it seems that we can't avoid the locked insn. Oleg.