From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755826AbcETL6c (ORCPT ); Fri, 20 May 2016 07:58:32 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:44372 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752892AbcETL6b (ORCPT ); Fri, 20 May 2016 07:58:31 -0400 Date: Fri, 20 May 2016 13:58:19 +0200 From: Peter Zijlstra To: Davidlohr Bueso Cc: manfred@colorfullife.com, Waiman.Long@hpe.com, mingo@kernel.org, torvalds@linux-foundation.org, ggherdovich@suse.com, mgorman@techsingularity.net, linux-kernel@vger.kernel.org, Paul McKenney , Will Deacon Subject: Re: sem_lock() vs qspinlocks Message-ID: <20160520115819.GF3193@twins.programming.kicks-ass.net> References: <20160520053926.GC31084@linux-uzut.site> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160520053926.GC31084@linux-uzut.site> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 19, 2016 at 10:39:26PM -0700, Davidlohr Bueso wrote: > As such, the following restores the behavior of the ticket locks and 'fixes' > (or hides?) the bug in sems. Naturally incorrect approach: > > @@ -290,7 +290,8 @@ static void sem_wait_array(struct sem_array *sma) > > for (i = 0; i < sma->sem_nsems; i++) { > sem = sma->sem_base + i; > - spin_unlock_wait(&sem->lock); > + while (atomic_read(&sem->lock)) > + cpu_relax(); > } > ipc_smp_acquire__after_spin_is_unlocked(); > } The actual bug is clear_pending_set_locked() not having acquire semantics. And the above 'fixes' things because it will observe the old pending bit or the locked bit, so it doesn't matter if the store flipping them is delayed. The comment in queued_spin_lock_slowpath() above the smp_cond_acquire() states that that acquire is sufficient, but this is incorrect in the face of spin_is_locked()/spin_unlock_wait() usage only looking at the lock byte. The problem is that the clear_pending_set_locked() is an unordered store, therefore this store can be delayed until no later than spin_unlock() (which orders against it due to the address dependency). This opens numerous races; for example: ipc_lock_object(&sma->sem_perm); sem_wait_array(sma); false -> spin_is_locked(&sma->sem_perm.lock) is entirely possible, because sem_wait_array() consists of pure reads, so the store can pass all that, even on x86. The below 'hack' seems to solve the problem. _However_ this also means the atomic_cmpxchg_relaxed() in the locked: branch is equally wrong -- although not visible on x86. And note that atomic_cmpxchg_acquire() would not in fact be sufficient either, since the acquire is on the LOAD not the STORE of the LL/SC. I need a break of sorts, because after twisting my head around the sem code and then the qspinlock code I'm wrecked. I'll try and make a proper patch if people can indeed confirm my thinking here. --- kernel/locking/qspinlock.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index ce2f75e32ae1..348e172e774f 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -366,6 +366,7 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) * *,1,0 -> *,0,1 */ clear_pending_set_locked(lock); + smp_mb(); return; /*