From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755222AbcETQFA (ORCPT ); Fri, 20 May 2016 12:05:00 -0400 Received: from merlin.infradead.org ([205.233.59.134]:40328 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751314AbcETQE6 (ORCPT ); Fri, 20 May 2016 12:04:58 -0400 Date: Fri, 20 May 2016 18:04:36 +0200 From: Peter Zijlstra To: Boqun Feng Cc: Davidlohr Bueso , manfred@colorfullife.com, Waiman.Long@hpe.com, mingo@kernel.org, torvalds@linux-foundation.org, ggherdovich@suse.com, mgorman@techsingularity.net, linux-kernel@vger.kernel.org, Paul McKenney , Will Deacon Subject: Re: sem_lock() vs qspinlocks Message-ID: <20160520160436.GQ3205@twins.programming.kicks-ass.net> References: <20160520053926.GC31084@linux-uzut.site> <20160520115819.GF3193@twins.programming.kicks-ass.net> <20160520140533.GA20726@insomnia> <20160520152149.GH3193@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160520152149.GH3193@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 20, 2016 at 05:21:49PM +0200, Peter Zijlstra wrote: > Let me write a patch.. OK, something like the below then.. lemme go build that and verify that too fixes things. --- Subject: locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait() Similar to commits: 51d7d5205d33 ("powerpc: Add smp_mb() to arch_spin_is_locked()") d86b8da04dfa ("arm64: spinlock: serialise spin_unlock_wait against concurrent lockers") qspinlock suffers from the fact that the _Q_LOCKED_VAL store is unordered inside the ACQUIRE of the lock. And while this is not a problem for the regular mutual exclusive critical section usage of spinlocks, it breaks creative locking like: spin_lock(A) spin_lock(B) spin_unlock_wait(B) if (!spin_is_locked(A)) do_something() do_something() In that both CPUs can end up running do_something at the same time, because our _Q_LOCKED_VAL store can drop past the spin_unlock_wait() spin_is_locked() loads (even on x86!!). To avoid making the normal case slower, add smp_mb()s to the less used spin_unlock_wait() / spin_is_locked() side of things to avoid this problem. Reported-by: Davidlohr Bueso Reported-by: Giovanni Gherdovich Signed-off-by: Peter Zijlstra (Intel) --- include/asm-generic/qspinlock.h | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinlock.h index 35a52a880b2f..6bd05700d8c9 100644 --- a/include/asm-generic/qspinlock.h +++ b/include/asm-generic/qspinlock.h @@ -28,7 +28,30 @@ */ static __always_inline int queued_spin_is_locked(struct qspinlock *lock) { - return atomic_read(&lock->val); + /* + * queued_spin_lock_slowpath() can ACQUIRE the lock before + * issuing the unordered store that sets _Q_LOCKED_VAL. + * + * See both smp_cond_acquire() sites for more detail. + * + * This however means that in code like: + * + * spin_lock(A) spin_lock(B) + * spin_unlock_wait(B) spin_is_locked(A) + * do_something() do_something() + * + * Both CPUs can end up running do_something() because the store + * setting _Q_LOCKED_VAL will pass through the loads in + * spin_unlock_wait() and/or spin_is_locked(). + * + * Avoid this by issuing a full memory barrier between the spin_lock() + * and the loads in spin_unlock_wait() and spin_is_locked(). + * + * Note that regular mutual exclusion doesn't care about this + * delayed store. + */ + smp_mb(); + return atomic_read(&lock->val) & _Q_LOCKED_MASK; } /** @@ -108,6 +131,8 @@ static __always_inline void queued_spin_unlock(struct qspinlock *lock) */ static inline void queued_spin_unlock_wait(struct qspinlock *lock) { + /* See queued_spin_is_locked() */ + smp_mb(); while (atomic_read(&lock->val) & _Q_LOCKED_MASK) cpu_relax(); }