From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965251AbcHaQkX (ORCPT ); Wed, 31 Aug 2016 12:40:23 -0400 Received: from foss.arm.com ([217.140.101.70]:37268 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935737AbcHaQkS (ORCPT ); Wed, 31 Aug 2016 12:40:18 -0400 Date: Wed, 31 Aug 2016 17:40:21 +0100 From: Will Deacon To: Peter Zijlstra Cc: Manfred Spraul , benh@kernel.crashing.org, paulmck@linux.vnet.ibm.com, Ingo Molnar , Boqun Feng , Andrew Morton , LKML , 1vier1@web.de, Davidlohr Bueso Subject: Re: [PATCH 1/4] spinlock: Document memory barrier rules Message-ID: <20160831164020.GG29505@arm.com> References: <1472385376-8801-1-git-send-email-manfred@colorfullife.com> <1472385376-8801-2-git-send-email-manfred@colorfullife.com> <20160829104815.GI10153@twins.programming.kicks-ass.net> <968e4c62-4486-a6aa-8fdf-67ff9b05a330@colorfullife.com> <20160829134424.GS10153@twins.programming.kicks-ass.net> <4859166f-ff39-e998-638b-6bf6912422a3@colorfullife.com> <20160831154049.GY10121@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160831154049.GY10121@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 31, 2016 at 05:40:49PM +0200, Peter Zijlstra wrote: > On Wed, Aug 31, 2016 at 06:59:07AM +0200, Manfred Spraul wrote: > > > The barrier must ensure that taking the spinlock (as observed by another cpu > > with spin_unlock_wait()) and a following read are ordered. > > > > start condition: sma->complex_mode = false; > > > > CPU 1: > > spin_lock(&sem->lock); /* sem_nsems instances */ > > smp_mb__after_spin_lock(); > > if (!smp_load_acquire(&sma->complex_mode)) { > > /* fast path successful! */ > > return sops->sem_num; > > } > > /* slow path, not relevant */ > > > > CPU 2: (holding sma->sem_perm.lock) > > > > smp_store_mb(sma->complex_mode, true); > > > > for (i = 0; i < sma->sem_nsems; i++) { > > spin_unlock_wait(&sma->sem_base[i].lock); > > } I'm struggling with this example. We have these locks: &sem->lock &sma->sem_base[0...sma->sem_nsems].lock &sma->sem_perm.lock a condition variable: sma->complex_mode and a new barrier: smp_mb__after_spin_lock() For simplicity, we can make sma->sem_nsems == 1, and have &sma->sem_base[0] be &sem->lock in the example above. &sma->sem_perm.lock seems to be irrelevant. The litmus test then looks a bit like: CPUm: LOCK(x) smp_mb(); RyAcq=0 CPUn: Wy=1 smp_mb(); UNLOCK_WAIT(x) which I think can be simplified to: LOCK(x) Ry=0 Wy=1 smp_mb(); // Note that this is implied by spin_unlock_wait on PPC and arm64 LOCK(x) // spin_unlock_wait behaves like lock; unlock UNLOCK(x) [I've removed a bunch of barriers here, that I don't think are necessary for the guarantees you're after] and the question is "Can both CPUs proceed?". Looking at the above, then I don't think that they can. Whilst CPUm can indeed speculate the Ry=0 before successfully taking the lock, if CPUn observes CPUm's read, then it must also observe the lock being held wrt the spin_lock API. That is because a successful LOCK operation by CPUn would force CPUm to replay its LL/SC loop and therefore discard its speculation of y. What am I missing? The code snippet seems to have too many barriers to me! Will