From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965036AbcHaPlD (ORCPT ); Wed, 31 Aug 2016 11:41:03 -0400 Received: from merlin.infradead.org ([205.233.59.134]:46012 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934320AbcHaPlC (ORCPT ); Wed, 31 Aug 2016 11:41:02 -0400 Date: Wed, 31 Aug 2016 17:40:49 +0200 From: Peter Zijlstra To: Manfred Spraul Cc: benh@kernel.crashing.org, paulmck@linux.vnet.ibm.com, Ingo Molnar , Boqun Feng , Andrew Morton , LKML , 1vier1@web.de, Davidlohr Bueso , Will Deacon Subject: Re: [PATCH 1/4] spinlock: Document memory barrier rules Message-ID: <20160831154049.GY10121@twins.programming.kicks-ass.net> References: <1472385376-8801-1-git-send-email-manfred@colorfullife.com> <1472385376-8801-2-git-send-email-manfred@colorfullife.com> <20160829104815.GI10153@twins.programming.kicks-ass.net> <968e4c62-4486-a6aa-8fdf-67ff9b05a330@colorfullife.com> <20160829134424.GS10153@twins.programming.kicks-ass.net> <4859166f-ff39-e998-638b-6bf6912422a3@colorfullife.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4859166f-ff39-e998-638b-6bf6912422a3@colorfullife.com> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 31, 2016 at 06:59:07AM +0200, Manfred Spraul wrote: > The barrier must ensure that taking the spinlock (as observed by another cpu > with spin_unlock_wait()) and a following read are ordered. > > start condition: sma->complex_mode = false; > > CPU 1: > spin_lock(&sem->lock); /* sem_nsems instances */ > smp_mb__after_spin_lock(); > if (!smp_load_acquire(&sma->complex_mode)) { > /* fast path successful! */ > return sops->sem_num; > } > /* slow path, not relevant */ > > CPU 2: (holding sma->sem_perm.lock) > > smp_store_mb(sma->complex_mode, true); > > for (i = 0; i < sma->sem_nsems; i++) { > spin_unlock_wait(&sma->sem_base[i].lock); > } > > It must not happen that both CPUs proceed: > Either CPU1 proceeds, then CPU2 must spin in spin_unlock_wait() > or CPU2 proceeds, then CPU1 must enter the slow path. > > What about this? > /* > * spin_lock() provides ACQUIRE semantics regarding reading the lock. > * There are no guarantees that the store of the lock is visible before > * any read or write operation within the protected area is performed. > * If the store of the lock must happen first, this function is required. > */ > #define spin_lock_store_acquire() So I think the fundamental problem is with our atomic_*_acquire() primitives, where we've specified that the ACQUIRE only pertains to the LOAD of the RmW. The spinlock implementations suffer this problem mostly because of that (not 100% accurate but close enough). One solution would be to simply use smp_mb__after_atomic(). The 'problem' with that is __atomic_op_acquire() defaults to using that, so the archs that use __atomic_op_acquire() will get a double smp_mb() (arm64 and powerpc do not use __atomic_op_acquire()). I'm not sure we want to introduce a new primitive for this specific to spinlocks. Will, any opinions?