From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S941294AbcHJTUp (ORCPT ); Wed, 10 Aug 2016 15:20:45 -0400 Received: from mx2.suse.de ([195.135.220.15]:44987 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933587AbcHJTSb (ORCPT ); Wed, 10 Aug 2016 15:18:31 -0400 Date: Wed, 10 Aug 2016 12:17:57 -0700 From: Davidlohr Bueso To: Manfred Spraul Cc: Benjamin Herrenschmidt , Michael Ellerman , Andrew Morton , Linux Kernel Mailing List , Susanne Spraul <1vier1@web.de>, "Paul E. McKenney" , Peter Zijlstra Subject: Re: spin_lock implicit/explicit memory barrier Message-ID: <20160810191757.GA4952@linux-80c1.suse> References: <1470787537.3015.83.camel@kernel.crashing.org> <4bd34301-0c63-66ae-71b1-6fd68c9fecdd@colorfullife.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <4bd34301-0c63-66ae-71b1-6fd68c9fecdd@colorfullife.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 10 Aug 2016, Manfred Spraul wrote: >On 08/10/2016 02:05 AM, Benjamin Herrenschmidt wrote: >>On Tue, 2016-08-09 at 20:52 +0200, Manfred Spraul wrote: >>>Hi Benjamin, Hi Michael, >>> >>>regarding commit 51d7d5205d33 ("powerpc: Add smp_mb() to >>>arch_spin_is_locked()"): >>> >>>For the ipc/sem code, I would like to replace the spin_is_locked() with >>>a smp_load_acquire(), see: >>> >>>http://git.cmpxchg.org/cgit.cgi/linux-mmots.git/tree/ipc/sem.c#n367 >>> >>>http://www.ozlabs.org/~akpm/mmots/broken-out/ipc-semc-fix-complex_count-vs-simple-op-race.patch >>> >>>To my understanding, I must now add a smp_mb(), otherwise it would be >>>broken on PowerPC: >>> >>>The approach that the memory barrier is added into spin_is_locked() >>>doesn't work because the code doesn't use spin_is_locked(). >>> >>>Correct? >>Right, otherwise you aren't properly ordered. The current powerpc locks provide >>good protection between what's inside vs. what's outside the lock but not vs. >>the lock *value* itself, so if, like you do in the sem code, use the lock >>value as something that is relevant in term of ordering, you probably need >>an explicit full barrier. But the problem here is with spin_unlock_wait() (for ll/sc spin_lock) not seeing the store that makes the lock visibly taken and both threads end up exiting out of sem_lock(); similar scenario to the spin_is_locked commit mentioned above, which is crossing of locks. Now that spin_unlock_wait() always implies at least an load-acquire barrier (for both ticket and qspinlocks, which is still x86 only), we wait on the full critical region. So this patch takes this locking scheme: CPU0 CPU1 spin_lock(l) spin_lock(L) spin_unlock_wait(L) if (spin_is_locked(l)) foo() foo() ... and converts it now to: CPU0 CPU1 complex_mode = true spin_lock(l) smp_mb() <--- do we want a smp_mb() here? spin_unlock_wait(l) if (!smp_load_acquire(complex_mode)) foo() foo() We should not be doing an smp_mb() right after a spin_lock(), makes no sense. The spinlock machinery should guarantee us the barriers in the unorthodox locking cases, such as this. Thanks, Davidlohr