From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet Gupta Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Wed, 9 Mar 2016 12:13:16 +0530 Message-ID: <56DFC604.6070407@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Sender: owner-linux-mm@kvack.org To: Christoph Lameter Cc: linux-mm@kvack.org, Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-parisc@vger.kernel, Peter Zijlstra , "James E.J. Bottomley" , Helge Deller , "linux-arch@vger.kernel.org" List-Id: linux-arch.vger.kernel.org +CC linux-arch, parisc folks, PeterZ On Wednesday 09 March 2016 02:10 AM, Christoph Lameter wrote: > On Tue, 8 Mar 2016, Vineet Gupta wrote: > >> # set the bit >> 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set >> 80543b90: or r3,r2,1 <--- (B) other core unlocks right here >> 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > Duh. Guess you need to take the spinlock also in the arch specific > implementation of __bit_spin_unlock(). This is certainly not the only case > in which we use the __ op to unlock. __bit_spin_lock() by definition is *not* required to be atomic, bit_spin_lock() is - so I don't think we need a spinlock there. There is clearly a problem in slub code that it is pairing a test_and_set_bit() with a __clear_bit(). Latter can obviously clobber former if they are not a single instruction each unlike x86 or they use llock/scond kind of instructions where the interim store from other core is detected and causes a retry of whole llock/scond sequence. BTW ARC is not the only arch which suffers from this - other arches potentially also are. AFAIK PARISC also doesn't have atomic r-m-w and also uses a set of external hashed spinlocks to protect the r-m-w sequences. https://lkml.org/lkml/2014/6/1/178 So there also we have the same race because the outer spin lock is not taken for slab_unlock() -> __bit_spin_lock() -> __clear_bit. Arguably I can fix the ARC !LLSC variant of test_and_set_bit() to not set the bit unconditionally but only if it was clear (PARISC does the same). That would be a slight micro-optimization as we won't need another snoop transaction to make line writable and that would also elide this problem, but I think there is a fundamental problem here in slub which is pairing atomic and non atomic ops - for performance reasons. It doesn't work on all arches and/or configurations. > You need a true atomic op or you need to take the "spinlock" in all > cases where you modify the bit. No we don't in __bit_spin_lock and we already do in bit_spin_lock. > If you take the lock in __bit_spin_unlock > then the race cannot happen. Of course it won't but that means we penalize all non atomic callers of the API with a superfluous spinlock which is not require din first place given the definition of API. >> Are you convinced now ! > > Yes, please fix your arch specific code. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Wed, 9 Mar 2016 11:13:49 +0100 Message-ID: <20160309101349.GJ6344@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <56DFC604.6070407@synopsys.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-snps-arc" Errors-To: linux-snps-arc-bounces+gla-linux-snps-arc=m.gmane.org@lists.infradead.org To: Vineet Gupta Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim List-Id: linux-arch.vger.kernel.org On Wed, Mar 09, 2016 at 12:13:16PM +0530, Vineet Gupta wrote: > +CC linux-arch, parisc folks, PeterZ > > On Wednesday 09 March 2016 02:10 AM, Christoph Lameter wrote: > > On Tue, 8 Mar 2016, Vineet Gupta wrote: > > > >> # set the bit > >> 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > >> 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > >> 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > > > Duh. Guess you need to take the spinlock also in the arch specific > > implementation of __bit_spin_unlock(). This is certainly not the only case > > in which we use the __ op to unlock. > > __bit_spin_lock() by definition is *not* required to be atomic, bit_spin_lock() is > - so I don't think we need a spinlock there. Agreed. The double underscore prefixed instructions are not required to be atomic in any way shape or form. > There is clearly a problem in slub code that it is pairing a test_and_set_bit() > with a __clear_bit(). Latter can obviously clobber former if they are not a single > instruction each unlike x86 or they use llock/scond kind of instructions where the > interim store from other core is detected and causes a retry of whole llock/scond > sequence. Yes, test_and_set_bit() + __clear_bit() is broken. > > If you take the lock in __bit_spin_unlock > > then the race cannot happen. > > Of course it won't but that means we penalize all non atomic callers of the API > with a superfluous spinlock which is not require din first place given the > definition of API. Quite. _However_, your arch is still broken, but not by your fault. Its the generic-asm code that is wrong. The thing is that __bit_spin_unlock() uses __clear_bit_unlock(), which defaults to __clear_bit(). Which is wrong. --- Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() __clear_bit_unlock() is a special little snowflake. While it carries the non-atomic '__' prefix, it is specifically documented to pair with test_and_set_bit() and therefore should be 'somewhat' atomic. Therefore the generic implementation of __clear_bit_unlock() cannot use the fully non-atomic __clear_bit() as a default. If an arch is able to do better; is must provide an implementation of __clear_bit_unlock() itself. Reported-by: Vineet Gupta Signed-off-by: Peter Zijlstra (Intel) --- include/asm-generic/bitops/lock.h | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h index c30266e94806..8ef0ccbf8167 100644 --- a/include/asm-generic/bitops/lock.h +++ b/include/asm-generic/bitops/lock.h @@ -29,16 +29,16 @@ do { \ * @nr: the bit to set * @addr: the address to start counting from * - * This operation is like clear_bit_unlock, however it is not atomic. - * It does provide release barrier semantics so it can be used to unlock - * a bit lock, however it would only be used if no other CPU can modify - * any bits in the memory until the lock is released (a good example is - * if the bit lock itself protects access to the other bits in the word). + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all + * the bits in the word are protected by this lock some archs can use weaker + * ops to safely unlock. + * + * See for example x86's implementation. */ #define __clear_bit_unlock(nr, addr) \ do { \ - smp_mb(); \ - __clear_bit(nr, addr); \ + smp_mb__before_atomic(); \ + clear_bit(nr, addr); \ } while (0) #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Wed, 9 Mar 2016 11:31:43 +0100 Message-ID: <20160309103143.GF25010@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <20160309101349.GJ6344@twins.programming.kicks-ass.net> Sender: stable-owner@vger.kernel.org To: Vineet Gupta Cc: Christoph Lameter , linux-mm@kvack.org, Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-parisc@vger.kernel, "James E.J. Bottomley" , Helge Deller , "linux-arch@vger.kernel.org" List-Id: linux-arch.vger.kernel.org On Wed, Mar 09, 2016 at 11:13:49AM +0100, Peter Zijlstra wrote: > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > FWIW, we could probably undo this if we unified all the spinlock based atomic ops implementations (there's a whole bunch doing essentially the same), and special cased __clear_bit_unlock() for that. Collapsing them is probably a good idea anyway, just a fair bit of non-trivial work to figure out all the differences and if they matter etc.. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet Gupta Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Wed, 9 Mar 2016 16:30:31 +0530 Message-ID: <56E0024F.4070401@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Return-path: Received: from smtprelay.synopsys.com ([198.182.47.9]:44037 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753133AbcCILBA (ORCPT ); Wed, 9 Mar 2016 06:01:00 -0500 In-Reply-To: <20160309101349.GJ6344@twins.programming.kicks-ass.net> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Peter Zijlstra Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote: >>> If you take the lock in __bit_spin_unlock >>> then the race cannot happen. >> >> Of course it won't but that means we penalize all non atomic callers of the API >> with a superfluous spinlock which is not require din first place given the >> definition of API. > > Quite. _However_, your arch is still broken, but not by your fault. Its > the generic-asm code that is wrong. > > The thing is that __bit_spin_unlock() uses __clear_bit_unlock(), which > defaults to __clear_bit(). Which is wrong. > > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > > Reported-by: Vineet Gupta This needs to be CCed stable as it fixes a real bug for ARC. > Signed-off-by: Peter Zijlstra (Intel) Tested-by: Vineet Gupta FWIW, could we add some background to commit log, specifically what prompted this. Something like below... ---->8------ This came up as a result of hackbench livelock'ing in slab_lock() on ARC with SMP + SLUB + !LLSC. The issue was incorrect pairing of atomic ops. slab_lock() -> bit_spin_lock() -> test_and_set_bit() slab_unlock() -> __bit_spin_unlock() -> __clear_bit() The non serializing __clear_bit() was getting "lost" 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set 80543b90: or r3,r2,1 <--- (B) other core unlocks right here 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) Fixes ARC STAR 9000817404 ---->8------ > --- > include/asm-generic/bitops/lock.h | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h > index c30266e94806..8ef0ccbf8167 100644 > --- a/include/asm-generic/bitops/lock.h > +++ b/include/asm-generic/bitops/lock.h > @@ -29,16 +29,16 @@ do { \ > * @nr: the bit to set > * @addr: the address to start counting from > * > - * This operation is like clear_bit_unlock, however it is not atomic. > - * It does provide release barrier semantics so it can be used to unlock > - * a bit lock, however it would only be used if no other CPU can modify > - * any bits in the memory until the lock is released (a good example is > - * if the bit lock itself protects access to the other bits in the word). > + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all > + * the bits in the word are protected by this lock some archs can use weaker > + * ops to safely unlock. > + * > + * See for example x86's implementation. > */ To be able to override/use-generic don't we need #ifndef .... > #define __clear_bit_unlock(nr, addr) \ > do { \ > - smp_mb(); \ > - __clear_bit(nr, addr); \ > + smp_mb__before_atomic(); \ > + clear_bit(nr, addr); \ > } while (0) > > #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ > From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet Gupta Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Wed, 9 Mar 2016 16:42:43 +0530 Message-ID: <56E0052B.3080304@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <20160309103143.GF25010@twins.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Return-path: Received: from smtprelay.synopsys.com ([198.182.47.9]:44230 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753156AbcCILNI (ORCPT ); Wed, 9 Mar 2016 06:13:08 -0500 In-Reply-To: <20160309103143.GF25010@twins.programming.kicks-ass.net> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Peter Zijlstra Cc: Christoph Lameter , linux-mm@kvack.org, Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-parisc@vger.kernel, "James E.J. Bottomley" , Helge Deller , "linux-arch@vger.kernel.org" On Wednesday 09 March 2016 04:01 PM, Peter Zijlstra wrote: > On Wed, Mar 09, 2016 at 11:13:49AM +0100, Peter Zijlstra wrote: >> --- >> Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() >> >> __clear_bit_unlock() is a special little snowflake. While it carries the >> non-atomic '__' prefix, it is specifically documented to pair with >> test_and_set_bit() and therefore should be 'somewhat' atomic. >> >> Therefore the generic implementation of __clear_bit_unlock() cannot use >> the fully non-atomic __clear_bit() as a default. >> >> If an arch is able to do better; is must provide an implementation of >> __clear_bit_unlock() itself. >> > > FWIW, we could probably undo this if we unified all the spinlock based > atomic ops implementations (there's a whole bunch doing essentially the > same), and special cased __clear_bit_unlock() for that. > > Collapsing them is probably a good idea anyway, just a fair bit of > non-trivial work to figure out all the differences and if they matter > etc.. Indeed I thought about this when we first did the SMP port. The only issue was somewhat more generated code with the hashed spinlocks (vs. my dumb 2 spin locks - which as I see now will also cause false sharing - likely ending up in the same cache line), but I was more of a micro-optimization freak then than I'm now :-) So yeah I agree ! From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Wed, 9 Mar 2016 12:40:54 +0100 Message-ID: <20160309114054.GJ6356@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <56E0024F.4070401@synopsys.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-snps-arc" Errors-To: linux-snps-arc-bounces+gla-linux-snps-arc=m.gmane.org@lists.infradead.org To: Vineet Gupta Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Joonsoo Kim , Andrew Morton , linux-snps-arc@lists.infradead.org, Christoph Lameter List-Id: linux-arch.vger.kernel.org On Wed, Mar 09, 2016 at 04:30:31PM +0530, Vineet Gupta wrote: > FWIW, could we add some background to commit log, specifically what prompted this. > Something like below... Sure.. find below. > > +++ b/include/asm-generic/bitops/lock.h > > @@ -29,16 +29,16 @@ do { \ > > * @nr: the bit to set > > * @addr: the address to start counting from > > * > > + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all > > + * the bits in the word are protected by this lock some archs can use weaker > > + * ops to safely unlock. > > + * > > + * See for example x86's implementation. > > */ > > To be able to override/use-generic don't we need #ifndef .... I did not follow through the maze, I think the few archs implementing this simply do not include this file at all. I'll let the first person that cares about this worry about that :-) --- Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() __clear_bit_unlock() is a special little snowflake. While it carries the non-atomic '__' prefix, it is specifically documented to pair with test_and_set_bit() and therefore should be 'somewhat' atomic. Therefore the generic implementation of __clear_bit_unlock() cannot use the fully non-atomic __clear_bit() as a default. If an arch is able to do better; is must provide an implementation of __clear_bit_unlock() itself. Specifically, this came up as a result of hackbench livelock'ing in slab_lock() on ARC with SMP + SLUB + !LLSC. The issue was incorrect pairing of atomic ops. slab_lock() -> bit_spin_lock() -> test_and_set_bit() slab_unlock() -> __bit_spin_unlock() -> __clear_bit() The non serializing __clear_bit() was getting "lost" 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set 80543b90: or r3,r2,1 <--- (B) other core unlocks right here 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) Fixes ARC STAR 9000817404 (and probably more). Cc: stable@vger.kernel.org Reported-by: Vineet Gupta Tested-by: Vineet Gupta Signed-off-by: Peter Zijlstra (Intel) --- include/asm-generic/bitops/lock.h | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h index c30266e94806..8ef0ccbf8167 100644 --- a/include/asm-generic/bitops/lock.h +++ b/include/asm-generic/bitops/lock.h @@ -29,16 +29,16 @@ do { \ * @nr: the bit to set * @addr: the address to start counting from * - * This operation is like clear_bit_unlock, however it is not atomic. - * It does provide release barrier semantics so it can be used to unlock - * a bit lock, however it would only be used if no other CPU can modify - * any bits in the memory until the lock is released (a good example is - * if the bit lock itself protects access to the other bits in the word). + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all + * the bits in the word are protected by this lock some archs can use weaker + * ops to safely unlock. + * + * See for example x86's implementation. */ #define __clear_bit_unlock(nr, addr) \ do { \ - smp_mb(); \ - __clear_bit(nr, addr); \ + smp_mb__before_atomic(); \ + clear_bit(nr, addr); \ } while (0) #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet Gupta Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Wed, 9 Mar 2016 17:23:26 +0530 Message-ID: <56E00EB6.4000201@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> <20160309114054.GJ6356@twins.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Return-path: Received: from smtprelay.synopsys.com ([198.182.60.111]:48017 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752678AbcCILxv (ORCPT ); Wed, 9 Mar 2016 06:53:51 -0500 In-Reply-To: <20160309114054.GJ6356@twins.programming.kicks-ass.net> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Peter Zijlstra Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Joonsoo Kim , Andrew Morton , linux-snps-arc@lists.infradead.org, Christoph Lameter On Wednesday 09 March 2016 05:10 PM, Peter Zijlstra wrote: > On Wed, Mar 09, 2016 at 04:30:31PM +0530, Vineet Gupta wrote: >> FWIW, could we add some background to commit log, specifically what prompted this. >> Something like below... > > Sure.. find below. > >>> +++ b/include/asm-generic/bitops/lock.h >>> @@ -29,16 +29,16 @@ do { \ >>> * @nr: the bit to set >>> * @addr: the address to start counting from >>> * >>> + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all >>> + * the bits in the word are protected by this lock some archs can use weaker >>> + * ops to safely unlock. >>> + * >>> + * See for example x86's implementation. >>> */ >> >> To be able to override/use-generic don't we need #ifndef .... > > I did not follow through the maze, I think the few archs implementing > this simply do not include this file at all. > > I'll let the first person that cares about this worry about that :-) Ok - that's be me :-) although I really don't see much gains in case of ARC LLSC. For us, LD + BCLR + ST is very similar to LLOCK + BCLR + SCOND atleast in terms of cache coherency transactions ! > > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > > Specifically, this came up as a result of hackbench livelock'ing in > slab_lock() on ARC with SMP + SLUB + !LLSC. > > The issue was incorrect pairing of atomic ops. > > slab_lock() -> bit_spin_lock() -> test_and_set_bit() > slab_unlock() -> __bit_spin_unlock() -> __clear_bit() > > The non serializing __clear_bit() was getting "lost" > > 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > Fixes ARC STAR 9000817404 (and probably more). > > Cc: stable@vger.kernel.org > Reported-by: Vineet Gupta > Tested-by: Vineet Gupta > Signed-off-by: Peter Zijlstra (Intel) LGTM. Thx a bunch Peter ! -Vineet > --- > include/asm-generic/bitops/lock.h | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h > index c30266e94806..8ef0ccbf8167 100644 > --- a/include/asm-generic/bitops/lock.h > +++ b/include/asm-generic/bitops/lock.h > @@ -29,16 +29,16 @@ do { \ > * @nr: the bit to set > * @addr: the address to start counting from > * > - * This operation is like clear_bit_unlock, however it is not atomic. > - * It does provide release barrier semantics so it can be used to unlock > - * a bit lock, however it would only be used if no other CPU can modify > - * any bits in the memory until the lock is released (a good example is > - * if the bit lock itself protects access to the other bits in the word). > + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all > + * the bits in the word are protected by this lock some archs can use weaker > + * ops to safely unlock. > + * > + * See for example x86's implementation. > */ > #define __clear_bit_unlock(nr, addr) \ > do { \ > - smp_mb(); \ > - __clear_bit(nr, addr); \ > + smp_mb__before_atomic(); \ > + clear_bit(nr, addr); \ > } while (0) > > #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ > From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Wed, 9 Mar 2016 13:22:17 +0100 Message-ID: <20160309122217.GK6356@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> <20160309114054.GJ6356@twins.programming.kicks-ass.net> <56E00EB6.4000201@synopsys.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <56E00EB6.4000201@synopsys.com> Sender: owner-linux-mm@kvack.org To: Vineet Gupta Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Joonsoo Kim , Andrew Morton , linux-snps-arc@lists.infradead.org, Christoph Lameter List-Id: linux-arch.vger.kernel.org On Wed, Mar 09, 2016 at 05:23:26PM +0530, Vineet Gupta wrote: > > I did not follow through the maze, I think the few archs implementing > > this simply do not include this file at all. > > > > I'll let the first person that cares about this worry about that :-) > > Ok - that's be me :-) although I really don't see much gains in case of ARC LLSC. > > For us, LD + BCLR + ST is very similar to LLOCK + BCLR + SCOND atleast in terms of > cache coherency transactions ! The win would be in not having to ever retry the SCOND. Although in this case, the contending CPU would be doing reads -- which I assume will not cause a SCOND to fail, so it might indeed not make any difference. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet Gupta Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Wed, 9 Mar 2016 18:52:45 +0530 Message-ID: <56E023A5.2000105@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20160309101349.GJ6344@twins.programming.kicks-ass.net> Sender: owner-linux-mm@kvack.org To: Peter Zijlstra Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim List-Id: linux-arch.vger.kernel.org On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote: >> There is clearly a problem in slub code that it is pairing a test_and_set_bit() >> with a __clear_bit(). Latter can obviously clobber former if they are not a single >> instruction each unlike x86 or they use llock/scond kind of instructions where the >> interim store from other core is detected and causes a retry of whole llock/scond >> sequence. > > Yes, test_and_set_bit() + __clear_bit() is broken. But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so (ignoring the performance thing for discussion sake, which is a side effect of this implementation). So despite the comment below in bit_spinlock.h I don't quite comprehend how this is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed cases, then isn't this true in general for lot more cases in kernel, i.e. pairing atomic lock with non-atomic unlock ? I'm missing something ! | /* | * bit-based spin_unlock() | * non-atomic version, which can be used eg. if the bit lock itself is | * protecting the rest of the flags in the word. | */ | static inline void __bit_spin_unlock(int bitnum, unsigned long *addr) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Wed, 9 Mar 2016 15:51:19 +0100 Message-ID: <20160309145119.GN6356@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E023A5.2000105@synopsys.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <56E023A5.2000105@synopsys.com> Sender: linux-kernel-owner@vger.kernel.org To: Vineet Gupta Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim List-Id: linux-arch.vger.kernel.org On Wed, Mar 09, 2016 at 06:52:45PM +0530, Vineet Gupta wrote: > On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote: > >> There is clearly a problem in slub code that it is pairing a test_and_set_bit() > >> with a __clear_bit(). Latter can obviously clobber former if they are not a single > >> instruction each unlike x86 or they use llock/scond kind of instructions where the > >> interim store from other core is detected and causes a retry of whole llock/scond > >> sequence. > > > > Yes, test_and_set_bit() + __clear_bit() is broken. > > But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so > (ignoring the performance thing for discussion sake, which is a side effect of > this implementation). The sort answer is: Per definition. They are defined to work together, which is what makes __clear_bit_unlock() such a special function. > So despite the comment below in bit_spinlock.h I don't quite comprehend how this > is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed > cases, then isn't this true in general for lot more cases in kernel, i.e. pairing > atomic lock with non-atomic unlock ? I'm missing something ! x86 (and others) do in fact use non-atomic instructions for spin_unlock(). But as this is all arch specific, we can make these assumptions. Its just that generic code cannot rely on it. So let me try and explain. The problem as identified is: CPU0 CPU1 bit_spin_lock() __bit_spin_unlock() 1: /* fetch_or, r1 holds the old value */ spin_lock load r1, addr load r1, addr bclr r2, r1, 1 store r2, addr or r2, r1, 1 store r2, addr /* lost the store from CPU1 */ spin_unlock and r1, 1 bnz 2 /* it was set, go wait */ ret 2: load r1, addr and r1, 1 bnz 2 /* wait until its not set */ b 1 /* try again */ For LL/SC we replace: spin_lock load r1, addr ... store r2, addr spin_unlock With the (obvious): 1: load-locked r1, addr ... store-cond r2, addr bnz 1 /* or whatever branch instruction is required to retry */ In this case the failure cannot happen, because the store from CPU1 would have invalidated the lock from CPU0 and caused the store-cond to fail and retry the loop, observing the new value. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet Gupta Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Thu, 10 Mar 2016 11:21:21 +0530 Message-ID: <56E10B59.1060700@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E023A5.2000105@synopsys.com> <20160309145119.GN6356@twins.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Return-path: Received: from us01smtprelay-2.synopsys.com ([198.182.47.9]:35939 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751571AbcCJFvr (ORCPT ); Thu, 10 Mar 2016 00:51:47 -0500 In-Reply-To: <20160309145119.GN6356@twins.programming.kicks-ass.net> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Peter Zijlstra Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim On Wednesday 09 March 2016 08:21 PM, Peter Zijlstra wrote: >> But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so >> (ignoring the performance thing for discussion sake, which is a side effect of >> this implementation). > > The sort answer is: Per definition. They are defined to work together, > which is what makes __clear_bit_unlock() such a special function. > >> So despite the comment below in bit_spinlock.h I don't quite comprehend how this >> is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed >> cases, then isn't this true in general for lot more cases in kernel, i.e. pairing >> atomic lock with non-atomic unlock ? I'm missing something ! > > x86 (and others) do in fact use non-atomic instructions for > spin_unlock(). But as this is all arch specific, we can make these > assumptions. Its just that generic code cannot rely on it. OK despite being obvious now, I was not seeing the similarity between spin_*lock() and bit_spin*lock() :-( ARC also uses standard ST for spin_unlock() so by analogy __bit_spin_unlock() (for LLSC case) would be correctly paired with bit_spin_lock(). But then why would anyone need bit_spin_unlock() at all. Specially after this patch from you which tightens __bit_spin_lock() even more for the general case. Thing is if the API exists majority of people would would use the more conservative version w/o understand all these nuances. Can we pursue the path of moving bit_spin_unlock() over to __bit_spin_lock(): first changing the backend only and if proven stable replacing the call-sites themselves. > > So let me try and explain. > > > The problem as identified is: > > CPU0 CPU1 > > bit_spin_lock() __bit_spin_unlock() > 1: > /* fetch_or, r1 holds the old value */ > spin_lock > load r1, addr > load r1, addr > bclr r2, r1, 1 > store r2, addr > or r2, r1, 1 > store r2, addr /* lost the store from CPU1 */ > spin_unlock > > and r1, 1 > bnz 2 /* it was set, go wait */ > ret > > 2: > load r1, addr > and r1, 1 > bnz 2 /* wait until its not set */ > > b 1 /* try again */ > > > > For LL/SC we replace: > > spin_lock > load r1, addr > > ... > > store r2, addr > spin_unlock > > With the (obvious): > > 1: > load-locked r1, addr > > ... > > store-cond r2, addr > bnz 1 /* or whatever branch instruction is required to retry */ > > > In this case the failure cannot happen, because the store from CPU1 > would have invalidated the lock from CPU0 and caused the > store-cond to fail and retry the loop, observing the new value. You did it again, A picture is worth thousand words ! Thx, -Vineet From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Thu, 10 Mar 2016 10:10:58 +0100 Message-ID: <20160310091058.GQ6344@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E023A5.2000105@synopsys.com> <20160309145119.GN6356@twins.programming.kicks-ass.net> <56E10B59.1060700@synopsys.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <56E10B59.1060700@synopsys.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-snps-arc" Errors-To: linux-snps-arc-bounces+gla-linux-snps-arc=m.gmane.org@lists.infradead.org To: Vineet Gupta Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Joonsoo Kim , Andrew Morton , linux-snps-arc@lists.infradead.org, Christoph Lameter List-Id: linux-arch.vger.kernel.org On Thu, Mar 10, 2016 at 11:21:21AM +0530, Vineet Gupta wrote: > On Wednesday 09 March 2016 08:21 PM, Peter Zijlstra wrote: > >> But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so > >> (ignoring the performance thing for discussion sake, which is a side effect of > >> this implementation). > > > > The sort answer is: Per definition. They are defined to work together, > > which is what makes __clear_bit_unlock() such a special function. > > > >> So despite the comment below in bit_spinlock.h I don't quite comprehend how this > >> is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed > >> cases, then isn't this true in general for lot more cases in kernel, i.e. pairing > >> atomic lock with non-atomic unlock ? I'm missing something ! > > > > x86 (and others) do in fact use non-atomic instructions for > > spin_unlock(). But as this is all arch specific, we can make these > > assumptions. Its just that generic code cannot rely on it. > > OK despite being obvious now, I was not seeing the similarity between spin_*lock() > and bit_spin*lock() :-( > > ARC also uses standard ST for spin_unlock() so by analogy __bit_spin_unlock() (for > LLSC case) would be correctly paired with bit_spin_lock(). > > But then why would anyone need bit_spin_unlock() at all. Specially after this > patch from you which tightens __bit_spin_lock() even more for the general case. > > Thing is if the API exists majority of people would would use the more > conservative version w/o understand all these nuances. Can we pursue the path of > moving bit_spin_unlock() over to __bit_spin_lock(): first changing the backend > only and if proven stable replacing the call-sites themselves. So the thing is, __bit_spin_unlock() is not safe if other bits in that word can have concurrent modifications. Only if the bitlock locks the whole word (or something else ensures no other bits will change) can you use __bit_spin_unlock() to clear the lock bit. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet Gupta Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Mon, 14 Mar 2016 13:35:20 +0530 Message-ID: <56E670C0.7080901@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> <20160309114054.GJ6356@twins.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20160309114054.GJ6356@twins.programming.kicks-ass.net> Sender: owner-linux-mm@kvack.org To: Peter Zijlstra Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Joonsoo Kim , Andrew Morton , linux-snps-arc@lists.infradead.org, Christoph Lameter List-Id: linux-arch.vger.kernel.org On Wednesday 09 March 2016 05:10 PM, Peter Zijlstra wrote: > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > > Specifically, this came up as a result of hackbench livelock'ing in > slab_lock() on ARC with SMP + SLUB + !LLSC. > > The issue was incorrect pairing of atomic ops. > > slab_lock() -> bit_spin_lock() -> test_and_set_bit() > slab_unlock() -> __bit_spin_unlock() -> __clear_bit() > > The non serializing __clear_bit() was getting "lost" > > 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > Fixes ARC STAR 9000817404 (and probably more). > > Cc: stable@vger.kernel.org > Reported-by: Vineet Gupta > Tested-by: Vineet Gupta > Signed-off-by: Peter Zijlstra (Intel) Peter, I don't see this in linux-next yet. I'm hoping you will send it Linus' way for 4.6-rc1. Thx, -Vineet -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtprelay.synopsys.com ([198.182.60.111]:43086 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751063AbcCIGnk (ORCPT ); Wed, 9 Mar 2016 01:43:40 -0500 Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> From: Vineet Gupta Message-ID: <56DFC604.6070407@synopsys.com> Date: Wed, 9 Mar 2016 12:13:16 +0530 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: linux-arch-owner@vger.kernel.org List-ID: To: Christoph Lameter Cc: linux-mm@kvack.org, Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-parisc@vger.kernel, Peter Zijlstra , "James E.J. Bottomley" , Helge Deller , "linux-arch@vger.kernel.org" Message-ID: <20160309064316.rePmuTTHGXGjE71cqJYXy7j9ocdkvkk_7Ou5kH7n9-s@z> +CC linux-arch, parisc folks, PeterZ On Wednesday 09 March 2016 02:10 AM, Christoph Lameter wrote: > On Tue, 8 Mar 2016, Vineet Gupta wrote: > >> # set the bit >> 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set >> 80543b90: or r3,r2,1 <--- (B) other core unlocks right here >> 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > Duh. Guess you need to take the spinlock also in the arch specific > implementation of __bit_spin_unlock(). This is certainly not the only case > in which we use the __ op to unlock. __bit_spin_lock() by definition is *not* required to be atomic, bit_spin_lock() is - so I don't think we need a spinlock there. There is clearly a problem in slub code that it is pairing a test_and_set_bit() with a __clear_bit(). Latter can obviously clobber former if they are not a single instruction each unlike x86 or they use llock/scond kind of instructions where the interim store from other core is detected and causes a retry of whole llock/scond sequence. BTW ARC is not the only arch which suffers from this - other arches potentially also are. AFAIK PARISC also doesn't have atomic r-m-w and also uses a set of external hashed spinlocks to protect the r-m-w sequences. https://lkml.org/lkml/2014/6/1/178 So there also we have the same race because the outer spin lock is not taken for slab_unlock() -> __bit_spin_lock() -> __clear_bit. Arguably I can fix the ARC !LLSC variant of test_and_set_bit() to not set the bit unconditionally but only if it was clear (PARISC does the same). That would be a slight micro-optimization as we won't need another snoop transaction to make line writable and that would also elide this problem, but I think there is a fundamental problem here in slub which is pairing atomic and non atomic ops - for performance reasons. It doesn't work on all arches and/or configurations. > You need a true atomic op or you need to take the "spinlock" in all > cases where you modify the bit. No we don't in __bit_spin_lock and we already do in bit_spin_lock. > If you take the lock in __bit_spin_unlock > then the race cannot happen. Of course it won't but that means we penalize all non atomic callers of the API with a superfluous spinlock which is not require din first place given the definition of API. >> Are you convinced now ! > > Yes, please fix your arch specific code. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from bombadil.infradead.org ([198.137.202.9]:52678 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751687AbcCIKOE (ORCPT ); Wed, 9 Mar 2016 05:14:04 -0500 Date: Wed, 9 Mar 2016 11:13:49 +0100 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Message-ID: <20160309101349.GJ6344@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56DFC604.6070407@synopsys.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Vineet Gupta Cc: Christoph Lameter , linux-mm@kvack.org, Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-parisc@vger.kernel, "James E.J. Bottomley" , Helge Deller , "linux-arch@vger.kernel.org" Message-ID: <20160309101349.aOpRr4oybu4ltMSFiQw9XqBH84qMcY8w8cFQsAWMTWI@z> On Wed, Mar 09, 2016 at 12:13:16PM +0530, Vineet Gupta wrote: > +CC linux-arch, parisc folks, PeterZ > > On Wednesday 09 March 2016 02:10 AM, Christoph Lameter wrote: > > On Tue, 8 Mar 2016, Vineet Gupta wrote: > > > >> # set the bit > >> 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > >> 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > >> 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > > > Duh. Guess you need to take the spinlock also in the arch specific > > implementation of __bit_spin_unlock(). This is certainly not the only case > > in which we use the __ op to unlock. > > __bit_spin_lock() by definition is *not* required to be atomic, bit_spin_lock() is > - so I don't think we need a spinlock there. Agreed. The double underscore prefixed instructions are not required to be atomic in any way shape or form. > There is clearly a problem in slub code that it is pairing a test_and_set_bit() > with a __clear_bit(). Latter can obviously clobber former if they are not a single > instruction each unlike x86 or they use llock/scond kind of instructions where the > interim store from other core is detected and causes a retry of whole llock/scond > sequence. Yes, test_and_set_bit() + __clear_bit() is broken. > > If you take the lock in __bit_spin_unlock > > then the race cannot happen. > > Of course it won't but that means we penalize all non atomic callers of the API > with a superfluous spinlock which is not require din first place given the > definition of API. Quite. _However_, your arch is still broken, but not by your fault. Its the generic-asm code that is wrong. The thing is that __bit_spin_unlock() uses __clear_bit_unlock(), which defaults to __clear_bit(). Which is wrong. --- Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() __clear_bit_unlock() is a special little snowflake. While it carries the non-atomic '__' prefix, it is specifically documented to pair with test_and_set_bit() and therefore should be 'somewhat' atomic. Therefore the generic implementation of __clear_bit_unlock() cannot use the fully non-atomic __clear_bit() as a default. If an arch is able to do better; is must provide an implementation of __clear_bit_unlock() itself. Reported-by: Vineet Gupta Signed-off-by: Peter Zijlstra (Intel) --- include/asm-generic/bitops/lock.h | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h index c30266e94806..8ef0ccbf8167 100644 --- a/include/asm-generic/bitops/lock.h +++ b/include/asm-generic/bitops/lock.h @@ -29,16 +29,16 @@ do { \ * @nr: the bit to set * @addr: the address to start counting from * - * This operation is like clear_bit_unlock, however it is not atomic. - * It does provide release barrier semantics so it can be used to unlock - * a bit lock, however it would only be used if no other CPU can modify - * any bits in the memory until the lock is released (a good example is - * if the bit lock itself protects access to the other bits in the word). + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all + * the bits in the word are protected by this lock some archs can use weaker + * ops to safely unlock. + * + * See for example x86's implementation. */ #define __clear_bit_unlock(nr, addr) \ do { \ - smp_mb(); \ - __clear_bit(nr, addr); \ + smp_mb__before_atomic(); \ + clear_bit(nr, addr); \ } while (0) #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from casper.infradead.org ([85.118.1.10]:56916 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932120AbcCIKbv (ORCPT ); Wed, 9 Mar 2016 05:31:51 -0500 Date: Wed, 9 Mar 2016 11:31:43 +0100 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Message-ID: <20160309103143.GF25010@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160309101349.GJ6344@twins.programming.kicks-ass.net> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Vineet Gupta Cc: Christoph Lameter , linux-mm@kvack.org, Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-parisc@vger.kernel, "James E.J. Bottomley" , Helge Deller , "linux-arch@vger.kernel.org" Message-ID: <20160309103143.cj2iolen9CHBz_1ODY_Rp9LqozAQNytbEEjwj8h1aVY@z> On Wed, Mar 09, 2016 at 11:13:49AM +0100, Peter Zijlstra wrote: > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > FWIW, we could probably undo this if we unified all the spinlock based atomic ops implementations (there's a whole bunch doing essentially the same), and special cased __clear_bit_unlock() for that. Collapsing them is probably a good idea anyway, just a fair bit of non-trivial work to figure out all the differences and if they matter etc.. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from casper.infradead.org ([85.118.1.10]:58630 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932179AbcCILlA (ORCPT ); Wed, 9 Mar 2016 06:41:00 -0500 Date: Wed, 9 Mar 2016 12:40:54 +0100 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Message-ID: <20160309114054.GJ6356@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56E0024F.4070401@synopsys.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Vineet Gupta Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim Message-ID: <20160309114054.L_YfBvp6OSwiWiNhuKRaIAcCZPAMz4pwlYR9jJ1mSaQ@z> On Wed, Mar 09, 2016 at 04:30:31PM +0530, Vineet Gupta wrote: > FWIW, could we add some background to commit log, specifically what prompted this. > Something like below... Sure.. find below. > > +++ b/include/asm-generic/bitops/lock.h > > @@ -29,16 +29,16 @@ do { \ > > * @nr: the bit to set > > * @addr: the address to start counting from > > * > > + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all > > + * the bits in the word are protected by this lock some archs can use weaker > > + * ops to safely unlock. > > + * > > + * See for example x86's implementation. > > */ > > To be able to override/use-generic don't we need #ifndef .... I did not follow through the maze, I think the few archs implementing this simply do not include this file at all. I'll let the first person that cares about this worry about that :-) --- Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() __clear_bit_unlock() is a special little snowflake. While it carries the non-atomic '__' prefix, it is specifically documented to pair with test_and_set_bit() and therefore should be 'somewhat' atomic. Therefore the generic implementation of __clear_bit_unlock() cannot use the fully non-atomic __clear_bit() as a default. If an arch is able to do better; is must provide an implementation of __clear_bit_unlock() itself. Specifically, this came up as a result of hackbench livelock'ing in slab_lock() on ARC with SMP + SLUB + !LLSC. The issue was incorrect pairing of atomic ops. slab_lock() -> bit_spin_lock() -> test_and_set_bit() slab_unlock() -> __bit_spin_unlock() -> __clear_bit() The non serializing __clear_bit() was getting "lost" 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set 80543b90: or r3,r2,1 <--- (B) other core unlocks right here 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) Fixes ARC STAR 9000817404 (and probably more). Cc: stable@vger.kernel.org Reported-by: Vineet Gupta Tested-by: Vineet Gupta Signed-off-by: Peter Zijlstra (Intel) --- include/asm-generic/bitops/lock.h | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h index c30266e94806..8ef0ccbf8167 100644 --- a/include/asm-generic/bitops/lock.h +++ b/include/asm-generic/bitops/lock.h @@ -29,16 +29,16 @@ do { \ * @nr: the bit to set * @addr: the address to start counting from * - * This operation is like clear_bit_unlock, however it is not atomic. - * It does provide release barrier semantics so it can be used to unlock - * a bit lock, however it would only be used if no other CPU can modify - * any bits in the memory until the lock is released (a good example is - * if the bit lock itself protects access to the other bits in the word). + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all + * the bits in the word are protected by this lock some archs can use weaker + * ops to safely unlock. + * + * See for example x86's implementation. */ #define __clear_bit_unlock(nr, addr) \ do { \ - smp_mb(); \ - __clear_bit(nr, addr); \ + smp_mb__before_atomic(); \ + clear_bit(nr, addr); \ } while (0) #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from casper.infradead.org ([85.118.1.10]:59643 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751487AbcCIMWX (ORCPT ); Wed, 9 Mar 2016 07:22:23 -0500 Date: Wed, 9 Mar 2016 13:22:17 +0100 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Message-ID: <20160309122217.GK6356@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> <20160309114054.GJ6356@twins.programming.kicks-ass.net> <56E00EB6.4000201@synopsys.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56E00EB6.4000201@synopsys.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Vineet Gupta Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Joonsoo Kim , Andrew Morton , linux-snps-arc@lists.infradead.org, Christoph Lameter Message-ID: <20160309122217.cGLxJh_H_xhZPdsL_ZfL_wji4coNl2bCpnMY4wJ-5cg@z> On Wed, Mar 09, 2016 at 05:23:26PM +0530, Vineet Gupta wrote: > > I did not follow through the maze, I think the few archs implementing > > this simply do not include this file at all. > > > > I'll let the first person that cares about this worry about that :-) > > Ok - that's be me :-) although I really don't see much gains in case of ARC LLSC. > > For us, LD + BCLR + ST is very similar to LLOCK + BCLR + SCOND atleast in terms of > cache coherency transactions ! The win would be in not having to ever retry the SCOND. Although in this case, the contending CPU would be doing reads -- which I assume will not cause a SCOND to fail, so it might indeed not make any difference. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtprelay.synopsys.com ([198.182.47.9]:46072 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932654AbcCINX0 (ORCPT ); Wed, 9 Mar 2016 08:23:26 -0500 Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> From: Vineet Gupta Message-ID: <56E023A5.2000105@synopsys.com> Date: Wed, 9 Mar 2016 18:52:45 +0530 MIME-Version: 1.0 In-Reply-To: <20160309101349.GJ6344@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: linux-arch-owner@vger.kernel.org List-ID: To: Peter Zijlstra Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim Message-ID: <20160309132245.QJQb-UIEZ9Bcui_aPsuAeHGYfDXuser-xfLc77mSICE@z> On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote: >> There is clearly a problem in slub code that it is pairing a test_and_set_bit() >> with a __clear_bit(). Latter can obviously clobber former if they are not a single >> instruction each unlike x86 or they use llock/scond kind of instructions where the >> interim store from other core is detected and causes a retry of whole llock/scond >> sequence. > > Yes, test_and_set_bit() + __clear_bit() is broken. But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so (ignoring the performance thing for discussion sake, which is a side effect of this implementation). So despite the comment below in bit_spinlock.h I don't quite comprehend how this is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed cases, then isn't this true in general for lot more cases in kernel, i.e. pairing atomic lock with non-atomic unlock ? I'm missing something ! | /* | * bit-based spin_unlock() | * non-atomic version, which can be used eg. if the bit lock itself is | * protecting the rest of the flags in the word. | */ | static inline void __bit_spin_unlock(int bitnum, unsigned long *addr) From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from bombadil.infradead.org ([198.137.202.9]:53714 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932740AbcCIOva (ORCPT ); Wed, 9 Mar 2016 09:51:30 -0500 Date: Wed, 9 Mar 2016 15:51:19 +0100 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Message-ID: <20160309145119.GN6356@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E023A5.2000105@synopsys.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56E023A5.2000105@synopsys.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Vineet Gupta Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim Message-ID: <20160309145119.dVgqlbtQrMfLY5dtugh6mSWjCcTFrFxolH31jqKtu5c@z> On Wed, Mar 09, 2016 at 06:52:45PM +0530, Vineet Gupta wrote: > On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote: > >> There is clearly a problem in slub code that it is pairing a test_and_set_bit() > >> with a __clear_bit(). Latter can obviously clobber former if they are not a single > >> instruction each unlike x86 or they use llock/scond kind of instructions where the > >> interim store from other core is detected and causes a retry of whole llock/scond > >> sequence. > > > > Yes, test_and_set_bit() + __clear_bit() is broken. > > But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so > (ignoring the performance thing for discussion sake, which is a side effect of > this implementation). The sort answer is: Per definition. They are defined to work together, which is what makes __clear_bit_unlock() such a special function. > So despite the comment below in bit_spinlock.h I don't quite comprehend how this > is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed > cases, then isn't this true in general for lot more cases in kernel, i.e. pairing > atomic lock with non-atomic unlock ? I'm missing something ! x86 (and others) do in fact use non-atomic instructions for spin_unlock(). But as this is all arch specific, we can make these assumptions. Its just that generic code cannot rely on it. So let me try and explain. The problem as identified is: CPU0 CPU1 bit_spin_lock() __bit_spin_unlock() 1: /* fetch_or, r1 holds the old value */ spin_lock load r1, addr load r1, addr bclr r2, r1, 1 store r2, addr or r2, r1, 1 store r2, addr /* lost the store from CPU1 */ spin_unlock and r1, 1 bnz 2 /* it was set, go wait */ ret 2: load r1, addr and r1, 1 bnz 2 /* wait until its not set */ b 1 /* try again */ For LL/SC we replace: spin_lock load r1, addr ... store r2, addr spin_unlock With the (obvious): 1: load-locked r1, addr ... store-cond r2, addr bnz 1 /* or whatever branch instruction is required to retry */ In this case the failure cannot happen, because the store from CPU1 would have invalidated the lock from CPU0 and caused the store-cond to fail and retry the loop, observing the new value. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from bombadil.infradead.org ([198.137.202.9]:58873 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935170AbcCJJLK (ORCPT ); Thu, 10 Mar 2016 04:11:10 -0500 Date: Thu, 10 Mar 2016 10:10:58 +0100 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Message-ID: <20160310091058.GQ6344@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E023A5.2000105@synopsys.com> <20160309145119.GN6356@twins.programming.kicks-ass.net> <56E10B59.1060700@synopsys.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56E10B59.1060700@synopsys.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Vineet Gupta Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim Message-ID: <20160310091058.gUdSrWAItpxbDO7ELYqsUYBE40hAxH1rilx3E9wQ1I0@z> On Thu, Mar 10, 2016 at 11:21:21AM +0530, Vineet Gupta wrote: > On Wednesday 09 March 2016 08:21 PM, Peter Zijlstra wrote: > >> But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so > >> (ignoring the performance thing for discussion sake, which is a side effect of > >> this implementation). > > > > The sort answer is: Per definition. They are defined to work together, > > which is what makes __clear_bit_unlock() such a special function. > > > >> So despite the comment below in bit_spinlock.h I don't quite comprehend how this > >> is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed > >> cases, then isn't this true in general for lot more cases in kernel, i.e. pairing > >> atomic lock with non-atomic unlock ? I'm missing something ! > > > > x86 (and others) do in fact use non-atomic instructions for > > spin_unlock(). But as this is all arch specific, we can make these > > assumptions. Its just that generic code cannot rely on it. > > OK despite being obvious now, I was not seeing the similarity between spin_*lock() > and bit_spin*lock() :-( > > ARC also uses standard ST for spin_unlock() so by analogy __bit_spin_unlock() (for > LLSC case) would be correctly paired with bit_spin_lock(). > > But then why would anyone need bit_spin_unlock() at all. Specially after this > patch from you which tightens __bit_spin_lock() even more for the general case. > > Thing is if the API exists majority of people would would use the more > conservative version w/o understand all these nuances. Can we pursue the path of > moving bit_spin_unlock() over to __bit_spin_lock(): first changing the backend > only and if proven stable replacing the call-sites themselves. So the thing is, __bit_spin_unlock() is not safe if other bits in that word can have concurrent modifications. Only if the bitlock locks the whole word (or something else ensures no other bits will change) can you use __bit_spin_unlock() to clear the lock bit. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtprelay.synopsys.com ([198.182.60.111]:55884 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933865AbcCNIFq (ORCPT ); Mon, 14 Mar 2016 04:05:46 -0400 Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> <20160309114054.GJ6356@twins.programming.kicks-ass.net> From: Vineet Gupta Message-ID: <56E670C0.7080901@synopsys.com> Date: Mon, 14 Mar 2016 13:35:20 +0530 MIME-Version: 1.0 In-Reply-To: <20160309114054.GJ6356@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: linux-arch-owner@vger.kernel.org List-ID: To: Peter Zijlstra Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Joonsoo Kim , Andrew Morton , linux-snps-arc@lists.infradead.org, Christoph Lameter Message-ID: <20160314080520.L4UVj8nHRxd2tpxpdbnDz_1yqcDfqNnmXWpRr8S6UlM@z> On Wednesday 09 March 2016 05:10 PM, Peter Zijlstra wrote: > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > > Specifically, this came up as a result of hackbench livelock'ing in > slab_lock() on ARC with SMP + SLUB + !LLSC. > > The issue was incorrect pairing of atomic ops. > > slab_lock() -> bit_spin_lock() -> test_and_set_bit() > slab_unlock() -> __bit_spin_unlock() -> __clear_bit() > > The non serializing __clear_bit() was getting "lost" > > 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > Fixes ARC STAR 9000817404 (and probably more). > > Cc: stable@vger.kernel.org > Reported-by: Vineet Gupta > Tested-by: Vineet Gupta > Signed-off-by: Peter Zijlstra (Intel) Peter, I don't see this in linux-next yet. I'm hoping you will send it Linus' way for 4.6-rc1. Thx, -Vineet