From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet Gupta Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Wed, 9 Mar 2016 12:13:16 +0530 Message-ID: <56DFC604.6070407@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Sender: owner-linux-mm@kvack.org To: Christoph Lameter Cc: linux-mm@kvack.org, Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-parisc@vger.kernel, Peter Zijlstra , "James E.J. Bottomley" , Helge Deller , "linux-arch@vger.kernel.org" List-Id: linux-arch.vger.kernel.org +CC linux-arch, parisc folks, PeterZ On Wednesday 09 March 2016 02:10 AM, Christoph Lameter wrote: > On Tue, 8 Mar 2016, Vineet Gupta wrote: > >> # set the bit >> 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set >> 80543b90: or r3,r2,1 <--- (B) other core unlocks right here >> 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > Duh. Guess you need to take the spinlock also in the arch specific > implementation of __bit_spin_unlock(). This is certainly not the only case > in which we use the __ op to unlock. __bit_spin_lock() by definition is *not* required to be atomic, bit_spin_lock() is - so I don't think we need a spinlock there. There is clearly a problem in slub code that it is pairing a test_and_set_bit() with a __clear_bit(). Latter can obviously clobber former if they are not a single instruction each unlike x86 or they use llock/scond kind of instructions where the interim store from other core is detected and causes a retry of whole llock/scond sequence. BTW ARC is not the only arch which suffers from this - other arches potentially also are. AFAIK PARISC also doesn't have atomic r-m-w and also uses a set of external hashed spinlocks to protect the r-m-w sequences. https://lkml.org/lkml/2014/6/1/178 So there also we have the same race because the outer spin lock is not taken for slab_unlock() -> __bit_spin_lock() -> __clear_bit. Arguably I can fix the ARC !LLSC variant of test_and_set_bit() to not set the bit unconditionally but only if it was clear (PARISC does the same). That would be a slight micro-optimization as we won't need another snoop transaction to make line writable and that would also elide this problem, but I think there is a fundamental problem here in slub which is pairing atomic and non atomic ops - for performance reasons. It doesn't work on all arches and/or configurations. > You need a true atomic op or you need to take the "spinlock" in all > cases where you modify the bit. No we don't in __bit_spin_lock and we already do in bit_spin_lock. > If you take the lock in __bit_spin_unlock > then the race cannot happen. Of course it won't but that means we penalize all non atomic callers of the API with a superfluous spinlock which is not require din first place given the definition of API. >> Are you convinced now ! > > Yes, please fix your arch specific code. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Wed, 9 Mar 2016 11:13:49 +0100 Message-ID: <20160309101349.GJ6344@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <56DFC604.6070407@synopsys.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-snps-arc" Errors-To: linux-snps-arc-bounces+gla-linux-snps-arc=m.gmane.org@lists.infradead.org To: Vineet Gupta Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim List-Id: linux-arch.vger.kernel.org On Wed, Mar 09, 2016 at 12:13:16PM +0530, Vineet Gupta wrote: > +CC linux-arch, parisc folks, PeterZ > > On Wednesday 09 March 2016 02:10 AM, Christoph Lameter wrote: > > On Tue, 8 Mar 2016, Vineet Gupta wrote: > > > >> # set the bit > >> 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > >> 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > >> 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > > > Duh. Guess you need to take the spinlock also in the arch specific > > implementation of __bit_spin_unlock(). This is certainly not the only case > > in which we use the __ op to unlock. > > __bit_spin_lock() by definition is *not* required to be atomic, bit_spin_lock() is > - so I don't think we need a spinlock there. Agreed. The double underscore prefixed instructions are not required to be atomic in any way shape or form. > There is clearly a problem in slub code that it is pairing a test_and_set_bit() > with a __clear_bit(). Latter can obviously clobber former if they are not a single > instruction each unlike x86 or they use llock/scond kind of instructions where the > interim store from other core is detected and causes a retry of whole llock/scond > sequence. Yes, test_and_set_bit() + __clear_bit() is broken. > > If you take the lock in __bit_spin_unlock > > then the race cannot happen. > > Of course it won't but that means we penalize all non atomic callers of the API > with a superfluous spinlock which is not require din first place given the > definition of API. Quite. _However_, your arch is still broken, but not by your fault. Its the generic-asm code that is wrong. The thing is that __bit_spin_unlock() uses __clear_bit_unlock(), which defaults to __clear_bit(). Which is wrong. --- Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() __clear_bit_unlock() is a special little snowflake. While it carries the non-atomic '__' prefix, it is specifically documented to pair with test_and_set_bit() and therefore should be 'somewhat' atomic. Therefore the generic implementation of __clear_bit_unlock() cannot use the fully non-atomic __clear_bit() as a default. If an arch is able to do better; is must provide an implementation of __clear_bit_unlock() itself. Reported-by: Vineet Gupta Signed-off-by: Peter Zijlstra (Intel) --- include/asm-generic/bitops/lock.h | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h index c30266e94806..8ef0ccbf8167 100644 --- a/include/asm-generic/bitops/lock.h +++ b/include/asm-generic/bitops/lock.h @@ -29,16 +29,16 @@ do { \ * @nr: the bit to set * @addr: the address to start counting from * - * This operation is like clear_bit_unlock, however it is not atomic. - * It does provide release barrier semantics so it can be used to unlock - * a bit lock, however it would only be used if no other CPU can modify - * any bits in the memory until the lock is released (a good example is - * if the bit lock itself protects access to the other bits in the word). + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all + * the bits in the word are protected by this lock some archs can use weaker + * ops to safely unlock. + * + * See for example x86's implementation. */ #define __clear_bit_unlock(nr, addr) \ do { \ - smp_mb(); \ - __clear_bit(nr, addr); \ + smp_mb__before_atomic(); \ + clear_bit(nr, addr); \ } while (0) #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Wed, 9 Mar 2016 11:31:43 +0100 Message-ID: <20160309103143.GF25010@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <20160309101349.GJ6344@twins.programming.kicks-ass.net> Sender: stable-owner@vger.kernel.org To: Vineet Gupta Cc: Christoph Lameter , linux-mm@kvack.org, Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-parisc@vger.kernel, "James E.J. Bottomley" , Helge Deller , "linux-arch@vger.kernel.org" List-Id: linux-arch.vger.kernel.org On Wed, Mar 09, 2016 at 11:13:49AM +0100, Peter Zijlstra wrote: > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > FWIW, we could probably undo this if we unified all the spinlock based atomic ops implementations (there's a whole bunch doing essentially the same), and special cased __clear_bit_unlock() for that. Collapsing them is probably a good idea anyway, just a fair bit of non-trivial work to figure out all the differences and if they matter etc.. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet Gupta Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Wed, 9 Mar 2016 16:30:31 +0530 Message-ID: <56E0024F.4070401@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Return-path: Received: from smtprelay.synopsys.com ([198.182.47.9]:44037 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753133AbcCILBA (ORCPT ); Wed, 9 Mar 2016 06:01:00 -0500 In-Reply-To: <20160309101349.GJ6344@twins.programming.kicks-ass.net> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Peter Zijlstra Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote: >>> If you take the lock in __bit_spin_unlock >>> then the race cannot happen. >> >> Of course it won't but that means we penalize all non atomic callers of the API >> with a superfluous spinlock which is not require din first place given the >> definition of API. > > Quite. _However_, your arch is still broken, but not by your fault. Its > the generic-asm code that is wrong. > > The thing is that __bit_spin_unlock() uses __clear_bit_unlock(), which > defaults to __clear_bit(). Which is wrong. > > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > > Reported-by: Vineet Gupta This needs to be CCed stable as it fixes a real bug for ARC. > Signed-off-by: Peter Zijlstra (Intel) Tested-by: Vineet Gupta FWIW, could we add some background to commit log, specifically what prompted this. Something like below... ---->8------ This came up as a result of hackbench livelock'ing in slab_lock() on ARC with SMP + SLUB + !LLSC. The issue was incorrect pairing of atomic ops. slab_lock() -> bit_spin_lock() -> test_and_set_bit() slab_unlock() -> __bit_spin_unlock() -> __clear_bit() The non serializing __clear_bit() was getting "lost" 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set 80543b90: or r3,r2,1 <--- (B) other core unlocks right here 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) Fixes ARC STAR 9000817404 ---->8------ > --- > include/asm-generic/bitops/lock.h | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h > index c30266e94806..8ef0ccbf8167 100644 > --- a/include/asm-generic/bitops/lock.h > +++ b/include/asm-generic/bitops/lock.h > @@ -29,16 +29,16 @@ do { \ > * @nr: the bit to set > * @addr: the address to start counting from > * > - * This operation is like clear_bit_unlock, however it is not atomic. > - * It does provide release barrier semantics so it can be used to unlock > - * a bit lock, however it would only be used if no other CPU can modify > - * any bits in the memory until the lock is released (a good example is > - * if the bit lock itself protects access to the other bits in the word). > + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all > + * the bits in the word are protected by this lock some archs can use weaker > + * ops to safely unlock. > + * > + * See for example x86's implementation. > */ To be able to override/use-generic don't we need #ifndef .... > #define __clear_bit_unlock(nr, addr) \ > do { \ > - smp_mb(); \ > - __clear_bit(nr, addr); \ > + smp_mb__before_atomic(); \ > + clear_bit(nr, addr); \ > } while (0) > > #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ > From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet Gupta Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Wed, 9 Mar 2016 16:42:43 +0530 Message-ID: <56E0052B.3080304@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <20160309103143.GF25010@twins.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Return-path: Received: from smtprelay.synopsys.com ([198.182.47.9]:44230 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753156AbcCILNI (ORCPT ); Wed, 9 Mar 2016 06:13:08 -0500 In-Reply-To: <20160309103143.GF25010@twins.programming.kicks-ass.net> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Peter Zijlstra Cc: Christoph Lameter , linux-mm@kvack.org, Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-parisc@vger.kernel, "James E.J. Bottomley" , Helge Deller , "linux-arch@vger.kernel.org" On Wednesday 09 March 2016 04:01 PM, Peter Zijlstra wrote: > On Wed, Mar 09, 2016 at 11:13:49AM +0100, Peter Zijlstra wrote: >> --- >> Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() >> >> __clear_bit_unlock() is a special little snowflake. While it carries the >> non-atomic '__' prefix, it is specifically documented to pair with >> test_and_set_bit() and therefore should be 'somewhat' atomic. >> >> Therefore the generic implementation of __clear_bit_unlock() cannot use >> the fully non-atomic __clear_bit() as a default. >> >> If an arch is able to do better; is must provide an implementation of >> __clear_bit_unlock() itself. >> > > FWIW, we could probably undo this if we unified all the spinlock based > atomic ops implementations (there's a whole bunch doing essentially the > same), and special cased __clear_bit_unlock() for that. > > Collapsing them is probably a good idea anyway, just a fair bit of > non-trivial work to figure out all the differences and if they matter > etc.. Indeed I thought about this when we first did the SMP port. The only issue was somewhat more generated code with the hashed spinlocks (vs. my dumb 2 spin locks - which as I see now will also cause false sharing - likely ending up in the same cache line), but I was more of a micro-optimization freak then than I'm now :-) So yeah I agree ! From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Wed, 9 Mar 2016 12:40:54 +0100 Message-ID: <20160309114054.GJ6356@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <56E0024F.4070401@synopsys.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-snps-arc" Errors-To: linux-snps-arc-bounces+gla-linux-snps-arc=m.gmane.org@lists.infradead.org To: Vineet Gupta Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Joonsoo Kim , Andrew Morton , linux-snps-arc@lists.infradead.org, Christoph Lameter List-Id: linux-arch.vger.kernel.org On Wed, Mar 09, 2016 at 04:30:31PM +0530, Vineet Gupta wrote: > FWIW, could we add some background to commit log, specifically what prompted this. > Something like below... Sure.. find below. > > +++ b/include/asm-generic/bitops/lock.h > > @@ -29,16 +29,16 @@ do { \ > > * @nr: the bit to set > > * @addr: the address to start counting from > > * > > + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all > > + * the bits in the word are protected by this lock some archs can use weaker > > + * ops to safely unlock. > > + * > > + * See for example x86's implementation. > > */ > > To be able to override/use-generic don't we need #ifndef .... I did not follow through the maze, I think the few archs implementing this simply do not include this file at all. I'll let the first person that cares about this worry about that :-) --- Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() __clear_bit_unlock() is a special little snowflake. While it carries the non-atomic '__' prefix, it is specifically documented to pair with test_and_set_bit() and therefore should be 'somewhat' atomic. Therefore the generic implementation of __clear_bit_unlock() cannot use the fully non-atomic __clear_bit() as a default. If an arch is able to do better; is must provide an implementation of __clear_bit_unlock() itself. Specifically, this came up as a result of hackbench livelock'ing in slab_lock() on ARC with SMP + SLUB + !LLSC. The issue was incorrect pairing of atomic ops. slab_lock() -> bit_spin_lock() -> test_and_set_bit() slab_unlock() -> __bit_spin_unlock() -> __clear_bit() The non serializing __clear_bit() was getting "lost" 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set 80543b90: or r3,r2,1 <--- (B) other core unlocks right here 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) Fixes ARC STAR 9000817404 (and probably more). Cc: stable@vger.kernel.org Reported-by: Vineet Gupta Tested-by: Vineet Gupta Signed-off-by: Peter Zijlstra (Intel) --- include/asm-generic/bitops/lock.h | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h index c30266e94806..8ef0ccbf8167 100644 --- a/include/asm-generic/bitops/lock.h +++ b/include/asm-generic/bitops/lock.h @@ -29,16 +29,16 @@ do { \ * @nr: the bit to set * @addr: the address to start counting from * - * This operation is like clear_bit_unlock, however it is not atomic. - * It does provide release barrier semantics so it can be used to unlock - * a bit lock, however it would only be used if no other CPU can modify - * any bits in the memory until the lock is released (a good example is - * if the bit lock itself protects access to the other bits in the word). + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all + * the bits in the word are protected by this lock some archs can use weaker + * ops to safely unlock. + * + * See for example x86's implementation. */ #define __clear_bit_unlock(nr, addr) \ do { \ - smp_mb(); \ - __clear_bit(nr, addr); \ + smp_mb__before_atomic(); \ + clear_bit(nr, addr); \ } while (0) #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet Gupta Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Wed, 9 Mar 2016 17:23:26 +0530 Message-ID: <56E00EB6.4000201@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> <20160309114054.GJ6356@twins.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Return-path: Received: from smtprelay.synopsys.com ([198.182.60.111]:48017 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752678AbcCILxv (ORCPT ); Wed, 9 Mar 2016 06:53:51 -0500 In-Reply-To: <20160309114054.GJ6356@twins.programming.kicks-ass.net> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Peter Zijlstra Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Joonsoo Kim , Andrew Morton , linux-snps-arc@lists.infradead.org, Christoph Lameter On Wednesday 09 March 2016 05:10 PM, Peter Zijlstra wrote: > On Wed, Mar 09, 2016 at 04:30:31PM +0530, Vineet Gupta wrote: >> FWIW, could we add some background to commit log, specifically what prompted this. >> Something like below... > > Sure.. find below. > >>> +++ b/include/asm-generic/bitops/lock.h >>> @@ -29,16 +29,16 @@ do { \ >>> * @nr: the bit to set >>> * @addr: the address to start counting from >>> * >>> + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all >>> + * the bits in the word are protected by this lock some archs can use weaker >>> + * ops to safely unlock. >>> + * >>> + * See for example x86's implementation. >>> */ >> >> To be able to override/use-generic don't we need #ifndef .... > > I did not follow through the maze, I think the few archs implementing > this simply do not include this file at all. > > I'll let the first person that cares about this worry about that :-) Ok - that's be me :-) although I really don't see much gains in case of ARC LLSC. For us, LD + BCLR + ST is very similar to LLOCK + BCLR + SCOND atleast in terms of cache coherency transactions ! > > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > > Specifically, this came up as a result of hackbench livelock'ing in > slab_lock() on ARC with SMP + SLUB + !LLSC. > > The issue was incorrect pairing of atomic ops. > > slab_lock() -> bit_spin_lock() -> test_and_set_bit() > slab_unlock() -> __bit_spin_unlock() -> __clear_bit() > > The non serializing __clear_bit() was getting "lost" > > 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > Fixes ARC STAR 9000817404 (and probably more). > > Cc: stable@vger.kernel.org > Reported-by: Vineet Gupta > Tested-by: Vineet Gupta > Signed-off-by: Peter Zijlstra (Intel) LGTM. Thx a bunch Peter ! -Vineet > --- > include/asm-generic/bitops/lock.h | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h > index c30266e94806..8ef0ccbf8167 100644 > --- a/include/asm-generic/bitops/lock.h > +++ b/include/asm-generic/bitops/lock.h > @@ -29,16 +29,16 @@ do { \ > * @nr: the bit to set > * @addr: the address to start counting from > * > - * This operation is like clear_bit_unlock, however it is not atomic. > - * It does provide release barrier semantics so it can be used to unlock > - * a bit lock, however it would only be used if no other CPU can modify > - * any bits in the memory until the lock is released (a good example is > - * if the bit lock itself protects access to the other bits in the word). > + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all > + * the bits in the word are protected by this lock some archs can use weaker > + * ops to safely unlock. > + * > + * See for example x86's implementation. > */ > #define __clear_bit_unlock(nr, addr) \ > do { \ > - smp_mb(); \ > - __clear_bit(nr, addr); \ > + smp_mb__before_atomic(); \ > + clear_bit(nr, addr); \ > } while (0) > > #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ > From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Wed, 9 Mar 2016 13:22:17 +0100 Message-ID: <20160309122217.GK6356@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> <20160309114054.GJ6356@twins.programming.kicks-ass.net> <56E00EB6.4000201@synopsys.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <56E00EB6.4000201@synopsys.com> Sender: owner-linux-mm@kvack.org To: Vineet Gupta Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Joonsoo Kim , Andrew Morton , linux-snps-arc@lists.infradead.org, Christoph Lameter List-Id: linux-arch.vger.kernel.org On Wed, Mar 09, 2016 at 05:23:26PM +0530, Vineet Gupta wrote: > > I did not follow through the maze, I think the few archs implementing > > this simply do not include this file at all. > > > > I'll let the first person that cares about this worry about that :-) > > Ok - that's be me :-) although I really don't see much gains in case of ARC LLSC. > > For us, LD + BCLR + ST is very similar to LLOCK + BCLR + SCOND atleast in terms of > cache coherency transactions ! The win would be in not having to ever retry the SCOND. Although in this case, the contending CPU would be doing reads -- which I assume will not cause a SCOND to fail, so it might indeed not make any difference. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet Gupta Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Wed, 9 Mar 2016 18:52:45 +0530 Message-ID: <56E023A5.2000105@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20160309101349.GJ6344@twins.programming.kicks-ass.net> Sender: owner-linux-mm@kvack.org To: Peter Zijlstra Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim List-Id: linux-arch.vger.kernel.org On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote: >> There is clearly a problem in slub code that it is pairing a test_and_set_bit() >> with a __clear_bit(). Latter can obviously clobber former if they are not a single >> instruction each unlike x86 or they use llock/scond kind of instructions where the >> interim store from other core is detected and causes a retry of whole llock/scond >> sequence. > > Yes, test_and_set_bit() + __clear_bit() is broken. But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so (ignoring the performance thing for discussion sake, which is a side effect of this implementation). So despite the comment below in bit_spinlock.h I don't quite comprehend how this is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed cases, then isn't this true in general for lot more cases in kernel, i.e. pairing atomic lock with non-atomic unlock ? I'm missing something ! | /* | * bit-based spin_unlock() | * non-atomic version, which can be used eg. if the bit lock itself is | * protecting the rest of the flags in the word. | */ | static inline void __bit_spin_unlock(int bitnum, unsigned long *addr) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Wed, 9 Mar 2016 15:51:19 +0100 Message-ID: <20160309145119.GN6356@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E023A5.2000105@synopsys.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <56E023A5.2000105@synopsys.com> Sender: linux-kernel-owner@vger.kernel.org To: Vineet Gupta Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim List-Id: linux-arch.vger.kernel.org On Wed, Mar 09, 2016 at 06:52:45PM +0530, Vineet Gupta wrote: > On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote: > >> There is clearly a problem in slub code that it is pairing a test_and_set_bit() > >> with a __clear_bit(). Latter can obviously clobber former if they are not a single > >> instruction each unlike x86 or they use llock/scond kind of instructions where the > >> interim store from other core is detected and causes a retry of whole llock/scond > >> sequence. > > > > Yes, test_and_set_bit() + __clear_bit() is broken. > > But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so > (ignoring the performance thing for discussion sake, which is a side effect of > this implementation). The sort answer is: Per definition. They are defined to work together, which is what makes __clear_bit_unlock() such a special function. > So despite the comment below in bit_spinlock.h I don't quite comprehend how this > is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed > cases, then isn't this true in general for lot more cases in kernel, i.e. pairing > atomic lock with non-atomic unlock ? I'm missing something ! x86 (and others) do in fact use non-atomic instructions for spin_unlock(). But as this is all arch specific, we can make these assumptions. Its just that generic code cannot rely on it. So let me try and explain. The problem as identified is: CPU0 CPU1 bit_spin_lock() __bit_spin_unlock() 1: /* fetch_or, r1 holds the old value */ spin_lock load r1, addr load r1, addr bclr r2, r1, 1 store r2, addr or r2, r1, 1 store r2, addr /* lost the store from CPU1 */ spin_unlock and r1, 1 bnz 2 /* it was set, go wait */ ret 2: load r1, addr and r1, 1 bnz 2 /* wait until its not set */ b 1 /* try again */ For LL/SC we replace: spin_lock load r1, addr ... store r2, addr spin_unlock With the (obvious): 1: load-locked r1, addr ... store-cond r2, addr bnz 1 /* or whatever branch instruction is required to retry */ In this case the failure cannot happen, because the store from CPU1 would have invalidated the lock from CPU0 and caused the store-cond to fail and retry the loop, observing the new value. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet Gupta Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Thu, 10 Mar 2016 11:21:21 +0530 Message-ID: <56E10B59.1060700@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E023A5.2000105@synopsys.com> <20160309145119.GN6356@twins.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Return-path: Received: from us01smtprelay-2.synopsys.com ([198.182.47.9]:35939 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751571AbcCJFvr (ORCPT ); Thu, 10 Mar 2016 00:51:47 -0500 In-Reply-To: <20160309145119.GN6356@twins.programming.kicks-ass.net> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Peter Zijlstra Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim On Wednesday 09 March 2016 08:21 PM, Peter Zijlstra wrote: >> But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so >> (ignoring the performance thing for discussion sake, which is a side effect of >> this implementation). > > The sort answer is: Per definition. They are defined to work together, > which is what makes __clear_bit_unlock() such a special function. > >> So despite the comment below in bit_spinlock.h I don't quite comprehend how this >> is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed >> cases, then isn't this true in general for lot more cases in kernel, i.e. pairing >> atomic lock with non-atomic unlock ? I'm missing something ! > > x86 (and others) do in fact use non-atomic instructions for > spin_unlock(). But as this is all arch specific, we can make these > assumptions. Its just that generic code cannot rely on it. OK despite being obvious now, I was not seeing the similarity between spin_*lock() and bit_spin*lock() :-( ARC also uses standard ST for spin_unlock() so by analogy __bit_spin_unlock() (for LLSC case) would be correctly paired with bit_spin_lock(). But then why would anyone need bit_spin_unlock() at all. Specially after this patch from you which tightens __bit_spin_lock() even more for the general case. Thing is if the API exists majority of people would would use the more conservative version w/o understand all these nuances. Can we pursue the path of moving bit_spin_unlock() over to __bit_spin_lock(): first changing the backend only and if proven stable replacing the call-sites themselves. > > So let me try and explain. > > > The problem as identified is: > > CPU0 CPU1 > > bit_spin_lock() __bit_spin_unlock() > 1: > /* fetch_or, r1 holds the old value */ > spin_lock > load r1, addr > load r1, addr > bclr r2, r1, 1 > store r2, addr > or r2, r1, 1 > store r2, addr /* lost the store from CPU1 */ > spin_unlock > > and r1, 1 > bnz 2 /* it was set, go wait */ > ret > > 2: > load r1, addr > and r1, 1 > bnz 2 /* wait until its not set */ > > b 1 /* try again */ > > > > For LL/SC we replace: > > spin_lock > load r1, addr > > ... > > store r2, addr > spin_unlock > > With the (obvious): > > 1: > load-locked r1, addr > > ... > > store-cond r2, addr > bnz 1 /* or whatever branch instruction is required to retry */ > > > In this case the failure cannot happen, because the store from CPU1 > would have invalidated the lock from CPU0 and caused the > store-cond to fail and retry the loop, observing the new value. You did it again, A picture is worth thousand words ! Thx, -Vineet From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Thu, 10 Mar 2016 10:10:58 +0100 Message-ID: <20160310091058.GQ6344@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E023A5.2000105@synopsys.com> <20160309145119.GN6356@twins.programming.kicks-ass.net> <56E10B59.1060700@synopsys.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <56E10B59.1060700@synopsys.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-snps-arc" Errors-To: linux-snps-arc-bounces+gla-linux-snps-arc=m.gmane.org@lists.infradead.org To: Vineet Gupta Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Joonsoo Kim , Andrew Morton , linux-snps-arc@lists.infradead.org, Christoph Lameter List-Id: linux-arch.vger.kernel.org On Thu, Mar 10, 2016 at 11:21:21AM +0530, Vineet Gupta wrote: > On Wednesday 09 March 2016 08:21 PM, Peter Zijlstra wrote: > >> But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so > >> (ignoring the performance thing for discussion sake, which is a side effect of > >> this implementation). > > > > The sort answer is: Per definition. They are defined to work together, > > which is what makes __clear_bit_unlock() such a special function. > > > >> So despite the comment below in bit_spinlock.h I don't quite comprehend how this > >> is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed > >> cases, then isn't this true in general for lot more cases in kernel, i.e. pairing > >> atomic lock with non-atomic unlock ? I'm missing something ! > > > > x86 (and others) do in fact use non-atomic instructions for > > spin_unlock(). But as this is all arch specific, we can make these > > assumptions. Its just that generic code cannot rely on it. > > OK despite being obvious now, I was not seeing the similarity between spin_*lock() > and bit_spin*lock() :-( > > ARC also uses standard ST for spin_unlock() so by analogy __bit_spin_unlock() (for > LLSC case) would be correctly paired with bit_spin_lock(). > > But then why would anyone need bit_spin_unlock() at all. Specially after this > patch from you which tightens __bit_spin_lock() even more for the general case. > > Thing is if the API exists majority of people would would use the more > conservative version w/o understand all these nuances. Can we pursue the path of > moving bit_spin_unlock() over to __bit_spin_lock(): first changing the backend > only and if proven stable replacing the call-sites themselves. So the thing is, __bit_spin_unlock() is not safe if other bits in that word can have concurrent modifications. Only if the bitlock locks the whole word (or something else ensures no other bits will change) can you use __bit_spin_unlock() to clear the lock bit. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet Gupta Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Mon, 14 Mar 2016 13:35:20 +0530 Message-ID: <56E670C0.7080901@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> <20160309114054.GJ6356@twins.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20160309114054.GJ6356@twins.programming.kicks-ass.net> Sender: owner-linux-mm@kvack.org To: Peter Zijlstra Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Joonsoo Kim , Andrew Morton , linux-snps-arc@lists.infradead.org, Christoph Lameter List-Id: linux-arch.vger.kernel.org On Wednesday 09 March 2016 05:10 PM, Peter Zijlstra wrote: > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > > Specifically, this came up as a result of hackbench livelock'ing in > slab_lock() on ARC with SMP + SLUB + !LLSC. > > The issue was incorrect pairing of atomic ops. > > slab_lock() -> bit_spin_lock() -> test_and_set_bit() > slab_unlock() -> __bit_spin_unlock() -> __clear_bit() > > The non serializing __clear_bit() was getting "lost" > > 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > Fixes ARC STAR 9000817404 (and probably more). > > Cc: stable@vger.kernel.org > Reported-by: Vineet Gupta > Tested-by: Vineet Gupta > Signed-off-by: Peter Zijlstra (Intel) Peter, I don't see this in linux-next yet. I'm hoping you will send it Linus' way for 4.6-rc1. Thx, -Vineet -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtprelay.synopsys.com ([198.182.60.111]:43086 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751063AbcCIGnk (ORCPT ); Wed, 9 Mar 2016 01:43:40 -0500 Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> From: Vineet Gupta Message-ID: <56DFC604.6070407@synopsys.com> Date: Wed, 9 Mar 2016 12:13:16 +0530 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: linux-arch-owner@vger.kernel.org List-ID: To: Christoph Lameter Cc: linux-mm@kvack.org, Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-parisc@vger.kernel, Peter Zijlstra , "James E.J. Bottomley" , Helge Deller , "linux-arch@vger.kernel.org" Message-ID: <20160309064316.rePmuTTHGXGjE71cqJYXy7j9ocdkvkk_7Ou5kH7n9-s@z> +CC linux-arch, parisc folks, PeterZ On Wednesday 09 March 2016 02:10 AM, Christoph Lameter wrote: > On Tue, 8 Mar 2016, Vineet Gupta wrote: > >> # set the bit >> 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set >> 80543b90: or r3,r2,1 <--- (B) other core unlocks right here >> 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > Duh. Guess you need to take the spinlock also in the arch specific > implementation of __bit_spin_unlock(). This is certainly not the only case > in which we use the __ op to unlock. __bit_spin_lock() by definition is *not* required to be atomic, bit_spin_lock() is - so I don't think we need a spinlock there. There is clearly a problem in slub code that it is pairing a test_and_set_bit() with a __clear_bit(). Latter can obviously clobber former if they are not a single instruction each unlike x86 or they use llock/scond kind of instructions where the interim store from other core is detected and causes a retry of whole llock/scond sequence. BTW ARC is not the only arch which suffers from this - other arches potentially also are. AFAIK PARISC also doesn't have atomic r-m-w and also uses a set of external hashed spinlocks to protect the r-m-w sequences. https://lkml.org/lkml/2014/6/1/178 So there also we have the same race because the outer spin lock is not taken for slab_unlock() -> __bit_spin_lock() -> __clear_bit. Arguably I can fix the ARC !LLSC variant of test_and_set_bit() to not set the bit unconditionally but only if it was clear (PARISC does the same). That would be a slight micro-optimization as we won't need another snoop transaction to make line writable and that would also elide this problem, but I think there is a fundamental problem here in slub which is pairing atomic and non atomic ops - for performance reasons. It doesn't work on all arches and/or configurations. > You need a true atomic op or you need to take the "spinlock" in all > cases where you modify the bit. No we don't in __bit_spin_lock and we already do in bit_spin_lock. > If you take the lock in __bit_spin_unlock > then the race cannot happen. Of course it won't but that means we penalize all non atomic callers of the API with a superfluous spinlock which is not require din first place given the definition of API. >> Are you convinced now ! > > Yes, please fix your arch specific code. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from bombadil.infradead.org ([198.137.202.9]:52678 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751687AbcCIKOE (ORCPT ); Wed, 9 Mar 2016 05:14:04 -0500 Date: Wed, 9 Mar 2016 11:13:49 +0100 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Message-ID: <20160309101349.GJ6344@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56DFC604.6070407@synopsys.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Vineet Gupta Cc: Christoph Lameter , linux-mm@kvack.org, Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-parisc@vger.kernel, "James E.J. Bottomley" , Helge Deller , "linux-arch@vger.kernel.org" Message-ID: <20160309101349.aOpRr4oybu4ltMSFiQw9XqBH84qMcY8w8cFQsAWMTWI@z> On Wed, Mar 09, 2016 at 12:13:16PM +0530, Vineet Gupta wrote: > +CC linux-arch, parisc folks, PeterZ > > On Wednesday 09 March 2016 02:10 AM, Christoph Lameter wrote: > > On Tue, 8 Mar 2016, Vineet Gupta wrote: > > > >> # set the bit > >> 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > >> 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > >> 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > > > Duh. Guess you need to take the spinlock also in the arch specific > > implementation of __bit_spin_unlock(). This is certainly not the only case > > in which we use the __ op to unlock. > > __bit_spin_lock() by definition is *not* required to be atomic, bit_spin_lock() is > - so I don't think we need a spinlock there. Agreed. The double underscore prefixed instructions are not required to be atomic in any way shape or form. > There is clearly a problem in slub code that it is pairing a test_and_set_bit() > with a __clear_bit(). Latter can obviously clobber former if they are not a single > instruction each unlike x86 or they use llock/scond kind of instructions where the > interim store from other core is detected and causes a retry of whole llock/scond > sequence. Yes, test_and_set_bit() + __clear_bit() is broken. > > If you take the lock in __bit_spin_unlock > > then the race cannot happen. > > Of course it won't but that means we penalize all non atomic callers of the API > with a superfluous spinlock which is not require din first place given the > definition of API. Quite. _However_, your arch is still broken, but not by your fault. Its the generic-asm code that is wrong. The thing is that __bit_spin_unlock() uses __clear_bit_unlock(), which defaults to __clear_bit(). Which is wrong. --- Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() __clear_bit_unlock() is a special little snowflake. While it carries the non-atomic '__' prefix, it is specifically documented to pair with test_and_set_bit() and therefore should be 'somewhat' atomic. Therefore the generic implementation of __clear_bit_unlock() cannot use the fully non-atomic __clear_bit() as a default. If an arch is able to do better; is must provide an implementation of __clear_bit_unlock() itself. Reported-by: Vineet Gupta Signed-off-by: Peter Zijlstra (Intel) --- include/asm-generic/bitops/lock.h | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h index c30266e94806..8ef0ccbf8167 100644 --- a/include/asm-generic/bitops/lock.h +++ b/include/asm-generic/bitops/lock.h @@ -29,16 +29,16 @@ do { \ * @nr: the bit to set * @addr: the address to start counting from * - * This operation is like clear_bit_unlock, however it is not atomic. - * It does provide release barrier semantics so it can be used to unlock - * a bit lock, however it would only be used if no other CPU can modify - * any bits in the memory until the lock is released (a good example is - * if the bit lock itself protects access to the other bits in the word). + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all + * the bits in the word are protected by this lock some archs can use weaker + * ops to safely unlock. + * + * See for example x86's implementation. */ #define __clear_bit_unlock(nr, addr) \ do { \ - smp_mb(); \ - __clear_bit(nr, addr); \ + smp_mb__before_atomic(); \ + clear_bit(nr, addr); \ } while (0) #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from casper.infradead.org ([85.118.1.10]:58630 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932179AbcCILlA (ORCPT ); Wed, 9 Mar 2016 06:41:00 -0500 Date: Wed, 9 Mar 2016 12:40:54 +0100 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Message-ID: <20160309114054.GJ6356@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56E0024F.4070401@synopsys.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Vineet Gupta Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim Message-ID: <20160309114054.L_YfBvp6OSwiWiNhuKRaIAcCZPAMz4pwlYR9jJ1mSaQ@z> On Wed, Mar 09, 2016 at 04:30:31PM +0530, Vineet Gupta wrote: > FWIW, could we add some background to commit log, specifically what prompted this. > Something like below... Sure.. find below. > > +++ b/include/asm-generic/bitops/lock.h > > @@ -29,16 +29,16 @@ do { \ > > * @nr: the bit to set > > * @addr: the address to start counting from > > * > > + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all > > + * the bits in the word are protected by this lock some archs can use weaker > > + * ops to safely unlock. > > + * > > + * See for example x86's implementation. > > */ > > To be able to override/use-generic don't we need #ifndef .... I did not follow through the maze, I think the few archs implementing this simply do not include this file at all. I'll let the first person that cares about this worry about that :-) --- Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() __clear_bit_unlock() is a special little snowflake. While it carries the non-atomic '__' prefix, it is specifically documented to pair with test_and_set_bit() and therefore should be 'somewhat' atomic. Therefore the generic implementation of __clear_bit_unlock() cannot use the fully non-atomic __clear_bit() as a default. If an arch is able to do better; is must provide an implementation of __clear_bit_unlock() itself. Specifically, this came up as a result of hackbench livelock'ing in slab_lock() on ARC with SMP + SLUB + !LLSC. The issue was incorrect pairing of atomic ops. slab_lock() -> bit_spin_lock() -> test_and_set_bit() slab_unlock() -> __bit_spin_unlock() -> __clear_bit() The non serializing __clear_bit() was getting "lost" 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set 80543b90: or r3,r2,1 <--- (B) other core unlocks right here 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) Fixes ARC STAR 9000817404 (and probably more). Cc: stable@vger.kernel.org Reported-by: Vineet Gupta Tested-by: Vineet Gupta Signed-off-by: Peter Zijlstra (Intel) --- include/asm-generic/bitops/lock.h | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h index c30266e94806..8ef0ccbf8167 100644 --- a/include/asm-generic/bitops/lock.h +++ b/include/asm-generic/bitops/lock.h @@ -29,16 +29,16 @@ do { \ * @nr: the bit to set * @addr: the address to start counting from * - * This operation is like clear_bit_unlock, however it is not atomic. - * It does provide release barrier semantics so it can be used to unlock - * a bit lock, however it would only be used if no other CPU can modify - * any bits in the memory until the lock is released (a good example is - * if the bit lock itself protects access to the other bits in the word). + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all + * the bits in the word are protected by this lock some archs can use weaker + * ops to safely unlock. + * + * See for example x86's implementation. */ #define __clear_bit_unlock(nr, addr) \ do { \ - smp_mb(); \ - __clear_bit(nr, addr); \ + smp_mb__before_atomic(); \ + clear_bit(nr, addr); \ } while (0) #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from casper.infradead.org ([85.118.1.10]:59643 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751487AbcCIMWX (ORCPT ); Wed, 9 Mar 2016 07:22:23 -0500 Date: Wed, 9 Mar 2016 13:22:17 +0100 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Message-ID: <20160309122217.GK6356@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> <20160309114054.GJ6356@twins.programming.kicks-ass.net> <56E00EB6.4000201@synopsys.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56E00EB6.4000201@synopsys.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Vineet Gupta Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Joonsoo Kim , Andrew Morton , linux-snps-arc@lists.infradead.org, Christoph Lameter Message-ID: <20160309122217.cGLxJh_H_xhZPdsL_ZfL_wji4coNl2bCpnMY4wJ-5cg@z> On Wed, Mar 09, 2016 at 05:23:26PM +0530, Vineet Gupta wrote: > > I did not follow through the maze, I think the few archs implementing > > this simply do not include this file at all. > > > > I'll let the first person that cares about this worry about that :-) > > Ok - that's be me :-) although I really don't see much gains in case of ARC LLSC. > > For us, LD + BCLR + ST is very similar to LLOCK + BCLR + SCOND atleast in terms of > cache coherency transactions ! The win would be in not having to ever retry the SCOND. Although in this case, the contending CPU would be doing reads -- which I assume will not cause a SCOND to fail, so it might indeed not make any difference. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtprelay.synopsys.com ([198.182.47.9]:46072 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932654AbcCINX0 (ORCPT ); Wed, 9 Mar 2016 08:23:26 -0500 Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> From: Vineet Gupta Message-ID: <56E023A5.2000105@synopsys.com> Date: Wed, 9 Mar 2016 18:52:45 +0530 MIME-Version: 1.0 In-Reply-To: <20160309101349.GJ6344@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: linux-arch-owner@vger.kernel.org List-ID: To: Peter Zijlstra Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim Message-ID: <20160309132245.QJQb-UIEZ9Bcui_aPsuAeHGYfDXuser-xfLc77mSICE@z> On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote: >> There is clearly a problem in slub code that it is pairing a test_and_set_bit() >> with a __clear_bit(). Latter can obviously clobber former if they are not a single >> instruction each unlike x86 or they use llock/scond kind of instructions where the >> interim store from other core is detected and causes a retry of whole llock/scond >> sequence. > > Yes, test_and_set_bit() + __clear_bit() is broken. But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so (ignoring the performance thing for discussion sake, which is a side effect of this implementation). So despite the comment below in bit_spinlock.h I don't quite comprehend how this is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed cases, then isn't this true in general for lot more cases in kernel, i.e. pairing atomic lock with non-atomic unlock ? I'm missing something ! | /* | * bit-based spin_unlock() | * non-atomic version, which can be used eg. if the bit lock itself is | * protecting the rest of the flags in the word. | */ | static inline void __bit_spin_unlock(int bitnum, unsigned long *addr) From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from bombadil.infradead.org ([198.137.202.9]:58873 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935170AbcCJJLK (ORCPT ); Thu, 10 Mar 2016 04:11:10 -0500 Date: Thu, 10 Mar 2016 10:10:58 +0100 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Message-ID: <20160310091058.GQ6344@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E023A5.2000105@synopsys.com> <20160309145119.GN6356@twins.programming.kicks-ass.net> <56E10B59.1060700@synopsys.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56E10B59.1060700@synopsys.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Vineet Gupta Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim Message-ID: <20160310091058.gUdSrWAItpxbDO7ELYqsUYBE40hAxH1rilx3E9wQ1I0@z> On Thu, Mar 10, 2016 at 11:21:21AM +0530, Vineet Gupta wrote: > On Wednesday 09 March 2016 08:21 PM, Peter Zijlstra wrote: > >> But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so > >> (ignoring the performance thing for discussion sake, which is a side effect of > >> this implementation). > > > > The sort answer is: Per definition. They are defined to work together, > > which is what makes __clear_bit_unlock() such a special function. > > > >> So despite the comment below in bit_spinlock.h I don't quite comprehend how this > >> is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed > >> cases, then isn't this true in general for lot more cases in kernel, i.e. pairing > >> atomic lock with non-atomic unlock ? I'm missing something ! > > > > x86 (and others) do in fact use non-atomic instructions for > > spin_unlock(). But as this is all arch specific, we can make these > > assumptions. Its just that generic code cannot rely on it. > > OK despite being obvious now, I was not seeing the similarity between spin_*lock() > and bit_spin*lock() :-( > > ARC also uses standard ST for spin_unlock() so by analogy __bit_spin_unlock() (for > LLSC case) would be correctly paired with bit_spin_lock(). > > But then why would anyone need bit_spin_unlock() at all. Specially after this > patch from you which tightens __bit_spin_lock() even more for the general case. > > Thing is if the API exists majority of people would would use the more > conservative version w/o understand all these nuances. Can we pursue the path of > moving bit_spin_unlock() over to __bit_spin_lock(): first changing the backend > only and if proven stable replacing the call-sites themselves. So the thing is, __bit_spin_unlock() is not safe if other bits in that word can have concurrent modifications. Only if the bitlock locks the whole word (or something else ensures no other bits will change) can you use __bit_spin_unlock() to clear the lock bit. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtprelay.synopsys.com ([198.182.60.111]:55884 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933865AbcCNIFq (ORCPT ); Mon, 14 Mar 2016 04:05:46 -0400 Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> <20160309114054.GJ6356@twins.programming.kicks-ass.net> From: Vineet Gupta Message-ID: <56E670C0.7080901@synopsys.com> Date: Mon, 14 Mar 2016 13:35:20 +0530 MIME-Version: 1.0 In-Reply-To: <20160309114054.GJ6356@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: linux-arch-owner@vger.kernel.org List-ID: To: Peter Zijlstra Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Joonsoo Kim , Andrew Morton , linux-snps-arc@lists.infradead.org, Christoph Lameter Message-ID: <20160314080520.L4UVj8nHRxd2tpxpdbnDz_1yqcDfqNnmXWpRr8S6UlM@z> On Wednesday 09 March 2016 05:10 PM, Peter Zijlstra wrote: > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > > Specifically, this came up as a result of hackbench livelock'ing in > slab_lock() on ARC with SMP + SLUB + !LLSC. > > The issue was incorrect pairing of atomic ops. > > slab_lock() -> bit_spin_lock() -> test_and_set_bit() > slab_unlock() -> __bit_spin_unlock() -> __clear_bit() > > The non serializing __clear_bit() was getting "lost" > > 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > Fixes ARC STAR 9000817404 (and probably more). > > Cc: stable@vger.kernel.org > Reported-by: Vineet Gupta > Tested-by: Vineet Gupta > Signed-off-by: Peter Zijlstra (Intel) Peter, I don't see this in linux-next yet. I'm hoping you will send it Linus' way for 4.6-rc1. Thx, -Vineet From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet.Gupta1@synopsys.com (Vineet Gupta) Date: Tue, 8 Mar 2016 20:00:57 +0530 Subject: [PATCH] mm: slub: Ensure that slab_unlock() is atomic List-ID: Message-ID: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> To: linux-snps-arc@lists.infradead.org We observed livelocks on ARC SMP setup when running hackbench with SLUB. This hardware configuration lacks atomic instructions (LLOCK/SCOND) thus kernel resorts to a central @smp_bitops_lock to protect any R-M-W ops suh as test_and_set_bit() The spinlock itself is implemented using Atomic [EX]change instruction which is always available. The race happened when both cores tried to slab_lock() the same page. c1 c0 ----------- ----------- slab_lock slab_lock slab_unlock Not observing the unlock This in turn happened because slab_unlock() doesn't serialize properly (doesn't use atomic clear) with a concurrent running slab_lock()->test_and_set_bit() Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Cc: Andrew Morton Cc: Noam Camus Cc: Cc: Cc: Cc: Signed-off-by: Vineet Gupta --- mm/slub.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/slub.c b/mm/slub.c index d8fbd4a6ed59..b7d345a508dc 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -345,7 +345,7 @@ static __always_inline void slab_lock(struct page *page) static __always_inline void slab_unlock(struct page *page) { VM_BUG_ON_PAGE(PageTail(page), page); - __bit_spin_unlock(PG_locked, &page->flags); + bit_spin_unlock(PG_locked, &page->flags); } static inline void set_page_slub_counters(struct page *page, unsigned long counters_new) -- 2.5.0 From mboxrd@z Thu Jan 1 00:00:00 1970 From: cl@linux.com (Christoph Lameter) Date: Tue, 8 Mar 2016 09:00:57 -0600 (CST) Subject: [PATCH] mm: slub: Ensure that slab_unlock() is atomic In-Reply-To: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> List-ID: Message-ID: To: linux-snps-arc@lists.infradead.org On Tue, 8 Mar 2016, Vineet Gupta wrote: > This in turn happened because slab_unlock() doesn't serialize properly > (doesn't use atomic clear) with a concurrent running > slab_lock()->test_and_set_bit() This is intentional because of the increased latency of atomic instructions. Why would the unlock need to be atomic? This patch will cause regressions. Guess this is an architecture specific issue of modified cachelines not becoming visible to other processors? From mboxrd@z Thu Jan 1 00:00:00 1970 From: vbabka@suse.cz (Vlastimil Babka) Date: Tue, 8 Mar 2016 16:32:29 +0100 Subject: [PATCH] mm: slub: Ensure that slab_unlock() is atomic In-Reply-To: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> List-ID: Message-ID: <56DEF08D.607@suse.cz> To: linux-snps-arc@lists.infradead.org On 03/08/2016 03:30 PM, Vineet Gupta wrote: > We observed livelocks on ARC SMP setup when running hackbench with SLUB. > This hardware configuration lacks atomic instructions (LLOCK/SCOND) thus > kernel resorts to a central @smp_bitops_lock to protect any R-M-W ops > suh as test_and_set_bit() Sounds like this architecture should then redefine __clear_bit_unlock and perhaps other non-atomic __X_bit() variants to be atomic, and not defer this requirement to places that use the API? > The spinlock itself is implemented using Atomic [EX]change instruction > which is always available. > > The race happened when both cores tried to slab_lock() the same page. > > c1 c0 > ----------- ----------- > slab_lock > slab_lock > slab_unlock > Not observing the unlock > > This in turn happened because slab_unlock() doesn't serialize properly > (doesn't use atomic clear) with a concurrent running > slab_lock()->test_and_set_bit() > > Cc: Christoph Lameter > Cc: Pekka Enberg > Cc: David Rientjes > Cc: Joonsoo Kim > Cc: Andrew Morton > Cc: Noam Camus > Cc: > Cc: > Cc: > Cc: > Signed-off-by: Vineet Gupta > --- > mm/slub.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/slub.c b/mm/slub.c > index d8fbd4a6ed59..b7d345a508dc 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -345,7 +345,7 @@ static __always_inline void slab_lock(struct page *page) > static __always_inline void slab_unlock(struct page *page) > { > VM_BUG_ON_PAGE(PageTail(page), page); > - __bit_spin_unlock(PG_locked, &page->flags); > + bit_spin_unlock(PG_locked, &page->flags); > } > > static inline void set_page_slub_counters(struct page *page, unsigned long counters_new) > From mboxrd@z Thu Jan 1 00:00:00 1970 From: vgupta@synopsys.com (Vineet Gupta) Date: Tue, 8 Mar 2016 21:16:27 +0530 Subject: [PATCH] mm: slub: Ensure that slab_unlock() is atomic In-Reply-To: References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> List-ID: Message-ID: <56DEF3D3.6080008@synopsys.com> To: linux-snps-arc@lists.infradead.org On Tuesday 08 March 2016 08:30 PM, Christoph Lameter wrote: > On Tue, 8 Mar 2016, Vineet Gupta wrote: > >> This in turn happened because slab_unlock() doesn't serialize properly >> (doesn't use atomic clear) with a concurrent running >> slab_lock()->test_and_set_bit() > > This is intentional because of the increased latency of atomic > instructions. Why would the unlock need to be atomic? This patch will > cause regressions. > > Guess this is an architecture specific issue of modified > cachelines not becoming visible to other processors? Absolutely not - we verified with the hardware coherency tracing that there was no foul play there. And I would dare not point finger at code which was last updated in 2011 w/o being absolutely sure. Let me explain this in bit more detail. Like I mentioned in commitlog, this config of ARC doesn't have exclusive load/stores (LLOCK/SCOND) so atomic ops are implemented using a "central" spin lock. The spin lock itself is implemented using EX instruction (atomic R-W) Generated code for slab_lock() - essentially bit_spin_lock() is below (I've removed generated code for CONFIG_PREEMPT for simplicity) 80543b0c : 80543b0c: push_s blink ... 80543b3a: mov_s r15,0x809de168 <-- @smp_bitops_lock 80543b40: mov_s r17,1 80543b46: mov_s r16,0 # spin lock() inside test_and_set_bit() - see arc bitops.h (!LLSC code) 80543b78: clri r4 80543b7c: dmb 3 80543b80: mov_s r2,r17 80543b82: ex r2,[r15] 80543b86: breq r2,1,80543b82 80543b8a: dmb 3 # set the bit 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set 80543b90: or r3,r2,1 <--- (B) other core unlocks right here 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) # spin_unlock 80543b96: dmb 3 80543b9a: mov_s r3,r16 80543b9c: ex r3,[r15] 80543ba0: dmb 3 80543ba4: seti r4 # check the old bit 80543ba8: bbit0 r2,0,80543bb8 <--- bit was set, branch not taken 80543bac: b_s 80543b68 <--- enter the test_bit() loop 80543b68: ld_s r2,[r13,0] <-- (C) reads the bit, set by SELF 80543b6a: bbit1 r2,0,80543b68 spins infinitely ... Now using hardware coherency tracing (and using the cycle timestamps) we verified (A) and (B) Thing is with exclusive load/store this race can't just happen since the intervening ST will cause the ST in (C) to NOT commit and the LD/ST will be retried. And there will be very few production systems which are SMP but lack exclusive load/stores. Are you convinced now ! -Vineet From mboxrd@z Thu Jan 1 00:00:00 1970 From: cl@linux.com (Christoph Lameter) Date: Tue, 8 Mar 2016 14:40:52 -0600 (CST) Subject: [PATCH] mm: slub: Ensure that slab_unlock() is atomic In-Reply-To: <56DEF3D3.6080008@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> List-ID: Message-ID: To: linux-snps-arc@lists.infradead.org On Tue, 8 Mar 2016, Vineet Gupta wrote: > # set the bit > 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) Duh. Guess you need to take the spinlock also in the arch specific implementation of __bit_spin_unlock(). This is certainly not the only case in which we use the __ op to unlock. You need a true atomic op or you need to take the "spinlock" in all cases where you modify the bit. If you take the lock in __bit_spin_unlock then the race cannot happen. > Are you convinced now ! Yes, please fix your arch specific code. From mboxrd@z Thu Jan 1 00:00:00 1970 From: peterz@infradead.org (Peter Zijlstra) Date: Wed, 9 Mar 2016 11:31:43 +0100 Subject: [PATCH] mm: slub: Ensure that slab_unlock() is atomic In-Reply-To: <20160309101349.GJ6344@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> List-ID: Message-ID: <20160309103143.GF25010@twins.programming.kicks-ass.net> To: linux-snps-arc@lists.infradead.org On Wed, Mar 09, 2016@11:13:49AM +0100, Peter Zijlstra wrote: > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > FWIW, we could probably undo this if we unified all the spinlock based atomic ops implementations (there's a whole bunch doing essentially the same), and special cased __clear_bit_unlock() for that. Collapsing them is probably a good idea anyway, just a fair bit of non-trivial work to figure out all the differences and if they matter etc.. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet.Gupta1@synopsys.com (Vineet Gupta) Date: Wed, 9 Mar 2016 16:30:31 +0530 Subject: [PATCH] mm: slub: Ensure that slab_unlock() is atomic In-Reply-To: <20160309101349.GJ6344@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> List-ID: Message-ID: <56E0024F.4070401@synopsys.com> To: linux-snps-arc@lists.infradead.org On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote: >>> If you take the lock in __bit_spin_unlock >>> then the race cannot happen. >> >> Of course it won't but that means we penalize all non atomic callers of the API >> with a superfluous spinlock which is not require din first place given the >> definition of API. > > Quite. _However_, your arch is still broken, but not by your fault. Its > the generic-asm code that is wrong. > > The thing is that __bit_spin_unlock() uses __clear_bit_unlock(), which > defaults to __clear_bit(). Which is wrong. > > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > > Reported-by: Vineet Gupta This needs to be CCed stable as it fixes a real bug for ARC. > Signed-off-by: Peter Zijlstra (Intel) Tested-by: Vineet Gupta FWIW, could we add some background to commit log, specifically what prompted this. Something like below... ---->8------ This came up as a result of hackbench livelock'ing in slab_lock() on ARC with SMP + SLUB + !LLSC. The issue was incorrect pairing of atomic ops. slab_lock() -> bit_spin_lock() -> test_and_set_bit() slab_unlock() -> __bit_spin_unlock() -> __clear_bit() The non serializing __clear_bit() was getting "lost" 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set 80543b90: or r3,r2,1 <--- (B) other core unlocks right here 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) Fixes ARC STAR 9000817404 ---->8------ > --- > include/asm-generic/bitops/lock.h | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h > index c30266e94806..8ef0ccbf8167 100644 > --- a/include/asm-generic/bitops/lock.h > +++ b/include/asm-generic/bitops/lock.h > @@ -29,16 +29,16 @@ do { \ > * @nr: the bit to set > * @addr: the address to start counting from > * > - * This operation is like clear_bit_unlock, however it is not atomic. > - * It does provide release barrier semantics so it can be used to unlock > - * a bit lock, however it would only be used if no other CPU can modify > - * any bits in the memory until the lock is released (a good example is > - * if the bit lock itself protects access to the other bits in the word). > + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all > + * the bits in the word are protected by this lock some archs can use weaker > + * ops to safely unlock. > + * > + * See for example x86's implementation. > */ To be able to override/use-generic don't we need #ifndef .... > #define __clear_bit_unlock(nr, addr) \ > do { \ > - smp_mb(); \ > - __clear_bit(nr, addr); \ > + smp_mb__before_atomic(); \ > + clear_bit(nr, addr); \ > } while (0) > > #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ > From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet.Gupta1@synopsys.com (Vineet Gupta) Date: Wed, 9 Mar 2016 16:42:43 +0530 Subject: [PATCH] mm: slub: Ensure that slab_unlock() is atomic In-Reply-To: <20160309103143.GF25010@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <20160309103143.GF25010@twins.programming.kicks-ass.net> List-ID: Message-ID: <56E0052B.3080304@synopsys.com> To: linux-snps-arc@lists.infradead.org On Wednesday 09 March 2016 04:01 PM, Peter Zijlstra wrote: > On Wed, Mar 09, 2016@11:13:49AM +0100, Peter Zijlstra wrote: >> --- >> Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() >> >> __clear_bit_unlock() is a special little snowflake. While it carries the >> non-atomic '__' prefix, it is specifically documented to pair with >> test_and_set_bit() and therefore should be 'somewhat' atomic. >> >> Therefore the generic implementation of __clear_bit_unlock() cannot use >> the fully non-atomic __clear_bit() as a default. >> >> If an arch is able to do better; is must provide an implementation of >> __clear_bit_unlock() itself. >> > > FWIW, we could probably undo this if we unified all the spinlock based > atomic ops implementations (there's a whole bunch doing essentially the > same), and special cased __clear_bit_unlock() for that. > > Collapsing them is probably a good idea anyway, just a fair bit of > non-trivial work to figure out all the differences and if they matter > etc.. Indeed I thought about this when we first did the SMP port. The only issue was somewhat more generated code with the hashed spinlocks (vs. my dumb 2 spin locks - which as I see now will also cause false sharing - likely ending up in the same cache line), but I was more of a micro-optimization freak then than I'm now :-) So yeah I agree ! From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet.Gupta1@synopsys.com (Vineet Gupta) Date: Wed, 9 Mar 2016 12:13:16 +0530 Subject: [PATCH] mm: slub: Ensure that slab_unlock() is atomic In-Reply-To: References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> List-ID: Message-ID: <56DFC604.6070407@synopsys.com> To: linux-snps-arc@lists.infradead.org +CC linux-arch, parisc folks, PeterZ On Wednesday 09 March 2016 02:10 AM, Christoph Lameter wrote: > On Tue, 8 Mar 2016, Vineet Gupta wrote: > >> # set the bit >> 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set >> 80543b90: or r3,r2,1 <--- (B) other core unlocks right here >> 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > Duh. Guess you need to take the spinlock also in the arch specific > implementation of __bit_spin_unlock(). This is certainly not the only case > in which we use the __ op to unlock. __bit_spin_lock() by definition is *not* required to be atomic, bit_spin_lock() is - so I don't think we need a spinlock there. There is clearly a problem in slub code that it is pairing a test_and_set_bit() with a __clear_bit(). Latter can obviously clobber former if they are not a single instruction each unlike x86 or they use llock/scond kind of instructions where the interim store from other core is detected and causes a retry of whole llock/scond sequence. BTW ARC is not the only arch which suffers from this - other arches potentially also are. AFAIK PARISC also doesn't have atomic r-m-w and also uses a set of external hashed spinlocks to protect the r-m-w sequences. https://lkml.org/lkml/2014/6/1/178 So there also we have the same race because the outer spin lock is not taken for slab_unlock() -> __bit_spin_lock() -> __clear_bit. Arguably I can fix the ARC !LLSC variant of test_and_set_bit() to not set the bit unconditionally but only if it was clear (PARISC does the same). That would be a slight micro-optimization as we won't need another snoop transaction to make line writable and that would also elide this problem, but I think there is a fundamental problem here in slub which is pairing atomic and non atomic ops - for performance reasons. It doesn't work on all arches and/or configurations. > You need a true atomic op or you need to take the "spinlock" in all > cases where you modify the bit. No we don't in __bit_spin_lock and we already do in bit_spin_lock. > If you take the lock in __bit_spin_unlock > then the race cannot happen. Of course it won't but that means we penalize all non atomic callers of the API with a superfluous spinlock which is not require din first place given the definition of API. >> Are you convinced now ! > > Yes, please fix your arch specific code. From mboxrd@z Thu Jan 1 00:00:00 1970 From: peterz@infradead.org (Peter Zijlstra) Date: Wed, 9 Mar 2016 11:13:49 +0100 Subject: [PATCH] mm: slub: Ensure that slab_unlock() is atomic In-Reply-To: <56DFC604.6070407@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> List-ID: Message-ID: <20160309101349.GJ6344@twins.programming.kicks-ass.net> To: linux-snps-arc@lists.infradead.org On Wed, Mar 09, 2016@12:13:16PM +0530, Vineet Gupta wrote: > +CC linux-arch, parisc folks, PeterZ > > On Wednesday 09 March 2016 02:10 AM, Christoph Lameter wrote: > > On Tue, 8 Mar 2016, Vineet Gupta wrote: > > > >> # set the bit > >> 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > >> 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > >> 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > > > Duh. Guess you need to take the spinlock also in the arch specific > > implementation of __bit_spin_unlock(). This is certainly not the only case > > in which we use the __ op to unlock. > > __bit_spin_lock() by definition is *not* required to be atomic, bit_spin_lock() is > - so I don't think we need a spinlock there. Agreed. The double underscore prefixed instructions are not required to be atomic in any way shape or form. > There is clearly a problem in slub code that it is pairing a test_and_set_bit() > with a __clear_bit(). Latter can obviously clobber former if they are not a single > instruction each unlike x86 or they use llock/scond kind of instructions where the > interim store from other core is detected and causes a retry of whole llock/scond > sequence. Yes, test_and_set_bit() + __clear_bit() is broken. > > If you take the lock in __bit_spin_unlock > > then the race cannot happen. > > Of course it won't but that means we penalize all non atomic callers of the API > with a superfluous spinlock which is not require din first place given the > definition of API. Quite. _However_, your arch is still broken, but not by your fault. Its the generic-asm code that is wrong. The thing is that __bit_spin_unlock() uses __clear_bit_unlock(), which defaults to __clear_bit(). Which is wrong. --- Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() __clear_bit_unlock() is a special little snowflake. While it carries the non-atomic '__' prefix, it is specifically documented to pair with test_and_set_bit() and therefore should be 'somewhat' atomic. Therefore the generic implementation of __clear_bit_unlock() cannot use the fully non-atomic __clear_bit() as a default. If an arch is able to do better; is must provide an implementation of __clear_bit_unlock() itself. Reported-by: Vineet Gupta Signed-off-by: Peter Zijlstra (Intel) --- include/asm-generic/bitops/lock.h | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h index c30266e94806..8ef0ccbf8167 100644 --- a/include/asm-generic/bitops/lock.h +++ b/include/asm-generic/bitops/lock.h @@ -29,16 +29,16 @@ do { \ * @nr: the bit to set * @addr: the address to start counting from * - * This operation is like clear_bit_unlock, however it is not atomic. - * It does provide release barrier semantics so it can be used to unlock - * a bit lock, however it would only be used if no other CPU can modify - * any bits in the memory until the lock is released (a good example is - * if the bit lock itself protects access to the other bits in the word). + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all + * the bits in the word are protected by this lock some archs can use weaker + * ops to safely unlock. + * + * See for example x86's implementation. */ #define __clear_bit_unlock(nr, addr) \ do { \ - smp_mb(); \ - __clear_bit(nr, addr); \ + smp_mb__before_atomic(); \ + clear_bit(nr, addr); \ } while (0) #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ From mboxrd@z Thu Jan 1 00:00:00 1970 From: peterz@infradead.org (Peter Zijlstra) Date: Wed, 9 Mar 2016 12:40:54 +0100 Subject: [PATCH] mm: slub: Ensure that slab_unlock() is atomic In-Reply-To: <56E0024F.4070401@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> List-ID: Message-ID: <20160309114054.GJ6356@twins.programming.kicks-ass.net> To: linux-snps-arc@lists.infradead.org On Wed, Mar 09, 2016@04:30:31PM +0530, Vineet Gupta wrote: > FWIW, could we add some background to commit log, specifically what prompted this. > Something like below... Sure.. find below. > > +++ b/include/asm-generic/bitops/lock.h > > @@ -29,16 +29,16 @@ do { \ > > * @nr: the bit to set > > * @addr: the address to start counting from > > * > > + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all > > + * the bits in the word are protected by this lock some archs can use weaker > > + * ops to safely unlock. > > + * > > + * See for example x86's implementation. > > */ > > To be able to override/use-generic don't we need #ifndef .... I did not follow through the maze, I think the few archs implementing this simply do not include this file at all. I'll let the first person that cares about this worry about that :-) --- Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() __clear_bit_unlock() is a special little snowflake. While it carries the non-atomic '__' prefix, it is specifically documented to pair with test_and_set_bit() and therefore should be 'somewhat' atomic. Therefore the generic implementation of __clear_bit_unlock() cannot use the fully non-atomic __clear_bit() as a default. If an arch is able to do better; is must provide an implementation of __clear_bit_unlock() itself. Specifically, this came up as a result of hackbench livelock'ing in slab_lock() on ARC with SMP + SLUB + !LLSC. The issue was incorrect pairing of atomic ops. slab_lock() -> bit_spin_lock() -> test_and_set_bit() slab_unlock() -> __bit_spin_unlock() -> __clear_bit() The non serializing __clear_bit() was getting "lost" 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set 80543b90: or r3,r2,1 <--- (B) other core unlocks right here 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) Fixes ARC STAR 9000817404 (and probably more). Cc: stable at vger.kernel.org Reported-by: Vineet Gupta Tested-by: Vineet Gupta Signed-off-by: Peter Zijlstra (Intel) --- include/asm-generic/bitops/lock.h | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h index c30266e94806..8ef0ccbf8167 100644 --- a/include/asm-generic/bitops/lock.h +++ b/include/asm-generic/bitops/lock.h @@ -29,16 +29,16 @@ do { \ * @nr: the bit to set * @addr: the address to start counting from * - * This operation is like clear_bit_unlock, however it is not atomic. - * It does provide release barrier semantics so it can be used to unlock - * a bit lock, however it would only be used if no other CPU can modify - * any bits in the memory until the lock is released (a good example is - * if the bit lock itself protects access to the other bits in the word). + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all + * the bits in the word are protected by this lock some archs can use weaker + * ops to safely unlock. + * + * See for example x86's implementation. */ #define __clear_bit_unlock(nr, addr) \ do { \ - smp_mb(); \ - __clear_bit(nr, addr); \ + smp_mb__before_atomic(); \ + clear_bit(nr, addr); \ } while (0) #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet.Gupta1@synopsys.com (Vineet Gupta) Date: Wed, 9 Mar 2016 17:23:26 +0530 Subject: [PATCH] mm: slub: Ensure that slab_unlock() is atomic In-Reply-To: <20160309114054.GJ6356@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> <20160309114054.GJ6356@twins.programming.kicks-ass.net> List-ID: Message-ID: <56E00EB6.4000201@synopsys.com> To: linux-snps-arc@lists.infradead.org On Wednesday 09 March 2016 05:10 PM, Peter Zijlstra wrote: > On Wed, Mar 09, 2016@04:30:31PM +0530, Vineet Gupta wrote: >> FWIW, could we add some background to commit log, specifically what prompted this. >> Something like below... > > Sure.. find below. > >>> +++ b/include/asm-generic/bitops/lock.h >>> @@ -29,16 +29,16 @@ do { \ >>> * @nr: the bit to set >>> * @addr: the address to start counting from >>> * >>> + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all >>> + * the bits in the word are protected by this lock some archs can use weaker >>> + * ops to safely unlock. >>> + * >>> + * See for example x86's implementation. >>> */ >> >> To be able to override/use-generic don't we need #ifndef .... > > I did not follow through the maze, I think the few archs implementing > this simply do not include this file at all. > > I'll let the first person that cares about this worry about that :-) Ok - that's be me :-) although I really don't see much gains in case of ARC LLSC. For us, LD + BCLR + ST is very similar to LLOCK + BCLR + SCOND atleast in terms of cache coherency transactions ! > > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > > Specifically, this came up as a result of hackbench livelock'ing in > slab_lock() on ARC with SMP + SLUB + !LLSC. > > The issue was incorrect pairing of atomic ops. > > slab_lock() -> bit_spin_lock() -> test_and_set_bit() > slab_unlock() -> __bit_spin_unlock() -> __clear_bit() > > The non serializing __clear_bit() was getting "lost" > > 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > Fixes ARC STAR 9000817404 (and probably more). > > Cc: stable at vger.kernel.org > Reported-by: Vineet Gupta > Tested-by: Vineet Gupta > Signed-off-by: Peter Zijlstra (Intel) LGTM. Thx a bunch Peter ! -Vineet > --- > include/asm-generic/bitops/lock.h | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h > index c30266e94806..8ef0ccbf8167 100644 > --- a/include/asm-generic/bitops/lock.h > +++ b/include/asm-generic/bitops/lock.h > @@ -29,16 +29,16 @@ do { \ > * @nr: the bit to set > * @addr: the address to start counting from > * > - * This operation is like clear_bit_unlock, however it is not atomic. > - * It does provide release barrier semantics so it can be used to unlock > - * a bit lock, however it would only be used if no other CPU can modify > - * any bits in the memory until the lock is released (a good example is > - * if the bit lock itself protects access to the other bits in the word). > + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all > + * the bits in the word are protected by this lock some archs can use weaker > + * ops to safely unlock. > + * > + * See for example x86's implementation. > */ > #define __clear_bit_unlock(nr, addr) \ > do { \ > - smp_mb(); \ > - __clear_bit(nr, addr); \ > + smp_mb__before_atomic(); \ > + clear_bit(nr, addr); \ > } while (0) > > #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ > From mboxrd@z Thu Jan 1 00:00:00 1970 From: peterz@infradead.org (Peter Zijlstra) Date: Wed, 9 Mar 2016 15:51:19 +0100 Subject: [PATCH] mm: slub: Ensure that slab_unlock() is atomic In-Reply-To: <56E023A5.2000105@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E023A5.2000105@synopsys.com> List-ID: Message-ID: <20160309145119.GN6356@twins.programming.kicks-ass.net> To: linux-snps-arc@lists.infradead.org On Wed, Mar 09, 2016@06:52:45PM +0530, Vineet Gupta wrote: > On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote: > >> There is clearly a problem in slub code that it is pairing a test_and_set_bit() > >> with a __clear_bit(). Latter can obviously clobber former if they are not a single > >> instruction each unlike x86 or they use llock/scond kind of instructions where the > >> interim store from other core is detected and causes a retry of whole llock/scond > >> sequence. > > > > Yes, test_and_set_bit() + __clear_bit() is broken. > > But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so > (ignoring the performance thing for discussion sake, which is a side effect of > this implementation). The sort answer is: Per definition. They are defined to work together, which is what makes __clear_bit_unlock() such a special function. > So despite the comment below in bit_spinlock.h I don't quite comprehend how this > is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed > cases, then isn't this true in general for lot more cases in kernel, i.e. pairing > atomic lock with non-atomic unlock ? I'm missing something ! x86 (and others) do in fact use non-atomic instructions for spin_unlock(). But as this is all arch specific, we can make these assumptions. Its just that generic code cannot rely on it. So let me try and explain. The problem as identified is: CPU0 CPU1 bit_spin_lock() __bit_spin_unlock() 1: /* fetch_or, r1 holds the old value */ spin_lock load r1, addr load r1, addr bclr r2, r1, 1 store r2, addr or r2, r1, 1 store r2, addr /* lost the store from CPU1 */ spin_unlock and r1, 1 bnz 2 /* it was set, go wait */ ret 2: load r1, addr and r1, 1 bnz 2 /* wait until its not set */ b 1 /* try again */ For LL/SC we replace: spin_lock load r1, addr ... store r2, addr spin_unlock With the (obvious): 1: load-locked r1, addr ... store-cond r2, addr bnz 1 /* or whatever branch instruction is required to retry */ In this case the failure cannot happen, because the store from CPU1 would have invalidated the lock from CPU0 and caused the store-cond to fail and retry the loop, observing the new value. From mboxrd@z Thu Jan 1 00:00:00 1970 From: peterz@infradead.org (Peter Zijlstra) Date: Wed, 9 Mar 2016 13:22:17 +0100 Subject: [PATCH] mm: slub: Ensure that slab_unlock() is atomic In-Reply-To: <56E00EB6.4000201@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> <20160309114054.GJ6356@twins.programming.kicks-ass.net> <56E00EB6.4000201@synopsys.com> List-ID: Message-ID: <20160309122217.GK6356@twins.programming.kicks-ass.net> To: linux-snps-arc@lists.infradead.org On Wed, Mar 09, 2016@05:23:26PM +0530, Vineet Gupta wrote: > > I did not follow through the maze, I think the few archs implementing > > this simply do not include this file at all. > > > > I'll let the first person that cares about this worry about that :-) > > Ok - that's be me :-) although I really don't see much gains in case of ARC LLSC. > > For us, LD + BCLR + ST is very similar to LLOCK + BCLR + SCOND atleast in terms of > cache coherency transactions ! The win would be in not having to ever retry the SCOND. Although in this case, the contending CPU would be doing reads -- which I assume will not cause a SCOND to fail, so it might indeed not make any difference. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet.Gupta1@synopsys.com (Vineet Gupta) Date: Wed, 9 Mar 2016 18:52:45 +0530 Subject: [PATCH] mm: slub: Ensure that slab_unlock() is atomic In-Reply-To: <20160309101349.GJ6344@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> List-ID: Message-ID: <56E023A5.2000105@synopsys.com> To: linux-snps-arc@lists.infradead.org On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote: >> There is clearly a problem in slub code that it is pairing a test_and_set_bit() >> with a __clear_bit(). Latter can obviously clobber former if they are not a single >> instruction each unlike x86 or they use llock/scond kind of instructions where the >> interim store from other core is detected and causes a retry of whole llock/scond >> sequence. > > Yes, test_and_set_bit() + __clear_bit() is broken. But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so (ignoring the performance thing for discussion sake, which is a side effect of this implementation). So despite the comment below in bit_spinlock.h I don't quite comprehend how this is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed cases, then isn't this true in general for lot more cases in kernel, i.e. pairing atomic lock with non-atomic unlock ? I'm missing something ! | /* | * bit-based spin_unlock() | * non-atomic version, which can be used eg. if the bit lock itself is | * protecting the rest of the flags in the word. | */ | static inline void __bit_spin_unlock(int bitnum, unsigned long *addr) From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet.Gupta1@synopsys.com (Vineet Gupta) Date: Thu, 10 Mar 2016 11:21:21 +0530 Subject: [PATCH] mm: slub: Ensure that slab_unlock() is atomic In-Reply-To: <20160309145119.GN6356@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E023A5.2000105@synopsys.com> <20160309145119.GN6356@twins.programming.kicks-ass.net> List-ID: Message-ID: <56E10B59.1060700@synopsys.com> To: linux-snps-arc@lists.infradead.org On Wednesday 09 March 2016 08:21 PM, Peter Zijlstra wrote: >> But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so >> (ignoring the performance thing for discussion sake, which is a side effect of >> this implementation). > > The sort answer is: Per definition. They are defined to work together, > which is what makes __clear_bit_unlock() such a special function. > >> So despite the comment below in bit_spinlock.h I don't quite comprehend how this >> is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed >> cases, then isn't this true in general for lot more cases in kernel, i.e. pairing >> atomic lock with non-atomic unlock ? I'm missing something ! > > x86 (and others) do in fact use non-atomic instructions for > spin_unlock(). But as this is all arch specific, we can make these > assumptions. Its just that generic code cannot rely on it. OK despite being obvious now, I was not seeing the similarity between spin_*lock() and bit_spin*lock() :-( ARC also uses standard ST for spin_unlock() so by analogy __bit_spin_unlock() (for LLSC case) would be correctly paired with bit_spin_lock(). But then why would anyone need bit_spin_unlock() at all. Specially after this patch from you which tightens __bit_spin_lock() even more for the general case. Thing is if the API exists majority of people would would use the more conservative version w/o understand all these nuances. Can we pursue the path of moving bit_spin_unlock() over to __bit_spin_lock(): first changing the backend only and if proven stable replacing the call-sites themselves. > > So let me try and explain. > > > The problem as identified is: > > CPU0 CPU1 > > bit_spin_lock() __bit_spin_unlock() > 1: > /* fetch_or, r1 holds the old value */ > spin_lock > load r1, addr > load r1, addr > bclr r2, r1, 1 > store r2, addr > or r2, r1, 1 > store r2, addr /* lost the store from CPU1 */ > spin_unlock > > and r1, 1 > bnz 2 /* it was set, go wait */ > ret > > 2: > load r1, addr > and r1, 1 > bnz 2 /* wait until its not set */ > > b 1 /* try again */ > > > > For LL/SC we replace: > > spin_lock > load r1, addr > > ... > > store r2, addr > spin_unlock > > With the (obvious): > > 1: > load-locked r1, addr > > ... > > store-cond r2, addr > bnz 1 /* or whatever branch instruction is required to retry */ > > > In this case the failure cannot happen, because the store from CPU1 > would have invalidated the lock from CPU0 and caused the > store-cond to fail and retry the loop, observing the new value. You did it again, A picture is worth thousand words ! Thx, -Vineet From mboxrd@z Thu Jan 1 00:00:00 1970 From: peterz@infradead.org (Peter Zijlstra) Date: Thu, 10 Mar 2016 10:10:58 +0100 Subject: [PATCH] mm: slub: Ensure that slab_unlock() is atomic In-Reply-To: <56E10B59.1060700@synopsys.com> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E023A5.2000105@synopsys.com> <20160309145119.GN6356@twins.programming.kicks-ass.net> <56E10B59.1060700@synopsys.com> List-ID: Message-ID: <20160310091058.GQ6344@twins.programming.kicks-ass.net> To: linux-snps-arc@lists.infradead.org On Thu, Mar 10, 2016@11:21:21AM +0530, Vineet Gupta wrote: > On Wednesday 09 March 2016 08:21 PM, Peter Zijlstra wrote: > >> But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so > >> (ignoring the performance thing for discussion sake, which is a side effect of > >> this implementation). > > > > The sort answer is: Per definition. They are defined to work together, > > which is what makes __clear_bit_unlock() such a special function. > > > >> So despite the comment below in bit_spinlock.h I don't quite comprehend how this > >> is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed > >> cases, then isn't this true in general for lot more cases in kernel, i.e. pairing > >> atomic lock with non-atomic unlock ? I'm missing something ! > > > > x86 (and others) do in fact use non-atomic instructions for > > spin_unlock(). But as this is all arch specific, we can make these > > assumptions. Its just that generic code cannot rely on it. > > OK despite being obvious now, I was not seeing the similarity between spin_*lock() > and bit_spin*lock() :-( > > ARC also uses standard ST for spin_unlock() so by analogy __bit_spin_unlock() (for > LLSC case) would be correctly paired with bit_spin_lock(). > > But then why would anyone need bit_spin_unlock() at all. Specially after this > patch from you which tightens __bit_spin_lock() even more for the general case. > > Thing is if the API exists majority of people would would use the more > conservative version w/o understand all these nuances. Can we pursue the path of > moving bit_spin_unlock() over to __bit_spin_lock(): first changing the backend > only and if proven stable replacing the call-sites themselves. So the thing is, __bit_spin_unlock() is not safe if other bits in that word can have concurrent modifications. Only if the bitlock locks the whole word (or something else ensures no other bits will change) can you use __bit_spin_unlock() to clear the lock bit. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vineet.Gupta1@synopsys.com (Vineet Gupta) Date: Mon, 14 Mar 2016 13:35:20 +0530 Subject: [PATCH] mm: slub: Ensure that slab_unlock() is atomic In-Reply-To: <20160309114054.GJ6356@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> <20160309114054.GJ6356@twins.programming.kicks-ass.net> List-ID: Message-ID: <56E670C0.7080901@synopsys.com> To: linux-snps-arc@lists.infradead.org On Wednesday 09 March 2016 05:10 PM, Peter Zijlstra wrote: > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > > Specifically, this came up as a result of hackbench livelock'ing in > slab_lock() on ARC with SMP + SLUB + !LLSC. > > The issue was incorrect pairing of atomic ops. > > slab_lock() -> bit_spin_lock() -> test_and_set_bit() > slab_unlock() -> __bit_spin_unlock() -> __clear_bit() > > The non serializing __clear_bit() was getting "lost" > > 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > Fixes ARC STAR 9000817404 (and probably more). > > Cc: stable at vger.kernel.org > Reported-by: Vineet Gupta > Tested-by: Vineet Gupta > Signed-off-by: Peter Zijlstra (Intel) Peter, I don't see this in linux-next yet. I'm hoping you will send it Linus' way for 4.6-rc1. Thx, -Vineet From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f49.google.com (mail-pa0-f49.google.com [209.85.220.49]) by kanga.kvack.org (Postfix) with ESMTP id 9C3C76B0256 for ; Tue, 8 Mar 2016 09:31:22 -0500 (EST) Received: by mail-pa0-f49.google.com with SMTP id bj10so13803268pad.2 for ; Tue, 08 Mar 2016 06:31:22 -0800 (PST) Received: from smtprelay.synopsys.com (smtprelay.synopsys.com. [198.182.47.9]) by mx.google.com with ESMTPS id hq1si5111768pac.56.2016.03.08.06.31.21 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 08 Mar 2016 06:31:21 -0800 (PST) From: Vineet Gupta Subject: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Tue, 8 Mar 2016 20:00:57 +0530 Message-ID: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> MIME-Version: 1.0 Content-Type: text/plain Sender: owner-linux-mm@kvack.org List-ID: To: linux-mm@kvack.org Cc: Vineet Gupta , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org We observed livelocks on ARC SMP setup when running hackbench with SLUB. This hardware configuration lacks atomic instructions (LLOCK/SCOND) thus kernel resorts to a central @smp_bitops_lock to protect any R-M-W ops suh as test_and_set_bit() The spinlock itself is implemented using Atomic [EX]change instruction which is always available. The race happened when both cores tried to slab_lock() the same page. c1 c0 ----------- ----------- slab_lock slab_lock slab_unlock Not observing the unlock This in turn happened because slab_unlock() doesn't serialize properly (doesn't use atomic clear) with a concurrent running slab_lock()->test_and_set_bit() Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Cc: Andrew Morton Cc: Noam Camus Cc: Cc: Cc: Cc: Signed-off-by: Vineet Gupta --- mm/slub.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/slub.c b/mm/slub.c index d8fbd4a6ed59..b7d345a508dc 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -345,7 +345,7 @@ static __always_inline void slab_lock(struct page *page) static __always_inline void slab_unlock(struct page *page) { VM_BUG_ON_PAGE(PageTail(page), page); - __bit_spin_unlock(PG_locked, &page->flags); + bit_spin_unlock(PG_locked, &page->flags); } static inline void set_page_slub_counters(struct page *page, unsigned long counters_new) -- 2.5.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ig0-f171.google.com (mail-ig0-f171.google.com [209.85.213.171]) by kanga.kvack.org (Postfix) with ESMTP id BB6116B0254 for ; Tue, 8 Mar 2016 10:01:03 -0500 (EST) Received: by mail-ig0-f171.google.com with SMTP id ig19so19534902igb.1 for ; Tue, 08 Mar 2016 07:01:03 -0800 (PST) Received: from resqmta-ch2-11v.sys.comcast.net (resqmta-ch2-11v.sys.comcast.net. [2001:558:fe21:29:69:252:207:43]) by mx.google.com with ESMTPS id i198si5142298ioi.75.2016.03.08.07.00.59 for (version=TLS1_2 cipher=AES128-SHA bits=128/128); Tue, 08 Mar 2016 07:00:59 -0800 (PST) Date: Tue, 8 Mar 2016 09:00:57 -0600 (CST) From: Christoph Lameter Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic In-Reply-To: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> Message-ID: References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> Content-Type: text/plain; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: Vineet Gupta Cc: linux-mm@kvack.org, Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org On Tue, 8 Mar 2016, Vineet Gupta wrote: > This in turn happened because slab_unlock() doesn't serialize properly > (doesn't use atomic clear) with a concurrent running > slab_lock()->test_and_set_bit() This is intentional because of the increased latency of atomic instructions. Why would the unlock need to be atomic? This patch will cause regressions. Guess this is an architecture specific issue of modified cachelines not becoming visible to other processors? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f49.google.com (mail-wm0-f49.google.com [74.125.82.49]) by kanga.kvack.org (Postfix) with ESMTP id 998896B0005 for ; Tue, 8 Mar 2016 10:32:34 -0500 (EST) Received: by mail-wm0-f49.google.com with SMTP id p65so154776516wmp.1 for ; Tue, 08 Mar 2016 07:32:34 -0800 (PST) Received: from mx2.suse.de (mx2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id kv8si4474175wjb.17.2016.03.08.07.32.32 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 08 Mar 2016 07:32:33 -0800 (PST) Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> From: Vlastimil Babka Message-ID: <56DEF08D.607@suse.cz> Date: Tue, 8 Mar 2016 16:32:29 +0100 MIME-Version: 1.0 In-Reply-To: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Vineet Gupta , linux-mm@kvack.org Cc: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org On 03/08/2016 03:30 PM, Vineet Gupta wrote: > We observed livelocks on ARC SMP setup when running hackbench with SLUB. > This hardware configuration lacks atomic instructions (LLOCK/SCOND) thus > kernel resorts to a central @smp_bitops_lock to protect any R-M-W ops > suh as test_and_set_bit() Sounds like this architecture should then redefine __clear_bit_unlock and perhaps other non-atomic __X_bit() variants to be atomic, and not defer this requirement to places that use the API? > The spinlock itself is implemented using Atomic [EX]change instruction > which is always available. > > The race happened when both cores tried to slab_lock() the same page. > > c1 c0 > ----------- ----------- > slab_lock > slab_lock > slab_unlock > Not observing the unlock > > This in turn happened because slab_unlock() doesn't serialize properly > (doesn't use atomic clear) with a concurrent running > slab_lock()->test_and_set_bit() > > Cc: Christoph Lameter > Cc: Pekka Enberg > Cc: David Rientjes > Cc: Joonsoo Kim > Cc: Andrew Morton > Cc: Noam Camus > Cc: > Cc: > Cc: > Cc: > Signed-off-by: Vineet Gupta > --- > mm/slub.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/slub.c b/mm/slub.c > index d8fbd4a6ed59..b7d345a508dc 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -345,7 +345,7 @@ static __always_inline void slab_lock(struct page *page) > static __always_inline void slab_unlock(struct page *page) > { > VM_BUG_ON_PAGE(PageTail(page), page); > - __bit_spin_unlock(PG_locked, &page->flags); > + bit_spin_unlock(PG_locked, &page->flags); > } > > static inline void set_page_slub_counters(struct page *page, unsigned long counters_new) > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf0-f178.google.com (mail-pf0-f178.google.com [209.85.192.178]) by kanga.kvack.org (Postfix) with ESMTP id A91A76B0005 for ; Tue, 8 Mar 2016 10:46:35 -0500 (EST) Received: by mail-pf0-f178.google.com with SMTP id 124so15554084pfg.0 for ; Tue, 08 Mar 2016 07:46:35 -0800 (PST) Received: from mail-pa0-x243.google.com (mail-pa0-x243.google.com. [2607:f8b0:400e:c03::243]) by mx.google.com with ESMTPS id h75si5430688pfh.96.2016.03.08.07.46.34 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 08 Mar 2016 07:46:34 -0800 (PST) Received: by mail-pa0-x243.google.com with SMTP id 1so1385830pal.3 for ; Tue, 08 Mar 2016 07:46:34 -0800 (PST) Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> From: Vineet Gupta Message-ID: <56DEF3D3.6080008@synopsys.com> Date: Tue, 8 Mar 2016 21:16:27 +0530 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Christoph Lameter Cc: linux-mm@kvack.org, Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org On Tuesday 08 March 2016 08:30 PM, Christoph Lameter wrote: > On Tue, 8 Mar 2016, Vineet Gupta wrote: > >> This in turn happened because slab_unlock() doesn't serialize properly >> (doesn't use atomic clear) with a concurrent running >> slab_lock()->test_and_set_bit() > > This is intentional because of the increased latency of atomic > instructions. Why would the unlock need to be atomic? This patch will > cause regressions. > > Guess this is an architecture specific issue of modified > cachelines not becoming visible to other processors? Absolutely not - we verified with the hardware coherency tracing that there was no foul play there. And I would dare not point finger at code which was last updated in 2011 w/o being absolutely sure. Let me explain this in bit more detail. Like I mentioned in commitlog, this config of ARC doesn't have exclusive load/stores (LLOCK/SCOND) so atomic ops are implemented using a "central" spin lock. The spin lock itself is implemented using EX instruction (atomic R-W) Generated code for slab_lock() - essentially bit_spin_lock() is below (I've removed generated code for CONFIG_PREEMPT for simplicity) 80543b0c : 80543b0c: push_s blink ... 80543b3a: mov_s r15,0x809de168 <-- @smp_bitops_lock 80543b40: mov_s r17,1 80543b46: mov_s r16,0 # spin lock() inside test_and_set_bit() - see arc bitops.h (!LLSC code) 80543b78: clri r4 80543b7c: dmb 3 80543b80: mov_s r2,r17 80543b82: ex r2,[r15] 80543b86: breq r2,1,80543b82 80543b8a: dmb 3 # set the bit 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set 80543b90: or r3,r2,1 <--- (B) other core unlocks right here 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) # spin_unlock 80543b96: dmb 3 80543b9a: mov_s r3,r16 80543b9c: ex r3,[r15] 80543ba0: dmb 3 80543ba4: seti r4 # check the old bit 80543ba8: bbit0 r2,0,80543bb8 <--- bit was set, branch not taken 80543bac: b_s 80543b68 <--- enter the test_bit() loop 80543b68: ld_s r2,[r13,0] <-- (C) reads the bit, set by SELF 80543b6a: bbit1 r2,0,80543b68 spins infinitely ... Now using hardware coherency tracing (and using the cycle timestamps) we verified (A) and (B) Thing is with exclusive load/store this race can't just happen since the intervening ST will cause the ST in (C) to NOT commit and the LD/ST will be retried. And there will be very few production systems which are SMP but lack exclusive load/stores. Are you convinced now ! -Vineet -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io0-f181.google.com (mail-io0-f181.google.com [209.85.223.181]) by kanga.kvack.org (Postfix) with ESMTP id C857F6B0253 for ; Tue, 8 Mar 2016 15:40:54 -0500 (EST) Received: by mail-io0-f181.google.com with SMTP id m184so41325179iof.1 for ; Tue, 08 Mar 2016 12:40:54 -0800 (PST) Received: from resqmta-ch2-03v.sys.comcast.net (resqmta-ch2-03v.sys.comcast.net. [2001:558:fe21:29:69:252:207:35]) by mx.google.com with ESMTPS id 64si6828511ioz.0.2016.03.08.12.40.54 for (version=TLS1_2 cipher=AES128-SHA bits=128/128); Tue, 08 Mar 2016 12:40:54 -0800 (PST) Date: Tue, 8 Mar 2016 14:40:52 -0600 (CST) From: Christoph Lameter Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic In-Reply-To: <56DEF3D3.6080008@synopsys.com> Message-ID: References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> Content-Type: text/plain; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: Vineet Gupta Cc: linux-mm@kvack.org, Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org On Tue, 8 Mar 2016, Vineet Gupta wrote: > # set the bit > 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) Duh. Guess you need to take the spinlock also in the arch specific implementation of __bit_spin_unlock(). This is certainly not the only case in which we use the __ op to unlock. You need a true atomic op or you need to take the "spinlock" in all cases where you modify the bit. If you take the lock in __bit_spin_unlock then the race cannot happen. > Are you convinced now ! Yes, please fix your arch specific code. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f42.google.com (mail-wm0-f42.google.com [74.125.82.42]) by kanga.kvack.org (Postfix) with ESMTP id 3E47B6B0253 for ; Wed, 9 Mar 2016 05:31:51 -0500 (EST) Received: by mail-wm0-f42.google.com with SMTP id l68so171239631wml.0 for ; Wed, 09 Mar 2016 02:31:51 -0800 (PST) Received: from casper.infradead.org (casper.infradead.org. [2001:770:15f::2]) by mx.google.com with ESMTPS id w66si20448814wmd.51.2016.03.09.02.31.49 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 09 Mar 2016 02:31:50 -0800 (PST) Date: Wed, 9 Mar 2016 11:31:43 +0100 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Message-ID: <20160309103143.GF25010@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160309101349.GJ6344@twins.programming.kicks-ass.net> Sender: owner-linux-mm@kvack.org List-ID: To: Vineet Gupta Cc: Christoph Lameter , linux-mm@kvack.org, Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-parisc@vger.kernel, "James E.J. Bottomley" , Helge Deller , "linux-arch@vger.kernel.org" On Wed, Mar 09, 2016 at 11:13:49AM +0100, Peter Zijlstra wrote: > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > FWIW, we could probably undo this if we unified all the spinlock based atomic ops implementations (there's a whole bunch doing essentially the same), and special cased __clear_bit_unlock() for that. Collapsing them is probably a good idea anyway, just a fair bit of non-trivial work to figure out all the differences and if they matter etc.. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf0-f178.google.com (mail-pf0-f178.google.com [209.85.192.178]) by kanga.kvack.org (Postfix) with ESMTP id EB7BE6B0005 for ; Wed, 9 Mar 2016 05:14:00 -0500 (EST) Received: by mail-pf0-f178.google.com with SMTP id x188so36724268pfb.2 for ; Wed, 09 Mar 2016 02:14:00 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org. [2001:1868:205::9]) by mx.google.com with ESMTPS id y64si11540505pfi.87.2016.03.09.02.14.00 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 09 Mar 2016 02:14:00 -0800 (PST) Date: Wed, 9 Mar 2016 11:13:49 +0100 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Message-ID: <20160309101349.GJ6344@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56DFC604.6070407@synopsys.com> Sender: owner-linux-mm@kvack.org List-ID: To: Vineet Gupta Cc: Christoph Lameter , linux-mm@kvack.org, Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-parisc@vger.kernel, "James E.J. Bottomley" , Helge Deller , "linux-arch@vger.kernel.org" On Wed, Mar 09, 2016 at 12:13:16PM +0530, Vineet Gupta wrote: > +CC linux-arch, parisc folks, PeterZ > > On Wednesday 09 March 2016 02:10 AM, Christoph Lameter wrote: > > On Tue, 8 Mar 2016, Vineet Gupta wrote: > > > >> # set the bit > >> 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > >> 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > >> 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > > > Duh. Guess you need to take the spinlock also in the arch specific > > implementation of __bit_spin_unlock(). This is certainly not the only case > > in which we use the __ op to unlock. > > __bit_spin_lock() by definition is *not* required to be atomic, bit_spin_lock() is > - so I don't think we need a spinlock there. Agreed. The double underscore prefixed instructions are not required to be atomic in any way shape or form. > There is clearly a problem in slub code that it is pairing a test_and_set_bit() > with a __clear_bit(). Latter can obviously clobber former if they are not a single > instruction each unlike x86 or they use llock/scond kind of instructions where the > interim store from other core is detected and causes a retry of whole llock/scond > sequence. Yes, test_and_set_bit() + __clear_bit() is broken. > > If you take the lock in __bit_spin_unlock > > then the race cannot happen. > > Of course it won't but that means we penalize all non atomic callers of the API > with a superfluous spinlock which is not require din first place given the > definition of API. Quite. _However_, your arch is still broken, but not by your fault. Its the generic-asm code that is wrong. The thing is that __bit_spin_unlock() uses __clear_bit_unlock(), which defaults to __clear_bit(). Which is wrong. --- Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() __clear_bit_unlock() is a special little snowflake. While it carries the non-atomic '__' prefix, it is specifically documented to pair with test_and_set_bit() and therefore should be 'somewhat' atomic. Therefore the generic implementation of __clear_bit_unlock() cannot use the fully non-atomic __clear_bit() as a default. If an arch is able to do better; is must provide an implementation of __clear_bit_unlock() itself. Reported-by: Vineet Gupta Signed-off-by: Peter Zijlstra (Intel) --- include/asm-generic/bitops/lock.h | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h index c30266e94806..8ef0ccbf8167 100644 --- a/include/asm-generic/bitops/lock.h +++ b/include/asm-generic/bitops/lock.h @@ -29,16 +29,16 @@ do { \ * @nr: the bit to set * @addr: the address to start counting from * - * This operation is like clear_bit_unlock, however it is not atomic. - * It does provide release barrier semantics so it can be used to unlock - * a bit lock, however it would only be used if no other CPU can modify - * any bits in the memory until the lock is released (a good example is - * if the bit lock itself protects access to the other bits in the word). + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all + * the bits in the word are protected by this lock some archs can use weaker + * ops to safely unlock. + * + * See for example x86's implementation. */ #define __clear_bit_unlock(nr, addr) \ do { \ - smp_mb(); \ - __clear_bit(nr, addr); \ + smp_mb__before_atomic(); \ + clear_bit(nr, addr); \ } while (0) #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f45.google.com (mail-pa0-f45.google.com [209.85.220.45]) by kanga.kvack.org (Postfix) with ESMTP id 7F3876B0005 for ; Wed, 9 Mar 2016 06:01:01 -0500 (EST) Received: by mail-pa0-f45.google.com with SMTP id tt10so36778389pab.3 for ; Wed, 09 Mar 2016 03:01:01 -0800 (PST) Received: from smtprelay.synopsys.com (smtprelay4.synopsys.com. [198.182.47.9]) by mx.google.com with ESMTPS id c67si11824851pfj.47.2016.03.09.03.01.00 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 09 Mar 2016 03:01:00 -0800 (PST) Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> From: Vineet Gupta Message-ID: <56E0024F.4070401@synopsys.com> Date: Wed, 9 Mar 2016 16:30:31 +0530 MIME-Version: 1.0 In-Reply-To: <20160309101349.GJ6344@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Peter Zijlstra Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote: >>> If you take the lock in __bit_spin_unlock >>> then the race cannot happen. >> >> Of course it won't but that means we penalize all non atomic callers of the API >> with a superfluous spinlock which is not require din first place given the >> definition of API. > > Quite. _However_, your arch is still broken, but not by your fault. Its > the generic-asm code that is wrong. > > The thing is that __bit_spin_unlock() uses __clear_bit_unlock(), which > defaults to __clear_bit(). Which is wrong. > > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > > Reported-by: Vineet Gupta This needs to be CCed stable as it fixes a real bug for ARC. > Signed-off-by: Peter Zijlstra (Intel) Tested-by: Vineet Gupta FWIW, could we add some background to commit log, specifically what prompted this. Something like below... ---->8------ This came up as a result of hackbench livelock'ing in slab_lock() on ARC with SMP + SLUB + !LLSC. The issue was incorrect pairing of atomic ops. slab_lock() -> bit_spin_lock() -> test_and_set_bit() slab_unlock() -> __bit_spin_unlock() -> __clear_bit() The non serializing __clear_bit() was getting "lost" 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set 80543b90: or r3,r2,1 <--- (B) other core unlocks right here 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) Fixes ARC STAR 9000817404 ---->8------ > --- > include/asm-generic/bitops/lock.h | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h > index c30266e94806..8ef0ccbf8167 100644 > --- a/include/asm-generic/bitops/lock.h > +++ b/include/asm-generic/bitops/lock.h > @@ -29,16 +29,16 @@ do { \ > * @nr: the bit to set > * @addr: the address to start counting from > * > - * This operation is like clear_bit_unlock, however it is not atomic. > - * It does provide release barrier semantics so it can be used to unlock > - * a bit lock, however it would only be used if no other CPU can modify > - * any bits in the memory until the lock is released (a good example is > - * if the bit lock itself protects access to the other bits in the word). > + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all > + * the bits in the word are protected by this lock some archs can use weaker > + * ops to safely unlock. > + * > + * See for example x86's implementation. > */ To be able to override/use-generic don't we need #ifndef .... > #define __clear_bit_unlock(nr, addr) \ > do { \ > - smp_mb(); \ > - __clear_bit(nr, addr); \ > + smp_mb__before_atomic(); \ > + clear_bit(nr, addr); \ > } while (0) > > #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f48.google.com (mail-pa0-f48.google.com [209.85.220.48]) by kanga.kvack.org (Postfix) with ESMTP id 341406B0255 for ; Wed, 9 Mar 2016 06:13:08 -0500 (EST) Received: by mail-pa0-f48.google.com with SMTP id td3so10475557pab.2 for ; Wed, 09 Mar 2016 03:13:08 -0800 (PST) Received: from smtprelay.synopsys.com (smtprelay4.synopsys.com. [198.182.47.9]) by mx.google.com with ESMTPS id 83si11886331pfl.78.2016.03.09.03.13.07 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 09 Mar 2016 03:13:07 -0800 (PST) Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <20160309103143.GF25010@twins.programming.kicks-ass.net> From: Vineet Gupta Message-ID: <56E0052B.3080304@synopsys.com> Date: Wed, 9 Mar 2016 16:42:43 +0530 MIME-Version: 1.0 In-Reply-To: <20160309103143.GF25010@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Peter Zijlstra Cc: Christoph Lameter , linux-mm@kvack.org, Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-parisc@vger.kernel, "James E.J. Bottomley" , Helge Deller , "linux-arch@vger.kernel.org" On Wednesday 09 March 2016 04:01 PM, Peter Zijlstra wrote: > On Wed, Mar 09, 2016 at 11:13:49AM +0100, Peter Zijlstra wrote: >> --- >> Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() >> >> __clear_bit_unlock() is a special little snowflake. While it carries the >> non-atomic '__' prefix, it is specifically documented to pair with >> test_and_set_bit() and therefore should be 'somewhat' atomic. >> >> Therefore the generic implementation of __clear_bit_unlock() cannot use >> the fully non-atomic __clear_bit() as a default. >> >> If an arch is able to do better; is must provide an implementation of >> __clear_bit_unlock() itself. >> > > FWIW, we could probably undo this if we unified all the spinlock based > atomic ops implementations (there's a whole bunch doing essentially the > same), and special cased __clear_bit_unlock() for that. > > Collapsing them is probably a good idea anyway, just a fair bit of > non-trivial work to figure out all the differences and if they matter > etc.. Indeed I thought about this when we first did the SMP port. The only issue was somewhat more generated code with the hashed spinlocks (vs. my dumb 2 spin locks - which as I see now will also cause false sharing - likely ending up in the same cache line), but I was more of a micro-optimization freak then than I'm now :-) So yeah I agree ! -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f46.google.com (mail-pa0-f46.google.com [209.85.220.46]) by kanga.kvack.org (Postfix) with ESMTP id DC3886B0005 for ; Wed, 9 Mar 2016 06:53:51 -0500 (EST) Received: by mail-pa0-f46.google.com with SMTP id fl4so37758813pad.0 for ; Wed, 09 Mar 2016 03:53:51 -0800 (PST) Received: from smtprelay.synopsys.com (smtprelay.synopsys.com. [198.182.60.111]) by mx.google.com with ESMTPS id ju17si5019218pab.5.2016.03.09.03.53.50 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 09 Mar 2016 03:53:51 -0800 (PST) Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> <20160309114054.GJ6356@twins.programming.kicks-ass.net> From: Vineet Gupta Message-ID: <56E00EB6.4000201@synopsys.com> Date: Wed, 9 Mar 2016 17:23:26 +0530 MIME-Version: 1.0 In-Reply-To: <20160309114054.GJ6356@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Peter Zijlstra Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Joonsoo Kim , Andrew Morton , linux-snps-arc@lists.infradead.org, Christoph Lameter On Wednesday 09 March 2016 05:10 PM, Peter Zijlstra wrote: > On Wed, Mar 09, 2016 at 04:30:31PM +0530, Vineet Gupta wrote: >> FWIW, could we add some background to commit log, specifically what prompted this. >> Something like below... > > Sure.. find below. > >>> +++ b/include/asm-generic/bitops/lock.h >>> @@ -29,16 +29,16 @@ do { \ >>> * @nr: the bit to set >>> * @addr: the address to start counting from >>> * >>> + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all >>> + * the bits in the word are protected by this lock some archs can use weaker >>> + * ops to safely unlock. >>> + * >>> + * See for example x86's implementation. >>> */ >> >> To be able to override/use-generic don't we need #ifndef .... > > I did not follow through the maze, I think the few archs implementing > this simply do not include this file at all. > > I'll let the first person that cares about this worry about that :-) Ok - that's be me :-) although I really don't see much gains in case of ARC LLSC. For us, LD + BCLR + ST is very similar to LLOCK + BCLR + SCOND atleast in terms of cache coherency transactions ! > > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > > Specifically, this came up as a result of hackbench livelock'ing in > slab_lock() on ARC with SMP + SLUB + !LLSC. > > The issue was incorrect pairing of atomic ops. > > slab_lock() -> bit_spin_lock() -> test_and_set_bit() > slab_unlock() -> __bit_spin_unlock() -> __clear_bit() > > The non serializing __clear_bit() was getting "lost" > > 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > Fixes ARC STAR 9000817404 (and probably more). > > Cc: stable@vger.kernel.org > Reported-by: Vineet Gupta > Tested-by: Vineet Gupta > Signed-off-by: Peter Zijlstra (Intel) LGTM. Thx a bunch Peter ! -Vineet > --- > include/asm-generic/bitops/lock.h | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h > index c30266e94806..8ef0ccbf8167 100644 > --- a/include/asm-generic/bitops/lock.h > +++ b/include/asm-generic/bitops/lock.h > @@ -29,16 +29,16 @@ do { \ > * @nr: the bit to set > * @addr: the address to start counting from > * > - * This operation is like clear_bit_unlock, however it is not atomic. > - * It does provide release barrier semantics so it can be used to unlock > - * a bit lock, however it would only be used if no other CPU can modify > - * any bits in the memory until the lock is released (a good example is > - * if the bit lock itself protects access to the other bits in the word). > + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all > + * the bits in the word are protected by this lock some archs can use weaker > + * ops to safely unlock. > + * > + * See for example x86's implementation. > */ > #define __clear_bit_unlock(nr, addr) \ > do { \ > - smp_mb(); \ > - __clear_bit(nr, addr); \ > + smp_mb__before_atomic(); \ > + clear_bit(nr, addr); \ > } while (0) > > #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f43.google.com (mail-wm0-f43.google.com [74.125.82.43]) by kanga.kvack.org (Postfix) with ESMTP id A87766B0005 for ; Wed, 9 Mar 2016 06:41:00 -0500 (EST) Received: by mail-wm0-f43.google.com with SMTP id n186so173948967wmn.1 for ; Wed, 09 Mar 2016 03:41:00 -0800 (PST) Received: from casper.infradead.org (casper.infradead.org. [2001:770:15f::2]) by mx.google.com with ESMTPS id 5si26630652wmw.30.2016.03.09.03.40.59 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 09 Mar 2016 03:40:59 -0800 (PST) Date: Wed, 9 Mar 2016 12:40:54 +0100 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Message-ID: <20160309114054.GJ6356@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56E0024F.4070401@synopsys.com> Sender: owner-linux-mm@kvack.org List-ID: To: Vineet Gupta Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim On Wed, Mar 09, 2016 at 04:30:31PM +0530, Vineet Gupta wrote: > FWIW, could we add some background to commit log, specifically what prompted this. > Something like below... Sure.. find below. > > +++ b/include/asm-generic/bitops/lock.h > > @@ -29,16 +29,16 @@ do { \ > > * @nr: the bit to set > > * @addr: the address to start counting from > > * > > + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all > > + * the bits in the word are protected by this lock some archs can use weaker > > + * ops to safely unlock. > > + * > > + * See for example x86's implementation. > > */ > > To be able to override/use-generic don't we need #ifndef .... I did not follow through the maze, I think the few archs implementing this simply do not include this file at all. I'll let the first person that cares about this worry about that :-) --- Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() __clear_bit_unlock() is a special little snowflake. While it carries the non-atomic '__' prefix, it is specifically documented to pair with test_and_set_bit() and therefore should be 'somewhat' atomic. Therefore the generic implementation of __clear_bit_unlock() cannot use the fully non-atomic __clear_bit() as a default. If an arch is able to do better; is must provide an implementation of __clear_bit_unlock() itself. Specifically, this came up as a result of hackbench livelock'ing in slab_lock() on ARC with SMP + SLUB + !LLSC. The issue was incorrect pairing of atomic ops. slab_lock() -> bit_spin_lock() -> test_and_set_bit() slab_unlock() -> __bit_spin_unlock() -> __clear_bit() The non serializing __clear_bit() was getting "lost" 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set 80543b90: or r3,r2,1 <--- (B) other core unlocks right here 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) Fixes ARC STAR 9000817404 (and probably more). Cc: stable@vger.kernel.org Reported-by: Vineet Gupta Tested-by: Vineet Gupta Signed-off-by: Peter Zijlstra (Intel) --- include/asm-generic/bitops/lock.h | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h index c30266e94806..8ef0ccbf8167 100644 --- a/include/asm-generic/bitops/lock.h +++ b/include/asm-generic/bitops/lock.h @@ -29,16 +29,16 @@ do { \ * @nr: the bit to set * @addr: the address to start counting from * - * This operation is like clear_bit_unlock, however it is not atomic. - * It does provide release barrier semantics so it can be used to unlock - * a bit lock, however it would only be used if no other CPU can modify - * any bits in the memory until the lock is released (a good example is - * if the bit lock itself protects access to the other bits in the word). + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all + * the bits in the word are protected by this lock some archs can use weaker + * ops to safely unlock. + * + * See for example x86's implementation. */ #define __clear_bit_unlock(nr, addr) \ do { \ - smp_mb(); \ - __clear_bit(nr, addr); \ + smp_mb__before_atomic(); \ + clear_bit(nr, addr); \ } while (0) #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f52.google.com (mail-pa0-f52.google.com [209.85.220.52]) by kanga.kvack.org (Postfix) with ESMTP id 552106B0005 for ; Wed, 9 Mar 2016 09:51:28 -0500 (EST) Received: by mail-pa0-f52.google.com with SMTP id td3so14778598pab.2 for ; Wed, 09 Mar 2016 06:51:28 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org. [2001:1868:205::9]) by mx.google.com with ESMTPS id t13si1503757pas.225.2016.03.09.06.51.27 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 09 Mar 2016 06:51:27 -0800 (PST) Date: Wed, 9 Mar 2016 15:51:19 +0100 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Message-ID: <20160309145119.GN6356@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E023A5.2000105@synopsys.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56E023A5.2000105@synopsys.com> Sender: owner-linux-mm@kvack.org List-ID: To: Vineet Gupta Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim On Wed, Mar 09, 2016 at 06:52:45PM +0530, Vineet Gupta wrote: > On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote: > >> There is clearly a problem in slub code that it is pairing a test_and_set_bit() > >> with a __clear_bit(). Latter can obviously clobber former if they are not a single > >> instruction each unlike x86 or they use llock/scond kind of instructions where the > >> interim store from other core is detected and causes a retry of whole llock/scond > >> sequence. > > > > Yes, test_and_set_bit() + __clear_bit() is broken. > > But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so > (ignoring the performance thing for discussion sake, which is a side effect of > this implementation). The sort answer is: Per definition. They are defined to work together, which is what makes __clear_bit_unlock() such a special function. > So despite the comment below in bit_spinlock.h I don't quite comprehend how this > is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed > cases, then isn't this true in general for lot more cases in kernel, i.e. pairing > atomic lock with non-atomic unlock ? I'm missing something ! x86 (and others) do in fact use non-atomic instructions for spin_unlock(). But as this is all arch specific, we can make these assumptions. Its just that generic code cannot rely on it. So let me try and explain. The problem as identified is: CPU0 CPU1 bit_spin_lock() __bit_spin_unlock() 1: /* fetch_or, r1 holds the old value */ spin_lock load r1, addr load r1, addr bclr r2, r1, 1 store r2, addr or r2, r1, 1 store r2, addr /* lost the store from CPU1 */ spin_unlock and r1, 1 bnz 2 /* it was set, go wait */ ret 2: load r1, addr and r1, 1 bnz 2 /* wait until its not set */ b 1 /* try again */ For LL/SC we replace: spin_lock load r1, addr ... store r2, addr spin_unlock With the (obvious): 1: load-locked r1, addr ... store-cond r2, addr bnz 1 /* or whatever branch instruction is required to retry */ In this case the failure cannot happen, because the store from CPU1 would have invalidated the lock from CPU0 and caused the store-cond to fail and retry the loop, observing the new value. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f42.google.com (mail-pa0-f42.google.com [209.85.220.42]) by kanga.kvack.org (Postfix) with ESMTP id 36CB16B0005 for ; Thu, 10 Mar 2016 00:51:47 -0500 (EST) Received: by mail-pa0-f42.google.com with SMTP id td3so32143015pab.2 for ; Wed, 09 Mar 2016 21:51:47 -0800 (PST) Received: from smtprelay.synopsys.com (smtprelay.synopsys.com. [198.182.47.9]) by mx.google.com with ESMTPS id n80si3581930pfj.17.2016.03.09.21.51.46 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 09 Mar 2016 21:51:46 -0800 (PST) Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E023A5.2000105@synopsys.com> <20160309145119.GN6356@twins.programming.kicks-ass.net> From: Vineet Gupta Message-ID: <56E10B59.1060700@synopsys.com> Date: Thu, 10 Mar 2016 11:21:21 +0530 MIME-Version: 1.0 In-Reply-To: <20160309145119.GN6356@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Peter Zijlstra Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim On Wednesday 09 March 2016 08:21 PM, Peter Zijlstra wrote: >> But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so >> (ignoring the performance thing for discussion sake, which is a side effect of >> this implementation). > > The sort answer is: Per definition. They are defined to work together, > which is what makes __clear_bit_unlock() such a special function. > >> So despite the comment below in bit_spinlock.h I don't quite comprehend how this >> is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed >> cases, then isn't this true in general for lot more cases in kernel, i.e. pairing >> atomic lock with non-atomic unlock ? I'm missing something ! > > x86 (and others) do in fact use non-atomic instructions for > spin_unlock(). But as this is all arch specific, we can make these > assumptions. Its just that generic code cannot rely on it. OK despite being obvious now, I was not seeing the similarity between spin_*lock() and bit_spin*lock() :-( ARC also uses standard ST for spin_unlock() so by analogy __bit_spin_unlock() (for LLSC case) would be correctly paired with bit_spin_lock(). But then why would anyone need bit_spin_unlock() at all. Specially after this patch from you which tightens __bit_spin_lock() even more for the general case. Thing is if the API exists majority of people would would use the more conservative version w/o understand all these nuances. Can we pursue the path of moving bit_spin_unlock() over to __bit_spin_lock(): first changing the backend only and if proven stable replacing the call-sites themselves. > > So let me try and explain. > > > The problem as identified is: > > CPU0 CPU1 > > bit_spin_lock() __bit_spin_unlock() > 1: > /* fetch_or, r1 holds the old value */ > spin_lock > load r1, addr > load r1, addr > bclr r2, r1, 1 > store r2, addr > or r2, r1, 1 > store r2, addr /* lost the store from CPU1 */ > spin_unlock > > and r1, 1 > bnz 2 /* it was set, go wait */ > ret > > 2: > load r1, addr > and r1, 1 > bnz 2 /* wait until its not set */ > > b 1 /* try again */ > > > > For LL/SC we replace: > > spin_lock > load r1, addr > > ... > > store r2, addr > spin_unlock > > With the (obvious): > > 1: > load-locked r1, addr > > ... > > store-cond r2, addr > bnz 1 /* or whatever branch instruction is required to retry */ > > > In this case the failure cannot happen, because the store from CPU1 > would have invalidated the lock from CPU0 and caused the > store-cond to fail and retry the loop, observing the new value. You did it again, A picture is worth thousand words ! Thx, -Vineet -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf0-f173.google.com (mail-pf0-f173.google.com [209.85.192.173]) by kanga.kvack.org (Postfix) with ESMTP id F33D86B0005 for ; Thu, 10 Mar 2016 04:11:06 -0500 (EST) Received: by mail-pf0-f173.google.com with SMTP id u190so34776782pfb.3 for ; Thu, 10 Mar 2016 01:11:06 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org. [2001:1868:205::9]) by mx.google.com with ESMTPS id z25si4705202pfa.170.2016.03.10.01.11.06 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 10 Mar 2016 01:11:06 -0800 (PST) Date: Thu, 10 Mar 2016 10:10:58 +0100 From: Peter Zijlstra Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Message-ID: <20160310091058.GQ6344@twins.programming.kicks-ass.net> References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E023A5.2000105@synopsys.com> <20160309145119.GN6356@twins.programming.kicks-ass.net> <56E10B59.1060700@synopsys.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56E10B59.1060700@synopsys.com> Sender: owner-linux-mm@kvack.org List-ID: To: Vineet Gupta Cc: "linux-arch@vger.kernel.org" , linux-parisc@vger.kernel, Andrew Morton , Helge Deller , linux-kernel@vger.kernel.org, stable@vger.kernel.org, "James E.J. Bottomley" , Pekka Enberg , linux-mm@kvack.org, Noam Camus , David Rientjes , Christoph Lameter , linux-snps-arc@lists.infradead.org, Joonsoo Kim On Thu, Mar 10, 2016 at 11:21:21AM +0530, Vineet Gupta wrote: > On Wednesday 09 March 2016 08:21 PM, Peter Zijlstra wrote: > >> But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so > >> (ignoring the performance thing for discussion sake, which is a side effect of > >> this implementation). > > > > The sort answer is: Per definition. They are defined to work together, > > which is what makes __clear_bit_unlock() such a special function. > > > >> So despite the comment below in bit_spinlock.h I don't quite comprehend how this > >> is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed > >> cases, then isn't this true in general for lot more cases in kernel, i.e. pairing > >> atomic lock with non-atomic unlock ? I'm missing something ! > > > > x86 (and others) do in fact use non-atomic instructions for > > spin_unlock(). But as this is all arch specific, we can make these > > assumptions. Its just that generic code cannot rely on it. > > OK despite being obvious now, I was not seeing the similarity between spin_*lock() > and bit_spin*lock() :-( > > ARC also uses standard ST for spin_unlock() so by analogy __bit_spin_unlock() (for > LLSC case) would be correctly paired with bit_spin_lock(). > > But then why would anyone need bit_spin_unlock() at all. Specially after this > patch from you which tightens __bit_spin_lock() even more for the general case. > > Thing is if the API exists majority of people would would use the more > conservative version w/o understand all these nuances. Can we pursue the path of > moving bit_spin_unlock() over to __bit_spin_lock(): first changing the backend > only and if proven stable replacing the call-sites themselves. So the thing is, __bit_spin_unlock() is not safe if other bits in that word can have concurrent modifications. Only if the bitlock locks the whole word (or something else ensures no other bits will change) can you use __bit_spin_unlock() to clear the lock bit. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752822AbcCHObb (ORCPT ); Tue, 8 Mar 2016 09:31:31 -0500 Received: from smtprelay4.synopsys.com ([198.182.47.9]:48671 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751516AbcCHObW (ORCPT ); Tue, 8 Mar 2016 09:31:22 -0500 From: Vineet Gupta To: CC: Vineet Gupta , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , , , Subject: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Tue, 8 Mar 2016 20:00:57 +0530 Message-ID: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> X-Mailer: git-send-email 2.5.0 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.12.197.157] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org We observed livelocks on ARC SMP setup when running hackbench with SLUB. This hardware configuration lacks atomic instructions (LLOCK/SCOND) thus kernel resorts to a central @smp_bitops_lock to protect any R-M-W ops suh as test_and_set_bit() The spinlock itself is implemented using Atomic [EX]change instruction which is always available. The race happened when both cores tried to slab_lock() the same page. c1 c0 ----------- ----------- slab_lock slab_lock slab_unlock Not observing the unlock This in turn happened because slab_unlock() doesn't serialize properly (doesn't use atomic clear) with a concurrent running slab_lock()->test_and_set_bit() Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Cc: Andrew Morton Cc: Noam Camus Cc: Cc: Cc: Cc: Signed-off-by: Vineet Gupta --- mm/slub.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/slub.c b/mm/slub.c index d8fbd4a6ed59..b7d345a508dc 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -345,7 +345,7 @@ static __always_inline void slab_lock(struct page *page) static __always_inline void slab_unlock(struct page *page) { VM_BUG_ON_PAGE(PageTail(page), page); - __bit_spin_unlock(PG_locked, &page->flags); + bit_spin_unlock(PG_locked, &page->flags); } static inline void set_page_slub_counters(struct page *page, unsigned long counters_new) -- 2.5.0 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932238AbcCHPBN (ORCPT ); Tue, 8 Mar 2016 10:01:13 -0500 Received: from resqmta-ch2-09v.sys.comcast.net ([69.252.207.41]:47794 "EHLO resqmta-ch2-09v.sys.comcast.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754138AbcCHPBA (ORCPT ); Tue, 8 Mar 2016 10:01:00 -0500 Date: Tue, 8 Mar 2016 09:00:57 -0600 (CST) From: Christoph Lameter X-X-Sender: cl@east.gentwo.org To: Vineet Gupta cc: linux-mm@kvack.org, Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic In-Reply-To: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> Message-ID: References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 8 Mar 2016, Vineet Gupta wrote: > This in turn happened because slab_unlock() doesn't serialize properly > (doesn't use atomic clear) with a concurrent running > slab_lock()->test_and_set_bit() This is intentional because of the increased latency of atomic instructions. Why would the unlock need to be atomic? This patch will cause regressions. Guess this is an architecture specific issue of modified cachelines not becoming visible to other processors? From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932435AbcCHPcm (ORCPT ); Tue, 8 Mar 2016 10:32:42 -0500 Received: from mx2.suse.de ([195.135.220.15]:36751 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932320AbcCHPce (ORCPT ); Tue, 8 Mar 2016 10:32:34 -0500 Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic To: Vineet Gupta , linux-mm@kvack.org References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> Cc: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org From: Vlastimil Babka Message-ID: <56DEF08D.607@suse.cz> Date: Tue, 8 Mar 2016 16:32:29 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/08/2016 03:30 PM, Vineet Gupta wrote: > We observed livelocks on ARC SMP setup when running hackbench with SLUB. > This hardware configuration lacks atomic instructions (LLOCK/SCOND) thus > kernel resorts to a central @smp_bitops_lock to protect any R-M-W ops > suh as test_and_set_bit() Sounds like this architecture should then redefine __clear_bit_unlock and perhaps other non-atomic __X_bit() variants to be atomic, and not defer this requirement to places that use the API? > The spinlock itself is implemented using Atomic [EX]change instruction > which is always available. > > The race happened when both cores tried to slab_lock() the same page. > > c1 c0 > ----------- ----------- > slab_lock > slab_lock > slab_unlock > Not observing the unlock > > This in turn happened because slab_unlock() doesn't serialize properly > (doesn't use atomic clear) with a concurrent running > slab_lock()->test_and_set_bit() > > Cc: Christoph Lameter > Cc: Pekka Enberg > Cc: David Rientjes > Cc: Joonsoo Kim > Cc: Andrew Morton > Cc: Noam Camus > Cc: > Cc: > Cc: > Cc: > Signed-off-by: Vineet Gupta > --- > mm/slub.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/slub.c b/mm/slub.c > index d8fbd4a6ed59..b7d345a508dc 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -345,7 +345,7 @@ static __always_inline void slab_lock(struct page *page) > static __always_inline void slab_unlock(struct page *page) > { > VM_BUG_ON_PAGE(PageTail(page), page); > - __bit_spin_unlock(PG_locked, &page->flags); > + bit_spin_unlock(PG_locked, &page->flags); > } > > static inline void set_page_slub_counters(struct page *page, unsigned long counters_new) > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753247AbcCHPqp (ORCPT ); Tue, 8 Mar 2016 10:46:45 -0500 Received: from mail-pa0-f67.google.com ([209.85.220.67]:33425 "EHLO mail-pa0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932259AbcCHPqf (ORCPT ); Tue, 8 Mar 2016 10:46:35 -0500 Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic To: Christoph Lameter References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> Cc: linux-mm@kvack.org, Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org From: Vineet Gupta Organization: Synopsys Message-ID: <56DEF3D3.6080008@synopsys.com> Date: Tue, 8 Mar 2016 21:16:27 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tuesday 08 March 2016 08:30 PM, Christoph Lameter wrote: > On Tue, 8 Mar 2016, Vineet Gupta wrote: > >> This in turn happened because slab_unlock() doesn't serialize properly >> (doesn't use atomic clear) with a concurrent running >> slab_lock()->test_and_set_bit() > > This is intentional because of the increased latency of atomic > instructions. Why would the unlock need to be atomic? This patch will > cause regressions. > > Guess this is an architecture specific issue of modified > cachelines not becoming visible to other processors? Absolutely not - we verified with the hardware coherency tracing that there was no foul play there. And I would dare not point finger at code which was last updated in 2011 w/o being absolutely sure. Let me explain this in bit more detail. Like I mentioned in commitlog, this config of ARC doesn't have exclusive load/stores (LLOCK/SCOND) so atomic ops are implemented using a "central" spin lock. The spin lock itself is implemented using EX instruction (atomic R-W) Generated code for slab_lock() - essentially bit_spin_lock() is below (I've removed generated code for CONFIG_PREEMPT for simplicity) 80543b0c : 80543b0c: push_s blink ... 80543b3a: mov_s r15,0x809de168 <-- @smp_bitops_lock 80543b40: mov_s r17,1 80543b46: mov_s r16,0 # spin lock() inside test_and_set_bit() - see arc bitops.h (!LLSC code) 80543b78: clri r4 80543b7c: dmb 3 80543b80: mov_s r2,r17 80543b82: ex r2,[r15] 80543b86: breq r2,1,80543b82 80543b8a: dmb 3 # set the bit 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set 80543b90: or r3,r2,1 <--- (B) other core unlocks right here 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) # spin_unlock 80543b96: dmb 3 80543b9a: mov_s r3,r16 80543b9c: ex r3,[r15] 80543ba0: dmb 3 80543ba4: seti r4 # check the old bit 80543ba8: bbit0 r2,0,80543bb8 <--- bit was set, branch not taken 80543bac: b_s 80543b68 <--- enter the test_bit() loop 80543b68: ld_s r2,[r13,0] <-- (C) reads the bit, set by SELF 80543b6a: bbit1 r2,0,80543b68 spins infinitely ... Now using hardware coherency tracing (and using the cycle timestamps) we verified (A) and (B) Thing is with exclusive load/store this race can't just happen since the intervening ST will cause the ST in (C) to NOT commit and the LD/ST will be retried. And there will be very few production systems which are SMP but lack exclusive load/stores. Are you convinced now ! -Vineet From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751678AbcCHUlE (ORCPT ); Tue, 8 Mar 2016 15:41:04 -0500 Received: from resqmta-ch2-05v.sys.comcast.net ([69.252.207.37]:45150 "EHLO resqmta-ch2-05v.sys.comcast.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751582AbcCHUkz (ORCPT ); Tue, 8 Mar 2016 15:40:55 -0500 Date: Tue, 8 Mar 2016 14:40:52 -0600 (CST) From: Christoph Lameter X-X-Sender: cl@east.gentwo.org To: Vineet Gupta cc: linux-mm@kvack.org, Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic In-Reply-To: <56DEF3D3.6080008@synopsys.com> Message-ID: References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 8 Mar 2016, Vineet Gupta wrote: > # set the bit > 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) Duh. Guess you need to take the spinlock also in the arch specific implementation of __bit_spin_unlock(). This is certainly not the only case in which we use the __ op to unlock. You need a true atomic op or you need to take the "spinlock" in all cases where you modify the bit. If you take the lock in __bit_spin_unlock then the race cannot happen. > Are you convinced now ! Yes, please fix your arch specific code. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752475AbcCIGnr (ORCPT ); Wed, 9 Mar 2016 01:43:47 -0500 Received: from smtprelay.synopsys.com ([198.182.60.111]:43086 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751063AbcCIGnk (ORCPT ); Wed, 9 Mar 2016 01:43:40 -0500 Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic To: Christoph Lameter References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> CC: , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , , , , , "Peter Zijlstra" , "James E.J. Bottomley" , Helge Deller , "linux-arch@vger.kernel.org" From: Vineet Gupta Message-ID: <56DFC604.6070407@synopsys.com> Date: Wed, 9 Mar 2016 12:13:16 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.12.197.157] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org +CC linux-arch, parisc folks, PeterZ On Wednesday 09 March 2016 02:10 AM, Christoph Lameter wrote: > On Tue, 8 Mar 2016, Vineet Gupta wrote: > >> # set the bit >> 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set >> 80543b90: or r3,r2,1 <--- (B) other core unlocks right here >> 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > Duh. Guess you need to take the spinlock also in the arch specific > implementation of __bit_spin_unlock(). This is certainly not the only case > in which we use the __ op to unlock. __bit_spin_lock() by definition is *not* required to be atomic, bit_spin_lock() is - so I don't think we need a spinlock there. There is clearly a problem in slub code that it is pairing a test_and_set_bit() with a __clear_bit(). Latter can obviously clobber former if they are not a single instruction each unlike x86 or they use llock/scond kind of instructions where the interim store from other core is detected and causes a retry of whole llock/scond sequence. BTW ARC is not the only arch which suffers from this - other arches potentially also are. AFAIK PARISC also doesn't have atomic r-m-w and also uses a set of external hashed spinlocks to protect the r-m-w sequences. https://lkml.org/lkml/2014/6/1/178 So there also we have the same race because the outer spin lock is not taken for slab_unlock() -> __bit_spin_lock() -> __clear_bit. Arguably I can fix the ARC !LLSC variant of test_and_set_bit() to not set the bit unconditionally but only if it was clear (PARISC does the same). That would be a slight micro-optimization as we won't need another snoop transaction to make line writable and that would also elide this problem, but I think there is a fundamental problem here in slub which is pairing atomic and non atomic ops - for performance reasons. It doesn't work on all arches and/or configurations. > You need a true atomic op or you need to take the "spinlock" in all > cases where you modify the bit. No we don't in __bit_spin_lock and we already do in bit_spin_lock. > If you take the lock in __bit_spin_unlock > then the race cannot happen. Of course it won't but that means we penalize all non atomic callers of the API with a superfluous spinlock which is not require din first place given the definition of API. >> Are you convinced now ! > > Yes, please fix your arch specific code. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753362AbcCILBH (ORCPT ); Wed, 9 Mar 2016 06:01:07 -0500 Received: from smtprelay.synopsys.com ([198.182.47.9]:44037 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753133AbcCILBA (ORCPT ); Wed, 9 Mar 2016 06:01:00 -0500 Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic To: Peter Zijlstra References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> CC: "linux-arch@vger.kernel.org" , , Andrew Morton , Helge Deller , , , "James E.J. Bottomley" , Pekka Enberg , , Noam Camus , David Rientjes , Christoph Lameter , , Joonsoo Kim Newsgroups: gmane.linux.kernel.arc,gmane.linux.kernel.cross-arch,gmane.linux.kernel,gmane.linux.kernel.stable,gmane.linux.kernel.mm From: Vineet Gupta Message-ID: <56E0024F.4070401@synopsys.com> Date: Wed, 9 Mar 2016 16:30:31 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <20160309101349.GJ6344@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.12.197.157] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote: >>> If you take the lock in __bit_spin_unlock >>> then the race cannot happen. >> >> Of course it won't but that means we penalize all non atomic callers of the API >> with a superfluous spinlock which is not require din first place given the >> definition of API. > > Quite. _However_, your arch is still broken, but not by your fault. Its > the generic-asm code that is wrong. > > The thing is that __bit_spin_unlock() uses __clear_bit_unlock(), which > defaults to __clear_bit(). Which is wrong. > > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > > Reported-by: Vineet Gupta This needs to be CCed stable as it fixes a real bug for ARC. > Signed-off-by: Peter Zijlstra (Intel) Tested-by: Vineet Gupta FWIW, could we add some background to commit log, specifically what prompted this. Something like below... ---->8------ This came up as a result of hackbench livelock'ing in slab_lock() on ARC with SMP + SLUB + !LLSC. The issue was incorrect pairing of atomic ops. slab_lock() -> bit_spin_lock() -> test_and_set_bit() slab_unlock() -> __bit_spin_unlock() -> __clear_bit() The non serializing __clear_bit() was getting "lost" 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set 80543b90: or r3,r2,1 <--- (B) other core unlocks right here 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) Fixes ARC STAR 9000817404 ---->8------ > --- > include/asm-generic/bitops/lock.h | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h > index c30266e94806..8ef0ccbf8167 100644 > --- a/include/asm-generic/bitops/lock.h > +++ b/include/asm-generic/bitops/lock.h > @@ -29,16 +29,16 @@ do { \ > * @nr: the bit to set > * @addr: the address to start counting from > * > - * This operation is like clear_bit_unlock, however it is not atomic. > - * It does provide release barrier semantics so it can be used to unlock > - * a bit lock, however it would only be used if no other CPU can modify > - * any bits in the memory until the lock is released (a good example is > - * if the bit lock itself protects access to the other bits in the word). > + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all > + * the bits in the word are protected by this lock some archs can use weaker > + * ops to safely unlock. > + * > + * See for example x86's implementation. > */ To be able to override/use-generic don't we need #ifndef .... > #define __clear_bit_unlock(nr, addr) \ > do { \ > - smp_mb(); \ > - __clear_bit(nr, addr); \ > + smp_mb__before_atomic(); \ > + clear_bit(nr, addr); \ > } while (0) > > #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932254AbcCILNV (ORCPT ); Wed, 9 Mar 2016 06:13:21 -0500 Received: from smtprelay.synopsys.com ([198.182.47.9]:44230 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753156AbcCILNI (ORCPT ); Wed, 9 Mar 2016 06:13:08 -0500 Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic To: Peter Zijlstra References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <20160309103143.GF25010@twins.programming.kicks-ass.net> CC: Christoph Lameter , , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , "Noam Camus" , , , , , "James E.J. Bottomley" , Helge Deller , "linux-arch@vger.kernel.org" Newsgroups: gmane.linux.kernel.stable,gmane.linux.kernel.mm,gmane.linux.kernel,gmane.linux.kernel.arc,gmane.linux.kernel.cross-arch From: Vineet Gupta Message-ID: <56E0052B.3080304@synopsys.com> Date: Wed, 9 Mar 2016 16:42:43 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <20160309103143.GF25010@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.12.197.157] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wednesday 09 March 2016 04:01 PM, Peter Zijlstra wrote: > On Wed, Mar 09, 2016 at 11:13:49AM +0100, Peter Zijlstra wrote: >> --- >> Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() >> >> __clear_bit_unlock() is a special little snowflake. While it carries the >> non-atomic '__' prefix, it is specifically documented to pair with >> test_and_set_bit() and therefore should be 'somewhat' atomic. >> >> Therefore the generic implementation of __clear_bit_unlock() cannot use >> the fully non-atomic __clear_bit() as a default. >> >> If an arch is able to do better; is must provide an implementation of >> __clear_bit_unlock() itself. >> > > FWIW, we could probably undo this if we unified all the spinlock based > atomic ops implementations (there's a whole bunch doing essentially the > same), and special cased __clear_bit_unlock() for that. > > Collapsing them is probably a good idea anyway, just a fair bit of > non-trivial work to figure out all the differences and if they matter > etc.. Indeed I thought about this when we first did the SMP port. The only issue was somewhat more generated code with the hashed spinlocks (vs. my dumb 2 spin locks - which as I see now will also cause false sharing - likely ending up in the same cache line), but I was more of a micro-optimization freak then than I'm now :-) So yeah I agree ! From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932287AbcCILyA (ORCPT ); Wed, 9 Mar 2016 06:54:00 -0500 Received: from smtprelay.synopsys.com ([198.182.60.111]:48017 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752678AbcCILxv (ORCPT ); Wed, 9 Mar 2016 06:53:51 -0500 Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic To: Peter Zijlstra References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> <20160309114054.GJ6356@twins.programming.kicks-ass.net> CC: "linux-arch@vger.kernel.org" , , Helge Deller , , , "James E.J. Bottomley" , Pekka Enberg , , Noam Camus , David Rientjes , Joonsoo Kim , Andrew Morton , , Christoph Lameter Newsgroups: gmane.linux.kernel.arc,gmane.linux.kernel.cross-arch,gmane.linux.kernel,gmane.linux.kernel.stable,gmane.linux.kernel.mm From: Vineet Gupta Message-ID: <56E00EB6.4000201@synopsys.com> Date: Wed, 9 Mar 2016 17:23:26 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <20160309114054.GJ6356@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.12.197.157] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wednesday 09 March 2016 05:10 PM, Peter Zijlstra wrote: > On Wed, Mar 09, 2016 at 04:30:31PM +0530, Vineet Gupta wrote: >> FWIW, could we add some background to commit log, specifically what prompted this. >> Something like below... > > Sure.. find below. > >>> +++ b/include/asm-generic/bitops/lock.h >>> @@ -29,16 +29,16 @@ do { \ >>> * @nr: the bit to set >>> * @addr: the address to start counting from >>> * >>> + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all >>> + * the bits in the word are protected by this lock some archs can use weaker >>> + * ops to safely unlock. >>> + * >>> + * See for example x86's implementation. >>> */ >> >> To be able to override/use-generic don't we need #ifndef .... > > I did not follow through the maze, I think the few archs implementing > this simply do not include this file at all. > > I'll let the first person that cares about this worry about that :-) Ok - that's be me :-) although I really don't see much gains in case of ARC LLSC. For us, LD + BCLR + ST is very similar to LLOCK + BCLR + SCOND atleast in terms of cache coherency transactions ! > > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > > Specifically, this came up as a result of hackbench livelock'ing in > slab_lock() on ARC with SMP + SLUB + !LLSC. > > The issue was incorrect pairing of atomic ops. > > slab_lock() -> bit_spin_lock() -> test_and_set_bit() > slab_unlock() -> __bit_spin_unlock() -> __clear_bit() > > The non serializing __clear_bit() was getting "lost" > > 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > Fixes ARC STAR 9000817404 (and probably more). > > Cc: stable@vger.kernel.org > Reported-by: Vineet Gupta > Tested-by: Vineet Gupta > Signed-off-by: Peter Zijlstra (Intel) LGTM. Thx a bunch Peter ! -Vineet > --- > include/asm-generic/bitops/lock.h | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h > index c30266e94806..8ef0ccbf8167 100644 > --- a/include/asm-generic/bitops/lock.h > +++ b/include/asm-generic/bitops/lock.h > @@ -29,16 +29,16 @@ do { \ > * @nr: the bit to set > * @addr: the address to start counting from > * > - * This operation is like clear_bit_unlock, however it is not atomic. > - * It does provide release barrier semantics so it can be used to unlock > - * a bit lock, however it would only be used if no other CPU can modify > - * any bits in the memory until the lock is released (a good example is > - * if the bit lock itself protects access to the other bits in the word). > + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all > + * the bits in the word are protected by this lock some archs can use weaker > + * ops to safely unlock. > + * > + * See for example x86's implementation. > */ > #define __clear_bit_unlock(nr, addr) \ > do { \ > - smp_mb(); \ > - __clear_bit(nr, addr); \ > + smp_mb__before_atomic(); \ > + clear_bit(nr, addr); \ > } while (0) > > #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932697AbcCINXs (ORCPT ); Wed, 9 Mar 2016 08:23:48 -0500 Received: from smtprelay.synopsys.com ([198.182.47.9]:46072 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932654AbcCINX0 (ORCPT ); Wed, 9 Mar 2016 08:23:26 -0500 Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic To: Peter Zijlstra References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> CC: "linux-arch@vger.kernel.org" , , Andrew Morton , Helge Deller , , , "James E.J. Bottomley" , Pekka Enberg , , Noam Camus , David Rientjes , Christoph Lameter , , Joonsoo Kim Newsgroups: gmane.linux.kernel.arc,gmane.linux.kernel.cross-arch,gmane.linux.kernel,gmane.linux.kernel.stable,gmane.linux.kernel.mm From: Vineet Gupta Message-ID: <56E023A5.2000105@synopsys.com> Date: Wed, 9 Mar 2016 18:52:45 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <20160309101349.GJ6344@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.12.197.157] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote: >> There is clearly a problem in slub code that it is pairing a test_and_set_bit() >> with a __clear_bit(). Latter can obviously clobber former if they are not a single >> instruction each unlike x86 or they use llock/scond kind of instructions where the >> interim store from other core is detected and causes a retry of whole llock/scond >> sequence. > > Yes, test_and_set_bit() + __clear_bit() is broken. But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so (ignoring the performance thing for discussion sake, which is a side effect of this implementation). So despite the comment below in bit_spinlock.h I don't quite comprehend how this is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed cases, then isn't this true in general for lot more cases in kernel, i.e. pairing atomic lock with non-atomic unlock ? I'm missing something ! | /* | * bit-based spin_unlock() | * non-atomic version, which can be used eg. if the bit lock itself is | * protecting the rest of the flags in the word. | */ | static inline void __bit_spin_unlock(int bitnum, unsigned long *addr) From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753292AbcCJFvs (ORCPT ); Thu, 10 Mar 2016 00:51:48 -0500 Received: from us01smtprelay-2.synopsys.com ([198.182.47.9]:35939 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751571AbcCJFvr (ORCPT ); Thu, 10 Mar 2016 00:51:47 -0500 Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic To: Peter Zijlstra References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E023A5.2000105@synopsys.com> <20160309145119.GN6356@twins.programming.kicks-ass.net> CC: "linux-arch@vger.kernel.org" , , Andrew Morton , Helge Deller , , , "James E.J. Bottomley" , Pekka Enberg , , Noam Camus , David Rientjes , Christoph Lameter , , Joonsoo Kim Newsgroups: gmane.linux.kernel,gmane.linux.kernel.cross-arch,gmane.linux.kernel.stable,gmane.linux.kernel.mm,gmane.linux.kernel.arc From: Vineet Gupta Message-ID: <56E10B59.1060700@synopsys.com> Date: Thu, 10 Mar 2016 11:21:21 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <20160309145119.GN6356@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.12.197.157] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wednesday 09 March 2016 08:21 PM, Peter Zijlstra wrote: >> But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so >> (ignoring the performance thing for discussion sake, which is a side effect of >> this implementation). > > The sort answer is: Per definition. They are defined to work together, > which is what makes __clear_bit_unlock() such a special function. > >> So despite the comment below in bit_spinlock.h I don't quite comprehend how this >> is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed >> cases, then isn't this true in general for lot more cases in kernel, i.e. pairing >> atomic lock with non-atomic unlock ? I'm missing something ! > > x86 (and others) do in fact use non-atomic instructions for > spin_unlock(). But as this is all arch specific, we can make these > assumptions. Its just that generic code cannot rely on it. OK despite being obvious now, I was not seeing the similarity between spin_*lock() and bit_spin*lock() :-( ARC also uses standard ST for spin_unlock() so by analogy __bit_spin_unlock() (for LLSC case) would be correctly paired with bit_spin_lock(). But then why would anyone need bit_spin_unlock() at all. Specially after this patch from you which tightens __bit_spin_lock() even more for the general case. Thing is if the API exists majority of people would would use the more conservative version w/o understand all these nuances. Can we pursue the path of moving bit_spin_unlock() over to __bit_spin_lock(): first changing the backend only and if proven stable replacing the call-sites themselves. > > So let me try and explain. > > > The problem as identified is: > > CPU0 CPU1 > > bit_spin_lock() __bit_spin_unlock() > 1: > /* fetch_or, r1 holds the old value */ > spin_lock > load r1, addr > load r1, addr > bclr r2, r1, 1 > store r2, addr > or r2, r1, 1 > store r2, addr /* lost the store from CPU1 */ > spin_unlock > > and r1, 1 > bnz 2 /* it was set, go wait */ > ret > > 2: > load r1, addr > and r1, 1 > bnz 2 /* wait until its not set */ > > b 1 /* try again */ > > > > For LL/SC we replace: > > spin_lock > load r1, addr > > ... > > store r2, addr > spin_unlock > > With the (obvious): > > 1: > load-locked r1, addr > > ... > > store-cond r2, addr > bnz 1 /* or whatever branch instruction is required to retry */ > > > In this case the failure cannot happen, because the store from CPU1 > would have invalidated the lock from CPU0 and caused the > store-cond to fail and retry the loop, observing the new value. You did it again, A picture is worth thousand words ! Thx, -Vineet From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S964863AbcCNIGL (ORCPT ); Mon, 14 Mar 2016 04:06:11 -0400 Received: from smtprelay.synopsys.com ([198.182.60.111]:55884 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933865AbcCNIFq (ORCPT ); Mon, 14 Mar 2016 04:05:46 -0400 Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic To: Peter Zijlstra References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> <20160309114054.GJ6356@twins.programming.kicks-ass.net> CC: "linux-arch@vger.kernel.org" , , Helge Deller , , , "James E.J. Bottomley" , Pekka Enberg , , Noam Camus , David Rientjes , Joonsoo Kim , Andrew Morton , , Christoph Lameter Newsgroups: gmane.linux.kernel.arc,gmane.linux.kernel.cross-arch,gmane.linux.kernel,gmane.linux.kernel.stable,gmane.linux.kernel.mm From: Vineet Gupta Message-ID: <56E670C0.7080901@synopsys.com> Date: Mon, 14 Mar 2016 13:35:20 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <20160309114054.GJ6356@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.12.197.157] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wednesday 09 March 2016 05:10 PM, Peter Zijlstra wrote: > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > > Specifically, this came up as a result of hackbench livelock'ing in > slab_lock() on ARC with SMP + SLUB + !LLSC. > > The issue was incorrect pairing of atomic ops. > > slab_lock() -> bit_spin_lock() -> test_and_set_bit() > slab_unlock() -> __bit_spin_unlock() -> __clear_bit() > > The non serializing __clear_bit() was getting "lost" > > 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > Fixes ARC STAR 9000817404 (and probably more). > > Cc: stable@vger.kernel.org > Reported-by: Vineet Gupta > Tested-by: Vineet Gupta > Signed-off-by: Peter Zijlstra (Intel) Peter, I don't see this in linux-next yet. I'm hoping you will send it Linus' way for 4.6-rc1. Thx, -Vineet From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754961AbcCULRy (ORCPT ); Mon, 21 Mar 2016 07:17:54 -0400 Received: from terminus.zytor.com ([198.137.202.10]:49582 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754760AbcCULRl (ORCPT ); Mon, 21 Mar 2016 07:17:41 -0400 Date: Mon, 21 Mar 2016 04:16:33 -0700 From: tip-bot for Peter Zijlstra Message-ID: Cc: rientjes@google.com, jejb@parisc-linux.org, mingo@kernel.org, deller@gmx.de, cl@linux.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, Vineet.Gupta1@synopsys.com, peterz@infradead.org, iamjoonsoo.kim@lge.com, hpa@zytor.com, tglx@linutronix.de, penberg@kernel.org, torvalds@linux-foundation.org, paulmck@linux.vnet.ibm.com, noamc@ezchip.com Reply-To: penberg@kernel.org, torvalds@linux-foundation.org, paulmck@linux.vnet.ibm.com, noamc@ezchip.com, mingo@kernel.org, rientjes@google.com, jejb@parisc-linux.org, deller@gmx.de, linux-kernel@vger.kernel.org, cl@linux.com, akpm@linux-foundation.org, tglx@linutronix.de, Vineet.Gupta1@synopsys.com, peterz@infradead.org, hpa@zytor.com, iamjoonsoo.kim@lge.com In-Reply-To: <20160309114054.GJ6356@twins.programming.kicks-ass.net> References: <20160309114054.GJ6356@twins.programming.kicks-ass.net> To: linux-tip-commits@vger.kernel.org Subject: [tip:locking/urgent] bitops: Do not default to __clear_bit() for __clear_bit_unlock() Git-Commit-ID: f75d48644c56a31731d17fa693c8175328957e1d X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit-ID: f75d48644c56a31731d17fa693c8175328957e1d Gitweb: http://git.kernel.org/tip/f75d48644c56a31731d17fa693c8175328957e1d Author: Peter Zijlstra AuthorDate: Wed, 9 Mar 2016 12:40:54 +0100 Committer: Ingo Molnar CommitDate: Mon, 21 Mar 2016 10:50:48 +0100 bitops: Do not default to __clear_bit() for __clear_bit_unlock() __clear_bit_unlock() is a special little snowflake. While it carries the non-atomic '__' prefix, it is specifically documented to pair with test_and_set_bit() and therefore should be 'somewhat' atomic. Therefore the generic implementation of __clear_bit_unlock() cannot use the fully non-atomic __clear_bit() as a default. If an arch is able to do better; is must provide an implementation of __clear_bit_unlock() itself. Specifically, this came up as a result of hackbench livelock'ing in slab_lock() on ARC with SMP + SLUB + !LLSC. The issue was incorrect pairing of atomic ops. slab_lock() -> bit_spin_lock() -> test_and_set_bit() slab_unlock() -> __bit_spin_unlock() -> __clear_bit() The non serializing __clear_bit() was getting "lost" 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set 80543b90: or r3,r2,1 <--- (B) other core unlocks right here 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) Fixes ARC STAR 9000817404 (and probably more). Reported-by: Vineet Gupta Tested-by: Vineet Gupta Signed-off-by: Peter Zijlstra (Intel) Cc: Andrew Morton Cc: Christoph Lameter Cc: David Rientjes Cc: Helge Deller Cc: James E.J. Bottomley Cc: Joonsoo Kim Cc: Linus Torvalds Cc: Noam Camus Cc: Paul E. McKenney Cc: Pekka Enberg Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20160309114054.GJ6356@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar --- include/asm-generic/bitops/lock.h | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h index c30266e..8ef0ccb 100644 --- a/include/asm-generic/bitops/lock.h +++ b/include/asm-generic/bitops/lock.h @@ -29,16 +29,16 @@ do { \ * @nr: the bit to set * @addr: the address to start counting from * - * This operation is like clear_bit_unlock, however it is not atomic. - * It does provide release barrier semantics so it can be used to unlock - * a bit lock, however it would only be used if no other CPU can modify - * any bits in the memory until the lock is released (a good example is - * if the bit lock itself protects access to the other bits in the word). + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all + * the bits in the word are protected by this lock some archs can use weaker + * ops to safely unlock. + * + * See for example x86's implementation. */ #define __clear_bit_unlock(nr, addr) \ do { \ - smp_mb(); \ - __clear_bit(nr, addr); \ + smp_mb__before_atomic(); \ + clear_bit(nr, addr); \ } while (0) #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic To: Peter Zijlstra References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> <20160309114054.GJ6356@twins.programming.kicks-ass.net> CC: "linux-arch@vger.kernel.org" , , Helge Deller , , , "James E.J. Bottomley" , Pekka Enberg , , Noam Camus , David Rientjes , Joonsoo Kim , Andrew Morton , , Christoph Lameter From: Vineet Gupta Message-ID: <56E670C0.7080901@synopsys.com> Date: Mon, 14 Mar 2016 13:35:20 +0530 MIME-Version: 1.0 In-Reply-To: <20160309114054.GJ6356@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: On Wednesday 09 March 2016 05:10 PM, Peter Zijlstra wrote: > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > > Specifically, this came up as a result of hackbench livelock'ing in > slab_lock() on ARC with SMP + SLUB + !LLSC. > > The issue was incorrect pairing of atomic ops. > > slab_lock() -> bit_spin_lock() -> test_and_set_bit() > slab_unlock() -> __bit_spin_unlock() -> __clear_bit() > > The non serializing __clear_bit() was getting "lost" > > 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > Fixes ARC STAR 9000817404 (and probably more). > > Cc: stable@vger.kernel.org > Reported-by: Vineet Gupta > Tested-by: Vineet Gupta > Signed-off-by: Peter Zijlstra (Intel) Peter, I don't see this in linux-next yet. I'm hoping you will send it Linus' way for 4.6-rc1. Thx, -Vineet -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic To: Peter Zijlstra References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E023A5.2000105@synopsys.com> <20160309145119.GN6356@twins.programming.kicks-ass.net> CC: "linux-arch@vger.kernel.org" , , Andrew Morton , Helge Deller , , , "James E.J. Bottomley" , Pekka Enberg , , Noam Camus , David Rientjes , Christoph Lameter , , Joonsoo Kim From: Vineet Gupta Message-ID: <56E10B59.1060700@synopsys.com> Date: Thu, 10 Mar 2016 11:21:21 +0530 MIME-Version: 1.0 In-Reply-To: <20160309145119.GN6356@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: On Wednesday 09 March 2016 08:21 PM, Peter Zijlstra wrote: >> But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so >> (ignoring the performance thing for discussion sake, which is a side effect of >> this implementation). > > The sort answer is: Per definition. They are defined to work together, > which is what makes __clear_bit_unlock() such a special function. > >> So despite the comment below in bit_spinlock.h I don't quite comprehend how this >> is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed >> cases, then isn't this true in general for lot more cases in kernel, i.e. pairing >> atomic lock with non-atomic unlock ? I'm missing something ! > > x86 (and others) do in fact use non-atomic instructions for > spin_unlock(). But as this is all arch specific, we can make these > assumptions. Its just that generic code cannot rely on it. OK despite being obvious now, I was not seeing the similarity between spin_*lock() and bit_spin*lock() :-( ARC also uses standard ST for spin_unlock() so by analogy __bit_spin_unlock() (for LLSC case) would be correctly paired with bit_spin_lock(). But then why would anyone need bit_spin_unlock() at all. Specially after this patch from you which tightens __bit_spin_lock() even more for the general case. Thing is if the API exists majority of people would would use the more conservative version w/o understand all these nuances. Can we pursue the path of moving bit_spin_unlock() over to __bit_spin_lock(): first changing the backend only and if proven stable replacing the call-sites themselves. > > So let me try and explain. > > > The problem as identified is: > > CPU0 CPU1 > > bit_spin_lock() __bit_spin_unlock() > 1: > /* fetch_or, r1 holds the old value */ > spin_lock > load r1, addr > load r1, addr > bclr r2, r1, 1 > store r2, addr > or r2, r1, 1 > store r2, addr /* lost the store from CPU1 */ > spin_unlock > > and r1, 1 > bnz 2 /* it was set, go wait */ > ret > > 2: > load r1, addr > and r1, 1 > bnz 2 /* wait until its not set */ > > b 1 /* try again */ > > > > For LL/SC we replace: > > spin_lock > load r1, addr > > ... > > store r2, addr > spin_unlock > > With the (obvious): > > 1: > load-locked r1, addr > > ... > > store-cond r2, addr > bnz 1 /* or whatever branch instruction is required to retry */ > > > In this case the failure cannot happen, because the store from CPU1 > would have invalidated the lock from CPU0 and caused the > store-cond to fail and retry the loop, observing the new value. You did it again, A picture is worth thousand words ! Thx, -Vineet -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic To: Peter Zijlstra References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> CC: "linux-arch@vger.kernel.org" , , Andrew Morton , Helge Deller , , , "James E.J. Bottomley" , Pekka Enberg , , Noam Camus , David Rientjes , Christoph Lameter , , Joonsoo Kim From: Vineet Gupta Message-ID: <56E023A5.2000105@synopsys.com> Date: Wed, 9 Mar 2016 18:52:45 +0530 MIME-Version: 1.0 In-Reply-To: <20160309101349.GJ6344@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote: >> There is clearly a problem in slub code that it is pairing a test_and_set_bit() >> with a __clear_bit(). Latter can obviously clobber former if they are not a single >> instruction each unlike x86 or they use llock/scond kind of instructions where the >> interim store from other core is detected and causes a retry of whole llock/scond >> sequence. > > Yes, test_and_set_bit() + __clear_bit() is broken. But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so (ignoring the performance thing for discussion sake, which is a side effect of this implementation). So despite the comment below in bit_spinlock.h I don't quite comprehend how this is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed cases, then isn't this true in general for lot more cases in kernel, i.e. pairing atomic lock with non-atomic unlock ? I'm missing something ! | /* | * bit-based spin_unlock() | * non-atomic version, which can be used eg. if the bit lock itself is | * protecting the rest of the flags in the word. | */ | static inline void __bit_spin_unlock(int bitnum, unsigned long *addr) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic To: Peter Zijlstra References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <56E0024F.4070401@synopsys.com> <20160309114054.GJ6356@twins.programming.kicks-ass.net> CC: "linux-arch@vger.kernel.org" , , Helge Deller , , , "James E.J. Bottomley" , Pekka Enberg , , Noam Camus , David Rientjes , Joonsoo Kim , Andrew Morton , , Christoph Lameter From: Vineet Gupta Message-ID: <56E00EB6.4000201@synopsys.com> Date: Wed, 9 Mar 2016 17:23:26 +0530 MIME-Version: 1.0 In-Reply-To: <20160309114054.GJ6356@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: On Wednesday 09 March 2016 05:10 PM, Peter Zijlstra wrote: > On Wed, Mar 09, 2016 at 04:30:31PM +0530, Vineet Gupta wrote: >> FWIW, could we add some background to commit log, specifically what prompted this. >> Something like below... > > Sure.. find below. > >>> +++ b/include/asm-generic/bitops/lock.h >>> @@ -29,16 +29,16 @@ do { \ >>> * @nr: the bit to set >>> * @addr: the address to start counting from >>> * >>> + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all >>> + * the bits in the word are protected by this lock some archs can use weaker >>> + * ops to safely unlock. >>> + * >>> + * See for example x86's implementation. >>> */ >> >> To be able to override/use-generic don't we need #ifndef .... > > I did not follow through the maze, I think the few archs implementing > this simply do not include this file at all. > > I'll let the first person that cares about this worry about that :-) Ok - that's be me :-) although I really don't see much gains in case of ARC LLSC. For us, LD + BCLR + ST is very similar to LLOCK + BCLR + SCOND atleast in terms of cache coherency transactions ! > > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > > Specifically, this came up as a result of hackbench livelock'ing in > slab_lock() on ARC with SMP + SLUB + !LLSC. > > The issue was incorrect pairing of atomic ops. > > slab_lock() -> bit_spin_lock() -> test_and_set_bit() > slab_unlock() -> __bit_spin_unlock() -> __clear_bit() > > The non serializing __clear_bit() was getting "lost" > > 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set > 80543b90: or r3,r2,1 <--- (B) other core unlocks right here > 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > Fixes ARC STAR 9000817404 (and probably more). > > Cc: stable@vger.kernel.org > Reported-by: Vineet Gupta > Tested-by: Vineet Gupta > Signed-off-by: Peter Zijlstra (Intel) LGTM. Thx a bunch Peter ! -Vineet > --- > include/asm-generic/bitops/lock.h | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h > index c30266e94806..8ef0ccbf8167 100644 > --- a/include/asm-generic/bitops/lock.h > +++ b/include/asm-generic/bitops/lock.h > @@ -29,16 +29,16 @@ do { \ > * @nr: the bit to set > * @addr: the address to start counting from > * > - * This operation is like clear_bit_unlock, however it is not atomic. > - * It does provide release barrier semantics so it can be used to unlock > - * a bit lock, however it would only be used if no other CPU can modify > - * any bits in the memory until the lock is released (a good example is > - * if the bit lock itself protects access to the other bits in the word). > + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all > + * the bits in the word are protected by this lock some archs can use weaker > + * ops to safely unlock. > + * > + * See for example x86's implementation. > */ > #define __clear_bit_unlock(nr, addr) \ > do { \ > - smp_mb(); \ > - __clear_bit(nr, addr); \ > + smp_mb__before_atomic(); \ > + clear_bit(nr, addr); \ > } while (0) > > #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic To: Peter Zijlstra References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> <20160309103143.GF25010@twins.programming.kicks-ass.net> CC: Christoph Lameter , , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , "Noam Camus" , , , , , "James E.J. Bottomley" , Helge Deller , "linux-arch@vger.kernel.org" From: Vineet Gupta Message-ID: <56E0052B.3080304@synopsys.com> Date: Wed, 9 Mar 2016 16:42:43 +0530 MIME-Version: 1.0 In-Reply-To: <20160309103143.GF25010@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: On Wednesday 09 March 2016 04:01 PM, Peter Zijlstra wrote: > On Wed, Mar 09, 2016 at 11:13:49AM +0100, Peter Zijlstra wrote: >> --- >> Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() >> >> __clear_bit_unlock() is a special little snowflake. While it carries the >> non-atomic '__' prefix, it is specifically documented to pair with >> test_and_set_bit() and therefore should be 'somewhat' atomic. >> >> Therefore the generic implementation of __clear_bit_unlock() cannot use >> the fully non-atomic __clear_bit() as a default. >> >> If an arch is able to do better; is must provide an implementation of >> __clear_bit_unlock() itself. >> > > FWIW, we could probably undo this if we unified all the spinlock based > atomic ops implementations (there's a whole bunch doing essentially the > same), and special cased __clear_bit_unlock() for that. > > Collapsing them is probably a good idea anyway, just a fair bit of > non-trivial work to figure out all the differences and if they matter > etc.. Indeed I thought about this when we first did the SMP port. The only issue was somewhat more generated code with the hashed spinlocks (vs. my dumb 2 spin locks - which as I see now will also cause false sharing - likely ending up in the same cache line), but I was more of a micro-optimization freak then than I'm now :-) So yeah I agree ! -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic To: Peter Zijlstra References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> <56DFC604.6070407@synopsys.com> <20160309101349.GJ6344@twins.programming.kicks-ass.net> CC: "linux-arch@vger.kernel.org" , , Andrew Morton , Helge Deller , , , "James E.J. Bottomley" , Pekka Enberg , , Noam Camus , David Rientjes , Christoph Lameter , , Joonsoo Kim From: Vineet Gupta Message-ID: <56E0024F.4070401@synopsys.com> Date: Wed, 9 Mar 2016 16:30:31 +0530 MIME-Version: 1.0 In-Reply-To: <20160309101349.GJ6344@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote: >>> If you take the lock in __bit_spin_unlock >>> then the race cannot happen. >> >> Of course it won't but that means we penalize all non atomic callers of the API >> with a superfluous spinlock which is not require din first place given the >> definition of API. > > Quite. _However_, your arch is still broken, but not by your fault. Its > the generic-asm code that is wrong. > > The thing is that __bit_spin_unlock() uses __clear_bit_unlock(), which > defaults to __clear_bit(). Which is wrong. > > --- > Subject: bitops: Do not default to __clear_bit() for __clear_bit_unlock() > > __clear_bit_unlock() is a special little snowflake. While it carries the > non-atomic '__' prefix, it is specifically documented to pair with > test_and_set_bit() and therefore should be 'somewhat' atomic. > > Therefore the generic implementation of __clear_bit_unlock() cannot use > the fully non-atomic __clear_bit() as a default. > > If an arch is able to do better; is must provide an implementation of > __clear_bit_unlock() itself. > > Reported-by: Vineet Gupta This needs to be CCed stable as it fixes a real bug for ARC. > Signed-off-by: Peter Zijlstra (Intel) Tested-by: Vineet Gupta FWIW, could we add some background to commit log, specifically what prompted this. Something like below... ---->8------ This came up as a result of hackbench livelock'ing in slab_lock() on ARC with SMP + SLUB + !LLSC. The issue was incorrect pairing of atomic ops. slab_lock() -> bit_spin_lock() -> test_and_set_bit() slab_unlock() -> __bit_spin_unlock() -> __clear_bit() The non serializing __clear_bit() was getting "lost" 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set 80543b90: or r3,r2,1 <--- (B) other core unlocks right here 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) Fixes ARC STAR 9000817404 ---->8------ > --- > include/asm-generic/bitops/lock.h | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h > index c30266e94806..8ef0ccbf8167 100644 > --- a/include/asm-generic/bitops/lock.h > +++ b/include/asm-generic/bitops/lock.h > @@ -29,16 +29,16 @@ do { \ > * @nr: the bit to set > * @addr: the address to start counting from > * > - * This operation is like clear_bit_unlock, however it is not atomic. > - * It does provide release barrier semantics so it can be used to unlock > - * a bit lock, however it would only be used if no other CPU can modify > - * any bits in the memory until the lock is released (a good example is > - * if the bit lock itself protects access to the other bits in the word). > + * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all > + * the bits in the word are protected by this lock some archs can use weaker > + * ops to safely unlock. > + * > + * See for example x86's implementation. > */ To be able to override/use-generic don't we need #ifndef .... > #define __clear_bit_unlock(nr, addr) \ > do { \ > - smp_mb(); \ > - __clear_bit(nr, addr); \ > + smp_mb__before_atomic(); \ > + clear_bit(nr, addr); \ > } while (0) > > #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */ > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic To: Christoph Lameter References: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> <56DEF3D3.6080008@synopsys.com> CC: , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , , , , , "Peter Zijlstra" , "James E.J. Bottomley" , Helge Deller , "linux-arch@vger.kernel.org" From: Vineet Gupta Message-ID: <56DFC604.6070407@synopsys.com> Date: Wed, 9 Mar 2016 12:13:16 +0530 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: +CC linux-arch, parisc folks, PeterZ On Wednesday 09 March 2016 02:10 AM, Christoph Lameter wrote: > On Tue, 8 Mar 2016, Vineet Gupta wrote: > >> # set the bit >> 80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set >> 80543b90: or r3,r2,1 <--- (B) other core unlocks right here >> 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock) > > Duh. Guess you need to take the spinlock also in the arch specific > implementation of __bit_spin_unlock(). This is certainly not the only case > in which we use the __ op to unlock. __bit_spin_lock() by definition is *not* required to be atomic, bit_spin_lock() is - so I don't think we need a spinlock there. There is clearly a problem in slub code that it is pairing a test_and_set_bit() with a __clear_bit(). Latter can obviously clobber former if they are not a single instruction each unlike x86 or they use llock/scond kind of instructions where the interim store from other core is detected and causes a retry of whole llock/scond sequence. BTW ARC is not the only arch which suffers from this - other arches potentially also are. AFAIK PARISC also doesn't have atomic r-m-w and also uses a set of external hashed spinlocks to protect the r-m-w sequences. https://lkml.org/lkml/2014/6/1/178 So there also we have the same race because the outer spin lock is not taken for slab_unlock() -> __bit_spin_lock() -> __clear_bit. Arguably I can fix the ARC !LLSC variant of test_and_set_bit() to not set the bit unconditionally but only if it was clear (PARISC does the same). That would be a slight micro-optimization as we won't need another snoop transaction to make line writable and that would also elide this problem, but I think there is a fundamental problem here in slub which is pairing atomic and non atomic ops - for performance reasons. It doesn't work on all arches and/or configurations. > You need a true atomic op or you need to take the "spinlock" in all > cases where you modify the bit. No we don't in __bit_spin_lock and we already do in bit_spin_lock. > If you take the lock in __bit_spin_unlock > then the race cannot happen. Of course it won't but that means we penalize all non atomic callers of the API with a superfluous spinlock which is not require din first place given the definition of API. >> Are you convinced now ! > > Yes, please fix your arch specific code. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: Vineet Gupta To: CC: Vineet Gupta , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Noam Camus , , , Subject: [PATCH] mm: slub: Ensure that slab_unlock() is atomic Date: Tue, 8 Mar 2016 20:00:57 +0530 Message-ID: <1457447457-25878-1-git-send-email-vgupta@synopsys.com> MIME-Version: 1.0 Content-Type: text/plain Sender: owner-linux-mm@kvack.org List-ID: We observed livelocks on ARC SMP setup when running hackbench with SLUB. This hardware configuration lacks atomic instructions (LLOCK/SCOND) thus kernel resorts to a central @smp_bitops_lock to protect any R-M-W ops suh as test_and_set_bit() The spinlock itself is implemented using Atomic [EX]change instruction which is always available. The race happened when both cores tried to slab_lock() the same page. c1 c0 ----------- ----------- slab_lock slab_lock slab_unlock Not observing the unlock This in turn happened because slab_unlock() doesn't serialize properly (doesn't use atomic clear) with a concurrent running slab_lock()->test_and_set_bit() Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Cc: Andrew Morton Cc: Noam Camus Cc: Cc: Cc: Cc: Signed-off-by: Vineet Gupta --- mm/slub.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/slub.c b/mm/slub.c index d8fbd4a6ed59..b7d345a508dc 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -345,7 +345,7 @@ static __always_inline void slab_lock(struct page *page) static __always_inline void slab_unlock(struct page *page) { VM_BUG_ON_PAGE(PageTail(page), page); - __bit_spin_unlock(PG_locked, &page->flags); + bit_spin_unlock(PG_locked, &page->flags); } static inline void set_page_slub_counters(struct page *page, unsigned long counters_new) -- 2.5.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org