LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC] arch: Introduce new TSO memory barrier smp_tmb()
From: Paul E. McKenney @ 2013-11-03 22:43 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Michael Neuling, Mathieu Desnoyers, Peter Zijlstra,
	Frederic Weisbecker, Oleg Nesterov, LKML, Linux PPC dev,
	Anton Blanchard, Victor Kaplansky, Linus Torvalds
In-Reply-To: <1383512363.4776.22.camel@pasglop>

On Mon, Nov 04, 2013 at 07:59:23AM +1100, Benjamin Herrenschmidt wrote:
> On Sun, 2013-11-03 at 16:17 +0100, Peter Zijlstra wrote:
> > On Sun, Nov 03, 2013 at 06:40:17AM -0800, Paul E. McKenney wrote:
> > > If there was an smp_tmb(), I would likely use it in rcu_assign_pointer().
> > 
> > Well, I'm obviously all for introducing this new barrier, for it will
> > reduce a full mfence on x86 to a compiler barrier. And ppc can use
> > lwsync as opposed to sync afaict. Not sure ARM can do better.
> 
> The patch at the *very least* needs a good description of the semantics
> of the barrier, what does it order vs. what etc...

Agreed.  Also it needs a name that people can live with.  We will get
there.  ;-)

							Thanx, Paul

> Cheers,
> Ben.
> 
> > ---
> > Subject: arch: Introduce new TSO memory barrier smp_tmb()
> > 
> > A few sites could be downgraded from smp_mb() to smp_tmb() and a few
> > site should be upgraded to smp_tmb() that are now using smp_wmb().
> > 
> > XXX hope PaulMck explains things better..
> > 
> > X86 (!OOSTORE), SPARC have native TSO memory models and smp_tmb()
> > reduces to barrier().
> > 
> > PPC can use lwsync instead of sync
> > 
> > For the other archs, have smp_tmb map to smp_mb, as the stronger barrier
> > is always correct but possibly suboptimal.
> > 
> > Suggested-by: Paul McKenney <paulmck@linux.vnet.ibm.com>
> > Not-Signed-off-by: Peter Zijlstra <peterz@infradead.org>
> > ---
> >  arch/alpha/include/asm/barrier.h      | 2 ++
> >  arch/arc/include/asm/barrier.h        | 2 ++
> >  arch/arm/include/asm/barrier.h        | 2 ++
> >  arch/arm64/include/asm/barrier.h      | 2 ++
> >  arch/avr32/include/asm/barrier.h      | 1 +
> >  arch/blackfin/include/asm/barrier.h   | 1 +
> >  arch/cris/include/asm/barrier.h       | 2 ++
> >  arch/frv/include/asm/barrier.h        | 1 +
> >  arch/h8300/include/asm/barrier.h      | 2 ++
> >  arch/hexagon/include/asm/barrier.h    | 1 +
> >  arch/ia64/include/asm/barrier.h       | 2 ++
> >  arch/m32r/include/asm/barrier.h       | 2 ++
> >  arch/m68k/include/asm/barrier.h       | 1 +
> >  arch/metag/include/asm/barrier.h      | 3 +++
> >  arch/microblaze/include/asm/barrier.h | 1 +
> >  arch/mips/include/asm/barrier.h       | 3 +++
> >  arch/mn10300/include/asm/barrier.h    | 2 ++
> >  arch/parisc/include/asm/barrier.h     | 1 +
> >  arch/powerpc/include/asm/barrier.h    | 2 ++
> >  arch/s390/include/asm/barrier.h       | 1 +
> >  arch/score/include/asm/barrier.h      | 1 +
> >  arch/sh/include/asm/barrier.h         | 2 ++
> >  arch/sparc/include/asm/barrier_32.h   | 1 +
> >  arch/sparc/include/asm/barrier_64.h   | 3 +++
> >  arch/tile/include/asm/barrier.h       | 2 ++
> >  arch/unicore32/include/asm/barrier.h  | 1 +
> >  arch/x86/include/asm/barrier.h        | 3 +++
> >  arch/xtensa/include/asm/barrier.h     | 1 +
> >  28 files changed, 48 insertions(+)
> > 
> > diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
> > index ce8860a0b32d..02ea63897038 100644
> > --- a/arch/alpha/include/asm/barrier.h
> > +++ b/arch/alpha/include/asm/barrier.h
> > @@ -18,12 +18,14 @@ __asm__ __volatile__("mb": : :"memory")
> >  #ifdef CONFIG_SMP
> >  #define __ASM_SMP_MB	"\tmb\n"
> >  #define smp_mb()	mb()
> > +#define smp_tmb()	mb()
> >  #define smp_rmb()	rmb()
> >  #define smp_wmb()	wmb()
> >  #define smp_read_barrier_depends()	read_barrier_depends()
> >  #else
> >  #define __ASM_SMP_MB
> >  #define smp_mb()	barrier()
> > +#define smp_tmb()	barrier()
> >  #define smp_rmb()	barrier()
> >  #define smp_wmb()	barrier()
> >  #define smp_read_barrier_depends()	do { } while (0)
> > diff --git a/arch/arc/include/asm/barrier.h b/arch/arc/include/asm/barrier.h
> > index f6cb7c4ffb35..456c790fa1ad 100644
> > --- a/arch/arc/include/asm/barrier.h
> > +++ b/arch/arc/include/asm/barrier.h
> > @@ -22,10 +22,12 @@
> >  /* TODO-vineetg verify the correctness of macros here */
> >  #ifdef CONFIG_SMP
> >  #define smp_mb()        mb()
> > +#define smp_tmb()	mb()
> >  #define smp_rmb()       rmb()
> >  #define smp_wmb()       wmb()
> >  #else
> >  #define smp_mb()        barrier()
> > +#define smp_tmb()	barrier()
> >  #define smp_rmb()       barrier()
> >  #define smp_wmb()       barrier()
> >  #endif
> > diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
> > index 60f15e274e6d..bc88a8505673 100644
> > --- a/arch/arm/include/asm/barrier.h
> > +++ b/arch/arm/include/asm/barrier.h
> > @@ -51,10 +51,12 @@
> >  
> >  #ifndef CONFIG_SMP
> >  #define smp_mb()	barrier()
> > +#define smp_tmb()	barrier()
> >  #define smp_rmb()	barrier()
> >  #define smp_wmb()	barrier()
> >  #else
> >  #define smp_mb()	dmb(ish)
> > +#define smp_tmb()	smp_mb()
> >  #define smp_rmb()	smp_mb()
> >  #define smp_wmb()	dmb(ishst)
> >  #endif
> > diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
> > index d4a63338a53c..ec0531f4892f 100644
> > --- a/arch/arm64/include/asm/barrier.h
> > +++ b/arch/arm64/include/asm/barrier.h
> > @@ -33,10 +33,12 @@
> >  
> >  #ifndef CONFIG_SMP
> >  #define smp_mb()	barrier()
> > +#define smp_tmb()	barrier()
> >  #define smp_rmb()	barrier()
> >  #define smp_wmb()	barrier()
> >  #else
> >  #define smp_mb()	asm volatile("dmb ish" : : : "memory")
> > +#define smp_tmb()	asm volatile("dmb ish" : : : "memory")
> >  #define smp_rmb()	asm volatile("dmb ishld" : : : "memory")
> >  #define smp_wmb()	asm volatile("dmb ishst" : : : "memory")
> >  #endif
> > diff --git a/arch/avr32/include/asm/barrier.h b/arch/avr32/include/asm/barrier.h
> > index 0961275373db..6c6ccb9cf290 100644
> > --- a/arch/avr32/include/asm/barrier.h
> > +++ b/arch/avr32/include/asm/barrier.h
> > @@ -20,6 +20,7 @@
> >  # error "The AVR32 port does not support SMP"
> >  #else
> >  # define smp_mb()		barrier()
> > +# define smp_tmb()		barrier()
> >  # define smp_rmb()		barrier()
> >  # define smp_wmb()		barrier()
> >  # define smp_read_barrier_depends() do { } while(0)
> > diff --git a/arch/blackfin/include/asm/barrier.h b/arch/blackfin/include/asm/barrier.h
> > index ebb189507dd7..100f49121a18 100644
> > --- a/arch/blackfin/include/asm/barrier.h
> > +++ b/arch/blackfin/include/asm/barrier.h
> > @@ -40,6 +40,7 @@
> >  #endif /* !CONFIG_SMP */
> >  
> >  #define smp_mb()  mb()
> > +#define smp_tmb() mb()
> >  #define smp_rmb() rmb()
> >  #define smp_wmb() wmb()
> >  #define set_mb(var, value) do { var = value; mb(); } while (0)
> > diff --git a/arch/cris/include/asm/barrier.h b/arch/cris/include/asm/barrier.h
> > index 198ad7fa6b25..679c33738b4c 100644
> > --- a/arch/cris/include/asm/barrier.h
> > +++ b/arch/cris/include/asm/barrier.h
> > @@ -12,11 +12,13 @@
> >  
> >  #ifdef CONFIG_SMP
> >  #define smp_mb()        mb()
> > +#define smp_tmb()       mb()
> >  #define smp_rmb()       rmb()
> >  #define smp_wmb()       wmb()
> >  #define smp_read_barrier_depends()     read_barrier_depends()
> >  #else
> >  #define smp_mb()        barrier()
> > +#define smp_tmb()       barrier()
> >  #define smp_rmb()       barrier()
> >  #define smp_wmb()       barrier()
> >  #define smp_read_barrier_depends()     do { } while(0)
> > diff --git a/arch/frv/include/asm/barrier.h b/arch/frv/include/asm/barrier.h
> > index 06776ad9f5e9..60354ce13ba0 100644
> > --- a/arch/frv/include/asm/barrier.h
> > +++ b/arch/frv/include/asm/barrier.h
> > @@ -20,6 +20,7 @@
> >  #define read_barrier_depends()	do { } while (0)
> >  
> >  #define smp_mb()			barrier()
> > +#define smp_tmb()			barrier()
> >  #define smp_rmb()			barrier()
> >  #define smp_wmb()			barrier()
> >  #define smp_read_barrier_depends()	do {} while(0)
> > diff --git a/arch/h8300/include/asm/barrier.h b/arch/h8300/include/asm/barrier.h
> > index 9e0aa9fc195d..e8e297fa4e9a 100644
> > --- a/arch/h8300/include/asm/barrier.h
> > +++ b/arch/h8300/include/asm/barrier.h
> > @@ -16,11 +16,13 @@
> >  
> >  #ifdef CONFIG_SMP
> >  #define smp_mb()	mb()
> > +#define smp_tmb()	mb()
> >  #define smp_rmb()	rmb()
> >  #define smp_wmb()	wmb()
> >  #define smp_read_barrier_depends()	read_barrier_depends()
> >  #else
> >  #define smp_mb()	barrier()
> > +#define smp_tmb()	barrier()
> >  #define smp_rmb()	barrier()
> >  #define smp_wmb()	barrier()
> >  #define smp_read_barrier_depends()	do { } while(0)
> > diff --git a/arch/hexagon/include/asm/barrier.h b/arch/hexagon/include/asm/barrier.h
> > index 1041a8e70ce8..2dd5b2ad4d21 100644
> > --- a/arch/hexagon/include/asm/barrier.h
> > +++ b/arch/hexagon/include/asm/barrier.h
> > @@ -28,6 +28,7 @@
> >  #define smp_rmb()			barrier()
> >  #define smp_read_barrier_depends()	barrier()
> >  #define smp_wmb()			barrier()
> > +#define smp_tmb()			barrier()
> >  #define smp_mb()			barrier()
> >  #define smp_mb__before_atomic_dec()	barrier()
> >  #define smp_mb__after_atomic_dec()	barrier()
> > diff --git a/arch/ia64/include/asm/barrier.h b/arch/ia64/include/asm/barrier.h
> > index 60576e06b6fb..a5f92146b091 100644
> > --- a/arch/ia64/include/asm/barrier.h
> > +++ b/arch/ia64/include/asm/barrier.h
> > @@ -42,11 +42,13 @@
> >  
> >  #ifdef CONFIG_SMP
> >  # define smp_mb()	mb()
> > +# define smp_tmb()	mb()
> >  # define smp_rmb()	rmb()
> >  # define smp_wmb()	wmb()
> >  # define smp_read_barrier_depends()	read_barrier_depends()
> >  #else
> >  # define smp_mb()	barrier()
> > +# define smp_tmb()	barrier()
> >  # define smp_rmb()	barrier()
> >  # define smp_wmb()	barrier()
> >  # define smp_read_barrier_depends()	do { } while(0)
> > diff --git a/arch/m32r/include/asm/barrier.h b/arch/m32r/include/asm/barrier.h
> > index 6976621efd3f..a6fa29facd7a 100644
> > --- a/arch/m32r/include/asm/barrier.h
> > +++ b/arch/m32r/include/asm/barrier.h
> > @@ -79,12 +79,14 @@
> >  
> >  #ifdef CONFIG_SMP
> >  #define smp_mb()	mb()
> > +#define smp_tmb()	mb()
> >  #define smp_rmb()	rmb()
> >  #define smp_wmb()	wmb()
> >  #define smp_read_barrier_depends()	read_barrier_depends()
> >  #define set_mb(var, value) do { (void) xchg(&var, value); } while (0)
> >  #else
> >  #define smp_mb()	barrier()
> > +#define smp_tmb()	barrier()
> >  #define smp_rmb()	barrier()
> >  #define smp_wmb()	barrier()
> >  #define smp_read_barrier_depends()	do { } while (0)
> > diff --git a/arch/m68k/include/asm/barrier.h b/arch/m68k/include/asm/barrier.h
> > index 445ce22c23cb..8ecf52c87847 100644
> > --- a/arch/m68k/include/asm/barrier.h
> > +++ b/arch/m68k/include/asm/barrier.h
> > @@ -13,6 +13,7 @@
> >  #define set_mb(var, value)	({ (var) = (value); wmb(); })
> >  
> >  #define smp_mb()	barrier()
> > +#define smp_tmb()	barrier()
> >  #define smp_rmb()	barrier()
> >  #define smp_wmb()	barrier()
> >  #define smp_read_barrier_depends()	((void)0)
> > diff --git a/arch/metag/include/asm/barrier.h b/arch/metag/include/asm/barrier.h
> > index c90bfc6bf648..eb179fbce580 100644
> > --- a/arch/metag/include/asm/barrier.h
> > +++ b/arch/metag/include/asm/barrier.h
> > @@ -50,6 +50,7 @@ static inline void wmb(void)
> >  #ifndef CONFIG_SMP
> >  #define fence()		do { } while (0)
> >  #define smp_mb()        barrier()
> > +#define smp_tmb()       barrier()
> >  #define smp_rmb()       barrier()
> >  #define smp_wmb()       barrier()
> >  #else
> > @@ -70,11 +71,13 @@ static inline void fence(void)
> >  	*flushptr = 0;
> >  }
> >  #define smp_mb()        fence()
> > +#define smp_tmb()       fence()
> >  #define smp_rmb()       fence()
> >  #define smp_wmb()       barrier()
> >  #else
> >  #define fence()		do { } while (0)
> >  #define smp_mb()        barrier()
> > +#define smp_tmb()       barrier()
> >  #define smp_rmb()       barrier()
> >  #define smp_wmb()       barrier()
> >  #endif
> > diff --git a/arch/microblaze/include/asm/barrier.h b/arch/microblaze/include/asm/barrier.h
> > index df5be3e87044..d573c170a717 100644
> > --- a/arch/microblaze/include/asm/barrier.h
> > +++ b/arch/microblaze/include/asm/barrier.h
> > @@ -21,6 +21,7 @@
> >  #define set_wmb(var, value)	do { var = value; wmb(); } while (0)
> >  
> >  #define smp_mb()		mb()
> > +#define smp_tmb()		mb()
> >  #define smp_rmb()		rmb()
> >  #define smp_wmb()		wmb()
> >  
> > diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
> > index 314ab5532019..535e699eec3b 100644
> > --- a/arch/mips/include/asm/barrier.h
> > +++ b/arch/mips/include/asm/barrier.h
> > @@ -144,15 +144,18 @@
> >  #if defined(CONFIG_WEAK_ORDERING) && defined(CONFIG_SMP)
> >  # ifdef CONFIG_CPU_CAVIUM_OCTEON
> >  #  define smp_mb()	__sync()
> > +#  define smp_tmb()	__sync()
> >  #  define smp_rmb()	barrier()
> >  #  define smp_wmb()	__syncw()
> >  # else
> >  #  define smp_mb()	__asm__ __volatile__("sync" : : :"memory")
> > +#  define smp_tmb()	__asm__ __volatile__("sync" : : :"memory")
> >  #  define smp_rmb()	__asm__ __volatile__("sync" : : :"memory")
> >  #  define smp_wmb()	__asm__ __volatile__("sync" : : :"memory")
> >  # endif
> >  #else
> >  #define smp_mb()	barrier()
> > +#define smp_tmb()	barrier()
> >  #define smp_rmb()	barrier()
> >  #define smp_wmb()	barrier()
> >  #endif
> > diff --git a/arch/mn10300/include/asm/barrier.h b/arch/mn10300/include/asm/barrier.h
> > index 2bd97a5c8af7..a345b0776e5f 100644
> > --- a/arch/mn10300/include/asm/barrier.h
> > +++ b/arch/mn10300/include/asm/barrier.h
> > @@ -19,11 +19,13 @@
> >  
> >  #ifdef CONFIG_SMP
> >  #define smp_mb()	mb()
> > +#define smp_tmb()	mb()
> >  #define smp_rmb()	rmb()
> >  #define smp_wmb()	wmb()
> >  #define set_mb(var, value)  do { xchg(&var, value); } while (0)
> >  #else  /* CONFIG_SMP */
> >  #define smp_mb()	barrier()
> > +#define smp_tmb()	barrier()
> >  #define smp_rmb()	barrier()
> >  #define smp_wmb()	barrier()
> >  #define set_mb(var, value)  do { var = value;  mb(); } while (0)
> > diff --git a/arch/parisc/include/asm/barrier.h b/arch/parisc/include/asm/barrier.h
> > index e77d834aa803..f53196b589ec 100644
> > --- a/arch/parisc/include/asm/barrier.h
> > +++ b/arch/parisc/include/asm/barrier.h
> > @@ -25,6 +25,7 @@
> >  #define rmb()		mb()
> >  #define wmb()		mb()
> >  #define smp_mb()	mb()
> > +#define smp_tmb()	mb()
> >  #define smp_rmb()	mb()
> >  #define smp_wmb()	mb()
> >  #define smp_read_barrier_depends()	do { } while(0)
> > diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h
> > index ae782254e731..d7e8a560f1fe 100644
> > --- a/arch/powerpc/include/asm/barrier.h
> > +++ b/arch/powerpc/include/asm/barrier.h
> > @@ -46,11 +46,13 @@
> >  #endif
> >  
> >  #define smp_mb()	mb()
> > +#define smp_tmb()	__asm__ __volatile__ (stringify_in_c(LWSYNC) : : :"memory")
> >  #define smp_rmb()	__asm__ __volatile__ (stringify_in_c(LWSYNC) : : :"memory")
> >  #define smp_wmb()	__asm__ __volatile__ (stringify_in_c(SMPWMB) : : :"memory")
> >  #define smp_read_barrier_depends()	read_barrier_depends()
> >  #else
> >  #define smp_mb()	barrier()
> > +#define smp_tmb()	barrier()
> >  #define smp_rmb()	barrier()
> >  #define smp_wmb()	barrier()
> >  #define smp_read_barrier_depends()	do { } while(0)
> > diff --git a/arch/s390/include/asm/barrier.h b/arch/s390/include/asm/barrier.h
> > index 16760eeb79b0..f0409a874243 100644
> > --- a/arch/s390/include/asm/barrier.h
> > +++ b/arch/s390/include/asm/barrier.h
> > @@ -24,6 +24,7 @@
> >  #define wmb()				mb()
> >  #define read_barrier_depends()		do { } while(0)
> >  #define smp_mb()			mb()
> > +#define smp_tmb()			mb()
> >  #define smp_rmb()			rmb()
> >  #define smp_wmb()			wmb()
> >  #define smp_read_barrier_depends()	read_barrier_depends()
> > diff --git a/arch/score/include/asm/barrier.h b/arch/score/include/asm/barrier.h
> > index 0eacb6471e6d..865652083dde 100644
> > --- a/arch/score/include/asm/barrier.h
> > +++ b/arch/score/include/asm/barrier.h
> > @@ -5,6 +5,7 @@
> >  #define rmb()		barrier()
> >  #define wmb()		barrier()
> >  #define smp_mb()	barrier()
> > +#define smp_tmb()	barrier()
> >  #define smp_rmb()	barrier()
> >  #define smp_wmb()	barrier()
> >  
> > diff --git a/arch/sh/include/asm/barrier.h b/arch/sh/include/asm/barrier.h
> > index 72c103dae300..f8dce7926432 100644
> > --- a/arch/sh/include/asm/barrier.h
> > +++ b/arch/sh/include/asm/barrier.h
> > @@ -39,11 +39,13 @@
> >  
> >  #ifdef CONFIG_SMP
> >  #define smp_mb()	mb()
> > +#define smp_tmb()	mb()
> >  #define smp_rmb()	rmb()
> >  #define smp_wmb()	wmb()
> >  #define smp_read_barrier_depends()	read_barrier_depends()
> >  #else
> >  #define smp_mb()	barrier()
> > +#define smp_tmb()	barrier()
> >  #define smp_rmb()	barrier()
> >  #define smp_wmb()	barrier()
> >  #define smp_read_barrier_depends()	do { } while(0)
> > diff --git a/arch/sparc/include/asm/barrier_32.h b/arch/sparc/include/asm/barrier_32.h
> > index c1b76654ee76..1037ce189cee 100644
> > --- a/arch/sparc/include/asm/barrier_32.h
> > +++ b/arch/sparc/include/asm/barrier_32.h
> > @@ -8,6 +8,7 @@
> >  #define read_barrier_depends()	do { } while(0)
> >  #define set_mb(__var, __value)  do { __var = __value; mb(); } while(0)
> >  #define smp_mb()	__asm__ __volatile__("":::"memory")
> > +#define smp_tmb()	__asm__ __volatile__("":::"memory")
> >  #define smp_rmb()	__asm__ __volatile__("":::"memory")
> >  #define smp_wmb()	__asm__ __volatile__("":::"memory")
> >  #define smp_read_barrier_depends()	do { } while(0)
> > diff --git a/arch/sparc/include/asm/barrier_64.h b/arch/sparc/include/asm/barrier_64.h
> > index 95d45986f908..0f3c2fdb86b8 100644
> > --- a/arch/sparc/include/asm/barrier_64.h
> > +++ b/arch/sparc/include/asm/barrier_64.h
> > @@ -34,6 +34,7 @@ do {	__asm__ __volatile__("ba,pt	%%xcc, 1f\n\t" \
> >   * memory ordering than required by the specifications.
> >   */
> >  #define mb()	membar_safe("#StoreLoad")
> > +#define tmb()	__asm__ __volatile__("":::"memory")
> >  #define rmb()	__asm__ __volatile__("":::"memory")
> >  #define wmb()	__asm__ __volatile__("":::"memory")
> >  
> > @@ -43,10 +44,12 @@ do {	__asm__ __volatile__("ba,pt	%%xcc, 1f\n\t" \
> >  
> >  #ifdef CONFIG_SMP
> >  #define smp_mb()	mb()
> > +#define smp_tmb()	tmb()
> >  #define smp_rmb()	rmb()
> >  #define smp_wmb()	wmb()
> >  #else
> >  #define smp_mb()	__asm__ __volatile__("":::"memory")
> > +#define smp_tmb()	__asm__ __volatile__("":::"memory")
> >  #define smp_rmb()	__asm__ __volatile__("":::"memory")
> >  #define smp_wmb()	__asm__ __volatile__("":::"memory")
> >  #endif
> > diff --git a/arch/tile/include/asm/barrier.h b/arch/tile/include/asm/barrier.h
> > index a9a73da5865d..cad3c6ae28bf 100644
> > --- a/arch/tile/include/asm/barrier.h
> > +++ b/arch/tile/include/asm/barrier.h
> > @@ -127,11 +127,13 @@ mb_incoherent(void)
> >  
> >  #ifdef CONFIG_SMP
> >  #define smp_mb()	mb()
> > +#define smp_tmb()	mb()
> >  #define smp_rmb()	rmb()
> >  #define smp_wmb()	wmb()
> >  #define smp_read_barrier_depends()	read_barrier_depends()
> >  #else
> >  #define smp_mb()	barrier()
> > +#define smp_tmb()	barrier()
> >  #define smp_rmb()	barrier()
> >  #define smp_wmb()	barrier()
> >  #define smp_read_barrier_depends()	do { } while (0)
> > diff --git a/arch/unicore32/include/asm/barrier.h b/arch/unicore32/include/asm/barrier.h
> > index a6620e5336b6..8b341fffbda6 100644
> > --- a/arch/unicore32/include/asm/barrier.h
> > +++ b/arch/unicore32/include/asm/barrier.h
> > @@ -18,6 +18,7 @@
> >  #define rmb()				barrier()
> >  #define wmb()				barrier()
> >  #define smp_mb()			barrier()
> > +#define smp_tmb()			barrier()
> >  #define smp_rmb()			barrier()
> >  #define smp_wmb()			barrier()
> >  #define read_barrier_depends()		do { } while (0)
> > diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
> > index c6cd358a1eec..480201d83af1 100644
> > --- a/arch/x86/include/asm/barrier.h
> > +++ b/arch/x86/include/asm/barrier.h
> > @@ -86,14 +86,17 @@
> >  # define smp_rmb()	barrier()
> >  #endif
> >  #ifdef CONFIG_X86_OOSTORE
> > +# define smp_tmb()	mb()
> >  # define smp_wmb() 	wmb()
> >  #else
> > +# define smp_tmb()	barrier()
> >  # define smp_wmb()	barrier()
> >  #endif
> >  #define smp_read_barrier_depends()	read_barrier_depends()
> >  #define set_mb(var, value) do { (void)xchg(&var, value); } while (0)
> >  #else
> >  #define smp_mb()	barrier()
> > +#define smp_tmb()	barrier()
> >  #define smp_rmb()	barrier()
> >  #define smp_wmb()	barrier()
> >  #define smp_read_barrier_depends()	do { } while (0)
> > diff --git a/arch/xtensa/include/asm/barrier.h b/arch/xtensa/include/asm/barrier.h
> > index ef021677d536..7839db843ea5 100644
> > --- a/arch/xtensa/include/asm/barrier.h
> > +++ b/arch/xtensa/include/asm/barrier.h
> > @@ -20,6 +20,7 @@
> >  #error smp_* not defined
> >  #else
> >  #define smp_mb()	barrier()
> > +#define smp_tmb()	barrier()
> >  #define smp_rmb()	barrier()
> >  #define smp_wmb()	barrier()
> >  #endif
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> 
> 

^ permalink raw reply

* Re: [RFC] arch: Introduce new TSO memory barrier smp_tmb()
From: Paul E. McKenney @ 2013-11-03 22:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Neuling, Mathieu Desnoyers, Oleg Nesterov, LKML,
	Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky, Linus Torvalds
In-Reply-To: <20131103200124.GK19466@laptop.lan>

On Sun, Nov 03, 2013 at 09:01:24PM +0100, Peter Zijlstra wrote:
> On Sun, Nov 03, 2013 at 10:08:14AM -0800, Linus Torvalds wrote:
> > On Sun, Nov 3, 2013 at 7:17 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > > On Sun, Nov 03, 2013 at 06:40:17AM -0800, Paul E. McKenney wrote:
> > >> If there was an smp_tmb(), I would likely use it in rcu_assign_pointer().
> > >
> > > Well, I'm obviously all for introducing this new barrier, for it will
> > > reduce a full mfence on x86 to a compiler barrier. And ppc can use
> > > lwsync as opposed to sync afaict. Not sure ARM can do better.
> > >
> > > ---
> > > Subject: arch: Introduce new TSO memory barrier smp_tmb()
> > 
> > This is specialized enough that I would *really* like the name to be
> > more descriptive. Compare to the special "smp_read_barrier_depends()"
> > maco: it's unusual, and it has very specific semantics, so it gets a
> > long and descriptive name.
> > 
> > Memory ordering is subtle enough without then using names that are
> > subtle in themselves. mb/rmb/wmb are conceptually pretty simple
> > operations, and very basic when talking about memory ordering.
> > "acquire" and "release" are less simple, but have descriptive names
> > and have very specific uses in locking.
> > 
> > In contrast "smp_tmb()" is a *horrible* name, because TSO is a
> > description of the memory ordering, not of a particular barrier. It's
> > also not even clear that you can have a "tso barrier", since the
> > ordering (like acquire/release) presumably is really about one
> > particular *store*, not about some kind of barrier between different
> > operations.
> > 
> > So please describe exactly what the semantics that barrier has, and
> > then name the barrier that way.
> > 
> > I assume that in this particular case, the semantics RCU wants is
> > "write barrier, and no preceding reads can move past this point".

Its semantics order prior reads against subsequent reads, prior reads
against subsequent writes, and prior writes against subsequent writes.
It does -not- order prior writes against subsequent reads.

> > Calling that "smp_tmb()" is f*cking insane, imnsho.
> 
> Fair enough; from what I could gather the proposed semantics are
> RELEASE+WMB, such that neither reads not writes can cross over, writes
> can't cross back, but reads could.
> 
> Since both RELEASE and WMB are trivial under TSO the entire thing
> collapses.

And here are some candidate names, with no attempt to sort sanity from
insanity:

smp_storebuffer_mb() -- A barrier that enforces those orderings
	that do not invalidate the hardware store-buffer optimization.

smp_not_w_r_mb() -- A barrier that orders everything except prior
	writes against subsequent reads.

smp_acqrel_mb() -- A barrier that combines C/C++ acquire and release
	semantics.  (C/C++ "acquire" orders a specific load against
	subsequent loads and stores, while C/C++ "release" orders
	a specific store against prior loads and stores.)

Others?

> Now I'm currently completely confused as to what C/C++ wrecks vs actual
> proper memory order issues; let alone fully comprehend the case that
> started all this.

Each can result in similar wreckage.  In either case, it is about failing
to guarantee needed orderings.

							Thanx, Paul

^ permalink raw reply

* [PATCH v3 1/1] powerpc/embedded6xx: Add support for Motorola/Emerson MVME5100
From: Stephen Chivers @ 2013-11-03 21:07 UTC (permalink / raw)
  To: linuxppc-dev

Add support for the Motorola/Emerson MVME5100 Single Board Computer.

The MVME5100 is a 6U form factor VME64 computer with:

	- A single MPC7410 or MPC750 CPU
	- A HAWK Processor Host Bridge (CPU to PCI) and
	  MultiProcessor Interrupt Controller (MPIC)
	- Up to 500Mb of onboard memory
	- A M48T37 Real Time Clock (RTC) and Non-Volatile Memory chip
	- Two 16550 compatible UARTS
	- Two Intel E100 Fast Ethernets
	- Two PCI Mezzanine Card (PMC) Slots
	- PPCBug Firmware

The HAWK PHB/MPIC is compatible with the MPC10x devices.

There is no onboard disk support. This is usually provided by
installing a PMC in the first PMC slot.

This patch revives the board support, it was present in early 2.6
series kernels. The board support in those days was by Matt Porter of
MontaVista Software.

CSC Australia has around 31 of these boards in service. The kernel in use
for the boards is based on 2.6.31. The boards are operated without disks
from a file server. 

This patch is based on linux-3.12-rc7 and has been boot tested.

V1->V2:
	Address comments by Kular Gama and Scott Wood.
	Minor adjustment to platforms/embedded6xx/Kconfig to ensure
		correct indentation where possible.

V2->V3:
	Address comments by Scott Wood and Ben Herrenschmidt.
	Address errors reported by checkpatch.

Signed-off-by: Stephen Chivers <schivers@csc.com>
---
 arch/powerpc/boot/Makefile                    |    3 +-
 arch/powerpc/boot/dts/mvme5100.dts            |  185 +++++++++++++++++++++
 arch/powerpc/boot/mvme5100.c                  |   27 +++
 arch/powerpc/boot/wrapper                     |    4 +
 arch/powerpc/configs/mvme5100_defconfig       |  144 ++++++++++++++++
 arch/powerpc/platforms/embedded6xx/Kconfig    |   13 ++-
 arch/powerpc/platforms/embedded6xx/Makefile   |    1 +
 arch/powerpc/platforms/embedded6xx/mvme5100.c |  221 +++++++++++++++++++++++++
 8 files changed, 596 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
index 15ca225..a1c9c86 100644
--- a/arch/powerpc/boot/Makefile
+++ b/arch/powerpc/boot/Makefile
@@ -94,7 +94,7 @@ src-plat-$(CONFIG_FSL_SOC_BOOKE) += cuboot-85xx.c cuboot-85xx-cpm2.c
 src-plat-$(CONFIG_EMBEDDED6xx) += cuboot-pq2.c cuboot-mpc7448hpc2.c \
 					cuboot-c2k.c gamecube-head.S \
 					gamecube.c wii-head.S wii.c holly.c \
-					prpmc2800.c
+					prpmc2800.c fixed-head.S mvme5100.c
 src-plat-$(CONFIG_AMIGAONE) += cuboot-amigaone.c
 src-plat-$(CONFIG_PPC_PS3) += ps3-head.S ps3-hvcall.S ps3.c
 src-plat-$(CONFIG_EPAPR_BOOT) += epapr.c epapr-wrapper.c
@@ -285,6 +285,7 @@ image-$(CONFIG_MPC7448HPC2)		+= cuImage.mpc7448hpc2
 image-$(CONFIG_PPC_C2K)			+= cuImage.c2k
 image-$(CONFIG_GAMECUBE)		+= dtbImage.gamecube
 image-$(CONFIG_WII)			+= dtbImage.wii
+image-$(CONFIG_MVME5100)		+= dtbImage.mvme5100
 
 # Board port in arch/powerpc/platform/amigaone/Kconfig
 image-$(CONFIG_AMIGAONE)		+= cuImage.amigaone
diff --git a/arch/powerpc/boot/dts/mvme5100.dts b/arch/powerpc/boot/dts/mvme5100.dts
new file mode 100644
index 0000000..4cb2f05
--- /dev/null
+++ b/arch/powerpc/boot/dts/mvme5100.dts
@@ -0,0 +1,185 @@
+/*
+ * Device Tree Souce for Motorola/Emerson MVME5100.
+ *
+ * Copyright 2013 CSC Australia Pty. Ltd.
+ *
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2.  This program is licensed "as is" without
+ * any warranty of any kind, whether express or implied.
+ */
+
+/dts-v1/;
+
+/ {
+	model = "MVME5100";
+	compatible = "MVME5100";
+	#address-cells = <1>;
+	#size-cells = <1>;
+
+	aliases {
+		serial0 = &serial0;
+		pci0 = &pci0;
+	};
+
+	cpus {
+		#address-cells = <1>;
+		#size-cells = <0>;
+
+		PowerPC,7410 {
+			device_type = "cpu";
+			reg = <0x0>;
+			/* Following required by dtc but not used */
+			d-cache-line-size = <32>;
+			i-cache-line-size = <32>;
+			i-cache-size = <32768>;
+			d-cache-size = <32768>;
+			timebase-frequency = <25000000>;
+			clock-frequency = <500000000>;
+			bus-frequency = <100000000>;
+		};
+	};
+
+	memory {
+		device_type = "memory";
+		reg = <0x0 0x20000000>;
+	};
+
+	hawk@fef80000 {
+		#address-cells = <1>;
+		#size-cells = <1>;
+		compatible = "hawk-bridge", "simple-bus";
+		ranges = <0x0 0xfef80000 0x10000>;
+		reg = <0xfef80000 0x10000>;
+
+		serial0: serial@8000 {
+			device_type = "serial";
+			compatible = "ns16550";
+			reg = <0x8000 0x80>;
+			reg-shift = <4>;
+			clock-frequency = <1843200>;
+			current-speed = <9600>;
+			interrupts = <1 1>; // IRQ1 Level Active Low.
+			interrupt-parent = <&mpic>;
+		};
+
+		serial1: serial@8200 {
+			device_type = "serial";
+			compatible = "ns16550";
+			reg = <0x8200 0x80>;
+			reg-shift = <4>;
+			clock-frequency = <1843200>;
+			current-speed = <9600>;
+			interrupts = <1 1>; // IRQ1 Level Active Low.
+			interrupt-parent = <&mpic>;
+		};
+
+		mpic: interrupt-controller@f3f80000 {
+			#interrupt-cells = <2>;
+			#address-cells = <0>;
+			device_type = "open-pic";
+			compatible = "chrp,open-pic";
+			interrupt-controller;
+			reg = <0xf3f80000 0x40000>;
+		};
+	};
+
+	pci0: pci@feff0000 {
+		#address-cells = <3>;
+		#size-cells = <2>;
+		#interrupt-cells = <1>;
+		device_type = "pci";
+		compatible = "hawk-pci";
+		reg = <0xfec00000 0x400000>;
+		8259-interrupt-acknowledge = <0xfeff0030>;
+		ranges = <0x1000000 0x0        0x0 0xfe000000 0x0 0x800000
+			  0x2000000 0x0 0x80000000 0x80000000 0x0 0x74000000>;
+		bus-range = <0 255>;
+		clock-frequency = <33333333>;
+		interrupt-parent = <&mpic>;
+		interrupt-map-mask = <0xf800 0x0 0x0 0x7>;
+		interrupt-map = <
+
+			/*
+			 * This definition (IDSEL 11) duplicates the
+			 * interrupts definition in the i8259
+			 * interrupt controller below.
+			 *
+			 * Do not change the interrupt sense/polarity from
+			 * 0x2 to anything else, doing so will cause endless
+			 * "spurious" i8259 interrupts to be fielded.
+			 */
+			// IDSEL 11 - iPMC712 PCI/ISA Bridge
+			0x5800 0x0 0x0 0x1 &mpic 0x0 0x2
+			0x5800 0x0 0x0 0x2 &mpic 0x0 0x2
+			0x5800 0x0 0x0 0x3 &mpic 0x0 0x2
+			0x5800 0x0 0x0 0x4 &mpic 0x0 0x2
+
+			/* IDSEL 12 - Not Used */
+
+			/* IDSEL 13 - Universe VME Bridge */
+			0x6800 0x0 0x0 0x1 &mpic 0x5 0x1
+			0x6800 0x0 0x0 0x2 &mpic 0x6 0x1
+			0x6800 0x0 0x0 0x3 &mpic 0x7 0x1
+			0x6800 0x0 0x0 0x4 &mpic 0x8 0x1
+
+			/* IDSEL 14 - ENET 1 */
+			0x7000 0x0 0x0 0x1 &mpic 0x2 0x1
+
+			/* IDSEL 15 - Not Used */
+
+			/* IDSEL 16 - PMC Slot 1 */
+			0x8000 0x0 0x0 0x1 &mpic 0x9 0x1
+			0x8000 0x0 0x0 0x2 &mpic 0xa 0x1
+			0x8000 0x0 0x0 0x3 &mpic 0xb 0x1
+			0x8000 0x0 0x0 0x4 &mpic 0xc 0x1
+
+			/* IDSEL 17 - PMC Slot 2 */
+			0x8800 0x0 0x0 0x1 &mpic 0xc 0x1
+			0x8800 0x0 0x0 0x2 &mpic 0x9 0x1
+			0x8800 0x0 0x0 0x3 &mpic 0xa 0x1
+			0x8800 0x0 0x0 0x4 &mpic 0xb 0x1
+
+			/* IDSEL 18 - Not Used */
+
+			/* IDSEL 19 - ENET 2 */
+			0x9800 0x0 0x0 0x1 &mpic 0xd 0x1
+
+			/* IDSEL 20 - PMCSPAN (PCI-X) */
+			0xa000 0x0 0x0 0x1 &mpic 0x9 0x1
+			0xa000 0x0 0x0 0x2 &mpic 0xa 0x1
+			0xa000 0x0 0x0 0x3 &mpic 0xb 0x1
+			0xa000 0x0 0x0 0x4 &mpic 0xc 0x1
+
+		>;
+
+		isa {
+			#address-cells = <2>;
+			#size-cells = <1>;
+			#interrupt-cells = <2>;
+			device_type = "isa";
+			compatible = "isa";
+			ranges = <0x00000001 0 0x01000000 0 0x00000000 0x00001000>;
+			interrupt-parent = <&i8259>;
+
+			i8259: interrupt-controller@20 {
+				#interrupt-cells = <2>;
+				#address-cells = <0>;
+				interrupts = <0 2>;
+				device_type = "interrupt-controller";
+				compatible = "chrp,iic";
+				interrupt-controller;
+				reg = <1 0x00000020 0x00000002
+                                       1 0x000000a0 0x00000002
+                                       1 0x000004d0 0x00000002>;
+				interrupt-parent = <&mpic>;
+			};
+
+		};
+
+	};
+
+	chosen {
+		linux,stdout-path = &serial0;
+        };
+
+};
diff --git a/arch/powerpc/boot/mvme5100.c b/arch/powerpc/boot/mvme5100.c
new file mode 100644
index 0000000..cb865f8
--- /dev/null
+++ b/arch/powerpc/boot/mvme5100.c
@@ -0,0 +1,27 @@
+/*
+ * Motorola/Emerson MVME5100 with PPCBug firmware.
+ *
+ * Author: Stephen Chivers <schivers@csc.com>
+ *
+ * Copyright 2013 CSC Australia Pty. Ltd.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ *
+ */
+#include "types.h"
+#include "ops.h"
+#include "io.h"
+
+BSS_STACK(4096);
+
+void platform_init(unsigned long r3, unsigned long r4, unsigned long r5)
+{
+	u32			heapsize;
+
+	heapsize = 0x8000000 - (u32)_end; /* 128M */
+	simple_alloc_init(_end, heapsize, 32, 64);
+	fdt_init(_dtb_start);
+	serial_console_init();
+}
diff --git a/arch/powerpc/boot/wrapper b/arch/powerpc/boot/wrapper
index cd7af84..401c6a1 100755
--- a/arch/powerpc/boot/wrapper
+++ b/arch/powerpc/boot/wrapper
@@ -257,6 +257,10 @@ epapr)
     link_address='0x20000000'
     pie=-pie
     ;;
+mvme5100)
+    platformo="$object/fixed-head.o $object/mvme5100.o"
+    binary=y
+    ;;
 esac
 
 vmz="$tmpdir/`basename \"$kernel\"`.$ext"
diff --git a/arch/powerpc/configs/mvme5100_defconfig b/arch/powerpc/configs/mvme5100_defconfig
new file mode 100644
index 0000000..93c7752
--- /dev/null
+++ b/arch/powerpc/configs/mvme5100_defconfig
@@ -0,0 +1,144 @@
+CONFIG_SYSVIPC=y
+CONFIG_POSIX_MQUEUE=y
+CONFIG_NO_HZ=y
+CONFIG_HIGH_RES_TIMERS=y
+CONFIG_IKCONFIG=y
+CONFIG_IKCONFIG_PROC=y
+CONFIG_LOG_BUF_SHIFT=14
+# CONFIG_UTS_NS is not set
+# CONFIG_IPC_NS is not set
+# CONFIG_PID_NS is not set
+# CONFIG_NET_NS is not set
+CONFIG_CC_OPTIMIZE_FOR_SIZE=y
+# CONFIG_COMPAT_BRK is not set
+CONFIG_MODULES=y
+CONFIG_MODULE_UNLOAD=y
+# CONFIG_BLK_DEV_BSG is not set
+# CONFIG_PPC_CHRP is not set
+# CONFIG_PPC_PMAC is not set
+CONFIG_EMBEDDED6xx=y
+CONFIG_MVME5100=y
+CONFIG_KVM_GUEST=y
+CONFIG_HZ_100=y
+# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
+# CONFIG_COMPACTION is not set
+CONFIG_CMDLINE_BOOL=y
+CONFIG_CMDLINE="console=ttyS0,9600 ip=dhcp root=/dev/nfs"
+CONFIG_NET=y
+CONFIG_PACKET=y
+CONFIG_UNIX=y
+CONFIG_INET=y
+CONFIG_IP_MULTICAST=y
+CONFIG_IP_PNP=y
+CONFIG_IP_PNP_DHCP=y
+CONFIG_IP_PNP_BOOTP=y
+# CONFIG_INET_LRO is not set
+# CONFIG_IPV6 is not set
+CONFIG_NETFILTER=y
+CONFIG_NF_CONNTRACK=m
+CONFIG_NF_CT_PROTO_SCTP=m
+CONFIG_NF_CONNTRACK_AMANDA=m
+CONFIG_NF_CONNTRACK_FTP=m
+CONFIG_NF_CONNTRACK_H323=m
+CONFIG_NF_CONNTRACK_IRC=m
+CONFIG_NF_CONNTRACK_NETBIOS_NS=m
+CONFIG_NF_CONNTRACK_PPTP=m
+CONFIG_NF_CONNTRACK_SIP=m
+CONFIG_NF_CONNTRACK_TFTP=m
+CONFIG_NETFILTER_XT_MATCH_MAC=m
+CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m
+CONFIG_NETFILTER_XT_MATCH_STATE=m
+CONFIG_NF_CONNTRACK_IPV4=m
+CONFIG_IP_NF_IPTABLES=m
+CONFIG_IP_NF_FILTER=m
+CONFIG_IP_NF_TARGET_REJECT=m
+CONFIG_IP_NF_MANGLE=m
+CONFIG_IP_NF_TARGET_ECN=m
+CONFIG_IP_NF_TARGET_TTL=m
+CONFIG_IP_NF_RAW=m
+CONFIG_IP_NF_ARPTABLES=m
+CONFIG_IP_NF_ARPFILTER=m
+CONFIG_IP_NF_ARP_MANGLE=m
+CONFIG_LAPB=m
+CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
+CONFIG_PROC_DEVICETREE=y
+CONFIG_BLK_DEV_LOOP=y
+CONFIG_BLK_DEV_RAM=y
+CONFIG_BLK_DEV_RAM_COUNT=2
+CONFIG_BLK_DEV_RAM_SIZE=8192
+CONFIG_EEPROM_LEGACY=m
+CONFIG_NETDEVICES=y
+CONFIG_TUN=m
+# CONFIG_NET_VENDOR_3COM is not set
+CONFIG_E100=y
+# CONFIG_WLAN is not set
+# CONFIG_INPUT_MOUSEDEV_PSAUX is not set
+# CONFIG_INPUT_KEYBOARD is not set
+# CONFIG_INPUT_MOUSE is not set
+# CONFIG_SERIO is not set
+CONFIG_SERIAL_8250=y
+CONFIG_SERIAL_8250_CONSOLE=y
+CONFIG_SERIAL_8250_NR_UARTS=10
+CONFIG_SERIAL_8250_EXTENDED=y
+CONFIG_SERIAL_8250_MANY_PORTS=y
+CONFIG_SERIAL_8250_SHARE_IRQ=y
+CONFIG_SERIAL_OF_PLATFORM=y
+CONFIG_HW_RANDOM=y
+CONFIG_I2C=y
+CONFIG_I2C_CHARDEV=y
+CONFIG_I2C_MPC=y
+# CONFIG_HWMON is not set
+CONFIG_VIDEO_OUTPUT_CONTROL=m
+# CONFIG_VGA_CONSOLE is not set
+# CONFIG_HID is not set
+# CONFIG_USB_SUPPORT is not set
+# CONFIG_IOMMU_SUPPORT is not set
+CONFIG_VME_BUS=m
+CONFIG_VME_CA91CX42=m
+CONFIG_EXT2_FS=m
+CONFIG_EXT3_FS=m
+# CONFIG_EXT3_DEFAULTS_TO_ORDERED is not set
+CONFIG_XFS_FS=m
+CONFIG_ISO9660_FS=m
+CONFIG_JOLIET=y
+CONFIG_ZISOFS=y
+CONFIG_UDF_FS=m
+CONFIG_MSDOS_FS=m
+CONFIG_VFAT_FS=m
+CONFIG_PROC_KCORE=y
+CONFIG_TMPFS=y
+CONFIG_NFS_FS=y
+CONFIG_NFS_V3_ACL=y
+CONFIG_NFS_V4=y
+CONFIG_ROOT_NFS=y
+CONFIG_NFSD=m
+CONFIG_NFSD_V3=y
+CONFIG_CIFS=m
+CONFIG_NLS=y
+CONFIG_NLS_CODEPAGE_437=m
+CONFIG_NLS_CODEPAGE_932=m
+CONFIG_NLS_ISO8859_1=m
+CONFIG_NLS_UTF8=m
+CONFIG_CRC_CCITT=m
+CONFIG_CRC_T10DIF=y
+CONFIG_XZ_DEC=y
+CONFIG_XZ_DEC_X86=y
+CONFIG_XZ_DEC_IA64=y
+CONFIG_XZ_DEC_ARM=y
+CONFIG_XZ_DEC_ARMTHUMB=y
+CONFIG_XZ_DEC_SPARC=y
+CONFIG_MAGIC_SYSRQ=y
+CONFIG_DEBUG_KERNEL=y
+CONFIG_DETECT_HUNG_TASK=y
+CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=20
+CONFIG_CRYPTO_CBC=y
+CONFIG_CRYPTO_PCBC=m
+CONFIG_CRYPTO_MD5=y
+CONFIG_CRYPTO_MICHAEL_MIC=m
+CONFIG_CRYPTO_SHA1=m
+CONFIG_CRYPTO_BLOWFISH=m
+CONFIG_CRYPTO_DES=y
+CONFIG_CRYPTO_SERPENT=m
+CONFIG_CRYPTO_TWOFISH=m
+CONFIG_CRYPTO_DEFLATE=m
+# CONFIG_CRYPTO_ANSI_CPRNG is not set
diff --git a/arch/powerpc/platforms/embedded6xx/Kconfig b/arch/powerpc/platforms/embedded6xx/Kconfig
index 302ba43..6d3c7a9 100644
--- a/arch/powerpc/platforms/embedded6xx/Kconfig
+++ b/arch/powerpc/platforms/embedded6xx/Kconfig
@@ -67,6 +67,18 @@ config PPC_C2K
 	  This option enables support for the GE Fanuc C2K board (formerly
 	  an SBS board).
 
+config MVME5100
+	bool "Motorola/Emerson MVME5100"
+	depends on EMBEDDED6xx
+	select MPIC
+	select PCI
+	select PPC_INDIRECT_PCI
+	select PPC_I8259
+	select PPC_NATIVE
+	help
+	  This option enables support for the Motorola (now Emerson) MVME5100
+	  board.
+
 config TSI108_BRIDGE
 	bool
 	select PCI
@@ -113,4 +125,3 @@ config WII
 	help
 	  Select WII if configuring for the Nintendo Wii.
 	  More information at: <http://gc-linux.sourceforge.net/>
-
diff --git a/arch/powerpc/platforms/embedded6xx/Makefile b/arch/powerpc/platforms/embedded6xx/Makefile
index 66c23e4..cdd48d4 100644
--- a/arch/powerpc/platforms/embedded6xx/Makefile
+++ b/arch/powerpc/platforms/embedded6xx/Makefile
@@ -11,3 +11,4 @@ obj-$(CONFIG_USBGECKO_UDBG)	+= usbgecko_udbg.o
 obj-$(CONFIG_GAMECUBE_COMMON)	+= flipper-pic.o
 obj-$(CONFIG_GAMECUBE)		+= gamecube.o
 obj-$(CONFIG_WII)		+= wii.o hlwd-pic.o
+obj-$(CONFIG_MVME5100)		+= mvme5100.o
diff --git a/arch/powerpc/platforms/embedded6xx/mvme5100.c b/arch/powerpc/platforms/embedded6xx/mvme5100.c
new file mode 100644
index 0000000..25e3bfb
--- /dev/null
+++ b/arch/powerpc/platforms/embedded6xx/mvme5100.c
@@ -0,0 +1,221 @@
+/*
+ * Board setup routines for the Motorola/Emerson MVME5100.
+ *
+ * Copyright 2013 CSC Australia Pty. Ltd.
+ *
+ * Based on earlier code by:
+ *
+ *    Matt Porter, MontaVista Software Inc.
+ *    Copyright 2001 MontaVista Software Inc.
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ *
+ * Author: Stephen Chivers <schivers@csc.com>
+ *
+ */
+
+#include <linux/of_platform.h>
+
+#include <asm/i8259.h>
+#include <asm/pci-bridge.h>
+#include <asm/mpic.h>
+#include <asm/prom.h>
+#include <mm/mmu_decl.h>
+#include <asm/udbg.h>
+
+#define HAWK_MPIC_SIZE		0x00040000U
+#define MVME5100_PCI_MEM_OFFSET 0x00000000
+
+/* Board register addresses. */
+#define BOARD_STATUS_REG	0xfef88080
+#define BOARD_MODFAIL_REG	0xfef88090
+#define BOARD_MODRST_REG	0xfef880a0
+#define BOARD_TBEN_REG		0xfef880c0
+#define BOARD_SW_READ_REG	0xfef880e0
+#define BOARD_GEO_ADDR_REG	0xfef880e8
+#define BOARD_EXT_FEATURE1_REG	0xfef880f0
+#define BOARD_EXT_FEATURE2_REG	0xfef88100
+
+static phys_addr_t pci_membase;
+static u_char *restart;
+
+static void mvme5100_8259_cascade(unsigned int irq, struct irq_desc *desc)
+{
+	struct irq_chip *chip = irq_desc_get_chip(desc);
+	unsigned int cascade_irq = i8259_irq();
+
+	if (cascade_irq != NO_IRQ)
+		generic_handle_irq(cascade_irq);
+
+	chip->irq_eoi(&desc->irq_data);
+}
+
+static void __init mvme5100_pic_init(void)
+{
+	struct mpic *mpic;
+	struct device_node *np;
+	struct device_node *cp = NULL;
+	unsigned int cirq;
+	unsigned long intack = 0;
+	const u32 *prop = NULL;
+
+	np = of_find_node_by_type(NULL, "open-pic");
+	if (!np) {
+		pr_err("Could not find open-pic node\n");
+		return;
+	}
+
+	mpic = mpic_alloc(np, pci_membase, 0, 16, 256, " OpenPIC  ");
+
+	BUG_ON(mpic == NULL);
+	of_node_put(np);
+
+	mpic_assign_isu(mpic, 0, pci_membase + 0x10000);
+
+	mpic_init(mpic);
+
+	cp = of_find_compatible_node(NULL, NULL, "chrp,iic");
+	if (cp == NULL) {
+		pr_warn("mvme5100_pic_init: couldn't find i8259\n");
+		return;
+	}
+
+	cirq = irq_of_parse_and_map(cp, 0);
+	if (cirq == NO_IRQ) {
+		pr_warn("mvme5100_pic_init: no cascade interrupt?\n");
+		return;
+	}
+
+	np = of_find_compatible_node(NULL, "pci", "mpc10x-pci");
+	if (np) {
+		prop = of_get_property(np, "8259-interrupt-acknowledge", NULL);
+
+		if (prop)
+			intack = prop[0];
+
+		of_node_put(np);
+	}
+
+	if (intack)
+		pr_debug("mvme5100_pic_init: PCI 8259 intack at 0x%016lx\n",
+		   intack);
+
+	i8259_init(cp, intack);
+	of_node_put(cp);
+	irq_set_chained_handler(cirq, mvme5100_8259_cascade);
+}
+
+static int __init mvme5100_add_bridge(struct device_node *dev)
+{
+	const int		*bus_range;
+	int			len;
+	struct pci_controller	*hose;
+	unsigned short		devid;
+
+	pr_info("Adding PCI host bridge %s\n", dev->full_name);
+
+	bus_range = of_get_property(dev, "bus-range", &len);
+
+	hose = pcibios_alloc_controller(dev);
+	if (hose == NULL)
+		return -ENOMEM;
+
+	hose->first_busno = bus_range ? bus_range[0] : 0;
+	hose->last_busno = bus_range ? bus_range[1] : 0xff;
+
+	setup_indirect_pci(hose, 0xfe000cf8, 0xfe000cfc, 0);
+
+	pci_process_bridge_OF_ranges(hose, dev, 1);
+
+	early_read_config_word(hose, 0, 0, PCI_DEVICE_ID, &devid);
+
+	if (devid != PCI_DEVICE_ID_MOTOROLA_HAWK) {
+		pr_err("HAWK PHB not present?\n");
+		return 0;
+	}
+
+	early_read_config_dword(hose, 0, 0, PCI_BASE_ADDRESS_1, &pci_membase);
+
+	if (pci_membase == 0) {
+		pr_err("HAWK PHB mibar not correctly set?\n");
+		return 0;
+	}
+
+	pr_info("mvme5100_pic_init: pci_membase: %x\n", pci_membase);
+
+	return 0;
+}
+
+static struct of_device_id mvme5100_of_bus_ids[] __initdata = {
+	{ .compatible = "hawk-bridge", },
+	{},
+};
+
+/*
+ * Setup the architecture
+ */
+static void __init mvme5100_setup_arch(void)
+{
+	struct device_node *np;
+
+	if (ppc_md.progress)
+		ppc_md.progress("mvme5100_setup_arch()", 0);
+
+	for_each_compatible_node(np, "pci", "hawk-pci")
+		mvme5100_add_bridge(np);
+
+	restart = ioremap(BOARD_MODRST_REG, 4);
+}
+
+
+static void mvme5100_show_cpuinfo(struct seq_file *m)
+{
+	seq_puts(m, "Vendor\t\t: Motorola/Emerson\n");
+	seq_puts(m, "Machine\t\t: MVME5100\n");
+}
+
+static void mvme5100_restart(char *cmd)
+{
+
+	local_irq_disable();
+	mtmsr(mfmsr() | MSR_IP);
+
+	out_8((u_char *) restart, 0x01);
+
+	while (1)
+		;
+}
+
+/*
+ * Called very early, device-tree isn't unflattened
+ */
+static int __init mvme5100_probe(void)
+{
+	unsigned long root = of_get_flat_dt_root();
+
+	return of_flat_dt_is_compatible(root, "MVME5100");
+}
+
+static int __init probe_of_platform_devices(void)
+{
+
+	of_platform_bus_probe(NULL, mvme5100_of_bus_ids, NULL);
+	return 0;
+}
+
+machine_device_initcall(mvme5100, probe_of_platform_devices);
+
+define_machine(mvme5100) {
+	.name			= "MVME5100",
+	.probe			= mvme5100_probe,
+	.setup_arch		= mvme5100_setup_arch,
+	.init_IRQ		= mvme5100_pic_init,
+	.show_cpuinfo		= mvme5100_show_cpuinfo,
+	.get_irq		= mpic_get_irq,
+	.restart		= mvme5100_restart,
+	.calibrate_decr		= generic_calibrate_decr,
+	.progress		= udbg_progress,
+};

^ permalink raw reply related

* Re: [RFC] arch: Introduce new TSO memory barrier smp_tmb()
From: Benjamin Herrenschmidt @ 2013-11-03 20:59 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Neuling, Mathieu Desnoyers, Frederic Weisbecker,
	Oleg Nesterov, LKML, Linux PPC dev, Anton Blanchard,
	Victor Kaplansky, Paul E. McKenney, Linus Torvalds
In-Reply-To: <20131103151704.GJ19466@laptop.lan>

On Sun, 2013-11-03 at 16:17 +0100, Peter Zijlstra wrote:
> On Sun, Nov 03, 2013 at 06:40:17AM -0800, Paul E. McKenney wrote:
> > If there was an smp_tmb(), I would likely use it in rcu_assign_pointer().
> 
> Well, I'm obviously all for introducing this new barrier, for it will
> reduce a full mfence on x86 to a compiler barrier. And ppc can use
> lwsync as opposed to sync afaict. Not sure ARM can do better.

The patch at the *very least* needs a good description of the semantics
of the barrier, what does it order vs. what etc...

Cheers,
Ben.

> ---
> Subject: arch: Introduce new TSO memory barrier smp_tmb()
> 
> A few sites could be downgraded from smp_mb() to smp_tmb() and a few
> site should be upgraded to smp_tmb() that are now using smp_wmb().
> 
> XXX hope PaulMck explains things better..
> 
> X86 (!OOSTORE), SPARC have native TSO memory models and smp_tmb()
> reduces to barrier().
> 
> PPC can use lwsync instead of sync
> 
> For the other archs, have smp_tmb map to smp_mb, as the stronger barrier
> is always correct but possibly suboptimal.
> 
> Suggested-by: Paul McKenney <paulmck@linux.vnet.ibm.com>
> Not-Signed-off-by: Peter Zijlstra <peterz@infradead.org>
> ---
>  arch/alpha/include/asm/barrier.h      | 2 ++
>  arch/arc/include/asm/barrier.h        | 2 ++
>  arch/arm/include/asm/barrier.h        | 2 ++
>  arch/arm64/include/asm/barrier.h      | 2 ++
>  arch/avr32/include/asm/barrier.h      | 1 +
>  arch/blackfin/include/asm/barrier.h   | 1 +
>  arch/cris/include/asm/barrier.h       | 2 ++
>  arch/frv/include/asm/barrier.h        | 1 +
>  arch/h8300/include/asm/barrier.h      | 2 ++
>  arch/hexagon/include/asm/barrier.h    | 1 +
>  arch/ia64/include/asm/barrier.h       | 2 ++
>  arch/m32r/include/asm/barrier.h       | 2 ++
>  arch/m68k/include/asm/barrier.h       | 1 +
>  arch/metag/include/asm/barrier.h      | 3 +++
>  arch/microblaze/include/asm/barrier.h | 1 +
>  arch/mips/include/asm/barrier.h       | 3 +++
>  arch/mn10300/include/asm/barrier.h    | 2 ++
>  arch/parisc/include/asm/barrier.h     | 1 +
>  arch/powerpc/include/asm/barrier.h    | 2 ++
>  arch/s390/include/asm/barrier.h       | 1 +
>  arch/score/include/asm/barrier.h      | 1 +
>  arch/sh/include/asm/barrier.h         | 2 ++
>  arch/sparc/include/asm/barrier_32.h   | 1 +
>  arch/sparc/include/asm/barrier_64.h   | 3 +++
>  arch/tile/include/asm/barrier.h       | 2 ++
>  arch/unicore32/include/asm/barrier.h  | 1 +
>  arch/x86/include/asm/barrier.h        | 3 +++
>  arch/xtensa/include/asm/barrier.h     | 1 +
>  28 files changed, 48 insertions(+)
> 
> diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
> index ce8860a0b32d..02ea63897038 100644
> --- a/arch/alpha/include/asm/barrier.h
> +++ b/arch/alpha/include/asm/barrier.h
> @@ -18,12 +18,14 @@ __asm__ __volatile__("mb": : :"memory")
>  #ifdef CONFIG_SMP
>  #define __ASM_SMP_MB	"\tmb\n"
>  #define smp_mb()	mb()
> +#define smp_tmb()	mb()
>  #define smp_rmb()	rmb()
>  #define smp_wmb()	wmb()
>  #define smp_read_barrier_depends()	read_barrier_depends()
>  #else
>  #define __ASM_SMP_MB
>  #define smp_mb()	barrier()
> +#define smp_tmb()	barrier()
>  #define smp_rmb()	barrier()
>  #define smp_wmb()	barrier()
>  #define smp_read_barrier_depends()	do { } while (0)
> diff --git a/arch/arc/include/asm/barrier.h b/arch/arc/include/asm/barrier.h
> index f6cb7c4ffb35..456c790fa1ad 100644
> --- a/arch/arc/include/asm/barrier.h
> +++ b/arch/arc/include/asm/barrier.h
> @@ -22,10 +22,12 @@
>  /* TODO-vineetg verify the correctness of macros here */
>  #ifdef CONFIG_SMP
>  #define smp_mb()        mb()
> +#define smp_tmb()	mb()
>  #define smp_rmb()       rmb()
>  #define smp_wmb()       wmb()
>  #else
>  #define smp_mb()        barrier()
> +#define smp_tmb()	barrier()
>  #define smp_rmb()       barrier()
>  #define smp_wmb()       barrier()
>  #endif
> diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
> index 60f15e274e6d..bc88a8505673 100644
> --- a/arch/arm/include/asm/barrier.h
> +++ b/arch/arm/include/asm/barrier.h
> @@ -51,10 +51,12 @@
>  
>  #ifndef CONFIG_SMP
>  #define smp_mb()	barrier()
> +#define smp_tmb()	barrier()
>  #define smp_rmb()	barrier()
>  #define smp_wmb()	barrier()
>  #else
>  #define smp_mb()	dmb(ish)
> +#define smp_tmb()	smp_mb()
>  #define smp_rmb()	smp_mb()
>  #define smp_wmb()	dmb(ishst)
>  #endif
> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
> index d4a63338a53c..ec0531f4892f 100644
> --- a/arch/arm64/include/asm/barrier.h
> +++ b/arch/arm64/include/asm/barrier.h
> @@ -33,10 +33,12 @@
>  
>  #ifndef CONFIG_SMP
>  #define smp_mb()	barrier()
> +#define smp_tmb()	barrier()
>  #define smp_rmb()	barrier()
>  #define smp_wmb()	barrier()
>  #else
>  #define smp_mb()	asm volatile("dmb ish" : : : "memory")
> +#define smp_tmb()	asm volatile("dmb ish" : : : "memory")
>  #define smp_rmb()	asm volatile("dmb ishld" : : : "memory")
>  #define smp_wmb()	asm volatile("dmb ishst" : : : "memory")
>  #endif
> diff --git a/arch/avr32/include/asm/barrier.h b/arch/avr32/include/asm/barrier.h
> index 0961275373db..6c6ccb9cf290 100644
> --- a/arch/avr32/include/asm/barrier.h
> +++ b/arch/avr32/include/asm/barrier.h
> @@ -20,6 +20,7 @@
>  # error "The AVR32 port does not support SMP"
>  #else
>  # define smp_mb()		barrier()
> +# define smp_tmb()		barrier()
>  # define smp_rmb()		barrier()
>  # define smp_wmb()		barrier()
>  # define smp_read_barrier_depends() do { } while(0)
> diff --git a/arch/blackfin/include/asm/barrier.h b/arch/blackfin/include/asm/barrier.h
> index ebb189507dd7..100f49121a18 100644
> --- a/arch/blackfin/include/asm/barrier.h
> +++ b/arch/blackfin/include/asm/barrier.h
> @@ -40,6 +40,7 @@
>  #endif /* !CONFIG_SMP */
>  
>  #define smp_mb()  mb()
> +#define smp_tmb() mb()
>  #define smp_rmb() rmb()
>  #define smp_wmb() wmb()
>  #define set_mb(var, value) do { var = value; mb(); } while (0)
> diff --git a/arch/cris/include/asm/barrier.h b/arch/cris/include/asm/barrier.h
> index 198ad7fa6b25..679c33738b4c 100644
> --- a/arch/cris/include/asm/barrier.h
> +++ b/arch/cris/include/asm/barrier.h
> @@ -12,11 +12,13 @@
>  
>  #ifdef CONFIG_SMP
>  #define smp_mb()        mb()
> +#define smp_tmb()       mb()
>  #define smp_rmb()       rmb()
>  #define smp_wmb()       wmb()
>  #define smp_read_barrier_depends()     read_barrier_depends()
>  #else
>  #define smp_mb()        barrier()
> +#define smp_tmb()       barrier()
>  #define smp_rmb()       barrier()
>  #define smp_wmb()       barrier()
>  #define smp_read_barrier_depends()     do { } while(0)
> diff --git a/arch/frv/include/asm/barrier.h b/arch/frv/include/asm/barrier.h
> index 06776ad9f5e9..60354ce13ba0 100644
> --- a/arch/frv/include/asm/barrier.h
> +++ b/arch/frv/include/asm/barrier.h
> @@ -20,6 +20,7 @@
>  #define read_barrier_depends()	do { } while (0)
>  
>  #define smp_mb()			barrier()
> +#define smp_tmb()			barrier()
>  #define smp_rmb()			barrier()
>  #define smp_wmb()			barrier()
>  #define smp_read_barrier_depends()	do {} while(0)
> diff --git a/arch/h8300/include/asm/barrier.h b/arch/h8300/include/asm/barrier.h
> index 9e0aa9fc195d..e8e297fa4e9a 100644
> --- a/arch/h8300/include/asm/barrier.h
> +++ b/arch/h8300/include/asm/barrier.h
> @@ -16,11 +16,13 @@
>  
>  #ifdef CONFIG_SMP
>  #define smp_mb()	mb()
> +#define smp_tmb()	mb()
>  #define smp_rmb()	rmb()
>  #define smp_wmb()	wmb()
>  #define smp_read_barrier_depends()	read_barrier_depends()
>  #else
>  #define smp_mb()	barrier()
> +#define smp_tmb()	barrier()
>  #define smp_rmb()	barrier()
>  #define smp_wmb()	barrier()
>  #define smp_read_barrier_depends()	do { } while(0)
> diff --git a/arch/hexagon/include/asm/barrier.h b/arch/hexagon/include/asm/barrier.h
> index 1041a8e70ce8..2dd5b2ad4d21 100644
> --- a/arch/hexagon/include/asm/barrier.h
> +++ b/arch/hexagon/include/asm/barrier.h
> @@ -28,6 +28,7 @@
>  #define smp_rmb()			barrier()
>  #define smp_read_barrier_depends()	barrier()
>  #define smp_wmb()			barrier()
> +#define smp_tmb()			barrier()
>  #define smp_mb()			barrier()
>  #define smp_mb__before_atomic_dec()	barrier()
>  #define smp_mb__after_atomic_dec()	barrier()
> diff --git a/arch/ia64/include/asm/barrier.h b/arch/ia64/include/asm/barrier.h
> index 60576e06b6fb..a5f92146b091 100644
> --- a/arch/ia64/include/asm/barrier.h
> +++ b/arch/ia64/include/asm/barrier.h
> @@ -42,11 +42,13 @@
>  
>  #ifdef CONFIG_SMP
>  # define smp_mb()	mb()
> +# define smp_tmb()	mb()
>  # define smp_rmb()	rmb()
>  # define smp_wmb()	wmb()
>  # define smp_read_barrier_depends()	read_barrier_depends()
>  #else
>  # define smp_mb()	barrier()
> +# define smp_tmb()	barrier()
>  # define smp_rmb()	barrier()
>  # define smp_wmb()	barrier()
>  # define smp_read_barrier_depends()	do { } while(0)
> diff --git a/arch/m32r/include/asm/barrier.h b/arch/m32r/include/asm/barrier.h
> index 6976621efd3f..a6fa29facd7a 100644
> --- a/arch/m32r/include/asm/barrier.h
> +++ b/arch/m32r/include/asm/barrier.h
> @@ -79,12 +79,14 @@
>  
>  #ifdef CONFIG_SMP
>  #define smp_mb()	mb()
> +#define smp_tmb()	mb()
>  #define smp_rmb()	rmb()
>  #define smp_wmb()	wmb()
>  #define smp_read_barrier_depends()	read_barrier_depends()
>  #define set_mb(var, value) do { (void) xchg(&var, value); } while (0)
>  #else
>  #define smp_mb()	barrier()
> +#define smp_tmb()	barrier()
>  #define smp_rmb()	barrier()
>  #define smp_wmb()	barrier()
>  #define smp_read_barrier_depends()	do { } while (0)
> diff --git a/arch/m68k/include/asm/barrier.h b/arch/m68k/include/asm/barrier.h
> index 445ce22c23cb..8ecf52c87847 100644
> --- a/arch/m68k/include/asm/barrier.h
> +++ b/arch/m68k/include/asm/barrier.h
> @@ -13,6 +13,7 @@
>  #define set_mb(var, value)	({ (var) = (value); wmb(); })
>  
>  #define smp_mb()	barrier()
> +#define smp_tmb()	barrier()
>  #define smp_rmb()	barrier()
>  #define smp_wmb()	barrier()
>  #define smp_read_barrier_depends()	((void)0)
> diff --git a/arch/metag/include/asm/barrier.h b/arch/metag/include/asm/barrier.h
> index c90bfc6bf648..eb179fbce580 100644
> --- a/arch/metag/include/asm/barrier.h
> +++ b/arch/metag/include/asm/barrier.h
> @@ -50,6 +50,7 @@ static inline void wmb(void)
>  #ifndef CONFIG_SMP
>  #define fence()		do { } while (0)
>  #define smp_mb()        barrier()
> +#define smp_tmb()       barrier()
>  #define smp_rmb()       barrier()
>  #define smp_wmb()       barrier()
>  #else
> @@ -70,11 +71,13 @@ static inline void fence(void)
>  	*flushptr = 0;
>  }
>  #define smp_mb()        fence()
> +#define smp_tmb()       fence()
>  #define smp_rmb()       fence()
>  #define smp_wmb()       barrier()
>  #else
>  #define fence()		do { } while (0)
>  #define smp_mb()        barrier()
> +#define smp_tmb()       barrier()
>  #define smp_rmb()       barrier()
>  #define smp_wmb()       barrier()
>  #endif
> diff --git a/arch/microblaze/include/asm/barrier.h b/arch/microblaze/include/asm/barrier.h
> index df5be3e87044..d573c170a717 100644
> --- a/arch/microblaze/include/asm/barrier.h
> +++ b/arch/microblaze/include/asm/barrier.h
> @@ -21,6 +21,7 @@
>  #define set_wmb(var, value)	do { var = value; wmb(); } while (0)
>  
>  #define smp_mb()		mb()
> +#define smp_tmb()		mb()
>  #define smp_rmb()		rmb()
>  #define smp_wmb()		wmb()
>  
> diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
> index 314ab5532019..535e699eec3b 100644
> --- a/arch/mips/include/asm/barrier.h
> +++ b/arch/mips/include/asm/barrier.h
> @@ -144,15 +144,18 @@
>  #if defined(CONFIG_WEAK_ORDERING) && defined(CONFIG_SMP)
>  # ifdef CONFIG_CPU_CAVIUM_OCTEON
>  #  define smp_mb()	__sync()
> +#  define smp_tmb()	__sync()
>  #  define smp_rmb()	barrier()
>  #  define smp_wmb()	__syncw()
>  # else
>  #  define smp_mb()	__asm__ __volatile__("sync" : : :"memory")
> +#  define smp_tmb()	__asm__ __volatile__("sync" : : :"memory")
>  #  define smp_rmb()	__asm__ __volatile__("sync" : : :"memory")
>  #  define smp_wmb()	__asm__ __volatile__("sync" : : :"memory")
>  # endif
>  #else
>  #define smp_mb()	barrier()
> +#define smp_tmb()	barrier()
>  #define smp_rmb()	barrier()
>  #define smp_wmb()	barrier()
>  #endif
> diff --git a/arch/mn10300/include/asm/barrier.h b/arch/mn10300/include/asm/barrier.h
> index 2bd97a5c8af7..a345b0776e5f 100644
> --- a/arch/mn10300/include/asm/barrier.h
> +++ b/arch/mn10300/include/asm/barrier.h
> @@ -19,11 +19,13 @@
>  
>  #ifdef CONFIG_SMP
>  #define smp_mb()	mb()
> +#define smp_tmb()	mb()
>  #define smp_rmb()	rmb()
>  #define smp_wmb()	wmb()
>  #define set_mb(var, value)  do { xchg(&var, value); } while (0)
>  #else  /* CONFIG_SMP */
>  #define smp_mb()	barrier()
> +#define smp_tmb()	barrier()
>  #define smp_rmb()	barrier()
>  #define smp_wmb()	barrier()
>  #define set_mb(var, value)  do { var = value;  mb(); } while (0)
> diff --git a/arch/parisc/include/asm/barrier.h b/arch/parisc/include/asm/barrier.h
> index e77d834aa803..f53196b589ec 100644
> --- a/arch/parisc/include/asm/barrier.h
> +++ b/arch/parisc/include/asm/barrier.h
> @@ -25,6 +25,7 @@
>  #define rmb()		mb()
>  #define wmb()		mb()
>  #define smp_mb()	mb()
> +#define smp_tmb()	mb()
>  #define smp_rmb()	mb()
>  #define smp_wmb()	mb()
>  #define smp_read_barrier_depends()	do { } while(0)
> diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h
> index ae782254e731..d7e8a560f1fe 100644
> --- a/arch/powerpc/include/asm/barrier.h
> +++ b/arch/powerpc/include/asm/barrier.h
> @@ -46,11 +46,13 @@
>  #endif
>  
>  #define smp_mb()	mb()
> +#define smp_tmb()	__asm__ __volatile__ (stringify_in_c(LWSYNC) : : :"memory")
>  #define smp_rmb()	__asm__ __volatile__ (stringify_in_c(LWSYNC) : : :"memory")
>  #define smp_wmb()	__asm__ __volatile__ (stringify_in_c(SMPWMB) : : :"memory")
>  #define smp_read_barrier_depends()	read_barrier_depends()
>  #else
>  #define smp_mb()	barrier()
> +#define smp_tmb()	barrier()
>  #define smp_rmb()	barrier()
>  #define smp_wmb()	barrier()
>  #define smp_read_barrier_depends()	do { } while(0)
> diff --git a/arch/s390/include/asm/barrier.h b/arch/s390/include/asm/barrier.h
> index 16760eeb79b0..f0409a874243 100644
> --- a/arch/s390/include/asm/barrier.h
> +++ b/arch/s390/include/asm/barrier.h
> @@ -24,6 +24,7 @@
>  #define wmb()				mb()
>  #define read_barrier_depends()		do { } while(0)
>  #define smp_mb()			mb()
> +#define smp_tmb()			mb()
>  #define smp_rmb()			rmb()
>  #define smp_wmb()			wmb()
>  #define smp_read_barrier_depends()	read_barrier_depends()
> diff --git a/arch/score/include/asm/barrier.h b/arch/score/include/asm/barrier.h
> index 0eacb6471e6d..865652083dde 100644
> --- a/arch/score/include/asm/barrier.h
> +++ b/arch/score/include/asm/barrier.h
> @@ -5,6 +5,7 @@
>  #define rmb()		barrier()
>  #define wmb()		barrier()
>  #define smp_mb()	barrier()
> +#define smp_tmb()	barrier()
>  #define smp_rmb()	barrier()
>  #define smp_wmb()	barrier()
>  
> diff --git a/arch/sh/include/asm/barrier.h b/arch/sh/include/asm/barrier.h
> index 72c103dae300..f8dce7926432 100644
> --- a/arch/sh/include/asm/barrier.h
> +++ b/arch/sh/include/asm/barrier.h
> @@ -39,11 +39,13 @@
>  
>  #ifdef CONFIG_SMP
>  #define smp_mb()	mb()
> +#define smp_tmb()	mb()
>  #define smp_rmb()	rmb()
>  #define smp_wmb()	wmb()
>  #define smp_read_barrier_depends()	read_barrier_depends()
>  #else
>  #define smp_mb()	barrier()
> +#define smp_tmb()	barrier()
>  #define smp_rmb()	barrier()
>  #define smp_wmb()	barrier()
>  #define smp_read_barrier_depends()	do { } while(0)
> diff --git a/arch/sparc/include/asm/barrier_32.h b/arch/sparc/include/asm/barrier_32.h
> index c1b76654ee76..1037ce189cee 100644
> --- a/arch/sparc/include/asm/barrier_32.h
> +++ b/arch/sparc/include/asm/barrier_32.h
> @@ -8,6 +8,7 @@
>  #define read_barrier_depends()	do { } while(0)
>  #define set_mb(__var, __value)  do { __var = __value; mb(); } while(0)
>  #define smp_mb()	__asm__ __volatile__("":::"memory")
> +#define smp_tmb()	__asm__ __volatile__("":::"memory")
>  #define smp_rmb()	__asm__ __volatile__("":::"memory")
>  #define smp_wmb()	__asm__ __volatile__("":::"memory")
>  #define smp_read_barrier_depends()	do { } while(0)
> diff --git a/arch/sparc/include/asm/barrier_64.h b/arch/sparc/include/asm/barrier_64.h
> index 95d45986f908..0f3c2fdb86b8 100644
> --- a/arch/sparc/include/asm/barrier_64.h
> +++ b/arch/sparc/include/asm/barrier_64.h
> @@ -34,6 +34,7 @@ do {	__asm__ __volatile__("ba,pt	%%xcc, 1f\n\t" \
>   * memory ordering than required by the specifications.
>   */
>  #define mb()	membar_safe("#StoreLoad")
> +#define tmb()	__asm__ __volatile__("":::"memory")
>  #define rmb()	__asm__ __volatile__("":::"memory")
>  #define wmb()	__asm__ __volatile__("":::"memory")
>  
> @@ -43,10 +44,12 @@ do {	__asm__ __volatile__("ba,pt	%%xcc, 1f\n\t" \
>  
>  #ifdef CONFIG_SMP
>  #define smp_mb()	mb()
> +#define smp_tmb()	tmb()
>  #define smp_rmb()	rmb()
>  #define smp_wmb()	wmb()
>  #else
>  #define smp_mb()	__asm__ __volatile__("":::"memory")
> +#define smp_tmb()	__asm__ __volatile__("":::"memory")
>  #define smp_rmb()	__asm__ __volatile__("":::"memory")
>  #define smp_wmb()	__asm__ __volatile__("":::"memory")
>  #endif
> diff --git a/arch/tile/include/asm/barrier.h b/arch/tile/include/asm/barrier.h
> index a9a73da5865d..cad3c6ae28bf 100644
> --- a/arch/tile/include/asm/barrier.h
> +++ b/arch/tile/include/asm/barrier.h
> @@ -127,11 +127,13 @@ mb_incoherent(void)
>  
>  #ifdef CONFIG_SMP
>  #define smp_mb()	mb()
> +#define smp_tmb()	mb()
>  #define smp_rmb()	rmb()
>  #define smp_wmb()	wmb()
>  #define smp_read_barrier_depends()	read_barrier_depends()
>  #else
>  #define smp_mb()	barrier()
> +#define smp_tmb()	barrier()
>  #define smp_rmb()	barrier()
>  #define smp_wmb()	barrier()
>  #define smp_read_barrier_depends()	do { } while (0)
> diff --git a/arch/unicore32/include/asm/barrier.h b/arch/unicore32/include/asm/barrier.h
> index a6620e5336b6..8b341fffbda6 100644
> --- a/arch/unicore32/include/asm/barrier.h
> +++ b/arch/unicore32/include/asm/barrier.h
> @@ -18,6 +18,7 @@
>  #define rmb()				barrier()
>  #define wmb()				barrier()
>  #define smp_mb()			barrier()
> +#define smp_tmb()			barrier()
>  #define smp_rmb()			barrier()
>  #define smp_wmb()			barrier()
>  #define read_barrier_depends()		do { } while (0)
> diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
> index c6cd358a1eec..480201d83af1 100644
> --- a/arch/x86/include/asm/barrier.h
> +++ b/arch/x86/include/asm/barrier.h
> @@ -86,14 +86,17 @@
>  # define smp_rmb()	barrier()
>  #endif
>  #ifdef CONFIG_X86_OOSTORE
> +# define smp_tmb()	mb()
>  # define smp_wmb() 	wmb()
>  #else
> +# define smp_tmb()	barrier()
>  # define smp_wmb()	barrier()
>  #endif
>  #define smp_read_barrier_depends()	read_barrier_depends()
>  #define set_mb(var, value) do { (void)xchg(&var, value); } while (0)
>  #else
>  #define smp_mb()	barrier()
> +#define smp_tmb()	barrier()
>  #define smp_rmb()	barrier()
>  #define smp_wmb()	barrier()
>  #define smp_read_barrier_depends()	do { } while (0)
> diff --git a/arch/xtensa/include/asm/barrier.h b/arch/xtensa/include/asm/barrier.h
> index ef021677d536..7839db843ea5 100644
> --- a/arch/xtensa/include/asm/barrier.h
> +++ b/arch/xtensa/include/asm/barrier.h
> @@ -20,6 +20,7 @@
>  #error smp_* not defined
>  #else
>  #define smp_mb()	barrier()
> +#define smp_tmb()	barrier()
>  #define smp_rmb()	barrier()
>  #define smp_wmb()	barrier()
>  #endif
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply

* Re: perf events ring buffer memory barrier on powerpc
From: Benjamin Herrenschmidt @ 2013-11-03 20:57 UTC (permalink / raw)
  To: Victor Kaplansky
  Cc: Michael Neuling, Mathieu Desnoyers, Peter Zijlstra,
	Frederic Weisbecker, LKML, Oleg Nesterov, Linux PPC dev,
	David Laight, Anton Blanchard, paulmck
In-Reply-To: <OF57D6EF77.CD2E45CE-ON42257C16.005AA6D8-42257C16.005AB961@il.ibm.com>

On Fri, 2013-11-01 at 18:30 +0200, Victor Kaplansky wrote:
> "David Laight" <David.Laight@aculab.com> wrote on 11/01/2013 06:25:29 PM:
> > gcc will do unexpected memory accesses for bit fields that are
> > adjacent to volatile data.
> > In particular it may generate 64bit sized (and aligned) RMW cycles
> > when accessing bit fields.
> > And yes, this has caused real problems.
> 
> Thanks, I am aware about this bug/feature in gcc.

AFAIK, this has been fixed in 4.8 and 4.7.3 ... 

Cheers,
Ben.

> -- Victor
> 
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply

* Re: [RFC] arch: Introduce new TSO memory barrier smp_tmb()
From: Peter Zijlstra @ 2013-11-03 20:01 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Michael Neuling, Mathieu Desnoyers, Oleg Nesterov, LKML,
	Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky, Paul E. McKenney
In-Reply-To: <CA+55aFx_kuvR-0dwJtjvnhtha5QBc5XcLZPRH=WKT+hYVAKOrw@mail.gmail.com>

On Sun, Nov 03, 2013 at 10:08:14AM -0800, Linus Torvalds wrote:
> On Sun, Nov 3, 2013 at 7:17 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Sun, Nov 03, 2013 at 06:40:17AM -0800, Paul E. McKenney wrote:
> >> If there was an smp_tmb(), I would likely use it in rcu_assign_pointer().
> >
> > Well, I'm obviously all for introducing this new barrier, for it will
> > reduce a full mfence on x86 to a compiler barrier. And ppc can use
> > lwsync as opposed to sync afaict. Not sure ARM can do better.
> >
> > ---
> > Subject: arch: Introduce new TSO memory barrier smp_tmb()
> 
> This is specialized enough that I would *really* like the name to be
> more descriptive. Compare to the special "smp_read_barrier_depends()"
> maco: it's unusual, and it has very specific semantics, so it gets a
> long and descriptive name.
> 
> Memory ordering is subtle enough without then using names that are
> subtle in themselves. mb/rmb/wmb are conceptually pretty simple
> operations, and very basic when talking about memory ordering.
> "acquire" and "release" are less simple, but have descriptive names
> and have very specific uses in locking.
> 
> In contrast "smp_tmb()" is a *horrible* name, because TSO is a
> description of the memory ordering, not of a particular barrier. It's
> also not even clear that you can have a "tso barrier", since the
> ordering (like acquire/release) presumably is really about one
> particular *store*, not about some kind of barrier between different
> operations.
> 
> So please describe exactly what the semantics that barrier has, and
> then name the barrier that way.
> 
> I assume that in this particular case, the semantics RCU wants is
> "write barrier, and no preceding reads can move past this point".
> 
> Calling that "smp_tmb()" is f*cking insane, imnsho.

Fair enough; from what I could gather the proposed semantics are
RELEASE+WMB, such that neither reads not writes can cross over, writes
can't cross back, but reads could.

Since both RELEASE and WMB are trivial under TSO the entire thing
collapses.

Now I'm currently completely confused as to what C/C++ wrecks vs actual
proper memory order issues; let alone fully comprehend the case that
started all this.

^ permalink raw reply

* Re: [RFC] arch: Introduce new TSO memory barrier smp_tmb()
From: Linus Torvalds @ 2013-11-03 18:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Neuling, Mathieu Desnoyers, Oleg Nesterov, LKML,
	Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky, Paul E. McKenney
In-Reply-To: <20131103151704.GJ19466@laptop.lan>

On Sun, Nov 3, 2013 at 7:17 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Sun, Nov 03, 2013 at 06:40:17AM -0800, Paul E. McKenney wrote:
>> If there was an smp_tmb(), I would likely use it in rcu_assign_pointer().
>
> Well, I'm obviously all for introducing this new barrier, for it will
> reduce a full mfence on x86 to a compiler barrier. And ppc can use
> lwsync as opposed to sync afaict. Not sure ARM can do better.
>
> ---
> Subject: arch: Introduce new TSO memory barrier smp_tmb()

This is specialized enough that I would *really* like the name to be
more descriptive. Compare to the special "smp_read_barrier_depends()"
maco: it's unusual, and it has very specific semantics, so it gets a
long and descriptive name.

Memory ordering is subtle enough without then using names that are
subtle in themselves. mb/rmb/wmb are conceptually pretty simple
operations, and very basic when talking about memory ordering.
"acquire" and "release" are less simple, but have descriptive names
and have very specific uses in locking.

In contrast "smp_tmb()" is a *horrible* name, because TSO is a
description of the memory ordering, not of a particular barrier. It's
also not even clear that you can have a "tso barrier", since the
ordering (like acquire/release) presumably is really about one
particular *store*, not about some kind of barrier between different
operations.

So please describe exactly what the semantics that barrier has, and
then name the barrier that way.

I assume that in this particular case, the semantics RCU wants is
"write barrier, and no preceding reads can move past this point".

Calling that "smp_tmb()" is f*cking insane, imnsho.

              Linus

^ permalink raw reply

* [RFC] arch: Introduce new TSO memory barrier smp_tmb()
From: Peter Zijlstra @ 2013-11-03 15:17 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Michael Neuling, Mathieu Desnoyers, Oleg Nesterov, LKML,
	Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky, Linus Torvalds
In-Reply-To: <20131103144017.GA25118@linux.vnet.ibm.com>

On Sun, Nov 03, 2013 at 06:40:17AM -0800, Paul E. McKenney wrote:
> If there was an smp_tmb(), I would likely use it in rcu_assign_pointer().

Well, I'm obviously all for introducing this new barrier, for it will
reduce a full mfence on x86 to a compiler barrier. And ppc can use
lwsync as opposed to sync afaict. Not sure ARM can do better.

---
Subject: arch: Introduce new TSO memory barrier smp_tmb()

A few sites could be downgraded from smp_mb() to smp_tmb() and a few
site should be upgraded to smp_tmb() that are now using smp_wmb().

XXX hope PaulMck explains things better..

X86 (!OOSTORE), SPARC have native TSO memory models and smp_tmb()
reduces to barrier().

PPC can use lwsync instead of sync

For the other archs, have smp_tmb map to smp_mb, as the stronger barrier
is always correct but possibly suboptimal.

Suggested-by: Paul McKenney <paulmck@linux.vnet.ibm.com>
Not-Signed-off-by: Peter Zijlstra <peterz@infradead.org>
---
 arch/alpha/include/asm/barrier.h      | 2 ++
 arch/arc/include/asm/barrier.h        | 2 ++
 arch/arm/include/asm/barrier.h        | 2 ++
 arch/arm64/include/asm/barrier.h      | 2 ++
 arch/avr32/include/asm/barrier.h      | 1 +
 arch/blackfin/include/asm/barrier.h   | 1 +
 arch/cris/include/asm/barrier.h       | 2 ++
 arch/frv/include/asm/barrier.h        | 1 +
 arch/h8300/include/asm/barrier.h      | 2 ++
 arch/hexagon/include/asm/barrier.h    | 1 +
 arch/ia64/include/asm/barrier.h       | 2 ++
 arch/m32r/include/asm/barrier.h       | 2 ++
 arch/m68k/include/asm/barrier.h       | 1 +
 arch/metag/include/asm/barrier.h      | 3 +++
 arch/microblaze/include/asm/barrier.h | 1 +
 arch/mips/include/asm/barrier.h       | 3 +++
 arch/mn10300/include/asm/barrier.h    | 2 ++
 arch/parisc/include/asm/barrier.h     | 1 +
 arch/powerpc/include/asm/barrier.h    | 2 ++
 arch/s390/include/asm/barrier.h       | 1 +
 arch/score/include/asm/barrier.h      | 1 +
 arch/sh/include/asm/barrier.h         | 2 ++
 arch/sparc/include/asm/barrier_32.h   | 1 +
 arch/sparc/include/asm/barrier_64.h   | 3 +++
 arch/tile/include/asm/barrier.h       | 2 ++
 arch/unicore32/include/asm/barrier.h  | 1 +
 arch/x86/include/asm/barrier.h        | 3 +++
 arch/xtensa/include/asm/barrier.h     | 1 +
 28 files changed, 48 insertions(+)

diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
index ce8860a0b32d..02ea63897038 100644
--- a/arch/alpha/include/asm/barrier.h
+++ b/arch/alpha/include/asm/barrier.h
@@ -18,12 +18,14 @@ __asm__ __volatile__("mb": : :"memory")
 #ifdef CONFIG_SMP
 #define __ASM_SMP_MB	"\tmb\n"
 #define smp_mb()	mb()
+#define smp_tmb()	mb()
 #define smp_rmb()	rmb()
 #define smp_wmb()	wmb()
 #define smp_read_barrier_depends()	read_barrier_depends()
 #else
 #define __ASM_SMP_MB
 #define smp_mb()	barrier()
+#define smp_tmb()	barrier()
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
 #define smp_read_barrier_depends()	do { } while (0)
diff --git a/arch/arc/include/asm/barrier.h b/arch/arc/include/asm/barrier.h
index f6cb7c4ffb35..456c790fa1ad 100644
--- a/arch/arc/include/asm/barrier.h
+++ b/arch/arc/include/asm/barrier.h
@@ -22,10 +22,12 @@
 /* TODO-vineetg verify the correctness of macros here */
 #ifdef CONFIG_SMP
 #define smp_mb()        mb()
+#define smp_tmb()	mb()
 #define smp_rmb()       rmb()
 #define smp_wmb()       wmb()
 #else
 #define smp_mb()        barrier()
+#define smp_tmb()	barrier()
 #define smp_rmb()       barrier()
 #define smp_wmb()       barrier()
 #endif
diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
index 60f15e274e6d..bc88a8505673 100644
--- a/arch/arm/include/asm/barrier.h
+++ b/arch/arm/include/asm/barrier.h
@@ -51,10 +51,12 @@
 
 #ifndef CONFIG_SMP
 #define smp_mb()	barrier()
+#define smp_tmb()	barrier()
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
 #else
 #define smp_mb()	dmb(ish)
+#define smp_tmb()	smp_mb()
 #define smp_rmb()	smp_mb()
 #define smp_wmb()	dmb(ishst)
 #endif
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index d4a63338a53c..ec0531f4892f 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -33,10 +33,12 @@
 
 #ifndef CONFIG_SMP
 #define smp_mb()	barrier()
+#define smp_tmb()	barrier()
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
 #else
 #define smp_mb()	asm volatile("dmb ish" : : : "memory")
+#define smp_tmb()	asm volatile("dmb ish" : : : "memory")
 #define smp_rmb()	asm volatile("dmb ishld" : : : "memory")
 #define smp_wmb()	asm volatile("dmb ishst" : : : "memory")
 #endif
diff --git a/arch/avr32/include/asm/barrier.h b/arch/avr32/include/asm/barrier.h
index 0961275373db..6c6ccb9cf290 100644
--- a/arch/avr32/include/asm/barrier.h
+++ b/arch/avr32/include/asm/barrier.h
@@ -20,6 +20,7 @@
 # error "The AVR32 port does not support SMP"
 #else
 # define smp_mb()		barrier()
+# define smp_tmb()		barrier()
 # define smp_rmb()		barrier()
 # define smp_wmb()		barrier()
 # define smp_read_barrier_depends() do { } while(0)
diff --git a/arch/blackfin/include/asm/barrier.h b/arch/blackfin/include/asm/barrier.h
index ebb189507dd7..100f49121a18 100644
--- a/arch/blackfin/include/asm/barrier.h
+++ b/arch/blackfin/include/asm/barrier.h
@@ -40,6 +40,7 @@
 #endif /* !CONFIG_SMP */
 
 #define smp_mb()  mb()
+#define smp_tmb() mb()
 #define smp_rmb() rmb()
 #define smp_wmb() wmb()
 #define set_mb(var, value) do { var = value; mb(); } while (0)
diff --git a/arch/cris/include/asm/barrier.h b/arch/cris/include/asm/barrier.h
index 198ad7fa6b25..679c33738b4c 100644
--- a/arch/cris/include/asm/barrier.h
+++ b/arch/cris/include/asm/barrier.h
@@ -12,11 +12,13 @@
 
 #ifdef CONFIG_SMP
 #define smp_mb()        mb()
+#define smp_tmb()       mb()
 #define smp_rmb()       rmb()
 #define smp_wmb()       wmb()
 #define smp_read_barrier_depends()     read_barrier_depends()
 #else
 #define smp_mb()        barrier()
+#define smp_tmb()       barrier()
 #define smp_rmb()       barrier()
 #define smp_wmb()       barrier()
 #define smp_read_barrier_depends()     do { } while(0)
diff --git a/arch/frv/include/asm/barrier.h b/arch/frv/include/asm/barrier.h
index 06776ad9f5e9..60354ce13ba0 100644
--- a/arch/frv/include/asm/barrier.h
+++ b/arch/frv/include/asm/barrier.h
@@ -20,6 +20,7 @@
 #define read_barrier_depends()	do { } while (0)
 
 #define smp_mb()			barrier()
+#define smp_tmb()			barrier()
 #define smp_rmb()			barrier()
 #define smp_wmb()			barrier()
 #define smp_read_barrier_depends()	do {} while(0)
diff --git a/arch/h8300/include/asm/barrier.h b/arch/h8300/include/asm/barrier.h
index 9e0aa9fc195d..e8e297fa4e9a 100644
--- a/arch/h8300/include/asm/barrier.h
+++ b/arch/h8300/include/asm/barrier.h
@@ -16,11 +16,13 @@
 
 #ifdef CONFIG_SMP
 #define smp_mb()	mb()
+#define smp_tmb()	mb()
 #define smp_rmb()	rmb()
 #define smp_wmb()	wmb()
 #define smp_read_barrier_depends()	read_barrier_depends()
 #else
 #define smp_mb()	barrier()
+#define smp_tmb()	barrier()
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
 #define smp_read_barrier_depends()	do { } while(0)
diff --git a/arch/hexagon/include/asm/barrier.h b/arch/hexagon/include/asm/barrier.h
index 1041a8e70ce8..2dd5b2ad4d21 100644
--- a/arch/hexagon/include/asm/barrier.h
+++ b/arch/hexagon/include/asm/barrier.h
@@ -28,6 +28,7 @@
 #define smp_rmb()			barrier()
 #define smp_read_barrier_depends()	barrier()
 #define smp_wmb()			barrier()
+#define smp_tmb()			barrier()
 #define smp_mb()			barrier()
 #define smp_mb__before_atomic_dec()	barrier()
 #define smp_mb__after_atomic_dec()	barrier()
diff --git a/arch/ia64/include/asm/barrier.h b/arch/ia64/include/asm/barrier.h
index 60576e06b6fb..a5f92146b091 100644
--- a/arch/ia64/include/asm/barrier.h
+++ b/arch/ia64/include/asm/barrier.h
@@ -42,11 +42,13 @@
 
 #ifdef CONFIG_SMP
 # define smp_mb()	mb()
+# define smp_tmb()	mb()
 # define smp_rmb()	rmb()
 # define smp_wmb()	wmb()
 # define smp_read_barrier_depends()	read_barrier_depends()
 #else
 # define smp_mb()	barrier()
+# define smp_tmb()	barrier()
 # define smp_rmb()	barrier()
 # define smp_wmb()	barrier()
 # define smp_read_barrier_depends()	do { } while(0)
diff --git a/arch/m32r/include/asm/barrier.h b/arch/m32r/include/asm/barrier.h
index 6976621efd3f..a6fa29facd7a 100644
--- a/arch/m32r/include/asm/barrier.h
+++ b/arch/m32r/include/asm/barrier.h
@@ -79,12 +79,14 @@
 
 #ifdef CONFIG_SMP
 #define smp_mb()	mb()
+#define smp_tmb()	mb()
 #define smp_rmb()	rmb()
 #define smp_wmb()	wmb()
 #define smp_read_barrier_depends()	read_barrier_depends()
 #define set_mb(var, value) do { (void) xchg(&var, value); } while (0)
 #else
 #define smp_mb()	barrier()
+#define smp_tmb()	barrier()
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
 #define smp_read_barrier_depends()	do { } while (0)
diff --git a/arch/m68k/include/asm/barrier.h b/arch/m68k/include/asm/barrier.h
index 445ce22c23cb..8ecf52c87847 100644
--- a/arch/m68k/include/asm/barrier.h
+++ b/arch/m68k/include/asm/barrier.h
@@ -13,6 +13,7 @@
 #define set_mb(var, value)	({ (var) = (value); wmb(); })
 
 #define smp_mb()	barrier()
+#define smp_tmb()	barrier()
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
 #define smp_read_barrier_depends()	((void)0)
diff --git a/arch/metag/include/asm/barrier.h b/arch/metag/include/asm/barrier.h
index c90bfc6bf648..eb179fbce580 100644
--- a/arch/metag/include/asm/barrier.h
+++ b/arch/metag/include/asm/barrier.h
@@ -50,6 +50,7 @@ static inline void wmb(void)
 #ifndef CONFIG_SMP
 #define fence()		do { } while (0)
 #define smp_mb()        barrier()
+#define smp_tmb()       barrier()
 #define smp_rmb()       barrier()
 #define smp_wmb()       barrier()
 #else
@@ -70,11 +71,13 @@ static inline void fence(void)
 	*flushptr = 0;
 }
 #define smp_mb()        fence()
+#define smp_tmb()       fence()
 #define smp_rmb()       fence()
 #define smp_wmb()       barrier()
 #else
 #define fence()		do { } while (0)
 #define smp_mb()        barrier()
+#define smp_tmb()       barrier()
 #define smp_rmb()       barrier()
 #define smp_wmb()       barrier()
 #endif
diff --git a/arch/microblaze/include/asm/barrier.h b/arch/microblaze/include/asm/barrier.h
index df5be3e87044..d573c170a717 100644
--- a/arch/microblaze/include/asm/barrier.h
+++ b/arch/microblaze/include/asm/barrier.h
@@ -21,6 +21,7 @@
 #define set_wmb(var, value)	do { var = value; wmb(); } while (0)
 
 #define smp_mb()		mb()
+#define smp_tmb()		mb()
 #define smp_rmb()		rmb()
 #define smp_wmb()		wmb()
 
diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
index 314ab5532019..535e699eec3b 100644
--- a/arch/mips/include/asm/barrier.h
+++ b/arch/mips/include/asm/barrier.h
@@ -144,15 +144,18 @@
 #if defined(CONFIG_WEAK_ORDERING) && defined(CONFIG_SMP)
 # ifdef CONFIG_CPU_CAVIUM_OCTEON
 #  define smp_mb()	__sync()
+#  define smp_tmb()	__sync()
 #  define smp_rmb()	barrier()
 #  define smp_wmb()	__syncw()
 # else
 #  define smp_mb()	__asm__ __volatile__("sync" : : :"memory")
+#  define smp_tmb()	__asm__ __volatile__("sync" : : :"memory")
 #  define smp_rmb()	__asm__ __volatile__("sync" : : :"memory")
 #  define smp_wmb()	__asm__ __volatile__("sync" : : :"memory")
 # endif
 #else
 #define smp_mb()	barrier()
+#define smp_tmb()	barrier()
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
 #endif
diff --git a/arch/mn10300/include/asm/barrier.h b/arch/mn10300/include/asm/barrier.h
index 2bd97a5c8af7..a345b0776e5f 100644
--- a/arch/mn10300/include/asm/barrier.h
+++ b/arch/mn10300/include/asm/barrier.h
@@ -19,11 +19,13 @@
 
 #ifdef CONFIG_SMP
 #define smp_mb()	mb()
+#define smp_tmb()	mb()
 #define smp_rmb()	rmb()
 #define smp_wmb()	wmb()
 #define set_mb(var, value)  do { xchg(&var, value); } while (0)
 #else  /* CONFIG_SMP */
 #define smp_mb()	barrier()
+#define smp_tmb()	barrier()
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
 #define set_mb(var, value)  do { var = value;  mb(); } while (0)
diff --git a/arch/parisc/include/asm/barrier.h b/arch/parisc/include/asm/barrier.h
index e77d834aa803..f53196b589ec 100644
--- a/arch/parisc/include/asm/barrier.h
+++ b/arch/parisc/include/asm/barrier.h
@@ -25,6 +25,7 @@
 #define rmb()		mb()
 #define wmb()		mb()
 #define smp_mb()	mb()
+#define smp_tmb()	mb()
 #define smp_rmb()	mb()
 #define smp_wmb()	mb()
 #define smp_read_barrier_depends()	do { } while(0)
diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h
index ae782254e731..d7e8a560f1fe 100644
--- a/arch/powerpc/include/asm/barrier.h
+++ b/arch/powerpc/include/asm/barrier.h
@@ -46,11 +46,13 @@
 #endif
 
 #define smp_mb()	mb()
+#define smp_tmb()	__asm__ __volatile__ (stringify_in_c(LWSYNC) : : :"memory")
 #define smp_rmb()	__asm__ __volatile__ (stringify_in_c(LWSYNC) : : :"memory")
 #define smp_wmb()	__asm__ __volatile__ (stringify_in_c(SMPWMB) : : :"memory")
 #define smp_read_barrier_depends()	read_barrier_depends()
 #else
 #define smp_mb()	barrier()
+#define smp_tmb()	barrier()
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
 #define smp_read_barrier_depends()	do { } while(0)
diff --git a/arch/s390/include/asm/barrier.h b/arch/s390/include/asm/barrier.h
index 16760eeb79b0..f0409a874243 100644
--- a/arch/s390/include/asm/barrier.h
+++ b/arch/s390/include/asm/barrier.h
@@ -24,6 +24,7 @@
 #define wmb()				mb()
 #define read_barrier_depends()		do { } while(0)
 #define smp_mb()			mb()
+#define smp_tmb()			mb()
 #define smp_rmb()			rmb()
 #define smp_wmb()			wmb()
 #define smp_read_barrier_depends()	read_barrier_depends()
diff --git a/arch/score/include/asm/barrier.h b/arch/score/include/asm/barrier.h
index 0eacb6471e6d..865652083dde 100644
--- a/arch/score/include/asm/barrier.h
+++ b/arch/score/include/asm/barrier.h
@@ -5,6 +5,7 @@
 #define rmb()		barrier()
 #define wmb()		barrier()
 #define smp_mb()	barrier()
+#define smp_tmb()	barrier()
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
 
diff --git a/arch/sh/include/asm/barrier.h b/arch/sh/include/asm/barrier.h
index 72c103dae300..f8dce7926432 100644
--- a/arch/sh/include/asm/barrier.h
+++ b/arch/sh/include/asm/barrier.h
@@ -39,11 +39,13 @@
 
 #ifdef CONFIG_SMP
 #define smp_mb()	mb()
+#define smp_tmb()	mb()
 #define smp_rmb()	rmb()
 #define smp_wmb()	wmb()
 #define smp_read_barrier_depends()	read_barrier_depends()
 #else
 #define smp_mb()	barrier()
+#define smp_tmb()	barrier()
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
 #define smp_read_barrier_depends()	do { } while(0)
diff --git a/arch/sparc/include/asm/barrier_32.h b/arch/sparc/include/asm/barrier_32.h
index c1b76654ee76..1037ce189cee 100644
--- a/arch/sparc/include/asm/barrier_32.h
+++ b/arch/sparc/include/asm/barrier_32.h
@@ -8,6 +8,7 @@
 #define read_barrier_depends()	do { } while(0)
 #define set_mb(__var, __value)  do { __var = __value; mb(); } while(0)
 #define smp_mb()	__asm__ __volatile__("":::"memory")
+#define smp_tmb()	__asm__ __volatile__("":::"memory")
 #define smp_rmb()	__asm__ __volatile__("":::"memory")
 #define smp_wmb()	__asm__ __volatile__("":::"memory")
 #define smp_read_barrier_depends()	do { } while(0)
diff --git a/arch/sparc/include/asm/barrier_64.h b/arch/sparc/include/asm/barrier_64.h
index 95d45986f908..0f3c2fdb86b8 100644
--- a/arch/sparc/include/asm/barrier_64.h
+++ b/arch/sparc/include/asm/barrier_64.h
@@ -34,6 +34,7 @@ do {	__asm__ __volatile__("ba,pt	%%xcc, 1f\n\t" \
  * memory ordering than required by the specifications.
  */
 #define mb()	membar_safe("#StoreLoad")
+#define tmb()	__asm__ __volatile__("":::"memory")
 #define rmb()	__asm__ __volatile__("":::"memory")
 #define wmb()	__asm__ __volatile__("":::"memory")
 
@@ -43,10 +44,12 @@ do {	__asm__ __volatile__("ba,pt	%%xcc, 1f\n\t" \
 
 #ifdef CONFIG_SMP
 #define smp_mb()	mb()
+#define smp_tmb()	tmb()
 #define smp_rmb()	rmb()
 #define smp_wmb()	wmb()
 #else
 #define smp_mb()	__asm__ __volatile__("":::"memory")
+#define smp_tmb()	__asm__ __volatile__("":::"memory")
 #define smp_rmb()	__asm__ __volatile__("":::"memory")
 #define smp_wmb()	__asm__ __volatile__("":::"memory")
 #endif
diff --git a/arch/tile/include/asm/barrier.h b/arch/tile/include/asm/barrier.h
index a9a73da5865d..cad3c6ae28bf 100644
--- a/arch/tile/include/asm/barrier.h
+++ b/arch/tile/include/asm/barrier.h
@@ -127,11 +127,13 @@ mb_incoherent(void)
 
 #ifdef CONFIG_SMP
 #define smp_mb()	mb()
+#define smp_tmb()	mb()
 #define smp_rmb()	rmb()
 #define smp_wmb()	wmb()
 #define smp_read_barrier_depends()	read_barrier_depends()
 #else
 #define smp_mb()	barrier()
+#define smp_tmb()	barrier()
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
 #define smp_read_barrier_depends()	do { } while (0)
diff --git a/arch/unicore32/include/asm/barrier.h b/arch/unicore32/include/asm/barrier.h
index a6620e5336b6..8b341fffbda6 100644
--- a/arch/unicore32/include/asm/barrier.h
+++ b/arch/unicore32/include/asm/barrier.h
@@ -18,6 +18,7 @@
 #define rmb()				barrier()
 #define wmb()				barrier()
 #define smp_mb()			barrier()
+#define smp_tmb()			barrier()
 #define smp_rmb()			barrier()
 #define smp_wmb()			barrier()
 #define read_barrier_depends()		do { } while (0)
diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index c6cd358a1eec..480201d83af1 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -86,14 +86,17 @@
 # define smp_rmb()	barrier()
 #endif
 #ifdef CONFIG_X86_OOSTORE
+# define smp_tmb()	mb()
 # define smp_wmb() 	wmb()
 #else
+# define smp_tmb()	barrier()
 # define smp_wmb()	barrier()
 #endif
 #define smp_read_barrier_depends()	read_barrier_depends()
 #define set_mb(var, value) do { (void)xchg(&var, value); } while (0)
 #else
 #define smp_mb()	barrier()
+#define smp_tmb()	barrier()
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
 #define smp_read_barrier_depends()	do { } while (0)
diff --git a/arch/xtensa/include/asm/barrier.h b/arch/xtensa/include/asm/barrier.h
index ef021677d536..7839db843ea5 100644
--- a/arch/xtensa/include/asm/barrier.h
+++ b/arch/xtensa/include/asm/barrier.h
@@ -20,6 +20,7 @@
 #error smp_* not defined
 #else
 #define smp_mb()	barrier()
+#define smp_tmb()	barrier()
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
 #endif

^ permalink raw reply related

* Re: perf events ring buffer memory barrier on powerpc
From: Paul E. McKenney @ 2013-11-03 14:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Neuling, Mathieu Desnoyers, Oleg Nesterov, LKML,
	Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky
In-Reply-To: <20131102173239.GB3947@linux.vnet.ibm.com>

On Sat, Nov 02, 2013 at 10:32:39AM -0700, Paul E. McKenney wrote:
> On Fri, Nov 01, 2013 at 03:56:34PM +0100, Peter Zijlstra wrote:
> > On Wed, Oct 30, 2013 at 11:40:15PM -0700, Paul E. McKenney wrote:
> > > > Now the whole crux of the question is if we need barrier A at all, since
> > > > the STORES issued by the @buf writes are dependent on the ubuf->tail
> > > > read.
> > > 
> > > The dependency you are talking about is via the "if" statement?
> > > Even C/C++11 is not required to respect control dependencies.
> > > 
> > > This one is a bit annoying.  The x86 TSO means that you really only
> > > need barrier(), ARM (recent ARM, anyway) and Power could use a weaker
> > > barrier, and so on -- but smp_mb() emits a full barrier.
> > > 
> > > Perhaps a new smp_tmb() for TSO semantics, where reads are ordered
> > > before reads, writes before writes, and reads before writes, but not
> > > writes before reads?  Another approach would be to define a per-arch
> > > barrier for this particular case.
> > 
> > I suppose we can only introduce new barrier primitives if there's more
> > than 1 use-case.
> 
> There probably are others.

If there was an smp_tmb(), I would likely use it in rcu_assign_pointer().
There are some corner cases that can happen with the current smp_wmb()
that would be prevented by smp_tmb().  These corner cases are a bit
strange, as follows:

	struct foo gp;

	void P0(void)
	{
		struct foo *p = kmalloc(sizeof(*p);

		if (!p)
			return;
		ACCESS_ONCE(p->a) = 0;
		BUG_ON(ACCESS_ONCE(p->a));
		rcu_assign_pointer(gp, p);
	}

	void P1(void)
	{
		struct foo *p = rcu_dereference(gp);

		if (!p)
			return;
		ACCESS_ONCE(p->a) = 1;
	}

With smp_wmb(), the BUG_ON() can occur because smp_wmb() does
not prevent CPU from reordering the read in the BUG_ON() with the
rcu_assign_pointer().  With smp_tmb(), it could not.

Now, I am not too worried about this because I cannot think of any use
for code like that in P0() and P1().  But if there was an smp_tmb(),
it would be cleaner to make the BUG_ON() impossible.

							Thanx, Paul

> > > > If the read shows no available space, we simply will not issue those
> > > > writes -- therefore we could argue we can avoid the memory barrier.
> > > 
> > > Proving that means iterating through the permitted combinations of
> > > compilers and architectures...  There is always hand-coded assembly
> > > language, I suppose.
> > 
> > I'm starting to think that while the C/C++ language spec says they can
> > wreck the world by doing these silly optimization, real world users will
> > push back for breaking their existing code.
> > 
> > I'm fairly sure the GCC people _will_ get shouted at _loudly_ when they
> > break the kernel by doing crazy shit like that.
> > 
> > Given its near impossible to write a correct program in C/C++ and
> > tagging the entire kernel with __atomic is equally not going to happen,
> > I think we must find a practical solution.
> > 
> > Either that, or we really need to consider forking the language and
> > compiler :-(
> 
> Depends on how much benefit the optimizations provide.  If they provide
> little or no benefit, I am with you, otherwise we will need to bit some
> bullet or another.  Keep in mind that there is a lot of code in the
> kernel that runs sequentially (e.g., due to being fully protected by
> locks), and aggressive optimizations for that sort of code are harmless.
> 
> Can't say I know the answer at the moment, though.
> 
> 							Thanx, Paul

^ permalink raw reply

* Re: perf events ring buffer memory barrier on powerpc
From: Paul E. McKenney @ 2013-11-02 17:32 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Neuling, Mathieu Desnoyers, Oleg Nesterov, LKML,
	Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky
In-Reply-To: <20131101145634.GH19466@laptop.lan>

On Fri, Nov 01, 2013 at 03:56:34PM +0100, Peter Zijlstra wrote:
> On Wed, Oct 30, 2013 at 11:40:15PM -0700, Paul E. McKenney wrote:
> > > Now the whole crux of the question is if we need barrier A at all, since
> > > the STORES issued by the @buf writes are dependent on the ubuf->tail
> > > read.
> > 
> > The dependency you are talking about is via the "if" statement?
> > Even C/C++11 is not required to respect control dependencies.
> > 
> > This one is a bit annoying.  The x86 TSO means that you really only
> > need barrier(), ARM (recent ARM, anyway) and Power could use a weaker
> > barrier, and so on -- but smp_mb() emits a full barrier.
> > 
> > Perhaps a new smp_tmb() for TSO semantics, where reads are ordered
> > before reads, writes before writes, and reads before writes, but not
> > writes before reads?  Another approach would be to define a per-arch
> > barrier for this particular case.
> 
> I suppose we can only introduce new barrier primitives if there's more
> than 1 use-case.

There probably are others.

> > > If the read shows no available space, we simply will not issue those
> > > writes -- therefore we could argue we can avoid the memory barrier.
> > 
> > Proving that means iterating through the permitted combinations of
> > compilers and architectures...  There is always hand-coded assembly
> > language, I suppose.
> 
> I'm starting to think that while the C/C++ language spec says they can
> wreck the world by doing these silly optimization, real world users will
> push back for breaking their existing code.
> 
> I'm fairly sure the GCC people _will_ get shouted at _loudly_ when they
> break the kernel by doing crazy shit like that.
> 
> Given its near impossible to write a correct program in C/C++ and
> tagging the entire kernel with __atomic is equally not going to happen,
> I think we must find a practical solution.
> 
> Either that, or we really need to consider forking the language and
> compiler :-(

Depends on how much benefit the optimizations provide.  If they provide
little or no benefit, I am with you, otherwise we will need to bit some
bullet or another.  Keep in mind that there is a lot of code in the
kernel that runs sequentially (e.g., due to being fully protected by
locks), and aggressive optimizations for that sort of code are harmless.

Can't say I know the answer at the moment, though.

							Thanx, Paul

^ permalink raw reply

* Re: perf events ring buffer memory barrier on powerpc
From: Paul E. McKenney @ 2013-11-02 17:49 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Neuling, tony.luck, Mathieu Desnoyers, Oleg Nesterov,
	LKML, Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky
In-Reply-To: <20131101161819.GV16117@laptop.programming.kicks-ass.net>

On Fri, Nov 01, 2013 at 05:18:19PM +0100, Peter Zijlstra wrote:
> On Wed, Oct 30, 2013 at 11:40:15PM -0700, Paul E. McKenney wrote:
> > The dependency you are talking about is via the "if" statement?
> > Even C/C++11 is not required to respect control dependencies.
> > 
> > This one is a bit annoying.  The x86 TSO means that you really only
> > need barrier(), ARM (recent ARM, anyway) and Power could use a weaker
> > barrier, and so on -- but smp_mb() emits a full barrier.
> > 
> > Perhaps a new smp_tmb() for TSO semantics, where reads are ordered
> > before reads, writes before writes, and reads before writes, but not
> > writes before reads?  Another approach would be to define a per-arch
> > barrier for this particular case.
> 
> Supposing a sane language where we can rely on control flow; would that
> change the story?
> 
> I'm afraid I'm now terminally confused between actual proper memory
> model issues and fucked compilers.

Power and ARM won't speculate stores, but they will happily speculate
loads.  Not sure about Itanium, perhaps Tony knows.  And yes, reordering
by the compilers and CPUs does sometimes seem a bit intertwined.

							Thanx, Paul

^ permalink raw reply

* Re: perf events ring buffer memory barrier on powerpc
From: Paul E. McKenney @ 2013-11-02 17:26 UTC (permalink / raw)
  To: Victor Kaplansky
  Cc: mbatty, dhowells, Michael Neuling, Mathieu Desnoyers,
	Peter Zijlstra, LKML, Oleg Nesterov, Linux PPC dev,
	Anton Blanchard, Frederic Weisbecker, lfomicki
In-Reply-To: <20131102163618.GK4067@linux.vnet.ibm.com>

[ Adding David Howells, Lech Fomicki, and Mark Batty on CC for their
  thoughts given previous discussions. ]

On Sat, Nov 02, 2013 at 09:36:18AM -0700, Paul E. McKenney wrote:
> On Fri, Nov 01, 2013 at 03:12:58PM +0200, Victor Kaplansky wrote:
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote on 10/31/2013
> > 08:16:02 AM:
> > 
> > > > BTW, it is why you also don't need ACCESS_ONCE() around @tail, but only
> > > > around
> > > > @head read.
> > 
> > Just to be sure, that we are talking about the same code - I was
> > considering
> > ACCESS_ONCE() around @tail in point AAA in the following example from
> > Documentation/circular-buffers.txt for CONSUMER:
> > 
> >         unsigned long head = ACCESS_ONCE(buffer->head);
> >         unsigned long tail = buffer->tail;      /* AAA */
> > 
> >         if (CIRC_CNT(head, tail, buffer->size) >= 1) {
> >                 /* read index before reading contents at that index */
> >                 smp_read_barrier_depends();
> > 
> >                 /* extract one item from the buffer */
> >                 struct item *item = buffer[tail];
> > 
> >                 consume_item(item);
> > 
> >                 smp_mb(); /* finish reading descriptor before incrementing
> > tail */
> > 
> >                 buffer->tail = (tail + 1) & (buffer->size - 1); /* BBB */
> >         }
> 
> Hmmm...  I believe that we need to go back to the original code in
> Documentation/circular-buffers.txt.  I do so at the bottom of this email.
> 
> > > If you omit the ACCESS_ONCE() calls around @tail, the compiler is within
> > > its rights to combine adjacent operations and also to invent loads and
> > > stores, for example, in cases of register pressure.
> > 
> > Right. And I was completely aware about these possible transformations when
> > said that ACCESS_ONCE() around @tail in point AAA is redundant. Moved, or
> > even
> > completely dismissed reads of @tail in consumer code, are not a problem at
> > all,
> > since @tail is written exclusively by CONSUMER side.
> 
> I believe that the lack of ACCESS_ONCE() around the consumer's store
> to buffer->tail is at least a documentation problem.  In the original
> consumer code, it is trapped between an smp_mb() and a spin_unlock(),
> but it is updating something that is read without synchronization by
> some other thread.
> 
> > > It is also within
> > > its rights to do piece-at-a-time loads and stores, which might sound
> > > unlikely, but which can actually has happened when the compiler figures
> > > out exactly what is to be stored at compile time, especially on hardware
> > > that only allows small immediate values.
> > 
> > As for writes to @tail, the ACCESS_ONCE around @tail at point AAA,
> > doesn't prevent in any way an imaginary super-optimizing compiler
> > from moving around the store to @tail (which appears in the code at point
> > BBB).
> > 
> > It is why ACCESS_ONCE at point AAA is completely redundant.
> 
> Agreed, it is under the lock that guards modifications, so AAA does not
> need ACCESS_ONCE().
> 
> OK, here is the producer from Documentation/circular-buffers.txt, with
> some comments added:
> 
> 	spin_lock(&producer_lock);
> 
> 	unsigned long head = buffer->head;

The above is updated only under producer_lock, which we hold, so no
ACCESS_ONCE() is needed for buffer->head.

> 	unsigned long tail = ACCESS_ONCE(buffer->tail); /* PT */
> 
> 	if (CIRC_SPACE(head, tail, buffer->size) >= 1) {
> 		/* insert one item into the buffer */
> 		struct item *item = buffer[head];
> 
> 		produce_item(item); /* PD */
> 
> 		smp_wmb(); /* commit the item before incrementing the head */
> 
> 		buffer->head = (head + 1) & (buffer->size - 1);  /* PH */

The above needs to be something like:

		ACCESS_ONCE(buffer->head) = (head + 1) & (buffer->size - 1);

This is because we are writing to a shared variable that might be being
read concurrently.

> 		/* wake_up() will make sure that the head is committed before
> 		 * waking anyone up */
> 		wake_up(consumer);
> 	}
> 
> 	spin_unlock(&producer_lock);
> 
> And here is the consumer, also from Documentation/circular-buffers.txt:
> 
> 	spin_lock(&consumer_lock);
> 
> 	unsigned long head = ACCESS_ONCE(buffer->head); /* CH */
> 	unsigned long tail = buffer->tail;

The above is updated only under consumer_lock, which we hold, so no
ACCESS_ONCE() is needed for buffer->tail.

> 
> 	if (CIRC_CNT(head, tail, buffer->size) >= 1) {
> 		/* read index before reading contents at that index */
> 		smp_read_barrier_depends();
> 
> 		/* extract one item from the buffer */
> 		struct item *item = buffer[tail]; /* CD */
> 
> 		consume_item(item);
> 
> 		smp_mb(); /* finish reading descriptor before incrementing tail */
> 
> 		buffer->tail = (tail + 1) & (buffer->size - 1); /* CT */

And here, for no-execution-cost documentation, if nothing else:

		ACCESS_ONCE(buffer->tail) = (tail + 1) & (buffer->size - 1);

> 	}
> 
> 	spin_unlock(&consumer_lock);
> 
> Here are the ordering requirements as I see them:
> 
> 1.	The producer is not allowed to clobber a location that the
> 	consumer is in the process of reading from.
> 
> 2.	The consumer is not allowed to read from a location that the
> 	producer has not yet completed writing to.
> 
> #1 is helped out by the fact that there is always an empty element in
> the array, so that the producer will need to produce twice in a row
> to catch up to where the consumer is currently consuming.  #2 has no
> such benefit: The consumer can consume an item that has just now been
> produced.
> 
> #1 requires that CD is ordered before CT in a way that pairs with the
> ordering of PT and PD.  There is of course no effective ordering between
> PT and PD within a given call to the producer, but we only need the
> ordering between the read from PT for one call to the producer and the
> PD of the -next- call to the producer, courtesy of the fact that there
> is always one empty cell in the array.  Therefore, the required ordering
> between PT of one call and PD of the next is provided by the unlock-lock
> pair.  The ordering of CD and CT is of course provided by the smp_mb().
> (And yes, I was missing the unlock-lock pair earlier.  In my defense,
> you did leave this unlock-lock pair out of your example.)
> 
> So ordering requirement #1 is handled by the original, but only if you
> leave the locking in place.  The producer's smp_wmb() does not necessarily
> order prior loads against subsequent stores, and the wake_up() only
> guarantees ordering if something was actually awakened.  As noted earlier,
> the "if" does not necessarily provide ordering.
> 
> On to ordering requirement #2.
> 
> This requires that CH and CD is ordered in a way that pairs with ordering
> between PD and PH.  PD and PH are both writes, so the smp_wmb() does
> the trick there.  The consumer side is a bit strange.  On DEC Alpha,
> smp_read_barrier_dependes() turns into smp_mb(), so that case is covered
> (though by accident).  On other architectures, smp_read_barrier_depends()
> generates no code, and there is no data dependency between the CH and CD.
> The dependency is instead between the read from ->tail and the write,

Sigh.  Make that "The dependency is instead between the read from ->tail
and the read from the array."

> and as you noted, ->tail is written by the consumer, not the producer.

And non-dependent reads -can- be speculated, so the
smp_read_barrier_depends() needs to be at least an smp_rmb().

Again, don't take my word for it, try it with either ppcmem or real
weakly ordered hardware.

I am not 100% confident of the patch below, but am getting there.
If a change is really needed, it must of course be propagated to the
uses within the Linux kernel.

							Thanx, Paul

> But my battery is dying, so more later, including ACCESS_ONCE().

documentation: Fix circular-buffer example.

The code sample in Documentation/circular-buffers.txt appears to have a
few ordering bugs.  This patch therefore applies the needed fixes.

Reported-by: Lech Fomicki <lfomicki@poczta.fm>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/Documentation/circular-buffers.txt b/Documentation/circular-buffers.txt
index 8117e5bf6065..a36bed3db4ee 100644
--- a/Documentation/circular-buffers.txt
+++ b/Documentation/circular-buffers.txt
@@ -170,7 +170,7 @@ The producer will look something like this:
 
 		smp_wmb(); /* commit the item before incrementing the head */
 
-		buffer->head = (head + 1) & (buffer->size - 1);
+		ACCESS_ONCE(buffer->head) = (head + 1) & (buffer->size - 1);
 
 		/* wake_up() will make sure that the head is committed before
 		 * waking anyone up */
@@ -183,9 +183,14 @@ This will instruct the CPU that the contents of the new item must be written
 before the head index makes it available to the consumer and then instructs the
 CPU that the revised head index must be written before the consumer is woken.
 
-Note that wake_up() doesn't have to be the exact mechanism used, but whatever
-is used must guarantee a (write) memory barrier between the update of the head
-index and the change of state of the consumer, if a change of state occurs.
+Note that wake_up() does not guarantee any sort of barrier unless something
+is actually awakened.  We therefore cannot rely on it for ordering.  However,
+there is always one element of the array left empty.  Therefore, the
+producer must produce two elements before it could possibly corrupt the
+element currently being read by the consumer.  Therefore, the unlock-lock
+pair between consecutive invocations of the consumer provides the necessary
+ordering between the read of the index indicating that the consumer has
+vacated a given element and the write by the producer to that same element.
 
 
 THE CONSUMER
@@ -200,7 +205,7 @@ The consumer will look something like this:
 
 	if (CIRC_CNT(head, tail, buffer->size) >= 1) {
 		/* read index before reading contents at that index */
-		smp_read_barrier_depends();
+		smp_rmb();
 
 		/* extract one item from the buffer */
 		struct item *item = buffer[tail];
@@ -209,7 +214,7 @@ The consumer will look something like this:
 
 		smp_mb(); /* finish reading descriptor before incrementing tail */
 
-		buffer->tail = (tail + 1) & (buffer->size - 1);
+		ACCESS_ONCE(buffer->tail) = (tail + 1) & (buffer->size - 1);
 	}
 
 	spin_unlock(&consumer_lock);
@@ -223,7 +228,10 @@ Note the use of ACCESS_ONCE() in both algorithms to read the opposition index.
 This prevents the compiler from discarding and reloading its cached value -
 which some compilers will do across smp_read_barrier_depends().  This isn't
 strictly needed if you can be sure that the opposition index will _only_ be
-used the once.
+used the once.  Similarly, ACCESS_ONCE() is used in both algorithms to
+write the thread's index.  This documents the fact that we are writing
+to something that can be read concurrently and also prevents the compiler
+from tearing the store.
 
 
 ===============

^ permalink raw reply related

* Re: perf events ring buffer memory barrier on powerpc
From: Paul E. McKenney @ 2013-11-02 15:46 UTC (permalink / raw)
  To: Victor Kaplansky
  Cc: Michael Neuling, Mathieu Desnoyers, Peter Zijlstra, LKML,
	Oleg Nesterov, Linux PPC dev, Anton Blanchard,
	Frederic Weisbecker
In-Reply-To: <OF3CB471BA.BACC7843-ON42257C16.0055EF23-42257C16.0058875B@il.ibm.com>

On Fri, Nov 01, 2013 at 06:06:58PM +0200, Victor Kaplansky wrote:
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote on 10/31/2013
> 05:25:43 PM:
> 
> > I really don't care about "fair" -- I care instead about the kernel
> > working reliably.
> 
> Though I don't see how putting a memory barrier without deep understanding
> why it is needed helps kernel reliability, I do agree that reliability
> is more important than performance.

True enough.  Of course, the same applies to removing memory barriers.

> > And it should also be easy for proponents of removing memory barriers to
> > clearly articulate what orderings their code does and does not need.
> 
> I intentionally took a simplified example of circle buffer from
> Documentation/circular-buffers.txt. I think both sides agree about
> memory ordering requirements in the example. At least I didn't see anyone
> argued about them.

Hard to say.  No one has actually stated them clearly, so how could we
know whether or not we agree.

> > You are assuming control dependencies that the C language does not
> > provide.  Now, for all I know right now, there might well be some other
> > reason why a full barrier is not required, but the "if" statement cannot
> > be that reason.
> >
> > Please review section 1.10 of the C++11 standard (or the corresponding
> > section of the C11 standard, if you prefer).  The point is that the
> > C/C++11 covers only data dependencies, not control dependencies.
> 
> I feel you made a wrong assumption about my expertise in compilers. I don't
> need to reread section 1.10 of the C++11 standard, because I do agree that
> potentially compiler can break the code in our case. And I do agree that
> a compiler barrier() or some other means (including a change of the
> standard)
> can be required in future to prevent a compiler from moving memory accesses
> around.

I was simply reacting to what seemed to me to be your statement that
control dependencies affect ordering.  They don't.  The C/C++ standard
does not in any way respect control dependencies.  In fact, there are
implementations that do not respect control dependencies.  But don't
take my word for it, actually try it out on a weakly ordered system.
Or try out either ppcmem or armmem, which does a full state-space search.

Here is the paper:

	http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/pldi105-sarkar.pdf

And here is the web-based tool:

	http://www.cl.cam.ac.uk/~pes20/ppcmem/

And here is a much faster version that you can run locally:

	http://www.cl.cam.ac.uk/~pes20/weakmemory/index.html

> But "broken" compiler is much wider issue to be deeply discussed in this
> thread. I'm pretty sure that kernel have tons of very subtle
> code that actually creates locks and memory ordering. Such code
> usually just use the "barrier()"  approach to tell gcc not to combine
> or move memory accesses around it.
> 
> Let's just agree for the sake of this memory barrier discussion that we
> *do* need compiler barrier to tell gcc not to combine or move memory
> accesses around it.

Sometimes barrier() is indeed all you need, other times more is needed.

> > Glad we agree on something!
> 
> I'm glad too!
> 
> > Did you miss the following passage in the paragraph you quoted?
> >
> >    "... likewise, your consumer must issue a memory barrier
> >    instruction after removing an item from the queue and before
> >    reading from its memory."
> >
> > That is why DEC Alpha readers need a read-side memory barrier -- it says
> > so right there.  And as either you or Peter noted earlier in this thread,
> > this barrier can be supplied by smp_read_barrier_depends().
> 
> I did not miss that passage. That passage explains why consumer on Alpha
> processor after reading @head is required to execute an additional
> smp_read_barrier_depends() before it can *read* from memory pointed by
> @tail. And I think that I understand why - because the reader have to wait
> till local caches are fully updated and only then it can read data from
> the data buffer.
> 
> But on the producer side, after we read @tail, we don't need to wait for
> update of local caches before we start *write* data to the buffer, since
> the
> producer is the only one who write data there!

Well, we cannot allow the producer to clobber data while the consumer
is reading it out.  That said, I do agree that we should get some help
from the fact that one element of the array is left empty, so that the
producer goes through a full write before clobbering the cell that the
consumer just vacated.

> > I can sympathize if you are having trouble believing this.  After all,
> > it took the DEC Alpha architects a full hour to convince me, and that was
> > in a face-to-face meeting instead of over email.  (Just for the record,
> > it took me even longer to convince them that their earlier documentation
> > did not clearly indicate the need for these read-side barriers.)  But
> > regardless of whether or not I sympathize, DEC Alpha is what it is.
> 
> Again, I do understand quirkiness of the DEC Alpha, and I still think that
> there is no need in *full* memory barrier on producer side - the one
> before writing data to the buffer and which you've put in kfifo
> implementation.

There really does need to be some sort of memory barrier there to
order the read of the index before the write into the array element.
Now, it might well be that this barrier is supplied by the unlock-lock
pair guarding the producer, but either way, there does need to be some
ordering.

							Thanx, Paul

^ permalink raw reply

* Re: perf events ring buffer memory barrier on powerpc
From: Paul E. McKenney @ 2013-11-02 15:20 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Neuling, Mathieu Desnoyers, LKML, Oleg Nesterov,
	Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky
In-Reply-To: <20131101103017.GF19466@laptop.lan>

On Fri, Nov 01, 2013 at 11:30:17AM +0100, Peter Zijlstra wrote:
> On Fri, Nov 01, 2013 at 02:28:14AM -0700, Paul E. McKenney wrote:
> > > This is a completely untenable position.
> > 
> > Indeed it is!
> > 
> > C/C++ never was intended to be used for parallel programming, 
> 
> And yet pretty much all kernels ever written for SMP systems are written
> in it; what drugs are those people smoking?

There was a time when I wished that the C/C++ standards people had added
concurrency to the language 30 years ago, but I eventually realized that
any attempt at that time would have been totally broken.

> Furthermore there's a gazillion parallel userspace programs.

Most of which have very unaggressive concurrency designs.

> > and this is
> > but one of the problems that can arise when we nevertheless use it for
> > parallel programming.  As compilers get smarter (for some definition of
> > "smarter") and as more systems have special-purpose hardware (such as
> > vector units) that are visible to the compiler, we can expect more of
> > this kind of trouble.
> > 
> > This was one of many reasons that I decided to help with the C/C++11
> > effort, whatever anyone might think about the results.
> 
> Well, I applaud your efforts, but given the results I think the C/C++
> people are all completely insane.

If it makes you feel any better, they have the same opinion of all of
us who use C/C++ for concurrency given that the standard provides no
guarantee.

> > > How do the C/C++ people propose to deal with this?
> > 
> > By marking "ptr" as atomic, thus telling the compiler not to mess with it.
> > And thus requiring that all accesses to it be decorated, which in the
> > case of RCU could be buried in the RCU accessors.
> 
> This seems contradictory; marking it atomic would look like:
> 
> struct foo {
> 	unsigned long value;
> 	__atomic void *ptr;
> 	unsigned long value1;
> };
> 
> Clearly we cannot hide this definition in accessors, because then
> accesses to value* won't see the annotation.

#define __rcu __atomic

Though there are probably placement restrictions for __atomic that
current use of __rcu doesn't pay attention to.

> That said; mandating we mark all 'shared' data with __atomic is
> completely untenable and is not backwards compatible.
> 
> To be safe we must assume all data shared unless indicated otherwise.

Something similar to the compiler directives forcing twos-complement
interpretation of signed overflow could be attractive.  Not sure what
it would do to code generation, though.

							Thanx, Paul

^ permalink raw reply

* Re: perf events ring buffer memory barrier on powerpc
From: Paul E. McKenney @ 2013-11-02 17:28 UTC (permalink / raw)
  To: Victor Kaplansky
  Cc: Michael Neuling, Mathieu Desnoyers, Peter Zijlstra, LKML,
	Oleg Nesterov, Linux PPC dev, Anton Blanchard,
	Frederic Weisbecker
In-Reply-To: <OF2F28411B.D04C5B5A-ON42257C16.004E02C5-42257C16.004F420C@il.ibm.com>

On Fri, Nov 01, 2013 at 04:25:42PM +0200, Victor Kaplansky wrote:
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote on 10/31/2013
> 08:40:15 AM:
> 
> > > void ubuf_read(void)
> > > {
> > >    u64 head, tail;
> > >
> > >    tail = ACCESS_ONCE(ubuf->tail);
> > >    head = ACCESS_ONCE(ubuf->head);
> > >
> > >    /*
> > >     * Ensure we read the buffer boundaries before the actual buffer
> > >     * data...
> > >     */
> > >    smp_rmb(); /* C, matches with B */
> > >
> > >    while (tail != head) {
> > >       obj = ubuf->data + tail;
> > >       /* process obj */
> > >       tail += obj->size;
> > >       tail %= ubuf->size;
> > >    }
> > >
> > >    /*
> > >     * Ensure all data reads are complete before we issue the
> > >     * ubuf->tail update; once that update hits, kbuf_write() can
> > >     * observe and overwrite data.
> > >     */
> > >    smp_mb(); /* D, matches with A */
> > >
> > >    ubuf->tail = tail;
> > > }
> 
> > > Could we replace A and C with an smp_read_barrier_depends()?
> >
> > C, yes, given that you have ACCESS_ONCE() on the fetch from ->tail
> > and that the value fetch from ->tail feeds into the address used for
> > the "obj =" assignment.
> 
> No! You must to have a full smp_rmb() at C. The race on the reader side
> is not between fetch of @tail and read from address pointed by @tail.
> The real race here is between a fetch of @head and read of obj from
> memory pointed by @tail.

I believe you are in fact correct, good catch.

							Thanx, Paul

^ permalink raw reply

* Re: perf events ring buffer memory barrier on powerpc
From: Paul E. McKenney @ 2013-11-02 16:36 UTC (permalink / raw)
  To: Victor Kaplansky
  Cc: Michael Neuling, Mathieu Desnoyers, Peter Zijlstra, LKML,
	Oleg Nesterov, Linux PPC dev, Anton Blanchard,
	Frederic Weisbecker
In-Reply-To: <OF6655E55D.554B7448-ON42257C16.00442B8B-42257C16.0048995E@il.ibm.com>

On Fri, Nov 01, 2013 at 03:12:58PM +0200, Victor Kaplansky wrote:
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote on 10/31/2013
> 08:16:02 AM:
> 
> > > BTW, it is why you also don't need ACCESS_ONCE() around @tail, but only
> > > around
> > > @head read.
> 
> Just to be sure, that we are talking about the same code - I was
> considering
> ACCESS_ONCE() around @tail in point AAA in the following example from
> Documentation/circular-buffers.txt for CONSUMER:
> 
>         unsigned long head = ACCESS_ONCE(buffer->head);
>         unsigned long tail = buffer->tail;      /* AAA */
> 
>         if (CIRC_CNT(head, tail, buffer->size) >= 1) {
>                 /* read index before reading contents at that index */
>                 smp_read_barrier_depends();
> 
>                 /* extract one item from the buffer */
>                 struct item *item = buffer[tail];
> 
>                 consume_item(item);
> 
>                 smp_mb(); /* finish reading descriptor before incrementing
> tail */
> 
>                 buffer->tail = (tail + 1) & (buffer->size - 1); /* BBB */
>         }

Hmmm...  I believe that we need to go back to the original code in
Documentation/circular-buffers.txt.  I do so at the bottom of this email.

> > If you omit the ACCESS_ONCE() calls around @tail, the compiler is within
> > its rights to combine adjacent operations and also to invent loads and
> > stores, for example, in cases of register pressure.
> 
> Right. And I was completely aware about these possible transformations when
> said that ACCESS_ONCE() around @tail in point AAA is redundant. Moved, or
> even
> completely dismissed reads of @tail in consumer code, are not a problem at
> all,
> since @tail is written exclusively by CONSUMER side.

I believe that the lack of ACCESS_ONCE() around the consumer's store
to buffer->tail is at least a documentation problem.  In the original
consumer code, it is trapped between an smp_mb() and a spin_unlock(),
but it is updating something that is read without synchronization by
some other thread.

> > It is also within
> > its rights to do piece-at-a-time loads and stores, which might sound
> > unlikely, but which can actually has happened when the compiler figures
> > out exactly what is to be stored at compile time, especially on hardware
> > that only allows small immediate values.
> 
> As for writes to @tail, the ACCESS_ONCE around @tail at point AAA,
> doesn't prevent in any way an imaginary super-optimizing compiler
> from moving around the store to @tail (which appears in the code at point
> BBB).
> 
> It is why ACCESS_ONCE at point AAA is completely redundant.

Agreed, it is under the lock that guards modifications, so AAA does not
need ACCESS_ONCE().

OK, here is the producer from Documentation/circular-buffers.txt, with
some comments added:

	spin_lock(&producer_lock);

	unsigned long head = buffer->head;
	unsigned long tail = ACCESS_ONCE(buffer->tail); /* PT */

	if (CIRC_SPACE(head, tail, buffer->size) >= 1) {
		/* insert one item into the buffer */
		struct item *item = buffer[head];

		produce_item(item); /* PD */

		smp_wmb(); /* commit the item before incrementing the head */

		buffer->head = (head + 1) & (buffer->size - 1);  /* PH */

		/* wake_up() will make sure that the head is committed before
		 * waking anyone up */
		wake_up(consumer);
	}

	spin_unlock(&producer_lock);

And here is the consumer, also from Documentation/circular-buffers.txt:

	spin_lock(&consumer_lock);

	unsigned long head = ACCESS_ONCE(buffer->head); /* CH */
	unsigned long tail = buffer->tail;

	if (CIRC_CNT(head, tail, buffer->size) >= 1) {
		/* read index before reading contents at that index */
		smp_read_barrier_depends();

		/* extract one item from the buffer */
		struct item *item = buffer[tail]; /* CD */

		consume_item(item);

		smp_mb(); /* finish reading descriptor before incrementing tail */

		buffer->tail = (tail + 1) & (buffer->size - 1); /* CT */
	}

	spin_unlock(&consumer_lock);

Here are the ordering requirements as I see them:

1.	The producer is not allowed to clobber a location that the
	consumer is in the process of reading from.

2.	The consumer is not allowed to read from a location that the
	producer has not yet completed writing to.

#1 is helped out by the fact that there is always an empty element in
the array, so that the producer will need to produce twice in a row
to catch up to where the consumer is currently consuming.  #2 has no
such benefit: The consumer can consume an item that has just now been
produced.

#1 requires that CD is ordered before CT in a way that pairs with the
ordering of PT and PD.  There is of course no effective ordering between
PT and PD within a given call to the producer, but we only need the
ordering between the read from PT for one call to the producer and the
PD of the -next- call to the producer, courtesy of the fact that there
is always one empty cell in the array.  Therefore, the required ordering
between PT of one call and PD of the next is provided by the unlock-lock
pair.  The ordering of CD and CT is of course provided by the smp_mb().
(And yes, I was missing the unlock-lock pair earlier.  In my defense,
you did leave this unlock-lock pair out of your example.)

So ordering requirement #1 is handled by the original, but only if you
leave the locking in place.  The producer's smp_wmb() does not necessarily
order prior loads against subsequent stores, and the wake_up() only
guarantees ordering if something was actually awakened.  As noted earlier,
the "if" does not necessarily provide ordering.

On to ordering requirement #2.

This requires that CH and CD is ordered in a way that pairs with ordering
between PD and PH.  PD and PH are both writes, so the smp_wmb() does
the trick there.  The consumer side is a bit strange.  On DEC Alpha,
smp_read_barrier_dependes() turns into smp_mb(), so that case is covered
(though by accident).  On other architectures, smp_read_barrier_depends()
generates no code, and there is no data dependency between the CH and CD.
The dependency is instead between the read from ->tail and the write,
and as you noted, ->tail is written by the consumer, not the producer.

But my battery is dying, so more later, including ACCESS_ONCE().

							Thanx, Paul

^ permalink raw reply

* Re: perf events ring buffer memory barrier on powerpc
From: Paul E. McKenney @ 2013-11-02 17:46 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Neuling, Mathieu Desnoyers, Oleg Nesterov, LKML,
	Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky
In-Reply-To: <20131101161129.GU16117@laptop.programming.kicks-ass.net>

On Fri, Nov 01, 2013 at 05:11:29PM +0100, Peter Zijlstra wrote:
> On Wed, Oct 30, 2013 at 11:40:15PM -0700, Paul E. McKenney wrote:
> > > void kbuf_write(int sz, void *buf)
> > > {
> > > 	u64 tail = ACCESS_ONCE(ubuf->tail); /* last location userspace read */
> > > 	u64 offset = kbuf->head; /* we already know where we last wrote */
> > > 	u64 head = offset + sz;
> > > 
> > > 	if (!space(tail, offset, head)) {
> > > 		/* discard @buf */
> > > 		return;
> > > 	}
> > > 
> > > 	/*
> > > 	 * Ensure that if we see the userspace tail (ubuf->tail) such
> > > 	 * that there is space to write @buf without overwriting data
> > > 	 * userspace hasn't seen yet, we won't in fact store data before
> > > 	 * that read completes.
> > > 	 */
> > > 
> > > 	smp_mb(); /* A, matches with D */
> > > 
> > > 	write(kbuf->data + offset, buf, sz);
> > > 	kbuf->head = head % kbuf->size;
> > > 
> > > 	/*
> > > 	 * Ensure that we write all the @buf data before we update the
> > > 	 * userspace visible ubuf->head pointer.
> > > 	 */
> > > 	smp_wmb(); /* B, matches with C */
> > > 
> > > 	ubuf->head = kbuf->head;
> > > }
> 
> > > Now the whole crux of the question is if we need barrier A at all, since
> > > the STORES issued by the @buf writes are dependent on the ubuf->tail
> > > read.
> > 
> > The dependency you are talking about is via the "if" statement?
> > Even C/C++11 is not required to respect control dependencies.
> 
> But surely we must be able to make it so; otherwise you'd never be able
> to write:
> 
> void *ptr = obj1;
> 
> void foo(void)
> {
> 
> 	/* create obj2, obj3 */
> 
> 	smp_wmb(); /* ensure the objs are complete */
> 
> 	/* expose either obj2 or obj3 */
> 	if (x)
> 		ptr = obj2;
> 	else
> 		ptr = obj3;

OK, the smp_wmb() orders the creation and the exposing.  But the
compiler can do this:

	ptr = obj3;
	if (x)
		ptr = obj2;

And that could momentarily expose obj3 to readers, and these readers
might be fatally disappointed by the free() below.  If you instead said:

	if (x)
		ACCESS_ONCE(ptr) = obj2;
	else
		ACCESS_ONCE(ptr) = obj3;

then the general consensus appears to be that the compiler would not
be permitted to carry out the above optimization.  Since you have
the smp_wmb(), readers that are properly ordered (e.g., smp_rmb() or
rcu_dereference()) would be prevented from seeing pre-initialization
state.

> 	/* free the unused one */
> 	if (x)
> 		free(obj3);
> 	else
> 		free(obj2);
> }
> 
> Earlier you said that 'volatile' or '__atomic' avoids speculative
> writes; so would:
> 
> volatile void *ptr = obj1;
> 
> Make the compiler respect control dependencies again? If so, could we
> somehow mark that !space() condition volatile?

The compiler should, but the CPU is still free to ignore the control
dependencies in the general case.

We might be able to rely on weakly ordered hardware refraining
from speculating stores, but not sure that this applies across all
architectures of interest.  We definitely can -not- rely on weakly
ordered hardware refraining from speculating loads.

> Currently the above would be considered a valid pattern. But you're
> saying its not because the compiler is free to expose both obj2 and obj3
> (for however short a time) and thus the free of the 'unused' object is
> incorrect and can cause use-after-free.

Yes, it is definitely unsafe and invalid in absence of ACCESS_ONCE().

> In fact; how can we be sure that:
> 
> void *ptr = NULL;
> 
> void bar(void)
> {
> 	void *obj = malloc(...);
> 
> 	/* fill obj */
> 
> 	if (!err)
> 		rcu_assign_pointer(ptr, obj);
> 	else
> 		free(obj);
> }
> 
> Does not get 'optimized' into:
> 
> void bar(void)
> {
> 	void *obj = malloc(...);
> 	void *old_ptr = ptr;
> 
> 	/* fill obj */
> 
> 	rcu_assign_pointer(ptr, obj);
> 	if (err) { /* because runtime profile data says this is unlikely */
> 		ptr = old_ptr;
> 		free(obj);
> 	}
> }

In this particular case, the barrier() implied by the smp_wmb() in
rcu_assign_pointer() will prevent this "optimization".  However, other
"optimizations" are the reason why I am working to introduce ACCESS_ONCE()
into rcu_assign_pointer.

> We _MUST_ be able to rely on control flow, otherwise me might as well
> all go back to writing kernels in asm.

It isn't -that- bad!  ;-)

							Thanx, Paul

^ permalink raw reply

* Re: [PATCH 0/2] move of_find_next_cache_node to DT core
From: Grant Likely @ 2013-11-02 18:06 UTC (permalink / raw)
  To: Sudeep KarkadaNagesha, Sudeep KarkadaNagesha
  Cc: devicetree@vger.kernel.org, Rob Herring,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <524E9BA9.4000506@arm.com>

On Fri, 04 Oct 2013 11:42:49 +0100, Sudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com> wrote:
> Hi Grant,
> 
> On 18/09/13 17:18, Sudeep KarkadaNagesha wrote:
> > On 18/09/13 15:51, Grant Likely wrote:
> >> On Wed, 18 Sep 2013 11:53:03 +0100, Sudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com> wrote:
> >>> From: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
> >>>
> >>> Hi,
> >>>
> >>> The cache bindings are generic and used by many other architectures
> >>> apart from PPC. These patches fixes and move the existing definition
> >>> of of_find_next_cache_node to DT common code.
> >>>
> >>> Regards,
> >>> Sudeep
> >>
> >> Acked-by: Grant Likely <grant.likely@secretlab.ca>
> >>
> >> However, do you have a user for this function on other architectures
> >> yet? I'd like to see a user for the function in the same patch series..
> >>
> > 
> > Yes I have posted an RFC[1] following this series implementing cacheinfo
> > for ARM similar to x86 implementation. I was not sure if it's good idea
> > to combine it as its still initial RFC version.
> > 
> 
> Do you prefer to have this a independent change or to go with the cache topology
> support patches[1] on ARM ?

Ben's already picked it up, so I'm fine with it.

g.

^ permalink raw reply

* Re: linux-next: manual merge of the dt-rh tree with the powerpc tree
From: Benjamin Herrenschmidt @ 2013-11-01 23:13 UTC (permalink / raw)
  To: Rob Herring
  Cc: Sudeep KarkadaNagesha, Stephen Rothwell, linux-next, linuxppc-dev,
	linux-kernel
In-Reply-To: <52742A2A.9080008@gmail.com>

On Fri, 2013-11-01 at 17:24 -0500, Rob Herring wrote:
> On 11/01/2013 12:20 AM, Stephen Rothwell wrote:
> > Hi Rob,
> > 
> > Today's linux-next merge of the dt-rh tree got a conflict in 
> > arch/powerpc/include/asm/prom.h between commit a3e31b458844 ("of:
> > Move definition of of_find_next_cache_node into common code") from
> > the powerpc tree and commit 0c3f061c195c ("of: implement
> > of_node_to_nid as a weak function") from the dt-rh tree.
> 
> Ben, I can pick these 2 patches up instead if you want to drop them
> and avoid the conflict.

I'd rather not rebase my tree, the conflict seems to be rather trivial
to solve.

Cheers,
Ben.

^ permalink raw reply

* Re: linux-next: manual merge of the dt-rh tree with the powerpc tree
From: Stephen Rothwell @ 2013-11-01 22:27 UTC (permalink / raw)
  To: Rob Herring; +Cc: Sudeep KarkadaNagesha, linux-next, linuxppc-dev, linux-kernel
In-Reply-To: <52742A2A.9080008@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 732 bytes --]

Hi Rob,

On Fri, 01 Nov 2013 17:24:42 -0500 Rob Herring <robherring2@gmail.com> wrote:
>
> On 11/01/2013 12:20 AM, Stephen Rothwell wrote:
> > 
> > Today's linux-next merge of the dt-rh tree got a conflict in 
> > arch/powerpc/include/asm/prom.h between commit a3e31b458844 ("of:
> > Move definition of of_find_next_cache_node into common code") from
> > the powerpc tree and commit 0c3f061c195c ("of: implement
> > of_node_to_nid as a weak function") from the dt-rh tree.
> 
> Ben, I can pick these 2 patches up instead if you want to drop them
> and avoid the conflict.

The conflict is pretty trivial and so should not require any action.

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: linux-next: manual merge of the dt-rh tree with the powerpc tree
From: Rob Herring @ 2013-11-01 22:24 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Sudeep KarkadaNagesha, linux-next, linuxppc-dev, linux-kernel
In-Reply-To: <20131101162002.4dd386299ec42e8d8d032be2@canb.auug.org.au>

On 11/01/2013 12:20 AM, Stephen Rothwell wrote:
> Hi Rob,
> 
> Today's linux-next merge of the dt-rh tree got a conflict in 
> arch/powerpc/include/asm/prom.h between commit a3e31b458844 ("of:
> Move definition of of_find_next_cache_node into common code") from
> the powerpc tree and commit 0c3f061c195c ("of: implement
> of_node_to_nid as a weak function") from the dt-rh tree.

Ben, I can pick these 2 patches up instead if you want to drop them
and avoid the conflict.

Rob

^ permalink raw reply

* Re: [PATCHv2 5/8] ASoC: SGTL5000: Enhance the SGTL5000 codec driver about regulator.
From: Mark Brown @ 2013-11-01 18:50 UTC (permalink / raw)
  To: Xiubo Li
  Cc: mark.rutland, alsa-devel, linux-doc, tiwai, b18965, timur, perex,
	r65073, LW, linux, b42378, linux-arm-kernel, grant.likely,
	devicetree, ian.campbell, pawel.moll, swarren, rob.herring, oskar,
	fabio.estevam, lgirdwood, linux-kernel, rob, r64188, shawn.guo,
	linuxppc-dev
In-Reply-To: <1383289495-24523-6-git-send-email-Li.Xiubo@freescale.com>

[-- Attachment #1: Type: text/plain, Size: 760 bytes --]

On Fri, Nov 01, 2013 at 03:04:52PM +0800, Xiubo Li wrote:
> On VF610 series there are no regulators used, and now whether the
> CONFIG_REGULATOR mirco is enabled or not, for the VF610 audio
> patch series, the board cannot be probe successfully.
> And this patch will solve this issue.

I don't understand what this is for at all, you're just saying there is
a problem you're trying to solve but you don't explain anything about
what the problem is or how your changes address it.

> +#ifndef CONFIG_SND_SOC_FSL_SGTL5000_VF610
>  static int ldo_regulator_register(struct snd_soc_codec *codec,

This is definitely broken, it won't work with multi-platform kernels,
and I don't understand what this is supposed to do - what is the reason
for making this change?

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCHv2 6/8] ASoC: fsl: add SGTL5000 based audio machine driver.
From: Mark Brown @ 2013-11-01 18:40 UTC (permalink / raw)
  To: Xiubo Li
  Cc: mark.rutland, alsa-devel, linux-doc, tiwai, b18965, timur, perex,
	r65073, LW, linux, b42378, linux-arm-kernel, grant.likely,
	devicetree, ian.campbell, pawel.moll, swarren, rob.herring, oskar,
	fabio.estevam, lgirdwood, linux-kernel, rob, r64188, shawn.guo,
	linuxppc-dev
In-Reply-To: <1383289495-24523-7-git-send-email-Li.Xiubo@freescale.com>

[-- Attachment #1: Type: text/plain, Size: 785 bytes --]

On Fri, Nov 01, 2013 at 03:04:53PM +0800, Xiubo Li wrote:

> Conflicts:
> 	sound/soc/fsl/Makefile

Ahem.

> +	/* TODO: The SAI driver should figure this out for us */
> +	switch (channels) {
> +	case 2:
> +		snd_soc_dai_set_tdm_slot(cpu_dai, 0xfffffffc, 0xfffffffc, 2, 0);
> +		break;
> +	case 1:
> +		snd_soc_dai_set_tdm_slot(cpu_dai, 0xfffffffe, 0xfffffffe, 1, 0);
> +		break;
> +	default:
> +		return -EINVAL;
> +	}

Yes, it should - this code should probably just be copied straight into
the SAI driver.  If we need to support other configurations we can do
that later.

> +static int fsl_sgtl5000_remove(struct platform_device *pdev)
> +{
> +	snd_soc_unregister_card(&fsl_sgt1500_card);
> +
> +	return 0;
> +}

You're using snd_soc_unregister_card() so you don't need to do this.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCHv2 1/8] ALSA: Add SAI SoC Digital Audio Interface driver.
From: Mark Brown @ 2013-11-01 18:25 UTC (permalink / raw)
  To: Xiubo Li
  Cc: mark.rutland, alsa-devel, linux-doc, tiwai, b18965, timur, perex,
	r65073, LW, linux, b42378, linux-arm-kernel, grant.likely,
	devicetree, ian.campbell, pawel.moll, swarren, rob.herring, oskar,
	fabio.estevam, lgirdwood, linux-kernel, rob, r64188, shawn.guo,
	linuxppc-dev
In-Reply-To: <1383289495-24523-2-git-send-email-Li.Xiubo@freescale.com>

[-- Attachment #1: Type: text/plain, Size: 1005 bytes --]

On Fri, Nov 01, 2013 at 03:04:48PM +0800, Xiubo Li wrote:

> +static int fsl_sai_set_dai_clkdiv(struct snd_soc_dai *cpu_dai,
> +		int div_id, int div)
> +{
> +	struct fsl_sai *sai = snd_soc_dai_get_drvdata(cpu_dai);
> +	u32 tcr2, rcr2;
> +
> +	if (div_id == FSL_SAI_TX_DIV) {
> +		tcr2 = readl(sai->base + FSL_SAI_TCR2);
> +		tcr2 &= ~FSL_SAI_CR2_DIV_MASK;
> +		tcr2 |= FSL_SAI_CR2_DIV(div);
> +		writel(tcr2, sai->base + FSL_SAI_TCR2);

What is this divider and why does the user have to set it manually?

> +	} else
> +		return -EINVAL;
> +

Coding style?

> +static int fsl_sai_dai_probe(struct snd_soc_dai *dai)
> +{
> +	int ret;
> +	struct fsl_sai *sai = dev_get_drvdata(dai->dev);
> +
> +	ret = clk_prepare_enable(sai->clk);
> +	if (ret)
> +		return ret;

It'd be nicer to only enable the clock while the device is in active
use.

> +	ret = snd_dmaengine_pcm_register(&pdev->dev, NULL,
> +			SND_DMAENGINE_PCM_FLAG_NO_RESIDUE);
> +	if (ret)
> +		return ret;

We should have a devm_ version of this.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* RE: perf events ring buffer memory barrier on powerpc
From: Victor Kaplansky @ 2013-11-01 16:30 UTC (permalink / raw)
  To: David Laight
  Cc: Michael Neuling, Mathieu Desnoyers, Peter Zijlstra,
	Frederic Weisbecker, LKML, Oleg Nesterov, Linux PPC dev,
	Anton Blanchard, paulmck
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B73D4@saturn3.aculab.com>

"David Laight" <David.Laight@aculab.com> wrote on 11/01/2013 06:25:29 PM:
> gcc will do unexpected memory accesses for bit fields that are
> adjacent to volatile data.
> In particular it may generate 64bit sized (and aligned) RMW cycles
> when accessing bit fields.
> And yes, this has caused real problems.

Thanks, I am aware about this bug/feature in gcc.
-- Victor

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox