[PATCH v6 tip/core/locking 3/8] Documentation/memory-barriers.txt: Prohibit speculative writes

linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v6 tip/core/locking 3/8] Documentation/memory-barriers.txt: Prohibit speculative writes
       [not found] ` <1386799151-2219-1-git-send-email-paulmck@linux.vnet.ibm.com>
@ 2013-12-11 21:59   ` Paul E. McKenney
  2013-12-12 13:17     ` Ingo Molnar
  2013-12-16 10:39     ` [tip:core/locking] " tip-bot for Peter Zijlstra
  2013-12-11 21:59   ` [PATCH v6 tip/core/locking 5/8] locking: Add an smp_mb__after_unlock_lock() for UNLOCK+LOCK barrier Paul E. McKenney
  2013-12-16 10:39   ` [tip:core/locking] Documentation/memory-barriers.txt: Add needed ACCESS_ONCE() calls to memory-barriers.txt tip-bot for Paul E. McKenney
  2 siblings, 2 replies; 11+ messages in thread
From: Paul E. McKenney @ 2013-12-11 21:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, oleg, sbw,
	Paul E. McKenney, Linux-Arch

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

No SMP architecture currently supporting Linux allows speculative writes,
so this commit updates Documentation/memory-barriers.txt to prohibit
them in Linux core code.  It also records restrictions on their use.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linux-Arch <linux-arch@vger.kernel.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
---
 Documentation/memory-barriers.txt | 183 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 175 insertions(+), 8 deletions(-)

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index 2d22da095a60..deafa36aeea1 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -571,11 +571,10 @@ dependency barrier to make it work correctly.  Consider the following bit of
 code:
 
 	q = ACCESS_ONCE(a);
-	if (p) {
-		<data dependency barrier>
-		q = ACCESS_ONCE(b);
+	if (q) {
+		<data dependency barrier>  /* BUG: No data dependency!!! */
+		p = ACCESS_ONCE(b);
 	}
-	x = *q;
 
 This will not have the desired effect because there is no actual data
 dependency, but rather a control dependency that the CPU may short-circuit
@@ -584,11 +583,176 @@ the load from b as having happened before the load from a.  In such a
 case what's actually required is:
 
 	q = ACCESS_ONCE(a);
-	if (p) {
+	if (q) {
 		<read barrier>
-		q = ACCESS_ONCE(b);
+		p = ACCESS_ONCE(b);
 	}
-	x = *q;
+
+However, stores are not speculated.  This means that ordering -is- provided
+in the following example:
+
+	q = ACCESS_ONCE(a);
+	if (ACCESS_ONCE(q)) {
+		ACCESS_ONCE(b) = p;
+	}
+
+Please note that ACCESS_ONCE() is not optional!  Without the ACCESS_ONCE(),
+the compiler is within its rights to transform this example:
+
+	q = a;
+	if (q) {
+		b = p;  /* BUG: Compiler can reorder!!! */
+		do_something();
+	} else {
+		b = p;  /* BUG: Compiler can reorder!!! */
+		do_something_else();
+	}
+
+into this, which of course defeats the ordering:
+
+	b = p;
+	q = a;
+	if (q)
+		do_something();
+	else
+		do_something_else();
+
+Worse yet, if the compiler is able to prove (say) that the value of
+variable 'a' is always non-zero, it would be well within its rights
+to optimize the original example by eliminating the "if" statement
+as follows:
+
+	q = a;
+	b = p;  /* BUG: Compiler can reorder!!! */
+	do_something();
+
+The solution is again ACCESS_ONCE(), which preserves the ordering between
+the load from variable 'a' and the store to variable 'b':
+
+	q = ACCESS_ONCE(a);
+	if (q) {
+		ACCESS_ONCE(b) = p;
+		do_something();
+	} else {
+		ACCESS_ONCE(b) = p;
+		do_something_else();
+	}
+
+You could also use barrier() to prevent the compiler from moving
+the stores to variable 'b', but barrier() would not prevent the
+compiler from proving to itself that a==1 always, so ACCESS_ONCE()
+is also needed.
+
+It is important to note that control dependencies absolutely require a
+a conditional.  For example, the following "optimized" version of
+the above example breaks ordering:
+
+	q = ACCESS_ONCE(a);
+	ACCESS_ONCE(b) = p;  /* BUG: No ordering vs. load from a!!! */
+	if (q) {
+		/* ACCESS_ONCE(b) = p; -- moved up, BUG!!! */
+		do_something();
+	} else {
+		/* ACCESS_ONCE(b) = p; -- moved up, BUG!!! */
+		do_something_else();
+	}
+
+It is of course legal for the prior load to be part of the conditional,
+for example, as follows:
+
+	if (ACCESS_ONCE(a) > 0) {
+		ACCESS_ONCE(b) = q / 2;
+		do_something();
+	} else {
+		ACCESS_ONCE(b) = q / 3;
+		do_something_else();
+	}
+
+This will again ensure that the load from variable 'a' is ordered before the
+stores to variable 'b'.
+
+In addition, you need to be careful what you do with the local variable 'q',
+otherwise the compiler might be able to guess the value and again remove
+the needed conditional.  For example:
+
+	q = ACCESS_ONCE(a);
+	if (q % MAX) {
+		ACCESS_ONCE(b) = p;
+		do_something();
+	} else {
+		ACCESS_ONCE(b) = p;
+		do_something_else();
+	}
+
+If MAX is defined to be 1, then the compiler knows that (q % MAX) is
+equal to zero, in which case the compiler is within its rights to
+transform the above code into the following:
+
+	q = ACCESS_ONCE(a);
+	ACCESS_ONCE(b) = p;
+	do_something_else();
+
+This transformation loses the ordering between the load from variable 'a'
+and the store to variable 'b'.  If you are relying on this ordering, you
+should do something like the following:
+
+	q = ACCESS_ONCE(a);
+	BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */
+	if (q % MAX) {
+		ACCESS_ONCE(b) = p;
+		do_something();
+	} else {
+		ACCESS_ONCE(b) = p;
+		do_something_else();
+	}
+
+Finally, control dependencies do -not- provide transitivity.  This is
+demonstrated by two related examples:
+
+	CPU 0                     CPU 1
+	=====================     =====================
+	r1 = ACCESS_ONCE(x);      r2 = ACCESS_ONCE(y);
+	if (r1 >= 0)              if (r2 >= 0)
+	  ACCESS_ONCE(y) = 1;       ACCESS_ONCE(x) = 1;
+
+	assert(!(r1 == 1 && r2 == 1));
+
+The above two-CPU example will never trigger the assert().  However,
+if control dependencies guaranteed transitivity (which they do not),
+then adding the following two CPUs would guarantee a related assertion:
+
+	CPU 2                     CPU 3
+	=====================     =====================
+	ACCESS_ONCE(x) = 2;       ACCESS_ONCE(y) = 2;
+
+	assert(!(r1 == 2 && r2 == 2 && x == 1 && y == 1)); /* FAILS!!! */
+
+But because control dependencies do -not- provide transitivity, the
+above assertion can fail after the combined four-CPU example completes.
+If you need the four-CPU example to provide ordering, you will need
+smp_mb() between the loads and stores in the CPU 0 and CPU 1 code fragments.
+
+In summary:
+
+  (*) Control dependencies can order prior loads against later stores.
+      However, they do -not- guarantee any other sort of ordering:
+      Not prior loads against later loads, nor prior stores against
+      later anything.  If you need these other forms of ordering,
+      use smb_rmb(), smp_wmb(), or, in the case of prior stores and
+      later loads, smp_mb().
+
+  (*) Control dependencies require at least one run-time conditional
+      between the prior load and the subsequent store.  If the compiler
+      is able to optimize the conditional away, it will have also
+      optimized away the ordering.  Careful use of ACCESS_ONCE() can
+      help to preserve the needed conditional.
+
+  (*) Control dependencies require that the compiler avoid reordering the
+      dependency into nonexistence.  Careful use of ACCESS_ONCE() or
+      barrier() can help to preserve your control dependency.
+
+  (*) Control dependencies do -not- provide transitivity.  If you
+      need transitivity, use smp_mb().
 
 
 SMP BARRIER PAIRING
@@ -1083,7 +1247,10 @@ compiler from moving the memory accesses either side of it to the other side:
 
 	barrier();
 
-This is a general barrier - lesser varieties of compiler barrier do not exist.
+This is a general barrier -- there are no read-read or write-write variants
+of barrier().  Howevever, ACCESS_ONCE() can be thought of as a weak form
+for barrier() that affects only the specific accesses flagged by the
+ACCESS_ONCE().
 
 The compiler barrier has no direct effect on the CPU, which may then reorder
 things however it wishes.
-- 
1.8.1.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v6 tip/core/locking 5/8] locking: Add an smp_mb__after_unlock_lock() for UNLOCK+LOCK barrier
       [not found] ` <1386799151-2219-1-git-send-email-paulmck@linux.vnet.ibm.com>
  2013-12-11 21:59   ` [PATCH v6 tip/core/locking 3/8] Documentation/memory-barriers.txt: Prohibit speculative writes Paul E. McKenney
@ 2013-12-11 21:59   ` Paul E. McKenney
  2013-12-11 21:59     ` Paul E. McKenney
  2013-12-16 10:40     ` [tip:core/locking] locking: Add an smp_mb__after_unlock_lock() for UNLOCK+BLOCK barrier tip-bot for Paul E. McKenney
  2013-12-16 10:39   ` [tip:core/locking] Documentation/memory-barriers.txt: Add needed ACCESS_ONCE() calls to memory-barriers.txt tip-bot for Paul E. McKenney
  2 siblings, 2 replies; 11+ messages in thread
From: Paul E. McKenney @ 2013-12-11 21:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, oleg, sbw,
	Paul E. McKenney, Linux-Arch, Ingo Molnar, Linus Torvalds

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The Linux kernel has traditionally required that an UNLOCK+LOCK pair
act as a full memory barrier when either (1) that UNLOCK+LOCK pair
was executed by the same CPU or task, or (2) the same lock variable
was used for the UNLOCK and LOCK.  It now seems likely that very few
places in the kernel rely on this full-memory-barrier semantic, and
with the advent of queued locks, providing this semantic either requires
complex reasoning, or for some architectures, added overhead.

This commit therefore adds a smp_mb__after_unlock_lock(), which may be
placed after a LOCK primitive to restore the full-memory-barrier semantic.
All definitions are currently no-ops, but will be upgraded for some
architectures when queued locks arrive.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linux-Arch <linux-arch@vger.kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
---
 include/linux/spinlock.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h
index 75f34949d9ab..3f2867ff0ced 100644
--- a/include/linux/spinlock.h
+++ b/include/linux/spinlock.h
@@ -130,6 +130,16 @@ do {								\
 #define smp_mb__before_spinlock()	smp_wmb()
 #endif
 
+/*
+ * Place this after a lock-acquisition primitive to guarantee that
+ * an UNLOCK+LOCK pair act as a full barrier.  This guarantee applies
+ * if the UNLOCK and LOCK are executed by the same CPU or if the
+ * UNLOCK and LOCK operate on the same lock variable.
+ */
+#ifndef smp_mb__after_unlock_lock
+#define smp_mb__after_unlock_lock()	do { } while (0)
+#endif
+
 /**
  * raw_spin_unlock_wait - wait until the spinlock gets unlocked
  * @lock: the spinlock in question.
-- 
1.8.1.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v6 tip/core/locking 5/8] locking: Add an smp_mb__after_unlock_lock() for UNLOCK+LOCK barrier
  2013-12-11 21:59   ` [PATCH v6 tip/core/locking 5/8] locking: Add an smp_mb__after_unlock_lock() for UNLOCK+LOCK barrier Paul E. McKenney
@ 2013-12-11 21:59     ` Paul E. McKenney
  2013-12-16 10:40     ` [tip:core/locking] locking: Add an smp_mb__after_unlock_lock() for UNLOCK+BLOCK barrier tip-bot for Paul E. McKenney
  1 sibling, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2013-12-11 21:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, oleg, sbw,
	Paul E. McKenney, Linux-Arch, Ingo Molnar, Linus Torvalds

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The Linux kernel has traditionally required that an UNLOCK+LOCK pair
act as a full memory barrier when either (1) that UNLOCK+LOCK pair
was executed by the same CPU or task, or (2) the same lock variable
was used for the UNLOCK and LOCK.  It now seems likely that very few
places in the kernel rely on this full-memory-barrier semantic, and
with the advent of queued locks, providing this semantic either requires
complex reasoning, or for some architectures, added overhead.

This commit therefore adds a smp_mb__after_unlock_lock(), which may be
placed after a LOCK primitive to restore the full-memory-barrier semantic.
All definitions are currently no-ops, but will be upgraded for some
architectures when queued locks arrive.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linux-Arch <linux-arch@vger.kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
---
 include/linux/spinlock.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h
index 75f34949d9ab..3f2867ff0ced 100644
--- a/include/linux/spinlock.h
+++ b/include/linux/spinlock.h
@@ -130,6 +130,16 @@ do {								\
 #define smp_mb__before_spinlock()	smp_wmb()
 #endif
 
+/*
+ * Place this after a lock-acquisition primitive to guarantee that
+ * an UNLOCK+LOCK pair act as a full barrier.  This guarantee applies
+ * if the UNLOCK and LOCK are executed by the same CPU or if the
+ * UNLOCK and LOCK operate on the same lock variable.
+ */
+#ifndef smp_mb__after_unlock_lock
+#define smp_mb__after_unlock_lock()	do { } while (0)
+#endif
+
 /**
  * raw_spin_unlock_wait - wait until the spinlock gets unlocked
  * @lock: the spinlock in question.
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v6 tip/core/locking 3/8] Documentation/memory-barriers.txt: Prohibit speculative writes
  2013-12-11 21:59   ` [PATCH v6 tip/core/locking 3/8] Documentation/memory-barriers.txt: Prohibit speculative writes Paul E. McKenney
@ 2013-12-12 13:17     ` Ingo Molnar
  2013-12-12 15:08       ` Paul E. McKenney
  2013-12-16 10:39     ` [tip:core/locking] " tip-bot for Peter Zijlstra
  1 sibling, 1 reply; 11+ messages in thread
From: Ingo Molnar @ 2013-12-12 13:17 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv,
	tglx, peterz, rostedt, dhowells, edumazet, darren, fweisbec, oleg,
	sbw, Linux-Arch


* Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> 
> No SMP architecture currently supporting Linux allows speculative writes,
> so this commit updates Documentation/memory-barriers.txt to prohibit
> them in Linux core code.  It also records restrictions on their use.
> 
> Signed-off-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

The signoff sequence looks weird here. If this came from Peter 
originally, did you mean:

  From: Peter Zijlstra <peterz@infradead.org>

instead of "From: Paul ..." ?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v6 tip/core/locking 3/8] Documentation/memory-barriers.txt: Prohibit speculative writes
  2013-12-12 13:17     ` Ingo Molnar
@ 2013-12-12 15:08       ` Paul E. McKenney
  2013-12-13 14:18         ` Peter Zijlstra
  0 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2013-12-12 15:08 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv,
	tglx, peterz, rostedt, dhowells, edumazet, darren, fweisbec, oleg,
	sbw, Linux-Arch

On Thu, Dec 12, 2013 at 02:17:10PM +0100, Ingo Molnar wrote:
> 
> * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > 
> > No SMP architecture currently supporting Linux allows speculative writes,
> > so this commit updates Documentation/memory-barriers.txt to prohibit
> > them in Linux core code.  It also records restrictions on their use.
> > 
> > Signed-off-by: Peter Zijlstra <peterz@infradead.org>
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> The signoff sequence looks weird here. If this came from Peter 
> originally, did you mean:
> 
>   From: Peter Zijlstra <peterz@infradead.org>
> 
> instead of "From: Paul ..." ?

Indeed I did!  I will fix.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v6 tip/core/locking 3/8] Documentation/memory-barriers.txt: Prohibit speculative writes
  2013-12-12 15:08       ` Paul E. McKenney
@ 2013-12-13 14:18         ` Peter Zijlstra
  0 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2013-12-13 14:18 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Ingo Molnar, linux-kernel, laijs, dipankar, akpm,
	mathieu.desnoyers, josh, niv, tglx, rostedt, dhowells, edumazet,
	darren, fweisbec, oleg, sbw, Linux-Arch

On Thu, Dec 12, 2013 at 07:08:44AM -0800, Paul E. McKenney wrote:
> On Thu, Dec 12, 2013 at 02:17:10PM +0100, Ingo Molnar wrote:
> > 
> > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> > 
> > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > 
> > > No SMP architecture currently supporting Linux allows speculative writes,
> > > so this commit updates Documentation/memory-barriers.txt to prohibit
> > > them in Linux core code.  It also records restrictions on their use.
> > > 
> > > Signed-off-by: Peter Zijlstra <peterz@infradead.org>
> > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > 
> > The signoff sequence looks weird here. If this came from Peter 
> > originally, did you mean:
> > 
> >   From: Peter Zijlstra <peterz@infradead.org>
> > 
> > instead of "From: Paul ..." ?
> 
> Indeed I did!  I will fix.

Well Paul did rewrite the entire thing into something far better than
what I started out with. I don't mind him dropping my SOB over his own.
I'll settle for an reviewed-by on this ;-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [tip:core/locking] Documentation/memory-barriers.txt: Add needed ACCESS_ONCE() calls to memory-barriers.txt
       [not found] ` <1386799151-2219-1-git-send-email-paulmck@linux.vnet.ibm.com>
  2013-12-11 21:59   ` [PATCH v6 tip/core/locking 3/8] Documentation/memory-barriers.txt: Prohibit speculative writes Paul E. McKenney
  2013-12-11 21:59   ` [PATCH v6 tip/core/locking 5/8] locking: Add an smp_mb__after_unlock_lock() for UNLOCK+LOCK barrier Paul E. McKenney
@ 2013-12-16 10:39   ` tip-bot for Paul E. McKenney
  2 siblings, 0 replies; 11+ messages in thread
From: tip-bot for Paul E. McKenney @ 2013-12-16 10:39 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, torvalds, a.p.zijlstra, paulmck, akpm,
	tglx, josh, linux-arch

Commit-ID:  2ecf810121c7eae34473b8fa108112036bc61127
Gitweb:     http://git.kernel.org/tip/2ecf810121c7eae34473b8fa108112036bc61127
Author:     Paul E. McKenney <paulmck@linux.vnet.ibm.com>
AuthorDate: Wed, 11 Dec 2013 13:59:04 -0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 16 Dec 2013 11:36:08 +0100

Documentation/memory-barriers.txt: Add needed ACCESS_ONCE() calls to memory-barriers.txt

The Documentation/memory-barriers.txt file was written before
the need for ACCESS_ONCE() was fully appreciated.  It therefore
contains no ACCESS_ONCE() calls, which can be a problem when
people lift examples from it.  This commit therefore adds
ACCESS_ONCE() calls.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <linux-arch@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1386799151-2219-1-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 Documentation/memory-barriers.txt | 206 +++++++++++++++++++++++---------------
 1 file changed, 126 insertions(+), 80 deletions(-)

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index 020cccd..1d06723 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -194,18 +194,22 @@ There are some minimal guarantees that may be expected of a CPU:
  (*) On any given CPU, dependent memory accesses will be issued in order, with
      respect to itself.  This means that for:
 
-	Q = P; D = *Q;
+	ACCESS_ONCE(Q) = P; smp_read_barrier_depends(); D = ACCESS_ONCE(*Q);
 
      the CPU will issue the following memory operations:
 
 	Q = LOAD P, D = LOAD *Q
 
-     and always in that order.
+     and always in that order.  On most systems, smp_read_barrier_depends()
+     does nothing, but it is required for DEC Alpha.  The ACCESS_ONCE()
+     is required to prevent compiler mischief.  Please note that you
+     should normally use something like rcu_dereference() instead of
+     open-coding smp_read_barrier_depends().
 
  (*) Overlapping loads and stores within a particular CPU will appear to be
      ordered within that CPU.  This means that for:
 
-	a = *X; *X = b;
+	a = ACCESS_ONCE(*X); ACCESS_ONCE(*X) = b;
 
      the CPU will only issue the following sequence of memory operations:
 
@@ -213,7 +217,7 @@ There are some minimal guarantees that may be expected of a CPU:
 
      And for:
 
-	*X = c; d = *X;
+	ACCESS_ONCE(*X) = c; d = ACCESS_ONCE(*X);
 
      the CPU will only issue:
 
@@ -224,6 +228,41 @@ There are some minimal guarantees that may be expected of a CPU:
 
 And there are a number of things that _must_ or _must_not_ be assumed:
 
+ (*) It _must_not_ be assumed that the compiler will do what you want with
+     memory references that are not protected by ACCESS_ONCE().  Without
+     ACCESS_ONCE(), the compiler is within its rights to do all sorts
+     of "creative" transformations:
+
+     (-) Repeat the load, possibly getting a different value on the second
+         and subsequent loads.  This is especially prone to happen when
+	 register pressure is high.
+
+     (-) Merge adjacent loads and stores to the same location.  The most
+         familiar example is the transformation from:
+
+		while (a)
+			do_something();
+
+         to something like:
+
+		if (a)
+			for (;;)
+				do_something();
+
+         Using ACCESS_ONCE() as follows prevents this sort of optimization:
+
+		while (ACCESS_ONCE(a))
+			do_something();
+
+     (-) "Store tearing", where a single store in the source code is split
+         into smaller stores in the object code.  Note that gcc really
+	 will do this on some architectures when storing certain constants.
+	 It can be cheaper to do a series of immediate stores than to
+	 form the constant in a register and then to store that register.
+
+     (-) "Load tearing", which splits loads in a manner analogous to
+     	 store tearing.
+
  (*) It _must_not_ be assumed that independent loads and stores will be issued
      in the order given.  This means that for:
 
@@ -450,14 +489,14 @@ The usage requirements of data dependency barriers are a little subtle, and
 it's not always obvious that they're needed.  To illustrate, consider the
 following sequence of events:
 
-	CPU 1		CPU 2
-	===============	===============
+	CPU 1		      CPU 2
+	===============	      ===============
 	{ A == 1, B == 2, C = 3, P == &A, Q == &C }
 	B = 4;
 	<write barrier>
-	P = &B
-			Q = P;
-			D = *Q;
+	ACCESS_ONCE(P) = &B
+			      Q = ACCESS_ONCE(P);
+			      D = *Q;
 
 There's a clear data dependency here, and it would seem that by the end of the
 sequence, Q must be either &A or &B, and that:
@@ -477,15 +516,15 @@ Alpha).
 To deal with this, a data dependency barrier or better must be inserted
 between the address load and the data load:
 
-	CPU 1		CPU 2
-	===============	===============
+	CPU 1		      CPU 2
+	===============	      ===============
 	{ A == 1, B == 2, C = 3, P == &A, Q == &C }
 	B = 4;
 	<write barrier>
-	P = &B
-			Q = P;
-			<data dependency barrier>
-			D = *Q;
+	ACCESS_ONCE(P) = &B
+			      Q = ACCESS_ONCE(P);
+			      <data dependency barrier>
+			      D = *Q;
 
 This enforces the occurrence of one of the two implications, and prevents the
 third possibility from arising.
@@ -504,21 +543,22 @@ Another example of where data dependency barriers might be required is where a
 number is read from memory and then used to calculate the index for an array
 access:
 
-	CPU 1		CPU 2
-	===============	===============
+	CPU 1		      CPU 2
+	===============	      ===============
 	{ M[0] == 1, M[1] == 2, M[3] = 3, P == 0, Q == 3 }
 	M[1] = 4;
 	<write barrier>
-	P = 1
-			Q = P;
-			<data dependency barrier>
-			D = M[Q];
+	ACCESS_ONCE(P) = 1
+			      Q = ACCESS_ONCE(P);
+			      <data dependency barrier>
+			      D = M[Q];
 
 
-The data dependency barrier is very important to the RCU system, for example.
-See rcu_dereference() in include/linux/rcupdate.h.  This permits the current
-target of an RCU'd pointer to be replaced with a new modified target, without
-the replacement target appearing to be incompletely initialised.
+The data dependency barrier is very important to the RCU system,
+for example.  See rcu_assign_pointer() and rcu_dereference() in
+include/linux/rcupdate.h.  This permits the current target of an RCU'd
+pointer to be replaced with a new modified target, without the replacement
+target appearing to be incompletely initialised.
 
 See also the subsection on "Cache Coherency" for a more thorough example.
 
@@ -530,22 +570,23 @@ A control dependency requires a full read memory barrier, not simply a data
 dependency barrier to make it work correctly.  Consider the following bit of
 code:
 
-	q = &a;
+	q = ACCESS_ONCE(a);
 	if (p) {
 		<data dependency barrier>
-		q = &b;
+		q = ACCESS_ONCE(b);
 	}
 	x = *q;
 
 This will not have the desired effect because there is no actual data
-dependency, but rather a control dependency that the CPU may short-circuit by
-attempting to predict the outcome in advance.  In such a case what's actually
-required is:
+dependency, but rather a control dependency that the CPU may short-circuit
+by attempting to predict the outcome in advance, so that other CPUs see
+the load from b as having happened before the load from a.  In such a
+case what's actually required is:
 
-	q = &a;
+	q = ACCESS_ONCE(a);
 	if (p) {
 		<read barrier>
-		q = &b;
+		q = ACCESS_ONCE(b);
 	}
 	x = *q;
 
@@ -561,23 +602,23 @@ barrier, though a general barrier would also be viable.  Similarly a read
 barrier or a data dependency barrier should always be paired with at least an
 write barrier, though, again, a general barrier is viable:
 
-	CPU 1		CPU 2
-	===============	===============
-	a = 1;
+	CPU 1		      CPU 2
+	===============	      ===============
+	ACCESS_ONCE(a) = 1;
 	<write barrier>
-	b = 2;		x = b;
-			<read barrier>
-			y = a;
+	ACCESS_ONCE(b) = 2;   x = ACCESS_ONCE(b);
+			      <read barrier>
+			      y = ACCESS_ONCE(a);
 
 Or:
 
-	CPU 1		CPU 2
-	===============	===============================
+	CPU 1		      CPU 2
+	===============	      ===============================
 	a = 1;
 	<write barrier>
-	b = &a;		x = b;
-			<data dependency barrier>
-			y = *x;
+	ACCESS_ONCE(b) = &a;  x = ACCESS_ONCE(b);
+			      <data dependency barrier>
+			      y = *x;
 
 Basically, the read barrier always has to be there, even though it can be of
 the "weaker" type.
@@ -586,13 +627,13 @@ the "weaker" type.
 match the loads after the read barrier or the data dependency barrier, and vice
 versa:
 
-	CPU 1                           CPU 2
-	===============                 ===============
-	a = 1;           }----   --->{  v = c
-	b = 2;           }    \ /    {  w = d
-	<write barrier>        \        <read barrier>
-	c = 3;           }    / \    {  x = a;
-	d = 4;           }----   --->{  y = b;
+	CPU 1                               CPU 2
+	===================                 ===================
+	ACCESS_ONCE(a) = 1;  }----   --->{  v = ACCESS_ONCE(c);
+	ACCESS_ONCE(b) = 2;  }    \ /    {  w = ACCESS_ONCE(d);
+	<write barrier>            \        <read barrier>
+	ACCESS_ONCE(c) = 3;  }    / \    {  x = ACCESS_ONCE(a);
+	ACCESS_ONCE(d) = 4;  }----   --->{  y = ACCESS_ONCE(b);
 
 
 EXAMPLES OF MEMORY BARRIER SEQUENCES
@@ -1435,12 +1476,12 @@ three CPUs; then should the following sequence of events occur:
 
 	CPU 1				CPU 2
 	===============================	===============================
-	*A = a;				*E = e;
+	ACCESS_ONCE(*A) = a;		ACCESS_ONCE(*E) = e;
 	LOCK M				LOCK Q
-	*B = b;				*F = f;
-	*C = c;				*G = g;
+	ACCESS_ONCE(*B) = b;		ACCESS_ONCE(*F) = f;
+	ACCESS_ONCE(*C) = c;		ACCESS_ONCE(*G) = g;
 	UNLOCK M			UNLOCK Q
-	*D = d;				*H = h;
+	ACCESS_ONCE(*D) = d;		ACCESS_ONCE(*H) = h;
 
 Then there is no guarantee as to what order CPU 3 will see the accesses to *A
 through *H occur in, other than the constraints imposed by the separate locks
@@ -1460,17 +1501,17 @@ However, if the following occurs:
 
 	CPU 1				CPU 2
 	===============================	===============================
-	*A = a;
-	LOCK M		[1]
-	*B = b;
-	*C = c;
-	UNLOCK M	[1]
-	*D = d;				*E = e;
-					LOCK M		[2]
-					*F = f;
-					*G = g;
-					UNLOCK M	[2]
-					*H = h;
+	ACCESS_ONCE(*A) = a;
+	LOCK M		     [1]
+	ACCESS_ONCE(*B) = b;
+	ACCESS_ONCE(*C) = c;
+	UNLOCK M	     [1]
+	ACCESS_ONCE(*D) = d;		ACCESS_ONCE(*E) = e;
+					LOCK M		     [2]
+					ACCESS_ONCE(*F) = f;
+					ACCESS_ONCE(*G) = g;
+					UNLOCK M	     [2]
+					ACCESS_ONCE(*H) = h;
 
 CPU 3 might see:
 
@@ -2177,11 +2218,11 @@ A programmer might take it for granted that the CPU will perform memory
 operations in exactly the order specified, so that if the CPU is, for example,
 given the following piece of code to execute:
 
-	a = *A;
-	*B = b;
-	c = *C;
-	d = *D;
-	*E = e;
+	a = ACCESS_ONCE(*A);
+	ACCESS_ONCE(*B) = b;
+	c = ACCESS_ONCE(*C);
+	d = ACCESS_ONCE(*D);
+	ACCESS_ONCE(*E) = e;
 
 they would then expect that the CPU will complete the memory operation for each
 instruction before moving on to the next one, leading to a definite sequence of
@@ -2228,12 +2269,12 @@ However, it is guaranteed that a CPU will be self-consistent: it will see its
 _own_ accesses appear to be correctly ordered, without the need for a memory
 barrier.  For instance with the following code:
 
-	U = *A;
-	*A = V;
-	*A = W;
-	X = *A;
-	*A = Y;
-	Z = *A;
+	U = ACCESS_ONCE(*A);
+	ACCESS_ONCE(*A) = V;
+	ACCESS_ONCE(*A) = W;
+	X = ACCESS_ONCE(*A);
+	ACCESS_ONCE(*A) = Y;
+	Z = ACCESS_ONCE(*A);
 
 and assuming no intervention by an external influence, it can be assumed that
 the final result will appear to be:
@@ -2250,7 +2291,12 @@ accesses:
 
 in that order, but, without intervention, the sequence may have almost any
 combination of elements combined or discarded, provided the program's view of
-the world remains consistent.
+the world remains consistent.  Note that ACCESS_ONCE() is -not- optional
+in the above example, as there are architectures where a given CPU might
+interchange successive loads to the same location.  On such architectures,
+ACCESS_ONCE() does whatever is necessary to prevent this, for example, on
+Itanium the volatile casts used by ACCESS_ONCE() cause GCC to emit the
+special ld.acq and st.rel instructions that prevent such reordering.
 
 The compiler may also combine, discard or defer elements of the sequence before
 the CPU even sees them.
@@ -2264,13 +2310,13 @@ may be reduced to:
 
 	*A = W;
 
-since, without a write barrier, it can be assumed that the effect of the
-storage of V to *A is lost.  Similarly:
+since, without either a write barrier or an ACCESS_ONCE(), it can be
+assumed that the effect of the storage of V to *A is lost.  Similarly:
 
 	*A = Y;
 	Z = *A;
 
-may, without a memory barrier, be reduced to:
+may, without a memory barrier or an ACCESS_ONCE(), be reduced to:
 
 	*A = Y;
 	Z = Y;

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [tip:core/locking] Documentation/memory-barriers.txt: Prohibit speculative writes
  2013-12-11 21:59   ` [PATCH v6 tip/core/locking 3/8] Documentation/memory-barriers.txt: Prohibit speculative writes Paul E. McKenney
  2013-12-12 13:17     ` Ingo Molnar
@ 2013-12-16 10:39     ` tip-bot for Peter Zijlstra
  2013-12-16 10:39       ` tip-bot for Peter Zijlstra
  1 sibling, 1 reply; 11+ messages in thread
From: tip-bot for Peter Zijlstra @ 2013-12-16 10:39 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, torvalds, a.p.zijlstra, peterz, paulmck,
	akpm, tglx, josh, linux-arch

Commit-ID:  18c03c61444a211237f3d4782353cb38dba795df
Gitweb:     http://git.kernel.org/tip/18c03c61444a211237f3d4782353cb38dba795df
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Wed, 11 Dec 2013 13:59:06 -0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 16 Dec 2013 11:36:11 +0100

Documentation/memory-barriers.txt: Prohibit speculative writes

No SMP architecture currently supporting Linux allows
speculative writes, so this commit updates
Documentation/memory-barriers.txt to prohibit them in Linux core
code.  It also records restrictions on their use.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <linux-arch@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1386799151-2219-3-git-send-email-paulmck@linux.vnet.ibm.com
[ Paul modified the original patch from Peter. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 Documentation/memory-barriers.txt | 183 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 175 insertions(+), 8 deletions(-)

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index 2d22da0..deafa36 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -571,11 +571,10 @@ dependency barrier to make it work correctly.  Consider the following bit of
 code:
 
 	q = ACCESS_ONCE(a);
-	if (p) {
-		<data dependency barrier>
-		q = ACCESS_ONCE(b);
+	if (q) {
+		<data dependency barrier>  /* BUG: No data dependency!!! */
+		p = ACCESS_ONCE(b);
 	}
-	x = *q;
 
 This will not have the desired effect because there is no actual data
 dependency, but rather a control dependency that the CPU may short-circuit
@@ -584,11 +583,176 @@ the load from b as having happened before the load from a.  In such a
 case what's actually required is:
 
 	q = ACCESS_ONCE(a);
-	if (p) {
+	if (q) {
 		<read barrier>
-		q = ACCESS_ONCE(b);
+		p = ACCESS_ONCE(b);
 	}
-	x = *q;
+
+However, stores are not speculated.  This means that ordering -is- provided
+in the following example:
+
+	q = ACCESS_ONCE(a);
+	if (ACCESS_ONCE(q)) {
+		ACCESS_ONCE(b) = p;
+	}
+
+Please note that ACCESS_ONCE() is not optional!  Without the ACCESS_ONCE(),
+the compiler is within its rights to transform this example:
+
+	q = a;
+	if (q) {
+		b = p;  /* BUG: Compiler can reorder!!! */
+		do_something();
+	} else {
+		b = p;  /* BUG: Compiler can reorder!!! */
+		do_something_else();
+	}
+
+into this, which of course defeats the ordering:
+
+	b = p;
+	q = a;
+	if (q)
+		do_something();
+	else
+		do_something_else();
+
+Worse yet, if the compiler is able to prove (say) that the value of
+variable 'a' is always non-zero, it would be well within its rights
+to optimize the original example by eliminating the "if" statement
+as follows:
+
+	q = a;
+	b = p;  /* BUG: Compiler can reorder!!! */
+	do_something();
+
+The solution is again ACCESS_ONCE(), which preserves the ordering between
+the load from variable 'a' and the store to variable 'b':
+
+	q = ACCESS_ONCE(a);
+	if (q) {
+		ACCESS_ONCE(b) = p;
+		do_something();
+	} else {
+		ACCESS_ONCE(b) = p;
+		do_something_else();
+	}
+
+You could also use barrier() to prevent the compiler from moving
+the stores to variable 'b', but barrier() would not prevent the
+compiler from proving to itself that a==1 always, so ACCESS_ONCE()
+is also needed.
+
+It is important to note that control dependencies absolutely require a
+a conditional.  For example, the following "optimized" version of
+the above example breaks ordering:
+
+	q = ACCESS_ONCE(a);
+	ACCESS_ONCE(b) = p;  /* BUG: No ordering vs. load from a!!! */
+	if (q) {
+		/* ACCESS_ONCE(b) = p; -- moved up, BUG!!! */
+		do_something();
+	} else {
+		/* ACCESS_ONCE(b) = p; -- moved up, BUG!!! */
+		do_something_else();
+	}
+
+It is of course legal for the prior load to be part of the conditional,
+for example, as follows:
+
+	if (ACCESS_ONCE(a) > 0) {
+		ACCESS_ONCE(b) = q / 2;
+		do_something();
+	} else {
+		ACCESS_ONCE(b) = q / 3;
+		do_something_else();
+	}
+
+This will again ensure that the load from variable 'a' is ordered before the
+stores to variable 'b'.
+
+In addition, you need to be careful what you do with the local variable 'q',
+otherwise the compiler might be able to guess the value and again remove
+the needed conditional.  For example:
+
+	q = ACCESS_ONCE(a);
+	if (q % MAX) {
+		ACCESS_ONCE(b) = p;
+		do_something();
+	} else {
+		ACCESS_ONCE(b) = p;
+		do_something_else();
+	}
+
+If MAX is defined to be 1, then the compiler knows that (q % MAX) is
+equal to zero, in which case the compiler is within its rights to
+transform the above code into the following:
+
+	q = ACCESS_ONCE(a);
+	ACCESS_ONCE(b) = p;
+	do_something_else();
+
+This transformation loses the ordering between the load from variable 'a'
+and the store to variable 'b'.  If you are relying on this ordering, you
+should do something like the following:
+
+	q = ACCESS_ONCE(a);
+	BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */
+	if (q % MAX) {
+		ACCESS_ONCE(b) = p;
+		do_something();
+	} else {
+		ACCESS_ONCE(b) = p;
+		do_something_else();
+	}
+
+Finally, control dependencies do -not- provide transitivity.  This is
+demonstrated by two related examples:
+
+	CPU 0                     CPU 1
+	=====================     =====================
+	r1 = ACCESS_ONCE(x);      r2 = ACCESS_ONCE(y);
+	if (r1 >= 0)              if (r2 >= 0)
+	  ACCESS_ONCE(y) = 1;       ACCESS_ONCE(x) = 1;
+
+	assert(!(r1 == 1 && r2 == 1));
+
+The above two-CPU example will never trigger the assert().  However,
+if control dependencies guaranteed transitivity (which they do not),
+then adding the following two CPUs would guarantee a related assertion:
+
+	CPU 2                     CPU 3
+	=====================     =====================
+	ACCESS_ONCE(x) = 2;       ACCESS_ONCE(y) = 2;
+
+	assert(!(r1 == 2 && r2 == 2 && x == 1 && y == 1)); /* FAILS!!! */
+
+But because control dependencies do -not- provide transitivity, the
+above assertion can fail after the combined four-CPU example completes.
+If you need the four-CPU example to provide ordering, you will need
+smp_mb() between the loads and stores in the CPU 0 and CPU 1 code fragments.
+
+In summary:
+
+  (*) Control dependencies can order prior loads against later stores.
+      However, they do -not- guarantee any other sort of ordering:
+      Not prior loads against later loads, nor prior stores against
+      later anything.  If you need these other forms of ordering,
+      use smb_rmb(), smp_wmb(), or, in the case of prior stores and
+      later loads, smp_mb().
+
+  (*) Control dependencies require at least one run-time conditional
+      between the prior load and the subsequent store.  If the compiler
+      is able to optimize the conditional away, it will have also
+      optimized away the ordering.  Careful use of ACCESS_ONCE() can
+      help to preserve the needed conditional.
+
+  (*) Control dependencies require that the compiler avoid reordering the
+      dependency into nonexistence.  Careful use of ACCESS_ONCE() or
+      barrier() can help to preserve your control dependency.
+
+  (*) Control dependencies do -not- provide transitivity.  If you
+      need transitivity, use smp_mb().
 
 
 SMP BARRIER PAIRING
@@ -1083,7 +1247,10 @@ compiler from moving the memory accesses either side of it to the other side:
 
 	barrier();
 
-This is a general barrier - lesser varieties of compiler barrier do not exist.
+This is a general barrier -- there are no read-read or write-write variants
+of barrier().  Howevever, ACCESS_ONCE() can be thought of as a weak form
+for barrier() that affects only the specific accesses flagged by the
+ACCESS_ONCE().
 
 The compiler barrier has no direct effect on the CPU, which may then reorder
 things however it wishes.

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [tip:core/locking] Documentation/memory-barriers.txt: Prohibit speculative writes
  2013-12-16 10:39     ` [tip:core/locking] " tip-bot for Peter Zijlstra
@ 2013-12-16 10:39       ` tip-bot for Peter Zijlstra
  0 siblings, 0 replies; 11+ messages in thread
From: tip-bot for Peter Zijlstra @ 2013-12-16 10:39 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, torvalds, a.p.zijlstra, peterz, paulmck,
	akpm, tglx, josh, linux-arch

Commit-ID:  18c03c61444a211237f3d4782353cb38dba795df
Gitweb:     http://git.kernel.org/tip/18c03c61444a211237f3d4782353cb38dba795df
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Wed, 11 Dec 2013 13:59:06 -0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 16 Dec 2013 11:36:11 +0100

Documentation/memory-barriers.txt: Prohibit speculative writes

No SMP architecture currently supporting Linux allows
speculative writes, so this commit updates
Documentation/memory-barriers.txt to prohibit them in Linux core
code.  It also records restrictions on their use.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <linux-arch@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1386799151-2219-3-git-send-email-paulmck@linux.vnet.ibm.com
[ Paul modified the original patch from Peter. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 Documentation/memory-barriers.txt | 183 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 175 insertions(+), 8 deletions(-)

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index 2d22da0..deafa36 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -571,11 +571,10 @@ dependency barrier to make it work correctly.  Consider the following bit of
 code:
 
 	q = ACCESS_ONCE(a);
-	if (p) {
-		<data dependency barrier>
-		q = ACCESS_ONCE(b);
+	if (q) {
+		<data dependency barrier>  /* BUG: No data dependency!!! */
+		p = ACCESS_ONCE(b);
 	}
-	x = *q;
 
 This will not have the desired effect because there is no actual data
 dependency, but rather a control dependency that the CPU may short-circuit
@@ -584,11 +583,176 @@ the load from b as having happened before the load from a.  In such a
 case what's actually required is:
 
 	q = ACCESS_ONCE(a);
-	if (p) {
+	if (q) {
 		<read barrier>
-		q = ACCESS_ONCE(b);
+		p = ACCESS_ONCE(b);
 	}
-	x = *q;
+
+However, stores are not speculated.  This means that ordering -is- provided
+in the following example:
+
+	q = ACCESS_ONCE(a);
+	if (ACCESS_ONCE(q)) {
+		ACCESS_ONCE(b) = p;
+	}
+
+Please note that ACCESS_ONCE() is not optional!  Without the ACCESS_ONCE(),
+the compiler is within its rights to transform this example:
+
+	q = a;
+	if (q) {
+		b = p;  /* BUG: Compiler can reorder!!! */
+		do_something();
+	} else {
+		b = p;  /* BUG: Compiler can reorder!!! */
+		do_something_else();
+	}
+
+into this, which of course defeats the ordering:
+
+	b = p;
+	q = a;
+	if (q)
+		do_something();
+	else
+		do_something_else();
+
+Worse yet, if the compiler is able to prove (say) that the value of
+variable 'a' is always non-zero, it would be well within its rights
+to optimize the original example by eliminating the "if" statement
+as follows:
+
+	q = a;
+	b = p;  /* BUG: Compiler can reorder!!! */
+	do_something();
+
+The solution is again ACCESS_ONCE(), which preserves the ordering between
+the load from variable 'a' and the store to variable 'b':
+
+	q = ACCESS_ONCE(a);
+	if (q) {
+		ACCESS_ONCE(b) = p;
+		do_something();
+	} else {
+		ACCESS_ONCE(b) = p;
+		do_something_else();
+	}
+
+You could also use barrier() to prevent the compiler from moving
+the stores to variable 'b', but barrier() would not prevent the
+compiler from proving to itself that a==1 always, so ACCESS_ONCE()
+is also needed.
+
+It is important to note that control dependencies absolutely require a
+a conditional.  For example, the following "optimized" version of
+the above example breaks ordering:
+
+	q = ACCESS_ONCE(a);
+	ACCESS_ONCE(b) = p;  /* BUG: No ordering vs. load from a!!! */
+	if (q) {
+		/* ACCESS_ONCE(b) = p; -- moved up, BUG!!! */
+		do_something();
+	} else {
+		/* ACCESS_ONCE(b) = p; -- moved up, BUG!!! */
+		do_something_else();
+	}
+
+It is of course legal for the prior load to be part of the conditional,
+for example, as follows:
+
+	if (ACCESS_ONCE(a) > 0) {
+		ACCESS_ONCE(b) = q / 2;
+		do_something();
+	} else {
+		ACCESS_ONCE(b) = q / 3;
+		do_something_else();
+	}
+
+This will again ensure that the load from variable 'a' is ordered before the
+stores to variable 'b'.
+
+In addition, you need to be careful what you do with the local variable 'q',
+otherwise the compiler might be able to guess the value and again remove
+the needed conditional.  For example:
+
+	q = ACCESS_ONCE(a);
+	if (q % MAX) {
+		ACCESS_ONCE(b) = p;
+		do_something();
+	} else {
+		ACCESS_ONCE(b) = p;
+		do_something_else();
+	}
+
+If MAX is defined to be 1, then the compiler knows that (q % MAX) is
+equal to zero, in which case the compiler is within its rights to
+transform the above code into the following:
+
+	q = ACCESS_ONCE(a);
+	ACCESS_ONCE(b) = p;
+	do_something_else();
+
+This transformation loses the ordering between the load from variable 'a'
+and the store to variable 'b'.  If you are relying on this ordering, you
+should do something like the following:
+
+	q = ACCESS_ONCE(a);
+	BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */
+	if (q % MAX) {
+		ACCESS_ONCE(b) = p;
+		do_something();
+	} else {
+		ACCESS_ONCE(b) = p;
+		do_something_else();
+	}
+
+Finally, control dependencies do -not- provide transitivity.  This is
+demonstrated by two related examples:
+
+	CPU 0                     CPU 1
+	=====================     =====================
+	r1 = ACCESS_ONCE(x);      r2 = ACCESS_ONCE(y);
+	if (r1 >= 0)              if (r2 >= 0)
+	  ACCESS_ONCE(y) = 1;       ACCESS_ONCE(x) = 1;
+
+	assert(!(r1 == 1 && r2 == 1));
+
+The above two-CPU example will never trigger the assert().  However,
+if control dependencies guaranteed transitivity (which they do not),
+then adding the following two CPUs would guarantee a related assertion:
+
+	CPU 2                     CPU 3
+	=====================     =====================
+	ACCESS_ONCE(x) = 2;       ACCESS_ONCE(y) = 2;
+
+	assert(!(r1 == 2 && r2 == 2 && x == 1 && y == 1)); /* FAILS!!! */
+
+But because control dependencies do -not- provide transitivity, the
+above assertion can fail after the combined four-CPU example completes.
+If you need the four-CPU example to provide ordering, you will need
+smp_mb() between the loads and stores in the CPU 0 and CPU 1 code fragments.
+
+In summary:
+
+  (*) Control dependencies can order prior loads against later stores.
+      However, they do -not- guarantee any other sort of ordering:
+      Not prior loads against later loads, nor prior stores against
+      later anything.  If you need these other forms of ordering,
+      use smb_rmb(), smp_wmb(), or, in the case of prior stores and
+      later loads, smp_mb().
+
+  (*) Control dependencies require at least one run-time conditional
+      between the prior load and the subsequent store.  If the compiler
+      is able to optimize the conditional away, it will have also
+      optimized away the ordering.  Careful use of ACCESS_ONCE() can
+      help to preserve the needed conditional.
+
+  (*) Control dependencies require that the compiler avoid reordering the
+      dependency into nonexistence.  Careful use of ACCESS_ONCE() or
+      barrier() can help to preserve your control dependency.
+
+  (*) Control dependencies do -not- provide transitivity.  If you
+      need transitivity, use smp_mb().
 
 
 SMP BARRIER PAIRING
@@ -1083,7 +1247,10 @@ compiler from moving the memory accesses either side of it to the other side:
 
 	barrier();
 
-This is a general barrier - lesser varieties of compiler barrier do not exist.
+This is a general barrier -- there are no read-read or write-write variants
+of barrier().  Howevever, ACCESS_ONCE() can be thought of as a weak form
+for barrier() that affects only the specific accesses flagged by the
+ACCESS_ONCE().
 
 The compiler barrier has no direct effect on the CPU, which may then reorder
 things however it wishes.

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [tip:core/locking] locking: Add an smp_mb__after_unlock_lock() for UNLOCK+BLOCK barrier
  2013-12-11 21:59   ` [PATCH v6 tip/core/locking 5/8] locking: Add an smp_mb__after_unlock_lock() for UNLOCK+LOCK barrier Paul E. McKenney
  2013-12-11 21:59     ` Paul E. McKenney
@ 2013-12-16 10:40     ` tip-bot for Paul E. McKenney
  2013-12-16 10:40       ` tip-bot for Paul E. McKenney
  1 sibling, 1 reply; 11+ messages in thread
From: tip-bot for Paul E. McKenney @ 2013-12-16 10:40 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, torvalds, a.p.zijlstra, paulmck, akpm,
	tglx, linux-arch

Commit-ID:  01352fb81658cbf78c55844de8e3d1d606bbf3f8
Gitweb:     http://git.kernel.org/tip/01352fb81658cbf78c55844de8e3d1d606bbf3f8
Author:     Paul E. McKenney <paulmck@linux.vnet.ibm.com>
AuthorDate: Wed, 11 Dec 2013 13:59:08 -0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 16 Dec 2013 11:36:13 +0100

locking: Add an smp_mb__after_unlock_lock() for UNLOCK+BLOCK barrier

The Linux kernel has traditionally required that an UNLOCK+LOCK
pair act as a full memory barrier when either (1) that
UNLOCK+LOCK pair was executed by the same CPU or task, or (2)
the same lock variable was used for the UNLOCK and LOCK.  It now
seems likely that very few places in the kernel rely on this
full-memory-barrier semantic, and with the advent of queued
locks, providing this semantic either requires complex
reasoning, or for some architectures, added overhead.

This commit therefore adds a smp_mb__after_unlock_lock(), which
may be placed after a LOCK primitive to restore the
full-memory-barrier semantic. All definitions are currently
no-ops, but will be upgraded for some architectures when queued
locks arrive.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <linux-arch@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1386799151-2219-5-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/spinlock.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h
index 75f3494..3f2867f 100644
--- a/include/linux/spinlock.h
+++ b/include/linux/spinlock.h
@@ -130,6 +130,16 @@ do {								\
 #define smp_mb__before_spinlock()	smp_wmb()
 #endif
 
+/*
+ * Place this after a lock-acquisition primitive to guarantee that
+ * an UNLOCK+LOCK pair act as a full barrier.  This guarantee applies
+ * if the UNLOCK and LOCK are executed by the same CPU or if the
+ * UNLOCK and LOCK operate on the same lock variable.
+ */
+#ifndef smp_mb__after_unlock_lock
+#define smp_mb__after_unlock_lock()	do { } while (0)
+#endif
+
 /**
  * raw_spin_unlock_wait - wait until the spinlock gets unlocked
  * @lock: the spinlock in question.

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [tip:core/locking] locking: Add an smp_mb__after_unlock_lock() for UNLOCK+BLOCK barrier
  2013-12-16 10:40     ` [tip:core/locking] locking: Add an smp_mb__after_unlock_lock() for UNLOCK+BLOCK barrier tip-bot for Paul E. McKenney
@ 2013-12-16 10:40       ` tip-bot for Paul E. McKenney
  0 siblings, 0 replies; 11+ messages in thread
From: tip-bot for Paul E. McKenney @ 2013-12-16 10:40 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, torvalds, a.p.zijlstra, paulmck, akpm,
	tglx, linux-arch

Commit-ID:  01352fb81658cbf78c55844de8e3d1d606bbf3f8
Gitweb:     http://git.kernel.org/tip/01352fb81658cbf78c55844de8e3d1d606bbf3f8
Author:     Paul E. McKenney <paulmck@linux.vnet.ibm.com>
AuthorDate: Wed, 11 Dec 2013 13:59:08 -0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 16 Dec 2013 11:36:13 +0100

locking: Add an smp_mb__after_unlock_lock() for UNLOCK+BLOCK barrier

The Linux kernel has traditionally required that an UNLOCK+LOCK
pair act as a full memory barrier when either (1) that
UNLOCK+LOCK pair was executed by the same CPU or task, or (2)
the same lock variable was used for the UNLOCK and LOCK.  It now
seems likely that very few places in the kernel rely on this
full-memory-barrier semantic, and with the advent of queued
locks, providing this semantic either requires complex
reasoning, or for some architectures, added overhead.

This commit therefore adds a smp_mb__after_unlock_lock(), which
may be placed after a LOCK primitive to restore the
full-memory-barrier semantic. All definitions are currently
no-ops, but will be upgraded for some architectures when queued
locks arrive.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <linux-arch@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1386799151-2219-5-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/spinlock.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h
index 75f3494..3f2867f 100644
--- a/include/linux/spinlock.h
+++ b/include/linux/spinlock.h
@@ -130,6 +130,16 @@ do {								\
 #define smp_mb__before_spinlock()	smp_wmb()
 #endif
 
+/*
+ * Place this after a lock-acquisition primitive to guarantee that
+ * an UNLOCK+LOCK pair act as a full barrier.  This guarantee applies
+ * if the UNLOCK and LOCK are executed by the same CPU or if the
+ * UNLOCK and LOCK operate on the same lock variable.
+ */
+#ifndef smp_mb__after_unlock_lock
+#define smp_mb__after_unlock_lock()	do { } while (0)
+#endif
+
 /**
  * raw_spin_unlock_wait - wait until the spinlock gets unlocked
  * @lock: the spinlock in question.

^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-12-16 10:40 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20131211215850.GA810@linux.vnet.ibm.com>
     [not found] ` <1386799151-2219-1-git-send-email-paulmck@linux.vnet.ibm.com>
2013-12-11 21:59   ` [PATCH v6 tip/core/locking 3/8] Documentation/memory-barriers.txt: Prohibit speculative writes Paul E. McKenney
2013-12-12 13:17     ` Ingo Molnar
2013-12-12 15:08       ` Paul E. McKenney
2013-12-13 14:18         ` Peter Zijlstra
2013-12-16 10:39     ` [tip:core/locking] " tip-bot for Peter Zijlstra
2013-12-16 10:39       ` tip-bot for Peter Zijlstra
2013-12-11 21:59   ` [PATCH v6 tip/core/locking 5/8] locking: Add an smp_mb__after_unlock_lock() for UNLOCK+LOCK barrier Paul E. McKenney
2013-12-11 21:59     ` Paul E. McKenney
2013-12-16 10:40     ` [tip:core/locking] locking: Add an smp_mb__after_unlock_lock() for UNLOCK+BLOCK barrier tip-bot for Paul E. McKenney
2013-12-16 10:40       ` tip-bot for Paul E. McKenney
2013-12-16 10:39   ` [tip:core/locking] Documentation/memory-barriers.txt: Add needed ACCESS_ONCE() calls to memory-barriers.txt tip-bot for Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).