All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org,
	"Paul E. McKenney" <paulmck@us.ibm.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Thomas Gleixner <tglx@linutronix.de>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: [GIT PULL] locking fixes
Date: Thu, 24 Mar 2016 08:47:10 +0100	[thread overview]
Message-ID: <20160324074710.GA31975@gmail.com> (raw)

Linus,

Please pull the latest locking-urgent-for-linus git tree from:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git locking-urgent-for-linus

   # HEAD: f75d48644c56a31731d17fa693c8175328957e1d bitops: Do not default to __clear_bit() for __clear_bit_unlock()

Documentation updates and a bitops ordering fix.

 Thanks,

	Ingo

------------------>
Paul E. McKenney (7):
      documentation: Fix control dependency and identical stores
      documentation: Fix memory-barriers.txt section references
      documentation: Remove obsolete reference to RCU-protected indexes
      documentation: Subsequent writes ordered by rcu_dereference()
      documentation: Distinguish between local and global transitivity
      documentation:  Add alternative release-acquire outcome
      documentation: Transitivity is not cumulativity

Peter Zijlstra (1):
      bitops: Do not default to __clear_bit() for __clear_bit_unlock()

SeongJae Park (1):
      documentation: Clarify compiler store-fusion example


 Documentation/memory-barriers.txt | 141 +++++++++++++++++++++++++++++++-------
 include/asm-generic/bitops/lock.h |  14 ++--
 2 files changed, 123 insertions(+), 32 deletions(-)

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index 904ee42d078e..3729cbe60e41 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -232,7 +232,7 @@ GUARANTEES
      with memory references that are not protected by READ_ONCE() and
      WRITE_ONCE().  Without them, the compiler is within its rights to
      do all sorts of "creative" transformations, which are covered in
-     the Compiler Barrier section.
+     the COMPILER BARRIER section.
 
  (*) It _must_not_ be assumed that independent loads and stores will be issued
      in the order given.  This means that for:
@@ -555,6 +555,30 @@ To deal with this, a data dependency barrier or better must be inserted
 This enforces the occurrence of one of the two implications, and prevents the
 third possibility from arising.
 
+A data-dependency barrier must also order against dependent writes:
+
+	CPU 1		      CPU 2
+	===============	      ===============
+	{ A == 1, B == 2, C = 3, P == &A, Q == &C }
+	B = 4;
+	<write barrier>
+	WRITE_ONCE(P, &B);
+			      Q = READ_ONCE(P);
+			      <data dependency barrier>
+			      *Q = 5;
+
+The data-dependency barrier must order the read into Q with the store
+into *Q.  This prohibits this outcome:
+
+	(Q == B) && (B == 4)
+
+Please note that this pattern should be rare.  After all, the whole point
+of dependency ordering is to -prevent- writes to the data structure, along
+with the expensive cache misses associated with those writes.  This pattern
+can be used to record rare error conditions and the like, and the ordering
+prevents such records from being lost.
+
+
 [!] Note that this extremely counterintuitive situation arises most easily on
 machines with split caches, so that, for example, one cache bank processes
 even-numbered cache lines and the other bank processes odd-numbered cache
@@ -565,21 +589,6 @@ odd-numbered bank is idle, one can see the new value of the pointer P (&B),
 but the old value of the variable B (2).
 
 
-Another example of where data dependency barriers might be required is where a
-number is read from memory and then used to calculate the index for an array
-access:
-
-	CPU 1		      CPU 2
-	===============	      ===============
-	{ M[0] == 1, M[1] == 2, M[3] = 3, P == 0, Q == 3 }
-	M[1] = 4;
-	<write barrier>
-	WRITE_ONCE(P, 1);
-			      Q = READ_ONCE(P);
-			      <data dependency barrier>
-			      D = M[Q];
-
-
 The data dependency barrier is very important to the RCU system,
 for example.  See rcu_assign_pointer() and rcu_dereference() in
 include/linux/rcupdate.h.  This permits the current target of an RCU'd
@@ -800,9 +809,13 @@ site: https://www.cl.cam.ac.uk/~pes20/ppcmem/index.html.
       use smp_rmb(), smp_wmb(), or, in the case of prior stores and
       later loads, smp_mb().
 
-  (*) If both legs of the "if" statement begin with identical stores
-      to the same variable, a barrier() statement is required at the
-      beginning of each leg of the "if" statement.
+  (*) If both legs of the "if" statement begin with identical stores to
+      the same variable, then those stores must be ordered, either by
+      preceding both of them with smp_mb() or by using smp_store_release()
+      to carry out the stores.  Please note that it is -not- sufficient
+      to use barrier() at beginning of each leg of the "if" statement,
+      as optimizing compilers do not necessarily respect barrier()
+      in this case.
 
   (*) Control dependencies require at least one run-time conditional
       between the prior load and the subsequent store, and this
@@ -814,7 +827,7 @@ site: https://www.cl.cam.ac.uk/~pes20/ppcmem/index.html.
   (*) Control dependencies require that the compiler avoid reordering the
       dependency into nonexistence.  Careful use of READ_ONCE() or
       atomic{,64}_read() can help to preserve your control dependency.
-      Please see the Compiler Barrier section for more information.
+      Please see the COMPILER BARRIER section for more information.
 
   (*) Control dependencies pair normally with other types of barriers.
 
@@ -1257,7 +1270,7 @@ TRANSITIVITY
 
 Transitivity is a deeply intuitive notion about ordering that is not
 always provided by real computer systems.  The following example
-demonstrates transitivity (also called "cumulativity"):
+demonstrates transitivity:
 
 	CPU 1			CPU 2			CPU 3
 	=======================	=======================	=======================
@@ -1305,8 +1318,86 @@ or a level of cache, CPU 2 might have early access to CPU 1's writes.
 General barriers are therefore required to ensure that all CPUs agree
 on the combined order of CPU 1's and CPU 2's accesses.
 
-To reiterate, if your code requires transitivity, use general barriers
-throughout.
+General barriers provide "global transitivity", so that all CPUs will
+agree on the order of operations.  In contrast, a chain of release-acquire
+pairs provides only "local transitivity", so that only those CPUs on
+the chain are guaranteed to agree on the combined order of the accesses.
+For example, switching to C code in deference to Herman Hollerith:
+
+	int u, v, x, y, z;
+
+	void cpu0(void)
+	{
+		r0 = smp_load_acquire(&x);
+		WRITE_ONCE(u, 1);
+		smp_store_release(&y, 1);
+	}
+
+	void cpu1(void)
+	{
+		r1 = smp_load_acquire(&y);
+		r4 = READ_ONCE(v);
+		r5 = READ_ONCE(u);
+		smp_store_release(&z, 1);
+	}
+
+	void cpu2(void)
+	{
+		r2 = smp_load_acquire(&z);
+		smp_store_release(&x, 1);
+	}
+
+	void cpu3(void)
+	{
+		WRITE_ONCE(v, 1);
+		smp_mb();
+		r3 = READ_ONCE(u);
+	}
+
+Because cpu0(), cpu1(), and cpu2() participate in a local transitive
+chain of smp_store_release()/smp_load_acquire() pairs, the following
+outcome is prohibited:
+
+	r0 == 1 && r1 == 1 && r2 == 1
+
+Furthermore, because of the release-acquire relationship between cpu0()
+and cpu1(), cpu1() must see cpu0()'s writes, so that the following
+outcome is prohibited:
+
+	r1 == 1 && r5 == 0
+
+However, the transitivity of release-acquire is local to the participating
+CPUs and does not apply to cpu3().  Therefore, the following outcome
+is possible:
+
+	r0 == 0 && r1 == 1 && r2 == 1 && r3 == 0 && r4 == 0
+
+As an aside, the following outcome is also possible:
+
+	r0 == 0 && r1 == 1 && r2 == 1 && r3 == 0 && r4 == 0 && r5 == 1
+
+Although cpu0(), cpu1(), and cpu2() will see their respective reads and
+writes in order, CPUs not involved in the release-acquire chain might
+well disagree on the order.  This disagreement stems from the fact that
+the weak memory-barrier instructions used to implement smp_load_acquire()
+and smp_store_release() are not required to order prior stores against
+subsequent loads in all cases.  This means that cpu3() can see cpu0()'s
+store to u as happening -after- cpu1()'s load from v, even though
+both cpu0() and cpu1() agree that these two operations occurred in the
+intended order.
+
+However, please keep in mind that smp_load_acquire() is not magic.
+In particular, it simply reads from its argument with ordering.  It does
+-not- ensure that any particular value will be read.  Therefore, the
+following outcome is possible:
+
+	r0 == 0 && r1 == 0 && r2 == 0 && r5 == 0
+
+Note that this outcome can happen even on a mythical sequentially
+consistent system where nothing is ever reordered.
+
+To reiterate, if your code requires global transitivity, use general
+barriers throughout.
 
 
 ========================
@@ -1459,7 +1550,7 @@ be fatal in concurrent code.  Here are some examples of these sorts
      the following:
 
 	a = 0;
-	/* Code that does not store to variable a. */
+	... Code that does not store to variable a ...
 	a = 0;
 
      The compiler sees that the value of variable 'a' is already zero, so
@@ -1471,7 +1562,7 @@ be fatal in concurrent code.  Here are some examples of these sorts
      wrong guess:
 
 	WRITE_ONCE(a, 0);
-	/* Code that does not store to variable a. */
+	... Code that does not store to variable a ...
 	WRITE_ONCE(a, 0);
 
  (*) The compiler is within its rights to reorder memory accesses unless
diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h
index c30266e94806..8ef0ccbf8167 100644
--- a/include/asm-generic/bitops/lock.h
+++ b/include/asm-generic/bitops/lock.h
@@ -29,16 +29,16 @@ do {					\
  * @nr: the bit to set
  * @addr: the address to start counting from
  *
- * This operation is like clear_bit_unlock, however it is not atomic.
- * It does provide release barrier semantics so it can be used to unlock
- * a bit lock, however it would only be used if no other CPU can modify
- * any bits in the memory until the lock is released (a good example is
- * if the bit lock itself protects access to the other bits in the word).
+ * A weaker form of clear_bit_unlock() as used by __bit_lock_unlock(). If all
+ * the bits in the word are protected by this lock some archs can use weaker
+ * ops to safely unlock.
+ *
+ * See for example x86's implementation.
  */
 #define __clear_bit_unlock(nr, addr)	\
 do {					\
-	smp_mb();			\
-	__clear_bit(nr, addr);		\
+	smp_mb__before_atomic();	\
+	clear_bit(nr, addr);		\
 } while (0)
 
 #endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */

             reply	other threads:[~2016-03-24  7:47 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-24  7:47 Ingo Molnar [this message]
  -- strict thread matches above, loose matches on Subject: below --
2026-04-24 13:49 [GIT PULL] locking fixes Ingo Molnar
2026-04-24 18:32 ` pr-tracker-bot
2026-03-29  5:20 Ingo Molnar
2026-03-29 17:29 ` pr-tracker-bot
2025-12-06 11:32 Ingo Molnar
2025-12-06 20:42 ` pr-tracker-bot
2025-09-26 13:46 Ingo Molnar
2025-09-26 20:44 ` pr-tracker-bot
2025-03-28 21:05 Ingo Molnar
2025-03-30 23:04 ` pr-tracker-bot
2025-03-14  9:06 Ingo Molnar
2025-03-14 20:14 ` pr-tracker-bot
2023-09-22 10:12 Ingo Molnar
2023-09-22 20:19 ` pr-tracker-bot
2021-07-11 13:22 Ingo Molnar
2021-07-11 18:22 ` pr-tracker-bot
2021-04-11 12:14 Ingo Molnar
2021-04-11 18:56 ` pr-tracker-bot
2021-03-21 10:53 Ingo Molnar
2021-03-21 18:45 ` pr-tracker-bot
2020-12-27  9:50 Ingo Molnar
2020-12-27 17:27 ` pr-tracker-bot
2020-08-15 11:13 Ingo Molnar
2020-08-16  1:55 ` pr-tracker-bot
2020-01-18 17:53 Ingo Molnar
2020-01-18 21:05 ` pr-tracker-bot
2019-12-17 11:27 Ingo Molnar
2019-12-17 19:20 ` pr-tracker-bot
2019-04-20  7:30 Ingo Molnar
2019-04-20 16:51 ` Linus Torvalds
2019-04-21 18:23   ` Ingo Molnar
2019-04-20 19:25 ` pr-tracker-bot
2019-02-10  8:53 Ingo Molnar
2019-02-10 18:30 ` pr-tracker-bot
2018-10-05  9:36 Ingo Molnar
2018-10-05 23:06 ` Greg Kroah-Hartman
2018-09-15 12:56 Ingo Molnar
2018-07-30 17:49 Ingo Molnar
2018-03-25  8:49 Ingo Molnar
2018-02-15  0:50 Ingo Molnar
2018-01-17 15:24 Ingo Molnar
2018-01-22  9:43 ` Geert Uytterhoeven
2018-01-22 10:39   ` Peter Zijlstra
2018-01-12 13:45 Ingo Molnar
2017-12-15 15:55 Ingo Molnar
2017-10-14 16:01 Ingo Molnar
2017-03-07 20:27 Ingo Molnar
2017-02-28  7:57 Ingo Molnar
2017-02-28 18:37 ` Linus Torvalds
2017-05-03 23:21 ` Linus Torvalds
2017-05-04  5:40   ` Peter Zijlstra
     [not found]     ` <CA+55aFymvtCAYHdz__3Lj=YqmORB7_A-NXrw=+h+60znJVsDTw@mail.gmail.com>
2017-05-04 22:44       ` Greg Kroah-Hartman
2016-12-07 18:42 Ingo Molnar
2016-10-18 10:55 Ingo Molnar
2016-08-18 20:34 Ingo Molnar
2016-08-12 19:32 Ingo Molnar
2016-06-10 12:45 Ingo Molnar
2016-04-28 17:52 Ingo Molnar
2016-04-23 11:22 Ingo Molnar
2015-09-17  7:57 Ingo Molnar
2015-04-18 15:15 Ingo Molnar
2015-02-20 13:37 Ingo Molnar
2015-02-21  0:03 ` Linus Torvalds
2015-02-21  1:51   ` Linus Torvalds
2015-02-23  8:35     ` Christian Borntraeger
2015-02-21  5:07   ` Ingo Molnar
2015-02-21  5:16     ` Ingo Molnar
2015-02-21  5:28       ` Ingo Molnar
2015-01-11  8:39 Ingo Molnar
2014-10-31 11:06 Ingo Molnar
2014-04-16 11:39 Ingo Molnar
2014-01-15 18:15 Ingo Molnar
2009-12-10 19:45 Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160324074710.GA31975@gmail.com \
    --to=mingo@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@us.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.