All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@us.ibm.com>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: David Howells <dhowells@redhat.com>,
	Kernel development list <linux-kernel@vger.kernel.org>
Subject: Re: Uses for memory barriers
Date: Thu, 19 Oct 2006 15:46:46 -0700	[thread overview]
Message-ID: <20061019224646.GC2265@us.ibm.com> (raw)
In-Reply-To: <Pine.LNX.4.44L0.0610191557220.8492-100000@iolanthe.rowland.org>

On Thu, Oct 19, 2006 at 04:55:16PM -0400, Alan Stern wrote:
> On Thu, 19 Oct 2006, Paul E. McKenney wrote:
> 
> > > I see no reason to think the control dependency in CPU 1's assignment to B 
> > > is any weaker than a memory barrier.
> > 
> > I am assuming that you have in mind a special restricted memory barrier
> > that applies only to the load of A, and not necessarily to any other
> > preceding operations.  Otherwise, these two code sequences would be
> > equivalent, and they are not (as usual, all variables initially zero):
> > 
> > 	CPU 0, Sequence 1	CPU 0, Sequence 2	CPU 1
> > 
> > 	A=1			A=1			while (C==0);
> > 	while (B==0);		while (B==0);		smp_mb();
> > 	C=1			smp_mb();		assert(A==1);
> > 				C=1
> > 
> > In sequence 1, CPU 1's assertion can fail.  Not so with sequence 2.
> 
> Yes, that's a very good point.  Indeed, I meant a restricted memory
> barrier applying only to the two accesses involved.  In the same sort of
> way rmb() is a restricted memory barrier, applying only to pairs of
> loads.

OK.

> > Regardless of your definition of your posited memory barrier corresponding
> > to the control dependency, a counter example:
> > 
> > 	CPU 1			CPU 2
> > 
> > 	A=1;
> > 	...
> > 	while (A==0);		while (B==0);
> > 	B=1			smp_mb()
> > 				assert(A==1) <fails>
> > 
> > Here, placing an smp_mb() after the "while (A==0)" does make a difference.
> > 
> > Degenerate, perhaps, given that the same CPU is assigning and while-ing,
> > but so it goes.
> 
> The smp_mb() does make a difference.  But it doesn't invalidate my notion
> of a dependency acting as a restricted memory barrier.  The notion allows
> you to conclude from this example only that ld_1(A) >v ld_2(A), which is
> meaningless (using your convention for >v).  It doesn't allow you to
> conclude st_1(A) >v ld_2(A).

Yes, assuming that control dependencies result in your restricted memory
barrier.

> > Even assuming a special restricted memory barrier, the example of DEC
> > Alpha and pointer dereferencing gives me pause.  Feel free to berate
> > me for this, as you have done in the past.  ;-)
> 
> Ah, interesting comment.  With the Alpha and pointer dereferencing, the
> problems arise because of failure to respect a data dependency between two
> loads.  Here I am talking about a dependency between a load and a
> subsequent store, so it isn't the same thing at all.  Failure to respect
> this kind of dependency would mean the CPU was writing a value before it
> knew what value to write (or whether to write it, or where to write it).  
> Not even the most aggressively speculative machine will do that!

http://www.tinker.ncsu.edu/techreports/vssepic.pdf

Not exactly the same thing, but certainly a very similar level of
speculative aggression!

> > Seriously, my judgement of this would depend on exactly what part of
> > the smp_mb() semantics you are claiming for the control dependency.
> > I do not believe that we could make progress without appealing to a
> > specific implementation, so I would rather ignore control dependencies,
> > at least for non-MMIO accesses.  MMIO would be another story altogether.
> 
> What I'm claiming is exactly what was written in an earlier email:
> 
> 	st(A) < st(B) >v ac(B) < ac(A)  implies  st(A) >v ac(A), and
> 
> 	ld(A) < st(B) >v ac(B) < st(A)  implies  st(A) !>v ld(A).
> 
> Here I'm using your convention for >v, and < indicates either an explicit
> barrier between two accesses or a dependency between a load and a later
> store.

Your notion of control-dependency barriers makes sense in an intuitive
sense.  Does Linux rely on it, other than for MMIO accesses?

> > > "Sequentially precedes" means that the system behaves as though there were 
> > > a memory barrier between the two accesses.
> > 
> > OK.  As noted above, if I were to interpret "a memory barrier" as really
> > being everything entailed by smp_mb(), I disagree with your statement in an
> > earlier email stating:
> > 
> > 	Similarly, in the program "if (A) B = 2;" the load(A) sequentially
> > 	precedes the store(B) -- thanks to the dependency or (if you
> > 	prefer) the absence of speculative stores.
> > 
> > However, I don't believe that is what you mean by "a memory barrier" in
> > this case -- my guess again is that you mean a special memory barrier that
> > applies only the the load of A in one direction, but that applies to
> > everything following the load in the other direction.
> 
> It applies to the load of A in one direction and to all later stores in
> the other direction.  Not to later loads.

Ah, good point -- I didn't pick up on the fact that it needn't constrain
later loads.

> > I would use ">p" for the program-order relationship, and probably something
> > like ">b" for the memory-barrier relationship.  There are other orderings,
> > including the control-flow ordering discussed earlier, data dependencies,
> > and so on.
> 
> > The literature is quite inconsistent.  The DEC Alpha manual takes your
> > approach, while Gharachorloo's dissertation takes my approach.  Not to
> > be outdone, Steinke and Nutt's JACM paper (written long after the other
> > two) uses different directions for different types of orderings!!!
> > See http://arxiv.org/PS_cache/cs/pdf/0208/0208027.pdf, page 49,
> > Definitions A.5, A.6 on the one hand and Definition A.7 on the other.  ;-)
> 
> > This is the connotation conflict.  For you, it is confusing to write
> > "A > B" when A precedes B.  For me, it is confusing to write "st < ld"
> > when data flows from the "st" to the "ld".  So, the only way to resolve
> > this is to avoid use of ">" like the plague!
> 
> Okay, let's change the notation.  I don't like <v very much.  Let's not
> worry about potential confusion with implication signs, and use
> 
> 	1:st(A) -> 2:st(A)

Would "=>" work, or does that conflict with something else?

And the number before the colon is the CPU #, right?

> to indicate that 1:st occurs earlier than 2:st in the global ordering of 
> all stores to A.  And let's use
> 
> 	3:st(B) -> 4:ld(B)
> 
> to mean that 4:ld returned the value either of 3:st or of some other store 
> to B occuring later in the global ordering of all such stores.

OK...  Though expressing your English description formally is a bit messy,
it does capture a very useful idiom.

> Lastly, let's use
> 
> 	5:ac(A) +> 6:ac(B)
> 
> to indicate either that the two accesses are separated by a memory barrier 
> or that 5:ac is a load and 6:ac is a dependent store (all occurring on the 
> same CPU).

So the number preceding the colon is the value being loaded or stored?

Either way, the symbols seem reasonable.  In a PDF, I would probably
set a symbol indicating the type of flow over a hollow arrow or something.

> > And in a cache-coherent system, there must be.  Or, more precisely,
> > there must not be different sequences of loads that indicate inconsistent
> > orderings of stores to a given single variable.  If the system can
> > prove that there are no concurrent loads during a given period of
> > time, I guess it would be within its rights to ditch cache coherence
> > for that variable during that time...
> 
> What about indirect indications of inconsistency?  See my example below.

I have some questions about that one.

> > > (BTW, can you explain the difference between "cache coherent" and "cache 
> > > consistent"?  I never quite got it straight...)
> > 
> > "Cache coherent" is the preferred term, though "cache consistent" is
> > sometimes used as a synonym.  If you want to be painfully correct, you
> > would say "cache coherent" when talking about stores to a single variable,
> > and "memory consistency model" when talking about ordering of accesses
> > to multiple variables.
> 
> Hmmm.  Then what about "DMA coherent" vs. "DMA consistent"?

No idea.  Having worked with systems where DMA did not play particularly
nicely with the cache-coherence protocol, they both sound like good things,
though.  ;-)

As near as I can tell by looking around, they are synonyms or nearly so.

> > > The analogy breaks down for pairs of stores.  If stores are blind then 
> > > they can't see other stores -- but we need them to.
> > 
> > I would instead say that you need to execute some loads in order to be
> > able to see the effects of your pairs of stores.
> 
> Consider this example:
> 
> 	CPU 0			CPU 1
> 	-----			-----
> 	A = 1;			B = 2;
> 	mb();			mb();
> 	B = 1;			X = A + 1;
> 	...
> 	assert(!(B==2 && X==1));
> 
> The assertion cannot fail.  But to prove it in our formalism requires 
> writing  st_0(B=1) -> st_1(B=2).  In other words, CPU 1's store to B sees 
> (i.e., overwrites) CPU 0's store to B.

Alternatively, we could use a notation that states that a given load gets
exactly the value from a given store, for example "st ==> ld" as opposed
to "st => ld", where there might be an intervening store.

(1)	B==2 -> st_1(B=2) ==> ld_0(B==2)

	Because there is only one store of 2 into B.

(2)	But st_0(B=1) =p> ld_0(B) -> st_0(B=1) => ld_0(B)

	Here I use "=p>" to indicate program order, and rely on the
	fact that a CPU must see its own accesses in order.

(3)	(1) and (2) imply st_0(B=1) => st_1(B=2) ==> ld_0(B==2)

	So, yes, we do end up saying something about the order of the
	stores, but only indirectly, based on other observations -- in
	this case, program order and direct value sequence.  In other
	words, we can sometimes say things about the order of stores
	even though stores are blind.

(4)	By memory-barrier implication:

	(a)	st_0(A=1) +> st_0(B=1) &&
	
	(b)	st_1(B=2) +> ld_1(A) &&
	
	(c)	st_0(B=1) => st_1(B=2)

	-> st_0(A=1) => ld_1(A)

(5)	Since there is only one store to A: st_0(A=1) ==> ld_1(A==1)

(6)	Therefore, X==2 and the assertion cannot fail if B==2.  But
	if the assertion fails, it must be true that B==2, so the
	assertion cannot fail.

Is that more or less what you had in mind?

							Thanx, Paul

  reply	other threads:[~2006-10-19 22:45 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20060911190005.GA1295@us.ibm.com>
2006-09-12 18:08 ` Uses for memory barriers Alan Stern
2006-09-12 20:23   ` Paul E. McKenney
2006-09-14 14:58     ` Alan Stern
2006-09-15  5:16       ` Paul E. McKenney
2006-09-15 19:48         ` Alan Stern
2006-09-16  4:19           ` Paul E. McKenney
2006-09-16 15:28             ` Alan Stern
2006-09-18 19:13               ` Paul E. McKenney
2006-09-18 20:13                 ` Alan Stern
2006-09-19  0:47                   ` Paul E. McKenney
2006-09-19 16:04                     ` Alan Stern
2006-09-19 16:38                       ` Nick Piggin
2006-09-19 17:40                         ` Alan Stern
2006-09-19 17:51                           ` Nick Piggin
2006-09-19 18:19                             ` Paul E. McKenney
2006-09-19 18:48                               ` Nick Piggin
2006-09-19 19:36                                 ` Paul E. McKenney
2006-09-19 19:48                                   ` Nick Piggin
2006-09-19 20:01                                     ` Paul E. McKenney
2006-09-19 20:38                                       ` Alan Stern
2006-09-21  1:43                                         ` Paul E. McKenney
2006-09-19 18:16                       ` Paul E. McKenney
2006-09-20 19:39                         ` Alan Stern
2006-09-21  1:34                           ` Paul E. McKenney
2006-09-21 20:59                             ` Alan Stern
2006-09-22  5:02                               ` Paul E. McKenney
2006-09-22 20:38                                 ` Alan Stern
2006-09-27 21:06                                 ` Alan Stern
2006-09-30  1:11                                   ` Paul E. McKenney
2006-09-30 21:01                                     ` Alan Stern
2006-10-02  0:06                                       ` Paul E. McKenney
2006-10-02 15:44                                         ` Alan Stern
2006-10-04 15:35                                           ` Paul E. McKenney
2006-10-04 18:04                                             ` Alan Stern
2006-10-13 16:51                                               ` Paul E. McKenney
2006-10-13 18:30                                                 ` Alan Stern
2006-10-13 22:39                                                   ` Paul E. McKenney
2006-10-14  2:27                                                     ` Alan Stern
2006-10-17  1:24                                                       ` Paul E. McKenney
2006-10-17 15:29                                                         ` Alan Stern
2006-10-17 17:27                                                           ` Paul E. McKenney
2006-10-17 19:42                                                             ` Alan Stern
2006-10-17 20:15                                                               ` Paul E. McKenney
2006-10-17 21:21                                                                 ` Alan Stern
2006-10-17 22:58                                                                   ` Paul E. McKenney
2006-10-18 19:05                                                                     ` Alan Stern
2006-10-18 23:01                                                                       ` Paul E. McKenney
2006-10-19 16:44                                                                         ` Alan Stern
2006-10-19 19:21                                                                           ` Paul E. McKenney
2006-10-19 20:55                                                                             ` Alan Stern
2006-10-19 22:46                                                                               ` Paul E. McKenney [this message]
2006-10-20 16:54                                                                                 ` Alan Stern
2006-10-21  0:59                                                                                   ` Paul E. McKenney
2006-10-21 19:47                                                                                     ` Alan Stern
2006-10-21 22:52                                                                                       ` Paul E. McKenney
2006-10-22  2:18                                                                                         ` Alan Stern
2006-10-23  5:32                                                                                           ` Paul E. McKenney
2006-10-23 14:07                                                                                             ` Alan Stern
2006-10-24 17:52                                                                                               ` Paul E. McKenney
     [not found] <200609082230.22225.oliver@neukum.org>
2006-09-08 21:26 ` Alan Stern
2006-09-08 21:46   ` Oliver Neukum
2006-09-08 22:25     ` Alan Stern
2006-09-08 22:49       ` Oliver Neukum
2006-09-09  2:25         ` Alan Stern
2006-09-11 16:21           ` Paul E. McKenney
2006-09-11 16:50             ` Alan Stern
2006-09-11 17:23               ` Segher Boessenkool
2006-09-11 19:04                 ` Paul E. McKenney
2006-09-11 19:03               ` Paul E. McKenney
2006-09-11 17:21             ` Segher Boessenkool
2006-09-11 19:48             ` Oliver Neukum
2006-09-11 20:29               ` Paul E. McKenney
2006-09-12  9:01             ` David Howells
2006-09-12 10:22               ` Oliver Neukum
2006-09-12 14:55                 ` Paul E. McKenney
2006-09-12 15:07                   ` Oliver Neukum
2006-09-12 16:12                     ` Paul E. McKenney
2006-09-12 17:50                     ` Segher Boessenkool
2006-09-12 14:42               ` Paul E. McKenney
2006-09-12  8:57           ` David Howells
     [not found] <200609081929.33027.oliver@neukum.org>
2006-09-08 18:06 ` Alan Stern
2006-09-08 18:22   ` Oliver Neukum
2006-09-07 21:25 Alan Stern
2006-09-07 22:10 ` linux-os (Dick Johnson)
2006-09-08 18:39   ` Alan Stern
2006-09-08  0:14 ` Paul E. McKenney
2006-09-08 15:55   ` Alan Stern
2006-09-08 18:57     ` Paul E. McKenney
2006-09-08 21:23       ` Alan Stern
2006-09-09  0:44         ` Paul E. McKenney
2006-09-11 16:05           ` Alan Stern
2006-09-08  5:52 ` David Schwartz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061019224646.GC2265@us.ibm.com \
    --to=paulmck@us.ibm.com \
    --cc=dhowells@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.