* Does Itanium permit speculative stores? @ 2013-11-11 17:13 Paul E. McKenney 2013-11-12 18:00 ` Luck, Tony 2013-11-27 4:55 ` Jon Masters 0 siblings, 2 replies; 11+ messages in thread From: Paul E. McKenney @ 2013-11-11 17:13 UTC (permalink / raw) To: tony.luck; +Cc: peterz, linux-kernel Hello, Tony, Does Itanium permit speculative stores? For example, on Itanium what are the permitted outcomes of the following litmus test, where both x and y are initially zero? CPU 0 CPU 1 r1 = ACCESS_ONCE(x); r2 = ACCESS_ONCE(y); if (r1) if (r2) ACCESS_ONCE(y) = 1; ACCESS_ONCE(x) = 1; In particular, is the outcome (r1 == 1 && r2 == 1) possible on Itanium given this litmus test? Thanx, Paul ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: Does Itanium permit speculative stores? 2013-11-11 17:13 Does Itanium permit speculative stores? Paul E. McKenney @ 2013-11-12 18:00 ` Luck, Tony 2013-11-12 18:26 ` Peter Zijlstra ` (2 more replies) 2013-11-27 4:55 ` Jon Masters 1 sibling, 3 replies; 11+ messages in thread From: Luck, Tony @ 2013-11-12 18:00 UTC (permalink / raw) To: paulmck@linux.vnet.ibm.com Cc: peterz@infradead.org, linux-kernel@vger.kernel.org > Does Itanium permit speculative stores? For example, on Itanium what are > the permitted outcomes of the following litmus test, where both x and y > are initially zero? We have a complier visible speculative read via the "ld.s" and "chk" instructions. But there is no speculative write ("st.s") instruction. I think you are asking "can out of order writes become visible in this scenario?" CPU 0 CPU 1 r1 = ACCESS_ONCE(x); r2 = ACCESS_ONCE(y); if (r1) if (r2) ACCESS_ONCE(y) = 1; ACCESS_ONCE(x) = 1; > In particular, is the outcome (r1 == 1 && r2 == 1) possible on Itanium > given this litmus test? The "ACCESS_ONCE" macro casts to volatile - which will make gcc generate ordered "ld.acq" and "st.rel" instructions for your code snippets. So I think you should be fine. -Tony ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Does Itanium permit speculative stores? 2013-11-12 18:00 ` Luck, Tony @ 2013-11-12 18:26 ` Peter Zijlstra 2013-11-12 18:46 ` Luck, Tony 2013-11-12 21:30 ` Paul E. McKenney 2013-11-12 18:31 ` Paul E. McKenney 2013-11-12 18:34 ` Peter Zijlstra 2 siblings, 2 replies; 11+ messages in thread From: Peter Zijlstra @ 2013-11-12 18:26 UTC (permalink / raw) To: Luck, Tony; +Cc: paulmck@linux.vnet.ibm.com, linux-kernel@vger.kernel.org On Tue, Nov 12, 2013 at 06:00:26PM +0000, Luck, Tony wrote: > > Does Itanium permit speculative stores? For example, on Itanium what are > > the permitted outcomes of the following litmus test, where both x and y > > are initially zero? > > We have a complier visible speculative read via the "ld.s" and "chk" instructions. But > there is no speculative write ("st.s") instruction. I think you are asking "can out of order > writes become visible in this scenario?" > > CPU 0 CPU 1 > > r1 = ACCESS_ONCE(x); r2 = ACCESS_ONCE(y); > if (r1) if (r2) > ACCESS_ONCE(y) = 1; ACCESS_ONCE(x) = 1; > > > In particular, is the outcome (r1 == 1 && r2 == 1) possible on Itanium > > given this litmus test? > > The "ACCESS_ONCE" macro casts to volatile - which will make gcc generate > ordered "ld.acq" and "st.rel" instructions for your code snippets. So I think > you should be fine. Cute that volatile generates barrier instructions. But no; I think Paul accidentally formulated his question in C (since we all speak C) but meant to ask an architectural question. So the point we're having a discussion on is if any architecture has visible speculative STORES and if there's an architecture that doesn't have control dependencies. On the visible speculative STORES; can, if in the above example we have regular loads/stores: LOAD r1, x LOAD r2, y IF (r1) IF (r2) STORE y, 1 STORE x, 1 we observe: r1==1 && r2==1 In order for that to be true; we must be able to observe the stores before the loads are complete -- and therefore before the branches are a certainty. Typically if an architecture speculates on branches the result doesn't become visible/committed until the branch is a certainty -- ie. linear branch history. Alternatively: x:=0 IF (cond) LOAD r1,x STORE x,1 STORE x,2 Can r1 ever be 1 if we know 'cond' will never be true (runtime constraint, not compile time so the branch cannot be omitted). ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: Does Itanium permit speculative stores? 2013-11-12 18:26 ` Peter Zijlstra @ 2013-11-12 18:46 ` Luck, Tony 2013-11-12 18:49 ` Peter Zijlstra 2013-11-12 21:29 ` Paul E. McKenney 2013-11-12 21:30 ` Paul E. McKenney 1 sibling, 2 replies; 11+ messages in thread From: Luck, Tony @ 2013-11-12 18:46 UTC (permalink / raw) To: Peter Zijlstra; +Cc: paulmck@linux.vnet.ibm.com, linux-kernel@vger.kernel.org > So the point we're having a discussion on is if any architecture has > visible speculative STORES and if there's an architecture that doesn't > have control dependencies. > > On the visible speculative STORES; can, if in the above example we have > regular loads/stores: > > LOAD r1, x LOAD r2, y > IF (r1) IF (r2) > STORE y, 1 STORE x, 1 > > we observe: r1==1 && r2==1 > > In order for that to be true; we must be able to observe the stores > before the loads are complete -- and therefore before the branches are a > certainty. Even without the ".acq" and ".rel" this code is still safe. Quoting ia64 SDM vol 1, Section 4.4.7 Memory Access Ordering" "In addition, memory writes and flushes must observe control dependencies" which I take to mean that the STORE can't be visible until we are certain of the outcome of the conditional. -Tony ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Does Itanium permit speculative stores? 2013-11-12 18:46 ` Luck, Tony @ 2013-11-12 18:49 ` Peter Zijlstra 2013-11-12 21:29 ` Paul E. McKenney 1 sibling, 0 replies; 11+ messages in thread From: Peter Zijlstra @ 2013-11-12 18:49 UTC (permalink / raw) To: Luck, Tony; +Cc: paulmck@linux.vnet.ibm.com, linux-kernel@vger.kernel.org On Tue, Nov 12, 2013 at 06:46:20PM +0000, Luck, Tony wrote: > > So the point we're having a discussion on is if any architecture has > > visible speculative STORES and if there's an architecture that doesn't > > have control dependencies. > > > > On the visible speculative STORES; can, if in the above example we have > > regular loads/stores: > > > > LOAD r1, x LOAD r2, y > > IF (r1) IF (r2) > > STORE y, 1 STORE x, 1 > > > > we observe: r1==1 && r2==1 > > > > In order for that to be true; we must be able to observe the stores > > before the loads are complete -- and therefore before the branches are a > > certainty. > > Even without the ".acq" and ".rel" this code is still safe. > > Quoting ia64 SDM vol 1, Section 4.4.7 Memory Access Ordering" > "In addition, memory writes and flushes must observe control dependencies" > > which I take to mean that the STORE can't be visible until we are certain of > the outcome of the conditional. Awesome! ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Does Itanium permit speculative stores? 2013-11-12 18:46 ` Luck, Tony 2013-11-12 18:49 ` Peter Zijlstra @ 2013-11-12 21:29 ` Paul E. McKenney 1 sibling, 0 replies; 11+ messages in thread From: Paul E. McKenney @ 2013-11-12 21:29 UTC (permalink / raw) To: Luck, Tony; +Cc: Peter Zijlstra, linux-kernel@vger.kernel.org On Tue, Nov 12, 2013 at 06:46:20PM +0000, Luck, Tony wrote: > > So the point we're having a discussion on is if any architecture has > > visible speculative STORES and if there's an architecture that doesn't > > have control dependencies. > > > > On the visible speculative STORES; can, if in the above example we have > > regular loads/stores: > > > > LOAD r1, x LOAD r2, y > > IF (r1) IF (r2) > > STORE y, 1 STORE x, 1 > > > > we observe: r1==1 && r2==1 > > > > In order for that to be true; we must be able to observe the stores > > before the loads are complete -- and therefore before the branches are a > > certainty. > > Even without the ".acq" and ".rel" this code is still safe. > > Quoting ia64 SDM vol 1, Section 4.4.7 Memory Access Ordering" > "In addition, memory writes and flushes must observe control dependencies" > > which I take to mean that the STORE can't be visible until we are certain of > the outcome of the conditional. Even better! Thanx, Paul ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Does Itanium permit speculative stores? 2013-11-12 18:26 ` Peter Zijlstra 2013-11-12 18:46 ` Luck, Tony @ 2013-11-12 21:30 ` Paul E. McKenney 1 sibling, 0 replies; 11+ messages in thread From: Paul E. McKenney @ 2013-11-12 21:30 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Luck, Tony, linux-kernel@vger.kernel.org On Tue, Nov 12, 2013 at 07:26:17PM +0100, Peter Zijlstra wrote: > On Tue, Nov 12, 2013 at 06:00:26PM +0000, Luck, Tony wrote: > > > Does Itanium permit speculative stores? For example, on Itanium what are > > > the permitted outcomes of the following litmus test, where both x and y > > > are initially zero? > > > > We have a complier visible speculative read via the "ld.s" and "chk" instructions. But > > there is no speculative write ("st.s") instruction. I think you are asking "can out of order > > writes become visible in this scenario?" > > > > CPU 0 CPU 1 > > > > r1 = ACCESS_ONCE(x); r2 = ACCESS_ONCE(y); > > if (r1) if (r2) > > ACCESS_ONCE(y) = 1; ACCESS_ONCE(x) = 1; > > > > > In particular, is the outcome (r1 == 1 && r2 == 1) possible on Itanium > > > given this litmus test? > > > > The "ACCESS_ONCE" macro casts to volatile - which will make gcc generate > > ordered "ld.acq" and "st.rel" instructions for your code snippets. So I think > > you should be fine. > > Cute that volatile generates barrier instructions. > > But no; I think Paul accidentally formulated his question in C (since we > all speak C) but meant to ask an architectural question. I got both answers, so I am good. ;-) > So the point we're having a discussion on is if any architecture has > visible speculative STORES and if there's an architecture that doesn't > have control dependencies. > > On the visible speculative STORES; can, if in the above example we have > regular loads/stores: > > LOAD r1, x LOAD r2, y > IF (r1) IF (r2) > STORE y, 1 STORE x, 1 > > we observe: r1==1 && r2==1 > > In order for that to be true; we must be able to observe the stores > before the loads are complete -- and therefore before the branches are a > certainty. > > Typically if an architecture speculates on branches the result doesn't > become visible/committed until the branch is a certainty -- ie. linear > branch history. > > Alternatively: > > x:=0 > > IF (cond) LOAD r1,x > STORE x,1 > STORE x,2 > > Can r1 ever be 1 if we know 'cond' will never be true (runtime > constraint, not compile time so the branch cannot be omitted). I would have been OK mandating use of ACCESS_ONCE() to prevent speculative stores, but it is even nicer that it is not necessary. Thanx, Paul ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Does Itanium permit speculative stores? 2013-11-12 18:00 ` Luck, Tony 2013-11-12 18:26 ` Peter Zijlstra @ 2013-11-12 18:31 ` Paul E. McKenney 2013-11-12 18:34 ` Peter Zijlstra 2 siblings, 0 replies; 11+ messages in thread From: Paul E. McKenney @ 2013-11-12 18:31 UTC (permalink / raw) To: Luck, Tony; +Cc: peterz@infradead.org, linux-kernel@vger.kernel.org On Tue, Nov 12, 2013 at 06:00:26PM +0000, Luck, Tony wrote: > > Does Itanium permit speculative stores? For example, on Itanium what are > > the permitted outcomes of the following litmus test, where both x and y > > are initially zero? > > We have a complier visible speculative read via the "ld.s" and "chk" instructions. But > there is no speculative write ("st.s") instruction. I think you are asking "can out of order > writes become visible in this scenario?" > > CPU 0 CPU 1 > > r1 = ACCESS_ONCE(x); r2 = ACCESS_ONCE(y); > if (r1) if (r2) > ACCESS_ONCE(y) = 1; ACCESS_ONCE(x) = 1; > > > In particular, is the outcome (r1 == 1 && r2 == 1) possible on Itanium > > given this litmus test? > > The "ACCESS_ONCE" macro casts to volatile - which will make gcc generate > ordered "ld.acq" and "st.rel" instructions for your code snippets. So I think > you should be fine. Excellent!!! Thank you for the information! If I understand you correctly, this underscores the importance of using ACCESS_ONCE() -- if you omit them in the above scenario, perhaps you can see out-of-order stores becoming visible in this scenario? Also, this resolves our earlier IRC discussion about Itanium's lack of read-read cache coherence. If you use ACCESS_ONCE properly, then on Itanium the reads will become ld.acq instructions, ensuring the expected cache coherence. Very nice! Thanx, Paul ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Does Itanium permit speculative stores? 2013-11-12 18:00 ` Luck, Tony 2013-11-12 18:26 ` Peter Zijlstra 2013-11-12 18:31 ` Paul E. McKenney @ 2013-11-12 18:34 ` Peter Zijlstra 2 siblings, 0 replies; 11+ messages in thread From: Peter Zijlstra @ 2013-11-12 18:34 UTC (permalink / raw) To: Luck, Tony; +Cc: paulmck@linux.vnet.ibm.com, linux-kernel@vger.kernel.org On Tue, Nov 12, 2013 at 06:00:26PM +0000, Luck, Tony wrote: > The "ACCESS_ONCE" macro casts to volatile - which will make gcc generate > ordered "ld.acq" and "st.rel" instructions for your code snippets. So I think > you should be fine. Hurm.. so: +#define smp_store_release(p, v) \ +do { \ + compiletime_assert_atomic_type(*p); \ + switch (sizeof(*p)) { \ + case 4: \ + asm volatile ("st4.rel [%0]=%1" \ + : "=r" (p) : "r" (v) : "memory"); \ + break; \ + case 8: \ + asm volatile ("st8.rel [%0]=%1" \ + : "=r" (p) : "r" (v) : "memory"); \ + break; \ + } \ +} while (0) + +#define smp_load_acquire(p) \ +({ \ + typeof(*p) ___p1; \ + compiletime_assert_atomic_type(*p); \ + switch (sizeof(*p)) { \ + case 4: \ + asm volatile ("ld4.acq %0=[%1]" \ + : "=r" (___p1) : "r" (p) : "memory"); \ + break; \ + case 8: \ + asm volatile ("ld8.acq %0=[%1]" \ + : "=r" (___p1) : "r" (p) : "memory"); \ + break; \ + } \ + ___p1; \ +}) That all can be written as: +#define smp_store_release(p, v) \ +do { \ + compiletime_assert_atomic_type(*p); \ + ACCESS_ONCE(*p) = (v); \ +} while (0) + +#define smp_load_acquire(p) \ +({ \ + typeof(*p) ___p1 = ACCESS_ONCE(*p); \ + compiletime_assert_atomic_type(*p); \ + ___p1; \ +}) On ia64? Totally much simpler! ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Does Itanium permit speculative stores? 2013-11-11 17:13 Does Itanium permit speculative stores? Paul E. McKenney 2013-11-12 18:00 ` Luck, Tony @ 2013-11-27 4:55 ` Jon Masters 2013-11-27 17:19 ` Paul E. McKenney 1 sibling, 1 reply; 11+ messages in thread From: Jon Masters @ 2013-11-27 4:55 UTC (permalink / raw) To: paulmck; +Cc: tony.luck, peterz, linux-kernel On 11/11/2013 12:13 PM, Paul E. McKenney wrote: > Hello, Tony, > > Does Itanium permit speculative stores? For example, on Itanium what are > the permitted outcomes of the following litmus test, where both x and y > are initially zero? > > CPU 0 CPU 1 > > r1 = ACCESS_ONCE(x); r2 = ACCESS_ONCE(y); > if (r1) if (r2) > ACCESS_ONCE(y) = 1; ACCESS_ONCE(x) = 1; > > In particular, is the outcome (r1 == 1 && r2 == 1) possible on Itanium > given this litmus test? > > Thanx, Paul > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > Btw, I was reading through some UEFI docs and noticed a reference to "A Formal Specification of Intel Itanium Processor Family Memory Ordering", then remembered this thread. In case it's of use: http://www.intel.com/design/itanium/downloads/251429.htm Jon. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Does Itanium permit speculative stores? 2013-11-27 4:55 ` Jon Masters @ 2013-11-27 17:19 ` Paul E. McKenney 0 siblings, 0 replies; 11+ messages in thread From: Paul E. McKenney @ 2013-11-27 17:19 UTC (permalink / raw) To: Jon Masters; +Cc: tony.luck, peterz, linux-kernel On Tue, Nov 26, 2013 at 11:55:58PM -0500, Jon Masters wrote: > On 11/11/2013 12:13 PM, Paul E. McKenney wrote: > > Hello, Tony, > > > > Does Itanium permit speculative stores? For example, on Itanium what are > > the permitted outcomes of the following litmus test, where both x and y > > are initially zero? > > > > CPU 0 CPU 1 > > > > r1 = ACCESS_ONCE(x); r2 = ACCESS_ONCE(y); > > if (r1) if (r2) > > ACCESS_ONCE(y) = 1; ACCESS_ONCE(x) = 1; > > > > In particular, is the outcome (r1 == 1 && r2 == 1) possible on Itanium > > given this litmus test? > > > > Thanx, Paul > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > > Btw, I was reading through some UEFI docs and noticed a reference to "A > Formal Specification of Intel Itanium Processor Family Memory Ordering", > then remembered this thread. In case it's of use: > > http://www.intel.com/design/itanium/downloads/251429.htm I have seen this, but there have been too many times when I have fooled myself about what the words mean (with DEC Alpha back in the late 90s being the most impressive example). So while I do learn what I can from them, they are unfortunately not a substitute for asking. ;-) Besides, some of the Itanium locking code uses instructions that the above manual is silent about. Thanx, Paul ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2013-11-27 22:06 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-11-11 17:13 Does Itanium permit speculative stores? Paul E. McKenney 2013-11-12 18:00 ` Luck, Tony 2013-11-12 18:26 ` Peter Zijlstra 2013-11-12 18:46 ` Luck, Tony 2013-11-12 18:49 ` Peter Zijlstra 2013-11-12 21:29 ` Paul E. McKenney 2013-11-12 21:30 ` Paul E. McKenney 2013-11-12 18:31 ` Paul E. McKenney 2013-11-12 18:34 ` Peter Zijlstra 2013-11-27 4:55 ` Jon Masters 2013-11-27 17:19 ` Paul E. McKenney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).