Re: Intel Memory Ordering White Paper

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Re: Intel Memory Ordering White Paper
  2007-09-08 17:48       ` Nick Piggin
@ 2007-09-07 18:13         ` Nick Piggin
  2007-09-08  8:53           ` Andi Kleen
  2007-09-08 11:34         ` dean gaudet
  1 sibling, 1 reply; 22+ messages in thread
From: Nick Piggin @ 2007-09-07 18:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: ak, Jesse Barnes, linux-kernel

On Sunday 09 September 2007 03:48, Nick Piggin wrote:

> There is some suggestion in the source code that non-temporal stores
> (movntq) are weakly ordered. But AFAIKS from the documents, it is ordered
> when operating on wb memory. What's the situation there?

Sorry, it looks from the AMD document like nontemporal stores to wb
memory can go out of order. It is a bit hard to decipher what the types
mean.

If this is the case, we can either retain the sfence in smp_wmb(), or noop
it, and put explicit sfences around any place that performs nontemporal
stores...

Anyway, the lfence should be able to go away without so much trouble.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Intel Memory Ordering White Paper
  2007-09-08  8:53           ` Andi Kleen
@ 2007-09-07 19:57             ` Nick Piggin
  2007-09-08 10:19               ` Andi Kleen
  0 siblings, 1 reply; 22+ messages in thread
From: Nick Piggin @ 2007-09-07 19:57 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Linus Torvalds, Jesse Barnes, linux-kernel

On Saturday 08 September 2007 18:53, Andi Kleen wrote:
> On Friday 07 September 2007 20:13:12 Nick Piggin wrote:
> > On Sunday 09 September 2007 03:48, Nick Piggin wrote:
> > > There is some suggestion in the source code that non-temporal stores
> > > (movntq) are weakly ordered. But AFAIKS from the documents, it is
> > > ordered when operating on wb memory. What's the situation there?
> >
> > Sorry, it looks from the AMD document like nontemporal stores to wb
> > memory can go out of order.
>
> Yes, that is how NT stores are defined.
>
> > If this is the case, we can either retain the sfence in smp_wmb(), or
> > noop it, and put explicit sfences around any place that performs
> > nontemporal stores...
>
> We do this already, but in most cases it doesn't matter anyways. We AFAIK
> do not rely on any ordering for copy_*_user for example. There are not
> that many users of nt so it's not a huge issue.

OK, but we just don't want to be making lots of little exceptions. For
bulk copies, I don't see it being a big issue to always sfence around
them (it would be a relatively minor cost).


> > Anyway, the lfence should be able to go away without so much trouble.
>
> You mean sfence? lfence in rmb is definitely needed.

I mean lfence in smp_rmb().


> sfence on x86-64 is not strictly needed, but also shouldn't hurt very much
> so I always kept it in.
>
> -Andi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Intel Memory Ordering White Paper
  2007-09-08 10:19               ` Andi Kleen
@ 2007-09-07 20:32                 ` Nick Piggin
  2007-09-08 20:37                   ` H. Peter Anvin
  0 siblings, 1 reply; 22+ messages in thread
From: Nick Piggin @ 2007-09-07 20:32 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Linus Torvalds, Jesse Barnes, linux-kernel

On Saturday 08 September 2007 20:19, Andi Kleen wrote:
> On Friday 07 September 2007 21:57:35 Nick Piggin wrote:
> > > > Anyway, the lfence should be able to go away without so much trouble.
> > >
> > > You mean sfence? lfence in rmb is definitely needed.
> >
> > I mean lfence in smp_rmb().
>
> One point of rmb is to stop speculative loads and I don't think we
> can get that without lfence.

smp_rmb() should not need to do anything because loads are done
in order anyway. Both AMD and Intel have committed to this now.

The important point is that they *appear* to be done in order. AFAIK,
the CPUs can still do speculative and out of order loads, but throw
out the results if they could be wrong.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Intel Memory Ordering White Paper
  2007-09-08 10:30   ` Alan Cox
@ 2007-09-07 20:46     ` Nick Piggin
  0 siblings, 0 replies; 22+ messages in thread
From: Nick Piggin @ 2007-09-07 20:46 UTC (permalink / raw)
  To: Alan Cox; +Cc: Jesse Barnes, linux-kernel

On Saturday 08 September 2007 20:30, Alan Cox wrote:
> On Sat, 8 Sep 2007 18:54:57 +1000
>
> Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> > On Saturday 08 September 2007 08:26, Jesse Barnes wrote:
> > > FYI, we just released a new white paper describing memory ordering for
> > > Intel processors:
> > > http://developer.intel.com/products/processor/manuals/index.htm
> > >
> > > Should help answer some questions about some of the ordering primitives
> > > we use on i386 and x86_64.
> >
> > So, can we finally noop smp_rmb and smp_wmb on x86?
>
> Nakked-by: Alan Cox <alan@redhat.com>
>
> You can only no-op it on 64bit Intel processors. On 32bit it needs to be
> conditional on whether your processor family (or back compat for it) as
> the Pentium Pro has some serious store ordering errata (hence the way it
> needs lock decb for spin_unlock)

We already noop smp_wmb on i386 even when CONFIG_X86_PPRO_FENCE.

I'm not sure if either errata can be solved completely by adding lock ops
in barrier instructions anyway: they both seem to involve situations where
there is just a single problematic cacheline in question.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Intel Memory Ordering White Paper
  2007-09-08 10:29 ` Alan Cox
@ 2007-09-07 20:49   ` Nick Piggin
  2007-09-08 14:11     ` Alan Cox
  0 siblings, 1 reply; 22+ messages in thread
From: Nick Piggin @ 2007-09-07 20:49 UTC (permalink / raw)
  To: Alan Cox; +Cc: Jesse Barnes, linux-kernel

On Saturday 08 September 2007 20:29, Alan Cox wrote:
> On Fri, 7 Sep 2007 15:26:50 -0700
>
> Jesse Barnes <jesse.barnes@intel.com> wrote:
> > FYI, we just released a new white paper describing memory ordering for
> > Intel processors:
> > http://developer.intel.com/products/processor/manuals/index.htm
> >
> > Should help answer some questions about some of the ordering primitives
> > we use on i386 and x86_64.
>
> Nice - but it appears to be 64bit only - and indeed it appears to be
> untrue for real 32bit because of the Pentium Pro fencing errata.

As I said, we're not doing anything special in barriers for the ppro errata
today anyway.


> The kernel also runs on IDT Winchip, Cyrix and AMD processors not all of
> which have exactly the same behaviour (the IDT Winchip as we run it
> profoundly differs)

AMD processors guarantee loads are ordered and stores are ordered
(with exceptions of non-temporal, and non-wb policy).

As for the others that do out of order stores, are any of them SMP?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Intel Memory Ordering White Paper
@ 2007-09-07 22:26 Jesse Barnes
  2007-09-08  8:54 ` Nick Piggin
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Jesse Barnes @ 2007-09-07 22:26 UTC (permalink / raw)
  To: linux-kernel

FYI, we just released a new white paper describing memory ordering for 
Intel processors:
http://developer.intel.com/products/processor/manuals/index.htm

Should help answer some questions about some of the ordering primitives 
we use on i386 and x86_64.

Jesse

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Intel Memory Ordering White Paper
  2007-09-08  8:54 ` Nick Piggin
@ 2007-09-07 23:20   ` Linus Torvalds
  2007-09-08 17:34     ` Nick Piggin
  2007-09-08 10:30   ` Alan Cox
  1 sibling, 1 reply; 22+ messages in thread
From: Linus Torvalds @ 2007-09-07 23:20 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Jesse Barnes, linux-kernel



On Sat, 8 Sep 2007, Nick Piggin wrote:
> 
> So, can we finally noop smp_rmb and smp_wmb on x86?

Did AMD already release their version? If so, we should probably add a 
commit that does that in somewhat early 2.6.24 rc, and add the pointers to 
the whitepapers in the commit message.

		Linus

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Intel Memory Ordering White Paper
  2007-09-07 18:13         ` Nick Piggin
@ 2007-09-08  8:53           ` Andi Kleen
  2007-09-07 19:57             ` Nick Piggin
  0 siblings, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2007-09-08  8:53 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Linus Torvalds, Jesse Barnes, linux-kernel

On Friday 07 September 2007 20:13:12 Nick Piggin wrote:
> On Sunday 09 September 2007 03:48, Nick Piggin wrote:
> 
> > There is some suggestion in the source code that non-temporal stores
> > (movntq) are weakly ordered. But AFAIKS from the documents, it is ordered
> > when operating on wb memory. What's the situation there?
> 
> Sorry, it looks from the AMD document like nontemporal stores to wb
> memory can go out of order.

Yes, that is how NT stores are defined.
 
> If this is the case, we can either retain the sfence in smp_wmb(), or noop
> it, and put explicit sfences around any place that performs nontemporal
> stores...

We do this already, but in most cases it doesn't matter anyways. We AFAIK
do not rely on any ordering for copy_*_user for example. There are not
that many users of nt so it's not a huge issue.

> 
> Anyway, the lfence should be able to go away without so much trouble.

You mean sfence? lfence in rmb is definitely needed.

sfence on x86-64 is not strictly needed, but also shouldn't hurt very much 
so I always kept it in.

-Andi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Intel Memory Ordering White Paper
  2007-09-07 22:26 Intel Memory Ordering White Paper Jesse Barnes
@ 2007-09-08  8:54 ` Nick Piggin
  2007-09-07 23:20   ` Linus Torvalds
  2007-09-08 10:30   ` Alan Cox
  2007-09-08 10:29 ` Alan Cox
  2007-09-12 18:26 ` Dr. David Alan Gilbert
  2 siblings, 2 replies; 22+ messages in thread
From: Nick Piggin @ 2007-09-08  8:54 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 374 bytes --]

On Saturday 08 September 2007 08:26, Jesse Barnes wrote:
> FYI, we just released a new white paper describing memory ordering for
> Intel processors:
> http://developer.intel.com/products/processor/manuals/index.htm
>
> Should help answer some questions about some of the ordering primitives
> we use on i386 and x86_64.

So, can we finally noop smp_rmb and smp_wmb on x86?

[-- Attachment #2: x86-barrier-opt.patch --]
[-- Type: text/x-diff, Size: 1011 bytes --]

Index: linux-2.6/include/asm-i386/system.h
===================================================================
--- linux-2.6.orig/include/asm-i386/system.h
+++ linux-2.6/include/asm-i386/system.h
@@ -286,7 +286,7 @@ static inline unsigned long get_limit(un
 
 #ifdef CONFIG_SMP
 #define smp_mb()	mb()
-#define smp_rmb()	rmb()
+#define smp_rmb()	barrier()
 #define smp_wmb()	wmb()
 #define smp_read_barrier_depends()	read_barrier_depends()
 #define set_mb(var, value) do { (void) xchg(&var, value); } while (0)
Index: linux-2.6/include/asm-x86_64/system.h
===================================================================
--- linux-2.6.orig/include/asm-x86_64/system.h
+++ linux-2.6/include/asm-x86_64/system.h
@@ -141,8 +141,8 @@ static inline void write_cr8(unsigned lo
 
 #ifdef CONFIG_SMP
 #define smp_mb()	mb()
-#define smp_rmb()	rmb()
-#define smp_wmb()	wmb()
+#define smp_rmb()	barrier()
+#define smp_wmb()	barrier()
 #define smp_read_barrier_depends()	do {} while(0)
 #else
 #define smp_mb()	barrier()

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Intel Memory Ordering White Paper
  2007-09-07 19:57             ` Nick Piggin
@ 2007-09-08 10:19               ` Andi Kleen
  2007-09-07 20:32                 ` Nick Piggin
  0 siblings, 1 reply; 22+ messages in thread
From: Andi Kleen @ 2007-09-08 10:19 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Linus Torvalds, Jesse Barnes, linux-kernel

On Friday 07 September 2007 21:57:35 Nick Piggin wrote:

> 
> > > Anyway, the lfence should be able to go away without so much trouble.
> >
> > You mean sfence? lfence in rmb is definitely needed.
> 
> I mean lfence in smp_rmb().

One point of rmb is to stop speculative loads and I don't think we 
can get that without lfence.

-Andi


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Intel Memory Ordering White Paper
  2007-09-07 22:26 Intel Memory Ordering White Paper Jesse Barnes
  2007-09-08  8:54 ` Nick Piggin
@ 2007-09-08 10:29 ` Alan Cox
  2007-09-07 20:49   ` Nick Piggin
  2007-09-12 18:26 ` Dr. David Alan Gilbert
  2 siblings, 1 reply; 22+ messages in thread
From: Alan Cox @ 2007-09-08 10:29 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: linux-kernel

On Fri, 7 Sep 2007 15:26:50 -0700
Jesse Barnes <jesse.barnes@intel.com> wrote:

> FYI, we just released a new white paper describing memory ordering for 
> Intel processors:
> http://developer.intel.com/products/processor/manuals/index.htm
> 
> Should help answer some questions about some of the ordering primitives 
> we use on i386 and x86_64.

Nice - but it appears to be 64bit only - and indeed it appears to be
untrue for real 32bit because of the Pentium Pro fencing errata.

The kernel also runs on IDT Winchip, Cyrix and AMD processors not all of
which have exactly the same behaviour (the IDT Winchip as we run it
profoundly differs)

Alan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Intel Memory Ordering White Paper
  2007-09-08  8:54 ` Nick Piggin
  2007-09-07 23:20   ` Linus Torvalds
@ 2007-09-08 10:30   ` Alan Cox
  2007-09-07 20:46     ` Nick Piggin
  1 sibling, 1 reply; 22+ messages in thread
From: Alan Cox @ 2007-09-08 10:30 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Jesse Barnes, linux-kernel

On Sat, 8 Sep 2007 18:54:57 +1000
Nick Piggin <nickpiggin@yahoo.com.au> wrote:

> On Saturday 08 September 2007 08:26, Jesse Barnes wrote:
> > FYI, we just released a new white paper describing memory ordering for
> > Intel processors:
> > http://developer.intel.com/products/processor/manuals/index.htm
> >
> > Should help answer some questions about some of the ordering primitives
> > we use on i386 and x86_64.
> 
> So, can we finally noop smp_rmb and smp_wmb on x86?

Nakked-by: Alan Cox <alan@redhat.com>

You can only no-op it on 64bit Intel processors. On 32bit it needs to be
conditional on whether your processor family (or back compat for it) as
the Pentium Pro has some serious store ordering errata (hence the way it
needs lock decb for spin_unlock)

Alan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Intel Memory Ordering White Paper
  2007-09-08 17:48       ` Nick Piggin
  2007-09-07 18:13         ` Nick Piggin
@ 2007-09-08 11:34         ` dean gaudet
  2007-09-08 12:08           ` Petr Vandrovec
  1 sibling, 1 reply; 22+ messages in thread
From: dean gaudet @ 2007-09-08 11:34 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Linus Torvalds, ak, Jesse Barnes, linux-kernel

On Sun, 9 Sep 2007, Nick Piggin wrote:

> I've also heard that string operations do not follow the normal ordering, but
> that's just with respect to individual loads/stores in the one operation, I
> hope? And they will still follow ordering rules WRT surrounding loads and
> stores?

see section 7.2.3 of intel volume 3A...

"Code dependent upon sequential store ordering should not use the string 
operations for the entire data structure to be stored. Data and semaphores 
should be separated. Order dependent code should use a discrete semaphore 
uniquely stored to after any string operations to allow correctly ordered 
data to be seen by all processors."

i think we need sfence after things like copy_page, clear_page, and 
possibly copy_user... at least on intel processors with fast strings 
option enabled.

-dean

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Intel Memory Ordering White Paper
  2007-09-08 11:34         ` dean gaudet
@ 2007-09-08 12:08           ` Petr Vandrovec
  2007-09-08 12:27             ` dean gaudet
  0 siblings, 1 reply; 22+ messages in thread
From: Petr Vandrovec @ 2007-09-08 12:08 UTC (permalink / raw)
  To: dean gaudet; +Cc: Nick Piggin, Linus Torvalds, ak, Jesse Barnes, linux-kernel

dean gaudet wrote:
> On Sun, 9 Sep 2007, Nick Piggin wrote:
> 
>> I've also heard that string operations do not follow the normal ordering, but
>> that's just with respect to individual loads/stores in the one operation, I
>> hope? And they will still follow ordering rules WRT surrounding loads and
>> stores?
> 
> see section 7.2.3 of intel volume 3A...
> 
> "Code dependent upon sequential store ordering should not use the string 
> operations for the entire data structure to be stored. Data and semaphores 
> should be separated. Order dependent code should use a discrete semaphore 
> uniquely stored to after any string operations to allow correctly ordered 
> data to be seen by all processors."
> 
> i think we need sfence after things like copy_page, clear_page, and 
> possibly copy_user... at least on intel processors with fast strings 
> option enabled.

I do not think.  I believe that authors are trying to say that

struct { uint8 lock; uint8 data; } x;

lea (x.data),%edi
mov $2,%ecx
std
rep movsb

to set both data and lock does not guarantee that x.lock will be set 
after x.data and that you should do

lea (x.data),%edi
std
movsb
movsb  # or mov (%esi),%al; mov %al,(%edi), but movsb looks discrete 
enough to me

instead (and yes, I know that my example is silly).
							Petr


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Intel Memory Ordering White Paper
  2007-09-08 12:08           ` Petr Vandrovec
@ 2007-09-08 12:27             ` dean gaudet
  0 siblings, 0 replies; 22+ messages in thread
From: dean gaudet @ 2007-09-08 12:27 UTC (permalink / raw)
  To: Petr Vandrovec
  Cc: Nick Piggin, Linus Torvalds, ak, Jesse Barnes, linux-kernel

On Sat, 8 Sep 2007, Petr Vandrovec wrote:

> dean gaudet wrote:
> > On Sun, 9 Sep 2007, Nick Piggin wrote:
> > 
> > > I've also heard that string operations do not follow the normal ordering,
> > > but
> > > that's just with respect to individual loads/stores in the one operation,
> > > I
> > > hope? And they will still follow ordering rules WRT surrounding loads and
> > > stores?
> > 
> > see section 7.2.3 of intel volume 3A...
> > 
> > "Code dependent upon sequential store ordering should not use the string
> > operations for the entire data structure to be stored. Data and semaphores
> > should be separated. Order dependent code should use a discrete semaphore
> > uniquely stored to after any string operations to allow correctly ordered
> > data to be seen by all processors."
> > 
> > i think we need sfence after things like copy_page, clear_page, and possibly
> > copy_user... at least on intel processors with fast strings option enabled.
> 
> I do not think.  I believe that authors are trying to say that
> 
> struct { uint8 lock; uint8 data; } x;
> 
> lea (x.data),%edi
> mov $2,%ecx
> std
> rep movsb
> 
> to set both data and lock does not guarantee that x.lock will be set after
> x.data and that you should do
> 
> lea (x.data),%edi
> std
> movsb
> movsb  # or mov (%esi),%al; mov %al,(%edi), but movsb looks discrete enough to
> me
> 
> instead (and yes, I know that my example is silly).

no it's worse than that -- intel fast string stores can become globally 
visible in any order at all w.r.t. normal loads or stores... so take all 
those great examples in their recent whitepaper and throw out all the 
ordering guarantees for addresses on different cachelines if any of the 
stores are rep string.

for example transitive store ordering for locations on multiple cachelines 
is not guaranteed at all.  the kernel could return a zero page and one 
core could see the zeroes out of order with another core performing some 
sort of lockless data structure operation.

fast strings don't break ordering from the point of view of the core 
performing the rep string operation, but externally there are no 
guarantees (it's right there in the docs).

-dean

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Intel Memory Ordering White Paper
  2007-09-07 20:49   ` Nick Piggin
@ 2007-09-08 14:11     ` Alan Cox
  0 siblings, 0 replies; 22+ messages in thread
From: Alan Cox @ 2007-09-08 14:11 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Jesse Barnes, linux-kernel

> AMD processors guarantee loads are ordered and stores are ordered
> (with exceptions of non-temporal, and non-wb policy).
> 
> As for the others that do out of order stores, are any of them SMP?

IDT winchip isn't, Geode isn't

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Intel Memory Ordering White Paper
  2007-09-07 23:20   ` Linus Torvalds
@ 2007-09-08 17:34     ` Nick Piggin
  2007-09-08 17:48       ` Nick Piggin
  0 siblings, 1 reply; 22+ messages in thread
From: Nick Piggin @ 2007-09-08 17:34 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jesse Barnes, linux-kernel

On Saturday 08 September 2007 09:20, Linus Torvalds wrote:
> On Sat, 8 Sep 2007, Nick Piggin wrote:
> > So, can we finally noop smp_rmb and smp_wmb on x86?
>
> Did AMD already release their version? If so, we should probably add a
> commit that does that in somewhat early 2.6.24 rc, and add the pointers to
> the whitepapers in the commit message.

http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24593.pdf

AMD64 Architecture Programmer's Manual Volume 2: System Programming
section 7.2: Multiprocessor Memory Access Ordering, a paragraph on the
first page says

"Loads do not pass previous loads (loads are not re-ordered). Stores do
not pass previous stores (stores are not re-ordered)"

So, yes, it should be easy to do.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Intel Memory Ordering White Paper
  2007-09-08 17:34     ` Nick Piggin
@ 2007-09-08 17:48       ` Nick Piggin
  2007-09-07 18:13         ` Nick Piggin
  2007-09-08 11:34         ` dean gaudet
  0 siblings, 2 replies; 22+ messages in thread
From: Nick Piggin @ 2007-09-08 17:48 UTC (permalink / raw)
  To: Linus Torvalds, ak; +Cc: Jesse Barnes, linux-kernel

On Sunday 09 September 2007 03:34, Nick Piggin wrote:
> On Saturday 08 September 2007 09:20, Linus Torvalds wrote:
> > On Sat, 8 Sep 2007, Nick Piggin wrote:
> > > So, can we finally noop smp_rmb and smp_wmb on x86?
> >
> > Did AMD already release their version? If so, we should probably add a
> > commit that does that in somewhat early 2.6.24 rc, and add the pointers
> > to the whitepapers in the commit message.
>
> http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/245
>93.pdf
>
> AMD64 Architecture Programmer's Manual Volume 2: System Programming
> section 7.2: Multiprocessor Memory Access Ordering, a paragraph on the
> first page says
>
> "Loads do not pass previous loads (loads are not re-ordered). Stores do
> not pass previous stores (stores are not re-ordered)"
>
> So, yes, it should be easy to do.

There is some suggestion in the source code that non-temporal stores
(movntq) are weakly ordered. But AFAIKS from the documents, it is ordered
when operating on wb memory. What's the situation there?

I've also heard that string operations do not follow the normal ordering, but
that's just with respect to individual loads/stores in the one operation, I
hope? And they will still follow ordering rules WRT surrounding loads and
stores?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Intel Memory Ordering White Paper
  2007-09-07 20:32                 ` Nick Piggin
@ 2007-09-08 20:37                   ` H. Peter Anvin
  0 siblings, 0 replies; 22+ messages in thread
From: H. Peter Anvin @ 2007-09-08 20:37 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andi Kleen, Linus Torvalds, Jesse Barnes, linux-kernel

Nick Piggin wrote:
> smp_rmb() should not need to do anything because loads are done
> in order anyway. Both AMD and Intel have committed to this now.
> 
> The important point is that they *appear* to be done in order. AFAIK,
> the CPUs can still do speculative and out of order loads, but throw
> out the results if they could be wrong.

Is there anything even semiofficial from VIA?  Not that the x86 
architecture isn't pretty much definable as the AMD-Intel consensus...

	-hpa


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Intel Memory Ordering White Paper
  2007-09-07 22:26 Intel Memory Ordering White Paper Jesse Barnes
  2007-09-08  8:54 ` Nick Piggin
  2007-09-08 10:29 ` Alan Cox
@ 2007-09-12 18:26 ` Dr. David Alan Gilbert
  2007-09-19 16:26   ` Jesse Barnes
  2 siblings, 1 reply; 22+ messages in thread
From: Dr. David Alan Gilbert @ 2007-09-12 18:26 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: linux-kernel

* Jesse Barnes (jesse.barnes@intel.com) wrote:
> FYI, we just released a new white paper describing memory ordering for 
> Intel processors:
> http://developer.intel.com/products/processor/manuals/index.htm
> 
> Should help answer some questions about some of the ordering primitives 
> we use on i386 and x86_64.

Hi Jesse,
  Thanks for letting everyone know about that paper, however - it
has confused me somewhat; there seem to be differences in that
description and that described in the 'Intel 64 and IA-32 Architectures
Software Developer's Manual' and I'd like to understand whether
this paper is designed just to explain points or is actually 
intended to change what can be expected of the processor.

That ordering doc states:
'Loads are not reordered with other loads'

Vol3a section 7.2.1 of the architecture manual states:

'Reads can be carried out speculatively and in any order.'

Is this a:
  1) Change in the definition of the architecture that existing
processors actually follow anyway.
  2) A difference between what the processor does and what is visible
to the software (the intro to this paper does seem to emphasize
software visibility more than the architecture manual).
  3) Some other difference I haven't spotted.

The other thing that made me think about it was that the Itanium
Architecture Software Dev Manul vol2 2.1.2 states that the Itanium
uses ld.acq/st.rel (acquire/release) references to
'operate according to the IA-32 ordering model.' which I think means
that all those loads are in order relative to all the other acquire
loads?

Dave

-- 
 -----Open up your eyes, open up your mind, open up your code -------   
/ Dr. David Alan Gilbert    | Running GNU/Linux on Alpha,68K| Happy  \ 
\ gro.gilbert @ treblig.org | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex /
 \ _________________________|_____ http://www.treblig.org   |_______/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Intel Memory Ordering White Paper
  2007-09-12 18:26 ` Dr. David Alan Gilbert
@ 2007-09-19 16:26   ` Jesse Barnes
  2007-09-19 17:29     ` Andi Kleen
  0 siblings, 1 reply; 22+ messages in thread
From: Jesse Barnes @ 2007-09-19 16:26 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: linux-kernel

On Wednesday, September 12, 2007 11:26 am Dr. David Alan Gilbert wrote:
> * Jesse Barnes (jesse.barnes@intel.com) wrote:
> > FYI, we just released a new white paper describing memory ordering
> > for Intel processors:
> > http://developer.intel.com/products/processor/manuals/index.htm
> >
> > Should help answer some questions about some of the ordering
> > primitives we use on i386 and x86_64.
>
> Hi Jesse,
>   Thanks for letting everyone know about that paper, however - it
> has confused me somewhat; there seem to be differences in that
> description and that described in the 'Intel 64 and IA-32
> Architectures Software Developer's Manual' and I'd like to understand
> whether this paper is designed just to explain points or is actually
> intended to change what can be expected of the processor.
>
> That ordering doc states:
> 'Loads are not reordered with other loads'
>
> Vol3a section 7.2.1 of the architecture manual states:
>
> 'Reads can be carried out speculatively and in any order.'
>
> Is this a:
>   1) Change in the definition of the architecture that existing
> processors actually follow anyway.
>   2) A difference between what the processor does and what is visible
> to the software (the intro to this paper does seem to emphasize
> software visibility more than the architecture manual).
>   3) Some other difference I haven't spotted.

It's really both (1) and (2).  This document will become part of the 
regular manuals when the next version is published.  And yes, 
processors may do something different internally, but software can rely 
on the behavior described by the rules in the document.

Jesse

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Intel Memory Ordering White Paper
  2007-09-19 16:26   ` Jesse Barnes
@ 2007-09-19 17:29     ` Andi Kleen
  0 siblings, 0 replies; 22+ messages in thread
From: Andi Kleen @ 2007-09-19 17:29 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: Dr. David Alan Gilbert, linux-kernel

Jesse Barnes <jesse.barnes@intel.com> writes:
> 
> It's really both (1) and (2).  This document will become part of the 
> regular manuals when the next version is published.  And yes, 
> processors may do something different internally, but software can rely 
> on the behavior described by the rules in the document.

... until the first erratum comes around. With the multitude of x86
cores being introduced all the time (how many did only Intel just announce at 
IDF?@) that is going to happen sooner or later.

i386 with full legacy enabled already has to care about old PPros and 
those seriously violate write ordering.

-Andi

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2007-09-19 17:30 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-07 22:26 Intel Memory Ordering White Paper Jesse Barnes
2007-09-08  8:54 ` Nick Piggin
2007-09-07 23:20   ` Linus Torvalds
2007-09-08 17:34     ` Nick Piggin
2007-09-08 17:48       ` Nick Piggin
2007-09-07 18:13         ` Nick Piggin
2007-09-08  8:53           ` Andi Kleen
2007-09-07 19:57             ` Nick Piggin
2007-09-08 10:19               ` Andi Kleen
2007-09-07 20:32                 ` Nick Piggin
2007-09-08 20:37                   ` H. Peter Anvin
2007-09-08 11:34         ` dean gaudet
2007-09-08 12:08           ` Petr Vandrovec
2007-09-08 12:27             ` dean gaudet
2007-09-08 10:30   ` Alan Cox
2007-09-07 20:46     ` Nick Piggin
2007-09-08 10:29 ` Alan Cox
2007-09-07 20:49   ` Nick Piggin
2007-09-08 14:11     ` Alan Cox
2007-09-12 18:26 ` Dr. David Alan Gilbert
2007-09-19 16:26   ` Jesse Barnes
2007-09-19 17:29     ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox