Re: [PATCH] Document Linux's memory barriers [try #5]

public inbox for linux-arch@vger.kernel.org
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@us.ibm.com>
To: Linus Torvalds <torvalds@osdl.org>
Cc: David Howells <dhowells@redhat.com>,
	Andrew Morton <akpm@osdl.org>,
	linux-arch@vger.kernel.org, linuxppc64-dev@ozlabs.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] Document Linux's memory barriers [try #5]
Date: Thu, 16 Mar 2006 22:23:00 -0800	[thread overview]
Message-ID: <20060317062300.GE1323@us.ibm.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0603162118450.3618@g5.osdl.org>

On Thu, Mar 16, 2006 at 09:32:03PM -0800, Linus Torvalds wrote:
> 
> 
> On Thu, 16 Mar 2006, Paul E. McKenney wrote:
> > 
> > In section 7.1.1 on page 195, it says:
> > 
> > 	For cacheable memory types, the following rules govern
> > 	read ordering:
> > 
> > 	o	Out-of-order reads are allowed.  Out-of-order reads
> > 		can occur as a result of out-of-order instruction
> > 		execution or speculative execution.  The processor
> > 		can read memory out-of-order to allow out-of-order
> > 		to proceed.
> > 
> > 	o	Speculative reads are allows ... [but no effect on
> > 		ordering beyond that given in the other rules, near
> > 		as I can tell]
> > 
> > 	o	Reads can be reordered ahead of writes.  Reads are
> > 		generally given a higher priority by the processor
> > 		than writes because instruction execution stalls
> > 		if the read data required by an instruction is not
> > 		immediately available.  Allowing reads ahead of
> > 		writes usually maximizes software performance.
> 
> These are just the same as the x86 ordering. Notice how reads can pass 
> (earlier) writes, but won't be pushed back after later writes. That's very 
> much the x86 ordering (together with the "CPU ordering" for writes).

OK, so you are not arguing with the "AMD" row, but rather with the
x86 row.  So I was looking in the wrong manual.  Specifically, you are
saying that the x86's "Loads Reordered After Stores" cell should be
blank rather than "Y", right?

> > > (Also, x86 doesn't have an incoherent instruction cache - some older x86 
> > > cores have an incoherent instruction decode _buffer_, but that's a 
> > > slightly different issue with basically no effect on any sane program).
> > 
> > Newer cores check the linear address, so code generated in a different
> > address space now needs to do CPUID.  This is admittedly an unusual
> > case -- perhaps I was getting overly worked up about it.  I based this
> > on Section 10.6 on page 10-21 (physical page 405) of Intel's "IA-32
> > Intel Architecture Software Developer's Manual Volume 3: System
> > Programming Guide", 2004.  PDF available (as of 2/16/2005) from:
> > 
> > 	ftp://download.intel.com/design/Pentium4/manuals/25366814.pdf
> 
> Not according to the docs I have.
> 
> The _prefetch_ queue is invalidated based on the linear address, but not 
> the caches. The caches are coherent, and the prefetch is also coherent in 
> modern cores wrt linear address (but old cores, like the original i386, 
> would literally not see the write, so you could do
> 
> 		movl $1234,1f
> 	1:	xorl %eax,%eax
> 
> and the "movl" would overwrite the "xorl", but the "xorl" would still get 
> executed if it was in the 16-byte prefetch buffer or whatever).
> 
> Modern cores will generally be _totally_ serialized, so if you write to 
> the next instruction, I think most modern cores will notice it. It's only 
> if you use paging or something to write to the physical address to 
> something that is in the prefetch buffers that it can get you.

Yep.  But I would not put it past some JIT writer to actually do something
like double-mapping the JITed code to two different linear addresses.

> Now, the prefetching has gotten longer over time, but it is basically 
> still just a few tens of instructions, and any serializing instruction 
> will force it to be serialized with the cache.

Agreed.

> It's really a non-issue, because regular self-modifying code will trigger 
> the linear address check, and system code will always end up doing an 
> "iret" or other serializing instruction, so it really doesn't trigger. 

Only if there is a context switch between the writing of the instruction
and the executing of it.  Might not be the case if someone double-maps
the memory or some other similar stunt.  And I agree modern cores seem
to be getting less aggressive in their search for instruction-level
parallelism, but it doesn't take too much speculative-execution capability
to (sometimes!) get some pretty strange stuff loaded into the instruction
prefetch buffer.

> So in practice, you really should see it as being entirely coherent. You 
> have to do some _really_ strange sh*t to ever see anything different.

No argument with your last sentence!  (And, believe it or not, there
was a time when self-modifying code was considered manly rather than
strange.  But that was back in the days of 4096-byte main memories...)

I believe we are in violent agreement on this one.  The column label in
the table is "Incoherent Instruction Cache/Pipeline".  You are saying
that only the pipeline can be incoherent, and even then, only in strange
situations.  But the only situation in which I would leave a cell in
this column blank would be if -both- the cache -and- the pipeline were
-always- coherent, even in strange situations.  So I believe that this
one still needs to stay "Y".

I would rather see someone's JIT execute an extra CPUID after generating
a new chunk of code than to see it fail strangely -- but only sometimes,
and not reproducibly.

Would it help if the column were instead labelled "Incoherent Instruction
Cache or Pipeline", replacing the current "/" with "or"?

							Thanx, Paul

next prev parent reply	other threads:[~2006-03-17  6:22 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20060315200956.4a9e2cb3.akpm@osdl.org>
2006-03-09 20:29 ` [PATCH] Document Linux's memory barriers [try #4] David Howells
2006-03-09 23:34   ` Paul Mackerras
2006-03-09 23:45     ` Michael Buesch
2006-03-09 23:56       ` Linus Torvalds
2006-03-10  0:07         ` Michael Buesch
2006-03-10  0:48     ` Alan Cox
2006-03-10  0:54       ` Paul Mackerras
2006-03-10 15:19     ` David Howells
2006-03-11  0:01       ` Paul Mackerras
2006-03-10  5:28   ` Nick Piggin
2006-03-15 11:10     ` David Howells
2006-03-15 11:51       ` Nick Piggin
2006-03-15 13:47         ` David Howells
2006-03-15 23:21           ` Nick Piggin
2006-03-12 17:15   ` Eric W. Biederman
2006-03-14 21:26     ` David Howells
2006-03-14 21:48       ` Paul Mackerras
2006-03-14 23:59         ` David Howells
2006-03-15  0:20           ` Linus Torvalds
2006-03-15  1:19             ` David Howells
2006-03-15  1:47               ` Linus Torvalds
2006-03-15  1:25             ` Nick Piggin
2006-03-15  0:54           ` Paul Mackerras
2006-03-15 14:23   ` [PATCH] Document Linux's memory barriers [try #5] David Howells
2006-03-16 23:17     ` Paul E. McKenney
2006-03-16 23:55       ` Linus Torvalds
2006-03-17  1:29         ` Paul E. McKenney
2006-03-17  5:32           ` Linus Torvalds
2006-03-17  6:23             ` Paul E. McKenney [this message]
2006-03-23 18:34       ` David Howells
2006-03-23 19:28         ` Linus Torvalds
2006-03-23 22:26         ` Paul E. McKenney
     [not found]     ` <21253.1142509812@warthog.cambridge.redhat.com>
     [not found]       ` <Pine.LNX.4.64.0603160914410.3618@g5.osdl.org>
2006-03-17  1:20         ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060317062300.GE1323@us.ibm.com \
    --to=paulmck@us.ibm.com \
    --cc=akpm@osdl.org \
    --cc=dhowells@redhat.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc64-dev@ozlabs.org \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox