From: Linus Torvalds <torvalds@linux-foundation.org>
To: Nick Piggin <npiggin@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>, David Howells <dhowells@redhat.com>,
Ulrich Drepper <drepper@redhat.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 0/3] 64-bit futexes: Intro
Date: Wed, 4 Jun 2008 13:38:50 -0700 (PDT) [thread overview]
Message-ID: <alpine.LFD.1.10.0806041320020.3473@woody.linux-foundation.org> (raw)
In-Reply-To: <alpine.LFD.1.10.0806041249470.3473@woody.linux-foundation.org>
On Wed, 4 Jun 2008, Linus Torvalds wrote:
>
> So when you do
>
> movb reg,(byteptr)
> movl (byteptr),reg
>
> you may actually get old data in the upper 24 bits, along with new data in
> the lower 8.
Put another way: the CPU may internally effectively rewrite the two
instructions as
movb reg,tmpreg (tmp = writebuffer)
movl (byteptr),reg (do the 32-bit read of the old cached contents)
movb tmpreg,reg (writebuffer snoop by reads)
movb tmpreg,(byteptr) (writebuffer actually drains into cacheline)
and *if* your algorithm is robust wrt these kinds of rewrites, you're ok.
But notice how there are two accesses to the cacheline, and how they are
actually in the "wrong" order, and how the cacheline could have been
updated by another CPU in between.
Does this actually happen? Yeah, I do believe it does. Is it a deathknell
for anybody trying to be clever with overlapping reads/writes? No, you can
still have things like causality rules that guarantee that your algorithm
is perfectly stable in the face of these kinds of reordering. But it *is*
one of the few re-orderings that I think is theoretically archtiecturally
visible.
For example, let's start out with a 32-bit word that contains zero, and
three CPU's. One CPU does
cmpxchgl 0->0x01010101,mem
another one does
cmpxchlg 0x01010101->0x02020202,mem
and the third one does that
movb $0x03,mem
movl mem,reg
and after it all completed, you may have 0x02020203 in memory, but "reg"
on the third CPU contains 0x01010103.
Note how NO OTHER CPU could _possibly_ have seen that value! That value
never ever existed in any caches. If the final value was 0x02020203, then
both the cmpxchgl's must have worked, so the cache coherency contents
*must* have gone from 0 -> 0x01010101 -> 0x02020202 -> 0x02020203 (with
the movb actually getting the exclusive cache access last).
So the third CPU saw a value for that load that actually *never* existed
in any cache-line: 0x01010103. Exactly because the x86 memory ordering
allows the store buffer contents to be forwarded within a CPU core.
And this is why atomic locked instructions are special. They bypass the
store buffer (or at least they _act_ as if they do - they likely actually
use the store buffer, but they flush it and the instruction pipeline
before the load and wait for it to drain after, and have a lock on the
cacheline that they take as part o the load, and release as part of the
store - all to make sure that the cacheline doesn't go away in between and
that nobody else can see the store buffer contents while this is going
on).
This is also why there is so much room for improvement in locked
instruction performance - you don't _have_ to flush things if you just are
very careful about tracking how and when you use which elements in the
store buffer, and track the ordering of cache accesses by all of this.
Linus
next prev parent reply other threads:[~2008-06-04 20:40 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-31 1:27 [PATCH 0/3] 64-bit futexes: Intro Ulrich Drepper
2008-05-31 2:13 ` Linus Torvalds
2008-05-31 3:14 ` Ulrich Drepper
2008-05-31 3:44 ` Linus Torvalds
2008-05-31 4:04 ` Ulrich Drepper
2008-05-31 4:16 ` Linus Torvalds
2008-05-31 4:23 ` Linus Torvalds
2008-05-31 4:38 ` Ulrich Drepper
2008-05-31 4:58 ` Linus Torvalds
2008-05-31 22:25 ` Linus Torvalds
2008-05-31 22:32 ` Linus Torvalds
2008-06-02 18:54 ` Ingo Molnar
2008-06-02 20:22 ` Linus Torvalds
2008-06-02 23:03 ` Ingo Molnar
2008-06-03 3:24 ` Nick Piggin
2008-06-04 19:57 ` Linus Torvalds
2008-06-04 20:38 ` Linus Torvalds [this message]
2008-06-05 1:56 ` Nick Piggin
2008-06-05 1:58 ` Nick Piggin
2008-06-05 3:08 ` Linus Torvalds
2008-06-05 4:29 ` Nick Piggin
2008-06-05 1:45 ` Nick Piggin
2008-06-06 1:27 ` Nick Piggin
2008-06-06 3:37 ` Linus Torvalds
2008-06-06 11:53 ` Nick Piggin
2008-06-06 15:01 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.1.10.0806041320020.3473@woody.linux-foundation.org \
--to=torvalds@linux-foundation.org \
--cc=akpm@linux-foundation.org \
--cc=dhowells@redhat.com \
--cc=drepper@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=npiggin@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.