From: Nick Piggin <npiggin@suse.de>
To: Segher Boessenkool <segher@kernel.crashing.org>
Cc: linuxppc-dev@ozlabs.org, paulus@samba.org
Subject: Re: [patch 2/2] powerpc: optimise smp_wmb
Date: Thu, 22 May 2008 02:30:54 +0200 [thread overview]
Message-ID: <20080522003054.GA23271@wotan.suse.de> (raw)
In-Reply-To: <7dc38d9080603be9c25b8f649e5df0a0@kernel.crashing.org>
On Wed, May 21, 2008 at 10:12:03PM +0200, Segher Boessenkool wrote:
> >>From memory, I measured lwsync is 5 times faster than eieio on
> >a dual G5. This was on a simple microbenchmark that made use of
> >smp_wmb for store ordering, but it did not involve any IO access
> >(which presumably would disadvantage eieio further).
>
> This is very much specific to your particular benchmark.
>
> On the 970, there are two differences between lwsync and eieio:
>
> 1) lwsync cannot issue before all previous loads are done; eieio
> does not have this restriction.
>
> Then, they both fly through the execution core, it doesn't wait
> for the barrier insn to complete in the storage system. In both
> cases, a barrier is inserted into both the L2 queues and the
> non-cacheable queues. These barriers are both removed at the
> same time, that is, when both are the oldest in their queue and
> have done their thing.
>
> 2) For eieio, the non-cacheable unit waits for all previous
> (non-cacheable) stores to complete, and then arbitrates for the
> bus and sends an EIEIO transaction.
>
> Your benchmark doesn't do non-cacheable stores, so it would seem
> the five-time slowdown is caused by that bus arbitration (and the
> bus transaction). Maybe your cacheable stores hit the bus as well,
> that would make this worse. Your benchmark also doesn't see the
> negative effects from 1).
>
> In "real" code, I expect 2) to be pretty much invisible (the store
> queues will never be completely filled up), but 1) shouldn't be very
> bad either. So it's a wash. But only a real benchmark will tell.
OK, interesting thanks. Yes the "benchmark" is not a good one, but
it verified for me that there is a difference there. Combined with
IBM's documents saying lwsync is preferred for store/store ordering
is my rationale for sending the patch. A real benchmark would be nice
but it would probably be hard to notice any improvement.
> >Given the G5 speedup, I'd be surprised if there is not an improvment
> >on POWER4 and 5 as well,
>
> The 970 storage subsystem and the POWER4 one are very different.
> Or maybe these queues are just about the last thing that _is_
> identical, I dunno, there aren't public POWER4 docs for this ;-)
>
> >although no idea about POWER6 or cell...
>
> No idea about POWER6; for CBE, the backend works similar to the
> 970 one.
>
> Given that the architecture says to use lwsync for cases like this,
> it would be very surprising if it performed (much) worse than eieio,
> eh? ;-) So I think your patch is a win; just wanted to clarify on
> your five-time slowdown number.
Sure, thanks!
Nick
next prev parent reply other threads:[~2008-05-22 0:30 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-21 14:10 [patch 1/2] powerpc: rmb fix Nick Piggin
2008-05-21 14:12 ` [patch 2/2] powerpc: optimise smp_wmb Nick Piggin
2008-05-21 15:26 ` Benjamin Herrenschmidt
2008-05-21 15:34 ` Nick Piggin
2008-05-21 15:43 ` Benjamin Herrenschmidt
2008-05-21 15:47 ` Nick Piggin
2008-05-21 16:02 ` Benjamin Herrenschmidt
2008-05-21 20:51 ` Segher Boessenkool
2008-05-21 16:01 ` Nick Piggin
2008-05-21 20:12 ` Segher Boessenkool
2008-05-21 20:44 ` Benjamin Herrenschmidt
2008-05-21 22:07 ` Segher Boessenkool
2008-05-22 0:30 ` Nick Piggin [this message]
2008-05-21 20:16 ` Segher Boessenkool
2008-05-21 15:27 ` [patch 1/2] powerpc: rmb fix Benjamin Herrenschmidt
2008-05-21 15:32 ` Nick Piggin
2008-05-21 15:43 ` Benjamin Herrenschmidt
2008-05-23 2:14 ` Paul Mackerras
2008-05-23 4:40 ` Nick Piggin
2008-05-23 4:53 ` Paul Mackerras
2008-05-23 5:48 ` Nick Piggin
2008-05-23 6:40 ` Paul Mackerras
2008-05-26 1:38 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080522003054.GA23271@wotan.suse.de \
--to=npiggin@suse.de \
--cc=linuxppc-dev@ozlabs.org \
--cc=paulus@samba.org \
--cc=segher@kernel.crashing.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).