From: Nick Piggin <npiggin@suse.de>
To: Segher Boessenkool <segher@kernel.crashing.org>
Cc: linuxppc-dev@ozlabs.org, paulus@samba.org
Subject: Re: [patch 2/2] powerpc: optimise smp_wmb
Date: Thu, 22 May 2008 02:30:54 +0200 [thread overview]
Message-ID: <20080522003054.GA23271@wotan.suse.de> (raw)
In-Reply-To: <7dc38d9080603be9c25b8f649e5df0a0@kernel.crashing.org>
On Wed, May 21, 2008 at 10:12:03PM +0200, Segher Boessenkool wrote:
> >>From memory, I measured lwsync is 5 times faster than eieio on
> >a dual G5. This was on a simple microbenchmark that made use of
> >smp_wmb for store ordering, but it did not involve any IO access
> >(which presumably would disadvantage eieio further).
>
> This is very much specific to your particular benchmark.
>
> On the 970, there are two differences between lwsync and eieio:
>
> 1) lwsync cannot issue before all previous loads are done; eieio
> does not have this restriction.
>
> Then, they both fly through the execution core, it doesn't wait
> for the barrier insn to complete in the storage system. In both
> cases, a barrier is inserted into both the L2 queues and the
> non-cacheable queues. These barriers are both removed at the
> same time, that is, when both are the oldest in their queue and
> have done their thing.
>
> 2) For eieio, the non-cacheable unit waits for all previous
> (non-cacheable) stores to complete, and then arbitrates for the
> bus and sends an EIEIO transaction.
>
> Your benchmark doesn't do non-cacheable stores, so it would seem
> the five-time slowdown is caused by that bus arbitration (and the
> bus transaction). Maybe your cacheable stores hit the bus as well,
> that would make this worse. Your benchmark also doesn't see the
> negative effects from 1).
>
> In "real" code, I expect 2) to be pretty much invisible (the store
> queues will never be completely filled up), but 1) shouldn't be very
> bad either. So it's a wash. But only a real benchmark will tell.
OK, interesting thanks. Yes the "benchmark" is not a good one, but
it verified for me that there is a difference there. Combined with
IBM's documents saying lwsync is preferred for store/store ordering
is my rationale for sending the patch. A real benchmark would be nice
but it would probably be hard to notice any improvement.
> >Given the G5 speedup, I'd be surprised if there is not an improvment
> >on POWER4 and 5 as well,
>
> The 970 storage subsystem and the POWER4 one are very different.
> Or maybe these queues are just about the last thing that _is_
> identical, I dunno, there aren't public POWER4 docs for this ;-)
>
> >although no idea about POWER6 or cell...
>
> No idea about POWER6; for CBE, the backend works similar to the
> 970 one.
>
> Given that the architecture says to use lwsync for cases like this,
> it would be very surprising if it performed (much) worse than eieio,
> eh? ;-) So I think your patch is a win; just wanted to clarify on
> your five-time slowdown number.
Sure, thanks!
Nick
next prev parent reply other threads:[~2008-05-22 0:30 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-21 14:10 [patch 1/2] powerpc: rmb fix Nick Piggin
2008-05-21 14:12 ` [patch 2/2] powerpc: optimise smp_wmb Nick Piggin
2008-05-21 15:26 ` Benjamin Herrenschmidt
2008-05-21 15:34 ` Nick Piggin
2008-05-21 15:43 ` Benjamin Herrenschmidt
2008-05-21 15:47 ` Nick Piggin
2008-05-21 16:02 ` Benjamin Herrenschmidt
2008-05-21 20:51 ` Segher Boessenkool
2008-05-21 16:01 ` Nick Piggin
2008-05-21 20:12 ` Segher Boessenkool
2008-05-21 20:44 ` Benjamin Herrenschmidt
2008-05-21 22:07 ` Segher Boessenkool
2008-05-22 0:30 ` Nick Piggin [this message]
2008-05-21 20:16 ` Segher Boessenkool
2008-05-21 15:27 ` [patch 1/2] powerpc: rmb fix Benjamin Herrenschmidt
2008-05-21 15:32 ` Nick Piggin
2008-05-21 15:43 ` Benjamin Herrenschmidt
2008-05-23 2:14 ` Paul Mackerras
2008-05-23 4:40 ` Nick Piggin
2008-05-23 4:53 ` Paul Mackerras
2008-05-23 5:48 ` Nick Piggin
2008-05-23 6:40 ` Paul Mackerras
2008-05-26 1:38 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080522003054.GA23271@wotan.suse.de \
--to=npiggin@suse.de \
--cc=linuxppc-dev@ozlabs.org \
--cc=paulus@samba.org \
--cc=segher@kernel.crashing.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.