linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@suse.de>
To: Segher Boessenkool <segher@kernel.crashing.org>
Cc: linuxppc-dev@ozlabs.org, paulus@samba.org
Subject: Re: [patch 2/2] powerpc: optimise smp_wmb
Date: Thu, 22 May 2008 02:30:54 +0200	[thread overview]
Message-ID: <20080522003054.GA23271@wotan.suse.de> (raw)
In-Reply-To: <7dc38d9080603be9c25b8f649e5df0a0@kernel.crashing.org>

On Wed, May 21, 2008 at 10:12:03PM +0200, Segher Boessenkool wrote:
> >>From memory, I measured lwsync is 5 times faster than eieio on
> >a dual G5. This was on a simple microbenchmark that made use of
> >smp_wmb for store ordering, but it did not involve any IO access
> >(which presumably would disadvantage eieio further).
> 
> This is very much specific to your particular benchmark.
> 
> On the 970, there are two differences between lwsync and eieio:
> 
> 1) lwsync cannot issue before all previous loads are done; eieio
> does not have this restriction.
> 
> Then, they both fly through the execution core, it doesn't wait
> for the barrier insn to complete in the storage system.  In both
> cases, a barrier is inserted into both the L2 queues and the
> non-cacheable queues.  These barriers are both removed at the
> same time, that is, when both are the oldest in their queue and
> have done their thing.
> 
> 2) For eieio, the non-cacheable unit waits for all previous
> (non-cacheable) stores to complete, and then arbitrates for the
> bus and sends an EIEIO transaction.
> 
> Your benchmark doesn't do non-cacheable stores, so it would seem
> the five-time slowdown is caused by that bus arbitration (and the
> bus transaction).  Maybe your cacheable stores hit the bus as well,
> that would make this worse.  Your benchmark also doesn't see the
> negative effects from 1).
> 
> In "real" code, I expect 2) to be pretty much invisible (the store
> queues will never be completely filled up), but 1) shouldn't be very
> bad either.  So it's a wash.  But only a real benchmark will tell.

OK, interesting thanks. Yes the "benchmark" is not a good one, but
it verified for me that there is a difference there. Combined with
IBM's documents saying lwsync is preferred for store/store ordering
is my rationale for sending the patch. A real benchmark would be nice
but it would probably be hard to notice any improvement.

 
> >Given the G5 speedup, I'd be surprised if there is not an improvment
> >on POWER4 and 5 as well,
> 
> The 970 storage subsystem and the POWER4 one are very different.
> Or maybe these queues are just about the last thing that _is_
> identical, I dunno, there aren't public POWER4 docs for this ;-)
> 
> >although no idea about POWER6 or cell...
> 
> No idea about POWER6; for CBE, the backend works similar to the
> 970 one.
> 
> Given that the architecture says to use lwsync for cases like this,
> it would be very surprising if it performed (much) worse than eieio,
> eh? ;-)  So I think your patch is a win; just wanted to clarify on
> your five-time slowdown number.

Sure, thanks!

Nick

  parent reply	other threads:[~2008-05-22  0:30 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-21 14:10 [patch 1/2] powerpc: rmb fix Nick Piggin
2008-05-21 14:12 ` [patch 2/2] powerpc: optimise smp_wmb Nick Piggin
2008-05-21 15:26   ` Benjamin Herrenschmidt
2008-05-21 15:34     ` Nick Piggin
2008-05-21 15:43       ` Benjamin Herrenschmidt
2008-05-21 15:47         ` Nick Piggin
2008-05-21 16:02           ` Benjamin Herrenschmidt
2008-05-21 20:51             ` Segher Boessenkool
2008-05-21 16:01         ` Nick Piggin
2008-05-21 20:12           ` Segher Boessenkool
2008-05-21 20:44             ` Benjamin Herrenschmidt
2008-05-21 22:07               ` Segher Boessenkool
2008-05-22  0:30             ` Nick Piggin [this message]
2008-05-21 20:16   ` Segher Boessenkool
2008-05-21 15:27 ` [patch 1/2] powerpc: rmb fix Benjamin Herrenschmidt
2008-05-21 15:32   ` Nick Piggin
2008-05-21 15:43     ` Benjamin Herrenschmidt
2008-05-23  2:14     ` Paul Mackerras
2008-05-23  4:40       ` Nick Piggin
2008-05-23  4:53         ` Paul Mackerras
2008-05-23  5:48           ` Nick Piggin
2008-05-23  6:40             ` Paul Mackerras
2008-05-26  1:38               ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080522003054.GA23271@wotan.suse.de \
    --to=npiggin@suse.de \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=paulus@samba.org \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).