linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [OT] 7450
@ 2001-03-09  8:59 Giuliano Pochini
  2001-03-09 19:08 ` Dan Malek
  0 siblings, 1 reply; 7+ messages in thread
From: Giuliano Pochini @ 2001-03-09  8:59 UTC (permalink / raw)
  To: linuxppc-dev


I read many msg about 7450 performance problems. Are there any
test results made with Linux ?  Will GCC have optiminazions
(workarounds?) for the 7450's longer pipeline ?


Bye.
    Giuliano Pochini ->)|(<- Shiny Network {AS6665} ->)|(<-

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [OT] 7450
  2001-03-09  8:59 [OT] 7450 Giuliano Pochini
@ 2001-03-09 19:08 ` Dan Malek
  2001-03-10 15:15   ` Holger Bettag
  0 siblings, 1 reply; 7+ messages in thread
From: Dan Malek @ 2001-03-09 19:08 UTC (permalink / raw)
  To: Giuliano Pochini; +Cc: linuxppc-dev


Giuliano Pochini wrote:
>
> I read many msg about 7450 performance problems.

>From who?  People that are actually running hardware or
speculating from rumors based on documentation that doesn't exist?

> .... Are there any
> test results made with Linux ?

Not yet.  Any of us with actual hardware are bound by NDAs
that would prohibit discussing such things.

> .... Will GCC have optiminazions
> (workarounds?) for the 7450's longer pipeline ?

People are working on it.


	-- Dan

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [OT] 7450
  2001-03-09 19:08 ` Dan Malek
@ 2001-03-10 15:15   ` Holger Bettag
  2001-03-11  1:43     ` David Edelsohn
  2001-03-15 13:05     ` Holger Bettag
  0 siblings, 2 replies; 7+ messages in thread
From: Holger Bettag @ 2001-03-10 15:15 UTC (permalink / raw)
  To: linuxppc-dev


Dan Malek <dan@mvista.com> writes:

>
> Giuliano Pochini wrote:
> >
> > I read many msg about 7450 performance problems.
>
> From who?  People that are actually running hardware or
> speculating from rumors based on documentation that doesn't exist?
>
There are some Mac benchmarks flying around, which don't make the 7450
look all too good.

Some of the benchmarks are obviously bogus: some memory bandwidth measurements
are consistently off by exactly a factor of two. Other benchmarks have later
been shown to be heavily dependant on gfx drivers.

But at least one issue remains, and that is surprisingly low FP performance.
This was measured with one ray tracing application and with an MP3 coder.
I currently believe that thus far, no PPC compiler has made much effort
to schedule FP operations. With just three cycles of latency for a
multiply-add, you can get away with rather sloppy code (in fact, I know
of no shorter FPU pipeline in any other CPUs that reach comparable clock
speeds).

But with five cycles of FP latency, scheduling becomes really important.

Branch efficiency has also somewhat decreased compared to the 7400. The
most notable slowdown is that taken branches are no longer 'free', because
the L1 instruction cache now has a latency of 3 instead of two cycles,
and the branch target instruction cache can only supply enough instructions
for one clock cycle.

I was quite surprised by this, because usually branch efficiency becomes
more important the more instructions can be issued per cycle. There _are_
quite a few things a compiler can do to lessen the impact of slower
branches, but I'm not yet sure if this will fully balance the disadvantages.
Furthermore, such kind of 'speculative code motion' is very specific to
CPU microarchitecture; i.e. code optimized this way for a 7450 might not
run optimally on a 7400.

[...]
> > .... Will GCC have optiminazions
> > (workarounds?) for the 7450's longer pipeline ?
>
> People are working on it.
>
These aren't really "workarounds". Nowadays CPU architecture and compiler
capabilities have to be regarded together. It may well be the case that
the chip designers made a sound decision to move certain complexities to
the software side rather than to the hardware side.

Any such compiler improvements will also be of use for older PPCs. But
Amdahl's Law strikes again: as G3 and ('old') G4 don't spend much of
their time processing branches, further improvements won't have a big
impact on overall performance. That's why putting such optimizations into
the compiler would have been mostly wasted effort - up to now.

  Holger


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [OT] 7450
  2001-03-10 15:15   ` Holger Bettag
@ 2001-03-11  1:43     ` David Edelsohn
  2001-03-15 13:05     ` Holger Bettag
  1 sibling, 0 replies; 7+ messages in thread
From: David Edelsohn @ 2001-03-11  1:43 UTC (permalink / raw)
  To: Holger Bettag; +Cc: linuxppc-dev


	There are a number of performance improvements in the forthcoming
GCC 3.0 and later releases (including a further improved scheduler and
software pipelining) which should address some of the performance
limitations in the current GCC release.

David

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [OT] 7450
@ 2001-03-12 11:57 Giuliano Pochini
  2001-03-13  5:56 ` Timothy A. Seufert
  0 siblings, 1 reply; 7+ messages in thread
From: Giuliano Pochini @ 2001-03-12 11:57 UTC (permalink / raw)
  To: linuxppc-dev


>> Giuliano Pochini wrote:
>> >
>> > I read many msg about 7450 performance problems.
>>
>> From who?  People that are actually running hardware or
>> speculating from rumors based on documentation that doesn't exist?
>>

>There are some Mac benchmarks flying around, which don't make the 7450
>look all too good.
>[...]
>But at least one issue remains, and that is surprisingly low FP performance.
>This was measured with one ray tracing application and with an MP3 coder.
>I currently believe that thus far, no PPC compiler has made much effort
>schedule FP operations. With just three cycles of latency for a
>multiply-add, you can get away with rather sloppy code (in fact, I know
>of no shorter FPU pipeline in any other CPUs that reach comparable clock
>speeds).

>But with five cycles of FP latency, scheduling becomes really important.

Hmm, probably soft compiled for 604 runs fine on 7450 because it had
similar latencies and a 6-stage pipeline.

>> > .... Will GCC have optiminazions
>> > (workarounds?) for the 7450's longer pipeline ?
>>
>> People are working on it.

>These aren't really "workarounds". Nowadays CPU architecture and compiler
>capabilities have to be regarded together. It may well be the case that
>the chip designers made a sound decision to move certain complexities to
>the software side rather than to the hardware side.

Yes, but it's not a good thing. People cannot recompile their software.


Bye.
    Giuliano Pochini ->)|(<- Shiny Network {AS6665} ->)|(<-


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [OT] 7450
  2001-03-12 11:57 Giuliano Pochini
@ 2001-03-13  5:56 ` Timothy A. Seufert
  0 siblings, 0 replies; 7+ messages in thread
From: Timothy A. Seufert @ 2001-03-13  5:56 UTC (permalink / raw)
  To: Giuliano Pochini, linuxppc-dev


At 12:57 PM +0100 3/12/01, Giuliano Pochini wrote:

>>But with five cycles of FP latency, scheduling becomes really important.
>
>Hmm, probably soft compiled for 604 runs fine on 7450 because it had
>similar latencies and a 6-stage pipeline.

The 604 has a 3-stage FPU, same as just about every other PPC Apple
has used.  The 7450 has a 5-stage FPU.

   Tim Seufert

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [OT] 7450
  2001-03-10 15:15   ` Holger Bettag
  2001-03-11  1:43     ` David Edelsohn
@ 2001-03-15 13:05     ` Holger Bettag
  1 sibling, 0 replies; 7+ messages in thread
From: Holger Bettag @ 2001-03-15 13:05 UTC (permalink / raw)
  To: Holger Bettag; +Cc: linuxppc-dev


Holger Bettag <hobold@Informatik.Uni-Bremen.DE> writes:

>
> Dan Malek <dan@mvista.com> writes:
>
> > Giuliano Pochini wrote:
> > >
> > > I read many msg about 7450 performance problems.
> >
> > From who?  People that are actually running hardware or
> > speculating from rumors based on documentation that doesn't exist?
> >
> There are some Mac benchmarks flying around, which don't make the 7450
> look all too good.
>
I got some more info on this issue and currently it seems that some
benchmarks are penalized not by the CPU core, but by the rather slow
L3 speeds. Apparently, the external L3 runs at a quarter of core speed
for the current high end G4 Macs; i.e. 166 and 183 MHz for the 666 and
733 MHz models, compared to 266MHz L2 of the 533MHz model.

I have heard of examples of both branch intensive and FPU intensive code
where the 7450 is faster per clock than the 7400, as long as the working
set fits in the on-chip caches.

The currently available docs from Motorola say that no L3 configurations
can be tested that result in speeds over 200MHz. This seems to be a
limitation of the testing procedure/equipment.

Is someone out there running the 7450 in non-Apple environments? What L3
configurations are used there?

  Holger

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2001-03-15 13:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-03-09  8:59 [OT] 7450 Giuliano Pochini
2001-03-09 19:08 ` Dan Malek
2001-03-10 15:15   ` Holger Bettag
2001-03-11  1:43     ` David Edelsohn
2001-03-15 13:05     ` Holger Bettag
  -- strict thread matches above, loose matches on Subject: below --
2001-03-12 11:57 Giuliano Pochini
2001-03-13  5:56 ` Timothy A. Seufert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).