* Re: compiler optimization? something else?
[not found] <20000308150957.A12031@drow.res.cmu.edu>
@ 2000-03-08 20:21 ` Sean Harding
0 siblings, 0 replies; 13+ messages in thread
From: Sean Harding @ 2000-03-08 20:21 UTC (permalink / raw)
To: Daniel Jacobowitz; +Cc: linuxppc-dev
On Wed, 8 Mar 2000, Daniel Jacobowitz wrote:
> Lame has massive hand-coded assembly sections, if I am not mistaken.
Possible, but again. It works extremely well on alpha, sparc and pa-risc
in addition to x86. This isn't simply a case of an app being heavily
optimised for x86 and working poorly on ppc. That's obviously something
the compiler can't help.
sean
--
Sean Harding sharding@dogcow.org |"art may imitate life
http://www.dogcow.org/sean/ | but life imitates t.v."
| --ani difranco
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: compiler optimization? something else?
[not found] <20000308204420.29276.qmail@web1705.mail.yahoo.com>
@ 2000-03-08 21:22 ` David Edelsohn
0 siblings, 0 replies; 13+ messages in thread
From: David Edelsohn @ 2000-03-08 21:22 UTC (permalink / raw)
To: Gabriel Ricard; +Cc: Sean Harding, linuxppc-dev, pcg
PentiumGCC is significantly based on the Pentium optimization work
for GCC preformed by Intel.
Cygnus is currently working on a large number of PowerPC
optimizations as part of a contract from IBM. All of this work is
occurring in a branch of the public GCC CVS repository.
Any and all architecture-specific and architecture-indenpendent
optimization work is appreciated, but I think the performance problems and
the current optimization work underway need to be understood by those
offering to help.
Thanks, David
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: compiler optimization? something else?
@ 2000-03-08 23:01 Dan Bethe
0 siblings, 0 replies; 13+ messages in thread
From: Dan Bethe @ 2000-03-08 23:01 UTC (permalink / raw)
To: Gabriel Ricard, jgrantha, jcarr, jhaas, linuxppc-dev; +Cc: pcg
Guys. Let's get Marc a Powerbook. A relevant Powerbook, which I
assume would require a G3, would cost about $1000-2000. See
http://www.powerbookguy.com, who's a new/used dealer in Oakland, CA.
Perhaps a 1400 could be upgraded with a G3 card, as a more
cost-effective purchase. I dunno, but afaik the only relevant criteria
are these:
* cheap, small, easy to ship to Germany without wrecking it
* G3 cpu
* plenny of RAM, maybe at least 96 MB
And Marc could supply whatever else, such as display, ethernet (file
server), etc.
OTOH, if Marc has additional skills such as device drivers, we can
send him a current system such as Wallstreet or Lombard, and have him
additionally code some support for graphics, power management,
Cardbus/pcmcia, and USB.
Is that correct? Let's hear from Marc and from the LinuxPPC guys. I
don't know who else is at Linuxppc.com, who would be the most likely
candidate for funding such a purchase. Gabe and I would be glad to
make the purchase and ship it out.
--- Gabriel Ricard <g_ricard@yahoo.com> wrote:
>
> I've been looking for ways to get Marc Lehman a
> PowerBook so he could work on optimizing the PowerPC
> GCC compiler. Marc (pcg@goof.com) is the person behind
> the PentiumGCC compiler. I've talked with him and
> others about this many times now and he is totally
> willing to work on it, except he doesn't have a PPC
> system, and doesn't have room for a desktop machine,
> or else I woulda shipped him one, even though he lives
> in Europe.
>
> If I can't find anyone willing to donate a Powerbook,
> or help chip in for one maybe it would be possible to
> setup a box somewhere on a network connection, maybe I
> can get the guys at VA to let me stick a box in the
> community racks, then he could work on it remotely.
=====
"Don't expect your own messiah; this neverworld which you desire is
only in your mind." -- http://www.dreamtheater.net/songb4.htm#IV5
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: compiler optimization? something else?
[not found] <Pine.GSO.4.05.10003081359150.17227-100000@ophelia.dogcow.org>
@ 2000-03-09 7:33 ` Geert Uytterhoeven
2000-03-09 8:53 ` Timothy A. Seufert
2000-03-09 10:24 ` Gabriel Paubert
2 siblings, 0 replies; 13+ messages in thread
From: Geert Uytterhoeven @ 2000-03-09 7:33 UTC (permalink / raw)
To: Sean Harding; +Cc: David Edelsohn, linuxppc-dev
On Wed, 8 Mar 2000, Sean Harding wrote:
> On Wed, 8 Mar 2000, David Edelsohn wrote:
> (FWIW, another metric...It took 41 minutes to compile a kernel on the
> powermac. It took just over 6 minutes on my PII 350 desktop. The pentium
> was compressing an mp3 at the same time).
I assume the PPC compiled a PPC kernel and the ia32 compiled an ia32 kernel?
Then your benchmark is irrelevant since it hides the complexity difference for
generating good code between ia32 and PPC.
Please redo the timings cross-compiling a PPC kernel on the ia32 or
cross-compiling an ia32 kernel on the PPC.
BTW, if you do timings for integer wavelet transforms, you'll find out that the
PPC is about as fast (or faster) than the PII 350. For this particular case my
200 MHz 604e outperformed all PCs and Suns I had access to at the university
(SMP doesn't count).
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: compiler optimization? something else?
[not found] <Pine.GSO.4.05.10003081359150.17227-100000@ophelia.dogcow.org>
2000-03-09 7:33 ` Geert Uytterhoeven
@ 2000-03-09 8:53 ` Timothy A. Seufert
2000-03-09 10:24 ` Gabriel Paubert
2 siblings, 0 replies; 13+ messages in thread
From: Timothy A. Seufert @ 2000-03-09 8:53 UTC (permalink / raw)
To: Sean Harding; +Cc: linuxppc-dev
At 2:18 PM -0800 3/8/00, Sean Harding wrote:
>Obviously a 150MHz PPC workstation isn't going to keep
>up with a 12 proc 500MHz alpha server. But I find it hard to believe that
>it should be as far out of line as it is.
>
>(FWIW, another metric...It took 41 minutes to compile a kernel on the
>powermac. It took just over 6 minutes on my PII 350 desktop. The pentium
>was compressing an mp3 at the same time).
Kernel compiles depend on more than just the CPU. Even if you have
enough RAM to avoid swapping, a fast disk does enhance compilation
times. I believe it also ends up touching a lot of memory, so RAM
speed is important.
In fact, RAM access is an area where some of Apple's older machines
really look weak. The original generation of PCI Macs (which almost
all the PCI Mac clones like yours were derived from) has pretty poor
memory performance.
For what it's worth, I just did a clean build of the kernel, and it
took about 8 minutes. Here's the output of "time make":
real 8m11.447s
user 7m13.080s
sys 0m49.350s
Note the significant system (I/O) time.
This is on a 366 MHz beige G3 with 512K L2 cache, a 5400 RPM IDE
disk, and 192MB RAM. It wasn't compressing an MP3 at the same time
though. :) Assuming the PII's time wouldn't improve enormously if
it were not doing the compression, we're probably within a factor of
2 here. SPECint95 scores suggest that a G3 should generally be
slightly faster than a PII at the same clock rate, so this could
either be normal variation (SPEC doesn't perfectly predict
everything) or a case where performance isn't as good as it could be.
BTW, one thing which is different about RAM use between the two
platforms is that the PPC tends to use more of it -- PPC binaries are
significantly larger than x86. This is architectural, not a compiler
problem; the PPC instruction set (like most general purpose 32-bit
RISCs) wasn't designed for high code density. In fact, it's one of
the least dense (or most bloated, take your pick) architectures. The
x86 was designed in an era when memory was a lot more precious, so
its binaries are fairly compact.
Tim Seufert
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: compiler optimization? something else?
[not found] <Pine.GSO.4.05.10003081359150.17227-100000@ophelia.dogcow.org>
2000-03-09 7:33 ` Geert Uytterhoeven
2000-03-09 8:53 ` Timothy A. Seufert
@ 2000-03-09 10:24 ` Gabriel Paubert
2 siblings, 0 replies; 13+ messages in thread
From: Gabriel Paubert @ 2000-03-09 10:24 UTC (permalink / raw)
To: Sean Harding; +Cc: David Edelsohn, linuxppc-dev
On Wed, 8 Mar 2000, Sean Harding wrote:
> (FWIW, another metric...It took 41 minutes to compile a kernel on the
> powermac. It took just over 6 minutes on my PII 350 desktop. The pentium
> was compressing an mp3 at the same time).
Which compiler version are you using ? A full kernel compile takes just
above 7 minutes on my 233 MHz 750 (MVME2400, 1Mb L2 cache but running at a
slow 93 MHz, I suspect Motorola got the L2 setup wrong ) with root and
everything NFS mounted and 32 Mb of RAM (no swap).
Are you sure the L2 cache is enabled ?
Gabriel.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: compiler optimization? something else?
[not found] <20000308113436.A3472@dogcow.org>
@ 2000-03-09 14:46 ` Franz Sirl
2000-03-09 16:04 ` David Edelsohn
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Franz Sirl @ 2000-03-09 14:46 UTC (permalink / raw)
To: Sean Harding; +Cc: linuxppc-dev
At 20:34 08.03.00, Sean Harding wrote:
>I posted this on comp.os.linux.powerpc without any response. Perhaps someone
>here can address it.
>
>Is anyone working on optimizing the compiler for PPC? It seems like it could
>use a little work. Maybe that's not actually the problem here, but it's the
>first thing that springs to mind.
>
>I recently was playing with encoding mp3s using LAME on several
>systems. LAME 3.63beta compiles easily out of the box, but it is
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
But what compiler options did it use on with your compile? I downloaded it
myself and it seems to default to a simple -O, which is a rather
non-optimal choice. Try editing the Makefile and play with combinations of
these options:
-O2, -O3, -funroll-loops, -funroll-all-loops, -ffast-math (dunno if this
one has an effect on PPC at all), -finline-functions, -mcpu=604, -mcpu=750
Do that and come back with a table listing your encoding times for the
different switch combinations.
>quite a bit slower than is seeems it should be. For one sample file, the
>encoding took 32 minutes on my PPC 604/150. The same file, encoded with the
>same options, took 6 minutes on my PII 366 laptop. I get similar speeds on
>Tru64 alpha systems, HP-UX pa-risc systems and Solaris sparc systems.
>Obviously there are differences in speed for these systems based on the CPU,
>etc. But they all perform in line with what I would expect to see. It's
>just ppc that's way slower.
Well, if you set the compiler options to suboptimal values, this is what
you would expect. I think I saw hand-tuned compiler options for all
platforms you listed, just not for Linux/PPC+gcc, which usually means
nobody did care til now. Remember Linux is a collaborative effort and if
you care about something being done/implemented/optimized/etc you usually
have to do it yourself or maybe kick the right people :-).
>The system is not i/o bound during the encode, and there were no other
>CPU-intensive processes running at the time. If this were an isolated case,
>I wouldn't worry too much about it. But it seems like most CPU-intensive
>tasks are slower than they should be.
>
>My system is pretty generic LinuxPPC 1999 right now. Here's what I have:
>
>juliet ~
>26% uname -a
>Linux juliet 2.2.13 #1 Fri Nov 12 23:01:37 PST 1999 ppc unknown
>
>juliet ~
>27% gcc --version
>egcs-2.91.66
Upgrade to gcc-2.95.2, <ftp://devel.linuxppc.org/users/fsirl/R5/RPMS/ppc/>.
Though I don't believe this will give you a really big improvement in code
optimization, it is a big step forward in compiler correctness on PPC.
Franz.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: compiler optimization? something else?
2000-03-09 14:46 ` compiler optimization? something else? Franz Sirl
@ 2000-03-09 16:04 ` David Edelsohn
2000-03-11 22:32 ` Giuliano Pochini
2000-03-10 9:18 ` Gabriel Paubert
2000-03-11 22:31 ` Giuliano Pochini
2 siblings, 1 reply; 13+ messages in thread
From: David Edelsohn @ 2000-03-09 16:04 UTC (permalink / raw)
To: Franz Sirl; +Cc: Sean Harding, linuxppc-dev
>>>>> Franz Sirl writes:
Franz> -O2, -O3, -funroll-loops, -funroll-all-loops, -ffast-math (dunno if this
Franz> one has an effect on PPC at all), -finline-functions, -mcpu=604, -mcpu=750
-ffast-math does enable use of the FP "fsel" instruction for
conditional moves. -ffast-math also affects other floating-point
optimizations in the compiler, not just specific architecture features.
David
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: compiler optimization? something else?
2000-03-09 14:46 ` compiler optimization? something else? Franz Sirl
2000-03-09 16:04 ` David Edelsohn
@ 2000-03-10 9:18 ` Gabriel Paubert
2000-03-10 15:58 ` David Edelsohn
2000-03-11 22:31 ` Giuliano Pochini
2 siblings, 1 reply; 13+ messages in thread
From: Gabriel Paubert @ 2000-03-10 9:18 UTC (permalink / raw)
To: Franz Sirl; +Cc: Sean Harding, linuxppc-dev
On Thu, 9 Mar 2000, Franz Sirl wrote:
> But what compiler options did it use on with your compile? I downloaded it
> myself and it seems to default to a simple -O, which is a rather
> non-optimal choice. Try editing the Makefile and play with combinations of
> these options:
>
> -O2, -O3, -funroll-loops, -funroll-all-loops, -ffast-math (dunno if this
> one has an effect on PPC at all), -finline-functions, -mcpu=604, -mcpu=750
-ffast_math has some effects in the generic part of the compiler
(simplifying some expressions and constant folding) since even some
obvious mathematical relationships do not hold with IEEE floating point.
Main reasons are the existence of NaNs and the distinction between
positive and negative zeroes. (x+0 and x-0 can't be simplified, x==x may
not be true...). -ffast-math will also replace divisions by constants
with multiplies, which may be slightly less precise but is always
significantly faster (especially on a 604).
If you don't have a 601 -fgfxopt might also help since it will generate
fsel instructions for min/max/conditional moves. fsel is unfortunately not
implemented on 601 so you can't set gfxopt for distributions.
Gabriel.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: compiler optimization? something else?
2000-03-10 9:18 ` Gabriel Paubert
@ 2000-03-10 15:58 ` David Edelsohn
0 siblings, 0 replies; 13+ messages in thread
From: David Edelsohn @ 2000-03-10 15:58 UTC (permalink / raw)
To: Gabriel Paubert; +Cc: Franz Sirl, Sean Harding, linuxppc-dev
>>>>> Gabriel Paubert writes:
Gabriel> If you don't have a 601 -fgfxopt might also help since it will generate
Gabriel> fsel instructions for min/max/conditional moves. fsel is unfortunately not
Gabriel> implemented on 601 so you can't set gfxopt for distributions.
Using the appropriate -mcpu=XXX option (e.g., -mcpu=604) sets
-fgfxopt accordingly.
David
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: compiler optimization? something else?
2000-03-11 22:32 ` Giuliano Pochini
@ 2000-03-11 19:44 ` David Edelsohn
0 siblings, 0 replies; 13+ messages in thread
From: David Edelsohn @ 2000-03-11 19:44 UTC (permalink / raw)
To: Giuliano Pochini; +Cc: Franz Sirl, Sean Harding, linuxppc-dev
Fused multiply-add always is used on POWER/PowerPC unless the user
specifically request that it not be used. -ffast-math does not affect
it.
David
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: compiler optimization? something else?
2000-03-09 14:46 ` compiler optimization? something else? Franz Sirl
2000-03-09 16:04 ` David Edelsohn
2000-03-10 9:18 ` Gabriel Paubert
@ 2000-03-11 22:31 ` Giuliano Pochini
2 siblings, 0 replies; 13+ messages in thread
From: Giuliano Pochini @ 2000-03-11 22:31 UTC (permalink / raw)
To: Franz Sirl; +Cc: Sean Harding, linuxppc-dev
> >juliet ~
> >27% gcc --version
> >egcs-2.91.66
>
> Upgrade to gcc-2.95.2, <ftp://devel.linuxppc.org/users/fsirl/R5/RPMS/ppc/>.
> Though I don't believe this will give you a really big improvement in code
> optimization, it is a big step forward in compiler correctness on PPC.
I don't see any speed improvement from egcs-1.1.12 to gcc 2.95.2. Various
tests done with lame and mpg123 show than 2.95.2 produces code only 0.5%
faster or so. Executable size is 10-15% smaller.
Bye.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: compiler optimization? something else?
2000-03-09 16:04 ` David Edelsohn
@ 2000-03-11 22:32 ` Giuliano Pochini
2000-03-11 19:44 ` David Edelsohn
0 siblings, 1 reply; 13+ messages in thread
From: Giuliano Pochini @ 2000-03-11 22:32 UTC (permalink / raw)
To: David Edelsohn; +Cc: Franz Sirl, Sean Harding, linuxppc-dev
> Franz> -O2, -O3, -funroll-loops, -funroll-all-loops, -ffast-math (dunno if this
> Franz> one has an effect on PPC at all), -finline-functions, -mcpu=604, -mcpu=750
>
> -ffast-math does enable use of the FP "fsel" instruction for
> conditional moves. -ffast-math also affects other floating-point
> optimizations in the compiler, not just specific architecture features.
And mul-add's & friends ?
Bye.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2000-03-11 22:32 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20000308113436.A3472@dogcow.org>
2000-03-09 14:46 ` compiler optimization? something else? Franz Sirl
2000-03-09 16:04 ` David Edelsohn
2000-03-11 22:32 ` Giuliano Pochini
2000-03-11 19:44 ` David Edelsohn
2000-03-10 9:18 ` Gabriel Paubert
2000-03-10 15:58 ` David Edelsohn
2000-03-11 22:31 ` Giuliano Pochini
[not found] <Pine.GSO.4.05.10003081359150.17227-100000@ophelia.dogcow.org>
2000-03-09 7:33 ` Geert Uytterhoeven
2000-03-09 8:53 ` Timothy A. Seufert
2000-03-09 10:24 ` Gabriel Paubert
2000-03-08 23:01 Dan Bethe
[not found] <20000308204420.29276.qmail@web1705.mail.yahoo.com>
2000-03-08 21:22 ` David Edelsohn
[not found] <20000308150957.A12031@drow.res.cmu.edu>
2000-03-08 20:21 ` Sean Harding
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).