* [parisc-linux] B132L outperforms C160 - 64-bit userland needed?
@ 2005-08-16 9:23 Kurt Fitzner
2005-08-16 13:00 ` Michael S. Zick
2005-08-17 6:19 ` Grant Grundler
0 siblings, 2 replies; 21+ messages in thread
From: Kurt Fitzner @ 2005-08-16 9:23 UTC (permalink / raw)
To: parisc-linux
In the interim until I can source an ISA/EISA fast ethernet card, I've
been playing with my new C160. I decided to benchmark it and compare it
to my B132L. To my surprise, when it came to integer operations, the
B132L outperforms the C160!
I benchmarked using nbench, which is a Linux/Unix port of Byte
magazine's ByteMark. Full results from both machines are at the end of
this post.
Both machines were benchmarked using identical binaries compiled with:
-O3 -march=1.1 -mschedule=7300 -mfast-indirect-calls -mgas
Thinking that the scheduling and architecture might be slowing down the
C160, I recompiled it with:
-O3 -march=2.0 -mschedule=8000 -mfast-indirect-calls -mgas
When that produced even worse results, I tried -march=2.0 vs 1.1 and
-mschedule=8000 vs 7300 seperately. Each one alone slows down the
benchmark and the effect is addititive. It seems that in Linux, right
now at least, compiling with -march=2.0 or -mschedule=8000 is a Bad Thing.
If you look at the individual results, in most areas the C160 performs
about 20% better than the B132. It's just that in a few areas, the C160
has absolutely dismal performance. Numeric sorting and the assignment
algorithm were both notably slower on the C160.
With a clock speed 20% faster, I must admit that the C160's poor showing
was a dissappointment. I'm wondering if this is because there isn't a
64-bit userland yet. Is stepping down to 32-bit on the C160 hurting its
performance that badly?
I suppose (assuming I'm correct about the reason for the performance
drop) my options are to wait for 64-bit userland or to put HPUX on it.
Is there any way which someone can help the 64-bit userland effort who
is quite strong in system-level programming in general though weak in
Linux kernel programming specifically? Is there a project web site for
this effort?
Kurt
Phong (C160):
------------------------------------------------------------------
TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 37.51 : 0.96 : 0.32
STRING SORT : 5.0486 : 2.26 : 0.35
BITFIELD : 1.6052e+07 : 2.75 : 0.58
FP EMULATION : 8.4215 : 4.04 : 0.93
FOURIER : 1102.1 : 1.25 : 0.70
ASSIGNMENT : 0.59547 : 2.27 : 0.59
IDEA : 115.34 : 1.76 : 0.52
HUFFMAN : 89.382 : 2.48 : 0.79
NEURAL NET : 1.6905 : 2.72 : 1.14
LU DECOMPOSITION : 41.254 : 2.14 : 1.54
=======================ORIGINAL BYTEMARK RESULTS=======================
INTEGER INDEX : 2.187
FLOATING-POINT INDEX: 1.938
Baseline: MSDOS P90, 256 KB L2-cache, Watcom* compiler 10.0
===========================LINUX DATA BELOW============================
CPU : Raven U 160 (9000/780/C160) 160MHz
L2 Cache : 512 KB (WB, 0-way associative)
OS : Linux 2.6.10-pa11-phong-3
C compiler : gcc version 3.3.5 (Debian 1:3.3.5-13)
libc : ld-2.3.2.so
MEMORY INDEX : 0.491
INTEGER INDEX : 0.591
FLOATING-POINT INDEX: 1.075
Baseline: Linux AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
=======================================================================
Megabyte (B132L):
------------------------------------------------------------------
TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 60.695 : 1.56 : 0.51
STRING SORT : 3.3905 : 1.51 : 0.23
BITFIELD : 1.1081e+07 : 1.90 : 0.40
FP EMULATION : 6.0832 : 2.92 : 0.67
FOURIER : 876.58 : 1.00 : 0.56
ASSIGNMENT : 0.80283 : 3.05 : 0.79
IDEA : 150.04 : 2.29 : 0.68
HUFFMAN : 76.017 : 2.11 : 0.67
NEURAL NET : 1.1334 : 1.82 : 0.77
LU DECOMPOSITION : 41.733 : 2.16 : 1.56
=======================ORIGINAL BYTEMARK RESULTS=======================
INTEGER INDEX : 2.121
FLOATING-POINT INDEX: 1.577
Baseline: MSDOS P90, 256 KB L2-cache, Watcom* compiler 10.0
===========================LINUX DATA BELOW============================
CPU : Merlin L2 132 (9000/778/B132L) 132MHz
Cache : 64 KB (WB, 0-way associative)
OS : Linux 2.6.8.1-pa11-megabyte-20050720
C compiler : gcc version 3.3.5 (Debian 1:3.3.5-13)
libc : ld-2.3.2.so
MEMORY INDEX : 0.419
INTEGER INDEX : 0.630
FLOATING-POINT INDEX: 0.875
Baseline: Linux AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
=======================================================================
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [parisc-linux] B132L outperforms C160 - 64-bit userland needed?
2005-08-16 9:23 [parisc-linux] B132L outperforms C160 - 64-bit userland needed? Kurt Fitzner
@ 2005-08-16 13:00 ` Michael S. Zick
2005-08-17 0:03 ` Kurt Fitzner
2005-08-17 6:19 ` Grant Grundler
1 sibling, 1 reply; 21+ messages in thread
From: Michael S. Zick @ 2005-08-16 13:00 UTC (permalink / raw)
To: parisc-linux
On Tue August 16 2005 04:23, Kurt Fitzner wrote:
> In the interim until I can source an ISA/EISA fast ethernet card, I've
> been playing with my new C160. I decided to benchmark it and compare it
> to my B132L. To my surprise, when it came to integer operations, the
> B132L outperforms the C160!
>
> I benchmarked using nbench, which is a Linux/Unix port of Byte
> magazine's ByteMark. Full results from both machines are at the end of
> this post.
>
> Both machines were benchmarked using identical binaries compiled with:
> -O3 -march=1.1 -mschedule=7300 -mfast-indirect-calls -mgas
>
> Thinking that the scheduling and architecture might be slowing down the
> C160, I recompiled it with:
> -O3 -march=2.0 -mschedule=8000 -mfast-indirect-calls -mgas
>
> When that produced even worse results, I tried -march=2.0 vs 1.1 and
> -mschedule=8000 vs 7300 seperately. Each one alone slows down the
> benchmark and the effect is addititive. It seems that in Linux, right
> now at least, compiling with -march=2.0 or -mschedule=8000 is a Bad Thing.
>
It would be interesting to see if this also holds with a newer GCC.
(3.4, 4.0, 4.1)
>
> If you look at the individual results, in most areas the C160 performs
> about 20% better than the B132. It's just that in a few areas, the C160
> has absolutely dismal performance. Numeric sorting and the assignment
> algorithm were both notably slower on the C160.
>
> With a clock speed 20% faster, I must admit that the C160's poor showing
> was a dissappointment. I'm wondering if this is because there isn't a
> 64-bit userland yet. Is stepping down to 32-bit on the C160 hurting its
> performance that badly?
>
Try the same version kernel on both machines - you might just be seeing
the difference between 2.6.8 and 2.6.10. (or 32bit and 64bit kernels).
Also, what compiler was used to build the kernel.
Mike
> I suppose (assuming I'm correct about the reason for the performance
> drop) my options are to wait for 64-bit userland or to put HPUX on it.
>
> Is there any way which someone can help the 64-bit userland effort who
> is quite strong in system-level programming in general though weak in
> Linux kernel programming specifically? Is there a project web site for
> this effort?
>
> Kurt
>
>
> Phong (C160):
> ------------------------------------------------------------------
> TEST : Iterations/sec. : Old Index : New Index
> : : Pentium 90* : AMD K6/233*
> --------------------:------------------:-------------:------------
> NUMERIC SORT : 37.51 : 0.96 : 0.32
> STRING SORT : 5.0486 : 2.26 : 0.35
> BITFIELD : 1.6052e+07 : 2.75 : 0.58
> FP EMULATION : 8.4215 : 4.04 : 0.93
> FOURIER : 1102.1 : 1.25 : 0.70
> ASSIGNMENT : 0.59547 : 2.27 : 0.59
> IDEA : 115.34 : 1.76 : 0.52
> HUFFMAN : 89.382 : 2.48 : 0.79
> NEURAL NET : 1.6905 : 2.72 : 1.14
> LU DECOMPOSITION : 41.254 : 2.14 : 1.54
> =======================ORIGINAL BYTEMARK RESULTS=======================
> INTEGER INDEX : 2.187
> FLOATING-POINT INDEX: 1.938
> Baseline: MSDOS P90, 256 KB L2-cache, Watcom* compiler 10.0
> ===========================LINUX DATA BELOW============================
> CPU : Raven U 160 (9000/780/C160) 160MHz
> L2 Cache : 512 KB (WB, 0-way associative)
> OS : Linux 2.6.10-pa11-phong-3
> C compiler : gcc version 3.3.5 (Debian 1:3.3.5-13)
> libc : ld-2.3.2.so
> MEMORY INDEX : 0.491
> INTEGER INDEX : 0.591
> FLOATING-POINT INDEX: 1.075
> Baseline: Linux AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
> =======================================================================
>
>
> Megabyte (B132L):
> ------------------------------------------------------------------
> TEST : Iterations/sec. : Old Index : New Index
> : : Pentium 90* : AMD K6/233*
> --------------------:------------------:-------------:------------
> NUMERIC SORT : 60.695 : 1.56 : 0.51
> STRING SORT : 3.3905 : 1.51 : 0.23
> BITFIELD : 1.1081e+07 : 1.90 : 0.40
> FP EMULATION : 6.0832 : 2.92 : 0.67
> FOURIER : 876.58 : 1.00 : 0.56
> ASSIGNMENT : 0.80283 : 3.05 : 0.79
> IDEA : 150.04 : 2.29 : 0.68
> HUFFMAN : 76.017 : 2.11 : 0.67
> NEURAL NET : 1.1334 : 1.82 : 0.77
> LU DECOMPOSITION : 41.733 : 2.16 : 1.56
> =======================ORIGINAL BYTEMARK RESULTS=======================
> INTEGER INDEX : 2.121
> FLOATING-POINT INDEX: 1.577
> Baseline: MSDOS P90, 256 KB L2-cache, Watcom* compiler 10.0
> ===========================LINUX DATA BELOW============================
> CPU : Merlin L2 132 (9000/778/B132L) 132MHz
> Cache : 64 KB (WB, 0-way associative)
> OS : Linux 2.6.8.1-pa11-megabyte-20050720
> C compiler : gcc version 3.3.5 (Debian 1:3.3.5-13)
> libc : ld-2.3.2.so
> MEMORY INDEX : 0.419
> INTEGER INDEX : 0.630
> FLOATING-POINT INDEX: 0.875
> Baseline: Linux AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
> =======================================================================
>
> _______________________________________________
> parisc-linux mailing list
> parisc-linux@lists.parisc-linux.org
> http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
>
>
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [parisc-linux] B132L outperforms C160 - 64-bit userland needed?
2005-08-16 13:00 ` Michael S. Zick
@ 2005-08-17 0:03 ` Kurt Fitzner
2005-08-17 1:32 ` John David Anglin
0 siblings, 1 reply; 21+ messages in thread
From: Kurt Fitzner @ 2005-08-17 0:03 UTC (permalink / raw)
To: parisc-linux
Michael S. Zick wrote:
>>When that produced even worse results, I tried -march=2.0 vs 1.1 and
>>-mschedule=8000 vs 7300 seperately. Each one alone slows down the
>>benchmark and the effect is addititive. It seems that in Linux, right
>>now at least, compiling with -march=2.0 or -mschedule=8000 is a Bad Thing.
>>
>
> It would be interesting to see if this also holds with a newer GCC.
> (3.4, 4.0, 4.1)
I can't see how it would be different. Isn't compiling for PA2.0/8000
in Linux tying of GCC's hands behind its back. You're telling it you
want good code for a 64-bit CPU, but it can't produce 64-bit code.
Is there any real possibility that this is compiler-related and not
simply a 32 vs. 64 bit issue? If there is a real chance of this, I'll
bite the bullet and actually test out newer GCC versions.
> Try the same version kernel on both machines - you might just be seeing
> the difference between 2.6.8 and 2.6.10. (or 32bit and 64bit kernels).
I've installed 2.6.8.1 on the C160 to match the B132L. I'm seeing a 2%
increase in speed across the board with the different kernel. I
attribute this to my having set the CPU to 7300 in the kernel settings.
The speed increase is exactly consistent with the difference I see in
executables on my C160 when compiled with -march1.1/-mschedule=7300
rather than 2.0/8000. The C160 still underperforms significantly when
compared to the B132L.
I have not yet compiled a 64-bit kernel for my C160. All advice I have
read is that this is a complete waste of time since without a 64-bit
userland it will just make the kernel bigger and slower. Is there
likely to be any benefit at all to a 64-bit kernel?
> Also, what compiler was used to build the kernel.
Same as used to build nBench, gcc 3.3.5 (Debian 1:3.3.5-13).
Kurt.
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [parisc-linux] B132L outperforms C160 - 64-bit userland needed?
2005-08-17 0:03 ` Kurt Fitzner
@ 2005-08-17 1:32 ` John David Anglin
2005-08-17 1:48 ` Michael S. Zick
2005-08-17 3:43 ` Kurt Fitzner
0 siblings, 2 replies; 21+ messages in thread
From: John David Anglin @ 2005-08-17 1:32 UTC (permalink / raw)
To: Kurt Fitzner; +Cc: parisc-linux
> >>When that produced even worse results, I tried -march=2.0 vs 1.1 and
> >>-mschedule=8000 vs 7300 seperately. Each one alone slows down the
> >>benchmark and the effect is addititive. It seems that in Linux, right
> >>now at least, compiling with -march=2.0 or -mschedule=8000 is a Bad Thing.
> >>
In theory, -mschedule=8000 should only be used on machines with PA 2.0
processors (i.e., not the B132L). It is tweaked to the number of execution
units, etc, in the PA 2.0 processor. How much difference this makes in the
real world is not clear. I haven't seen any numbers. As far as the
models themselves, they haven't changed since they were added by Jeff
Law somewhere around GCC 3.0. It would be interesting to see how they
compare on the same cpu, same os, etc.
As far PA 2.0 versus 1.1, the main differences affecting 32-bit code
are some new branch instructions. There are also some new FP instructions
but these are somewhat compromised by linker bugs. In non floating-point
code, I would expect the PA 2.0 features to make their presence felt
in code with large functions.
> > It would be interesting to see if this also holds with a newer GCC.
> > (3.4, 4.0, 4.1)
There have been a lot of optimization improvements in GCC since 3.3.
It would be useful to see how effective they are in real applications
and in benchmark performance. As far as the PA backend goes, there
haven't been any major performance improvements added since 3.3. The
changes mainly are bug fixes.
> I can't see how it would be different. Isn't compiling for PA2.0/8000
> in Linux tying of GCC's hands behind its back. You're telling it you
> want good code for a 64-bit CPU, but it can't produce 64-bit code.
>
> Is there any real possibility that this is compiler-related and not
> simply a 32 vs. 64 bit issue? If there is a real chance of this, I'll
> bite the bullet and actually test out newer GCC versions.
64-bit code isn't going to make your apps run faster. There is more
overhead in data accesses in 64-bit code (i.e., they go through the DLT)
than in 32-bit apps. Also, a lot more sign extensions are needed. In
terms of a GCC build, the difference is about 15-20%. The 64-bit tools
are less mature. So generally, you only want to use 64-bit apps when
they can benefit from the larger address space.
> I have not yet compiled a 64-bit kernel for my C160. All advice I have
> read is that this is a complete waste of time since without a 64-bit
> userland it will just make the kernel bigger and slower. Is there
> likely to be any benefit at all to a 64-bit kernel?
I doubt it. You only want to use a 64-bit kernel when you have a machine
with lots of memory.
Dave
--
J. David Anglin dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6602)
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [parisc-linux] B132L outperforms C160 - 64-bit userland needed?
2005-08-17 1:32 ` John David Anglin
@ 2005-08-17 1:48 ` Michael S. Zick
2005-08-17 3:43 ` Kurt Fitzner
1 sibling, 0 replies; 21+ messages in thread
From: Michael S. Zick @ 2005-08-17 1:48 UTC (permalink / raw)
To: parisc-linux
On Tue August 16 2005 20:32, John David Anglin wrote:
> > >>When that produced even worse results, I tried -march=2.0 vs 1.1 and
> > >>-mschedule=8000 vs 7300 seperately. Each one alone slows down the
> > >>benchmark and the effect is addititive. It seems that in Linux, right
> > >>now at least, compiling with -march=2.0 or -mschedule=8000 is a Bad Thing.
> > >>
>
> In theory, -mschedule=8000 should only be used on machines with PA 2.0
> processors (i.e., not the B132L). It is tweaked to the number of execution
> units, etc, in the PA 2.0 processor. How much difference this makes in the
> real world is not clear. I haven't seen any numbers. As far as the
> models themselves, they haven't changed since they were added by Jeff
> Law somewhere around GCC 3.0. It would be interesting to see how they
> compare on the same cpu, same os, etc.
>
One other suggestion.
Try running the Povray benchmark - It is compute intensive and long
enough running to turn other kernel activity into noise.
You should be able to just: apt-get povray
But it could be that the benchmark is separately packaged.
I think Joel tried hppa-Povray - he might have information to share.
Mike
> As far PA 2.0 versus 1.1, the main differences affecting 32-bit code
> are some new branch instructions. There are also some new FP instructions
> but these are somewhat compromised by linker bugs. In non floating-point
> code, I would expect the PA 2.0 features to make their presence felt
> in code with large functions.
>
> > > It would be interesting to see if this also holds with a newer GCC.
> > > (3.4, 4.0, 4.1)
>
> There have been a lot of optimization improvements in GCC since 3.3.
> It would be useful to see how effective they are in real applications
> and in benchmark performance. As far as the PA backend goes, there
> haven't been any major performance improvements added since 3.3. The
> changes mainly are bug fixes.
>
> > I can't see how it would be different. Isn't compiling for PA2.0/8000
> > in Linux tying of GCC's hands behind its back. You're telling it you
> > want good code for a 64-bit CPU, but it can't produce 64-bit code.
> >
> > Is there any real possibility that this is compiler-related and not
> > simply a 32 vs. 64 bit issue? If there is a real chance of this, I'll
> > bite the bullet and actually test out newer GCC versions.
>
> 64-bit code isn't going to make your apps run faster. There is more
> overhead in data accesses in 64-bit code (i.e., they go through the DLT)
> than in 32-bit apps. Also, a lot more sign extensions are needed. In
> terms of a GCC build, the difference is about 15-20%. The 64-bit tools
> are less mature. So generally, you only want to use 64-bit apps when
> they can benefit from the larger address space.
>
> > I have not yet compiled a 64-bit kernel for my C160. All advice I have
> > read is that this is a complete waste of time since without a 64-bit
> > userland it will just make the kernel bigger and slower. Is there
> > likely to be any benefit at all to a 64-bit kernel?
>
> I doubt it. You only want to use a 64-bit kernel when you have a machine
> with lots of memory.
>
> Dave
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [parisc-linux] B132L outperforms C160 - 64-bit userland needed?
2005-08-17 1:32 ` John David Anglin
2005-08-17 1:48 ` Michael S. Zick
@ 2005-08-17 3:43 ` Kurt Fitzner
2005-08-17 6:37 ` Grant Grundler
2005-08-17 14:16 ` John David Anglin
1 sibling, 2 replies; 21+ messages in thread
From: Kurt Fitzner @ 2005-08-17 3:43 UTC (permalink / raw)
To: parisc-linux
John David Anglin wrote:
>>>>When that produced even worse results, I tried -march=2.0 vs 1.1 and
>>>>-mschedule=8000 vs 7300 seperately. Each one alone slows down the
>>>>benchmark and the effect is addititive. It seems that in Linux, right
>>>>now at least, compiling with -march=2.0 or -mschedule=8000 is a Bad Thing.
>>>>
>
>
> In theory, -mschedule=8000 should only be used on machines with PA 2.0
> processors (i.e., not the B132L).
I must not have been clear. Wanting to have as level a playing field as
possible for benchmarking, I fist used identical binaries for both the
PA 1.1 and PA 2.0 machines. These binaries were build as
-march=1.1/-mschedule=7300. When I first noticed that the B132L was
outperforming the C160, I then recompiled the C160 binaries as
-march=2.0/-mschedule=8000. My thinking was that perhaps the 1.1/7300
binaries on the C160 were perhaps causing the poor results. The
recompiled 2.0/8000 binaries had even /poorer/ results, on the order of
a consistent two percent reduction in performance over 1.1/7300.
> ...It is tweaked to the number of execution
> units, etc, in the PA 2.0 processor. How much difference this makes in the
> real world is not clear. I haven't seen any numbers. As far as the
> models themselves, they haven't changed since they were added by Jeff
> Law somewhere around GCC 3.0. It would be interesting to see how they
> compare on the same cpu, same os, etc.
In Linux 2.6.10-pa11 and 2.6.8.1-pa11 with a 32 bit kernel on a C160
(PA-8000 cpu)
* -march=1.1 produces code that performs approximately one percent
faster than -march=2.0
* -mschedule=7300 produces code that is approx. one percent faster than
-mschedule=8000
* The two above are additive - 1.1/7300 is two percent faster than 2.0/8000
* Code compiled as 1.1/7300 and also run on a C160 with its kernel
configured with the PA7300LC processor type (as opposed to configured as
PA8000) enjoys another ~2% speed boost that is additive.
So, to be completely clear, where the baseline is a C160, Linux
2.6.8.1-pa11 configured for PA8000 cpu and where the user binary is
compiled with -march=2.0 and -mschedule=8000:
- User binary compiled with -march=1.1 = +1% performance
- User binary compiled with -mschedule=8000 = +1% performance
- Kernel configured as PA7300LC = +2% performance
> As far PA 2.0 versus 1.1, the main differences affecting 32-bit code
> are some new branch instructions. There are also some new FP instructions
> but these are somewhat compromised by linker bugs. In non floating-point
> code, I would expect the PA 2.0 features to make their presence felt
> in code with large functions.
Is there anything in PA 2.0 that you would expect to cause poorer
performance in any circumstance when compared to PA 1.1?
> There have been a lot of optimization improvements in GCC since 3.3.
> It would be useful to see how effective they are in real applications
> and in benchmark performance. As far as the PA backend goes, there
> haven't been any major performance improvements added since 3.3. The
> changes mainly are bug fixes.
So, what I'm hearing is that I might expect to see better code across
the board with a post 3.3 compiler, but that there is unlikely to be a
change in what I am seeing with 1.1 vs 2.0, 7300 vs 8000?
> 64-bit code isn't going to make your apps run faster. There is more
> overhead in data accesses in 64-bit code (i.e., they go through the DLT)
> than in 32-bit apps. Also, a lot more sign extensions are needed. In
> terms of a GCC build, the difference is about 15-20%. The 64-bit tools
> are less mature. So generally, you only want to use 64-bit apps when
> they can benefit from the larger address space.
I'm not strong on the PA architecture - I assumed on a 64 bit machine
that the data bus would be 64 bits wide. Thus, I would have thought
that 64-bit compiled apps on such a machine would run at least as fast
as 32 bit ones, and be superior in some areas such as when they had to
deal with 64 bit values like file offsets. If less bit-width is better,
shouldn't we all be going back to 6502? :)
In any case, if what I'm seeing isn't due to 32 vs 64 bit, then I am
completely baffled by what I'm seeing. Why would a C160 running at a
20% faster clock speed, having eight times the cache, and a design that
should be superior in every sense run slower than B132L? I'm not trying
to blame my machine's poor performance on anyone, but I must admit that
having read as much as I can on the C160 vs B132L that I can find no
explanation in the hardware.
Kurt.
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [parisc-linux] B132L outperforms C160 - 64-bit userland needed?
2005-08-17 3:43 ` Kurt Fitzner
@ 2005-08-17 6:37 ` Grant Grundler
2005-08-17 14:16 ` John David Anglin
1 sibling, 0 replies; 21+ messages in thread
From: Grant Grundler @ 2005-08-17 6:37 UTC (permalink / raw)
To: Kurt Fitzner; +Cc: parisc-linux
On Tue, Aug 16, 2005 at 09:43:51PM -0600, Kurt Fitzner wrote:
> > As far PA 2.0 versus 1.1, the main differences affecting 32-bit code
> > are some new branch instructions. There are also some new FP instructions
> > but these are somewhat compromised by linker bugs. In non floating-point
> > code, I would expect the PA 2.0 features to make their presence felt
> > in code with large functions.
>
> Is there anything in PA 2.0 that you would expect to cause poorer
> performance in any circumstance when compared to PA 1.1?
No - PA2.0 HW will in general perform better.
I agree it's odd that it doesn't.
And it may not be true for HPUX acc compiler or other gcc versions.
> > 64-bit code isn't going to make your apps run faster. There is more
> > overhead in data accesses in 64-bit code (i.e., they go through the DLT)
> > than in 32-bit apps. Also, a lot more sign extensions are needed. In
> > terms of a GCC build, the difference is about 15-20%. The 64-bit tools
> > are less mature. So generally, you only want to use 64-bit apps when
> > they can benefit from the larger address space.
>
> I'm not strong on the PA architecture - I assumed on a 64 bit machine
> that the data bus would be 64 bits wide.
It's not. Compare "Runway" vs "GSC+" bus on URL below.
> Thus, I would have thought
> that 64-bit compiled apps on such a machine would run at least as fast
> as 32 bit ones, and be superior in some areas such as when they had to
> deal with 64 bit values like file offsets. If less bit-width is better,
> shouldn't we all be going back to 6502? :)
No. What Dave didn't mention (but implied) was that data structures
get bigger (64 bit vs 32-bit for the same field) and thus require
more cachelines for the same data structure. Note that the CPU loads
*cachelines* from memory and not individual pointers or fields.
And thus must potentially load *more* cachelines when running 64-bit
mode than when running 32-bit (narrow) mode apps.
> In any case, if what I'm seeing isn't due to 32 vs 64 bit, then I am
> completely baffled by what I'm seeing. Why would a C160 running at a
> 20% faster clock speed, having eight times the cache, and a design that
> should be superior in every sense run slower than B132L?
chipset was designed with different tradeoffs. C160 can take 1.5 or 2GB
of RAM. B132L maxes out at 768MB (and that might not even be supported).
> I'm not trying
> to blame my machine's poor performance on anyone, but I must admit that
> having read as much as I can on the C160 vs B132L that I can find no
> explanation in the hardware.
Try http://www.openpa.net/index.html
(That's the "The OpenPA Project" off the www.parisc-linux.org Nav Bar.)
The chipsets in the two boxes are very different.
It definitely a difference in the system, not just the CPU.
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [parisc-linux] B132L outperforms C160 - 64-bit userland needed?
2005-08-17 3:43 ` Kurt Fitzner
2005-08-17 6:37 ` Grant Grundler
@ 2005-08-17 14:16 ` John David Anglin
1 sibling, 0 replies; 21+ messages in thread
From: John David Anglin @ 2005-08-17 14:16 UTC (permalink / raw)
To: Kurt Fitzner; +Cc: parisc-linux
> > ...It is tweaked to the number of execution
> > units, etc, in the PA 2.0 processor. How much difference this makes in the
> > real world is not clear. I haven't seen any numbers. As far as the
> > models themselves, they haven't changed since they were added by Jeff
> > Law somewhere around GCC 3.0. It would be interesting to see how they
> > compare on the same cpu, same os, etc.
>
> In Linux 2.6.10-pa11 and 2.6.8.1-pa11 with a 32 bit kernel on a C160
> (PA-8000 cpu)
> * -march=1.1 produces code that performs approximately one percent
> faster than -march=2.0
> * -mschedule=7300 produces code that is approx. one percent faster than
> -mschedule=8000
> * The two above are additive - 1.1/7300 is two percent faster than 2.0/8000
> * Code compiled as 1.1/7300 and also run on a C160 with its kernel
> configured with the PA7300LC processor type (as opposed to configured as
> PA8000) enjoys another ~2% speed boost that is additive.
>
> So, to be completely clear, where the baseline is a C160, Linux
> 2.6.8.1-pa11 configured for PA8000 cpu and where the user binary is
> compiled with -march=2.0 and -mschedule=8000:
> - User binary compiled with -march=1.1 = +1% performance
> - User binary compiled with -mschedule=8000 = +1% performance
> - Kernel configured as PA7300LC = +2% performance
>
> > As far PA 2.0 versus 1.1, the main differences affecting 32-bit code
> > are some new branch instructions. There are also some new FP instructions
> > but these are somewhat compromised by linker bugs. In non floating-point
> > code, I would expect the PA 2.0 features to make their presence felt
> > in code with large functions.
>
> Is there anything in PA 2.0 that you would expect to cause poorer
> performance in any circumstance when compared to PA 1.1?
The only thing that comes to mind is that some PA 2.0 floating-point
instructions require assist from the kernel. It might be beneficial
to avoid these exceptions.
The numbers suggest that there may be issues with PA 8000 scheduling
and PA 2.0 code generation, but I think more testing is needed. It
would very useful if the actual problem could be identified.
> > There have been a lot of optimization improvements in GCC since 3.3.
> > It would be useful to see how effective they are in real applications
> > and in benchmark performance. As far as the PA backend goes, there
> > haven't been any major performance improvements added since 3.3. The
> > changes mainly are bug fixes.
>
> So, what I'm hearing is that I might expect to see better code across
> the board with a post 3.3 compiler, but that there is unlikely to be a
> change in what I am seeing with 1.1 vs 2.0, 7300 vs 8000?
That's the current situation. There's certainly improvements
that could be made to the PA backend in GCC. Some issues are:
1) Long offsets in PA 2.0 floating-point loads and stores. This is
a 32-bit linker bug.
2) The PA has many integer conditions that aren't exploited. If these
could be exploited, we might save an insn in some loops. This might
yield a significant performance increase in some loops.
3) We need to look at improving the load sequence for self data when
generating PIC code. Currently, I believe that we always load data
through the DLT. This requires three instructions, with two being
memory accesses. For self data, this can be shortened to two insns
and one memory access. When self data can be accessed using a 16-bit
offset, data can be accessed in one memory access. So, it might be
useful to have a code generation option for tiny data.
4) There have been reports that utilizing the left/right addressibility
of the floating-point registers in 32-bit code causes unnecessary
processor stalls.
> > 64-bit code isn't going to make your apps run faster. There is more
> > overhead in data accesses in 64-bit code (i.e., they go through the DLT)
> > than in 32-bit apps. Also, a lot more sign extensions are needed. In
> > terms of a GCC build, the difference is about 15-20%. The 64-bit tools
> > are less mature. So generally, you only want to use 64-bit apps when
> > they can benefit from the larger address space.
>
> I'm not strong on the PA architecture - I assumed on a 64 bit machine
> that the data bus would be 64 bits wide. Thus, I would have thought
> that 64-bit compiled apps on such a machine would run at least as fast
> as 32 bit ones, and be superior in some areas such as when they had to
> deal with 64 bit values like file offsets. If less bit-width is better,
> shouldn't we all be going back to 6502? :)
That's true. However, there's only a benefit when an application uses
a lot of 64-bit data, and/or benefits from the 64-bit address space.
Integer applications still typically use a lot of 32-bit signed integers
for saving space and portability. When the PA does an 32-bit load in
wide mode, the value is zero extended. However, in many cases the value
needs to be sign extended. Thus, in many cases you pay a penalty when
64-bit registers are needed.
> In any case, if what I'm seeing isn't due to 32 vs 64 bit, then I am
> completely baffled by what I'm seeing. Why would a C160 running at a
> 20% faster clock speed, having eight times the cache, and a design that
> should be superior in every sense run slower than B132L? I'm not trying
> to blame my machine's poor performance on anyone, but I must admit that
> having read as much as I can on the C160 vs B132L that I can find no
> explanation in the hardware.
I really don't know. You should be able to run PA 1.1 code for the
B132L on the C160. If the same code runs slower on the C160, I would
first suspect that it's due to a difference in configuration (disks,
graphics, etc).
Dave
--
J. David Anglin dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6602)
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [parisc-linux] B132L outperforms C160 - 64-bit userland needed?
2005-08-16 9:23 [parisc-linux] B132L outperforms C160 - 64-bit userland needed? Kurt Fitzner
2005-08-16 13:00 ` Michael S. Zick
@ 2005-08-17 6:19 ` Grant Grundler
2005-08-17 18:42 ` Kurt Fitzner
2005-08-17 20:38 ` Carlos O'Donell
1 sibling, 2 replies; 21+ messages in thread
From: Grant Grundler @ 2005-08-17 6:19 UTC (permalink / raw)
To: Kurt Fitzner; +Cc: parisc-linux
On Tue, Aug 16, 2005 at 03:23:28AM -0600, Kurt Fitzner wrote:
> To my surprise, when it came to integer operations, the
> B132L outperforms the C160!
Thanks for posting the results...looks like fun!
....
> If you look at the individual results, in most areas the C160 performs
> about 20% better than the B132. It's just that in a few areas, the C160
> has absolutely dismal performance. Numeric sorting and the assignment
> algorithm were both notably slower on the C160.
Could these workloads be thrashing memory or cache?
Anyone know if the C160 has slower memory latency since the Memory
controller is on the MMU and not directly attached to the runway bus?
> With a clock speed 20% faster, I must admit that the C160's poor showing
> was a dissappointment. I'm wondering if this is because there isn't a
> 64-bit userland yet. Is stepping down to 32-bit on the C160 hurting its
> performance that badly?
It really depends on what the tests are doing.
Can you characterize the tests you care about better?
You might also investigate why NEURAL NET test is so much
faster on C160: 1.6905 (C160) vs 1.1334 (B132L)
That's much more than clock speed difference.
I suspect it's a cache friendly algorithm.
> I suppose (assuming I'm correct about the reason for the performance
> drop) my options are to wait for 64-bit userland or to put HPUX on it.
HPUX has a very good parisc compiler. It's possible recent gcc is
approaching the acc performance in most cases, but *alot* of tuning
went into acc and PBO (Profile Based Optimization) is still the
easiest way to get nearly optimal performance for any program.
If performance on a C160 is that important, my advice is to dump
both boxes and buy any Pentium M (1.x Ghz) laptop :^). But that would
be boring... :^P
> Is there any way which someone can help the 64-bit userland effort who
> is quite strong in system-level programming in general though weak in
> Linux kernel programming specifically? Is there a project web site for
> this effort?
Just this mailing list. Carlos O'Donell is occasionally hacking at it.
Sounds like you might be able to help him.
thanks again,
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [parisc-linux] B132L outperforms C160 - 64-bit userland needed?
2005-08-17 6:19 ` Grant Grundler
@ 2005-08-17 18:42 ` Kurt Fitzner
2005-08-17 18:56 ` Kyle McMartin
2005-08-17 19:40 ` Andrew Sharp
2005-08-17 20:38 ` Carlos O'Donell
1 sibling, 2 replies; 21+ messages in thread
From: Kurt Fitzner @ 2005-08-17 18:42 UTC (permalink / raw)
To: Grant Grundler; +Cc: parisc-linux
Grant Grundler wrote:
> If performance on a C160 is that important, my advice is to dump
> both boxes and buy any Pentium M (1.x Ghz) laptop :^). But that would
> be boring... :^P
Boring indeed. :)
Actually, my thought is that my little C160 may be simply shedding light
on a problem that's more global. The C160 is the slowest PA2.0 machine
made. It's one of the few machines where a PA2.0 CPU is in the same
speed range as PA1.1 CPUs.
What I'm wondering is that if a C160 is performing at the level of a
B132L then could perhaps ALL PA2.0 machines be underperforming? It
could be just that my C160 is one of the few machines that was slow
enough anyway that it raised a red flag. If that's the case and
something minor-ish is making everyone's PA2.0 machines underperform, it
may be something worth looking into.
Now, A doesn't necarily HAVE to follow B here, but I don't think it's a
totally unreasonable suspicion. I may be chasing my tail, but at least
it's something to start with to help me learn the ropes.
Kurt.
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [parisc-linux] B132L outperforms C160 - 64-bit userland needed?
2005-08-17 18:42 ` Kurt Fitzner
@ 2005-08-17 18:56 ` Kyle McMartin
2005-08-17 19:40 ` Andrew Sharp
1 sibling, 0 replies; 21+ messages in thread
From: Kyle McMartin @ 2005-08-17 18:56 UTC (permalink / raw)
To: Kurt Fitzner; +Cc: parisc-linux
On Wed, Aug 17, 2005 at 12:42:06PM -0600, Kurt Fitzner wrote:
> Now, A doesn't necarily HAVE to follow B here, but I don't think it's a
> totally unreasonable suspicion. I may be chasing my tail, but at least
> it's something to start with to help me learn the ropes.
>
Well, seeing as I doubt we'll ever see any PA2.0 CPU ERS (or even 7200?)
then I don't really know what you want to change.
Cheers,
--
Kyle McMartin
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [parisc-linux] B132L outperforms C160 - 64-bit userland needed?
2005-08-17 18:42 ` Kurt Fitzner
2005-08-17 18:56 ` Kyle McMartin
@ 2005-08-17 19:40 ` Andrew Sharp
2005-08-18 5:27 ` Kurt Fitzner
1 sibling, 1 reply; 21+ messages in thread
From: Andrew Sharp @ 2005-08-17 19:40 UTC (permalink / raw)
To: parisc-linux
On Wed, Aug 17, 2005 at 12:42:06PM -0600, Kurt Fitzner wrote:
> Grant Grundler wrote:
>
> > If performance on a C160 is that important, my advice is to dump
> > both boxes and buy any Pentium M (1.x Ghz) laptop :^). But that would
> > be boring... :^P
>
> Boring indeed. :)
>
> Actually, my thought is that my little C160 may be simply shedding light
> on a problem that's more global. The C160 is the slowest PA2.0 machine
> made. It's one of the few machines where a PA2.0 CPU is in the same
> speed range as PA1.1 CPUs.
It seems like you're answering your own question here. I'm an
hppa-ignoramus, but the thing that sticks out the most of all this is that
the machines aren't that different in clock rate, but are that different
in cache size. So stop ignoring that difference. Cache thrashing is
a real, and sad, international problem. Take those benchmarks that
produce slower results on a P4/2MB cache processor than they do on a
P4/512KB processor. Same motherboard, OS, benchmark. Only difference,
4x bigger cache. If the processor is spending more time loading cache
lines, it's spending less time computating. Cache systems are designed to
improve the performance of general computing tasks, and many benchmarks
fall outside that realm. What's interesting is that some processors
have caches so big that these benchmarks do still fit inside them (8MB,
16MB, etc) and those processors look really good on those benchmarks
(pucker up your lips and say "SPEC"), compared to cheaper, but really
much faster, processors.
And as for the gcc compiler switch differences, it seems that maybe
the 3.3.5 or whatever just isn't that great when it comes to 2.0/8000
optimizations. Not really a shocker, but good to know, as it makes the
case for taking a similar look at 4.0 and 4.1 performance. Mayhaps there
is still time to put some changes into gcc 4.x before a final version is
nailed down for etch or ?
I must be getting bored by pentiums. What's an old, dual
processor parisc/2.0 machine go for on ebay these days? ~:^)
a
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [parisc-linux] B132L outperforms C160 - 64-bit userland needed?
2005-08-17 19:40 ` Andrew Sharp
@ 2005-08-18 5:27 ` Kurt Fitzner
2005-08-18 7:17 ` Grant Grundler
2005-08-20 6:21 ` Grant Grundler
0 siblings, 2 replies; 21+ messages in thread
From: Kurt Fitzner @ 2005-08-18 5:27 UTC (permalink / raw)
To: parisc-linux
Andrew Sharp wrote:
> So stop ignoring that difference. Cache thrashing is
> a real, and sad, international problem. Take those benchmarks that
> produce slower results on a P4/2MB cache processor than they do on a
> P4/512KB processor. Same motherboard, OS, benchmark. Only difference,
> 4x bigger cache. If the processor is spending more time loading cache
> lines, it's spending less time computating. Cache systems are designed to
> improve the performance of general computing tasks, and many benchmarks
> fall outside that realm.
Andrew Sharp wrote:
Ok, I bit the bullet. I took an image of the hard drive to restore
later, and installed HPUX. I then obtained gcc 3.4.3 from the HPUX
software porting archive and recompiled the benchmark with that. Raw
data is at the end.
Next off, for nbench running on a C160/HPUX I see a memory throughput
increase of 2.67X, integer calc increase of 1.4X, and floating point
increase of 2.24X when compared to the C160/Linux.
The data show exactly the sort of performance increase that I originally
expected to see when comparing a B132L and C160. Strongly improved
integer and vastly improved memory performance.
The performance issue on my C160/Linux was not due cache line
loading/thrashing. It's not due to slower memory, nor due to some odd
conflaguration of C160 architecture changes and an old benchmark
program. I am convinced that there is some problem in Linux - probably
some otherwise minor thing. Whatever it is, it's sucking performance
from PA8000 systems - and perhaps other PA8x00 ones.
What really locked it in for me that it's an OS issue, aside from the
dramatic results below, is what happened when I changed optimization
switches in HPUX. I see the exact same 2% drop in performance when I go
from binaries compiled as 1.1/7300 to 2.0/8000 that I saw in Linux.
Same compiler on both OSes producing code that reacts exactly the same
way when optimizations are changed.
I'm convinced - how can I convince the group? I'm quite willing to give
ssh access to my machines for the results to be verified. I should be
able to swap between HPUX/Linux fairly quickly now that I have images of
both.
Kurt.
p.s. I installed HPUX in 64-bit mode. Which, so I'm told, should have
decreased my performance. I'd like to get my C160 running Linux in
64-bit mode, but I can't. :(
------------------------------------------------------------------
TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 99.68 : 2.56 : 0.84
STRING SORT : 14.104 : 6.30 : 0.98
BITFIELD : 2.1599e+07 : 3.70 : 0.77
FP EMULATION : 8.8339 : 4.24 : 0.98
FOURIER : 2660.2 : 3.03 : 1.70
ASSIGNMENT : 1.8832 : 7.17 : 1.86
IDEA : 116.73 : 1.79 : 0.53
HUFFMAN : 121.4 : 3.37 : 1.08
NEURAL NET : 2.81 : 4.51 : 1.90
LU DECOMPOSITION : 116.6 : 6.04 : 4.36
==================================================================
OS : HP-UX B.11.11
C compiler : gcc version 3.4.3
MEMORY INDEX : 1.120
INTEGER INDEX : 0.827
FLOATING-POINT INDEX: 2.414
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [parisc-linux] B132L outperforms C160 - 64-bit userland needed?
2005-08-18 5:27 ` Kurt Fitzner
@ 2005-08-18 7:17 ` Grant Grundler
2005-08-20 6:21 ` Grant Grundler
1 sibling, 0 replies; 21+ messages in thread
From: Grant Grundler @ 2005-08-18 7:17 UTC (permalink / raw)
To: Kurt Fitzner; +Cc: parisc-linux
On Wed, Aug 17, 2005 at 11:27:24PM -0600, Kurt Fitzner wrote:
> p.s. I installed HPUX in 64-bit mode. Which, so I'm told, should have
> decreased my performance. I'd like to get my C160 running Linux in
> 64-bit mode, but I can't. :(
Sure you can. recent 64-bit kernels will boot on C160.
It's just not an obvious win in most cases.
I need to think about your other comments.
parisc-linux is less than optimal for cache behavior.
Especially kernels that predate 2.6.10-ish timeframe where
James Bottomley fixed up some of the worst offenses.
But if it's an OS problem (it could be), running lmbench
on both HPUX and parisc-linux might shed light on where
the issue is.
thanks for posting the HPUX NBench results.
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [parisc-linux] B132L outperforms C160 - 64-bit userland needed?
2005-08-18 5:27 ` Kurt Fitzner
2005-08-18 7:17 ` Grant Grundler
@ 2005-08-20 6:21 ` Grant Grundler
1 sibling, 0 replies; 21+ messages in thread
From: Grant Grundler @ 2005-08-20 6:21 UTC (permalink / raw)
To: Kurt Fitzner; +Cc: parisc-linux
On Wed, Aug 17, 2005 at 11:27:24PM -0600, Kurt Fitzner wrote:
...
> Next off, for nbench running on a C160/HPUX I see a memory throughput
> increase of 2.67X, integer calc increase of 1.4X, and floating point
> increase of 2.24X when compared to the C160/Linux.
...
> The performance issue on my C160/Linux was not due cache line
> loading/thrashing.
...
> What really locked it in for me that it's an OS issue,
...
It clearly is an OS issue. And thinking about it more,
I'm convinced this benchmark is exposing a cache utilization
(or lack thereof) issue. HPUX is very efficient at
NOT flushing the cache. Linux flushes alot more.
thanks,
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [parisc-linux] B132L outperforms C160 - 64-bit userland needed?
2005-08-17 6:19 ` Grant Grundler
2005-08-17 18:42 ` Kurt Fitzner
@ 2005-08-17 20:38 ` Carlos O'Donell
1 sibling, 0 replies; 21+ messages in thread
From: Carlos O'Donell @ 2005-08-17 20:38 UTC (permalink / raw)
To: Grant Grundler; +Cc: parisc-linux
On Wed, Aug 17, 2005 at 12:19:13AM -0600, Grant Grundler wrote:
> > Is there any way which someone can help the 64-bit userland effort who
> > is quite strong in system-level programming in general though weak in
> > Linux kernel programming specifically? Is there a project web site for
> > this effort?
>
> Just this mailing list. Carlos O'Donell is occasionally hacking at it.
> Sounds like you might be able to help him.
http://wiki.parisc-linux.org/userspace64
c.
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [parisc-linux] B132L outperforms C160 - 64-bit userland needed?
@ 2005-08-18 8:27 Joel Soete
2005-08-20 6:26 ` Grant Grundler
0 siblings, 1 reply; 21+ messages in thread
From: Joel Soete @ 2005-08-18 8:27 UTC (permalink / raw)
To: grundler; +Cc: parisc-linux, tsg45800
> On Wed, Aug 17, 2005 at 11:27:24PM -0600, Kurt Fitzner wrote:
> > p.s. I installed HPUX in 64-bit mode. Which, so I'm told, should ha=
ve
> > decreased my performance. I'd like to get my C160 running Linux in
> > 64-bit mode, but I can't. :(
>
> Sure you can. recent 64-bit kernels will boot on C160.
> It's just not an obvious win in most cases.
>
mmm, is this system need ccio-dma and zalon:ncr53c720 drivers as does my =
d380?
[...]
>
> But if it's an OS problem (it could be), running lmbench
> on both HPUX and parisc-linux might shed light on where
> the issue is.
>
On my d380 those two drivers doesn't seems to be 64bit ready :_( (2.6.12-=
pa2)
(I expect to find simple instrumentation tools like IKD (but not maintain=
for
2.6) or LTT to help me to figure out if some race condition (as I suspect=
)
occures or not)
Thanks,
Joel
=0A=0A---------------------------------------------------------------=0AA=
free anti-spam and anti-virus filter on all Scarlet mailboxes=0AMore inf=
o on http://www.scarlet.be/
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [parisc-linux] B132L outperforms C160 - 64-bit userland needed?
2005-08-18 8:27 Joel Soete
@ 2005-08-20 6:26 ` Grant Grundler
[not found] ` <430778F2.8020406@tiscali.be>
0 siblings, 1 reply; 21+ messages in thread
From: Grant Grundler @ 2005-08-20 6:26 UTC (permalink / raw)
To: Joel Soete; +Cc: parisc-linux, tsg45800
On Thu, Aug 18, 2005 at 10:27:37AM +0200, Joel Soete wrote:
> mmm, is this system need ccio-dma and zalon:ncr53c720 drivers
> as does my d380?
C160 will need both ccio-dma and zalon.
we've run ccio-dma in 64-bit in the past.
I'm not as sure about zalon driver.
> On my d380 those two drivers doesn't seems to be 64bit
> ready :_( (2.6.12-pa2)
Doesn't compile or doesn't boot?
> (I expect to find simple instrumentation tools like IKD
> (but not maintain for 2.6) or LTT to help me to figure
> out if some race condition (as I suspect)
> occures or not)
There are plenty of things that don't work for palinux.
The parisc port is just doing "ok" - not excellent.
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2005-08-21 14:19 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-16 9:23 [parisc-linux] B132L outperforms C160 - 64-bit userland needed? Kurt Fitzner
2005-08-16 13:00 ` Michael S. Zick
2005-08-17 0:03 ` Kurt Fitzner
2005-08-17 1:32 ` John David Anglin
2005-08-17 1:48 ` Michael S. Zick
2005-08-17 3:43 ` Kurt Fitzner
2005-08-17 6:37 ` Grant Grundler
2005-08-17 14:16 ` John David Anglin
2005-08-17 6:19 ` Grant Grundler
2005-08-17 18:42 ` Kurt Fitzner
2005-08-17 18:56 ` Kyle McMartin
2005-08-17 19:40 ` Andrew Sharp
2005-08-18 5:27 ` Kurt Fitzner
2005-08-18 7:17 ` Grant Grundler
2005-08-20 6:21 ` Grant Grundler
2005-08-17 20:38 ` Carlos O'Donell
-- strict thread matches above, loose matches on Subject: below --
2005-08-18 8:27 Joel Soete
2005-08-20 6:26 ` Grant Grundler
[not found] ` <430778F2.8020406@tiscali.be>
[not found] ` <20050820234126.GA20524@colo.lackof.org>
2005-08-21 9:42 ` Joel Soete
[not found] ` <20050820235516.GE2756@parcelfarce.linux.theplanet.co.uk>
2005-08-21 10:29 ` Joel Soete
2005-08-21 14:19 ` Matthew Wilcox
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.