public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* HPL benchmarking linux kernel for shared memory performance.
@ 2008-03-16  5:38 Allan Menezes
  2008-03-16 10:42 ` Marcin Slusarz
  2008-03-16 18:13 ` Roger Heflin
  0 siblings, 2 replies; 5+ messages in thread
From: Allan Menezes @ 2008-03-16  5:38 UTC (permalink / raw)
  To: linux-kernel

Hi,
    I eliminated in which kernel verion this begins to happen. I am 
benchmarking a single node with hpl and openmpi beta 1.3 and gotoblas v1.24
for personal noncommercial reasons.
I tried a single node with the command $ mpirun -np 1 ./xhpl
and with kernel ver 2.6.23.14 i get over 38Gflops
but with kernel ver 2.6.23.15 compiling with the same .config i get 7.xx 
Gflops which is 1/5th that of the other kernel.
Keep in mind that only the kernels have changed from kernel.org not any 
hardware or anything else as all else is same! but the  performace drops 
to 1/5th
There is no network involved in this single node quad core intel test 
but just shared memory.
So the shared memory or smp performance of the newer kernels is far far 
worse than upto 2.6.23.14!
Even with 2.6.25.rc5 the performance is degraded!
Can you please help me find why this is occurring? Please advise!
mY set up Quad core Q6600 intel overclocked stably to 2.88 GHZ , 6 gig 
ddr2 800 mhz dual channel ram,
Allan Menezes

^ permalink raw reply	[flat|nested] 5+ messages in thread
* HPL benchmarking linux kernel for shared memory performance.
@ 2008-03-22  4:05 Allan Menezes
  2008-03-22 10:04 ` Andi Kleen
  0 siblings, 1 reply; 5+ messages in thread
From: Allan Menezes @ 2008-03-22  4:05 UTC (permalink / raw)
  To: linux-kernel

Hi,
   Sorry to have bothered you all but there is no degradation of 
performance with hpl and kernel upto 2.6.25-rc6. It was my silly error!
The motherboards I use are two Asus p5e vm do and 3 ASus p5k-vm with 
intel Q6600 quad cpus with GO stepping oveclocking them slightly i now 
get 165 GFlops for my 5 node quad core cluster
with gotoblas v1.24 and hpl and opensource openmpi v 1.3 alpha.
My error was that the CPUID LIMIT was enabled in bios which is for older 
windows systems which prevents the cpuid instruction on boot up from 
filling the register EAX with values above 0x03
So I was seeing only 2 cores for a quad core in cat /proc/cpuinfo on all 
nodes giving me lousy performance for every kernel except some like 
2.6.23.13 and 2.6.23.14.since i was oversubscribibg each node with 4 
processes and top -H  confirmed that.
Now I disabled CPUID LIMIT for each node and i see alll 4 processors on 
eac node with cat /proc/cpuinfo for kernel ver 2.6.25-rc6 and i get 
38.44 -38.9 Gflops per node. Ofcourse i am stably overclocking each 
q6600 by approx 480mhz to get that otherwise for the kentsfiel q6600 
intel at 2.4 GHZ the peak is 38.4 GFlops( reference paper by Dr. Jack 
Dongarra et al.)
The bios message is misleading : It says " Disable for Windows Xp" and 
since I am running linux I enable it till i Googled for CPUID LIMIT and 
found out and experimented!
So for your info disable CPUID LIMIT in bios when running quad core 
linux or more! It's meant for old oses like win98 etc.
Thank you and and sorry for the bother it was my error and not a fault 
in the kernels. This is FYI only.
Cheers,
Allan Menezes

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-03-22 10:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-16  5:38 HPL benchmarking linux kernel for shared memory performance Allan Menezes
2008-03-16 10:42 ` Marcin Slusarz
2008-03-16 18:13 ` Roger Heflin
  -- strict thread matches above, loose matches on Subject: below --
2008-03-22  4:05 Allan Menezes
2008-03-22 10:04 ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox