public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Disturbing wide variation in execution time
@ 2005-07-07  6:44 Sheo Shanker Prasad
  2005-07-07  7:10 ` David S. Miller
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Sheo Shanker Prasad @ 2005-07-07  6:44 UTC (permalink / raw)
  To: linux-kernel

I will appreciate your help in eliminating a disturbing wide variation (by a 
factors of 2 to 2.5) in the execution time of a test (execution benchmark) 
program under identical conditions even when the machine is freshly started 
(rebooted) and no other user program is running (not even e-mail or Internet 
browser).

I have a dual Opteron 250 (2.4 GHz) running SuSE 9.3 Pro & Linux version 
2.6.11.4-21.7-smp (geeko@buildhost) (gcc version 3.3.5 20050117 (prerelease) 
(SUSE Linux)) #1 SMP Thu Jun 2 14:23:14 UTC 2005. The motherboard is Tyan 
Thunder K8W (S2885 ANRF) with AMI BIOS

The machine has 4GB of PC3200 DDR RAM, two dimms on each CPU.

The original machine bought from a vendor about 6 months ago. At that time it 
was running SuSE 9.1 Pro and the execution time for the same test program was 
consistently the same (around 2m 37s +/- a few %). Then the mother board 
failed and the machine went totally dead. The vendor then replaced the failed 
motherboard with a new Tyan Thunder K8W and installed the SuSE 9.3. I am not 
sure whether or not the AMI BIOS was also replaced.

When the repaired machine was started, I began to notice the disturbing wide 
variation and the frequect significant slow down of the machine as exhibited 
by the factor of 2 to 2.5 increased execution time of the test program as 
described above.  Sometimes it would be quite fast (executing at the original 
2m 40s) and sometime a factor of 2.5 slow, and sometimes with speed in 
between.

I have already done these tests. I have tested the memory using both 
memtest86+ version 1.6 and memtest86-3.2. In both tests done over 3 cycles NO 
memory error was reported. I also ran Linux version of BYTE Bench mark for 
memory, floating point and integer indices. These tests matched test reported 
by others for their Opteron 250. 

Nevertheless, I have this wide and random variation in the execution time of 
given program under identical conditions. Guided by the comments I 
received from suse-amd64 user mailing list and the advises posted on LKML.ORG 
(this list), I tried booting with the option "mem=3000M" (significantly less 
than 4000M). That does not help either.

I am now perplexed as to why the machine is behaving with so unpredictable 
speeds varying by  factors of 2 to 2.5. What could the the cause and how can 
I get rid of it and make the machine reliable and efficient? (Also, when I 
boot with mem=3000M, then does that mean that the remainingg memory is wasted? 
What is the significance of putting that limit on the memory?)

Your help will be greatly appreciated.

Best regards.

Sheo
(Sheo S. Prasad)
Creative Research Enterprises
6354 Camino del Lago
Pleasanton, CA 94566, USA
Voice Phone: (+1) 925 426-9341
Fax   Phone: (+1) 925 426-9417
e-mail: ssp@CreativeResearch.org


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Disturbing wide variation in execution time
  2005-07-07  6:44 Disturbing wide variation in execution time Sheo Shanker Prasad
@ 2005-07-07  7:10 ` David S. Miller
  2005-07-07  8:03   ` Sheo Shanker Prasad
  2005-07-07 18:46 ` Philippe Troin
  2005-07-08  0:41 ` michael
  2 siblings, 1 reply; 7+ messages in thread
From: David S. Miller @ 2005-07-07  7:10 UTC (permalink / raw)
  To: ssp; +Cc: linux-kernel

From: Sheo Shanker Prasad <ssp@creativeresearch.org>
Date: Wed, 6 Jul 2005 23:44:53 -0700

> I will appreciate your help in eliminating a disturbing wide variation (by a 
> factors of 2 to 2.5) in the execution time of a test (execution benchmark) 
> program under identical conditions even when the machine is freshly started 
> (rebooted) and no other user program is running (not even e-mail or Internet 
> browser).

You haven't told us exactly what your test program does.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Disturbing wide variation in execution time
  2005-07-07  7:10 ` David S. Miller
@ 2005-07-07  8:03   ` Sheo Shanker Prasad
  2005-07-07  9:53     ` Eric Piel
  0 siblings, 1 reply; 7+ messages in thread
From: Sheo Shanker Prasad @ 2005-07-07  8:03 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel

Dear David,

The program is an atmospheric chemistry-transport modeling code that computes 
the distributions of atmospheric species (e.g., ozone) as a function of 
latitude and altitude and how that changes with time.

Thanks for taking time to think about my problem. I greatly appreciate it.

I hope to hear from you soon.

Regards.

Sheo


On Thursday July 7 2005 12:10 am, you wrote:
> From: Sheo Shanker Prasad <ssp@creativeresearch.org>
> Date: Wed, 6 Jul 2005 23:44:53 -0700
>
> > I will appreciate your help in eliminating a disturbing wide variation
> > (by a factors of 2 to 2.5) in the execution time of a test (execution
> > benchmark) program under identical conditions even when the machine is
> > freshly started (rebooted) and no other user program is running (not even
> > e-mail or Internet browser).
>
> You haven't told us exactly what your test program does.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Best regards.

Sheo
(Sheo S. Prasad)
Creative Research Enterprises
6354 Camino del Lago
Pleasanton, CA 94566, USA
Voice Phone: (+1) 925 426-9341
Fax   Phone: (+1) 925 426-9417
e-mail: ssp@CreativeResearch.org


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Disturbing wide variation in execution time
  2005-07-07  8:03   ` Sheo Shanker Prasad
@ 2005-07-07  9:53     ` Eric Piel
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Piel @ 2005-07-07  9:53 UTC (permalink / raw)
  To: Sheo Shanker Prasad; +Cc: David S. Miller, linux-kernel

(please do not top-post)

07.07.2005 10:03, Sheo Shanker Prasad wrote/a écrit:
> Dear David,
> 
> The program is an atmospheric chemistry-transport modeling code that computes 
> the distributions of atmospheric species (e.g., ozone) as a function of 
> latitude and altitude and how that changes with time.
Well, to let us examine your  problem, you need to describe what 
*technicaly* does your program. ie: lots of disk access, lots of 
networks acces, lots of CPU access, allocate huge memory... in 
particular, is it CPU bound or I/O bound? You should also mention how 
tasks/thread is your program composed of.

It's good to describe your hardware: cluster of NUMA computers or just 
one x86? How much memory, what kind of network, how many hard disks, 
which exact version of the kernel....

Cherry on the top, you could include a *small* program which exhibit the 
same problem.

> 
> Thanks for taking time to think about my problem. I greatly appreciate it.
> 
To answer briefly your question, _in general_ a program which takes more 
than a few seconds to execute should take roughly the same time to 
execute all the times.


Eric

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Disturbing wide variation in execution time
  2005-07-07  6:44 Disturbing wide variation in execution time Sheo Shanker Prasad
  2005-07-07  7:10 ` David S. Miller
@ 2005-07-07 18:46 ` Philippe Troin
  2005-07-08 21:51   ` Sheo Shanker Prasad
  2005-07-08  0:41 ` michael
  2 siblings, 1 reply; 7+ messages in thread
From: Philippe Troin @ 2005-07-07 18:46 UTC (permalink / raw)
  To: Sheo Shanker Prasad; +Cc: linux-kernel

Sheo Shanker Prasad <ssp@creativeresearch.org> writes:

> I will appreciate your help in eliminating a disturbing wide
> variation (by a factors of 2 to 2.5) in the execution time of a test
> (execution benchmark) program under identical conditions even when
> the machine is freshly started (rebooted) and no other user program
> is running (not even e-mail or Internet browser).
> 
> I have a dual Opteron 250 (2.4 GHz) running SuSE 9.3 Pro & Linux
> version 2.6.11.4-21.7-smp (geeko@buildhost) (gcc version 3.3.5
> 20050117 (prerelease) (SUSE Linux)) #1 SMP Thu Jun 2 14:23:14 UTC
> 2005. The motherboard is Tyan Thunder K8W (S2885 ANRF) with AMI BIOS
> 
> The machine has 4GB of PC3200 DDR RAM, two dimms on each CPU.
> 
> The original machine bought from a vendor about 6 months ago. At
> that time it was running SuSE 9.1 Pro and the execution time for the
> same test program was consistently the same (around 2m 37s +/- a few
> %). Then the mother board failed and the machine went totally
> dead. The vendor then replaced the failed motherboard with a new
> Tyan Thunder K8W and installed the SuSE 9.3. I am not sure whether
> or not the AMI BIOS was also replaced.
> 
> When the repaired machine was started, I began to notice the
> disturbing wide variation and the frequect significant slow down of
> the machine as exhibited by the factor of 2 to 2.5 increased
> execution time of the test program as described above.  Sometimes it
> would be quite fast (executing at the original 2m 40s) and sometime
> a factor of 2.5 slow, and sometimes with speed in between.

8< snip >8

 1. Are you running an i386 kernel or an x86_64 kernel?

 2. Which BIOS version?

 3. Is node interleaving enabled in the BIOS?

Phil.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Disturbing wide variation in execution time
  2005-07-07  6:44 Disturbing wide variation in execution time Sheo Shanker Prasad
  2005-07-07  7:10 ` David S. Miller
  2005-07-07 18:46 ` Philippe Troin
@ 2005-07-08  0:41 ` michael
  2 siblings, 0 replies; 7+ messages in thread
From: michael @ 2005-07-08  0:41 UTC (permalink / raw)
  To: Sheo Shanker Prasad; +Cc: linux-kernel

Sheo Shanker Prasad <ssp@creativeresearch.org> writes:
[...]
> When the repaired machine was started, I began to notice the disturbing wide 
> variation and the frequect significant slow down of the machine as exhibited 
> by the factor of 2 to 2.5 increased execution time of the test program as 
> described above.  Sometimes it would be quite fast (executing at the original 
> 2m 40s) and sometime a factor of 2.5 slow, and sometimes with speed in 
> between.

Something stuffed on the CPU heatsink causing thermal speed throttling?
Just a wild guess.

Michael.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Disturbing wide variation in execution time
  2005-07-07 18:46 ` Philippe Troin
@ 2005-07-08 21:51   ` Sheo Shanker Prasad
  0 siblings, 0 replies; 7+ messages in thread
From: Sheo Shanker Prasad @ 2005-07-08 21:51 UTC (permalink / raw)
  To: Philippe Troin; +Cc: linux-kernel

On Thursday July 7 2005 11:46 am, Philippe Troin wrote:
> Sheo Shanker Prasad <ssp@creativeresearch.org> writes:
> > I will appreciate your help in eliminating a disturbing wide
> > variation (by a factors of 2 to 2.5) in the execution time of a test
> > (execution benchmark) program under identical conditions even when
> > the machine is freshly started (rebooted) and no other user program
> > is running (not even e-mail or Internet browser).
> >
> > I have a dual Opteron 250 (2.4 GHz) running SuSE 9.3 Pro & Linux
> > version 2.6.11.4-21.7-smp (geeko@buildhost) (gcc version 3.3.5
> > 20050117 (prerelease) (SUSE Linux)) #1 SMP Thu Jun 2 14:23:14 UTC
> > 2005. The motherboard is Tyan Thunder K8W (S2885 ANRF) with AMI BIOS
> >
> > The machine has 4GB of PC3200 DDR RAM, two dimms on each CPU.
> >
> > The original machine bought from a vendor about 6 months ago. At
> > that time it was running SuSE 9.1 Pro and the execution time for the
> > same test program was consistently the same (around 2m 37s +/- a few
> > %). Then the mother board failed and the machine went totally
> > dead. The vendor then replaced the failed motherboard with a new
> > Tyan Thunder K8W and installed the SuSE 9.3. I am not sure whether
> > or not the AMI BIOS was also replaced.
> >
> > When the repaired machine was started, I began to notice the
> > disturbing wide variation and the frequect significant slow down of
> > the machine as exhibited by the factor of 2 to 2.5 increased
> > execution time of the test program as described above.  Sometimes it
> > would be quite fast (executing at the original 2m 40s) and sometime
> > a factor of 2.5 slow, and sometimes with speed in between.
>
> 8< snip >8

Thanks very much for your taking time to think about my problem. Here are 
answers to your questions.
>
>  1. Are you running an i386 kernel or an x86_64 kernel?

I think, I am running a x86_64 kernel.  I think so, because I had asked the 
vendor of the machine to install x86_64 and because the file

System.map-2.6.11.4-21.7-smp 

in the /boot directory has an entry: ffffffff804f0000 T x86_64_start_kernel

and that directory also contains the gzipped file:

 symvers-2.6.11.4-21.7-x86_64-smp.gz

The operating system is Linux version 2.6.11.4-21.7-smp (geeko@buildhost)  
(gcc version 3.3.5 20050117 (prerelease) (SUSE Linux)) #1 SMP Thu Jun 2 
14:23:14 UTC 2005
>
>  2. Which BIOS version?

The BIOS is AMIBIOS version is 08.00.10 with the build date of 02/11/05 
09:44:04 and has the ID:  0AAAA001.

>
>  3. Is node interleaving enabled in the BIOS?

When I go through the BIOS setup, I do not see any choice for the node 
interleaving ON or OFF. However, I think that the two CPUs (as node0 and 
node1) are made NUMA aware by default, but I could be quite wrong. 

Out of ignorance, therefore,  the following are the contents of 

 /sys/devices/system/node/node0/numastat &

numa_hit 3620274
numa_miss 0
numa_foreign 0
interleave_hit 21903
local_node 3610298agravaited
other_node 9976

Similarly, following are the  the contents of

  /sys/devices/system/node/node1/numastat

numa_hit 3089426
numa_miss 0
numa_foreign 0
interleave_hit 38355
local_node 3072605
other_node 16821

>
> Phil.

 Thanks again, Phil, and I hope to hear from you soon.
-- 
Best regards.

Sheo
(Sheo S. Prasad)
Creative Research Enterprises
6354 Camino del Lago
Pleasanton, CA 94566, USA
Voice Phone: (+1) 925 426-9341
Fax   Phone: (+1) 925 426-9417
e-mail: ssp@CreativeResearch.org


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-07-08 21:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-07  6:44 Disturbing wide variation in execution time Sheo Shanker Prasad
2005-07-07  7:10 ` David S. Miller
2005-07-07  8:03   ` Sheo Shanker Prasad
2005-07-07  9:53     ` Eric Piel
2005-07-07 18:46 ` Philippe Troin
2005-07-08 21:51   ` Sheo Shanker Prasad
2005-07-08  0:41 ` michael

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox