public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Help understanding slow down
@ 2004-05-24  0:32 Phy Prabab
  2004-05-24  0:57 ` Phy Prabab
  0 siblings, 1 reply; 31+ messages in thread
From: Phy Prabab @ 2004-05-24  0:32 UTC (permalink / raw)
  To: linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 991 bytes --]

Hello,

I need some help understanding what is at issue with
the extreme lsow down in build times for a custom
executable on different kernel versions.  The
difference is pretty huge:

RH 2.4.20-13-7 : ~1m.10s
2.4.22         : ~1m.40s
2.4.26         : ~2m.15s
2.6.1          : ~3m.40s
2.6.2          : ~4m.00s
2.6.3          : ~4m.00s
2.6.6          : ~3m.15s
2.6.6-mm4      : ~2m.10s
2.6.6-mm5      : ~2m.50s
2.6.7-p1       : ~2m.80s
(ran five times on every kernel to get approximate
time listed)

The question is, how can I get the newer kernels to
scream like the older kernels?

I have moved all files in question to the local disk
to rule out network issues (though the 2.6.x kernels
are faster at net access).  I have run the make
command in debug mode and find no differnce betHz w/8G
RAM.

Thank you for your assistance.
Phy



	
		
__________________________________
Do you Yahoo!?
Yahoo! Domains – Claim yours for only $14.70/year
http://smallbusiness.promotions.yahoo.com/offer 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  0:32 Help understanding slow down Phy Prabab
@ 2004-05-24  0:57 ` Phy Prabab
  2004-05-24  1:04   ` William Lee Irwin III
  2004-05-24  6:33   ` Paul Rolland
  0 siblings, 2 replies; 31+ messages in thread
From: Phy Prabab @ 2004-05-24  0:57 UTC (permalink / raw)
  To: Phy Prabab, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 2053 bytes --]

Just for more clarification, here is a perfect
example:

2.6.7-p1:
24.86user 51.77system 2:58.87elapsed 42%CPU
(0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (13major+7591686minor)pagefaults
0swaps

2.4.21:
28.68user 34.98system 1:12.34elapsed 87%CPU
(0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (5691267major+1130523minor)pagefaults
0swaps


Both runs on the same machine with the same process
(making headers).

Could someone give me some pointers/directions on
where to look.

Thank you for your time.
Phy



--- Phy Prabab <phyprabab@yahoo.com> wrote:
> Hello,
> 
> I need some help understanding what is at issue with
> the extreme lsow down in build times for a custom
> executable on different kernel versions.  The
> difference is pretty huge:
> 
> RH 2.4.20-13-7 : ~1m.10s
> 2.4.22         : ~1m.40s
> 2.4.26         : ~2m.15s
> 2.6.1          : ~3m.40s
> 2.6.2          : ~4m.00s
> 2.6.3          : ~4m.00s
> 2.6.6          : ~3m.15s
> 2.6.6-mm4      : ~2m.10s
> 2.6.6-mm5      : ~2m.50s
> 2.6.7-p1       : ~2m.80s
> (ran five times on every kernel to get approximate
> time listed)
> 
> The question is, how can I get the newer kernels to
> scream like the older kernels?
> 
> I have moved all files in question to the local disk
> to rule out network issues (though the 2.6.x kernels
> are faster at net access).  I have run the make
> command in debug mode and find no differnce betHz
> w/8G
> RAM.
> 
> Thank you for your assistance.
> Phy
> 
> 
> 
> 	
> 		
> __________________________________
> Do you Yahoo!?
> Yahoo! Domains – Claim yours for only $14.70/year
> http://smallbusiness.promotions.yahoo.com/offer 
> -
> To unsubscribe from this list: send the line
> "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at 
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


	
		
__________________________________
Do you Yahoo!?
Yahoo! Domains – Claim yours for only $14.70/year
http://smallbusiness.promotions.yahoo.com/offer 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  0:57 ` Phy Prabab
@ 2004-05-24  1:04   ` William Lee Irwin III
  2004-05-24  1:25     ` Jakob Oestergaard
  2004-05-24  6:33   ` Paul Rolland
  1 sibling, 1 reply; 31+ messages in thread
From: William Lee Irwin III @ 2004-05-24  1:04 UTC (permalink / raw)
  To: Phy Prabab; +Cc: linux-kernel

On Sun, May 23, 2004 at 05:57:51PM -0700, Phy Prabab wrote:
> Just for more clarification, here is a perfect
> example:
> 2.6.7-p1:
> 24.86user 51.77system 2:58.87elapsed 42%CPU
> (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (13major+7591686minor)pagefaults
> 0swaps
> 2.4.21:
> 28.68user 34.98system 1:12.34elapsed 87%CPU
> (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (5691267major+1130523minor)pagefaults
> 0swaps

Thanks. This reveals that the performance regression is system time.
Could you please rerun the 2.6.7-rc1 test with kernel profiling on,
and reset the profile buffer prior just prior to the run?

This is the kernel command line option profile=1, and then as root,
echo 1 > /proc/profile just prior to the run.


-- wli

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  1:04   ` William Lee Irwin III
@ 2004-05-24  1:25     ` Jakob Oestergaard
  2004-05-24  1:28       ` William Lee Irwin III
  0 siblings, 1 reply; 31+ messages in thread
From: Jakob Oestergaard @ 2004-05-24  1:25 UTC (permalink / raw)
  To: William Lee Irwin III, Phy Prabab, linux-kernel

On Sun, May 23, 2004 at 06:04:55PM -0700, William Lee Irwin III wrote:
> On Sun, May 23, 2004 at 05:57:51PM -0700, Phy Prabab wrote:
> > Just for more clarification, here is a perfect
> > example:
> > 2.6.7-p1:
> > 24.86user 51.77system 2:58.87elapsed 42%CPU
> > (0avgtext+0avgdata 0maxresident)k
> > 0inputs+0outputs (13major+7591686minor)pagefaults
> > 0swaps
> > 2.4.21:
> > 28.68user 34.98system 1:12.34elapsed 87%CPU
> > (0avgtext+0avgdata 0maxresident)k
> > 0inputs+0outputs (5691267major+1130523minor)pagefaults
> > 0swaps
> 
> Thanks. This reveals that the performance regression is system time.

Eh, not if I read the numbers right:

2.6.7-p1: 24.86user 51.77system 2:58.87elapsed 42%CPU

24.86 + 51.77 = 76.63 seconds on CPU, 102.24 seconds of waiting

2.4.21: 28.68user 34.98system 1:12.34elapsed 87%CPU

28.68 + 34.98 = 63.66 seconds on CPU, 8.68 seconds of waiting

So, 2.6.7-p1 spends 16.79 seconds more in the kernel as you observed,
but it spends 93.56 seconds more waiting for I/O (or whatever).

Unless I'm totally missing something, the wait seems to be the
regression.

 / jakob


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  1:25     ` Jakob Oestergaard
@ 2004-05-24  1:28       ` William Lee Irwin III
  2004-05-24  1:51         ` Phy Prabab
  0 siblings, 1 reply; 31+ messages in thread
From: William Lee Irwin III @ 2004-05-24  1:28 UTC (permalink / raw)
  To: Jakob Oestergaard, Phy Prabab, linux-kernel

On Mon, May 24, 2004 at 03:25:53AM +0200, Jakob Oestergaard wrote:
> Eh, not if I read the numbers right:
> 2.6.7-p1: 24.86user 51.77system 2:58.87elapsed 42%CPU
> 24.86 + 51.77 = 76.63 seconds on CPU, 102.24 seconds of waiting
> 2.4.21: 28.68user 34.98system 1:12.34elapsed 87%CPU
> 28.68 + 34.98 = 63.66 seconds on CPU, 8.68 seconds of waiting
> So, 2.6.7-p1 spends 16.79 seconds more in the kernel as you observed,
> but it spends 93.56 seconds more waiting for I/O (or whatever).
> Unless I'm totally missing something, the wait seems to be the
> regression.

I'm sorry, you're right. Let's start by looking into IO activity.
Phy, could you log the output of vmstat(1) during the runs?


-- wli

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
@ 2004-05-24  1:37 Robert M. Stockmann
  2004-05-24  1:50 ` Phy Prabab
  2004-05-24  3:31 ` Robert M. Stockmann
  0 siblings, 2 replies; 31+ messages in thread
From: Robert M. Stockmann @ 2004-05-24  1:37 UTC (permalink / raw)
  To: linux-kernel

Hi  Phy,

Could you also give the uname -a output of both the 2.4.x and
the 2.6.x kernels ? And also mention which make options you use ? :

make	  : 1 thread , load is around 1.00
make -j 2 : 2 theads , load is around 2.00
make -j 4 : 4 theads , load is around 4.00

# uname -a	(dual-Xeon SMP machine)
Linux jackson.stokkie.net 2.4.23abi #1 SMP Sun Jan 25 03:28:48 CET 2004 i686 unknown unknown GNU/Linux
                                       ^^^^

# uname -a	(poor PII 400 MHz machine)
Linux tapebox.stokkie.net 2.6.6 #1 Thu May 13 05:48:57 CEST 2004 i686 unknown unknown GNU/Linux
                                  ^^^ no SMP

Also , is your Opteron machine rather different as the mainstream Opteron
boards, i.e. are there some exotic drivers used ?

Regards,

Robert

> Just for more clarification, here is a perfect
> example:
>  
> 2.6.7-p1:
> 24.86user 51.77system 2:58.87elapsed 42%CPU
> (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (13major+7591686minor)pagefaults
> 0swaps
>  
> 2.4.21:
> 28.68user 34.98system 1:12.34elapsed 87%CPU
> (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (5691267major+1130523minor)pagefaults
> 0swaps
>  
>  
> Both runs on the same machine with the same process
> (making headers).
>  
> Could someone give me some pointers/directions on
> where to look.
>  
> Thank you for your time.
> Phy
>  
> --- Phy Prabab <phyprabab@xxxxxxxxx> wrote:
>> Hello,
>> 
>> I need some help understanding what is at issue with
>> the extreme lsow down in build times for a custom
>> executable on different kernel versions. The
>> difference is pretty huge:
>> 
>> RH 2.4.20-13-7 : ~1m.10s
>> 2.4.22 : ~1m.40s
>> 2.4.26 : ~2m.15s
>> 2.6.1 : ~3m.40s
>> 2.6.2 : ~4m.00s
>> 2.6.3 : ~4m.00s
>> 2.6.6 : ~3m.15s
>> 2.6.6-mm4 : ~2m.10s
>> 2.6.6-mm5 : ~2m.50s
>> 2.6.7-p1 : ~2m.80s
>> (ran five times on every kernel to get approximate
>> time listed)
>> 
>> The question is, how can I get the newer kernels to
>> scream like the older kernels?
>> 
>> I have moved all files in question to the local disk
>> to rule out network issues (though the 2.6.x kernels
>> are faster at net access). I have run the make
>> command in debug mode and find no differnce betHz
>> w/8G
>> RAM.
>> 
>> Thank you for your assistance.
>> Phy
>> 
>> __________________________________
>> Do you Yahoo!?
>> Yahoo! Domains ? Claim yours for only $14.70/year
>> http://smallbusiness.promotions.yahoo.com/offer 

- 
Robert M. Stockmann - RHCE
Network Engineer - UNIX/Linux Specialist
crashrecovery.org  stock@stokkie.net


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  1:37 Robert M. Stockmann
@ 2004-05-24  1:50 ` Phy Prabab
  2004-05-24  3:31 ` Robert M. Stockmann
  1 sibling, 0 replies; 31+ messages in thread
From: Phy Prabab @ 2004-05-24  1:50 UTC (permalink / raw)
  To: Robert M. Stockmann, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 3656 bytes --]

Robert,

This is a x86 machine which I have transitioned to for
the time being.  I had submitted a previous email
discussing the Opteron, however, did not hear back so
tried the x86 tests.

uname -a
Linux c1m12 2.6.7-rc1 #2 SMP Sun May 23 18:16:47 PDT
2004 i686 i686 i386 GNU/Linux

making headers for my custom binary is basically just
making links to all the different header file
scattered within the source code (some dependancy
make).

Phy


--- "Robert M. Stockmann" <stock@stokkie.net> wrote:
> Hi  Phy,
> 
> Could you also give the uname -a output of both the
> 2.4.x and
> the 2.6.x kernels ? And also mention which make
> options you use ? :
> 
> make	  : 1 thread , load is around 1.00
> make -j 2 : 2 theads , load is around 2.00
> make -j 4 : 4 theads , load is around 4.00
> 
> # uname -a	(dual-Xeon SMP machine)
> Linux jackson.stokkie.net 2.4.23abi #1 SMP Sun Jan
> 25 03:28:48 CET 2004 i686 unknown unknown GNU/Linux
>                                        ^^^^
> 
> # uname -a	(poor PII 400 MHz machine)
> Linux tapebox.stokkie.net 2.6.6 #1 Thu May 13
> 05:48:57 CEST 2004 i686 unknown unknown GNU/Linux
>                                   ^^^ no SMP
> 
> Also , is your Opteron machine rather different as
> the mainstream Opteron
> boards, i.e. are there some exotic drivers used ?
> 
> Regards,
> 
> Robert
> 
> > Just for more clarification, here is a perfect
> > example:
> >  
> > 2.6.7-p1:
> > 24.86user 51.77system 2:58.87elapsed 42%CPU
> > (0avgtext+0avgdata 0maxresident)k
> > 0inputs+0outputs (13major+7591686minor)pagefaults
> > 0swaps
> >  
> > 2.4.21:
> > 28.68user 34.98system 1:12.34elapsed 87%CPU
> > (0avgtext+0avgdata 0maxresident)k
> > 0inputs+0outputs
> (5691267major+1130523minor)pagefaults
> > 0swaps
> >  
> >  
> > Both runs on the same machine with the same
> process
> > (making headers).
> >  
> > Could someone give me some pointers/directions on
> > where to look.
> >  
> > Thank you for your time.
> > Phy
> >  
> > --- Phy Prabab <phyprabab@xxxxxxxxx> wrote:
> >> Hello,
> >> 
> >> I need some help understanding what is at issue
> with
> >> the extreme lsow down in build times for a custom
> >> executable on different kernel versions. The
> >> difference is pretty huge:
> >> 
> >> RH 2.4.20-13-7 : ~1m.10s
> >> 2.4.22 : ~1m.40s
> >> 2.4.26 : ~2m.15s
> >> 2.6.1 : ~3m.40s
> >> 2.6.2 : ~4m.00s
> >> 2.6.3 : ~4m.00s
> >> 2.6.6 : ~3m.15s
> >> 2.6.6-mm4 : ~2m.10s
> >> 2.6.6-mm5 : ~2m.50s
> >> 2.6.7-p1 : ~2m.80s
> >> (ran five times on every kernel to get
> approximate
> >> time listed)
> >> 
> >> The question is, how can I get the newer kernels
> to
> >> scream like the older kernels?
> >> 
> >> I have moved all files in question to the local
> disk
> >> to rule out network issues (though the 2.6.x
> kernels
> >> are faster at net access). I have run the make
> >> command in debug mode and find no differnce betHz
> >> w/8G
> >> RAM.
> >> 
> >> Thank you for your assistance.
> >> Phy
> >> 
> >> __________________________________
> >> Do you Yahoo!?
> >> Yahoo! Domains ? Claim yours for only $14.70/year
> >> http://smallbusiness.promotions.yahoo.com/offer 
> 
> - 
> Robert M. Stockmann - RHCE
> Network Engineer - UNIX/Linux Specialist
> crashrecovery.org  stock@stokkie.net
> 
> -
> To unsubscribe from this list: send the line
> "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at 
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


	
		
__________________________________
Do you Yahoo!?
Yahoo! Domains – Claim yours for only $14.70/year
http://smallbusiness.promotions.yahoo.com/offer 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  1:28       ` William Lee Irwin III
@ 2004-05-24  1:51         ` Phy Prabab
  2004-05-24  2:42           ` Nick Piggin
  2004-05-24  2:43           ` Andrew Morton
  0 siblings, 2 replies; 31+ messages in thread
From: Phy Prabab @ 2004-05-24  1:51 UTC (permalink / raw)
  To: William Lee Irwin III, Jakob Oestergaard, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 2916 bytes --]

Okay, so ran the test with vmstat running.  I ran it
capturing every 1 sec.  Here is an excert:

procs                      memory      swap         
io     system         cpu
 r  b   swpd   free   buff  cache   si   so    bi   
bo   in    cs us sy id wa
 1  0      0 8153848  17000  82348    0    0     0    
0 4568  4028  6 16 78  0
 0  1      0 8154168  17008  82340    0    0     0  
160 4596  4079  7 17 76  1
 1  0      0 8153848  17008  82340    0    0     0    
0 4511  3998  7 16 76  0
 1  0      0 8153912  17008  82340    0    0     0    
0 4460  3952  7 14 79  0
 1  0      0 8153784  17016  82332    0    0     0    
0 4437  3962  7 16 77  0
 1  0      0 8153528  17016  82332    0    0     0    
0 4444  3927  7 14 78  0
 1  1      0 8153784  17024  82392    0    0     0  
144 4399  3895  7 15 77  1
 0  0      0 8153592  17024  82392    0    0     0    
0 4367  3821  7 15 78  0
 1  0      0 8153848  17024  82392    0    0     0    
0 4393  3926  6 16 78  0
 1  0      0 8153528  17024  82460    0    0     0    
0 4438  3960  8 14 78  0
 1  0      0 8154040  17024  82460    0    0     0    
0 4415  3912  6 15 78  0
 1  1      0 8153720  17032  82452    0    0     0  
140 4457  3953  7 15 77  1
 1  0      0 8153784  17032  82452    0    0     0    
0 4437  3889  7 14 79  0
 0  0      0 8153784  17040  82444    0    0     0    
0 4398  3903  8 15 77  0
 1  0      0 8153464  17040  82444    0    0     0    
0 4398  3902  7 14 79  0
 0  0      0 8153528  17040  82444    0    0     0    
0 4447  3922  6 17 77  0
 0  1      0 8153720  17052  82432    0    0     0  
144 4490  3960  6 16 77  1
 0  0      0 8153656  17056  82428    0    0     0    
0 4449  3954  7 15 78  0


--- William Lee Irwin III <wli@holomorphy.com> wrote:
> On Mon, May 24, 2004 at 03:25:53AM +0200, Jakob
> Oestergaard wrote:
> > Eh, not if I read the numbers right:
> > 2.6.7-p1: 24.86user 51.77system 2:58.87elapsed
> 42%CPU
> > 24.86 + 51.77 = 76.63 seconds on CPU, 102.24
> seconds of waiting
> > 2.4.21: 28.68user 34.98system 1:12.34elapsed
> 87%CPU
> > 28.68 + 34.98 = 63.66 seconds on CPU, 8.68 seconds
> of waiting
> > So, 2.6.7-p1 spends 16.79 seconds more in the
> kernel as you observed,
> > but it spends 93.56 seconds more waiting for I/O
> (or whatever).
> > Unless I'm totally missing something, the wait
> seems to be the
> > regression.
> 
> I'm sorry, you're right. Let's start by looking into
> IO activity.
> Phy, could you log the output of vmstat(1) during
> the runs?
> 
> 
> -- wli
> -
> To unsubscribe from this list: send the line
> "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at 
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


	
		
__________________________________
Do you Yahoo!?
Yahoo! Domains – Claim yours for only $14.70/year
http://smallbusiness.promotions.yahoo.com/offer 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  1:51         ` Phy Prabab
@ 2004-05-24  2:42           ` Nick Piggin
  2004-05-24  2:43           ` Andrew Morton
  1 sibling, 0 replies; 31+ messages in thread
From: Nick Piggin @ 2004-05-24  2:42 UTC (permalink / raw)
  To: Phy Prabab; +Cc: William Lee Irwin III, Jakob Oestergaard, linux-kernel

Phy Prabab wrote:
> Okay, so ran the test with vmstat running.  I ran it
> capturing every 1 sec.  Here is an excert:
> 
> procs                      memory      swap         
> io     system         cpu
>  r  b   swpd   free   buff  cache   si   so    bi   bo   in    cs us sy id wa
>  1  0      0 8153848  17000  82348    0    0     0   0 4568  4028  6 16 78  0
>  0  1      0 8154168  17008  82340    0    0     0 160 4596  4079  7 17 76  1
>  1  0      0 8153848  17008  82340    0    0     0   0 4511  3998  7 16 76  0
>  1  0      0 8153912  17008  82340    0    0     0   0 4460  3952  7 14 79  0
>  1  0      0 8153784  17016  82332    0    0     0   0 4437  3962  7 16 77  0
>  1  0      0 8153528  17016  82332    0    0     0   0 4444  3927  7 14 78  0
>  1  1      0 8153784  17024  82392    0    0     0 144 4399  3895  7 15 77  1
>  0  0      0 8153592  17024  82392    0    0     0   0 4367  3821  7 15 78  0
>  1  0      0 8153848  17024  82392    0    0     0   0 4393  3926  6 16 78  0
>  1  0      0 8153528  17024  82460    0    0     0   0 4438  3960  8 14 78  0
>  1  0      0 8154040  17024  82460    0    0     0   0 4415  3912  6 15 78  0
>  1  1      0 8153720  17032  82452    0    0     0 140 4457  3953  7 15 77  1
>  1  0      0 8153784  17032  82452    0    0     0   0 4437  3889  7 14 79  0
>  0  0      0 8153784  17040  82444    0    0     0   0 4398  3903  8 15 77  0
>  1  0      0 8153464  17040  82444    0    0     0   0 4398  3902  7 14 79  0
>  0  0      0 8153528  17040  82444    0    0     0   0 4447  3922  6 17 77  0
>  0  1      0 8153720  17052  82432    0    0     0 144 4490  3960  6 16 77  1
>  0  0      0 8153656  17056  82428    0    0     0   0 4449  3954  7 15 78  0
> 

Funny, you are basically not doing any IO at all.
How many CPUs are in your system?

As WLI noted, 2.6 is still taking more system time,
so a profile might still be useful.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  1:51         ` Phy Prabab
  2004-05-24  2:42           ` Nick Piggin
@ 2004-05-24  2:43           ` Andrew Morton
  2004-05-24  6:23             ` Phy Prabab
  1 sibling, 1 reply; 31+ messages in thread
From: Andrew Morton @ 2004-05-24  2:43 UTC (permalink / raw)
  To: Phy Prabab; +Cc: wli, jakob, linux-kernel

Phy Prabab <phyprabab@yahoo.com> wrote:
>
> Okay, so ran the test with vmstat running.  I ran it
>  capturing every 1 sec.  Here is an excert:

That was horridly wordwrapped.

procs                      memory      swap         io     system         cpu
 r  b   swpd   free   buff  cache   si   so    bi   bo   in    cs us sy id wa
 1  0      0 8153848  17000  82348    0    0     0    0 4568  4028  6 16 78  0
 0  1      0 8154168  17008  82340    0    0     0  160 4596  4079  7 17 76  1
 1  0      0 8153848  17008  82340    0    0     0    0 4511  3998  7 16 76  0
 1  0      0 8153912  17008  82340    0    0     0    0 4460  3952  7 14 79  0
 1  0      0 8153784  17016  82332    0    0     0    0 4437  3962  7 16 77  0
 1  0      0 8153528  17016  82332    0    0     0    0 4444  3927  7 14 78  0
 1  1      0 8153784  17024  82392    0    0     0  144 4399  3895  7 15 77  1
 0  0      0 8153592  17024  82392    0    0     0    0 4367  3821  7 15 78  0
 1  0      0 8153848  17024  82392    0    0     0    0 4393  3926  6 16 78  0
 1  0      0 8153528  17024  82460    0    0     0    0 4438  3960  8 14 78  0
 1  0      0 8154040  17024  82460    0    0     0    0 4415  3912  6 15 78  0
 1  1      0 8153720  17032  82452    0    0     0  140 4457  3953  7 15 77  1
 1  0      0 8153784  17032  82452    0    0     0    0 4437  3889  7 14 79  0
 0  0      0 8153784  17040  82444    0    0     0    0 4398  3903  8 15 77  0
 1  0      0 8153464  17040  82444    0    0     0    0 4398  3902  7 14 79  0
 0  0      0 8153528  17040  82444    0    0     0    0 4447  3922  6 17 77  0
 0  1      0 8153720  17052  82432    0    0     0  144 4490  3960  6 16 77  1
 0  0      0 8153656  17056  82428    0    0     0    0 4449  3954  7 15 78  0

This is a single-CPU machine, yes?

Your application is spending most of the time in an explicit sleep of some
form.  Suggest you run strace against it and see what it's up to.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  1:37 Robert M. Stockmann
  2004-05-24  1:50 ` Phy Prabab
@ 2004-05-24  3:31 ` Robert M. Stockmann
  2004-05-24  6:14   ` Phy Prabab
  1 sibling, 1 reply; 31+ messages in thread
From: Robert M. Stockmann @ 2004-05-24  3:31 UTC (permalink / raw)
  To: linux-kernel

Hi

> procs memory swap io system cpu
>
> r b swpd free    buff  cache si so bi bo  in   cs   us sy id wa
> 1 0 0    8153848 17000 82348 0  0  0  0   4568 4028 6  16 78 0
> 0 1 0    8154168 17008 82340 0  0  0  160 4596 4079 7  17 76 1
> 1 0 0    8153848 17008 82340 0  0  0  0   4511 3998 7  16 76 0
> 1 0 0    8153912 17008 82340 0  0  0  0   4460 3952 7  14 79 0
> 1 0 0    8153784 17016 82332 0  0  0  0   4437 3962 7  16 77 0
> 1 0 0    8153528 17016 82332 0  0  0  0   4444 3927 7  14 78 0
> 1 1 0    8153784 17024 82392 0  0  0  144 4399 3895 7  15 77 1
> 0 0 0    8153592 17024 82392 0  0  0  0   4367 3821 7  15 78 0
> 1 0 0    8153848 17024 82392 0  0  0  0   4393 3926 6  16 78 0
> 1 0 0    8153528 17024 82460 0  0  0  0   4438 3960 8  14 78 0
> 1 0 0    8154040 17024 82460 0  0  0  0   4415 3912 6  15 78 0
> 1 1 0    8153720 17032 82452 0  0  0  140 4457 3953 7  15 77 1
> 1 0 0    8153784 17032 82452 0  0  0  0   4437 3889 7  14 79 0
> 0 0 0    8153784 17040 82444 0  0  0  0   4398 3903 8  15 77 0
> 1 0 0    8153464 17040 82444 0  0  0  0   4398 3902 7  14 79 0
> 0 0 0    8153528 17040 82444 0  0  0  0   4447 3922 6  17 77 0
> 0 1 0    8153720 17052 82432 0  0  0  144 4490 3960 6  16 77 1
> 0 0 0    8153656 17056 82428 0  0  0  0   4449 3954 7  15 78 0

FIELD DESCRIPTION FOR VM MODE
   Procs
       r: The number of processes waiting for run time.
       b: The number of processes in uninterruptible sleep.

During the 18 seconds of displayed stats, you have 12 seconds in which
1 process is waiting for run time. Which process is that, and where is
it waiting for ?

Is your custom excecutable compile project causing this or does a simple
compile task display the same slowdowns? As a small bench test with a source
we all can download, try to compile bison-1.75.tar.bz2 :

# ./configure ; time make
...
make[1]: Leaving directory `/home/stock/tmp/src/bison-1.75'
37.33user 3.64system 0:40.66elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (121059major+64165minor)pagefaults 0swaps

regards,

Robert
-- 
Robert M. Stockmann - RHCE
Network Engineer - UNIX/Linux Specialist
crashrecovery.org  stock@stokkie.net


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  3:31 ` Robert M. Stockmann
@ 2004-05-24  6:14   ` Phy Prabab
  2004-05-24 18:36     ` Robert M. Stockmann
  0 siblings, 1 reply; 31+ messages in thread
From: Phy Prabab @ 2004-05-24  6:14 UTC (permalink / raw)
  To: Robert M. Stockmann, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 4838 bytes --]

Okay, here we go:
./configure;/usr/bin/time make
make[1]: Leaving directory `/var/tmp/bison-1.875'
9.39user 1.18system 0:11.59elapsed 91%CPU
(0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+147530minor)pagefaults 0swaps

 0  0      0 8151040  19704  87396    0    0     0  
148 1673  1230  3  6 90  1
 0  0      0 8150912  19712  87456    0    0     0    
0 2697  2553 10 13 78  0
 0  0      0 8150712  19712  87388    0    0     4    
0 3233  3404 15 19 66  0
 1  0      0 8147512  19724  87716    0    0   276    
0 2477  2671 23 15 60  2
 1  0      0 8149944  19736  87840    0    0   132    
0 2940  3048 17 17 65  1
 1  1      0 8148144  19736  87840    0    0     0   
32 2647  2466 22 16 62  1
 1  0      0 8150176  19744  87832    0    0     0  
212 2831  2837 19 17 63  0
 1  0      0 8148704  19744  87832    0    0    12    
0 2983  3051 18 18 63  0
 1  0      0 8149984  19744  87832    0    0     0    
0 3260  3632 20 22 58  0
 1  0      0 8149792  19744  87832    0    0     0    
0 3165  3366 19 22 59  0
 0  1      0 8150240  19760  87952    0    0   108   
36 2810  2795 16 17 64  3
 1  0      0 8148768  19768  87944    0    0    12  
208 2851  2775 18 15 65  1
 1  0      0 8149152  19784  88472    0    0   576  
200 2670  2681 16 15 64  5
 1  0      0 8148960  19796  88596    0    0    80    
0 3387  4263 10 20 71  0
 1  0      0 8147864  19800  88728    0    0     0    
0 3070  3834 23 16 60  0
 0  1      0 8148440  19812  88988    0    0     0  
468 3914  4592  7 18 73  1
procs                      memory      swap         
io     system         cpu
 r  b   swpd   free   buff  cache   si   so    bi   
bo   in    cs us sy id wa
 1  0      0 8147352  19844  89228    0    0   120  
300 3070  3900 15 17 65  4
 1  0      0 8144600  19852  89288    0    0     0  
104 1850  1008 35  8 57  1
 1  0      0 8143576  19864  89412    0    0     0    
0 1416   501 43  5 52  0


A little faster than the result you published,
however, it sheds no light for me.  Any thing that
seems to point out some issues?

Thanks,
Phy

--- "Robert M. Stockmann" <stock@stokkie.net> wrote:
> Hi
> 
> > procs memory swap io system cpu
> >
> > r b swpd free    buff  cache si so bi bo  in   cs 
>  us sy id wa
> > 1 0 0    8153848 17000 82348 0  0  0  0   4568
> 4028 6  16 78 0
> > 0 1 0    8154168 17008 82340 0  0  0  160 4596
> 4079 7  17 76 1
> > 1 0 0    8153848 17008 82340 0  0  0  0   4511
> 3998 7  16 76 0
> > 1 0 0    8153912 17008 82340 0  0  0  0   4460
> 3952 7  14 79 0
> > 1 0 0    8153784 17016 82332 0  0  0  0   4437
> 3962 7  16 77 0
> > 1 0 0    8153528 17016 82332 0  0  0  0   4444
> 3927 7  14 78 0
> > 1 1 0    8153784 17024 82392 0  0  0  144 4399
> 3895 7  15 77 1
> > 0 0 0    8153592 17024 82392 0  0  0  0   4367
> 3821 7  15 78 0
> > 1 0 0    8153848 17024 82392 0  0  0  0   4393
> 3926 6  16 78 0
> > 1 0 0    8153528 17024 82460 0  0  0  0   4438
> 3960 8  14 78 0
> > 1 0 0    8154040 17024 82460 0  0  0  0   4415
> 3912 6  15 78 0
> > 1 1 0    8153720 17032 82452 0  0  0  140 4457
> 3953 7  15 77 1
> > 1 0 0    8153784 17032 82452 0  0  0  0   4437
> 3889 7  14 79 0
> > 0 0 0    8153784 17040 82444 0  0  0  0   4398
> 3903 8  15 77 0
> > 1 0 0    8153464 17040 82444 0  0  0  0   4398
> 3902 7  14 79 0
> > 0 0 0    8153528 17040 82444 0  0  0  0   4447
> 3922 6  17 77 0
> > 0 1 0    8153720 17052 82432 0  0  0  144 4490
> 3960 6  16 77 1
> > 0 0 0    8153656 17056 82428 0  0  0  0   4449
> 3954 7  15 78 0
> 
> FIELD DESCRIPTION FOR VM MODE
>    Procs
>        r: The number of processes waiting for run
> time.
>        b: The number of processes in uninterruptible
> sleep.
> 
> During the 18 seconds of displayed stats, you have
> 12 seconds in which
> 1 process is waiting for run time. Which process is
> that, and where is
> it waiting for ?
> 
> Is your custom excecutable compile project causing
> this or does a simple
> compile task display the same slowdowns? As a small
> bench test with a source
> we all can download, try to compile
> bison-1.75.tar.bz2 :
> 
> # ./configure ; time make
> ...
> make[1]: Leaving directory
> `/home/stock/tmp/src/bison-1.75'
> 37.33user 3.64system 0:40.66elapsed 100%CPU
> (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (121059major+64165minor)pagefaults
> 0swaps
> 
> regards,
> 
> Robert
> -- 
> Robert M. Stockmann - RHCE
> Network Engineer - UNIX/Linux Specialist
> crashrecovery.org  stock@stokkie.net
> 
> -
> To unsubscribe from this list: send the line
> "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at 
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


	
		
__________________________________
Do you Yahoo!?
Yahoo! Domains – Claim yours for only $14.70/year
http://smallbusiness.promotions.yahoo.com/offer 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  2:43           ` Andrew Morton
@ 2004-05-24  6:23             ` Phy Prabab
  2004-05-24  6:27               ` William Lee Irwin III
  0 siblings, 1 reply; 31+ messages in thread
From: Phy Prabab @ 2004-05-24  6:23 UTC (permalink / raw)
  To: Andrew Morton; +Cc: wli, jakob, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 2484 bytes --]

Sorry for the late reply.   No this is a dual Xeon
3.06Ghz.  All runs are with the same hardware and same
make/build system.

How do I generate a profile of the system.  I have
alrady compiled profiling and have enabled the kernel.

Thanks,
Phy
--- Andrew Morton <akpm@osdl.org> wrote:
> Phy Prabab <phyprabab@yahoo.com> wrote:
> >
> > Okay, so ran the test with vmstat running.  I ran
> it
> >  capturing every 1 sec.  Here is an excert:
> 
> That was horridly wordwrapped.
> 
> procs                      memory      swap        
> io     system         cpu
>  r  b   swpd   free   buff  cache   si   so    bi  
> bo   in    cs us sy id wa
>  1  0      0 8153848  17000  82348    0    0     0  
>  0 4568  4028  6 16 78  0
>  0  1      0 8154168  17008  82340    0    0     0 
> 160 4596  4079  7 17 76  1
>  1  0      0 8153848  17008  82340    0    0     0  
>  0 4511  3998  7 16 76  0
>  1  0      0 8153912  17008  82340    0    0     0  
>  0 4460  3952  7 14 79  0
>  1  0      0 8153784  17016  82332    0    0     0  
>  0 4437  3962  7 16 77  0
>  1  0      0 8153528  17016  82332    0    0     0  
>  0 4444  3927  7 14 78  0
>  1  1      0 8153784  17024  82392    0    0     0 
> 144 4399  3895  7 15 77  1
>  0  0      0 8153592  17024  82392    0    0     0  
>  0 4367  3821  7 15 78  0
>  1  0      0 8153848  17024  82392    0    0     0  
>  0 4393  3926  6 16 78  0
>  1  0      0 8153528  17024  82460    0    0     0  
>  0 4438  3960  8 14 78  0
>  1  0      0 8154040  17024  82460    0    0     0  
>  0 4415  3912  6 15 78  0
>  1  1      0 8153720  17032  82452    0    0     0 
> 140 4457  3953  7 15 77  1
>  1  0      0 8153784  17032  82452    0    0     0  
>  0 4437  3889  7 14 79  0
>  0  0      0 8153784  17040  82444    0    0     0  
>  0 4398  3903  8 15 77  0
>  1  0      0 8153464  17040  82444    0    0     0  
>  0 4398  3902  7 14 79  0
>  0  0      0 8153528  17040  82444    0    0     0  
>  0 4447  3922  6 17 77  0
>  0  1      0 8153720  17052  82432    0    0     0 
> 144 4490  3960  6 16 77  1
>  0  0      0 8153656  17056  82428    0    0     0  
>  0 4449  3954  7 15 78  0
> 
> This is a single-CPU machine, yes?
> 
> Your application is spending most of the time in an
> explicit sleep of some
> form.  Suggest you run strace against it and see
> what it's up to.


	
		
__________________________________
Do you Yahoo!?
Yahoo! Domains – Claim yours for only $14.70/year
http://smallbusiness.promotions.yahoo.com/offer 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  6:23             ` Phy Prabab
@ 2004-05-24  6:27               ` William Lee Irwin III
  2004-05-24  6:39                 ` Phy Prabab
  0 siblings, 1 reply; 31+ messages in thread
From: William Lee Irwin III @ 2004-05-24  6:27 UTC (permalink / raw)
  To: Phy Prabab; +Cc: Andrew Morton, jakob, linux-kernel

On Sun, May 23, 2004 at 11:23:37PM -0700, Phy Prabab wrote:
> Sorry for the late reply.   No this is a dual Xeon
> 3.06Ghz.  All runs are with the same hardware and same
> make/build system.
> How do I generate a profile of the system.  I have
> alrady compiled profiling and have enabled the kernel.

readprofile -n -m /boot/System.map-`uname -r` | sort -rn -k 1,1 | head -22

No need for compiled-in support, just boot with profile=1.


-- wli

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  0:57 ` Phy Prabab
  2004-05-24  1:04   ` William Lee Irwin III
@ 2004-05-24  6:33   ` Paul Rolland
  2004-05-24 17:28     ` Roger Luethi
  1 sibling, 1 reply; 31+ messages in thread
From: Paul Rolland @ 2004-05-24  6:33 UTC (permalink / raw)
  To: 'Phy Prabab', linux-kernel

Hello, 

> 2.6.7-p1:
> 24.86user 51.77system 2:58.87elapsed 42%CPU
> (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (13major+7591686minor)pagefaults
> 0swaps
> 
> 2.4.21:
> 28.68user 34.98system 1:12.34elapsed 87%CPU
> (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (5691267major+1130523minor)pagefaults
> 0swaps
> 
> 
> Both runs on the same machine with the same process
> (making headers).
> 
> Could someone give me some pointers/directions on
> where to look.

Any reason why there is such a difference in the pagefaults
numbers between 2.4.x and 2.6.x ????
Could it explain a part of the time differences ?

Paul



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  6:27               ` William Lee Irwin III
@ 2004-05-24  6:39                 ` Phy Prabab
  2004-05-24  7:21                   ` Phy Prabab
  2004-05-24  7:53                   ` Andrew Morton
  0 siblings, 2 replies; 31+ messages in thread
From: Phy Prabab @ 2004-05-24  6:39 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: Andrew Morton, jakob, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 1996 bytes --]

Okay, here is the output:

317955 total                                     
0.1302
263633 poll_idle                               
4545.3966
  6764 do_page_fault                             
5.1951
  3650 kmap_atomic                              
28.2946
  3288 __d_lookup                               
12.0000
  1856 atomic_dec_and_lock                      
30.4262
  1823 do_no_page                                
2.4870
  1398 do_anonymous_page                         
3.1991
  1168 link_path_walk                            
0.5266
  1138 find_get_page                            
19.9649
  1128 zap_pte_range                             
1.2450
  1075 nfs_lookup_revalidate                     
0.7739
  1039 page_remove_rmap                          
5.0437
  1006 in_group_p                                
9.4906
   987 handle_mm_fault                           
2.2742
   913 do_wp_page                                
1.1499
   816 finish_task_switch                        
6.6885
   772 tg3_poll                                  
2.6621
   762 rpcauth_lookup_credcache                  
1.3275
   722 page_add_file_rmap                        
4.1257
   623 system_call                              
14.1591
   607 strncpy_from_user                         
6.8202

Thanks again!
Phy



--- William Lee Irwin III <wli@holomorphy.com> wrote:
> On Sun, May 23, 2004 at 11:23:37PM -0700, Phy Prabab
> wrote:
> > Sorry for the late reply.   No this is a dual Xeon
> > 3.06Ghz.  All runs are with the same hardware and
> same
> > make/build system.
> > How do I generate a profile of the system.  I have
> > alrady compiled profiling and have enabled the
> kernel.
> 
> readprofile -n -m /boot/System.map-`uname -r` | sort
> -rn -k 1,1 | head -22
> 
> No need for compiled-in support, just boot with
> profile=1.
> 
> 
> -- wli


	
		
__________________________________
Do you Yahoo!?
Yahoo! Domains – Claim yours for only $14.70/year
http://smallbusiness.promotions.yahoo.com/offer 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  6:39                 ` Phy Prabab
@ 2004-05-24  7:21                   ` Phy Prabab
  2004-05-24  7:44                     ` Nick Piggin
  2004-05-24 14:13                     ` Martin J. Bligh
  2004-05-24  7:53                   ` Andrew Morton
  1 sibling, 2 replies; 31+ messages in thread
From: Phy Prabab @ 2004-05-24  7:21 UTC (permalink / raw)
  To: Phy Prabab, William Lee Irwin III; +Cc: Andrew Morton, jakob, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 2912 bytes --]

For more information, I ran the same test on 2.4.21
with profiling enabled:
22133 total               0.0102
18255 default_idle        351.0577
634 do_page_fault         0.4412
342 do_no_page            0.3654
231 d_lookup              0.8134
173 do_anonymous_page     0.2790
157 system_call           2.8036
157 pte_alloc_atomic      0.2973
130 do_wp_page            0.1222
102 __find_get_page       1.5000
99 atomic_dec_and_lock    1.3750
90 handle_mm_fault        0.2616
89 zap_page_range         0.0792
86 link_path_walk         0.0343
62 mraccessf              0.6458
59 permission             0.4470
56 mrunlock               0.5600
42 strncpy_from_user      0.5526
40 strnlen_user           0.4545
40 __do_generic_file_read 0.0316
39 __free_pages           1.2188
35 filemap_nopage         0.0648

Also refomated the previous report to make it
readable:

> 317955 total                 0.1302
> 263633 poll_idle             4545.3966
> 6764 do_page_fault           5.1951
> 3650 kmap_atomic             28.2946
> 3288 __d_lookup              12.0000
> 1856 atomic_dec_and_lock     30.4262
> 1823 do_no_page              2.4870
> 1398 do_anonymous_page       3.1991
> 1168 link_path_walk          0.5266
> 1138 find_get_page           19.9649
                               1.2450
> 1075 nfs_lookup_revalidate   0.7739
> 1039 page_remove_rmap        5.0437
> 1006 in_group_p              9.4906
> 987 handle_mm_fault          2.2742
> 913 do_wp_page               1.1499
> 816 finish_task_switch       6.6885
> 772 tg3_poll                 2.6621
> 762 rpcauth_lookup_credcache 1.3275
> 722 page_add_file_rmap       4.1257
> 623 system_call              14.1591
> 607 strncpy_from_user        6.8202
> 
> Thanks again!
> Phy
> 
> 
> 
> --- William Lee Irwin III <wli@holomorphy.com>
> wrote:
> > On Sun, May 23, 2004 at 11:23:37PM -0700, Phy
> Prabab
> > wrote:
> > > Sorry for the late reply.   No this is a dual
> Xeon
> > > 3.06Ghz.  All runs are with the same hardware
> and
> > same
> > > make/build system.
> > > How do I generate a profile of the system.  I
> have
> > > alrady compiled profiling and have enabled the
> > kernel.
> > 
> > readprofile -n -m /boot/System.map-`uname -r` |
> sort
> > -rn -k 1,1 | head -22
> > 
> > No need for compiled-in support, just boot with
> > profile=1.
> > 
> > 
> > -- wli
> 
> 
> 	
> 		
> __________________________________
> Do you Yahoo!?
> Yahoo! Domains – Claim yours for only $14.70/year
> http://smallbusiness.promotions.yahoo.com/offer 
> -
> To unsubscribe from this list: send the line
> "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at 
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


	
		
__________________________________
Do you Yahoo!?
Yahoo! Domains – Claim yours for only $14.70/year
http://smallbusiness.promotions.yahoo.com/offer 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  7:21                   ` Phy Prabab
@ 2004-05-24  7:44                     ` Nick Piggin
  2004-05-24  7:50                       ` Phy Prabab
  2004-05-24 14:13                     ` Martin J. Bligh
  1 sibling, 1 reply; 31+ messages in thread
From: Nick Piggin @ 2004-05-24  7:44 UTC (permalink / raw)
  To: Phy Prabab; +Cc: William Lee Irwin III, Andrew Morton, jakob, linux-kernel

Phy Prabab wrote:
> For more information, I ran the same test on 2.4.21

...

Is hyperthreading enabled on 2.6 and 2.4? Can you send
a cat /proc/cpuinfo | grep processor for a 2.4 and a 2.6
kernel please?

If you have hyperthreading enabled, try 2.6.7-rc1 with
SMT (hyperthreading) scheduler support enabled in the
kernel config. It is in processor type and features menu.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  7:44                     ` Nick Piggin
@ 2004-05-24  7:50                       ` Phy Prabab
  2004-05-24  7:59                         ` Nick Piggin
  0 siblings, 1 reply; 31+ messages in thread
From: Phy Prabab @ 2004-05-24  7:50 UTC (permalink / raw)
  To: Nick Piggin; +Cc: William Lee Irwin III, Andrew Morton, jakob, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 829 bytes --]

NO HT, disabled in bios and did not enable in kernel:
cat /proc/cpuinfo|grep processor|wc -l
  2
grep SMT .config (2.6.7-rc1)
# CONFIG_SCHED_SMT is not set

On 2.4.21 I also include "append=noht"

Thanks!
Phy



--- Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> Phy Prabab wrote:
> > For more information, I ran the same test on
> 2.4.21
> 
> ...
> 
> Is hyperthreading enabled on 2.6 and 2.4? Can you
> send
> a cat /proc/cpuinfo | grep processor for a 2.4 and a
> 2.6
> kernel please?
> 
> If you have hyperthreading enabled, try 2.6.7-rc1
> with
> SMT (hyperthreading) scheduler support enabled in
> the
> kernel config. It is in processor type and features
> menu.
> 


	
		
__________________________________
Do you Yahoo!?
Yahoo! Domains – Claim yours for only $14.70/year
http://smallbusiness.promotions.yahoo.com/offer 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  6:39                 ` Phy Prabab
  2004-05-24  7:21                   ` Phy Prabab
@ 2004-05-24  7:53                   ` Andrew Morton
  2004-05-24  8:00                     ` Phy Prabab
  2004-05-25 10:32                     ` Ingo Molnar
  1 sibling, 2 replies; 31+ messages in thread
From: Andrew Morton @ 2004-05-24  7:53 UTC (permalink / raw)
  To: Phy Prabab; +Cc: wli, jakob, linux-kernel

Phy Prabab <phyprabab@yahoo.com> wrote:
>
> Okay, here is the output:
> 
>  317955 total                                     
>  0.1302
>  263633 poll_idle                               
>  4545.3966
>    6764 do_page_fault                             
>  5.1951
>    3650 kmap_atomic 

It's strange that you appear to be using poll_idle().  Maybe it's an error
in the profiler - it sometimes makes mistakes in identifying small
functions.

But if you _are_ using poll_idle() and if your CPU is hyperthreaded then
yes, one "CPU" is going to take a performance hit from the "idle" one.

Is your CPU hyperthreaded?  (cat /proc/cpuinfo)

Are you booting with "idle=poll"?  Do `dmesg -s 1000000 | grep idle'.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  7:50                       ` Phy Prabab
@ 2004-05-24  7:59                         ` Nick Piggin
  2004-05-24 10:43                           ` Jakob Oestergaard
  0 siblings, 1 reply; 31+ messages in thread
From: Nick Piggin @ 2004-05-24  7:59 UTC (permalink / raw)
  To: Phy Prabab; +Cc: William Lee Irwin III, Andrew Morton, jakob, linux-kernel

Phy Prabab wrote:
> NO HT, disabled in bios and did not enable in kernel:
> cat /proc/cpuinfo|grep processor|wc -l
>   2
> grep SMT .config (2.6.7-rc1)
> # CONFIG_SCHED_SMT is not set
> 
> On 2.4.21 I also include "append=noht"
> 

OK good, that makes things simpler.

I'm out of ideas though. The kernel just doesn't seem to be
the problem here. Can you put together a testcase that causes
the problem and that we can download and reproduce it?

Otherwise, try to work out why your userspace processes are
more often idle under the 2.6 kernel.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  7:53                   ` Andrew Morton
@ 2004-05-24  8:00                     ` Phy Prabab
  2004-05-25 10:32                     ` Ingo Molnar
  1 sibling, 0 replies; 31+ messages in thread
From: Phy Prabab @ 2004-05-24  8:00 UTC (permalink / raw)
  To: Andrew Morton; +Cc: wli, jakob, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 1255 bytes --]

Andrew,

I have turned HT off in the bios (caused non
deterministic performance) and I did not configure the
kernel for SMT.  Additionally, looking at dmesg shows
no idle-poll (and I do not boot the kernel with
idle-poll).  
Should I post my config for this kernel?

Thank you for your time.
Phy
--- Andrew Morton <akpm@osdl.org> wrote:
> Phy Prabab <phyprabab@yahoo.com> wrote:
> >
> > Okay, here is the output:
> > 
> >  317955 total                                     
> >  0.1302
> >  263633 poll_idle                               
> >  4545.3966
> >    6764 do_page_fault                             
> >  5.1951
> >    3650 kmap_atomic 
> 
> It's strange that you appear to be using
> poll_idle().  Maybe it's an error
> in the profiler - it sometimes makes mistakes in
> identifying small
> functions.
> 
> But if you _are_ using poll_idle() and if your CPU
> is hyperthreaded then
> yes, one "CPU" is going to take a performance hit
> from the "idle" one.
> 
> Is your CPU hyperthreaded?  (cat /proc/cpuinfo)
> 
> Are you booting with "idle=poll"?  Do `dmesg -s
> 1000000 | grep idle'.
> 


	
		
__________________________________
Do you Yahoo!?
Yahoo! Domains – Claim yours for only $14.70/year
http://smallbusiness.promotions.yahoo.com/offer 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  7:59                         ` Nick Piggin
@ 2004-05-24 10:43                           ` Jakob Oestergaard
  0 siblings, 0 replies; 31+ messages in thread
From: Jakob Oestergaard @ 2004-05-24 10:43 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Phy Prabab, William Lee Irwin III, Andrew Morton, linux-kernel

On Mon, May 24, 2004 at 05:59:36PM +1000, Nick Piggin wrote:
> Phy Prabab wrote:
> >NO HT, disabled in bios and did not enable in kernel:
> >cat /proc/cpuinfo|grep processor|wc -l
> >  2
> >grep SMT .config (2.6.7-rc1)
> ># CONFIG_SCHED_SMT is not set
> >
> >On 2.4.21 I also include "append=noht"
> >
> 
> OK good, that makes things simpler.
> 
> I'm out of ideas though. The kernel just doesn't seem to be
> the problem here. Can you put together a testcase that causes
> the problem and that we can download and reproduce it?

You could try running your program with 'strace -T' to see which system
calls are causing the slowdown.

If you make a lot of system calls, this can be a little tedious to go
thru though...  But it should reveal to you where your program is
waiting.

Note; if you use gprof and compile with the '-pg' switch, the profiler
will show you a profile of CPU time, not wall-clock time.  Your
performance problem is wall-clock time, and if your program actually
spends a lot of time waiting in some system call, this will *NOT* show
up in a normal gprof profile.

 / jakob


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  7:21                   ` Phy Prabab
  2004-05-24  7:44                     ` Nick Piggin
@ 2004-05-24 14:13                     ` Martin J. Bligh
  1 sibling, 0 replies; 31+ messages in thread
From: Martin J. Bligh @ 2004-05-24 14:13 UTC (permalink / raw)
  To: Phy Prabab, William Lee Irwin III; +Cc: Andrew Morton, jakob, linux-kernel

I'm curious where all those kmap_atomic's come from. Do you have
CONFIG_HIGHPTE enabled? If so, please repeat with it off, and see
if that helps ... if not, grab the kcg patch out my tree, and see
where they're coming from (or shove a printk in there doing a dump_stack
for every 10,000 interations or so)

M.

--Phy Prabab <phyprabab@yahoo.com> wrote (on Monday, May 24, 2004 00:21:07 -0700):

> For more information, I ran the same test on 2.4.21
> with profiling enabled:
> 22133 total               0.0102
> 18255 default_idle        351.0577
> 634 do_page_fault         0.4412
> 342 do_no_page            0.3654
> 231 d_lookup              0.8134
> 173 do_anonymous_page     0.2790
> 157 system_call           2.8036
> 157 pte_alloc_atomic      0.2973
> 130 do_wp_page            0.1222
> 102 __find_get_page       1.5000
> 99 atomic_dec_and_lock    1.3750
> 90 handle_mm_fault        0.2616
> 89 zap_page_range         0.0792
> 86 link_path_walk         0.0343
> 62 mraccessf              0.6458
> 59 permission             0.4470
> 56 mrunlock               0.5600
> 42 strncpy_from_user      0.5526
> 40 strnlen_user           0.4545
> 40 __do_generic_file_read 0.0316
> 39 __free_pages           1.2188
> 35 filemap_nopage         0.0648
> 
> Also refomated the previous report to make it
> readable:
> 
>> 317955 total                 0.1302
>> 263633 poll_idle             4545.3966
>> 6764 do_page_fault           5.1951
>> 3650 kmap_atomic             28.2946
>> 3288 __d_lookup              12.0000
>> 1856 atomic_dec_and_lock     30.4262
>> 1823 do_no_page              2.4870
>> 1398 do_anonymous_page       3.1991
>> 1168 link_path_walk          0.5266
>> 1138 find_get_page           19.9649
>                                1.2450
>> 1075 nfs_lookup_revalidate   0.7739
>> 1039 page_remove_rmap        5.0437
>> 1006 in_group_p              9.4906
>> 987 handle_mm_fault          2.2742
>> 913 do_wp_page               1.1499
>> 816 finish_task_switch       6.6885
>> 772 tg3_poll                 2.6621
>> 762 rpcauth_lookup_credcache 1.3275
>> 722 page_add_file_rmap       4.1257
>> 623 system_call              14.1591
>> 607 strncpy_from_user        6.8202
>> 
>> Thanks again!
>> Phy
>> 
>> 
>> 
>> --- William Lee Irwin III <wli@holomorphy.com>
>> wrote:
>> > On Sun, May 23, 2004 at 11:23:37PM -0700, Phy
>> Prabab
>> > wrote:
>> > > Sorry for the late reply.   No this is a dual
>> Xeon
>> > > 3.06Ghz.  All runs are with the same hardware
>> and
>> > same
>> > > make/build system.
>> > > How do I generate a profile of the system.  I
>> have
>> > > alrady compiled profiling and have enabled the
>> > kernel.
>> > 
>> > readprofile -n -m /boot/System.map-`uname -r` |
>> sort
>> > -rn -k 1,1 | head -22
>> > 
>> > No need for compiled-in support, just boot with
>> > profile=1.
>> > 
>> > 
>> > -- wli
>> 
>> 
>> 	
>> 		
>> __________________________________
>> Do you Yahoo!?
>> Yahoo! Domains – Claim yours for only $14.70/year
>> http://smallbusiness.promotions.yahoo.com/offer 
>> -
>> To unsubscribe from this list: send the line
>> "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at 
>> http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 
> 	
> 		
> __________________________________
> Do you Yahoo!?
> Yahoo! Domains – Claim yours for only $14.70/year
> http://smallbusiness.promotions.yahoo.com/offer 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  6:33   ` Paul Rolland
@ 2004-05-24 17:28     ` Roger Luethi
  0 siblings, 0 replies; 31+ messages in thread
From: Roger Luethi @ 2004-05-24 17:28 UTC (permalink / raw)
  To: Paul Rolland; +Cc: 'Phy Prabab', linux-kernel

On Mon, 24 May 2004 08:33:49 +0200, Paul Rolland wrote:
> Hello, 
> 
> > 2.6.7-p1:
> > 24.86user 51.77system 2:58.87elapsed 42%CPU
> > (0avgtext+0avgdata 0maxresident)k
> > 0inputs+0outputs (13major+7591686minor)pagefaults
> > 0swaps
> > 
> > 2.4.21:
> > 28.68user 34.98system 1:12.34elapsed 87%CPU
> > (0avgtext+0avgdata 0maxresident)k
> > 0inputs+0outputs (5691267major+1130523minor)pagefaults
> > 0swaps
> > 
> > 
> > Both runs on the same machine with the same process
> > (making headers).
> > 
> > Could someone give me some pointers/directions on
> > where to look.
> 
> Any reason why there is such a difference in the pagefaults
> numbers between 2.4.x and 2.6.x ????
> Could it explain a part of the time differences ?

Probably no. Last time I checked, the major fault field for rusage
contained bogus data (included minor faults as well). Looks like that
got fixed in 2.6.

Roger

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  6:14   ` Phy Prabab
@ 2004-05-24 18:36     ` Robert M. Stockmann
  0 siblings, 0 replies; 31+ messages in thread
From: Robert M. Stockmann @ 2004-05-24 18:36 UTC (permalink / raw)
  To: Phy Prabab; +Cc: linux-kernel

On Sun, 23 May 2004, Phy Prabab wrote:

> Okay, here we go:
> ./configure;/usr/bin/time make
> make[1]: Leaving directory `/var/tmp/bison-1.875'
> 9.39user 1.18system 0:11.59elapsed 91%CPU
> (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (0major+147530minor)pagefaults 0swaps
> 
>  0  0      0 8151040  19704  87396    0    0     0  
> 148 1673  1230  3  6 90  1
>  0  0      0 8150912  19712  87456    0    0     0    
> 0 2697  2553 10 13 78  0
>  0  0      0 8150712  19712  87388    0    0     4    
> 0 3233  3404 15 19 66  0
>  1  0      0 8147512  19724  87716    0    0   276    
> 0 2477  2671 23 15 60  2
>  1  0      0 8149944  19736  87840    0    0   132    
> 0 2940  3048 17 17 65  1
>  1  1      0 8148144  19736  87840    0    0     0   
> 32 2647  2466 22 16 62  1
>  1  0      0 8150176  19744  87832    0    0     0  
> 212 2831  2837 19 17 63  0
>  1  0      0 8148704  19744  87832    0    0    12    
> 0 2983  3051 18 18 63  0
>  1  0      0 8149984  19744  87832    0    0     0    
> 0 3260  3632 20 22 58  0
>  1  0      0 8149792  19744  87832    0    0     0    
> 0 3165  3366 19 22 59  0
>  0  1      0 8150240  19760  87952    0    0   108   
> 36 2810  2795 16 17 64  3
>  1  0      0 8148768  19768  87944    0    0    12  
> 208 2851  2775 18 15 65  1
>  1  0      0 8149152  19784  88472    0    0   576  
> 200 2670  2681 16 15 64  5
>  1  0      0 8148960  19796  88596    0    0    80    
> 0 3387  4263 10 20 71  0
>  1  0      0 8147864  19800  88728    0    0     0    
> 0 3070  3834 23 16 60  0
>  0  1      0 8148440  19812  88988    0    0     0  
> 468 3914  4592  7 18 73  1
> procs                      memory      swap         
> io     system         cpu
>  r  b   swpd   free   buff  cache   si   so    bi   
> bo   in    cs us sy id wa
>  1  0      0 8147352  19844  89228    0    0   120  
> 300 3070  3900 15 17 65  4
>  1  0      0 8144600  19852  89288    0    0     0  
> 104 1850  1008 35  8 57  1
>  1  0      0 8143576  19864  89412    0    0     0    
> 0 1416   501 43  5 52  0
> 
> 
> A little faster than the result you published,
> however, it sheds no light for me.  Any thing that
> seems to point out some issues?

Well you need to do this two times : using 2.6.7p1
and 2.4.21. and compare the results. Maybe compile a larger 
gnu software package so the difference in speed is more clear.

Robert
> 
> --- "Robert M. Stockmann" <stock@stokkie.net> wrote:
> > Hi
> > 
> > > procs memory swap io system cpu
> > >
> > > r b swpd free    buff  cache si so bi bo  in   cs 
> >  us sy id wa
> > > 1 0 0    8153848 17000 82348 0  0  0  0   4568
> > 4028 6  16 78 0
> > > 0 1 0    8154168 17008 82340 0  0  0  160 4596
> > 4079 7  17 76 1
> > > 1 0 0    8153848 17008 82340 0  0  0  0   4511
> > 3998 7  16 76 0
> > > 1 0 0    8153912 17008 82340 0  0  0  0   4460
> > 3952 7  14 79 0
> > > 1 0 0    8153784 17016 82332 0  0  0  0   4437
> > 3962 7  16 77 0
> > > 1 0 0    8153528 17016 82332 0  0  0  0   4444
> > 3927 7  14 78 0
> > > 1 1 0    8153784 17024 82392 0  0  0  144 4399
> > 3895 7  15 77 1
> > > 0 0 0    8153592 17024 82392 0  0  0  0   4367
> > 3821 7  15 78 0
> > > 1 0 0    8153848 17024 82392 0  0  0  0   4393
> > 3926 6  16 78 0
> > > 1 0 0    8153528 17024 82460 0  0  0  0   4438
> > 3960 8  14 78 0
> > > 1 0 0    8154040 17024 82460 0  0  0  0   4415
> > 3912 6  15 78 0
> > > 1 1 0    8153720 17032 82452 0  0  0  140 4457
> > 3953 7  15 77 1
> > > 1 0 0    8153784 17032 82452 0  0  0  0   4437
> > 3889 7  14 79 0
> > > 0 0 0    8153784 17040 82444 0  0  0  0   4398
> > 3903 8  15 77 0
> > > 1 0 0    8153464 17040 82444 0  0  0  0   4398
> > 3902 7  14 79 0
> > > 0 0 0    8153528 17040 82444 0  0  0  0   4447
> > 3922 6  17 77 0
> > > 0 1 0    8153720 17052 82432 0  0  0  144 4490
> > 3960 6  16 77 1
> > > 0 0 0    8153656 17056 82428 0  0  0  0   4449
> > 3954 7  15 78 0
> > 
> > FIELD DESCRIPTION FOR VM MODE
> >    Procs
> >        r: The number of processes waiting for run
> > time.
> >        b: The number of processes in uninterruptible
> > sleep.
> > 
> > During the 18 seconds of displayed stats, you have
> > 12 seconds in which
> > 1 process is waiting for run time. Which process is
> > that, and where is
> > it waiting for ?
> > 
> > Is your custom excecutable compile project causing
> > this or does a simple
> > compile task display the same slowdowns? As a small
> > bench test with a source
> > we all can download, try to compile
> > bison-1.75.tar.bz2 :
> > 
> > # ./configure ; time make
> > ...
> > make[1]: Leaving directory
> > `/home/stock/tmp/src/bison-1.75'
> > 37.33user 3.64system 0:40.66elapsed 100%CPU
> > (0avgtext+0avgdata 0maxresident)k
> > 0inputs+0outputs (121059major+64165minor)pagefaults
> > 0swaps
> > 
> > regards,
> > 
> > Robert
> > -- 
> > Robert M. Stockmann - RHCE
> > Network Engineer - UNIX/Linux Specialist
> > crashrecovery.org  stock@stokkie.net
> > 
> > -
> > To unsubscribe from this list: send the line
> > "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at 
> > http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> 
> 
> 	
> 		
> __________________________________
> Do you Yahoo!?
> Yahoo! Domains – Claim yours for only $14.70/year
> http://smallbusiness.promotions.yahoo.com/offer 
> 

-- 
Robert M. Stockmann - RHCE
Network Engineer - UNIX/Linux Specialist
crashrecovery.org  stock@stokkie.net


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-25 10:32                     ` Ingo Molnar
@ 2004-05-25  9:29                       ` Andrew Morton
  2004-05-25 11:42                         ` Ingo Molnar
  0 siblings, 1 reply; 31+ messages in thread
From: Andrew Morton @ 2004-05-25  9:29 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel

Ingo Molnar <mingo@elte.hu> wrote:
>
>  with the patch below we will print a big fat warning. (I did not want to
>  deny idle=poll altogether - future HT implementations might work fine
>  with polling idle.)

idle=poll is handy when profiling the kernel with oprofile clock-unhalted
events.  Because if you use the normal halt-based idle loop no profile
"ticks" are accounted to idle time at all and the results are really hard
to understand.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-25 11:42                         ` Ingo Molnar
@ 2004-05-25  9:58                           ` Andrew Morton
  2004-05-25 12:25                             ` Ingo Molnar
  0 siblings, 1 reply; 31+ messages in thread
From: Andrew Morton @ 2004-05-25  9:58 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel

Ingo Molnar <mingo@elte.hu> wrote:
>
> 
> * Andrew Morton <akpm@osdl.org> wrote:
> 
> > Ingo Molnar <mingo@elte.hu> wrote:
> > >
> > >  with the patch below we will print a big fat warning. (I did not want to
> > >  deny idle=poll altogether - future HT implementations might work fine
> > >  with polling idle.)
> > 
> > idle=poll is handy when profiling the kernel with oprofile
> > clock-unhalted events.  Because if you use the normal halt-based idle
> > loop no profile "ticks" are accounted to idle time at all and the
> > results are really hard to understand.
> 
> it makes it a bit more plausible, but kernel profiling based on ticks in
> a HT environment is still quite unreliable, even with idle=poll. The HT
> cores will yield to each other on various occasions - like spinlock
> loops. This disproportionatly increases the hits of various looping
> functions, creating false impressions of lock contention where there's
> only little contention. Plus idle=poll is a constant ~20% performance
> drain on the non-idle HT core, further distorting the profile. HT makes
> profiling really hard, no matter what.

But often one is looking for relativities rather than real absolute
numbers.  (In which case the absent idle time doesn't matter, but it freaks
me out...)

> but ... we agree on the warning printk, right?

yup.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-24  7:53                   ` Andrew Morton
  2004-05-24  8:00                     ` Phy Prabab
@ 2004-05-25 10:32                     ` Ingo Molnar
  2004-05-25  9:29                       ` Andrew Morton
  1 sibling, 1 reply; 31+ messages in thread
From: Ingo Molnar @ 2004-05-25 10:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel


* Andrew Morton <akpm@osdl.org> wrote:

> But if you _are_ using poll_idle() and if your CPU is hyperthreaded
> then yes, one "CPU" is going to take a performance hit from the "idle"
> one.

with the patch below we will print a big fat warning. (I did not want to
deny idle=poll altogether - future HT implementations might work fine
with polling idle.)

	Ingo

Signed-off-by: Ingo Molnar <mingo@elte.hu>

--- linux/arch/i386/kernel/process.c.orig	
+++ linux/arch/i386/kernel/process.c	
@@ -202,6 +202,10 @@ static int __init idle_setup (char *str)
 	if (!strncmp(str, "poll", 4)) {
 		printk("using polling idle threads.\n");
 		pm_idle = poll_idle;
+#ifdef CONFIG_SMP
+		if (smp_num_siblings > 1)
+			printk("WARNING: polling idle and HT enabled, performance may degrade.\n");
+#endif
 	} else if (!strncmp(str, "halt", 4)) {
 		printk("using halt in idle threads.\n");
 		pm_idle = default_idle;

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-25  9:29                       ` Andrew Morton
@ 2004-05-25 11:42                         ` Ingo Molnar
  2004-05-25  9:58                           ` Andrew Morton
  0 siblings, 1 reply; 31+ messages in thread
From: Ingo Molnar @ 2004-05-25 11:42 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel


* Andrew Morton <akpm@osdl.org> wrote:

> Ingo Molnar <mingo@elte.hu> wrote:
> >
> >  with the patch below we will print a big fat warning. (I did not want to
> >  deny idle=poll altogether - future HT implementations might work fine
> >  with polling idle.)
> 
> idle=poll is handy when profiling the kernel with oprofile
> clock-unhalted events.  Because if you use the normal halt-based idle
> loop no profile "ticks" are accounted to idle time at all and the
> results are really hard to understand.

it makes it a bit more plausible, but kernel profiling based on ticks in
a HT environment is still quite unreliable, even with idle=poll. The HT
cores will yield to each other on various occasions - like spinlock
loops. This disproportionatly increases the hits of various looping
functions, creating false impressions of lock contention where there's
only little contention. Plus idle=poll is a constant ~20% performance
drain on the non-idle HT core, further distorting the profile. HT makes
profiling really hard, no matter what.

but ... we agree on the warning printk, right?

	Ingo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Help understanding slow down
  2004-05-25  9:58                           ` Andrew Morton
@ 2004-05-25 12:25                             ` Ingo Molnar
  0 siblings, 0 replies; 31+ messages in thread
From: Ingo Molnar @ 2004-05-25 12:25 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel


* Andrew Morton <akpm@osdl.org> wrote:

> > it makes it a bit more plausible, but kernel profiling based on ticks in
> > a HT environment is still quite unreliable, even with idle=poll. The HT
> > cores will yield to each other on various occasions - like spinlock
> > loops. This disproportionatly increases the hits of various looping
> > functions, creating false impressions of lock contention where there's
> > only little contention. Plus idle=poll is a constant ~20% performance
> > drain on the non-idle HT core, further distorting the profile. HT makes
> > profiling really hard, no matter what.
> 
> But often one is looking for relativities rather than real absolute
> numbers.  (In which case the absent idle time doesn't matter, but it
> freaks me out...)

but it's the relativities that get skewed by HT, fundamentally - both
with and without idle=poll. A function that does not generate alot of
HT-yield activity will fare much less on the profiler histogram than a
function that happens to hit some contention point every now and then.
Also, cachemisses get far more prominent due to HT (they are a yield
point too) - suppressing other, possibly more important functions in the
list. So a kernel profile done under HT gives a rough idea but a
function could easily have half the overhead that the profiler credits
it to have. Profiling under HT magnifies the effect of lock-contention
and cache-misses, and shrinks the effect of algorithmic overhead. Plus
this skewing is not deterministic, it depends on the actual system
activity.

unfortunately i see no good way to do reliable time-based profiling
under HT, given the interaction of logical CPUs. The virtual CPUs break
the single-EIP notion of 'what is this physical CPU doing now'. Perhaps
it would be more reliable to not use time but use some of the other
triggers: e.g. # of uops retired?

	Ingo

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2004-05-25 10:36 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-05-24  0:32 Help understanding slow down Phy Prabab
2004-05-24  0:57 ` Phy Prabab
2004-05-24  1:04   ` William Lee Irwin III
2004-05-24  1:25     ` Jakob Oestergaard
2004-05-24  1:28       ` William Lee Irwin III
2004-05-24  1:51         ` Phy Prabab
2004-05-24  2:42           ` Nick Piggin
2004-05-24  2:43           ` Andrew Morton
2004-05-24  6:23             ` Phy Prabab
2004-05-24  6:27               ` William Lee Irwin III
2004-05-24  6:39                 ` Phy Prabab
2004-05-24  7:21                   ` Phy Prabab
2004-05-24  7:44                     ` Nick Piggin
2004-05-24  7:50                       ` Phy Prabab
2004-05-24  7:59                         ` Nick Piggin
2004-05-24 10:43                           ` Jakob Oestergaard
2004-05-24 14:13                     ` Martin J. Bligh
2004-05-24  7:53                   ` Andrew Morton
2004-05-24  8:00                     ` Phy Prabab
2004-05-25 10:32                     ` Ingo Molnar
2004-05-25  9:29                       ` Andrew Morton
2004-05-25 11:42                         ` Ingo Molnar
2004-05-25  9:58                           ` Andrew Morton
2004-05-25 12:25                             ` Ingo Molnar
2004-05-24  6:33   ` Paul Rolland
2004-05-24 17:28     ` Roger Luethi
  -- strict thread matches above, loose matches on Subject: below --
2004-05-24  1:37 Robert M. Stockmann
2004-05-24  1:50 ` Phy Prabab
2004-05-24  3:31 ` Robert M. Stockmann
2004-05-24  6:14   ` Phy Prabab
2004-05-24 18:36     ` Robert M. Stockmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox