* Help understanding slow down
@ 2004-05-24 0:32 Phy Prabab
2004-05-24 0:57 ` Phy Prabab
0 siblings, 1 reply; 31+ messages in thread
From: Phy Prabab @ 2004-05-24 0:32 UTC (permalink / raw)
To: linux-kernel
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 991 bytes --]
Hello,
I need some help understanding what is at issue with
the extreme lsow down in build times for a custom
executable on different kernel versions. The
difference is pretty huge:
RH 2.4.20-13-7 : ~1m.10s
2.4.22 : ~1m.40s
2.4.26 : ~2m.15s
2.6.1 : ~3m.40s
2.6.2 : ~4m.00s
2.6.3 : ~4m.00s
2.6.6 : ~3m.15s
2.6.6-mm4 : ~2m.10s
2.6.6-mm5 : ~2m.50s
2.6.7-p1 : ~2m.80s
(ran five times on every kernel to get approximate
time listed)
The question is, how can I get the newer kernels to
scream like the older kernels?
I have moved all files in question to the local disk
to rule out network issues (though the 2.6.x kernels
are faster at net access). I have run the make
command in debug mode and find no differnce betHz w/8G
RAM.
Thank you for your assistance.
Phy
__________________________________
Do you Yahoo!?
Yahoo! Domains Claim yours for only $14.70/year
http://smallbusiness.promotions.yahoo.com/offer
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 0:32 Help understanding slow down Phy Prabab
@ 2004-05-24 0:57 ` Phy Prabab
2004-05-24 1:04 ` William Lee Irwin III
2004-05-24 6:33 ` Paul Rolland
0 siblings, 2 replies; 31+ messages in thread
From: Phy Prabab @ 2004-05-24 0:57 UTC (permalink / raw)
To: Phy Prabab, linux-kernel
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 2053 bytes --]
Just for more clarification, here is a perfect
example:
2.6.7-p1:
24.86user 51.77system 2:58.87elapsed 42%CPU
(0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (13major+7591686minor)pagefaults
0swaps
2.4.21:
28.68user 34.98system 1:12.34elapsed 87%CPU
(0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (5691267major+1130523minor)pagefaults
0swaps
Both runs on the same machine with the same process
(making headers).
Could someone give me some pointers/directions on
where to look.
Thank you for your time.
Phy
--- Phy Prabab <phyprabab@yahoo.com> wrote:
> Hello,
>
> I need some help understanding what is at issue with
> the extreme lsow down in build times for a custom
> executable on different kernel versions. The
> difference is pretty huge:
>
> RH 2.4.20-13-7 : ~1m.10s
> 2.4.22 : ~1m.40s
> 2.4.26 : ~2m.15s
> 2.6.1 : ~3m.40s
> 2.6.2 : ~4m.00s
> 2.6.3 : ~4m.00s
> 2.6.6 : ~3m.15s
> 2.6.6-mm4 : ~2m.10s
> 2.6.6-mm5 : ~2m.50s
> 2.6.7-p1 : ~2m.80s
> (ran five times on every kernel to get approximate
> time listed)
>
> The question is, how can I get the newer kernels to
> scream like the older kernels?
>
> I have moved all files in question to the local disk
> to rule out network issues (though the 2.6.x kernels
> are faster at net access). I have run the make
> command in debug mode and find no differnce betHz
> w/8G
> RAM.
>
> Thank you for your assistance.
> Phy
>
>
>
>
>
> __________________________________
> Do you Yahoo!?
> Yahoo! Domains Claim yours for only $14.70/year
> http://smallbusiness.promotions.yahoo.com/offer
> -
> To unsubscribe from this list: send the line
> "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
__________________________________
Do you Yahoo!?
Yahoo! Domains Claim yours for only $14.70/year
http://smallbusiness.promotions.yahoo.com/offer
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 0:57 ` Phy Prabab
@ 2004-05-24 1:04 ` William Lee Irwin III
2004-05-24 1:25 ` Jakob Oestergaard
2004-05-24 6:33 ` Paul Rolland
1 sibling, 1 reply; 31+ messages in thread
From: William Lee Irwin III @ 2004-05-24 1:04 UTC (permalink / raw)
To: Phy Prabab; +Cc: linux-kernel
On Sun, May 23, 2004 at 05:57:51PM -0700, Phy Prabab wrote:
> Just for more clarification, here is a perfect
> example:
> 2.6.7-p1:
> 24.86user 51.77system 2:58.87elapsed 42%CPU
> (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (13major+7591686minor)pagefaults
> 0swaps
> 2.4.21:
> 28.68user 34.98system 1:12.34elapsed 87%CPU
> (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (5691267major+1130523minor)pagefaults
> 0swaps
Thanks. This reveals that the performance regression is system time.
Could you please rerun the 2.6.7-rc1 test with kernel profiling on,
and reset the profile buffer prior just prior to the run?
This is the kernel command line option profile=1, and then as root,
echo 1 > /proc/profile just prior to the run.
-- wli
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 1:04 ` William Lee Irwin III
@ 2004-05-24 1:25 ` Jakob Oestergaard
2004-05-24 1:28 ` William Lee Irwin III
0 siblings, 1 reply; 31+ messages in thread
From: Jakob Oestergaard @ 2004-05-24 1:25 UTC (permalink / raw)
To: William Lee Irwin III, Phy Prabab, linux-kernel
On Sun, May 23, 2004 at 06:04:55PM -0700, William Lee Irwin III wrote:
> On Sun, May 23, 2004 at 05:57:51PM -0700, Phy Prabab wrote:
> > Just for more clarification, here is a perfect
> > example:
> > 2.6.7-p1:
> > 24.86user 51.77system 2:58.87elapsed 42%CPU
> > (0avgtext+0avgdata 0maxresident)k
> > 0inputs+0outputs (13major+7591686minor)pagefaults
> > 0swaps
> > 2.4.21:
> > 28.68user 34.98system 1:12.34elapsed 87%CPU
> > (0avgtext+0avgdata 0maxresident)k
> > 0inputs+0outputs (5691267major+1130523minor)pagefaults
> > 0swaps
>
> Thanks. This reveals that the performance regression is system time.
Eh, not if I read the numbers right:
2.6.7-p1: 24.86user 51.77system 2:58.87elapsed 42%CPU
24.86 + 51.77 = 76.63 seconds on CPU, 102.24 seconds of waiting
2.4.21: 28.68user 34.98system 1:12.34elapsed 87%CPU
28.68 + 34.98 = 63.66 seconds on CPU, 8.68 seconds of waiting
So, 2.6.7-p1 spends 16.79 seconds more in the kernel as you observed,
but it spends 93.56 seconds more waiting for I/O (or whatever).
Unless I'm totally missing something, the wait seems to be the
regression.
/ jakob
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 1:25 ` Jakob Oestergaard
@ 2004-05-24 1:28 ` William Lee Irwin III
2004-05-24 1:51 ` Phy Prabab
0 siblings, 1 reply; 31+ messages in thread
From: William Lee Irwin III @ 2004-05-24 1:28 UTC (permalink / raw)
To: Jakob Oestergaard, Phy Prabab, linux-kernel
On Mon, May 24, 2004 at 03:25:53AM +0200, Jakob Oestergaard wrote:
> Eh, not if I read the numbers right:
> 2.6.7-p1: 24.86user 51.77system 2:58.87elapsed 42%CPU
> 24.86 + 51.77 = 76.63 seconds on CPU, 102.24 seconds of waiting
> 2.4.21: 28.68user 34.98system 1:12.34elapsed 87%CPU
> 28.68 + 34.98 = 63.66 seconds on CPU, 8.68 seconds of waiting
> So, 2.6.7-p1 spends 16.79 seconds more in the kernel as you observed,
> but it spends 93.56 seconds more waiting for I/O (or whatever).
> Unless I'm totally missing something, the wait seems to be the
> regression.
I'm sorry, you're right. Let's start by looking into IO activity.
Phy, could you log the output of vmstat(1) during the runs?
-- wli
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
@ 2004-05-24 1:37 Robert M. Stockmann
2004-05-24 1:50 ` Phy Prabab
2004-05-24 3:31 ` Robert M. Stockmann
0 siblings, 2 replies; 31+ messages in thread
From: Robert M. Stockmann @ 2004-05-24 1:37 UTC (permalink / raw)
To: linux-kernel
Hi Phy,
Could you also give the uname -a output of both the 2.4.x and
the 2.6.x kernels ? And also mention which make options you use ? :
make : 1 thread , load is around 1.00
make -j 2 : 2 theads , load is around 2.00
make -j 4 : 4 theads , load is around 4.00
# uname -a (dual-Xeon SMP machine)
Linux jackson.stokkie.net 2.4.23abi #1 SMP Sun Jan 25 03:28:48 CET 2004 i686 unknown unknown GNU/Linux
^^^^
# uname -a (poor PII 400 MHz machine)
Linux tapebox.stokkie.net 2.6.6 #1 Thu May 13 05:48:57 CEST 2004 i686 unknown unknown GNU/Linux
^^^ no SMP
Also , is your Opteron machine rather different as the mainstream Opteron
boards, i.e. are there some exotic drivers used ?
Regards,
Robert
> Just for more clarification, here is a perfect
> example:
>
> 2.6.7-p1:
> 24.86user 51.77system 2:58.87elapsed 42%CPU
> (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (13major+7591686minor)pagefaults
> 0swaps
>
> 2.4.21:
> 28.68user 34.98system 1:12.34elapsed 87%CPU
> (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (5691267major+1130523minor)pagefaults
> 0swaps
>
>
> Both runs on the same machine with the same process
> (making headers).
>
> Could someone give me some pointers/directions on
> where to look.
>
> Thank you for your time.
> Phy
>
> --- Phy Prabab <phyprabab@xxxxxxxxx> wrote:
>> Hello,
>>
>> I need some help understanding what is at issue with
>> the extreme lsow down in build times for a custom
>> executable on different kernel versions. The
>> difference is pretty huge:
>>
>> RH 2.4.20-13-7 : ~1m.10s
>> 2.4.22 : ~1m.40s
>> 2.4.26 : ~2m.15s
>> 2.6.1 : ~3m.40s
>> 2.6.2 : ~4m.00s
>> 2.6.3 : ~4m.00s
>> 2.6.6 : ~3m.15s
>> 2.6.6-mm4 : ~2m.10s
>> 2.6.6-mm5 : ~2m.50s
>> 2.6.7-p1 : ~2m.80s
>> (ran five times on every kernel to get approximate
>> time listed)
>>
>> The question is, how can I get the newer kernels to
>> scream like the older kernels?
>>
>> I have moved all files in question to the local disk
>> to rule out network issues (though the 2.6.x kernels
>> are faster at net access). I have run the make
>> command in debug mode and find no differnce betHz
>> w/8G
>> RAM.
>>
>> Thank you for your assistance.
>> Phy
>>
>> __________________________________
>> Do you Yahoo!?
>> Yahoo! Domains ? Claim yours for only $14.70/year
>> http://smallbusiness.promotions.yahoo.com/offer
-
Robert M. Stockmann - RHCE
Network Engineer - UNIX/Linux Specialist
crashrecovery.org stock@stokkie.net
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 1:37 Robert M. Stockmann
@ 2004-05-24 1:50 ` Phy Prabab
2004-05-24 3:31 ` Robert M. Stockmann
1 sibling, 0 replies; 31+ messages in thread
From: Phy Prabab @ 2004-05-24 1:50 UTC (permalink / raw)
To: Robert M. Stockmann, linux-kernel
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 3656 bytes --]
Robert,
This is a x86 machine which I have transitioned to for
the time being. I had submitted a previous email
discussing the Opteron, however, did not hear back so
tried the x86 tests.
uname -a
Linux c1m12 2.6.7-rc1 #2 SMP Sun May 23 18:16:47 PDT
2004 i686 i686 i386 GNU/Linux
making headers for my custom binary is basically just
making links to all the different header file
scattered within the source code (some dependancy
make).
Phy
--- "Robert M. Stockmann" <stock@stokkie.net> wrote:
> Hi Phy,
>
> Could you also give the uname -a output of both the
> 2.4.x and
> the 2.6.x kernels ? And also mention which make
> options you use ? :
>
> make : 1 thread , load is around 1.00
> make -j 2 : 2 theads , load is around 2.00
> make -j 4 : 4 theads , load is around 4.00
>
> # uname -a (dual-Xeon SMP machine)
> Linux jackson.stokkie.net 2.4.23abi #1 SMP Sun Jan
> 25 03:28:48 CET 2004 i686 unknown unknown GNU/Linux
> ^^^^
>
> # uname -a (poor PII 400 MHz machine)
> Linux tapebox.stokkie.net 2.6.6 #1 Thu May 13
> 05:48:57 CEST 2004 i686 unknown unknown GNU/Linux
> ^^^ no SMP
>
> Also , is your Opteron machine rather different as
> the mainstream Opteron
> boards, i.e. are there some exotic drivers used ?
>
> Regards,
>
> Robert
>
> > Just for more clarification, here is a perfect
> > example:
> >
> > 2.6.7-p1:
> > 24.86user 51.77system 2:58.87elapsed 42%CPU
> > (0avgtext+0avgdata 0maxresident)k
> > 0inputs+0outputs (13major+7591686minor)pagefaults
> > 0swaps
> >
> > 2.4.21:
> > 28.68user 34.98system 1:12.34elapsed 87%CPU
> > (0avgtext+0avgdata 0maxresident)k
> > 0inputs+0outputs
> (5691267major+1130523minor)pagefaults
> > 0swaps
> >
> >
> > Both runs on the same machine with the same
> process
> > (making headers).
> >
> > Could someone give me some pointers/directions on
> > where to look.
> >
> > Thank you for your time.
> > Phy
> >
> > --- Phy Prabab <phyprabab@xxxxxxxxx> wrote:
> >> Hello,
> >>
> >> I need some help understanding what is at issue
> with
> >> the extreme lsow down in build times for a custom
> >> executable on different kernel versions. The
> >> difference is pretty huge:
> >>
> >> RH 2.4.20-13-7 : ~1m.10s
> >> 2.4.22 : ~1m.40s
> >> 2.4.26 : ~2m.15s
> >> 2.6.1 : ~3m.40s
> >> 2.6.2 : ~4m.00s
> >> 2.6.3 : ~4m.00s
> >> 2.6.6 : ~3m.15s
> >> 2.6.6-mm4 : ~2m.10s
> >> 2.6.6-mm5 : ~2m.50s
> >> 2.6.7-p1 : ~2m.80s
> >> (ran five times on every kernel to get
> approximate
> >> time listed)
> >>
> >> The question is, how can I get the newer kernels
> to
> >> scream like the older kernels?
> >>
> >> I have moved all files in question to the local
> disk
> >> to rule out network issues (though the 2.6.x
> kernels
> >> are faster at net access). I have run the make
> >> command in debug mode and find no differnce betHz
> >> w/8G
> >> RAM.
> >>
> >> Thank you for your assistance.
> >> Phy
> >>
> >> __________________________________
> >> Do you Yahoo!?
> >> Yahoo! Domains ? Claim yours for only $14.70/year
> >> http://smallbusiness.promotions.yahoo.com/offer
>
> -
> Robert M. Stockmann - RHCE
> Network Engineer - UNIX/Linux Specialist
> crashrecovery.org stock@stokkie.net
>
> -
> To unsubscribe from this list: send the line
> "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
__________________________________
Do you Yahoo!?
Yahoo! Domains Claim yours for only $14.70/year
http://smallbusiness.promotions.yahoo.com/offer
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 1:28 ` William Lee Irwin III
@ 2004-05-24 1:51 ` Phy Prabab
2004-05-24 2:42 ` Nick Piggin
2004-05-24 2:43 ` Andrew Morton
0 siblings, 2 replies; 31+ messages in thread
From: Phy Prabab @ 2004-05-24 1:51 UTC (permalink / raw)
To: William Lee Irwin III, Jakob Oestergaard, linux-kernel
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 2916 bytes --]
Okay, so ran the test with vmstat running. I ran it
capturing every 1 sec. Here is an excert:
procs memory swap
io system cpu
r b swpd free buff cache si so bi
bo in cs us sy id wa
1 0 0 8153848 17000 82348 0 0 0
0 4568 4028 6 16 78 0
0 1 0 8154168 17008 82340 0 0 0
160 4596 4079 7 17 76 1
1 0 0 8153848 17008 82340 0 0 0
0 4511 3998 7 16 76 0
1 0 0 8153912 17008 82340 0 0 0
0 4460 3952 7 14 79 0
1 0 0 8153784 17016 82332 0 0 0
0 4437 3962 7 16 77 0
1 0 0 8153528 17016 82332 0 0 0
0 4444 3927 7 14 78 0
1 1 0 8153784 17024 82392 0 0 0
144 4399 3895 7 15 77 1
0 0 0 8153592 17024 82392 0 0 0
0 4367 3821 7 15 78 0
1 0 0 8153848 17024 82392 0 0 0
0 4393 3926 6 16 78 0
1 0 0 8153528 17024 82460 0 0 0
0 4438 3960 8 14 78 0
1 0 0 8154040 17024 82460 0 0 0
0 4415 3912 6 15 78 0
1 1 0 8153720 17032 82452 0 0 0
140 4457 3953 7 15 77 1
1 0 0 8153784 17032 82452 0 0 0
0 4437 3889 7 14 79 0
0 0 0 8153784 17040 82444 0 0 0
0 4398 3903 8 15 77 0
1 0 0 8153464 17040 82444 0 0 0
0 4398 3902 7 14 79 0
0 0 0 8153528 17040 82444 0 0 0
0 4447 3922 6 17 77 0
0 1 0 8153720 17052 82432 0 0 0
144 4490 3960 6 16 77 1
0 0 0 8153656 17056 82428 0 0 0
0 4449 3954 7 15 78 0
--- William Lee Irwin III <wli@holomorphy.com> wrote:
> On Mon, May 24, 2004 at 03:25:53AM +0200, Jakob
> Oestergaard wrote:
> > Eh, not if I read the numbers right:
> > 2.6.7-p1: 24.86user 51.77system 2:58.87elapsed
> 42%CPU
> > 24.86 + 51.77 = 76.63 seconds on CPU, 102.24
> seconds of waiting
> > 2.4.21: 28.68user 34.98system 1:12.34elapsed
> 87%CPU
> > 28.68 + 34.98 = 63.66 seconds on CPU, 8.68 seconds
> of waiting
> > So, 2.6.7-p1 spends 16.79 seconds more in the
> kernel as you observed,
> > but it spends 93.56 seconds more waiting for I/O
> (or whatever).
> > Unless I'm totally missing something, the wait
> seems to be the
> > regression.
>
> I'm sorry, you're right. Let's start by looking into
> IO activity.
> Phy, could you log the output of vmstat(1) during
> the runs?
>
>
> -- wli
> -
> To unsubscribe from this list: send the line
> "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
__________________________________
Do you Yahoo!?
Yahoo! Domains Claim yours for only $14.70/year
http://smallbusiness.promotions.yahoo.com/offer
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 1:51 ` Phy Prabab
@ 2004-05-24 2:42 ` Nick Piggin
2004-05-24 2:43 ` Andrew Morton
1 sibling, 0 replies; 31+ messages in thread
From: Nick Piggin @ 2004-05-24 2:42 UTC (permalink / raw)
To: Phy Prabab; +Cc: William Lee Irwin III, Jakob Oestergaard, linux-kernel
Phy Prabab wrote:
> Okay, so ran the test with vmstat running. I ran it
> capturing every 1 sec. Here is an excert:
>
> procs memory swap
> io system cpu
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 1 0 0 8153848 17000 82348 0 0 0 0 4568 4028 6 16 78 0
> 0 1 0 8154168 17008 82340 0 0 0 160 4596 4079 7 17 76 1
> 1 0 0 8153848 17008 82340 0 0 0 0 4511 3998 7 16 76 0
> 1 0 0 8153912 17008 82340 0 0 0 0 4460 3952 7 14 79 0
> 1 0 0 8153784 17016 82332 0 0 0 0 4437 3962 7 16 77 0
> 1 0 0 8153528 17016 82332 0 0 0 0 4444 3927 7 14 78 0
> 1 1 0 8153784 17024 82392 0 0 0 144 4399 3895 7 15 77 1
> 0 0 0 8153592 17024 82392 0 0 0 0 4367 3821 7 15 78 0
> 1 0 0 8153848 17024 82392 0 0 0 0 4393 3926 6 16 78 0
> 1 0 0 8153528 17024 82460 0 0 0 0 4438 3960 8 14 78 0
> 1 0 0 8154040 17024 82460 0 0 0 0 4415 3912 6 15 78 0
> 1 1 0 8153720 17032 82452 0 0 0 140 4457 3953 7 15 77 1
> 1 0 0 8153784 17032 82452 0 0 0 0 4437 3889 7 14 79 0
> 0 0 0 8153784 17040 82444 0 0 0 0 4398 3903 8 15 77 0
> 1 0 0 8153464 17040 82444 0 0 0 0 4398 3902 7 14 79 0
> 0 0 0 8153528 17040 82444 0 0 0 0 4447 3922 6 17 77 0
> 0 1 0 8153720 17052 82432 0 0 0 144 4490 3960 6 16 77 1
> 0 0 0 8153656 17056 82428 0 0 0 0 4449 3954 7 15 78 0
>
Funny, you are basically not doing any IO at all.
How many CPUs are in your system?
As WLI noted, 2.6 is still taking more system time,
so a profile might still be useful.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 1:51 ` Phy Prabab
2004-05-24 2:42 ` Nick Piggin
@ 2004-05-24 2:43 ` Andrew Morton
2004-05-24 6:23 ` Phy Prabab
1 sibling, 1 reply; 31+ messages in thread
From: Andrew Morton @ 2004-05-24 2:43 UTC (permalink / raw)
To: Phy Prabab; +Cc: wli, jakob, linux-kernel
Phy Prabab <phyprabab@yahoo.com> wrote:
>
> Okay, so ran the test with vmstat running. I ran it
> capturing every 1 sec. Here is an excert:
That was horridly wordwrapped.
procs memory swap io system cpu
r b swpd free buff cache si so bi bo in cs us sy id wa
1 0 0 8153848 17000 82348 0 0 0 0 4568 4028 6 16 78 0
0 1 0 8154168 17008 82340 0 0 0 160 4596 4079 7 17 76 1
1 0 0 8153848 17008 82340 0 0 0 0 4511 3998 7 16 76 0
1 0 0 8153912 17008 82340 0 0 0 0 4460 3952 7 14 79 0
1 0 0 8153784 17016 82332 0 0 0 0 4437 3962 7 16 77 0
1 0 0 8153528 17016 82332 0 0 0 0 4444 3927 7 14 78 0
1 1 0 8153784 17024 82392 0 0 0 144 4399 3895 7 15 77 1
0 0 0 8153592 17024 82392 0 0 0 0 4367 3821 7 15 78 0
1 0 0 8153848 17024 82392 0 0 0 0 4393 3926 6 16 78 0
1 0 0 8153528 17024 82460 0 0 0 0 4438 3960 8 14 78 0
1 0 0 8154040 17024 82460 0 0 0 0 4415 3912 6 15 78 0
1 1 0 8153720 17032 82452 0 0 0 140 4457 3953 7 15 77 1
1 0 0 8153784 17032 82452 0 0 0 0 4437 3889 7 14 79 0
0 0 0 8153784 17040 82444 0 0 0 0 4398 3903 8 15 77 0
1 0 0 8153464 17040 82444 0 0 0 0 4398 3902 7 14 79 0
0 0 0 8153528 17040 82444 0 0 0 0 4447 3922 6 17 77 0
0 1 0 8153720 17052 82432 0 0 0 144 4490 3960 6 16 77 1
0 0 0 8153656 17056 82428 0 0 0 0 4449 3954 7 15 78 0
This is a single-CPU machine, yes?
Your application is spending most of the time in an explicit sleep of some
form. Suggest you run strace against it and see what it's up to.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 1:37 Robert M. Stockmann
2004-05-24 1:50 ` Phy Prabab
@ 2004-05-24 3:31 ` Robert M. Stockmann
2004-05-24 6:14 ` Phy Prabab
1 sibling, 1 reply; 31+ messages in thread
From: Robert M. Stockmann @ 2004-05-24 3:31 UTC (permalink / raw)
To: linux-kernel
Hi
> procs memory swap io system cpu
>
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 1 0 0 8153848 17000 82348 0 0 0 0 4568 4028 6 16 78 0
> 0 1 0 8154168 17008 82340 0 0 0 160 4596 4079 7 17 76 1
> 1 0 0 8153848 17008 82340 0 0 0 0 4511 3998 7 16 76 0
> 1 0 0 8153912 17008 82340 0 0 0 0 4460 3952 7 14 79 0
> 1 0 0 8153784 17016 82332 0 0 0 0 4437 3962 7 16 77 0
> 1 0 0 8153528 17016 82332 0 0 0 0 4444 3927 7 14 78 0
> 1 1 0 8153784 17024 82392 0 0 0 144 4399 3895 7 15 77 1
> 0 0 0 8153592 17024 82392 0 0 0 0 4367 3821 7 15 78 0
> 1 0 0 8153848 17024 82392 0 0 0 0 4393 3926 6 16 78 0
> 1 0 0 8153528 17024 82460 0 0 0 0 4438 3960 8 14 78 0
> 1 0 0 8154040 17024 82460 0 0 0 0 4415 3912 6 15 78 0
> 1 1 0 8153720 17032 82452 0 0 0 140 4457 3953 7 15 77 1
> 1 0 0 8153784 17032 82452 0 0 0 0 4437 3889 7 14 79 0
> 0 0 0 8153784 17040 82444 0 0 0 0 4398 3903 8 15 77 0
> 1 0 0 8153464 17040 82444 0 0 0 0 4398 3902 7 14 79 0
> 0 0 0 8153528 17040 82444 0 0 0 0 4447 3922 6 17 77 0
> 0 1 0 8153720 17052 82432 0 0 0 144 4490 3960 6 16 77 1
> 0 0 0 8153656 17056 82428 0 0 0 0 4449 3954 7 15 78 0
FIELD DESCRIPTION FOR VM MODE
Procs
r: The number of processes waiting for run time.
b: The number of processes in uninterruptible sleep.
During the 18 seconds of displayed stats, you have 12 seconds in which
1 process is waiting for run time. Which process is that, and where is
it waiting for ?
Is your custom excecutable compile project causing this or does a simple
compile task display the same slowdowns? As a small bench test with a source
we all can download, try to compile bison-1.75.tar.bz2 :
# ./configure ; time make
...
make[1]: Leaving directory `/home/stock/tmp/src/bison-1.75'
37.33user 3.64system 0:40.66elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (121059major+64165minor)pagefaults 0swaps
regards,
Robert
--
Robert M. Stockmann - RHCE
Network Engineer - UNIX/Linux Specialist
crashrecovery.org stock@stokkie.net
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 3:31 ` Robert M. Stockmann
@ 2004-05-24 6:14 ` Phy Prabab
2004-05-24 18:36 ` Robert M. Stockmann
0 siblings, 1 reply; 31+ messages in thread
From: Phy Prabab @ 2004-05-24 6:14 UTC (permalink / raw)
To: Robert M. Stockmann, linux-kernel
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 4838 bytes --]
Okay, here we go:
./configure;/usr/bin/time make
make[1]: Leaving directory `/var/tmp/bison-1.875'
9.39user 1.18system 0:11.59elapsed 91%CPU
(0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+147530minor)pagefaults 0swaps
0 0 0 8151040 19704 87396 0 0 0
148 1673 1230 3 6 90 1
0 0 0 8150912 19712 87456 0 0 0
0 2697 2553 10 13 78 0
0 0 0 8150712 19712 87388 0 0 4
0 3233 3404 15 19 66 0
1 0 0 8147512 19724 87716 0 0 276
0 2477 2671 23 15 60 2
1 0 0 8149944 19736 87840 0 0 132
0 2940 3048 17 17 65 1
1 1 0 8148144 19736 87840 0 0 0
32 2647 2466 22 16 62 1
1 0 0 8150176 19744 87832 0 0 0
212 2831 2837 19 17 63 0
1 0 0 8148704 19744 87832 0 0 12
0 2983 3051 18 18 63 0
1 0 0 8149984 19744 87832 0 0 0
0 3260 3632 20 22 58 0
1 0 0 8149792 19744 87832 0 0 0
0 3165 3366 19 22 59 0
0 1 0 8150240 19760 87952 0 0 108
36 2810 2795 16 17 64 3
1 0 0 8148768 19768 87944 0 0 12
208 2851 2775 18 15 65 1
1 0 0 8149152 19784 88472 0 0 576
200 2670 2681 16 15 64 5
1 0 0 8148960 19796 88596 0 0 80
0 3387 4263 10 20 71 0
1 0 0 8147864 19800 88728 0 0 0
0 3070 3834 23 16 60 0
0 1 0 8148440 19812 88988 0 0 0
468 3914 4592 7 18 73 1
procs memory swap
io system cpu
r b swpd free buff cache si so bi
bo in cs us sy id wa
1 0 0 8147352 19844 89228 0 0 120
300 3070 3900 15 17 65 4
1 0 0 8144600 19852 89288 0 0 0
104 1850 1008 35 8 57 1
1 0 0 8143576 19864 89412 0 0 0
0 1416 501 43 5 52 0
A little faster than the result you published,
however, it sheds no light for me. Any thing that
seems to point out some issues?
Thanks,
Phy
--- "Robert M. Stockmann" <stock@stokkie.net> wrote:
> Hi
>
> > procs memory swap io system cpu
> >
> > r b swpd free buff cache si so bi bo in cs
> us sy id wa
> > 1 0 0 8153848 17000 82348 0 0 0 0 4568
> 4028 6 16 78 0
> > 0 1 0 8154168 17008 82340 0 0 0 160 4596
> 4079 7 17 76 1
> > 1 0 0 8153848 17008 82340 0 0 0 0 4511
> 3998 7 16 76 0
> > 1 0 0 8153912 17008 82340 0 0 0 0 4460
> 3952 7 14 79 0
> > 1 0 0 8153784 17016 82332 0 0 0 0 4437
> 3962 7 16 77 0
> > 1 0 0 8153528 17016 82332 0 0 0 0 4444
> 3927 7 14 78 0
> > 1 1 0 8153784 17024 82392 0 0 0 144 4399
> 3895 7 15 77 1
> > 0 0 0 8153592 17024 82392 0 0 0 0 4367
> 3821 7 15 78 0
> > 1 0 0 8153848 17024 82392 0 0 0 0 4393
> 3926 6 16 78 0
> > 1 0 0 8153528 17024 82460 0 0 0 0 4438
> 3960 8 14 78 0
> > 1 0 0 8154040 17024 82460 0 0 0 0 4415
> 3912 6 15 78 0
> > 1 1 0 8153720 17032 82452 0 0 0 140 4457
> 3953 7 15 77 1
> > 1 0 0 8153784 17032 82452 0 0 0 0 4437
> 3889 7 14 79 0
> > 0 0 0 8153784 17040 82444 0 0 0 0 4398
> 3903 8 15 77 0
> > 1 0 0 8153464 17040 82444 0 0 0 0 4398
> 3902 7 14 79 0
> > 0 0 0 8153528 17040 82444 0 0 0 0 4447
> 3922 6 17 77 0
> > 0 1 0 8153720 17052 82432 0 0 0 144 4490
> 3960 6 16 77 1
> > 0 0 0 8153656 17056 82428 0 0 0 0 4449
> 3954 7 15 78 0
>
> FIELD DESCRIPTION FOR VM MODE
> Procs
> r: The number of processes waiting for run
> time.
> b: The number of processes in uninterruptible
> sleep.
>
> During the 18 seconds of displayed stats, you have
> 12 seconds in which
> 1 process is waiting for run time. Which process is
> that, and where is
> it waiting for ?
>
> Is your custom excecutable compile project causing
> this or does a simple
> compile task display the same slowdowns? As a small
> bench test with a source
> we all can download, try to compile
> bison-1.75.tar.bz2 :
>
> # ./configure ; time make
> ...
> make[1]: Leaving directory
> `/home/stock/tmp/src/bison-1.75'
> 37.33user 3.64system 0:40.66elapsed 100%CPU
> (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (121059major+64165minor)pagefaults
> 0swaps
>
> regards,
>
> Robert
> --
> Robert M. Stockmann - RHCE
> Network Engineer - UNIX/Linux Specialist
> crashrecovery.org stock@stokkie.net
>
> -
> To unsubscribe from this list: send the line
> "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
__________________________________
Do you Yahoo!?
Yahoo! Domains Claim yours for only $14.70/year
http://smallbusiness.promotions.yahoo.com/offer
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 2:43 ` Andrew Morton
@ 2004-05-24 6:23 ` Phy Prabab
2004-05-24 6:27 ` William Lee Irwin III
0 siblings, 1 reply; 31+ messages in thread
From: Phy Prabab @ 2004-05-24 6:23 UTC (permalink / raw)
To: Andrew Morton; +Cc: wli, jakob, linux-kernel
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 2484 bytes --]
Sorry for the late reply. No this is a dual Xeon
3.06Ghz. All runs are with the same hardware and same
make/build system.
How do I generate a profile of the system. I have
alrady compiled profiling and have enabled the kernel.
Thanks,
Phy
--- Andrew Morton <akpm@osdl.org> wrote:
> Phy Prabab <phyprabab@yahoo.com> wrote:
> >
> > Okay, so ran the test with vmstat running. I ran
> it
> > capturing every 1 sec. Here is an excert:
>
> That was horridly wordwrapped.
>
> procs memory swap
> io system cpu
> r b swpd free buff cache si so bi
> bo in cs us sy id wa
> 1 0 0 8153848 17000 82348 0 0 0
> 0 4568 4028 6 16 78 0
> 0 1 0 8154168 17008 82340 0 0 0
> 160 4596 4079 7 17 76 1
> 1 0 0 8153848 17008 82340 0 0 0
> 0 4511 3998 7 16 76 0
> 1 0 0 8153912 17008 82340 0 0 0
> 0 4460 3952 7 14 79 0
> 1 0 0 8153784 17016 82332 0 0 0
> 0 4437 3962 7 16 77 0
> 1 0 0 8153528 17016 82332 0 0 0
> 0 4444 3927 7 14 78 0
> 1 1 0 8153784 17024 82392 0 0 0
> 144 4399 3895 7 15 77 1
> 0 0 0 8153592 17024 82392 0 0 0
> 0 4367 3821 7 15 78 0
> 1 0 0 8153848 17024 82392 0 0 0
> 0 4393 3926 6 16 78 0
> 1 0 0 8153528 17024 82460 0 0 0
> 0 4438 3960 8 14 78 0
> 1 0 0 8154040 17024 82460 0 0 0
> 0 4415 3912 6 15 78 0
> 1 1 0 8153720 17032 82452 0 0 0
> 140 4457 3953 7 15 77 1
> 1 0 0 8153784 17032 82452 0 0 0
> 0 4437 3889 7 14 79 0
> 0 0 0 8153784 17040 82444 0 0 0
> 0 4398 3903 8 15 77 0
> 1 0 0 8153464 17040 82444 0 0 0
> 0 4398 3902 7 14 79 0
> 0 0 0 8153528 17040 82444 0 0 0
> 0 4447 3922 6 17 77 0
> 0 1 0 8153720 17052 82432 0 0 0
> 144 4490 3960 6 16 77 1
> 0 0 0 8153656 17056 82428 0 0 0
> 0 4449 3954 7 15 78 0
>
> This is a single-CPU machine, yes?
>
> Your application is spending most of the time in an
> explicit sleep of some
> form. Suggest you run strace against it and see
> what it's up to.
__________________________________
Do you Yahoo!?
Yahoo! Domains Claim yours for only $14.70/year
http://smallbusiness.promotions.yahoo.com/offer
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 6:23 ` Phy Prabab
@ 2004-05-24 6:27 ` William Lee Irwin III
2004-05-24 6:39 ` Phy Prabab
0 siblings, 1 reply; 31+ messages in thread
From: William Lee Irwin III @ 2004-05-24 6:27 UTC (permalink / raw)
To: Phy Prabab; +Cc: Andrew Morton, jakob, linux-kernel
On Sun, May 23, 2004 at 11:23:37PM -0700, Phy Prabab wrote:
> Sorry for the late reply. No this is a dual Xeon
> 3.06Ghz. All runs are with the same hardware and same
> make/build system.
> How do I generate a profile of the system. I have
> alrady compiled profiling and have enabled the kernel.
readprofile -n -m /boot/System.map-`uname -r` | sort -rn -k 1,1 | head -22
No need for compiled-in support, just boot with profile=1.
-- wli
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 0:57 ` Phy Prabab
2004-05-24 1:04 ` William Lee Irwin III
@ 2004-05-24 6:33 ` Paul Rolland
2004-05-24 17:28 ` Roger Luethi
1 sibling, 1 reply; 31+ messages in thread
From: Paul Rolland @ 2004-05-24 6:33 UTC (permalink / raw)
To: 'Phy Prabab', linux-kernel
Hello,
> 2.6.7-p1:
> 24.86user 51.77system 2:58.87elapsed 42%CPU
> (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (13major+7591686minor)pagefaults
> 0swaps
>
> 2.4.21:
> 28.68user 34.98system 1:12.34elapsed 87%CPU
> (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (5691267major+1130523minor)pagefaults
> 0swaps
>
>
> Both runs on the same machine with the same process
> (making headers).
>
> Could someone give me some pointers/directions on
> where to look.
Any reason why there is such a difference in the pagefaults
numbers between 2.4.x and 2.6.x ????
Could it explain a part of the time differences ?
Paul
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 6:27 ` William Lee Irwin III
@ 2004-05-24 6:39 ` Phy Prabab
2004-05-24 7:21 ` Phy Prabab
2004-05-24 7:53 ` Andrew Morton
0 siblings, 2 replies; 31+ messages in thread
From: Phy Prabab @ 2004-05-24 6:39 UTC (permalink / raw)
To: William Lee Irwin III; +Cc: Andrew Morton, jakob, linux-kernel
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 1996 bytes --]
Okay, here is the output:
317955 total
0.1302
263633 poll_idle
4545.3966
6764 do_page_fault
5.1951
3650 kmap_atomic
28.2946
3288 __d_lookup
12.0000
1856 atomic_dec_and_lock
30.4262
1823 do_no_page
2.4870
1398 do_anonymous_page
3.1991
1168 link_path_walk
0.5266
1138 find_get_page
19.9649
1128 zap_pte_range
1.2450
1075 nfs_lookup_revalidate
0.7739
1039 page_remove_rmap
5.0437
1006 in_group_p
9.4906
987 handle_mm_fault
2.2742
913 do_wp_page
1.1499
816 finish_task_switch
6.6885
772 tg3_poll
2.6621
762 rpcauth_lookup_credcache
1.3275
722 page_add_file_rmap
4.1257
623 system_call
14.1591
607 strncpy_from_user
6.8202
Thanks again!
Phy
--- William Lee Irwin III <wli@holomorphy.com> wrote:
> On Sun, May 23, 2004 at 11:23:37PM -0700, Phy Prabab
> wrote:
> > Sorry for the late reply. No this is a dual Xeon
> > 3.06Ghz. All runs are with the same hardware and
> same
> > make/build system.
> > How do I generate a profile of the system. I have
> > alrady compiled profiling and have enabled the
> kernel.
>
> readprofile -n -m /boot/System.map-`uname -r` | sort
> -rn -k 1,1 | head -22
>
> No need for compiled-in support, just boot with
> profile=1.
>
>
> -- wli
__________________________________
Do you Yahoo!?
Yahoo! Domains Claim yours for only $14.70/year
http://smallbusiness.promotions.yahoo.com/offer
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 6:39 ` Phy Prabab
@ 2004-05-24 7:21 ` Phy Prabab
2004-05-24 7:44 ` Nick Piggin
2004-05-24 14:13 ` Martin J. Bligh
2004-05-24 7:53 ` Andrew Morton
1 sibling, 2 replies; 31+ messages in thread
From: Phy Prabab @ 2004-05-24 7:21 UTC (permalink / raw)
To: Phy Prabab, William Lee Irwin III; +Cc: Andrew Morton, jakob, linux-kernel
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 2912 bytes --]
For more information, I ran the same test on 2.4.21
with profiling enabled:
22133 total 0.0102
18255 default_idle 351.0577
634 do_page_fault 0.4412
342 do_no_page 0.3654
231 d_lookup 0.8134
173 do_anonymous_page 0.2790
157 system_call 2.8036
157 pte_alloc_atomic 0.2973
130 do_wp_page 0.1222
102 __find_get_page 1.5000
99 atomic_dec_and_lock 1.3750
90 handle_mm_fault 0.2616
89 zap_page_range 0.0792
86 link_path_walk 0.0343
62 mraccessf 0.6458
59 permission 0.4470
56 mrunlock 0.5600
42 strncpy_from_user 0.5526
40 strnlen_user 0.4545
40 __do_generic_file_read 0.0316
39 __free_pages 1.2188
35 filemap_nopage 0.0648
Also refomated the previous report to make it
readable:
> 317955 total 0.1302
> 263633 poll_idle 4545.3966
> 6764 do_page_fault 5.1951
> 3650 kmap_atomic 28.2946
> 3288 __d_lookup 12.0000
> 1856 atomic_dec_and_lock 30.4262
> 1823 do_no_page 2.4870
> 1398 do_anonymous_page 3.1991
> 1168 link_path_walk 0.5266
> 1138 find_get_page 19.9649
1.2450
> 1075 nfs_lookup_revalidate 0.7739
> 1039 page_remove_rmap 5.0437
> 1006 in_group_p 9.4906
> 987 handle_mm_fault 2.2742
> 913 do_wp_page 1.1499
> 816 finish_task_switch 6.6885
> 772 tg3_poll 2.6621
> 762 rpcauth_lookup_credcache 1.3275
> 722 page_add_file_rmap 4.1257
> 623 system_call 14.1591
> 607 strncpy_from_user 6.8202
>
> Thanks again!
> Phy
>
>
>
> --- William Lee Irwin III <wli@holomorphy.com>
> wrote:
> > On Sun, May 23, 2004 at 11:23:37PM -0700, Phy
> Prabab
> > wrote:
> > > Sorry for the late reply. No this is a dual
> Xeon
> > > 3.06Ghz. All runs are with the same hardware
> and
> > same
> > > make/build system.
> > > How do I generate a profile of the system. I
> have
> > > alrady compiled profiling and have enabled the
> > kernel.
> >
> > readprofile -n -m /boot/System.map-`uname -r` |
> sort
> > -rn -k 1,1 | head -22
> >
> > No need for compiled-in support, just boot with
> > profile=1.
> >
> >
> > -- wli
>
>
>
>
> __________________________________
> Do you Yahoo!?
> Yahoo! Domains Claim yours for only $14.70/year
> http://smallbusiness.promotions.yahoo.com/offer
> -
> To unsubscribe from this list: send the line
> "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
__________________________________
Do you Yahoo!?
Yahoo! Domains Claim yours for only $14.70/year
http://smallbusiness.promotions.yahoo.com/offer
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 7:21 ` Phy Prabab
@ 2004-05-24 7:44 ` Nick Piggin
2004-05-24 7:50 ` Phy Prabab
2004-05-24 14:13 ` Martin J. Bligh
1 sibling, 1 reply; 31+ messages in thread
From: Nick Piggin @ 2004-05-24 7:44 UTC (permalink / raw)
To: Phy Prabab; +Cc: William Lee Irwin III, Andrew Morton, jakob, linux-kernel
Phy Prabab wrote:
> For more information, I ran the same test on 2.4.21
...
Is hyperthreading enabled on 2.6 and 2.4? Can you send
a cat /proc/cpuinfo | grep processor for a 2.4 and a 2.6
kernel please?
If you have hyperthreading enabled, try 2.6.7-rc1 with
SMT (hyperthreading) scheduler support enabled in the
kernel config. It is in processor type and features menu.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 7:44 ` Nick Piggin
@ 2004-05-24 7:50 ` Phy Prabab
2004-05-24 7:59 ` Nick Piggin
0 siblings, 1 reply; 31+ messages in thread
From: Phy Prabab @ 2004-05-24 7:50 UTC (permalink / raw)
To: Nick Piggin; +Cc: William Lee Irwin III, Andrew Morton, jakob, linux-kernel
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 829 bytes --]
NO HT, disabled in bios and did not enable in kernel:
cat /proc/cpuinfo|grep processor|wc -l
2
grep SMT .config (2.6.7-rc1)
# CONFIG_SCHED_SMT is not set
On 2.4.21 I also include "append=noht"
Thanks!
Phy
--- Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> Phy Prabab wrote:
> > For more information, I ran the same test on
> 2.4.21
>
> ...
>
> Is hyperthreading enabled on 2.6 and 2.4? Can you
> send
> a cat /proc/cpuinfo | grep processor for a 2.4 and a
> 2.6
> kernel please?
>
> If you have hyperthreading enabled, try 2.6.7-rc1
> with
> SMT (hyperthreading) scheduler support enabled in
> the
> kernel config. It is in processor type and features
> menu.
>
__________________________________
Do you Yahoo!?
Yahoo! Domains Claim yours for only $14.70/year
http://smallbusiness.promotions.yahoo.com/offer
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 6:39 ` Phy Prabab
2004-05-24 7:21 ` Phy Prabab
@ 2004-05-24 7:53 ` Andrew Morton
2004-05-24 8:00 ` Phy Prabab
2004-05-25 10:32 ` Ingo Molnar
1 sibling, 2 replies; 31+ messages in thread
From: Andrew Morton @ 2004-05-24 7:53 UTC (permalink / raw)
To: Phy Prabab; +Cc: wli, jakob, linux-kernel
Phy Prabab <phyprabab@yahoo.com> wrote:
>
> Okay, here is the output:
>
> 317955 total
> 0.1302
> 263633 poll_idle
> 4545.3966
> 6764 do_page_fault
> 5.1951
> 3650 kmap_atomic
It's strange that you appear to be using poll_idle(). Maybe it's an error
in the profiler - it sometimes makes mistakes in identifying small
functions.
But if you _are_ using poll_idle() and if your CPU is hyperthreaded then
yes, one "CPU" is going to take a performance hit from the "idle" one.
Is your CPU hyperthreaded? (cat /proc/cpuinfo)
Are you booting with "idle=poll"? Do `dmesg -s 1000000 | grep idle'.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 7:50 ` Phy Prabab
@ 2004-05-24 7:59 ` Nick Piggin
2004-05-24 10:43 ` Jakob Oestergaard
0 siblings, 1 reply; 31+ messages in thread
From: Nick Piggin @ 2004-05-24 7:59 UTC (permalink / raw)
To: Phy Prabab; +Cc: William Lee Irwin III, Andrew Morton, jakob, linux-kernel
Phy Prabab wrote:
> NO HT, disabled in bios and did not enable in kernel:
> cat /proc/cpuinfo|grep processor|wc -l
> 2
> grep SMT .config (2.6.7-rc1)
> # CONFIG_SCHED_SMT is not set
>
> On 2.4.21 I also include "append=noht"
>
OK good, that makes things simpler.
I'm out of ideas though. The kernel just doesn't seem to be
the problem here. Can you put together a testcase that causes
the problem and that we can download and reproduce it?
Otherwise, try to work out why your userspace processes are
more often idle under the 2.6 kernel.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 7:53 ` Andrew Morton
@ 2004-05-24 8:00 ` Phy Prabab
2004-05-25 10:32 ` Ingo Molnar
1 sibling, 0 replies; 31+ messages in thread
From: Phy Prabab @ 2004-05-24 8:00 UTC (permalink / raw)
To: Andrew Morton; +Cc: wli, jakob, linux-kernel
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 1255 bytes --]
Andrew,
I have turned HT off in the bios (caused non
deterministic performance) and I did not configure the
kernel for SMT. Additionally, looking at dmesg shows
no idle-poll (and I do not boot the kernel with
idle-poll).
Should I post my config for this kernel?
Thank you for your time.
Phy
--- Andrew Morton <akpm@osdl.org> wrote:
> Phy Prabab <phyprabab@yahoo.com> wrote:
> >
> > Okay, here is the output:
> >
> > 317955 total
> > 0.1302
> > 263633 poll_idle
> > 4545.3966
> > 6764 do_page_fault
> > 5.1951
> > 3650 kmap_atomic
>
> It's strange that you appear to be using
> poll_idle(). Maybe it's an error
> in the profiler - it sometimes makes mistakes in
> identifying small
> functions.
>
> But if you _are_ using poll_idle() and if your CPU
> is hyperthreaded then
> yes, one "CPU" is going to take a performance hit
> from the "idle" one.
>
> Is your CPU hyperthreaded? (cat /proc/cpuinfo)
>
> Are you booting with "idle=poll"? Do `dmesg -s
> 1000000 | grep idle'.
>
__________________________________
Do you Yahoo!?
Yahoo! Domains Claim yours for only $14.70/year
http://smallbusiness.promotions.yahoo.com/offer
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 7:59 ` Nick Piggin
@ 2004-05-24 10:43 ` Jakob Oestergaard
0 siblings, 0 replies; 31+ messages in thread
From: Jakob Oestergaard @ 2004-05-24 10:43 UTC (permalink / raw)
To: Nick Piggin
Cc: Phy Prabab, William Lee Irwin III, Andrew Morton, linux-kernel
On Mon, May 24, 2004 at 05:59:36PM +1000, Nick Piggin wrote:
> Phy Prabab wrote:
> >NO HT, disabled in bios and did not enable in kernel:
> >cat /proc/cpuinfo|grep processor|wc -l
> > 2
> >grep SMT .config (2.6.7-rc1)
> ># CONFIG_SCHED_SMT is not set
> >
> >On 2.4.21 I also include "append=noht"
> >
>
> OK good, that makes things simpler.
>
> I'm out of ideas though. The kernel just doesn't seem to be
> the problem here. Can you put together a testcase that causes
> the problem and that we can download and reproduce it?
You could try running your program with 'strace -T' to see which system
calls are causing the slowdown.
If you make a lot of system calls, this can be a little tedious to go
thru though... But it should reveal to you where your program is
waiting.
Note; if you use gprof and compile with the '-pg' switch, the profiler
will show you a profile of CPU time, not wall-clock time. Your
performance problem is wall-clock time, and if your program actually
spends a lot of time waiting in some system call, this will *NOT* show
up in a normal gprof profile.
/ jakob
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 7:21 ` Phy Prabab
2004-05-24 7:44 ` Nick Piggin
@ 2004-05-24 14:13 ` Martin J. Bligh
1 sibling, 0 replies; 31+ messages in thread
From: Martin J. Bligh @ 2004-05-24 14:13 UTC (permalink / raw)
To: Phy Prabab, William Lee Irwin III; +Cc: Andrew Morton, jakob, linux-kernel
I'm curious where all those kmap_atomic's come from. Do you have
CONFIG_HIGHPTE enabled? If so, please repeat with it off, and see
if that helps ... if not, grab the kcg patch out my tree, and see
where they're coming from (or shove a printk in there doing a dump_stack
for every 10,000 interations or so)
M.
--Phy Prabab <phyprabab@yahoo.com> wrote (on Monday, May 24, 2004 00:21:07 -0700):
> For more information, I ran the same test on 2.4.21
> with profiling enabled:
> 22133 total 0.0102
> 18255 default_idle 351.0577
> 634 do_page_fault 0.4412
> 342 do_no_page 0.3654
> 231 d_lookup 0.8134
> 173 do_anonymous_page 0.2790
> 157 system_call 2.8036
> 157 pte_alloc_atomic 0.2973
> 130 do_wp_page 0.1222
> 102 __find_get_page 1.5000
> 99 atomic_dec_and_lock 1.3750
> 90 handle_mm_fault 0.2616
> 89 zap_page_range 0.0792
> 86 link_path_walk 0.0343
> 62 mraccessf 0.6458
> 59 permission 0.4470
> 56 mrunlock 0.5600
> 42 strncpy_from_user 0.5526
> 40 strnlen_user 0.4545
> 40 __do_generic_file_read 0.0316
> 39 __free_pages 1.2188
> 35 filemap_nopage 0.0648
>
> Also refomated the previous report to make it
> readable:
>
>> 317955 total 0.1302
>> 263633 poll_idle 4545.3966
>> 6764 do_page_fault 5.1951
>> 3650 kmap_atomic 28.2946
>> 3288 __d_lookup 12.0000
>> 1856 atomic_dec_and_lock 30.4262
>> 1823 do_no_page 2.4870
>> 1398 do_anonymous_page 3.1991
>> 1168 link_path_walk 0.5266
>> 1138 find_get_page 19.9649
> 1.2450
>> 1075 nfs_lookup_revalidate 0.7739
>> 1039 page_remove_rmap 5.0437
>> 1006 in_group_p 9.4906
>> 987 handle_mm_fault 2.2742
>> 913 do_wp_page 1.1499
>> 816 finish_task_switch 6.6885
>> 772 tg3_poll 2.6621
>> 762 rpcauth_lookup_credcache 1.3275
>> 722 page_add_file_rmap 4.1257
>> 623 system_call 14.1591
>> 607 strncpy_from_user 6.8202
>>
>> Thanks again!
>> Phy
>>
>>
>>
>> --- William Lee Irwin III <wli@holomorphy.com>
>> wrote:
>> > On Sun, May 23, 2004 at 11:23:37PM -0700, Phy
>> Prabab
>> > wrote:
>> > > Sorry for the late reply. No this is a dual
>> Xeon
>> > > 3.06Ghz. All runs are with the same hardware
>> and
>> > same
>> > > make/build system.
>> > > How do I generate a profile of the system. I
>> have
>> > > alrady compiled profiling and have enabled the
>> > kernel.
>> >
>> > readprofile -n -m /boot/System.map-`uname -r` |
>> sort
>> > -rn -k 1,1 | head -22
>> >
>> > No need for compiled-in support, just boot with
>> > profile=1.
>> >
>> >
>> > -- wli
>>
>>
>>
>>
>> __________________________________
>> Do you Yahoo!?
>> Yahoo! Domains Claim yours for only $14.70/year
>> http://smallbusiness.promotions.yahoo.com/offer
>> -
>> To unsubscribe from this list: send the line
>> "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at
>> http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
>
>
>
>
> __________________________________
> Do you Yahoo!?
> Yahoo! Domains Claim yours for only $14.70/year
> http://smallbusiness.promotions.yahoo.com/offer
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 6:33 ` Paul Rolland
@ 2004-05-24 17:28 ` Roger Luethi
0 siblings, 0 replies; 31+ messages in thread
From: Roger Luethi @ 2004-05-24 17:28 UTC (permalink / raw)
To: Paul Rolland; +Cc: 'Phy Prabab', linux-kernel
On Mon, 24 May 2004 08:33:49 +0200, Paul Rolland wrote:
> Hello,
>
> > 2.6.7-p1:
> > 24.86user 51.77system 2:58.87elapsed 42%CPU
> > (0avgtext+0avgdata 0maxresident)k
> > 0inputs+0outputs (13major+7591686minor)pagefaults
> > 0swaps
> >
> > 2.4.21:
> > 28.68user 34.98system 1:12.34elapsed 87%CPU
> > (0avgtext+0avgdata 0maxresident)k
> > 0inputs+0outputs (5691267major+1130523minor)pagefaults
> > 0swaps
> >
> >
> > Both runs on the same machine with the same process
> > (making headers).
> >
> > Could someone give me some pointers/directions on
> > where to look.
>
> Any reason why there is such a difference in the pagefaults
> numbers between 2.4.x and 2.6.x ????
> Could it explain a part of the time differences ?
Probably no. Last time I checked, the major fault field for rusage
contained bogus data (included minor faults as well). Looks like that
got fixed in 2.6.
Roger
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 6:14 ` Phy Prabab
@ 2004-05-24 18:36 ` Robert M. Stockmann
0 siblings, 0 replies; 31+ messages in thread
From: Robert M. Stockmann @ 2004-05-24 18:36 UTC (permalink / raw)
To: Phy Prabab; +Cc: linux-kernel
On Sun, 23 May 2004, Phy Prabab wrote:
> Okay, here we go:
> ./configure;/usr/bin/time make
> make[1]: Leaving directory `/var/tmp/bison-1.875'
> 9.39user 1.18system 0:11.59elapsed 91%CPU
> (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (0major+147530minor)pagefaults 0swaps
>
> 0 0 0 8151040 19704 87396 0 0 0
> 148 1673 1230 3 6 90 1
> 0 0 0 8150912 19712 87456 0 0 0
> 0 2697 2553 10 13 78 0
> 0 0 0 8150712 19712 87388 0 0 4
> 0 3233 3404 15 19 66 0
> 1 0 0 8147512 19724 87716 0 0 276
> 0 2477 2671 23 15 60 2
> 1 0 0 8149944 19736 87840 0 0 132
> 0 2940 3048 17 17 65 1
> 1 1 0 8148144 19736 87840 0 0 0
> 32 2647 2466 22 16 62 1
> 1 0 0 8150176 19744 87832 0 0 0
> 212 2831 2837 19 17 63 0
> 1 0 0 8148704 19744 87832 0 0 12
> 0 2983 3051 18 18 63 0
> 1 0 0 8149984 19744 87832 0 0 0
> 0 3260 3632 20 22 58 0
> 1 0 0 8149792 19744 87832 0 0 0
> 0 3165 3366 19 22 59 0
> 0 1 0 8150240 19760 87952 0 0 108
> 36 2810 2795 16 17 64 3
> 1 0 0 8148768 19768 87944 0 0 12
> 208 2851 2775 18 15 65 1
> 1 0 0 8149152 19784 88472 0 0 576
> 200 2670 2681 16 15 64 5
> 1 0 0 8148960 19796 88596 0 0 80
> 0 3387 4263 10 20 71 0
> 1 0 0 8147864 19800 88728 0 0 0
> 0 3070 3834 23 16 60 0
> 0 1 0 8148440 19812 88988 0 0 0
> 468 3914 4592 7 18 73 1
> procs memory swap
> io system cpu
> r b swpd free buff cache si so bi
> bo in cs us sy id wa
> 1 0 0 8147352 19844 89228 0 0 120
> 300 3070 3900 15 17 65 4
> 1 0 0 8144600 19852 89288 0 0 0
> 104 1850 1008 35 8 57 1
> 1 0 0 8143576 19864 89412 0 0 0
> 0 1416 501 43 5 52 0
>
>
> A little faster than the result you published,
> however, it sheds no light for me. Any thing that
> seems to point out some issues?
Well you need to do this two times : using 2.6.7p1
and 2.4.21. and compare the results. Maybe compile a larger
gnu software package so the difference in speed is more clear.
Robert
>
> --- "Robert M. Stockmann" <stock@stokkie.net> wrote:
> > Hi
> >
> > > procs memory swap io system cpu
> > >
> > > r b swpd free buff cache si so bi bo in cs
> > us sy id wa
> > > 1 0 0 8153848 17000 82348 0 0 0 0 4568
> > 4028 6 16 78 0
> > > 0 1 0 8154168 17008 82340 0 0 0 160 4596
> > 4079 7 17 76 1
> > > 1 0 0 8153848 17008 82340 0 0 0 0 4511
> > 3998 7 16 76 0
> > > 1 0 0 8153912 17008 82340 0 0 0 0 4460
> > 3952 7 14 79 0
> > > 1 0 0 8153784 17016 82332 0 0 0 0 4437
> > 3962 7 16 77 0
> > > 1 0 0 8153528 17016 82332 0 0 0 0 4444
> > 3927 7 14 78 0
> > > 1 1 0 8153784 17024 82392 0 0 0 144 4399
> > 3895 7 15 77 1
> > > 0 0 0 8153592 17024 82392 0 0 0 0 4367
> > 3821 7 15 78 0
> > > 1 0 0 8153848 17024 82392 0 0 0 0 4393
> > 3926 6 16 78 0
> > > 1 0 0 8153528 17024 82460 0 0 0 0 4438
> > 3960 8 14 78 0
> > > 1 0 0 8154040 17024 82460 0 0 0 0 4415
> > 3912 6 15 78 0
> > > 1 1 0 8153720 17032 82452 0 0 0 140 4457
> > 3953 7 15 77 1
> > > 1 0 0 8153784 17032 82452 0 0 0 0 4437
> > 3889 7 14 79 0
> > > 0 0 0 8153784 17040 82444 0 0 0 0 4398
> > 3903 8 15 77 0
> > > 1 0 0 8153464 17040 82444 0 0 0 0 4398
> > 3902 7 14 79 0
> > > 0 0 0 8153528 17040 82444 0 0 0 0 4447
> > 3922 6 17 77 0
> > > 0 1 0 8153720 17052 82432 0 0 0 144 4490
> > 3960 6 16 77 1
> > > 0 0 0 8153656 17056 82428 0 0 0 0 4449
> > 3954 7 15 78 0
> >
> > FIELD DESCRIPTION FOR VM MODE
> > Procs
> > r: The number of processes waiting for run
> > time.
> > b: The number of processes in uninterruptible
> > sleep.
> >
> > During the 18 seconds of displayed stats, you have
> > 12 seconds in which
> > 1 process is waiting for run time. Which process is
> > that, and where is
> > it waiting for ?
> >
> > Is your custom excecutable compile project causing
> > this or does a simple
> > compile task display the same slowdowns? As a small
> > bench test with a source
> > we all can download, try to compile
> > bison-1.75.tar.bz2 :
> >
> > # ./configure ; time make
> > ...
> > make[1]: Leaving directory
> > `/home/stock/tmp/src/bison-1.75'
> > 37.33user 3.64system 0:40.66elapsed 100%CPU
> > (0avgtext+0avgdata 0maxresident)k
> > 0inputs+0outputs (121059major+64165minor)pagefaults
> > 0swaps
> >
> > regards,
> >
> > Robert
> > --
> > Robert M. Stockmann - RHCE
> > Network Engineer - UNIX/Linux Specialist
> > crashrecovery.org stock@stokkie.net
> >
> > -
> > To unsubscribe from this list: send the line
> > "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at
> > http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
>
>
>
>
> __________________________________
> Do you Yahoo!?
> Yahoo! Domains Claim yours for only $14.70/year
> http://smallbusiness.promotions.yahoo.com/offer
>
--
Robert M. Stockmann - RHCE
Network Engineer - UNIX/Linux Specialist
crashrecovery.org stock@stokkie.net
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-25 10:32 ` Ingo Molnar
@ 2004-05-25 9:29 ` Andrew Morton
2004-05-25 11:42 ` Ingo Molnar
0 siblings, 1 reply; 31+ messages in thread
From: Andrew Morton @ 2004-05-25 9:29 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel
Ingo Molnar <mingo@elte.hu> wrote:
>
> with the patch below we will print a big fat warning. (I did not want to
> deny idle=poll altogether - future HT implementations might work fine
> with polling idle.)
idle=poll is handy when profiling the kernel with oprofile clock-unhalted
events. Because if you use the normal halt-based idle loop no profile
"ticks" are accounted to idle time at all and the results are really hard
to understand.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-25 11:42 ` Ingo Molnar
@ 2004-05-25 9:58 ` Andrew Morton
2004-05-25 12:25 ` Ingo Molnar
0 siblings, 1 reply; 31+ messages in thread
From: Andrew Morton @ 2004-05-25 9:58 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel
Ingo Molnar <mingo@elte.hu> wrote:
>
>
> * Andrew Morton <akpm@osdl.org> wrote:
>
> > Ingo Molnar <mingo@elte.hu> wrote:
> > >
> > > with the patch below we will print a big fat warning. (I did not want to
> > > deny idle=poll altogether - future HT implementations might work fine
> > > with polling idle.)
> >
> > idle=poll is handy when profiling the kernel with oprofile
> > clock-unhalted events. Because if you use the normal halt-based idle
> > loop no profile "ticks" are accounted to idle time at all and the
> > results are really hard to understand.
>
> it makes it a bit more plausible, but kernel profiling based on ticks in
> a HT environment is still quite unreliable, even with idle=poll. The HT
> cores will yield to each other on various occasions - like spinlock
> loops. This disproportionatly increases the hits of various looping
> functions, creating false impressions of lock contention where there's
> only little contention. Plus idle=poll is a constant ~20% performance
> drain on the non-idle HT core, further distorting the profile. HT makes
> profiling really hard, no matter what.
But often one is looking for relativities rather than real absolute
numbers. (In which case the absent idle time doesn't matter, but it freaks
me out...)
> but ... we agree on the warning printk, right?
yup.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-24 7:53 ` Andrew Morton
2004-05-24 8:00 ` Phy Prabab
@ 2004-05-25 10:32 ` Ingo Molnar
2004-05-25 9:29 ` Andrew Morton
1 sibling, 1 reply; 31+ messages in thread
From: Ingo Molnar @ 2004-05-25 10:32 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
* Andrew Morton <akpm@osdl.org> wrote:
> But if you _are_ using poll_idle() and if your CPU is hyperthreaded
> then yes, one "CPU" is going to take a performance hit from the "idle"
> one.
with the patch below we will print a big fat warning. (I did not want to
deny idle=poll altogether - future HT implementations might work fine
with polling idle.)
Ingo
Signed-off-by: Ingo Molnar <mingo@elte.hu>
--- linux/arch/i386/kernel/process.c.orig
+++ linux/arch/i386/kernel/process.c
@@ -202,6 +202,10 @@ static int __init idle_setup (char *str)
if (!strncmp(str, "poll", 4)) {
printk("using polling idle threads.\n");
pm_idle = poll_idle;
+#ifdef CONFIG_SMP
+ if (smp_num_siblings > 1)
+ printk("WARNING: polling idle and HT enabled, performance may degrade.\n");
+#endif
} else if (!strncmp(str, "halt", 4)) {
printk("using halt in idle threads.\n");
pm_idle = default_idle;
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-25 9:29 ` Andrew Morton
@ 2004-05-25 11:42 ` Ingo Molnar
2004-05-25 9:58 ` Andrew Morton
0 siblings, 1 reply; 31+ messages in thread
From: Ingo Molnar @ 2004-05-25 11:42 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
* Andrew Morton <akpm@osdl.org> wrote:
> Ingo Molnar <mingo@elte.hu> wrote:
> >
> > with the patch below we will print a big fat warning. (I did not want to
> > deny idle=poll altogether - future HT implementations might work fine
> > with polling idle.)
>
> idle=poll is handy when profiling the kernel with oprofile
> clock-unhalted events. Because if you use the normal halt-based idle
> loop no profile "ticks" are accounted to idle time at all and the
> results are really hard to understand.
it makes it a bit more plausible, but kernel profiling based on ticks in
a HT environment is still quite unreliable, even with idle=poll. The HT
cores will yield to each other on various occasions - like spinlock
loops. This disproportionatly increases the hits of various looping
functions, creating false impressions of lock contention where there's
only little contention. Plus idle=poll is a constant ~20% performance
drain on the non-idle HT core, further distorting the profile. HT makes
profiling really hard, no matter what.
but ... we agree on the warning printk, right?
Ingo
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Help understanding slow down
2004-05-25 9:58 ` Andrew Morton
@ 2004-05-25 12:25 ` Ingo Molnar
0 siblings, 0 replies; 31+ messages in thread
From: Ingo Molnar @ 2004-05-25 12:25 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
* Andrew Morton <akpm@osdl.org> wrote:
> > it makes it a bit more plausible, but kernel profiling based on ticks in
> > a HT environment is still quite unreliable, even with idle=poll. The HT
> > cores will yield to each other on various occasions - like spinlock
> > loops. This disproportionatly increases the hits of various looping
> > functions, creating false impressions of lock contention where there's
> > only little contention. Plus idle=poll is a constant ~20% performance
> > drain on the non-idle HT core, further distorting the profile. HT makes
> > profiling really hard, no matter what.
>
> But often one is looking for relativities rather than real absolute
> numbers. (In which case the absent idle time doesn't matter, but it
> freaks me out...)
but it's the relativities that get skewed by HT, fundamentally - both
with and without idle=poll. A function that does not generate alot of
HT-yield activity will fare much less on the profiler histogram than a
function that happens to hit some contention point every now and then.
Also, cachemisses get far more prominent due to HT (they are a yield
point too) - suppressing other, possibly more important functions in the
list. So a kernel profile done under HT gives a rough idea but a
function could easily have half the overhead that the profiler credits
it to have. Profiling under HT magnifies the effect of lock-contention
and cache-misses, and shrinks the effect of algorithmic overhead. Plus
this skewing is not deterministic, it depends on the actual system
activity.
unfortunately i see no good way to do reliable time-based profiling
under HT, given the interaction of logical CPUs. The virtual CPUs break
the single-EIP notion of 'what is this physical CPU doing now'. Perhaps
it would be more reliable to not use time but use some of the other
triggers: e.g. # of uops retired?
Ingo
^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2004-05-25 10:36 UTC | newest]
Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-05-24 0:32 Help understanding slow down Phy Prabab
2004-05-24 0:57 ` Phy Prabab
2004-05-24 1:04 ` William Lee Irwin III
2004-05-24 1:25 ` Jakob Oestergaard
2004-05-24 1:28 ` William Lee Irwin III
2004-05-24 1:51 ` Phy Prabab
2004-05-24 2:42 ` Nick Piggin
2004-05-24 2:43 ` Andrew Morton
2004-05-24 6:23 ` Phy Prabab
2004-05-24 6:27 ` William Lee Irwin III
2004-05-24 6:39 ` Phy Prabab
2004-05-24 7:21 ` Phy Prabab
2004-05-24 7:44 ` Nick Piggin
2004-05-24 7:50 ` Phy Prabab
2004-05-24 7:59 ` Nick Piggin
2004-05-24 10:43 ` Jakob Oestergaard
2004-05-24 14:13 ` Martin J. Bligh
2004-05-24 7:53 ` Andrew Morton
2004-05-24 8:00 ` Phy Prabab
2004-05-25 10:32 ` Ingo Molnar
2004-05-25 9:29 ` Andrew Morton
2004-05-25 11:42 ` Ingo Molnar
2004-05-25 9:58 ` Andrew Morton
2004-05-25 12:25 ` Ingo Molnar
2004-05-24 6:33 ` Paul Rolland
2004-05-24 17:28 ` Roger Luethi
-- strict thread matches above, loose matches on Subject: below --
2004-05-24 1:37 Robert M. Stockmann
2004-05-24 1:50 ` Phy Prabab
2004-05-24 3:31 ` Robert M. Stockmann
2004-05-24 6:14 ` Phy Prabab
2004-05-24 18:36 ` Robert M. Stockmann
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox