* AIM9 regression
@ 2008-09-23 18:14 Christoph Lameter
2008-09-23 20:36 ` Stephen Hemminger
2008-09-24 5:12 ` Herbert Xu
0 siblings, 2 replies; 20+ messages in thread
From: Christoph Lameter @ 2008-09-23 18:14 UTC (permalink / raw)
To: David Miller; +Cc: Netdev, Herbert Xu
I just dont seem to be able to get 2.6.27 to behave in a speedy way network
wise. Configured out various components (netfilter, etc etc) but I still keep
getting these aim9 result against 2.6.22:
47 misc_rtns_1 448038.00 430118.00 -17920.00 -4.00% Auxiliary Loops/second
48 dir_rtns_1 2412587.41 2723000.00 310412.59 12.87% Directory
Operations/second
49 shell_rtns_1 364.30 345.80 -18.50 -5.08% Shell Scripts/second
50 shell_rtns_2 364.20 355.34 -8.86 -2.43% Shell Scripts/second
51 shell_rtns_3 363.30 353.60 -9.70 -2.67% Shell Scripts/second
52 series_1 6694290.00 6706690.00 12400.00 0.19% Series Evaluations/second
53 shared_memory 1042900.00 1080630.00 37730.00 3.62% Shared Memory
Operations/second
54 tcp_test 352035.00 278442.00 -73593.00 -20.91% TCP/IP Messages/second
55 udp_test 640940.00 585570.00 -55370.00 -8.64% UDP/IP DataGrams/second
56 fifo_test 772440.00 932330.00 159890.00 20.70% FIFO Messages/second
57 stream_pipe 1222870.00 1230140.00 7270.00 0.59% Stream Pipe
Messages/second
58 dgram_pipe 1143106.89 1152730.00 9623.11 0.84% DataGram Pipe
Messages/second
59 pipe_cpy 867850.00 1065430.00 197580.00 22.77% Pipe Messages/second
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression
2008-09-23 18:14 AIM9 regression Christoph Lameter
@ 2008-09-23 20:36 ` Stephen Hemminger
2008-09-23 20:40 ` Christoph Lameter
2008-09-24 5:12 ` Herbert Xu
1 sibling, 1 reply; 20+ messages in thread
From: Stephen Hemminger @ 2008-09-23 20:36 UTC (permalink / raw)
To: Christoph Lameter; +Cc: David Miller, Netdev, Herbert Xu
On Tue, 23 Sep 2008 13:14:27 -0500
Christoph Lameter <cl@linux-foundation.org> wrote:
> I just dont seem to be able to get 2.6.27 to behave in a speedy way network
> wise. Configured out various components (netfilter, etc etc) but I still keep
> getting these aim9 result against 2.6.22:
>
> 47 misc_rtns_1 448038.00 430118.00 -17920.00 -4.00% Auxiliary Loops/second
> 48 dir_rtns_1 2412587.41 2723000.00 310412.59 12.87% Directory
> Operations/second
> 49 shell_rtns_1 364.30 345.80 -18.50 -5.08% Shell Scripts/second
> 50 shell_rtns_2 364.20 355.34 -8.86 -2.43% Shell Scripts/second
> 51 shell_rtns_3 363.30 353.60 -9.70 -2.67% Shell Scripts/second
> 52 series_1 6694290.00 6706690.00 12400.00 0.19% Series Evaluations/second
> 53 shared_memory 1042900.00 1080630.00 37730.00 3.62% Shared Memory
> Operations/second
> 54 tcp_test 352035.00 278442.00 -73593.00 -20.91% TCP/IP Messages/second
> 55 udp_test 640940.00 585570.00 -55370.00 -8.64% UDP/IP DataGrams/second
> 56 fifo_test 772440.00 932330.00 159890.00 20.70% FIFO Messages/second
> 57 stream_pipe 1222870.00 1230140.00 7270.00 0.59% Stream Pipe
> Messages/second
> 58 dgram_pipe 1143106.89 1152730.00 9623.11 0.84% DataGram Pipe
> Messages/second
> 59 pipe_cpy 867850.00 1065430.00 197580.00 22.77% Pipe Messages/second
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Hardware configuration please?
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression
2008-09-23 20:36 ` Stephen Hemminger
@ 2008-09-23 20:40 ` Christoph Lameter
2008-09-23 20:43 ` Christoph Lameter
2008-09-24 1:20 ` Jeff Garzik
0 siblings, 2 replies; 20+ messages in thread
From: Christoph Lameter @ 2008-09-23 20:40 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Miller, Netdev, Herbert Xu
Stephen Hemminger wrote:
> Hardware configuration please?
Dual Processor 4 core 8G, Xeon X5460 @ 3.16GHz, 667Mhz memory.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression
2008-09-23 20:40 ` Christoph Lameter
@ 2008-09-23 20:43 ` Christoph Lameter
2008-09-24 1:20 ` Jeff Garzik
1 sibling, 0 replies; 20+ messages in thread
From: Christoph Lameter @ 2008-09-23 20:43 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Miller, Netdev, Herbert Xu
Christoph Lameter wrote:
> Dual Processor 4 core 8G, Xeon X5460 @ 3.16GHz, 667Mhz memory.
So 8G Ram, 8 processors total.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression
2008-09-23 20:40 ` Christoph Lameter
2008-09-23 20:43 ` Christoph Lameter
@ 2008-09-24 1:20 ` Jeff Garzik
2008-09-24 3:11 ` David Miller
1 sibling, 1 reply; 20+ messages in thread
From: Jeff Garzik @ 2008-09-24 1:20 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Stephen Hemminger, David Miller, Netdev, Herbert Xu
Christoph Lameter wrote:
> Stephen Hemminger wrote:
>
>> Hardware configuration please?
>
> Dual Processor 4 core 8G, Xeon X5460 @ 3.16GHz, 667Mhz memory.
Network hardware configuration?
Or is the TCP test over loopback?
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression
2008-09-24 1:20 ` Jeff Garzik
@ 2008-09-24 3:11 ` David Miller
2008-09-24 14:20 ` Christoph Lameter
0 siblings, 1 reply; 20+ messages in thread
From: David Miller @ 2008-09-24 3:11 UTC (permalink / raw)
To: jeff; +Cc: cl, shemminger, netdev, herbert
From: Jeff Garzik <jeff@garzik.org>
Date: Tue, 23 Sep 2008 21:20:40 -0400
> Christoph Lameter wrote:
> > Stephen Hemminger wrote:
> >
> >> Hardware configuration please?
> > Dual Processor 4 core 8G, Xeon X5460 @ 3.16GHz, 667Mhz memory.
>
> Network hardware configuration?
>
> Or is the TCP test over loopback?
I'm pretty sure it's over loopback :-)
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression
2008-09-23 18:14 AIM9 regression Christoph Lameter
2008-09-23 20:36 ` Stephen Hemminger
@ 2008-09-24 5:12 ` Herbert Xu
2008-09-24 5:18 ` David Miller
1 sibling, 1 reply; 20+ messages in thread
From: Herbert Xu @ 2008-09-24 5:12 UTC (permalink / raw)
To: Christoph Lameter; +Cc: David Miller, Netdev
On Tue, Sep 23, 2008 at 01:14:27PM -0500, Christoph Lameter wrote:
> I just dont seem to be able to get 2.6.27 to behave in a speedy way network
> wise. Configured out various components (netfilter, etc etc) but I still keep
> getting these aim9 result against 2.6.22:
Could you please compare this against something less ancient,
like 2.6.26 perhaps?
Thanks,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression
2008-09-24 5:12 ` Herbert Xu
@ 2008-09-24 5:18 ` David Miller
2008-09-24 15:16 ` Stephen Hemminger
0 siblings, 1 reply; 20+ messages in thread
From: David Miller @ 2008-09-24 5:18 UTC (permalink / raw)
To: herbert; +Cc: cl, netdev
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Wed, 24 Sep 2008 13:12:37 +0800
> On Tue, Sep 23, 2008 at 01:14:27PM -0500, Christoph Lameter wrote:
> > I just dont seem to be able to get 2.6.27 to behave in a speedy way network
> > wise. Configured out various components (netfilter, etc etc) but I still keep
> > getting these aim9 result against 2.6.22:
>
> Could you please compare this against something less ancient,
> like 2.6.26 perhaps?
Herbert, this is part of the tbench regression issues. Christoph
took tbench from 2.6.22 until 2.6.27 and at basically every release
tbench performance suffered noticably.
Now, he's taking the AIM9 benchmark networking numbers and showing
that the same exact effect is seen there too.
It really behooves us to start doing something proactive about this
blindingly obvious set of networking performance regressions through
the past 6 or so releases instead of barking at the reporters saying
things like "try this, try that, what's your config" etc.
:-)
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression
2008-09-24 3:11 ` David Miller
@ 2008-09-24 14:20 ` Christoph Lameter
0 siblings, 0 replies; 20+ messages in thread
From: Christoph Lameter @ 2008-09-24 14:20 UTC (permalink / raw)
To: David Miller; +Cc: jeff, shemminger, netdev, herbert
David Miller wrote:
> From: Jeff Garzik <jeff@garzik.org>
> Date: Tue, 23 Sep 2008 21:20:40 -0400
>
>> Christoph Lameter wrote:
>>> Stephen Hemminger wrote:
>>>
>>>> Hardware configuration please?
>>> Dual Processor 4 core 8G, Xeon X5460 @ 3.16GHz, 667Mhz memory.
>> Network hardware configuration?
>>
>> Or is the TCP test over loopback?
>
> I'm pretty sure it's over loopback :-)
Correct.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression
2008-09-24 5:18 ` David Miller
@ 2008-09-24 15:16 ` Stephen Hemminger
2008-09-24 19:10 ` Christoph Lameter
2008-09-24 19:36 ` David Miller
0 siblings, 2 replies; 20+ messages in thread
From: Stephen Hemminger @ 2008-09-24 15:16 UTC (permalink / raw)
To: David Miller; +Cc: herbert, cl, netdev
On Tue, 23 Sep 2008 22:18:31 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:
> From: Herbert Xu <herbert@gondor.apana.org.au>
> Date: Wed, 24 Sep 2008 13:12:37 +0800
>
> > On Tue, Sep 23, 2008 at 01:14:27PM -0500, Christoph Lameter wrote:
> > > I just dont seem to be able to get 2.6.27 to behave in a speedy way network
> > > wise. Configured out various components (netfilter, etc etc) but I still keep
> > > getting these aim9 result against 2.6.22:
> >
> > Could you please compare this against something less ancient,
> > like 2.6.26 perhaps?
>
> Herbert, this is part of the tbench regression issues. Christoph
> took tbench from 2.6.22 until 2.6.27 and at basically every release
> tbench performance suffered noticably.
>
> Now, he's taking the AIM9 benchmark networking numbers and showing
> that the same exact effect is seen there too.
>
> It really behooves us to start doing something proactive about this
> blindingly obvious set of networking performance regressions through
> the past 6 or so releases instead of barking at the reporters saying
> things like "try this, try that, what's your config" etc.
>
> :-)
These loopback benchmarks are often more sensitive to scheduler than networking
changes.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression
2008-09-24 15:16 ` Stephen Hemminger
@ 2008-09-24 19:10 ` Christoph Lameter
2008-09-24 19:53 ` David Miller
2008-09-24 19:36 ` David Miller
1 sibling, 1 reply; 20+ messages in thread
From: Christoph Lameter @ 2008-09-24 19:10 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Miller, herbert, netdev
Stephen Hemminger wrote:
> These loopback benchmarks are often more sensitive to scheduler than networking
> changes.
Just ran a test with real NICs which show the same issues. I guess I need to
get familiar with the network stack and start hacking on it. Sigh.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression
2008-09-24 15:16 ` Stephen Hemminger
2008-09-24 19:10 ` Christoph Lameter
@ 2008-09-24 19:36 ` David Miller
2008-09-29 14:24 ` Ilpo Järvinen
1 sibling, 1 reply; 20+ messages in thread
From: David Miller @ 2008-09-24 19:36 UTC (permalink / raw)
To: shemminger; +Cc: herbert, cl, netdev
From: Stephen Hemminger <shemminger@vyatta.com>
Date: Wed, 24 Sep 2008 08:16:03 -0700
> On Tue, 23 Sep 2008 22:18:31 -0700 (PDT)
> David Miller <davem@davemloft.net> wrote:
>
> > From: Herbert Xu <herbert@gondor.apana.org.au>
> > Date: Wed, 24 Sep 2008 13:12:37 +0800
> >
> > > On Tue, Sep 23, 2008 at 01:14:27PM -0500, Christoph Lameter wrote:
> > > > I just dont seem to be able to get 2.6.27 to behave in a speedy way network
> > > > wise. Configured out various components (netfilter, etc etc) but I still keep
> > > > getting these aim9 result against 2.6.22:
> > >
> > > Could you please compare this against something less ancient,
> > > like 2.6.26 perhaps?
> >
> > Herbert, this is part of the tbench regression issues. Christoph
> > took tbench from 2.6.22 until 2.6.27 and at basically every release
> > tbench performance suffered noticably.
> >
> > Now, he's taking the AIM9 benchmark networking numbers and showing
> > that the same exact effect is seen there too.
> >
> > It really behooves us to start doing something proactive about this
> > blindingly obvious set of networking performance regressions through
> > the past 6 or so releases instead of barking at the reporters saying
> > things like "try this, try that, what's your config" etc.
> >
> > :-)
>
> These loopback benchmarks are often more sensitive to scheduler than networking
> changes.
When it gets to %20, I strong start to doubt that, and this is exactly
what's happening here.
What is it going to take to actually get someone to start profiling and
analyzing this? :-)
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression
2008-09-24 19:10 ` Christoph Lameter
@ 2008-09-24 19:53 ` David Miller
2008-09-24 21:34 ` Stephen Hemminger
0 siblings, 1 reply; 20+ messages in thread
From: David Miller @ 2008-09-24 19:53 UTC (permalink / raw)
To: cl; +Cc: shemminger, herbert, netdev
From: Christoph Lameter <cl@linux-foundation.org>
Date: Wed, 24 Sep 2008 14:10:54 -0500
> Stephen Hemminger wrote:
>
> > These loopback benchmarks are often more sensitive to scheduler than networking
> > changes.
>
> Just ran a test with real NICs which show the same issues. I guess I need to
> get familiar with the network stack and start hacking on it. Sigh.
I feel your pain, I think people are being very unreasonable in their
analysis of your numbers, and for this I want to personally apologize.
It's clearly a networking issue in my eyes, and I wish my co-developers
in networking would treat it as such instead of pushing the blame under
the carpet and saying "scheduler", "SLUB", and all kinds of other bullshit
without any facts on this specific case to back up such accusations.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression
2008-09-24 19:53 ` David Miller
@ 2008-09-24 21:34 ` Stephen Hemminger
2008-09-24 22:26 ` David Miller
0 siblings, 1 reply; 20+ messages in thread
From: Stephen Hemminger @ 2008-09-24 21:34 UTC (permalink / raw)
To: David Miller; +Cc: cl, herbert, netdev
On Wed, 24 Sep 2008 12:53:46 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:
> From: Christoph Lameter <cl@linux-foundation.org>
> Date: Wed, 24 Sep 2008 14:10:54 -0500
>
> > Stephen Hemminger wrote:
> >
> > > These loopback benchmarks are often more sensitive to scheduler than networking
> > > changes.
> >
> > Just ran a test with real NICs which show the same issues. I guess I need to
> > get familiar with the network stack and start hacking on it. Sigh.
>
> I feel your pain, I think people are being very unreasonable in their
> analysis of your numbers, and for this I want to personally apologize.
>
> It's clearly a networking issue in my eyes, and I wish my co-developers
> in networking would treat it as such instead of pushing the blame under
> the carpet and saying "scheduler", "SLUB", and all kinds of other bullshit
> without any facts on this specific case to back up such accusations.
Is this a one time change, or has networking been getting slower over time?
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression
2008-09-24 21:34 ` Stephen Hemminger
@ 2008-09-24 22:26 ` David Miller
0 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2008-09-24 22:26 UTC (permalink / raw)
To: shemminger; +Cc: cl, herbert, netdev
From: Stephen Hemminger <shemminger@vyatta.com>
Date: Wed, 24 Sep 2008 14:34:19 -0700
> On Wed, 24 Sep 2008 12:53:46 -0700 (PDT)
> David Miller <davem@davemloft.net> wrote:
>
> > From: Christoph Lameter <cl@linux-foundation.org>
> > Date: Wed, 24 Sep 2008 14:10:54 -0500
> >
> > > Stephen Hemminger wrote:
> > >
> > > > These loopback benchmarks are often more sensitive to scheduler than networking
> > > > changes.
> > >
> > > Just ran a test with real NICs which show the same issues. I guess I need to
> > > get familiar with the network stack and start hacking on it. Sigh.
> >
> > I feel your pain, I think people are being very unreasonable in their
> > analysis of your numbers, and for this I want to personally apologize.
> >
> > It's clearly a networking issue in my eyes, and I wish my co-developers
> > in networking would treat it as such instead of pushing the blame under
> > the carpet and saying "scheduler", "SLUB", and all kinds of other bullshit
> > without any facts on this specific case to back up such accusations.
>
> Is this a one time change, or has networking been getting slower over time?
As per the tbench thread, it's been getting slower and slower, every
single release, since as far back as people have tested, which seems
to be 2.6.22 or thereabouts.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression
2008-09-24 19:36 ` David Miller
@ 2008-09-29 14:24 ` Ilpo Järvinen
2008-09-29 14:54 ` Christoph Lameter
0 siblings, 1 reply; 20+ messages in thread
From: Ilpo Järvinen @ 2008-09-29 14:24 UTC (permalink / raw)
To: David Miller; +Cc: shemminger, herbert, cl, netdev
On Wed, 24 Sep 2008, David Miller wrote:
> From: Stephen Hemminger <shemminger@vyatta.com>
> Date: Wed, 24 Sep 2008 08:16:03 -0700
>
> > On Tue, 23 Sep 2008 22:18:31 -0700 (PDT)
> > David Miller <davem@davemloft.net> wrote:
> >
> > > From: Herbert Xu <herbert@gondor.apana.org.au>
> > > Date: Wed, 24 Sep 2008 13:12:37 +0800
> > >
> > > > On Tue, Sep 23, 2008 at 01:14:27PM -0500, Christoph Lameter wrote:
> > > > > I just dont seem to be able to get 2.6.27 to behave in a speedy way network
> > > > > wise. Configured out various components (netfilter, etc etc) but I still keep
> > > > > getting these aim9 result against 2.6.22:
> > > >
> > > > Could you please compare this against something less ancient,
> > > > like 2.6.26 perhaps?
> > >
> > > Herbert, this is part of the tbench regression issues. Christoph
> > > took tbench from 2.6.22 until 2.6.27 and at basically every release
> > > tbench performance suffered noticably.
> > >
> > > Now, he's taking the AIM9 benchmark networking numbers and showing
> > > that the same exact effect is seen there too.
> > >
> > > It really behooves us to start doing something proactive about this
> > > blindingly obvious set of networking performance regressions through
> > > the past 6 or so releases instead of barking at the reporters saying
> > > things like "try this, try that, what's your config" etc.
> > >
> > > :-)
> >
> > These loopback benchmarks are often more sensitive to scheduler than
> > networking changes.
>
> When it gets to %20, I strong start to doubt that, and this is exactly
> what's happening here.
>
> What is it going to take to actually get someone to start profiling and
> analyzing this? :-)
...I was thinking earlier to answer "time?", but now once been there, it
seems that more time is more appropriate... So far I haven't been able to
find a way to create a reproducable serie of result numbers with aim9
tcp_test... it seems that the results vary within that (at least) 20%
margin. Can Christoph actually get stable numbers out of it with 27-rcs
(I haven't extensively tested .22 yet with long test durations but it
seems that same problem occurs with it as well if short tests were used)?
...And what I've learned, I couldn't even finish a testrun with conntrack
and default settings as ipv4 conntrack run out of entries :-).
Ow, almost forgot, I got some stable regression with lockdep though,
I hope we've gotten some more power to its detection in return for the
lost performance.
I got these top variations (in absolute numbers) between three consecutive
runs of 1000 seconds aim9 tcp_test (3xoprof(abs,%), func, max-min,
(max-min)/min), aim9+its data on tmpfs (with nodebug-nonf config):
266288 1.0221 420190 1.6457 614494 2.4039 vfs_read 348206 1.30763
233649 0.8968 317763 1.2446 508838 1.9906 vfs_write 275189 1.17779
228732 0.8779 496359 1.9440 324747 1.2704 dnotify_parent 267627 1.17005
671548 2.5776 592604 2.3210 445792 1.7440 inet_csk_get_port 225756 0.506416
392960 1.5083 362665 1.4204 491234 1.9217 netif_rx 128569 0.354512
121337 0.4657 208314 0.8159 249783 0.9772 do_sync_write 128446 1.05859
164951 0.6331 168276 0.6591 285451 1.1167 loopback_xmit 120500 0.73052
359659 1.3805 242133 0.9483 256785 1.0046 __tcp_select_window 117526 0.485378
876319 3.3636 762690 2.9872 772554 3.0223 tcp_sendmsg 113629 0.148985
266895 1.0244 199204 0.7802 176985 0.6924 tcp_established_options 89910 0.508009
689652 2.6471 647962 2.5378 608943 2.3822 dev_queue_xmit 80709 0.132539
206754 0.7936 265523 1.0400 284087 1.1114 __kmalloc_track_caller 77333 0.374034
544026 2.0882 496654 1.9452 571982 2.2376 tcp_recvmsg 75328 0.151671
600414 2.3046 525704 2.0590 567588 2.2204 ip_queue_xmit 74710 0.142114
131820 0.5060 59259 0.2321 121586 0.4757 getnstimeofday 72561 1.22447
67061 0.2574 132155 0.5176 137914 0.5395 rw_verify_area 70853 1.05655
129676 0.4977 60652 0.2376 98307 0.3846 sock_rfree 69024 1.13803
535701 2.0562 586248 2.2961 517563 2.0247 ip_finish_output 68685 0.132708
692187 2.6568 634962 2.4869 623888 2.4407 tcp_rcv_established 68299 0.109473
949233 3.6435 900741 3.5279 882256 3.4514 tcp_transmit_skb 66977 0.0759156
...like said, the variation in the aim9 results were ~20% at most.
--
i.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression
2008-09-29 14:24 ` Ilpo Järvinen
@ 2008-09-29 14:54 ` Christoph Lameter
2008-09-29 15:12 ` Ilpo Järvinen
` (2 more replies)
0 siblings, 3 replies; 20+ messages in thread
From: Christoph Lameter @ 2008-09-29 14:54 UTC (permalink / raw)
To: Ilpo Järvinen; +Cc: David Miller, shemminger, herbert, netdev
Ilpo Järvinen wrote:
> ...I was thinking earlier to answer "time?", but now once been there, it
> seems that more time is more appropriate... So far I haven't been able to
> find a way to create a reproducable serie of result numbers with aim9
> tcp_test... it seems that the results vary within that (at least) 20%
> margin. Can Christoph actually get stable numbers out of it with 27-rcs
> (I haven't extensively tested .22 yet with long test durations but it
> seems that same problem occurs with it as well if short tests were used)?
Results fluctuate between 10 - 25%. The problem occurs with the short
durations as well. If this is due to the additional code complexity in later
kernels as we suspect then it may be an issue with cpu cache effectiveness.
Going to 64 bit binaries also yields a significant hit (as high as 30%) which
also indicates caching issues.
Both 64 bit kernels and later kernels cause the variability of results to
increase. 64 bit has double the effect than a 2.6.27 kernel. All indications
of cpu caching issues. The L1 cache may become ineffective due to the
increased cache footprint.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression
2008-09-29 14:54 ` Christoph Lameter
@ 2008-09-29 15:12 ` Ilpo Järvinen
2008-09-29 15:36 ` Stephen Hemminger
2008-10-31 14:57 ` Ilpo Järvinen
2 siblings, 0 replies; 20+ messages in thread
From: Ilpo Järvinen @ 2008-09-29 15:12 UTC (permalink / raw)
To: Christoph Lameter; +Cc: David Miller, shemminger, Herbert Xu, Netdev
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1466 bytes --]
On Mon, 29 Sep 2008, Christoph Lameter wrote:
> Ilpo Järvinen wrote:
>
> > So far I haven't been able to
> > find a way to create a reproducable serie of result numbers with aim9
> > tcp_test... it seems that the results vary within that (at least) 20%
> > margin. Can Christoph actually get stable numbers out of it with 27-rcs
> > (I haven't extensively tested .22 yet with long test durations but it
> > seems that same problem occurs with it as well if short tests were used)?
>
> Results fluctuate between 10 - 25%. The problem occurs with the short
> durations as well. If this is due to the additional code complexity in later
> kernels as we suspect then it may be an issue with cpu cache effectiveness.
Hmm... I'll try to extract some very raw (and possible somewhat skewed)
numbers out of that based on the profiles I have and some acme's tools.
> Going to 64 bit binaries also yields a significant hit (as high as 30%)
> which also indicates caching issues.
>
> Both 64 bit kernels and later kernels cause the variability of results to
> increase. 64 bit has double the effect than a 2.6.27 kernel. All indications
> of cpu caching issues. The L1 cache may become ineffective due to the
> increased cache footprint.
Ok. I was testing 64bit only... I'll probably try next some very short
tests in vast numbers to see if the results converge after enough samples
are taken, lets hope I don't need many days to get to such point... :-)
--
i.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression
2008-09-29 14:54 ` Christoph Lameter
2008-09-29 15:12 ` Ilpo Järvinen
@ 2008-09-29 15:36 ` Stephen Hemminger
2008-10-31 14:57 ` Ilpo Järvinen
2 siblings, 0 replies; 20+ messages in thread
From: Stephen Hemminger @ 2008-09-29 15:36 UTC (permalink / raw)
To: Christoph Lameter; +Cc: David Miller, herbert, netdev, Ilpo Järvinen
----- Original Message -----
From: "Christoph Lameter" <cl@linux-foundation.org>
To: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi>
Cc: "David Miller" <davem@davemloft.net>, shemminger@vyatta.com, herbert@gondor.apana.org.au, netdev@vger.kernel.org
Sent: Monday, September 29, 2008 4:54:11 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna
Subject: Re: AIM9 regression
Ilpo Järvinen wrote:
> ...I was thinking earlier to answer "time?", but now once been there, it
> seems that more time is more appropriate... So far I haven't been able to
> find a way to create a reproducable serie of result numbers with aim9
> tcp_test... it seems that the results vary within that (at least) 20%
> margin. Can Christoph actually get stable numbers out of it with 27-rcs
> (I haven't extensively tested .22 yet with long test durations but it
> seems that same problem occurs with it as well if short tests were used)?
Results fluctuate between 10 - 25%. The problem occurs with the short
durations as well. If this is due to the additional code complexity in later
kernels as we suspect then it may be an issue with cpu cache effectiveness.
Going to 64 bit binaries also yields a significant hit (as high as 30%) which
also indicates caching issues.
Both 64 bit kernels and later kernels cause the variability of results to
increase. 64 bit has double the effect than a 2.6.27 kernel. All indications
of cpu caching issues. The L1 cache may become ineffective due to the
increased cache footprint.
-------------
One of the items showing up in the profile is the local side port allocation.
Is the ephemeral port range getting full? If it is then the random port scan
could take a long time to find the next free slot, especially now that source
ports are randomized.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression
2008-09-29 14:54 ` Christoph Lameter
2008-09-29 15:12 ` Ilpo Järvinen
2008-09-29 15:36 ` Stephen Hemminger
@ 2008-10-31 14:57 ` Ilpo Järvinen
2 siblings, 0 replies; 20+ messages in thread
From: Ilpo Järvinen @ 2008-10-31 14:57 UTC (permalink / raw)
To: Christoph Lameter; +Cc: David Miller, shemminger, Herbert Xu, Netdev
[-- Attachment #1: Type: TEXT/PLAIN, Size: 2813 bytes --]
On Mon, 29 Sep 2008, Christoph Lameter wrote:
> Ilpo Järvinen wrote:
>
> > ...I was thinking earlier to answer "time?", but now once been there, it
> > seems that more time is more appropriate... So far I haven't been able to
> > find a way to create a reproducable serie of result numbers with aim9
> > tcp_test... it seems that the results vary within that (at least) 20%
> > margin. Can Christoph actually get stable numbers out of it with 27-rcs
> > (I haven't extensively tested .22 yet with long test durations but it
> > seems that same problem occurs with it as well if short tests were used)?
>
> Results fluctuate between 10 - 25%. The problem occurs with the short
> durations as well. If this is due to the additional code complexity in later
> kernels as we suspect then it may be an issue with cpu cache effectiveness.
>
> Going to 64 bit binaries also yields a significant hit (as high as 30%) which
> also indicates caching issues.
>
> Both 64 bit kernels and later kernels cause the variability of results to
> increase. 64 bit has double the effect than a 2.6.27 kernel. All indications
> of cpu caching issues. The L1 cache may become ineffective due to the
> increased cache footprint.
I experimented with it some and changed tcp_test to bind into supplied
port instead of relying on the port allocator randomness, both server and
client port were do like that. However, I had to turn tcp_tw_recycle on to
get the test to actually return instead of -ESOMETHING. In addition I did
sync & drop_caches before each run (I'm not sure if it did actually reduce
variantion a bit or did I just imagine, I'd expect it to damp test
harness caused artifacts if it did something) + sleep 20 before each 20
seconds test.
Port allocator could be benchmarked separately if so desired.
Here are my current numbers with 64-bit (nodebug & nonf):
.22 .28-rc2-gsmthg
GSO/TSO
off on
240700 232398 224194
241187 236722 227610
243940 237388 229472
244367 237469 229576
246134 238569 229680
246211 238680 229999
246400 238693 230262
248761 239076 230404
250934 239107 231404
251203 239152 231562
251572 239215 231912
254158 239863 232744
256407 239912 234017
257329 240022 -EINTR
259560 241352 -EINTR
http://www.cs.helsinki.fi/u/ijjarvin/aim9/res.png
TSO/GSO does modulos every so often but Dave is currently evaluating how
to get rid of that, discussed here:
http://marc.info/?t=122411618000004&r=1&w=2
...Still some uncertainty where the remaining of Evgeniy's G&TSO off/on
difference comes from.
2.6.27-rc7 has basically the same numbers as 2.6.28-rc2 though
I accidently had there ftrace on so some extra nops were present.
Still some regression to attack, but there seems to considerably
less than 20% when testing for net_random()'s output is removed.
--
i.
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2008-10-31 14:57 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-23 18:14 AIM9 regression Christoph Lameter
2008-09-23 20:36 ` Stephen Hemminger
2008-09-23 20:40 ` Christoph Lameter
2008-09-23 20:43 ` Christoph Lameter
2008-09-24 1:20 ` Jeff Garzik
2008-09-24 3:11 ` David Miller
2008-09-24 14:20 ` Christoph Lameter
2008-09-24 5:12 ` Herbert Xu
2008-09-24 5:18 ` David Miller
2008-09-24 15:16 ` Stephen Hemminger
2008-09-24 19:10 ` Christoph Lameter
2008-09-24 19:53 ` David Miller
2008-09-24 21:34 ` Stephen Hemminger
2008-09-24 22:26 ` David Miller
2008-09-24 19:36 ` David Miller
2008-09-29 14:24 ` Ilpo Järvinen
2008-09-29 14:54 ` Christoph Lameter
2008-09-29 15:12 ` Ilpo Järvinen
2008-09-29 15:36 ` Stephen Hemminger
2008-10-31 14:57 ` Ilpo Järvinen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).