* AIM9 regression @ 2008-09-23 18:14 Christoph Lameter 2008-09-23 20:36 ` Stephen Hemminger 2008-09-24 5:12 ` Herbert Xu 0 siblings, 2 replies; 20+ messages in thread From: Christoph Lameter @ 2008-09-23 18:14 UTC (permalink / raw) To: David Miller; +Cc: Netdev, Herbert Xu I just dont seem to be able to get 2.6.27 to behave in a speedy way network wise. Configured out various components (netfilter, etc etc) but I still keep getting these aim9 result against 2.6.22: 47 misc_rtns_1 448038.00 430118.00 -17920.00 -4.00% Auxiliary Loops/second 48 dir_rtns_1 2412587.41 2723000.00 310412.59 12.87% Directory Operations/second 49 shell_rtns_1 364.30 345.80 -18.50 -5.08% Shell Scripts/second 50 shell_rtns_2 364.20 355.34 -8.86 -2.43% Shell Scripts/second 51 shell_rtns_3 363.30 353.60 -9.70 -2.67% Shell Scripts/second 52 series_1 6694290.00 6706690.00 12400.00 0.19% Series Evaluations/second 53 shared_memory 1042900.00 1080630.00 37730.00 3.62% Shared Memory Operations/second 54 tcp_test 352035.00 278442.00 -73593.00 -20.91% TCP/IP Messages/second 55 udp_test 640940.00 585570.00 -55370.00 -8.64% UDP/IP DataGrams/second 56 fifo_test 772440.00 932330.00 159890.00 20.70% FIFO Messages/second 57 stream_pipe 1222870.00 1230140.00 7270.00 0.59% Stream Pipe Messages/second 58 dgram_pipe 1143106.89 1152730.00 9623.11 0.84% DataGram Pipe Messages/second 59 pipe_cpy 867850.00 1065430.00 197580.00 22.77% Pipe Messages/second ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression 2008-09-23 18:14 AIM9 regression Christoph Lameter @ 2008-09-23 20:36 ` Stephen Hemminger 2008-09-23 20:40 ` Christoph Lameter 2008-09-24 5:12 ` Herbert Xu 1 sibling, 1 reply; 20+ messages in thread From: Stephen Hemminger @ 2008-09-23 20:36 UTC (permalink / raw) To: Christoph Lameter; +Cc: David Miller, Netdev, Herbert Xu On Tue, 23 Sep 2008 13:14:27 -0500 Christoph Lameter <cl@linux-foundation.org> wrote: > I just dont seem to be able to get 2.6.27 to behave in a speedy way network > wise. Configured out various components (netfilter, etc etc) but I still keep > getting these aim9 result against 2.6.22: > > 47 misc_rtns_1 448038.00 430118.00 -17920.00 -4.00% Auxiliary Loops/second > 48 dir_rtns_1 2412587.41 2723000.00 310412.59 12.87% Directory > Operations/second > 49 shell_rtns_1 364.30 345.80 -18.50 -5.08% Shell Scripts/second > 50 shell_rtns_2 364.20 355.34 -8.86 -2.43% Shell Scripts/second > 51 shell_rtns_3 363.30 353.60 -9.70 -2.67% Shell Scripts/second > 52 series_1 6694290.00 6706690.00 12400.00 0.19% Series Evaluations/second > 53 shared_memory 1042900.00 1080630.00 37730.00 3.62% Shared Memory > Operations/second > 54 tcp_test 352035.00 278442.00 -73593.00 -20.91% TCP/IP Messages/second > 55 udp_test 640940.00 585570.00 -55370.00 -8.64% UDP/IP DataGrams/second > 56 fifo_test 772440.00 932330.00 159890.00 20.70% FIFO Messages/second > 57 stream_pipe 1222870.00 1230140.00 7270.00 0.59% Stream Pipe > Messages/second > 58 dgram_pipe 1143106.89 1152730.00 9623.11 0.84% DataGram Pipe > Messages/second > 59 pipe_cpy 867850.00 1065430.00 197580.00 22.77% Pipe Messages/second > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Hardware configuration please? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression 2008-09-23 20:36 ` Stephen Hemminger @ 2008-09-23 20:40 ` Christoph Lameter 2008-09-23 20:43 ` Christoph Lameter 2008-09-24 1:20 ` Jeff Garzik 0 siblings, 2 replies; 20+ messages in thread From: Christoph Lameter @ 2008-09-23 20:40 UTC (permalink / raw) To: Stephen Hemminger; +Cc: David Miller, Netdev, Herbert Xu Stephen Hemminger wrote: > Hardware configuration please? Dual Processor 4 core 8G, Xeon X5460 @ 3.16GHz, 667Mhz memory. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression 2008-09-23 20:40 ` Christoph Lameter @ 2008-09-23 20:43 ` Christoph Lameter 2008-09-24 1:20 ` Jeff Garzik 1 sibling, 0 replies; 20+ messages in thread From: Christoph Lameter @ 2008-09-23 20:43 UTC (permalink / raw) To: Stephen Hemminger; +Cc: David Miller, Netdev, Herbert Xu Christoph Lameter wrote: > Dual Processor 4 core 8G, Xeon X5460 @ 3.16GHz, 667Mhz memory. So 8G Ram, 8 processors total. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression 2008-09-23 20:40 ` Christoph Lameter 2008-09-23 20:43 ` Christoph Lameter @ 2008-09-24 1:20 ` Jeff Garzik 2008-09-24 3:11 ` David Miller 1 sibling, 1 reply; 20+ messages in thread From: Jeff Garzik @ 2008-09-24 1:20 UTC (permalink / raw) To: Christoph Lameter; +Cc: Stephen Hemminger, David Miller, Netdev, Herbert Xu Christoph Lameter wrote: > Stephen Hemminger wrote: > >> Hardware configuration please? > > Dual Processor 4 core 8G, Xeon X5460 @ 3.16GHz, 667Mhz memory. Network hardware configuration? Or is the TCP test over loopback? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression 2008-09-24 1:20 ` Jeff Garzik @ 2008-09-24 3:11 ` David Miller 2008-09-24 14:20 ` Christoph Lameter 0 siblings, 1 reply; 20+ messages in thread From: David Miller @ 2008-09-24 3:11 UTC (permalink / raw) To: jeff; +Cc: cl, shemminger, netdev, herbert From: Jeff Garzik <jeff@garzik.org> Date: Tue, 23 Sep 2008 21:20:40 -0400 > Christoph Lameter wrote: > > Stephen Hemminger wrote: > > > >> Hardware configuration please? > > Dual Processor 4 core 8G, Xeon X5460 @ 3.16GHz, 667Mhz memory. > > Network hardware configuration? > > Or is the TCP test over loopback? I'm pretty sure it's over loopback :-) ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression 2008-09-24 3:11 ` David Miller @ 2008-09-24 14:20 ` Christoph Lameter 0 siblings, 0 replies; 20+ messages in thread From: Christoph Lameter @ 2008-09-24 14:20 UTC (permalink / raw) To: David Miller; +Cc: jeff, shemminger, netdev, herbert David Miller wrote: > From: Jeff Garzik <jeff@garzik.org> > Date: Tue, 23 Sep 2008 21:20:40 -0400 > >> Christoph Lameter wrote: >>> Stephen Hemminger wrote: >>> >>>> Hardware configuration please? >>> Dual Processor 4 core 8G, Xeon X5460 @ 3.16GHz, 667Mhz memory. >> Network hardware configuration? >> >> Or is the TCP test over loopback? > > I'm pretty sure it's over loopback :-) Correct. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression 2008-09-23 18:14 AIM9 regression Christoph Lameter 2008-09-23 20:36 ` Stephen Hemminger @ 2008-09-24 5:12 ` Herbert Xu 2008-09-24 5:18 ` David Miller 1 sibling, 1 reply; 20+ messages in thread From: Herbert Xu @ 2008-09-24 5:12 UTC (permalink / raw) To: Christoph Lameter; +Cc: David Miller, Netdev On Tue, Sep 23, 2008 at 01:14:27PM -0500, Christoph Lameter wrote: > I just dont seem to be able to get 2.6.27 to behave in a speedy way network > wise. Configured out various components (netfilter, etc etc) but I still keep > getting these aim9 result against 2.6.22: Could you please compare this against something less ancient, like 2.6.26 perhaps? Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression 2008-09-24 5:12 ` Herbert Xu @ 2008-09-24 5:18 ` David Miller 2008-09-24 15:16 ` Stephen Hemminger 0 siblings, 1 reply; 20+ messages in thread From: David Miller @ 2008-09-24 5:18 UTC (permalink / raw) To: herbert; +Cc: cl, netdev From: Herbert Xu <herbert@gondor.apana.org.au> Date: Wed, 24 Sep 2008 13:12:37 +0800 > On Tue, Sep 23, 2008 at 01:14:27PM -0500, Christoph Lameter wrote: > > I just dont seem to be able to get 2.6.27 to behave in a speedy way network > > wise. Configured out various components (netfilter, etc etc) but I still keep > > getting these aim9 result against 2.6.22: > > Could you please compare this against something less ancient, > like 2.6.26 perhaps? Herbert, this is part of the tbench regression issues. Christoph took tbench from 2.6.22 until 2.6.27 and at basically every release tbench performance suffered noticably. Now, he's taking the AIM9 benchmark networking numbers and showing that the same exact effect is seen there too. It really behooves us to start doing something proactive about this blindingly obvious set of networking performance regressions through the past 6 or so releases instead of barking at the reporters saying things like "try this, try that, what's your config" etc. :-) ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression 2008-09-24 5:18 ` David Miller @ 2008-09-24 15:16 ` Stephen Hemminger 2008-09-24 19:10 ` Christoph Lameter 2008-09-24 19:36 ` David Miller 0 siblings, 2 replies; 20+ messages in thread From: Stephen Hemminger @ 2008-09-24 15:16 UTC (permalink / raw) To: David Miller; +Cc: herbert, cl, netdev On Tue, 23 Sep 2008 22:18:31 -0700 (PDT) David Miller <davem@davemloft.net> wrote: > From: Herbert Xu <herbert@gondor.apana.org.au> > Date: Wed, 24 Sep 2008 13:12:37 +0800 > > > On Tue, Sep 23, 2008 at 01:14:27PM -0500, Christoph Lameter wrote: > > > I just dont seem to be able to get 2.6.27 to behave in a speedy way network > > > wise. Configured out various components (netfilter, etc etc) but I still keep > > > getting these aim9 result against 2.6.22: > > > > Could you please compare this against something less ancient, > > like 2.6.26 perhaps? > > Herbert, this is part of the tbench regression issues. Christoph > took tbench from 2.6.22 until 2.6.27 and at basically every release > tbench performance suffered noticably. > > Now, he's taking the AIM9 benchmark networking numbers and showing > that the same exact effect is seen there too. > > It really behooves us to start doing something proactive about this > blindingly obvious set of networking performance regressions through > the past 6 or so releases instead of barking at the reporters saying > things like "try this, try that, what's your config" etc. > > :-) These loopback benchmarks are often more sensitive to scheduler than networking changes. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression 2008-09-24 15:16 ` Stephen Hemminger @ 2008-09-24 19:10 ` Christoph Lameter 2008-09-24 19:53 ` David Miller 2008-09-24 19:36 ` David Miller 1 sibling, 1 reply; 20+ messages in thread From: Christoph Lameter @ 2008-09-24 19:10 UTC (permalink / raw) To: Stephen Hemminger; +Cc: David Miller, herbert, netdev Stephen Hemminger wrote: > These loopback benchmarks are often more sensitive to scheduler than networking > changes. Just ran a test with real NICs which show the same issues. I guess I need to get familiar with the network stack and start hacking on it. Sigh. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression 2008-09-24 19:10 ` Christoph Lameter @ 2008-09-24 19:53 ` David Miller 2008-09-24 21:34 ` Stephen Hemminger 0 siblings, 1 reply; 20+ messages in thread From: David Miller @ 2008-09-24 19:53 UTC (permalink / raw) To: cl; +Cc: shemminger, herbert, netdev From: Christoph Lameter <cl@linux-foundation.org> Date: Wed, 24 Sep 2008 14:10:54 -0500 > Stephen Hemminger wrote: > > > These loopback benchmarks are often more sensitive to scheduler than networking > > changes. > > Just ran a test with real NICs which show the same issues. I guess I need to > get familiar with the network stack and start hacking on it. Sigh. I feel your pain, I think people are being very unreasonable in their analysis of your numbers, and for this I want to personally apologize. It's clearly a networking issue in my eyes, and I wish my co-developers in networking would treat it as such instead of pushing the blame under the carpet and saying "scheduler", "SLUB", and all kinds of other bullshit without any facts on this specific case to back up such accusations. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression 2008-09-24 19:53 ` David Miller @ 2008-09-24 21:34 ` Stephen Hemminger 2008-09-24 22:26 ` David Miller 0 siblings, 1 reply; 20+ messages in thread From: Stephen Hemminger @ 2008-09-24 21:34 UTC (permalink / raw) To: David Miller; +Cc: cl, herbert, netdev On Wed, 24 Sep 2008 12:53:46 -0700 (PDT) David Miller <davem@davemloft.net> wrote: > From: Christoph Lameter <cl@linux-foundation.org> > Date: Wed, 24 Sep 2008 14:10:54 -0500 > > > Stephen Hemminger wrote: > > > > > These loopback benchmarks are often more sensitive to scheduler than networking > > > changes. > > > > Just ran a test with real NICs which show the same issues. I guess I need to > > get familiar with the network stack and start hacking on it. Sigh. > > I feel your pain, I think people are being very unreasonable in their > analysis of your numbers, and for this I want to personally apologize. > > It's clearly a networking issue in my eyes, and I wish my co-developers > in networking would treat it as such instead of pushing the blame under > the carpet and saying "scheduler", "SLUB", and all kinds of other bullshit > without any facts on this specific case to back up such accusations. Is this a one time change, or has networking been getting slower over time? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression 2008-09-24 21:34 ` Stephen Hemminger @ 2008-09-24 22:26 ` David Miller 0 siblings, 0 replies; 20+ messages in thread From: David Miller @ 2008-09-24 22:26 UTC (permalink / raw) To: shemminger; +Cc: cl, herbert, netdev From: Stephen Hemminger <shemminger@vyatta.com> Date: Wed, 24 Sep 2008 14:34:19 -0700 > On Wed, 24 Sep 2008 12:53:46 -0700 (PDT) > David Miller <davem@davemloft.net> wrote: > > > From: Christoph Lameter <cl@linux-foundation.org> > > Date: Wed, 24 Sep 2008 14:10:54 -0500 > > > > > Stephen Hemminger wrote: > > > > > > > These loopback benchmarks are often more sensitive to scheduler than networking > > > > changes. > > > > > > Just ran a test with real NICs which show the same issues. I guess I need to > > > get familiar with the network stack and start hacking on it. Sigh. > > > > I feel your pain, I think people are being very unreasonable in their > > analysis of your numbers, and for this I want to personally apologize. > > > > It's clearly a networking issue in my eyes, and I wish my co-developers > > in networking would treat it as such instead of pushing the blame under > > the carpet and saying "scheduler", "SLUB", and all kinds of other bullshit > > without any facts on this specific case to back up such accusations. > > Is this a one time change, or has networking been getting slower over time? As per the tbench thread, it's been getting slower and slower, every single release, since as far back as people have tested, which seems to be 2.6.22 or thereabouts. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression 2008-09-24 15:16 ` Stephen Hemminger 2008-09-24 19:10 ` Christoph Lameter @ 2008-09-24 19:36 ` David Miller 2008-09-29 14:24 ` Ilpo Järvinen 1 sibling, 1 reply; 20+ messages in thread From: David Miller @ 2008-09-24 19:36 UTC (permalink / raw) To: shemminger; +Cc: herbert, cl, netdev From: Stephen Hemminger <shemminger@vyatta.com> Date: Wed, 24 Sep 2008 08:16:03 -0700 > On Tue, 23 Sep 2008 22:18:31 -0700 (PDT) > David Miller <davem@davemloft.net> wrote: > > > From: Herbert Xu <herbert@gondor.apana.org.au> > > Date: Wed, 24 Sep 2008 13:12:37 +0800 > > > > > On Tue, Sep 23, 2008 at 01:14:27PM -0500, Christoph Lameter wrote: > > > > I just dont seem to be able to get 2.6.27 to behave in a speedy way network > > > > wise. Configured out various components (netfilter, etc etc) but I still keep > > > > getting these aim9 result against 2.6.22: > > > > > > Could you please compare this against something less ancient, > > > like 2.6.26 perhaps? > > > > Herbert, this is part of the tbench regression issues. Christoph > > took tbench from 2.6.22 until 2.6.27 and at basically every release > > tbench performance suffered noticably. > > > > Now, he's taking the AIM9 benchmark networking numbers and showing > > that the same exact effect is seen there too. > > > > It really behooves us to start doing something proactive about this > > blindingly obvious set of networking performance regressions through > > the past 6 or so releases instead of barking at the reporters saying > > things like "try this, try that, what's your config" etc. > > > > :-) > > These loopback benchmarks are often more sensitive to scheduler than networking > changes. When it gets to %20, I strong start to doubt that, and this is exactly what's happening here. What is it going to take to actually get someone to start profiling and analyzing this? :-) ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression 2008-09-24 19:36 ` David Miller @ 2008-09-29 14:24 ` Ilpo Järvinen 2008-09-29 14:54 ` Christoph Lameter 0 siblings, 1 reply; 20+ messages in thread From: Ilpo Järvinen @ 2008-09-29 14:24 UTC (permalink / raw) To: David Miller; +Cc: shemminger, herbert, cl, netdev On Wed, 24 Sep 2008, David Miller wrote: > From: Stephen Hemminger <shemminger@vyatta.com> > Date: Wed, 24 Sep 2008 08:16:03 -0700 > > > On Tue, 23 Sep 2008 22:18:31 -0700 (PDT) > > David Miller <davem@davemloft.net> wrote: > > > > > From: Herbert Xu <herbert@gondor.apana.org.au> > > > Date: Wed, 24 Sep 2008 13:12:37 +0800 > > > > > > > On Tue, Sep 23, 2008 at 01:14:27PM -0500, Christoph Lameter wrote: > > > > > I just dont seem to be able to get 2.6.27 to behave in a speedy way network > > > > > wise. Configured out various components (netfilter, etc etc) but I still keep > > > > > getting these aim9 result against 2.6.22: > > > > > > > > Could you please compare this against something less ancient, > > > > like 2.6.26 perhaps? > > > > > > Herbert, this is part of the tbench regression issues. Christoph > > > took tbench from 2.6.22 until 2.6.27 and at basically every release > > > tbench performance suffered noticably. > > > > > > Now, he's taking the AIM9 benchmark networking numbers and showing > > > that the same exact effect is seen there too. > > > > > > It really behooves us to start doing something proactive about this > > > blindingly obvious set of networking performance regressions through > > > the past 6 or so releases instead of barking at the reporters saying > > > things like "try this, try that, what's your config" etc. > > > > > > :-) > > > > These loopback benchmarks are often more sensitive to scheduler than > > networking changes. > > When it gets to %20, I strong start to doubt that, and this is exactly > what's happening here. > > What is it going to take to actually get someone to start profiling and > analyzing this? :-) ...I was thinking earlier to answer "time?", but now once been there, it seems that more time is more appropriate... So far I haven't been able to find a way to create a reproducable serie of result numbers with aim9 tcp_test... it seems that the results vary within that (at least) 20% margin. Can Christoph actually get stable numbers out of it with 27-rcs (I haven't extensively tested .22 yet with long test durations but it seems that same problem occurs with it as well if short tests were used)? ...And what I've learned, I couldn't even finish a testrun with conntrack and default settings as ipv4 conntrack run out of entries :-). Ow, almost forgot, I got some stable regression with lockdep though, I hope we've gotten some more power to its detection in return for the lost performance. I got these top variations (in absolute numbers) between three consecutive runs of 1000 seconds aim9 tcp_test (3xoprof(abs,%), func, max-min, (max-min)/min), aim9+its data on tmpfs (with nodebug-nonf config): 266288 1.0221 420190 1.6457 614494 2.4039 vfs_read 348206 1.30763 233649 0.8968 317763 1.2446 508838 1.9906 vfs_write 275189 1.17779 228732 0.8779 496359 1.9440 324747 1.2704 dnotify_parent 267627 1.17005 671548 2.5776 592604 2.3210 445792 1.7440 inet_csk_get_port 225756 0.506416 392960 1.5083 362665 1.4204 491234 1.9217 netif_rx 128569 0.354512 121337 0.4657 208314 0.8159 249783 0.9772 do_sync_write 128446 1.05859 164951 0.6331 168276 0.6591 285451 1.1167 loopback_xmit 120500 0.73052 359659 1.3805 242133 0.9483 256785 1.0046 __tcp_select_window 117526 0.485378 876319 3.3636 762690 2.9872 772554 3.0223 tcp_sendmsg 113629 0.148985 266895 1.0244 199204 0.7802 176985 0.6924 tcp_established_options 89910 0.508009 689652 2.6471 647962 2.5378 608943 2.3822 dev_queue_xmit 80709 0.132539 206754 0.7936 265523 1.0400 284087 1.1114 __kmalloc_track_caller 77333 0.374034 544026 2.0882 496654 1.9452 571982 2.2376 tcp_recvmsg 75328 0.151671 600414 2.3046 525704 2.0590 567588 2.2204 ip_queue_xmit 74710 0.142114 131820 0.5060 59259 0.2321 121586 0.4757 getnstimeofday 72561 1.22447 67061 0.2574 132155 0.5176 137914 0.5395 rw_verify_area 70853 1.05655 129676 0.4977 60652 0.2376 98307 0.3846 sock_rfree 69024 1.13803 535701 2.0562 586248 2.2961 517563 2.0247 ip_finish_output 68685 0.132708 692187 2.6568 634962 2.4869 623888 2.4407 tcp_rcv_established 68299 0.109473 949233 3.6435 900741 3.5279 882256 3.4514 tcp_transmit_skb 66977 0.0759156 ...like said, the variation in the aim9 results were ~20% at most. -- i. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression 2008-09-29 14:24 ` Ilpo Järvinen @ 2008-09-29 14:54 ` Christoph Lameter 2008-09-29 15:12 ` Ilpo Järvinen ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Christoph Lameter @ 2008-09-29 14:54 UTC (permalink / raw) To: Ilpo Järvinen; +Cc: David Miller, shemminger, herbert, netdev Ilpo Järvinen wrote: > ...I was thinking earlier to answer "time?", but now once been there, it > seems that more time is more appropriate... So far I haven't been able to > find a way to create a reproducable serie of result numbers with aim9 > tcp_test... it seems that the results vary within that (at least) 20% > margin. Can Christoph actually get stable numbers out of it with 27-rcs > (I haven't extensively tested .22 yet with long test durations but it > seems that same problem occurs with it as well if short tests were used)? Results fluctuate between 10 - 25%. The problem occurs with the short durations as well. If this is due to the additional code complexity in later kernels as we suspect then it may be an issue with cpu cache effectiveness. Going to 64 bit binaries also yields a significant hit (as high as 30%) which also indicates caching issues. Both 64 bit kernels and later kernels cause the variability of results to increase. 64 bit has double the effect than a 2.6.27 kernel. All indications of cpu caching issues. The L1 cache may become ineffective due to the increased cache footprint. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression 2008-09-29 14:54 ` Christoph Lameter @ 2008-09-29 15:12 ` Ilpo Järvinen 2008-09-29 15:36 ` Stephen Hemminger 2008-10-31 14:57 ` Ilpo Järvinen 2 siblings, 0 replies; 20+ messages in thread From: Ilpo Järvinen @ 2008-09-29 15:12 UTC (permalink / raw) To: Christoph Lameter; +Cc: David Miller, shemminger, Herbert Xu, Netdev [-- Attachment #1: Type: TEXT/PLAIN, Size: 1466 bytes --] On Mon, 29 Sep 2008, Christoph Lameter wrote: > Ilpo Järvinen wrote: > > > So far I haven't been able to > > find a way to create a reproducable serie of result numbers with aim9 > > tcp_test... it seems that the results vary within that (at least) 20% > > margin. Can Christoph actually get stable numbers out of it with 27-rcs > > (I haven't extensively tested .22 yet with long test durations but it > > seems that same problem occurs with it as well if short tests were used)? > > Results fluctuate between 10 - 25%. The problem occurs with the short > durations as well. If this is due to the additional code complexity in later > kernels as we suspect then it may be an issue with cpu cache effectiveness. Hmm... I'll try to extract some very raw (and possible somewhat skewed) numbers out of that based on the profiles I have and some acme's tools. > Going to 64 bit binaries also yields a significant hit (as high as 30%) > which also indicates caching issues. > > Both 64 bit kernels and later kernels cause the variability of results to > increase. 64 bit has double the effect than a 2.6.27 kernel. All indications > of cpu caching issues. The L1 cache may become ineffective due to the > increased cache footprint. Ok. I was testing 64bit only... I'll probably try next some very short tests in vast numbers to see if the results converge after enough samples are taken, lets hope I don't need many days to get to such point... :-) -- i. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression 2008-09-29 14:54 ` Christoph Lameter 2008-09-29 15:12 ` Ilpo Järvinen @ 2008-09-29 15:36 ` Stephen Hemminger 2008-10-31 14:57 ` Ilpo Järvinen 2 siblings, 0 replies; 20+ messages in thread From: Stephen Hemminger @ 2008-09-29 15:36 UTC (permalink / raw) To: Christoph Lameter; +Cc: David Miller, herbert, netdev, Ilpo Järvinen ----- Original Message ----- From: "Christoph Lameter" <cl@linux-foundation.org> To: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> Cc: "David Miller" <davem@davemloft.net>, shemminger@vyatta.com, herbert@gondor.apana.org.au, netdev@vger.kernel.org Sent: Monday, September 29, 2008 4:54:11 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna Subject: Re: AIM9 regression Ilpo Järvinen wrote: > ...I was thinking earlier to answer "time?", but now once been there, it > seems that more time is more appropriate... So far I haven't been able to > find a way to create a reproducable serie of result numbers with aim9 > tcp_test... it seems that the results vary within that (at least) 20% > margin. Can Christoph actually get stable numbers out of it with 27-rcs > (I haven't extensively tested .22 yet with long test durations but it > seems that same problem occurs with it as well if short tests were used)? Results fluctuate between 10 - 25%. The problem occurs with the short durations as well. If this is due to the additional code complexity in later kernels as we suspect then it may be an issue with cpu cache effectiveness. Going to 64 bit binaries also yields a significant hit (as high as 30%) which also indicates caching issues. Both 64 bit kernels and later kernels cause the variability of results to increase. 64 bit has double the effect than a 2.6.27 kernel. All indications of cpu caching issues. The L1 cache may become ineffective due to the increased cache footprint. ------------- One of the items showing up in the profile is the local side port allocation. Is the ephemeral port range getting full? If it is then the random port scan could take a long time to find the next free slot, especially now that source ports are randomized. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: AIM9 regression 2008-09-29 14:54 ` Christoph Lameter 2008-09-29 15:12 ` Ilpo Järvinen 2008-09-29 15:36 ` Stephen Hemminger @ 2008-10-31 14:57 ` Ilpo Järvinen 2 siblings, 0 replies; 20+ messages in thread From: Ilpo Järvinen @ 2008-10-31 14:57 UTC (permalink / raw) To: Christoph Lameter; +Cc: David Miller, shemminger, Herbert Xu, Netdev [-- Attachment #1: Type: TEXT/PLAIN, Size: 2813 bytes --] On Mon, 29 Sep 2008, Christoph Lameter wrote: > Ilpo Järvinen wrote: > > > ...I was thinking earlier to answer "time?", but now once been there, it > > seems that more time is more appropriate... So far I haven't been able to > > find a way to create a reproducable serie of result numbers with aim9 > > tcp_test... it seems that the results vary within that (at least) 20% > > margin. Can Christoph actually get stable numbers out of it with 27-rcs > > (I haven't extensively tested .22 yet with long test durations but it > > seems that same problem occurs with it as well if short tests were used)? > > Results fluctuate between 10 - 25%. The problem occurs with the short > durations as well. If this is due to the additional code complexity in later > kernels as we suspect then it may be an issue with cpu cache effectiveness. > > Going to 64 bit binaries also yields a significant hit (as high as 30%) which > also indicates caching issues. > > Both 64 bit kernels and later kernels cause the variability of results to > increase. 64 bit has double the effect than a 2.6.27 kernel. All indications > of cpu caching issues. The L1 cache may become ineffective due to the > increased cache footprint. I experimented with it some and changed tcp_test to bind into supplied port instead of relying on the port allocator randomness, both server and client port were do like that. However, I had to turn tcp_tw_recycle on to get the test to actually return instead of -ESOMETHING. In addition I did sync & drop_caches before each run (I'm not sure if it did actually reduce variantion a bit or did I just imagine, I'd expect it to damp test harness caused artifacts if it did something) + sleep 20 before each 20 seconds test. Port allocator could be benchmarked separately if so desired. Here are my current numbers with 64-bit (nodebug & nonf): .22 .28-rc2-gsmthg GSO/TSO off on 240700 232398 224194 241187 236722 227610 243940 237388 229472 244367 237469 229576 246134 238569 229680 246211 238680 229999 246400 238693 230262 248761 239076 230404 250934 239107 231404 251203 239152 231562 251572 239215 231912 254158 239863 232744 256407 239912 234017 257329 240022 -EINTR 259560 241352 -EINTR http://www.cs.helsinki.fi/u/ijjarvin/aim9/res.png TSO/GSO does modulos every so often but Dave is currently evaluating how to get rid of that, discussed here: http://marc.info/?t=122411618000004&r=1&w=2 ...Still some uncertainty where the remaining of Evgeniy's G&TSO off/on difference comes from. 2.6.27-rc7 has basically the same numbers as 2.6.28-rc2 though I accidently had there ftrace on so some extra nops were present. Still some regression to attack, but there seems to considerably less than 20% when testing for net_random()'s output is removed. -- i. ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2008-10-31 14:57 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-09-23 18:14 AIM9 regression Christoph Lameter 2008-09-23 20:36 ` Stephen Hemminger 2008-09-23 20:40 ` Christoph Lameter 2008-09-23 20:43 ` Christoph Lameter 2008-09-24 1:20 ` Jeff Garzik 2008-09-24 3:11 ` David Miller 2008-09-24 14:20 ` Christoph Lameter 2008-09-24 5:12 ` Herbert Xu 2008-09-24 5:18 ` David Miller 2008-09-24 15:16 ` Stephen Hemminger 2008-09-24 19:10 ` Christoph Lameter 2008-09-24 19:53 ` David Miller 2008-09-24 21:34 ` Stephen Hemminger 2008-09-24 22:26 ` David Miller 2008-09-24 19:36 ` David Miller 2008-09-29 14:24 ` Ilpo Järvinen 2008-09-29 14:54 ` Christoph Lameter 2008-09-29 15:12 ` Ilpo Järvinen 2008-09-29 15:36 ` Stephen Hemminger 2008-10-31 14:57 ` Ilpo Järvinen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).