* Poor gige performance with 2.4.20-pre* @ 2002-09-28 22:57 Richard Gooch 2002-09-29 2:12 ` Xiaoliang (David) Wei 2002-09-29 2:32 ` Ben Greear 0 siblings, 2 replies; 10+ messages in thread From: Richard Gooch @ 2002-09-28 22:57 UTC (permalink / raw) To: netdev Hi, all. For a while now I've noticed poor performance with gige cards under 2.4.19 and 2.4.20-pre*. At first I thought it was because of the cheap-ass Addtron cards I bought (these use the ns83820 chip). But now that the Intel E1000 cards are pretty cheap too, I've grabbed a couple (part number: PWLA8390MT) and see the same problem. In fact, the E1000 cards are no better than the Addtron cards. I'm using the D-Link DGS-1008T 8-port gige switch. MTU=1500 bytes. The basic test I do is to send 100 MB over a TCP connection from one machine to the other. The results are: Dual PIII 450 MHz -> Dual Athalon 1.6 GHz yields 58 MB/s Dual Athalon 1.6 GHz -> Dual PIII 450 MHz yields 23 MB/s This is quite a bit less than what gige is supposed to give. Is this expected? Regards, Richard.... Permanent: rgooch@atnf.csiro.au Current: rgooch@ras.ucalgary.ca ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Poor gige performance with 2.4.20-pre* 2002-09-28 22:57 Poor gige performance with 2.4.20-pre* Richard Gooch @ 2002-09-29 2:12 ` Xiaoliang (David) Wei 2002-09-29 6:34 ` Richard Gooch 2002-09-29 2:32 ` Ben Greear 1 sibling, 1 reply; 10+ messages in thread From: Xiaoliang (David) Wei @ 2002-09-29 2:12 UTC (permalink / raw) To: Richard Gooch, netdev Hi, Did you do the experiments on WAN or LAN? What's the other configurations, such as: The sending/receiving buffer(I think it should be larger than Bandwidth*Delay)? > Hi, all. For a while now I've noticed poor performance with gige > cards under 2.4.19 and 2.4.20-pre*. At first I thought it was because > of the cheap-ass Addtron cards I bought (these use the ns83820 chip). > But now that the Intel E1000 cards are pretty cheap too, I've grabbed > a couple (part number: PWLA8390MT) and see the same problem. In fact, > the E1000 cards are no better than the Addtron cards. I'm using the > D-Link DGS-1008T 8-port gige switch. MTU=1500 bytes. > > The basic test I do is to send 100 MB over a TCP connection from one > machine to the other. The results are: > > Dual PIII 450 MHz -> Dual Athalon 1.6 GHz yields 58 MB/s > Dual Athalon 1.6 GHz -> Dual PIII 450 MHz yields 23 MB/s > > This is quite a bit less than what gige is supposed to give. Is this > expected? > > Regards, > > Richard.... > Permanent: rgooch@atnf.csiro.au > Current: rgooch@ras.ucalgary.ca > > > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Poor gige performance with 2.4.20-pre* 2002-09-29 2:12 ` Xiaoliang (David) Wei @ 2002-09-29 6:34 ` Richard Gooch 2002-09-30 0:45 ` Benjamin LaHaise 0 siblings, 1 reply; 10+ messages in thread From: Richard Gooch @ 2002-09-29 6:34 UTC (permalink / raw) To: Xiaoliang (David) Wei; +Cc: netdev Xiaoliang Wei writes: > Hi, > Did you do the experiments on WAN or LAN? What's the other > configurations, such as: The sending/receiving buffer(I think it > should be larger than Bandwidth*Delay)? This is all on a LAN (of course; expecting good performance from a WAN is pretty futile). I use a buffer size of 256 KiB. Regards, Richard.... Permanent: rgooch@atnf.csiro.au Current: rgooch@ras.ucalgary.ca ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Poor gige performance with 2.4.20-pre* 2002-09-29 6:34 ` Richard Gooch @ 2002-09-30 0:45 ` Benjamin LaHaise 2002-09-30 0:53 ` Richard Gooch 0 siblings, 1 reply; 10+ messages in thread From: Benjamin LaHaise @ 2002-09-30 0:45 UTC (permalink / raw) To: Richard Gooch; +Cc: Xiaoliang (David) Wei, netdev On Sun, Sep 29, 2002 at 12:34:02AM -0600, Richard Gooch wrote: > This is all on a LAN (of course; expecting good performance from a WAN > is pretty futile). I use a buffer size of 256 KiB. >From my experience tuning on a 550MHz P3 Xeon, you're better off using a buffer size of 8-16KB that stays in the L1 cache. Of course, that was without actually doing anything useful with the data being transferred. Gige really does need a faster cpu in the ghz+ range. As for ns83820, it's a work in progress. Some of the recent bugfixes may have reduced performance, so it may need to be retuned. -ben -- GMS rules. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Poor gige performance with 2.4.20-pre* 2002-09-30 0:45 ` Benjamin LaHaise @ 2002-09-30 0:53 ` Richard Gooch 0 siblings, 0 replies; 10+ messages in thread From: Richard Gooch @ 2002-09-30 0:53 UTC (permalink / raw) To: Benjamin LaHaise; +Cc: Xiaoliang (David) Wei, netdev Benjamin LaHaise writes: > On Sun, Sep 29, 2002 at 12:34:02AM -0600, Richard Gooch wrote: > > This is all on a LAN (of course; expecting good performance from a WAN > > is pretty futile). I use a buffer size of 256 KiB. > > From my experience tuning on a 550MHz P3 Xeon, you're better off > using a buffer size of 8-16KB that stays in the L1 cache. Of > course, that was without actually doing anything useful with the > data being transferred. Gige really does need a faster cpu in the > ghz+ range. As for ns83820, it's a work in progress. Some of the > recent bugfixes may have reduced performance, so it may need to be > retuned. Using 8 KiB buffer reduces performance, 16 KiB is almost the same as using 256 KiB. Regards, Richard.... Permanent: rgooch@atnf.csiro.au Current: rgooch@ras.ucalgary.ca ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Poor gige performance with 2.4.20-pre* 2002-09-28 22:57 Poor gige performance with 2.4.20-pre* Richard Gooch 2002-09-29 2:12 ` Xiaoliang (David) Wei @ 2002-09-29 2:32 ` Ben Greear 2002-09-29 19:22 ` Richard Gooch 1 sibling, 1 reply; 10+ messages in thread From: Ben Greear @ 2002-09-29 2:32 UTC (permalink / raw) To: Richard Gooch; +Cc: netdev Richard Gooch wrote: > Hi, all. For a while now I've noticed poor performance with gige > cards under 2.4.19 and 2.4.20-pre*. At first I thought it was because > of the cheap-ass Addtron cards I bought (these use the ns83820 chip). > But now that the Intel E1000 cards are pretty cheap too, I've grabbed > a couple (part number: PWLA8390MT) and see the same problem. In fact, > the E1000 cards are no better than the Addtron cards. I'm using the > D-Link DGS-1008T 8-port gige switch. MTU=1500 bytes. Machine: dual Athlon, 1.66Ghz, 64/66Mhz pci, 512MB RAM, 2 Intel PRO/1000 MT server NICs. Kernel: 2.4.20-pre7, pre8 (same behaviour) I was able to send and receive 400Mbps between two cards on the machine simultaneously. This is sustained over a period of time untill the box crashes after an hour or so :( Using pktgen, I could generate 860Mbps in one direction from one port to another on the same machine (crashed after an hour or so here too). Try setting the TxDescriptors=4096 RxDescriptors=1024 when loading the e1000 module, that helps tremendously when using smaller packets. I tried the e1000 driver in 2.5.38 on the machine, it ran at about 1/3 of the speed, and crashed in under 5 minutes... So, the performance could be better, but what is really killing me is stability at this point... > > The basic test I do is to send 100 MB over a TCP connection from one > machine to the other. The results are: > > Dual PIII 450 MHz -> Dual Athalon 1.6 GHz yields 58 MB/s > Dual Athalon 1.6 GHz -> Dual PIII 450 MHz yields 23 MB/s > > This is quite a bit less than what gige is supposed to give. Is this > expected? > > Regards, > > Richard.... > Permanent: rgooch@atnf.csiro.au > Current: rgooch@ras.ucalgary.ca > -- Ben Greear <greearb@candelatech.com> <Ben_Greear AT excite.com> President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Poor gige performance with 2.4.20-pre* 2002-09-29 2:32 ` Ben Greear @ 2002-09-29 19:22 ` Richard Gooch 2002-09-29 19:32 ` Ben Greear 0 siblings, 1 reply; 10+ messages in thread From: Richard Gooch @ 2002-09-29 19:22 UTC (permalink / raw) To: Ben Greear; +Cc: netdev Ben Greear writes: > Richard Gooch wrote: > > Hi, all. For a while now I've noticed poor performance with gige > > cards under 2.4.19 and 2.4.20-pre*. At first I thought it was because > > of the cheap-ass Addtron cards I bought (these use the ns83820 chip). > > But now that the Intel E1000 cards are pretty cheap too, I've grabbed > > a couple (part number: PWLA8390MT) and see the same problem. In fact, > > the E1000 cards are no better than the Addtron cards. I'm using the > > D-Link DGS-1008T 8-port gige switch. MTU=1500 bytes. > > Try setting the TxDescriptors=4096 RxDescriptors=1024 when loading the > e1000 module, that helps tremendously when using smaller packets. Didn't help at all. Just to summarise, I've got: options e1000 TxDescriptors=4096 RxDescriptors=1024 net.ipv4.tcp_rmem = 262144 262144 262144 net.ipv4.tcp_wmem = 262144 262144 262144 MTU=1500 I'm doing read(2)/write(2) to/from a user-space buffer over a TCP socket with 256 KiB buffer size. Is the E1000 supposed to have hardware interrupt mitigation (thus avoiding the need for NAPI)? Regards, Richard.... Permanent: rgooch@atnf.csiro.au Current: rgooch@ras.ucalgary.ca ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Poor gige performance with 2.4.20-pre* 2002-09-29 19:22 ` Richard Gooch @ 2002-09-29 19:32 ` Ben Greear 2002-09-29 20:54 ` Richard Gooch 0 siblings, 1 reply; 10+ messages in thread From: Ben Greear @ 2002-09-29 19:32 UTC (permalink / raw) To: Richard Gooch; +Cc: netdev Richard Gooch wrote: > Ben Greear writes: > >>Richard Gooch wrote: >> >>> Hi, all. For a while now I've noticed poor performance with gige >>>cards under 2.4.19 and 2.4.20-pre*. At first I thought it was because >>>of the cheap-ass Addtron cards I bought (these use the ns83820 chip). >>>But now that the Intel E1000 cards are pretty cheap too, I've grabbed >>>a couple (part number: PWLA8390MT) and see the same problem. In fact, >>>the E1000 cards are no better than the Addtron cards. I'm using the >>>D-Link DGS-1008T 8-port gige switch. MTU=1500 bytes. >> >>Try setting the TxDescriptors=4096 RxDescriptors=1024 when loading the >>e1000 module, that helps tremendously when using smaller packets. > > > Didn't help at all. Just to summarise, I've got: > options e1000 TxDescriptors=4096 RxDescriptors=1024 > net.ipv4.tcp_rmem = 262144 262144 262144 > net.ipv4.tcp_wmem = 262144 262144 262144 > MTU=1500 > > I'm doing read(2)/write(2) to/from a user-space buffer over a TCP > socket with 256 KiB buffer size. > > Is the E1000 supposed to have hardware interrupt mitigation (thus > avoiding the need for NAPI)? NAPI did not greatly improve the performance I saw with larger packets, but it did help with smaller (say, 60 byte) packets. One other thing I saw with TCP connections: They started off slow, but after a few seconds they were reacing their peak throughput. How long are you running your test? Ben > > Regards, > > Richard.... > Permanent: rgooch@atnf.csiro.au > Current: rgooch@ras.ucalgary.ca > -- Ben Greear <greearb@candelatech.com> <Ben_Greear AT excite.com> President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Poor gige performance with 2.4.20-pre* 2002-09-29 19:32 ` Ben Greear @ 2002-09-29 20:54 ` Richard Gooch 2002-09-30 21:21 ` Jon Fraser 0 siblings, 1 reply; 10+ messages in thread From: Richard Gooch @ 2002-09-29 20:54 UTC (permalink / raw) To: Ben Greear; +Cc: netdev Ben Greear writes: > Richard Gooch wrote: > > Ben Greear writes: > > > >>Richard Gooch wrote: > >> > >>> Hi, all. For a while now I've noticed poor performance with gige > >>>cards under 2.4.19 and 2.4.20-pre*. At first I thought it was because > >>>of the cheap-ass Addtron cards I bought (these use the ns83820 chip). > >>>But now that the Intel E1000 cards are pretty cheap too, I've grabbed > >>>a couple (part number: PWLA8390MT) and see the same problem. In fact, > >>>the E1000 cards are no better than the Addtron cards. I'm using the > >>>D-Link DGS-1008T 8-port gige switch. MTU=1500 bytes. > >> > >>Try setting the TxDescriptors=4096 RxDescriptors=1024 when loading the > >>e1000 module, that helps tremendously when using smaller packets. > > > > Didn't help at all. Just to summarise, I've got: > > options e1000 TxDescriptors=4096 RxDescriptors=1024 > > net.ipv4.tcp_rmem = 262144 262144 262144 > > net.ipv4.tcp_wmem = 262144 262144 262144 > > MTU=1500 > > > > I'm doing read(2)/write(2) to/from a user-space buffer over a TCP > > socket with 256 KiB buffer size. > > > > Is the E1000 supposed to have hardware interrupt mitigation (thus > > avoiding the need for NAPI)? > > NAPI did not greatly improve the performance I saw with larger packets, > but it did help with smaller (say, 60 byte) packets. My packets should be 1500 bytes, or close to it. > One other thing I saw with TCP connections: They started off slow, > but after a few seconds they were reacing their peak throughput. > How long are you running your test? I normally send 100 MB, so that's around 2 seconds or more. Sending 1 GB doesn't change anything (other than the test taking 20 seconds or more). Oh, BTW: some possibly relevant config options: CONFIG_M686=y CONFIG_HIGHMEM4G=y # CONFIG_HIGHMEM64G is not set CONFIG_HIGHMEM=y CONFIG_HIGHIO=y CONFIG_SMP=y CONFIG_E1000=m Regards, Richard.... Permanent: rgooch@atnf.csiro.au Current: rgooch@ras.ucalgary.ca ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: Poor gige performance with 2.4.20-pre* 2002-09-29 20:54 ` Richard Gooch @ 2002-09-30 21:21 ` Jon Fraser 0 siblings, 0 replies; 10+ messages in thread From: Jon Fraser @ 2002-09-30 21:21 UTC (permalink / raw) To: netdev Hello, I'm new to this list, so please bear with me. I'm doing similar tests with gige and am seeing similar issues. I have two different but similar test machines, both running 2.4.18 Dell 1550 dual 1 ghz PIII, 256k cache serverworks HE chipset intel E1000, 82542 chipset embedded card dual 1.266 ghz PIII, 512k cache serverworks HE chipset embedded intel E1000, 28543 chipset We're using IXIA test gear to source/sink the packets. The systems are just ip-forwarding the traffic back out the same interface. That is, we have the gige setup with aliases so it is on two different nets. I'm trying to find the bottlenecks in small packet performance. With large packets, we can exceed 900 mpbs on the embedded card, so that's not an issue. The Dell 1550 seems to run out of bus bandwidth before reaching that level. With 64 byte packets, we can achive 250 kpps running dual processor. This consumes about 65% of each cpu. Can't go faster without dropping a significant percentage of the packets. If we run with the 28543 intrrupts tied to a single processor, we can achieve about 285 kpps, at which point we're using 95% of the single cpu. Running a uniprocessor kernel, we top out around 350 kpps. There's nothing else running on the boxes. I'm perplexed by a couple of issues. The network performance of the SMP kernel with the gige bound to single processor is only about 80% of the UP kernel. Is this typical? Are the causes of the performance degradation well known? With the gige running on both processors, we get rather poor performance. We can't even reach the same number of pps on two processors that we can with one. Using cpu performance measurement counters, we seem to reach a point where there is as much time being spent doing cache invalidates as there is doing real work. All the queues and statistics are per-cpu in the 2.14.18 kernel. Are there other known problems causing excessive cache invalidates? Are there any significant improvements in later kernels? Thanks in advance, Jon Fraser > -----Original Message----- > From: netdev-bounce@oss.sgi.com [mailto:netdev-bounce@oss.sgi.com]On > Behalf Of Richard Gooch > Sent: Sunday, September 29, 2002 4:54 PM > To: Ben Greear > Cc: netdev@oss.sgi.com > Subject: Re: Poor gige performance with 2.4.20-pre* > > > Ben Greear writes: > > Richard Gooch wrote: > > > Ben Greear writes: > > > > > >>Richard Gooch wrote: > > >> > > >>> Hi, all. For a while now I've noticed poor performance > with gige > > >>>cards under 2.4.19 and 2.4.20-pre*. At first I thought > it was because > > >>>of the cheap-ass Addtron cards I bought (these use the > ns83820 chip). > > >>>But now that the Intel E1000 cards are pretty cheap too, > I've grabbed > > >>>a couple (part number: PWLA8390MT) and see the same > problem. In fact, > > >>>the E1000 cards are no better than the Addtron cards. > I'm using the > > >>>D-Link DGS-1008T 8-port gige switch. MTU=1500 bytes. > > >> > > >>Try setting the TxDescriptors=4096 RxDescriptors=1024 > when loading the > > >>e1000 module, that helps tremendously when using smaller packets. > > > > > > Didn't help at all. Just to summarise, I've got: > > > options e1000 TxDescriptors=4096 RxDescriptors=1024 > > > net.ipv4.tcp_rmem = 262144 262144 262144 > > > net.ipv4.tcp_wmem = 262144 262144 262144 > > > MTU=1500 > > > > > > I'm doing read(2)/write(2) to/from a user-space buffer over a TCP > > > socket with 256 KiB buffer size. > > > > > > Is the E1000 supposed to have hardware interrupt mitigation (thus > > > avoiding the need for NAPI)? > > > > NAPI did not greatly improve the performance I saw with > larger packets, > > but it did help with smaller (say, 60 byte) packets. > > My packets should be 1500 bytes, or close to it. > > > One other thing I saw with TCP connections: They started off slow, > > but after a few seconds they were reacing their peak throughput. > > How long are you running your test? > > I normally send 100 MB, so that's around 2 seconds or more. Sending > 1 GB doesn't change anything (other than the test taking 20 seconds or > more). > > Oh, BTW: some possibly relevant config options: > CONFIG_M686=y > CONFIG_HIGHMEM4G=y > # CONFIG_HIGHMEM64G is not set > CONFIG_HIGHMEM=y > CONFIG_HIGHIO=y > CONFIG_SMP=y > CONFIG_E1000=m > > Regards, > > Richard.... > Permanent: rgooch@atnf.csiro.au > Current: rgooch@ras.ucalgary.ca > > ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2002-09-30 21:21 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-09-28 22:57 Poor gige performance with 2.4.20-pre* Richard Gooch 2002-09-29 2:12 ` Xiaoliang (David) Wei 2002-09-29 6:34 ` Richard Gooch 2002-09-30 0:45 ` Benjamin LaHaise 2002-09-30 0:53 ` Richard Gooch 2002-09-29 2:32 ` Ben Greear 2002-09-29 19:22 ` Richard Gooch 2002-09-29 19:32 ` Ben Greear 2002-09-29 20:54 ` Richard Gooch 2002-09-30 21:21 ` Jon Fraser
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).