netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Poor gige performance with 2.4.20-pre*
@ 2002-09-28 22:57 Richard Gooch
  2002-09-29  2:12 ` Xiaoliang (David) Wei
  2002-09-29  2:32 ` Ben Greear
  0 siblings, 2 replies; 10+ messages in thread
From: Richard Gooch @ 2002-09-28 22:57 UTC (permalink / raw)
  To: netdev

  Hi, all. For a while now I've noticed poor performance with gige
cards under 2.4.19 and 2.4.20-pre*. At first I thought it was because
of the cheap-ass Addtron cards I bought (these use the ns83820 chip).
But now that the Intel E1000 cards are pretty cheap too, I've grabbed
a couple (part number: PWLA8390MT) and see the same problem. In fact,
the E1000 cards are no better than the Addtron cards. I'm using the
D-Link DGS-1008T 8-port gige switch. MTU=1500 bytes.

The basic test I do is to send 100 MB over a TCP connection from one
machine to the other. The results are:

Dual PIII 450 MHz -> Dual Athalon 1.6 GHz yields 58 MB/s
Dual Athalon 1.6 GHz -> Dual PIII 450 MHz yields 23 MB/s

This is quite a bit less than what gige is supposed to give. Is this
expected?

				Regards,

					Richard....
Permanent: rgooch@atnf.csiro.au
Current:   rgooch@ras.ucalgary.ca

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Poor gige performance with 2.4.20-pre*
  2002-09-28 22:57 Poor gige performance with 2.4.20-pre* Richard Gooch
@ 2002-09-29  2:12 ` Xiaoliang (David) Wei
  2002-09-29  6:34   ` Richard Gooch
  2002-09-29  2:32 ` Ben Greear
  1 sibling, 1 reply; 10+ messages in thread
From: Xiaoliang (David) Wei @ 2002-09-29  2:12 UTC (permalink / raw)
  To: Richard Gooch, netdev

Hi,
    Did you do the experiments on WAN or LAN? What's the other
configurations, such as: The sending/receiving buffer(I think it should be
larger than Bandwidth*Delay)?

>   Hi, all. For a while now I've noticed poor performance with gige
> cards under 2.4.19 and 2.4.20-pre*. At first I thought it was because
> of the cheap-ass Addtron cards I bought (these use the ns83820 chip).
> But now that the Intel E1000 cards are pretty cheap too, I've grabbed
> a couple (part number: PWLA8390MT) and see the same problem. In fact,
> the E1000 cards are no better than the Addtron cards. I'm using the
> D-Link DGS-1008T 8-port gige switch. MTU=1500 bytes.
>
> The basic test I do is to send 100 MB over a TCP connection from one
> machine to the other. The results are:
>
> Dual PIII 450 MHz -> Dual Athalon 1.6 GHz yields 58 MB/s
> Dual Athalon 1.6 GHz -> Dual PIII 450 MHz yields 23 MB/s
>
> This is quite a bit less than what gige is supposed to give. Is this
> expected?
>
> Regards,
>
> Richard....
> Permanent: rgooch@atnf.csiro.au
> Current:   rgooch@ras.ucalgary.ca
>
>
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Poor gige performance with 2.4.20-pre*
  2002-09-28 22:57 Poor gige performance with 2.4.20-pre* Richard Gooch
  2002-09-29  2:12 ` Xiaoliang (David) Wei
@ 2002-09-29  2:32 ` Ben Greear
  2002-09-29 19:22   ` Richard Gooch
  1 sibling, 1 reply; 10+ messages in thread
From: Ben Greear @ 2002-09-29  2:32 UTC (permalink / raw)
  To: Richard Gooch; +Cc: netdev

Richard Gooch wrote:
>   Hi, all. For a while now I've noticed poor performance with gige
> cards under 2.4.19 and 2.4.20-pre*. At first I thought it was because
> of the cheap-ass Addtron cards I bought (these use the ns83820 chip).
> But now that the Intel E1000 cards are pretty cheap too, I've grabbed
> a couple (part number: PWLA8390MT) and see the same problem. In fact,
> the E1000 cards are no better than the Addtron cards. I'm using the
> D-Link DGS-1008T 8-port gige switch. MTU=1500 bytes.

Machine: dual Athlon, 1.66Ghz, 64/66Mhz pci, 512MB RAM,
   2 Intel PRO/1000 MT server NICs.
   Kernel: 2.4.20-pre7, pre8 (same behaviour)

I was able to send and
receive 400Mbps between two cards on the machine simultaneously.
This is sustained over a period of time untill the box crashes
after an hour or so :(

Using pktgen, I could generate 860Mbps in one direction from one port
to another on the same machine (crashed after an hour or so here too).

Try setting the TxDescriptors=4096 RxDescriptors=1024 when loading the
e1000 module, that helps tremendously when using smaller packets.

I tried the e1000 driver in 2.5.38 on the machine, it ran at about 1/3 of
the speed, and crashed in under 5 minutes...

So, the performance could be better, but what is really killing me is
stability at this point...

> 
> The basic test I do is to send 100 MB over a TCP connection from one
> machine to the other. The results are:
> 
> Dual PIII 450 MHz -> Dual Athalon 1.6 GHz yields 58 MB/s
> Dual Athalon 1.6 GHz -> Dual PIII 450 MHz yields 23 MB/s
> 
> This is quite a bit less than what gige is supposed to give. Is this
> expected?
> 
> 				Regards,
> 
> 					Richard....
> Permanent: rgooch@atnf.csiro.au
> Current:   rgooch@ras.ucalgary.ca
> 


-- 
Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
President of Candela Technologies Inc      http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Poor gige performance with 2.4.20-pre*
  2002-09-29  2:12 ` Xiaoliang (David) Wei
@ 2002-09-29  6:34   ` Richard Gooch
  2002-09-30  0:45     ` Benjamin LaHaise
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Gooch @ 2002-09-29  6:34 UTC (permalink / raw)
  To: Xiaoliang (David) Wei; +Cc: netdev

Xiaoliang Wei writes:
> Hi,
>     Did you do the experiments on WAN or LAN? What's the other
> configurations, such as: The sending/receiving buffer(I think it
> should be larger than Bandwidth*Delay)?

This is all on a LAN (of course; expecting good performance from a WAN
is pretty futile). I use a buffer size of 256 KiB.

				Regards,

					Richard....
Permanent: rgooch@atnf.csiro.au
Current:   rgooch@ras.ucalgary.ca

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Poor gige performance with 2.4.20-pre*
  2002-09-29  2:32 ` Ben Greear
@ 2002-09-29 19:22   ` Richard Gooch
  2002-09-29 19:32     ` Ben Greear
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Gooch @ 2002-09-29 19:22 UTC (permalink / raw)
  To: Ben Greear; +Cc: netdev

Ben Greear writes:
> Richard Gooch wrote:
> >   Hi, all. For a while now I've noticed poor performance with gige
> > cards under 2.4.19 and 2.4.20-pre*. At first I thought it was because
> > of the cheap-ass Addtron cards I bought (these use the ns83820 chip).
> > But now that the Intel E1000 cards are pretty cheap too, I've grabbed
> > a couple (part number: PWLA8390MT) and see the same problem. In fact,
> > the E1000 cards are no better than the Addtron cards. I'm using the
> > D-Link DGS-1008T 8-port gige switch. MTU=1500 bytes.
>
> Try setting the TxDescriptors=4096 RxDescriptors=1024 when loading the
> e1000 module, that helps tremendously when using smaller packets.

Didn't help at all. Just to summarise, I've got:
options e1000 TxDescriptors=4096 RxDescriptors=1024
net.ipv4.tcp_rmem = 262144 262144 262144
net.ipv4.tcp_wmem = 262144 262144 262144
MTU=1500

I'm doing read(2)/write(2) to/from a user-space buffer over a TCP
socket with 256 KiB buffer size.

Is the E1000 supposed to have hardware interrupt mitigation (thus
avoiding the need for NAPI)?

				Regards,

					Richard....
Permanent: rgooch@atnf.csiro.au
Current:   rgooch@ras.ucalgary.ca

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Poor gige performance with 2.4.20-pre*
  2002-09-29 19:22   ` Richard Gooch
@ 2002-09-29 19:32     ` Ben Greear
  2002-09-29 20:54       ` Richard Gooch
  0 siblings, 1 reply; 10+ messages in thread
From: Ben Greear @ 2002-09-29 19:32 UTC (permalink / raw)
  To: Richard Gooch; +Cc: netdev

Richard Gooch wrote:
> Ben Greear writes:
> 
>>Richard Gooch wrote:
>>
>>>  Hi, all. For a while now I've noticed poor performance with gige
>>>cards under 2.4.19 and 2.4.20-pre*. At first I thought it was because
>>>of the cheap-ass Addtron cards I bought (these use the ns83820 chip).
>>>But now that the Intel E1000 cards are pretty cheap too, I've grabbed
>>>a couple (part number: PWLA8390MT) and see the same problem. In fact,
>>>the E1000 cards are no better than the Addtron cards. I'm using the
>>>D-Link DGS-1008T 8-port gige switch. MTU=1500 bytes.
>>
>>Try setting the TxDescriptors=4096 RxDescriptors=1024 when loading the
>>e1000 module, that helps tremendously when using smaller packets.
> 
> 
> Didn't help at all. Just to summarise, I've got:
> options e1000 TxDescriptors=4096 RxDescriptors=1024
> net.ipv4.tcp_rmem = 262144 262144 262144
> net.ipv4.tcp_wmem = 262144 262144 262144
> MTU=1500
> 
> I'm doing read(2)/write(2) to/from a user-space buffer over a TCP
> socket with 256 KiB buffer size.
> 
> Is the E1000 supposed to have hardware interrupt mitigation (thus
> avoiding the need for NAPI)?

NAPI did not greatly improve the performance I saw with larger packets,
but it did help with smaller (say, 60 byte) packets.

One other thing I saw with TCP connections:  They started off slow, but after
a few seconds they were reacing their peak throughput.  How long are you running
your test?

Ben

> 
> 				Regards,
> 
> 					Richard....
> Permanent: rgooch@atnf.csiro.au
> Current:   rgooch@ras.ucalgary.ca
> 


-- 
Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
President of Candela Technologies Inc      http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Poor gige performance with 2.4.20-pre*
  2002-09-29 19:32     ` Ben Greear
@ 2002-09-29 20:54       ` Richard Gooch
  2002-09-30 21:21         ` Jon Fraser
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Gooch @ 2002-09-29 20:54 UTC (permalink / raw)
  To: Ben Greear; +Cc: netdev

Ben Greear writes:
> Richard Gooch wrote:
> > Ben Greear writes:
> > 
> >>Richard Gooch wrote:
> >>
> >>>  Hi, all. For a while now I've noticed poor performance with gige
> >>>cards under 2.4.19 and 2.4.20-pre*. At first I thought it was because
> >>>of the cheap-ass Addtron cards I bought (these use the ns83820 chip).
> >>>But now that the Intel E1000 cards are pretty cheap too, I've grabbed
> >>>a couple (part number: PWLA8390MT) and see the same problem. In fact,
> >>>the E1000 cards are no better than the Addtron cards. I'm using the
> >>>D-Link DGS-1008T 8-port gige switch. MTU=1500 bytes.
> >>
> >>Try setting the TxDescriptors=4096 RxDescriptors=1024 when loading the
> >>e1000 module, that helps tremendously when using smaller packets.
> > 
> > Didn't help at all. Just to summarise, I've got:
> > options e1000 TxDescriptors=4096 RxDescriptors=1024
> > net.ipv4.tcp_rmem = 262144 262144 262144
> > net.ipv4.tcp_wmem = 262144 262144 262144
> > MTU=1500
> > 
> > I'm doing read(2)/write(2) to/from a user-space buffer over a TCP
> > socket with 256 KiB buffer size.
> > 
> > Is the E1000 supposed to have hardware interrupt mitigation (thus
> > avoiding the need for NAPI)?
> 
> NAPI did not greatly improve the performance I saw with larger packets,
> but it did help with smaller (say, 60 byte) packets.

My packets should be 1500 bytes, or close to it.

> One other thing I saw with TCP connections: They started off slow,
> but after a few seconds they were reacing their peak throughput.
> How long are you running your test?

I normally send 100 MB, so that's around 2 seconds or more. Sending
1 GB doesn't change anything (other than the test taking 20 seconds or
more).

Oh, BTW: some possibly relevant config options:
CONFIG_M686=y
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
CONFIG_HIGHMEM=y
CONFIG_HIGHIO=y
CONFIG_SMP=y
CONFIG_E1000=m

				Regards,

					Richard....
Permanent: rgooch@atnf.csiro.au
Current:   rgooch@ras.ucalgary.ca

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Poor gige performance with 2.4.20-pre*
  2002-09-29  6:34   ` Richard Gooch
@ 2002-09-30  0:45     ` Benjamin LaHaise
  2002-09-30  0:53       ` Richard Gooch
  0 siblings, 1 reply; 10+ messages in thread
From: Benjamin LaHaise @ 2002-09-30  0:45 UTC (permalink / raw)
  To: Richard Gooch; +Cc: Xiaoliang (David) Wei, netdev

On Sun, Sep 29, 2002 at 12:34:02AM -0600, Richard Gooch wrote:
> This is all on a LAN (of course; expecting good performance from a WAN
> is pretty futile). I use a buffer size of 256 KiB.

>From my experience tuning on a 550MHz P3 Xeon, you're better off using a 
buffer size of 8-16KB that stays in the L1 cache.  Of course, that was 
without actually doing anything useful with the data being transferred.  
Gige really does need a faster cpu in the ghz+ range.  As for ns83820, 
it's a work in progress.  Some of the recent bugfixes may have reduced 
performance, so it may need to be retuned.

		-ben
-- 
GMS rules.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Poor gige performance with 2.4.20-pre*
  2002-09-30  0:45     ` Benjamin LaHaise
@ 2002-09-30  0:53       ` Richard Gooch
  0 siblings, 0 replies; 10+ messages in thread
From: Richard Gooch @ 2002-09-30  0:53 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Xiaoliang (David) Wei, netdev

Benjamin LaHaise writes:
> On Sun, Sep 29, 2002 at 12:34:02AM -0600, Richard Gooch wrote:
> > This is all on a LAN (of course; expecting good performance from a WAN
> > is pretty futile). I use a buffer size of 256 KiB.
> 
> From my experience tuning on a 550MHz P3 Xeon, you're better off
> using a buffer size of 8-16KB that stays in the L1 cache.  Of
> course, that was without actually doing anything useful with the
> data being transferred.  Gige really does need a faster cpu in the
> ghz+ range.  As for ns83820, it's a work in progress.  Some of the
> recent bugfixes may have reduced performance, so it may need to be
> retuned.

Using 8 KiB buffer reduces performance, 16 KiB is almost the same as
using 256 KiB.

				Regards,

					Richard....
Permanent: rgooch@atnf.csiro.au
Current:   rgooch@ras.ucalgary.ca

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: Poor gige performance with 2.4.20-pre*
  2002-09-29 20:54       ` Richard Gooch
@ 2002-09-30 21:21         ` Jon Fraser
  0 siblings, 0 replies; 10+ messages in thread
From: Jon Fraser @ 2002-09-30 21:21 UTC (permalink / raw)
  To: netdev



Hello,

	I'm new to this list, so please bear with me.


	I'm doing similar tests with gige and am seeing similar
issues.  I have two different but similar test machines, both running 2.4.18

	Dell 1550
		dual  1 ghz PIII, 256k cache
		serverworks HE chipset
		intel E1000, 82542 chipset

	embedded card

		dual 1.266 ghz PIII, 512k cache
		serverworks HE chipset
		embedded intel E1000, 28543 chipset

We're using IXIA test gear to source/sink the packets.  The systems are
just ip-forwarding the traffic back out the same interface.  That is, we
have the gige setup with aliases so it is on two different nets.

I'm trying to find the bottlenecks in small packet performance.  With large
packets, we can exceed 900 mpbs on the embedded card, so that's not an
issue.
The Dell 1550 seems to run out of bus bandwidth before reaching that level.

With 64 byte packets, we can achive 250 kpps running dual processor.
This consumes about 65% of each cpu.  Can't go faster without dropping
a significant percentage of the packets.

If we run with the 28543 intrrupts tied to a single processor, we can
achieve
about 285 kpps, at which point we're using 95% of the single cpu.

Running a uniprocessor kernel, we top out around 350 kpps.

There's nothing else running on the boxes.


I'm perplexed by a couple of issues.

The network performance of the SMP kernel with the gige bound to single
processor is only
about 80% of the UP kernel.  Is this typical?  Are the causes of the
performance degradation
well known?


With the gige running on both processors, we get rather poor performance.
We can't even reach the same number of pps on two processors that we can
with one.
Using cpu performance measurement counters, we seem to reach a point where
there is as much time being spent doing cache invalidates as there is doing
real work.
All the queues and statistics are per-cpu in the 2.14.18 kernel.  Are there
other known problems causing excessive cache invalidates?  Are there any
significant
improvements in later kernels?


	Thanks in advance,

	Jon Fraser





> -----Original Message-----
> From: netdev-bounce@oss.sgi.com [mailto:netdev-bounce@oss.sgi.com]On
> Behalf Of Richard Gooch
> Sent: Sunday, September 29, 2002 4:54 PM
> To: Ben Greear
> Cc: netdev@oss.sgi.com
> Subject: Re: Poor gige performance with 2.4.20-pre*
>
>
> Ben Greear writes:
> > Richard Gooch wrote:
> > > Ben Greear writes:
> > >
> > >>Richard Gooch wrote:
> > >>
> > >>>  Hi, all. For a while now I've noticed poor performance
> with gige
> > >>>cards under 2.4.19 and 2.4.20-pre*. At first I thought
> it was because
> > >>>of the cheap-ass Addtron cards I bought (these use the
> ns83820 chip).
> > >>>But now that the Intel E1000 cards are pretty cheap too,
> I've grabbed
> > >>>a couple (part number: PWLA8390MT) and see the same
> problem. In fact,
> > >>>the E1000 cards are no better than the Addtron cards.
> I'm using the
> > >>>D-Link DGS-1008T 8-port gige switch. MTU=1500 bytes.
> > >>
> > >>Try setting the TxDescriptors=4096 RxDescriptors=1024
> when loading the
> > >>e1000 module, that helps tremendously when using smaller packets.
> > >
> > > Didn't help at all. Just to summarise, I've got:
> > > options e1000 TxDescriptors=4096 RxDescriptors=1024
> > > net.ipv4.tcp_rmem = 262144 262144 262144
> > > net.ipv4.tcp_wmem = 262144 262144 262144
> > > MTU=1500
> > >
> > > I'm doing read(2)/write(2) to/from a user-space buffer over a TCP
> > > socket with 256 KiB buffer size.
> > >
> > > Is the E1000 supposed to have hardware interrupt mitigation (thus
> > > avoiding the need for NAPI)?
> >
> > NAPI did not greatly improve the performance I saw with
> larger packets,
> > but it did help with smaller (say, 60 byte) packets.
>
> My packets should be 1500 bytes, or close to it.
>
> > One other thing I saw with TCP connections: They started off slow,
> > but after a few seconds they were reacing their peak throughput.
> > How long are you running your test?
>
> I normally send 100 MB, so that's around 2 seconds or more. Sending
> 1 GB doesn't change anything (other than the test taking 20 seconds or
> more).
>
> Oh, BTW: some possibly relevant config options:
> CONFIG_M686=y
> CONFIG_HIGHMEM4G=y
> # CONFIG_HIGHMEM64G is not set
> CONFIG_HIGHMEM=y
> CONFIG_HIGHIO=y
> CONFIG_SMP=y
> CONFIG_E1000=m
>
> 				Regards,
>
> 					Richard....
> Permanent: rgooch@atnf.csiro.au
> Current:   rgooch@ras.ucalgary.ca
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2002-09-30 21:21 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-09-28 22:57 Poor gige performance with 2.4.20-pre* Richard Gooch
2002-09-29  2:12 ` Xiaoliang (David) Wei
2002-09-29  6:34   ` Richard Gooch
2002-09-30  0:45     ` Benjamin LaHaise
2002-09-30  0:53       ` Richard Gooch
2002-09-29  2:32 ` Ben Greear
2002-09-29 19:22   ` Richard Gooch
2002-09-29 19:32     ` Ben Greear
2002-09-29 20:54       ` Richard Gooch
2002-09-30 21:21         ` Jon Fraser

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).