bonding vs 802.3ad/Cisco EtherChannel link agregation

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* bonding vs 802.3ad/Cisco EtherChannel link agregation
@ 2002-09-12 18:39 Boris Protopopov
  2002-09-12 23:34 ` David S. Miller
  0 siblings, 1 reply; 19+ messages in thread
From: Boris Protopopov @ 2002-09-12 18:39 UTC (permalink / raw)
  To: linux-net, netdev

Hi, I got couple questions about bonding vs link agregation in GigE space. I may
be doing something wrong on the user level, or there might be system-level
issues I am not aware of. 

I am trying to increase bandwidth of GigE connecitons between boxes in my
cluster by means of bonding or aggregating several GigE links together. The
simplest setup that I have is two boxes with dual GigE e1000 cards connected to
each other directly by means of crossover cables. The boxes are running Redhat
7.3. I can successfully bond the links; however, I do not see any increase in
Netpipe bandwidth as compared to the case of a single GigE link. My CPU
utilization is around 20%, and I can get ~900Mbits/sec over a single link (MTU
7500). With bonding, CPU utilization is marginally higher, and there is no
increase (or some decrease) in Netpipe numbers. Looking at the ifconfig eth*
statistics, I can see that the traffic is distributed evenly between the
connections. The cards are plugged into 100Mhz PCI-X 64. Memory bw seems to be
sufficient (estimated with "stream" to be ~ 1200MB/s). Can anybody offer an
explanation ? 

Another question is about link agregation. Intel offers "teaming" option (with
the driver and utils) for agregating the e1000 cards in EtherChannel- or 802.3ad
- compatible way. From a few docs I could find that have any technical merit
(802.3ad doc on order), it appears that there are some sort of Ethernet frame
ordering restrictions that EtherChannel and 802.3ad impose on the traffic spread
across the aggregated links. So, in my case, I can see (ifconfig statistics)
that one link is always sending GigE frames, whereas the other clink is always
receiving. Obviously, in this arrangement, Netpipe will not benefit from the
aggregation. A response that I got from Intel customer support indicated that
"this is how EtherChannel works". Can someone explain why there are some
ordering restrictions ? If I am not mistaken, TCP/IP would handle out-of-order
frames transparently because it is designed to do so.

Thanks in advance for your help,
Boris Protopopov.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bonding vs 802.3ad/Cisco EtherChannel link agregation
  2002-09-12 18:39 bonding vs 802.3ad/Cisco EtherChannel link agregation Boris Protopopov
@ 2002-09-12 23:34 ` David S. Miller
  2002-09-13 14:29   ` Chris Friesen
  0 siblings, 1 reply; 19+ messages in thread
From: David S. Miller @ 2002-09-12 23:34 UTC (permalink / raw)
  To: borisp; +Cc: linux-net, netdev

Bonding does not help with single stream performance.
You have to have multiple apps generating multiple streams
of data before you'll realize any improvement.
Therefore netpipe is a bad test for what you're doing.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: bonding vs 802.3ad/Cisco EtherChannel link agregation
@ 2002-09-13  1:30 Feldman, Scott
  2002-09-13 14:50 ` Boris Protopopov
  0 siblings, 1 reply; 19+ messages in thread
From: Feldman, Scott @ 2002-09-13  1:30 UTC (permalink / raw)
  To: 'David S. Miller', borisp; +Cc: linux-net, netdev

> Bonding does not help with single stream performance.
> You have to have multiple apps generating multiple streams
> of data before you'll realize any improvement.
> Therefore netpipe is a bad test for what you're doing.

The analogy that I like is to imagine a culvert under your driveway and you
want to fill up the ditch on the other side, so you stick your garden hose
in the culvert.  The rate of water flow is good, but you're just not
utilizing the volume (bandwidth) of the culvert.  So you stick your
neighbors garden hose in the culvert.  And so on.  Now the ditch is filling
up.

So stick a bunch of garden hoses (streams) into that culvert (gigabit) and
flood it to the point of saturation, and now measure the efficiency of the
system (CPU %)  How much CPU is left to do other useful work?  The lower the
CPU utilization, the higher the efficiency of the system.

Ok, the analogy is corny, and it doesn't have anything to do with bonding,
but you'd be surprise how often this question comes up.

-scott

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bonding vs 802.3ad/Cisco EtherChannel link agregation
  2002-09-12 23:34 ` David S. Miller
@ 2002-09-13 14:29   ` Chris Friesen
  2002-09-13 22:22     ` Cacophonix
  0 siblings, 1 reply; 19+ messages in thread
From: Chris Friesen @ 2002-09-13 14:29 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-net, netdev

"David S. Miller" wrote:
> 
> Bonding does not help with single stream performance.
> You have to have multiple apps generating multiple streams
> of data before you'll realize any improvement.
> Therefore netpipe is a bad test for what you're doing.

This has always confused me.  Why doesn't the bonding driver try and spread all the traffic over all
the links?  This would allow for a single stream to use the full aggregate bandwidth of all bonded
links.

Chris

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bonding vs 802.3ad/Cisco EtherChannel link agregation
  2002-09-13  1:30 Feldman, Scott
@ 2002-09-13 14:50 ` Boris Protopopov
  0 siblings, 0 replies; 19+ messages in thread
From: Boris Protopopov @ 2002-09-13 14:50 UTC (permalink / raw)
  To: Feldman, Scott; +Cc: 'David S. Miller', linux-net, netdev

Hi, Scott, David, thanks for your replies, 
I agree that it seems that I bonding does not help with a single conection (btw,
is this a single TCP conneciton, or an ordered pair of IP addresses, or an
ordered pair of MAC addresses ?). My question is: why ? 

Scott, I am not sure I understand your analogy. It seems that you suggest that I
cannot saturate GigE with a single hose (a single Netpipe connection, I am
guessing). My experiments, however, indicate that I can - I get ~900Mbits/sec
out of 1000Mbits/sec with 20% CPU utilization when I use a single Netpipe
connection. I am guessing, the ~10% underutilization is caused by the protocol
overheads (headers/trailers). I added another GigE link, bonded it with the
first one, and repeated the experiment. I saw the ethernet frames evenly
distributed between the two links (ifconfig statistics), which, in my opinion,
indicated that a single Netpipe connection was in fact using both links. The CPU
utilization rose a bit, but it was nowhere near even 50%. The memory subsystem
seems to be able to handle over 1000Mbytes/sec, the peripheral bus is 100Mhz
64-bit PCI-X (handles 800Mbytes/sec). So, obviously, there is a bottneck
somewhere, but I don't understand where. I think my CPU/memory/I/O hose is big
enough to fill up two GigE links (~200Mbytes/sec), however, this is not
happening. My question is: why ?

I know why this is happens when I use teaming: because, on each side, one GigE
link is dedicated to sending, and the other one - to receiving (again, I use
ifconfig statistics to see what's happening). Acording to what I was told, this
is because 802.3ad and EtherChannel insist that all GigE frames that belong to a
"conversation" (I was unable to find a technical definition of the
"conversation" so far) are delivered in order. I would like to understand what a
"conversation" is, and why TCP/IP could not handle out-of-order delivery in it's
usual manner. I am guessing, because it would be too slow or would incur too
much CPU overhead. I was wondering if someone knew for sure :)

Thanks again, hope to get more opinions (it seems I am not alone :)),
Boris.

"Feldman, Scott" wrote:
> 
> > Bonding does not help with single stream performance.
> > You have to have multiple apps generating multiple streams
> > of data before you'll realize any improvement.
> > Therefore netpipe is a bad test for what you're doing.
> 
> The analogy that I like is to imagine a culvert under your driveway and you
> want to fill up the ditch on the other side, so you stick your garden hose
> in the culvert.  The rate of water flow is good, but you're just not
> utilizing the volume (bandwidth) of the culvert.  So you stick your
> neighbors garden hose in the culvert.  And so on.  Now the ditch is filling
> up.
> 
> So stick a bunch of garden hoses (streams) into that culvert (gigabit) and
> flood it to the point of saturation, and now measure the efficiency of the
> system (CPU %)  How much CPU is left to do other useful work?  The lower the
> CPU utilization, the higher the efficiency of the system.
> 
> Ok, the analogy is corny, and it doesn't have anything to do with bonding,
> but you'd be surprise how often this question comes up.
> 
> -scott
> -
> To unsubscribe from this list: send the line "unsubscribe linux-net" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bonding vs 802.3ad/Cisco EtherChannel link agregation
  2002-09-13 14:29   ` Chris Friesen
@ 2002-09-13 22:22     ` Cacophonix
  2002-09-16 13:23       ` Chris Friesen
  0 siblings, 1 reply; 19+ messages in thread
From: Cacophonix @ 2002-09-13 22:22 UTC (permalink / raw)
  To: Chris Friesen; +Cc: linux-net, netdev


--- Chris Friesen <cfriesen@nortelnetworks.com> wrote:
> "David S. Miller" wrote:
> > 
> > Bonding does not help with single stream performance.
> > You have to have multiple apps generating multiple streams
> > of data before you'll realize any improvement.
> > Therefore netpipe is a bad test for what you're doing.
> 
> This has always confused me.  Why doesn't the bonding driver try and spread all the
> traffic over all
> the links?  This would allow for a single stream to use the full aggregate bandwidth of
> all bonded
> links.

Because then you risk heavy packet reordering within an individual flow,
which can be detrimental in some cases.
--karthik


__________________________________________________
Do you Yahoo!?
Yahoo! News - Today's headlines
http://news.yahoo.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bonding vs 802.3ad/Cisco EtherChannel link agregation
  2002-09-13 22:22     ` Cacophonix
@ 2002-09-16 13:23       ` Chris Friesen
  2002-09-16 16:09         ` Ben Greear
  0 siblings, 1 reply; 19+ messages in thread
From: Chris Friesen @ 2002-09-16 13:23 UTC (permalink / raw)
  To: Cacophonix; +Cc: linux-net, netdev

Cacophonix wrote:
> 
> --- Chris Friesen <cfriesen@nortelnetworks.com> wrote:

> > This has always confused me.  Why doesn't the bonding driver try and spread
> > all the traffic over all the links?
> 
> Because then you risk heavy packet reordering within an individual flow,
> which can be detrimental in some cases.
> --karthik

I can see how it could make the receiving host work more on reassembly, but if throughput is key,
wouldn't you still end up better if you can push twice as many packets through the pipe?

Chris

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bonding vs 802.3ad/Cisco EtherChannel link agregation
  2002-09-16 13:23       ` Chris Friesen
@ 2002-09-16 16:09         ` Ben Greear
  2002-09-16 19:55           ` David S. Miller
  2002-09-17 10:16           ` jamal
  0 siblings, 2 replies; 19+ messages in thread
From: Ben Greear @ 2002-09-16 16:09 UTC (permalink / raw)
  To: Chris Friesen; +Cc: Cacophonix, linux-net, netdev

Chris Friesen wrote:
> Cacophonix wrote:
> 
>>--- Chris Friesen <cfriesen@nortelnetworks.com> wrote:
> 
> 
>>>This has always confused me.  Why doesn't the bonding driver try and spread
>>>all the traffic over all the links?
>>
>>Because then you risk heavy packet reordering within an individual flow,
>>which can be detrimental in some cases.
>>--karthik
> 
> 
> I can see how it could make the receiving host work more on reassembly, but if throughput is key,
> wouldn't you still end up better if you can push twice as many packets through the pipe?
> 
> Chris

Also, I notice lots of out-of-order packets on a single gigE link when running at high
speeds (SMP machine), so the kernel is still having to reorder quite a few packets.
Has anyone done any tests to see how much worse it is with dual-port bonding?

NAPI helps my problem, but does not make it go away entirely.

Ben

> 


-- 
Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
President of Candela Technologies Inc      http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bonding vs 802.3ad/Cisco EtherChannel link agregation
  2002-09-16 16:09         ` Ben Greear
@ 2002-09-16 19:55           ` David S. Miller
  2002-09-16 21:10             ` Chris Friesen
  2002-09-17 10:16           ` jamal
  1 sibling, 1 reply; 19+ messages in thread
From: David S. Miller @ 2002-09-16 19:55 UTC (permalink / raw)
  To: greearb; +Cc: cfriesen, cacophonix, linux-net, netdev

   From: Ben Greear <greearb@candelatech.com>
   Date: Mon, 16 Sep 2002 09:09:42 -0700

   NAPI helps my problem, but does not make it go away entirely.

There is a lot of logic in our TCP input btw that notices packet
reordering on the receive side and acts accordingly (ie. it does not
fire off fast retransmits or backoff prematurely when reordering
is detected)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: bonding vs 802.3ad/Cisco EtherChannel link agregation
@ 2002-09-16 20:12 Yan-Fa Li
  0 siblings, 0 replies; 19+ messages in thread
From: Yan-Fa Li @ 2002-09-16 20:12 UTC (permalink / raw)
  To: 'Ben Greear', Chris Friesen; +Cc: Cacophonix, linux-net, netdev

I had similar problems with NAPI and DL2K.  I was only able to "resolve" the
issue by forcing my application and the NIC to a single CPU using CPU
affinity
hacks.

-----Original Message-----
From: Ben Greear [mailto:greearb@candelatech.com] 
Sent: Monday, September 16, 2002 9:10 AM
To: Chris Friesen
Cc: Cacophonix; linux-net@vger.kernel.org; netdev@oss.sgi.com
Subject: Re: bonding vs 802.3ad/Cisco EtherChannel link agregation


Chris Friesen wrote:
> Cacophonix wrote:
> 
>>--- Chris Friesen <cfriesen@nortelnetworks.com> wrote:
> 
> 
>>>This has always confused me.  Why doesn't the bonding driver try and 
>>>spread all the traffic over all the links?
>>
>>Because then you risk heavy packet reordering within an individual 
>>flow, which can be detrimental in some cases. --karthik
> 
> 
> I can see how it could make the receiving host work more on 
> reassembly, but if throughput is key, wouldn't you still end up better 
> if you can push twice as many packets through the pipe?
> 
> Chris

Also, I notice lots of out-of-order packets on a single gigE link when
running at high speeds (SMP machine), so the kernel is still having to
reorder quite a few packets. Has anyone done any tests to see how much worse
it is with dual-port bonding?

NAPI helps my problem, but does not make it go away entirely.

Ben

> 


-- 
Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
President of Candela Technologies Inc      http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bonding vs 802.3ad/Cisco EtherChannel link agregation
  2002-09-16 21:10             ` Chris Friesen
@ 2002-09-16 21:04               ` David S. Miller
  2002-09-16 21:22                 ` Chris Friesen
  0 siblings, 1 reply; 19+ messages in thread
From: David S. Miller @ 2002-09-16 21:04 UTC (permalink / raw)
  To: cfriesen; +Cc: greearb, cacophonix, linux-net, netdev

   From: Chris Friesen <cfriesen@nortelnetworks.com>
   Date: Mon, 16 Sep 2002 17:10:06 -0400

   Okay, that makes me even more curious why we don't send successive
   packets out successive pipes in a bonded link.

This is not done because it leads to packet reordering which
if bad enough can trigger retransmits.

Scott Feldman's posting mentioned this, as did one other I
think.

Same flows (which in this context means TCP connection) must
go over the same link to avoid packet reordering at the receiver.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bonding vs 802.3ad/Cisco EtherChannel link agregation
  2002-09-16 19:55           ` David S. Miller
@ 2002-09-16 21:10             ` Chris Friesen
  2002-09-16 21:04               ` David S. Miller
  0 siblings, 1 reply; 19+ messages in thread
From: Chris Friesen @ 2002-09-16 21:10 UTC (permalink / raw)
  To: David S. Miller; +Cc: greearb, cacophonix, linux-net, netdev

"David S. Miller" wrote:

> There is a lot of logic in our TCP input btw that notices packet
> reordering on the receive side and acts accordingly (ie. it does not
> fire off fast retransmits or backoff prematurely when reordering
> is detected)

Okay, that makes me even more curious why we don't send successive packets out successive pipes in a
bonded link.

It seems to me that when sending packets out a bonded link, we should scan for the next device that
has an opening on its queue (perhaps also taking into account link speed etc) so as to try and keep
the entire aggregate link working at max capacity.

Does this not make sense in the real world for some reason?  Maybe a config option
CONFIG_MAX_BONDED_THROUGHPUT to allow a single stream to take up the entire aggregate link even if
it makes the receiver do more work?

Chris

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bonding vs 802.3ad/Cisco EtherChannel link agregation
  2002-09-16 21:22                 ` Chris Friesen
@ 2002-09-16 21:17                   ` David S. Miller
  0 siblings, 0 replies; 19+ messages in thread
From: David S. Miller @ 2002-09-16 21:17 UTC (permalink / raw)
  To: cfriesen; +Cc: greearb, cacophonix, linux-net, netdev

   From: Chris Friesen <cfriesen@nortelnetworks.com>
   Date: Mon, 16 Sep 2002 17:22:30 -0400

   I did see those posts, but then I saw yours on how the linux
   receive end does the right thing with regards to reordering, and
   that confused me.

It's a last ditch effort to keep receive performance on the TCP
connection reasonable, it is not meant to be a normal mode
of operation.  The reordering detection in the TCP stack does
not get you back to optimal receive performance, it simply can't.

For one thing, all of the fast paths in the TCP input paths require
that the packets arrive in order.

If bonding did what you suggest, reordering would become the norm
and that isn't going to be good for TCP input performance whether
we have reordering detection logic or not.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bonding vs 802.3ad/Cisco EtherChannel link agregation
  2002-09-16 21:04               ` David S. Miller
@ 2002-09-16 21:22                 ` Chris Friesen
  2002-09-16 21:17                   ` David S. Miller
  0 siblings, 1 reply; 19+ messages in thread
From: Chris Friesen @ 2002-09-16 21:22 UTC (permalink / raw)
  To: David S. Miller; +Cc: greearb, cacophonix, linux-net, netdev

"David S. Miller" wrote:
>    From: Chris Friesen <cfriesen@nortelnetworks.com>
> 
>    Okay, that makes me even more curious why we don't send successive
>    packets out successive pipes in a bonded link.
> 
> This is not done because it leads to packet reordering which
> if bad enough can trigger retransmits.
> 
> Scott Feldman's posting mentioned this, as did one other I
> think.

I did see those posts, but then I saw yours on how the linux receive end does the right thing with
regards to reordering, and that confused me.

So if I have it right linux-linux could theoretically work okay with a single stream over multiple
links (potentially causing lots of reordering), but linux-router would not work well.


Chris

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bonding vs 802.3ad/Cisco EtherChannel link agregation
  2002-09-16 16:09         ` Ben Greear
  2002-09-16 19:55           ` David S. Miller
@ 2002-09-17 10:16           ` jamal
  2002-09-17 16:43             ` Ben Greear
  1 sibling, 1 reply; 19+ messages in thread
From: jamal @ 2002-09-17 10:16 UTC (permalink / raw)
  To: Ben Greear; +Cc: Chris Friesen, Cacophonix, linux-net, netdev

On Mon, 16 Sep 2002, Ben Greear wrote:

> Also, I notice lots of out-of-order packets on a single gigE link when running at high
> speeds (SMP machine), so the kernel is still having to reorder quite a few packets.
> Has anyone done any tests to see how much worse it is with dual-port bonding?

It will depend on the scheduling algorithm used.
Always remember that reordering is BAD for TCP and you shall be fine.
Typical for TCP you want to run a single flow onto a single NIC.
If you are running some UDP type control app in a cluster environment
where ordering is a non-issue then you could maximize throughput
by sending as fast as you can on all interfaces.

>
> NAPI helps my problem, but does not make it go away entirely.
>

Could you be more specific as to where you see reordering with NAPI?
Please dont disappear. What us it that you see that makes you believe
theres reordering with NAPI i.edescribe your test setup etc.

cheers,
jamal

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bonding vs 802.3ad/Cisco EtherChannel link agregation
  2002-09-17 10:16           ` jamal
@ 2002-09-17 16:43             ` Ben Greear
  2002-09-18  1:07               ` jamal
  0 siblings, 1 reply; 19+ messages in thread
From: Ben Greear @ 2002-09-17 16:43 UTC (permalink / raw)
  To: jamal; +Cc: Chris Friesen, Cacophonix, linux-net, netdev

jamal wrote:
> 
> On Mon, 16 Sep 2002, Ben Greear wrote:
> 
> 
>>Also, I notice lots of out-of-order packets on a single gigE link when running at high
>>speeds (SMP machine), so the kernel is still having to reorder quite a few packets.
>>Has anyone done any tests to see how much worse it is with dual-port bonding?
> 
> 
> It will depend on the scheduling algorithm used.
> Always remember that reordering is BAD for TCP and you shall be fine.
> Typical for TCP you want to run a single flow onto a single NIC.
> If you are running some UDP type control app in a cluster environment
> where ordering is a non-issue then you could maximize throughput
> by sending as fast as you can on all interfaces.
> 
> 
>>NAPI helps my problem, but does not make it go away entirely.
>>
> 
> 
> Could you be more specific as to where you see reordering with NAPI?
> Please dont disappear. What us it that you see that makes you believe
> theres reordering with NAPI i.edescribe your test setup etc.

I have a program that sends and receives UDP packets with 32-bit sequence
numbers.  I can detect OOO packets if they fit into the last 10 packets
received.  If they are farther out of order than that, the code treats
them as dropped....

I used smp_afinity to tie a NIC to a CPU, and the napi patch for 2.4.20-pre7.

When sending and receiving 250Mbps of UDP/IP traffic (sending over cross-over
cable to other NIC on same machine), I see the occasional OOO packet.  I also
see bogus dropped packets, which means sometimes the order is off by 10 or more
packets....

The other fun thing about this setup is that after running around 65 billion bytes
with this test,
the machine crashes with an OOPs.  I can repeat this at will, but so far only on
this dual-proc machine, and only using the e1000 driver (NAPI or regular).  Soon, I'll
test with a 4-port tulip NIC to see if I can take the e1000 out of the equation...

I can also repeat the at slower speeds (50Mbps send & recieve), and using
the pktgen tool.  If anyone else is running high sustained SEND & RECEIVE traffic,
I would be interested to know their stability!

I have had a single-proc machine running the exact same (SMP) kernel as the dual-proc
machine, but using the tulip driver, and it has run solid for 2 days, sending &
receiving over 1.3 trillion bytes :)

Thanks,
Ben

> 
> cheers,
> jamal
> 

-- 
Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
President of Candela Technologies Inc      http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bonding vs 802.3ad/Cisco EtherChannel link agregation
  2002-09-17 16:43             ` Ben Greear
@ 2002-09-18  1:07               ` jamal
  2002-09-18  4:06                 ` Ben Greear
  0 siblings, 1 reply; 19+ messages in thread
From: jamal @ 2002-09-18  1:07 UTC (permalink / raw)
  To: Ben Greear; +Cc: Chris Friesen, Cacophonix, linux-net, netdev

On Tue, 17 Sep 2002, Ben Greear wrote:

> I have a program that sends and receives UDP packets with 32-bit sequence
> numbers.  I can detect OOO packets if they fit into the last 10 packets
> received.  If they are farther out of order than that, the code treats
> them as dropped....
>

So let me understand this:
You have a packet going out eth0 looped back to eth0 and you are seeing
reordering? What sort of things are happening from departure at udp socket
to arrival on the other side? Are you doing anything funky yourself or
it is all kernel?

> I used smp_afinity to tie a NIC to a CPU, and the napi patch for 2.4.20-pre7.

Did you get reordering with affinity?

>
> When sending and receiving 250Mbps of UDP/IP traffic (sending over cross-over
> cable to other NIC on same machine), I see the occasional OOO packet.  I also
> see bogus dropped packets, which means sometimes the order is off by 10 or more
> packets....

A lot of shit is happening at that rate in particular with PCI bus. If
i understood correctly a packet crosses the bus about 4 times?

>
> The other fun thing about this setup is that after running around 65 billion bytes
> with this test, the machine crashes with an OOPs.

No clue whats going on - probably a race somewhere and cant help since
i dont have this NIC but if you get Robert excited he might be able to
help you.

> I can repeat this at will, but so far only on
> this dual-proc machine, and only using the e1000 driver (NAPI or regular).  Soon, I'll
> test with a 4-port tulip NIC to see if I can take the e1000 out of the equation...
>

I doubt youll see it there.

cheers,
jamal

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bonding vs 802.3ad/Cisco EtherChannel link agregation
  2002-09-18  1:07               ` jamal
@ 2002-09-18  4:06                 ` Ben Greear
  2002-09-18 11:48                   ` jamal
  0 siblings, 1 reply; 19+ messages in thread
From: Ben Greear @ 2002-09-18  4:06 UTC (permalink / raw)
  To: jamal; +Cc: Chris Friesen, Cacophonix, linux-net, netdev

jamal wrote:
> 
> On Tue, 17 Sep 2002, Ben Greear wrote:
> 
> 
>>I have a program that sends and receives UDP packets with 32-bit sequence
>>numbers.  I can detect OOO packets if they fit into the last 10 packets
>>received.  If they are farther out of order than that, the code treats
>>them as dropped....
>>
> 
> 
> So let me understand this:
> You have a packet going out eth0 looped back to eth0 and you are seeing

I go out eth2 and come in eth3, both e1000 copper nics.

> reordering? What sort of things are happening from departure at udp socket
> to arrival on the other side? Are you doing anything funky yourself or
> it is all kernel?

I see reordering on regular old socket calls from user space, and I see
the same thing with pktgen packets as well (which clears the stack of fault).

For user space, I am binding to source IP, setting SO_BINDTODEVICE, and O_NONBLOCK.
Nothing overly weird I believe.

pktgen has its own ways, basically it grabs the pkts in the dev.c receive method
near where the bridging code grabs it's packets...

I see no reordering at all with tulip 4-port nics and single processor machines.

> 
> 
>>I used smp_afinity to tie a NIC to a CPU, and the napi patch for 2.4.20-pre7.
> 
> 
> Did you get reordering with affinity?

Yes.  I'm not sure I tried affinity w/out NAPI though.  I definately tried
it with NAPI and saw reordering.

> 
> 
>>When sending and receiving 250Mbps of UDP/IP traffic (sending over cross-over
>>cable to other NIC on same machine), I see the occasional OOO packet.  I also
>>see bogus dropped packets, which means sometimes the order is off by 10 or more
>>packets....
> 
> 
> A lot of shit is happening at that rate in particular with PCI bus. If
> i understood correctly a packet crosses the bus about 4 times?

Two times I believe, but I'm sending in both directions, so there is about
1Gbps across the bus, not even counting the UDP, IP, and Ethernet headers.

> 
> 
>>The other fun thing about this setup is that after running around 65 billion bytes
>>with this test, the machine crashes with an OOPs.
> 
> 
> No clue whats going on - probably a race somewhere and cant help since
> i dont have this NIC but if you get Robert excited he might be able to
> help you.

He seems moderately interested, suggested I try the one in 2.5.[latest].
I'll be able to do that next week, assuming it's trivial to compile it
for 2.4.19....

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
President of Candela Technologies Inc      http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bonding vs 802.3ad/Cisco EtherChannel link agregation
  2002-09-18  4:06                 ` Ben Greear
@ 2002-09-18 11:48                   ` jamal
  0 siblings, 0 replies; 19+ messages in thread
From: jamal @ 2002-09-18 11:48 UTC (permalink / raw)
  To: Ben Greear; +Cc: Chris Friesen, Cacophonix, linux-net, netdev



On Tue, 17 Sep 2002, Ben Greear wrote:

> I see reordering on regular old socket calls from user space, and I see
> the same thing with pktgen packets as well (which clears the stack of fault).
>
> > Did you get reordering with affinity?
>
> Yes.  I'm not sure I tried affinity w/out NAPI though.  I definately tried
> it with NAPI and saw reordering.
>

Ok, now it is getting interesting. I would start to be suspicious
about either the hardware or driver.
Please try without NAPI and affinity and post what you see (not that this
result matters but should help confirn things).

cheers,
jamal

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2002-09-18 11:48 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-09-12 18:39 bonding vs 802.3ad/Cisco EtherChannel link agregation Boris Protopopov
2002-09-12 23:34 ` David S. Miller
2002-09-13 14:29   ` Chris Friesen
2002-09-13 22:22     ` Cacophonix
2002-09-16 13:23       ` Chris Friesen
2002-09-16 16:09         ` Ben Greear
2002-09-16 19:55           ` David S. Miller
2002-09-16 21:10             ` Chris Friesen
2002-09-16 21:04               ` David S. Miller
2002-09-16 21:22                 ` Chris Friesen
2002-09-16 21:17                   ` David S. Miller
2002-09-17 10:16           ` jamal
2002-09-17 16:43             ` Ben Greear
2002-09-18  1:07               ` jamal
2002-09-18  4:06                 ` Ben Greear
2002-09-18 11:48                   ` jamal
  -- strict thread matches above, loose matches on Subject: below --
2002-09-13  1:30 Feldman, Scott
2002-09-13 14:50 ` Boris Protopopov
2002-09-16 20:12 Yan-Fa Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).