netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Chris Friesen" <cfriesen@nortel.com>
To: Andy Gospodarek <andy@greyhouse.net>
Cc: netdev@vger.kernel.org, bonding-devel@lists.sourceforge.net,
	fubar@us.ibm.com, ctindel@users.sourceforge.net
Subject: Re: [Bonding-devel] quick help with bonding?
Date: Thu, 29 Mar 2007 16:08:47 -0600	[thread overview]
Message-ID: <460C38EF.1080509@nortel.com> (raw)
In-Reply-To: <20070329181617.GA25770@gospo.rdu.redhat.com>

[-- Attachment #1: Type: text/plain, Size: 2054 bytes --]

Andy Gospodarek wrote:

> Can you elaborate on what isn't going well with this driver/hardware?  

I have a ppc64 blade running a customized 2.6.10.  At init time, two of 
our gigE links (eth4 and eth5) are bonded together to form bond0.  This 
link has an MTU of 9000, and uses arp monitoring.  We're using an 
ethernet driver with a modified RX path for jumbo frames[1].  With the 
stock driver, it seems to work fine.

The problem is that eth5 seems to be bouncing up and down every 15 sec 
or so (see the attached log excerpt).  Also, "ifconfig" shows that only 
3 packets totalling 250 bytes have gone out eth5, when I know that the 
arp monitoring code from the bond layer is sending 10 arps/sec out the link.


eth5      Link encap:Ethernet  HWaddr 00:03:CC:51:01:3E
           inet6 addr: fe80::203:ccff:fe51:13e/64 Scope:Link
           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:9000  Metric:1
           RX packets:119325 errors:90283 dropped:90283 overruns:90283 
frame:0
           TX packets:3 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:8978310 (8.5 MiB)  TX bytes:250 (250.0 b)
           Base address:0x3840 Memory:92220000-92240000


I had initially suspected that it might be due to the "u32 jiffies" 
stuff in bonding.h, but changing that doesn't seem to fix the issue.

If I boot the system and then log in and manually create the bond link 
(rather than it happening at init time) then I don't see the problem.

If it matters at all, normally the system boots from eth4.  I'm going to 
try booting from eth6 and see if the problem still occurs.


Chris




[1] I'm not sure if I'm supposed to mention the specific driver, as it 
hasn't been officially released yet, so I'll keep this high-level. 
Normally for jumbo frames you need to allocate a large physically 
contiguous buffer.  With the modified driver, rather than receiving into 
a contiguous buffer the incoming packet is split across multiple pages 
which are then reassembled into an sk_buff and passed up the link.

[-- Attachment #2: bond_log.txt --]
[-- Type: text/plain, Size: 3505 bytes --]

Mar 29 20:54:08 base0-0-0-5-0-11-1 kernel: bonding: ARP monitoring set to 100 ms with 2 target(s): 172.24.136.0 172.24.137.0
Mar 29 20:54:08 base0-0-0-5-0-11-1 kernel: bonding: ARP monitoring set to 100 ms with 2 target(s): 172.25.136.0 172.25.137.0
Mar 29 20:54:08 base0-0-0-5-0-11-1 kernel: bonding: Warning: failed to get speed/duplex from eth4, speed forced to 100Mbps, duplex forced to Full.
Mar 29 20:54:08 base0-0-0-5-0-11-1 kernel: bonding: bond0: enslaving eth4 as an active interface with an up link.
Mar 29 20:54:08 base0-0-0-5-0-11-1 kernel: bonding: Warning: failed to get speed/duplex from eth5, speed forced to 100Mbps, duplex forced to Full.
Mar 29 20:54:08 base0-0-0-5-0-11-1 kernel: bonding: bond0: enslaving eth5 as an active interface with an up link.
Mar 29 20:54:08 base0-0-0-5-0-11-1 kernel: bonding: bond0: scheduling interface eth5 to be reset in 30000 msec.
Mar 29 20:54:08 base0-0-0-5-0-11-1 kernel: bonding: bond0: interface eth5 is now down.
Mar 29 20:54:08 base0-0-0-5-0-11-1 kernel: bonding: bond0: scheduling interface eth4 to be reset in 30000 msec.
Mar 29 20:54:08 base0-0-0-5-0-11-1 kernel: bonding: bond0: interface eth4 is now down.
Mar 29 20:54:08 base0-0-0-5-0-11-1 kernel: bonding: bond0: now running without any active interface !
Mar 29 20:54:08 base0-0-0-5-0-11-1 kernel: bonding: bond0: cancelled scheduled reset of interface eth5
Mar 29 20:54:08 base0-0-0-5-0-11-1 kernel: bonding: bond0: link status definitely up for interface eth5
Mar 29 20:54:08 base0-0-0-5-0-11-1 kernel: bonding: bond0: cancelled scheduled reset of interface eth4
Mar 29 20:54:08 base0-0-0-5-0-11-1 kernel: bonding: bond0: interface eth4 is now up
Mar 29 20:54:08 base0-0-0-5-0-11-1 kernel: bonding: bond0: scheduling interface eth5 to be reset in 30000 msec.
Mar 29 20:54:08 base0-0-0-5-0-11-1 kernel: bonding: bond0: interface eth5 is now down.
Mar 29 20:54:09 base0-0-0-5-0-11-1 kernel: bonding: interface eth4 reset delay set to 600 msec.
Mar 29 20:54:59 base0-0-0-5-0-11-1 kernel: bonding: bond0: cancelled scheduled reset of interface eth5
Mar 29 20:54:59 base0-0-0-5-0-11-1 kernel: bonding: bond0: interface eth5 is now up
Mar 29 20:54:59 base0-0-0-5-0-11-1 kernel: bonding: bond0: scheduling interface eth5 to be reset in 30000 msec.
Mar 29 20:54:59 base0-0-0-5-0-11-1 kernel: bonding: bond0: interface eth5 is now down.
Mar 29 20:55:15 base0-0-0-5-0-11-1 kernel: bonding: bond0: cancelled scheduled reset of interface eth5
Mar 29 20:55:15 base0-0-0-5-0-11-1 kernel: bonding: bond0: interface eth5 is now up
Mar 29 20:55:15 base0-0-0-5-0-11-1 kernel: bonding: bond0: scheduling interface eth5 to be reset in 30000 msec.
Mar 29 20:55:15 base0-0-0-5-0-11-1 kernel: bonding: bond0: interface eth5 is now down.
Mar 29 20:55:30 base0-0-0-5-0-11-1 kernel: bonding: bond0: cancelled scheduled reset of interface eth5
Mar 29 20:55:30 base0-0-0-5-0-11-1 kernel: bonding: bond0: interface eth5 is now up
Mar 29 20:55:30 base0-0-0-5-0-11-1 kernel: bonding: bond0: scheduling interface eth5 to be reset in 30000 msec.
Mar 29 20:55:30 base0-0-0-5-0-11-1 kernel: bonding: bond0: interface eth5 is now down.
Mar 29 20:55:45 base0-0-0-5-0-11-1 kernel: bonding: bond0: cancelled scheduled reset of interface eth5
Mar 29 20:55:45 base0-0-0-5-0-11-1 kernel: bonding: bond0: interface eth5 is now up
Mar 29 20:55:46 base0-0-0-5-0-11-1 kernel: bonding: bond0: scheduling interface eth5 to be reset in 30000 msec.
Mar 29 20:55:46 base0-0-0-5-0-11-1 kernel: bonding: bond0: interface eth5 is now down.

  reply	other threads:[~2007-03-29 22:09 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-29 16:14 quick help with bonding? Chris Friesen
2007-03-29 16:24 ` [Bonding-devel] " Jay Vosburgh
2007-03-29 18:16 ` Andy Gospodarek
2007-03-29 22:08   ` Chris Friesen [this message]
2007-03-29 22:30     ` Jay Vosburgh
2007-03-29 23:01       ` Mark Huth
2007-03-29 23:42       ` Chris Friesen
2007-03-30  0:13         ` Jay Vosburgh
2007-03-30  0:36         ` Andy Gospodarek
2007-03-30  1:19           ` Chris Friesen
2007-03-30  1:26             ` Chris Friesen
2007-03-30  2:48               ` Andy Gospodarek
2007-03-30  2:49               ` Chris Friesen
2007-04-02 22:28 ` Chris Friesen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=460C38EF.1080509@nortel.com \
    --to=cfriesen@nortel.com \
    --cc=andy@greyhouse.net \
    --cc=bonding-devel@lists.sourceforge.net \
    --cc=ctindel@users.sourceforge.net \
    --cc=fubar@us.ibm.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).