netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* bonding can't change to another slave if you ifdown the active slave
@ 2011-03-04  2:15 Weiping Pan
  2011-03-05  0:38 ` Jay Vosburgh
  2011-03-05  2:53 ` Andy Gospodarek
  0 siblings, 2 replies; 13+ messages in thread
From: Weiping Pan @ 2011-03-04  2:15 UTC (permalink / raw)
  To: netdev, bonding-devel; +Cc: Linda Wang

Hi,

I'm doing some Linux bonding driver test, and I find a problem in
balance-rr mode.
That's it can't change to another slave if you ifdown the active slave.
Any comments are warmly welcomed!

regards
Weiping Pan

My host is Fedora 14, and I install VirtualBox (4.0.2), and enable 4
nics for the guest system.
My guest is Fedora 14 too.
First on my host, I run:
[pwp@localhost linux-2.6.35-comment]$ uname -a
Linux localhost.localdomain 2.6.35.11-83.fc14.i686 #1 SMP Mon Feb 7
07:04:18 UTC 2011 i686 i686 i386 GNU/Linux

[pwp@localhost linux-2.6.35-comment]$ sudo ifconfig eth0:0 192.168.1.100
netmask 255.255.255.0 up
[pwp@localhost linux-2.6.35-comment]$ sudo ifconfig
eth0      Link encap:Ethernet  HWaddr 64:31:50:3A:B0:B5
           inet addr:10.66.65.228  Bcast:10.66.65.255  Mask:255.255.254.0
           inet6 addr: fe80::6631:50ff:fe3a:b0b5/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:811505 errors:0 dropped:0 overruns:0 frame:0
           TX packets:777018 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:709681583 (676.8 MiB)  TX bytes:71520005 (68.2 MiB)
           Interrupt:17

eth0:0    Link encap:Ethernet  HWaddr 64:31:50:3A:B0:B5
           inet addr:192.168.1.100  Bcast:192.168.1.255  Mask:255.255.255.0
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           Interrupt:17

Then I enable bonding on my guest, I run:
[root@localhost ~]# uname -a
Linux localhost.localdomain 2.6.35.11-83.fc14.i686 #1 SMP Mon Feb 7
07:04:18 UTC 2011 i686 i686 i386 GNU/Linux

[root@localhost ~]# ifconfig
eth6      Link encap:Ethernet  HWaddr 08:00:27:3A:4D:BD
           inet addr:10.66.65.167  Bcast:10.66.65.255  Mask:255.255.254.0
           inet6 addr: fe80::a00:27ff:fe3a:4dbd/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:65 errors:0 dropped:0 overruns:0 frame:0
           TX packets:31 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:9916 (9.6 KiB)  TX bytes:3090 (3.0 KiB)

eth7      Link encap:Ethernet  HWaddr 08:00:27:26:1B:DB
           inet addr:10.66.65.154  Bcast:10.66.65.255  Mask:255.255.254.0
           inet6 addr: fe80::a00:27ff:fe26:1bdb/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:57 errors:0 dropped:0 overruns:0 frame:0
           TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:7358 (7.1 KiB)  TX bytes:1152 (1.1 KiB)

eth8      Link encap:Ethernet  HWaddr 08:00:27:B5:FC:D1
           inet addr:10.66.65.169  Bcast:10.66.65.255  Mask:255.255.254.0
           inet6 addr: fe80::a00:27ff:feb5:fcd1/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:57 errors:0 dropped:0 overruns:0 frame:0
           TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:7358 (7.1 KiB)  TX bytes:1152 (1.1 KiB)

eth9      Link encap:Ethernet  HWaddr 08:00:27:C7:7B:FC
           inet addr:10.66.65.216  Bcast:10.66.65.255  Mask:255.255.254.0
           inet6 addr: fe80::a00:27ff:fec7:7bfc/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:57 errors:0 dropped:0 overruns:0 frame:0
           TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:7358 (7.1 KiB)  TX bytes:1152 (1.1 KiB)

lo        Link encap:Local Loopback
           inet addr:127.0.0.1  Mask:255.0.0.0
           inet6 addr: ::1/128 Scope:Host
           UP LOOPBACK RUNNING  MTU:16436  Metric:1
           RX packets:123 errors:0 dropped:0 overruns:0 frame:0
           TX packets:123 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:13036 (12.7 KiB)  TX bytes:13036 (12.7 KiB)

[root@localhost ~]# ifconfig eth7 down
[root@localhost ~]# ifconfig eth8 down
[root@localhost ~]# dmesg -c
[root@localhost ~]# modprobe bonding mode=0 miimon=100
[root@localhost ~]# ifconfig bond0 192.168.1.5 netmask 255.255.255.0 up
[root@localhost ~]# ifenslave bond0 eth7

[root@localhost ~]# dmesg
[  304.496463] bonding: Ethernet Channel Bonding Driver: v3.6.0
(September 26, 2009)
[  304.496468] bonding: MII link monitoring set to 100 ms
[  353.527680] ADDRCONF(NETDEV_UP): bond0: link is not ready
[  355.321626] e1000: eth7 NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: RX
[  355.322250] bonding: bond0: enslaving eth7 as an active interface
with an up link.
[  355.323503] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
[  365.394052] bond0: no IPv6 routers present

[pwp@localhost ~]$ ping 192.168.1.100 -c 10
PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
64 bytes from 192.168.1.100: icmp_req=1 ttl=64 time=0.196 ms
64 bytes from 192.168.1.100: icmp_req=2 ttl=64 time=0.365 ms
64 bytes from 192.168.1.100: icmp_req=3 ttl=64 time=0.259 ms
64 bytes from 192.168.1.100: icmp_req=4 ttl=64 time=0.135 ms
64 bytes from 192.168.1.100: icmp_req=5 ttl=64 time=0.194 ms
64 bytes from 192.168.1.100: icmp_req=6 ttl=64 time=0.225 ms
64 bytes from 192.168.1.100: icmp_req=7 ttl=64 time=0.189 ms
64 bytes from 192.168.1.100: icmp_req=8 ttl=64 time=0.274 ms
64 bytes from 192.168.1.100: icmp_req=9 ttl=64 time=1.07 ms
64 bytes from 192.168.1.100: icmp_req=10 ttl=64 time=0.274 ms

--- 192.168.1.100 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9002ms
rtt min/avg/max/mdev = 0.135/0.319/1.079/0.260 ms

[root@localhost ~]# ifenslave bond0 eth8
[root@localhost ~]# dmesg
[  304.496463] bonding: Ethernet Channel Bonding Driver: v3.6.0
(September 26, 2009)
[  304.496468] bonding: MII link monitoring set to 100 ms
[  353.527680] ADDRCONF(NETDEV_UP): bond0: link is not ready
[  355.321626] e1000: eth7 NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: RX
[  355.322250] bonding: bond0: enslaving eth7 as an active interface
with an up link.
[  355.323503] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
[  365.394052] bond0: no IPv6 routers present
[  510.913797] e1000: eth8 NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: RX
[  510.917312] bonding: bond0: enslaving eth8 as an active interface
with an up link.

[pwp@localhost ~]$ ping 192.168.1.100 -c 10
PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
64 bytes from 192.168.1.100: icmp_req=1 ttl=64 time=0.182 ms
64 bytes from 192.168.1.100: icmp_req=2 ttl=64 time=0.211 ms
64 bytes from 192.168.1.100: icmp_req=3 ttl=64 time=0.270 ms
64 bytes from 192.168.1.100: icmp_req=4 ttl=64 time=0.248 ms
64 bytes from 192.168.1.100: icmp_req=5 ttl=64 time=0.132 ms
64 bytes from 192.168.1.100: icmp_req=6 ttl=64 time=0.291 ms
64 bytes from 192.168.1.100: icmp_req=7 ttl=64 time=0.246 ms
64 bytes from 192.168.1.100: icmp_req=8 ttl=64 time=0.272 ms
64 bytes from 192.168.1.100: icmp_req=9 ttl=64 time=0.293 ms
64 bytes from 192.168.1.100: icmp_req=10 ttl=64 time=0.133 ms

--- 192.168.1.100 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9000ms
rtt min/avg/max/mdev = 0.132/0.227/0.293/0.060 ms

[root@localhost ~]# ifconfig
bond0     Link encap:Ethernet  HWaddr 08:00:27:26:1B:DB
           inet addr:192.168.1.5  Bcast:192.168.1.255  Mask:255.255.255.0
           inet6 addr: fe80::a00:27ff:fe26:1bdb/64 Scope:Link
           UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
           RX packets:311 errors:0 dropped:0 overruns:0 frame:0
           TX packets:61 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:38075 (37.1 KiB)  TX bytes:8698 (8.4 KiB)

eth7      Link encap:Ethernet  HWaddr 08:00:27:26:1B:DB
           inet addr:10.66.65.154  Bcast:10.66.65.255  Mask:255.255.254.0
           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
           RX packets:181 errors:0 dropped:0 overruns:0 frame:0
           TX packets:39 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:22297 (21.7 KiB)  TX bytes:4578 (4.4 KiB)

eth8      Link encap:Ethernet  HWaddr 08:00:27:26:1B:DB
           inet addr:192.168.1.15  Bcast:192.168.1.255  Mask:255.255.255.0
           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
           RX packets:130 errors:0 dropped:0 overruns:0 frame:0
           TX packets:22 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:15778 (15.4 KiB)  TX bytes:4120 (4.0 KiB)

[root@localhost ~]# ifconfig eth7 down
[root@localhost ~]# dmesg
[  304.496463] bonding: Ethernet Channel Bonding Driver: v3.6.0
(September 26, 2009)
[  304.496468] bonding: MII link monitoring set to 100 ms
[  353.527680] ADDRCONF(NETDEV_UP): bond0: link is not ready
[  355.321626] e1000: eth7 NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: RX
[  355.322250] bonding: bond0: enslaving eth7 as an active interface
with an up link.
[  355.323503] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
[  365.394052] bond0: no IPv6 routers present
[  510.913797] e1000: eth8 NIC Link is Up 1000 Mbps Full Duplex, Flow
Control: RX
[  510.917312] bonding: bond0: enslaving eth8 as an active interface
with an up link.
[  592.208534] bonding: bond0: link status definitely down for interface
eth7, disabling it

Now, if bonding driver works well, eth8 will be the active slave, and
the network connection is ok.
__But__ ...

[pwp@localhost ~]$ ping 192.168.1.100 -c 10
PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
 From 192.168.1.5 icmp_seq=10 Destination Host Unreachable

--- 192.168.1.100 ping statistics ---
10 packets transmitted, 0 received, +1 errors, 100% packet loss, time 8999ms

How strange!

[root@localhost ~]# ifconfig
bond0     Link encap:Ethernet  HWaddr 08:00:27:26:1B:DB
           inet addr:192.168.1.5  Bcast:192.168.1.255  Mask:255.255.255.0
           inet6 addr: fe80::a00:27ff:fe26:1bdb/64 Scope:Link
           UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
           RX packets:357 errors:0 dropped:0 overruns:0 frame:0
           TX packets:76 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:42971 (41.9 KiB)  TX bytes:9832 (9.6 KiB)

eth8      Link encap:Ethernet  HWaddr 08:00:27:26:1B:DB
           inet addr:192.168.1.15  Bcast:192.168.1.255  Mask:255.255.255.0
           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
           RX packets:163 errors:0 dropped:0 overruns:0 frame:0
           TX packets:37 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:19073 (18.6 KiB)  TX bytes:5254 (5.1 KiB)

[root@localhost ~]# arp
Address                  HWtype  HWaddress           Flags
Mask            Iface
corerouter.nay.redhat.c  ether   00:1d:45:20:d5:ff
C                     eth6
192.168.1.100
(incomplete)                              bond0

I think maybe there is something wrong about arp.
So I run ping and tcpdump synchronously.

[pwp@localhost ~]$ ping 192.168.1.100 -c 10
PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
 From 192.168.1.5 icmp_seq=2 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=3 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=4 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=6 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=7 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=8 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=9 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=10 Destination Host Unreachable

--- 192.168.1.100 ping statistics ---
10 packets transmitted, 0 received, +8 errors, 100% packet loss, time 9002ms
pipe 3

And meanwhile,
[root@localhost ~]# tcpdump -i bond0 -p arp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on bond0, link-type EN10MB (Ethernet), capture size 65535 bytes
02:46:56.983092 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
length 28
02:46:57.984040 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
length 28
02:46:58.988442 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
length 28
02:47:00.987340 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
length 28
02:47:01.988136 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
length 28
02:47:02.990033 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
length 28
02:47:04.985086 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
length 28
02:47:05.992368 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
length 28
02:47:06.996727 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
length 28
02:47:17.231106 ARP, Request who-has dhcp-65-32.nay.redhat.com tell
dhcp-65-180.nay.redhat.com, length 46
^C
10 packets captured
10 packets received by filter
0 packets dropped by kernel


But I'm sure eth8 works well.

[root@localhost ~]# modprobe -r bonding
[root@localhost ~]# modprobe bonding mode=0 miimon=100
[root@localhost ~]# ifconfig bond0 192.168.1.5 netmask 255.255.255.0 up
[root@localhost ~]# ifenslave bond0 eth8

[pwp@localhost ~]$ ping 192.168.1.100 -c 10
PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
64 bytes from 192.168.1.100: icmp_req=1 ttl=64 time=0.683 ms
64 bytes from 192.168.1.100: icmp_req=2 ttl=64 time=0.222 ms
64 bytes from 192.168.1.100: icmp_req=3 ttl=64 time=0.265 ms
64 bytes from 192.168.1.100: icmp_req=4 ttl=64 time=0.237 ms
64 bytes from 192.168.1.100: icmp_req=5 ttl=64 time=0.214 ms
64 bytes from 192.168.1.100: icmp_req=6 ttl=64 time=0.214 ms
64 bytes from 192.168.1.100: icmp_req=7 ttl=64 time=0.238 ms
64 bytes from 192.168.1.100: icmp_req=8 ttl=64 time=0.152 ms
64 bytes from 192.168.1.100: icmp_req=9 ttl=64 time=0.234 ms
64 bytes from 192.168.1.100: icmp_req=10 ttl=64 time=0.221 ms

--- 192.168.1.100 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9004ms
rtt min/avg/max/mdev = 0.152/0.268/0.683/0.141 ms

[root@localhost ~]# ifconfig
bond0     Link encap:Ethernet  HWaddr 08:00:27:B5:FC:D1
           inet addr:192.168.1.5  Bcast:192.168.1.255  Mask:255.255.255.0
           inet6 addr: fe80::a00:27ff:feb5:fcd1/64 Scope:Link
           UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
           RX packets:263 errors:0 dropped:0 overruns:0 frame:0
           TX packets:79 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:28246 (27.5 KiB)  TX bytes:9810 (9.5 KiB)

eth8      Link encap:Ethernet  HWaddr 08:00:27:B5:FC:D1
           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
           RX packets:263 errors:0 dropped:0 overruns:0 frame:0
           TX packets:79 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:28246 (27.5 KiB)  TX bytes:9810 (9.5 KiB)

[root@localhost ~]# arp
Address                  HWtype  HWaddress           Flags
Mask            Iface
corerouter.nay.redhat.c  ether   00:1d:45:20:d5:ff
C                     eth6
192.168.1.100            ether   64:31:50:3a:b0:b5
C                     bond0





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bonding can't change to another slave if you ifdown the active slave
  2011-03-04  2:15 bonding can't change to another slave if you ifdown the active slave Weiping Pan
@ 2011-03-05  0:38 ` Jay Vosburgh
  2011-03-07  3:23   ` Weiping Pan
  2011-03-05  2:53 ` Andy Gospodarek
  1 sibling, 1 reply; 13+ messages in thread
From: Jay Vosburgh @ 2011-03-05  0:38 UTC (permalink / raw)
  To: Weiping Pan; +Cc: netdev, bonding-devel, Linda Wang

Weiping Pan <panweiping3@gmail.com> wrote:

>I'm doing some Linux bonding driver test, and I find a problem in
>balance-rr mode.
>That's it can't change to another slave if you ifdown the active slave.
>Any comments are warmly welcomed!

	I followed your recipe on a somewhat more recent kernel (2.6.37)
and using real hardware, and I don't see the problem you describe.

	I do have a couple of questions, further down.

[...]
>My host is Fedora 14, and I install VirtualBox (4.0.2), and enable 4

	I've not ever tried virtualbox, but it may be that its virtual
switch is misbehaving.  One possibility that comes to mind is that the
virtual switch is confused by seeing the same MAC address on multiple
ports (which is a problem with a hardware virtual switch I'm familiar
with).

>nics for the guest system.
>My guest is Fedora 14 too.
>First on my host, I run:
>[pwp@localhost linux-2.6.35-comment]$ uname -a
>Linux localhost.localdomain 2.6.35.11-83.fc14.i686 #1 SMP Mon Feb 7
>07:04:18 UTC 2011 i686 i686 i386 GNU/Linux
>
>[pwp@localhost linux-2.6.35-comment]$ sudo ifconfig eth0:0 192.168.1.100
>netmask 255.255.255.0 up
>[pwp@localhost linux-2.6.35-comment]$ sudo ifconfig
>eth0      Link encap:Ethernet  HWaddr 64:31:50:3A:B0:B5
>          inet addr:10.66.65.228  Bcast:10.66.65.255  Mask:255.255.254.0
>          inet6 addr: fe80::6631:50ff:fe3a:b0b5/64 Scope:Link
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:811505 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:777018 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:1000
>          RX bytes:709681583 (676.8 MiB)  TX bytes:71520005 (68.2 MiB)
>          Interrupt:17
>
>eth0:0    Link encap:Ethernet  HWaddr 64:31:50:3A:B0:B5
>          inet addr:192.168.1.100  Bcast:192.168.1.255  Mask:255.255.255.0
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          Interrupt:17
>
>Then I enable bonding on my guest, I run:
>[root@localhost ~]# uname -a
>Linux localhost.localdomain 2.6.35.11-83.fc14.i686 #1 SMP Mon Feb 7
>07:04:18 UTC 2011 i686 i686 i386 GNU/Linux
>
>[root@localhost ~]# ifconfig
>eth6      Link encap:Ethernet  HWaddr 08:00:27:3A:4D:BD
>          inet addr:10.66.65.167  Bcast:10.66.65.255  Mask:255.255.254.0
>          inet6 addr: fe80::a00:27ff:fe3a:4dbd/64 Scope:Link
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:65 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:31 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:1000
>          RX bytes:9916 (9.6 KiB)  TX bytes:3090 (3.0 KiB)
>
>eth7      Link encap:Ethernet  HWaddr 08:00:27:26:1B:DB
>          inet addr:10.66.65.154  Bcast:10.66.65.255  Mask:255.255.254.0
>          inet6 addr: fe80::a00:27ff:fe26:1bdb/64 Scope:Link
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:57 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:1000
>          RX bytes:7358 (7.1 KiB)  TX bytes:1152 (1.1 KiB)
>
>eth8      Link encap:Ethernet  HWaddr 08:00:27:B5:FC:D1
>          inet addr:10.66.65.169  Bcast:10.66.65.255  Mask:255.255.254.0
>          inet6 addr: fe80::a00:27ff:feb5:fcd1/64 Scope:Link
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:57 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:1000
>          RX bytes:7358 (7.1 KiB)  TX bytes:1152 (1.1 KiB)
>
>eth9      Link encap:Ethernet  HWaddr 08:00:27:C7:7B:FC
>          inet addr:10.66.65.216  Bcast:10.66.65.255  Mask:255.255.254.0
>          inet6 addr: fe80::a00:27ff:fec7:7bfc/64 Scope:Link
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:57 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:1000
>          RX bytes:7358 (7.1 KiB)  TX bytes:1152 (1.1 KiB)
>
>lo        Link encap:Local Loopback
>          inet addr:127.0.0.1  Mask:255.0.0.0
>          inet6 addr: ::1/128 Scope:Host
>          UP LOOPBACK RUNNING  MTU:16436  Metric:1
>          RX packets:123 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:123 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:0
>          RX bytes:13036 (12.7 KiB)  TX bytes:13036 (12.7 KiB)
>
>[root@localhost ~]# ifconfig eth7 down
>[root@localhost ~]# ifconfig eth8 down
>[root@localhost ~]# dmesg -c
>[root@localhost ~]# modprobe bonding mode=0 miimon=100
>[root@localhost ~]# ifconfig bond0 192.168.1.5 netmask 255.255.255.0 up
>[root@localhost ~]# ifenslave bond0 eth7
>
>[root@localhost ~]# dmesg
>[  304.496463] bonding: Ethernet Channel Bonding Driver: v3.6.0
>(September 26, 2009)
>[  304.496468] bonding: MII link monitoring set to 100 ms
>[  353.527680] ADDRCONF(NETDEV_UP): bond0: link is not ready
>[  355.321626] e1000: eth7 NIC Link is Up 1000 Mbps Full Duplex, Flow
>Control: RX
>[  355.322250] bonding: bond0: enslaving eth7 as an active interface
>with an up link.
>[  355.323503] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
>[  365.394052] bond0: no IPv6 routers present
>
>[pwp@localhost ~]$ ping 192.168.1.100 -c 10

	At this point, what is in the routing table ("ip route show")
and the ARP table ("ip neigh show")?

>PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
>64 bytes from 192.168.1.100: icmp_req=1 ttl=64 time=0.196 ms
>64 bytes from 192.168.1.100: icmp_req=2 ttl=64 time=0.365 ms
>64 bytes from 192.168.1.100: icmp_req=3 ttl=64 time=0.259 ms
>64 bytes from 192.168.1.100: icmp_req=4 ttl=64 time=0.135 ms
>64 bytes from 192.168.1.100: icmp_req=5 ttl=64 time=0.194 ms
>64 bytes from 192.168.1.100: icmp_req=6 ttl=64 time=0.225 ms
>64 bytes from 192.168.1.100: icmp_req=7 ttl=64 time=0.189 ms
>64 bytes from 192.168.1.100: icmp_req=8 ttl=64 time=0.274 ms
>64 bytes from 192.168.1.100: icmp_req=9 ttl=64 time=1.07 ms
>64 bytes from 192.168.1.100: icmp_req=10 ttl=64 time=0.274 ms
>
>--- 192.168.1.100 ping statistics ---
>10 packets transmitted, 10 received, 0% packet loss, time 9002ms
>rtt min/avg/max/mdev = 0.135/0.319/1.079/0.260 ms
>
>[root@localhost ~]# ifenslave bond0 eth8
>[root@localhost ~]# dmesg
>[  304.496463] bonding: Ethernet Channel Bonding Driver: v3.6.0
>(September 26, 2009)
>[  304.496468] bonding: MII link monitoring set to 100 ms
>[  353.527680] ADDRCONF(NETDEV_UP): bond0: link is not ready
>[  355.321626] e1000: eth7 NIC Link is Up 1000 Mbps Full Duplex, Flow
>Control: RX
>[  355.322250] bonding: bond0: enslaving eth7 as an active interface
>with an up link.
>[  355.323503] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
>[  365.394052] bond0: no IPv6 routers present
>[  510.913797] e1000: eth8 NIC Link is Up 1000 Mbps Full Duplex, Flow
>Control: RX
>[  510.917312] bonding: bond0: enslaving eth8 as an active interface
>with an up link.
>
>[pwp@localhost ~]$ ping 192.168.1.100 -c 10
>PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
>64 bytes from 192.168.1.100: icmp_req=1 ttl=64 time=0.182 ms
>64 bytes from 192.168.1.100: icmp_req=2 ttl=64 time=0.211 ms
>64 bytes from 192.168.1.100: icmp_req=3 ttl=64 time=0.270 ms
>64 bytes from 192.168.1.100: icmp_req=4 ttl=64 time=0.248 ms
>64 bytes from 192.168.1.100: icmp_req=5 ttl=64 time=0.132 ms
>64 bytes from 192.168.1.100: icmp_req=6 ttl=64 time=0.291 ms
>64 bytes from 192.168.1.100: icmp_req=7 ttl=64 time=0.246 ms
>64 bytes from 192.168.1.100: icmp_req=8 ttl=64 time=0.272 ms
>64 bytes from 192.168.1.100: icmp_req=9 ttl=64 time=0.293 ms
>64 bytes from 192.168.1.100: icmp_req=10 ttl=64 time=0.133 ms
>
>--- 192.168.1.100 ping statistics ---
>10 packets transmitted, 10 received, 0% packet loss, time 9000ms
>rtt min/avg/max/mdev = 0.132/0.227/0.293/0.060 ms
>
>[root@localhost ~]# ifconfig
>bond0     Link encap:Ethernet  HWaddr 08:00:27:26:1B:DB
>          inet addr:192.168.1.5  Bcast:192.168.1.255  Mask:255.255.255.0
>          inet6 addr: fe80::a00:27ff:fe26:1bdb/64 Scope:Link
>          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
>          RX packets:311 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:61 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:0
>          RX bytes:38075 (37.1 KiB)  TX bytes:8698 (8.4 KiB)
>
>eth7      Link encap:Ethernet  HWaddr 08:00:27:26:1B:DB
>          inet addr:10.66.65.154  Bcast:10.66.65.255  Mask:255.255.254.0
>          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
>          RX packets:181 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:39 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:1000
>          RX bytes:22297 (21.7 KiB)  TX bytes:4578 (4.4 KiB)
>
>eth8      Link encap:Ethernet  HWaddr 08:00:27:26:1B:DB
>          inet addr:192.168.1.15  Bcast:192.168.1.255  Mask:255.255.255.0
>          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
>          RX packets:130 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:22 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:1000
>          RX bytes:15778 (15.4 KiB)  TX bytes:4120 (4.0 KiB)
>
>[root@localhost ~]# ifconfig eth7 down

	Next question: just after setting eth7 down, what do the routing
and ARP tables look like?

>[root@localhost ~]# dmesg
>[  304.496463] bonding: Ethernet Channel Bonding Driver: v3.6.0
>(September 26, 2009)
>[  304.496468] bonding: MII link monitoring set to 100 ms
>[  353.527680] ADDRCONF(NETDEV_UP): bond0: link is not ready
>[  355.321626] e1000: eth7 NIC Link is Up 1000 Mbps Full Duplex, Flow
>Control: RX
>[  355.322250] bonding: bond0: enslaving eth7 as an active interface
>with an up link.
>[  355.323503] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
>[  365.394052] bond0: no IPv6 routers present
>[  510.913797] e1000: eth8 NIC Link is Up 1000 Mbps Full Duplex, Flow
>Control: RX
>[  510.917312] bonding: bond0: enslaving eth8 as an active interface
>with an up link.
>[  592.208534] bonding: bond0: link status definitely down for interface
>eth7, disabling it
>
>Now, if bonding driver works well, eth8 will be the active slave, and
>the network connection is ok.
>__But__ ...
>
>[pwp@localhost ~]$ ping 192.168.1.100 -c 10
>PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
>From 192.168.1.5 icmp_seq=10 Destination Host Unreachable
>
>--- 192.168.1.100 ping statistics ---
>10 packets transmitted, 0 received, +1 errors, 100% packet loss, time 8999ms
>
>How strange!
>
>[root@localhost ~]# ifconfig
>bond0     Link encap:Ethernet  HWaddr 08:00:27:26:1B:DB
>          inet addr:192.168.1.5  Bcast:192.168.1.255  Mask:255.255.255.0
>          inet6 addr: fe80::a00:27ff:fe26:1bdb/64 Scope:Link
>          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
>          RX packets:357 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:76 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:0
>          RX bytes:42971 (41.9 KiB)  TX bytes:9832 (9.6 KiB)
>
>eth8      Link encap:Ethernet  HWaddr 08:00:27:26:1B:DB
>          inet addr:192.168.1.15  Bcast:192.168.1.255  Mask:255.255.255.0
>          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
>          RX packets:163 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:37 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:1000
>          RX bytes:19073 (18.6 KiB)  TX bytes:5254 (5.1 KiB)
>
>[root@localhost ~]# arp
>Address                  HWtype  HWaddress           Flags
>Mask            Iface
>corerouter.nay.redhat.c  ether   00:1d:45:20:d5:ff
>C                     eth6
>192.168.1.100
>(incomplete)                              bond0
>
>I think maybe there is something wrong about arp.
>So I run ping and tcpdump synchronously.
>
>[pwp@localhost ~]$ ping 192.168.1.100 -c 10
>PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
>From 192.168.1.5 icmp_seq=2 Destination Host Unreachable
>From 192.168.1.5 icmp_seq=3 Destination Host Unreachable
>From 192.168.1.5 icmp_seq=4 Destination Host Unreachable
>From 192.168.1.5 icmp_seq=6 Destination Host Unreachable
>From 192.168.1.5 icmp_seq=7 Destination Host Unreachable
>From 192.168.1.5 icmp_seq=8 Destination Host Unreachable
>From 192.168.1.5 icmp_seq=9 Destination Host Unreachable
>From 192.168.1.5 icmp_seq=10 Destination Host Unreachable
>
>--- 192.168.1.100 ping statistics ---
>10 packets transmitted, 0 received, +8 errors, 100% packet loss, time 9002ms
>pipe 3
>
>And meanwhile,
>[root@localhost ~]# tcpdump -i bond0 -p arp
>tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
>listening on bond0, link-type EN10MB (Ethernet), capture size 65535 bytes
>02:46:56.983092 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
>length 28
[...]

	At this point, does tcpdump on the host system see the incoming
ARP requests?

>But I'm sure eth8 works well.
>
>[root@localhost ~]# modprobe -r bonding
>[root@localhost ~]# modprobe bonding mode=0 miimon=100
>[root@localhost ~]# ifconfig bond0 192.168.1.5 netmask 255.255.255.0 up
>[root@localhost ~]# ifenslave bond0 eth8
>
>[pwp@localhost ~]$ ping 192.168.1.100 -c 10
>PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
>64 bytes from 192.168.1.100: icmp_req=1 ttl=64 time=0.683 ms
>64 bytes from 192.168.1.100: icmp_req=2 ttl=64 time=0.222 ms
>64 bytes from 192.168.1.100: icmp_req=3 ttl=64 time=0.265 ms
>64 bytes from 192.168.1.100: icmp_req=4 ttl=64 time=0.237 ms
>64 bytes from 192.168.1.100: icmp_req=5 ttl=64 time=0.214 ms
>64 bytes from 192.168.1.100: icmp_req=6 ttl=64 time=0.214 ms
>64 bytes from 192.168.1.100: icmp_req=7 ttl=64 time=0.238 ms
>64 bytes from 192.168.1.100: icmp_req=8 ttl=64 time=0.152 ms
>64 bytes from 192.168.1.100: icmp_req=9 ttl=64 time=0.234 ms
>64 bytes from 192.168.1.100: icmp_req=10 ttl=64 time=0.221 ms
>
>--- 192.168.1.100 ping statistics ---
>10 packets transmitted, 10 received, 0% packet loss, time 9004ms
>rtt min/avg/max/mdev = 0.152/0.268/0.683/0.141 ms
>
>[root@localhost ~]# ifconfig
>bond0     Link encap:Ethernet  HWaddr 08:00:27:B5:FC:D1
>          inet addr:192.168.1.5  Bcast:192.168.1.255  Mask:255.255.255.0
>          inet6 addr: fe80::a00:27ff:feb5:fcd1/64 Scope:Link
>          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
>          RX packets:263 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:79 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:0
>          RX bytes:28246 (27.5 KiB)  TX bytes:9810 (9.5 KiB)
>
>eth8      Link encap:Ethernet  HWaddr 08:00:27:B5:FC:D1
>          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
>          RX packets:263 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:79 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:1000
>          RX bytes:28246 (27.5 KiB)  TX bytes:9810 (9.5 KiB)
>
>[root@localhost ~]# arp
>Address                  HWtype  HWaddress           Flags
>Mask            Iface
>corerouter.nay.redhat.c  ether   00:1d:45:20:d5:ff
>C                     eth6
>192.168.1.100            ether   64:31:50:3a:b0:b5
>C                     bond0

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bonding can't change to another slave if you ifdown the active slave
  2011-03-04  2:15 bonding can't change to another slave if you ifdown the active slave Weiping Pan
  2011-03-05  0:38 ` Jay Vosburgh
@ 2011-03-05  2:53 ` Andy Gospodarek
  2011-03-05 13:49   ` Nicolas de Pesloüan
  2011-03-07  4:20   ` Weiping Pan
  1 sibling, 2 replies; 13+ messages in thread
From: Andy Gospodarek @ 2011-03-05  2:53 UTC (permalink / raw)
  To: Weiping Pan; +Cc: netdev, bonding-devel, Linda Wang

On Fri, Mar 04, 2011 at 10:15:17AM +0800, Weiping Pan wrote:
> Hi,
>
> I'm doing some Linux bonding driver test, and I find a problem in
> balance-rr mode.
> That's it can't change to another slave if you ifdown the active slave.
> Any comments are warmly welcomed!
>
> regards
> Weiping Pan
>
> My host is Fedora 14, and I install VirtualBox (4.0.2), and enable 4
> nics for the guest system.

Does this mean you are passing 4 NICs from your host to your guest
(maybe via direct pci-device assignment to the guest) or are you
creating 4 virtual devices on the host that are in a bridge group on the
host?

[...]
> [root@localhost ~]# ifconfig eth7 down

This is not a great way to test link failure with bonding.  The best way
is to actually pull the cable so the interface is truly down.

> [root@localhost ~]# dmesg
> [  304.496463] bonding: Ethernet Channel Bonding Driver: v3.6.0
> (September 26, 2009)
> [  304.496468] bonding: MII link monitoring set to 100 ms
> [  353.527680] ADDRCONF(NETDEV_UP): bond0: link is not ready
> [  355.321626] e1000: eth7 NIC Link is Up 1000 Mbps Full Duplex, Flow
> Control: RX
> [  355.322250] bonding: bond0: enslaving eth7 as an active interface
> with an up link.
> [  355.323503] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
> [  365.394052] bond0: no IPv6 routers present
> [  510.913797] e1000: eth8 NIC Link is Up 1000 Mbps Full Duplex, Flow
> Control: RX
> [  510.917312] bonding: bond0: enslaving eth8 as an active interface
> with an up link.
> [  592.208534] bonding: bond0: link status definitely down for interface
> eth7, disabling it

I suspect I know, but what does /proc/net/bonding/bond0 look like?

[...]
> And meanwhile,
> [root@localhost ~]# tcpdump -i bond0 -p arp
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
> listening on bond0, link-type EN10MB (Ethernet), capture size 65535 bytes
> 02:46:56.983092 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
> length 28
> 02:46:57.984040 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
> length 28
> 02:46:58.988442 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
> length 28
> 02:47:00.987340 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
> length 28
> 02:47:01.988136 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
> length 28
> 02:47:02.990033 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
> length 28
> 02:47:04.985086 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
> length 28
> 02:47:05.992368 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
> length 28
> 02:47:06.996727 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
> length 28
> 02:47:17.231106 ARP, Request who-has dhcp-65-32.nay.redhat.com tell
> dhcp-65-180.nay.redhat.com, length 46
> ^C
> 10 packets captured
> 10 packets received by filter
> 0 packets dropped by kernel
>
>

What does a tcpdump on eth0 look like?  I'm curious if these arp
requests make it there or if the responses are the frames being dropped
(possibly by the connected bridge/switch).

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bonding can't change to another slave if you ifdown the active slave
  2011-03-05  2:53 ` Andy Gospodarek
@ 2011-03-05 13:49   ` Nicolas de Pesloüan
  2011-03-07  3:13     ` Weiping Pan
  2011-03-07  4:20   ` Weiping Pan
  1 sibling, 1 reply; 13+ messages in thread
From: Nicolas de Pesloüan @ 2011-03-05 13:49 UTC (permalink / raw)
  To: Weiping Pan; +Cc: Andy Gospodarek, netdev, bonding-devel, Linda Wang

Le 05/03/2011 03:53, Andy Gospodarek a écrit :
> On Fri, Mar 04, 2011 at 10:15:17AM +0800, Weiping Pan wrote:
>> Hi,
>>
>> I'm doing some Linux bonding driver test, and I find a problem in
>> balance-rr mode.
>> That's it can't change to another slave if you ifdown the active slave.
>> Any comments are warmly welcomed!
>>
>> regards
>> Weiping Pan
>>
>> My host is Fedora 14, and I install VirtualBox (4.0.2), and enable 4
>> nics for the guest system.
>
> Does this mean you are passing 4 NICs from your host to your guest
> (maybe via direct pci-device assignment to the guest) or are you
> creating 4 virtual devices on the host that are in a bridge group on the
> host?

VirtualBox does not allow assignment of pci-device to the guest. The network interfaces on the guest 
are pure virtual one, with several modes available. In order to help you trouble shooting this 
problem, we need to know the mode form each of the virtual interfaces. Possible modes are NAT, 
bridged, internal-network, and host-only-network.

Please provide the output of the following command:

VBoxManage showvminfo <your-vm-uuid> | grep ^NIC

To display your vm uuid, use the following command:

VBoxManage list vms

>
> [...]
>> [root@localhost ~]# ifconfig eth7 down
>
> This is not a great way to test link failure with bonding.  The best way
> is to actually pull the cable so the interface is truly down.

To virtually plug or unplug the cable from a virtual interface, use the following command, replacing 
the # with the interface number (from 1 to 8):

VBoxManage controlvm setlinkstate# on
VBoxManage controlvm setlinkstate# off

	Nicolas.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bonding can't change to another slave if you ifdown the active slave
  2011-03-05 13:49   ` Nicolas de Pesloüan
@ 2011-03-07  3:13     ` Weiping Pan
  2011-03-07 21:15       ` Nicolas de Pesloüan
  0 siblings, 1 reply; 13+ messages in thread
From: Weiping Pan @ 2011-03-07  3:13 UTC (permalink / raw)
  To: Nicolas de Pesloüan
  Cc: Andy Gospodarek, netdev, bonding-devel, Linda Wang

On 03/05/2011 09:49 PM, Nicolas de Pesloüan wrote:
> Le 05/03/2011 03:53, Andy Gospodarek a écrit :
>> On Fri, Mar 04, 2011 at 10:15:17AM +0800, Weiping Pan wrote:
>>> Hi,
>>>
>>> I'm doing some Linux bonding driver test, and I find a problem in
>>> balance-rr mode.
>>> That's it can't change to another slave if you ifdown the active slave.
>>> Any comments are warmly welcomed!
>>>
>>> regards
>>> Weiping Pan
>>>
>>> My host is Fedora 14, and I install VirtualBox (4.0.2), and enable 4
>>> nics for the guest system.
>>
>> Does this mean you are passing 4 NICs from your host to your guest
>> (maybe via direct pci-device assignment to the guest) or are you
>> creating 4 virtual devices on the host that are in a bridge group on the
>> host?
>
> VirtualBox does not allow assignment of pci-device to the guest. The 
> network interfaces on the guest are pure virtual one, with several 
> modes available. In order to help you trouble shooting this problem, 
> we need to know the mode form each of the virtual interfaces. Possible 
> modes are NAT, bridged, internal-network, and host-only-network.
>
> Please provide the output of the following command:
>
> VBoxManage showvminfo <your-vm-uuid> | grep ^NIC
>
> To display your vm uuid, use the following command:
>
> VBoxManage list vms
[root@localhost ~]# VBoxManage showvminfo 
67b83c47-0ee2-46bc-b0ff-e0eb43edc1c2 |grep ^NIC
NIC 1:           MAC: 0800270481A8, Attachment: Bridged Interface 
'eth0', Cable connected: on, Trace: off (file: none), Type: 82540EM, 
Reported speed: 0 Mbps, Boot priority: 0
NIC 2:           MAC: 08002778F641, Attachment: Bridged Interface 
'eth0', Cable connected: on, Trace: off (file: none), Type: 82540EM, 
Reported speed: 0 Mbps, Boot priority: 0
NIC 3:           MAC: 080027C408BA, Attachment: Bridged Interface 
'eth0', Cable connected: on, Trace: off (file: none), Type: 82540EM, 
Reported speed: 0 Mbps, Boot priority: 0
NIC 4:           MAC: 080027DB339A, Attachment: Bridged Interface 
'eth0', Cable connected: on, Trace: off (file: none), Type: 82540EM, 
Reported speed: 0 Mbps, Boot priority: 0
NIC 5:           disabled
NIC 6:           disabled
NIC 7:           disabled
NIC 8:           disabled

And when guest starts, i find that:
NIC 1: eth7
NIC 2: eth6
NIC 3: eth9
NIC 4: eth8

>
>>
>> [...]
>>> [root@localhost ~]# ifconfig eth7 down
>>
>> This is not a great way to test link failure with bonding.  The best way
>> is to actually pull the cable so the interface is truly down.
>
> To virtually plug or unplug the cable from a virtual interface, use 
> the following command, replacing the # with the interface number (from 
> 1 to 8):
>
> VBoxManage controlvm setlinkstate# on
> VBoxManage controlvm setlinkstate# off
I repeat my test with your guide, but it still doesn't work!

First on my host,
ifconfig eth0:0 192.168.1.100 netmask 255.255.255.0 up

And restart my guest,
[root@localhost ~]# ifconfig
eth6      Link encap:Ethernet  HWaddr 08:00:27:78:F6:41
           inet addr:10.66.65.128  Bcast:10.66.65.255  Mask:255.255.254.0
           inet6 addr: fe80::a00:27ff:fe78:f641/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:22 errors:0 dropped:0 overruns:0 frame:0
           TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:3608 (3.5 KiB)  TX bytes:1152 (1.1 KiB)

eth7      Link encap:Ethernet  HWaddr 08:00:27:04:81:A8
           inet addr:10.66.65.53  Bcast:10.66.65.255  Mask:255.255.254.0
           inet6 addr: fe80::a00:27ff:fe04:81a8/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:23 errors:0 dropped:0 overruns:0 frame:0
           TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:3668 (3.5 KiB)  TX bytes:1152 (1.1 KiB)

eth8      Link encap:Ethernet  HWaddr 08:00:27:DB:33:9A
           inet addr:10.66.65.237  Bcast:10.66.65.255  Mask:255.255.254.0
           inet6 addr: fe80::a00:27ff:fedb:339a/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:147 errors:0 dropped:0 overruns:0 frame:0
           TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:14783 (14.4 KiB)  TX bytes:1152 (1.1 KiB)

eth9      Link encap:Ethernet  HWaddr 08:00:27:C4:08:BA
           inet addr:10.66.65.125  Bcast:10.66.65.255  Mask:255.255.254.0
           inet6 addr: fe80::a00:27ff:fec4:8ba/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:147 errors:0 dropped:0 overruns:0 frame:0
           TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:14783 (14.4 KiB)  TX bytes:1152 (1.1 KiB)

lo        Link encap:Local Loopback
           inet addr:127.0.0.1  Mask:255.0.0.0
           inet6 addr: ::1/128 Scope:Host
           UP LOOPBACK RUNNING  MTU:16436  Metric:1
           RX packets:16 errors:0 dropped:0 overruns:0 frame:0
           TX packets:16 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:880 (880.0 b)  TX bytes:880 (880.0 b)

[root@localhost ~]# ifconfig eth6 down
[root@localhost ~]# ifconfig eth7 down
[root@localhost ~]# ifconfig eth8 down
[root@localhost ~]# ifconfig eth9 down
[root@localhost ~]# ip route show
[root@localhost ~]# ip neigh show
[root@localhost ~]# ifconfig
lo        Link encap:Local Loopback
           inet addr:127.0.0.1  Mask:255.0.0.0
           inet6 addr: ::1/128 Scope:Host
           UP LOOPBACK RUNNING  MTU:16436  Metric:1
           RX packets:16 errors:0 dropped:0 overruns:0 frame:0
           TX packets:16 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:880 (880.0 b)  TX bytes:880 (880.0 b)

[root@localhost ~]# dmesg -c &> /dev/null
[root@localhost ~]# modprobe bonding mode=0 miimon=100
[root@localhost ~]# ifconfig bond0 192.168.1.5 netmask 255.255.255.0 up
[root@localhost ~]# ifenslave bond0 eth6
[root@localhost ~]# ifenslave bond0 eth7
[root@localhost ~]# dmesg
[  164.865840] bonding: Ethernet Channel Bonding Driver: v3.6.0 
(September 26, 2009)
[  164.865845] bonding: MII link monitoring set to 100 ms
[  181.186201] ADDRCONF(NETDEV_UP): bond0: link is not ready
[  191.549252] bonding: bond0: enslaving eth6 as an active interface 
with a down link.
[  191.552653] e1000: eth6 NIC Link is Up 1000 Mbps Full Duplex, Flow 
Control: RX
[  191.586166] bonding: bond0: link status definitely up for interface eth6.
[  191.586315] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
[  193.420974] e1000: eth7 NIC Link is Up 1000 Mbps Full Duplex, Flow 
Control: RX
[  193.434907] bonding: bond0: enslaving eth7 as an active interface 
with an up link.

[root@localhost ~]# ifconfig
bond0     Link encap:Ethernet  HWaddr 08:00:27:78:F6:41
           inet addr:192.168.1.5  Bcast:192.168.1.255  Mask:255.255.255.0
           inet6 addr: fe80::a00:27ff:fe78:f641/64 Scope:Link
           UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
           RX packets:95 errors:0 dropped:0 overruns:0 frame:0
           TX packets:26 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:13415 (13.1 KiB)  TX bytes:4140 (4.0 KiB)

eth6      Link encap:Ethernet  HWaddr 08:00:27:78:F6:41
           inet addr:10.66.65.128  Bcast:10.66.65.255  Mask:255.255.254.0
           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
           RX packets:48 errors:0 dropped:0 overruns:0 frame:0
           TX packets:13 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:7464 (7.2 KiB)  TX bytes:1822 (1.7 KiB)

eth7      Link encap:Ethernet  HWaddr 08:00:27:78:F6:41
           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
           RX packets:47 errors:0 dropped:0 overruns:0 frame:0
           TX packets:13 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:5951 (5.8 KiB)  TX bytes:2318 (2.2 KiB)

lo        Link encap:Local Loopback
           inet addr:127.0.0.1  Mask:255.0.0.0
           inet6 addr: ::1/128 Scope:Host
           UP LOOPBACK RUNNING  MTU:16436  Metric:1
           RX packets:16 errors:0 dropped:0 overruns:0 frame:0
           TX packets:16 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:880 (880.0 b)  TX bytes:880 (880.0 b)

[root@localhost ~]# ping 192.168.1.100 -c 5
PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
64 bytes from 192.168.1.100: icmp_req=1 ttl=64 time=1.98 ms
64 bytes from 192.168.1.100: icmp_req=2 ttl=64 time=0.955 ms
64 bytes from 192.168.1.100: icmp_req=3 ttl=64 time=0.209 ms
64 bytes from 192.168.1.100: icmp_req=4 ttl=64 time=0.277 ms
64 bytes from 192.168.1.100: icmp_req=5 ttl=64 time=0.289 ms

--- 192.168.1.100 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4002ms
rtt min/avg/max/mdev = 0.209/0.742/1.984/0.678 ms

[root@localhost ~]# ip route show
192.168.1.0/24 dev bond0  proto kernel  scope link  src 192.168.1.5
10.66.64.0/23 dev eth6  proto kernel  scope link  src 10.66.65.128  
metric 1
default via 10.66.65.254 dev eth6  proto static
[root@localhost ~]# ip neigh show
192.168.1.100 dev bond0 lladdr 64:31:50:3a:b0:b5 STALE


And on host,
[root@localhost ~]# VBoxManage controlvm 
67b83c47-0ee2-46bc-b0ff-e0eb43edc1c2 setlinkstate2 off

Then on guest,
[root@localhost ~]# ethtool eth6
Settings for eth6:
         Supported ports: [ TP ]
         Supported link modes:   10baseT/Half 10baseT/Full
                                 100baseT/Half 100baseT/Full
                                 1000baseT/Full
         Supports auto-negotiation: Yes
         Advertised link modes:  10baseT/Half 10baseT/Full
                                 100baseT/Half 100baseT/Full
                                 1000baseT/Full
         Advertised pause frame use: No
         Advertised auto-negotiation: Yes
         Speed: Unknown!
         Duplex: Unknown! (255)
         Port: Twisted Pair
         PHYAD: 0
         Transceiver: internal
         Auto-negotiation: on
         MDI-X: Unknown
         Supports Wake-on: umbg
         Wake-on: d
         Current message level: 0x00000007 (7)
         Link detected: no

[root@localhost ~]# dmesg
[  164.865840] bonding: Ethernet Channel Bonding Driver: v3.6.0 
(September 26, 2009)
[  164.865845] bonding: MII link monitoring set to 100 ms
[  181.186201] ADDRCONF(NETDEV_UP): bond0: link is not ready
[  191.549252] bonding: bond0: enslaving eth6 as an active interface 
with a down link.
[  191.552653] e1000: eth6 NIC Link is Up 1000 Mbps Full Duplex, Flow 
Control: RX
[  191.586166] bonding: bond0: link status definitely up for interface eth6.
[  191.586315] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
[  193.420974] e1000: eth7 NIC Link is Up 1000 Mbps Full Duplex, Flow 
Control: RX
[  193.434907] bonding: bond0: enslaving eth7 as an active interface 
with an up link.
[  202.018085] bond0: no IPv6 routers present
[  238.834001] e1000: eth7 NIC Link is Up 1000 Mbps Full Duplex, Flow 
Control: RX
[  434.010205] e1000: eth6 NIC Link is Down
[  434.011661] bonding: bond0: link status definitely down for interface 
eth6, disabling it

[root@localhost ~]# ping 192.168.1.100 -c 5
PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
 From 192.168.1.5 icmp_seq=2 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=3 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=4 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=5 Destination Host Unreachable

--- 192.168.1.100 ping statistics ---
5 packets transmitted, 0 received, +4 errors, 100% packet loss, time 4001ms
pipe 3

[root@localhost ~]# ip route show
192.168.1.0/24 dev bond0  proto kernel  scope link  src 192.168.1.5
[root@localhost ~]# ip neigh show
192.168.1.100 dev bond0  FAILED

ping on the guest while tcpdump on the host,
on guest:
[root@localhost ~]# ping 192.168.1.100
PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
 From 192.168.1.5 icmp_seq=2 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=3 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=4 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=6 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=7 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=8 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=10 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=11 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=12 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=14 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=15 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=16 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=18 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=19 Destination Host Unreachable
 From 192.168.1.5 icmp_seq=20 Destination Host Unreachable
^C
--- 192.168.1.100 ping statistics ---
21 packets transmitted, 0 received, +15 errors, 100% packet loss, time 
20005ms
pipe 3

on host:
[root@localhost ~]# tcpdump -i eth0 -p arp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
11:00:50.474242 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:00:50.474256 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:00:52.469651 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:00:52.469661 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:00:52.632719 ARP, Request who-has dhcp-65-29.nay.redhat.com tell 
corerouter.nay.redhat.com, length 46
11:00:53.192150 ARP, Request who-has dhcp-65-14.nay.redhat.com tell 
corerouter.nay.redhat.com, length 46
11:00:53.471246 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:00:53.471257 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:00:54.474627 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:00:54.474636 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:00:56.472050 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:00:56.472060 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:00:57.475211 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:00:57.475220 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:00:58.476840 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:00:58.476849 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:00:58.624738 ARP, Request who-has dhcp-65-29.nay.redhat.com tell 
corerouter.nay.redhat.com, length 46
11:01:00.477029 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:01:00.477038 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
^C
19 packets captured
19 packets received by filter
0 packets dropped by kernel
[root@localhost ~]# ip neigh show
192.168.1.5 dev eth0 lladdr 08:00:27:78:f6:41 STALE
10.66.65.254 dev eth0 lladdr 00:1d:45:20:d5:ff REACHABLE


regards
Weiping Pan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bonding can't change to another slave if you ifdown the active slave
  2011-03-05  0:38 ` Jay Vosburgh
@ 2011-03-07  3:23   ` Weiping Pan
  0 siblings, 0 replies; 13+ messages in thread
From: Weiping Pan @ 2011-03-07  3:23 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: netdev, bonding-devel, Linda Wang

On 03/05/2011 08:38 AM, Jay Vosburgh wrote:
> Weiping Pan<panweiping3@gmail.com>  wrote:
>
>> I'm doing some Linux bonding driver test, and I find a problem in
>> balance-rr mode.
>> That's it can't change to another slave if you ifdown the active slave.
>> Any comments are warmly welcomed!
> 	I followed your recipe on a somewhat more recent kernel (2.6.37)
> and using real hardware, and I don't see the problem you describe.
>
> 	I do have a couple of questions, further down.
>
> [...]
>> My host is Fedora 14, and I install VirtualBox (4.0.2), and enable 4
> 	I've not ever tried virtualbox, but it may be that its virtual
> switch is misbehaving.  One possibility that comes to mind is that the
> virtual switch is confused by seeing the same MAC address on multiple
> ports (which is a problem with a hardware virtual switch I'm familiar
> with).
I use bridge mode in virtualbox.
[root@localhost ~]# VBoxManage showvminfo 
67b83c47-0ee2-46bc-b0ff-e0eb43edc1c2 |grep ^NIC
NIC 1:           MAC: 0800270481A8, Attachment: Bridged Interface 
'eth0', Cable connected: on, Trace: off (file: none), Type: 82540EM, 
Reported speed: 0 Mbps, Boot priority: 0
NIC 2:           MAC: 08002778F641, Attachment: Bridged Interface 
'eth0', Cable connected: on, Trace: off (file: none), Type: 82540EM, 
Reported speed: 0 Mbps, Boot priority: 0
NIC 3:           MAC: 080027C408BA, Attachment: Bridged Interface 
'eth0', Cable connected: on, Trace: off (file: none), Type: 82540EM, 
Reported speed: 0 Mbps, Boot priority: 0
NIC 4:           MAC: 080027DB339A, Attachment: Bridged Interface 
'eth0', Cable connected: on, Trace: off (file: none), Type: 82540EM, 
Reported speed: 0 Mbps, Boot priority: 0
NIC 5:           disabled
NIC 6:           disabled
NIC 7:           disabled
NIC 8:           disabled
>> nics for the guest system.
>> My guest is Fedora 14 too.
>> First on my host, I run:
>> [pwp@localhost linux-2.6.35-comment]$ uname -a
>> Linux localhost.localdomain 2.6.35.11-83.fc14.i686 #1 SMP Mon Feb 7
>> 07:04:18 UTC 2011 i686 i686 i386 GNU/Linux
>>
>> [pwp@localhost linux-2.6.35-comment]$ sudo ifconfig eth0:0 192.168.1.100
>> netmask 255.255.255.0 up
>> [pwp@localhost linux-2.6.35-comment]$ sudo ifconfig
>> eth0      Link encap:Ethernet  HWaddr 64:31:50:3A:B0:B5
>>           inet addr:10.66.65.228  Bcast:10.66.65.255  Mask:255.255.254.0
>>           inet6 addr: fe80::6631:50ff:fe3a:b0b5/64 Scope:Link
>>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>           RX packets:811505 errors:0 dropped:0 overruns:0 frame:0
>>           TX packets:777018 errors:0 dropped:0 overruns:0 carrier:0
>>           collisions:0 txqueuelen:1000
>>           RX bytes:709681583 (676.8 MiB)  TX bytes:71520005 (68.2 MiB)
>>           Interrupt:17
>>
>> eth0:0    Link encap:Ethernet  HWaddr 64:31:50:3A:B0:B5
>>           inet addr:192.168.1.100  Bcast:192.168.1.255  Mask:255.255.255.0
>>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>           Interrupt:17
>>
>> Then I enable bonding on my guest, I run:
>> [root@localhost ~]# uname -a
>> Linux localhost.localdomain 2.6.35.11-83.fc14.i686 #1 SMP Mon Feb 7
>> 07:04:18 UTC 2011 i686 i686 i386 GNU/Linux
>>
>> [root@localhost ~]# ifconfig
>> eth6      Link encap:Ethernet  HWaddr 08:00:27:3A:4D:BD
>>           inet addr:10.66.65.167  Bcast:10.66.65.255  Mask:255.255.254.0
>>           inet6 addr: fe80::a00:27ff:fe3a:4dbd/64 Scope:Link
>>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>           RX packets:65 errors:0 dropped:0 overruns:0 frame:0
>>           TX packets:31 errors:0 dropped:0 overruns:0 carrier:0
>>           collisions:0 txqueuelen:1000
>>           RX bytes:9916 (9.6 KiB)  TX bytes:3090 (3.0 KiB)
>>
>> eth7      Link encap:Ethernet  HWaddr 08:00:27:26:1B:DB
>>           inet addr:10.66.65.154  Bcast:10.66.65.255  Mask:255.255.254.0
>>           inet6 addr: fe80::a00:27ff:fe26:1bdb/64 Scope:Link
>>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>           RX packets:57 errors:0 dropped:0 overruns:0 frame:0
>>           TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
>>           collisions:0 txqueuelen:1000
>>           RX bytes:7358 (7.1 KiB)  TX bytes:1152 (1.1 KiB)
>>
>> eth8      Link encap:Ethernet  HWaddr 08:00:27:B5:FC:D1
>>           inet addr:10.66.65.169  Bcast:10.66.65.255  Mask:255.255.254.0
>>           inet6 addr: fe80::a00:27ff:feb5:fcd1/64 Scope:Link
>>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>           RX packets:57 errors:0 dropped:0 overruns:0 frame:0
>>           TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
>>           collisions:0 txqueuelen:1000
>>           RX bytes:7358 (7.1 KiB)  TX bytes:1152 (1.1 KiB)
>>
>> eth9      Link encap:Ethernet  HWaddr 08:00:27:C7:7B:FC
>>           inet addr:10.66.65.216  Bcast:10.66.65.255  Mask:255.255.254.0
>>           inet6 addr: fe80::a00:27ff:fec7:7bfc/64 Scope:Link
>>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>           RX packets:57 errors:0 dropped:0 overruns:0 frame:0
>>           TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
>>           collisions:0 txqueuelen:1000
>>           RX bytes:7358 (7.1 KiB)  TX bytes:1152 (1.1 KiB)
>>
>> lo        Link encap:Local Loopback
>>           inet addr:127.0.0.1  Mask:255.0.0.0
>>           inet6 addr: ::1/128 Scope:Host
>>           UP LOOPBACK RUNNING  MTU:16436  Metric:1
>>           RX packets:123 errors:0 dropped:0 overruns:0 frame:0
>>           TX packets:123 errors:0 dropped:0 overruns:0 carrier:0
>>           collisions:0 txqueuelen:0
>>           RX bytes:13036 (12.7 KiB)  TX bytes:13036 (12.7 KiB)
>>
>> [root@localhost ~]# ifconfig eth7 down
>> [root@localhost ~]# ifconfig eth8 down
>> [root@localhost ~]# dmesg -c
>> [root@localhost ~]# modprobe bonding mode=0 miimon=100
>> [root@localhost ~]# ifconfig bond0 192.168.1.5 netmask 255.255.255.0 up
>> [root@localhost ~]# ifenslave bond0 eth7
>>
>> [root@localhost ~]# dmesg
>> [  304.496463] bonding: Ethernet Channel Bonding Driver: v3.6.0
>> (September 26, 2009)
>> [  304.496468] bonding: MII link monitoring set to 100 ms
>> [  353.527680] ADDRCONF(NETDEV_UP): bond0: link is not ready
>> [  355.321626] e1000: eth7 NIC Link is Up 1000 Mbps Full Duplex, Flow
>> Control: RX
>> [  355.322250] bonding: bond0: enslaving eth7 as an active interface
>> with an up link.
>> [  355.323503] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
>> [  365.394052] bond0: no IPv6 routers present
>>
>> [pwp@localhost ~]$ ping 192.168.1.100 -c 10
> 	At this point, what is in the routing table ("ip route show")
> and the ARP table ("ip neigh show")?
[root@localhost ~]# ip route show
192.168.1.0/24 dev bond0  proto kernel  scope link  src 192.168.1.5
10.66.64.0/23 dev eth7  proto kernel  scope link  src 10.66.65.53  metric 1
10.66.64.0/23 dev eth6  proto kernel  scope link  src 10.66.65.128  
metric 1
default via 10.66.65.254 dev eth7  proto static
[root@localhost ~]# ip neigh show
192.168.1.100 dev bond0 lladdr 64:31:50:3a:b0:b5 REACHABLE


>> PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
>> 64 bytes from 192.168.1.100: icmp_req=1 ttl=64 time=0.196 ms
>> 64 bytes from 192.168.1.100: icmp_req=2 ttl=64 time=0.365 ms
>> 64 bytes from 192.168.1.100: icmp_req=3 ttl=64 time=0.259 ms
>> 64 bytes from 192.168.1.100: icmp_req=4 ttl=64 time=0.135 ms
>> 64 bytes from 192.168.1.100: icmp_req=5 ttl=64 time=0.194 ms
>> 64 bytes from 192.168.1.100: icmp_req=6 ttl=64 time=0.225 ms
>> 64 bytes from 192.168.1.100: icmp_req=7 ttl=64 time=0.189 ms
>> 64 bytes from 192.168.1.100: icmp_req=8 ttl=64 time=0.274 ms
>> 64 bytes from 192.168.1.100: icmp_req=9 ttl=64 time=1.07 ms
>> 64 bytes from 192.168.1.100: icmp_req=10 ttl=64 time=0.274 ms
>>
>> --- 192.168.1.100 ping statistics ---
>> 10 packets transmitted, 10 received, 0% packet loss, time 9002ms
>> rtt min/avg/max/mdev = 0.135/0.319/1.079/0.260 ms
>>
>> [root@localhost ~]# ifenslave bond0 eth8
>> [root@localhost ~]# dmesg
>> [  304.496463] bonding: Ethernet Channel Bonding Driver: v3.6.0
>> (September 26, 2009)
>> [  304.496468] bonding: MII link monitoring set to 100 ms
>> [  353.527680] ADDRCONF(NETDEV_UP): bond0: link is not ready
>> [  355.321626] e1000: eth7 NIC Link is Up 1000 Mbps Full Duplex, Flow
>> Control: RX
>> [  355.322250] bonding: bond0: enslaving eth7 as an active interface
>> with an up link.
>> [  355.323503] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
>> [  365.394052] bond0: no IPv6 routers present
>> [  510.913797] e1000: eth8 NIC Link is Up 1000 Mbps Full Duplex, Flow
>> Control: RX
>> [  510.917312] bonding: bond0: enslaving eth8 as an active interface
>> with an up link.
>>
>> [pwp@localhost ~]$ ping 192.168.1.100 -c 10
>> PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
>> 64 bytes from 192.168.1.100: icmp_req=1 ttl=64 time=0.182 ms
>> 64 bytes from 192.168.1.100: icmp_req=2 ttl=64 time=0.211 ms
>> 64 bytes from 192.168.1.100: icmp_req=3 ttl=64 time=0.270 ms
>> 64 bytes from 192.168.1.100: icmp_req=4 ttl=64 time=0.248 ms
>> 64 bytes from 192.168.1.100: icmp_req=5 ttl=64 time=0.132 ms
>> 64 bytes from 192.168.1.100: icmp_req=6 ttl=64 time=0.291 ms
>> 64 bytes from 192.168.1.100: icmp_req=7 ttl=64 time=0.246 ms
>> 64 bytes from 192.168.1.100: icmp_req=8 ttl=64 time=0.272 ms
>> 64 bytes from 192.168.1.100: icmp_req=9 ttl=64 time=0.293 ms
>> 64 bytes from 192.168.1.100: icmp_req=10 ttl=64 time=0.133 ms
>>
>> --- 192.168.1.100 ping statistics ---
>> 10 packets transmitted, 10 received, 0% packet loss, time 9000ms
>> rtt min/avg/max/mdev = 0.132/0.227/0.293/0.060 ms
>>
>> [root@localhost ~]# ifconfig
>> bond0     Link encap:Ethernet  HWaddr 08:00:27:26:1B:DB
>>           inet addr:192.168.1.5  Bcast:192.168.1.255  Mask:255.255.255.0
>>           inet6 addr: fe80::a00:27ff:fe26:1bdb/64 Scope:Link
>>           UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
>>           RX packets:311 errors:0 dropped:0 overruns:0 frame:0
>>           TX packets:61 errors:0 dropped:0 overruns:0 carrier:0
>>           collisions:0 txqueuelen:0
>>           RX bytes:38075 (37.1 KiB)  TX bytes:8698 (8.4 KiB)
>>
>> eth7      Link encap:Ethernet  HWaddr 08:00:27:26:1B:DB
>>           inet addr:10.66.65.154  Bcast:10.66.65.255  Mask:255.255.254.0
>>           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
>>           RX packets:181 errors:0 dropped:0 overruns:0 frame:0
>>           TX packets:39 errors:0 dropped:0 overruns:0 carrier:0
>>           collisions:0 txqueuelen:1000
>>           RX bytes:22297 (21.7 KiB)  TX bytes:4578 (4.4 KiB)
>>
>> eth8      Link encap:Ethernet  HWaddr 08:00:27:26:1B:DB
>>           inet addr:192.168.1.15  Bcast:192.168.1.255  Mask:255.255.255.0
>>           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
>>           RX packets:130 errors:0 dropped:0 overruns:0 frame:0
>>           TX packets:22 errors:0 dropped:0 overruns:0 carrier:0
>>           collisions:0 txqueuelen:1000
>>           RX bytes:15778 (15.4 KiB)  TX bytes:4120 (4.0 KiB)
>>
>> [root@localhost ~]# ifconfig eth7 down
> 	Next question: just after setting eth7 down, what do the routing
> and ARP tables look like?
[root@localhost ~]# ifconfig eth7 down
[root@localhost ~]# ip route show
192.168.1.0/24 dev bond0  proto kernel  scope link  src 192.168.1.5
10.66.64.0/23 dev eth6  proto kernel  scope link  src 10.66.65.128  
metric 1
default via 10.66.65.254 dev eth6  proto static
[root@localhost ~]# ip neigh show
192.168.1.100 dev bond0 lladdr 64:31:50:3a:b0:b5 REACHABLE


>> [root@localhost ~]# dmesg
>> [  304.496463] bonding: Ethernet Channel Bonding Driver: v3.6.0
>> (September 26, 2009)
>> [  304.496468] bonding: MII link monitoring set to 100 ms
>> [  353.527680] ADDRCONF(NETDEV_UP): bond0: link is not ready
>> [  355.321626] e1000: eth7 NIC Link is Up 1000 Mbps Full Duplex, Flow
>> Control: RX
>> [  355.322250] bonding: bond0: enslaving eth7 as an active interface
>> with an up link.
>> [  355.323503] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
>> [  365.394052] bond0: no IPv6 routers present
>> [  510.913797] e1000: eth8 NIC Link is Up 1000 Mbps Full Duplex, Flow
>> Control: RX
>> [  510.917312] bonding: bond0: enslaving eth8 as an active interface
>> with an up link.
>> [  592.208534] bonding: bond0: link status definitely down for interface
>> eth7, disabling it
>>
>> Now, if bonding driver works well, eth8 will be the active slave, and
>> the network connection is ok.
>> __But__ ...
>>
>> [pwp@localhost ~]$ ping 192.168.1.100 -c 10
>> PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
> > From 192.168.1.5 icmp_seq=10 Destination Host Unreachable
>> --- 192.168.1.100 ping statistics ---
>> 10 packets transmitted, 0 received, +1 errors, 100% packet loss, time 8999ms
>>
>> How strange!
>>
>> [root@localhost ~]# ifconfig
>> bond0     Link encap:Ethernet  HWaddr 08:00:27:26:1B:DB
>>           inet addr:192.168.1.5  Bcast:192.168.1.255  Mask:255.255.255.0
>>           inet6 addr: fe80::a00:27ff:fe26:1bdb/64 Scope:Link
>>           UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
>>           RX packets:357 errors:0 dropped:0 overruns:0 frame:0
>>           TX packets:76 errors:0 dropped:0 overruns:0 carrier:0
>>           collisions:0 txqueuelen:0
>>           RX bytes:42971 (41.9 KiB)  TX bytes:9832 (9.6 KiB)
>>
>> eth8      Link encap:Ethernet  HWaddr 08:00:27:26:1B:DB
>>           inet addr:192.168.1.15  Bcast:192.168.1.255  Mask:255.255.255.0
>>           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
>>           RX packets:163 errors:0 dropped:0 overruns:0 frame:0
>>           TX packets:37 errors:0 dropped:0 overruns:0 carrier:0
>>           collisions:0 txqueuelen:1000
>>           RX bytes:19073 (18.6 KiB)  TX bytes:5254 (5.1 KiB)
>>
>> [root@localhost ~]# arp
>> Address                  HWtype  HWaddress           Flags
>> Mask            Iface
>> corerouter.nay.redhat.c  ether   00:1d:45:20:d5:ff
>> C                     eth6
>> 192.168.1.100
>> (incomplete)                              bond0
>>
>> I think maybe there is something wrong about arp.
>> So I run ping and tcpdump synchronously.
>>
>> [pwp@localhost ~]$ ping 192.168.1.100 -c 10
>> PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
> > From 192.168.1.5 icmp_seq=2 Destination Host Unreachable
> > From 192.168.1.5 icmp_seq=3 Destination Host Unreachable
> > From 192.168.1.5 icmp_seq=4 Destination Host Unreachable
> > From 192.168.1.5 icmp_seq=6 Destination Host Unreachable
> > From 192.168.1.5 icmp_seq=7 Destination Host Unreachable
> > From 192.168.1.5 icmp_seq=8 Destination Host Unreachable
> > From 192.168.1.5 icmp_seq=9 Destination Host Unreachable
> > From 192.168.1.5 icmp_seq=10 Destination Host Unreachable
>> --- 192.168.1.100 ping statistics ---
>> 10 packets transmitted, 0 received, +8 errors, 100% packet loss, time 9002ms
>> pipe 3
>>
>> And meanwhile,
>> [root@localhost ~]# tcpdump -i bond0 -p arp
>> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
>> listening on bond0, link-type EN10MB (Ethernet), capture size 65535 bytes
>> 02:46:56.983092 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
>> length 28
> [...]
>
> 	At this point, does tcpdump on the host system see the incoming
> ARP requests?
Yes. On host,
[root@localhost ~]# tcpdump -i eth0 -p arp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
11:21:01.721704 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:21:01.721714 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:21:02.723536 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:21:02.723548 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:21:03.019325 ARP, Request who-has 10.66.4.107 tell 10.66.4.108, length 46
11:21:04.018956 ARP, Request who-has 10.66.4.107 tell 10.66.4.108, length 46
11:21:04.720847 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:21:04.720856 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:21:05.018627 ARP, Request who-has 10.66.4.107 tell 10.66.4.108, length 46
11:21:05.722297 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:21:05.722308 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:21:06.724211 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
11:21:06.724220 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
^C
13 packets captured
13 packets received by filter
0 packets dropped by kernel

Maybe host doesn't reply ? I'm not sure.

regards
Weiping pan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bonding can't change to another slave if you ifdown the active slave
  2011-03-05  2:53 ` Andy Gospodarek
  2011-03-05 13:49   ` Nicolas de Pesloüan
@ 2011-03-07  4:20   ` Weiping Pan
  1 sibling, 0 replies; 13+ messages in thread
From: Weiping Pan @ 2011-03-07  4:20 UTC (permalink / raw)
  To: Andy Gospodarek; +Cc: netdev, bonding-devel, Linda Wang

On 03/05/2011 10:53 AM, Andy Gospodarek wrote:
> On Fri, Mar 04, 2011 at 10:15:17AM +0800, Weiping Pan wrote:
>> Hi,
>>
>> I'm doing some Linux bonding driver test, and I find a problem in
>> balance-rr mode.
>> That's it can't change to another slave if you ifdown the active slave.
>> Any comments are warmly welcomed!
>>
>> regards
>> Weiping Pan
>>
>> My host is Fedora 14, and I install VirtualBox (4.0.2), and enable 4
>> nics for the guest system.
> Does this mean you are passing 4 NICs from your host to your guest
> (maybe via direct pci-device assignment to the guest) or are you
> creating 4 virtual devices on the host that are in a bridge group on the
> host?
>
> [...]
I use bridge mode in virtualbox.
[root@localhost ~]# VBoxManage showvminfo 
67b83c47-0ee2-46bc-b0ff-e0eb43edc1c2 |grep ^NIC
NIC 1:           MAC: 0800270481A8, Attachment: Bridged Interface 
'eth0', Cable connected: on, Trace: off (file: none), Type: 82540EM, 
Reported speed: 0 Mbps, Boot priority: 0
NIC 2:           MAC: 08002778F641, Attachment: Bridged Interface 
'eth0', Cable connected: on, Trace: off (file: none), Type: 82540EM, 
Reported speed: 0 Mbps, Boot priority: 0
NIC 3:           MAC: 080027C408BA, Attachment: Bridged Interface 
'eth0', Cable connected: on, Trace: off (file: none), Type: 82540EM, 
Reported speed: 0 Mbps, Boot priority: 0
NIC 4:           MAC: 080027DB339A, Attachment: Bridged Interface 
'eth0', Cable connected: on, Trace: off (file: none), Type: 82540EM, 
Reported speed: 0 Mbps, Boot priority: 0
NIC 5:           disabled
NIC 6:           disabled
NIC 7:           disabled
NIC 8:           disabled

>> [root@localhost ~]# ifconfig eth7 down
> This is not a great way to test link failure with bonding.  The best way
> is to actually pull the cable so the interface is truly down.
Ok.
But I think bonding should  work in such condition.
>> [root@localhost ~]# dmesg
>> [  304.496463] bonding: Ethernet Channel Bonding Driver: v3.6.0
>> (September 26, 2009)
>> [  304.496468] bonding: MII link monitoring set to 100 ms
>> [  353.527680] ADDRCONF(NETDEV_UP): bond0: link is not ready
>> [  355.321626] e1000: eth7 NIC Link is Up 1000 Mbps Full Duplex, Flow
>> Control: RX
>> [  355.322250] bonding: bond0: enslaving eth7 as an active interface
>> with an up link.
>> [  355.323503] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
>> [  365.394052] bond0: no IPv6 routers present
>> [  510.913797] e1000: eth8 NIC Link is Up 1000 Mbps Full Duplex, Flow
>> Control: RX
>> [  510.917312] bonding: bond0: enslaving eth8 as an active interface
>> with an up link.
>> [  592.208534] bonding: bond0: link status definitely down for interface
>> eth7, disabling it
> I suspect I know, but what does /proc/net/bonding/bond0 look like?
[root@localhost ~]# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.6.0 (September 26, 2009)

Bonding Mode: load balancing (round-robin)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth7
MII Status: down
Link Failure Count: 1
Permanent HW addr: 08:00:27:04:81:a8

Slave Interface: eth8
MII Status: up
Link Failure Count: 0
Permanent HW addr: 08:00:27:db:33:9a

> [...]
>> And meanwhile,
>> [root@localhost ~]# tcpdump -i bond0 -p arp
>> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
>> listening on bond0, link-type EN10MB (Ethernet), capture size 65535 bytes
>> 02:46:56.983092 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
>> length 28
>> 02:46:57.984040 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
>> length 28
>> 02:46:58.988442 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
>> length 28
>> 02:47:00.987340 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
>> length 28
>> 02:47:01.988136 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
>> length 28
>> 02:47:02.990033 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
>> length 28
>> 02:47:04.985086 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
>> length 28
>> 02:47:05.992368 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
>> length 28
>> 02:47:06.996727 ARP, Request who-has 192.168.1.100 tell 192.168.1.5,
>> length 28
>> 02:47:17.231106 ARP, Request who-has dhcp-65-32.nay.redhat.com tell
>> dhcp-65-180.nay.redhat.com, length 46
>> ^C
>> 10 packets captured
>> 10 packets received by filter
>> 0 packets dropped by kernel
>>
>>
> What does a tcpdump on eth0 look like?  I'm curious if these arp
> requests make it there or if the responses are the frames being dropped
> (possibly by the connected bridge/switch).
on host,
[root@localhost ~]# tcpdump -i eth0 -p arp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
12:18:24.885306 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
12:18:24.885320 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
12:18:26.880019 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
12:18:26.880030 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
12:18:27.881584 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
12:18:27.881593 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
12:18:28.883657 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
12:18:28.883671 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
12:18:30.881699 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
12:18:30.881709 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
12:18:31.885003 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
12:18:31.885012 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
12:18:31.942278 ARP, Request who-has dhcp-65-14.nay.redhat.com tell 
corerouter.nay.redhat.com, length 46
12:18:32.721861 ARP, Request who-has dhcp-65-29.nay.redhat.com tell 
corerouter.nay.redhat.com, length 46
12:18:32.888740 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28
12:18:32.888748 ARP, Request who-has 192.168.1.100 tell 192.168.1.5, 
length 28

[root@localhost ~]# ip route show
192.168.1.0/24 dev eth0  proto kernel  scope link  src 192.168.1.100
10.66.64.0/23 dev eth0  proto kernel  scope link  src 10.66.65.228  
metric 1
default via 10.66.65.254 dev eth0  proto static
[root@localhost ~]# ip neigh show
192.168.1.5 dev eth0 lladdr 08:00:27:04:81:a8 STALE
10.66.65.254 dev eth0 lladdr 00:1d:45:20:d5:ff REACHABLE

regards
Weiping Pan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bonding can't change to another slave if you ifdown the active slave
  2011-03-07  3:13     ` Weiping Pan
@ 2011-03-07 21:15       ` Nicolas de Pesloüan
  0 siblings, 0 replies; 13+ messages in thread
From: Nicolas de Pesloüan @ 2011-03-07 21:15 UTC (permalink / raw)
  To: Weiping Pan; +Cc: Andy Gospodarek, netdev, bonding-devel, Linda Wang

Le 07/03/2011 04:13, Weiping Pan a écrit :
> On 03/05/2011 09:49 PM, Nicolas de Pesloüan wrote:
>> Le 05/03/2011 03:53, Andy Gospodarek a écrit :
>>> On Fri, Mar 04, 2011 at 10:15:17AM +0800, Weiping Pan wrote:
>>>> Hi,
>>>>
>>>> I'm doing some Linux bonding driver test, and I find a problem in
>>>> balance-rr mode.
>>>> That's it can't change to another slave if you ifdown the active slave.
>>>> Any comments are warmly welcomed!
>>>>
>>>> regards
>>>> Weiping Pan
>>>>
>>>> My host is Fedora 14, and I install VirtualBox (4.0.2), and enable 4
>>>> nics for the guest system.
>>>
>>> Does this mean you are passing 4 NICs from your host to your guest
>>> (maybe via direct pci-device assignment to the guest) or are you
>>> creating 4 virtual devices on the host that are in a bridge group on the
>>> host?
>>
>> VirtualBox does not allow assignment of pci-device to the guest. The
>> network interfaces on the guest are pure virtual one, with several
>> modes available. In order to help you trouble shooting this problem,
>> we need to know the mode form each of the virtual interfaces. Possible
>> modes are NAT, bridged, internal-network, and host-only-network.
>>
>> Please provide the output of the following command:
>>
>> VBoxManage showvminfo <your-vm-uuid> | grep ^NIC
>>
>> To display your vm uuid, use the following command:
>>
>> VBoxManage list vms
> [root@localhost ~]# VBoxManage showvminfo
> 67b83c47-0ee2-46bc-b0ff-e0eb43edc1c2 |grep ^NIC
> NIC 1: MAC: 0800270481A8, Attachment: Bridged Interface 'eth0', Cable
> connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0
> Mbps, Boot priority: 0
> NIC 2: MAC: 08002778F641, Attachment: Bridged Interface 'eth0', Cable
> connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0
> Mbps, Boot priority: 0
> NIC 3: MAC: 080027C408BA, Attachment: Bridged Interface 'eth0', Cable
> connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0
> Mbps, Boot priority: 0
> NIC 4: MAC: 080027DB339A, Attachment: Bridged Interface 'eth0', Cable
> connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0
> Mbps, Boot priority: 0
> NIC 5: disabled
> NIC 6: disabled
> NIC 7: disabled
> NIC 8: disabled
>
> And when guest starts, i find that:
> NIC 1: eth7
> NIC 2: eth6
> NIC 3: eth9
> NIC 4: eth8

Would you mind testing with "Host-only Interface 'vboxnet0'", instead of "Bridged Interface 'eth0'"?

All the bonding tests I do use this setup and the link failure detection work well.

	Nicolas.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bonding can't change to another slave if you ifdown the active slave
@ 2011-03-08  6:52 Weiping Pan
  2011-03-08 12:51 ` WANG Cong
  0 siblings, 1 reply; 13+ messages in thread
From: Weiping Pan @ 2011-03-08  6:52 UTC (permalink / raw)
  To: netdev


On 03/08/2011 05:15 AM, Nicolas de Pesloüan wrote:
> Le 07/03/2011 04:13, Weiping Pan a écrit :
>> On 03/05/2011 09:49 PM, Nicolas de Pesloüan wrote:
>>> Le 05/03/2011 03:53, Andy Gospodarek a écrit :
>>>> On Fri, Mar 04, 2011 at 10:15:17AM +0800, Weiping Pan wrote:
>>>>> Hi,
>>>>>
>>>>> I'm doing some Linux bonding driver test, and I find a problem in
>>>>> balance-rr mode.
>>>>> That's it can't change to another slave if you ifdown the active 
>>>>> slave.
>>>>> Any comments are warmly welcomed!
>>>>>
>>>>> regards
>>>>> Weiping Pan
>>>>>
>>>>> My host is Fedora 14, and I install VirtualBox (4.0.2), and enable 4
>>>>> nics for the guest system.
>>>>
>>>> Does this mean you are passing 4 NICs from your host to your guest
>>>> (maybe via direct pci-device assignment to the guest) or are you
>>>> creating 4 virtual devices on the host that are in a bridge group 
>>>> on the
>>>> host?
>>>
>>> VirtualBox does not allow assignment of pci-device to the guest. The
>>> network interfaces on the guest are pure virtual one, with several
>>> modes available. In order to help you trouble shooting this problem,
>>> we need to know the mode form each of the virtual interfaces. Possible
>>> modes are NAT, bridged, internal-network, and host-only-network.
>>>
>>> Please provide the output of the following command:
>>>
>>> VBoxManage showvminfo <your-vm-uuid> | grep ^NIC
>>>
>>> To display your vm uuid, use the following command:
>>>
>>> VBoxManage list vms
>> [root@localhost ~]# VBoxManage showvminfo
>> 67b83c47-0ee2-46bc-b0ff-e0eb43edc1c2 |grep ^NIC
>> NIC 1: MAC: 0800270481A8, Attachment: Bridged Interface 'eth0', Cable
>> connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0
>> Mbps, Boot priority: 0
>> NIC 2: MAC: 08002778F641, Attachment: Bridged Interface 'eth0', Cable
>> connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0
>> Mbps, Boot priority: 0
>> NIC 3: MAC: 080027C408BA, Attachment: Bridged Interface 'eth0', Cable
>> connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0
>> Mbps, Boot priority: 0
>> NIC 4: MAC: 080027DB339A, Attachment: Bridged Interface 'eth0', Cable
>> connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0
>> Mbps, Boot priority: 0
>> NIC 5: disabled
>> NIC 6: disabled
>> NIC 7: disabled
>> NIC 8: disabled
>>
>> And when guest starts, i find that:
>> NIC 1: eth7
>> NIC 2: eth6
>> NIC 3: eth9
>> NIC 4: eth8
>
> Would you mind testing with "Host-only Interface 'vboxnet0'", instead 
> of "Bridged Interface 'eth0'"?
>
> All the bonding tests I do use this setup and the link failure 
> detection work well.
>
>     Nicolas.
ok, I use "Host-only mode" and get



an conclusion, that if the first enslaved nic is pulled out, bonding 
can't handle well.

First test.
I first enslave eth6, then pull it out, bonding doesn't work.
on host,
[root@localhost ~]# VBoxManage -v
4.0.2r69518

[root@localhost ~]# VBoxManage showvminfo 
67b83c47-0ee2-46bc-b0ff-e0eb43edc1c2|grep ^NIC
NIC 1:           MAC: 0800270481A8, Attachment: Host-only Interface 
'vboxnet0', Cable connected: on, Trace: off (file: none), Type: 82540EM, 
Reported speed: 0 Mbps, Boot priority: 0
NIC 2:           MAC: 08002778F641, Attachment: Host-only Interface 
'vboxnet0', Cable connected: on, Trace: off (file: none), Type: 82540EM, 
Reported speed: 0 Mbps, Boot priority: 0
NIC 3:           MAC: 080027C408BA, Attachment: Host-only Interface 
'vboxnet0', Cable connected: on, Trace: off (file: none), Type: 82540EM, 
Reported speed: 0 Mbps, Boot priority: 0
NIC 4:           MAC: 080027DB339A, Attachment: Host-only Interface 
'vboxnet0', Cable connected: on, Trace: off (file: none), Type: 82540EM, 
Reported speed: 0 Mbps, Boot priority: 0
NIC 5:           disabled
NIC 6:           disabled
NIC 7:           disabled
NIC 8:           disabled

[root@localhost ~]# ifconfig vboxnet0
vboxnet0  Link encap:Ethernet  HWaddr 0A:00:27:00:00:00
           inet addr:192.168.56.1  Bcast:192.168.56.255  Mask:255.255.255.0
           inet6 addr: fe80::800:27ff:fe00:0/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
           TX packets:268 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:0 (0.0 b)  TX bytes:52456 (51.2 KiB)

restart guest,
[root@localhost ~]# uname -a
Linux localhost.localdomain 2.6.35.11-83.fc14.i686 #1 SMP Mon Feb 7 
07:04:18 UTC 2011 i686 i686 i386 GNU/Linux

[root@localhost ~]# ifconfig
eth6      Link encap:Ethernet  HWaddr 08:00:27:78:F6:41
           inet addr:192.168.56.101  Bcast:192.168.56.255  
Mask:255.255.255.0
           inet6 addr: fe80::a00:27ff:fe78:f641/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:14 errors:0 dropped:0 overruns:0 frame:0
           TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:6772 (6.6 KiB)  TX bytes:1152 (1.1 KiB)

eth7      Link encap:Ethernet  HWaddr 08:00:27:04:81:A8
           inet addr:192.168.56.102  Bcast:192.168.56.255  
Mask:255.255.255.0
           inet6 addr: fe80::a00:27ff:fe04:81a8/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:14 errors:0 dropped:0 overruns:0 frame:0
           TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:6772 (6.6 KiB)  TX bytes:1152 (1.1 KiB)

eth8      Link encap:Ethernet  HWaddr 08:00:27:DB:33:9A
           inet addr:192.168.56.104  Bcast:192.168.56.255  
Mask:255.255.255.0
           inet6 addr: fe80::a00:27ff:fedb:339a/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:63 errors:0 dropped:0 overruns:0 frame:0
           TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:19456 (19.0 KiB)  TX bytes:1152 (1.1 KiB)

eth9      Link encap:Ethernet  HWaddr 08:00:27:C4:08:BA
           inet addr:192.168.56.103  Bcast:192.168.56.255  
Mask:255.255.255.0
           inet6 addr: fe80::a00:27ff:fec4:8ba/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:63 errors:0 dropped:0 overruns:0 frame:0
           TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:19456 (19.0 KiB)  TX bytes:1152 (1.1 KiB)

lo        Link encap:Local Loopback
           inet addr:127.0.0.1  Mask:255.0.0.0
           inet6 addr: ::1/128 Scope:Host
           UP LOOPBACK RUNNING  MTU:16436  Metric:1
           RX packets:16 errors:0 dropped:0 overruns:0 frame:0
           TX packets:16 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:880 (880.0 b)  TX bytes:880 (880.0 b)

so
NIC 1:           eth7
NIC 2:           eth6
NIC 3:           eth9
NIC 4:           eth8

on guest,
[root@localhost ~]# for i in eth{6..9}; do ifconfig $i down; done
[root@localhost ~]# modprobe bonding mode=0 miimon=100
[root@localhost ~]# ifconfig bond0 192.168.56.2 netmask 255.255.255.0 up
[root@localhost ~]# ifenslave bond0 eth6
[root@localhost ~]# ifenslave bond0 eth7
[root@localhost ~]# ping 192.168.56.1 -c 5
PING 192.168.56.1 (192.168.56.1) 56(84) bytes of data.
64 bytes from 192.168.56.1: icmp_req=1 ttl=64 time=1.83 ms
64 bytes from 192.168.56.1: icmp_req=2 ttl=64 time=0.149 ms
64 bytes from 192.168.56.1: icmp_req=3 ttl=64 time=0.204 ms
64 bytes from 192.168.56.1: icmp_req=4 ttl=64 time=0.294 ms
64 bytes from 192.168.56.1: icmp_req=5 ttl=64 time=0.412 ms

--- 192.168.56.1 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4003ms
rtt min/avg/max/mdev = 0.149/0.578/1.832/0.633 ms
[root@localhost ~]# arp
Address                  HWtype  HWaddress           Flags 
Mask            Iface
192.168.56.1             ether   0a:00:27:00:00:00   
C                     bond0

on host,
[root@localhost ~]# arp
Address                  HWtype  HWaddress           Flags 
Mask            Iface
192.168.56.2             ether   08:00:27:78:f6:41   
C                     vboxnet0
corerouter.nay.redhat.c  ether   00:1d:45:20:d5:ff   
C                     eth0
[root@localhost ~]# VBoxManage controlvm 
67b83c47-0ee2-46bc-b0ff-e0eb43edc1c2 setlinkstate2 off

[root@localhost ~]# ping 192.168.56.1 -c 5
PING 192.168.56.1 (192.168.56.1) 56(84) bytes of data.
 From 192.168.56.2 icmp_seq=2 Destination Host Unreachable
 From 192.168.56.2 icmp_seq=3 Destination Host Unreachable
 From 192.168.56.2 icmp_seq=4 Destination Host Unreachable
 From 192.168.56.2 icmp_seq=5 Destination Host Unreachable

--- 192.168.56.1 ping statistics ---
5 packets transmitted, 0 received, +4 errors, 100% packet loss, time 4001ms
pipe 3
[root@localhost ~]# dmesg
[  513.093249] e1000: eth6 NIC Link is Down
[  513.129123] bonding: bond0: link status definitely down for interface 
eth6, disabling it
[root@localhost ~]# ip route show
192.168.56.0/24 dev bond0  proto kernel  scope link  src 192.168.56.2
192.168.56.0/24 dev eth7  proto kernel  scope link  src 192.168.56.101  
metric 1
[root@localhost ~]# ip neigh show
192.168.56.1 dev bond0  FAILED

on host,
[root@localhost ~]# ip route show
192.168.1.0/24 dev eth0  proto kernel  scope link  src 192.168.1.100
192.168.56.0/24 dev vboxnet0  proto kernel  scope link  src 192.168.56.1
10.66.64.0/23 dev eth0  proto kernel  scope link  src 10.66.65.228  
metric 1
default via 10.66.65.254 dev eth0  proto static
[root@localhost ~]# ip neigh show
192.168.56.2 dev vboxnet0 lladdr 08:00:27:78:f6:41 STALE
10.66.65.254 dev eth0 lladdr 00:1d:45:20:d5:ff REACHABLE


Second test
I first enslave eth6, then pull eth7 out, bonding works well.
on guest,
[root@localhost ~]# modprobe bonding mode=0 miimon=100
[root@localhost ~]# ifconfig bond0 192.168.56.2 netmask 255.255.255.0 up
[root@localhost ~]# ifenslave bond0 eth6
[root@localhost ~]# ifenslave bond0 eth7
[root@localhost ~]# ping 192.168.56.1 -c 5
PING 192.168.56.1 (192.168.56.1) 56(84) bytes of data.
64 bytes from 192.168.56.1: icmp_req=1 ttl=64 time=0.902 ms
64 bytes from 192.168.56.1: icmp_req=2 ttl=64 time=0.260 ms
64 bytes from 192.168.56.1: icmp_req=3 ttl=64 time=0.237 ms
64 bytes from 192.168.56.1: icmp_req=4 ttl=64 time=0.335 ms
64 bytes from 192.168.56.1: icmp_req=5 ttl=64 time=0.170 ms

--- 192.168.56.1 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4001ms
rtt min/avg/max/mdev = 0.170/0.380/0.902/0.267 ms

[root@localhost ~]# dmesg
[ 1162.524825] bonding: MII link monitoring set to 100 ms
[ 1165.845586] ADDRCONF(NETDEV_UP): bond0: link is not ready
[ 1174.505912] e1000: eth6 NIC Link is Up 1000 Mbps Full Duplex, Flow 
Control: RX
[ 1174.512372] bonding: bond0: enslaving eth6 as an active interface 
with an up link.
[ 1174.512897] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
[ 1178.857322] ICMPv6 NA: someone advertises our address 
fe80:0000:0000:0000:0a00:27ff:fe78:f641 on bond0!
[ 1178.858649] e1000: eth7 NIC Link is Up 1000 Mbps Full Duplex, Flow 
Control: RX
[ 1178.861391] bonding: bond0: enslaving eth7 as an active interface 
with an up link.
[ 1184.682110] bond0: no IPv6 routers present

on host,
[root@localhost ~]# VBoxManage controlvm 
67b83c47-0ee2-46bc-b0ff-e0eb43edc1c2 setlinkstate1 off

on guest,
[root@localhost ~]# ping 192.168.56.1 -c 5
PING 192.168.56.1 (192.168.56.1) 56(84) bytes of data.
64 bytes from 192.168.56.1: icmp_req=1 ttl=64 time=0.599 ms
64 bytes from 192.168.56.1: icmp_req=2 ttl=64 time=0.150 ms
64 bytes from 192.168.56.1: icmp_req=3 ttl=64 time=0.224 ms
64 bytes from 192.168.56.1: icmp_req=4 ttl=64 time=0.154 ms
64 bytes from 192.168.56.1: icmp_req=5 ttl=64 time=0.189 ms

--- 192.168.56.1 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4000ms
rtt min/avg/max/mdev = 0.150/0.263/0.599/0.170 ms
[root@localhost ~]# dmesg
[ 1281.421231] e1000: eth7 NIC Link is Down
[ 1281.492178] bonding: bond0: link status definitely down for interface 
eth7, disabling it


many thanks
Weiping Pan






^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bonding can't change to another slave if you ifdown the active slave
  2011-03-08  6:52 Weiping Pan
@ 2011-03-08 12:51 ` WANG Cong
  2011-03-09  2:40   ` Weiping Pan
  2011-03-09  3:38   ` Weiping Pan
  0 siblings, 2 replies; 13+ messages in thread
From: WANG Cong @ 2011-03-08 12:51 UTC (permalink / raw)
  To: netdev

On Tue, 08 Mar 2011 14:52:52 +0800, Weiping Pan wrote:

> ok, I use "Host-only mode" and get
> 
> 
> 
> an conclusion, that if the first enslaved nic is pulled out, bonding
> can't handle well.
> 
> First test.
> I first enslave eth6, then pull it out, bonding doesn't work. on host,
...
> 
> 
> Second test
> I first enslave eth6, then pull eth7 out, bonding works well. on guest,

Can you show me your /proc/net/bonding/bond0 before and after pulling down
eth6 or eth7? And what does `ip link show` say?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bonding can't change to another slave if you ifdown the active slave
  2011-03-08 12:51 ` WANG Cong
@ 2011-03-09  2:40   ` Weiping Pan
  2011-03-09  6:02     ` Américo Wang
  2011-03-09  3:38   ` Weiping Pan
  1 sibling, 1 reply; 13+ messages in thread
From: Weiping Pan @ 2011-03-09  2:40 UTC (permalink / raw)
  To: WANG Cong; +Cc: netdev

On 03/08/2011 08:51 PM, WANG Cong wrote:
> On Tue, 08 Mar 2011 14:52:52 +0800, Weiping Pan wrote:
>
>> ok, I use "Host-only mode" and get
>>
>>
>>
>> an conclusion, that if the first enslaved nic is pulled out, bonding
>> can't handle well.
>>
>> First test.
>> I first enslave eth6, then pull it out, bonding doesn't work. on host,
> ...
>>
>> Second test
>> I first enslave eth6, then pull eth7 out, bonding works well. on guest,
> Can you show me your /proc/net/bonding/bond0 before and after pulling down
> eth6 or eth7? And what does `ip link show` say?
>
Ok, let me repeat my test, and gather more information.
on host,
[root@localhost ~]# VBoxManage -v
4.0.2r69518
[root@localhost ~]# VBoxManage showvminfo 
67b83c47-0ee2-46bc-b0ff-e0eb43edc1c2|grep ^NIC
NIC 1:           MAC: 0800270481A8, Attachment: Host-only Interface 
'vboxnet0', Cable connected: on, Trace: off (file: none), Type: 82540EM, 
Reported speed: 0 Mbps, Boot priority: 0
NIC 2:           MAC: 08002778F641, Attachment: Host-only Interface 
'vboxnet0', Cable connected: on, Trace: off (file: none), Type: 82540EM, 
Reported speed: 0 Mbps, Boot priority: 0
NIC 3:           MAC: 080027C408BA, Attachment: Host-only Interface 
'vboxnet0', Cable connected: on, Trace: off (file: none), Type: 82540EM, 
Reported speed: 0 Mbps, Boot priority: 0
NIC 4:           MAC: 080027DB339A, Attachment: Host-only Interface 
'vboxnet0', Cable connected: on, Trace: off (file: none), Type: 82540EM, 
Reported speed: 0 Mbps, Boot priority: 0


[root@localhost ~]# ifconfig
vboxnet0  Link encap:Ethernet  HWaddr 0A:00:27:00:00:00
           inet addr:192.168.56.1  Bcast:192.168.56.255  Mask:255.255.255.0
           inet6 addr: fe80::800:27ff:fe00:0/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
           TX packets:3438 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:0 (0.0 b)  TX bytes:296716 (289.7 KiB)

vboxnet0:0 Link encap:Ethernet  HWaddr 0A:00:27:00:00:00
           inet addr:192.168.1.100  Bcast:192.168.1.255  Mask:255.255.255.0
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

vboxnet0:1 Link encap:Ethernet  HWaddr 0A:00:27:00:00:00
           inet addr:192.168.1.101  Bcast:192.168.1.255  Mask:255.255.255.0
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
[root@localhost ~]# route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use 
Iface
192.168.1.0     *               255.255.255.0   U     0      0        0 
vboxnet0
192.168.56.0    *               255.255.255.0   U     0      0        0 
vboxnet0
10.66.64.0      *               255.255.254.0   U     1      0        0 eth0
default         corerouter.nay. 0.0.0.0         UG    0      0        0 eth0
[root@localhost ~]# ip neigh show
10.66.65.254 dev eth0 lladdr 00:1d:45:20:d5:ff REACHABLE


then restart guest,
[root@localhost ~]# ifconfig
eth6      Link encap:Ethernet  HWaddr 08:00:27:78:F6:41
           inet addr:192.168.56.101  Bcast:192.168.56.255  
Mask:255.255.255.0
           inet6 addr: fe80::a00:27ff:fe78:f641/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:14 errors:0 dropped:0 overruns:0 frame:0
           TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:6772 (6.6 KiB)  TX bytes:1152 (1.1 KiB)

eth7      Link encap:Ethernet  HWaddr 08:00:27:04:81:A8
           inet addr:192.168.56.102  Bcast:192.168.56.255  
Mask:255.255.255.0
           inet6 addr: fe80::a00:27ff:fe04:81a8/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:14 errors:0 dropped:0 overruns:0 frame:0
           TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:6772 (6.6 KiB)  TX bytes:1152 (1.1 KiB)

eth8      Link encap:Ethernet  HWaddr 08:00:27:DB:33:9A
           inet addr:192.168.56.104  Bcast:192.168.56.255  
Mask:255.255.255.0
           inet6 addr: fe80::a00:27ff:fedb:339a/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:14 errors:0 dropped:0 overruns:0 frame:0
           TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:6772 (6.6 KiB)  TX bytes:1152 (1.1 KiB)

eth9      Link encap:Ethernet  HWaddr 08:00:27:C4:08:BA
           inet addr:192.168.56.103  Bcast:192.168.56.255  
Mask:255.255.255.0
           inet6 addr: fe80::a00:27ff:fec4:8ba/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:14 errors:0 dropped:0 overruns:0 frame:0
           TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:6772 (6.6 KiB)  TX bytes:1152 (1.1 KiB)

so according to mac address,
NIC 1:           eth7
NIC 2:           eth6
NIC 3:           eth9
NIC 4:           eth8


[root@localhost ~]# ifconfig eth6 down
[root@localhost ~]# ifconfig eth7 down
[root@localhost ~]# ifconfig eth8 down
[root@localhost ~]# ifconfig eth9 down
[root@localhost ~]# modprobe bonding mode=0 miimon=100
[root@localhost ~]# ifconfig bond0 192.168.1.2 netmask 255.255.255.0 up
[root@localhost ~]# ifenslave bond0 eth6
[root@localhost ~]# ifenslave bond0 eth7
[root@localhost ~]# dmesg
[ 1436.344751] bonding: Ethernet Channel Bonding Driver: v3.6.0 
(September 26, 2009)
[ 1436.344756] bonding: MII link monitoring set to 100 ms
[ 1480.485933] ADDRCONF(NETDEV_UP): bond0: link is not ready
[ 1490.087608] e1000: eth6 NIC Link is Up 1000 Mbps Full Duplex, Flow 
Control: RX
[ 1490.091795] bonding: bond0: enslaving eth6 as an active interface 
with an up link.
[ 1490.092326] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
[ 1492.676643] e1000: eth7 NIC Link is Up 1000 Mbps Full Duplex, Flow 
Control: RX
[ 1492.684839] bonding: bond0: enslaving eth7 as an active interface 
with an up link.
[root@localhost ~]# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.6.0 (September 26, 2009)

Bonding Mode: load balancing (round-robin)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth6
MII Status: up
Link Failure Count: 0
Permanent HW addr: 08:00:27:78:f6:41

Slave Interface: eth7
MII Status: up
Link Failure Count: 0
Permanent HW addr: 08:00:27:04:81:a8
[root@localhost ~]# ifconfig
bond0     Link encap:Ethernet  HWaddr 08:00:27:78:F6:41
           inet addr:192.168.1.2  Bcast:192.168.1.255  Mask:255.255.255.0
           inet6 addr: fe80::a00:27ff:fe78:f641/64 Scope:Link
           UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
           RX packets:40 errors:0 dropped:0 overruns:0 frame:0
           TX packets:25 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:19632 (19.1 KiB)  TX bytes:4070 (3.9 KiB)

eth6      Link encap:Ethernet  HWaddr 08:00:27:78:F6:41
           inet addr:192.168.56.101  Bcast:192.168.56.255  
Mask:255.255.255.0
           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
           RX packets:21 errors:0 dropped:0 overruns:0 frame:0
           TX packets:13 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:10158 (9.9 KiB)  TX bytes:1802 (1.7 KiB)

eth7      Link encap:Ethernet  HWaddr 08:00:27:78:F6:41
           inet addr:192.168.56.101  Bcast:192.168.56.255  
Mask:255.255.255.0
           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
           RX packets:19 errors:0 dropped:0 overruns:0 frame:0
           TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:9474 (9.2 KiB)  TX bytes:2268 (2.2 KiB)

[root@localhost ~]# ping 192.168.1.100 -c 5
PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
64 bytes from 192.168.1.100: icmp_req=1 ttl=64 time=3.69 ms
64 bytes from 192.168.1.100: icmp_req=2 ttl=64 time=0.242 ms
64 bytes from 192.168.1.100: icmp_req=3 ttl=64 time=0.154 ms
64 bytes from 192.168.1.100: icmp_req=4 ttl=64 time=0.270 ms
64 bytes from 192.168.1.100: icmp_req=5 ttl=64 time=0.278 ms

--- 192.168.1.100 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4004ms
rtt min/avg/max/mdev = 0.154/0.928/3.696/1.384 ms

[root@localhost ~]# dmesg -c &>/dev/null

on host,
[root@localhost ~]# VBoxManage controlvm 
67b83c47-0ee2-46bc-b0ff-e0eb43edc1c2 setlinkstate2 off

on guest,
[root@localhost ~]# dmesg
[ 1731.945272] e1000: eth6 NIC Link is Down
[ 1732.026207] bonding: bond0: link status definitely down for interface 
eth6, disabling it
[root@localhost ~]# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.6.0 (September 26, 2009)

Bonding Mode: load balancing (round-robin)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth6
MII Status: down
Link Failure Count: 1
Permanent HW addr: 08:00:27:78:f6:41

Slave Interface: eth7
MII Status: up
Link Failure Count: 0
Permanent HW addr: 08:00:27:04:81:a8
[root@localhost ~]# ping 192.168.1.100 -c 5
PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
 From 192.168.1.2 icmp_seq=2 Destination Host Unreachable
 From 192.168.1.2 icmp_seq=3 Destination Host Unreachable
 From 192.168.1.2 icmp_seq=4 Destination Host Unreachable
 From 192.168.1.2 icmp_seq=5 Destination Host Unreachable

--- 192.168.1.100 ping statistics ---
5 packets transmitted, 0 received, +4 errors, 100% packet loss, time 4001ms
pipe 3
[root@localhost ~]# ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth7: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc 
pfifo_fast master bond0 state UP qlen 1000
     link/ether 08:00:27:78:f6:41 brd ff:ff:ff:ff:ff:ff
3: eth6: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc 
pfifo_fast master bond0 state DOWN qlen 1000
     link/ether 08:00:27:78:f6:41 brd ff:ff:ff:ff:ff:ff
4: eth9: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN qlen 
1000
     link/ether 08:00:27:c4:08:ba brd ff:ff:ff:ff:ff:ff
5: eth8: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN qlen 
1000
     link/ether 08:00:27:db:33:9a brd ff:ff:ff:ff:ff:ff
6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc 
noqueue state UP
     link/ether 08:00:27:78:f6:41 brd ff:ff:ff:ff:ff:ff
[root@localhost ~]# ip neigh show
192.168.1.100 dev bond0  FAILED

on host,
[root@localhost ~]# ip neigh show
10.66.65.254 dev eth0 lladdr 00:1d:45:20:d5:ff STALE
192.168.1.2 dev vboxnet0 lladdr 08:00:27:78:f6:41 STALE

on guest, ping while tcpdump
[root@localhost ~]# ping 192.168.1.100
PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
 From 192.168.1.2 icmp_seq=11 Destination Host Unreachable
 From 192.168.1.2 icmp_seq=12 Destination Host Unreachable
 From 192.168.1.2 icmp_seq=13 Destination Host Unreachable
 From 192.168.1.2 icmp_seq=15 Destination Host Unreachable
 From 192.168.1.2 icmp_seq=16 Destination Host Unreachable
 From 192.168.1.2 icmp_seq=17 Destination Host Unreachable
 From 192.168.1.2 icmp_seq=19 Destination Host Unreachable
 From 192.168.1.2 icmp_seq=20 Destination Host Unreachable
 From 192.168.1.2 icmp_seq=21 Destination Host Unreachable
 From 192.168.1.2 icmp_seq=23 Destination Host Unreachable
 From 192.168.1.2 icmp_seq=24 Destination Host Unreachable
 From 192.168.1.2 icmp_seq=25 Destination Host Unreachable
 From 192.168.1.2 icmp_seq=27 Destination Host Unreachable
 From 192.168.1.2 icmp_seq=28 Destination Host Unreachable
 From 192.168.1.2 icmp_seq=29 Destination Host Unreachable
^C
--- 192.168.1.100 ping statistics ---
30 packets transmitted, 0 received, +15 errors, 100% packet loss, time 
29011ms
pipe 3
[root@localhost ~]# tcpdump -i bond0 -p arp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on bond0, link-type EN10MB (Ethernet), capture size 65535 bytes
21:22:40.694058 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
21:22:41.695875 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
21:22:42.698067 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
21:22:44.689068 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
21:22:45.689837 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
21:22:46.692076 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
21:22:48.691080 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
21:22:49.693828 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
21:22:50.696074 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
21:22:52.693070 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
21:22:53.693837 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
21:22:54.696075 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
21:22:56.694072 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
21:22:57.696063 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
21:22:58.698096 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
21:23:00.696065 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
21:23:01.698072 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
21:23:02.700063 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
21:23:04.698071 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
21:23:05.700504 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
21:23:06.702009 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
^C
21 packets captured
21 packets received by filter
0 packets dropped by kernel

meanwhile on host,
[root@localhost ~]# tcpdump -i vboxnet0 -p arp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vboxnet0, link-type EN10MB (Ethernet), capture size 65535 bytes
10:22:56.936833 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
10:22:56.936839 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
10:22:57.938415 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
10:22:57.938422 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
10:22:58.939875 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
10:22:58.939881 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
10:23:00.937292 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
10:23:00.937299 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
10:23:01.939242 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
10:23:01.939249 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
10:23:02.940559 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
10:23:02.940565 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
10:23:04.938148 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
10:23:04.938156 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
10:23:05.939688 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
10:23:05.939695 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
10:23:06.941288 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
10:23:06.941295 ARP, Request who-has 192.168.1.100 tell 192.168.1.2, 
length 28
^C
18 packets captured
18 packets received by filter
0 packets dropped by kernel


Maybe this is the cause of the problem.
The guest can't receive correct ARP reply, maybe the virtual network of 
VirtualBox doesn't transfer it.

on host,
[root@localhost ~]# VBoxManage controlvm 
67b83c47-0ee2-46bc-b0ff-e0eb43edc1c2 setlinkstate2 on

on guest,
[root@localhost ~]# dmesg
[ 2392.304591] e1000: eth6 NIC Link is Up 1000 Mbps Full Duplex, Flow 
Control: RX
[ 2392.381223] bonding: bond0: link status definitely up for interface eth6.
[root@localhost ~]# ping 192.168.1.100 -c5
PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
64 bytes from 192.168.1.100: icmp_req=1 ttl=64 time=0.648 ms
64 bytes from 192.168.1.100: icmp_req=2 ttl=64 time=0.208 ms
64 bytes from 192.168.1.100: icmp_req=3 ttl=64 time=0.187 ms
64 bytes from 192.168.1.100: icmp_req=4 ttl=64 time=0.274 ms
64 bytes from 192.168.1.100: icmp_req=5 ttl=64 time=0.216 ms

--- 192.168.1.100 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4002ms
rtt min/avg/max/mdev = 0.187/0.306/0.648/0.174 ms

[root@localhost ~]# ip neigh show
192.168.1.100 dev bond0 lladdr 0a:00:27:00:00:00 REACHABLE


thanks
Weiping Pan



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bonding can't change to another slave if you ifdown the active slave
  2011-03-08 12:51 ` WANG Cong
  2011-03-09  2:40   ` Weiping Pan
@ 2011-03-09  3:38   ` Weiping Pan
  1 sibling, 0 replies; 13+ messages in thread
From: Weiping Pan @ 2011-03-09  3:38 UTC (permalink / raw)
  To: WANG Cong; +Cc: netdev

On 03/08/2011 08:51 PM, WANG Cong wrote:
> On Tue, 08 Mar 2011 14:52:52 +0800, Weiping Pan wrote:
>
>> ok, I use "Host-only mode" and get
>>
>>
>>
>> an conclusion, that if the first enslaved nic is pulled out, bonding
>> can't handle well.
>>
>> First test.
>> I first enslave eth6, then pull it out, bonding doesn't work. on host,
> ...
>>
>> Second test
>> I first enslave eth6, then pull eth7 out, bonding works well. on guest,
> Can you show me your /proc/net/bonding/bond0 before and after pulling down
> eth6 or eth7? And what does `ip link show` say?
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
And I test it again, there is no problem with kvm.
Maybe there is bug in virtualbox.

thanks
Weiping Pan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: bonding can't change to another slave if you ifdown the active slave
  2011-03-09  2:40   ` Weiping Pan
@ 2011-03-09  6:02     ` Américo Wang
  0 siblings, 0 replies; 13+ messages in thread
From: Américo Wang @ 2011-03-09  6:02 UTC (permalink / raw)
  To: Weiping Pan; +Cc: WANG Cong, netdev

On Wed, Mar 09, 2011 at 10:40:36AM +0800, Weiping Pan wrote:

<...>

>[root@localhost ~]# ifconfig
>bond0     Link encap:Ethernet  HWaddr 08:00:27:78:F6:41
>          inet addr:192.168.1.2  Bcast:192.168.1.255  Mask:255.255.255.0
>          inet6 addr: fe80::a00:27ff:fe78:f641/64 Scope:Link
>          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
>          RX packets:40 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:25 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:0
>          RX bytes:19632 (19.1 KiB)  TX bytes:4070 (3.9 KiB)
>
>eth6      Link encap:Ethernet  HWaddr 08:00:27:78:F6:41
>          inet addr:192.168.56.101  Bcast:192.168.56.255
>Mask:255.255.255.0
>          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
>          RX packets:21 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:13 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:1000
>          RX bytes:10158 (9.9 KiB)  TX bytes:1802 (1.7 KiB)
>
>eth7      Link encap:Ethernet  HWaddr 08:00:27:78:F6:41
>          inet addr:192.168.56.101  Bcast:192.168.56.255
>Mask:255.255.255.0
>          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
>          RX packets:19 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:1000
>          RX bytes:9474 (9.2 KiB)  TX bytes:2268 (2.2 KiB)


It's weird that you bond two nic's into another different subnet.

>
>[root@localhost ~]# ping 192.168.1.100 -c 5
>PING 192.168.1.100 (192.168.1.100) 56(84) bytes of data.
>64 bytes from 192.168.1.100: icmp_req=1 ttl=64 time=3.69 ms
>64 bytes from 192.168.1.100: icmp_req=2 ttl=64 time=0.242 ms
>64 bytes from 192.168.1.100: icmp_req=3 ttl=64 time=0.154 ms
>64 bytes from 192.168.1.100: icmp_req=4 ttl=64 time=0.270 ms
>64 bytes from 192.168.1.100: icmp_req=5 ttl=64 time=0.278 ms
>

What is your route table in your guest?


>[root@localhost ~]# ip link show
>1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
>    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>2: eth7: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc
>pfifo_fast master bond0 state UP qlen 1000
>    link/ether 08:00:27:78:f6:41 brd ff:ff:ff:ff:ff:ff
>3: eth6: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc
>pfifo_fast master bond0 state DOWN qlen 1000

Clearly this equals to pulling off the cable from eth6,
which is your first slave.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2011-03-09  6:02 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-04  2:15 bonding can't change to another slave if you ifdown the active slave Weiping Pan
2011-03-05  0:38 ` Jay Vosburgh
2011-03-07  3:23   ` Weiping Pan
2011-03-05  2:53 ` Andy Gospodarek
2011-03-05 13:49   ` Nicolas de Pesloüan
2011-03-07  3:13     ` Weiping Pan
2011-03-07 21:15       ` Nicolas de Pesloüan
2011-03-07  4:20   ` Weiping Pan
  -- strict thread matches above, loose matches on Subject: below --
2011-03-08  6:52 Weiping Pan
2011-03-08 12:51 ` WANG Cong
2011-03-09  2:40   ` Weiping Pan
2011-03-09  6:02     ` Américo Wang
2011-03-09  3:38   ` Weiping Pan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).