bonding device in balance-alb mode shows packet loss in kernel 3.2-rc6

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* bonding device in balance-alb mode shows packet loss in kernel 3.2-rc6
@ 2011-12-27 14:31 Narendra_K
  2011-12-27 19:56 ` Nicolas de Pesloüan
  2011-12-28 20:08 ` Jay Vosburgh
  0 siblings, 2 replies; 6+ messages in thread
From: Narendra_K @ 2011-12-27 14:31 UTC (permalink / raw)
  To: netdev; +Cc: fubar

[-- Attachment #1: Type: text/plain, Size: 4775 bytes --]

Hello,

On kernel version 3.2-rc6, when a bonding device is configured in 'balance-alb' mode,
ping reported packet losses. By looking at protocol trance, it seemed like the lost
packets had the destination MAC id of inactive slave. 

Scenario:

Host under test:

bond0 IP addr: 10.2.2.1 - balance-alb mode, 2 or more slaves.

Remote Host1: 10.2.2.11

Remote Host2: 10.2.2.2

Ping to Host 1 IP. Observe that there is no packet loss

# ping 10.2.2.11
PING 10.2.2.11 (10.2.2.11) 56(84) bytes of data.
64 bytes from 10.2.2.11: icmp_seq=1 ttl=64 time=0.156 ms
64 bytes from 10.2.2.11: icmp_seq=2 ttl=64 time=0.130 ms
64 bytes from 10.2.2.11: icmp_seq=3 ttl=64 time=0.151 ms
64 bytes from 10.2.2.11: icmp_seq=4 ttl=64 time=0.137 ms
64 bytes from 10.2.2.11: icmp_seq=5 ttl=64 time=0.151 ms
64 bytes from 10.2.2.11: icmp_seq=6 ttl=64 time=0.129 ms
^C
--- 10.2.2.11 ping statistics ---
6 packets transmitted, 6 received, 0% packet loss, time 4997ms
rtt min/avg/max/mdev = 0.129/0.142/0.156/0.014 ms

Now ping to Host2 IP. Observe that there is packet loss. It is reproducible almost
always.

# ping 10.2.2.2
PING 10.2.2.2 (10.2.2.2) 56(84) bytes of data.
64 bytes from 10.2.2.2: icmp_seq=6 ttl=64 time=0.108 ms
64 bytes from 10.2.2.2: icmp_seq=7 ttl=64 time=0.104 ms
64 bytes from 10.2.2.2: icmp_seq=8 ttl=64 time=0.119 ms
64 bytes from 10.2.2.2: icmp_seq=56 ttl=64 time=0.139 ms
64 bytes from 10.2.2.2: icmp_seq=57 ttl=64 time=0.111 ms
^C
--- 10.2.2.2 ping statistics ---
75 packets transmitted, 5 received, 93% packet loss, time 74037ms
rtt min/avg/max/mdev = 0.104/0.116/0.139/0.014 ms

More information:

Hardware information: 
Dell PowerEdge R610

# lspci | grep -i ether
01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
01:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)

Kernel version:
3.2.0-rc6

# ethtool -i bond0
driver: bonding
version: 3.7.1

By observing the packets on remote HOST2, the sequence is

1. 'bond0' broadcasts an ARP request with source MAC equal to
'bond0' MAC address and receives a ARP response to the same.
Next few packets are received.

2. After some, there are 2 ARP replies from 'bond0' to HOST2
with source MAC equal to 'inactive slave' MAC id. Now HOST2 sends
ICMP response with destnation MAC equal to inactive slave MAC id
and these packets are dropped.

The wireshark protocol trace is attached to this note.

3. The behavior was independent of the Network adapters models.

4. Also, I had few prints in 'eth_type_trans' and it seemed like the 'inactive slave'
was not receiving any frames destined to it (00:21:9b:9d:a5:74) except ARP broadcasts.
Setting the 'inactive slave' in 'promisc' mode made bond0 see the responses.

# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: adaptive load balancing
Primary Slave: None
Currently Active Slave: em2
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: em2
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:21:9b:9d:a5:72
Slave queue ID: 0

Slave Interface: em3
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:21:9b:9d:a5:74 <--- 1
Slave queue ID: 0

Slave Interface: em4
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:21:9b:9d:a5:76
Slave queue ID: 0


# ip addr show dev bond0
8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
    link/ether 00:21:9b:9d:a5:72 brd ff:ff:ff:ff:ff:ff
    inet 10.2.2.1/24 brd 10.2.2.255 scope global bond0
    inet6 fe80::221:9bff:fe9d:a572/64 scope link
       valid_lft forever preferred_lft forever

# ip addr show dev em2
3: em2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP qlen 1000
    link/ether 00:21:9b:9d:a5:72 brd ff:ff:ff:ff:ff:ff

# ip addr show dev em3
4: em3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP qlen 1000
    link/ether 00:21:9b:9d:a5:74 brd ff:ff:ff:ff:ff:ff

# ip addr show dev em4
5: em4: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP qlen 1000
    link/ether 00:21:9b:9d:a5:76 brd ff:ff:ff:ff:ff:ff

It would be great if you have any insight into this. Please let me know if any additional information is required.

With regards,
Narendra K



[-- Attachment #2: linux-3.2-rc6-balance-alb-protocol-trace --]
[-- Type: application/octet-stream, Size: 14230 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: bonding device in balance-alb mode shows packet loss in kernel 3.2-rc6
  2011-12-27 14:31 bonding device in balance-alb mode shows packet loss in kernel 3.2-rc6 Narendra_K
@ 2011-12-27 19:56 ` Nicolas de Pesloüan
  2011-12-28  7:25   ` Narendra_K
  2011-12-28 20:08 ` Jay Vosburgh
  1 sibling, 1 reply; 6+ messages in thread
From: Nicolas de Pesloüan @ 2011-12-27 19:56 UTC (permalink / raw)
  To: Narendra_K; +Cc: netdev, fubar

Le 27/12/2011 15:31, Narendra_K@Dell.com a écrit :
> Hello,
>
> On kernel version 3.2-rc6, when a bonding device is configured in 'balance-alb' mode,
> ping reported packet losses. By looking at protocol trance, it seemed like the lost
> packets had the destination MAC id of inactive slave.
>

[snip]

Can you provide the output of the following command?

grep . /sys/classs/net/bond0/bonding/*

Thanks,

	Nicolas.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: bonding device in balance-alb mode shows packet loss in kernel 3.2-rc6
  2011-12-27 19:56 ` Nicolas de Pesloüan
@ 2011-12-28  7:25   ` Narendra_K
  2011-12-28 17:59     ` Narendra_K
  0 siblings, 1 reply; 6+ messages in thread
From: Narendra_K @ 2011-12-28  7:25 UTC (permalink / raw)
  To: nicolas.2p.debian; +Cc: netdev, fubar

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org] On Behalf Of Nicolas de Pesloüan
> Sent: Wednesday, December 28, 2011 1:26 AM
> To: K, Narendra
> Cc: netdev@vger.kernel.org; fubar@us.ibm.com
> Subject: Re: bonding device in balance-alb mode shows packet loss in kernel
> 3.2-rc6
> 
> Le 27/12/2011 15:31, Narendra_K@Dell.com a écrit :
> > Hello,
> >
> > On kernel version 3.2-rc6, when a bonding device is configured in
> > 'balance-alb' mode, ping reported packet losses. By looking at
> > protocol trance, it seemed like the lost packets had the destination MAC id
> of inactive slave.
> >
> 
> [snip]
> 
> Can you provide the output of the following command?
> 
> grep . /sys/classs/net/bond0/bonding/*

Hi, thanks for the response. Please find the output here -

# grep . /sys/class/net/bond0/bonding/*
/sys/class/net/bond0/bonding/active_slave:em2
/sys/class/net/bond0/bonding/ad_select:stable 0
/sys/class/net/bond0/bonding/all_slaves_active:0
/sys/class/net/bond0/bonding/arp_interval:0
/sys/class/net/bond0/bonding/arp_validate:none 0
/sys/class/net/bond0/bonding/downdelay:0
/sys/class/net/bond0/bonding/fail_over_mac:none 0
/sys/class/net/bond0/bonding/lacp_rate:slow 0
/sys/class/net/bond0/bonding/miimon:100
/sys/class/net/bond0/bonding/mii_status:up
/sys/class/net/bond0/bonding/min_links:0
/sys/class/net/bond0/bonding/mode:balance-alb 6
/sys/class/net/bond0/bonding/num_grat_arp:1
/sys/class/net/bond0/bonding/num_unsol_na:1
/sys/class/net/bond0/bonding/primary_reselect:always 0
/sys/class/net/bond0/bonding/queue_id:em2:0 em3:0 em4:0
/sys/class/net/bond0/bonding/resend_igmp:1
/sys/class/net/bond0/bonding/slaves:em2 em3 em4
/sys/class/net/bond0/bonding/updelay:0
/sys/class/net/bond0/bonding/use_carrier:1
/sys/class/net/bond0/bonding/xmit_hash_policy:layer2 0

With regards,
Narendra K

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: bonding device in balance-alb mode shows packet loss in kernel 3.2-rc6
  2011-12-28  7:25   ` Narendra_K
@ 2011-12-28 17:59     ` Narendra_K
  0 siblings, 0 replies; 6+ messages in thread
From: Narendra_K @ 2011-12-28 17:59 UTC (permalink / raw)
  To: nicolas.2p.debian; +Cc: netdev, fubar, Shyam_Iyer, Surya_Prabhakar


> > -----Original Message-----
> > From: netdev-owner@vger.kernel.org [mailto:netdev-
> > owner@vger.kernel.org] On Behalf Of Nicolas de Pesloüan
> > Sent: Wednesday, December 28, 2011 1:26 AM
> > To: K, Narendra
> > Cc: netdev@vger.kernel.org; fubar@us.ibm.com
> > Subject: Re: bonding device in balance-alb mode shows packet loss in
> > kernel
> > 3.2-rc6
> >
> > Le 27/12/2011 15:31, Narendra_K@Dell.com a écrit :
> > > Hello,
> > >
> > > On kernel version 3.2-rc6, when a bonding device is configured in
> > > 'balance-alb' mode, ping reported packet losses. By looking at
> > > protocol trance, it seemed like the lost packets had the destination
> > > MAC id
> > of inactive slave.
> > >
> >
> > [snip]
> >
> > Can you provide the output of the following command?
> >
> > grep . /sys/classs/net/bond0/bonding/*
> 
> Hi, thanks for the response. Please find the output here -
> 
> # grep . /sys/class/net/bond0/bonding/*
> /sys/class/net/bond0/bonding/active_slave:em2
> /sys/class/net/bond0/bonding/ad_select:stable 0
> /sys/class/net/bond0/bonding/all_slaves_active:0
> /sys/class/net/bond0/bonding/arp_interval:0
> /sys/class/net/bond0/bonding/arp_validate:none 0
> /sys/class/net/bond0/bonding/downdelay:0
> /sys/class/net/bond0/bonding/fail_over_mac:none 0
> /sys/class/net/bond0/bonding/lacp_rate:slow 0
> /sys/class/net/bond0/bonding/miimon:100
> /sys/class/net/bond0/bonding/mii_status:up
> /sys/class/net/bond0/bonding/min_links:0
> /sys/class/net/bond0/bonding/mode:balance-alb 6
> /sys/class/net/bond0/bonding/num_grat_arp:1
> /sys/class/net/bond0/bonding/num_unsol_na:1
> /sys/class/net/bond0/bonding/primary_reselect:always 0
> /sys/class/net/bond0/bonding/queue_id:em2:0 em3:0 em4:0
> /sys/class/net/bond0/bonding/resend_igmp:1
> /sys/class/net/bond0/bonding/slaves:em2 em3 em4
> /sys/class/net/bond0/bonding/updelay:0
> /sys/class/net/bond0/bonding/use_carrier:1
> /sys/class/net/bond0/bonding/xmit_hash_policy:layer2 0

Hi, this information might be useful. As setting the 'inactive slave' to promisc mode stopped the packet drops and "eth_type_trans" showed only ARP broadcasts on the inactive slave,  I manually  assigned the 'permanent HW address' to the inactive slave and the packet drop stopped .

Ifconfig em3 hw ether <perm HW addr>

With regards,
Narendra K

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: bonding device in balance-alb mode shows packet loss in kernel 3.2-rc6
  2011-12-27 14:31 bonding device in balance-alb mode shows packet loss in kernel 3.2-rc6 Narendra_K
  2011-12-27 19:56 ` Nicolas de Pesloüan
@ 2011-12-28 20:08 ` Jay Vosburgh
  2011-12-30 12:22   ` Narendra_K
  1 sibling, 1 reply; 6+ messages in thread
From: Jay Vosburgh @ 2011-12-28 20:08 UTC (permalink / raw)
  To: Narendra_K; +Cc: netdev

<Narendra_K@Dell.com> wrote:

>Hello,
>
>On kernel version 3.2-rc6, when a bonding device is configured in 'balance-alb' mode,
>ping reported packet losses. By looking at protocol trance, it seemed like the lost
>packets had the destination MAC id of inactive slave. 

	In balance-alb mode, there isn't really an "inactive" slave in
the same sense as for active-backup mode.  For this mode, the "inactive"
slave flag is used to suppress duplicates for multicast and broadcasts,
to prevent multiple copies of those from being received (if each slave
gets one copy).  Unicast traffic should pass normally to all slaves.
Each slave also keeps a discrete MAC address, and peers are assigned to
particular slaves via tailored ARP messages (so, different peers may see
a different MAC for the bond's IP address).

>Scenario:
>
>Host under test:
>
>bond0 IP addr: 10.2.2.1 - balance-alb mode, 2 or more slaves.
>
>Remote Host1: 10.2.2.11
>
>Remote Host2: 10.2.2.2
>
>Ping to Host 1 IP. Observe that there is no packet loss
>
># ping 10.2.2.11
>PING 10.2.2.11 (10.2.2.11) 56(84) bytes of data.
>64 bytes from 10.2.2.11: icmp_seq=1 ttl=64 time=0.156 ms
>64 bytes from 10.2.2.11: icmp_seq=2 ttl=64 time=0.130 ms
>64 bytes from 10.2.2.11: icmp_seq=3 ttl=64 time=0.151 ms
>64 bytes from 10.2.2.11: icmp_seq=4 ttl=64 time=0.137 ms
>64 bytes from 10.2.2.11: icmp_seq=5 ttl=64 time=0.151 ms
>64 bytes from 10.2.2.11: icmp_seq=6 ttl=64 time=0.129 ms
>^C
>--- 10.2.2.11 ping statistics ---
>6 packets transmitted, 6 received, 0% packet loss, time 4997ms
>rtt min/avg/max/mdev = 0.129/0.142/0.156/0.014 ms
>
>Now ping to Host2 IP. Observe that there is packet loss. It is reproducible almost
>always.
>
># ping 10.2.2.2
>PING 10.2.2.2 (10.2.2.2) 56(84) bytes of data.
>64 bytes from 10.2.2.2: icmp_seq=6 ttl=64 time=0.108 ms
>64 bytes from 10.2.2.2: icmp_seq=7 ttl=64 time=0.104 ms
>64 bytes from 10.2.2.2: icmp_seq=8 ttl=64 time=0.119 ms
>64 bytes from 10.2.2.2: icmp_seq=56 ttl=64 time=0.139 ms
>64 bytes from 10.2.2.2: icmp_seq=57 ttl=64 time=0.111 ms
>^C
>--- 10.2.2.2 ping statistics ---
>75 packets transmitted, 5 received, 93% packet loss, time 74037ms
>rtt min/avg/max/mdev = 0.104/0.116/0.139/0.014 ms
>
>More information:
>
>Hardware information: 
>Dell PowerEdge R610
>
># lspci | grep -i ether
>01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
>01:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
>02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
>02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
>
>Kernel version:
>3.2.0-rc6
>
># ethtool -i bond0
>driver: bonding
>version: 3.7.1
>
>By observing the packets on remote HOST2, the sequence is
>
>1. 'bond0' broadcasts an ARP request with source MAC equal to
>'bond0' MAC address and receives a ARP response to the same.
>Next few packets are received.

	In this case, it means the peer has been assigned to the "em2"
slave.

>2. After some, there are 2 ARP replies from 'bond0' to HOST2
>with source MAC equal to 'inactive slave' MAC id. Now HOST2 sends
>ICMP response with destnation MAC equal to inactive slave MAC id
>and these packets are dropped.

	This part is not unusual for the balance-alb mode; the traffic
is periodically rebalanced, and in this case the peer HOST2 was likely
assigned to a different slave that it was previously.  I'm not sure why
the packets don't reach their destination, but they shouldn't be dropped
due to the slave being "inactive," as I explained above.

>The wireshark protocol trace is attached to this note.
>
>3. The behavior was independent of the Network adapters models.
>
>4. Also, I had few prints in 'eth_type_trans' and it seemed like the 'inactive slave'
>was not receiving any frames destined to it (00:21:9b:9d:a5:74) except ARP broadcasts.
>Setting the 'inactive slave' in 'promisc' mode made bond0 see the responses.

	This seems very strange, since the MAC information shown later
suggests that the slaves all are using their original MAC addresses, so
the packets ought to be delivered.

	I'm out of the office until next week, so I won't have an
opportunity to try and reproduce this myself until then.  I wonder if
something in the rx_handler changes over the last few months has broken
this, although a look at the code suggests that it should be doing the
right things.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: bonding device in balance-alb mode shows packet loss in kernel 3.2-rc6
  2011-12-28 20:08 ` Jay Vosburgh
@ 2011-12-30 12:22   ` Narendra_K
  0 siblings, 0 replies; 6+ messages in thread
From: Narendra_K @ 2011-12-30 12:22 UTC (permalink / raw)
  To: fubar; +Cc: netdev, Surya_Prabhakar, Shyam_Iyer

> -----Original Message-----
> From: Jay Vosburgh [mailto:fubar@us.ibm.com]
> Sent: Thursday, December 29, 2011 1:39 AM
> To: K, Narendra
> Cc: netdev@vger.kernel.org
> Subject: Re: bonding device in balance-alb mode shows packet loss in kernel
> 3.2-rc6
> 
> <Narendra_K@Dell.com> wrote:
> 
> >By observing the packets on remote HOST2, the sequence is
> >
> >1. 'bond0' broadcasts an ARP request with source MAC equal to 'bond0'
> >MAC address and receives a ARP response to the same.
> >Next few packets are received.
> 
> 	In this case, it means the peer has been assigned to the "em2"
> slave.
> 
> >2. After some, there are 2 ARP replies from 'bond0' to HOST2 with
> >source MAC equal to 'inactive slave' MAC id. Now HOST2 sends ICMP
> >response with destnation MAC equal to inactive slave MAC id and these
> >packets are dropped.
> 
> 	This part is not unusual for the balance-alb mode; the traffic is
> periodically rebalanced, and in this case the peer HOST2 was likely assigned
> to a different slave that it was previously.  I'm not sure why the packets don't
> reach their destination, but they shouldn't be dropped due to the slave being
> "inactive," as I explained above.
> 
> >The wireshark protocol trace is attached to this note.
> >
> >3. The behavior was independent of the Network adapters models.
> >
> >4. Also, I had few prints in 'eth_type_trans' and it seemed like the 'inactive
> slave'
> >was not receiving any frames destined to it (00:21:9b:9d:a5:74) except ARP
> broadcasts.
> >Setting the 'inactive slave' in 'promisc' mode made bond0 see the responses.
> 
> 	This seems very strange, since the MAC information shown later
> suggests that the slaves all are using their original MAC addresses, so the
> packets ought to be delivered.
> 
> 	I'm out of the office until next week, so I won't have an opportunity
> to try and reproduce this myself until then.  I wonder if something in the
> rx_handler changes over the last few months has broken this, although a
> look at the code suggests that it should be doing the right things.

Hi Jay, thanks for looking into this. I am out of office next week.
I am copying Surya if additional information is required.
(Please keep Surya in CC).

It was strange that 'eth_type_trans' showed only ARP broadcasts for
em3 and em4. Interestingly when i set the perm HW address of em3 manually
by

ifconfig em3 hw ether 00:21:9b:9d:a5:74

packet drops stopped and 'eth_type_trans' showed unicast frames
destined to 00:21:9b:9d:a5:74.

I put few debug prints in 'bnx2_set_mac_addr' to see what MAC ids are
getting set in the hardware. When i stopped and started the bond0,
all the slaves seemed to have the same MAC id 
(of em2 and bond0 00:21:9b:9d:a5:72). 

Also, the following change made the packet drops stop and prints in
'bnx2_set_mac_addr' seemed to indicate that all slaves got unique
mac id set in hardware.

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 7f87568..e717267 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1620,7 +1620,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
         */
        memcpy(new_slave->perm_hwaddr, slave_dev->dev_addr, ETH_ALEN);

-       if (!bond->params.fail_over_mac) {
+       if (!bond->params.fail_over_mac && !bond_is_lb(bond)) {
                /*
                 * Set slave to master's mac address.  The application already
                 * set the master's mac address to that of the first slave


With regards,
Narendra K
 

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-12-30 12:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-27 14:31 bonding device in balance-alb mode shows packet loss in kernel 3.2-rc6 Narendra_K
2011-12-27 19:56 ` Nicolas de Pesloüan
2011-12-28  7:25   ` Narendra_K
2011-12-28 17:59     ` Narendra_K
2011-12-28 20:08 ` Jay Vosburgh
2011-12-30 12:22   ` Narendra_K

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).