netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jay Vosburgh <fubar@us.ibm.com>
To: Chris Friesen <chris.friesen@genband.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>, netdev@vger.kernel.org
Subject: Re: how to handle bonding failover when using a bridge over the bond?
Date: Thu, 14 Feb 2013 10:03:13 -0800	[thread overview]
Message-ID: <23692.1360864993@death.nxdomain> (raw)
In-Reply-To: <511D141B.602@genband.com>

Chris Friesen <chris.friesen@genband.com> wrote:

>On 02/14/2013 02:01 AM, Cong Wang wrote:
>> On Wed, 13 Feb 2013 at 00:30 GMT, Chris Friesen<chris.friesen@genband.com>  wrote:
>>> On 02/12/2013 06:02 PM, Jay Vosburgh wrote:
>>>> 	The bond doesn't track all of the MACs that go through it, but
>>>> the bridge presumably does, and could respond to the FAILOVER notifier
>>>> with something to notify the switch that the port assignments for the
>>>> various MACs have changed.
>>>
>>> That would probably make sense.  I've added the bridging folks, maybe
>>> they'll have a suggestion how this sort of thing should be handled.
>>>
>>
>> It is already handled. When BONDING_FAILOVER is triggered and the MAC has
>> been changed, NETDEV_CHANGEADDR is issued too, then bridge will capture
>> it and update its fdb:
>>
>>          case NETDEV_CHANGEADDR:
>>                  spin_lock_bh(&br->lock);
>>                  br_fdb_changeaddr(p, dev->dev_addr);
>>                  changed_addr = br_stp_recalculate_bridge_id(br);
>>                  spin_unlock_bh(&br->lock);
>>
>>                  if (changed_addr)
>>                          call_netdevice_notifiers(NETDEV_CHANGEADDR, br->dev);
>>
>>                  break;
>
>I'm not familiar with the bridge code, can you elaborate on how this helps?

	I'm not sure that it does, even if you're using STP (although
I'd want to try it with STP to make sure).  This only updates the fdb's
MAC for the bond's port.  It won't affect the VM's MACs (which it
shouldn't, because they don't change), and won't send any gratuitous
updates through the bond's port to the switch that would notify the
second switch ("B" in Chris's description, below) that the switch port
for the VM's MAC(s) has changed.

	Also, if the bond has fail_over_mac=follow, then no CHANGEADDR
is issued, because the MAC address does not change.  This is not common
(and not the case in the configuration described below), but does occur.

>The problem scenario is this:
>
>I have a host with eth0/eth1 bonded together as bond0.  eth0/eth1 are
>connected to separate L2 switches, which are interconnected.
>
>On the host there are a number of virtual machines, each with a virtual
>interface.
>
>All the virtual interfaces as well as bond0 are bridged together to allow
>the VMs, the host, and the outside world to talk to each other.
>
>Currently the host does NOT participate in STP because it is considered an
>edge node.
>
>Suppose eth0 is the active link and we pull it.  The bond will make eth1
>active and emit gratuitous arp packets for itself, so the external L2
>switches will update the location of the MAC address belonging to the
>bond.  On loss of carrier for the link to eth0 L2 switch "A" will drop the
>entries for the MAC addresses, including the ones for the virtual
>machines.
>
>The problem is that L2 switch "B" still thinks that all the virtual
>machines are accessible via L2 switch "A".  Thus any incoming packets
>destined for a virtual machine will get dropped.

	I'm trying to track down the system I tested previously to see
exactly how it is set up and why it works when yours does not.  It's
possible that it doesn't work, and the testing we did simply missed this
case.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

  reply	other threads:[~2013-02-14 18:44 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-12 23:19 how to handle bonding failover when using a bridge over the bond? Chris Friesen
2013-02-13  0:02 ` Jay Vosburgh
2013-02-13  0:30   ` Chris Friesen
2013-02-13 17:14     ` Chris Friesen
2013-02-14  8:01     ` Cong Wang
2013-02-14 16:43       ` Chris Friesen
2013-02-14 18:03         ` Jay Vosburgh [this message]
2013-02-14 19:29           ` Chris Friesen
2013-02-14 19:42             ` Rick Jones

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=23692.1360864993@death.nxdomain \
    --to=fubar@us.ibm.com \
    --cc=chris.friesen@genband.com \
    --cc=netdev@vger.kernel.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).