Re: how to handle bonding failover when using a bridge over the bond?

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jay Vosburgh <fubar@us.ibm.com>
To: Chris Friesen <chris.friesen@genband.com>
Cc: bonding-devel@lists.sourceforge.net, netdev <netdev@vger.kernel.org>
Subject: Re: how to handle bonding failover when using a bridge over the bond?
Date: Tue, 12 Feb 2013 16:02:26 -0800	[thread overview]
Message-ID: <32261.1360713746@death.nxdomain> (raw)
In-Reply-To: <511ACE16.3080906@genband.com>

Chris Friesen <chris.friesen@genband.com> wrote:

>I've got a scenario that seems to be not well handled with the current
>bonding code in linux, but maybe I'm missing something.
>
>I have a physical host with two ethernet links that are bonded together
>(active/backup).  Each link is connected to a separate L2 switch, which
>are in turn connected with a crosslink for redundancy.
>
>The physical host is running multiple virtual machines each with a virtual
>adapter.  The virtual adapters and the bond are all bridged together to
>allow communication between the virtual machines, the host, and the
>outside world.
>
>Now suppose one of the slave links fails. The bond device will failover to
>the other slave and send out a gratuitous arp on the newly active slave.
>This will cause the L2 switches to update their lookup tables for the MAC
>address associated with the bond (so it now points to the newly active
>slave), but doesn't update the MAC addresses associated with the various
>virtual machines.  If someone on the network sends a packet to one of the
>virtual machines, the switch will try to send it over the failed slave.

	If the link failure is such that there is no carrier on the
switch port, the switch will drop the forwarding entry for the virtual
machine's MAC address from that port.  The traffic for the VM's MAC
would then flood to all ports, presumably including the link to the
other switch, which wouldn't have a forwarding entry for the MAC, either
(or it would be the switch link port), and would also flood it to all
ports, one of which is the correct one.

	Now, I'm speculating a bit here, as I have not traced out
exactly how this works.  I have discussed bonding failover with people
here who have systems set up in the manner you describe (and did some
testing), and it appears to be working for them.

	On the other hand, something like a manual change of active
slave won't bring down the carrier of the previously-active slave, and
in that case there might be a problem with traffic destined for one of
the VMs, until the VM sends something that makes it to the new switch.

	Is this actually failing for you, or is this a thought
experiment?

>What's the recommended solution for this?  The logical solution would seem
>to be to have something issue GARPs for each virtual machine when the bond
>device fails over, but there doesn't seem to be any way to register for
>notification (via rtnetlink for instance) when the bond fails over.  I
>could monitor for carrier loss, but that wouldn't work for the case where
>bonding is using arp monitoring.

	There is a NETDEV_BONDING_FAILOVER notifier that is called for
active-backup mode when a new active slave is assigned.  The
rtnetlink_event function is on that chain, and will send an rtnetlink
message, although I don't see that the actual event is included in the
message.

	The bond doesn't track all of the MACs that go through it, but
the bridge presumably does, and could respond to the FAILOVER notifier
with something to notify the switch that the port assignments for the
various MACs have changed.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

next prev parent reply	other threads:[~2013-02-13  0:02 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-12 23:19 how to handle bonding failover when using a bridge over the bond? Chris Friesen
2013-02-13  0:02 ` Jay Vosburgh [this message]
2013-02-13  0:30   ` Chris Friesen
2013-02-13 17:14     ` Chris Friesen
2013-02-14  8:01     ` Cong Wang
2013-02-14 16:43       ` Chris Friesen
2013-02-14 18:03         ` Jay Vosburgh
2013-02-14 19:29           ` Chris Friesen
2013-02-14 19:42             ` Rick Jones

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=32261.1360713746@death.nxdomain \
    --to=fubar@us.ibm.com \
    --cc=bonding-devel@lists.sourceforge.net \
    --cc=chris.friesen@genband.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).