From: Chris Friesen <chris.friesen@genband.com>
To: Jay Vosburgh <fubar@us.ibm.com>
Cc: netdev <netdev@vger.kernel.org>,
Stephen Hemminger <shemminger@vyatta.com>,
bridge@lists.linux-foundation.org,
bonding-devel@lists.sourceforge.net
Subject: Re: [Bridge] how to handle bonding failover when using a bridge over the bond?
Date: Wed, 13 Feb 2013 11:14:00 -0600 [thread overview]
Message-ID: <511BC9D8.1020200@genband.com> (raw)
In-Reply-To: <511ADEBB.1000701@genband.com>
On 02/12/2013 06:30 PM, Chris Friesen wrote:
> On 02/12/2013 06:02 PM, Jay Vosburgh wrote:
>> Chris Friesen<chris.friesen@genband.com> wrote:
>>> I have a physical host with two ethernet links that are bonded
>>> together (active/backup). Each link is connected to a separate L2
>>> switch, which are in turn connected with a crosslink for
>>> redundancy.
>>>
>>> The physical host is running multiple virtual machines each with
>>> a virtual adapter. The virtual adapters and the bond are all
>>> bridged together to allow communication between the virtual
>>> machines, the host, and the outside world.
>>>
>>> Now suppose one of the slave links fails. The bond device will
>>> failover to the other slave and send out a gratuitous arp on the
>>> newly active slave. This will cause the L2 switches to update
>>> their lookup tables for the MAC address associated with the bond
>>> (so it now points to the newly active slave), but doesn't update
>>> the MAC addresses associated with the various virtual machines.
>>> If someone on the network sends a packet to one of the virtual
>>> machines, the switch will try to send it over the failed slave.
>>
>> If the link failure is such that there is no carrier on the switch
>> port, the switch will drop the forwarding entry for the virtual
>> machine's MAC address from that port. The traffic for the VM's MAC
>> would then flood to all ports, presumably including the link to
>> the other switch, which wouldn't have a forwarding entry for the
>> MAC, either (or it would be the switch link port), and would also
>> flood it to all ports, one of which is the correct one.
I talked with our networking guy. Apparently what is happening is that
if we pull the link to switch A it drops the forwarding entries for all
MACs on the downed link, but switch B still has stale entries pointing
to the inter-switch link.
If a packet destined for the VM that arrives at switch B, it will send
it across to switch A. (Which is pointless since A no longer has a
working link to the MAC in question.)
If a packet destined for the VM that arrives at switch A, it will
broadcast it to all ports, including the inter-switch link to switch B.
However, switch B still thinks the MAC address is connected to switch
A, so it drops the packet.
Once the VMs send out packets switch B will update its tables, but if
the VMs are event-driven and mostly only respond to incoming packets
they could end up waiting a long time.
Chris
WARNING: multiple messages have this Message-ID (diff)
From: Chris Friesen <chris.friesen@genband.com>
To: Jay Vosburgh <fubar@us.ibm.com>
Cc: bonding-devel@lists.sourceforge.net,
netdev <netdev@vger.kernel.org>,
Stephen Hemminger <shemminger@vyatta.com>,
bridge@lists.linux-foundation.org
Subject: Re: how to handle bonding failover when using a bridge over the bond?
Date: Wed, 13 Feb 2013 11:14:00 -0600 [thread overview]
Message-ID: <511BC9D8.1020200@genband.com> (raw)
In-Reply-To: <511ADEBB.1000701@genband.com>
On 02/12/2013 06:30 PM, Chris Friesen wrote:
> On 02/12/2013 06:02 PM, Jay Vosburgh wrote:
>> Chris Friesen<chris.friesen@genband.com> wrote:
>>> I have a physical host with two ethernet links that are bonded
>>> together (active/backup). Each link is connected to a separate L2
>>> switch, which are in turn connected with a crosslink for
>>> redundancy.
>>>
>>> The physical host is running multiple virtual machines each with
>>> a virtual adapter. The virtual adapters and the bond are all
>>> bridged together to allow communication between the virtual
>>> machines, the host, and the outside world.
>>>
>>> Now suppose one of the slave links fails. The bond device will
>>> failover to the other slave and send out a gratuitous arp on the
>>> newly active slave. This will cause the L2 switches to update
>>> their lookup tables for the MAC address associated with the bond
>>> (so it now points to the newly active slave), but doesn't update
>>> the MAC addresses associated with the various virtual machines.
>>> If someone on the network sends a packet to one of the virtual
>>> machines, the switch will try to send it over the failed slave.
>>
>> If the link failure is such that there is no carrier on the switch
>> port, the switch will drop the forwarding entry for the virtual
>> machine's MAC address from that port. The traffic for the VM's MAC
>> would then flood to all ports, presumably including the link to
>> the other switch, which wouldn't have a forwarding entry for the
>> MAC, either (or it would be the switch link port), and would also
>> flood it to all ports, one of which is the correct one.
I talked with our networking guy. Apparently what is happening is that
if we pull the link to switch A it drops the forwarding entries for all
MACs on the downed link, but switch B still has stale entries pointing
to the inter-switch link.
If a packet destined for the VM that arrives at switch B, it will send
it across to switch A. (Which is pointless since A no longer has a
working link to the MAC in question.)
If a packet destined for the VM that arrives at switch A, it will
broadcast it to all ports, including the inter-switch link to switch B.
However, switch B still thinks the MAC address is connected to switch
A, so it drops the packet.
Once the VMs send out packets switch B will update its tables, but if
the VMs are event-driven and mostly only respond to incoming packets
they could end up waiting a long time.
Chris
next prev parent reply other threads:[~2013-02-13 17:14 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-02-12 23:19 how to handle bonding failover when using a bridge over the bond? Chris Friesen
2013-02-13 0:02 ` Jay Vosburgh
2013-02-13 0:30 ` [Bridge] " Chris Friesen
2013-02-13 0:30 ` Chris Friesen
2013-02-13 17:14 ` Chris Friesen [this message]
2013-02-13 17:14 ` Chris Friesen
2013-02-14 8:01 ` Cong Wang
2013-02-14 16:43 ` Chris Friesen
2013-02-14 18:03 ` Jay Vosburgh
2013-02-14 19:29 ` Chris Friesen
2013-02-14 19:42 ` Rick Jones
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=511BC9D8.1020200@genband.com \
--to=chris.friesen@genband.com \
--cc=bonding-devel@lists.sourceforge.net \
--cc=bridge@lists.linux-foundation.org \
--cc=fubar@us.ibm.com \
--cc=netdev@vger.kernel.org \
--cc=shemminger@vyatta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.