From: Jay Vosburgh <fubar@us.ibm.com>
To: Chris Friesen <chris.friesen@genband.com>
Cc: andy@greyhouse.net, netdev <netdev@vger.kernel.org>
Subject: Re: [BUG?] bonding, slave selection, carrier loss, etc.
Date: Fri, 10 Feb 2012 17:53:53 -0800 [thread overview]
Message-ID: <28766.1328925233@death.nxdomain> (raw)
In-Reply-To: <4F35AC78.3010907@genband.com>
Chris Friesen <chris.friesen@genband.com> wrote:
>I'm resurrecting an ancient discussion I had with Jay, because I think
>the issue described below is still present and the code he talked about
>submitting to close it doesn't appear to have ever gone in.
Yah, I never got it to work quite right; I don't remember
exactly why.
>Basically in active/backup mode with mii monitoring there is a window
>between the active slave device losing carrier and calling
>netif_carrier_off() and the miimon code actually detecting the loss of
>the carrier and selecting a new active slave.
>
>The best solution would be for bonding to just register for notification
>of the link going down. Presumably most drivers should be doing that
>properly by now, and for devices that get interrupt-driven notification
>of link status changes this would allow the bonding code to react much
>quicker.
A quick look at some drivers shows that at least acenic still
doesn't do netif_carrier_off, so converting entirely to a notifier-based
failover mechanism would break drivers that work today.
Adding a notifier callback as an additional path into something
like bond_miimon_commit may be feasible.
>Barring that, I think something like the following is needed. This is
>against 2.6.27, but could easily be reworked against current.
>
>
>
>---------------------- drivers/net/bonding/bond_main.c -----------------------
>index 8499558..e4445d8 100644
>@@ -4313,20 +4313,33 @@ static int bond_xmit_activebackup(struct sk_buff *skb, struct net_device *bond_d
> read_lock(&bond->lock);
> read_lock(&bond->curr_slave_lock);
>
> if (!BOND_IS_OK(bond)) {
> goto out;
> }
>
> if (!bond->curr_active_slave)
> goto out;
>
>+ /* Verify that the active slave is actually up before
>+ * trying to send packets. If it isn't, then
>+ * trigger the selection of a new active slave.
>+ */
>+ if (!IS_UP(bond->curr_active_slave->dev)) {
>+ read_unlock(&bond->curr_slave_lock);
>+ write_lock(&bond->curr_slave_lock);
>+ bond_select_active_slave(bond);
>+ write_unlock(&bond->curr_slave_lock);
>+ read_lock(&bond->curr_slave_lock);
>+ if (!bond->curr_active_slave)
>+ goto out;
>+ }
The problem here is going to be that bond_select_active_slave()
should be called with RTNL held (because the notifier calls it makes
require RTNL), and I'm not sure it's permissible to acquire RTNL in a
driver transmit function.
-J
> res = bond_dev_queue_xmit(bond, skb, bond->curr_active_slave->dev);
>
> out:
> if (res) {
> /* no suitable interface, frame not sent */
> dev_kfree_skb(skb);
> }
> read_unlock(&bond->curr_slave_lock);
> read_unlock(&bond->lock);
> return 0;
>
>Chris
>
>
>
>
>On 03/27/2009 06:00 PM, Jay Vosburgh wrote:
>> Chris Friesen<cfriesen@nortel.com> wrote:
>>
>>> In a much earlier version of the bonding driver we ran into problems
>>> where we could have lost carrier on one of the slaves, but at the time
>>> of xmit the bonding driver hadn't yet switched to a better slave.
>>> Because of this we added a patch very much like the one below.
>>>
>>> A quick glance at the current bonding code would seem to indicate that
>>> there could still be a window between the active slave device losing
>>> carrier and calling netif_carrier_off() and the miimon code actually
>>> detecting the loss of the carrier and selecting a new active slave.
>>> Do I have this correct? If so, would the patch below be correct?
>>
>> Yes, the window is equal to whatever the monitoring interval is
>> (for miimon) or double the interval for ARP.
>>
>> Your patch, I think, would work, but it's suboptimal in that it
>> only affects one mode, and doesn't resolve any of the bigger issues with
>> the link monitoring system in bonding (see below). Trying to do the
>> equivalent in other modes may have issues; some modes require RTNL to be
>> held when changing slave states, so it's difficult to do that from the
>> transmit routine.
>>
>>> On a related note--assuming the net driver can detect link loss and
>>> is properly calling netif_carrier_off() why do we still need to poll
>>> the status in the bonding driver? Isn't there some way to hook into
>>> the network stack and get notified when the carrier goes down?
>>
>> This is actually something I'm working on now.
>>
>> There are notifier callbacks that are tied to a driver calling
>> netif_carrier_on or _off. The problem is that a bunch of older (mostly
>> 10/100, although acenic is a gigabit) drivers don't do netif_carrier_on
>> or _off, or check their link state on a ridiculously long interval, so
>> simply dropping the current miimon implementation and replacing it with
>> the event notifier may not be feasible for backwards compatibility
>> reasons. Heck, I've still got 3c59x and acenic cards in my test
>> systems, neither of which do netif_carrier correctly; I can't be the
>> only one.
>>
>> An additional goal is to permit the state change notifications
>> (or miimon) and the ARP monitor to run concurrently. Sadly, the current
>> "link state" system can't handle two things simultaneously poking at the
>> slave's link state; if, e.g., ARP says down, but MII/notifiers says up,
>> then the link state can flap, so it needs a sort of "arbitrator."
>>
>> A minor advantage of reworking all of that is that it should end
>> up being less code when all done, and should be more modular, so it'd be
>> easier if somebody wanted to add, say, an ICMP probe monitor.
>>
>> I'll probably be posting an RFC patch next week.
>>
>> -J
>>
>> ---
>> -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
>>
>
>
>--
>Chris Friesen
>Software Developer
>GENBAND
>chris.friesen@genband.com
>www.genband.com
>
next prev parent reply other threads:[~2012-02-11 1:54 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <49CD5B93.7010407@nortel.com>
[not found] ` <31087.1238198438@death.nxdomain.ibm.com>
2012-02-10 23:47 ` [BUG?] bonding, slave selection, carrier loss, etc Chris Friesen
2012-02-11 1:53 ` Jay Vosburgh [this message]
2012-02-11 18:52 ` Ben Hutchings
2012-02-13 18:16 ` Chris Friesen
2012-02-13 18:48 ` Stephen Hemminger
2012-02-13 19:18 ` Chris Friesen
2012-02-13 20:24 ` Ben Hutchings
2012-02-13 20:37 ` Jay Vosburgh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=28766.1328925233@death.nxdomain \
--to=fubar@us.ibm.com \
--cc=andy@greyhouse.net \
--cc=chris.friesen@genband.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.