From: Jay Vosburgh <jay.vosburgh@canonical.com>
To: netdev@vger.kernel.org
Cc: Alex Sidorenko <alexandre.sidorenko@hpe.com>,
Mahesh Bandewar <maheshb@google.com>,
Jarod Wilson <jarod@redhat.com>,
Veaceslav Falico <vfalico@gmail.com>,
Andy Gospodarek <andy@greyhouse.net>,
"David Miller" <davem@davemloft.net>
Subject: [PATCH net] bonding: fix slave stuck in BOND_LINK_FAIL state
Date: Tue, 07 Nov 2017 19:50:07 +0900 [thread overview]
Message-ID: <15116.1510051807@nyx> (raw)
The bonding miimon logic has a flaw, in that a failure of the
rtnl_trylock can cause a slave to become permanently stuck in
BOND_LINK_FAIL state.
The sequence of events to cause this is as follows:
1) bond_miimon_inspect finds that a slave's link is down, and so
calls bond_propose_link_state, setting slave->new_link_state to
BOND_LINK_FAIL, then sets slave->new_link to BOND_LINK_DOWN and returns
non-zero.
2) In bond_mii_monitor, the rtnl_trylock fails, and the timer is
rescheduled. No change is committed.
3) bond_miimon_inspect is called again, but this time the slave
from step 1 has recovered. slave->new_link is reset to NOCHANGE, and, as
slave->link was never changed, the switch enters the BOND_LINK_UP case,
and does nothing. The pending BOND_LINK_FAIL state from step 1 remains
pending, as new_link_state is not reset.
4) The state from step 3 persists until another slave changes link
state and causes bond_miimon_inspect to return non-zero. At this point,
the BOND_LINK_FAIL state change on the slave from steps 1-3 is committed,
and the slave will remain stuck in BOND_LINK_FAIL state even though it
is actually link up.
The remedy for this is to initialize new_link_state on each entry
to bond_miimon_inspect, as is already done with new_link.
Reported-by: Alex Sidorenko <alexandre.sidorenko@hpe.com>
Reviewed-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Fixes: fb9eb899a6dc ("bonding: handle link transition from FAIL to UP correctly")
---
drivers/net/bonding/bond_main.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index c99dc59d729b..167434e952da 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2042,6 +2042,7 @@ static int bond_miimon_inspect(struct bonding *bond)
bond_for_each_slave_rcu(bond, slave, iter) {
slave->new_link = BOND_LINK_NOCHANGE;
+ slave->link_new_state = slave->link;
link_state = bond_check_dev_link(bond, slave->dev, 0);
--
2.14.1
next reply other threads:[~2017-11-07 10:50 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-07 10:50 Jay Vosburgh [this message]
2017-11-08 2:05 ` [PATCH net] bonding: fix slave stuck in BOND_LINK_FAIL state Mahesh Bandewar (महेश बंडेवार)
2017-11-08 7:08 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=15116.1510051807@nyx \
--to=jay.vosburgh@canonical.com \
--cc=alexandre.sidorenko@hpe.com \
--cc=andy@greyhouse.net \
--cc=davem@davemloft.net \
--cc=jarod@redhat.com \
--cc=maheshb@google.com \
--cc=netdev@vger.kernel.org \
--cc=vfalico@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.