netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jay Vosburgh <jay.vosburgh@canonical.com>
To: Alex Sidorenko <alexandre.sidorenko@hpe.com>
Cc: netdev@vger.kernel.org, Jarod Wilson <jarod@redhat.com>
Subject: Re: Bond recovery from BOND_LINK_FAIL state not working
Date: Wed, 01 Nov 2017 14:34:05 -0700	[thread overview]
Message-ID: <28118.1509572045@famine> (raw)
In-Reply-To: <f9a6457e-f575-e9c1-2519-7694e30c474f@hpe.com>

Alex Sidorenko <alexandre.sidorenko@hpe.com> wrote:

>The problem has been found while trying to deploy RHEL7 on HPE Synergy
>platform, it is seen both in customer's environment and in HPE test lab.
>
>There are several bonds configured in TLB mode and miimon=100, all other
>options are default. Slaves are connected to VirtualConnect
>modules. Rebooting a VC module should bring one bond slave (ens3f0) down
>temporarily, but not another one (ens3f1). But what we see is
>
>Oct 24 10:37:12 SYDC1LNX kernel: bond0: link status up again after 0 ms for interface ens3f1
>
>and it never recovers. When VC reboot is complete, everything goes back to normal again.
>
>Redhat has backported all recent upstream commits and instrumented the
>bonding driver. We have found the following (when VC goes down)
>
>In bond_miimon_inspect() the first slave goes to
>	bond_propose_link_state(slave, BOND_LINK_FAIL);
>		and
>	slave->new_link = BOND_LINK_DOWN;
>
>The second slave is still
>	slave->link = BOND_LINK_UP;
>		and
>        slave->new_link = BOND_LINK_NOCHANGE;
>
>This is as expected. But in bond_miimon_commit() we see that _both_ slaves
>are in BOND_LINK_FAIL.  That is, something changes the state of the second
>slave from another thread. We suspect the NetworkManager, as the problem
>is there _only_ when bonds are controlled by it, if we set
>NM_CONTROLLED=no everything starts working normally.
>
>While we still do not understand how NM affects bond state, I think that
>bonding driver needs to be made reliable enough to recover even from this
>state.

	You're suggesting that the bonding driver be able to distinguish
"false" down states set by network manager from a genuine link failure?
Am I misunderstanding?

>At this moment when we enter bond_miimon_inspect() with
>slave->link = BOND_LINK_FAIL and are in the following code
>
>                        /*FALLTHRU*/
>                case BOND_LINK_BACK:
>                        if (!link_state) {
>                                bond_propose_link_state(slave, BOND_LINK_DOWN);
>                                netdev_info(bond->dev, "link status down
>again after %d ms for interface %s\n",
>                                            (bond->params.updelay - slave->delay) *
>                                            bond->params.miimon,
>                                            slave->dev->name);
>
>                                commit++;
>                                continue;
>                        }
>

	Looking at bonding in the current net-next, if a slave enters
bond_miimon_inspect with slave->link == BOND_LINK_FAIL, it will not
proceed to the BOND_LINK_BACK block of the switch; BOND_LINK_FAIL does
not fall through that far.

	Did you actually mean the BOND_LINK_FAIL block of the switch?
I'll assume so for the rest of my reply.

>we propose a new state and do 'commit++', but we do not change
>slave->new_link from BOND_LINK_NOCHANGE. As a result, bond_miimon_commit()
>will not process this slave.
>
>The following patch fixes the issue:
>
>****
>If we enter bond_miimon_inspect() with slave_link=BOND_LINK_FAIL
>and recover, we do bond_propose_link_state(slave, BOND_LINK_UP);
>but do not change slave->new_link, so it is left in
>BOND_LINK_NOCHANGE. As a result, bond_miimon_commit() will not
>process that slave and it never recovers. We need to set
>slave->new_link = BOND_LINK_UP to make bond_miimon_commit() work

	What is your downdelay set to?  In principle,
bond_miimon_inspect should not enter with a slave in BOND_LINK_FAIL
state if downdelay is 0.

> drivers/net/bonding/bond_main.c | 1 +
> 1 file changed, 1 insertion(+)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index c99dc59..07aa7ba 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -2072,6 +2072,7 @@ static int bond_miimon_inspect(struct bonding *bond)
>                                            (bond->params.downdelay - slave->delay) *
>                                            bond->params.miimon,
>                                            slave->dev->name);
>+                               slave->new_link = BOND_LINK_UP;
>                                commit++;
>                                continue;
>                        }
>-- 
>2.7.4

	Your patch does not apply to net-next, so I'm not absolutely
sure where this is, but presuming that this is in the BOND_LINK_FAIL
case of the switch, it looks like both BOND_LINK_FAIL and BOND_LINK_BACK
will have the issue that if the link recovers or fails, respectively,
within the delay window (for down/updelay > 0) it won't set a
slave->new_link.

	Looks like this got lost somewhere along the line, as originally
the transition back to UP (or DOWN) happened immediately, and that has
been lost somewhere.

	I'll have to dig out when that broke, but I'll see about a test
patch this afternoon.

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

  reply	other threads:[~2017-11-01 21:34 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-01 18:09 Bond recovery from BOND_LINK_FAIL state not working Alex Sidorenko
2017-11-01 21:34 ` Jay Vosburgh [this message]
2017-11-02  0:35   ` Jay Vosburgh
2017-11-02  2:37     ` Jarod Wilson
2017-11-02  4:51       ` Jay Vosburgh
2017-11-02 12:47         ` Alex Sidorenko
2017-11-03  1:11           ` Jay Vosburgh
2017-11-03 15:40             ` Alex Sidorenko
2017-11-03 18:26               ` Jay Vosburgh
2017-11-03 19:30                 ` Alex Sidorenko
2017-11-03 21:46                   ` Jarod Wilson
2017-11-06  6:06             ` Jarod Wilson
2017-11-07  2:47               ` Jay Vosburgh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=28118.1509572045@famine \
    --to=jay.vosburgh@canonical.com \
    --cc=alexandre.sidorenko@hpe.com \
    --cc=jarod@redhat.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).