netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "zhangsha (A)" <zhangsha.zhang@huawei.com>
To: Jay Vosburgh <jay.vosburgh@canonical.com>,
	"zaharov@selectel.ru" <zaharov@selectel.ru>
Cc: "vfalico@gmail.com" <vfalico@gmail.com>,
	"andy@greyhouse.net" <andy@greyhouse.net>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	yuehaibing <yuehaibing@huawei.com>,
	hunongda <hunongda@huawei.com>,
	"Chenzhendong (alex)" <alex.chen@huawei.com>
Subject: RE: [PATCH v3] bonding: force enable lacp port after link state recovery for 802.3ad
Date: Thu, 19 Sep 2019 12:46:31 +0000	[thread overview]
Message-ID: <e9a3d1748f0641ebb2423d2121123ff3@huawei.com> (raw)
In-Reply-To: <10098.1568880711@nyx>



> -----Original Message-----
> From: Jay Vosburgh [mailto:jay.vosburgh@canonical.com]
> Sent: 2019年9月19日 16:12
> To: zhangsha (A) <zhangsha.zhang@huawei.com>
> Cc: vfalico@gmail.com; andy@greyhouse.net; davem@davemloft.net;
> netdev@vger.kernel.org; linux-kernel@vger.kernel.org; yuehaibing
> <yuehaibing@huawei.com>; hunongda <hunongda@huawei.com>;
> Chenzhendong (alex) <alex.chen@huawei.com>
> Subject: Re: [PATCH v3] bonding: force enable lacp port after link state
> recovery for 802.3ad
> 
> zhangsha (A) <zhangsha.zhang@huawei.com> wrote:
> 
> >> -----Original Message-----
> >> From: zhangsha (A)
> >> Sent: 2019年9月18日 21:06
> >> To: jay.vosburgh@canonical.com; vfalico@gmail.com;
> >> andy@greyhouse.net; davem@davemloft.net; netdev@vger.kernel.org;
> >> linux-kernel@vger.kernel.org; yuehaibing <yuehaibing@huawei.com>;
> >> hunongda <hunongda@huawei.com>; Chenzhendong (alex)
> >> <alex.chen@huawei.com>; zhangsha (A) <zhangsha.zhang@huawei.com>
> >> Subject: [PATCH v3] bonding: force enable lacp port after link state
> >> recovery for 802.3ad
> >>
> >> From: Sha Zhang <zhangsha.zhang@huawei.com>
> >>
> >> After the commit 334031219a84 ("bonding/802.3ad: fix slave link
> >> initialization transition states") merged, the slave's link status
> >> will be changed to BOND_LINK_FAIL from BOND_LINK_DOWN in the
> following scenario:
> >> - Driver reports loss of carrier and
> >>   bonding driver receives NETDEV_DOWN notifier
> >> - slave's duplex and speed is zerod and
> >>   its port->is_enabled is cleard to 'false';
> >> - Driver reports link recovery and
> >>   bonding driver receives NETDEV_UP notifier;
> >> - If speed/duplex getting failed here, the link status
> >>   will be changed to BOND_LINK_FAIL;
> >> - The MII monotor later recover the slave's speed/duplex
> >>   and set link status to BOND_LINK_UP, but remains
> >>   the 'port->is_enabled' to 'false'.
> >>
> >> In this scenario, the lacp port will not be enabled even its speed
> >> and duplex are valid. The bond will not send LACPDU's, and its state is
> 'AD_STATE_DEFAULTED'
> >> forever. The simplest fix I think is to call
> >> bond_3ad_handle_link_change() in bond_miimon_commit, this function
> >> can enable lacp after port slave speed check.
> >> As enabled, the lacp port can run its state machine normally after link
> recovery.
> >>
> >> Signed-off-by: Sha Zhang <zhangsha.zhang@huawei.com>
> >> ---
> >>  drivers/net/bonding/bond_main.c | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/net/bonding/bond_main.c
> >> b/drivers/net/bonding/bond_main.c index 931d9d9..76324a5 100644
> >> --- a/drivers/net/bonding/bond_main.c
> >> +++ b/drivers/net/bonding/bond_main.c
> >> @@ -2206,7 +2206,8 @@ static void bond_miimon_commit(struct bonding
> >> *bond)
> >>  			 */
> >>  			if (BOND_MODE(bond) == BOND_MODE_8023AD &&
> >>  			    slave->link == BOND_LINK_UP)
> >> -
> >> 	bond_3ad_adapter_speed_duplex_changed(slave);
> >> +				bond_3ad_handle_link_change(slave,
> >> +							    BOND_LINK_UP);
> >>  			continue;
> >>
> >>  		case BOND_LINK_UP:
> >
> >Hi, David,
> >I have replied your email for a while,  I guess you may miss my email, so I
> resend it.
> >The following link address is the last email, please review the new one again,
> thank you.
> >https://patchwork.ozlabs.org/patch/1151915/
> >
> >Last time, you doubted this is a driver specific problem, I prefer to
> >believe it's not because I find the commit 4d2c0cda, its log says "
> >Some NIC drivers don't have correct speed/duplex settings at the time
> >they send NETDEV_UP notification ...".
> >
> >Anyway, I think the lacp status should be fixed correctly, since
> >link-monitoring (miimon) set SPEED/DUPLEX right here.
> 
> 	I suspect this is going to be related to the concurrent discussion with
> Aleksei, and would like to see the instrumentation results from his tests before
> adding another change to attempt to resolve this.
> 
> 	Also, what device are you using for your testing, and are you able to
> run the instrumentation patch that I provided to Aleksei and provide its results?
> 	
> 	-J
> 

Yes, I think it's the same problem.
I am using a Huawei hinic card with kernel 4.19 and got the same message and the
weird system mac "00:00:00:00:00:00":
"link status definitely down for interface eth6, disabling it
 link status up again after 0 ms for interface eth6"

I will run your instrumentation patch, but maybe I need more times.
In fact, I have tried to reproduce the problem for thousands of times, but never succeeded.

> ---
> 	-Jay Vosburgh, jay.vosburgh@canonical.com

  reply	other threads:[~2019-09-19 12:46 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-18 13:06 [PATCH v3] bonding: force enable lacp port after link state recovery for 802.3ad zhangsha.zhang
2019-09-18 13:35 ` zhangsha (A)
2019-09-19  8:11   ` Jay Vosburgh
2019-09-19 12:46     ` zhangsha (A) [this message]
2019-09-24 13:56 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e9a3d1748f0641ebb2423d2121123ff3@huawei.com \
    --to=zhangsha.zhang@huawei.com \
    --cc=alex.chen@huawei.com \
    --cc=andy@greyhouse.net \
    --cc=davem@davemloft.net \
    --cc=hunongda@huawei.com \
    --cc=jay.vosburgh@canonical.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=vfalico@gmail.com \
    --cc=yuehaibing@huawei.com \
    --cc=zaharov@selectel.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).