netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
To: Scott Feldman <sfeldma@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	Netdev <netdev@vger.kernel.org>,
	Roopa Prabhu <roopa@cumulusnetworks.com>,
	Andy Gospodarek <gospo@cumulusnetworks.com>,
	Wilson Kok <wkok@cumulusnetworks.com>
Subject: Re: [RFC PATCH net-next v3 0/4] net: Introduce IFF_PROTO_DOWN flag.
Date: Tue, 28 Apr 2015 13:04:41 -0700	[thread overview]
Message-ID: <CACcJQnSZJGU2ohYo=NKjiyo3n0rX-0SP2Aupt74ONbvY7aE33A@mail.gmail.com> (raw)
In-Reply-To: <CAE4R7bAbe34eOqfdEUqNAp57936Eqcu54+r_Bp75CDdB4FgThw@mail.gmail.com>

On Tue, Apr 28, 2015 at 12:37 PM, Scott Feldman <sfeldma@gmail.com> wrote:
> On Tue, Apr 28, 2015 at 8:39 AM, Anuradha Karuppiah
> <anuradhak@cumulusnetworks.com> wrote:
>>
>>
>> On Mon, Apr 27, 2015 at 10:45 PM, Scott Feldman <sfeldma@gmail.com> wrote:
>>>
>>> On Mon, Apr 27, 2015 at 10:38 AM,  <anuradhak@cumulusnetworks.com> wrote:
>>> > From: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
>>> >
>>> > User space daemons can detect errors in the network that need to be
>>> > notified to the switch device drivers.
>>> >
>>> > Drivers can react to this error state by doing a phy-down on the
>>> > switch-port which would result in a carrier-off locally and on the
>>> > directly connected switch. Doing that would prevent loops and
>>> > black-holes in the network.
>>>
>>> (Sorry if this was asked earlier)
>>>
>>> Can the application simply send a SETLINK with IFF_UP clear and the
>>> port driver's ndo_stop would bring the PHY link down?
>>
>>
>> Yes, Clearing IFF_UP on detecting errors (PROTO_DOWN) is possible and we
>> tried
>> that implementation as well. Unfortunately it failed because of the
>> following
>> reasons -
>>
>> 1. There is no way to disambiguate between admin_down (!IFF_UP) and an
>> APP/driver enforced error_down (IFF_PROTO_DOWN). Administrator or
>> automation-scripts that monitor the config assumed that switch-port
>> configuration had somehow fallen out of sync (and attempted to reinstate the
>> admin_up repeatedly).
>>
>> 2. Automatic error recovery was not possible; consider the following
>> scenario
>> for e.g.
>>    a. The MLAG peer-link is down so the MLAG app on the secondary switch has
>>       proto_down’ed all the MLAG ports (including switch-port swp1) by
>> clearing
>>       IFF_UP.
>>    b. At the same time the administrator is in the process of making some
>>       changes on the network connected to swp1. To avoid doing it live he
>> would
>>       admin_disable swp1 (!IFF_UP) by doing an "ip link set swp1 down" (this
>>       is a no-op as event #a has already cleared IFF_UP on swp1).
>>    c. If the MLAG peer-link recovers at this point the MLAG app on the
>>       secondary switch would try to automatically recover the MLAG ports
>>       by clearing proto_down (i.e. setting IFF_UP); including on swp1. Doing
>>       that overrides the administrator’s directive to keep swp1 admin_down.
>>       Overriding an admin-down in a live network can be very dangerous so it
>>       is not possible to do auto-error-recovery unless we have a way to
>>       disambiguate between the admin and error states
>
> That makes sense.
>
> Dang, this is so close to IFF_DORMANT.  The interface can be IFF_UP
> and link mode can be DORMANT.  Can the port driver kill PHY link if
> dev->flags&IFF_DORMANT in ndo_set_rx_mode()?  Would require
> IFF_DORMANT is included in dev->flags in __dev_change_flags().

Yes, IFF_DORMANT does seem close to what is needed; in the current/standard
interpretation IFF_DORMANT keeps the switch port phy-up and running (and most
PDUs are also exchanged in the dormant state). Like you said we could
re-interpret IFF_DORMANT in this context to phy-down the switch-port;
unfortunately we are already using IFF_DORMANT as well (in its standard
interpretation)...

We are using the dormant mode (for the MLAG app itself) to hold the MLAG port
in a brief/transition-ary suspended state when the switch-port link/carrier up
happens. This has been done to co-ordinate states across the MLAG peer switches
and to ensure that egress port block masks are programmed on the peer switch
before transitioning the local switch port to an OPER_UP state. If we didn't do
that the dual-connected server would see duplicate packets every time a
link-down to link-up happened on a MLAG port.

So IFF_DORMANT re-interpretation is not going to be easily possible for the
MLAG use case.

  reply	other threads:[~2015-04-28 20:04 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-27 17:38 [RFC PATCH net-next v3 0/4] net: Introduce IFF_PROTO_DOWN flag anuradhak
2015-04-28  5:45 ` Scott Feldman
2015-04-28 15:43   ` Anuradha Karuppiah
     [not found]   ` <CACcJQnRw5HVUb0M3A2u_zbMtp85pi+kdCUa5gaY6cN4HXpVyeQ@mail.gmail.com>
2015-04-28 19:37     ` Scott Feldman
2015-04-28 20:04       ` Anuradha Karuppiah [this message]
2015-04-29  0:28         ` Scott Feldman
2015-04-29 22:04           ` Anuradha Karuppiah
2015-04-29 22:08 ` Stephen Hemminger
2015-04-29 22:58   ` Anuradha Karuppiah

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACcJQnSZJGU2ohYo=NKjiyo3n0rX-0SP2Aupt74ONbvY7aE33A@mail.gmail.com' \
    --to=anuradhak@cumulusnetworks.com \
    --cc=davem@davemloft.net \
    --cc=gospo@cumulusnetworks.com \
    --cc=netdev@vger.kernel.org \
    --cc=roopa@cumulusnetworks.com \
    --cc=sfeldma@gmail.com \
    --cc=wkok@cumulusnetworks.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).