netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: "David S. Miller" <davem@davemloft.net>,
	Scott Feldman <sfeldma@gmail.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Roopa Prabhu <roopa@cumulusnetworks.com>,
	Andy Gospodarek <gospo@cumulusnetworks.com>,
	Wilson Kok <wkok@cumulusnetworks.com>
Subject: Re: [RFC PATCH net-next v3 1/4] net core: Add IFF_PROTO_DOWN support.
Date: Wed, 29 Apr 2015 16:25:07 -0700	[thread overview]
Message-ID: <CACcJQnQyW7GWp42AuRBwKpzVRAGnXoB7ddtcURW5GxDa_4wKKQ@mail.gmail.com> (raw)
In-Reply-To: <CACcJQnTYa-TuVzOi7B7cYFqht_KPbi9EB+_L0TuaPxWjPZ6oKg@mail.gmail.com>

On Wed, Apr 29, 2015 at 4:07 PM, Anuradha Karuppiah
<anuradhak@cumulusnetworks.com> wrote:
> On Wed, Apr 29, 2015 at 3:13 PM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
>> On Mon, 27 Apr 2015 10:38:21 -0700
>> anuradhak@cumulusnetworks.com wrote:
>>
>>> From: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
>>>
>>> This patch introduces an IFF_PROTO_DOWN flag that can be used by
>>> user space applications to notify drivers that errors have been
>>> detected on the device.
>>>
>>> Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
>>> Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
>>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>>> Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
>>
>> I worry that adding another bit to an already complex state API
>> will break userspace.
>>
>> There are lots of things besides iproute2 which look at those
>> flags including routing daemons (quagga), network manager, netplugd,
>> and switch controllers.
>
> Yes, I understand your concerns here. And tried to work around introducing
> a separate error flag by clearing IFF_UP on proto_down/detecting errors (as
> Scott also brought up earlier).
>
> That implementation failed because of the following reasons -
> 1. There is no way to disambiguate between admin_down (!IFF_UP) and an
> APP/driver enforced error_down (IFF_PROTO_DOWN). Administrator or
> automation-scripts that monitor the config assumed that switch-port
> configuration had somehow fallen out of sync (and attempted to reinstate the
> admin_up repeatedly).
>
> 2. Automatic error recovery was not possible; consider the following scenario
> for e.g.
>    a. The MLAG peer-link is down so the MLAG app on the secondary switch has
>       proto_down’ed all the MLAG ports (including switch-port swp1) by clearing
>       IFF_UP.
>    b. At the same time the administrator is in the process of making some
>       changes on the network connected to swp1. To avoid doing it live he would
>       admin_disable swp1 (!IFF_UP) by doing an "ip link set swp1 down" (this
>       is a no-op as event #a has already cleared IFF_UP on swp1).
>    c. If the MLAG peer-link recovers at this point the MLAG app on the
>       secondary switch would try to automatically recover the MLAG ports
>       by clearing proto_down (i.e. setting IFF_UP); including on swp1. Doing
>       that overrides the administrator’s directive to keep swp1 admin_down.
>       Overriding an admin-down in a live network can be very dangerous so it
>       is not possible to do auto-error-recovery unless we have a way to
>       disambiguate between the admin and error states.

I have the need to disambiguate the error state but it doesn't have to be an
IFF_X attribute. Stephen, Do you think it would be more easily consumable if
it were a new/separate net_device attribute instead of being a new bit in
"&struct net_device flags"?

  reply	other threads:[~2015-04-29 23:25 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-27 17:38 [RFC PATCH net-next v3 1/4] net core: Add IFF_PROTO_DOWN support anuradhak
2015-04-28  6:05 ` Scott Feldman
2015-04-28 15:45   ` Anuradha Karuppiah
2015-04-29 22:13 ` Stephen Hemminger
2015-04-29 23:07   ` Anuradha Karuppiah
2015-04-29 23:25     ` Anuradha Karuppiah [this message]
2015-04-29 23:33       ` Stephen Hemminger
2015-04-30  0:16         ` Anuradha Karuppiah

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACcJQnQyW7GWp42AuRBwKpzVRAGnXoB7ddtcURW5GxDa_4wKKQ@mail.gmail.com \
    --to=anuradhak@cumulusnetworks.com \
    --cc=davem@davemloft.net \
    --cc=gospo@cumulusnetworks.com \
    --cc=netdev@vger.kernel.org \
    --cc=roopa@cumulusnetworks.com \
    --cc=sfeldma@gmail.com \
    --cc=stephen@networkplumber.org \
    --cc=wkok@cumulusnetworks.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).