netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hannes Frederic Sowa <hannes@stressinduktion.org>
To: Andy Gospodarek <gospo@cumulusnetworks.com>,
	netdev@vger.kernel.org, davem@davemloft.net
Cc: ddutt@cumulusnetworks.com
Subject: Re: [PATCH net-next] net: change fib behavior based on interface link status
Date: Wed, 03 Jun 2015 11:35:09 +0200	[thread overview]
Message-ID: <1433324109.1444360.285555441.44C7F90D@webmail.messagingengine.com> (raw)
In-Reply-To: <1433300839-18511-1-git-send-email-gospo@cumulusnetworks.com>

On Wed, Jun 3, 2015, at 05:07, Andy Gospodarek wrote:
> This patch adds the ability to have the Linux kernel track whether or
> not a particular route should be used based on the link-status of the
> interface associated with the next-hop.
> 
> Before this patch any link-failure on an interface that was serving as a
> gateway for some systems could result in those systems being isolated
> from the rest of the network as the stack would continue to attempt to
> send frames out of an interface that is actually linked-down.  When the
> kernel is responsible for all forwarding, it should also be responsible
> for taking action when the traffic can no longer be forwarded -- there
> is no real need to outsource link-monitoring to userspace anymore.
> 
> This feature is only enabled with the new sysctl set (default is off):
> net.core.kill_routes_on_linkdown = 1
> 
> When this is set, the following behavior can be observed (interface p8p1
> is link-down):
> 
> # ip route show 
> default via 10.0.5.2 dev p9p1 
> 10.0.5.0/24 dev p9p1  proto kernel  scope link  src 10.0.5.15 
> 70.0.0.0/24 dev p7p1  proto kernel  scope link  src 70.0.0.1 
> 80.0.0.0/24 dev p8p1  proto kernel  scope link  src 80.0.0.1 dead 
> 90.0.0.0/24 via 80.0.0.2 dev p8p1  metric 1 dead 
> 90.0.0.0/24 via 70.0.0.2 dev p7p1  metric 2 
> # ip route get 90.0.0.1 
> 90.0.0.1 via 70.0.0.2 dev p7p1  src 70.0.0.1 
>     cache 
> # ip route get 80.0.0.1 
> local 80.0.0.1 dev lo  src 80.0.0.1 
>     cache <local> 
> # ip route get 80.0.0.2
> 80.0.0.2 via 10.0.5.2 dev p9p1  src 10.0.5.15 
>     cache 
> 
> While the route does remain in the table (so it can be modified if
> needed rather than being wiped away as it would be if IFF_UP was
> cleared), the proper next-hop is chosen automatically when the link is
> down.  Now interface p8p1 is linked-up:
> 
> # ip route show 
> default via 10.0.5.2 dev p9p1 
> 10.0.5.0/24 dev p9p1  proto kernel  scope link  src 10.0.5.15 
> 70.0.0.0/24 dev p7p1  proto kernel  scope link  src 70.0.0.1 
> 80.0.0.0/24 dev p8p1  proto kernel  scope link  src 80.0.0.1 
> 90.0.0.0/24 via 80.0.0.2 dev p8p1  metric 1 
> 90.0.0.0/24 via 70.0.0.2 dev p7p1  metric 2 
> 192.168.56.0/24 dev p2p1  proto kernel  scope link  src 192.168.56.2 
> # ip route get 90.0.0.1 
> 90.0.0.1 via 80.0.0.2 dev p8p1  src 80.0.0.1 
>     cache 
> # ip route get 80.0.0.1 
> local 80.0.0.1 dev lo  src 80.0.0.1 
>     cache <local> 
> # ip route get 80.0.0.2
> 80.0.0.2 dev p8p1  src 80.0.0.1 
>     cache 
> 
> and the output changes to what one would expect.
> 
> Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
> Suggested-by: Dinesh Dutt <ddutt@cumulusnetworks.com>
> 
> ---
> Though there were some that preferred not to have a configuration option
> and to make this behavior the default when it was discussed in Ottawa
> earlier this year since "it was time to do this."  I wanted to propose
> the config option to preserve the current behavior for those that desire
> it.  I'll happily remove it if Dave and Linus approve.

I raised the concern that in case we don't have any other fallback route
and the kernel decides to send back ICMP errors to the end host, we
could kill TCP connections with those error messages. The current
behavior is that the packet gets silently dropped and TCP will retry, no
ICMP error message is send by immediate routers. This is especially
important if only a short link loss event happens on a default route.

Thus I prefer the configuration option with the default value being off.

This is a great feature, thanks!
Hannes

  parent reply	other threads:[~2015-06-03  9:35 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-03  3:07 [PATCH net-next] net: change fib behavior based on interface link status Andy Gospodarek
2015-06-03  5:03 ` Scott Feldman
2015-06-03 15:12   ` Andy Gospodarek
2015-06-03 17:40     ` Scott Feldman
2015-06-03 17:57       ` Andy Gospodarek
2015-06-03 18:12         ` Scott Feldman
2015-06-03 18:16           ` Andy Gospodarek
2015-06-03  9:35 ` Hannes Frederic Sowa [this message]
2015-06-03 13:49   ` Andy Gospodarek
2015-06-03 13:53 ` Neil Horman
2015-06-03 14:13   ` Hannes Frederic Sowa
2015-06-03 14:21     ` Neil Horman
2015-06-03 14:45       ` Hannes Frederic Sowa
2015-06-03 14:46       ` Andy Gospodarek
2015-06-03 15:02         ` Neil Horman
2015-06-03 15:16           ` Andy Gospodarek
2015-06-03 17:47             ` Neil Horman
2015-06-03 15:17           ` Hannes Frederic Sowa
2015-06-03 14:03 ` Hannes Frederic Sowa
2015-06-03 14:24   ` Andy Gospodarek
2015-06-03 18:15 ` Scott Feldman
2015-06-03 18:27   ` Andy Gospodarek
2015-06-03 19:03     ` Scott Feldman
2015-06-03 20:11       ` Andy Gospodarek
2015-06-03 20:04     ` Hannes Frederic Sowa
2015-06-03 20:34       ` Andy Gospodarek
2015-06-03 20:36         ` Hannes Frederic Sowa
2015-06-03 18:25 ` Alexander Duyck
2015-06-03 20:02   ` Andy Gospodarek
2015-06-03 21:01     ` Alexander Duyck
2015-06-05 19:05       ` Andy Gospodarek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1433324109.1444360.285555441.44C7F90D@webmail.messagingengine.com \
    --to=hannes@stressinduktion.org \
    --cc=davem@davemloft.net \
    --cc=ddutt@cumulusnetworks.com \
    --cc=gospo@cumulusnetworks.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).