From mboxrd@z Thu Jan 1 00:00:00 1970 From: Robert Shearman Subject: Re: [PATCH net-next RFC] mpls: support for dead routes Date: Fri, 30 Oct 2015 15:06:34 +0000 Message-ID: <5633877A.4060303@brocade.com> References: <1446133748-13738-1-git-send-email-roopa@cumulusnetworks.com> <56324F09.2060103@brocade.com> <56326980.5060605@cumulusnetworks.com> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Cc: , , To: roopa Return-path: Received: from mx0b-000f0801.pphosted.com ([67.231.152.113]:56848 "EHLO mx0b-000f0801.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750710AbbJ3PGt (ORCPT ); Fri, 30 Oct 2015 11:06:49 -0400 In-Reply-To: <56326980.5060605@cumulusnetworks.com> Sender: netdev-owner@vger.kernel.org List-ID: On 29/10/15 18:46, roopa wrote: > On 10/29/15, 9:53 AM, Robert Shearman wrote: >> On 29/10/15 15:49, Roopa Prabhu wrote: >>> From: Roopa Prabhu >>> >>> Adds support for both RTNH_F_DEAD and RTNH_F_LINKDOWN flags. >>> This resembles ipv4 fib code. I also picked fib_rebalance from >>> ipv4. Enabled weights support for nexthop, just because the >>> infrastructure is already there. >>> >>> Signed-off-by: Roopa Prabhu >>> --- >>> I want to get this in before net-next closes as promised. >>> I have tested it for the dead/linkdown flags. The multipath selection >>> and hash calculation in the face of dead routes needs some more >>> work. I am short on cycles this week and thought of getting some >>> early feedback. Hence sending this out as RFC. I will continue with some >>> more testing. Robert, I am using your hash algo but it needs some more >>> work with dead routes. If you already have any thoughts on this, i will >>> take them. thanks!. >> >> If you were to sort the array of nexthops (and by implication via addresses) by their non-deadness keeping a count of the alive nexthops, then there's no need to resort to an O(n) algorithm for selecting the nexthop, and no need to store per-nh flags. >> >> E.g. before eth0 link down: >> >> +----------------------+ >> | rt_nhn = 3 | >> | rt_nhn_alive = 3 | >> +----------------------+ >> | nh 0: | >> | dev = eth0, ... | >> +----------------------+ >> | nh 1: | >> | dev = eth1, ... | >> +----------------------+ >> | nh 2: | >> | dev = eth0, ... | >> +----------------------+ >> | vias ... | >> +----------------------+ >> >> after eth0 link down: >> >> +----------------------+ >> | rt_nhn = 3 | >> | rt_nhn_alive = 1 | >> +----------------------+ >> | nh 0: | >> | dev = eth1, ... | >> +----------------------+ >> | nh 1: | >> | dev = eth0, ... | >> +----------------------+ >> | nh 2: | >> | dev = eth0, ... | >> +----------------------+ >> | vias ... | >> +----------------------+ >> >> The mpls_select_multipath algorithm just then needs to be changed to use rt_nhn_alive instead of rt_nhn and will work otherwise as-is. >> >> On link down you'll need to alloc a new route for RCU-safety, but you can presumably just do a kmemdup to reduce the amount of code you have to write and sort the nexthops in the copy. Link up will be similar. > You mean sort the nexthops on every link and carrier event ?. I don't see a need for it. >> >> Then on the mpls_dump_route, if the index of the nexthop is >= rt_nhn_alive then the path is link-down. If the nh_dev is NULL then generate RTNH_F_DEAD|RTNH_F_LINKDOWN for the flags, otherwise just RTNH_F_LINKDOWN. > I was not thinking of making nh_dev NULL on RTNH_F_DEAD. And i would prefer to store the RTNH flags instead of deriving them on every dump. >> >> This would use less memory and be faster for forwarding. > Thanks for your inputs Robert. I am not see a huge advantage in sorting the nexthops on link events. > And i will be only saving an 'int' in a nexthop. It avoids the extra 12 bytes per nexthop and it means that you don't need to walk through every nexthop in the worst case to select a path during forwarding. Thanks, Rob