All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wilco Baan Hofman <wilco@baanhofman.nl>
To: nicolas.dichtel@6wind.com
Cc: netdev <netdev@vger.kernel.org>
Subject: Re: ECMP ipv6 vs ipv4
Date: Wed, 17 Apr 2013 17:22:35 +0200	[thread overview]
Message-ID: <1366212155.31353.111.camel@localhost> (raw)
In-Reply-To: <516EAE3A.8000201@6wind.com>

On Wed, 2013-04-17 at 16:14 +0200, Nicolas Dichtel wrote:
> Le 17/04/2013 15:16, Wilco Baan Hofman a écrit :
> > On Wed, 2013-04-17 at 11:03 +0200, Nicolas Dichtel wrote:
> >
> >>>>>
> >>>>> I propose that we have a nexthop structure to an exclusive route,
> >>>>> similar what we have for IPv4, where we store the gateway, device and
> >>>>> weight for all nexthops and the algorithm in the route. This would make
> >>>>> the netlink API symmetrical again and fixes the n*n inefficiencies when
> >>>>> adding routes (all siblings need to know about all siblings).
> >>>>>
> >>>>> What are your thoughts on this?
> >> The pro of the current implementation is that you can add or delete a nexthop
> >> withtout removing the whole route. You don't need to list again all nexthops
> >> each time you want to modify one.
> >
> > That would also be possible using ip -6 route change, it'll be more
> > efficient for insertions and more consistent with the IPv4
> > implementation. Remember that most code is in fact shared between IPv4
> > and IPv6 implementations for routing protocol suites.
> >
> > For bird it would be much more convenient to have the same API work for
> > both as the code is shared (with minor differences).
> >
> > The memory structure like below would make sense and you can expand it
> > as well:
> >
> > struct ip6_nexthop {
> > 	int               flags; /* algorithm per packet or hash, etc */
> > 	struct list_head  *hops; /* nh_via */
> > };
> > struct ip6_nh {
> > 	int              ifindex;
> > 	struct in6_addr  rt6i_gateway;
> > 	char             weight;
> > 	int              flags; /* pervasive, onlink */
> > };
> >
> > I'm not sure how to make this map correctly to the append API.. I think
> > we need to make sure that all APIs either are consistent and symmetrical
> > or don't work from day 1.
> Maybe the error was to propose two API to insert ECMPv6 routes, but as soon as 
> there is two API, one will not be symetric with what is returned by the kernel ;-)

Yeah, I'm not a fan, especially when it doesn't map 1:1 with what's
going on.


> >
> > I am willing to implement this, including algorithm support using the
> > netlink nexthop API, like the IPv4 implementation.. or change the IPv4
> > implementation, but either way I feel they need to be consistent.
> I'm not sure that this is a major argument. There is already differences between 
> IPv4 and IPv6 (for example, IPv4 addresses are kept when an interface is down, 
> not IPv6 addresses, netlink messages are sent when routes are removed after 
> putting down an interface in IPv6 but not in IPv4). But I let other speak about 
> this.

I would prefer to have fewer differences between IPv4 and IPv6 handling
instead of more, unless the RFCs demand different behaviour.

> What is important is to avoid breaking existing API.
> 

I sort of agree, but quagga support is on hold until this is resolved,
and bird does not support it properly until we resolve this. The latter
I intend to fix myself and I am in contact with Quagga developers.
Static via iproute is a slightly different story though.


If no-one else comments, I'll start on writing a patch to support the
netlink nexthop API with weights and per-packet and weighted hash
algorithms on an exclusive route. I'll also see if I can support ip
route append if nexthop is specified to add a nexthop to the list, but
this shall be a different patch and it may not map well.

I would like to hear some more thoughts on this though.


Wilco Baan Hofman

      reply	other threads:[~2013-04-17 15:22 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-15  7:58 ECMP ipv6 vs ipv4 Wilco Baan Hofman
2013-04-15 15:51 ` Nicolas Dichtel
2013-04-15 16:53   ` Wilco Baan Hofman
2013-04-17  9:03     ` Nicolas Dichtel
2013-04-17 13:16       ` Wilco Baan Hofman
2013-04-17 14:14         ` Nicolas Dichtel
2013-04-17 15:22           ` Wilco Baan Hofman [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1366212155.31353.111.camel@localhost \
    --to=wilco@baanhofman.nl \
    --cc=netdev@vger.kernel.org \
    --cc=nicolas.dichtel@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.