From: Thomas Graf <tgraf@suug.ch>
To: Tom Herbert <tom@herbertland.com>
Cc: Jiri Benc <jbenc@redhat.com>, David Miller <davem@davemloft.net>,
Linux Kernel Network Developers <netdev@vger.kernel.org>
Subject: Re: [PATCH net-next] route: fix breakage after moving lwtunnel state
Date: Fri, 28 Aug 2015 12:20:42 +0200 [thread overview]
Message-ID: <20150828102042.GF32206@pox.localdomain> (raw)
In-Reply-To: <CALx6S35+g1WG9AERJ5p9Fx41fXvBAdSf9YDjYW0wgAjtgdq2XQ@mail.gmail.com>
On 08/27/15 at 02:20pm, Tom Herbert wrote:
> I'm doing:
>
> ip route add 3333:0:0:1:5555:0:2:0/128 encap ila 2001:0:0:2 via
> 2401:db00:20:911a:face:0:27:0
>
> so that 2401:db00:20:911a:face:0:27:0 is the next hop route for
> destination 2001:0:0:2:5555:0:2:0. The dst_output for lwt just calls
> the original dest_output after transforming the packet without the use
> of any additional routes. So in this way ILA LWT is just acting as a
> "pass-through" packet transformation mechanism. Such a model might
> have additional utility: LWT occurs before iptables so that iptables
> sees the translated or encapsulated packet (davem mentioned this is
> probably what we want), we may want to defer translation until IP
> fragmentation (Roopa mentioned she needs this for MPLS).
>
> > The IP metadata encap at FIB level is currently encap agnostic
> > and requires an intermediate encap device which then defines the
> > actual encap protocol:
> >
> > ip route overlay/prefix encap ip dst 10.1.1.1 dev vxlan0
> > ip route 10.1.1.1/prefix dev eth0
> >
> But then your outputting through another device, multiple routes are
> involved, performance drops :-( What not just set the route through
> VXLAN in that case?
The problem with having a single route is that it doesn't allow to
separate management of overlay and underlay. It is common to manage
the underlay with Quagga, bird or even static routes and defer the
overlay to Neutron or a fancy container orchestration system.
Caching of the 10.1.1.1 nexthop route in the overlay route would
essentially lead to the same behaviour without requiring to hardcode
the nexthop. I should have patches to demonstrate this in a bit.
> > I like it because we don't have to embed all the options as metadata
> > and can still set the through the device. An option would also be
> > to allow for both and add the following alternative:
> >
> > ip route overlay/prefix encap ip type vxlan dst 10.1.1.1 dev eth0
>
> Better, we should be able to send encapsulated packets with needing a device.
Why is the device itself bad? I understand that we want to minimize
overhead but why is a single logical device to keep common config and
stats undesirable?
next prev parent reply other threads:[~2015-08-28 10:20 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-21 10:41 [PATCH net-next] route: fix breakage after moving lwtunnel state Jiri Benc
2015-08-23 23:51 ` David Miller
2015-08-26 16:19 ` Jiri Benc
2015-08-26 22:13 ` Thomas Graf
2015-08-27 19:47 ` Tom Herbert
2015-08-27 21:00 ` Thomas Graf
2015-08-27 21:20 ` Tom Herbert
2015-08-28 10:20 ` Thomas Graf [this message]
2015-08-28 8:36 ` Jiri Benc
2015-08-27 18:30 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150828102042.GF32206@pox.localdomain \
--to=tgraf@suug.ch \
--cc=davem@davemloft.net \
--cc=jbenc@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=tom@herbertland.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).