From: Thomas Graf <tgraf@suug.ch>
To: Shrijeet Mukherjee <shm@cumulusnetworks.com>
Cc: hannes@stressinduktion.org, nicolas.dichtel@6wind.com,
dsahern@gmail.com, ebiederm@xmission.com, hadi@mojatatu.com,
davem@davemloft.net, stephen@networkplumber.org,
netdev@vger.kernel.org, roopa@cumulusnetworks.com,
gospo@cumulusnetworks.com, jtoppins@cumulusnetworks.com,
nikolay@cumulusnetworks.com
Subject: Re: [RFC net-next 0/3] Proposal for VRF-lite
Date: Tue, 9 Jun 2015 12:15:50 +0200 [thread overview]
Message-ID: <20150609101550.GA10411@pox.localdomain> (raw)
In-Reply-To: <cover.1433561681.git.shm@cumulusnetworks.com>
On 06/08/15 at 11:35am, Shrijeet Mukherjee wrote:
[...]
> model with some performance paths that need optimization. (Specifically
> the output route selector that Roopa, Robert, Thomas and EricB are
> currently discussing on the MPLS thread)
Thanks for posting these patches just in time. This explains how
you intent to deploy Roopa's patches in a scalable manner.
> High Level points
>
> 1. Simple overlay driver (minimal changes to current stack)
> * uses the existing fib tables and fib rules infrastructure
> 2. Modelled closely after the ipvlan driver
> 3. Uses current API and infrastructure.
> * Applications can use SO_BINDTODEVICE or cmsg device indentifiers
> to pick VRF (ping, traceroute just work)
I like the aspect of reusing existing user interfaces. We might
need to introduce a more fine grained capability than CAP_NET_RAW
to give containers the privileges to bind to a VRF without
allowing them to inject raw frames.
Given I understand this correctly: If my intent was to run a
process in multiple VRFs, then I would need to run that process
in the host network namespace which contains the VRF devices
which would also contain the physical devices. While I might want
to grant my process the ability to bind to VRFs, I may not want
to give it the privileges to bind to any device. So we could
consider introducing CAP_NET_VRF which would allow to bind to
VRF devices.
> * Standard IP Rules work, and since they are aggregated against the
> device, scale is manageable
> 4. Completely orthogonal to Namespaces and only provides separation in
> the routing plane (and ARP)
> 5. Debugging is built-in as tcpdump and counters on the VRF device
> works as is.
>
> N2
> N1 (all configs here) +---------------+
> +--------------+ | |
> |swp1 :10.0.1.1+----------------------+swp1 :10.0.1.2 |
> | | | |
> |swp2 :10.0.2.1+----------------------+swp2 :10.0.2.2 |
> | | +---------------+
> | VRF 0 |
> | table 5 |
> | |
> +---------------+
> | |
> | VRF 1 | N3
> | table 6 | +---------------+
> | | | |
> |swp3 :10.0.2.1+----------------------+swp1 :10.0.2.2 |
> | | | |
> |swp4 :10.0.3.1+----------------------+swp2 :10.0.3.2 |
> +--------------+ +---------------+
Do I understand this correctly that swp* represent veth pairs?
Why do you have distinct addresses on each peer of the pair?
Are the addresses in N2 and N3 considered private and NATed?
[...]
> # Install the lookup rules that map table to VRF domain
> ip rule add pref 200 oif vrf0 lookup 5
> ip rule add pref 200 iif vrf0 lookup 5
> ip rule add pref 200 oif vrf1 lookup 6
> ip rule add pref 200 iif vrf1 lookup 6
I think this is a good start but we all know the scalability
constraints of this. Depending on the number of L3 domains,
an eBPF classifier utilizing a map to translate origin to
routing table and vice versa might address the scale requirement
long term.
[...]
I will comment on the implementation specifics once I have a
good understanding of your desired end state looks like.
next prev parent reply other threads:[~2015-06-09 10:15 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-08 18:35 [RFC net-next 0/3] Proposal for VRF-lite Shrijeet Mukherjee
2015-06-08 18:35 ` [RFC net-next 1/3] Symbol preparation for VRF driver Shrijeet Mukherjee
2015-06-10 16:24 ` Alexander Duyck
2015-06-08 18:35 ` [RFC net-next 2/3] VRF driver and needed infrastructure Shrijeet Mukherjee
2015-06-08 19:08 ` David Ahern
2015-06-08 20:17 ` Hannes Frederic Sowa
2015-06-09 9:19 ` Nicolas Dichtel
2015-06-09 12:35 ` Nikolay Aleksandrov
2015-06-10 2:11 ` Shrijeet Mukherjee
2015-06-10 18:20 ` Alexander Duyck
2015-06-08 18:35 ` [RFC net-next 3/3] rcv path changes for vrf traffic Shrijeet Mukherjee
2015-06-08 19:58 ` Hannes Frederic Sowa
2015-06-08 20:00 ` Hannes Frederic Sowa
2015-06-08 20:22 ` Shrijeet Mukherjee
2015-06-08 20:33 ` Hannes Frederic Sowa
2015-06-08 22:44 ` Shrijeet Mukherjee
2015-06-09 5:41 ` Hannes Frederic Sowa
2015-06-08 22:05 ` David Miller
2015-06-08 22:13 ` Hannes Frederic Sowa
2015-06-08 22:21 ` David Miller
2015-06-09 0:36 ` David Ahern
2015-06-09 1:03 ` David Ahern
2015-06-09 5:35 ` Hannes Frederic Sowa
2015-06-10 18:31 ` Alexander Duyck
2015-06-08 18:35 ` [RFC iproute2] Add the ability to create a VRF device and specify it's table binding Shrijeet Mukherjee
2015-06-08 19:13 ` [RFC net-next 0/3] Proposal for VRF-lite David Ahern
2015-06-08 19:51 ` Shrijeet Mukherjee
2015-06-08 20:41 ` Hannes Frederic Sowa
2015-06-09 8:58 ` Nicolas Dichtel
2015-06-09 14:21 ` David Ahern
2015-06-09 14:55 ` Nicolas Dichtel
2015-06-09 17:14 ` Shrijeet Mukherjee
2015-06-09 10:15 ` Thomas Graf [this message]
2015-06-09 12:30 ` Nicolas Dichtel
2015-06-09 12:43 ` Hannes Frederic Sowa
[not found] ` <CAJmoNQHRTJwdMjziQiPBX07sZKrYd3Z1ASNi1xQZdgJ1Vs6bGg@mail.gmail.com>
2015-06-12 9:46 ` Thomas Graf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150609101550.GA10411@pox.localdomain \
--to=tgraf@suug.ch \
--cc=davem@davemloft.net \
--cc=dsahern@gmail.com \
--cc=ebiederm@xmission.com \
--cc=gospo@cumulusnetworks.com \
--cc=hadi@mojatatu.com \
--cc=hannes@stressinduktion.org \
--cc=jtoppins@cumulusnetworks.com \
--cc=netdev@vger.kernel.org \
--cc=nicolas.dichtel@6wind.com \
--cc=nikolay@cumulusnetworks.com \
--cc=roopa@cumulusnetworks.com \
--cc=shm@cumulusnetworks.com \
--cc=stephen@networkplumber.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).