From: David Ahern <dsahern@gmail.com>
To: sowmini varadhan <sowmini05@gmail.com>,
Lorenzo Colitti <lorenzo@google.com>
Cc: netdev <netdev@vger.kernel.org>, JP Abgrall <jpa@google.com>,
David Miller <davem@davemloft.net>, Julian Anastasov <ja@ssi.bg>,
Hannes Frederic Sowa <hannes@stressinduktion.org>
Subject: Re: [PATCH 0/3] Make mark-based routing work better with multiple separate networks.
Date: Tue, 13 May 2014 11:12:03 -0600 [thread overview]
Message-ID: <53725263.2080903@gmail.com> (raw)
In-Reply-To: <CACP96tTRQqvQM6UTuqtVonZJW3LaHkBZkkpH+zCZa=GjHAEy+Q@mail.gmail.com>
On 5/13/14, 4:49 AM, sowmini varadhan wrote:
> On Mon, May 12, 2014 at 6:53 PM, Lorenzo Colitti <lorenzo@google.com> wrote:
>> On Tue, May 13, 2014 at 6:09 AM, sowmini varadhan <sowmini05@gmail.com> wrote:
>>> http://lwn.net/Articles/407495/, a single
>>> process should be able to open sockes in different namespaces.
>>
>> Other things that you can't do with namespaces are have the same physical
>> interface (and the same IP address?) in two different namespaces, or
>> have the same listening socket in two different namespaces. Namespaces
>> are not a panacea.'
>
> So this thread got unintentionally cut off by my not selecting Reply-All
> in the google gui.
>
> But to summarize a couple of private exchanges between Lorenzo and
> me, it still appears to me that the use-case here is what routers
> consider a "VRF". Thus it makes sense to add code (if/as needed)
> to fix the VRF support in linux, rather than adding yet-another-one-off
> feature with socket marking.
>
> Specifically addressing the two issues raised above:
> - yes, it is true that an interface can exist in only one netns at a time.
> But the same ip address can exist in multiple netns-es. If the
> app wants to listen to a proper-subset of networks that go in/out
> a single physical interface, you can use macvlan, and assign the
> macvlans to the desired netns.
> - "same listening socket for multiple namespaces". Clearly that problem
> also exists for the socket-marks approach. But again this can actually
> be solved (for both netns and sock-marks) by having the application
> set up separate sockets for each netns (netns or whatever) of interest,
> and build an epoll fd over that set of sockets. No need for any kernel
> code for this.
using namespaces for VRFs has a number of problems:
1. It does not scale efficiently -- e.g., 1k VRFs.
a. namespaces have high memory consumption. It depends on features
enabled, but I see ~200kB/namespace. At 1024 namespaces that's a high
memory hit.
b. requiring separate processes/threads/sockets per namespace for a
service to have a presence in each. ie., the 'same listening socket for
multiple namespaces' problem.
2. Complicates L2 apps which should be vrf agnostic.
3. Requires root (CAP_SYS_ADMIN) to use setns. If you go the
thread/socket per namespace route all of those processes need SYS_ADMIN
capability which is not the desired security posture.
>
> Or you can optimize this by building infra in the kernel to support the
> Wildcard ALL_VRFS notion. Or add even more code to support something
> less than ALL_VRFS.
>
> My point is: what is the real networking construct that this use-case needs?
> Isn't it what routers describe as the VRF? If yes, then shouldnt
> we have one single way of supporting that in linux, instead of having
> a little-bit-here and a little-bit-there?
From a separation of resources perspective why not have the
infrastructure kernel side that allows interfaces to be separated into
namespaces for isolation and then within a namespace provide L3
abstractions that allow separate routing tables, neighbor caches, etc --
ie., VRF abstraction within a network namespace. Allow apps to have a
listen socket that works across the VRFs in a namespace; connected
sockets are VRF based.
Nested network namespaces (which does not seem to work with 3.4 and 3.10
kernels) would provide that layering but still suffers from the problems
mentioned above.
David
prev parent reply other threads:[~2014-05-13 17:12 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-09 17:36 [PATCH 0/3] Make mark-based routing work better with multiple separate networks Lorenzo Colitti
2014-05-09 17:36 ` [PATCH 1/3] net: add a sysctl to reflect the fwmark on replies Lorenzo Colitti
2014-05-09 17:37 ` [PATCH 2/3] net: Use fwmark reflection in PMTU discovery Lorenzo Colitti
2014-05-09 17:37 ` [PATCH 3/3] net: support marking accepting TCP sockets Lorenzo Colitti
2014-05-09 18:05 ` Eric Dumazet
2014-05-12 12:21 ` [PATCH 0/3] Make mark-based routing work better with multiple separate networks sowmini varadhan
2014-05-12 19:58 ` Lorenzo Colitti
2014-05-12 21:09 ` sowmini varadhan
2014-05-12 22:53 ` Lorenzo Colitti
2014-05-13 10:49 ` sowmini varadhan
2014-05-13 15:28 ` Lorenzo Colitti
2014-05-13 15:38 ` sowmini varadhan
2014-05-13 16:09 ` Ben Greear
2014-05-13 17:12 ` David Ahern [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53725263.2080903@gmail.com \
--to=dsahern@gmail.com \
--cc=davem@davemloft.net \
--cc=hannes@stressinduktion.org \
--cc=ja@ssi.bg \
--cc=jpa@google.com \
--cc=lorenzo@google.com \
--cc=netdev@vger.kernel.org \
--cc=sowmini05@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).