Re: [PATCH 0/3] Make mark-based routing work better with multiple separate networks.

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: David Ahern <dsahern@gmail.com>
To: sowmini varadhan <sowmini05@gmail.com>,
	Lorenzo Colitti <lorenzo@google.com>
Cc: netdev <netdev@vger.kernel.org>, JP Abgrall <jpa@google.com>,
	David Miller <davem@davemloft.net>, Julian Anastasov <ja@ssi.bg>,
	Hannes Frederic Sowa <hannes@stressinduktion.org>
Subject: Re: [PATCH 0/3] Make mark-based routing work better with multiple separate networks.
Date: Tue, 13 May 2014 11:12:03 -0600	[thread overview]
Message-ID: <53725263.2080903@gmail.com> (raw)
In-Reply-To: <CACP96tTRQqvQM6UTuqtVonZJW3LaHkBZkkpH+zCZa=GjHAEy+Q@mail.gmail.com>

On 5/13/14, 4:49 AM, sowmini varadhan wrote:
> On Mon, May 12, 2014 at 6:53 PM, Lorenzo Colitti <lorenzo@google.com> wrote:
>> On Tue, May 13, 2014 at 6:09 AM, sowmini varadhan <sowmini05@gmail.com> wrote:
>>> http://lwn.net/Articles/407495/, a single
>>> process should be able to open sockes in different namespaces.
>>
>> Other things that you can't do with namespaces are have the same physical
>> interface (and the same IP address?) in two different namespaces, or
>> have the same listening socket in two different namespaces. Namespaces
>> are not a panacea.'
>
> So this thread got unintentionally cut off by my not selecting Reply-All
> in the google gui.
>
> But to summarize a couple of private exchanges between Lorenzo and
> me, it still appears to me that the use-case here is what routers
> consider a "VRF". Thus it makes sense to add code (if/as needed)
> to fix the VRF support in linux, rather than adding yet-another-one-off
> feature with socket marking.
>
> Specifically addressing the two issues raised above:
> - yes, it is true that an interface can exist in only one netns at a time.
>    But the same ip address can exist in multiple netns-es. If the
>    app wants to listen to a proper-subset of networks that go in/out
>    a single physical interface, you can use macvlan, and assign the
>    macvlans to the desired netns.
> - "same listening socket for multiple namespaces". Clearly that problem
>    also exists for the socket-marks approach. But again this can actually
>    be solved (for both netns and sock-marks) by having the application
>    set up separate sockets for each netns (netns or whatever) of interest,
>    and build an epoll fd over that set of sockets. No need for any kernel
>    code for this.

using namespaces for VRFs has a number of problems:

1. It does not scale efficiently -- e.g., 1k VRFs.
    a. namespaces have high memory consumption. It depends on features 
enabled, but I see ~200kB/namespace. At 1024 namespaces that's a high 
memory hit.

    b. requiring separate processes/threads/sockets per namespace for a 
service to have a presence in each. ie., the 'same listening socket for 
multiple namespaces' problem.

2. Complicates L2 apps which should be vrf agnostic.

3. Requires root (CAP_SYS_ADMIN) to use setns. If you go the 
thread/socket per namespace route all of those processes need SYS_ADMIN 
capability which is not the desired security posture.

>
>    Or you can optimize this by building infra in the kernel to support the
>    Wildcard ALL_VRFS notion. Or add even more code to support something
>    less than ALL_VRFS.
>
> My point is: what is the real networking construct that this use-case needs?
> Isn't it what routers describe as the VRF? If yes, then shouldnt
> we have one single way of supporting that in linux, instead of having
> a little-bit-here and a little-bit-there?

 From a separation of resources perspective why not have the 
infrastructure kernel side that allows interfaces to be separated into 
namespaces for isolation and then within a namespace provide L3 
abstractions that allow separate routing tables, neighbor caches, etc -- 
ie., VRF abstraction within a network namespace. Allow apps to have a 
listen socket that works across the VRFs in a namespace; connected 
sockets are VRF based.

Nested network namespaces (which does not seem to work with 3.4 and 3.10 
kernels) would provide that layering but still suffers from the problems 
mentioned above.

David

     prev parent reply	other threads:[~2014-05-13 17:12 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-09 17:36 [PATCH 0/3] Make mark-based routing work better with multiple separate networks Lorenzo Colitti
2014-05-09 17:36 ` [PATCH 1/3] net: add a sysctl to reflect the fwmark on replies Lorenzo Colitti
2014-05-09 17:37 ` [PATCH 2/3] net: Use fwmark reflection in PMTU discovery Lorenzo Colitti
2014-05-09 17:37 ` [PATCH 3/3] net: support marking accepting TCP sockets Lorenzo Colitti
2014-05-09 18:05   ` Eric Dumazet
2014-05-12 12:21 ` [PATCH 0/3] Make mark-based routing work better with multiple separate networks sowmini varadhan
2014-05-12 19:58   ` Lorenzo Colitti
2014-05-12 21:09     ` sowmini varadhan
2014-05-12 22:53       ` Lorenzo Colitti
2014-05-13 10:49         ` sowmini varadhan
2014-05-13 15:28           ` Lorenzo Colitti
2014-05-13 15:38             ` sowmini varadhan
2014-05-13 16:09               ` Ben Greear
2014-05-13 17:12           ` David Ahern [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53725263.2080903@gmail.com \
    --to=dsahern@gmail.com \
    --cc=davem@davemloft.net \
    --cc=hannes@stressinduktion.org \
    --cc=ja@ssi.bg \
    --cc=jpa@google.com \
    --cc=lorenzo@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=sowmini05@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).