All of lore.kernel.org
 help / color / mirror / Atom feed
From: ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org (Eric W. Biederman)
To: Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Hannes Frederic Sowa
	<hannes-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r@public.gmane.org>,
	David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>,
	netdev <netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Linux API <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [PATCH net-next RFC v1 00/27] afnetns: new namespace type for separation on protocol level
Date: Mon, 13 Mar 2017 17:06:21 -0500	[thread overview]
Message-ID: <874lywu90i.fsf@xmission.com> (raw)
In-Reply-To: <CAHO5Pa1s949dohzEEE68Ux=mXA7N7sR-U98Jwjvx1a_A5AhFEw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> (Michael Kerrisk's message of "Mon, 13 Mar 2017 20:56:35 +0100")

Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> On Mon, Mar 13, 2017 at 12:44 AM, Hannes Frederic Sowa
> <hannes-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r@public.gmane.org> wrote:
>> Hi,
>>
>> On Sun, 2017-03-12 at 16:26 -0700, David Miller wrote:
>>> From: Hannes Frederic Sowa <hannes-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r@public.gmane.org>
>>> Date: Mon, 13 Mar 2017 00:01:24 +0100
>>>
>>> > afnetns behaves like ordinary namespaces: clone, unshare, setns syscalls
>>> > can work with afnetns with one limitation: one cannot cross the realm
>>> > of a network namespace while changing the afnetns compartement. To get
>>> > into a new afnetns in a different net namespace, one must first change
>>> > to the net namespace and afterwards switch to the desired afnetns.
>>>
>>> Please explain why this is useful, who wants this kind of facility,
>>> and how it will be used.
>>
>> Yes, I have to enhance the cover letter:
>>
>> The work behind all this is to provide more dense container hosting.
>> Right now we lose performance, because all packets need to be forwarded
>> through either a bridge or must be routed until they reach the
>> containers. For example, we can't make use of early demuxing for the
>> incoming packets. We basically pass the networking stack twice for
>> every packet.
>>
>> The usage is very much in line with how network namespaces are used
>> nowadays:
>>
>> ip afnetns add afns-1
>> ip address add 192.168.1.1/24 dev eth0 afnetns afns-1
>> ip afnetns exec afns-1 /usr/sbin/httpd
>>
>> this spawns a shell where all child processes will only have access to
>> the specific ip addresses, even though they do a wildcard bind. Source
>> address selection will also use only the ip addresses available to the
>> children.
>>
>> In some sense it has lots of characteristics like ipvlan, allowing a
>> single MAC address to host lots of IP addresses which will end up in
>> different namespaces. Unlink ipvlan however, it will also solve the
>> problem around duplicate address detection and multiplexing packets to
>> the IGMP or MLD state machines.
>>
>> The resource consumption in comparison with ordinary namespaces will be
>> much lower. All in all, we will have far less networking subsystems to
>> cross compared to normal netns solutions.
>>
>> Some more information also in the first patch, which adds a
>> Documentation.

If the goal is one ip address per network namespace with a network
device and mac address on the network I have something that I was
working on that I believe is in the end is a much simpler solution.

Add routes in the routing table between network namespaces.

AKA in the initial network namespace with the network device have
an input route not towards the local loopback device but towards
the network namespaces loopback device.

Before other issues took precedence I made it half way to implementing
that.   The ip input path won't get confused if the destination network
device is not in the same network namespace as the device.  Last I
looked the ip output path still had a few places where confusion was
possible between the network socket and the output device.

As long as installing such routes is conditional upon having
CAP_NET_ADMIN in both network namespaces you should be fine and things
should be very simple and very fast.  Because that won't take a special
case through the network stack.

Given that performance is your primary motive I suspect this will yield
the fastest possible path through the network stack as no extra steps
need to be taken, and can benefit from any routing improvements to the
ordinary network stack.

Eric

  parent reply	other threads:[~2017-03-13 22:06 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-12 23:01 [PATCH net-next RFC v1 00/27] afnetns: new namespace type for separation on protocol level Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 01/27] afnetns: add CLONE_NEWAFNET flag Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 02/27] afnetns: basic namespace operations and representations Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 03/27] afnetns: prepare for integration into ipv4 Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 04/27] afnetns: add net_afnetns Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 05/27] afnetns: ipv6 integration Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 06/27] afnetns: put afnetns pointer into struct sock Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 07/27] ipv4: introduce ifa_find_rcu Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 08/27] afnetns: factor out inet_allow_bind Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 09/27] afnetns: add sock_afnetns Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 10/27] afnetns: add ifa_find_afnetns_rcu Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 11/27] afnetns: validate afnetns in inet_allow_bind Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 12/27] afnetns: ipv4/udp integration Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 13/27] afnetns: use inet_allow_bind in inet6_bind Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 14/27] afnetns: check for afnetns " Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 15/27] afnetns: add ipv6_get_ifaddr_afnetns_rcu Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 16/27] afnetns: add udpv6 support Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 17/27] afnetns: introduce __inet_select_addr Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 18/27] afnetns: afnetns should influence source address selection Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 19/27] afnetns: add afnetns support for tcpv4 Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 20/27] ipv6: move ipv6_get_ifaddr to vmlinux in case ipv6 is build as module Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 21/27] afnetns: add support for tcpv6 Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 22/27] afnetns: track owning namespace for inet_bind Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 23/27] afnetns: use user_ns from afnetns for checking for binding to port < 1024 Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 24/27] afnetns: check afnetns user_ns in inet6_bind Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 25/27] afnetns: ipv4: inherit afnetns from calling application Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 26/27] afnetns: ipv6: " Hannes Frederic Sowa
2017-03-12 23:01 ` [PATCH net-next RFC v1 27/27] afnetns: allow only whitelisted protocols to operate inside afnetns Hannes Frederic Sowa
2017-03-12 23:26 ` [PATCH net-next RFC v1 00/27] afnetns: new namespace type for separation on protocol level David Miller
2017-03-12 23:44   ` Hannes Frederic Sowa
     [not found]     ` <1489362279.2283.1.camel-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r@public.gmane.org>
2017-03-13 19:56       ` Michael Kerrisk
     [not found]         ` <CAHO5Pa1s949dohzEEE68Ux=mXA7N7sR-U98Jwjvx1a_A5AhFEw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-03-13 22:06           ` Eric W. Biederman [this message]
2017-03-14 10:18             ` Hannes Frederic Sowa
     [not found]               ` <cc9229f8-a389-87cc-2512-ee00e200a7c3-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r@public.gmane.org>
2017-03-14 17:46                 ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874lywu90i.fsf@xmission.com \
    --to=ebiederm-as9lmozglivwk0htik3j/w@public.gmane.org \
    --cc=davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org \
    --cc=hannes-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.