Netdev List

* Re: [RFC PATCH net-next 0/5] Ease netns management for userland
From: Nicolas Dichtel @ 2012-12-12 20:54 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: netdev, davem, aatteka
In-Reply-To: <87fw3boyxn.fsf@xmission.com>

Le 12/12/2012 20:25, Eric W. Biederman a écrit :
> Nicolas Dichtel <nicolas.dichtel@6wind.com> writes:
>
>> The goal of this serie is to ease netns management by daemons. Some systems use
>> netns only to virtualize network stack and don't want to multiply userland
>> daemons.  These system may have a lot of netns, up to 2000. We don't want to
>> launch an instance of each daemons (quagga, strongswan, conntrackd, ...) for
>> each netns because it will consume a lot of ressources. Having one daemon that
>> manage all netns is more efficient (mainly if there are few objects to manage:
>> one or two routes per netns for example).
>> Hence, one goal of this serie is to allow, for a daemon, to monitor netns
>> activities, thus it can open or close netlink sockets, allocating structures
>> needed to manage these netns when they are created or deleted.
>> To help to identify a netns, an index has been added to each netns.
>>
>> A new setsockopt() option is also added, to help daemons to open socket in the
>> right netns. For now, a daemon that want to open a socket in a specified netns,
>> need to call setns(CLONE_NEWNET) with a fd (not so easy to found), open the
>> socket and then call again setns() to go back in the initial netns. Having this
>> kind of setsockopt() will simplify operations. Obviously, this setsockopt()
>> should be done enough early (is test on sk_state enough?). The first target is
>> netlink socket but it can be useful for other kind of socket, it's why a add a
>> generic socket option.
>>
>> As usual, the patch against iproute2 will be sent once the patches are included
>> and net-next merged. I can send it on demand.
>
> Short answer you don't need to do any of this.
>
> setns with the namespace files in /proc/<pid>/ns/net gives you more than
> enough mechanism to solve this problem.  And iprout2 already supports
> all of this.
>
> And your approach creates very serious maintenances problems to the
> point I don't even want to read your patches.  What namespace do your
> namespace id's live in?
>
> A socketopt to change the namespace of a socket is nasty because sockets
> changing which network namespace they are in, leads to races which
> aren't worth thinking about writing the code to handle.
>
> Longer answer.
>
> You can bind mount the namespace id's /proc/<pid>/ns/net files to
> give you any name you want.  This puts naming policy in userspace
> control, and nests just fine.
>
> You can open a socket in any network namespace you want just
> by calling setns before socket.  Wrapping this idiom in a library call
> or if there is sufficient need in a socketat system call seems
> reasonable.
Yes, I agree that this SO_NETNS may be a bad idea.

>
> There is a classic question of if two network namespace files refer to
> the same network namespace and I have code in linux-next and my pull
> request to Linus to give those files a unique inode number.
Interesseting to know that.

>
> So please use the facilities already merged into the kernel.
Ok, but how can a daemon get the list of netns? Suppose that we want that
quagga manage all netns, how can it get this list to open needed netlink
socket?
For example, iproute2 is only aware of netns created with iproute2, but it
will no detect other netns.

^ permalink raw reply