From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1751795AbcGXFE1 (ORCPT <rfc822;w@1wt.eu>);
	Sun, 24 Jul 2016 01:04:27 -0400
Received: from out02.mta.xmission.com ([166.70.13.232]:44580 "EHLO
	out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1750741AbcGXFEX (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Sun, 24 Jul 2016 01:04:23 -0400
From: ebiederm@xmission.com (Eric W. Biederman)
To: "W. Trevor King" <wking@tremily.us>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>,
        Andrey Vagin <avagin@openvz.org>,
        Serge Hallyn <serge.hallyn@canonical.com>, linux-api@vger.kernel.org,
        containers@lists.linux-foundation.org, linux-kernel@vger.kernel.org,
        Alexander Viro <viro@zeniv.linux.org.uk>, criu@openvz.org,
        linux-fsdevel@vger.kernel.org,
        "Michael Kerrisk \(man-pages\)" <mtk.manpages@gmail.com>
References: <1468520419-28220-1-git-send-email-avagin@openvz.org>
	<20160723211414.GA25371@odin.tremily.us>
	<1469309936.2332.35.camel@HansenPartnership.com>
	<20160723215802.GO24913@odin.tremily.us>
	<87mvl8nhlv.fsf@x220.int.ebiederm.org>
	<20160723223448.GP24913@odin.tremily.us>
Date: Sat, 23 Jul 2016 23:51:07 -0500
In-Reply-To: <20160723223448.GP24913@odin.tremily.us> (W. Trevor King's
	message of "Sat, 23 Jul 2016 15:34:48 -0700")
Message-ID: <877fcboczo.fsf@x220.int.ebiederm.org>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-XM-SPF: eid=1bRBaA-0004O5-IE;;;mid=<877fcboczo.fsf@x220.int.ebiederm.org>;;;hst=in01.mta.xmission.com;;;ip=67.3.204.119;;;frm=ebiederm@xmission.com;;;spf=neutral
X-XM-AID: U2FsdGVkX181bnv3LOY+AGVQmg8M6FtJsH+V1UeyQKg=
X-SA-Exim-Connect-IP: 67.3.204.119
X-SA-Exim-Mail-From: ebiederm@xmission.com
X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP
	*  0.7 XMSubLong Long Subject
	*  1.5 XMNoVowels Alpha-numberic number with no vowels
	*  0.0 TVD_RCVD_IP Message was received from an IP address
	*  0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available.
	*  0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60%
	*      [score: 0.5000]
	* -0.0 DCC_CHECK_NEGATIVE Not listed in DCC
	*      [sa05 1397; Body=1 Fuz1=1 Fuz2=1]
	*  0.1 XMSolicitRefs_0 Weightloss drug
X-Spam-DCC: XMission; sa05 1397; Body=1 Fuz1=1 Fuz2=1 
X-Spam-Combo: **;"W. Trevor King" <wking@tremily.us>
X-Spam-Relay-Country: 
X-Spam-Timing: total 5842 ms - load_scoreonly_sql: 0.03 (0.0%),
	signal_user_changed: 5.0 (0.1%), b_tie_ro: 3.5 (0.1%), parse: 1.34 (0.0%),
	extract_message_metadata: 14 (0.2%), get_uri_detail_list: 3.7 (0.1%),
	tests_pri_-1000: 4.7 (0.1%), tests_pri_-950: 1.10 (0.0%), tests_pri_-900:
	0.90 (0.0%), tests_pri_-400: 29 (0.5%), check_bayes: 27 (0.5%), b_tokenize: 8
	(0.1%), b_tok_get_all: 9 (0.2%), b_comp_prob: 4.1 (0.1%), b_tok_touch_all:
	3.3 (0.1%), b_finish: 1.01 (0.0%), tests_pri_0: 355 (6.1%),
	check_dkim_signature: 0.49 (0.0%), check_dkim_adsp: 4.1 (0.1%),
	tests_pri_500: 5427 (92.9%), poll_dns_idle: 5420 (92.8%), rewrite_mail: 0.00
	(0.0%)
Subject: Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
X-Spam-Flag: No
X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600)
X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

"W. Trevor King" <wking@tremily.us> writes:

> On Sat, Jul 23, 2016 at 04:56:44PM -0500, Eric W. Biederman wrote:
>> "W. Trevor King" <wking@tremily.us> writes:
>> > On Sat, Jul 23, 2016 at 02:38:56PM -0700, James Bottomley wrote:
>> >> On Sat, 2016-07-23 at 14:14 -0700, W. Trevor King wrote:
>> >> > namespaces(7) and clone(2) both have:
>> >> > 
>> >> >   When a network namespace is freed (i.e., when the last
>> >> >   process in the namespace terminates), its physical network
>> >> >   devices are moved back to the initial network namespace (not
>> >> >   to the parent of the process).
>> >> > 
>> >> > So the initial network namespace (the head of
>> >> > net_namespace_list?)  is special [1].  To understand how
>> >> > physical network devices will be handled, it seems like we want
>> >> > to treat network devices as a depth-1 tree, with all
>> >> > non-initial net namespaces as children of the initial net
>> >> > namespace.  Can we extend this series' NS_GET_PARENT to return:
>> >> > 
>> >> > * EPERM for an unprivileged caller (like this series currently
>> >> >   does for PID namespaces),
>> >> > * ENOENT when called on net_namespace_list, and
>> >> > * net_namespace_list when called on any other net namespace.
>> >> 
>> >> What's the practical application of this?  independent net
>> >> namespaces are managed by the ip netns command.  It pins them by
>> >> a bind mount in a flat fashion; if we make them hierarchical the
>> >> tool would probably need updating to reflect this, so we're going
>> >> to need a reason to give the network people.  Just having the
>> >> interfaces not go back to root when you do an ip netns delete
>> >> doesn't seem very compelling.
>> >
>> > I'm not suggesting we add support for deeper nesting, I'm suggesting
>> > we use NS_GET_PARENT to allow sufficiently privileged users to
>> > determine if a given net namespace is the initial net namespace.  You
>> > could do this already with something like:
>> >
>> > 1. Create a new net namespace.
>> > 2. Add a physical network device to that namespace.
>> > 3. Delete that namespace.
>> > 4. See if the physical network device shows up in your
>> >    initial-net-namespace candidate.
>> > 5. Delete the physical network device (hopefully it ended up
>> >    somewhere you can find it ;).
>> >
>> > But using an NS_GET_PARENT call seems much safer and easier.
>> 
>> Have you had the problem in practice where you can't tell which
>> network namespace is the initial network namespace.  This all seems
>> like a theoretical problem rather than a real one.
>
> I haven't had any practical problems here, I'm just trying to wrap my
> head around namespace-relationship discovery.  The special physical
> network device handling seems a lot like init re-parenting (with no
> PR_SET_CHILD_SUBREAPER analog in a 1-deep namespace tree), so calling
> the initial network namespace a parent (and all the other namespaces
> its direct children) seems natural enough.  If that doesn't sound
> convincing, I'm happy to punt this idea until someone runs into a
> practical problem ;).

Then let's punt this until someone runs into a practical problem.

For scaling and for sanity it is desirable to keep the connections
between namespaces to a minimum.  Further the initial instances of a
namespace always tend to be a little bit special.

Eric