public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: "Michael Kerrisk \(man-pages\)" <mtk.manpages@gmail.com>
Cc: Andrei Vagin <avagin@openvz.org>,
	Containers <containers@lists.linux-foundation.org>,
	Linux API <linux-api@vger.kernel.org>,
	lkml <linux-kernel@vger.kernel.org>,
	"linux-fsdevel\@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	James Bottomley <James.Bottomley@hansenpartnership.com>,
	"W. Trevor King" <wking@tremily.us>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	"Serge E. Hallyn" <serge@hallyn.com>
Subject: Re: Documenting the ioctl interfaces to discover relationships between namespaces
Date: Tue, 13 Dec 2016 07:18:27 +1300	[thread overview]
Message-ID: <87r35df1u4.fsf@xmission.com> (raw)
In-Reply-To: <6771af94-9847-0277-ec1d-62bc3649a17a@gmail.com> (Michael Kerrisk's message of "Mon, 12 Dec 2016 17:01:14 +0100")

"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:

> On 12/11/2016 11:30 PM, Eric W. Biederman wrote:
>> "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
>> 
>>> [was: [PATCH 0/4 v3] Add an interface to discover relationships
>>> between namespaces]
>> 
>> One small comment below.
>> 
>>>
>>>    Introspecting namespace relationships
>>>        Since Linux 4.9, two ioctl(2) operations  are  provided  to  allow
>>>        introspection  of  namespace relationships (see user_namespaces(7)
>>>        and pid_namespaces(7)).  The form of the calls is:
>>>
>>>            ioctl(fd, request);
>>>
>>>        In each case, fd refers to a /proc/[pid]/ns/* file.
>>>
>>>        NS_GET_USERNS
>>>               Returns a file descriptor that refers to  the  owning  user
>>>               namespace for the namespace referred to by fd.
>>>
>>>        NS_GET_PARENT
>>>               Returns  a file descriptor that refers to the parent names‐
>>>               pace of the namespace referred to by fd.  This operation is
>>>               valid  only for hierarchical namespaces (i.e., PID and user
>>>               namespaces).  For user namespaces, NS_GET_PARENT is synony‐
>>>               mous with NS_GET_USERNS.
>>>
>>>        In each case, the returned file descriptor is opened with O_RDONLY
>>>        and O_CLOEXEC (close-on-exec).
>>>
>>>        By applying fstat(2) to the returned file descriptor, one  obtains
>>>        a  stat structure whose st_ino (inode number) field identifies the
>>>        owning/parent namespace.  This inode number can  be  matched  with
>>>        the  inode  number  of  another  /proc/[pid]/ns/{pid,user} file to
>>>        determine whether that is the owning/parent namespace.
>> 
>> Like all fstat inode comparisons to be fully accurate you need to
>> compare both the st_ino and st_dev.  I reserve the right for st_dev to
>> be significant when comparing namespaces.  Otherwise I might have to
>> create a namespace of namespaces someday and that is ugly.
>> 
>>>        Either of these ioctl(2) operations can fail  with  the  following
>>>        error:
>>>
>>>        EPERM  The  requested  namespace is outside of the caller's names‐
>>>               pace scope.  This error can occur if, for example, the own‐
>>>               ing  user  namespace is an ancestor of the caller's current
>>>               user namespace.  It can also occur on  attempts  to  obtain
>>>               the parent of the initial user or PID namespace.
>>>
>>>        Additionally,  the  NS_GET_PARENT operation can fail with the fol‐
>>>        lowing error:
>>>
>>>        EINVAL fd refers to a nonhierarchical namespace.
>>>
>>>        See the EXAMPLE section for an example of the use of these  opera‐
>>>        tions.
>
> So, after playing with this a bit, I have a question. 
>
> I gather that in order to, for example, elaborate the tree of user
> namespaces on the system, one would use NS_GET_PARENT on each of
> the /proc/*/ns/user files and match up the results. Right?
> 	   
> What happens if one of the parent user namespaces contains no
> processes? That is, the parent namespace exists by virtue of being
> pinned because a proc/PID/ns/user file is open or bind mounted.
> (Chrome seems to do this sort of dance with user namespaces, for
> example.) How do we find the ancestor of *that* user namespace?

What is returned from NS_GET_USERNS and NS_GET_PARENT is a file
descriptor, that you can call NS_GET_PARENT on.

Eric

  reply	other threads:[~2016-12-12 18:21 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-11 11:54 Documenting the ioctl interfaces to discover relationships between namespaces Michael Kerrisk (man-pages)
2016-12-11 22:30 ` Eric W. Biederman
2016-12-12  6:13   ` Michael Kerrisk (man-pages)
2016-12-12 16:01   ` Michael Kerrisk (man-pages)
2016-12-12 18:18     ` Eric W. Biederman [this message]
2016-12-14  7:32       ` Michael Kerrisk (man-pages)
2016-12-15  0:46 ` Andrei Vagin
2016-12-15  9:53   ` Michael Kerrisk (man-pages)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r35df1u4.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=avagin@openvz.org \
    --cc=containers@lists.linux-foundation.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtk.manpages@gmail.com \
    --cc=serge@hallyn.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wking@tremily.us \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox