From mboxrd@z Thu Jan 1 00:00:00 1970 From: ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org (Eric W. Biederman) Subject: Re: Documenting the ioctl interfaces to discover relationships between namespaces Date: Tue, 13 Dec 2016 07:18:27 +1300 Message-ID: <87r35df1u4.fsf@xmission.com> References: <87poky5ca9.fsf@xmission.com> <6771af94-9847-0277-ec1d-62bc3649a17a@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: In-Reply-To: <6771af94-9847-0277-ec1d-62bc3649a17a-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> (Michael Kerrisk's message of "Mon, 12 Dec 2016 17:01:14 +0100") List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: "Michael Kerrisk (man-pages)" Cc: Andrei Vagin , Linux API , Containers , lkml , James Bottomley , Alexander Viro , "linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "Serge E. Hallyn" List-Id: containers.vger.kernel.org Ik1pY2hhZWwgS2VycmlzayAobWFuLXBhZ2VzKSIgPG10ay5tYW5wYWdlc0BnbWFpbC5jb20+IHdy aXRlczoKCj4gT24gMTIvMTEvMjAxNiAxMTozMCBQTSwgRXJpYyBXLiBCaWVkZXJtYW4gd3JvdGU6 Cj4+ICJNaWNoYWVsIEtlcnJpc2sgKG1hbi1wYWdlcykiIDxtdGsubWFucGFnZXNAZ21haWwuY29t PiB3cml0ZXM6Cj4+IAo+Pj4gW3dhczogW1BBVENIIDAvNCB2M10gQWRkIGFuIGludGVyZmFjZSB0 byBkaXNjb3ZlciByZWxhdGlvbnNoaXBzCj4+PiBiZXR3ZWVuIG5hbWVzcGFjZXNdCj4+IAo+PiBP bmUgc21hbGwgY29tbWVudCBiZWxvdy4KPj4gCj4+Pgo+Pj4gICAgSW50cm9zcGVjdGluZyBuYW1l c3BhY2UgcmVsYXRpb25zaGlwcwo+Pj4gICAgICAgIFNpbmNlIExpbnV4IDQuOSwgdHdvIGlvY3Rs KDIpIG9wZXJhdGlvbnMgIGFyZSAgcHJvdmlkZWQgIHRvICBhbGxvdwo+Pj4gICAgICAgIGludHJv c3BlY3Rpb24gIG9mICBuYW1lc3BhY2UgcmVsYXRpb25zaGlwcyAoc2VlIHVzZXJfbmFtZXNwYWNl cyg3KQo+Pj4gICAgICAgIGFuZCBwaWRfbmFtZXNwYWNlcyg3KSkuICBUaGUgZm9ybSBvZiB0aGUg Y2FsbHMgaXM6Cj4+Pgo+Pj4gICAgICAgICAgICBpb2N0bChmZCwgcmVxdWVzdCk7Cj4+Pgo+Pj4g ICAgICAgIEluIGVhY2ggY2FzZSwgZmQgcmVmZXJzIHRvIGEgL3Byb2MvW3BpZF0vbnMvKiBmaWxl Lgo+Pj4KPj4+ICAgICAgICBOU19HRVRfVVNFUk5TCj4+PiAgICAgICAgICAgICAgIFJldHVybnMg YSBmaWxlIGRlc2NyaXB0b3IgdGhhdCByZWZlcnMgdG8gIHRoZSAgb3duaW5nICB1c2VyCj4+PiAg ICAgICAgICAgICAgIG5hbWVzcGFjZSBmb3IgdGhlIG5hbWVzcGFjZSByZWZlcnJlZCB0byBieSBm ZC4KPj4+Cj4+PiAgICAgICAgTlNfR0VUX1BBUkVOVAo+Pj4gICAgICAgICAgICAgICBSZXR1cm5z ICBhIGZpbGUgZGVzY3JpcHRvciB0aGF0IHJlZmVycyB0byB0aGUgcGFyZW50IG5hbWVz4oCQCj4+ PiAgICAgICAgICAgICAgIHBhY2Ugb2YgdGhlIG5hbWVzcGFjZSByZWZlcnJlZCB0byBieSBmZC4g IFRoaXMgb3BlcmF0aW9uIGlzCj4+PiAgICAgICAgICAgICAgIHZhbGlkICBvbmx5IGZvciBoaWVy YXJjaGljYWwgbmFtZXNwYWNlcyAoaS5lLiwgUElEIGFuZCB1c2VyCj4+PiAgICAgICAgICAgICAg IG5hbWVzcGFjZXMpLiAgRm9yIHVzZXIgbmFtZXNwYWNlcywgTlNfR0VUX1BBUkVOVCBpcyBzeW5v bnnigJAKPj4+ICAgICAgICAgICAgICAgbW91cyB3aXRoIE5TX0dFVF9VU0VSTlMuCj4+Pgo+Pj4g ICAgICAgIEluIGVhY2ggY2FzZSwgdGhlIHJldHVybmVkIGZpbGUgZGVzY3JpcHRvciBpcyBvcGVu ZWQgd2l0aCBPX1JET05MWQo+Pj4gICAgICAgIGFuZCBPX0NMT0VYRUMgKGNsb3NlLW9uLWV4ZWMp Lgo+Pj4KPj4+ICAgICAgICBCeSBhcHBseWluZyBmc3RhdCgyKSB0byB0aGUgcmV0dXJuZWQgZmls ZSBkZXNjcmlwdG9yLCBvbmUgIG9idGFpbnMKPj4+ICAgICAgICBhICBzdGF0IHN0cnVjdHVyZSB3 aG9zZSBzdF9pbm8gKGlub2RlIG51bWJlcikgZmllbGQgaWRlbnRpZmllcyB0aGUKPj4+ICAgICAg ICBvd25pbmcvcGFyZW50IG5hbWVzcGFjZS4gIFRoaXMgaW5vZGUgbnVtYmVyIGNhbiAgYmUgIG1h dGNoZWQgIHdpdGgKPj4+ICAgICAgICB0aGUgIGlub2RlICBudW1iZXIgIG9mICBhbm90aGVyICAv cHJvYy9bcGlkXS9ucy97cGlkLHVzZXJ9IGZpbGUgdG8KPj4+ICAgICAgICBkZXRlcm1pbmUgd2hl dGhlciB0aGF0IGlzIHRoZSBvd25pbmcvcGFyZW50IG5hbWVzcGFjZS4KPj4gCj4+IExpa2UgYWxs IGZzdGF0IGlub2RlIGNvbXBhcmlzb25zIHRvIGJlIGZ1bGx5IGFjY3VyYXRlIHlvdSBuZWVkIHRv Cj4+IGNvbXBhcmUgYm90aCB0aGUgc3RfaW5vIGFuZCBzdF9kZXYuICBJIHJlc2VydmUgdGhlIHJp Z2h0IGZvciBzdF9kZXYgdG8KPj4gYmUgc2lnbmlmaWNhbnQgd2hlbiBjb21wYXJpbmcgbmFtZXNw YWNlcy4gIE90aGVyd2lzZSBJIG1pZ2h0IGhhdmUgdG8KPj4gY3JlYXRlIGEgbmFtZXNwYWNlIG9m IG5hbWVzcGFjZXMgc29tZWRheSBhbmQgdGhhdCBpcyB1Z2x5Lgo+PiAKPj4+ICAgICAgICBFaXRo ZXIgb2YgdGhlc2UgaW9jdGwoMikgb3BlcmF0aW9ucyBjYW4gZmFpbCAgd2l0aCAgdGhlICBmb2xs b3dpbmcKPj4+ICAgICAgICBlcnJvcjoKPj4+Cj4+PiAgICAgICAgRVBFUk0gIFRoZSAgcmVxdWVz dGVkICBuYW1lc3BhY2UgaXMgb3V0c2lkZSBvZiB0aGUgY2FsbGVyJ3MgbmFtZXPigJAKPj4+ICAg ICAgICAgICAgICAgcGFjZSBzY29wZS4gIFRoaXMgZXJyb3IgY2FuIG9jY3VyIGlmLCBmb3IgZXhh bXBsZSwgdGhlIG93buKAkAo+Pj4gICAgICAgICAgICAgICBpbmcgIHVzZXIgIG5hbWVzcGFjZSBp cyBhbiBhbmNlc3RvciBvZiB0aGUgY2FsbGVyJ3MgY3VycmVudAo+Pj4gICAgICAgICAgICAgICB1 c2VyIG5hbWVzcGFjZS4gIEl0IGNhbiBhbHNvIG9jY3VyIG9uICBhdHRlbXB0cyAgdG8gIG9idGFp bgo+Pj4gICAgICAgICAgICAgICB0aGUgcGFyZW50IG9mIHRoZSBpbml0aWFsIHVzZXIgb3IgUElE IG5hbWVzcGFjZS4KPj4+Cj4+PiAgICAgICAgQWRkaXRpb25hbGx5LCAgdGhlICBOU19HRVRfUEFS RU5UIG9wZXJhdGlvbiBjYW4gZmFpbCB3aXRoIHRoZSBmb2zigJAKPj4+ICAgICAgICBsb3dpbmcg ZXJyb3I6Cj4+Pgo+Pj4gICAgICAgIEVJTlZBTCBmZCByZWZlcnMgdG8gYSBub25oaWVyYXJjaGlj YWwgbmFtZXNwYWNlLgo+Pj4KPj4+ICAgICAgICBTZWUgdGhlIEVYQU1QTEUgc2VjdGlvbiBmb3Ig YW4gZXhhbXBsZSBvZiB0aGUgdXNlIG9mIHRoZXNlICBvcGVyYeKAkAo+Pj4gICAgICAgIHRpb25z Lgo+Cj4gU28sIGFmdGVyIHBsYXlpbmcgd2l0aCB0aGlzIGEgYml0LCBJIGhhdmUgYSBxdWVzdGlv bi4gCj4KPiBJIGdhdGhlciB0aGF0IGluIG9yZGVyIHRvLCBmb3IgZXhhbXBsZSwgZWxhYm9yYXRl IHRoZSB0cmVlIG9mIHVzZXIKPiBuYW1lc3BhY2VzIG9uIHRoZSBzeXN0ZW0sIG9uZSB3b3VsZCB1 c2UgTlNfR0VUX1BBUkVOVCBvbiBlYWNoIG9mCj4gdGhlIC9wcm9jLyovbnMvdXNlciBmaWxlcyBh bmQgbWF0Y2ggdXAgdGhlIHJlc3VsdHMuIFJpZ2h0Pwo+IAkgICAKPiBXaGF0IGhhcHBlbnMgaWYg b25lIG9mIHRoZSBwYXJlbnQgdXNlciBuYW1lc3BhY2VzIGNvbnRhaW5zIG5vCj4gcHJvY2Vzc2Vz PyBUaGF0IGlzLCB0aGUgcGFyZW50IG5hbWVzcGFjZSBleGlzdHMgYnkgdmlydHVlIG9mIGJlaW5n Cj4gcGlubmVkIGJlY2F1c2UgYSBwcm9jL1BJRC9ucy91c2VyIGZpbGUgaXMgb3BlbiBvciBiaW5k IG1vdW50ZWQuCj4gKENocm9tZSBzZWVtcyB0byBkbyB0aGlzIHNvcnQgb2YgZGFuY2Ugd2l0aCB1 c2VyIG5hbWVzcGFjZXMsIGZvcgo+IGV4YW1wbGUuKSBIb3cgZG8gd2UgZmluZCB0aGUgYW5jZXN0 b3Igb2YgKnRoYXQqIHVzZXIgbmFtZXNwYWNlPwoKV2hhdCBpcyByZXR1cm5lZCBmcm9tIE5TX0dF VF9VU0VSTlMgYW5kIE5TX0dFVF9QQVJFTlQgaXMgYSBmaWxlCmRlc2NyaXB0b3IsIHRoYXQgeW91 IGNhbiBjYWxsIE5TX0dFVF9QQVJFTlQgb24uCgpFcmljCl9fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fCkNvbnRhaW5lcnMgbWFpbGluZyBsaXN0CkNvbnRhaW5l cnNAbGlzdHMubGludXgtZm91bmRhdGlvbi5vcmcKaHR0cHM6Ly9saXN0cy5saW51eGZvdW5kYXRp b24ub3JnL21haWxtYW4vbGlzdGluZm8vY29udGFpbmVycw== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: ebiederm@xmission.com (Eric W. Biederman) To: "Michael Kerrisk \(man-pages\)" Cc: Andrei Vagin , Containers , Linux API , lkml , "linux-fsdevel\@vger.kernel.org" , James Bottomley , "W. Trevor King" , Alexander Viro , "Serge E. Hallyn" References: <87poky5ca9.fsf@xmission.com> <6771af94-9847-0277-ec1d-62bc3649a17a@gmail.com> Date: Tue, 13 Dec 2016 07:18:27 +1300 In-Reply-To: <6771af94-9847-0277-ec1d-62bc3649a17a@gmail.com> (Michael Kerrisk's message of "Mon, 12 Dec 2016 17:01:14 +0100") Message-ID: <87r35df1u4.fsf@xmission.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT Subject: Re: Documenting the ioctl interfaces to discover relationships between namespaces Sender: linux-kernel-owner@vger.kernel.org List-ID: "Michael Kerrisk (man-pages)" writes: > On 12/11/2016 11:30 PM, Eric W. Biederman wrote: >> "Michael Kerrisk (man-pages)" writes: >> >>> [was: [PATCH 0/4 v3] Add an interface to discover relationships >>> between namespaces] >> >> One small comment below. >> >>> >>> Introspecting namespace relationships >>> Since Linux 4.9, two ioctl(2) operations are provided to allow >>> introspection of namespace relationships (see user_namespaces(7) >>> and pid_namespaces(7)). The form of the calls is: >>> >>> ioctl(fd, request); >>> >>> In each case, fd refers to a /proc/[pid]/ns/* file. >>> >>> NS_GET_USERNS >>> Returns a file descriptor that refers to the owning user >>> namespace for the namespace referred to by fd. >>> >>> NS_GET_PARENT >>> Returns a file descriptor that refers to the parent names‐ >>> pace of the namespace referred to by fd. This operation is >>> valid only for hierarchical namespaces (i.e., PID and user >>> namespaces). For user namespaces, NS_GET_PARENT is synony‐ >>> mous with NS_GET_USERNS. >>> >>> In each case, the returned file descriptor is opened with O_RDONLY >>> and O_CLOEXEC (close-on-exec). >>> >>> By applying fstat(2) to the returned file descriptor, one obtains >>> a stat structure whose st_ino (inode number) field identifies the >>> owning/parent namespace. This inode number can be matched with >>> the inode number of another /proc/[pid]/ns/{pid,user} file to >>> determine whether that is the owning/parent namespace. >> >> Like all fstat inode comparisons to be fully accurate you need to >> compare both the st_ino and st_dev. I reserve the right for st_dev to >> be significant when comparing namespaces. Otherwise I might have to >> create a namespace of namespaces someday and that is ugly. >> >>> Either of these ioctl(2) operations can fail with the following >>> error: >>> >>> EPERM The requested namespace is outside of the caller's names‐ >>> pace scope. This error can occur if, for example, the own‐ >>> ing user namespace is an ancestor of the caller's current >>> user namespace. It can also occur on attempts to obtain >>> the parent of the initial user or PID namespace. >>> >>> Additionally, the NS_GET_PARENT operation can fail with the fol‐ >>> lowing error: >>> >>> EINVAL fd refers to a nonhierarchical namespace. >>> >>> See the EXAMPLE section for an example of the use of these opera‐ >>> tions. > > So, after playing with this a bit, I have a question. > > I gather that in order to, for example, elaborate the tree of user > namespaces on the system, one would use NS_GET_PARENT on each of > the /proc/*/ns/user files and match up the results. Right? > > What happens if one of the parent user namespaces contains no > processes? That is, the parent namespace exists by virtue of being > pinned because a proc/PID/ns/user file is open or bind mounted. > (Chrome seems to do this sort of dance with user namespaces, for > example.) How do we find the ancestor of *that* user namespace? What is returned from NS_GET_USERNS and NS_GET_PARENT is a file descriptor, that you can call NS_GET_PARENT on. Eric