From: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
To: "Eric W. Biederman"
<ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>,
Andrew Vagin <avagin-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
Cc: criu-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org,
Linux API <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
Containers
<containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
lkml <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
"Michael Kerrisk (man-pages)"
<mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Subject: Re: [CRIU] Introspecting userns relationships to other namespaces?
Date: Fri, 08 Jul 2016 07:35:33 -0700 [thread overview]
Message-ID: <1467988533.2322.118.camel@HansenPartnership.com> (raw)
In-Reply-To: <87vb0gy3nr.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
On Fri, 2016-07-08 at 02:44 -0500, Eric W. Biederman wrote:
> Andrew Vagin <avagin-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org> writes:
>
> > On Wed, Jul 06, 2016 at 10:46:33AM -0500, Eric W. Biederman wrote:
> > > "Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> writes:
> > >
> > > > On Wed, Jul 06, 2016 at 10:41:48AM +0200, Michael Kerrisk (man
> > > > -pages) wrote:
> > > > > [Rats! Doing now what I should have down to start with.
> > > > > Looping some
> > > > > lists and CRIU and other possibly relevant people into this
> > > > > conversation]
> > > > >
> > > > > Hi Eric,
> > > > >
> > > > > On 5 July 2016 at 23:47, Eric W. Biederman <
> > > > > ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> wrote:
> > > > > > "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > > > > > writes:
> > > > > >
> > > > > > > Hi Eric,
> > > > > > >
> > > > > > > I have a question. Is there any way currently to discover
> > > > > > > which user namespace a particular nonuser namespace is
> > > > > > > governed by? Maybe I am missing something, but there does
> > > > > > > not seem to be a way to do this. Also, can one discover
> > > > > > > which userns is the parent of a given userns? Again, I
> > > > > > > can't see a way to do this.
> > > > > > >
> > > > > > > The point here is introspecting so that a process might
> > > > > > > determine what its capabilities are when operating on
> > > > > > > some resource governed by a (nonuser) namespace.
> > > > > >
> > > > > > To the best of my knowledge that there is not an interface
> > > > > > to get that information. It would be good to have such an
> > > > > > interface for no other reason than the CRIU folks are going
> > > > > > to need it at some point. I am a bit surprised they have
> > > > > > not complained yet.
> > > >
> > > > I don't think they need it. They do in fact have what they
> > > > need. Assume you have tasks T1, T2, T1_1 and T2_1; T1 and T2
> > > > are in init_user_ns; T1 spawned T1_1 in a new userns; T2
> > > > spawned T2_1 which setns()d to T1_1's ns. There's some
> > > > {handwave} uid mapping, does not matter.
> > > >
> > > > At restart, it doesn't matter which task originally created the
> > > > new userns. criu knows T1_1 and T2_1 are in the same userns;
> > > > it creates the userns, sets up the mapping, and T1_1 and T2_1
> > > > setns() to it.
> > >
> > > Given that the simple cases are so easy it probably doesn't
> > > matter in that sense.
> > >
> > > However we now have the case where user namespaces own pid
> > > namespaces, and uts namespaces, and network namespaces, and ipc
> > > namespaces, and filesystems. Throw in some mount propagation and
> > > use of setns and things could get confusing. It is something
> > > that will need to be figured out if CRIU is going to properly
> > > checkpoint containers containing containers containing containers
> > > containing containers.
> >
> > It isn't a joke:). We have a few requests to support CR of
> > containers with Docker containers inside. And we are going to start
> > this task in a near future, so we would like to have interface to
> > get dependencies between namespaces too.
> >
> > BTW: CRIU already supports nested mount namespaces, because systemd
> > creates them for services.
>
> The tricky part about this and what messes up James proposed plan is
> that the interface needs to be something that returns a namespace
> file descriptor. So we can't print something out in a simple text
> file.
I actually described two problems: the first was how we get the
information in the first place. Currently the owning or parent user_ns
is tucked inside an opaque structure. I think we need to move that to
ns_common where it would be the owning userns for all non-user
namespaces and the parent for the userns.
Once we actually have the information, we can also add a set of proc
links, say either
/proc/<pid>/ns/X-userns
Which might be a bit messy since it doubles the number of files, or
perhaps in a simple directory.
> Well I suppose we could print an device number and inode number pair.
> But then someone would still have to scour processes looking for a
> user namespace so that is likely less than ideal.
There's no reason any of the proposed methods so far have to be
exclusive: nsfs.c has a lot of flexibility.
> Starting with 4.8 we are also going to need to be able to retrieve
> the user namespace owner of filesystems. That will be an interesting
> mix.
This is per mount point, isn't it? so it can't be in /proc/fs/ and it
would have to be per local mount tree. Yes, that is a bit nasty.
Sounds like we might need to unfold mount or mountinfo into something
that has one directory per entry?
James
> Eric
>
> _______________________________________________
> Containers mailing list
> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers
>
WARNING: multiple messages have this Message-ID (diff)
From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>,
Andrew Vagin <avagin@virtuozzo.com>
Cc: Linux API <linux-api@vger.kernel.org>,
Containers <containers@lists.linux-foundation.org>,
lkml <linux-kernel@vger.kernel.org>,
criu@openvz.org,
"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Subject: Re: [CRIU] Introspecting userns relationships to other namespaces?
Date: Fri, 08 Jul 2016 07:35:33 -0700 [thread overview]
Message-ID: <1467988533.2322.118.camel@HansenPartnership.com> (raw)
In-Reply-To: <87vb0gy3nr.fsf@x220.int.ebiederm.org>
On Fri, 2016-07-08 at 02:44 -0500, Eric W. Biederman wrote:
> Andrew Vagin <avagin@virtuozzo.com> writes:
>
> > On Wed, Jul 06, 2016 at 10:46:33AM -0500, Eric W. Biederman wrote:
> > > "Serge E. Hallyn" <serge@hallyn.com> writes:
> > >
> > > > On Wed, Jul 06, 2016 at 10:41:48AM +0200, Michael Kerrisk (man
> > > > -pages) wrote:
> > > > > [Rats! Doing now what I should have down to start with.
> > > > > Looping some
> > > > > lists and CRIU and other possibly relevant people into this
> > > > > conversation]
> > > > >
> > > > > Hi Eric,
> > > > >
> > > > > On 5 July 2016 at 23:47, Eric W. Biederman <
> > > > > ebiederm@xmission.com> wrote:
> > > > > > "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
> > > > > > writes:
> > > > > >
> > > > > > > Hi Eric,
> > > > > > >
> > > > > > > I have a question. Is there any way currently to discover
> > > > > > > which user namespace a particular nonuser namespace is
> > > > > > > governed by? Maybe I am missing something, but there does
> > > > > > > not seem to be a way to do this. Also, can one discover
> > > > > > > which userns is the parent of a given userns? Again, I
> > > > > > > can't see a way to do this.
> > > > > > >
> > > > > > > The point here is introspecting so that a process might
> > > > > > > determine what its capabilities are when operating on
> > > > > > > some resource governed by a (nonuser) namespace.
> > > > > >
> > > > > > To the best of my knowledge that there is not an interface
> > > > > > to get that information. It would be good to have such an
> > > > > > interface for no other reason than the CRIU folks are going
> > > > > > to need it at some point. I am a bit surprised they have
> > > > > > not complained yet.
> > > >
> > > > I don't think they need it. They do in fact have what they
> > > > need. Assume you have tasks T1, T2, T1_1 and T2_1; T1 and T2
> > > > are in init_user_ns; T1 spawned T1_1 in a new userns; T2
> > > > spawned T2_1 which setns()d to T1_1's ns. There's some
> > > > {handwave} uid mapping, does not matter.
> > > >
> > > > At restart, it doesn't matter which task originally created the
> > > > new userns. criu knows T1_1 and T2_1 are in the same userns;
> > > > it creates the userns, sets up the mapping, and T1_1 and T2_1
> > > > setns() to it.
> > >
> > > Given that the simple cases are so easy it probably doesn't
> > > matter in that sense.
> > >
> > > However we now have the case where user namespaces own pid
> > > namespaces, and uts namespaces, and network namespaces, and ipc
> > > namespaces, and filesystems. Throw in some mount propagation and
> > > use of setns and things could get confusing. It is something
> > > that will need to be figured out if CRIU is going to properly
> > > checkpoint containers containing containers containing containers
> > > containing containers.
> >
> > It isn't a joke:). We have a few requests to support CR of
> > containers with Docker containers inside. And we are going to start
> > this task in a near future, so we would like to have interface to
> > get dependencies between namespaces too.
> >
> > BTW: CRIU already supports nested mount namespaces, because systemd
> > creates them for services.
>
> The tricky part about this and what messes up James proposed plan is
> that the interface needs to be something that returns a namespace
> file descriptor. So we can't print something out in a simple text
> file.
I actually described two problems: the first was how we get the
information in the first place. Currently the owning or parent user_ns
is tucked inside an opaque structure. I think we need to move that to
ns_common where it would be the owning userns for all non-user
namespaces and the parent for the userns.
Once we actually have the information, we can also add a set of proc
links, say either
/proc/<pid>/ns/X-userns
Which might be a bit messy since it doubles the number of files, or
perhaps in a simple directory.
> Well I suppose we could print an device number and inode number pair.
> But then someone would still have to scour processes looking for a
> user namespace so that is likely less than ideal.
There's no reason any of the proposed methods so far have to be
exclusive: nsfs.c has a lot of flexibility.
> Starting with 4.8 we are also going to need to be able to retrieve
> the user namespace owner of filesystems. That will be an interesting
> mix.
This is per mount point, isn't it? so it can't be in /proc/fs/ and it
would have to be per local mount tree. Yes, that is a bit nasty.
Sounds like we might need to unfold mount or mountinfo into something
that has one directory per entry?
James
> Eric
>
> _______________________________________________
> Containers mailing list
> Containers@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers
>
next prev parent reply other threads:[~2016-07-08 14:35 UTC|newest]
Thread overview: 111+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <c2a26220-69f2-f2f5-491a-e43abd9a6f92@gmail.com>
[not found] ` <87r3b7pxja.fsf@x220.int.ebiederm.org>
[not found] ` <87r3b7pxja.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-06 8:41 ` Introspecting userns relationships to other namespaces? Michael Kerrisk (man-pages)
2016-07-06 8:41 ` Michael Kerrisk (man-pages)
[not found] ` <CAKgNAkgQbxLH-B3N3Xti3LLis+1Y-SJD2h1DEaXao7zTDA7pug-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-07-06 14:13 ` Serge E. Hallyn
2016-07-06 14:13 ` Serge E. Hallyn
[not found] ` <20160706141348.GB20728-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2016-07-06 15:46 ` Eric W. Biederman
2016-07-06 15:46 ` Eric W. Biederman
[not found] ` <871t36kbvq.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-08 1:57 ` [CRIU] " Andrew Vagin
2016-07-08 1:57 ` Andrew Vagin
[not found] ` <20160708015758.GA10512-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-07-08 7:44 ` Eric W. Biederman
2016-07-08 7:44 ` Eric W. Biederman
2016-07-08 7:44 ` Eric W. Biederman
[not found] ` <87vb0gy3nr.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-08 14:35 ` James Bottomley [this message]
2016-07-08 14:35 ` James Bottomley
[not found] ` <1467988533.2322.118.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
2016-07-08 20:38 ` Andrew Vagin
2016-07-08 20:38 ` Andrew Vagin
[not found] ` <20160708203818.GA2602-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-07-08 20:50 ` W. Trevor King
2016-07-08 20:50 ` W. Trevor King
2016-07-08 20:50 ` W. Trevor King
2016-07-08 22:19 ` James Bottomley
2016-07-08 22:19 ` James Bottomley
2016-07-08 22:19 ` James Bottomley
2016-07-08 22:19 ` James Bottomley
2016-07-08 22:19 ` James Bottomley
[not found] ` <5e4cc802-f0e0-4f4c-a2f7-585aaaa8feec-2ueSQiBKiTY7tOexoI0I+QC/G2K4zDHf@public.gmane.org>
2016-07-08 23:52 ` Eric W. Biederman
2016-07-08 23:52 ` Eric W. Biederman
[not found] ` <87wpkvpu1i.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-09 0:15 ` James Bottomley
2016-07-09 0:15 ` James Bottomley
[not found] ` <1468023332.2390.10.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
2016-07-09 3:05 ` Eric W. Biederman
2016-07-09 3:05 ` Eric W. Biederman
[not found] ` <87bn27o6j5.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-09 7:26 ` Andrew Vagin
2016-07-09 7:26 ` Andrew Vagin
[not found] ` <20160709072627.GA7480-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-07-09 10:31 ` James Bottomley
2016-07-09 10:31 ` James Bottomley
2016-07-09 10:31 ` James Bottomley
2016-07-09 10:32 ` James Bottomley
2016-07-09 10:32 ` James Bottomley
2016-07-09 18:15 ` Eric W. Biederman
2016-07-09 18:15 ` Eric W. Biederman
2016-07-09 18:15 ` Eric W. Biederman
[not found] ` <87eg72llu0.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-09 18:29 ` Eric W. Biederman
2016-07-09 18:29 ` Eric W. Biederman
[not found] ` <871t32ll6n.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-13 0:08 ` Andrew Vagin
2016-07-13 0:08 ` Andrew Vagin
[not found] ` <20160713000842.GC5818-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-07-13 3:59 ` W. Trevor King
2016-07-13 3:59 ` W. Trevor King
2016-07-09 0:15 ` James Bottomley
2016-07-08 23:52 ` Eric W. Biederman
2016-07-08 22:19 ` James Bottomley
2016-07-07 8:15 ` Michael Kerrisk (man-pages)
2016-07-07 8:15 ` Michael Kerrisk (man-pages)
[not found] ` <CAKgNAkhtQNg0mVv6ei_JigNz3njo_G3opE+rzd4OtKpa2hQe9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-07-07 13:36 ` Serge E. Hallyn
2016-07-07 13:36 ` Serge E. Hallyn
[not found] ` <20160707133631.GA2994-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2016-07-07 15:01 ` James Bottomley
2016-07-07 15:01 ` James Bottomley
2016-07-07 15:01 ` James Bottomley
[not found] ` <1467903712.2347.16.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
2016-07-07 18:21 ` Michael Kerrisk (man-pages)
2016-07-07 18:21 ` Michael Kerrisk (man-pages)
[not found] ` <CAKgNAkg+OiBngdFsdVR0gsSnVhMppuH2DxMBLCNAx8in5C0-zQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-07-07 18:24 ` Serge E. Hallyn
2016-07-07 18:24 ` Serge E. Hallyn
2016-07-07 18:24 ` Serge E. Hallyn
2016-07-07 19:17 ` James Bottomley
2016-07-07 19:17 ` James Bottomley
[not found] ` <1467919055.2322.36.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
2016-07-08 2:16 ` [CRIU] " Andrew Vagin
2016-07-08 2:16 ` Andrew Vagin
[not found] ` <20160708021617.GB10512-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-07-08 3:00 ` Andrew Vagin
2016-07-08 3:00 ` Andrew Vagin
[not found] ` <20160708030055.GC10512-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-07-08 3:26 ` James Bottomley
2016-07-08 3:26 ` James Bottomley
[not found] ` <1467948407.2322.88.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
2016-07-08 5:26 ` W. Trevor King
2016-07-08 5:26 ` W. Trevor King
2016-07-08 5:26 ` W. Trevor King
[not found] ` <20160708052650.GM4916-q4NCUed9G3sTnwFZoN752g@public.gmane.org>
2016-07-08 6:16 ` W. Trevor King
2016-07-08 6:16 ` W. Trevor King
2016-07-08 6:16 ` W. Trevor King
2016-07-08 6:54 ` Andrew Vagin
2016-07-08 6:54 ` Andrew Vagin
[not found] ` <20160708065453.GB14391-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-07-08 7:18 ` W. Trevor King
2016-07-08 7:18 ` W. Trevor King
2016-07-08 7:18 ` W. Trevor King
2016-07-08 5:41 ` [CRIU] " Andrei Vagin
2016-07-08 5:41 ` Andrei Vagin
2016-07-08 5:41 ` Andrei Vagin
[not found] ` <CANaxB-wBkHrsQXcruEDXWwU-X8y4szW3dgVd+9JvgCGrrNeW4g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-07-08 5:47 ` Andrei Vagin
2016-07-08 5:47 ` Andrei Vagin
2016-07-08 6:07 ` James Bottomley
2016-07-08 6:07 ` James Bottomley
2016-07-08 11:17 ` Michael Kerrisk (man-pages)
2016-07-08 11:17 ` Michael Kerrisk (man-pages)
2016-07-08 3:26 ` James Bottomley
2016-07-08 3:20 ` James Bottomley
2016-07-08 3:20 ` James Bottomley
[not found] ` <1467948005.2322.84.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
2016-07-08 6:09 ` Andrew Vagin
2016-07-08 6:09 ` Andrew Vagin
2016-07-08 11:11 ` Michael Kerrisk (man-pages)
2016-07-08 11:11 ` Michael Kerrisk (man-pages)
2016-07-07 19:17 ` James Bottomley
2016-07-09 3:15 ` W. Trevor King
2016-07-09 3:15 ` W. Trevor King
2016-07-09 3:15 ` W. Trevor King
[not found] ` <20160709031528.GA25507-q4NCUed9G3sTnwFZoN752g@public.gmane.org>
2016-07-09 3:13 ` Eric W. Biederman
2016-07-09 3:13 ` Eric W. Biederman
[not found] ` <87ziprmrln.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-10 5:36 ` [CRIU] " Andrew Vagin
2016-07-10 5:36 ` Andrew Vagin
[not found] ` <20160710053609.GB4868-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-07-10 20:29 ` Eric W. Biederman
2016-07-10 20:29 ` Eric W. Biederman
[not found] ` <87furhjkxw.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-07-10 21:06 ` James Bottomley
2016-07-10 21:06 ` James Bottomley
[not found] ` <1468184808.19833.30.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
2016-07-11 20:55 ` Andrew Vagin
2016-07-11 20:55 ` Andrew Vagin
2016-07-10 20:29 ` Eric W. Biederman
2016-07-06 14:13 ` Serge E. Hallyn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1467988533.2322.118.camel@HansenPartnership.com \
--to=james.bottomley-d9phhud1jfjcxq6kfmz53/egyhegw8jk@public.gmane.org \
--cc=avagin-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org \
--cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=criu-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
--cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
--cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.