From: Oren Laadan <orenl@librato.com>
To: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Matt Helsley <matthltc@us.ibm.com>,
arnd@arndb.de, Containers <containers@lists.linux-foundation.org>,
linux-kernel@vger.kernel.org,
"Eric W. Biederman" <ebiederm@xmission.com>,
hpa@zytor.com, Alexey Dobriyan <adobriyan@gmail.com>,
roland@redhat.com, Pavel Emelyanov <xemul@openvz.org>
Subject: Re: [v11][PATCH 9/9] Document clone_with_pids() syscall
Date: Sat, 07 Nov 2009 16:56:13 -0500 [thread overview]
Message-ID: <4AF5ECFD.3000509@librato.com> (raw)
In-Reply-To: <20091107022612.GA18039@suka>
Sukadev Bhattiprolu wrote:
> Matt Helsley [matthltc@us.ibm.com] wrote:
> | > If userspace passes an array with n pids and there are k namespace levels
> | > then clone_with_pids() makes sure that the kernel sees a pid array like:
> | >
> | > index 0 ... k - (n + 1) ... k - 1
> | > +-----------------------+-------------------------+
> | > pid_t | 0 ..................0 | <copied from userspace> |
> | > +-----------------------+-------------------------+
> |
> | (diagram assumes n != k. If n == k then pids[0] is the pid desired
> | in the initial namespace..)
>
> True.
>
> Also I was not sure if we should prevent choosing pids in ancestor containers.
> since a process is not even supposed to know of ancestor namespaces. Is there
> a need for choosing pids in those namespaces.
IMHO this is a bit confusing.
A process observes a single namespace - the one in which it "lives".
There is no such thing as descendant namespaces for that process.
There may be ancestor namespaces.
The clone occurs in the context of the process. So the process that
is forking _must_ indicate pids in _ancestor_ namespaces if it wishes
to select pids in those (as is the case in c/r).
>
> |
> | >
> | > So even though the order is different from choosepid() the calling
> | > task still doesn't need to know its pidns level. Of course, just
> | > like choosepid(), n <= k or userspace will get EINVAL.
> |
> | Forgot to mention that I prefer the way choosepid orders the pids.
> | It's not inspired by the way that the kernel implements pid namespaces
> | and has more to do with the way userspace sees things (IMHO).
>
> Hmm, In general we C/R a descendant container. So the way userspace
> sees it at that point is "what are the pids of this process in my current
> and in any descendant namespaces". IOW, the pid of container from which
> we checkpoint seems more interesting first - right ? If so, the pids[]
> are better ordered from older namespace to younger namespace ?
When we checkpoint, we use an external process to record the state of
(current or) descendant namespaces.
When we restart, we run in the context of the restarting process, so
we select a pid in the current and _ancestor_ namespaces.
So the order of pids as it (will) appear in the checkpoint image for
a given process will be from an ancestor down to descendant namespaces.
And this is how we (will) hand it over to eclone().
>
> | I don't know if it makes more sense to change clone_with_pids() or have
> | [e]glibc wrappers swap the array contents.
I prefer to decide now on an order and stick to it in the kernel and
in glibc.
Oren
next prev parent reply other threads:[~2009-11-07 21:56 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-05 5:30 [v11][PATCH 0/9] Implement clone_with_pids() system call Sukadev Bhattiprolu
2009-11-05 5:36 ` [v11][PATCH 1/9] Factor out code to allocate pidmap page Sukadev Bhattiprolu
2009-11-05 5:37 ` [v11][PATCH 2/9] Have alloc_pidmap() return actual error code Sukadev Bhattiprolu
2009-11-05 5:38 ` [v11][PATCH 3/9] Define set_pidmap() function Sukadev Bhattiprolu
2009-11-05 5:38 ` [v11][PATCH 4/9] Add target_pids parameter to alloc_pid() Sukadev Bhattiprolu
2009-11-05 5:39 ` [v11][PATCH 5/9] Add target_pids parameter to copy_process() Sukadev Bhattiprolu
2009-11-05 5:40 ` [v11][PATCH 6/9] Check invalid clone flags Sukadev Bhattiprolu
2009-11-05 5:40 ` [v11][PATCH 7/9] Define do_fork_with_pids() Sukadev Bhattiprolu
2009-11-05 5:41 ` [v11][PATCH 8/9] Define clone_with_pids() syscall Sukadev Bhattiprolu
2009-11-06 18:02 ` Serge E. Hallyn
2009-11-07 20:18 ` Sukadev Bhattiprolu
2009-11-09 20:37 ` Serge E. Hallyn
2009-11-05 5:42 ` [v11][PATCH 9/9] Document " Sukadev Bhattiprolu
2009-11-06 18:39 ` Serge E. Hallyn
2009-11-06 20:18 ` Matt Helsley
2009-11-06 21:45 ` Matt Helsley
2009-11-07 2:26 ` Sukadev Bhattiprolu
2009-11-07 21:56 ` Oren Laadan [this message]
2009-11-08 15:09 ` Serge E. Hallyn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AF5ECFD.3000509@librato.com \
--to=orenl@librato.com \
--cc=adobriyan@gmail.com \
--cc=arnd@arndb.de \
--cc=containers@lists.linux-foundation.org \
--cc=ebiederm@xmission.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=matthltc@us.ibm.com \
--cc=roland@redhat.com \
--cc=sukadev@linux.vnet.ibm.com \
--cc=xemul@openvz.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox