From: Oren Laadan <orenl@cs.columbia.edu>
To: "Serge E. Hallyn" <serue@us.ibm.com>
Cc: Linux Containers <containers@lists.osdl.org>,
lkml <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH linux-cr] nested pid namespaces (v2)
Date: Tue, 23 Mar 2010 01:39:38 -0400 [thread overview]
Message-ID: <4BA8541A.3090306@cs.columbia.edu> (raw)
In-Reply-To: <20100323042653.GB9014@us.ibm.com>
Serge E. Hallyn wrote:
> Quoting Oren Laadan (orenl@cs.columbia.edu):
>>
>> Serge E. Hallyn wrote:
>>> Support checkpoint and restart of tasks in nested pid namespaces. At
>>> Oren's request here is an alternative to my previous implementation. In
>>> this one, we keep the original single pids_array to minimize memory
>>> allocations. The pids array entries are augmented with a pidns depth
>> Thanks for adapting the patch.
>>
>> FWIW, not only minimize memory allocations, but also permit a more
>> regular structure of the image data (array of fixed size elements
>> followed by an array of vpids), which simplifies the code that needs
>> to read/write/access this data.
>>
>>> (relative to the container init's pidns, and an "rpid" which is the pid
>>> in the checkpointer's pidns (or 0 if no valid pid exists). The rpid
>>> will be used by userspace to gather more information (like
>>> /proc/$$/mountinfo) after the kernel sys_checkpoint. If any tasks are
>>> in nested pid namespace, another single array holds all of the vpids.
>>> At restart those are used by userspace to determine how to call
>>> eclone(). Kernel ignores them.
>>>
>>> All cr_tests including the new pid_ns testcase pass.
>>>
>>> Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
>>> ---
>> [...]
>
> Thanks, Oren - all other input is taken into what I'm about to post,
> except:
>
>>> @@ -293,10 +295,15 @@ static int may_checkpoint_task(struct ckpt_ctx *ctx, struct task_struct *t)
>>> _ckpt_err(ctx, -EPERM, "%(T)Nested net_ns unsupported\n");
>>> ret = -EPERM;
>>> }
>>> - /* no support for >1 private pidns */
>>> - if (nsproxy->pid_ns != ctx->root_nsproxy->pid_ns) {
>>> - _ckpt_err(ctx, -EPERM, "%(T)Nested pid_ns unsupported\n");
>>> - ret = -EPERM;
>>> + /* pidns must be descendent of root_nsproxy */
>>> + pidns = nsproxy->pid_ns;
>>> + while (pidns != ctx->root_nsproxy->pid_ns) {
>>> + if (pidns == &init_pid_ns) {
>>> + ret = -EPERM;
>>> + _ckpt_err(ctx, ret, "%(T)stranger pid_ns\n");
>>> + break;
>>> + }
>>> + pidns = pidns->parent;
>> Currently we do this while() loop twice - once here and once when
>> we collect the vpids. While I doubt if this has any performance
>> impact, is there an advantage to doing it also here ? (a violation
>> will be observed there too).
>
> With the new logic (ripped verbatim from Louis' email) such a move
> would make the checkpoint_vpids() code a bit uglier. I'm about to
> resend, please let me know if you still want the code moved.
>
If you think it's simpler this way, then so be it.
> ...
>
>>> diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
>>> index 0da0d83..6d86240 100644
>>> --- a/kernel/nsproxy.c
>>> +++ b/kernel/nsproxy.c
>>> @@ -364,8 +364,13 @@ static struct nsproxy *do_restore_ns(struct ckpt_ctx *ctx)
>>> get_net(net_ns);
>>> nsproxy->net_ns = net_ns;
>>> - get_pid_ns(current->nsproxy->pid_ns);
>>> - nsproxy->pid_ns = current->nsproxy->pid_ns;
>>> + /*
>>> + * The pid_ns will get assigned the first time that we
>>> + * assign the nsproxy to a task. The task had unshared
>>> + * its pid_ns in userspace before calling restart, and
>>> + * we want to keep using that pid_ns.
>>> + */
>>> + nsproxy->pid_ns = NULL;
>> This doesn't look healthy.
>>
>> If it is (or will be) possible for another process to look at the
>> restarting process, not having a pid-ns may confuse other code in
>> the kernel ?
>
> No task will have this nproxy attached before we assign a valid
> pid_ns. The NULL pid_ns is only while it is in the objhash but
> not attached to a task.
Ahh .. I see, ok then.
Oren.
prev parent reply other threads:[~2010-03-23 5:39 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-19 21:39 [PATCH linux-cr] nested pid namespaces (v2) Serge E. Hallyn
2010-03-19 21:41 ` [PATCH] user-ns: Nested pidns support (v2) Serge E. Hallyn
[not found] ` <20100319213955.GA17912-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-03-22 10:55 ` [PATCH linux-cr] nested pid namespaces (v2) Louis Rilling
2010-03-22 10:55 ` Louis Rilling
2010-03-22 14:38 ` Serge E. Hallyn
2010-03-22 19:57 ` Matt Helsley
2010-03-23 6:41 ` Louis Rilling
2010-03-22 23:17 ` Oren Laadan
2010-03-23 4:26 ` Serge E. Hallyn
2010-03-23 5:39 ` Oren Laadan [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BA8541A.3090306@cs.columbia.edu \
--to=orenl@cs.columbia.edu \
--cc=containers@lists.osdl.org \
--cc=linux-kernel@vger.kernel.org \
--cc=serue@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.