From: sukadev@us.ibm.com
To: Pavel Emelianov <xemul@openvz.org>
Cc: Andrew Morton <akpm@osdl.org>, Serge Hallyn <serue@us.ibm.com>,
"Eric W. Biederman" <ebiederm@xmission.com>,
Linux Containers <containers@lists.osdl.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Kirill Korotaev <dev@openvz.org>
Subject: Re: [PATCH 0/16] Pid namespaces
Date: Mon, 9 Jul 2007 14:42:44 -0700 [thread overview]
Message-ID: <20070709214244.GA11185@us.ibm.com> (raw)
In-Reply-To: <468DF6F7.1010906@openvz.org>
Pavel Emelianov [xemul@openvz.org] wrote:
| This is "submition for inclusion" of hierarchical, not kconfig
| configurable, zero overheaded ;) pid namespaces.
|
| The overall idea is the following:
|
| The namespace are organized as a tree - once a task is cloned
| with CLONE_NEWPIDS (yes, I've also switched to it :) the new
| namespace becomes the parent's child and tasks living in the
| parent namespace see the tasks from the new one. The numerical
| ids are used on the kernel-user boundary, i.e. when we export
| pid to user we show the id, that should be used to address the
| task in question from the namespace we're exporting this id to.
|
| The main difference from Suka's patches are the following:
|
| 0. Suka's patches change the kernel/pid.c code too heavy.
| This set keeps the kernel code look like it was without
| the patches. However, this is a minor issue. The major is:
|
| 1. Suka's approach is to remove the notion of the task's
| numerical pid from the kernel at all. The numbers are
| used on the kernel-user boundary or within the kernel but
| with the namespace this nr belongs to. This results in
| massive changes of struct's members fro int pid to struct
| pid *pid, task->pid becomes the virtual id and so on and
| so forth.
Your basic design is similar to what our patchset has been for
a while, with a few changes.
My patchset does not remove the task->pid. It still uses it
with the caveat that with multiple namespaces it is not unique.
getpid() implementation does not changes for instance.
Basically our patchset has init_pid_ns as the last element in the
pid->numbers[] array while yours is having it as the first. How
big a difference it makes, I am not sure.
|
| My approach is to keep the good old logic in the kernel.
| The task->pid is a global and unique pid, find_pid() finds
| the pid by its global id and so on. The virtual ids appear
| on the user-kernel boundary only. Thus drivers and other
| kernel code may still be unaware of pids unless they do not
| communicate with the userspace and get/put numerical pids.
Even in my patchset, drivers or other kernel code have no need
to know anything about namespaces.
Actually you seem to introduce a new function find_vpid() that
is used in a driver. So a driver-writer needs to know whether
to call find_pid() or find_vpid().
|
| And some more minor differences:
|
| 2. Suka's patches have the limit of pid namespace nesting.
| My patches do not.
Yes - its a compile-time constant (MAX_NESTED_PID_NS) that I
introduced just in the last version to simplify allocation.
Ecspecially after you argued against arbitrary depth before :-)
The basic design of your new 'struct pid' data structure is very
similar to what we have had for the last couple of rounds and we
could just as easily remove MAX_NESTED_PID_NS.
|
| 3. Suka assumes that pid namespace can live without proc mount
| and tries to make the code work with pid_ns->proc_mnt change
| from NULL to not-NULL from times to times.
| My code calls the kern_mount() at the namespace creation and
| thus the pid_namespace always works with proc.
Yes, we have been debating about the better approach for this yet.
We have been considering doing the kern_mount, as we do in
init_pid_ns at present.
|
| There are some small issues that I can describe if someone is
| interested.
|
| The tests like nptl perf, unixbench spawn, getpid and others
| didn't reveal any performance degradation in init_namespace
| with the RHEL5 kernel .config file. I admit, that different
| .config-s may show that patches hurt the performance, but the
| intention was *not* to make the kernel work worse with popular
| distributions.
|
| This set has some ways to move forward, but this is some kind
| of a core, that do not change the init_pid_namespace behavior
| (checked with LTP tests) and may require some hacking to do
| with the namespaces only.
|
| Patches apply to 2.6.22-rc6-mm1.
next prev parent reply other threads:[~2007-07-09 21:42 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-07-06 8:01 [PATCH 0/16] Pid namespaces Pavel Emelianov
2007-07-06 8:03 ` [PATCH 1/16] Round up the API Pavel Emelianov
2007-07-09 20:18 ` Cedric Le Goater
2007-07-10 6:40 ` Pavel Emelianov
2007-07-10 7:34 ` Andrew Morton
2007-07-06 8:03 ` [PATCH 2/16] Miscelaneous preparations for namespaces Pavel Emelianov
2007-07-09 20:22 ` Cedric Le Goater
2007-07-10 6:42 ` Pavel Emelianov
2007-07-06 8:04 ` [PATCH 3/16] Introduce MS_KERNMOUNT flag Pavel Emelianov
2007-07-06 8:05 ` [PATCH 4/16] Change data structures for pid namespaces Pavel Emelianov
2007-07-09 20:25 ` Cedric Le Goater
2007-07-10 4:32 ` sukadev
2007-07-10 7:04 ` Pavel Emelianov
2007-07-10 12:07 ` Cedric Le Goater
2007-07-06 8:05 ` [PATCH 5/16] Make proc be mountable from different " Pavel Emelianov
2007-07-06 8:06 ` [PATCH 6/16] Helpers to obtain pid numbers Pavel Emelianov
2007-07-10 5:18 ` sukadev
2007-07-10 6:49 ` Pavel Emelianov
2007-07-06 8:07 ` [PATCH 7/16] Helpers to find the task by its numerical ids Pavel Emelianov
2007-07-10 4:00 ` sukadev
2007-07-10 6:47 ` Pavel Emelianov
2007-07-06 8:07 ` [PATCH 8/16] Masquerade the siginfo when sending a pid to a foreign namespace Pavel Emelianov
2007-07-10 4:18 ` sukadev
2007-07-10 6:56 ` Pavel Emelianov
2007-07-06 8:08 ` [PATCH 9/16] Make proc_flust_task to flush entries from multiple proc trees Pavel Emelianov
2007-07-06 8:08 ` [PATCH 10/16] Changes in copy_process() to work with pid namespaces Pavel Emelianov
2007-07-12 0:21 ` sukadev
2007-07-06 8:09 ` [PATCH 11/16] Add support for multiple kmem caches for pids Pavel Emelianov
2007-07-06 8:10 ` [PATCH 12/16] Reference counting of pid naspaces by pids Pavel Emelianov
2007-07-06 8:10 ` [PATCH 13/16] Switch to operating with pid_numbers instead of pids Pavel Emelianov
2007-07-25 0:36 ` sukadev
2007-07-25 10:07 ` Pavel Emelyanov
2007-07-25 19:13 ` sukadev
2007-07-26 6:42 ` Pavel Emelyanov
2007-07-06 8:11 ` [PATCH 14/16] Make pid namespaces clonnable Pavel Emelianov
2007-07-06 8:13 ` [PATCH 15/16] Changes to show virtual ids to user Pavel Emelianov
2007-07-06 8:16 ` [PATCH 16/16] Remove already unneeded memners from struct pid Pavel Emelianov
2007-07-06 16:26 ` [PATCH 0/16] Pid namespaces Dave Hansen
2007-07-09 5:58 ` Pavel Emelianov
2007-07-09 19:58 ` Dave Hansen
2007-07-09 12:02 ` Herbert Poetzl
2007-07-09 13:16 ` Pavel Emelianov
2007-07-09 19:52 ` Herbert Poetzl
2007-07-09 20:12 ` Cedric Le Goater
2007-07-10 6:59 ` Pavel Emelianov
2007-07-09 17:46 ` Badari Pulavarty
2007-07-09 20:06 ` Cedric Le Goater
2007-07-09 23:00 ` Badari Pulavarty
2007-07-10 7:05 ` Pavel Emelianov
2007-07-10 11:30 ` Pavel Emelianov
2007-07-10 12:05 ` Daniel Lezcano
2007-07-10 13:03 ` Pavel Emelianov
2007-07-10 20:34 ` Badari Pulavarty
2007-07-10 13:06 ` Pavel Emelianov
2007-07-10 20:33 ` Badari Pulavarty
2007-07-09 21:42 ` sukadev [this message]
2007-07-10 0:29 ` sukadev
2007-07-10 9:41 ` Pavel Emelianov
2007-07-10 13:08 ` Pavel Emelianov
2007-07-10 4:26 ` sukadev
2007-07-10 7:02 ` Pavel Emelianov
2007-07-11 1:16 ` Matt Mackall
2007-07-11 6:39 ` Pavel Emelianov
2007-07-11 15:14 ` Matt Mackall
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070709214244.GA11185@us.ibm.com \
--to=sukadev@us.ibm.com \
--cc=akpm@osdl.org \
--cc=containers@lists.osdl.org \
--cc=dev@openvz.org \
--cc=ebiederm@xmission.com \
--cc=linux-kernel@vger.kernel.org \
--cc=serue@us.ibm.com \
--cc=xemul@openvz.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox