public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Pavel Emelianov <xemul@openvz.org>
To: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrew Morton <akpm@osdl.org>, Kirill Korotaev <dev@openvz.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Linux Containers <containers@lists.osdl.org>
Subject: Re: [PATCH 0/16] Pid namespaces
Date: Mon, 09 Jul 2007 17:16:17 +0400	[thread overview]
Message-ID: <46923521.4040109@openvz.org> (raw)
In-Reply-To: <20070709120236.GC4509@MAIL.13thfloor.at>

Herbert Poetzl wrote:
> On Fri, Jul 06, 2007 at 12:01:59PM +0400, Pavel Emelianov wrote:
>> This is "submition for inclusion" of hierarchical, not kconfig
>> configurable, zero overheaded ;) pid namespaces.
>>
>> The overall idea is the following:
>>
>> The namespace are organized as a tree - once a task is cloned
>> with CLONE_NEWPIDS (yes, I've also switched to it :) the new
>> namespace becomes the parent's child and tasks living in the
>> parent namespace see the tasks from the new one. The numerical
>> ids are used on the kernel-user boundary, i.e. when we export
>> pid to user we show the id, that should be used to address the
>> task in question from the namespace we're exporting this id to.
> 
> how does that behave when:
> 
>  a) the parent dies and gets reaped?

The children are re-parented to the namespace's init.
Surprised?

>  b) the 'spawned' init dies, but other tasks
>     inside the pid space are still active?

The init's init becomes the namespace's init.

>  c) what visibility rules do apply for the
>     various spaces (including the default host space)?

Each task sees tasks from its namespace and all its children
namespaces. Yes, each task can see itself as well ;)

>> The main difference from Suka's patches are the following:
>>
>> 0. Suka's patches change the kernel/pid.c code too heavy.
>>    This set keeps the kernel code look like it was without
>>    the patches. However, this is a minor issue. The major is:
>>
>> 1. Suka's approach is to remove the notion of the task's 
>>    numerical pid from the kernel at all. The numbers are 
>>    used on the kernel-user boundary or within the kernel but
>>    with the namespace this nr belongs to. This results in 
>>    massive changes of struct's members fro int pid to struct
>>    pid *pid, task->pid becomes the virtual id and so on and
>>    so forth.
>>    My approach is to keep the good old logic in the kernel. 
>>    The task->pid is a global and unique pid, find_pid() finds
>>    the pid by its global id and so on. The virtual ids appear
>>    on the user-kernel boundary only. Thus drivers and other 
>>    kernel code may still be unaware of pids unless they do not
>>    communicate with the userspace and get/put numerical pids.
> 
> interesting ... not sure that is what kernel folks
> have in mind though (IIRC, the struct pid change was
> considered a kernel side cleanup)

That's why I'm sending the patches - to make "kernel folks" make
a decision. Will we see some patches from VServer team?

>> And some more minor differences:
>>
>> 2. Suka's patches have the limit of pid namespace nesting. 
>>    My patches do not.
>>
>> 3. Suka assumes that pid namespace can live without proc mount
>>    and tries to make the code work with pid_ns->proc_mnt change
>>    from NULL to not-NULL from times to times.
>>    My code calls the kern_mount() at the namespace creation and
>>    thus the pid_namespace always works with proc.
> 
> shouldn't that be done by userspace instead?

It can be. But when the namespace is being created there's no
any userspace in it yet.

>> There are some small issues that I can describe if someone is
>> interested.
>>
>> The tests like nptl perf, unixbench spawn, getpid and others
>> didn't reveal any performance degradation in init_namespace
>> with the RHEL5 kernel .config file. I admit, that different
>> .config-s may show that patches hurt the performance, but the
>> intention was *not* to make the kernel work worse with popular
>> distributions.
>>
>> This set has some ways to move forward, but this is some kind
>> of a core, that do not change the init_pid_namespace behavior
>> (checked with LTP tests) and may require some hacking to do 
>> with the namespaces only.
> 
> TIA,
> Herbert
> 

BTW, why did you remove Suka and Serge from Cc?

Pavel

  reply	other threads:[~2007-07-09 13:54 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-06  8:01 [PATCH 0/16] Pid namespaces Pavel Emelianov
2007-07-06  8:03 ` [PATCH 1/16] Round up the API Pavel Emelianov
2007-07-09 20:18   ` Cedric Le Goater
2007-07-10  6:40     ` Pavel Emelianov
2007-07-10  7:34       ` Andrew Morton
2007-07-06  8:03 ` [PATCH 2/16] Miscelaneous preparations for namespaces Pavel Emelianov
2007-07-09 20:22   ` Cedric Le Goater
2007-07-10  6:42     ` Pavel Emelianov
2007-07-06  8:04 ` [PATCH 3/16] Introduce MS_KERNMOUNT flag Pavel Emelianov
2007-07-06  8:05 ` [PATCH 4/16] Change data structures for pid namespaces Pavel Emelianov
2007-07-09 20:25   ` Cedric Le Goater
2007-07-10  4:32     ` sukadev
2007-07-10  7:04       ` Pavel Emelianov
2007-07-10 12:07         ` Cedric Le Goater
2007-07-06  8:05 ` [PATCH 5/16] Make proc be mountable from different " Pavel Emelianov
2007-07-06  8:06 ` [PATCH 6/16] Helpers to obtain pid numbers Pavel Emelianov
2007-07-10  5:18   ` sukadev
2007-07-10  6:49     ` Pavel Emelianov
2007-07-06  8:07 ` [PATCH 7/16] Helpers to find the task by its numerical ids Pavel Emelianov
2007-07-10  4:00   ` sukadev
2007-07-10  6:47     ` Pavel Emelianov
2007-07-06  8:07 ` [PATCH 8/16] Masquerade the siginfo when sending a pid to a foreign namespace Pavel Emelianov
2007-07-10  4:18   ` sukadev
2007-07-10  6:56     ` Pavel Emelianov
2007-07-06  8:08 ` [PATCH 9/16] Make proc_flust_task to flush entries from multiple proc trees Pavel Emelianov
2007-07-06  8:08 ` [PATCH 10/16] Changes in copy_process() to work with pid namespaces Pavel Emelianov
2007-07-12  0:21   ` sukadev
2007-07-06  8:09 ` [PATCH 11/16] Add support for multiple kmem caches for pids Pavel Emelianov
2007-07-06  8:10 ` [PATCH 12/16] Reference counting of pid naspaces by pids Pavel Emelianov
2007-07-06  8:10 ` [PATCH 13/16] Switch to operating with pid_numbers instead of pids Pavel Emelianov
2007-07-25  0:36   ` sukadev
2007-07-25 10:07     ` Pavel Emelyanov
2007-07-25 19:13       ` sukadev
2007-07-26  6:42         ` Pavel Emelyanov
2007-07-06  8:11 ` [PATCH 14/16] Make pid namespaces clonnable Pavel Emelianov
2007-07-06  8:13 ` [PATCH 15/16] Changes to show virtual ids to user Pavel Emelianov
2007-07-06  8:16 ` [PATCH 16/16] Remove already unneeded memners from struct pid Pavel Emelianov
2007-07-06 16:26 ` [PATCH 0/16] Pid namespaces Dave Hansen
2007-07-09  5:58   ` Pavel Emelianov
2007-07-09 19:58     ` Dave Hansen
2007-07-09 12:02 ` Herbert Poetzl
2007-07-09 13:16   ` Pavel Emelianov [this message]
2007-07-09 19:52     ` Herbert Poetzl
2007-07-09 20:12       ` Cedric Le Goater
2007-07-10  6:59         ` Pavel Emelianov
2007-07-09 17:46 ` Badari Pulavarty
2007-07-09 20:06   ` Cedric Le Goater
2007-07-09 23:00     ` Badari Pulavarty
2007-07-10  7:05       ` Pavel Emelianov
2007-07-10 11:30     ` Pavel Emelianov
2007-07-10 12:05       ` Daniel Lezcano
2007-07-10 13:03         ` Pavel Emelianov
2007-07-10 20:34       ` Badari Pulavarty
2007-07-10 13:06   ` Pavel Emelianov
2007-07-10 20:33     ` Badari Pulavarty
2007-07-09 21:42 ` sukadev
2007-07-10  0:29 ` sukadev
2007-07-10  9:41   ` Pavel Emelianov
2007-07-10 13:08   ` Pavel Emelianov
2007-07-10  4:26 ` sukadev
2007-07-10  7:02   ` Pavel Emelianov
2007-07-11  1:16 ` Matt Mackall
2007-07-11  6:39   ` Pavel Emelianov
2007-07-11 15:14     ` Matt Mackall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46923521.4040109@openvz.org \
    --to=xemul@openvz.org \
    --cc=akpm@osdl.org \
    --cc=containers@lists.osdl.org \
    --cc=dev@openvz.org \
    --cc=ebiederm@xmission.com \
    --cc=herbert@13thfloor.at \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox