public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Pavel Emelianov <xemul@openvz.org>
To: Dave Hansen <haveblue@us.ibm.com>
Cc: Andrew Morton <akpm@osdl.org>, Kirill Korotaev <dev@openvz.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Linux Containers <containers@lists.osdl.org>
Subject: Re: [PATCH 0/16] Pid namespaces
Date: Mon, 09 Jul 2007 09:58:49 +0400	[thread overview]
Message-ID: <4691CE99.7030902@openvz.org> (raw)
In-Reply-To: <1183739214.10287.127.camel@localhost>

Dave Hansen wrote:
> On Fri, 2007-07-06 at 12:01 +0400, Pavel Emelianov wrote:
>> This is "submition for inclusion" of hierarchical, not kconfig
>> configurable, zero overheaded ;) pid namespaces.
> 
> Pavel, I'm a bit disappointed that you went ahead and sent this.  I
> thought that, perhaps, you might have brought up how displeased you were
> with Suka's patches when we discussed them at OLS.  
> 
> Hold your horses there a bit.  This has "little" overhead for the common
> case, which is a single level of pid namespaces.  That means that it is
> quick to access the "global" pid which would be the one that the "host
> container" sees.  It also provides quick access to the pid which a
> containerized task gets when the task itself calls getpid().  This quick
> access is provided by storing the values directly in the task struct.
> 
> However, when there is more than one level in the container hierarchy,
> the optimization breaks down.  A process which exists in a three-level
> hierarchy has slow access to the middle level pid.  Your approach stores
> this information in a linked list, and surely *that* is going to have

No. This approach stores numerical values in array. I have
removed the lists at all.

> overhead in fork().
> 
>> 2. Suka's patches have the limit of pid namespace nesting. 
>>    My patches do not.
> 
> I wouldn't say it that bluntly.  Suka's patches have a configurable
> limit simply because it makes the implementation simpler and faster.

I didn't say that this difference is crucial either. I just pointed
all the major differences out. The main difference (you lost it 
without any comment, but this difference is the main reason I send
my patches) is that the *approaches* differ.

> There was also a version which dynamically allocated structures and had
> no inherent limits, but this was _much_ simpler.  We could add dynamic
> allocation to this in the future and only overflow into that case if we
> overrun the static buffers.
> 
> That is, in effect, what your patches do.  They hard-code for a
> two-level container, and dynamically allocate the levels after that.

Nope. This approach treats all the levels in a same way. My
previous version of patches had configurable flat/multilevel
models, but this patch set has no Kconfig options and makes no
difference between the 2nd and the 5th levels. However there are
some lightweight optimizations concerning the init ns. This is
done so not to affect the kernel for people who do not need the
namespaces at all.

> Suka's patches allow for arbitrary (but, config-time fixed) depth to be
> optimized for, and don't disallow a future dynamically-allocated
> completely arbitrary depth. 
> 
> All of that said, I think that your approach would probably _work_ for
> our needs.  I agree with Eric Biederman that your approach is a bit of a
> hack (with the hard-coded optimization for two levels), but it would

Wrong again. As I have told - this set makes no difference between
levels of namespaces nesting. I have reimplemented my whole set.

> certainly _work_, or we can make it work.  That said, is it possible for
> Suka's to work for you?

It may work, but as I have said there are (currently) two approaches
to make pid namespaces. This difference is described in details in
my original [PATCH 0/16] letter.

>> 3. Suka assumes that pid namespace can live without proc mount
>>    and tries to make the code work with pid_ns->proc_mnt change
>>    from NULL to not-NULL from times to times.
>>    My code calls the kern_mount() at the namespace creation and
>>    thus the pid_namespace always works with proc.
> 
> Have you run this by Al Viro and the other fs guys?  /proc is a weird
> beast :)

Proc changes are trivial. The main difference is that different proc
mount can have different super blocks. However this is the thing you
are right with - I had to Cc: Al Viro with the proc patch...

> -- Dave

Thanks,
Pavel

  reply	other threads:[~2007-07-09  6:36 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-06  8:01 [PATCH 0/16] Pid namespaces Pavel Emelianov
2007-07-06  8:03 ` [PATCH 1/16] Round up the API Pavel Emelianov
2007-07-09 20:18   ` Cedric Le Goater
2007-07-10  6:40     ` Pavel Emelianov
2007-07-10  7:34       ` Andrew Morton
2007-07-06  8:03 ` [PATCH 2/16] Miscelaneous preparations for namespaces Pavel Emelianov
2007-07-09 20:22   ` Cedric Le Goater
2007-07-10  6:42     ` Pavel Emelianov
2007-07-06  8:04 ` [PATCH 3/16] Introduce MS_KERNMOUNT flag Pavel Emelianov
2007-07-06  8:05 ` [PATCH 4/16] Change data structures for pid namespaces Pavel Emelianov
2007-07-09 20:25   ` Cedric Le Goater
2007-07-10  4:32     ` sukadev
2007-07-10  7:04       ` Pavel Emelianov
2007-07-10 12:07         ` Cedric Le Goater
2007-07-06  8:05 ` [PATCH 5/16] Make proc be mountable from different " Pavel Emelianov
2007-07-06  8:06 ` [PATCH 6/16] Helpers to obtain pid numbers Pavel Emelianov
2007-07-10  5:18   ` sukadev
2007-07-10  6:49     ` Pavel Emelianov
2007-07-06  8:07 ` [PATCH 7/16] Helpers to find the task by its numerical ids Pavel Emelianov
2007-07-10  4:00   ` sukadev
2007-07-10  6:47     ` Pavel Emelianov
2007-07-06  8:07 ` [PATCH 8/16] Masquerade the siginfo when sending a pid to a foreign namespace Pavel Emelianov
2007-07-10  4:18   ` sukadev
2007-07-10  6:56     ` Pavel Emelianov
2007-07-06  8:08 ` [PATCH 9/16] Make proc_flust_task to flush entries from multiple proc trees Pavel Emelianov
2007-07-06  8:08 ` [PATCH 10/16] Changes in copy_process() to work with pid namespaces Pavel Emelianov
2007-07-12  0:21   ` sukadev
2007-07-06  8:09 ` [PATCH 11/16] Add support for multiple kmem caches for pids Pavel Emelianov
2007-07-06  8:10 ` [PATCH 12/16] Reference counting of pid naspaces by pids Pavel Emelianov
2007-07-06  8:10 ` [PATCH 13/16] Switch to operating with pid_numbers instead of pids Pavel Emelianov
2007-07-25  0:36   ` sukadev
2007-07-25 10:07     ` Pavel Emelyanov
2007-07-25 19:13       ` sukadev
2007-07-26  6:42         ` Pavel Emelyanov
2007-07-06  8:11 ` [PATCH 14/16] Make pid namespaces clonnable Pavel Emelianov
2007-07-06  8:13 ` [PATCH 15/16] Changes to show virtual ids to user Pavel Emelianov
2007-07-06  8:16 ` [PATCH 16/16] Remove already unneeded memners from struct pid Pavel Emelianov
2007-07-06 16:26 ` [PATCH 0/16] Pid namespaces Dave Hansen
2007-07-09  5:58   ` Pavel Emelianov [this message]
2007-07-09 19:58     ` Dave Hansen
2007-07-09 12:02 ` Herbert Poetzl
2007-07-09 13:16   ` Pavel Emelianov
2007-07-09 19:52     ` Herbert Poetzl
2007-07-09 20:12       ` Cedric Le Goater
2007-07-10  6:59         ` Pavel Emelianov
2007-07-09 17:46 ` Badari Pulavarty
2007-07-09 20:06   ` Cedric Le Goater
2007-07-09 23:00     ` Badari Pulavarty
2007-07-10  7:05       ` Pavel Emelianov
2007-07-10 11:30     ` Pavel Emelianov
2007-07-10 12:05       ` Daniel Lezcano
2007-07-10 13:03         ` Pavel Emelianov
2007-07-10 20:34       ` Badari Pulavarty
2007-07-10 13:06   ` Pavel Emelianov
2007-07-10 20:33     ` Badari Pulavarty
2007-07-09 21:42 ` sukadev
2007-07-10  0:29 ` sukadev
2007-07-10  9:41   ` Pavel Emelianov
2007-07-10 13:08   ` Pavel Emelianov
2007-07-10  4:26 ` sukadev
2007-07-10  7:02   ` Pavel Emelianov
2007-07-11  1:16 ` Matt Mackall
2007-07-11  6:39   ` Pavel Emelianov
2007-07-11 15:14     ` Matt Mackall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4691CE99.7030902@openvz.org \
    --to=xemul@openvz.org \
    --cc=akpm@osdl.org \
    --cc=containers@lists.osdl.org \
    --cc=dev@openvz.org \
    --cc=ebiederm@xmission.com \
    --cc=haveblue@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox