Re: (pspace,pid) vs true pid virtualization

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Hubertus Franke <frankeh@watson.ibm.com>
To: "Serge E. Hallyn" <serue@us.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
	Kirill Korotaev <dev@sw.ru>,
	linux-kernel@vger.kernel.org, vserver@list.linux-vserver.org,
	Herbert Poetzl <herbert@13thfloor.at>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>,
	Dave Hansen <haveblue@us.ibm.com>,
	Arjan van de Ven <arjan@infradead.org>,
	Suleiman Souhlal <ssouhlal@FreeBSD.org>,
	Cedric Le Goater <clg@fr.ibm.com>,
	Kyle Moffett <mrmacman_g4@mac.com>, Greg <gkurz@fr.ibm.com>,
	Linus Torvalds <torvalds@osdl.org>, Andrew Morton <akpm@osdl.org>,
	Greg KH <greg@kroah.com>, Rik van Riel <riel@redhat.com>,
	Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
	Andrey Savochkin <saw@sawoct.com>,
	Kirill Korotaev <dev@openvz.org>, Andi Kleen <ak@suse.de>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Jeff Garzik <jgarzik@pobox.com>,
	Trond Myklebust <trond.myklebust@fys.uio.no>,
	Jes Sorensen <jes@sgi.com>
Subject: Re: (pspace,pid) vs true pid virtualization
Date: Thu, 16 Feb 2006 22:35:38 -0500	[thread overview]
Message-ID: <43F5448A.6020501@watson.ibm.com> (raw)
In-Reply-To: <20060216142928.GA22358@sergelap.austin.ibm.com>

Serge E. Hallyn wrote:
> Quoting Eric W. Biederman (ebiederm@xmission.com):
> 
>>"Serge E. Hallyn" <serue@us.ibm.com> writes:
>>With respect to pids lets not get caught up in the implementation
>>details.  Let's first get clear on what the semantics should be.
>>
>>- Should the first pid in a pid space have pid 1?
that should only be required if you have system containers
or if there are tools or requirements that if I wonderup my process tree
that I ultimately must end up at 1.
>>
>>- Should pid == 1 ignore signals, it doesn't have a handler for?

Yes
>>
>>- Should any children of pid 1 be allowed to live when pid == 1 is killed?
No  .. it has init semantics !

> 
> 
> But doesn't that depend on whether we use (pspace,pid) or vpids?  If
> vpids, then init isn't really a problem, since from kernelspace
> processes in a comtainer stil have a global pid and global parent, and
> init knows them.  If (pspace,pid), then we need a fakeinit bc the real
> init doesn't know about the processes in the container...
> 
> 
>>- Should a process have some sort of global (on the machine identifier)?

First establish whether that global ID has to be persistent ...
I don't see why ! In which case the TASK_REF is the perfect global ID.
> 
> 
> I think to satisfy openvz existing customers this must be a yes.  With
> vpid the answer is simple.  With (pspace,pid), there are three anwers i've
> heard, namely
> 
> 	1. just use pspaceid, pid
> 	2. make pspaceid small and use (pspaceid << SOMEBITS | pid)
> 	3. use pid1/pid2/pid3 where pid1 is creator of pid and its
> 	pspace, etc...
This implies that pid2 can be looked up in the context of pid1.
In OpenVZ approach that's possible, In pspaces.. isn't that the wpid ?

> 
> But the openvz guys also don't want userspace tool changes, making (2)
> the most likely option.  Any other ideas?
> 
> 
>>- Should the pids in a pid space be visible from the outside?
> 
> 
> Again, the openvz guys say yes.
> 
> I think it should be acceptable if a pidspace is visible in all it's
> ancestor pidspaces.  I.e. if I create pspace2 and pspace3 from pid 234
> in pspace1, then pspace2 doesn't need to be able to address pspace3
> and vice versa.

That means you need to do a more complicated lookup ! for instance let's say you have hierarchy
pspace1
    |--->pspace2
    |       |--->pspace2a
    |--->pspace3	
            |--->pspace3a

let's assume we use the (pspaceid<<BITS | pid ) global id. To verify I have to
ensure that the target pid can reach the originating pid in its ancestor path.
Not a biggy as these pspace trees probably won't get much deeper then 3 or 4.

> 
> Kirill, is that acceptable?
> 
> 
>>- Should the parent of pid 1 be able to wait for it for it's 
>>  children?
> 
> Yes.

Yes ... VPID does that and wpid in pspace does that as well.
> 
> 
>>- Is a completely disjoin pid space acceptable to anyone?
> 
> 
> To anyone?  yes  :)
> 
> TO everyone, I don't think so.
> 
hehh...   yes they should be disjoined other then at the top
where we want to wait ..
> 
>>- What should the parent of pid == 1 see?
>>
>>- Should a process not in the default pid space be able to create 
>>  another pid space?
> 
> 
> Yes.

How else do you get hierarchy ....

> 
> This is to support using pidspaces for vservers, and creating
> migrateable sub-pidspaces in each vserver.
> 
> 
>>- Should we be able to monitor a pid space from the outside?
> 
> 
> To some extent, yes.
> 
> 
>>- Should we be able to have processes enter a pid space?
> 
> 
> IMO that is crucial.

Existing ones .. now that is potentially difficult to do. Particular
if you want to enter a pidspace that has already been migrated.
Because ones assigned pid might already been taken in the target pspace.

> 
> 
>>- Do we need to be able to be able to ptrace/kill individual processes
>>  in a pid space, from the outside, and why?
> 
> 
> I think this is completely unnecessary so long as a process can enter a
> pidspace.
> 
> 
>>- After migration what identifiers should the tasks have?
> 
> 
> It must be possible to retain the same pids, at least from inside the
> container.

Absolutely .. otherwise all cashed pids in userspace are meaningless.

> 
> So this is irrelevant, as the openvz approach can just virtualize the
> old pid, while (pspace, pid) will be able to create a new container and
> use the old pid values, which are then guaranteed to not be in use.

Exactly .. mute issue, this is "trivial" as long as you can fork with
a particular pid used.

> 
> 
>>If we can answer these kinds of questions we can likely focus in
>>on what the implementation should look like.  So far I have not
>>seen a question that could not be implemented with a (pspace, pid)/pid
>>or a vpid/pid implementation.
> 
> 
> But you have, haven't you?  Namely, how can openvz provide it's
> customers with a global view of all processes without putting 5 years of
> work into a new sysadmin interface?
> 
> -serge
>

next prev parent reply	other threads:[~2006-02-17  3:35 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-02-15 14:59 (pspace,pid) vs true pid virtualization Serge E. Hallyn
2006-02-15 22:12 ` Eric W. Biederman
2006-02-16 14:29   ` Serge E. Hallyn
2006-02-16 16:37     ` Eric W. Biederman
2006-02-16 17:53       ` Serge E. Hallyn
2006-02-16 18:19         ` Eric W. Biederman
2006-02-16 18:44           ` Serge E. Hallyn
2006-02-16 18:52             ` Dave Hansen
2006-02-17 10:57               ` Eric W. Biederman
2006-02-17 11:44                 ` Herbert Poetzl
2006-02-17 12:16                   ` Eric W. Biederman
2006-02-17 12:44                     ` Herbert Poetzl
2006-02-17 13:15                       ` Eric W. Biederman
2006-02-17 13:39                       ` Hubertus Franke
2006-02-17 21:40                         ` Herbert Poetzl
2006-02-17 11:04             ` Eric W. Biederman
2006-02-20 10:06       ` Kirill Korotaev
2006-02-17  3:35     ` Hubertus Franke [this message]
2006-02-17 14:53       ` Serge E. Hallyn
2006-02-20  9:37     ` Kirill Korotaev
2006-02-20 12:47       ` Herbert Poetzl
2006-02-20 14:34         ` Kirill Korotaev
2006-02-20 15:27           ` Herbert Poetzl
2006-02-16 14:30   ` Herbert Poetzl
2006-02-16 15:37     ` Serge E. Hallyn
2006-02-16 17:13       ` Eric W. Biederman
2006-02-16 17:57         ` Serge E. Hallyn
2006-02-20  9:54       ` Kirill Korotaev
2006-02-20 18:19         ` Dave Hansen
2006-02-16 16:59     ` Eric W. Biederman
2006-02-16 17:41     ` Dave Hansen
2006-02-16 19:12       ` Herbert Poetzl
2006-02-16 19:38         ` Dave Hansen
2006-02-16 21:11           ` Sam Vilain
2006-02-20 10:10       ` Kirill Korotaev
2006-02-20  9:50     ` Kirill Korotaev
2006-02-20 13:00       ` Herbert Poetzl
2006-02-20 14:44         ` Kirill Korotaev
2006-02-20 15:36           ` Herbert Poetzl
2006-02-20  9:13   ` Kirill Korotaev
2006-02-20 18:07     ` Dave Hansen
2006-02-15 23:24 ` Sam Vilain
2006-02-16  5:50   ` Eric W. Biederman
2006-02-20  9:17   ` Kirill Korotaev
2006-02-20 20:01     ` Sam Vilain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43F5448A.6020501@watson.ibm.com \
    --to=frankeh@watson.ibm.com \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=arjan@infradead.org \
    --cc=benh@kernel.crashing.org \
    --cc=clg@fr.ibm.com \
    --cc=dev@openvz.org \
    --cc=dev@sw.ru \
    --cc=ebiederm@xmission.com \
    --cc=gkurz@fr.ibm.com \
    --cc=greg@kroah.com \
    --cc=haveblue@us.ibm.com \
    --cc=herbert@13thfloor.at \
    --cc=jes@sgi.com \
    --cc=jgarzik@pobox.com \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mrmacman_g4@mac.com \
    --cc=riel@redhat.com \
    --cc=saw@sawoct.com \
    --cc=serue@us.ibm.com \
    --cc=ssouhlal@FreeBSD.org \
    --cc=torvalds@osdl.org \
    --cc=trond.myklebust@fys.uio.no \
    --cc=vserver@list.linux-vserver.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox