All of lore.kernel.org
 help / color / mirror / Atom feed
From: Herbert Poetzl <herbert@13thfloor.at>
To: Hubertus Franke <frankeh@watson.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
	Dave Hansen <haveblue@us.ibm.com>,
	"Serge E. Hallyn" <serue@us.ibm.com>, Kirill Korotaev <dev@sw.ru>,
	linux-kernel@vger.kernel.org, vserver@list.linux-vserver.org,
	Alan Cox <alan@lxorguk.ukuu.org.uk>,
	Arjan van de Ven <arjan@infradead.org>,
	Suleiman Souhlal <ssouhlal@FreeBSD.org>,
	Cedric Le Goater <clg@fr.ibm.com>,
	Kyle Moffett <mrmacman_g4@mac.com>, Greg <gkurz@fr.ibm.com>,
	Linus Torvalds <torvalds@osdl.org>, Andrew Morton <akpm@osdl.org>,
	Greg KH <greg@kroah.com>, Rik van Riel <riel@redhat.com>,
	Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
	Andrey Savochkin <saw@sawoct.com>,
	Kirill Korotaev <dev@openvz.org>, Andi Kleen <ak@suse.de>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Jeff Garzik <jgarzik@pobox.com>,
	Trond Myklebust <trond.myklebust@fys.uio.no>,
	Jes Sorensen <jes@sgi.com>
Subject: Re: (pspace,pid) vs true pid virtualization
Date: Fri, 17 Feb 2006 22:40:27 +0100	[thread overview]
Message-ID: <20060217214027.GA30682@MAIL.13thfloor.at> (raw)
In-Reply-To: <43F5D227.8020105@watson.ibm.com>

On Fri, Feb 17, 2006 at 08:39:51AM -0500, Hubertus Franke wrote:
> Herbert Poetzl wrote:
> >On Fri, Feb 17, 2006 at 05:16:06AM -0700, Eric W. Biederman wrote:
> >
> >>Herbert Poetzl <herbert@13thfloor.at> writes:
> >>
> >>
> >>>On Fri, Feb 17, 2006 at 03:57:26AM -0700, Eric W. Biederman wrote:
> >>>
> >>>>As for that.  When I mad that suggestion to Herbert Poetzl 
> >>>>his only concern was that a smart init might be too heavy weight 
> >>>>for lightweight vserver.  Generally I like the idea.
> >>>
> >>>well, may I remind that this solution would require _two_
> >>>init processes for each guest, which could easily make up
> >>>300-400 unnecessary processes in a lightweight server
> >>>setup?
> >>
> >>I take it seriously enough that I remembered the concern,
> >>and I think it is legitimate.  Figuring out how to safely
> >>set the policy is a challenge.  That is something a
> >>user space daemon trivially gets right.  
> >>
> >>The kernel side of a process is about 10K if the user space
> >>side was also lightweight we could have the entire
> >>per process cost in the 30K range.  30K*400 = 12000K = 12M.
> >
> >
> >that's something I'm not so worried about, but a statically
> >compiled userspace process with 20K sounds unusual in the
> >time of 2M *libcs :)
> >
> >
> >>That is significant but we are still cheap enough that it
> >>isn't necessarily a show stopper.
> >>
> >>I think the cost was only one extra process, for the case where you
> >>have fakeinit now it would be init, for other cases it would be a
> >>daemon that gets setup when you initialize the vserver.
> >
> 
> Eric, Herbert.. why do we need an extra process in each and every
> pspace.
> 
> Why not have single global pspace-init daemon that acts as the reaper
> for all pspace-top processes. Its only at the boundaries of pspaces
> and with signals were we seem to have trouble.

that would probably work, but I think it adds some
complications and might require certain design changes

just to give some ideas:

 - how to reach the guest space if there is no 'handle'?
 - how to handle hierarchical contexts?

> The "pspace-init" reaps the signal of all its sub-pspace's top
> processes and then "forwards" the signal to processes actually
> waiting. Kind of an interposer. Same way from the other side.
>
> You allocate a pid on behalf of the process you spawn in your
> pidspace. You mark in the pid hash of the lookup that this is merely
> a proxy and you forward that to the pspace-init where you have a
> separate lookup with <pspace-caller,pspace,pid>.
> 
> Same with signals, once the signal is reaped by pspace-init and its looked
> up who is the parent pspace and the pid in there, we forward it..

yup, could work ...

best,
Herbert

> Is something like that workable, idiotic (be kind), too intrusive ?
> 
> -- Hubertus
> 
> 
> >
> >well, depends, currently we do not need a parent to handle
> >the guest, so there is _no_ waiting process in the light-
> >weight case either, which makes that two processes for each
> >guest, no?
> >
> >anyway, I'm not strictly against having an init process
> >inside a guest, as long as it is not an essential part
> >of the overall design, because that would make it much
> >harder to rip it out later :)
> >
> >best,
> >Herbert
> >
> >

  reply	other threads:[~2006-02-17 21:40 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-02-15 14:59 (pspace,pid) vs true pid virtualization Serge E. Hallyn
2006-02-15 22:12 ` Eric W. Biederman
2006-02-16 14:29   ` Serge E. Hallyn
2006-02-16 16:37     ` Eric W. Biederman
2006-02-16 17:53       ` Serge E. Hallyn
2006-02-16 18:19         ` Eric W. Biederman
2006-02-16 18:44           ` Serge E. Hallyn
2006-02-16 18:52             ` Dave Hansen
2006-02-17 10:57               ` Eric W. Biederman
2006-02-17 11:44                 ` Herbert Poetzl
2006-02-17 12:16                   ` Eric W. Biederman
2006-02-17 12:44                     ` Herbert Poetzl
2006-02-17 13:15                       ` Eric W. Biederman
2006-02-17 13:39                       ` Hubertus Franke
2006-02-17 21:40                         ` Herbert Poetzl [this message]
2006-02-17 11:04             ` Eric W. Biederman
2006-02-20 10:06       ` Kirill Korotaev
2006-02-17  3:35     ` Hubertus Franke
2006-02-17 14:53       ` Serge E. Hallyn
2006-02-20  9:37     ` Kirill Korotaev
2006-02-20 12:47       ` Herbert Poetzl
2006-02-20 14:34         ` Kirill Korotaev
2006-02-20 15:27           ` Herbert Poetzl
2006-02-16 14:30   ` Herbert Poetzl
2006-02-16 15:37     ` Serge E. Hallyn
2006-02-16 17:13       ` Eric W. Biederman
2006-02-16 17:57         ` Serge E. Hallyn
2006-02-20  9:54       ` Kirill Korotaev
2006-02-20 18:19         ` Dave Hansen
2006-02-16 16:59     ` Eric W. Biederman
2006-02-16 17:41     ` Dave Hansen
2006-02-16 19:12       ` Herbert Poetzl
2006-02-16 19:38         ` Dave Hansen
2006-02-16 21:11           ` Sam Vilain
2006-02-20 10:10       ` Kirill Korotaev
2006-02-20  9:50     ` Kirill Korotaev
2006-02-20 13:00       ` Herbert Poetzl
2006-02-20 14:44         ` Kirill Korotaev
2006-02-20 15:36           ` Herbert Poetzl
2006-02-20  9:13   ` Kirill Korotaev
2006-02-20 18:07     ` Dave Hansen
2006-02-15 23:24 ` Sam Vilain
2006-02-16  5:50   ` Eric W. Biederman
2006-02-20  9:17   ` Kirill Korotaev
2006-02-20 20:01     ` Sam Vilain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060217214027.GA30682@MAIL.13thfloor.at \
    --to=herbert@13thfloor.at \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=arjan@infradead.org \
    --cc=benh@kernel.crashing.org \
    --cc=clg@fr.ibm.com \
    --cc=dev@openvz.org \
    --cc=dev@sw.ru \
    --cc=ebiederm@xmission.com \
    --cc=frankeh@watson.ibm.com \
    --cc=gkurz@fr.ibm.com \
    --cc=greg@kroah.com \
    --cc=haveblue@us.ibm.com \
    --cc=jes@sgi.com \
    --cc=jgarzik@pobox.com \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mrmacman_g4@mac.com \
    --cc=riel@redhat.com \
    --cc=saw@sawoct.com \
    --cc=serue@us.ibm.com \
    --cc=ssouhlal@FreeBSD.org \
    --cc=torvalds@osdl.org \
    --cc=trond.myklebust@fys.uio.no \
    --cc=vserver@list.linux-vserver.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.