From mboxrd@z Thu Jan 1 00:00:00 1970 From: Oren Laadan Subject: Re: [PATCH] [RFC] c/r: Add UTS support Date: Fri, 20 Mar 2009 16:53:42 -0400 Message-ID: <49C40256.4030408@cs.columbia.edu> References: <49C0B069.6060300@cs.columbia.edu> <20090318134932.GC22636@us.ibm.com> <878wn353mf.fsf@caffeine.danplanet.com> <49C1175F.9060600@free.fr> <49C1506C.1080500@google.com> <49C195CF.1080506@google.com> <49C1A52D.4000503@google.com> <20090320172616.GA7203@us.ibm.com> <49C3F3C0.30100@google.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <49C3F3C0.30100-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: Mike Waychison Cc: containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org, Dan Smith , Nathan Lynch , "Eric W. Biederman" List-Id: containers.vger.kernel.org Mike Waychison wrote: > Serge E. Hallyn wrote: >> Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org): >>> Ok. I see what you are trying to accomplish with this and honestly I >>> think it is silly. >>> >>> We should start the threads we need in the kernel, and if we need to >>> run clone_pid fine. I am not comfortable exporting clone_with_pid to >>> user space. >> Even if we create the task tree in userspace, I don't see why we >> can't have the parent of each nested pid_ns pass CLONE_NEWPID to >> clone_with_pid() instead of doing clone first and then unsharing >> the pidns? >> >> As for clone_with_pid(), I don't particularly like the semantics, >> but as was discussed over IRC, we could have clone_with_pid() >> return -EINVAL unless it is called while it is called from a task >> inside a restarting container. (and -EPERM if setting a pid in >> a pid_ns which was not created as part of the container) Eric >> do you dislike that any less? > > Wouldn't this mean the kernel would have to track which namespaces are > part of a restart and which aren't? Seems a little kludgy to me. We are talking only about pidns, right ? So we need to keep track of a single pidns - the top level for the restarting container; that's the container init. We can, e.g., mark it with a flag. We can then follow the hierarchy up through pidns until we reach the one marked, or the topmost, and grant (or deny) permission. Oren.