From mboxrd@z Thu Jan 1 00:00:00 1970 From: Matt Helsley Subject: Re: [RFC][PATCH] ns: Syscalls for better namespace sharing control. Date: Thu, 25 Feb 2010 17:09:15 -0800 Message-ID: <20100226010915.GA20106@count0.beaverton.ibm.com> References: <1263568754.23480.142.camel@bigi> <1266875729.3673.12.camel@bigi> <1266931623.3973.643.camel@bigi> <1266934817.3973.654.camel@bigi> <1266966581.3973.675.camel@bigi> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: hadi@cyberus.ca, Daniel Lezcano , Patrick McHardy , Linux Netdev List , containers@lists.linux-foundation.org, Netfilter Development Mailinglist , Ben Greear , Serge Hallyn , Matt Helsley To: "Eric W. Biederman" Return-path: Received: from e6.ny.us.ibm.com ([32.97.182.146]:43614 "EHLO e6.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934832Ab0BZBJU (ORCPT ); Thu, 25 Feb 2010 20:09:20 -0500 Content-Disposition: inline In-Reply-To: Sender: netfilter-devel-owner@vger.kernel.org List-ID: On Thu, Feb 25, 2010 at 12:57:02PM -0800, Eric W. Biederman wrote: > > Introduce two new system calls: > int nsfd(pid_t pid, unsigned long nstype); > int setns(unsigned long nstype, int fd); > > These two new system calls address three specific problems that can > make namespaces hard to work with. > - Namespaces require a dedicated process to pin them in memory. > - It is not possible to use a namespace unless you are the > child of the original creator. > - Namespaces don't have names that userspace can use to talk > about them. > > The nsfd() system call returns a file descriptor that can > be used to talk about a specific namespace, and to keep > the specified namespace alive. > > The fd returned by nsfd() can be bind mounted as: > mount --bind /proc/self/fd/N /some/filesystem/path > to keep the namespace alive indefinitely as long as > it is mounted. > > open works on the fd returned by nsfd() so another > process can get a hold of it and do interesting things. > > Overall that allows for persistent naming of namespaces > according to userspace policy. > > setns() allows changing the namespace of the current process > to a namespace that originates with nsfd(). > > Signed-off-by: Eric W. Biederman > --- > > This is just my first pass at this, and not yet compiled tested. > I was pleasantly surprised at how easy all of this was to implement. > +SYSCALL_DEFINE2(setns, unsigned long, nstype, int, fd) > +{ > + struct file *file; > + > + if (!capable(CAP_SYS_ADMIN)) > + return -EPERM; Is this check preliminary? In the future would we check against the owner of the target namespace too? Naturally that will require tagging each namespace with an owner but I thought that was already part of the plan... Cheers, -Matt Helsley