From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Serge E. Hallyn" Subject: Re: [PATCH 4/6] user namespaces: add user_ns to super block Date: Tue, 29 Jul 2008 13:09:13 -0500 Message-ID: <20080729180913.GC365@us.ibm.com> References: <20080726002700.GA29686@us.ibm.com> <20080726002754.GD29874@us.ibm.com> <1217285230.25300.19.camel@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: "Eric W. Biederman" Cc: Linux Containers List-Id: containers.vger.kernel.org Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org): > Matt Helsley writes: > > > Would this require passing the vfsmount to the filesystems themselves, > > or would they be within the VFS code only? > > The interesting bit is the user_namespace contained in the vfsmount. We > can pass that down. I think semantically it makes sense for a filesystem > mount to only operate in a single mount namespace. > > > If not wholly within the VFS > > I wonder if Al Viro would object to this. He's resisted past attempts to > > pass the vfsmount structs into more filesystem code paths and I'm > > guessing that could affect whether or not this approach can be > > implemented. > > Dave Hansen raised that concern when we were talking about it earlier. Since > we just care about a property of the mount it isn't a big deal. > > Actually thinking about this a little farther it may be simplest to have the > mnt_namespace capture the user_namespace, although that doesn't seem to map > semantically very well with cloning of the filesystem. Interesting idea. I'm going to pursue that. So at a do_new_mount(), mnt->user_ns = current->user_ns. At do_loopback(), we ask the fs whether the new_mnt->user_ns can be set to current->user_ns. If not, it keeps the original, meaning that current will always receive user nobody access to the fs. Otherwise, the fs is saying that it knows how to properly convert userids from current->user->user_ns to ones which make sense in the original_mnt->user_ns. > This is very much a question of how do we map the uid/gids store in the filesystem > into the uids/gids in the kernel. Which user namespace do they belong in. > > Especially in the case of read only mounts we can safely share a filesystem between > user_namespaces with no changes to the filesystem. Which I suspect is the > first case we want to allow as that is a tremendous savings in space if you have > lots of instances of the same distro, and people have been doing it with /usr > for years. > > Eric