From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Waychison Subject: Re: [RFC][PATCH] rbind across namespaces Date: Tue, 24 May 2005 10:44:39 -0700 Message-ID: <42936807.2000807@google.com> References: <1116627099.4397.43.camel@localhost> <1116660380.4397.66.camel@localhost> <20050521134615.GB4274@mail.shareable.org> <429277CA.9050300@google.com> <4292D416.5070001@waychison.com> <42935FCB.1010809@google.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: jamie@shareable.org, linuxram@us.ibm.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, akpm@osdl.org, viro@parcelfarce.linux.theplanet.co.uk Return-path: Received: from 216-239-45-4.google.com ([216.239.45.4]:20281 "EHLO 216-239-45-4.google.com") by vger.kernel.org with ESMTP id S261165AbVEXRpk (ORCPT ); Tue, 24 May 2005 13:45:40 -0400 To: Miklos Szeredi In-Reply-To: Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org Miklos Szeredi wrote: >>>>In what sense? readlink of /proc/PID/fd/* will provide a pathname >>>>relative to current's root: useless for any paths not in current's >>>>namespace. >>> >>> >>>Not readlink, but actual dereference of link will give you exactly the >>>vfsmount/dentry the file was opened on. If you want to bind/move >>>whatever on that mount, that's possible, even if it's a "detached >>>tree". >> >>Removing proc_check_root essentially removes any protection that >>namespaces provided in the first place. >> >>Think of it like virtual memory. You can fork off and create your own >>COW instance, and no one can see what you are doing unless they ptrace >>you or explicitly ask you. The mountfd model allows for explicit >>handing off of vfsmounts between namespaces without allowing arbitrary >>access. > > > Note: I didn't say we should remove proc_check_root(). I said _if_ > you remove it, _then_ mounting on foreign namespace will work, which > has actually been confirmed experimentally. > OK. > So your follow_link argument doesn't hold. The proc code does the > follow_link in some clever way, that the looked up object will end up > with the same vfsmount/dentry pair as the file. > Yes. I hadn't realized that the only safe-guard in place was proc_check_root. I had made the assumption that follow_link was using vfs_follow_link with the readlink info. > >>Yet even this doesn't allow userspace to define it's own policy for >>inter-namespace manipulation. > > > OK. Let's keep the kernel policy simple: just allow a process to > access it's _own_ file descriptors in proc. I.e. allow access to > /proc/self/fd/* even if it comes from a foreign namespace. > > Then the policy (who passes whom the file descriptors) is entirely up > to userspace (just as with your scheme). > > /proc/self/fd/FD doesn't give any extra rights to the process, it just > makes mounting from/to detached mounts, and foreign namespaces > possible without new interfaces. So you'd say 'mount /dev/foo /proc/self/fd/4' if 4 was an fd pointing to a directory in another namespace? That does require proc_check_root be removed. :\ > > >>Beware that due to the detached-subtrees bit, the locking became a bit >>ugly, requiring a global rw_lock for mntget/mntput. I still haven't >>figured out a better way to keep per-vfsmounts counts and per-subtree >>counts in sync. > > > Did you measure the effect on performace? Maybe it isn't so bad. As far as I could tell without doing high-smp benchmarks, it doesn't slow anything down. In this case, we aren't on the fastest path anyway, as we are usually a) walking a path across a mountpoint, requiring the global vfsmount_lock anyway. b) closing a file, which isn't fast either. You could micro-optimize the pivoting of from one vfsmount to another during a walk to only grab the lock once, but there may be no profound effect and any gains would be lost in the noise. Mike Waychison