From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S266854AbUHZAQs (ORCPT ); Wed, 25 Aug 2004 20:16:48 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S266877AbUHZAQs (ORCPT ); Wed, 25 Aug 2004 20:16:48 -0400 Received: from brmea-mail-3.Sun.COM ([192.18.98.34]:6800 "EHLO brmea-mail-3.sun.com") by vger.kernel.org with ESMTP id S266854AbUHZAQk (ORCPT ); Wed, 25 Aug 2004 20:16:40 -0400 Date: Wed, 25 Aug 2004 20:16:18 -0400 From: Mike Waychison Subject: Re: Using fs views to isolate untrusted processes: I need an assistant architect in the USA for Phase I of a DARPA funded linux kernel project In-reply-to: <30958D95-F6ED-11D8-A7C9-000393ACC76E@mac.com> To: Kyle Moffett Cc: Tim Hockin , LKML , Rik van Riel , ReiserFS List , Hans Reiser Message-id: <412D2BD2.2090408@sun.com> MIME-version: 1.0 Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7bit X-Accept-Language: en-us, en User-Agent: Mozilla Thunderbird 0.5 (X11/20040208) X-Enigmail-Version: 0.83.3.0 X-Enigmail-Supports: pgp-inline, pgp-mime References: <410D96DC.1060405@namesys.com> <20040825205618.GA7992@hockin.org> <30958D95-F6ED-11D8-A7C9-000393ACC76E@mac.com> Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Kyle Moffett wrote: > On Aug 25, 2004, at 16:56, Tim Hockin wrote: > >> On Wed, Aug 25, 2004 at 04:25:24PM -0400, Rik van Riel wrote: >> >>>> You can think of this as chroot on steroids. >>> >>> >>> Sounds like what you want is pretty much the namespace stuff >>> that has been in the kernel since the early 2.4 days. >>> >>> No need to replicate VFS functionality inside the filesystem. >> >> >> When I was at Sun, we talked a lot about this. Mike, does Sun have any >> iterest in this? >> >> We found a lot of shortcomings in implementing various namespace-ish >> things. > > > Here's a simple way to do what you want in userspace: > 1) Apply the kernel bind mount options fix (*) > 2) Run the following shell script > > cat <<'EOF' >fsviews.bash > #! /bin/bash > # First make the subdirectories > mkdir /fsviews_orig > mount -t tmpfs tmpfs /fsviews_rw > mkdir /fsviews_orig/dir1 > mkdir /fsviews_orig/dir2 > mkdir /fsviews_orig/old > > # Now make it read-only with a copy in /fsviews > mkdir /fsviews > mount --bind /fsviews_orig /fsviews > > # Put directories in /fsviews > mount --bind /somewhere/dir1 /fsviews/dir1 > mount --bind -o ro /otherplace/dir2 /fsviews/dir2 > > # Start the process in a new namespace > clone_prog bash <<'BACK_TO_OLD_NAMESPACE' > > mount -o ro,remount /fsviews_orig > pivot_root /fsviews /fsviews/old > umount -l /fsviews/old > /dir1/myscript & > > BACK_TO_OLD_NAMESPACE > > # Remove the extra dirs in this namespace > umount -l /fsviews > umount -l /fsviews_orig > rmdir /fsviews > rmdir /fsviews_orig > > EOF > > This assumes that clone_prog is a short C program that does a clone() > syscall > with the CLONE_NEWNS flag and executes a new process. > > Once this is done, "/dir2/script" is running in a _completely_ new > namespace > with a read-only root directory and two directories from other parts of > the vfs. > > (*) IIRC currently bind-mount rw/ro options are those of the underlying > mount, > the bind-mount options fix provides a separate set of options for each > bound > copy. There is only one minimal security implication without said > patch, that > root can still 'mount -o rw,remount /' to get root writeable again, but > since it's > on tmpfs, that doesn't matter much. You could also just take away some > capabilities, but otherwise except for the shared process tables this > acts very > much like a completely new, separate computer. I've used this to > thoroughly > secure minimally trusted daemons before. :-D > > Cheers, > Kyle Moffett This provides minimal protection if any: the user may remount any block devices on any given tree in his 'namespace' (in the sense of "that is what we call a mount-table in Linux"). * If I understand what Hans is looking to get done, he's asking for someone to architect a system where any given process can be restricted to seeing/accessing a subset of the namespace (in the sense of "a tree of directories/files"). Eg: process Foo is allowed access to write to /etc/group, but _not_ allowed access to /etc/shadow, under any circumstances && Foo will be run as root. Hell, maybe Foo is never able to even _see_ /etc/shadow (making it a true shadow file :). Hans, correct me if I misunderstood. [*] Somebody really should s/struct namespace/struct mounttable/g (or even mounttree) on the kernel sources. 'Namespace' isn't very descriptive and it leads to confusion :( - -- Mike Waychison Sun Microsystems, Inc. 1 (650) 352-5299 voice 1 (416) 202-8336 voice http://www.sun.com ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ NOTICE: The opinions expressed in this email are held by me, and may not represent the views of Sun Microsystems, Inc. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFBLSvRdQs4kOxk3/MRAnopAJ91xpTEqf1I/jaRdqbjbgfnNuPpugCfbkvz VeJUBr2UuagZ5UGMGC1nebw= =XuQT -----END PGP SIGNATURE-----