linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Djalal Harouni <tixxdz@gmail.com>, Chris Mason <clm@fb.com>,
	Theodore Tso <tytso@mit.edu>,
	Josh Triplett <josh@joshtriplett.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Andy Lutomirski <luto@kernel.org>,
	Seth Forshee <seth.forshee@canonical.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	LSM List <linux-security-module@vger.kernel.org>,
	Dongsu Park <dongsu@endocode.com>,
	David Herrmann <dh.herrmann@googlemail.com>,
	Miklos Szeredi <mszeredi@redhat.com>,
	Alban Crequy <alban.crequy@gmail.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	Phil Estes <estesp@gmail.com>
Subject: Re: [RFC 1/1] shiftfs: uid/gid shifting bind mount
Date: Mon, 06 Feb 2017 06:41:22 -0800	[thread overview]
Message-ID: <1486392082.2474.27.camel@HansenPartnership.com> (raw)
In-Reply-To: <CAOQ4uxhQE6y5pRont7ejobU+fzQSiTQaQub8gZT=-K7UAZEbkA@mail.gmail.com>

On Mon, 2017-02-06 at 08:59 +0200, Amir Goldstein wrote:
> On Mon, Feb 6, 2017 at 3:18 AM, James Bottomley
> <James.Bottomley@hansenpartnership.com> wrote:
> > On Sun, 2017-02-05 at 09:51 +0200, Amir Goldstein wrote:
> > > On Sat, Feb 4, 2017 at 9:19 PM, James Bottomley
> > > <James.Bottomley@hansenpartnership.com> wrote:
> > > > This allows any subtree to be uid/gid shifted and bound 
> > > > elsewhere.  It does this by operating simlarly to overlayfs. 
> > > >  Its primary use is for shifting the underlying uids of 
> > > > filesystems used to support unpriviliged (uid shifted) 
> > > > containers.  The usual use case here is that the container is 
> > > > operating with an uid shifted unprivileged root but sometimes 
> > > > needs to make use of or work with a filesystem image that has
> > > > root at real uid 0.
> > > > 
> > > > The mechanism is to allow any subordinate mount namespace to 
> > > > mount a shiftfs filesystem (by marking it FS_USERNS_MOUNT) but 
> > > > only allowing it to mount marked subtrees (using the -o mark 
> > > > option as root).   Once mounted, the subtree is mapped via the 
> > > > super block user namespace so that the interior ids of the 
> > > > mounting user namespace are the ids written to the filesystem.
> > > > 
> > > > Signed-off-by: James Bottomley <
> > > > James.Bottomley@HansenPartnership.com>
> > > > 
> > > 
> > > James,
> > > 
> > > Allow me to point out some problems in this patch and offer a
> > > slightly different approach.
> > > 
> > > First of all, the subject says "uid/gid shifting bind mount", but
> > > it's not really a bind mount. What it is is a stackable mount and 
> > > 2 levels of stack no less.
> > 
> > The reason for the description is to have it behave exactly like a 
> > bind mount.  You can assert that a bind mount is, in fact, a 
> > stacked mount, but we don't currently.  I'm also not sure where you 
> > get your 2 levels from?
> > 
> 
> A bind mount does not incur recursion into VFS code, a stacked fs 
> does. And there is a programmable limit of stack depth of 2, which 
> stacked fs need to comply with. Your proposed setup has 2 stacked fs, 
> the mark shitfs by admin and the uid shitfs by container user. Or
> maybe I misunderstood.

Oh, right, actually, it wouldn't be 2 because once the unprivileged
mount uses the marked filesystem, what it uses is the mnt and dentry
from the underlying filesystem (what you would have got from a path
lookup on it).

That said, it does perform recursive calls to the underlying filesystem
unlike a true bind mount, so I can add the depth easily enough.

> > >  So one thing that is missing is increasing of sb->s_stack_depth 
> > > and that also means that shiftfs cannot be used to recursively 
> > > shift uids in child userns if that was ever the intention.
> > 
> > I can't think of a use case that would ever need that, but perhaps
> > other container people can.
> > 
> > > The other problem is that by forking overlayfs functionality,
> > 
> > So this wouldn't really be the right way to look at it: shiftfs 
> > shares no code with overlayfs at all, so is definitely not a fork. 
> >  The only piece of functionality it has which is similar to 
> > overlayfs is the way it does lookups via a new dentry cache. 
> >  However, that functionality is not unique to overlayfs and if you 
> > look, you'll see that shiftfs_lookup() actually has far more in 
> > common with ecryptfs_lookup().
> 
> That's a good point. All stackable file systems may share similar 
> problems and solutions (e.g. consistent st_ino/st_dev). Perhaps it 
> calls for shared library code or more generic VFS code. At the moment 
> ecryptfs is not seeing much development, so everything happens in 
> overlayfs. If there is going to be more than 1 actively developed
> stackable fs, we need to see about that.

I believe we already do ... if you look at the lookup functions of each
of them, you see the only common thing is encapsulated in a variant of
the lookup_one_len() functions.  After that, even simple things like
our negative dentry handling differs.

> > >  shiftfs is going to miss out on overlayfs bug fixes related to 
> > > user credentials differ from mounter credentials, like fd3220d 
> > > ("ovl: update S_ISGID when setting posix ACLs"). I am not sure 
> > > that this specific case is relevant to shiftfs, but there could
> > > be other.
> > 
> > OK, so shiftfs doesn't have this bug and the reason why is
> > illustrative: basically shiftfs does three things
> > 
> >    1. lookups via a uid/gid shifted dentry cache
> >    2. shifted credential inode operations permission checks on the
> >       underlying filesystem
> >    3. location marking for unprivileged mount
> > 
> > I think we've already seen that 1. isn't from overlayfs but the
> > functionality could be added to overlayfs, I suppose.  The big 
> > problem is 2.  The overlayfs code emulates the permission checks, 
> > which makes it rather complex (this is where you get your bugs like 
> > the above from).  I did actually look at adding 2. to overlayfs on 
> > the theory that a single layer overlay might be closest to what 
> > this is, but eventually concluded I'd have to take the special 
> > cases and add a whole lot more to them ... it really would increase 
> > the maintenance burden substantially and make the code an
> > unreadable rats nest.
> > 
> 
> The use cases for uid shifting are still overwelming for me.
> I take your word for it that its going to be a maintanace burdon
> to add this functionality to overlayfs.
> 
> > When you think about it this way, it becomes obvious that the clean
> > separation is if shiftfs functionality is layered on top of 
> > overlayfs and when you do that, doing it as its own filesystem is 
> > more logical.
> > 
> 
> Yes, I agree with that statement. This is inline with the solution I 
> outlined at the end of my previous email, where single layer 
> overlayfs is used for the host "mark" mount, although I wonder if the 
> same cannot be achieved with a bind mount?

I understand, but once I can't consume overlayfs to construct it, the
idea of trying to use it becomes a negative not a positive.

We could achieve the same thing using bind mounts, if the vfsmount
structure carried a private field, but it doesn't.  I think given the
prevalence of this structure throughout the mount tree, that's a
deliberate decision to keep it thin.

> in host:
> mount -t overlay -o noexec,upper=<origin> container_visible <mark
> location>
> 
> in container:
> mount -t shiftfs -o <mark location> <somewhere in my local mount ns>

So I'm not sure it's a more widespread problem: mount --bind is usable
inside an unprivileged container, which means you can bridge filesystem
subtrees even only being local container admin.  The problem is
mounting other filesystems types.  Marking a type safe for mounting is
done by the FS_USERNS_MOUNT flag but it means for things like shiftfs
that you do have to restrict the source location, but for most
filesystem types, that source will be a device, so they will need other
checking than a mount mark.

James

  reply	other threads:[~2017-02-06 14:41 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-04 19:18 [RFC 0/1] shiftfs: uid/gid shifting filesystem (s_user_ns version) James Bottomley
2017-02-04 19:19 ` [RFC 1/1] shiftfs: uid/gid shifting bind mount James Bottomley
2017-02-05  7:51   ` Amir Goldstein
2017-02-06  1:18     ` James Bottomley
2017-02-06  6:59       ` Amir Goldstein
2017-02-06 14:41         ` James Bottomley [this message]
2017-02-14 23:03       ` Vivek Goyal
2017-02-14 23:45         ` James Bottomley
2017-02-15 14:17           ` Vivek Goyal
2017-02-16 15:51             ` James Bottomley
2017-02-16 16:42               ` Vivek Goyal
2017-02-16 16:58                 ` James Bottomley
2017-02-17  1:57                   ` Eric W. Biederman
2017-02-17  8:39                     ` Djalal Harouni
2017-02-17 17:19                     ` James Bottomley
2017-02-20  4:24                       ` Eric W. Biederman
2017-02-22 12:01                         ` James Bottomley
2017-02-06  3:25   ` J. R. Okajima
2017-02-06  6:38     ` Amir Goldstein
2017-02-06 16:29       ` James Bottomley
2017-02-06  6:46     ` James Bottomley
2017-02-06 14:50       ` Theodore Ts'o
2017-02-06 15:18         ` James Bottomley
2017-02-06 15:38           ` lkml
2017-02-06 17:32             ` James Bottomley
2017-02-06 21:52           ` J. Bruce Fields
2017-02-07  0:10             ` James Bottomley
2017-02-07  1:35               ` J. Bruce Fields
2017-02-07 19:01                 ` James Bottomley
2017-02-07 19:47                   ` Christoph Hellwig
2017-02-06 16:24       ` J. R. Okajima
2017-02-21  0:48         ` James Bottomley
2017-02-21  2:57           ` J. R. Okajima
2017-02-21  4:07             ` James Bottomley
2017-02-21  4:34               ` J. R. Okajima
2017-02-07  9:19   ` Christoph Hellwig
2017-02-07  9:39     ` Djalal Harouni
2017-02-07  9:53       ` Christoph Hellwig
2017-02-07 16:37     ` James Bottomley
2017-02-07 17:59       ` Amir Goldstein
2017-02-07 18:10         ` Christoph Hellwig
2017-02-07 19:02           ` James Bottomley
2017-02-07 19:49             ` Christoph Hellwig
2017-02-07 20:05               ` James Bottomley
2017-02-07 21:01                 ` Amir Goldstein
2017-02-07 22:25                   ` Christoph Hellwig
2017-02-07 23:42                     ` James Bottomley
2017-02-08  6:44                       ` Amir Goldstein
2017-02-08 11:45                         ` Konstantin Khlebnikov
2017-02-08 14:57                         ` James Bottomley
2017-02-08 15:15                         ` James Bottomley
2017-02-08  1:54               ` Josh Triplett
2017-02-08 15:22                 ` James Bottomley
2017-02-09 10:36                   ` Josh Triplett
2017-02-09 15:34                     ` James Bottomley
2017-02-13 10:15                       ` Eric W. Biederman
2017-02-15  9:33                         ` Djalal Harouni
2017-02-15  9:37                           ` Eric W. Biederman
2017-02-15 10:04                             ` Djalal Harouni
2017-02-07 18:20         ` James Bottomley
2017-02-07 19:48           ` Djalal Harouni
2017-02-15 20:34   ` Vivek Goyal
2017-02-16 15:56     ` James Bottomley
2017-02-17  2:55       ` Al Viro
2017-02-17 17:34         ` James Bottomley
2017-02-17 20:35           ` Vivek Goyal
2017-02-19  3:24             ` James Bottomley
2017-02-20 19:26               ` Vivek Goyal
2017-02-21  0:38                 ` James Bottomley
2017-02-17  2:29   ` Al Viro
2017-02-17 17:24     ` James Bottomley
2017-02-17 17:51       ` Al Viro
2017-02-17 20:27         ` Vivek Goyal
2017-02-17 20:50         ` James Bottomley
  -- strict thread matches above, loose matches on Subject: below --
2016-05-12 19:06 [RFC 0/1] shiftfs: uid/gid shifting filesystem James Bottomley
2016-05-12 19:07 ` [RFC 1/1] shiftfs: uid/gid shifting bind mount James Bottomley
2016-05-16 19:41   ` Serge Hallyn
2016-05-17  2:28     ` James Bottomley
2016-05-17  3:47       ` Serge E. Hallyn
2016-05-17 10:23         ` James Bottomley
2016-05-17 20:59           ` James Bottomley
2016-05-19  2:28             ` Serge E. Hallyn
2016-05-19 10:53               ` James Bottomley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1486392082.2474.27.camel@HansenPartnership.com \
    --to=james.bottomley@hansenpartnership.com \
    --cc=alban.crequy@gmail.com \
    --cc=amir73il@gmail.com \
    --cc=clm@fb.com \
    --cc=dh.herrmann@googlemail.com \
    --cc=dongsu@endocode.com \
    --cc=ebiederm@xmission.com \
    --cc=estesp@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mszeredi@redhat.com \
    --cc=serge@hallyn.com \
    --cc=seth.forshee@canonical.com \
    --cc=tixxdz@gmail.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).