Re: [RFC v2 PATCH 0/8] VFS:userns: support portable root filesystems

All of lore.kernel.org
 help / color / mirror / Atom feed

From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Djalal Harouni <tixxdz@gmail.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Chris Mason <clm@fb.com>,
	tytso@mit.edu, Serge Hallyn <serge.hallyn@canonical.com>,
	Josh Triplett <josh@joshtriplett.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Andy Lutomirski <luto@kernel.org>,
	Seth Forshee <seth.forshee@canonical.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	Dongsu Park <dongsu@endocode.com>,
	David Herrmann <dh.herrmann@googlemail.com>,
	Miklos Szeredi <mszeredi@redhat.com>,
	Alban Crequy <alban.crequy@gmail.com>
Subject: Re: [RFC v2 PATCH 0/8] VFS:userns: support portable root filesystems
Date: Wed, 04 May 2016 17:06:19 -0400	[thread overview]
Message-ID: <1462395979.14310.133.camel@HansenPartnership.com> (raw)
In-Reply-To: <1462372014-3786-1-git-send-email-tixxdz@gmail.com>

On Wed, 2016-05-04 at 16:26 +0200, Djalal Harouni wrote:
> This is version 2 of the VFS:userns support portable root filesystems
> RFC. Changes since version 1:
> 
> * Update documentation and remove some ambiguity about the feature.
>   Based on Josh Triplett comments.
> * Use a new email address to send the RFC :-)
> 
> 
> This RFC tries to explore how to support filesystem operations inside
> user namespace using only VFS and a per mount namespace solution.
> This
> allows to take advantage of user namespace separations without
> introducing any change at the filesystems level. All this is handled
> with the virtual view of mount namespaces.
> 
> 
> 1) Presentation:
> ================
> 
> The main aim is to support portable root filesystems and allow 
> containers, virtual machines and other cases to use the same root 
> filesystem. Due to security reasons, filesystems can't be mounted 
> inside user namespaces, and mounting them outside will not solve the 
> problem since they will show up with the wrong UIDs/GIDs. Read and 
> write operations will also fail and so on.
> 
> The current userspace solution is to automatically chown the whole 
> root filesystem before starting a container, example:
> (host) init_user_ns  1000000:1065536  => (container) user_ns_X1
> 0:65535
> (host) init_user_ns  2000000:2065536  => (container) user_ns_Y1
> 0:65535
> (host) init_user_ns  3000000:3065536  => (container) user_ns_Z1
> 0:65535
> ...
> 
> Every time a chown is called, files are changed and so on... This
> prevents to have portable filesystems where you can throw anywhere
> and boot. Having an extra step to adapt the filesystem to the current
> mapping and persist it will not allow to verify its integrity, it 
> makes snapshots and migration a bit harder, and probably other
> limitations...
> 
> It seems that there are multiple ways to allow user namespaces 
> combine nicely with filesystems, but none of them is that easy. The 
> bind mount and pin the user namespace during mount time will not 
> work, bind mounts share the same super block, hence you may endup 
> working on the wrong vfsmount context and there is no easy way to get
> out of that...

So this option was discussed at the recent LSF/MM summit.  The most
supported suggestion was that you'd use a new internal fs type that had
a struct mount with a new superblock and would copy the underlying
inodes but substitute it's own with modified ->getatrr/->setattr calls
that did the uid shift.  In many ways it would be a remapping bind
which would look similar to overlayfs but be a lot simpler.

> Using the user namespace in the super block seems the way to go, and
> there is the "Support fuse mounts in user namespaces" [1] patches 
> which seem nice but perhaps too complex!?

So I don't think that does what you want.  The fuse project I've used
before to do uid/gid shifts for build containers is bindfs

https://github.com/mpartel/bindfs/

It allows a --map argument where you specify pairs of uids/gids to map
(tedious for large ranges, but the map can be fixed to use uid:range
instead of individual).

>  there is also the overlayfs solution, and finaly the VFS layer 
> solution.
> 
> We present here a simple VFS solution, everything is packed inside 
> VFS, filesystems don't need to know anything (except probably XFS, 
> and special operations inside union filesystems). Currently it 
> supports ext4, btrfs and overlayfs. Changes into filesystems are 
> small, just parse the vfs_shift_uids and vfs_shift_gids options 
> during mount and set the appropriate flags into the super_block
> structure.

So this looks a little daunting.  It sprays the VFS with knowledge
about the shifts and requires support from every underlying filesystem.
 A simple remapping bind filesystem would be a lot simpler and require
no underlying filesystem support.

James

next prev parent reply	other threads:[~2016-05-04 21:06 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-04 14:26 [RFC v2 PATCH 0/8] VFS:userns: support portable root filesystems Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 1/8] VFS: add CLONE_MNTNS_SHIFT_UIDGID flag to allow mounts to shift their UIDs/GIDs Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 2/8] VFS:uidshift: add flags and helpers to shift UIDs and GIDs to virtual view Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 3/8] fs: Treat foreign mounts as nosuid Djalal Harouni
2016-05-04 23:19   ` Serge Hallyn
2016-05-05 13:05     ` Seth Forshee
2016-05-05 22:40       ` Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 4/8] VFS:userns: shift UID/GID to virtual view during permission access Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 5/8] VFS:userns: add helpers to shift UIDs and GIDs into on-disk view Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 6/8] VFS:userns: shift UID/GID to on-disk view before any write to disk Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 7/8] ext4: add support for vfs_shift_uids and vfs_shift_gids mount options Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 8/8] btrfs: " Djalal Harouni
2016-05-04 16:34 ` [RFC v2 PATCH 0/8] VFS:userns: support portable root filesystems Josh Triplett
2016-05-04 21:06 ` James Bottomley [this message]
2016-05-05  7:36   ` Djalal Harouni
2016-05-05 11:56     ` James Bottomley
2016-05-05 21:49       ` Djalal Harouni
2016-05-05 22:08         ` James Bottomley
2016-05-10 23:36           ` James Bottomley
2016-05-11  0:38             ` Al Viro
2016-05-11  0:53             ` Al Viro
2016-05-11  3:47               ` James Bottomley
2016-05-11 16:42             ` Djalal Harouni
2016-05-11 18:33               ` James Bottomley
2016-05-12 19:55                 ` Djalal Harouni
2016-05-12 22:24                   ` James Bottomley
2016-05-14  9:53                     ` Djalal Harouni
2016-05-14 13:46                       ` James Bottomley
2016-05-15  2:21                         ` Eric W. Biederman
2016-05-15 15:04                           ` James Bottomley
2016-05-16 14:12                           ` Seth Forshee
2016-05-16 16:42                             ` Eric W. Biederman
2016-05-16 18:25                               ` Seth Forshee
2016-05-16 19:13                           ` James Bottomley
2016-05-17 22:40                             ` Eric W. Biederman
2016-05-17 11:42                           ` Djalal Harouni
2016-05-17 15:42                         ` Djalal Harouni
2016-05-04 23:30 ` Serge Hallyn
2016-05-06 14:38   ` Djalal Harouni
2016-05-09 16:26     ` Serge Hallyn
2016-05-10 10:33       ` Djalal Harouni
2016-05-05  0:23 ` Dave Chinner
2016-05-05  1:44   ` Andy Lutomirski
2016-05-05  2:25     ` Dave Chinner
2016-05-05  3:29       ` Andy Lutomirski
2016-05-05 22:34     ` Djalal Harouni
2016-05-05 22:24   ` Djalal Harouni
2016-05-06  2:50     ` Dave Chinner
2016-05-12 19:47       ` Djalal Harouni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1462395979.14310.133.camel@HansenPartnership.com \
    --to=james.bottomley@hansenpartnership.com \
    --cc=alban.crequy@gmail.com \
    --cc=clm@fb.com \
    --cc=dh.herrmann@googlemail.com \
    --cc=dongsu@endocode.com \
    --cc=ebiederm@xmission.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mszeredi@redhat.com \
    --cc=serge.hallyn@canonical.com \
    --cc=seth.forshee@canonical.com \
    --cc=tixxdz@gmail.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.