From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Djalal Harouni <tixxdz@gmail.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Chris Mason <clm@fb.com>,
tytso@mit.edu, Serge Hallyn <serge.hallyn@canonical.com>,
Josh Triplett <josh@joshtriplett.org>,
"Eric W. Biederman" <ebiederm@xmission.com>,
Andy Lutomirski <luto@kernel.org>,
Seth Forshee <seth.forshee@canonical.com>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-security-module@vger.kernel.org,
Dongsu Park <dongsu@endocode.com>,
David Herrmann <dh.herrmann@googlemail.com>,
Miklos Szeredi <mszeredi@redhat.com>,
Alban Crequy <alban.crequy@gmail.com>
Subject: Re: [RFC v2 PATCH 0/8] VFS:userns: support portable root filesystems
Date: Wed, 04 May 2016 17:06:19 -0400 [thread overview]
Message-ID: <1462395979.14310.133.camel@HansenPartnership.com> (raw)
In-Reply-To: <1462372014-3786-1-git-send-email-tixxdz@gmail.com>
On Wed, 2016-05-04 at 16:26 +0200, Djalal Harouni wrote:
> This is version 2 of the VFS:userns support portable root filesystems
> RFC. Changes since version 1:
>
> * Update documentation and remove some ambiguity about the feature.
> Based on Josh Triplett comments.
> * Use a new email address to send the RFC :-)
>
>
> This RFC tries to explore how to support filesystem operations inside
> user namespace using only VFS and a per mount namespace solution.
> This
> allows to take advantage of user namespace separations without
> introducing any change at the filesystems level. All this is handled
> with the virtual view of mount namespaces.
>
>
> 1) Presentation:
> ================
>
> The main aim is to support portable root filesystems and allow
> containers, virtual machines and other cases to use the same root
> filesystem. Due to security reasons, filesystems can't be mounted
> inside user namespaces, and mounting them outside will not solve the
> problem since they will show up with the wrong UIDs/GIDs. Read and
> write operations will also fail and so on.
>
> The current userspace solution is to automatically chown the whole
> root filesystem before starting a container, example:
> (host) init_user_ns 1000000:1065536 => (container) user_ns_X1
> 0:65535
> (host) init_user_ns 2000000:2065536 => (container) user_ns_Y1
> 0:65535
> (host) init_user_ns 3000000:3065536 => (container) user_ns_Z1
> 0:65535
> ...
>
> Every time a chown is called, files are changed and so on... This
> prevents to have portable filesystems where you can throw anywhere
> and boot. Having an extra step to adapt the filesystem to the current
> mapping and persist it will not allow to verify its integrity, it
> makes snapshots and migration a bit harder, and probably other
> limitations...
>
> It seems that there are multiple ways to allow user namespaces
> combine nicely with filesystems, but none of them is that easy. The
> bind mount and pin the user namespace during mount time will not
> work, bind mounts share the same super block, hence you may endup
> working on the wrong vfsmount context and there is no easy way to get
> out of that...
So this option was discussed at the recent LSF/MM summit. The most
supported suggestion was that you'd use a new internal fs type that had
a struct mount with a new superblock and would copy the underlying
inodes but substitute it's own with modified ->getatrr/->setattr calls
that did the uid shift. In many ways it would be a remapping bind
which would look similar to overlayfs but be a lot simpler.
> Using the user namespace in the super block seems the way to go, and
> there is the "Support fuse mounts in user namespaces" [1] patches
> which seem nice but perhaps too complex!?
So I don't think that does what you want. The fuse project I've used
before to do uid/gid shifts for build containers is bindfs
https://github.com/mpartel/bindfs/
It allows a --map argument where you specify pairs of uids/gids to map
(tedious for large ranges, but the map can be fixed to use uid:range
instead of individual).
> there is also the overlayfs solution, and finaly the VFS layer
> solution.
>
> We present here a simple VFS solution, everything is packed inside
> VFS, filesystems don't need to know anything (except probably XFS,
> and special operations inside union filesystems). Currently it
> supports ext4, btrfs and overlayfs. Changes into filesystems are
> small, just parse the vfs_shift_uids and vfs_shift_gids options
> during mount and set the appropriate flags into the super_block
> structure.
So this looks a little daunting. It sprays the VFS with knowledge
about the shifts and requires support from every underlying filesystem.
A simple remapping bind filesystem would be a lot simpler and require
no underlying filesystem support.
James
next prev parent reply other threads:[~2016-05-04 21:06 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-04 14:26 [RFC v2 PATCH 0/8] VFS:userns: support portable root filesystems Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 1/8] VFS: add CLONE_MNTNS_SHIFT_UIDGID flag to allow mounts to shift their UIDs/GIDs Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 2/8] VFS:uidshift: add flags and helpers to shift UIDs and GIDs to virtual view Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 3/8] fs: Treat foreign mounts as nosuid Djalal Harouni
2016-05-04 23:19 ` Serge Hallyn
2016-05-05 13:05 ` Seth Forshee
2016-05-05 22:40 ` Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 4/8] VFS:userns: shift UID/GID to virtual view during permission access Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 5/8] VFS:userns: add helpers to shift UIDs and GIDs into on-disk view Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 6/8] VFS:userns: shift UID/GID to on-disk view before any write to disk Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 7/8] ext4: add support for vfs_shift_uids and vfs_shift_gids mount options Djalal Harouni
2016-05-04 14:26 ` [RFC v2 PATCH 8/8] btrfs: " Djalal Harouni
2016-05-04 16:34 ` [RFC v2 PATCH 0/8] VFS:userns: support portable root filesystems Josh Triplett
2016-05-04 21:06 ` James Bottomley [this message]
2016-05-05 7:36 ` Djalal Harouni
2016-05-05 11:56 ` James Bottomley
2016-05-05 21:49 ` Djalal Harouni
2016-05-05 22:08 ` James Bottomley
2016-05-10 23:36 ` James Bottomley
2016-05-11 0:38 ` Al Viro
2016-05-11 0:53 ` Al Viro
2016-05-11 3:47 ` James Bottomley
2016-05-11 16:42 ` Djalal Harouni
2016-05-11 18:33 ` James Bottomley
2016-05-12 19:55 ` Djalal Harouni
2016-05-12 22:24 ` James Bottomley
2016-05-14 9:53 ` Djalal Harouni
2016-05-14 13:46 ` James Bottomley
2016-05-15 2:21 ` Eric W. Biederman
2016-05-15 15:04 ` James Bottomley
2016-05-16 14:12 ` Seth Forshee
2016-05-16 16:42 ` Eric W. Biederman
2016-05-16 18:25 ` Seth Forshee
2016-05-16 19:13 ` James Bottomley
2016-05-17 22:40 ` Eric W. Biederman
2016-05-17 11:42 ` Djalal Harouni
2016-05-17 15:42 ` Djalal Harouni
2016-05-04 23:30 ` Serge Hallyn
2016-05-06 14:38 ` Djalal Harouni
2016-05-09 16:26 ` Serge Hallyn
2016-05-10 10:33 ` Djalal Harouni
2016-05-05 0:23 ` Dave Chinner
2016-05-05 1:44 ` Andy Lutomirski
2016-05-05 2:25 ` Dave Chinner
2016-05-05 3:29 ` Andy Lutomirski
2016-05-05 22:34 ` Djalal Harouni
2016-05-05 22:24 ` Djalal Harouni
2016-05-06 2:50 ` Dave Chinner
2016-05-12 19:47 ` Djalal Harouni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1462395979.14310.133.camel@HansenPartnership.com \
--to=james.bottomley@hansenpartnership.com \
--cc=alban.crequy@gmail.com \
--cc=clm@fb.com \
--cc=dh.herrmann@googlemail.com \
--cc=dongsu@endocode.com \
--cc=ebiederm@xmission.com \
--cc=josh@joshtriplett.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-security-module@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mszeredi@redhat.com \
--cc=serge.hallyn@canonical.com \
--cc=seth.forshee@canonical.com \
--cc=tixxdz@gmail.com \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).