From: Dwight Engen <dwight.engen@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Brian Foster <bfoster@redhat.com>,
Serge Hallyn <serge.hallyn@ubuntu.com>,
"Eric W. Biederman" <ebiederm@gmail.com>,
xfs@oss.sgi.com
Subject: Re: [PATCH v2 RFC] userns: Convert xfs to use kuid/kgid where appropriate
Date: Fri, 28 Jun 2013 10:23:22 -0400 [thread overview]
Message-ID: <20130628102322.74eef70f@oracle.com> (raw)
In-Reply-To: <20130626020924.GD29376@dastard>
On Wed, 26 Jun 2013 12:09:24 +1000
Dave Chinner <david@fromorbit.com> wrote:
> On Mon, Jun 24, 2013 at 09:10:35AM -0400, Dwight Engen wrote:
> > Hi Dave, all. Here is a v2 patch that I believe addresses the
> > previous comments, but I expect there to be more :) I think there
> > are a few more issues to sort out before this is ready, and I want
> > to add some tests to xfstests also.
> >
> > I added permission checks for eofblocks in the ioctl code, but
> > I don't think they are enough. Just because an unprivileged
> > caller is in a group doesn't mean he can write to a file of that
> > group, and I don't know how we can check for that till we get the
> > inode in hand. Brian, if you or anyone else could comment on how
> > this should work for the regular user and write() ENOSPC cases
> > that'd be great.
> >
> > The xfs code now uses inode->i_uid where possible instead of di_uid.
> > The remaining uses of di_uid are where the inode is being setup,
> > conversion to/from disk endianess, in dealing with quotas, and
> > bulkstat.
>
> Hi Dwight,
>
> I haven't looked at the code in detail, I'll just mention that I
> agree with Brain that we need to split this up into more patches.
> The conversions are subtle and easy to misunderstand, so small
> patches are good. I'd say at minimum separate patches are needed
> for:
>
> - permissions check on eof blocks
> - splitting eof blocks structure (I made comments on that to
> Brain's posted patch)
> - introduce XFS conversion helpers
> - implement VFS-XFS boundary conversion using helpers
> - XFS ioctl conversions (maybe multiple patches)
> - final patch to modify kconfig once everything is done
>
> > We do need to decide on the di_uid that comes back from bulkstat.
> > Right now it is returning on disk (== init_user_ns) uids. It looks
> > to me like xfsrestore is using the normal vfs routines (chown,
> > fchown, lchown) when restoring so that won't line up if the
> > xfsrestore is run in !init_user_ns.
>
> Right. This is a major problem because there's no guarantee that you
> are running restore in the same namespace as dump was run in. If you
> take the dump in the init_user_ns, then you can't restore it from
> inside the namespace container, and vice versa.
>
> > We could possibly convert to userns values
> > before returning them from the kernel, but I doubt that will work
> > well with the xfs quotas.
>
> Well, quotas are already converted to/from userns specific values
> before they get to XFS. Including project quotas, which I think at
> this point is wrong. We have no test cases for it, though, so I
> can't validate that it actually works as it is supposed to and that
> we don't break it inadvertantly in the future.
>
> [ I'm still waiting on Eric to provide any information or scripts
> for how he tested this all worked when he did the conversions.... ]
>
> > Should we just require that callers of bulkstat
> > be in init_user_ns? Thoughts?
>
> This is one of the reasons why I want Eric to give us some idea of
> how this is supposed to work - exactly how is backup and restore
> supposed to be managed on a shared filesystem that is segmented up
> into multiple namespace containers? We can talk about the
> implementation all we like, but none of us have a clue to the policy
> decisions that users will make that we need to support. Until we
> have a clear idea on what policies we are supposed to be supporting,
> the implementation will be ambiguous and compromised.
>
> e.g. If users are responsible for it, then bulkstat needs to filter
> based on the current namespace. If management is responsible (i.e.
> init_user_ns does backup/restore of ns-specific subtrees), then
> bulkstat cannot filter and needs to reject calls from outside the
> init_user_ns().
Hi, in thinking about this a bit more, I'm not sure why bulkstat should
be any different than stat(2). If you stat a file not covered by your
current_user_ns(), it doesn't "filter" and return ENOENT, it just
returns it as overflowuid (65534). Filtering bulkstat also has other
(performance) problems as you pointed out, and we cannot easily filter
XFS_IOC_FSINUMBERS anyway.
I don't think the user namespace is intended to subpartition the
set of inodes available from a filesystem, but only to remap the id
values. It then seems like the only policy that can reliably be
supported today is to backup/restore from init_user_ns, or if you want
to backup/restore in a sub user ns, you must know ahead of time that you
will not encounter ids outside your mapping in the fs/backup media.
> But if we have to allow both configurations - which I think we have to
> because both cases are valid choices for a hosting provider to give
> users - then how are we supposed to implement/handle this?
>
> The same goes for the file handle interfaces - it's perfectly valid
> for a user to run a userspace NFS server (e.g. ganesha) inside a
> namespace container, but that will allow that process to provide
> unchecked remote access to the entire underlying filesystem, not
> just the namespace being exported. i.e. fs/export.c needs to be made
> namespace aware in some way so that there is a kernel wide policy
> for handle to file translations in namespace containers....
So I looked at open_by_handle_at(2) and it looks to me like it is no
different than regular open(2) wrt the user namespace checks. The code
paths become common at path_openat(). Note that there is no check for
kuid_has_mapping(current_user_ns(), inode->i_uid) so an inode doesn't
have to be covered by the current user ns in order to get it opened,
the caller just has to pass the inode_permission() checks. I guess I
don't see the "unchecked" part, could you elaborate on that?
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2013-06-28 14:23 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-19 15:09 [PATCH] userns: Convert xfs to use kuid/kgid where appropriate Dwight Engen
2013-06-19 20:35 ` Eric W. Biederman
2013-06-20 1:41 ` Dave Chinner
2013-06-20 13:54 ` Dwight Engen
2013-06-20 21:10 ` Dave Chinner
2013-06-20 0:13 ` Dave Chinner
2013-06-20 13:54 ` Dwight Engen
2013-06-20 15:27 ` Brian Foster
2013-06-20 17:39 ` Dwight Engen
2013-06-20 19:12 ` Brian Foster
2013-06-20 22:12 ` Dave Chinner
2013-06-20 22:45 ` Eric W. Biederman
2013-06-20 23:35 ` Dave Chinner
2013-06-20 22:03 ` Dave Chinner
2013-06-21 15:14 ` Dwight Engen
2013-06-24 0:33 ` Dave Chinner
2013-06-24 13:10 ` [PATCH v2 RFC] " Dwight Engen
2013-06-25 16:46 ` Brian Foster
2013-06-25 20:08 ` Dwight Engen
2013-06-25 21:04 ` Brian Foster
2013-06-26 2:09 ` Dave Chinner
2013-06-26 21:30 ` Dwight Engen
2013-06-26 22:44 ` Dave Chinner
2013-06-27 13:02 ` Serge Hallyn
2013-06-28 1:54 ` Dave Chinner
2013-06-28 15:25 ` Serge Hallyn
2013-06-28 16:16 ` Dwight Engen
2013-06-27 20:57 ` Ben Myers
2013-06-28 1:46 ` Dave Chinner
2013-06-28 15:15 ` Serge Hallyn
2013-06-28 14:23 ` Dwight Engen [this message]
2013-06-28 15:11 ` [PATCH v3 0/6] " Dwight Engen
2013-06-28 15:11 ` [PATCH 1/6] create wrappers for converting kuid_t to/from uid_t Dwight Engen
2013-06-28 15:11 ` [PATCH 2/6] convert kuid_t to/from uid_t in ACLs Dwight Engen
2013-06-28 15:11 ` [PATCH 3/6] ioctl: check for capabilities in the current user namespace Dwight Engen
2013-06-28 15:11 ` [PATCH 4/6] convert kuid_t to/from uid_t for xfs internal structures Dwight Engen
2013-06-28 15:11 ` [PATCH 5/6] create internal eofblocks structure with kuid_t types Dwight Engen
2013-06-28 18:09 ` Brian Foster
2013-06-28 15:11 ` [PATCH 6/6] ioctl eofblocks: require non-privileged users to specify uid/gid match Dwight Engen
2013-06-28 18:50 ` Brian Foster
2013-06-28 20:28 ` Dwight Engen
2013-06-28 21:39 ` Brian Foster
2013-06-28 23:22 ` Dwight Engen
2013-07-01 12:21 ` Brian Foster
2013-07-06 4:44 ` [PATCH 1/1] export inode_capable Serge Hallyn
2013-07-08 13:09 ` [PATCH v2 RFC] userns: Convert xfs to use kuid/kgid where appropriate Serge Hallyn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130628102322.74eef70f@oracle.com \
--to=dwight.engen@oracle.com \
--cc=bfoster@redhat.com \
--cc=david@fromorbit.com \
--cc=ebiederm@gmail.com \
--cc=serge.hallyn@ubuntu.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox