From: Dwight Engen <dwight.engen@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Brian Foster <bfoster@redhat.com>,
Serge Hallyn <serge.hallyn@ubuntu.com>,
"Eric W. Biederman" <ebiederm@gmail.com>,
xfs@oss.sgi.com
Subject: Re: [PATCH v2 RFC] userns: Convert xfs to use kuid/kgid where appropriate
Date: Wed, 26 Jun 2013 17:30:17 -0400 [thread overview]
Message-ID: <20130626173017.5100327a@oracle.com> (raw)
In-Reply-To: <20130626020924.GD29376@dastard>
On Wed, 26 Jun 2013 12:09:24 +1000
Dave Chinner <david@fromorbit.com> wrote:
> On Mon, Jun 24, 2013 at 09:10:35AM -0400, Dwight Engen wrote:
> > Hi Dave, all. Here is a v2 patch that I believe addresses the
> > previous comments, but I expect there to be more :) I think there
> > are a few more issues to sort out before this is ready, and I want
> > to add some tests to xfstests also.
> >
> > I added permission checks for eofblocks in the ioctl code, but
> > I don't think they are enough. Just because an unprivileged
> > caller is in a group doesn't mean he can write to a file of that
> > group, and I don't know how we can check for that till we get the
> > inode in hand. Brian, if you or anyone else could comment on how
> > this should work for the regular user and write() ENOSPC cases
> > that'd be great.
> >
> > The xfs code now uses inode->i_uid where possible instead of di_uid.
> > The remaining uses of di_uid are where the inode is being setup,
> > conversion to/from disk endianess, in dealing with quotas, and
> > bulkstat.
>
> Hi Dwight,
>
> I haven't looked at the code in detail, I'll just mention that I
> agree with Brain that we need to split this up into more patches.
> The conversions are subtle and easy to misunderstand, so small
> patches are good. I'd say at minimum separate patches are needed
> for:
>
> - permissions check on eof blocks
> - splitting eof blocks structure (I made comments on that to
> Brain's posted patch)
> - introduce XFS conversion helpers
> - implement VFS-XFS boundary conversion using helpers
> - XFS ioctl conversions (maybe multiple patches)
> - final patch to modify kconfig once everything is done
Sure, I'll split it up and integrate the comments from the other
(eof blocks) thread as well.
> > We do need to decide on the di_uid that comes back from bulkstat.
> > Right now it is returning on disk (== init_user_ns) uids. It looks
> > to me like xfsrestore is using the normal vfs routines (chown,
> > fchown, lchown) when restoring so that won't line up if the
> > xfsrestore is run in !init_user_ns.
>
> Right. This is a major problem because there's no guarantee that you
> are running restore in the same namespace as dump was run in. If you
> take the dump in the init_user_ns, then you can't restore it from
> inside the namespace container, and vice versa.
Yep. In thinking just a bit about this, assuming we did convert the
ids that came back from bulkstat, how is this much different than today
where you take a backup on one machine and try to restore it on
another? It seems like today the onus is on the user to ensure the uids
will align correctly. AFAICS tar --numeric-owner would have the same
issue, and it looks like man xfsrestore is getting at a similar
thing in the Quotas section.
> > We could possibly convert to userns values
> > before returning them from the kernel, but I doubt that will work
> > well with the xfs quotas.
>
> Well, quotas are already converted to/from userns specific values
> before they get to XFS. Including project quotas, which I think at
> this point is wrong. We have no test cases for it, though, so I
> can't validate that it actually works as it is supposed to and that
> we don't break it inadvertantly in the future.
>
> [ I'm still waiting on Eric to provide any information or scripts
> for how he tested this all worked when he did the conversions.... ]
>
> > Should we just require that callers of bulkstat
> > be in init_user_ns? Thoughts?
>
> This is one of the reasons why I want Eric to give us some idea of
> how this is supposed to work - exactly how is backup and restore
> supposed to be managed on a shared filesystem that is segmented up
> into multiple namespace containers? We can talk about the
> implementation all we like, but none of us have a clue to the policy
> decisions that users will make that we need to support. Until we
> have a clear idea on what policies we are supposed to be supporting,
> the implementation will be ambiguous and compromised.
>
> e.g. If users are responsible for it, then bulkstat needs to filter
> based on the current namespace. If management is responsible (i.e.
> init_user_ns does backup/restore of ns-specific subtrees), then
> bulkstat cannot filter and needs to reject calls from outside the
> init_user_ns().
Maybe we can have bulkstat always filter based on if the caller
kuid_has_mapping(current_user_ns(), inode->i_uid)? That way a caller
from init_user_ns can see them all, but callers from inside a userns
will get a subset of inodes returned?
This doesn't solve the save from one uesrns, restore from a different
one use case, not sure if that was the scenario you were getting at. To
allow for this use case I guess we could have an "id offset" argument
for xfsrestore that gets applied before chown() but that seems icky.
> But if we have to allow both configurations - which I think we have to
> because both cases are valid choices for a hosting provider to give
> users - then how are we supposed to implement/handle this?
>
> The same goes for the file handle interfaces - it's perfectly valid
> for a user to run a userspace NFS server (e.g. ganesha) inside a
> namespace container, but that will allow that process to provide
> unchecked remote access to the entire underlying filesystem, not
> just the namespace being exported. i.e. fs/export.c needs to be made
> namespace aware in some way so that there is a kernel wide policy
> for handle to file translations in namespace containers....
I haven't looked at the handle stuff, I'll take a look and get
familiarized.
> ---
>
> FWIW, one comment on the wrappers now that I've quickly browsed the
> code:
>
> > @@ -68,14 +68,15 @@ xfs_acl_from_disk(
> >
> > switch (acl_e->e_tag) {
> > case ACL_USER:
> > + acl_e->e_uid =
> > xfs_kuid_from_disk(be32_to_cpu(ace->ae_id));
> > + break;
> > case ACL_GROUP:
> > - acl_e->e_id = be32_to_cpu(ace->ae_id);
> > + acl_e->e_gid =
> > xfs_kgid_from_disk(be32_to_cpu(ace->ae_id));
>
> I know I suggested it, but I have to say, that does look a little
> weird. Normally the to/from disk routines do endian conversion
> internally, so perhaps the conversion routines coul dbe better
> named.
>
> These:
>
> acl_e->e_uid = xfs_uid_to_kuid(be32_to_cpu(ace->ae_id));
> acl_e->e_gid = xfs_gid_to_kgid(be32_to_cpu(ace->ae_id));
>
> I think read a whole lot better, and the endian conversion now makes
> sense as that's converting the on-disk value first...
>
> > @@ -101,7 +102,18 @@ xfs_acl_to_disk(struct xfs_acl *aclp, const
> > struct posix_acl *acl) acl_e = &acl->a_entries[i];
> >
> > ace->ae_tag = cpu_to_be32(acl_e->e_tag);
> > - ace->ae_id = cpu_to_be32(acl_e->e_id);
> > + switch(acl_e->e_tag) {
> > + case ACL_USER:
> > + ace->ae_id =
> > cpu_to_be32(xfs_kuid_to_disk(acl_e->e_uid));
> > + break;
> > + case ACL_GROUP:
> > + ace->ae_id =
> > cpu_to_be32(xfs_kgid_to_disk(acl_e->e_gid));
>
> and:
> ace->ae_id =
> cpu_to_be32(xfs_kuid_to_uid(acl_e->e_uid)); ace->ae_id =
> cpu_to_be32(xfs_kgid_to_gid(acl_e->e_uid));
>
> Sorry for asking you to redo this - sometimes an idea I have doesn't
> quite work out the first time :/
Heh, no problem, I agree these names are even better. Makes me wonder
if the return type of xfs_k[ug]id_to_[ug]id should be [ug]id_t instead
of __uint32_t? I'll use these new names in a split out v3 series to
follow. Thanks.
> Cheers,
>
> Dave.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2013-06-26 21:30 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-19 15:09 [PATCH] userns: Convert xfs to use kuid/kgid where appropriate Dwight Engen
2013-06-19 20:35 ` Eric W. Biederman
2013-06-20 1:41 ` Dave Chinner
2013-06-20 13:54 ` Dwight Engen
2013-06-20 21:10 ` Dave Chinner
2013-06-20 0:13 ` Dave Chinner
2013-06-20 13:54 ` Dwight Engen
2013-06-20 15:27 ` Brian Foster
2013-06-20 17:39 ` Dwight Engen
2013-06-20 19:12 ` Brian Foster
2013-06-20 22:12 ` Dave Chinner
2013-06-20 22:45 ` Eric W. Biederman
2013-06-20 23:35 ` Dave Chinner
2013-06-20 22:03 ` Dave Chinner
2013-06-21 15:14 ` Dwight Engen
2013-06-24 0:33 ` Dave Chinner
2013-06-24 13:10 ` [PATCH v2 RFC] " Dwight Engen
2013-06-25 16:46 ` Brian Foster
2013-06-25 20:08 ` Dwight Engen
2013-06-25 21:04 ` Brian Foster
2013-06-26 2:09 ` Dave Chinner
2013-06-26 21:30 ` Dwight Engen [this message]
2013-06-26 22:44 ` Dave Chinner
2013-06-27 13:02 ` Serge Hallyn
2013-06-28 1:54 ` Dave Chinner
2013-06-28 15:25 ` Serge Hallyn
2013-06-28 16:16 ` Dwight Engen
2013-06-27 20:57 ` Ben Myers
2013-06-28 1:46 ` Dave Chinner
2013-06-28 15:15 ` Serge Hallyn
2013-06-28 14:23 ` Dwight Engen
2013-06-28 15:11 ` [PATCH v3 0/6] " Dwight Engen
2013-06-28 15:11 ` [PATCH 1/6] create wrappers for converting kuid_t to/from uid_t Dwight Engen
2013-06-28 15:11 ` [PATCH 2/6] convert kuid_t to/from uid_t in ACLs Dwight Engen
2013-06-28 15:11 ` [PATCH 3/6] ioctl: check for capabilities in the current user namespace Dwight Engen
2013-06-28 15:11 ` [PATCH 4/6] convert kuid_t to/from uid_t for xfs internal structures Dwight Engen
2013-06-28 15:11 ` [PATCH 5/6] create internal eofblocks structure with kuid_t types Dwight Engen
2013-06-28 18:09 ` Brian Foster
2013-06-28 15:11 ` [PATCH 6/6] ioctl eofblocks: require non-privileged users to specify uid/gid match Dwight Engen
2013-06-28 18:50 ` Brian Foster
2013-06-28 20:28 ` Dwight Engen
2013-06-28 21:39 ` Brian Foster
2013-06-28 23:22 ` Dwight Engen
2013-07-01 12:21 ` Brian Foster
2013-07-06 4:44 ` [PATCH 1/1] export inode_capable Serge Hallyn
2013-07-08 13:09 ` [PATCH v2 RFC] userns: Convert xfs to use kuid/kgid where appropriate Serge Hallyn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130626173017.5100327a@oracle.com \
--to=dwight.engen@oracle.com \
--cc=bfoster@redhat.com \
--cc=david@fromorbit.com \
--cc=ebiederm@gmail.com \
--cc=serge.hallyn@ubuntu.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox