* [PATCH v4 00/21] Support fuse mounts in user namespaces @ 2016-04-26 19:36 Seth Forshee 2016-04-26 19:36 ` [PATCH v4 01/21] fs: fix a posible leak of allocated superblock Seth Forshee ` (11 more replies) 0 siblings, 12 replies; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman, linux-bcache-u79uwXL29TY76Z2rM5mHXA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, linux-raid-u79uwXL29TY76Z2rM5mHXA, linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, cgroups-u79uwXL29TY76Z2rM5mHXA, linux-security-module-u79uwXL29TY76Z2rM5mHXA, selinux-+05T5uksL2qpZYMLLGbcSA Cc: Alexander Viro, Serge Hallyn, Richard Weinberger, Austin S Hemmelgarn, Miklos Szeredi, Pavel Tikhomirov, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Seth Forshee Hi Eric, Here's another update to my patches for mouning with fuse from unpivileged user namespaces. The main change here is a fix for a build failure when fuse is built as a module. As usual the series is also available at: git://git.kernel.org/pub/scm/linux/kernel/git/sforshee/linux.git fuse-userns Changes since v3: * Export current_in_userns() to fix an error when fuse is built as a module. * Add comment explaining the conditions for allowing CAP_CHOWN in s_user_ns to change the owner or group of an inode. * Added acks from Serge. Thanks, Seth --- Andy Lutomirski (1): fs: Treat foreign mounts as nosuid Pavel Tikhomirov (1): fs: fix a posible leak of allocated superblock Seth Forshee (19): fs: Remove check of s_user_ns for existing mounts in fs_fully_visible() fs: Allow sysfs and cgroupfs to share super blocks between user namespaces block_dev: Support checking inode permissions in lookup_bdev() block_dev: Check permissions towards block device inode when mounting selinux: Add support for unprivileged mounts from user namespaces userns: Replace in_userns with current_in_userns Smack: Handle labels consistently in untrusted mounts fs: Check for invalid i_uid in may_follow_link() cred: Reject inodes with invalid ids in set_create_file_as() fs: Refuse uid/gid changes which don't map into s_user_ns fs: Update posix_acl support to handle user namespace mounts fs: Allow superblock owner to change ownership of inodes with unmappable ids fs: Don't remove suid for CAP_FSETID in s_user_ns fs: Allow superblock owner to access do_remount_sb() capabilities: Allow privileged user in s_user_ns to set security.* xattrs fuse: Add support for pid namespaces fuse: Support fuse filesystems outside of init_user_ns fuse: Restrict allow_other to the superblock's namespace or a descendant fuse: Allow user namespace mounts drivers/md/bcache/super.c | 2 +- drivers/md/dm-table.c | 2 +- drivers/mtd/mtdsuper.c | 2 +- fs/attr.c | 73 ++++++++++++++++++++++++++++++++++++----- fs/block_dev.c | 18 ++++++++-- fs/exec.c | 2 +- fs/fuse/cuse.c | 3 +- fs/fuse/dev.c | 26 +++++++++++---- fs/fuse/dir.c | 16 ++++----- fs/fuse/file.c | 22 ++++++++++--- fs/fuse/fuse_i.h | 10 +++++- fs/fuse/inode.c | 40 ++++++++++++++-------- fs/inode.c | 3 +- fs/kernfs/inode.c | 2 ++ fs/namei.c | 2 +- fs/namespace.c | 20 ++++++++--- fs/posix_acl.c | 67 +++++++++++++++++++++++-------------- fs/proc/base.c | 2 ++ fs/proc/generic.c | 3 ++ fs/proc/proc_sysctl.c | 2 ++ fs/quota/quota.c | 2 +- fs/super.c | 7 +++- fs/sysfs/mount.c | 3 +- fs/xattr.c | 19 ++++++++--- include/linux/fs.h | 3 +- include/linux/mount.h | 1 + include/linux/posix_acl_xattr.h | 17 +++++++--- include/linux/uidgid.h | 10 ++++++ include/linux/user_namespace.h | 6 ++-- kernel/cgroup.c | 4 +-- kernel/cred.c | 2 ++ kernel/user_namespace.c | 7 ++-- security/commoncap.c | 22 +++++++++---- security/selinux/hooks.c | 25 +++++++++++++- security/smack/smack_lsm.c | 29 ++++++++++------ 35 files changed, 355 insertions(+), 119 deletions(-) ^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v4 01/21] fs: fix a posible leak of allocated superblock 2016-04-26 19:36 [PATCH v4 00/21] Support fuse mounts in user namespaces Seth Forshee @ 2016-04-26 19:36 ` Seth Forshee 2016-04-26 19:36 ` [PATCH v4 04/21] block_dev: Support checking inode permissions in lookup_bdev() Seth Forshee ` (10 subsequent siblings) 11 siblings, 0 replies; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman, Alexander Viro Cc: Serge Hallyn, Richard Weinberger, Austin S Hemmelgarn, Miklos Szeredi, Pavel Tikhomirov, linux-kernel, linux-bcache, dm-devel, linux-raid, linux-mtd, linux-fsdevel, fuse-devel, linux-security-module, selinux, cgroups, Seth Forshee From: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> We probably need to fix superblock leak in patch (v4 "fs: Add user namesapace member to struct super_block"): Imagine posible code path in sget_userns: we iterate through type->fs_supers and do not find suitable sb, we drop sb_lock to allocate s and go to retry. After we dropped sb_lock some other task from different userns takes sb_lock, it is already in retry stage and has s allocated, so it puts its s in type->fs_supers list. So in retry we will find these sb in list and check it has a different userns, and finally we will return without freeing s. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> --- fs/super.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/super.c b/fs/super.c index 829841e0ae7e..092a7828442e 100644 --- a/fs/super.c +++ b/fs/super.c @@ -474,6 +474,10 @@ retry: continue; if (user_ns != old->s_user_ns) { spin_unlock(&sb_lock); + if (s) { + up_write(&s->s_umount); + destroy_super(s); + } return ERR_PTR(-EBUSY); } if (!grab_super(old)) -- 2.7.4 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v4 04/21] block_dev: Support checking inode permissions in lookup_bdev() 2016-04-26 19:36 [PATCH v4 00/21] Support fuse mounts in user namespaces Seth Forshee 2016-04-26 19:36 ` [PATCH v4 01/21] fs: fix a posible leak of allocated superblock Seth Forshee @ 2016-04-26 19:36 ` Seth Forshee 2016-04-26 19:36 ` [PATCH v4 05/21] block_dev: Check permissions towards block device inode when mounting Seth Forshee ` (9 subsequent siblings) 11 siblings, 0 replies; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman, Kent Overstreet, Alasdair Kergon, Mike Snitzer, dm-devel, Shaohua Li, David Woodhouse, Brian Norris, Alexander Viro, Jan Kara, Jeff Layton, J. Bruce Fields Cc: linux-bcache, Serge Hallyn, Seth Forshee, Miklos Szeredi, Richard Weinberger, linux-security-module, linux-kernel, linux-raid, fuse-devel, Austin S Hemmelgarn, linux-mtd, selinux, linux-fsdevel, cgroups, Pavel Tikhomirov When looking up a block device by path no permission check is done to verify that the user has access to the block device inode at the specified path. In some cases it may be necessary to check permissions towards the inode, such as allowing unprivileged users to mount block devices in user namespaces. Add an argument to lookup_bdev() to optionally perform this permission check. A value of 0 skips the permission check and behaves the same as before. A non-zero value specifies the mask of access rights required towards the inode at the specified path. The check is always skipped if the user has CAP_SYS_ADMIN. All callers of lookup_bdev() currently pass a mask of 0, so this patch results in no functional change. Subsequent patches will add permission checks where appropriate. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> --- drivers/md/bcache/super.c | 2 +- drivers/md/dm-table.c | 2 +- drivers/mtd/mtdsuper.c | 2 +- fs/block_dev.c | 13 ++++++++++--- fs/quota/quota.c | 2 +- include/linux/fs.h | 2 +- 6 files changed, 15 insertions(+), 8 deletions(-) diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index a296425a7270..e169739a0253 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -1950,7 +1950,7 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr, sb); if (IS_ERR(bdev)) { if (bdev == ERR_PTR(-EBUSY)) { - bdev = lookup_bdev(strim(path)); + bdev = lookup_bdev(strim(path), 0); mutex_lock(&bch_register_lock); if (!IS_ERR(bdev) && bch_is_open(bdev)) err = "device already registered"; diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index f9e8f0bef332..13f568d527b5 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -372,7 +372,7 @@ dev_t dm_get_dev_t(const char *path) dev_t uninitialized_var(dev); struct block_device *bdev; - bdev = lookup_bdev(path); + bdev = lookup_bdev(path, 0); if (IS_ERR(bdev)) dev = name_to_dev_t(path); else { diff --git a/drivers/mtd/mtdsuper.c b/drivers/mtd/mtdsuper.c index 20c02a3b7417..b5b60e1af31c 100644 --- a/drivers/mtd/mtdsuper.c +++ b/drivers/mtd/mtdsuper.c @@ -176,7 +176,7 @@ struct dentry *mount_mtd(struct file_system_type *fs_type, int flags, /* try the old way - the hack where we allowed users to mount * /dev/mtdblock$(n) but didn't actually _use_ the blockdev */ - bdev = lookup_bdev(dev_name); + bdev = lookup_bdev(dev_name, 0); if (IS_ERR(bdev)) { ret = PTR_ERR(bdev); pr_debug("MTDSB: lookup_bdev() returned %d\n", ret); diff --git a/fs/block_dev.c b/fs/block_dev.c index 3e84d62d0a25..e9b937845bdb 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -1431,7 +1431,7 @@ struct block_device *blkdev_get_by_path(const char *path, fmode_t mode, struct block_device *bdev; int err; - bdev = lookup_bdev(path); + bdev = lookup_bdev(path, 0); if (IS_ERR(bdev)) return bdev; @@ -1821,12 +1821,14 @@ EXPORT_SYMBOL(ioctl_by_bdev); /** * lookup_bdev - lookup a struct block_device by name * @pathname: special file representing the block device + * @mask: rights to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC) * * Get a reference to the blockdevice at @pathname in the current * namespace if possible and return it. Return ERR_PTR(error) - * otherwise. + * otherwise. If @mask is non-zero, check for access rights to the + * inode at @pathname. */ -struct block_device *lookup_bdev(const char *pathname) +struct block_device *lookup_bdev(const char *pathname, int mask) { struct block_device *bdev; struct inode *inode; @@ -1841,6 +1843,11 @@ struct block_device *lookup_bdev(const char *pathname) return ERR_PTR(error); inode = d_backing_inode(path.dentry); + if (mask != 0 && !capable(CAP_SYS_ADMIN)) { + error = __inode_permission(inode, mask); + if (error) + goto fail; + } error = -ENOTBLK; if (!S_ISBLK(inode->i_mode)) goto fail; diff --git a/fs/quota/quota.c b/fs/quota/quota.c index 0f10ee9892ce..59223384b22e 100644 --- a/fs/quota/quota.c +++ b/fs/quota/quota.c @@ -799,7 +799,7 @@ static struct super_block *quotactl_block(const char __user *special, int cmd) if (IS_ERR(tmp)) return ERR_CAST(tmp); - bdev = lookup_bdev(tmp->name); + bdev = lookup_bdev(tmp->name, 0); putname(tmp); if (IS_ERR(bdev)) return ERR_CAST(bdev); diff --git a/include/linux/fs.h b/include/linux/fs.h index 66a639ec1bc4..173b8adc6131 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2438,7 +2438,7 @@ static inline void unregister_chrdev(unsigned int major, const char *name) #define BLKDEV_MAJOR_HASH_SIZE 255 extern const char *__bdevname(dev_t, char *buffer); extern const char *bdevname(struct block_device *bdev, char *buffer); -extern struct block_device *lookup_bdev(const char *); +extern struct block_device *lookup_bdev(const char *, int mask); extern void blkdev_show(struct seq_file *,off_t); #else -- 2.7.4 ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/ ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v4 05/21] block_dev: Check permissions towards block device inode when mounting 2016-04-26 19:36 [PATCH v4 00/21] Support fuse mounts in user namespaces Seth Forshee 2016-04-26 19:36 ` [PATCH v4 01/21] fs: fix a posible leak of allocated superblock Seth Forshee 2016-04-26 19:36 ` [PATCH v4 04/21] block_dev: Support checking inode permissions in lookup_bdev() Seth Forshee @ 2016-04-26 19:36 ` Seth Forshee 2016-04-26 19:36 ` [PATCH v4 06/21] fs: Treat foreign mounts as nosuid Seth Forshee ` (8 subsequent siblings) 11 siblings, 0 replies; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman, Alexander Viro Cc: Serge Hallyn, Richard Weinberger, Austin S Hemmelgarn, Miklos Szeredi, Pavel Tikhomirov, linux-kernel, linux-bcache, dm-devel, linux-raid, linux-mtd, linux-fsdevel, fuse-devel, linux-security-module, selinux, cgroups, Seth Forshee Unprivileged users should not be able to mount block devices when they lack sufficient privileges towards the block device inode. Update blkdev_get_by_path() to validate that the user has the required access to the inode at the specified path. The check will be skipped for CAP_SYS_ADMIN, so privileged mounts will continue working as before. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> --- fs/block_dev.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/block_dev.c b/fs/block_dev.c index e9b937845bdb..2007040afb7b 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -1429,9 +1429,14 @@ struct block_device *blkdev_get_by_path(const char *path, fmode_t mode, void *holder) { struct block_device *bdev; + int perm = 0; int err; - bdev = lookup_bdev(path, 0); + if (mode & FMODE_READ) + perm |= MAY_READ; + if (mode & FMODE_WRITE) + perm |= MAY_WRITE; + bdev = lookup_bdev(path, perm); if (IS_ERR(bdev)) return bdev; -- 2.7.4 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v4 06/21] fs: Treat foreign mounts as nosuid 2016-04-26 19:36 [PATCH v4 00/21] Support fuse mounts in user namespaces Seth Forshee ` (2 preceding siblings ...) 2016-04-26 19:36 ` [PATCH v4 05/21] block_dev: Check permissions towards block device inode when mounting Seth Forshee @ 2016-04-26 19:36 ` Seth Forshee 2016-04-26 19:36 ` [PATCH v4 08/21] userns: Replace in_userns with current_in_userns Seth Forshee ` (7 subsequent siblings) 11 siblings, 0 replies; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman, Alexander Viro, Serge Hallyn, James Morris, Serge E. Hallyn, Paul Moore, Stephen Smalley, Eric Paris Cc: Miklos Szeredi, Seth Forshee, dm-devel, linux-security-module, Richard Weinberger, linux-bcache, linux-kernel, linux-raid, fuse-devel, Austin S Hemmelgarn, linux-mtd, selinux, linux-fsdevel, cgroups, Andy Lutomirski, Pavel Tikhomirov From: Andy Lutomirski <luto@amacapital.net> If a process gets access to a mount from a different user namespace, that process should not be able to take advantage of setuid files or selinux entrypoints from that filesystem. Prevent this by treating mounts from other mount namespaces and those not owned by current_user_ns() or an ancestor as nosuid. This will make it safer to allow more complex filesystems to be mounted in non-root user namespaces. This does not remove the need for MNT_LOCK_NOSUID. The setuid, setgid, and file capability bits can no longer be abused if code in a user namespace were to clear nosuid on an untrusted filesystem, but this patch, by itself, is insufficient to protect the system from abuse of files that, when execed, would increase MAC privilege. As a more concrete explanation, any task that can manipulate a vfsmount associated with a given user namespace already has capabilities in that namespace and all of its descendents. If they can cause a malicious setuid, setgid, or file-caps executable to appear in that mount, then that executable will only allow them to elevate privileges in exactly the set of namespaces in which they are already privileges. On the other hand, if they can cause a malicious executable to appear with a dangerous MAC label, running it could change the caller's security context in a way that should not have been possible, even inside the namespace in which the task is confined. As a hardening measure, this would have made CVE-2014-5207 much more difficult to exploit. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> --- fs/exec.c | 2 +- fs/namespace.c | 13 +++++++++++++ include/linux/mount.h | 1 + security/commoncap.c | 8 +++++++- security/selinux/hooks.c | 2 +- 5 files changed, 23 insertions(+), 3 deletions(-) diff --git a/fs/exec.c b/fs/exec.c index c4010b8207a1..706088dd0cb1 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1391,7 +1391,7 @@ static void bprm_fill_uid(struct linux_binprm *bprm) bprm->cred->euid = current_euid(); bprm->cred->egid = current_egid(); - if (bprm->file->f_path.mnt->mnt_flags & MNT_NOSUID) + if (!mnt_may_suid(bprm->file->f_path.mnt)) return; if (task_no_new_privs(current)) diff --git a/fs/namespace.c b/fs/namespace.c index c133318bec35..6e9db4c166b4 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -3284,6 +3284,19 @@ found: return visible; } +bool mnt_may_suid(struct vfsmount *mnt) +{ + /* + * Foreign mounts (accessed via fchdir or through /proc + * symlinks) are always treated as if they are nosuid. This + * prevents namespaces from trusting potentially unsafe + * suid/sgid bits, file caps, or security labels that originate + * in other namespaces. + */ + return !(mnt->mnt_flags & MNT_NOSUID) && check_mnt(real_mount(mnt)) && + in_userns(current_user_ns(), mnt->mnt_sb->s_user_ns); +} + static struct ns_common *mntns_get(struct task_struct *task) { struct ns_common *ns = NULL; diff --git a/include/linux/mount.h b/include/linux/mount.h index f822c3c11377..54a594d49733 100644 --- a/include/linux/mount.h +++ b/include/linux/mount.h @@ -81,6 +81,7 @@ extern void mntput(struct vfsmount *mnt); extern struct vfsmount *mntget(struct vfsmount *mnt); extern struct vfsmount *mnt_clone_internal(struct path *path); extern int __mnt_is_readonly(struct vfsmount *mnt); +extern bool mnt_may_suid(struct vfsmount *mnt); struct path; extern struct vfsmount *clone_private_mount(struct path *path); diff --git a/security/commoncap.c b/security/commoncap.c index 8912ef117faa..a306c5d90709 100644 --- a/security/commoncap.c +++ b/security/commoncap.c @@ -453,8 +453,14 @@ static int get_file_caps(struct linux_binprm *bprm, bool *effective, bool *has_c if (!file_caps_enabled) return 0; - if (bprm->file->f_path.mnt->mnt_flags & MNT_NOSUID) + if (!mnt_may_suid(bprm->file->f_path.mnt)) return 0; + + /* + * This check is redundant with mnt_may_suid() but is kept to make + * explicit that capability bits are limited to s_user_ns and its + * descendants. + */ if (!in_userns(current_user_ns(), bprm->file->f_path.mnt->mnt_sb->s_user_ns)) return 0; diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 912deee3f01e..1350167635cb 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -2234,7 +2234,7 @@ static int check_nnp_nosuid(const struct linux_binprm *bprm, const struct task_security_struct *new_tsec) { int nnp = (bprm->unsafe & LSM_UNSAFE_NO_NEW_PRIVS); - int nosuid = (bprm->file->f_path.mnt->mnt_flags & MNT_NOSUID); + int nosuid = !mnt_may_suid(bprm->file->f_path.mnt); int rc; if (!nnp && !nosuid) -- 2.7.4 ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/ ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v4 08/21] userns: Replace in_userns with current_in_userns 2016-04-26 19:36 [PATCH v4 00/21] Support fuse mounts in user namespaces Seth Forshee ` (3 preceding siblings ...) 2016-04-26 19:36 ` [PATCH v4 06/21] fs: Treat foreign mounts as nosuid Seth Forshee @ 2016-04-26 19:36 ` Seth Forshee 2016-04-26 19:36 ` [PATCH v4 09/21] Smack: Handle labels consistently in untrusted mounts Seth Forshee ` (6 subsequent siblings) 11 siblings, 0 replies; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman, Alexander Viro, Serge Hallyn, James Morris, Serge E. Hallyn Cc: Miklos Szeredi, Seth Forshee, dm-devel, linux-security-module, Richard Weinberger, linux-bcache, linux-kernel, linux-raid, fuse-devel, Austin S Hemmelgarn, linux-mtd, selinux, linux-fsdevel, cgroups, Pavel Tikhomirov All current callers of in_userns pass current_user_ns as the first argument. Simplify by replacing in_userns with current_in_userns which checks whether current_user_ns is in the namespace supplied as an argument. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> --- fs/namespace.c | 2 +- include/linux/user_namespace.h | 6 ++---- kernel/user_namespace.c | 6 +++--- security/commoncap.c | 2 +- 4 files changed, 7 insertions(+), 9 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index 6e9db4c166b4..0ad8e4a4f50b 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -3294,7 +3294,7 @@ bool mnt_may_suid(struct vfsmount *mnt) * in other namespaces. */ return !(mnt->mnt_flags & MNT_NOSUID) && check_mnt(real_mount(mnt)) && - in_userns(current_user_ns(), mnt->mnt_sb->s_user_ns); + current_in_userns(mnt->mnt_sb->s_user_ns); } static struct ns_common *mntns_get(struct task_struct *task) diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h index a43faa727124..9217169c64cb 100644 --- a/include/linux/user_namespace.h +++ b/include/linux/user_namespace.h @@ -72,8 +72,7 @@ extern ssize_t proc_projid_map_write(struct file *, const char __user *, size_t, extern ssize_t proc_setgroups_write(struct file *, const char __user *, size_t, loff_t *); extern int proc_setgroups_show(struct seq_file *m, void *v); extern bool userns_may_setgroups(const struct user_namespace *ns); -extern bool in_userns(const struct user_namespace *ns, - const struct user_namespace *target_ns); +extern bool current_in_userns(const struct user_namespace *target_ns); #else static inline struct user_namespace *get_user_ns(struct user_namespace *ns) @@ -103,8 +102,7 @@ static inline bool userns_may_setgroups(const struct user_namespace *ns) return true; } -static inline bool in_userns(const struct user_namespace *ns, - const struct user_namespace *target_ns) +static inline bool current_in_userns(const struct user_namespace *target_ns) { return true; } diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c index dee3be5445da..68f594212759 100644 --- a/kernel/user_namespace.c +++ b/kernel/user_namespace.c @@ -942,10 +942,10 @@ bool userns_may_setgroups(const struct user_namespace *ns) * Returns true if @ns is the same namespace as or a descendant of * @target_ns. */ -bool in_userns(const struct user_namespace *ns, - const struct user_namespace *target_ns) +bool current_in_userns(const struct user_namespace *target_ns) { - for (; ns; ns = ns->parent) { + struct user_namespace *ns; + for (ns = current_user_ns(); ns; ns = ns->parent) { if (ns == target_ns) return true; } diff --git a/security/commoncap.c b/security/commoncap.c index a306c5d90709..e657227d221e 100644 --- a/security/commoncap.c +++ b/security/commoncap.c @@ -461,7 +461,7 @@ static int get_file_caps(struct linux_binprm *bprm, bool *effective, bool *has_c * explicit that capability bits are limited to s_user_ns and its * descendants. */ - if (!in_userns(current_user_ns(), bprm->file->f_path.mnt->mnt_sb->s_user_ns)) + if (!current_in_userns(bprm->file->f_path.mnt->mnt_sb->s_user_ns)) return 0; rc = get_vfs_caps_from_disk(bprm->file->f_path.dentry, &vcaps); -- 2.7.4 ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/ ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v4 09/21] Smack: Handle labels consistently in untrusted mounts 2016-04-26 19:36 [PATCH v4 00/21] Support fuse mounts in user namespaces Seth Forshee ` (4 preceding siblings ...) 2016-04-26 19:36 ` [PATCH v4 08/21] userns: Replace in_userns with current_in_userns Seth Forshee @ 2016-04-26 19:36 ` Seth Forshee 2016-04-26 19:36 ` [PATCH v4 10/21] fs: Check for invalid i_uid in may_follow_link() Seth Forshee ` (5 subsequent siblings) 11 siblings, 0 replies; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman, Casey Schaufler Cc: Alexander Viro, Serge Hallyn, Richard Weinberger, Austin S Hemmelgarn, Miklos Szeredi, Pavel Tikhomirov, linux-kernel, linux-bcache, dm-devel, linux-raid, linux-mtd, linux-fsdevel, fuse-devel, linux-security-module, selinux, cgroups, Seth Forshee, James Morris, Serge E. Hallyn The SMACK64, SMACK64EXEC, and SMACK64MMAP labels are all handled differently in untrusted mounts. This is confusing and potentically problematic. Change this to handle them all the same way that SMACK64 is currently handled; that is, read the label from disk and check it at use time. For SMACK64 and SMACK64MMAP access is denied if the label does not match smk_root. To be consistent with suid, a SMACK64EXEC label which does not match smk_root will still allow execution of the file but will not run with the label supplied in the xattr. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> --- security/smack/smack_lsm.c | 29 +++++++++++++++++++---------- 1 file changed, 19 insertions(+), 10 deletions(-) diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c index aa17198cd5f2..ca564590cc1b 100644 --- a/security/smack/smack_lsm.c +++ b/security/smack/smack_lsm.c @@ -919,6 +919,7 @@ static int smack_bprm_set_creds(struct linux_binprm *bprm) struct inode *inode = file_inode(bprm->file); struct task_smack *bsp = bprm->cred->security; struct inode_smack *isp; + struct superblock_smack *sbsp; int rc; if (bprm->cred_prepared) @@ -928,6 +929,11 @@ static int smack_bprm_set_creds(struct linux_binprm *bprm) if (isp->smk_task == NULL || isp->smk_task == bsp->smk_task) return 0; + sbsp = inode->i_sb->s_security; + if ((sbsp->smk_flags & SMK_SB_UNTRUSTED) && + isp->smk_task != sbsp->smk_root) + return 0; + if (bprm->unsafe & (LSM_UNSAFE_PTRACE | LSM_UNSAFE_PTRACE_CAP)) { struct task_struct *tracer; rc = 0; @@ -1725,6 +1731,7 @@ static int smack_mmap_file(struct file *file, struct task_smack *tsp; struct smack_known *okp; struct inode_smack *isp; + struct superblock_smack *sbsp; int may; int mmay; int tmay; @@ -1736,6 +1743,10 @@ static int smack_mmap_file(struct file *file, isp = file_inode(file)->i_security; if (isp->smk_mmap == NULL) return 0; + sbsp = file_inode(file)->i_sb->s_security; + if (sbsp->smk_flags & SMK_SB_UNTRUSTED && + isp->smk_mmap != sbsp->smk_root) + return -EACCES; mkp = isp->smk_mmap; tsp = current_security(); @@ -3546,16 +3557,14 @@ static void smack_d_instantiate(struct dentry *opt_dentry, struct inode *inode) if (rc >= 0) transflag = SMK_INODE_TRANSMUTE; } - if (!(sbsp->smk_flags & SMK_SB_UNTRUSTED)) { - /* - * Don't let the exec or mmap label be "*" or "@". - */ - skp = smk_fetch(XATTR_NAME_SMACKEXEC, inode, dp); - if (IS_ERR(skp) || skp == &smack_known_star || - skp == &smack_known_web) - skp = NULL; - isp->smk_task = skp; - } + /* + * Don't let the exec or mmap label be "*" or "@". + */ + skp = smk_fetch(XATTR_NAME_SMACKEXEC, inode, dp); + if (IS_ERR(skp) || skp == &smack_known_star || + skp == &smack_known_web) + skp = NULL; + isp->smk_task = skp; skp = smk_fetch(XATTR_NAME_SMACKMMAP, inode, dp); if (IS_ERR(skp) || skp == &smack_known_star || -- 2.7.4 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v4 10/21] fs: Check for invalid i_uid in may_follow_link() 2016-04-26 19:36 [PATCH v4 00/21] Support fuse mounts in user namespaces Seth Forshee ` (5 preceding siblings ...) 2016-04-26 19:36 ` [PATCH v4 09/21] Smack: Handle labels consistently in untrusted mounts Seth Forshee @ 2016-04-26 19:36 ` Seth Forshee 2016-05-24 15:55 ` Djalal Harouni 2016-04-26 19:36 ` [PATCH v4 11/21] cred: Reject inodes with invalid ids in set_create_file_as() Seth Forshee ` (4 subsequent siblings) 11 siblings, 1 reply; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman, Alexander Viro Cc: linux-bcache, Serge Hallyn, Seth Forshee, dm-devel, Miklos Szeredi, Richard Weinberger, linux-security-module, linux-kernel, linux-raid, fuse-devel, Austin S Hemmelgarn, linux-mtd, selinux, linux-fsdevel, cgroups, Pavel Tikhomirov Filesystem uids which don't map into a user namespace may result in inode->i_uid being INVALID_UID. A symlink and its parent could have different owners in the filesystem can both get mapped to INVALID_UID, which may result in following a symlink when this would not have otherwise been permitted when protected symlinks are enabled. Add a new helper function, uid_valid_eq(), and use this to validate that the ids in may_follow_link() are both equal and valid. Also add an equivalent helper for gids, which is currently unused. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> --- fs/namei.c | 2 +- include/linux/uidgid.h | 10 ++++++++++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/fs/namei.c b/fs/namei.c index a29094c6f4a1..6fe8b0d8ca90 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -915,7 +915,7 @@ static inline int may_follow_link(struct nameidata *nd) return 0; /* Allowed if parent directory and link owner match. */ - if (uid_eq(parent->i_uid, inode->i_uid)) + if (uid_valid_eq(parent->i_uid, inode->i_uid)) return 0; if (nd->flags & LOOKUP_RCU) diff --git a/include/linux/uidgid.h b/include/linux/uidgid.h index 03835522dfcb..e09529fe2668 100644 --- a/include/linux/uidgid.h +++ b/include/linux/uidgid.h @@ -117,6 +117,16 @@ static inline bool gid_valid(kgid_t gid) return __kgid_val(gid) != (gid_t) -1; } +static inline bool uid_valid_eq(kuid_t left, kuid_t right) +{ + return uid_eq(left, right) && uid_valid(left); +} + +static inline bool gid_valid_eq(kgid_t left, kgid_t right) +{ + return gid_eq(left, right) && gid_valid(left); +} + #ifdef CONFIG_USER_NS extern kuid_t make_kuid(struct user_namespace *from, uid_t uid); -- 2.7.4 ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/ ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH v4 10/21] fs: Check for invalid i_uid in may_follow_link() 2016-04-26 19:36 ` [PATCH v4 10/21] fs: Check for invalid i_uid in may_follow_link() Seth Forshee @ 2016-05-24 15:55 ` Djalal Harouni 0 siblings, 0 replies; 29+ messages in thread From: Djalal Harouni @ 2016-05-24 15:55 UTC (permalink / raw) To: Seth Forshee Cc: Eric W. Biederman, Alexander Viro, Serge Hallyn, Richard Weinberger, Austin S Hemmelgarn, Miklos Szeredi, Pavel Tikhomirov, linux-kernel, linux-bcache, dm-devel, linux-raid, linux-mtd, linux-fsdevel, fuse-devel, linux-security-module, selinux, cgroups On Tue, Apr 26, 2016 at 02:36:23PM -0500, Seth Forshee wrote: > Filesystem uids which don't map into a user namespace may result > in inode->i_uid being INVALID_UID. A symlink and its parent > could have different owners in the filesystem can both get > mapped to INVALID_UID, which may result in following a symlink > when this would not have otherwise been permitted when protected > symlinks are enabled. > > Add a new helper function, uid_valid_eq(), and use this to > validate that the ids in may_follow_link() are both equal and > valid. Also add an equivalent helper for gids, which is > currently unused. > > Signed-off-by: Seth Forshee <seth.forshee@canonical.com> > Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Reviewed-by: Djalal Harouni <tixxdz@opendz.org> > --- > fs/namei.c | 2 +- > include/linux/uidgid.h | 10 ++++++++++ > 2 files changed, 11 insertions(+), 1 deletion(-) > > diff --git a/fs/namei.c b/fs/namei.c > index a29094c6f4a1..6fe8b0d8ca90 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -915,7 +915,7 @@ static inline int may_follow_link(struct nameidata *nd) > return 0; > > /* Allowed if parent directory and link owner match. */ > - if (uid_eq(parent->i_uid, inode->i_uid)) > + if (uid_valid_eq(parent->i_uid, inode->i_uid)) > return 0; > > if (nd->flags & LOOKUP_RCU) > diff --git a/include/linux/uidgid.h b/include/linux/uidgid.h > index 03835522dfcb..e09529fe2668 100644 > --- a/include/linux/uidgid.h > +++ b/include/linux/uidgid.h > @@ -117,6 +117,16 @@ static inline bool gid_valid(kgid_t gid) > return __kgid_val(gid) != (gid_t) -1; > } > > +static inline bool uid_valid_eq(kuid_t left, kuid_t right) > +{ > + return uid_eq(left, right) && uid_valid(left); > +} > + > +static inline bool gid_valid_eq(kgid_t left, kgid_t right) > +{ > + return gid_eq(left, right) && gid_valid(left); > +} > + > #ifdef CONFIG_USER_NS > > extern kuid_t make_kuid(struct user_namespace *from, uid_t uid); > -- > 2.7.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Djalal Harouni http://opendz.org ^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v4 11/21] cred: Reject inodes with invalid ids in set_create_file_as() 2016-04-26 19:36 [PATCH v4 00/21] Support fuse mounts in user namespaces Seth Forshee ` (6 preceding siblings ...) 2016-04-26 19:36 ` [PATCH v4 10/21] fs: Check for invalid i_uid in may_follow_link() Seth Forshee @ 2016-04-26 19:36 ` Seth Forshee 2016-04-26 19:36 ` [PATCH v4 15/21] fs: Don't remove suid for CAP_FSETID in s_user_ns Seth Forshee ` (3 subsequent siblings) 11 siblings, 0 replies; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman Cc: linux-bcache, Serge Hallyn, Seth Forshee, dm-devel, Miklos Szeredi, Richard Weinberger, linux-security-module, linux-kernel, linux-raid, fuse-devel, Austin S Hemmelgarn, linux-mtd, Alexander Viro, selinux, linux-fsdevel, cgroups, Pavel Tikhomirov Using INVALID_[UG]ID for the LSM file creation context doesn't make sense, so return an error if the inode passed to set_create_file_as() has an invalid id. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> --- kernel/cred.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/cred.c b/kernel/cred.c index 0c0cd8a62285..5f264fb5737d 100644 --- a/kernel/cred.c +++ b/kernel/cred.c @@ -689,6 +689,8 @@ EXPORT_SYMBOL(set_security_override_from_ctx); */ int set_create_files_as(struct cred *new, struct inode *inode) { + if (!uid_valid(inode->i_uid) || !gid_valid(inode->i_gid)) + return -EINVAL; new->fsuid = inode->i_uid; new->fsgid = inode->i_gid; return security_kernel_create_files_as(new, inode); -- 2.7.4 ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/ ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v4 15/21] fs: Don't remove suid for CAP_FSETID in s_user_ns 2016-04-26 19:36 [PATCH v4 00/21] Support fuse mounts in user namespaces Seth Forshee ` (7 preceding siblings ...) 2016-04-26 19:36 ` [PATCH v4 11/21] cred: Reject inodes with invalid ids in set_create_file_as() Seth Forshee @ 2016-04-26 19:36 ` Seth Forshee [not found] ` <1461699396-33000-1-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> ` (2 subsequent siblings) 11 siblings, 0 replies; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman, Alexander Viro Cc: linux-bcache, Serge Hallyn, Seth Forshee, dm-devel, Miklos Szeredi, Richard Weinberger, linux-security-module, linux-kernel, linux-raid, fuse-devel, Austin S Hemmelgarn, linux-mtd, selinux, linux-fsdevel, cgroups, Pavel Tikhomirov Expand the check in should_remove_suid() to keep privileges for CAP_FSETID in s_user_ns rather than init_user_ns. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> --- fs/inode.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/inode.c b/fs/inode.c index 69b8b526c194..cd52170f9117 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -1690,7 +1690,8 @@ int should_remove_suid(struct dentry *dentry) if (unlikely((mode & S_ISGID) && (mode & S_IXGRP))) kill |= ATTR_KILL_SGID; - if (unlikely(kill && !capable(CAP_FSETID) && S_ISREG(mode))) + if (unlikely(kill && !ns_capable(dentry->d_sb->s_user_ns, CAP_FSETID) && + S_ISREG(mode))) return kill; return 0; -- 2.7.4 ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/ ^ permalink raw reply related [flat|nested] 29+ messages in thread
[parent not found: <1461699396-33000-1-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>]
* [PATCH v4 02/21] fs: Remove check of s_user_ns for existing mounts in fs_fully_visible() [not found] ` <1461699396-33000-1-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> @ 2016-04-26 19:36 ` Seth Forshee 2016-04-26 19:36 ` [PATCH v4 03/21] fs: Allow sysfs and cgroupfs to share super blocks between user namespaces Seth Forshee ` (8 subsequent siblings) 9 siblings, 0 replies; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman, Alexander Viro Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Serge Hallyn, Seth Forshee, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, Miklos Szeredi, linux-security-module-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-raid-u79uwXL29TY76Z2rM5mHXA, fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Austin S Hemmelgarn, linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, selinux-+05T5uksL2qpZYMLLGbcSA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, Pavel Tikhomirov fs_fully_visible() ignores MNT_LOCK_NODEV when FS_USERS_DEV_MOUNT is not set for the filesystem, but there is a bug in the logic that may cause mounting to fail. It is doing this only when the existing mount is not in init_user_ns but should check the new mount instead. But the new mount is always in a non-init namespace when fs_fully_visible() is called, so that condition can simply be removed. Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> --- fs/namespace.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index f20c82f91ecb..c133318bec35 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -3234,8 +3234,7 @@ static bool fs_fully_visible(struct file_system_type *type, int *new_mnt_flags) mnt_flags = mnt->mnt.mnt_flags; if (mnt->mnt.mnt_sb->s_iflags & SB_I_NOEXEC) mnt_flags &= ~(MNT_LOCK_NOSUID | MNT_LOCK_NOEXEC); - if (mnt->mnt.mnt_sb->s_user_ns != &init_user_ns && - !(mnt->mnt.mnt_sb->s_type->fs_flags & FS_USERNS_DEV_MOUNT)) + if (!(mnt->mnt.mnt_sb->s_type->fs_flags & FS_USERNS_DEV_MOUNT)) mnt_flags &= ~(MNT_LOCK_NODEV); /* Verify the mount flags are equal to or more permissive -- 2.7.4 ------------------------------------------------------------------------------ Find and fix application performance issues faster with Applications Manager Applications Manager provides deep performance insights into multiple tiers of your business applications. It resolves application problems quickly and reduces your MTTR. Get your free trial! https://ad.doubleclick.net/ddm/clk/302982198;130105516;z -- fuse-devel mailing list To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v4 03/21] fs: Allow sysfs and cgroupfs to share super blocks between user namespaces [not found] ` <1461699396-33000-1-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> 2016-04-26 19:36 ` [PATCH v4 02/21] fs: Remove check of s_user_ns for existing mounts in fs_fully_visible() Seth Forshee @ 2016-04-26 19:36 ` Seth Forshee 2016-04-26 19:36 ` [PATCH v4 07/21] selinux: Add support for unprivileged mounts from " Seth Forshee ` (7 subsequent siblings) 9 siblings, 0 replies; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman, Alexander Viro, Greg Kroah-Hartman, Jeff Layton, J. Bruce Fields, Tejun Heo, Li Zefan, Johannes Weiner Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Serge Hallyn, Seth Forshee, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, Miklos Szeredi, linux-security-module-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-raid-u79uwXL29TY76Z2rM5mHXA, fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Austin S Hemmelgarn, linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, selinux-+05T5uksL2qpZYMLLGbcSA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, Pavel Tikhomirov Both of these filesystems already have use cases for mounting the same super block from multiple user namespaces. For sysfs this happens when using criu for snapshotting a container, where sysfs is mounted in the containers network ns but the hosts user ns. The cgroup filesystem shares the same super block for all mounts of the same hierarchy regardless of the namespace. As a result, the restriction on mounting a super block from a single user namespace creates regressions for existing uses of these filesystems. For these specific filesystems this restriction isn't really necessary since the backing store is objects in kernel memory and thus the ids assigned from inodes is not subject to translation relative to s_user_ns. Add a new filesystem flag, FS_USERNS_SHARE_SB, which when set causes sget_userns() to skip the check of s_user_ns. Set this flag for the sysfs and cgroup filesystems to fix the regressions. Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> Acked-by: Serge Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> --- fs/super.c | 3 ++- fs/sysfs/mount.c | 3 ++- include/linux/fs.h | 1 + kernel/cgroup.c | 4 ++-- 4 files changed, 7 insertions(+), 4 deletions(-) diff --git a/fs/super.c b/fs/super.c index 092a7828442e..ead156b44bf8 100644 --- a/fs/super.c +++ b/fs/super.c @@ -472,7 +472,8 @@ retry: hlist_for_each_entry(old, &type->fs_supers, s_instances) { if (!test(old, data)) continue; - if (user_ns != old->s_user_ns) { + if (!(type->fs_flags & FS_USERNS_SHARE_SB) && + user_ns != old->s_user_ns) { spin_unlock(&sb_lock); if (s) { up_write(&s->s_umount); diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c index f3db82071cfb..9555accd4322 100644 --- a/fs/sysfs/mount.c +++ b/fs/sysfs/mount.c @@ -59,7 +59,8 @@ static struct file_system_type sysfs_fs_type = { .name = "sysfs", .mount = sysfs_mount, .kill_sb = sysfs_kill_sb, - .fs_flags = FS_USERNS_VISIBLE | FS_USERNS_MOUNT, + .fs_flags = FS_USERNS_VISIBLE | FS_USERNS_MOUNT | + FS_USERNS_SHARE_SB, }; int __init sysfs_init(void) diff --git a/include/linux/fs.h b/include/linux/fs.h index be0f8023e28c..66a639ec1bc4 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1988,6 +1988,7 @@ struct file_system_type { #define FS_USERNS_MOUNT 8 /* Can be mounted by userns root */ #define FS_USERNS_DEV_MOUNT 16 /* A userns mount does not imply MNT_NODEV */ #define FS_USERNS_VISIBLE 32 /* FS must already be visible */ +#define FS_USERNS_SHARE_SB 64 /* Allow sharing sb between userns-es */ #define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rename() internally. */ struct dentry *(*mount) (struct file_system_type *, int, const char *, void *); diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 671dc05c0b0f..9c9aa27e531a 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -2247,14 +2247,14 @@ static struct file_system_type cgroup_fs_type = { .name = "cgroup", .mount = cgroup_mount, .kill_sb = cgroup_kill_sb, - .fs_flags = FS_USERNS_MOUNT, + .fs_flags = FS_USERNS_MOUNT | FS_USERNS_SHARE_SB, }; static struct file_system_type cgroup2_fs_type = { .name = "cgroup2", .mount = cgroup_mount, .kill_sb = cgroup_kill_sb, - .fs_flags = FS_USERNS_MOUNT, + .fs_flags = FS_USERNS_MOUNT | FS_USERNS_SHARE_SB, }; static char *cgroup_path_ns_locked(struct cgroup *cgrp, char *buf, size_t buflen, -- 2.7.4 ------------------------------------------------------------------------------ Find and fix application performance issues faster with Applications Manager Applications Manager provides deep performance insights into multiple tiers of your business applications. It resolves application problems quickly and reduces your MTTR. Get your free trial! https://ad.doubleclick.net/ddm/clk/302982198;130105516;z -- fuse-devel mailing list To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v4 07/21] selinux: Add support for unprivileged mounts from user namespaces [not found] ` <1461699396-33000-1-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> 2016-04-26 19:36 ` [PATCH v4 02/21] fs: Remove check of s_user_ns for existing mounts in fs_fully_visible() Seth Forshee 2016-04-26 19:36 ` [PATCH v4 03/21] fs: Allow sysfs and cgroupfs to share super blocks between user namespaces Seth Forshee @ 2016-04-26 19:36 ` Seth Forshee 2016-04-26 19:36 ` [PATCH v4 12/21] fs: Refuse uid/gid changes which don't map into s_user_ns Seth Forshee ` (6 subsequent siblings) 9 siblings, 0 replies; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman, Paul Moore, Stephen Smalley, Eric Paris Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Serge Hallyn, Seth Forshee, James Morris, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, Miklos Szeredi, linux-security-module-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-raid-u79uwXL29TY76Z2rM5mHXA, fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Austin S Hemmelgarn, linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Alexander Viro, selinux-+05T5uksL2qpZYMLLGbcSA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, Pavel Tikhomirov, Serge E. Hallyn Security labels from unprivileged mounts in user namespaces must be ignored. Force superblocks from user namespaces whose labeling behavior is to use xattrs to use mountpoint labeling instead. For the mountpoint label, default to converting the current task context into a form suitable for file objects, but also allow the policy writer to specify a different label through policy transition rules. Pieced together from code snippets provided by Stephen Smalley. Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> Acked-by: Stephen Smalley <sds-+05T5uksL2qpZYMLLGbcSA@public.gmane.org> Acked-by: James Morris <james.l.morris-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> --- security/selinux/hooks.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 1350167635cb..33beed3ac589 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -820,6 +820,28 @@ static int selinux_set_mnt_opts(struct super_block *sb, goto out; } } + + /* + * If this is a user namespace mount, no contexts are allowed + * on the command line and security labels must be ignored. + */ + if (sb->s_user_ns != &init_user_ns) { + if (context_sid || fscontext_sid || rootcontext_sid || + defcontext_sid) { + rc = -EACCES; + goto out; + } + if (sbsec->behavior == SECURITY_FS_USE_XATTR) { + sbsec->behavior = SECURITY_FS_USE_MNTPOINT; + rc = security_transition_sid(current_sid(), current_sid(), + SECCLASS_FILE, NULL, + &sbsec->mntpoint_sid); + if (rc) + goto out; + } + goto out_set_opts; + } + /* sets the context of the superblock for the fs being mounted. */ if (fscontext_sid) { rc = may_context_mount_sb_relabel(fscontext_sid, sbsec, cred); @@ -888,6 +910,7 @@ static int selinux_set_mnt_opts(struct super_block *sb, sbsec->def_sid = defcontext_sid; } +out_set_opts: rc = sb_finish_set_opts(sb); out: mutex_unlock(&sbsec->lock); -- 2.7.4 ------------------------------------------------------------------------------ Find and fix application performance issues faster with Applications Manager Applications Manager provides deep performance insights into multiple tiers of your business applications. It resolves application problems quickly and reduces your MTTR. Get your free trial! https://ad.doubleclick.net/ddm/clk/302982198;130105516;z -- fuse-devel mailing list To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v4 12/21] fs: Refuse uid/gid changes which don't map into s_user_ns [not found] ` <1461699396-33000-1-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> ` (2 preceding siblings ...) 2016-04-26 19:36 ` [PATCH v4 07/21] selinux: Add support for unprivileged mounts from " Seth Forshee @ 2016-04-26 19:36 ` Seth Forshee 2016-04-26 19:36 ` [PATCH v4 13/21] fs: Update posix_acl support to handle user namespace mounts Seth Forshee ` (5 subsequent siblings) 9 siblings, 0 replies; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman, Alexander Viro Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Serge Hallyn, Seth Forshee, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, Miklos Szeredi, linux-security-module-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-raid-u79uwXL29TY76Z2rM5mHXA, fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Austin S Hemmelgarn, linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, selinux-+05T5uksL2qpZYMLLGbcSA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, Pavel Tikhomirov Add checks to inode_change_ok to verify that uid and gid changes will map into the superblock's user namespace. If they do not fail with -EOVERFLOW. This cannot be overriden with ATTR_FORCE. Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> --- fs/attr.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/fs/attr.c b/fs/attr.c index 25b24d0f6c88..3cfaaac4a18e 100644 --- a/fs/attr.c +++ b/fs/attr.c @@ -42,6 +42,17 @@ int inode_change_ok(const struct inode *inode, struct iattr *attr) return error; } + /* + * Verify that uid/gid changes are valid in the target namespace + * of the superblock. This cannot be overriden using ATTR_FORCE. + */ + if (ia_valid & ATTR_UID && + from_kuid(inode->i_sb->s_user_ns, attr->ia_uid) == (uid_t)-1) + return -EOVERFLOW; + if (ia_valid & ATTR_GID && + from_kgid(inode->i_sb->s_user_ns, attr->ia_gid) == (gid_t)-1) + return -EOVERFLOW; + /* If force is set do it anyway. */ if (ia_valid & ATTR_FORCE) return 0; -- 2.7.4 ------------------------------------------------------------------------------ Find and fix application performance issues faster with Applications Manager Applications Manager provides deep performance insights into multiple tiers of your business applications. It resolves application problems quickly and reduces your MTTR. Get your free trial! https://ad.doubleclick.net/ddm/clk/302982198;130105516;z -- fuse-devel mailing list To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v4 13/21] fs: Update posix_acl support to handle user namespace mounts [not found] ` <1461699396-33000-1-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> ` (3 preceding siblings ...) 2016-04-26 19:36 ` [PATCH v4 12/21] fs: Refuse uid/gid changes which don't map into s_user_ns Seth Forshee @ 2016-04-26 19:36 ` Seth Forshee 2016-04-26 19:36 ` [PATCH v4 14/21] fs: Allow superblock owner to change ownership of inodes with unmappable ids Seth Forshee ` (4 subsequent siblings) 9 siblings, 0 replies; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman, Alexander Viro Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Serge Hallyn, Seth Forshee, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, Miklos Szeredi, linux-security-module-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-raid-u79uwXL29TY76Z2rM5mHXA, fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Austin S Hemmelgarn, linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, selinux-+05T5uksL2qpZYMLLGbcSA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, Pavel Tikhomirov ids in on-disk ACLs should be converted to s_user_ns instead of init_user_ns as is done now. This introduces the possibility for id mappings to fail, and when this happens syscalls will return EOVERFLOW. Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> --- fs/posix_acl.c | 67 ++++++++++++++++++++++++++--------------- fs/xattr.c | 19 +++++++++--- include/linux/posix_acl_xattr.h | 17 ++++++++--- 3 files changed, 70 insertions(+), 33 deletions(-) diff --git a/fs/posix_acl.c b/fs/posix_acl.c index 711dd5170376..dac2842dd4cb 100644 --- a/fs/posix_acl.c +++ b/fs/posix_acl.c @@ -595,59 +595,77 @@ EXPORT_SYMBOL_GPL(posix_acl_create); /* * Fix up the uids and gids in posix acl extended attributes in place. */ -static void posix_acl_fix_xattr_userns( +static int posix_acl_fix_xattr_userns( struct user_namespace *to, struct user_namespace *from, void *value, size_t size) { posix_acl_xattr_header *header = (posix_acl_xattr_header *)value; posix_acl_xattr_entry *entry = (posix_acl_xattr_entry *)(header+1), *end; int count; - kuid_t uid; - kgid_t gid; + kuid_t kuid; + uid_t uid; + kgid_t kgid; + gid_t gid; if (!value) - return; + return 0; if (size < sizeof(posix_acl_xattr_header)) - return; + return 0; if (header->a_version != cpu_to_le32(POSIX_ACL_XATTR_VERSION)) - return; + return 0; count = posix_acl_xattr_count(size); if (count < 0) - return; + return 0; if (count == 0) - return; + return 0; for (end = entry + count; entry != end; entry++) { switch(le16_to_cpu(entry->e_tag)) { case ACL_USER: - uid = make_kuid(from, le32_to_cpu(entry->e_id)); - entry->e_id = cpu_to_le32(from_kuid(to, uid)); + kuid = make_kuid(from, le32_to_cpu(entry->e_id)); + if (!uid_valid(kuid)) + return -EOVERFLOW; + uid = from_kuid(to, kuid); + if (uid == (uid_t)-1) + return -EOVERFLOW; + entry->e_id = cpu_to_le32(uid); break; case ACL_GROUP: - gid = make_kgid(from, le32_to_cpu(entry->e_id)); - entry->e_id = cpu_to_le32(from_kgid(to, gid)); + kgid = make_kgid(from, le32_to_cpu(entry->e_id)); + if (!gid_valid(kgid)) + return -EOVERFLOW; + gid = from_kgid(to, kgid); + if (gid == (gid_t)-1) + return -EOVERFLOW; + entry->e_id = cpu_to_le32(gid); break; default: break; } } + + return 0; } -void posix_acl_fix_xattr_from_user(void *value, size_t size) +int +posix_acl_fix_xattr_from_user(struct user_namespace *target_ns, void *value, + size_t size) { - struct user_namespace *user_ns = current_user_ns(); - if (user_ns == &init_user_ns) - return; - posix_acl_fix_xattr_userns(&init_user_ns, user_ns, value, size); + struct user_namespace *source_ns = current_user_ns(); + if (source_ns == target_ns) + return 0; + return posix_acl_fix_xattr_userns(target_ns, source_ns, value, size); } -void posix_acl_fix_xattr_to_user(void *value, size_t size) +int +posix_acl_fix_xattr_to_user(struct user_namespace *source_ns, void *value, + size_t size) { - struct user_namespace *user_ns = current_user_ns(); - if (user_ns == &init_user_ns) - return; - posix_acl_fix_xattr_userns(user_ns, &init_user_ns, value, size); + struct user_namespace *target_ns = current_user_ns(); + if (target_ns == source_ns) + return 0; + return posix_acl_fix_xattr_userns(target_ns, source_ns, value, size); } /* @@ -780,7 +798,7 @@ posix_acl_xattr_get(const struct xattr_handler *handler, if (acl == NULL) return -ENODATA; - error = posix_acl_to_xattr(&init_user_ns, acl, value, size); + error = posix_acl_to_xattr(dentry->d_sb->s_user_ns, acl, value, size); posix_acl_release(acl); return error; @@ -806,7 +824,8 @@ posix_acl_xattr_set(const struct xattr_handler *handler, return -EPERM; if (value) { - acl = posix_acl_from_xattr(&init_user_ns, value, size); + acl = posix_acl_from_xattr(dentry->d_sb->s_user_ns, value, + size); if (IS_ERR(acl)) return PTR_ERR(acl); diff --git a/fs/xattr.c b/fs/xattr.c index 4861322e28e8..c541121945cd 100644 --- a/fs/xattr.c +++ b/fs/xattr.c @@ -330,8 +330,12 @@ setxattr(struct dentry *d, const char __user *name, const void __user *value, goto out; } if ((strcmp(kname, XATTR_NAME_POSIX_ACL_ACCESS) == 0) || - (strcmp(kname, XATTR_NAME_POSIX_ACL_DEFAULT) == 0)) - posix_acl_fix_xattr_from_user(kvalue, size); + (strcmp(kname, XATTR_NAME_POSIX_ACL_DEFAULT) == 0)) { + error = posix_acl_fix_xattr_from_user(d->d_sb->s_user_ns, + kvalue, size); + if (error) + goto out; + } } error = vfs_setxattr(d, kname, kvalue, size, flags); @@ -427,9 +431,14 @@ getxattr(struct dentry *d, const char __user *name, void __user *value, error = vfs_getxattr(d, kname, kvalue, size); if (error > 0) { if ((strcmp(kname, XATTR_NAME_POSIX_ACL_ACCESS) == 0) || - (strcmp(kname, XATTR_NAME_POSIX_ACL_DEFAULT) == 0)) - posix_acl_fix_xattr_to_user(kvalue, size); - if (size && copy_to_user(value, kvalue, error)) + (strcmp(kname, XATTR_NAME_POSIX_ACL_DEFAULT) == 0)) { + int ret; + ret = posix_acl_fix_xattr_to_user(d->d_sb->s_user_ns, + kvalue, size); + if (ret) + error = ret; + } + if (error > 0 && size && copy_to_user(value, kvalue, error)) error = -EFAULT; } else if (error == -ERANGE && size >= XATTR_SIZE_MAX) { /* The file system tried to returned a value bigger diff --git a/include/linux/posix_acl_xattr.h b/include/linux/posix_acl_xattr.h index e5e8ec40278d..5dec6b10951a 100644 --- a/include/linux/posix_acl_xattr.h +++ b/include/linux/posix_acl_xattr.h @@ -49,14 +49,23 @@ posix_acl_xattr_count(size_t size) } #ifdef CONFIG_FS_POSIX_ACL -void posix_acl_fix_xattr_from_user(void *value, size_t size); -void posix_acl_fix_xattr_to_user(void *value, size_t size); +int posix_acl_fix_xattr_from_user(struct user_namespace *target_ns, + void *value, size_t size); +int posix_acl_fix_xattr_to_user(struct user_namespace *source_ns, void *value, + size_t size); #else -static inline void posix_acl_fix_xattr_from_user(void *value, size_t size) +static inline int +posix_acl_fix_xattr_from_user(struct user_namespace *target_ns, void *value, + size_t size) { + return 0; } -static inline void posix_acl_fix_xattr_to_user(void *value, size_t size) + +static inline int +posix_acl_fix_xattr_to_user(struct user_namespace *source_ns, void *value, + size_t size) { + return 0; } #endif -- 2.7.4 ------------------------------------------------------------------------------ Find and fix application performance issues faster with Applications Manager Applications Manager provides deep performance insights into multiple tiers of your business applications. It resolves application problems quickly and reduces your MTTR. Get your free trial! https://ad.doubleclick.net/ddm/clk/302982198;130105516;z -- fuse-devel mailing list To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v4 14/21] fs: Allow superblock owner to change ownership of inodes with unmappable ids [not found] ` <1461699396-33000-1-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> ` (4 preceding siblings ...) 2016-04-26 19:36 ` [PATCH v4 13/21] fs: Update posix_acl support to handle user namespace mounts Seth Forshee @ 2016-04-26 19:36 ` Seth Forshee 2016-04-26 19:36 ` [PATCH v4 16/21] fs: Allow superblock owner to access do_remount_sb() Seth Forshee ` (3 subsequent siblings) 9 siblings, 0 replies; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman, Alexander Viro, Greg Kroah-Hartman Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Serge Hallyn, Seth Forshee, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, Miklos Szeredi, linux-security-module-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-raid-u79uwXL29TY76Z2rM5mHXA, fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Austin S Hemmelgarn, linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, selinux-+05T5uksL2qpZYMLLGbcSA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, Pavel Tikhomirov In a userns mount some on-disk inodes may have ids which do not map into s_user_ns, in which case the in-kernel inodes are owned by invalid users. The superblock owner should be able to change attributes of these inodes but cannot. However it is unsafe to grant the superblock owner privileged access to all inodes in the superblock since proc, sysfs, etc. use DAC to protect files which may not belong to s_user_ns. The problem is restricted to only inodes where the owner or group is an invalid user. We can work around this by allowing users with CAP_CHOWN in s_user_ns to change an invalid owner or group id, so long as the other id is either invalid or mappable in s_user_ns. After changing ownership the user will be privileged towards the inode and thus able to change other attributes. As an precaution, checks for invalid ids are added to the proc and kernfs setattr interfaces. These filesystems are not expected to have inodes with invalid ids, but if it does happen any setattr operations will return -EPERM. Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> --- fs/attr.c | 62 ++++++++++++++++++++++++++++++++++++++++++++------- fs/kernfs/inode.c | 2 ++ fs/proc/base.c | 2 ++ fs/proc/generic.c | 3 +++ fs/proc/proc_sysctl.c | 2 ++ 5 files changed, 63 insertions(+), 8 deletions(-) diff --git a/fs/attr.c b/fs/attr.c index 3cfaaac4a18e..06bb3f401559 100644 --- a/fs/attr.c +++ b/fs/attr.c @@ -16,6 +16,58 @@ #include <linux/evm.h> #include <linux/ima.h> +static bool chown_ok(const struct inode *inode, kuid_t uid) +{ + struct user_namespace *user_ns; + + if (uid_eq(current_fsuid(), inode->i_uid) && uid_eq(uid, inode->i_uid)) + return true; + if (capable_wrt_inode_uidgid(inode, CAP_CHOWN)) + return true; + + /* + * Inode uids/gids are of type kuid_t/kgid_t. As such, they can be + * a) INVALID_UID/INVALID_GID, b) valid and mappable into + * i_sb->s_user_ns, or c) valid but not mappable into + * i_sb->s_user_ns. + * + * For filesystems on user-supplied media ids will either be (a) or + * (b), so we permit CAP_CHOWN in s_user_ns to change INVALID_UID if + * the gid meets these conditions (and vice versa for INVALID_GID). + * + * For psuedo-filesystems like proc or sysfs ids will be either (b) + * or (c), so these conditions do not permit namespace-root to chown + * in those filesystems. + */ + user_ns = inode->i_sb->s_user_ns; + if (!uid_valid(inode->i_uid) && + (!gid_valid(inode->i_gid) || kgid_has_mapping(user_ns, inode->i_gid)) && + ns_capable(user_ns, CAP_CHOWN)) + return true; + + return false; +} + +static bool chgrp_ok(const struct inode *inode, kgid_t gid) +{ + struct user_namespace *user_ns; + + if (uid_eq(current_fsuid(), inode->i_uid) && + (in_group_p(gid) || gid_eq(gid, inode->i_gid))) + return true; + if (capable_wrt_inode_uidgid(inode, CAP_CHOWN)) + return true; + + /* Logic here is the same as in chown_ok(); see comment there. */ + user_ns = inode->i_sb->s_user_ns; + if (!gid_valid(inode->i_gid) && + (!uid_valid(inode->i_uid) || kuid_has_mapping(user_ns, inode->i_uid)) && + ns_capable(user_ns, CAP_CHOWN)) + return true; + + return false; +} + /** * inode_change_ok - check if attribute changes to an inode are allowed * @inode: inode to check @@ -58,17 +110,11 @@ int inode_change_ok(const struct inode *inode, struct iattr *attr) return 0; /* Make sure a caller can chown. */ - if ((ia_valid & ATTR_UID) && - (!uid_eq(current_fsuid(), inode->i_uid) || - !uid_eq(attr->ia_uid, inode->i_uid)) && - !capable_wrt_inode_uidgid(inode, CAP_CHOWN)) + if ((ia_valid & ATTR_UID) && !chown_ok(inode, attr->ia_uid)) return -EPERM; /* Make sure caller can chgrp. */ - if ((ia_valid & ATTR_GID) && - (!uid_eq(current_fsuid(), inode->i_uid) || - (!in_group_p(attr->ia_gid) && !gid_eq(attr->ia_gid, inode->i_gid))) && - !capable_wrt_inode_uidgid(inode, CAP_CHOWN)) + if ((ia_valid & ATTR_GID) && !chgrp_ok(inode, attr->ia_gid)) return -EPERM; /* Make sure a caller can chmod. */ diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c index 16405ae88d2d..2e97a337ee5f 100644 --- a/fs/kernfs/inode.c +++ b/fs/kernfs/inode.c @@ -117,6 +117,8 @@ int kernfs_iop_setattr(struct dentry *dentry, struct iattr *iattr) if (!kn) return -EINVAL; + if (!uid_valid(inode->i_uid) || !gid_valid(inode->i_gid)) + return -EPERM; mutex_lock(&kernfs_mutex); error = inode_change_ok(inode, iattr); diff --git a/fs/proc/base.c b/fs/proc/base.c index b1755b23893e..648d623e2158 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -711,6 +711,8 @@ int proc_setattr(struct dentry *dentry, struct iattr *attr) if (attr->ia_valid & ATTR_MODE) return -EPERM; + if (!uid_valid(inode->i_uid) || !gid_valid(inode->i_gid)) + return -EPERM; error = inode_change_ok(inode, attr); if (error) diff --git a/fs/proc/generic.c b/fs/proc/generic.c index ff3ffc76a937..1461570c552c 100644 --- a/fs/proc/generic.c +++ b/fs/proc/generic.c @@ -105,6 +105,9 @@ static int proc_notify_change(struct dentry *dentry, struct iattr *iattr) struct proc_dir_entry *de = PDE(inode); int error; + if (!uid_valid(inode->i_uid) || !gid_valid(inode->i_gid)) + return -EPERM; + error = inode_change_ok(inode, iattr); if (error) return error; diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index fe5b6e6c4671..f5d575157194 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -752,6 +752,8 @@ static int proc_sys_setattr(struct dentry *dentry, struct iattr *attr) if (attr->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID)) return -EPERM; + if (!uid_valid(inode->i_uid) || !gid_valid(inode->i_gid)) + return -EPERM; error = inode_change_ok(inode, attr); if (error) -- 2.7.4 ------------------------------------------------------------------------------ Find and fix application performance issues faster with Applications Manager Applications Manager provides deep performance insights into multiple tiers of your business applications. It resolves application problems quickly and reduces your MTTR. Get your free trial! https://ad.doubleclick.net/ddm/clk/302982198;130105516;z -- fuse-devel mailing list To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v4 16/21] fs: Allow superblock owner to access do_remount_sb() [not found] ` <1461699396-33000-1-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> ` (5 preceding siblings ...) 2016-04-26 19:36 ` [PATCH v4 14/21] fs: Allow superblock owner to change ownership of inodes with unmappable ids Seth Forshee @ 2016-04-26 19:36 ` Seth Forshee 2016-04-26 19:36 ` [PATCH v4 17/21] capabilities: Allow privileged user in s_user_ns to set security.* xattrs Seth Forshee ` (2 subsequent siblings) 9 siblings, 0 replies; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman, Alexander Viro Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Serge Hallyn, Seth Forshee, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, Miklos Szeredi, linux-security-module-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-raid-u79uwXL29TY76Z2rM5mHXA, fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Austin S Hemmelgarn, linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, selinux-+05T5uksL2qpZYMLLGbcSA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, Pavel Tikhomirov Superblock level remounts are currently restricted to global CAP_SYS_ADMIN, as is the path for changing the root mount to read only on umount. Loosen both of these permission checks to also allow CAP_SYS_ADMIN in any namespace which is privileged towards the userns which originally mounted the filesystem. Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> Acked-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> --- fs/namespace.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index 0ad8e4a4f50b..575e3f8b34fd 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -1510,7 +1510,7 @@ static int do_umount(struct mount *mnt, int flags) * Special case for "unmounting" root ... * we just try to remount it readonly. */ - if (!capable(CAP_SYS_ADMIN)) + if (!ns_capable(sb->s_user_ns, CAP_SYS_ADMIN)) return -EPERM; down_write(&sb->s_umount); if (!(sb->s_flags & MS_RDONLY)) @@ -2207,7 +2207,7 @@ static int do_remount(struct path *path, int flags, int mnt_flags, down_write(&sb->s_umount); if (flags & MS_BIND) err = change_mount_flags(path->mnt, flags); - else if (!capable(CAP_SYS_ADMIN)) + else if (!ns_capable(sb->s_user_ns, CAP_SYS_ADMIN)) err = -EPERM; else err = do_remount_sb(sb, flags, data, 0); -- 2.7.4 ------------------------------------------------------------------------------ Find and fix application performance issues faster with Applications Manager Applications Manager provides deep performance insights into multiple tiers of your business applications. It resolves application problems quickly and reduces your MTTR. Get your free trial! https://ad.doubleclick.net/ddm/clk/302982198;130105516;z -- fuse-devel mailing list To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v4 17/21] capabilities: Allow privileged user in s_user_ns to set security.* xattrs [not found] ` <1461699396-33000-1-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> ` (6 preceding siblings ...) 2016-04-26 19:36 ` [PATCH v4 16/21] fs: Allow superblock owner to access do_remount_sb() Seth Forshee @ 2016-04-26 19:36 ` Seth Forshee [not found] ` <1461699396-33000-18-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> 2016-04-26 19:36 ` [PATCH v4 18/21] fuse: Add support for pid namespaces Seth Forshee 2016-04-26 19:36 ` [PATCH v4 19/21] fuse: Support fuse filesystems outside of init_user_ns Seth Forshee 9 siblings, 1 reply; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman, Serge Hallyn, James Morris, Serge E. Hallyn Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Miklos Szeredi, Seth Forshee, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, linux-security-module-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-raid-u79uwXL29TY76Z2rM5mHXA, fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Austin S Hemmelgarn, linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Alexander Viro, selinux-+05T5uksL2qpZYMLLGbcSA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, Pavel Tikhomirov A privileged user in s_user_ns will generally have the ability to manipulate the backing store and insert security.* xattrs into the filesystem directly. Therefore the kernel must be prepared to handle these xattrs from unprivileged mounts, and it makes little sense for commoncap to prevent writing these xattrs to the filesystem. The capability and LSM code have already been updated to appropriately handle xattrs from unprivileged mounts, so it is safe to loosen this restriction on setting xattrs. The exception to this logic is that writing xattrs to a mounted filesystem may also cause the LSM inode_post_setxattr or inode_setsecurity callbacks to be invoked. SELinux will deny the xattr update by virtue of applying mountpoint labeling to unprivileged userns mounts, and Smack will deny the writes for any user without global CAP_MAC_ADMIN, so loosening the capability check in commoncap is safe in this respect as well. Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> --- security/commoncap.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/security/commoncap.c b/security/commoncap.c index e657227d221e..12477afaa8ed 100644 --- a/security/commoncap.c +++ b/security/commoncap.c @@ -664,15 +664,17 @@ int cap_bprm_secureexec(struct linux_binprm *bprm) int cap_inode_setxattr(struct dentry *dentry, const char *name, const void *value, size_t size, int flags) { + struct user_namespace *user_ns = dentry->d_sb->s_user_ns; + if (!strcmp(name, XATTR_NAME_CAPS)) { - if (!capable(CAP_SETFCAP)) + if (!ns_capable(user_ns, CAP_SETFCAP)) return -EPERM; return 0; } if (!strncmp(name, XATTR_SECURITY_PREFIX, sizeof(XATTR_SECURITY_PREFIX) - 1) && - !capable(CAP_SYS_ADMIN)) + !ns_capable(user_ns, CAP_SYS_ADMIN)) return -EPERM; return 0; } @@ -690,15 +692,17 @@ int cap_inode_setxattr(struct dentry *dentry, const char *name, */ int cap_inode_removexattr(struct dentry *dentry, const char *name) { + struct user_namespace *user_ns = dentry->d_sb->s_user_ns; + if (!strcmp(name, XATTR_NAME_CAPS)) { - if (!capable(CAP_SETFCAP)) + if (!ns_capable(user_ns, CAP_SETFCAP)) return -EPERM; return 0; } if (!strncmp(name, XATTR_SECURITY_PREFIX, sizeof(XATTR_SECURITY_PREFIX) - 1) && - !capable(CAP_SYS_ADMIN)) + !ns_capable(user_ns, CAP_SYS_ADMIN)) return -EPERM; return 0; } -- 2.7.4 ------------------------------------------------------------------------------ Find and fix application performance issues faster with Applications Manager Applications Manager provides deep performance insights into multiple tiers of your business applications. It resolves application problems quickly and reduces your MTTR. Get your free trial! https://ad.doubleclick.net/ddm/clk/302982198;130105516;z -- fuse-devel mailing list To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel ^ permalink raw reply related [flat|nested] 29+ messages in thread
[parent not found: <1461699396-33000-18-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>]
* Re: [PATCH v4 17/21] capabilities: Allow privileged user in s_user_ns to set security.* xattrs [not found] ` <1461699396-33000-18-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> @ 2016-04-27 7:22 ` James Morris 0 siblings, 0 replies; 29+ messages in thread From: James Morris @ 2016-04-27 7:22 UTC (permalink / raw) To: Seth Forshee Cc: Eric W. Biederman, Serge Hallyn, James Morris, Serge E. Hallyn, Alexander Viro, Richard Weinberger, Austin S Hemmelgarn, Miklos Szeredi, Pavel Tikhomirov, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-bcache-u79uwXL29TY76Z2rM5mHXA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, linux-raid-u79uwXL29TY76Z2rM5mHXA, linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, linux-security-module-u79uwXL29TY76Z2rM5mHXA, selinux-+05T5uksL2qpZYMLLGbcSA, cgroups-u79uwXL29TY76Z2rM5mHXA On Tue, 26 Apr 2016, Seth Forshee wrote: > A privileged user in s_user_ns will generally have the ability to > manipulate the backing store and insert security.* xattrs into > the filesystem directly. Therefore the kernel must be prepared to > handle these xattrs from unprivileged mounts, and it makes little > sense for commoncap to prevent writing these xattrs to the > filesystem. The capability and LSM code have already been updated > to appropriately handle xattrs from unprivileged mounts, so it > is safe to loosen this restriction on setting xattrs. > > The exception to this logic is that writing xattrs to a mounted > filesystem may also cause the LSM inode_post_setxattr or > inode_setsecurity callbacks to be invoked. SELinux will deny the > xattr update by virtue of applying mountpoint labeling to > unprivileged userns mounts, and Smack will deny the writes for > any user without global CAP_MAC_ADMIN, so loosening the > capability check in commoncap is safe in this respect as well. > > Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> > Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> Acked-by: James Morris <james.l.morris-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> -- James Morris <jmorris-gx6/JNMH7DfYtjvyW6yDsg@public.gmane.org> ^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v4 18/21] fuse: Add support for pid namespaces [not found] ` <1461699396-33000-1-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> ` (7 preceding siblings ...) 2016-04-26 19:36 ` [PATCH v4 17/21] capabilities: Allow privileged user in s_user_ns to set security.* xattrs Seth Forshee @ 2016-04-26 19:36 ` Seth Forshee 2016-07-20 2:44 ` Sheng Yang 2016-04-26 19:36 ` [PATCH v4 19/21] fuse: Support fuse filesystems outside of init_user_ns Seth Forshee 9 siblings, 1 reply; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman, Miklos Szeredi Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Serge Hallyn, Seth Forshee, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, Miklos Szeredi, linux-security-module-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-raid-u79uwXL29TY76Z2rM5mHXA, fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Austin S Hemmelgarn, linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Alexander Viro, selinux-+05T5uksL2qpZYMLLGbcSA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, Pavel Tikhomirov When the userspace process servicing fuse requests is running in a pid namespace then pids passed via the fuse fd are not being translated into that process' namespace. Translation is necessary for the pid to be useful to that process. Since no use case currently exists for changing namespaces all translations can be done relative to the pid namespace in use when fuse_conn_init() is called. For fuse this translates to mount time, and for cuse this is when /dev/cuse is opened. IO for this connection from another namespace will return errors. Requests from processes whose pid cannot be translated into the target namespace are not permitted, except for requests allocated via fuse_get_req_nofail_nopages. For no-fail requests in.h.pid will be 0 if the pid translation fails. File locking changes based on previous work done by Eric Biederman. Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> Acked-by: Miklos Szeredi <mszeredi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> --- fs/fuse/dev.c | 19 +++++++++++++++---- fs/fuse/file.c | 22 +++++++++++++++++----- fs/fuse/fuse_i.h | 4 ++++ fs/fuse/inode.c | 3 +++ 4 files changed, 39 insertions(+), 9 deletions(-) diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index cbece1221417..4e91b2ac25a7 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -19,6 +19,7 @@ #include <linux/pipe_fs_i.h> #include <linux/swap.h> #include <linux/splice.h> +#include <linux/sched.h> MODULE_ALIAS_MISCDEV(FUSE_MINOR); MODULE_ALIAS("devname:fuse"); @@ -124,11 +125,11 @@ static void __fuse_put_request(struct fuse_req *req) atomic_dec(&req->count); } -static void fuse_req_init_context(struct fuse_req *req) +static void fuse_req_init_context(struct fuse_conn *fc, struct fuse_req *req) { req->in.h.uid = from_kuid_munged(&init_user_ns, current_fsuid()); req->in.h.gid = from_kgid_munged(&init_user_ns, current_fsgid()); - req->in.h.pid = current->pid; + req->in.h.pid = pid_nr_ns(task_pid(current), fc->pid_ns); } void fuse_set_initialized(struct fuse_conn *fc) @@ -181,10 +182,14 @@ static struct fuse_req *__fuse_get_req(struct fuse_conn *fc, unsigned npages, goto out; } - fuse_req_init_context(req); + fuse_req_init_context(fc, req); __set_bit(FR_WAITING, &req->flags); if (for_background) __set_bit(FR_BACKGROUND, &req->flags); + if (req->in.h.pid == 0) { + fuse_put_request(fc, req); + return ERR_PTR(-EOVERFLOW); + } return req; @@ -274,7 +279,7 @@ struct fuse_req *fuse_get_req_nofail_nopages(struct fuse_conn *fc, if (!req) req = get_reserved_req(fc, file); - fuse_req_init_context(req); + fuse_req_init_context(fc, req); __set_bit(FR_WAITING, &req->flags); __clear_bit(FR_BACKGROUND, &req->flags); return req; @@ -1243,6 +1248,9 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file, struct fuse_in *in; unsigned reqsize; + if (task_active_pid_ns(current) != fc->pid_ns) + return -EIO; + restart: spin_lock(&fiq->waitq.lock); err = -EAGAIN; @@ -1872,6 +1880,9 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud, struct fuse_req *req; struct fuse_out_header oh; + if (task_active_pid_ns(current) != fc->pid_ns) + return -EIO; + if (nbytes < sizeof(struct fuse_out_header)) return -EINVAL; diff --git a/fs/fuse/file.c b/fs/fuse/file.c index 719924d6c706..b5c616c5ec98 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -2067,7 +2067,8 @@ static int fuse_direct_mmap(struct file *file, struct vm_area_struct *vma) return generic_file_mmap(file, vma); } -static int convert_fuse_file_lock(const struct fuse_file_lock *ffl, +static int convert_fuse_file_lock(struct fuse_conn *fc, + const struct fuse_file_lock *ffl, struct file_lock *fl) { switch (ffl->type) { @@ -2082,7 +2083,14 @@ static int convert_fuse_file_lock(const struct fuse_file_lock *ffl, fl->fl_start = ffl->start; fl->fl_end = ffl->end; - fl->fl_pid = ffl->pid; + + /* + * Convert pid into the caller's pid namespace. If the pid + * does not map into the namespace fl_pid will get set to 0. + */ + rcu_read_lock(); + fl->fl_pid = pid_vnr(find_pid_ns(ffl->pid, fc->pid_ns)); + rcu_read_unlock(); break; default: @@ -2131,7 +2139,7 @@ static int fuse_getlk(struct file *file, struct file_lock *fl) args.out.args[0].value = &outarg; err = fuse_simple_request(fc, &args); if (!err) - err = convert_fuse_file_lock(&outarg.lk, fl); + err = convert_fuse_file_lock(fc, &outarg.lk, fl); return err; } @@ -2143,7 +2151,8 @@ static int fuse_setlk(struct file *file, struct file_lock *fl, int flock) FUSE_ARGS(args); struct fuse_lk_in inarg; int opcode = (fl->fl_flags & FL_SLEEP) ? FUSE_SETLKW : FUSE_SETLK; - pid_t pid = fl->fl_type != F_UNLCK ? current->tgid : 0; + struct pid *pid = fl->fl_type != F_UNLCK ? task_tgid(current) : NULL; + pid_t pid_nr = pid_nr_ns(pid, fc->pid_ns); int err; if (fl->fl_lmops && fl->fl_lmops->lm_grant) { @@ -2155,7 +2164,10 @@ static int fuse_setlk(struct file *file, struct file_lock *fl, int flock) if (fl->fl_flags & FL_CLOSE) return 0; - fuse_lk_fill(&args, file, fl, opcode, pid, flock, &inarg); + if (pid && pid_nr == 0) + return -EOVERFLOW; + + fuse_lk_fill(&args, file, fl, opcode, pid_nr, flock, &inarg); err = fuse_simple_request(fc, &args); /* locking is restartable */ diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index eddbe02c4028..9145445a759a 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -23,6 +23,7 @@ #include <linux/poll.h> #include <linux/workqueue.h> #include <linux/kref.h> +#include <linux/pid_namespace.h> /** Max number of pages that can be used in a single read request */ #define FUSE_MAX_PAGES_PER_REQ 32 @@ -465,6 +466,9 @@ struct fuse_conn { /** The group id for this mount */ kgid_t group_id; + /** The pid namespace for this mount */ + struct pid_namespace *pid_ns; + /** The fuse mount flags for this mount */ unsigned flags; diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 1ce67668a8e1..eade0bfa4488 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -20,6 +20,7 @@ #include <linux/random.h> #include <linux/sched.h> #include <linux/exportfs.h> +#include <linux/pid_namespace.h> MODULE_AUTHOR("Miklos Szeredi <miklos-sUDqSbJrdHQHWmgEVkV9KA@public.gmane.org>"); MODULE_DESCRIPTION("Filesystem in Userspace"); @@ -609,6 +610,7 @@ void fuse_conn_init(struct fuse_conn *fc) fc->connected = 1; fc->attr_version = 1; get_random_bytes(&fc->scramble_key, sizeof(fc->scramble_key)); + fc->pid_ns = get_pid_ns(task_active_pid_ns(current)); } EXPORT_SYMBOL_GPL(fuse_conn_init); @@ -617,6 +619,7 @@ void fuse_conn_put(struct fuse_conn *fc) if (atomic_dec_and_test(&fc->count)) { if (fc->destroy_req) fuse_request_free(fc->destroy_req); + put_pid_ns(fc->pid_ns); fc->release(fc); } } -- 2.7.4 ------------------------------------------------------------------------------ Find and fix application performance issues faster with Applications Manager Applications Manager provides deep performance insights into multiple tiers of your business applications. It resolves application problems quickly and reduces your MTTR. Get your free trial! https://ad.doubleclick.net/ddm/clk/302982198;130105516;z -- fuse-devel mailing list To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH v4 18/21] fuse: Add support for pid namespaces 2016-04-26 19:36 ` [PATCH v4 18/21] fuse: Add support for pid namespaces Seth Forshee @ 2016-07-20 2:44 ` Sheng Yang [not found] ` <CA+2rt426_pshAauQizcxkfAq16vmEpB4sJ4genW_ucosH3j=zQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 29+ messages in thread From: Sheng Yang @ 2016-07-20 2:44 UTC (permalink / raw) To: Seth Forshee Cc: Eric W. Biederman, Miklos Szeredi, Alexander Viro, Serge Hallyn, Richard Weinberger, Austin S Hemmelgarn, Miklos Szeredi, Pavel Tikhomirov, kernel list, linux-bcache, dm-devel, linux-raid, linux-mtd, linux-fsdevel, fuse-devel, linux-security-module, selinux, cgroups On Tue, Apr 26, 2016 at 12:36 PM, Seth Forshee <seth.forshee@canonical.com> wrote: > When the userspace process servicing fuse requests is running in > a pid namespace then pids passed via the fuse fd are not being > translated into that process' namespace. Translation is necessary > for the pid to be useful to that process. > > Since no use case currently exists for changing namespaces all > translations can be done relative to the pid namespace in use > when fuse_conn_init() is called. For fuse this translates to > mount time, and for cuse this is when /dev/cuse is opened. IO for > this connection from another namespace will return errors. > > Requests from processes whose pid cannot be translated into the > target namespace are not permitted, except for requests > allocated via fuse_get_req_nofail_nopages. For no-fail requests > in.h.pid will be 0 if the pid translation fails. Hi Seth, This patch caused a regression in our major container use case with FUSE in Ubuntu 16.04, as patch was checked in as Ubuntu Sauce in Ubuntu 4.4.0-6.21 kernel. The use case is: 1. Create a Docker container. 2. Inside the container, start the FUSE backend, and mounted fs. 3. Following step 2 in the container, create a loopback device to map a file in the mounted fuse to create a block device, which will be available to the whole system. It works well before this commit. The use case is broken because no matter which namespace losetup runs, the real request from loopback device seems always come from init ns, thus it will be in different ns running fuse backend. So the request will got denied, because the ns running fuse won't able to see the things from higher level(level 0 in fact) pid namespace. I think since init pid ns has ability to access any process in the system, it should able to access the fuse mounted by any pid namespace process as well. What you think? --Sheng > > File locking changes based on previous work done by Eric > Biederman. > > Signed-off-by: Seth Forshee <seth.forshee@canonical.com> > Acked-by: Miklos Szeredi <mszeredi@redhat.com> > --- > fs/fuse/dev.c | 19 +++++++++++++++---- > fs/fuse/file.c | 22 +++++++++++++++++----- > fs/fuse/fuse_i.h | 4 ++++ > fs/fuse/inode.c | 3 +++ > 4 files changed, 39 insertions(+), 9 deletions(-) > > diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c > index cbece1221417..4e91b2ac25a7 100644 > --- a/fs/fuse/dev.c > +++ b/fs/fuse/dev.c > @@ -19,6 +19,7 @@ > #include <linux/pipe_fs_i.h> > #include <linux/swap.h> > #include <linux/splice.h> > +#include <linux/sched.h> > > MODULE_ALIAS_MISCDEV(FUSE_MINOR); > MODULE_ALIAS("devname:fuse"); > @@ -124,11 +125,11 @@ static void __fuse_put_request(struct fuse_req *req) > atomic_dec(&req->count); > } > > -static void fuse_req_init_context(struct fuse_req *req) > +static void fuse_req_init_context(struct fuse_conn *fc, struct fuse_req *req) > { > req->in.h.uid = from_kuid_munged(&init_user_ns, current_fsuid()); > req->in.h.gid = from_kgid_munged(&init_user_ns, current_fsgid()); > - req->in.h.pid = current->pid; > + req->in.h.pid = pid_nr_ns(task_pid(current), fc->pid_ns); > } > > void fuse_set_initialized(struct fuse_conn *fc) > @@ -181,10 +182,14 @@ static struct fuse_req *__fuse_get_req(struct fuse_conn *fc, unsigned npages, > goto out; > } > > - fuse_req_init_context(req); > + fuse_req_init_context(fc, req); > __set_bit(FR_WAITING, &req->flags); > if (for_background) > __set_bit(FR_BACKGROUND, &req->flags); > + if (req->in.h.pid == 0) { > + fuse_put_request(fc, req); > + return ERR_PTR(-EOVERFLOW); > + } > > return req; > > @@ -274,7 +279,7 @@ struct fuse_req *fuse_get_req_nofail_nopages(struct fuse_conn *fc, > if (!req) > req = get_reserved_req(fc, file); > > - fuse_req_init_context(req); > + fuse_req_init_context(fc, req); > __set_bit(FR_WAITING, &req->flags); > __clear_bit(FR_BACKGROUND, &req->flags); > return req; > @@ -1243,6 +1248,9 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file, > struct fuse_in *in; > unsigned reqsize; > > + if (task_active_pid_ns(current) != fc->pid_ns) > + return -EIO; > + > restart: > spin_lock(&fiq->waitq.lock); > err = -EAGAIN; > @@ -1872,6 +1880,9 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud, > struct fuse_req *req; > struct fuse_out_header oh; > > + if (task_active_pid_ns(current) != fc->pid_ns) > + return -EIO; > + > if (nbytes < sizeof(struct fuse_out_header)) > return -EINVAL; > > diff --git a/fs/fuse/file.c b/fs/fuse/file.c > index 719924d6c706..b5c616c5ec98 100644 > --- a/fs/fuse/file.c > +++ b/fs/fuse/file.c > @@ -2067,7 +2067,8 @@ static int fuse_direct_mmap(struct file *file, struct vm_area_struct *vma) > return generic_file_mmap(file, vma); > } > > -static int convert_fuse_file_lock(const struct fuse_file_lock *ffl, > +static int convert_fuse_file_lock(struct fuse_conn *fc, > + const struct fuse_file_lock *ffl, > struct file_lock *fl) > { > switch (ffl->type) { > @@ -2082,7 +2083,14 @@ static int convert_fuse_file_lock(const struct fuse_file_lock *ffl, > > fl->fl_start = ffl->start; > fl->fl_end = ffl->end; > - fl->fl_pid = ffl->pid; > + > + /* > + * Convert pid into the caller's pid namespace. If the pid > + * does not map into the namespace fl_pid will get set to 0. > + */ > + rcu_read_lock(); > + fl->fl_pid = pid_vnr(find_pid_ns(ffl->pid, fc->pid_ns)); > + rcu_read_unlock(); > break; > > default: > @@ -2131,7 +2139,7 @@ static int fuse_getlk(struct file *file, struct file_lock *fl) > args.out.args[0].value = &outarg; > err = fuse_simple_request(fc, &args); > if (!err) > - err = convert_fuse_file_lock(&outarg.lk, fl); > + err = convert_fuse_file_lock(fc, &outarg.lk, fl); > > return err; > } > @@ -2143,7 +2151,8 @@ static int fuse_setlk(struct file *file, struct file_lock *fl, int flock) > FUSE_ARGS(args); > struct fuse_lk_in inarg; > int opcode = (fl->fl_flags & FL_SLEEP) ? FUSE_SETLKW : FUSE_SETLK; > - pid_t pid = fl->fl_type != F_UNLCK ? current->tgid : 0; > + struct pid *pid = fl->fl_type != F_UNLCK ? task_tgid(current) : NULL; > + pid_t pid_nr = pid_nr_ns(pid, fc->pid_ns); > int err; > > if (fl->fl_lmops && fl->fl_lmops->lm_grant) { > @@ -2155,7 +2164,10 @@ static int fuse_setlk(struct file *file, struct file_lock *fl, int flock) > if (fl->fl_flags & FL_CLOSE) > return 0; > > - fuse_lk_fill(&args, file, fl, opcode, pid, flock, &inarg); > + if (pid && pid_nr == 0) > + return -EOVERFLOW; > + > + fuse_lk_fill(&args, file, fl, opcode, pid_nr, flock, &inarg); > err = fuse_simple_request(fc, &args); > > /* locking is restartable */ > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h > index eddbe02c4028..9145445a759a 100644 > --- a/fs/fuse/fuse_i.h > +++ b/fs/fuse/fuse_i.h > @@ -23,6 +23,7 @@ > #include <linux/poll.h> > #include <linux/workqueue.h> > #include <linux/kref.h> > +#include <linux/pid_namespace.h> > > /** Max number of pages that can be used in a single read request */ > #define FUSE_MAX_PAGES_PER_REQ 32 > @@ -465,6 +466,9 @@ struct fuse_conn { > /** The group id for this mount */ > kgid_t group_id; > > + /** The pid namespace for this mount */ > + struct pid_namespace *pid_ns; > + > /** The fuse mount flags for this mount */ > unsigned flags; > > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c > index 1ce67668a8e1..eade0bfa4488 100644 > --- a/fs/fuse/inode.c > +++ b/fs/fuse/inode.c > @@ -20,6 +20,7 @@ > #include <linux/random.h> > #include <linux/sched.h> > #include <linux/exportfs.h> > +#include <linux/pid_namespace.h> > > MODULE_AUTHOR("Miklos Szeredi <miklos@szeredi.hu>"); > MODULE_DESCRIPTION("Filesystem in Userspace"); > @@ -609,6 +610,7 @@ void fuse_conn_init(struct fuse_conn *fc) > fc->connected = 1; > fc->attr_version = 1; > get_random_bytes(&fc->scramble_key, sizeof(fc->scramble_key)); > + fc->pid_ns = get_pid_ns(task_active_pid_ns(current)); > } > EXPORT_SYMBOL_GPL(fuse_conn_init); > > @@ -617,6 +619,7 @@ void fuse_conn_put(struct fuse_conn *fc) > if (atomic_dec_and_test(&fc->count)) { > if (fc->destroy_req) > fuse_request_free(fc->destroy_req); > + put_pid_ns(fc->pid_ns); > fc->release(fc); > } > } > -- > 2.7.4 > ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <CA+2rt426_pshAauQizcxkfAq16vmEpB4sJ4genW_ucosH3j=zQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v4 18/21] fuse: Add support for pid namespaces [not found] ` <CA+2rt426_pshAauQizcxkfAq16vmEpB4sJ4genW_ucosH3j=zQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-07-20 12:52 ` Seth Forshee 2016-07-20 22:28 ` Sheng Yang 2016-07-21 7:25 ` Miklos Szeredi 0 siblings, 2 replies; 29+ messages in thread From: Seth Forshee @ 2016-07-20 12:52 UTC (permalink / raw) To: Sheng Yang Cc: Eric W. Biederman, Miklos Szeredi, Alexander Viro, Serge Hallyn, Richard Weinberger, Austin S Hemmelgarn, Miklos Szeredi, Pavel Tikhomirov, kernel list, linux-bcache-u79uwXL29TY76Z2rM5mHXA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, linux-raid-u79uwXL29TY76Z2rM5mHXA, linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, linux-security-module-u79uwXL29TY76Z2rM5mHXA, selinux-+05T5uksL2qpZYMLLGbcSA, cgroups-u79uwXL29TY76Z2rM5mHXA On Tue, Jul 19, 2016 at 07:44:11PM -0700, Sheng Yang wrote: > On Tue, Apr 26, 2016 at 12:36 PM, Seth Forshee > <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> wrote: > > When the userspace process servicing fuse requests is running in > > a pid namespace then pids passed via the fuse fd are not being > > translated into that process' namespace. Translation is necessary > > for the pid to be useful to that process. > > > > Since no use case currently exists for changing namespaces all > > translations can be done relative to the pid namespace in use > > when fuse_conn_init() is called. For fuse this translates to > > mount time, and for cuse this is when /dev/cuse is opened. IO for > > this connection from another namespace will return errors. > > > > Requests from processes whose pid cannot be translated into the > > target namespace are not permitted, except for requests > > allocated via fuse_get_req_nofail_nopages. For no-fail requests > > in.h.pid will be 0 if the pid translation fails. > > Hi Seth, > > This patch caused a regression in our major container use case with > FUSE in Ubuntu 16.04, as patch was checked in as Ubuntu Sauce in > Ubuntu 4.4.0-6.21 kernel. > > The use case is: > 1. Create a Docker container. > 2. Inside the container, start the FUSE backend, and mounted fs. > 3. Following step 2 in the container, create a loopback device to map > a file in the mounted fuse to create a block device, which will be > available to the whole system. > > It works well before this commit. > > The use case is broken because no matter which namespace losetup runs, > the real request from loopback device seems always come from init ns, > thus it will be in different ns running fuse backend. So the request > will got denied, because the ns running fuse won't able to see the > things from higher level(level 0 in fact) pid namespace. > > I think since init pid ns has ability to access any process in the > system, it should able to access the fuse mounted by any pid namespace > process as well. > > What you think? It sounds like we need to remove the restriction on accessing the filesystem from a different pid namespace. I don't think this poses a security problem. However there's no pid mapping that is usable by the userspace fuse process, so what do we put in the fuse request? Probably the only candidates are 0 and 0xffffffff. So a question for the fuse developers - is one value or the other preferrable for fuse_in_header.pid when the pid cannot be mapped, and is this going to cause problems for any fuse filesystems? I suspect that few filesystems actually look at the pid anyway, and already for a filesystem mounted in a pid namespace the values being given to userspace won't be correct for the namespace of the fuse process. Seth ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v4 18/21] fuse: Add support for pid namespaces 2016-07-20 12:52 ` Seth Forshee @ 2016-07-20 22:28 ` Sheng Yang 2016-07-21 7:25 ` Miklos Szeredi 1 sibling, 0 replies; 29+ messages in thread From: Sheng Yang @ 2016-07-20 22:28 UTC (permalink / raw) To: Seth Forshee Cc: Eric W. Biederman, Miklos Szeredi, Alexander Viro, Serge Hallyn, Richard Weinberger, Austin S Hemmelgarn, Miklos Szeredi, Pavel Tikhomirov, kernel list, linux-bcache, dm-devel, linux-raid, linux-mtd, linux-fsdevel, fuse-devel, linux-security-module, selinux, cgroups On Wed, Jul 20, 2016 at 5:52 AM, Seth Forshee <seth.forshee@canonical.com> wrote: > On Tue, Jul 19, 2016 at 07:44:11PM -0700, Sheng Yang wrote: >> On Tue, Apr 26, 2016 at 12:36 PM, Seth Forshee >> <seth.forshee@canonical.com> wrote: >> > When the userspace process servicing fuse requests is running in >> > a pid namespace then pids passed via the fuse fd are not being >> > translated into that process' namespace. Translation is necessary >> > for the pid to be useful to that process. >> > >> > Since no use case currently exists for changing namespaces all >> > translations can be done relative to the pid namespace in use >> > when fuse_conn_init() is called. For fuse this translates to >> > mount time, and for cuse this is when /dev/cuse is opened. IO for >> > this connection from another namespace will return errors. >> > >> > Requests from processes whose pid cannot be translated into the >> > target namespace are not permitted, except for requests >> > allocated via fuse_get_req_nofail_nopages. For no-fail requests >> > in.h.pid will be 0 if the pid translation fails. >> >> Hi Seth, >> >> This patch caused a regression in our major container use case with >> FUSE in Ubuntu 16.04, as patch was checked in as Ubuntu Sauce in >> Ubuntu 4.4.0-6.21 kernel. >> >> The use case is: >> 1. Create a Docker container. >> 2. Inside the container, start the FUSE backend, and mounted fs. >> 3. Following step 2 in the container, create a loopback device to map >> a file in the mounted fuse to create a block device, which will be >> available to the whole system. >> >> It works well before this commit. >> >> The use case is broken because no matter which namespace losetup runs, >> the real request from loopback device seems always come from init ns, >> thus it will be in different ns running fuse backend. So the request >> will got denied, because the ns running fuse won't able to see the >> things from higher level(level 0 in fact) pid namespace. >> >> I think since init pid ns has ability to access any process in the >> system, it should able to access the fuse mounted by any pid namespace >> process as well. >> >> What you think? > > It sounds like we need to remove the restriction on accessing the > filesystem from a different pid namespace. I don't think this poses a > security problem. However there's no pid mapping that is usable by the > userspace fuse process, so what do we put in the fuse request? Probably > the only candidates are 0 and 0xffffffff. Thanks Seth, I don't think it will be a security problem either, if we remove the restriction. > > So a question for the fuse developers - is one value or the other > preferrable for fuse_in_header.pid when the pid cannot be mapped, and is > this going to cause problems for any fuse filesystems? I suspect that > few filesystems actually look at the pid anyway, and already for a > filesystem mounted in a pid namespace the values being given to > userspace won't be correct for the namespace of the fuse process. > At least in our system we're not looking into the pid at all. --Sheng > Seth > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v4 18/21] fuse: Add support for pid namespaces 2016-07-20 12:52 ` Seth Forshee 2016-07-20 22:28 ` Sheng Yang @ 2016-07-21 7:25 ` Miklos Szeredi 1 sibling, 0 replies; 29+ messages in thread From: Miklos Szeredi @ 2016-07-21 7:25 UTC (permalink / raw) To: Seth Forshee Cc: Sheng Yang, Eric W. Biederman, Miklos Szeredi, Alexander Viro, Serge Hallyn, Richard Weinberger, Austin S Hemmelgarn, Pavel Tikhomirov, kernel list, linux-bcache-u79uwXL29TY76Z2rM5mHXA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, linux-raid-u79uwXL29TY76Z2rM5mHXA, linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-fsdevel, fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, linux-security-module-u79uwXL29TY76Z2rM5mHXA, selinux-+05T5uksL2qpZYMLLGbcSA, cgroups-u79uwXL29TY76Z2rM5mHXA On Wed, Jul 20, 2016 at 2:52 PM, Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> wrote: > It sounds like we need to remove the restriction on accessing the > filesystem from a different pid namespace. I don't think this poses a > security problem. However there's no pid mapping that is usable by the > userspace fuse process, so what do we put in the fuse request? Probably > the only candidates are 0 and 0xffffffff. > > So a question for the fuse developers - is one value or the other > preferrable for fuse_in_header.pid when the pid cannot be mapped, and is > this going to cause problems for any fuse filesystems? I suspect that > few filesystems actually look at the pid anyway, and already for a > filesystem mounted in a pid namespace the values being given to > userspace won't be correct for the namespace of the fuse process. pid = 0 sounds good. The pid from the request is used for example to get the auxiliary group list by libfuse (fuse_req_getgroups()). That's not used by all filesystems and it will return an error in case it can't find the proc entry (which it won't for pid == 0). It would be nice if we could transfer the group list through the userspace/kernel protocol, since then it wouldn't depend on proc and on being in the same pid namespace. But that's another story. Thanks, Miklos ^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v4 19/21] fuse: Support fuse filesystems outside of init_user_ns [not found] ` <1461699396-33000-1-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> ` (8 preceding siblings ...) 2016-04-26 19:36 ` [PATCH v4 18/21] fuse: Add support for pid namespaces Seth Forshee @ 2016-04-26 19:36 ` Seth Forshee 9 siblings, 0 replies; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman, Miklos Szeredi Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Serge Hallyn, Seth Forshee, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, Miklos Szeredi, linux-security-module-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-raid-u79uwXL29TY76Z2rM5mHXA, fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Austin S Hemmelgarn, linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Alexander Viro, selinux-+05T5uksL2qpZYMLLGbcSA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, Pavel Tikhomirov In order to support mounts from namespaces other than init_user_ns, fuse must translate uids and gids to/from the userns of the process servicing requests on /dev/fuse. This patch does that, with a couple of restrictions on the namespace: - The userns for the fuse connection is fixed to the namespace from which /dev/fuse is opened. - The namespace must be the same as s_user_ns. These restrictions simplify the implementation by avoiding the need to pass around userns references and by allowing fuse to rely on the checks in inode_change_ok for ownership changes. Either restriction could be relaxed in the future if needed. For cuse the namespace used for the connection is also simply current_user_ns() at the time /dev/cuse is opened. Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> --- fs/fuse/cuse.c | 3 ++- fs/fuse/dev.c | 13 ++++++++----- fs/fuse/dir.c | 14 +++++++------- fs/fuse/fuse_i.h | 6 +++++- fs/fuse/inode.c | 33 +++++++++++++++++++++------------ 5 files changed, 43 insertions(+), 26 deletions(-) diff --git a/fs/fuse/cuse.c b/fs/fuse/cuse.c index c5b6b7165489..98ebd0f4fd4c 100644 --- a/fs/fuse/cuse.c +++ b/fs/fuse/cuse.c @@ -48,6 +48,7 @@ #include <linux/stat.h> #include <linux/module.h> #include <linux/uio.h> +#include <linux/user_namespace.h> #include "fuse_i.h" @@ -498,7 +499,7 @@ static int cuse_channel_open(struct inode *inode, struct file *file) if (!cc) return -ENOMEM; - fuse_conn_init(&cc->fc); + fuse_conn_init(&cc->fc, current_user_ns()); fud = fuse_dev_alloc(&cc->fc); if (!fud) { diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index 4e91b2ac25a7..8fa1ce934df3 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -127,8 +127,8 @@ static void __fuse_put_request(struct fuse_req *req) static void fuse_req_init_context(struct fuse_conn *fc, struct fuse_req *req) { - req->in.h.uid = from_kuid_munged(&init_user_ns, current_fsuid()); - req->in.h.gid = from_kgid_munged(&init_user_ns, current_fsgid()); + req->in.h.uid = from_kuid(fc->user_ns, current_fsuid()); + req->in.h.gid = from_kgid(fc->user_ns, current_fsgid()); req->in.h.pid = pid_nr_ns(task_pid(current), fc->pid_ns); } @@ -186,7 +186,8 @@ static struct fuse_req *__fuse_get_req(struct fuse_conn *fc, unsigned npages, __set_bit(FR_WAITING, &req->flags); if (for_background) __set_bit(FR_BACKGROUND, &req->flags); - if (req->in.h.pid == 0) { + if (req->in.h.pid == 0 || req->in.h.uid == (uid_t)-1 || + req->in.h.gid == (gid_t)-1) { fuse_put_request(fc, req); return ERR_PTR(-EOVERFLOW); } @@ -1248,7 +1249,8 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file, struct fuse_in *in; unsigned reqsize; - if (task_active_pid_ns(current) != fc->pid_ns) + if (task_active_pid_ns(current) != fc->pid_ns || + current_user_ns() != fc->user_ns) return -EIO; restart: @@ -1880,7 +1882,8 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud, struct fuse_req *req; struct fuse_out_header oh; - if (task_active_pid_ns(current) != fc->pid_ns) + if (task_active_pid_ns(current) != fc->pid_ns || + current_user_ns() != fc->user_ns) return -EIO; if (nbytes < sizeof(struct fuse_out_header)) diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c index 4b855b65d457..ecba75bf6640 100644 --- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -841,8 +841,8 @@ static void fuse_fillattr(struct inode *inode, struct fuse_attr *attr, stat->ino = attr->ino; stat->mode = (inode->i_mode & S_IFMT) | (attr->mode & 07777); stat->nlink = attr->nlink; - stat->uid = make_kuid(&init_user_ns, attr->uid); - stat->gid = make_kgid(&init_user_ns, attr->gid); + stat->uid = make_kuid(fc->user_ns, attr->uid); + stat->gid = make_kgid(fc->user_ns, attr->gid); stat->rdev = inode->i_rdev; stat->atime.tv_sec = attr->atime; stat->atime.tv_nsec = attr->atimensec; @@ -1459,17 +1459,17 @@ static bool update_mtime(unsigned ivalid, bool trust_local_mtime) return true; } -static void iattr_to_fattr(struct iattr *iattr, struct fuse_setattr_in *arg, - bool trust_local_cmtime) +static void iattr_to_fattr(struct fuse_conn *fc, struct iattr *iattr, + struct fuse_setattr_in *arg, bool trust_local_cmtime) { unsigned ivalid = iattr->ia_valid; if (ivalid & ATTR_MODE) arg->valid |= FATTR_MODE, arg->mode = iattr->ia_mode; if (ivalid & ATTR_UID) - arg->valid |= FATTR_UID, arg->uid = from_kuid(&init_user_ns, iattr->ia_uid); + arg->valid |= FATTR_UID, arg->uid = from_kuid(fc->user_ns, iattr->ia_uid); if (ivalid & ATTR_GID) - arg->valid |= FATTR_GID, arg->gid = from_kgid(&init_user_ns, iattr->ia_gid); + arg->valid |= FATTR_GID, arg->gid = from_kgid(fc->user_ns, iattr->ia_gid); if (ivalid & ATTR_SIZE) arg->valid |= FATTR_SIZE, arg->size = iattr->ia_size; if (ivalid & ATTR_ATIME) { @@ -1629,7 +1629,7 @@ int fuse_do_setattr(struct inode *inode, struct iattr *attr, memset(&inarg, 0, sizeof(inarg)); memset(&outarg, 0, sizeof(outarg)); - iattr_to_fattr(attr, &inarg, trust_local_cmtime); + iattr_to_fattr(fc, attr, &inarg, trust_local_cmtime); if (file) { struct fuse_file *ff = file->private_data; inarg.valid |= FATTR_FH; diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index 9145445a759a..9f4c3c82edd6 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -24,6 +24,7 @@ #include <linux/workqueue.h> #include <linux/kref.h> #include <linux/pid_namespace.h> +#include <linux/user_namespace.h> /** Max number of pages that can be used in a single read request */ #define FUSE_MAX_PAGES_PER_REQ 32 @@ -469,6 +470,9 @@ struct fuse_conn { /** The pid namespace for this mount */ struct pid_namespace *pid_ns; + /** The user namespace for this mount */ + struct user_namespace *user_ns; + /** The fuse mount flags for this mount */ unsigned flags; @@ -867,7 +871,7 @@ struct fuse_conn *fuse_conn_get(struct fuse_conn *fc); /** * Initialize fuse_conn */ -void fuse_conn_init(struct fuse_conn *fc); +void fuse_conn_init(struct fuse_conn *fc, struct user_namespace *user_ns); /** * Release reference to fuse_conn diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index eade0bfa4488..0a771145d853 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -167,8 +167,8 @@ void fuse_change_attributes_common(struct inode *inode, struct fuse_attr *attr, inode->i_ino = fuse_squash_ino(attr->ino); inode->i_mode = (inode->i_mode & S_IFMT) | (attr->mode & 07777); set_nlink(inode, attr->nlink); - inode->i_uid = make_kuid(&init_user_ns, attr->uid); - inode->i_gid = make_kgid(&init_user_ns, attr->gid); + inode->i_uid = make_kuid(fc->user_ns, attr->uid); + inode->i_gid = make_kgid(fc->user_ns, attr->gid); inode->i_blocks = attr->blocks; inode->i_atime.tv_sec = attr->atime; inode->i_atime.tv_nsec = attr->atimensec; @@ -467,7 +467,8 @@ static int fuse_match_uint(substring_t *s, unsigned int *res) return err; } -static int parse_fuse_opt(char *opt, struct fuse_mount_data *d, int is_bdev) +static int parse_fuse_opt(char *opt, struct fuse_mount_data *d, int is_bdev, + struct user_namespace *user_ns) { char *p; memset(d, 0, sizeof(struct fuse_mount_data)); @@ -503,7 +504,7 @@ static int parse_fuse_opt(char *opt, struct fuse_mount_data *d, int is_bdev) case OPT_USER_ID: if (fuse_match_uint(&args[0], &uv)) return 0; - d->user_id = make_kuid(current_user_ns(), uv); + d->user_id = make_kuid(user_ns, uv); if (!uid_valid(d->user_id)) return 0; d->user_id_present = 1; @@ -512,7 +513,7 @@ static int parse_fuse_opt(char *opt, struct fuse_mount_data *d, int is_bdev) case OPT_GROUP_ID: if (fuse_match_uint(&args[0], &uv)) return 0; - d->group_id = make_kgid(current_user_ns(), uv); + d->group_id = make_kgid(user_ns, uv); if (!gid_valid(d->group_id)) return 0; d->group_id_present = 1; @@ -555,8 +556,10 @@ static int fuse_show_options(struct seq_file *m, struct dentry *root) struct super_block *sb = root->d_sb; struct fuse_conn *fc = get_fuse_conn_super(sb); - seq_printf(m, ",user_id=%u", from_kuid_munged(&init_user_ns, fc->user_id)); - seq_printf(m, ",group_id=%u", from_kgid_munged(&init_user_ns, fc->group_id)); + seq_printf(m, ",user_id=%u", + from_kuid_munged(fc->user_ns, fc->user_id)); + seq_printf(m, ",group_id=%u", + from_kgid_munged(fc->user_ns, fc->group_id)); if (fc->flags & FUSE_DEFAULT_PERMISSIONS) seq_puts(m, ",default_permissions"); if (fc->flags & FUSE_ALLOW_OTHER) @@ -587,7 +590,7 @@ static void fuse_pqueue_init(struct fuse_pqueue *fpq) fpq->connected = 1; } -void fuse_conn_init(struct fuse_conn *fc) +void fuse_conn_init(struct fuse_conn *fc, struct user_namespace *user_ns) { memset(fc, 0, sizeof(*fc)); spin_lock_init(&fc->lock); @@ -611,6 +614,7 @@ void fuse_conn_init(struct fuse_conn *fc) fc->attr_version = 1; get_random_bytes(&fc->scramble_key, sizeof(fc->scramble_key)); fc->pid_ns = get_pid_ns(task_active_pid_ns(current)); + fc->user_ns = get_user_ns(user_ns); } EXPORT_SYMBOL_GPL(fuse_conn_init); @@ -620,6 +624,7 @@ void fuse_conn_put(struct fuse_conn *fc) if (fc->destroy_req) fuse_request_free(fc->destroy_req); put_pid_ns(fc->pid_ns); + put_user_ns(fc->user_ns); fc->release(fc); } } @@ -1046,7 +1051,7 @@ static int fuse_fill_super(struct super_block *sb, void *data, int silent) sb->s_flags &= ~(MS_NOSEC | MS_I_VERSION); - if (!parse_fuse_opt(data, &d, is_bdev)) + if (!parse_fuse_opt(data, &d, is_bdev, sb->s_user_ns)) goto err; if (is_bdev) { @@ -1070,8 +1075,12 @@ static int fuse_fill_super(struct super_block *sb, void *data, int silent) if (!file) goto err; - if ((file->f_op != &fuse_dev_operations) || - (file->f_cred->user_ns != &init_user_ns)) + /* + * Require mount to happen from the same user namespace which + * opened /dev/fuse to prevent potential attacks. + */ + if (file->f_op != &fuse_dev_operations || + file->f_cred->user_ns != sb->s_user_ns) goto err_fput; fc = kmalloc(sizeof(*fc), GFP_KERNEL); @@ -1079,7 +1088,7 @@ static int fuse_fill_super(struct super_block *sb, void *data, int silent) if (!fc) goto err_fput; - fuse_conn_init(fc); + fuse_conn_init(fc, sb->s_user_ns); fc->release = fuse_free_conn; fud = fuse_dev_alloc(fc); -- 2.7.4 ------------------------------------------------------------------------------ Find and fix application performance issues faster with Applications Manager Applications Manager provides deep performance insights into multiple tiers of your business applications. It resolves application problems quickly and reduces your MTTR. Get your free trial! https://ad.doubleclick.net/ddm/clk/302982198;130105516;z -- fuse-devel mailing list To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v4 20/21] fuse: Restrict allow_other to the superblock's namespace or a descendant 2016-04-26 19:36 [PATCH v4 00/21] Support fuse mounts in user namespaces Seth Forshee ` (9 preceding siblings ...) [not found] ` <1461699396-33000-1-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> @ 2016-04-26 19:36 ` Seth Forshee 2016-04-26 19:36 ` [PATCH v4 21/21] fuse: Allow user namespace mounts Seth Forshee 11 siblings, 0 replies; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman, Miklos Szeredi Cc: Alexander Viro, Serge Hallyn, Richard Weinberger, Austin S Hemmelgarn, Miklos Szeredi, Pavel Tikhomirov, linux-kernel, linux-bcache, dm-devel, linux-raid, linux-mtd, linux-fsdevel, fuse-devel, linux-security-module, selinux, cgroups, Seth Forshee Unprivileged users are normally restricted from mounting with the allow_other option by system policy, but this could be bypassed for a mount done with user namespace root permissions. In such cases allow_other should not allow users outside the userns to access the mount as doing so would give the unprivileged user the ability to manipulate processes it would otherwise be unable to manipulate. Restrict allow_other to apply to users in the same userns used at mount or a descendant of that namespace. Also export current_in_userns() for use by fuse when built as a module. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Miklos Szeredi <mszeredi@redhat.com> --- fs/fuse/dir.c | 2 +- kernel/user_namespace.c | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c index ecba75bf6640..1a6c5af49608 100644 --- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -1015,7 +1015,7 @@ int fuse_allow_current_process(struct fuse_conn *fc) const struct cred *cred; if (fc->flags & FUSE_ALLOW_OTHER) - return 1; + return current_in_userns(fc->user_ns); cred = current_cred(); if (uid_eq(cred->euid, fc->user_id) && diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c index 68f594212759..fa2294e14b77 100644 --- a/kernel/user_namespace.c +++ b/kernel/user_namespace.c @@ -951,6 +951,7 @@ bool current_in_userns(const struct user_namespace *target_ns) } return false; } +EXPORT_SYMBOL(current_in_userns); static inline struct user_namespace *to_user_ns(struct ns_common *ns) { -- 2.7.4 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v4 21/21] fuse: Allow user namespace mounts 2016-04-26 19:36 [PATCH v4 00/21] Support fuse mounts in user namespaces Seth Forshee ` (10 preceding siblings ...) 2016-04-26 19:36 ` [PATCH v4 20/21] fuse: Restrict allow_other to the superblock's namespace or a descendant Seth Forshee @ 2016-04-26 19:36 ` Seth Forshee 11 siblings, 0 replies; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:36 UTC (permalink / raw) To: Eric W. Biederman, Miklos Szeredi Cc: Alexander Viro, Serge Hallyn, Richard Weinberger, Austin S Hemmelgarn, Miklos Szeredi, Pavel Tikhomirov, linux-kernel, linux-bcache, dm-devel, linux-raid, linux-mtd, linux-fsdevel, fuse-devel, linux-security-module, selinux, cgroups, Seth Forshee Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Miklos Szeredi <mszeredi@redhat.com> --- fs/fuse/inode.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 0a771145d853..254f1944ee98 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -1199,7 +1199,7 @@ static void fuse_kill_sb_anon(struct super_block *sb) static struct file_system_type fuse_fs_type = { .owner = THIS_MODULE, .name = "fuse", - .fs_flags = FS_HAS_SUBTYPE, + .fs_flags = FS_HAS_SUBTYPE | FS_USERNS_MOUNT, .mount = fuse_mount, .kill_sb = fuse_kill_sb_anon, }; @@ -1231,7 +1231,7 @@ static struct file_system_type fuseblk_fs_type = { .name = "fuseblk", .mount = fuse_mount_blk, .kill_sb = fuse_kill_sb_blk, - .fs_flags = FS_REQUIRES_DEV | FS_HAS_SUBTYPE, + .fs_flags = FS_REQUIRES_DEV | FS_HAS_SUBTYPE | FS_USERNS_MOUNT, }; MODULE_ALIAS_FS("fuseblk"); -- 2.7.4 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v4 00/21] Support fuse mounts in user namespaces @ 2016-04-26 19:30 Seth Forshee [not found] ` <1461699046-30485-1-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> 0 siblings, 1 reply; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:30 UTC (permalink / raw) To: Eric W. Biederman, linux-bcache-u79uwXL29TY76Z2rM5mHXA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, linux-raid-u79uwXL29TY76Z2rM5mHXA, linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, cgroups-u79uwXL29TY76Z2rM5mHXA, linux-security-module-u79uwXL29TY76Z2rM5mHXA, selinux-+05T5uksL2qpZYMLLGbcSA Cc: Serge Hallyn, Miklos Szeredi, Richard Weinberger, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Seth Forshee, Austin S Hemmelgarn, Alexander Viro, Pavel Tikhomirov Hi Eric, Here's another update to my patches for mouning with fuse from unpivileged user namespaces. The main change here is a fix for a build failure when fuse is built as a module. As usual the series is also available at: git://git.kernel.org/pub/scm/linux/kernel/git/sforshee/linux.git fuse-userns Changes since v3: * Export current_in_userns() to fix an error when fuse is built as a module. * Add comment explaining the conditions for allowing CAP_CHOWN in s_user_ns to change the owner or group of an inode. * Added acks from Serge. Thanks, Seth --- Andy Lutomirski (1): fs: Treat foreign mounts as nosuid Pavel Tikhomirov (1): fs: fix a posible leak of allocated superblock Seth Forshee (19): fs: Remove check of s_user_ns for existing mounts in fs_fully_visible() fs: Allow sysfs and cgroupfs to share super blocks between user namespaces block_dev: Support checking inode permissions in lookup_bdev() block_dev: Check permissions towards block device inode when mounting selinux: Add support for unprivileged mounts from user namespaces userns: Replace in_userns with current_in_userns Smack: Handle labels consistently in untrusted mounts fs: Check for invalid i_uid in may_follow_link() cred: Reject inodes with invalid ids in set_create_file_as() fs: Refuse uid/gid changes which don't map into s_user_ns fs: Update posix_acl support to handle user namespace mounts fs: Allow superblock owner to change ownership of inodes with unmappable ids fs: Don't remove suid for CAP_FSETID in s_user_ns fs: Allow superblock owner to access do_remount_sb() capabilities: Allow privileged user in s_user_ns to set security.* xattrs fuse: Add support for pid namespaces fuse: Support fuse filesystems outside of init_user_ns fuse: Restrict allow_other to the superblock's namespace or a descendant fuse: Allow user namespace mounts drivers/md/bcache/super.c | 2 +- drivers/md/dm-table.c | 2 +- drivers/mtd/mtdsuper.c | 2 +- fs/attr.c | 73 ++++++++++++++++++++++++++++++++++++----- fs/block_dev.c | 18 ++++++++-- fs/exec.c | 2 +- fs/fuse/cuse.c | 3 +- fs/fuse/dev.c | 26 +++++++++++---- fs/fuse/dir.c | 16 ++++----- fs/fuse/file.c | 22 ++++++++++--- fs/fuse/fuse_i.h | 10 +++++- fs/fuse/inode.c | 40 ++++++++++++++-------- fs/inode.c | 3 +- fs/kernfs/inode.c | 2 ++ fs/namei.c | 2 +- fs/namespace.c | 20 ++++++++--- fs/posix_acl.c | 67 +++++++++++++++++++++++-------------- fs/proc/base.c | 2 ++ fs/proc/generic.c | 3 ++ fs/proc/proc_sysctl.c | 2 ++ fs/quota/quota.c | 2 +- fs/super.c | 7 +++- fs/sysfs/mount.c | 3 +- fs/xattr.c | 19 ++++++++--- include/linux/fs.h | 3 +- include/linux/mount.h | 1 + include/linux/posix_acl_xattr.h | 17 +++++++--- include/linux/uidgid.h | 10 ++++++ include/linux/user_namespace.h | 6 ++-- kernel/cgroup.c | 4 +-- kernel/cred.c | 2 ++ kernel/user_namespace.c | 7 ++-- security/commoncap.c | 22 +++++++++---- security/selinux/hooks.c | 25 +++++++++++++- security/smack/smack_lsm.c | 29 ++++++++++------ 35 files changed, 355 insertions(+), 119 deletions(-) _______________________________________________ Selinux mailing list Selinux-+05T5uksL2qpZYMLLGbcSA@public.gmane.org To unsubscribe, send email to Selinux-leave-+05T5uksL2pAGbPMOrvdOA@public.gmane.org To get help, send an email containing "help" to Selinux-request-+05T5uksL2pAGbPMOrvdOA@public.gmane.org ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <1461699046-30485-1-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>]
* [PATCH v4 18/21] fuse: Add support for pid namespaces [not found] ` <1461699046-30485-1-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> @ 2016-04-26 19:30 ` Seth Forshee 0 siblings, 0 replies; 29+ messages in thread From: Seth Forshee @ 2016-04-26 19:30 UTC (permalink / raw) To: Eric W. Biederman, Miklos Szeredi Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA, Serge Hallyn, Seth Forshee, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, Miklos Szeredi, Richard Weinberger, linux-security-module-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-raid-u79uwXL29TY76Z2rM5mHXA, fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Austin S Hemmelgarn, linux-mtd-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Alexander Viro, selinux-+05T5uksL2qpZYMLLGbcSA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA, Pavel Tikhomirov When the userspace process servicing fuse requests is running in a pid namespace then pids passed via the fuse fd are not being translated into that process' namespace. Translation is necessary for the pid to be useful to that process. Since no use case currently exists for changing namespaces all translations can be done relative to the pid namespace in use when fuse_conn_init() is called. For fuse this translates to mount time, and for cuse this is when /dev/cuse is opened. IO for this connection from another namespace will return errors. Requests from processes whose pid cannot be translated into the target namespace are not permitted, except for requests allocated via fuse_get_req_nofail_nopages. For no-fail requests in.h.pid will be 0 if the pid translation fails. File locking changes based on previous work done by Eric Biederman. Signed-off-by: Seth Forshee <seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> Acked-by: Miklos Szeredi <mszeredi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> --- fs/fuse/dev.c | 19 +++++++++++++++---- fs/fuse/file.c | 22 +++++++++++++++++----- fs/fuse/fuse_i.h | 4 ++++ fs/fuse/inode.c | 3 +++ 4 files changed, 39 insertions(+), 9 deletions(-) diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index cbece1221417..4e91b2ac25a7 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -19,6 +19,7 @@ #include <linux/pipe_fs_i.h> #include <linux/swap.h> #include <linux/splice.h> +#include <linux/sched.h> MODULE_ALIAS_MISCDEV(FUSE_MINOR); MODULE_ALIAS("devname:fuse"); @@ -124,11 +125,11 @@ static void __fuse_put_request(struct fuse_req *req) atomic_dec(&req->count); } -static void fuse_req_init_context(struct fuse_req *req) +static void fuse_req_init_context(struct fuse_conn *fc, struct fuse_req *req) { req->in.h.uid = from_kuid_munged(&init_user_ns, current_fsuid()); req->in.h.gid = from_kgid_munged(&init_user_ns, current_fsgid()); - req->in.h.pid = current->pid; + req->in.h.pid = pid_nr_ns(task_pid(current), fc->pid_ns); } void fuse_set_initialized(struct fuse_conn *fc) @@ -181,10 +182,14 @@ static struct fuse_req *__fuse_get_req(struct fuse_conn *fc, unsigned npages, goto out; } - fuse_req_init_context(req); + fuse_req_init_context(fc, req); __set_bit(FR_WAITING, &req->flags); if (for_background) __set_bit(FR_BACKGROUND, &req->flags); + if (req->in.h.pid == 0) { + fuse_put_request(fc, req); + return ERR_PTR(-EOVERFLOW); + } return req; @@ -274,7 +279,7 @@ struct fuse_req *fuse_get_req_nofail_nopages(struct fuse_conn *fc, if (!req) req = get_reserved_req(fc, file); - fuse_req_init_context(req); + fuse_req_init_context(fc, req); __set_bit(FR_WAITING, &req->flags); __clear_bit(FR_BACKGROUND, &req->flags); return req; @@ -1243,6 +1248,9 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file, struct fuse_in *in; unsigned reqsize; + if (task_active_pid_ns(current) != fc->pid_ns) + return -EIO; + restart: spin_lock(&fiq->waitq.lock); err = -EAGAIN; @@ -1872,6 +1880,9 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud, struct fuse_req *req; struct fuse_out_header oh; + if (task_active_pid_ns(current) != fc->pid_ns) + return -EIO; + if (nbytes < sizeof(struct fuse_out_header)) return -EINVAL; diff --git a/fs/fuse/file.c b/fs/fuse/file.c index 719924d6c706..b5c616c5ec98 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -2067,7 +2067,8 @@ static int fuse_direct_mmap(struct file *file, struct vm_area_struct *vma) return generic_file_mmap(file, vma); } -static int convert_fuse_file_lock(const struct fuse_file_lock *ffl, +static int convert_fuse_file_lock(struct fuse_conn *fc, + const struct fuse_file_lock *ffl, struct file_lock *fl) { switch (ffl->type) { @@ -2082,7 +2083,14 @@ static int convert_fuse_file_lock(const struct fuse_file_lock *ffl, fl->fl_start = ffl->start; fl->fl_end = ffl->end; - fl->fl_pid = ffl->pid; + + /* + * Convert pid into the caller's pid namespace. If the pid + * does not map into the namespace fl_pid will get set to 0. + */ + rcu_read_lock(); + fl->fl_pid = pid_vnr(find_pid_ns(ffl->pid, fc->pid_ns)); + rcu_read_unlock(); break; default: @@ -2131,7 +2139,7 @@ static int fuse_getlk(struct file *file, struct file_lock *fl) args.out.args[0].value = &outarg; err = fuse_simple_request(fc, &args); if (!err) - err = convert_fuse_file_lock(&outarg.lk, fl); + err = convert_fuse_file_lock(fc, &outarg.lk, fl); return err; } @@ -2143,7 +2151,8 @@ static int fuse_setlk(struct file *file, struct file_lock *fl, int flock) FUSE_ARGS(args); struct fuse_lk_in inarg; int opcode = (fl->fl_flags & FL_SLEEP) ? FUSE_SETLKW : FUSE_SETLK; - pid_t pid = fl->fl_type != F_UNLCK ? current->tgid : 0; + struct pid *pid = fl->fl_type != F_UNLCK ? task_tgid(current) : NULL; + pid_t pid_nr = pid_nr_ns(pid, fc->pid_ns); int err; if (fl->fl_lmops && fl->fl_lmops->lm_grant) { @@ -2155,7 +2164,10 @@ static int fuse_setlk(struct file *file, struct file_lock *fl, int flock) if (fl->fl_flags & FL_CLOSE) return 0; - fuse_lk_fill(&args, file, fl, opcode, pid, flock, &inarg); + if (pid && pid_nr == 0) + return -EOVERFLOW; + + fuse_lk_fill(&args, file, fl, opcode, pid_nr, flock, &inarg); err = fuse_simple_request(fc, &args); /* locking is restartable */ diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index eddbe02c4028..9145445a759a 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -23,6 +23,7 @@ #include <linux/poll.h> #include <linux/workqueue.h> #include <linux/kref.h> +#include <linux/pid_namespace.h> /** Max number of pages that can be used in a single read request */ #define FUSE_MAX_PAGES_PER_REQ 32 @@ -465,6 +466,9 @@ struct fuse_conn { /** The group id for this mount */ kgid_t group_id; + /** The pid namespace for this mount */ + struct pid_namespace *pid_ns; + /** The fuse mount flags for this mount */ unsigned flags; diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 1ce67668a8e1..eade0bfa4488 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -20,6 +20,7 @@ #include <linux/random.h> #include <linux/sched.h> #include <linux/exportfs.h> +#include <linux/pid_namespace.h> MODULE_AUTHOR("Miklos Szeredi <miklos-sUDqSbJrdHQHWmgEVkV9KA@public.gmane.org>"); MODULE_DESCRIPTION("Filesystem in Userspace"); @@ -609,6 +610,7 @@ void fuse_conn_init(struct fuse_conn *fc) fc->connected = 1; fc->attr_version = 1; get_random_bytes(&fc->scramble_key, sizeof(fc->scramble_key)); + fc->pid_ns = get_pid_ns(task_active_pid_ns(current)); } EXPORT_SYMBOL_GPL(fuse_conn_init); @@ -617,6 +619,7 @@ void fuse_conn_put(struct fuse_conn *fc) if (atomic_dec_and_test(&fc->count)) { if (fc->destroy_req) fuse_request_free(fc->destroy_req); + put_pid_ns(fc->pid_ns); fc->release(fc); } } -- 2.7.4 _______________________________________________ Selinux mailing list Selinux-+05T5uksL2qpZYMLLGbcSA@public.gmane.org To unsubscribe, send email to Selinux-leave-+05T5uksL2pAGbPMOrvdOA@public.gmane.org To get help, send an email containing "help" to Selinux-request-+05T5uksL2pAGbPMOrvdOA@public.gmane.org ^ permalink raw reply related [flat|nested] 29+ messages in thread
end of thread, other threads:[~2016-07-21 7:25 UTC | newest] Thread overview: 29+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-04-26 19:36 [PATCH v4 00/21] Support fuse mounts in user namespaces Seth Forshee 2016-04-26 19:36 ` [PATCH v4 01/21] fs: fix a posible leak of allocated superblock Seth Forshee 2016-04-26 19:36 ` [PATCH v4 04/21] block_dev: Support checking inode permissions in lookup_bdev() Seth Forshee 2016-04-26 19:36 ` [PATCH v4 05/21] block_dev: Check permissions towards block device inode when mounting Seth Forshee 2016-04-26 19:36 ` [PATCH v4 06/21] fs: Treat foreign mounts as nosuid Seth Forshee 2016-04-26 19:36 ` [PATCH v4 08/21] userns: Replace in_userns with current_in_userns Seth Forshee 2016-04-26 19:36 ` [PATCH v4 09/21] Smack: Handle labels consistently in untrusted mounts Seth Forshee 2016-04-26 19:36 ` [PATCH v4 10/21] fs: Check for invalid i_uid in may_follow_link() Seth Forshee 2016-05-24 15:55 ` Djalal Harouni 2016-04-26 19:36 ` [PATCH v4 11/21] cred: Reject inodes with invalid ids in set_create_file_as() Seth Forshee 2016-04-26 19:36 ` [PATCH v4 15/21] fs: Don't remove suid for CAP_FSETID in s_user_ns Seth Forshee [not found] ` <1461699396-33000-1-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> 2016-04-26 19:36 ` [PATCH v4 02/21] fs: Remove check of s_user_ns for existing mounts in fs_fully_visible() Seth Forshee 2016-04-26 19:36 ` [PATCH v4 03/21] fs: Allow sysfs and cgroupfs to share super blocks between user namespaces Seth Forshee 2016-04-26 19:36 ` [PATCH v4 07/21] selinux: Add support for unprivileged mounts from " Seth Forshee 2016-04-26 19:36 ` [PATCH v4 12/21] fs: Refuse uid/gid changes which don't map into s_user_ns Seth Forshee 2016-04-26 19:36 ` [PATCH v4 13/21] fs: Update posix_acl support to handle user namespace mounts Seth Forshee 2016-04-26 19:36 ` [PATCH v4 14/21] fs: Allow superblock owner to change ownership of inodes with unmappable ids Seth Forshee 2016-04-26 19:36 ` [PATCH v4 16/21] fs: Allow superblock owner to access do_remount_sb() Seth Forshee 2016-04-26 19:36 ` [PATCH v4 17/21] capabilities: Allow privileged user in s_user_ns to set security.* xattrs Seth Forshee [not found] ` <1461699396-33000-18-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> 2016-04-27 7:22 ` James Morris 2016-04-26 19:36 ` [PATCH v4 18/21] fuse: Add support for pid namespaces Seth Forshee 2016-07-20 2:44 ` Sheng Yang [not found] ` <CA+2rt426_pshAauQizcxkfAq16vmEpB4sJ4genW_ucosH3j=zQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-07-20 12:52 ` Seth Forshee 2016-07-20 22:28 ` Sheng Yang 2016-07-21 7:25 ` Miklos Szeredi 2016-04-26 19:36 ` [PATCH v4 19/21] fuse: Support fuse filesystems outside of init_user_ns Seth Forshee 2016-04-26 19:36 ` [PATCH v4 20/21] fuse: Restrict allow_other to the superblock's namespace or a descendant Seth Forshee 2016-04-26 19:36 ` [PATCH v4 21/21] fuse: Allow user namespace mounts Seth Forshee -- strict thread matches above, loose matches on Subject: below -- 2016-04-26 19:30 [PATCH v4 00/21] Support fuse mounts in user namespaces Seth Forshee [not found] ` <1461699046-30485-1-git-send-email-seth.forshee-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> 2016-04-26 19:30 ` [PATCH v4 18/21] fuse: Add support for pid namespaces Seth Forshee
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).