From: Gao feng <gaofeng-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
To: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Linux Containers
<containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [REVIEW][PATCH 1/2] userns: Better restrictions on when proc and sysfs can be mounted
Date: Sat, 02 Nov 2013 14:06:27 +0800 [thread overview]
Message-ID: <52749663.2000701@cn.fujitsu.com> (raw)
In-Reply-To: <878uzmhkqg.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Hi Eric,
On 08/28/2013 05:44 AM, Eric W. Biederman wrote:
>
> Rely on the fact that another flavor of the filesystem is already
> mounted and do not rely on state in the user namespace.
>
> Verify that the mounted filesystem is not covered in any significant
> way. I would love to verify that the previously mounted filesystem
> has no mounts on top but there are at least the directories
> /proc/sys/fs/binfmt_misc and /sys/fs/cgroup/ that exist explicitly
> for other filesystems to mount on top of.
>
> Refactor the test into a function named fs_fully_visible and call that
> function from the mount routines of proc and sysfs. This makes this
> test local to the filesystems involved and the results current of when
> the mounts take place, removing a weird threading of the user
> namespace, the mount namespace and the filesystems themselves.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
> ---
> fs/namespace.c | 37 +++++++++++++++++++++++++------------
> fs/proc/root.c | 7 +++++--
> fs/sysfs/mount.c | 3 ++-
> include/linux/fs.h | 1 +
> include/linux/user_namespace.h | 4 ----
> kernel/user.c | 2 --
> kernel/user_namespace.c | 2 --
> 7 files changed, 33 insertions(+), 23 deletions(-)
>
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 64627f8..877e427 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -2867,25 +2867,38 @@ bool current_chrooted(void)
> return chrooted;
> }
>
> -void update_mnt_policy(struct user_namespace *userns)
> +bool fs_fully_visible(struct file_system_type *type)
> {
> struct mnt_namespace *ns = current->nsproxy->mnt_ns;
> struct mount *mnt;
> + bool visible = false;
>
> - down_read(&namespace_sem);
> + if (unlikely(!ns))
> + return false;
> +
> + namespace_lock();
> list_for_each_entry(mnt, &ns->list, mnt_list) {
> - switch (mnt->mnt.mnt_sb->s_magic) {
> - case SYSFS_MAGIC:
> - userns->may_mount_sysfs = true;
> - break;
> - case PROC_SUPER_MAGIC:
> - userns->may_mount_proc = true;
> - break;
> + struct mount *child;
> + if (mnt->mnt.mnt_sb->s_type != type)
> + continue;
> +
> + /* This mount is not fully visible if there are any child mounts
> + * that cover anything except for empty directories.
> + */
> + list_for_each_entry(child, &mnt->mnt_mounts, mnt_child) {
> + struct inode *inode = child->mnt_mountpoint->d_inode;
> + if (!S_ISDIR(inode->i_mode))
> + goto next;
> + if (inode->i_nlink != 2)
> + goto next;
I met a problem that proc filesystem failed to mount in user namespace,
The problem is the i_nlink of sysctl entries under proc filesystem is not
2. it always is 1 even it's a directory, see proc_sys_make_inode. and for
btrfs, the i_nlink for an empty dir is 2 too. it seems like depends on the
filesystem itself,not depends on vfs. In my system binfmt_misc is mounted
on /proc/sys/fs/binfmt_misc, and the i_nlink of this directory's inode is
1.
btw, I'm not quite understand what's the inode->i_nlink != 2 here means?
is this directory empty? as I know, when we create a file(not dir) under
a dir, the i_nlink of this dir will not increase.
And another question, it looks like if we don't have proc/sys fs mounted,
then proc/sys will be failed to be mounted?
Thanks!
> }
> - if (userns->may_mount_sysfs && userns->may_mount_proc)
> - break;
> + visible = true;
> + goto found;
> + next: ;
> }
> - up_read(&namespace_sem);
> +found:
> + namespace_unlock();
> + return visible;
> }
>
> static void *mntns_get(struct task_struct *task)
> diff --git a/fs/proc/root.c b/fs/proc/root.c
> index 38bd5d4..45e5fb7 100644
> --- a/fs/proc/root.c
> +++ b/fs/proc/root.c
> @@ -110,8 +110,11 @@ static struct dentry *proc_mount(struct file_system_type *fs_type,
> ns = task_active_pid_ns(current);
> options = data;
>
> - if (!current_user_ns()->may_mount_proc ||
> - !ns_capable(ns->user_ns, CAP_SYS_ADMIN))
> + if (!capable(CAP_SYS_ADMIN) && !fs_fully_visible(fs_type))
> + return ERR_PTR(-EPERM);
> +
> + /* Does the mounter have privilege over the pid namespace? */
> + if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN))
> return ERR_PTR(-EPERM);
> }
>
> diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
> index afd8327..4a2da3a 100644
> --- a/fs/sysfs/mount.c
> +++ b/fs/sysfs/mount.c
> @@ -112,7 +112,8 @@ static struct dentry *sysfs_mount(struct file_system_type *fs_type,
> struct super_block *sb;
> int error;
>
> - if (!(flags & MS_KERNMOUNT) && !current_user_ns()->may_mount_sysfs)
> + if (!(flags & MS_KERNMOUNT) && !capable(CAP_SYS_ADMIN) &&
> + !fs_fully_visible(fs_type))
> return ERR_PTR(-EPERM);
>
> info = kzalloc(sizeof(*info), GFP_KERNEL);
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 9818747..3050c62 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1897,6 +1897,7 @@ extern int vfs_ustat(dev_t, struct kstatfs *);
> extern int freeze_super(struct super_block *super);
> extern int thaw_super(struct super_block *super);
> extern bool our_mnt(struct vfsmount *mnt);
> +extern bool fs_fully_visible(struct file_system_type *);
>
> extern int current_umask(void);
>
> diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
> index b6b215f..4ce0093 100644
> --- a/include/linux/user_namespace.h
> +++ b/include/linux/user_namespace.h
> @@ -26,8 +26,6 @@ struct user_namespace {
> kuid_t owner;
> kgid_t group;
> unsigned int proc_inum;
> - bool may_mount_sysfs;
> - bool may_mount_proc;
> };
>
> extern struct user_namespace init_user_ns;
> @@ -84,6 +82,4 @@ static inline void put_user_ns(struct user_namespace *ns)
>
> #endif
>
> -void update_mnt_policy(struct user_namespace *userns);
> -
> #endif /* _LINUX_USER_H */
> diff --git a/kernel/user.c b/kernel/user.c
> index 69b4c3d..5bbb919 100644
> --- a/kernel/user.c
> +++ b/kernel/user.c
> @@ -51,8 +51,6 @@ struct user_namespace init_user_ns = {
> .owner = GLOBAL_ROOT_UID,
> .group = GLOBAL_ROOT_GID,
> .proc_inum = PROC_USER_INIT_INO,
> - .may_mount_sysfs = true,
> - .may_mount_proc = true,
> };
> EXPORT_SYMBOL_GPL(init_user_ns);
>
> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
> index d8c30db..d58ad1e 100644
> --- a/kernel/user_namespace.c
> +++ b/kernel/user_namespace.c
> @@ -97,8 +97,6 @@ int create_user_ns(struct cred *new)
>
> set_cred_user_ns(new, ns);
>
> - update_mnt_policy(ns);
> -
> return 0;
> }
>
>
next prev parent reply other threads:[~2013-11-02 6:06 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-27 21:44 [REVIEW][PATCH 1/2] userns: Better restrictions on when proc and sysfs can be mounted Eric W. Biederman
[not found] ` <878uzmhkqg.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-08-27 21:46 ` [REVIEW][PATCH 2/2] sysfs: Restrict mounting sysfs Eric W. Biederman
[not found] ` <874naahkng.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-08-28 19:00 ` Greg Kroah-Hartman
2013-09-23 10:33 ` James Hogan
[not found] ` <524018EA.9070202-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org>
2013-09-23 21:41 ` [PATCH] sysfs: Allow mounting without CONFIG_NET Eric W. Biederman
[not found] ` <87ioxrrzb6.fsf_-_-HxuHnoDHeQZYhcs0q7wBk77fW72O3V7zAL8bYrjMMd8@public.gmane.org>
2013-09-24 11:25 ` James Hogan
2013-08-27 21:47 ` [REVIEW][PATCH 1/2] userns: Better restrictions on when proc and sysfs can be mounted Andy Lutomirski
[not found] ` <CALCETrWPDzuoaJp2ko5jAbwYUBqSdPjvO5uGo-gZVsS4Wm1PKQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-08-27 21:57 ` Eric W. Biederman
2013-09-01 4:45 ` Eric W. Biederman
[not found] ` <87eh99noa0.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-09-03 17:40 ` Andy Lutomirski
2013-11-02 6:06 ` Gao feng [this message]
[not found] ` <52749663.2000701-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2013-11-04 7:00 ` Janne Karhunen
[not found] ` <CAE=NcrY+CzX+H4XQTdGj7CSZ98a5T=bNgT6=jGZzcjyaHb-ttw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-09 5:22 ` Eric W. Biederman
2013-11-08 2:33 ` Gao feng
[not found] ` <527C4D88.10907-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2013-11-09 5:42 ` Eric W. Biederman
[not found] ` <87k3gigmgj.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-11-13 7:26 ` Gao feng
[not found] ` <5283299B.8080702-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2013-11-14 11:10 ` Gao feng
[not found] ` <5284AF90.7060506-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2013-11-14 16:54 ` Andy Lutomirski
2013-11-15 1:16 ` Gao feng
[not found] ` <528575EC.2030309-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2013-11-15 4:54 ` Eric W. Biederman
[not found] ` <87txfexo25.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-11-15 6:14 ` Gao feng
[not found] ` <5285BBE2.7010001-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2013-11-15 8:37 ` Eric W. Biederman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52749663.2000701@cn.fujitsu.com \
--to=gaofeng-bthxqxjhjhxqfuhtdcdx3a@public.gmane.org \
--cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
--cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).