From: Gao feng <gaofeng@cn.fujitsu.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Linux Containers <containers@lists.linux-foundation.org>,
"Serge E. Hallyn" <serge@hallyn.com>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
Andy Lutomirski <luto@amacapital.net>
Subject: Re: [REVIEW][PATCH 1/2] userns: Better restrictions on when proc and sysfs can be mounted
Date: Fri, 08 Nov 2013 10:33:44 +0800 [thread overview]
Message-ID: <527C4D88.10907@cn.fujitsu.com> (raw)
In-Reply-To: <52749663.2000701@cn.fujitsu.com>
On 11/02/2013 02:06 PM, Gao feng wrote:
> Hi Eric,
>
> On 08/28/2013 05:44 AM, Eric W. Biederman wrote:
>>
>> Rely on the fact that another flavor of the filesystem is already
>> mounted and do not rely on state in the user namespace.
>>
>> Verify that the mounted filesystem is not covered in any significant
>> way. I would love to verify that the previously mounted filesystem
>> has no mounts on top but there are at least the directories
>> /proc/sys/fs/binfmt_misc and /sys/fs/cgroup/ that exist explicitly
>> for other filesystems to mount on top of.
>>
>> Refactor the test into a function named fs_fully_visible and call that
>> function from the mount routines of proc and sysfs. This makes this
>> test local to the filesystems involved and the results current of when
>> the mounts take place, removing a weird threading of the user
>> namespace, the mount namespace and the filesystems themselves.
>>
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>> ---
>> fs/namespace.c | 37 +++++++++++++++++++++++++------------
>> fs/proc/root.c | 7 +++++--
>> fs/sysfs/mount.c | 3 ++-
>> include/linux/fs.h | 1 +
>> include/linux/user_namespace.h | 4 ----
>> kernel/user.c | 2 --
>> kernel/user_namespace.c | 2 --
>> 7 files changed, 33 insertions(+), 23 deletions(-)
>>
>> diff --git a/fs/namespace.c b/fs/namespace.c
>> index 64627f8..877e427 100644
>> --- a/fs/namespace.c
>> +++ b/fs/namespace.c
>> @@ -2867,25 +2867,38 @@ bool current_chrooted(void)
>> return chrooted;
>> }
>>
>> -void update_mnt_policy(struct user_namespace *userns)
>> +bool fs_fully_visible(struct file_system_type *type)
>> {
>> struct mnt_namespace *ns = current->nsproxy->mnt_ns;
>> struct mount *mnt;
>> + bool visible = false;
>>
>> - down_read(&namespace_sem);
>> + if (unlikely(!ns))
>> + return false;
>> +
>> + namespace_lock();
>> list_for_each_entry(mnt, &ns->list, mnt_list) {
>> - switch (mnt->mnt.mnt_sb->s_magic) {
>> - case SYSFS_MAGIC:
>> - userns->may_mount_sysfs = true;
>> - break;
>> - case PROC_SUPER_MAGIC:
>> - userns->may_mount_proc = true;
>> - break;
>> + struct mount *child;
>> + if (mnt->mnt.mnt_sb->s_type != type)
>> + continue;
>> +
>> + /* This mount is not fully visible if there are any child mounts
>> + * that cover anything except for empty directories.
>> + */
>> + list_for_each_entry(child, &mnt->mnt_mounts, mnt_child) {
>> + struct inode *inode = child->mnt_mountpoint->d_inode;
>> + if (!S_ISDIR(inode->i_mode))
>> + goto next;
>> + if (inode->i_nlink != 2)
>> + goto next;
>
>
> I met a problem that proc filesystem failed to mount in user namespace,
> The problem is the i_nlink of sysctl entries under proc filesystem is not
> 2. it always is 1 even it's a directory, see proc_sys_make_inode. and for
> btrfs, the i_nlink for an empty dir is 2 too. it seems like depends on the
> filesystem itself,not depends on vfs. In my system binfmt_misc is mounted
> on /proc/sys/fs/binfmt_misc, and the i_nlink of this directory's inode is
> 1.
>
> btw, I'm not quite understand what's the inode->i_nlink != 2 here means?
> is this directory empty? as I know, when we create a file(not dir) under
> a dir, the i_nlink of this dir will not increase.
>
> And another question, it looks like if we don't have proc/sys fs mounted,
> then proc/sys will be failed to be mounted?
>
Any Idea?? or should we need to revert this patch??
next prev parent reply other threads:[~2013-11-08 2:32 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-27 21:44 [REVIEW][PATCH 1/2] userns: Better restrictions on when proc and sysfs can be mounted Eric W. Biederman
2013-08-27 21:46 ` [REVIEW][PATCH 2/2] sysfs: Restrict mounting sysfs Eric W. Biederman
2013-08-28 19:00 ` Greg Kroah-Hartman
2013-09-23 10:33 ` James Hogan
2013-09-23 21:41 ` [PATCH] sysfs: Allow mounting without CONFIG_NET Eric W. Biederman
2013-09-24 11:25 ` James Hogan
2013-08-27 21:47 ` [REVIEW][PATCH 1/2] userns: Better restrictions on when proc and sysfs can be mounted Andy Lutomirski
2013-08-27 21:57 ` Eric W. Biederman
2013-09-01 4:45 ` Eric W. Biederman
2013-09-03 17:40 ` Andy Lutomirski
2013-11-02 6:06 ` Gao feng
2013-11-04 7:00 ` Janne Karhunen
2013-11-09 5:22 ` Eric W. Biederman
2013-11-08 2:33 ` Gao feng [this message]
2013-11-09 5:42 ` Eric W. Biederman
2013-11-13 7:26 ` Gao feng
2013-11-14 11:10 ` Gao feng
2013-11-14 16:54 ` Andy Lutomirski
2013-11-15 1:16 ` Gao feng
2013-11-15 4:54 ` Eric W. Biederman
2013-11-15 6:14 ` Gao feng
2013-11-15 8:37 ` Eric W. Biederman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=527C4D88.10907@cn.fujitsu.com \
--to=gaofeng@cn.fujitsu.com \
--cc=containers@lists.linux-foundation.org \
--cc=ebiederm@xmission.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=serge@hallyn.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox