From: ebiederm@xmission.com (Eric W. Biederman)
To: Benjamin Thery <benjamin.thery@bull.net>
Cc: linux-kernel@vger.kernel.org, Tejun Heo <htejun@gmail.com>,
Greg Kroah-Hartman <gregkh@suse.de>,
Daniel Lezcano <dlezcano@fr.ibm.com>,
"Serge E. Hallyn" <serue@us.ibm.com>,
Pavel Emelyanov <xemul@openvz.org>,
netdev@vger.kernel.org
Subject: Re: [PATCH 10/10] sysfs: user namespaces: add ns to user_struct
Date: Tue, 29 Apr 2008 12:51:29 -0700 [thread overview]
Message-ID: <m17ieg1ltq.fsf@frodo.ebiederm.org> (raw)
In-Reply-To: <20080429171703.529721456@theryb.frec.bull.fr> (Benjamin Thery's message of "Tue, 29 Apr 2008 19:11:55 +0200")
Benjamin Thery <benjamin.thery@bull.net> writes:
> Add the user_namespace to user_struct.
>
> Add a user_namespace to the sysfs_tag struct.
>
> Mark the /sys/kernel/uids directory to be tagged so that processes in
> different user namespaces can remount /sys and see their own uid
> listings.
>
> Without this patch, having CONFIG_FAIR_SCHED=y makes user namespaces
> unusable, because when you
> clone(CLONE_NEWUSER)
> it will auto-create the root userid and try to create
> /sys/kernel/uids/0. Since that already exists from the parent user
> namespace, the create fails, and the clone misleadingly ends up
> returning -ENOMEM.
>
> This patch fixes the issue by allowing each user namespace to remount
> /sys, and having /sys filter the /sys/kernel/uid/ entries by user
> namespace.
There is a lot of promising code in this patch. This patch appears to
be building up a lot of userns infrastructure that is only indirectly
related to sysfs, and it still appears to have some debugging code
present.
As a proof of concept patch (which I asked for) this looks good.
I do think this patch should get split into logical changes
and cleaned up a bit before it gets merged into mainline.
More comments inline below.
Eric
> Index: linux-mm/fs/sysfs/mount.c
> ===================================================================
> --- linux-mm.orig/fs/sysfs/mount.c
> +++ linux-mm/fs/sysfs/mount.c
> @@ -17,6 +17,7 @@
> #include <linux/pagemap.h>
> #include <linux/init.h>
> #include <linux/nsproxy.h>
> +#include <linux/user_namespace.h>
> #include <net/net_namespace.h>
>
> #include "sysfs.h"
> @@ -81,6 +82,7 @@ static int sysfs_fill_super(struct super
> sb->s_root = root;
> sb->s_fs_info = info;
> info->tag.net_ns = hold_net(current->nsproxy->net_ns);
> + info->tag.user_ns = current->nsproxy->user_ns;
> return 0;
>
> out_err:
> @@ -100,6 +102,8 @@ static int sysfs_test_super(struct super
>
> if (task->nsproxy->net_ns != info->tag.net_ns)
> found = 0;
> + if (task->nsproxy->user_ns != info->tag.user_ns)
> + found = 0;
>
> return found;
> }
> @@ -214,6 +218,27 @@ static struct pernet_operations sysfs_ne
> };
> #endif
>
> +#ifdef CONFIG_USER_NS
> +void sysfs_userns_exit(struct user_namespace *user_ns)
> +{
> + /* Allow the net namespace to go away while sysfs is still mounted. */
> + struct super_block *sb;
> + printk(KERN_NOTICE "sysfs: user namespace exiting\n");
Debugging?
> + mutex_lock(&sysfs_rename_mutex);
> + sysfs_grab_supers();
> + mutex_lock(&sysfs_mutex);
> + list_for_each_entry(sb, &sysfs_fs_type.fs_supers, s_instances) {
> + struct sysfs_super_info *info = sysfs_info(sb);
> + if (info->tag.user_ns != user_ns)
> + continue;
> + info->tag.user_ns = NULL;
> + }
> + mutex_unlock(&sysfs_mutex);
> + sysfs_release_supers();
> + mutex_unlock(&sysfs_rename_mutex);
> +}
> +#endif
> +
> int __init sysfs_init(void)
> {
> int err = -ENOMEM;
> Index: linux-mm/include/linux/sched.h
> ===================================================================
> --- linux-mm.orig/include/linux/sched.h
> +++ linux-mm/include/linux/sched.h
> @@ -598,7 +598,7 @@ struct user_struct {
>
> /* Hash table maintenance information */
> struct hlist_node uidhash_node;
> - uid_t uid;
> + struct k_uid_t uid;
If we are going to go this direction my inclination
is to include an array of a single element in user_struct.
Maybe that makes sense. I just know we need to talk about
how a user maps into different user namespaces. As that
is a real concept that really occurs in real filesystems
like nfsv4 and p9fs, and having infrastructure that can
deal with the concept (even if it doesn't support it yet) would be
useful.
> #endif /* _SYSFS_H_ */
> Index: linux-mm/include/linux/types.h
> ===================================================================
> --- linux-mm.orig/include/linux/types.h
> +++ linux-mm/include/linux/types.h
> @@ -37,6 +37,11 @@ typedef __kernel_gid32_t gid_t;
> typedef __kernel_uid16_t uid16_t;
> typedef __kernel_gid16_t gid16_t;
>
> +struct k_uid_t {
> + uid_t uid;
> + struct user_namespace *ns;
> +};
Weird a _t suffix in a structure name. Usually _t indicates
that something is a typedef.
> +
> typedef unsigned long uintptr_t;
>
> #ifdef CONFIG_UID16
> Index: linux-mm/include/linux/user_namespace.h
> ===================================================================
> --- linux-mm.orig/include/linux/user_namespace.h
> +++ linux-mm/include/linux/user_namespace.h
> @@ -12,10 +12,15 @@
> struct user_namespace {
> struct kref kref;
> struct hlist_head uidhash_table[UIDHASH_SZ];
> + struct kset *kset;
> struct user_struct *root_user;
> };
>
> extern struct user_namespace init_user_ns;
> +extern int register_user_ns_kobj(struct user_namespace *ns);
> +extern void unregister_user_ns_kobj(struct user_namespace *ns);
> +extern int register_user_ns_kobj(struct user_namespace *ns);
> +extern void unregister_user_ns_kobj(struct user_namespace *ns);
>
> #ifdef CONFIG_USER_NS
>
> Index: linux-mm/kernel/user.c
> ===================================================================
> --- linux-mm.orig/kernel/user.c
> +++ linux-mm/kernel/user.c
> @@ -53,6 +53,10 @@ struct user_struct root_user = {
> .files = ATOMIC_INIT(0),
> .sigpending = ATOMIC_INIT(0),
> .locked_shm = 0,
> + .uid = {
> + .uid = 0,
> + .ns = &init_user_ns,
> + },
> #ifdef CONFIG_USER_SCHED
> .tg = &init_task_group,
> #endif
> @@ -71,13 +75,23 @@ static void uid_hash_remove(struct user_
> hlist_del_init(&up->uidhash_node);
> }
>
> -static struct user_struct *uid_hash_find(uid_t uid, struct hlist_head *hashent)
> +int task_user_equiv(struct task_struct *tsk, struct user_struct *u)
> +{
> + if (tsk->uid != u->uid.uid)
> + return 0;
> + if (tsk->nsproxy->user_ns != u->uid.ns)
> + return 0;
> + return 1;
> +}
> +
> +static struct user_struct *uid_hash_find(uid_t uid,
> + struct hlist_head *hashent)
> {
> struct user_struct *user;
> struct hlist_node *h;
>
> hlist_for_each_entry(user, h, hashent, uidhash_node) {
> - if (user->uid == uid) {
> + if (user->uid.uid == uid) {
> atomic_inc(&user->__count);
> return user;
> }
> @@ -236,6 +250,23 @@ static void uids_release(struct kobject
> return;
> }
>
> +static const void *userns_sb_tag(struct sysfs_tag_info *info)
> +{
> + return info->user_ns;
> +}
> +
> +static const void *userns_kobject_tag(struct kobject *kobj)
> +{
> + struct user_struct *up;
> + up = container_of(kobj, struct user_struct, kobj);
> + return up->uid.ns;
> +}
> +
> +static struct sysfs_tagged_dir_operations userns_tagged_dir_operations = {
> + .sb_tag = userns_sb_tag,
> + .kobject_tag = userns_kobject_tag,
> +};
> +
> static struct kobj_type uids_ktype = {
> .sysfs_ops = &kobj_sysfs_ops,
> .default_attrs = uids_attributes,
> @@ -246,19 +277,19 @@ static struct kobj_type uids_ktype = {
> static int uids_user_create(struct user_struct *up)
> {
> struct kobject *kobj = &up->kobj;
> - int error;
> + int err;
>
> memset(kobj, 0, sizeof(struct kobject));
> kobj->kset = uids_kset;
> - error = kobject_init_and_add(kobj, &uids_ktype, NULL, "%d", up->uid);
> - if (error) {
> + err = kobject_init_and_add(kobj, &uids_ktype, NULL, "%d", up->uid.uid);
> + if (err) {
> kobject_put(kobj);
> goto done;
> }
>
> kobject_uevent(kobj, KOBJ_ADD);
> done:
> - return error;
> + return err;
> }
>
> /* create these entries in sysfs:
> @@ -271,7 +302,7 @@ int __init uids_sysfs_init(void)
> uids_kset = kset_create_and_add("uids", NULL, kernel_kobj);
> if (!uids_kset)
> return -ENOMEM;
> -
> + sysfs_enable_tagging(&uids_kset->kobj, &userns_tagged_dir_operations);
> return uids_user_create(&root_user);
> }
>
> @@ -403,7 +434,8 @@ struct user_struct *alloc_uid(struct use
> if (!new)
> goto out_unlock;
>
> - new->uid = uid;
> + new->uid.uid = uid;
> + new->uid.ns = ns;
> atomic_set(&new->__count, 1);
>
> if (sched_create_user(new) < 0)
> Index: linux-mm/kernel/user_namespace.c
> ===================================================================
> --- linux-mm.orig/kernel/user_namespace.c
> +++ linux-mm/kernel/user_namespace.c
> @@ -22,7 +22,7 @@ static struct user_namespace *clone_user
> struct user_struct *new_user;
> int n;
>
> - ns = kmalloc(sizeof(struct user_namespace), GFP_KERNEL);
> + ns = kzalloc(sizeof(struct user_namespace), GFP_KERNEL);
> if (!ns)
> return ERR_PTR(-ENOMEM);
>
> @@ -71,6 +71,7 @@ void free_user_ns(struct kref *kref)
> struct user_namespace *ns;
>
> ns = container_of(kref, struct user_namespace, kref);
> + sysfs_userns_exit(ns);
> release_uids(ns);
> kfree(ns);
> }
next prev parent reply other threads:[~2008-04-29 19:59 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-29 17:10 [PATCH 00/10] sysfs tagged directories Benjamin Thery
2008-04-29 17:10 ` [PATCH 01/10] sysfs: Support for preventing unmounts Benjamin Thery
2008-04-29 17:10 ` [PATCH 02/10] sysfs: sysfs_get_dentry add a sb parameter Benjamin Thery
2008-04-29 17:10 ` [PATCH 03/10] sysfs: Implement __sysfs_get_dentry Benjamin Thery
2008-04-29 17:10 ` [PATCH 04/10] sysfs: Rename Support multiple superblocks Benjamin Thery
2008-04-29 17:11 ` [PATCH 05/10] sysfs: sysfs_chmod_file handle " Benjamin Thery
2008-04-29 17:11 ` [PATCH 06/10] sysfs: Implement sysfs tagged directory support Benjamin Thery
2008-04-29 17:11 ` [PATCH 07/10] sysfs: Implement sysfs_delete_link and sysfs_rename_link Benjamin Thery
2008-04-29 17:11 ` [PATCH 08/10] driver core: Implement tagged directory support for device classes Benjamin Thery
2008-04-29 17:11 ` [PATCH 09/10] net: netns: Enable tagging for net_class directories in sysfs Benjamin Thery
2008-04-29 17:11 ` [PATCH 10/10] sysfs: user namespaces: add ns to user_struct Benjamin Thery
2008-04-29 19:51 ` Eric W. Biederman [this message]
2008-04-29 23:18 ` Serge E. Hallyn
2008-04-29 17:36 ` [PATCH 00/10] sysfs tagged directories Greg KH
2008-04-29 18:04 ` Serge E. Hallyn
2008-04-29 18:41 ` Greg KH
2008-04-29 19:34 ` Serge E. Hallyn
2008-04-29 20:10 ` Eric W. Biederman
2008-05-01 3:12 ` Greg KH
2008-05-01 3:13 ` Greg KH
2008-05-01 15:10 ` Serge E. Hallyn
2008-05-01 18:34 ` Eric W. Biederman
2008-05-01 21:05 ` Serge E. Hallyn
2008-05-01 21:58 ` Eric W. Biederman
2008-05-02 17:42 ` Serge E. Hallyn
2008-05-04 23:13 ` Daniel Lezcano
2008-05-05 16:18 ` Serge E. Hallyn
2008-05-06 16:53 ` Benjamin Thery
2008-04-29 19:35 ` Eric W. Biederman
2008-04-29 19:14 ` Eric W. Biederman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m17ieg1ltq.fsf@frodo.ebiederm.org \
--to=ebiederm@xmission.com \
--cc=benjamin.thery@bull.net \
--cc=dlezcano@fr.ibm.com \
--cc=gregkh@suse.de \
--cc=htejun@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=serue@us.ibm.com \
--cc=xemul@openvz.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox