All of lore.kernel.org
 help / color / mirror / Atom feed
From: ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org (Eric W. Biederman)
To: Linux Containers
	<containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Vasily Kulikov <segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org>,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: [PATCH review 1/6] userns: Avoid recursion in put_user_ns
Date: Fri, 25 Jan 2013 18:19:18 -0800	[thread overview]
Message-ID: <877gn0it3t.fsf@xmission.com> (raw)
In-Reply-To: <87ehh8it9s.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> (Eric W. Biederman's message of "Fri, 25 Jan 2013 18:15:43 -0800")


When freeing a deeply nested user namespace free_user_ns calls
put_user_ns on it's parent which may in turn call free_user_ns again.
When -fno-optimize-sibling-calls is passed to gcc one stack frame per
user namespace is left on the stack, potentially overflowing the
kernel stack.  CONFIG_FRAME_POINTER forces -fno-optimize-sibling-calls
so we can't count on gcc to optimize this code.

Remove struct kref and use a plain atomic_t.  Making the code more
flexible and easier to comprehend.  Make the loop in free_user_ns
explict to guarantee that the stack does not overflow with
CONFIG_FRAME_POINTER enabled.

I have tested this fix with a simple program that uses unshare to
create a deeply nested user namespace structure and then calls exit.
With 1000 nesteuser namespaces before this change running my test
program causes the kernel to die a horrible death.  With 10,000,000
nested user namespaces after this change my test program runs to
completion and causes no harm.

Pointed-out-by: Vasily Kulikov <segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org>
Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
---
 include/linux/user_namespace.h |   10 +++++-----
 kernel/user.c                  |    4 +---
 kernel/user_namespace.c        |   17 +++++++++--------
 3 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index b9bd2e6..4ce0093 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -21,7 +21,7 @@ struct user_namespace {
 	struct uid_gid_map	uid_map;
 	struct uid_gid_map	gid_map;
 	struct uid_gid_map	projid_map;
-	struct kref		kref;
+	atomic_t		count;
 	struct user_namespace	*parent;
 	kuid_t			owner;
 	kgid_t			group;
@@ -35,18 +35,18 @@ extern struct user_namespace init_user_ns;
 static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
 {
 	if (ns)
-		kref_get(&ns->kref);
+		atomic_inc(&ns->count);
 	return ns;
 }
 
 extern int create_user_ns(struct cred *new);
 extern int unshare_userns(unsigned long unshare_flags, struct cred **new_cred);
-extern void free_user_ns(struct kref *kref);
+extern void free_user_ns(struct user_namespace *ns);
 
 static inline void put_user_ns(struct user_namespace *ns)
 {
-	if (ns)
-		kref_put(&ns->kref, free_user_ns);
+	if (ns && atomic_dec_and_test(&ns->count))
+		free_user_ns(ns);
 }
 
 struct seq_operations;
diff --git a/kernel/user.c b/kernel/user.c
index 33acb5e..57ebfd4 100644
--- a/kernel/user.c
+++ b/kernel/user.c
@@ -47,9 +47,7 @@ struct user_namespace init_user_ns = {
 			.count = 4294967295U,
 		},
 	},
-	.kref = {
-		.refcount	= ATOMIC_INIT(3),
-	},
+	.count = ATOMIC_INIT(3),
 	.owner = GLOBAL_ROOT_UID,
 	.group = GLOBAL_ROOT_GID,
 	.proc_inum = PROC_USER_INIT_INO,
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 2b042c4..24f8ec3 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -78,7 +78,7 @@ int create_user_ns(struct cred *new)
 		return ret;
 	}
 
-	kref_init(&ns->kref);
+	atomic_set(&ns->count, 1);
 	/* Leave the new->user_ns reference with the new user namespace. */
 	ns->parent = parent_ns;
 	ns->owner = owner;
@@ -104,15 +104,16 @@ int unshare_userns(unsigned long unshare_flags, struct cred **new_cred)
 	return create_user_ns(cred);
 }
 
-void free_user_ns(struct kref *kref)
+void free_user_ns(struct user_namespace *ns)
 {
-	struct user_namespace *parent, *ns =
-		container_of(kref, struct user_namespace, kref);
+	struct user_namespace *parent;
 
-	parent = ns->parent;
-	proc_free_inum(ns->proc_inum);
-	kmem_cache_free(user_ns_cachep, ns);
-	put_user_ns(parent);
+	do {
+		parent = ns->parent;
+		proc_free_inum(ns->proc_inum);
+		kmem_cache_free(user_ns_cachep, ns);
+		ns = parent;
+	} while (atomic_dec_and_test(&parent->count));
 }
 EXPORT_SYMBOL(free_user_ns);
 
-- 
1.7.5.4

WARNING: multiple messages have this Message-ID (diff)
From: ebiederm@xmission.com (Eric W. Biederman)
To: Linux Containers <containers@lists.linux-foundation.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>,
	<linux-kernel@vger.kernel.org>, <linux-fsdevel@vger.kernel.org>,
	Vasily Kulikov <segoon@openwall.com>
Subject: [PATCH review 1/6] userns: Avoid recursion in put_user_ns
Date: Fri, 25 Jan 2013 18:19:18 -0800	[thread overview]
Message-ID: <877gn0it3t.fsf@xmission.com> (raw)
In-Reply-To: <87ehh8it9s.fsf@xmission.com> (Eric W. Biederman's message of "Fri, 25 Jan 2013 18:15:43 -0800")


When freeing a deeply nested user namespace free_user_ns calls
put_user_ns on it's parent which may in turn call free_user_ns again.
When -fno-optimize-sibling-calls is passed to gcc one stack frame per
user namespace is left on the stack, potentially overflowing the
kernel stack.  CONFIG_FRAME_POINTER forces -fno-optimize-sibling-calls
so we can't count on gcc to optimize this code.

Remove struct kref and use a plain atomic_t.  Making the code more
flexible and easier to comprehend.  Make the loop in free_user_ns
explict to guarantee that the stack does not overflow with
CONFIG_FRAME_POINTER enabled.

I have tested this fix with a simple program that uses unshare to
create a deeply nested user namespace structure and then calls exit.
With 1000 nesteuser namespaces before this change running my test
program causes the kernel to die a horrible death.  With 10,000,000
nested user namespaces after this change my test program runs to
completion and causes no harm.

Pointed-out-by: Vasily Kulikov <segoon@openwall.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/user_namespace.h |   10 +++++-----
 kernel/user.c                  |    4 +---
 kernel/user_namespace.c        |   17 +++++++++--------
 3 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index b9bd2e6..4ce0093 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -21,7 +21,7 @@ struct user_namespace {
 	struct uid_gid_map	uid_map;
 	struct uid_gid_map	gid_map;
 	struct uid_gid_map	projid_map;
-	struct kref		kref;
+	atomic_t		count;
 	struct user_namespace	*parent;
 	kuid_t			owner;
 	kgid_t			group;
@@ -35,18 +35,18 @@ extern struct user_namespace init_user_ns;
 static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
 {
 	if (ns)
-		kref_get(&ns->kref);
+		atomic_inc(&ns->count);
 	return ns;
 }
 
 extern int create_user_ns(struct cred *new);
 extern int unshare_userns(unsigned long unshare_flags, struct cred **new_cred);
-extern void free_user_ns(struct kref *kref);
+extern void free_user_ns(struct user_namespace *ns);
 
 static inline void put_user_ns(struct user_namespace *ns)
 {
-	if (ns)
-		kref_put(&ns->kref, free_user_ns);
+	if (ns && atomic_dec_and_test(&ns->count))
+		free_user_ns(ns);
 }
 
 struct seq_operations;
diff --git a/kernel/user.c b/kernel/user.c
index 33acb5e..57ebfd4 100644
--- a/kernel/user.c
+++ b/kernel/user.c
@@ -47,9 +47,7 @@ struct user_namespace init_user_ns = {
 			.count = 4294967295U,
 		},
 	},
-	.kref = {
-		.refcount	= ATOMIC_INIT(3),
-	},
+	.count = ATOMIC_INIT(3),
 	.owner = GLOBAL_ROOT_UID,
 	.group = GLOBAL_ROOT_GID,
 	.proc_inum = PROC_USER_INIT_INO,
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 2b042c4..24f8ec3 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -78,7 +78,7 @@ int create_user_ns(struct cred *new)
 		return ret;
 	}
 
-	kref_init(&ns->kref);
+	atomic_set(&ns->count, 1);
 	/* Leave the new->user_ns reference with the new user namespace. */
 	ns->parent = parent_ns;
 	ns->owner = owner;
@@ -104,15 +104,16 @@ int unshare_userns(unsigned long unshare_flags, struct cred **new_cred)
 	return create_user_ns(cred);
 }
 
-void free_user_ns(struct kref *kref)
+void free_user_ns(struct user_namespace *ns)
 {
-	struct user_namespace *parent, *ns =
-		container_of(kref, struct user_namespace, kref);
+	struct user_namespace *parent;
 
-	parent = ns->parent;
-	proc_free_inum(ns->proc_inum);
-	kmem_cache_free(user_ns_cachep, ns);
-	put_user_ns(parent);
+	do {
+		parent = ns->parent;
+		proc_free_inum(ns->proc_inum);
+		kmem_cache_free(user_ns_cachep, ns);
+		ns = parent;
+	} while (atomic_dec_and_test(&parent->count));
 }
 EXPORT_SYMBOL(free_user_ns);
 
-- 
1.7.5.4


  parent reply	other threads:[~2013-01-26  2:19 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-26  2:15 [PATCH review 0/6] miscelaneous user namespace patches Eric W. Biederman
2013-01-26  2:15 ` Eric W. Biederman
     [not found] ` <87ehh8it9s.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-01-26  2:19   ` Eric W. Biederman [this message]
2013-01-26  2:19     ` [PATCH review 1/6] userns: Avoid recursion in put_user_ns Eric W. Biederman
     [not found]     ` <877gn0it3t.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-01-26 20:58       ` Serge E. Hallyn
2013-01-28 14:51       ` Vasily Kulikov
2013-01-28 14:51         ` Vasily Kulikov
2013-01-28 16:34         ` Eric W. Biederman
2013-01-28 16:34           ` Eric W. Biederman
2013-01-26 20:58     ` Serge E. Hallyn
2013-01-26  2:21   ` [PATCH review 2/6] userns: Allow any uid or gid mappings that don't overlap Eric W. Biederman
2013-01-26  2:21     ` Eric W. Biederman
     [not found]     ` <87zjzwhegj.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-01-26 21:08       ` Serge E. Hallyn
2013-01-26 21:08         ` Serge E. Hallyn
2013-01-28 14:28       ` Aristeu Rozanski
2013-01-28 14:28     ` Aristeu Rozanski
     [not found]       ` <20130128142816.GU17632-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-01-28 14:41         ` Lord Glauber Costa of Sealand
2013-01-28 14:41       ` Lord Glauber Costa of Sealand
     [not found]         ` <51068E23.5040000-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2013-01-28 15:12           ` Aristeu Rozanski
2013-01-28 15:12             ` Aristeu Rozanski
2013-01-26  2:22   ` [PATCH review 3/6] userns: Recommend use of memory control groups Eric W. Biederman
2013-01-26  2:22     ` Eric W. Biederman
     [not found]     ` <87txq4hedl.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-01-26 21:13       ` Serge E. Hallyn
2013-01-26 21:13         ` Serge E. Hallyn
     [not found]         ` <20130126211312.GD11274-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2013-01-27  6:19           ` Eric W. Biederman
2013-01-27  6:19             ` Eric W. Biederman
2013-01-28  7:37       ` Lord Glauber Costa of Sealand
2013-01-28  7:37     ` Lord Glauber Costa of Sealand
     [not found]       ` <51062AB5.9060203-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2013-01-28  7:50         ` Lord Glauber Costa of Sealand
2013-01-28  7:50           ` Lord Glauber Costa of Sealand
     [not found]           ` <51062DA8.1060804-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2013-01-28  8:14             ` Eric W. Biederman
2013-01-28  8:14           ` Eric W. Biederman
     [not found]             ` <87k3qxu3kp.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-01-28  8:22               ` Lord Glauber Costa of Sealand
2013-01-28  8:22                 ` Lord Glauber Costa of Sealand
     [not found]                 ` <51063558.1010402-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2013-01-28 16:19                   ` Eric W. Biederman
2013-01-28 16:19                     ` Eric W. Biederman
     [not found]                     ` <87k3qxs2ko.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-01-28 16:37                       ` Lord Glauber Costa of Sealand
2013-01-28 16:37                         ` Lord Glauber Costa of Sealand
     [not found]                         ` <5106A941.6060403-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2013-01-28 17:18                           ` Eric W. Biederman
2013-01-28 17:18                         ` Eric W. Biederman
2013-01-28  8:05         ` Eric W. Biederman
2013-01-28  8:05           ` Eric W. Biederman
2013-01-26  2:23   ` [PATCH review 4/6] userns: Allow the userns root to mount of devpts Eric W. Biederman
2013-01-26  2:23     ` Eric W. Biederman
     [not found]     ` <87obgchecv.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-01-26 21:22       ` Serge E. Hallyn
2013-01-26 21:22         ` Serge E. Hallyn
2013-01-26  2:26   ` [PATCH review 5/6] userns: Allow the userns root to mount ramfs Eric W. Biederman
2013-01-26  2:26     ` Eric W. Biederman
2013-01-27 18:23     ` Serge E. Hallyn
     [not found]     ` <87ip6khe7w.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-01-26 21:29       ` Serge E. Hallyn
2013-01-26 21:29         ` Serge E. Hallyn
     [not found]         ` <20130126212918.GG11274-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2013-01-27  6:09           ` Eric W. Biederman
2013-01-27  6:09             ` Eric W. Biederman
     [not found]             ` <87bocb5f8a.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-01-27 18:23               ` Serge E. Hallyn
2013-01-27 18:23                 ` Serge E. Hallyn
2013-01-27 18:23       ` Serge E. Hallyn
2013-01-26  2:26   ` [PATCH review 6/6] userns: Allow the userns root to mount tmpfs Eric W. Biederman
2013-01-26  2:26     ` Eric W. Biederman
     [not found]     ` <87d2wshe6v.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-01-27 18:23       ` Serge E. Hallyn
2013-01-27 18:23         ` Serge E. Hallyn
2013-01-28  1:28       ` Gao feng
2013-01-28  1:28     ` Gao feng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877gn0it3t.fsf@xmission.com \
    --to=ebiederm-as9lmozglivwk0htik3j/w@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.