linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Linux Containers <containers@lists.linux-foundation.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>,
	<linux-kernel@vger.kernel.org>, <linux-fsdevel@vger.kernel.org>,
	Vasily Kulikov <segoon@openwall.com>
Subject: [PATCH review 1/6] userns: Avoid recursion in put_user_ns
Date: Fri, 25 Jan 2013 18:19:18 -0800	[thread overview]
Message-ID: <877gn0it3t.fsf@xmission.com> (raw)
In-Reply-To: <87ehh8it9s.fsf@xmission.com> (Eric W. Biederman's message of "Fri, 25 Jan 2013 18:15:43 -0800")


When freeing a deeply nested user namespace free_user_ns calls
put_user_ns on it's parent which may in turn call free_user_ns again.
When -fno-optimize-sibling-calls is passed to gcc one stack frame per
user namespace is left on the stack, potentially overflowing the
kernel stack.  CONFIG_FRAME_POINTER forces -fno-optimize-sibling-calls
so we can't count on gcc to optimize this code.

Remove struct kref and use a plain atomic_t.  Making the code more
flexible and easier to comprehend.  Make the loop in free_user_ns
explict to guarantee that the stack does not overflow with
CONFIG_FRAME_POINTER enabled.

I have tested this fix with a simple program that uses unshare to
create a deeply nested user namespace structure and then calls exit.
With 1000 nesteuser namespaces before this change running my test
program causes the kernel to die a horrible death.  With 10,000,000
nested user namespaces after this change my test program runs to
completion and causes no harm.

Pointed-out-by: Vasily Kulikov <segoon@openwall.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/user_namespace.h |   10 +++++-----
 kernel/user.c                  |    4 +---
 kernel/user_namespace.c        |   17 +++++++++--------
 3 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index b9bd2e6..4ce0093 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -21,7 +21,7 @@ struct user_namespace {
 	struct uid_gid_map	uid_map;
 	struct uid_gid_map	gid_map;
 	struct uid_gid_map	projid_map;
-	struct kref		kref;
+	atomic_t		count;
 	struct user_namespace	*parent;
 	kuid_t			owner;
 	kgid_t			group;
@@ -35,18 +35,18 @@ extern struct user_namespace init_user_ns;
 static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
 {
 	if (ns)
-		kref_get(&ns->kref);
+		atomic_inc(&ns->count);
 	return ns;
 }
 
 extern int create_user_ns(struct cred *new);
 extern int unshare_userns(unsigned long unshare_flags, struct cred **new_cred);
-extern void free_user_ns(struct kref *kref);
+extern void free_user_ns(struct user_namespace *ns);
 
 static inline void put_user_ns(struct user_namespace *ns)
 {
-	if (ns)
-		kref_put(&ns->kref, free_user_ns);
+	if (ns && atomic_dec_and_test(&ns->count))
+		free_user_ns(ns);
 }
 
 struct seq_operations;
diff --git a/kernel/user.c b/kernel/user.c
index 33acb5e..57ebfd4 100644
--- a/kernel/user.c
+++ b/kernel/user.c
@@ -47,9 +47,7 @@ struct user_namespace init_user_ns = {
 			.count = 4294967295U,
 		},
 	},
-	.kref = {
-		.refcount	= ATOMIC_INIT(3),
-	},
+	.count = ATOMIC_INIT(3),
 	.owner = GLOBAL_ROOT_UID,
 	.group = GLOBAL_ROOT_GID,
 	.proc_inum = PROC_USER_INIT_INO,
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 2b042c4..24f8ec3 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -78,7 +78,7 @@ int create_user_ns(struct cred *new)
 		return ret;
 	}
 
-	kref_init(&ns->kref);
+	atomic_set(&ns->count, 1);
 	/* Leave the new->user_ns reference with the new user namespace. */
 	ns->parent = parent_ns;
 	ns->owner = owner;
@@ -104,15 +104,16 @@ int unshare_userns(unsigned long unshare_flags, struct cred **new_cred)
 	return create_user_ns(cred);
 }
 
-void free_user_ns(struct kref *kref)
+void free_user_ns(struct user_namespace *ns)
 {
-	struct user_namespace *parent, *ns =
-		container_of(kref, struct user_namespace, kref);
+	struct user_namespace *parent;
 
-	parent = ns->parent;
-	proc_free_inum(ns->proc_inum);
-	kmem_cache_free(user_ns_cachep, ns);
-	put_user_ns(parent);
+	do {
+		parent = ns->parent;
+		proc_free_inum(ns->proc_inum);
+		kmem_cache_free(user_ns_cachep, ns);
+		ns = parent;
+	} while (atomic_dec_and_test(&parent->count));
 }
 EXPORT_SYMBOL(free_user_ns);
 
-- 
1.7.5.4


  reply	other threads:[~2013-01-26  2:19 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-26  2:15 [PATCH review 0/6] miscelaneous user namespace patches Eric W. Biederman
2013-01-26  2:19 ` Eric W. Biederman [this message]
2013-01-26 20:58   ` [PATCH review 1/6] userns: Avoid recursion in put_user_ns Serge E. Hallyn
2013-01-28 14:51   ` Vasily Kulikov
2013-01-28 16:34     ` Eric W. Biederman
2013-01-26  2:21 ` [PATCH review 2/6] userns: Allow any uid or gid mappings that don't overlap Eric W. Biederman
2013-01-26 21:08   ` Serge E. Hallyn
2013-01-28 14:28   ` Aristeu Rozanski
2013-01-28 14:41     ` Lord Glauber Costa of Sealand
2013-01-28 15:12       ` Aristeu Rozanski
2013-01-26  2:22 ` [PATCH review 3/6] userns: Recommend use of memory control groups Eric W. Biederman
2013-01-26 21:13   ` Serge E. Hallyn
2013-01-27  6:19     ` Eric W. Biederman
2013-01-28  7:37   ` Lord Glauber Costa of Sealand
2013-01-28  7:50     ` Lord Glauber Costa of Sealand
2013-01-28  8:14       ` Eric W. Biederman
2013-01-28  8:22         ` Lord Glauber Costa of Sealand
2013-01-28 16:19           ` Eric W. Biederman
2013-01-28 16:37             ` Lord Glauber Costa of Sealand
2013-01-28 17:18               ` Eric W. Biederman
2013-01-28  8:05     ` Eric W. Biederman
2013-01-26  2:23 ` [PATCH review 4/6] userns: Allow the userns root to mount of devpts Eric W. Biederman
2013-01-26 21:22   ` Serge E. Hallyn
2013-01-26  2:26 ` [PATCH review 5/6] userns: Allow the userns root to mount ramfs Eric W. Biederman
2013-01-26 21:29   ` Serge E. Hallyn
2013-01-27  6:09     ` Eric W. Biederman
2013-01-27 18:23       ` Serge E. Hallyn
2013-01-27 18:23   ` Serge E. Hallyn
2013-01-26  2:26 ` [PATCH review 6/6] userns: Allow the userns root to mount tmpfs Eric W. Biederman
2013-01-27 18:23   ` Serge E. Hallyn
2013-01-28  1:28   ` Gao feng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877gn0it3t.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=segoon@openwall.com \
    --cc=serge@hallyn.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).