* [PATCH review 1/6] userns: Avoid recursion in put_user_ns
2013-01-26 2:15 ` Eric W. Biederman
@ 2013-01-26 2:19 ` Eric W. Biederman
-1 siblings, 0 replies; 62+ messages in thread
From: Eric W. Biederman @ 2013-01-26 2:19 UTC (permalink / raw)
To: Linux Containers
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Vasily Kulikov,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
When freeing a deeply nested user namespace free_user_ns calls
put_user_ns on it's parent which may in turn call free_user_ns again.
When -fno-optimize-sibling-calls is passed to gcc one stack frame per
user namespace is left on the stack, potentially overflowing the
kernel stack. CONFIG_FRAME_POINTER forces -fno-optimize-sibling-calls
so we can't count on gcc to optimize this code.
Remove struct kref and use a plain atomic_t. Making the code more
flexible and easier to comprehend. Make the loop in free_user_ns
explict to guarantee that the stack does not overflow with
CONFIG_FRAME_POINTER enabled.
I have tested this fix with a simple program that uses unshare to
create a deeply nested user namespace structure and then calls exit.
With 1000 nesteuser namespaces before this change running my test
program causes the kernel to die a horrible death. With 10,000,000
nested user namespaces after this change my test program runs to
completion and causes no harm.
Pointed-out-by: Vasily Kulikov <segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org>
Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
---
include/linux/user_namespace.h | 10 +++++-----
kernel/user.c | 4 +---
kernel/user_namespace.c | 17 +++++++++--------
3 files changed, 15 insertions(+), 16 deletions(-)
diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index b9bd2e6..4ce0093 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -21,7 +21,7 @@ struct user_namespace {
struct uid_gid_map uid_map;
struct uid_gid_map gid_map;
struct uid_gid_map projid_map;
- struct kref kref;
+ atomic_t count;
struct user_namespace *parent;
kuid_t owner;
kgid_t group;
@@ -35,18 +35,18 @@ extern struct user_namespace init_user_ns;
static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
{
if (ns)
- kref_get(&ns->kref);
+ atomic_inc(&ns->count);
return ns;
}
extern int create_user_ns(struct cred *new);
extern int unshare_userns(unsigned long unshare_flags, struct cred **new_cred);
-extern void free_user_ns(struct kref *kref);
+extern void free_user_ns(struct user_namespace *ns);
static inline void put_user_ns(struct user_namespace *ns)
{
- if (ns)
- kref_put(&ns->kref, free_user_ns);
+ if (ns && atomic_dec_and_test(&ns->count))
+ free_user_ns(ns);
}
struct seq_operations;
diff --git a/kernel/user.c b/kernel/user.c
index 33acb5e..57ebfd4 100644
--- a/kernel/user.c
+++ b/kernel/user.c
@@ -47,9 +47,7 @@ struct user_namespace init_user_ns = {
.count = 4294967295U,
},
},
- .kref = {
- .refcount = ATOMIC_INIT(3),
- },
+ .count = ATOMIC_INIT(3),
.owner = GLOBAL_ROOT_UID,
.group = GLOBAL_ROOT_GID,
.proc_inum = PROC_USER_INIT_INO,
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 2b042c4..24f8ec3 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -78,7 +78,7 @@ int create_user_ns(struct cred *new)
return ret;
}
- kref_init(&ns->kref);
+ atomic_set(&ns->count, 1);
/* Leave the new->user_ns reference with the new user namespace. */
ns->parent = parent_ns;
ns->owner = owner;
@@ -104,15 +104,16 @@ int unshare_userns(unsigned long unshare_flags, struct cred **new_cred)
return create_user_ns(cred);
}
-void free_user_ns(struct kref *kref)
+void free_user_ns(struct user_namespace *ns)
{
- struct user_namespace *parent, *ns =
- container_of(kref, struct user_namespace, kref);
+ struct user_namespace *parent;
- parent = ns->parent;
- proc_free_inum(ns->proc_inum);
- kmem_cache_free(user_ns_cachep, ns);
- put_user_ns(parent);
+ do {
+ parent = ns->parent;
+ proc_free_inum(ns->proc_inum);
+ kmem_cache_free(user_ns_cachep, ns);
+ ns = parent;
+ } while (atomic_dec_and_test(&parent->count));
}
EXPORT_SYMBOL(free_user_ns);
--
1.7.5.4
^ permalink raw reply related [flat|nested] 62+ messages in thread* [PATCH review 1/6] userns: Avoid recursion in put_user_ns
@ 2013-01-26 2:19 ` Eric W. Biederman
0 siblings, 0 replies; 62+ messages in thread
From: Eric W. Biederman @ 2013-01-26 2:19 UTC (permalink / raw)
To: Linux Containers
Cc: Serge E. Hallyn, linux-kernel, linux-fsdevel, Vasily Kulikov
When freeing a deeply nested user namespace free_user_ns calls
put_user_ns on it's parent which may in turn call free_user_ns again.
When -fno-optimize-sibling-calls is passed to gcc one stack frame per
user namespace is left on the stack, potentially overflowing the
kernel stack. CONFIG_FRAME_POINTER forces -fno-optimize-sibling-calls
so we can't count on gcc to optimize this code.
Remove struct kref and use a plain atomic_t. Making the code more
flexible and easier to comprehend. Make the loop in free_user_ns
explict to guarantee that the stack does not overflow with
CONFIG_FRAME_POINTER enabled.
I have tested this fix with a simple program that uses unshare to
create a deeply nested user namespace structure and then calls exit.
With 1000 nesteuser namespaces before this change running my test
program causes the kernel to die a horrible death. With 10,000,000
nested user namespaces after this change my test program runs to
completion and causes no harm.
Pointed-out-by: Vasily Kulikov <segoon@openwall.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
include/linux/user_namespace.h | 10 +++++-----
kernel/user.c | 4 +---
kernel/user_namespace.c | 17 +++++++++--------
3 files changed, 15 insertions(+), 16 deletions(-)
diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index b9bd2e6..4ce0093 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -21,7 +21,7 @@ struct user_namespace {
struct uid_gid_map uid_map;
struct uid_gid_map gid_map;
struct uid_gid_map projid_map;
- struct kref kref;
+ atomic_t count;
struct user_namespace *parent;
kuid_t owner;
kgid_t group;
@@ -35,18 +35,18 @@ extern struct user_namespace init_user_ns;
static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
{
if (ns)
- kref_get(&ns->kref);
+ atomic_inc(&ns->count);
return ns;
}
extern int create_user_ns(struct cred *new);
extern int unshare_userns(unsigned long unshare_flags, struct cred **new_cred);
-extern void free_user_ns(struct kref *kref);
+extern void free_user_ns(struct user_namespace *ns);
static inline void put_user_ns(struct user_namespace *ns)
{
- if (ns)
- kref_put(&ns->kref, free_user_ns);
+ if (ns && atomic_dec_and_test(&ns->count))
+ free_user_ns(ns);
}
struct seq_operations;
diff --git a/kernel/user.c b/kernel/user.c
index 33acb5e..57ebfd4 100644
--- a/kernel/user.c
+++ b/kernel/user.c
@@ -47,9 +47,7 @@ struct user_namespace init_user_ns = {
.count = 4294967295U,
},
},
- .kref = {
- .refcount = ATOMIC_INIT(3),
- },
+ .count = ATOMIC_INIT(3),
.owner = GLOBAL_ROOT_UID,
.group = GLOBAL_ROOT_GID,
.proc_inum = PROC_USER_INIT_INO,
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 2b042c4..24f8ec3 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -78,7 +78,7 @@ int create_user_ns(struct cred *new)
return ret;
}
- kref_init(&ns->kref);
+ atomic_set(&ns->count, 1);
/* Leave the new->user_ns reference with the new user namespace. */
ns->parent = parent_ns;
ns->owner = owner;
@@ -104,15 +104,16 @@ int unshare_userns(unsigned long unshare_flags, struct cred **new_cred)
return create_user_ns(cred);
}
-void free_user_ns(struct kref *kref)
+void free_user_ns(struct user_namespace *ns)
{
- struct user_namespace *parent, *ns =
- container_of(kref, struct user_namespace, kref);
+ struct user_namespace *parent;
- parent = ns->parent;
- proc_free_inum(ns->proc_inum);
- kmem_cache_free(user_ns_cachep, ns);
- put_user_ns(parent);
+ do {
+ parent = ns->parent;
+ proc_free_inum(ns->proc_inum);
+ kmem_cache_free(user_ns_cachep, ns);
+ ns = parent;
+ } while (atomic_dec_and_test(&parent->count));
}
EXPORT_SYMBOL(free_user_ns);
--
1.7.5.4
^ permalink raw reply related [flat|nested] 62+ messages in thread* Re: [PATCH review 1/6] userns: Avoid recursion in put_user_ns
2013-01-26 2:19 ` Eric W. Biederman
(?)
@ 2013-01-26 20:58 ` Serge E. Hallyn
-1 siblings, 0 replies; 62+ messages in thread
From: Serge E. Hallyn @ 2013-01-26 20:58 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Linux Containers, Serge E. Hallyn, linux-kernel, linux-fsdevel,
Vasily Kulikov
Quoting Eric W. Biederman (ebiederm@xmission.com):
>
> When freeing a deeply nested user namespace free_user_ns calls
> put_user_ns on it's parent which may in turn call free_user_ns again.
> When -fno-optimize-sibling-calls is passed to gcc one stack frame per
> user namespace is left on the stack, potentially overflowing the
> kernel stack. CONFIG_FRAME_POINTER forces -fno-optimize-sibling-calls
> so we can't count on gcc to optimize this code.
>
> Remove struct kref and use a plain atomic_t. Making the code more
> flexible and easier to comprehend. Make the loop in free_user_ns
> explict to guarantee that the stack does not overflow with
> CONFIG_FRAME_POINTER enabled.
>
> I have tested this fix with a simple program that uses unshare to
> create a deeply nested user namespace structure and then calls exit.
> With 1000 nesteuser namespaces before this change running my test
> program causes the kernel to die a horrible death. With 10,000,000
> nested user namespaces after this change my test program runs to
> completion and causes no harm.
>
> Pointed-out-by: Vasily Kulikov <segoon@openwall.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
> include/linux/user_namespace.h | 10 +++++-----
> kernel/user.c | 4 +---
> kernel/user_namespace.c | 17 +++++++++--------
> 3 files changed, 15 insertions(+), 16 deletions(-)
>
> diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
> index b9bd2e6..4ce0093 100644
> --- a/include/linux/user_namespace.h
> +++ b/include/linux/user_namespace.h
> @@ -21,7 +21,7 @@ struct user_namespace {
> struct uid_gid_map uid_map;
> struct uid_gid_map gid_map;
> struct uid_gid_map projid_map;
> - struct kref kref;
> + atomic_t count;
> struct user_namespace *parent;
> kuid_t owner;
> kgid_t group;
> @@ -35,18 +35,18 @@ extern struct user_namespace init_user_ns;
> static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
> {
> if (ns)
> - kref_get(&ns->kref);
> + atomic_inc(&ns->count);
> return ns;
> }
>
> extern int create_user_ns(struct cred *new);
> extern int unshare_userns(unsigned long unshare_flags, struct cred **new_cred);
> -extern void free_user_ns(struct kref *kref);
> +extern void free_user_ns(struct user_namespace *ns);
>
> static inline void put_user_ns(struct user_namespace *ns)
> {
> - if (ns)
> - kref_put(&ns->kref, free_user_ns);
> + if (ns && atomic_dec_and_test(&ns->count))
> + free_user_ns(ns);
> }
>
> struct seq_operations;
> diff --git a/kernel/user.c b/kernel/user.c
> index 33acb5e..57ebfd4 100644
> --- a/kernel/user.c
> +++ b/kernel/user.c
> @@ -47,9 +47,7 @@ struct user_namespace init_user_ns = {
> .count = 4294967295U,
> },
> },
> - .kref = {
> - .refcount = ATOMIC_INIT(3),
> - },
> + .count = ATOMIC_INIT(3),
> .owner = GLOBAL_ROOT_UID,
> .group = GLOBAL_ROOT_GID,
> .proc_inum = PROC_USER_INIT_INO,
> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
> index 2b042c4..24f8ec3 100644
> --- a/kernel/user_namespace.c
> +++ b/kernel/user_namespace.c
> @@ -78,7 +78,7 @@ int create_user_ns(struct cred *new)
> return ret;
> }
>
> - kref_init(&ns->kref);
> + atomic_set(&ns->count, 1);
> /* Leave the new->user_ns reference with the new user namespace. */
> ns->parent = parent_ns;
> ns->owner = owner;
> @@ -104,15 +104,16 @@ int unshare_userns(unsigned long unshare_flags, struct cred **new_cred)
> return create_user_ns(cred);
> }
>
> -void free_user_ns(struct kref *kref)
> +void free_user_ns(struct user_namespace *ns)
> {
> - struct user_namespace *parent, *ns =
> - container_of(kref, struct user_namespace, kref);
> + struct user_namespace *parent;
>
> - parent = ns->parent;
> - proc_free_inum(ns->proc_inum);
> - kmem_cache_free(user_ns_cachep, ns);
> - put_user_ns(parent);
> + do {
> + parent = ns->parent;
> + proc_free_inum(ns->proc_inum);
> + kmem_cache_free(user_ns_cachep, ns);
> + ns = parent;
> + } while (atomic_dec_and_test(&parent->count));
> }
> EXPORT_SYMBOL(free_user_ns);
>
> --
> 1.7.5.4
^ permalink raw reply [flat|nested] 62+ messages in thread[parent not found: <877gn0it3t.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH review 1/6] userns: Avoid recursion in put_user_ns
[not found] ` <877gn0it3t.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
@ 2013-01-26 20:58 ` Serge E. Hallyn
2013-01-28 14:51 ` Vasily Kulikov
1 sibling, 0 replies; 62+ messages in thread
From: Serge E. Hallyn @ 2013-01-26 20:58 UTC (permalink / raw)
To: Eric W. Biederman
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Vasily Kulikov,
Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA
Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
>
> When freeing a deeply nested user namespace free_user_ns calls
> put_user_ns on it's parent which may in turn call free_user_ns again.
> When -fno-optimize-sibling-calls is passed to gcc one stack frame per
> user namespace is left on the stack, potentially overflowing the
> kernel stack. CONFIG_FRAME_POINTER forces -fno-optimize-sibling-calls
> so we can't count on gcc to optimize this code.
>
> Remove struct kref and use a plain atomic_t. Making the code more
> flexible and easier to comprehend. Make the loop in free_user_ns
> explict to guarantee that the stack does not overflow with
> CONFIG_FRAME_POINTER enabled.
>
> I have tested this fix with a simple program that uses unshare to
> create a deeply nested user namespace structure and then calls exit.
> With 1000 nesteuser namespaces before this change running my test
> program causes the kernel to die a horrible death. With 10,000,000
> nested user namespaces after this change my test program runs to
> completion and causes no harm.
>
> Pointed-out-by: Vasily Kulikov <segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org>
Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
> ---
> include/linux/user_namespace.h | 10 +++++-----
> kernel/user.c | 4 +---
> kernel/user_namespace.c | 17 +++++++++--------
> 3 files changed, 15 insertions(+), 16 deletions(-)
>
> diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
> index b9bd2e6..4ce0093 100644
> --- a/include/linux/user_namespace.h
> +++ b/include/linux/user_namespace.h
> @@ -21,7 +21,7 @@ struct user_namespace {
> struct uid_gid_map uid_map;
> struct uid_gid_map gid_map;
> struct uid_gid_map projid_map;
> - struct kref kref;
> + atomic_t count;
> struct user_namespace *parent;
> kuid_t owner;
> kgid_t group;
> @@ -35,18 +35,18 @@ extern struct user_namespace init_user_ns;
> static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
> {
> if (ns)
> - kref_get(&ns->kref);
> + atomic_inc(&ns->count);
> return ns;
> }
>
> extern int create_user_ns(struct cred *new);
> extern int unshare_userns(unsigned long unshare_flags, struct cred **new_cred);
> -extern void free_user_ns(struct kref *kref);
> +extern void free_user_ns(struct user_namespace *ns);
>
> static inline void put_user_ns(struct user_namespace *ns)
> {
> - if (ns)
> - kref_put(&ns->kref, free_user_ns);
> + if (ns && atomic_dec_and_test(&ns->count))
> + free_user_ns(ns);
> }
>
> struct seq_operations;
> diff --git a/kernel/user.c b/kernel/user.c
> index 33acb5e..57ebfd4 100644
> --- a/kernel/user.c
> +++ b/kernel/user.c
> @@ -47,9 +47,7 @@ struct user_namespace init_user_ns = {
> .count = 4294967295U,
> },
> },
> - .kref = {
> - .refcount = ATOMIC_INIT(3),
> - },
> + .count = ATOMIC_INIT(3),
> .owner = GLOBAL_ROOT_UID,
> .group = GLOBAL_ROOT_GID,
> .proc_inum = PROC_USER_INIT_INO,
> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
> index 2b042c4..24f8ec3 100644
> --- a/kernel/user_namespace.c
> +++ b/kernel/user_namespace.c
> @@ -78,7 +78,7 @@ int create_user_ns(struct cred *new)
> return ret;
> }
>
> - kref_init(&ns->kref);
> + atomic_set(&ns->count, 1);
> /* Leave the new->user_ns reference with the new user namespace. */
> ns->parent = parent_ns;
> ns->owner = owner;
> @@ -104,15 +104,16 @@ int unshare_userns(unsigned long unshare_flags, struct cred **new_cred)
> return create_user_ns(cred);
> }
>
> -void free_user_ns(struct kref *kref)
> +void free_user_ns(struct user_namespace *ns)
> {
> - struct user_namespace *parent, *ns =
> - container_of(kref, struct user_namespace, kref);
> + struct user_namespace *parent;
>
> - parent = ns->parent;
> - proc_free_inum(ns->proc_inum);
> - kmem_cache_free(user_ns_cachep, ns);
> - put_user_ns(parent);
> + do {
> + parent = ns->parent;
> + proc_free_inum(ns->proc_inum);
> + kmem_cache_free(user_ns_cachep, ns);
> + ns = parent;
> + } while (atomic_dec_and_test(&parent->count));
> }
> EXPORT_SYMBOL(free_user_ns);
>
> --
> 1.7.5.4
^ permalink raw reply [flat|nested] 62+ messages in thread* Re: [PATCH review 1/6] userns: Avoid recursion in put_user_ns
2013-01-26 2:19 ` Eric W. Biederman
@ 2013-01-28 14:51 ` Vasily Kulikov
-1 siblings, 0 replies; 62+ messages in thread
From: Vasily Kulikov @ 2013-01-28 14:51 UTC (permalink / raw)
To: Eric W. Biederman
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Linux Containers,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
On Fri, Jan 25, 2013 at 18:19 -0800, Eric W. Biederman wrote:
>
> When freeing a deeply nested user namespace free_user_ns calls
> put_user_ns on it's parent which may in turn call free_user_ns again.
> When -fno-optimize-sibling-calls is passed to gcc one stack frame per
> user namespace is left on the stack, potentially overflowing the
> kernel stack. CONFIG_FRAME_POINTER forces -fno-optimize-sibling-calls
> so we can't count on gcc to optimize this code.
>
> Remove struct kref and use a plain atomic_t. Making the code more
> flexible and easier to comprehend. Make the loop in free_user_ns
> explict to guarantee that the stack does not overflow with
> CONFIG_FRAME_POINTER enabled.
>
> I have tested this fix with a simple program that uses unshare to
> create a deeply nested user namespace structure and then calls exit.
> With 1000 nesteuser namespaces before this change running my test
> program causes the kernel to die a horrible death. With 10,000,000
> nested user namespaces after this change my test program runs to
> completion and causes no harm.
>
> Pointed-out-by: Vasily Kulikov <segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org>
> Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Looks sane, thanks.
Acked-by: Vasily Kulikov <segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org>
The second bug I've noted in the same email (OOM) looks like should be
"fixed" by using memcg to limit kernel memory. So, I'm fine with this
side of user_ns :)
--
Vasily Kulikov
http://www.openwall.com - bringing security into open computing environments
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [PATCH review 1/6] userns: Avoid recursion in put_user_ns
@ 2013-01-28 14:51 ` Vasily Kulikov
0 siblings, 0 replies; 62+ messages in thread
From: Vasily Kulikov @ 2013-01-28 14:51 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Linux Containers, Serge E. Hallyn, linux-kernel, linux-fsdevel
On Fri, Jan 25, 2013 at 18:19 -0800, Eric W. Biederman wrote:
>
> When freeing a deeply nested user namespace free_user_ns calls
> put_user_ns on it's parent which may in turn call free_user_ns again.
> When -fno-optimize-sibling-calls is passed to gcc one stack frame per
> user namespace is left on the stack, potentially overflowing the
> kernel stack. CONFIG_FRAME_POINTER forces -fno-optimize-sibling-calls
> so we can't count on gcc to optimize this code.
>
> Remove struct kref and use a plain atomic_t. Making the code more
> flexible and easier to comprehend. Make the loop in free_user_ns
> explict to guarantee that the stack does not overflow with
> CONFIG_FRAME_POINTER enabled.
>
> I have tested this fix with a simple program that uses unshare to
> create a deeply nested user namespace structure and then calls exit.
> With 1000 nesteuser namespaces before this change running my test
> program causes the kernel to die a horrible death. With 10,000,000
> nested user namespaces after this change my test program runs to
> completion and causes no harm.
>
> Pointed-out-by: Vasily Kulikov <segoon@openwall.com>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Looks sane, thanks.
Acked-by: Vasily Kulikov <segoon@openwall.com>
The second bug I've noted in the same email (OOM) looks like should be
"fixed" by using memcg to limit kernel memory. So, I'm fine with this
side of user_ns :)
--
Vasily Kulikov
http://www.openwall.com - bringing security into open computing environments
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [PATCH review 1/6] userns: Avoid recursion in put_user_ns
2013-01-28 14:51 ` Vasily Kulikov
@ 2013-01-28 16:34 ` Eric W. Biederman
-1 siblings, 0 replies; 62+ messages in thread
From: Eric W. Biederman @ 2013-01-28 16:34 UTC (permalink / raw)
To: Vasily Kulikov
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Linux Containers,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
Vasily Kulikov <segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org> writes:
> Acked-by: Vasily Kulikov <segoon-cxoSlKxDwOJWk0Htik3J/w@public.gmane.org>
>
> The second bug I've noted in the same email (OOM) looks like should be
> "fixed" by using memcg to limit kernel memory. So, I'm fine with this
> side of user_ns :)
Good. That is what it looked like from here.
You pointed out one or two other things that I am still thinking about.
Eric
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [PATCH review 1/6] userns: Avoid recursion in put_user_ns
@ 2013-01-28 16:34 ` Eric W. Biederman
0 siblings, 0 replies; 62+ messages in thread
From: Eric W. Biederman @ 2013-01-28 16:34 UTC (permalink / raw)
To: Vasily Kulikov
Cc: Linux Containers, Serge E. Hallyn, linux-kernel, linux-fsdevel
Vasily Kulikov <segoon@openwall.com> writes:
> Acked-by: Vasily Kulikov <segoon@openwall.com>
>
> The second bug I've noted in the same email (OOM) looks like should be
> "fixed" by using memcg to limit kernel memory. So, I'm fine with this
> side of user_ns :)
Good. That is what it looked like from here.
You pointed out one or two other things that I am still thinking about.
Eric
^ permalink raw reply [flat|nested] 62+ messages in thread
* [PATCH review 2/6] userns: Allow any uid or gid mappings that don't overlap.
2013-01-26 2:15 ` Eric W. Biederman
@ 2013-01-26 2:21 ` Eric W. Biederman
-1 siblings, 0 replies; 62+ messages in thread
From: Eric W. Biederman @ 2013-01-26 2:21 UTC (permalink / raw)
To: Linux Containers
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton,
Aristeu Rozanski, linux-kernel-u79uwXL29TY76Z2rM5mHXA
When I initially wrote the code for /proc/<pid>/uid_map. I was lazy
and avoided duplicate mappings by the simple expedient of ensuring the
first number in a new extent was greater than any number in the
previous extent.
Unfortunately that precludes a number of valid mappings, and someone
noticed and complained. So use a simple check to ensure that ranges
in the mapping extents don't overlap.
Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
---
kernel/user_namespace.c | 45 +++++++++++++++++++++++++++++++++++++++------
1 files changed, 39 insertions(+), 6 deletions(-)
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 24f8ec3..8b65083 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -520,6 +520,42 @@ struct seq_operations proc_projid_seq_operations = {
.show = projid_m_show,
};
+static bool mappings_overlap(struct uid_gid_map *new_map, struct uid_gid_extent *extent)
+{
+ u32 upper_first, lower_first, upper_last, lower_last;
+ unsigned idx;
+
+ upper_first = extent->first;
+ lower_first = extent->lower_first;
+ upper_last = upper_first + extent->count - 1;
+ lower_last = lower_first + extent->count - 1;
+
+ for (idx = 0; idx < new_map->nr_extents; idx++) {
+ u32 prev_upper_first, prev_lower_first;
+ u32 prev_upper_last, prev_lower_last;
+ struct uid_gid_extent *prev;
+
+ prev = &new_map->extent[idx];
+
+ prev_upper_first = prev->first;
+ prev_lower_first = prev->lower_first;
+ prev_upper_last = prev_upper_first + prev->count - 1;
+ prev_lower_last = prev_lower_first + prev->count - 1;
+
+ /* Does the upper range intersect a previous extent? */
+ if ((prev_upper_first <= upper_last) &&
+ (prev_upper_last >= upper_first))
+ return true;
+
+ /* Does the lower range intersect a previous extent? */
+ if ((prev_lower_first <= lower_last) &&
+ (prev_lower_last >= lower_first))
+ return true;
+ }
+ return false;
+}
+
+
static DEFINE_MUTEX(id_map_mutex);
static ssize_t map_write(struct file *file, const char __user *buf,
@@ -532,7 +568,7 @@ static ssize_t map_write(struct file *file, const char __user *buf,
struct user_namespace *ns = seq->private;
struct uid_gid_map new_map;
unsigned idx;
- struct uid_gid_extent *extent, *last = NULL;
+ struct uid_gid_extent *extent = NULL;
unsigned long page = 0;
char *kbuf, *pos, *next_line;
ssize_t ret = -EINVAL;
@@ -635,14 +671,11 @@ static ssize_t map_write(struct file *file, const char __user *buf,
if ((extent->lower_first + extent->count) <= extent->lower_first)
goto out;
- /* For now only accept extents that are strictly in order */
- if (last &&
- (((last->first + last->count) > extent->first) ||
- ((last->lower_first + last->count) > extent->lower_first)))
+ /* Do the ranges in extent overlap any previous extents? */
+ if (mappings_overlap(&new_map, extent))
goto out;
new_map.nr_extents++;
- last = extent;
/* Fail if the file contains too many extents */
if ((new_map.nr_extents == UID_GID_MAP_MAX_EXTENTS) &&
--
1.7.5.4
^ permalink raw reply related [flat|nested] 62+ messages in thread* [PATCH review 2/6] userns: Allow any uid or gid mappings that don't overlap.
@ 2013-01-26 2:21 ` Eric W. Biederman
0 siblings, 0 replies; 62+ messages in thread
From: Eric W. Biederman @ 2013-01-26 2:21 UTC (permalink / raw)
To: Linux Containers
Cc: Serge E. Hallyn, linux-kernel, linux-fsdevel, Aristeu Rozanski,
Andrew Morton
When I initially wrote the code for /proc/<pid>/uid_map. I was lazy
and avoided duplicate mappings by the simple expedient of ensuring the
first number in a new extent was greater than any number in the
previous extent.
Unfortunately that precludes a number of valid mappings, and someone
noticed and complained. So use a simple check to ensure that ranges
in the mapping extents don't overlap.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
kernel/user_namespace.c | 45 +++++++++++++++++++++++++++++++++++++++------
1 files changed, 39 insertions(+), 6 deletions(-)
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 24f8ec3..8b65083 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -520,6 +520,42 @@ struct seq_operations proc_projid_seq_operations = {
.show = projid_m_show,
};
+static bool mappings_overlap(struct uid_gid_map *new_map, struct uid_gid_extent *extent)
+{
+ u32 upper_first, lower_first, upper_last, lower_last;
+ unsigned idx;
+
+ upper_first = extent->first;
+ lower_first = extent->lower_first;
+ upper_last = upper_first + extent->count - 1;
+ lower_last = lower_first + extent->count - 1;
+
+ for (idx = 0; idx < new_map->nr_extents; idx++) {
+ u32 prev_upper_first, prev_lower_first;
+ u32 prev_upper_last, prev_lower_last;
+ struct uid_gid_extent *prev;
+
+ prev = &new_map->extent[idx];
+
+ prev_upper_first = prev->first;
+ prev_lower_first = prev->lower_first;
+ prev_upper_last = prev_upper_first + prev->count - 1;
+ prev_lower_last = prev_lower_first + prev->count - 1;
+
+ /* Does the upper range intersect a previous extent? */
+ if ((prev_upper_first <= upper_last) &&
+ (prev_upper_last >= upper_first))
+ return true;
+
+ /* Does the lower range intersect a previous extent? */
+ if ((prev_lower_first <= lower_last) &&
+ (prev_lower_last >= lower_first))
+ return true;
+ }
+ return false;
+}
+
+
static DEFINE_MUTEX(id_map_mutex);
static ssize_t map_write(struct file *file, const char __user *buf,
@@ -532,7 +568,7 @@ static ssize_t map_write(struct file *file, const char __user *buf,
struct user_namespace *ns = seq->private;
struct uid_gid_map new_map;
unsigned idx;
- struct uid_gid_extent *extent, *last = NULL;
+ struct uid_gid_extent *extent = NULL;
unsigned long page = 0;
char *kbuf, *pos, *next_line;
ssize_t ret = -EINVAL;
@@ -635,14 +671,11 @@ static ssize_t map_write(struct file *file, const char __user *buf,
if ((extent->lower_first + extent->count) <= extent->lower_first)
goto out;
- /* For now only accept extents that are strictly in order */
- if (last &&
- (((last->first + last->count) > extent->first) ||
- ((last->lower_first + last->count) > extent->lower_first)))
+ /* Do the ranges in extent overlap any previous extents? */
+ if (mappings_overlap(&new_map, extent))
goto out;
new_map.nr_extents++;
- last = extent;
/* Fail if the file contains too many extents */
if ((new_map.nr_extents == UID_GID_MAP_MAX_EXTENTS) &&
--
1.7.5.4
^ permalink raw reply related [flat|nested] 62+ messages in thread[parent not found: <87zjzwhegj.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH review 2/6] userns: Allow any uid or gid mappings that don't overlap.
2013-01-26 2:21 ` Eric W. Biederman
@ 2013-01-26 21:08 ` Serge E. Hallyn
-1 siblings, 0 replies; 62+ messages in thread
From: Serge E. Hallyn @ 2013-01-26 21:08 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
Aristeu Rozanski, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
Andrew Morton
Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
>
> When I initially wrote the code for /proc/<pid>/uid_map. I was lazy
> and avoided duplicate mappings by the simple expedient of ensuring the
> first number in a new extent was greater than any number in the
> previous extent.
>
> Unfortunately that precludes a number of valid mappings, and someone
> noticed and complained. So use a simple check to ensure that ranges
> in the mapping extents don't overlap.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> ---
> kernel/user_namespace.c | 45 +++++++++++++++++++++++++++++++++++++++------
> 1 files changed, 39 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
> index 24f8ec3..8b65083 100644
> --- a/kernel/user_namespace.c
> +++ b/kernel/user_namespace.c
> @@ -520,6 +520,42 @@ struct seq_operations proc_projid_seq_operations = {
> .show = projid_m_show,
> };
>
> +static bool mappings_overlap(struct uid_gid_map *new_map, struct uid_gid_extent *extent)
> +{
> + u32 upper_first, lower_first, upper_last, lower_last;
> + unsigned idx;
> +
> + upper_first = extent->first;
> + lower_first = extent->lower_first;
> + upper_last = upper_first + extent->count - 1;
> + lower_last = lower_first + extent->count - 1;
> +
> + for (idx = 0; idx < new_map->nr_extents; idx++) {
> + u32 prev_upper_first, prev_lower_first;
> + u32 prev_upper_last, prev_lower_last;
> + struct uid_gid_extent *prev;
> +
> + prev = &new_map->extent[idx];
> +
> + prev_upper_first = prev->first;
> + prev_lower_first = prev->lower_first;
> + prev_upper_last = prev_upper_first + prev->count - 1;
> + prev_lower_last = prev_lower_first + prev->count - 1;
> +
> + /* Does the upper range intersect a previous extent? */
> + if ((prev_upper_first <= upper_last) &&
> + (prev_upper_last >= upper_first))
> + return true;
> +
> + /* Does the lower range intersect a previous extent? */
> + if ((prev_lower_first <= lower_last) &&
> + (prev_lower_last >= lower_first))
> + return true;
> + }
> + return false;
> +}
> +
> +
> static DEFINE_MUTEX(id_map_mutex);
>
> static ssize_t map_write(struct file *file, const char __user *buf,
> @@ -532,7 +568,7 @@ static ssize_t map_write(struct file *file, const char __user *buf,
> struct user_namespace *ns = seq->private;
> struct uid_gid_map new_map;
> unsigned idx;
> - struct uid_gid_extent *extent, *last = NULL;
> + struct uid_gid_extent *extent = NULL;
> unsigned long page = 0;
> char *kbuf, *pos, *next_line;
> ssize_t ret = -EINVAL;
> @@ -635,14 +671,11 @@ static ssize_t map_write(struct file *file, const char __user *buf,
> if ((extent->lower_first + extent->count) <= extent->lower_first)
> goto out;
>
> - /* For now only accept extents that are strictly in order */
> - if (last &&
> - (((last->first + last->count) > extent->first) ||
> - ((last->lower_first + last->count) > extent->lower_first)))
> + /* Do the ranges in extent overlap any previous extents? */
> + if (mappings_overlap(&new_map, extent))
> goto out;
>
> new_map.nr_extents++;
> - last = extent;
>
> /* Fail if the file contains too many extents */
> if ((new_map.nr_extents == UID_GID_MAP_MAX_EXTENTS) &&
> --
> 1.7.5.4
^ permalink raw reply [flat|nested] 62+ messages in thread* Re: [PATCH review 2/6] userns: Allow any uid or gid mappings that don't overlap.
@ 2013-01-26 21:08 ` Serge E. Hallyn
0 siblings, 0 replies; 62+ messages in thread
From: Serge E. Hallyn @ 2013-01-26 21:08 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Linux Containers, Serge E. Hallyn, linux-kernel, linux-fsdevel,
Aristeu Rozanski, Andrew Morton
Quoting Eric W. Biederman (ebiederm@xmission.com):
>
> When I initially wrote the code for /proc/<pid>/uid_map. I was lazy
> and avoided duplicate mappings by the simple expedient of ensuring the
> first number in a new extent was greater than any number in the
> previous extent.
>
> Unfortunately that precludes a number of valid mappings, and someone
> noticed and complained. So use a simple check to ensure that ranges
> in the mapping extents don't overlap.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
> ---
> kernel/user_namespace.c | 45 +++++++++++++++++++++++++++++++++++++++------
> 1 files changed, 39 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
> index 24f8ec3..8b65083 100644
> --- a/kernel/user_namespace.c
> +++ b/kernel/user_namespace.c
> @@ -520,6 +520,42 @@ struct seq_operations proc_projid_seq_operations = {
> .show = projid_m_show,
> };
>
> +static bool mappings_overlap(struct uid_gid_map *new_map, struct uid_gid_extent *extent)
> +{
> + u32 upper_first, lower_first, upper_last, lower_last;
> + unsigned idx;
> +
> + upper_first = extent->first;
> + lower_first = extent->lower_first;
> + upper_last = upper_first + extent->count - 1;
> + lower_last = lower_first + extent->count - 1;
> +
> + for (idx = 0; idx < new_map->nr_extents; idx++) {
> + u32 prev_upper_first, prev_lower_first;
> + u32 prev_upper_last, prev_lower_last;
> + struct uid_gid_extent *prev;
> +
> + prev = &new_map->extent[idx];
> +
> + prev_upper_first = prev->first;
> + prev_lower_first = prev->lower_first;
> + prev_upper_last = prev_upper_first + prev->count - 1;
> + prev_lower_last = prev_lower_first + prev->count - 1;
> +
> + /* Does the upper range intersect a previous extent? */
> + if ((prev_upper_first <= upper_last) &&
> + (prev_upper_last >= upper_first))
> + return true;
> +
> + /* Does the lower range intersect a previous extent? */
> + if ((prev_lower_first <= lower_last) &&
> + (prev_lower_last >= lower_first))
> + return true;
> + }
> + return false;
> +}
> +
> +
> static DEFINE_MUTEX(id_map_mutex);
>
> static ssize_t map_write(struct file *file, const char __user *buf,
> @@ -532,7 +568,7 @@ static ssize_t map_write(struct file *file, const char __user *buf,
> struct user_namespace *ns = seq->private;
> struct uid_gid_map new_map;
> unsigned idx;
> - struct uid_gid_extent *extent, *last = NULL;
> + struct uid_gid_extent *extent = NULL;
> unsigned long page = 0;
> char *kbuf, *pos, *next_line;
> ssize_t ret = -EINVAL;
> @@ -635,14 +671,11 @@ static ssize_t map_write(struct file *file, const char __user *buf,
> if ((extent->lower_first + extent->count) <= extent->lower_first)
> goto out;
>
> - /* For now only accept extents that are strictly in order */
> - if (last &&
> - (((last->first + last->count) > extent->first) ||
> - ((last->lower_first + last->count) > extent->lower_first)))
> + /* Do the ranges in extent overlap any previous extents? */
> + if (mappings_overlap(&new_map, extent))
> goto out;
>
> new_map.nr_extents++;
> - last = extent;
>
> /* Fail if the file contains too many extents */
> if ((new_map.nr_extents == UID_GID_MAP_MAX_EXTENTS) &&
> --
> 1.7.5.4
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [PATCH review 2/6] userns: Allow any uid or gid mappings that don't overlap.
[not found] ` <87zjzwhegj.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-01-26 21:08 ` Serge E. Hallyn
@ 2013-01-28 14:28 ` Aristeu Rozanski
1 sibling, 0 replies; 62+ messages in thread
From: Aristeu Rozanski @ 2013-01-28 14:28 UTC (permalink / raw)
To: Eric W. Biederman
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Linux Containers,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton
On Fri, Jan 25, 2013 at 06:21:00PM -0800, Eric W. Biederman wrote:
> When I initially wrote the code for /proc/<pid>/uid_map. I was lazy
> and avoided duplicate mappings by the simple expedient of ensuring the
> first number in a new extent was greater than any number in the
> previous extent.
>
> Unfortunately that precludes a number of valid mappings, and someone
> noticed and complained. So use a simple check to ensure that ranges
> in the mapping extents don't overlap.
Acked-by: Someone <aris-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
>
> Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
> ---
> kernel/user_namespace.c | 45 +++++++++++++++++++++++++++++++++++++++------
> 1 files changed, 39 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
> index 24f8ec3..8b65083 100644
> --- a/kernel/user_namespace.c
> +++ b/kernel/user_namespace.c
> @@ -520,6 +520,42 @@ struct seq_operations proc_projid_seq_operations = {
> .show = projid_m_show,
> };
>
> +static bool mappings_overlap(struct uid_gid_map *new_map, struct uid_gid_extent *extent)
> +{
> + u32 upper_first, lower_first, upper_last, lower_last;
> + unsigned idx;
> +
> + upper_first = extent->first;
> + lower_first = extent->lower_first;
> + upper_last = upper_first + extent->count - 1;
> + lower_last = lower_first + extent->count - 1;
> +
> + for (idx = 0; idx < new_map->nr_extents; idx++) {
> + u32 prev_upper_first, prev_lower_first;
> + u32 prev_upper_last, prev_lower_last;
> + struct uid_gid_extent *prev;
> +
> + prev = &new_map->extent[idx];
> +
> + prev_upper_first = prev->first;
> + prev_lower_first = prev->lower_first;
> + prev_upper_last = prev_upper_first + prev->count - 1;
> + prev_lower_last = prev_lower_first + prev->count - 1;
> +
> + /* Does the upper range intersect a previous extent? */
> + if ((prev_upper_first <= upper_last) &&
> + (prev_upper_last >= upper_first))
> + return true;
> +
> + /* Does the lower range intersect a previous extent? */
> + if ((prev_lower_first <= lower_last) &&
> + (prev_lower_last >= lower_first))
> + return true;
> + }
> + return false;
> +}
> +
> +
> static DEFINE_MUTEX(id_map_mutex);
>
> static ssize_t map_write(struct file *file, const char __user *buf,
> @@ -532,7 +568,7 @@ static ssize_t map_write(struct file *file, const char __user *buf,
> struct user_namespace *ns = seq->private;
> struct uid_gid_map new_map;
> unsigned idx;
> - struct uid_gid_extent *extent, *last = NULL;
> + struct uid_gid_extent *extent = NULL;
> unsigned long page = 0;
> char *kbuf, *pos, *next_line;
> ssize_t ret = -EINVAL;
> @@ -635,14 +671,11 @@ static ssize_t map_write(struct file *file, const char __user *buf,
> if ((extent->lower_first + extent->count) <= extent->lower_first)
> goto out;
>
> - /* For now only accept extents that are strictly in order */
> - if (last &&
> - (((last->first + last->count) > extent->first) ||
> - ((last->lower_first + last->count) > extent->lower_first)))
> + /* Do the ranges in extent overlap any previous extents? */
> + if (mappings_overlap(&new_map, extent))
> goto out;
>
> new_map.nr_extents++;
> - last = extent;
>
> /* Fail if the file contains too many extents */
> if ((new_map.nr_extents == UID_GID_MAP_MAX_EXTENTS) &&
> --
> 1.7.5.4
>
--
Aristeu
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [PATCH review 2/6] userns: Allow any uid or gid mappings that don't overlap.
2013-01-26 2:21 ` Eric W. Biederman
(?)
(?)
@ 2013-01-28 14:28 ` Aristeu Rozanski
[not found] ` <20130128142816.GU17632-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-01-28 14:41 ` Lord Glauber Costa of Sealand
-1 siblings, 2 replies; 62+ messages in thread
From: Aristeu Rozanski @ 2013-01-28 14:28 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Linux Containers, Serge E. Hallyn, linux-kernel, linux-fsdevel,
Andrew Morton
On Fri, Jan 25, 2013 at 06:21:00PM -0800, Eric W. Biederman wrote:
> When I initially wrote the code for /proc/<pid>/uid_map. I was lazy
> and avoided duplicate mappings by the simple expedient of ensuring the
> first number in a new extent was greater than any number in the
> previous extent.
>
> Unfortunately that precludes a number of valid mappings, and someone
> noticed and complained. So use a simple check to ensure that ranges
> in the mapping extents don't overlap.
Acked-by: Someone <aris@redhat.com>
>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
> kernel/user_namespace.c | 45 +++++++++++++++++++++++++++++++++++++++------
> 1 files changed, 39 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
> index 24f8ec3..8b65083 100644
> --- a/kernel/user_namespace.c
> +++ b/kernel/user_namespace.c
> @@ -520,6 +520,42 @@ struct seq_operations proc_projid_seq_operations = {
> .show = projid_m_show,
> };
>
> +static bool mappings_overlap(struct uid_gid_map *new_map, struct uid_gid_extent *extent)
> +{
> + u32 upper_first, lower_first, upper_last, lower_last;
> + unsigned idx;
> +
> + upper_first = extent->first;
> + lower_first = extent->lower_first;
> + upper_last = upper_first + extent->count - 1;
> + lower_last = lower_first + extent->count - 1;
> +
> + for (idx = 0; idx < new_map->nr_extents; idx++) {
> + u32 prev_upper_first, prev_lower_first;
> + u32 prev_upper_last, prev_lower_last;
> + struct uid_gid_extent *prev;
> +
> + prev = &new_map->extent[idx];
> +
> + prev_upper_first = prev->first;
> + prev_lower_first = prev->lower_first;
> + prev_upper_last = prev_upper_first + prev->count - 1;
> + prev_lower_last = prev_lower_first + prev->count - 1;
> +
> + /* Does the upper range intersect a previous extent? */
> + if ((prev_upper_first <= upper_last) &&
> + (prev_upper_last >= upper_first))
> + return true;
> +
> + /* Does the lower range intersect a previous extent? */
> + if ((prev_lower_first <= lower_last) &&
> + (prev_lower_last >= lower_first))
> + return true;
> + }
> + return false;
> +}
> +
> +
> static DEFINE_MUTEX(id_map_mutex);
>
> static ssize_t map_write(struct file *file, const char __user *buf,
> @@ -532,7 +568,7 @@ static ssize_t map_write(struct file *file, const char __user *buf,
> struct user_namespace *ns = seq->private;
> struct uid_gid_map new_map;
> unsigned idx;
> - struct uid_gid_extent *extent, *last = NULL;
> + struct uid_gid_extent *extent = NULL;
> unsigned long page = 0;
> char *kbuf, *pos, *next_line;
> ssize_t ret = -EINVAL;
> @@ -635,14 +671,11 @@ static ssize_t map_write(struct file *file, const char __user *buf,
> if ((extent->lower_first + extent->count) <= extent->lower_first)
> goto out;
>
> - /* For now only accept extents that are strictly in order */
> - if (last &&
> - (((last->first + last->count) > extent->first) ||
> - ((last->lower_first + last->count) > extent->lower_first)))
> + /* Do the ranges in extent overlap any previous extents? */
> + if (mappings_overlap(&new_map, extent))
> goto out;
>
> new_map.nr_extents++;
> - last = extent;
>
> /* Fail if the file contains too many extents */
> if ((new_map.nr_extents == UID_GID_MAP_MAX_EXTENTS) &&
> --
> 1.7.5.4
>
--
Aristeu
^ permalink raw reply [flat|nested] 62+ messages in thread[parent not found: <20130128142816.GU17632-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH review 2/6] userns: Allow any uid or gid mappings that don't overlap.
[not found] ` <20130128142816.GU17632-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2013-01-28 14:41 ` Lord Glauber Costa of Sealand
0 siblings, 0 replies; 62+ messages in thread
From: Lord Glauber Costa of Sealand @ 2013-01-28 14:41 UTC (permalink / raw)
To: Aristeu Rozanski
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Linux Containers,
Eric W. Biederman, Andrew Morton,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
Hello Mr. Someone.
On 01/28/2013 06:28 PM, Aristeu Rozanski wrote:
> On Fri, Jan 25, 2013 at 06:21:00PM -0800, Eric W. Biederman wrote:
>> When I initially wrote the code for /proc/<pid>/uid_map. I was lazy
>> and avoided duplicate mappings by the simple expedient of ensuring the
>> first number in a new extent was greater than any number in the
>> previous extent.
>>
>> Unfortunately that precludes a number of valid mappings, and someone
>> noticed and complained. So use a simple check to ensure that ranges
>> in the mapping extents don't overlap.
>
> Acked-by: Someone <aris-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
>
Documentation/SubmittingPatches:
"then you just add a line saying
Signed-off-by: Random J Developer <random-ld4jwAGwUXQYGaZWVHDzw80vGNN6ct63@public.gmane.org>
using your real name (sorry, no pseudonyms or anonymous contributions.)"
I know how it feels, but that is how it goes. You'll have to change that.
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [PATCH review 2/6] userns: Allow any uid or gid mappings that don't overlap.
2013-01-28 14:28 ` Aristeu Rozanski
[not found] ` <20130128142816.GU17632-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2013-01-28 14:41 ` Lord Glauber Costa of Sealand
[not found] ` <51068E23.5040000-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
1 sibling, 1 reply; 62+ messages in thread
From: Lord Glauber Costa of Sealand @ 2013-01-28 14:41 UTC (permalink / raw)
To: Aristeu Rozanski
Cc: Eric W. Biederman, linux-fsdevel, Linux Containers, linux-kernel,
Andrew Morton
Hello Mr. Someone.
On 01/28/2013 06:28 PM, Aristeu Rozanski wrote:
> On Fri, Jan 25, 2013 at 06:21:00PM -0800, Eric W. Biederman wrote:
>> When I initially wrote the code for /proc/<pid>/uid_map. I was lazy
>> and avoided duplicate mappings by the simple expedient of ensuring the
>> first number in a new extent was greater than any number in the
>> previous extent.
>>
>> Unfortunately that precludes a number of valid mappings, and someone
>> noticed and complained. So use a simple check to ensure that ranges
>> in the mapping extents don't overlap.
>
> Acked-by: Someone <aris@redhat.com>
>
Documentation/SubmittingPatches:
"then you just add a line saying
Signed-off-by: Random J Developer <random@developer.example.org>
using your real name (sorry, no pseudonyms or anonymous contributions.)"
I know how it feels, but that is how it goes. You'll have to change that.
^ permalink raw reply [flat|nested] 62+ messages in thread
* [PATCH review 3/6] userns: Recommend use of memory control groups.
2013-01-26 2:15 ` Eric W. Biederman
@ 2013-01-26 2:22 ` Eric W. Biederman
-1 siblings, 0 replies; 62+ messages in thread
From: Eric W. Biederman @ 2013-01-26 2:22 UTC (permalink / raw)
To: Linux Containers
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
In the help text describing user namespaces recommend use of memory
control groups. In many cases memory control groups are the only
mechanism there is to limit how much memory a user who can create
user namespaces can use.
Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
---
Documentation/namespaces/resource-control.txt | 10 ++++++++++
init/Kconfig | 7 +++++++
2 files changed, 17 insertions(+), 0 deletions(-)
create mode 100644 Documentation/namespaces/resource-control.txt
diff --git a/Documentation/namespaces/resource-control.txt b/Documentation/namespaces/resource-control.txt
new file mode 100644
index 0000000..3d8178a
--- /dev/null
+++ b/Documentation/namespaces/resource-control.txt
@@ -0,0 +1,10 @@
+There are a lot of kinds of objects in the kernel that don't have
+individual limits or that have limits that are ineffective when a set
+of processes is allowed to switch user ids. With user namespaces
+enabled in a kernel for people who don't trust their users or their
+users programs to play nice this problems becomes more acute.
+
+Therefore it is recommended that memory control groups be enabled in
+kernels that enable user namespaces, and it is further recommended
+that userspace configure memory control groups to limit how much
+memory users they don't trust to play nice can use.
diff --git a/init/Kconfig b/init/Kconfig
index 7d30240..c8c58bd 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1035,6 +1035,13 @@ config USER_NS
help
This allows containers, i.e. vservers, to use user namespaces
to provide different user info for different servers.
+
+ When user namespaces are enabled in the kernel it is
+ recommended that the MEMCG and MEMCG_KMEM options also be
+ enabled and that user-space use the memory control groups to
+ limit the amount of memory a memory unprivileged users can
+ use.
+
If unsure, say N.
config PID_NS
--
1.7.5.4
^ permalink raw reply related [flat|nested] 62+ messages in thread
* [PATCH review 3/6] userns: Recommend use of memory control groups.
@ 2013-01-26 2:22 ` Eric W. Biederman
0 siblings, 0 replies; 62+ messages in thread
From: Eric W. Biederman @ 2013-01-26 2:22 UTC (permalink / raw)
To: Linux Containers; +Cc: Serge E. Hallyn, linux-kernel, linux-fsdevel
In the help text describing user namespaces recommend use of memory
control groups. In many cases memory control groups are the only
mechanism there is to limit how much memory a user who can create
user namespaces can use.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
Documentation/namespaces/resource-control.txt | 10 ++++++++++
init/Kconfig | 7 +++++++
2 files changed, 17 insertions(+), 0 deletions(-)
create mode 100644 Documentation/namespaces/resource-control.txt
diff --git a/Documentation/namespaces/resource-control.txt b/Documentation/namespaces/resource-control.txt
new file mode 100644
index 0000000..3d8178a
--- /dev/null
+++ b/Documentation/namespaces/resource-control.txt
@@ -0,0 +1,10 @@
+There are a lot of kinds of objects in the kernel that don't have
+individual limits or that have limits that are ineffective when a set
+of processes is allowed to switch user ids. With user namespaces
+enabled in a kernel for people who don't trust their users or their
+users programs to play nice this problems becomes more acute.
+
+Therefore it is recommended that memory control groups be enabled in
+kernels that enable user namespaces, and it is further recommended
+that userspace configure memory control groups to limit how much
+memory users they don't trust to play nice can use.
diff --git a/init/Kconfig b/init/Kconfig
index 7d30240..c8c58bd 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1035,6 +1035,13 @@ config USER_NS
help
This allows containers, i.e. vservers, to use user namespaces
to provide different user info for different servers.
+
+ When user namespaces are enabled in the kernel it is
+ recommended that the MEMCG and MEMCG_KMEM options also be
+ enabled and that user-space use the memory control groups to
+ limit the amount of memory a memory unprivileged users can
+ use.
+
If unsure, say N.
config PID_NS
--
1.7.5.4
^ permalink raw reply related [flat|nested] 62+ messages in thread
[parent not found: <87txq4hedl.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH review 3/6] userns: Recommend use of memory control groups.
2013-01-26 2:22 ` Eric W. Biederman
@ 2013-01-26 21:13 ` Serge E. Hallyn
-1 siblings, 0 replies; 62+ messages in thread
From: Serge E. Hallyn @ 2013-01-26 21:13 UTC (permalink / raw)
To: Eric W. Biederman
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Linux Containers,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
>
> In the help text describing user namespaces recommend use of memory
> control groups. In many cases memory control groups are the only
> mechanism there is to limit how much memory a user who can create
> user namespaces can use.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
nit:
> ---
> Documentation/namespaces/resource-control.txt | 10 ++++++++++
> init/Kconfig | 7 +++++++
> 2 files changed, 17 insertions(+), 0 deletions(-)
> create mode 100644 Documentation/namespaces/resource-control.txt
>
> diff --git a/Documentation/namespaces/resource-control.txt b/Documentation/namespaces/resource-control.txt
> new file mode 100644
> index 0000000..3d8178a
> --- /dev/null
> +++ b/Documentation/namespaces/resource-control.txt
> @@ -0,0 +1,10 @@
> +There are a lot of kinds of objects in the kernel that don't have
> +individual limits or that have limits that are ineffective when a set
> +of processes is allowed to switch user ids. With user namespaces
> +enabled in a kernel for people who don't trust their users or their
> +users programs to play nice this problems becomes more acute.
users' programs
> +
> +Therefore it is recommended that memory control groups be enabled in
> +kernels that enable user namespaces, and it is further recommended
> +that userspace configure memory control groups to limit how much
> +memory users they don't trust to play nice can use.
> diff --git a/init/Kconfig b/init/Kconfig
> index 7d30240..c8c58bd 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1035,6 +1035,13 @@ config USER_NS
> help
> This allows containers, i.e. vservers, to use user namespaces
> to provide different user info for different servers.
> +
> + When user namespaces are enabled in the kernel it is
> + recommended that the MEMCG and MEMCG_KMEM options also be
> + enabled and that user-space use the memory control groups to
> + limit the amount of memory a memory unprivileged users can
> + use.
> +
> If unsure, say N.
>
> config PID_NS
> --
> 1.7.5.4
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [PATCH review 3/6] userns: Recommend use of memory control groups.
@ 2013-01-26 21:13 ` Serge E. Hallyn
0 siblings, 0 replies; 62+ messages in thread
From: Serge E. Hallyn @ 2013-01-26 21:13 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Linux Containers, Serge E. Hallyn, linux-kernel, linux-fsdevel
Quoting Eric W. Biederman (ebiederm@xmission.com):
>
> In the help text describing user namespaces recommend use of memory
> control groups. In many cases memory control groups are the only
> mechanism there is to limit how much memory a user who can create
> user namespaces can use.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
nit:
> ---
> Documentation/namespaces/resource-control.txt | 10 ++++++++++
> init/Kconfig | 7 +++++++
> 2 files changed, 17 insertions(+), 0 deletions(-)
> create mode 100644 Documentation/namespaces/resource-control.txt
>
> diff --git a/Documentation/namespaces/resource-control.txt b/Documentation/namespaces/resource-control.txt
> new file mode 100644
> index 0000000..3d8178a
> --- /dev/null
> +++ b/Documentation/namespaces/resource-control.txt
> @@ -0,0 +1,10 @@
> +There are a lot of kinds of objects in the kernel that don't have
> +individual limits or that have limits that are ineffective when a set
> +of processes is allowed to switch user ids. With user namespaces
> +enabled in a kernel for people who don't trust their users or their
> +users programs to play nice this problems becomes more acute.
users' programs
> +
> +Therefore it is recommended that memory control groups be enabled in
> +kernels that enable user namespaces, and it is further recommended
> +that userspace configure memory control groups to limit how much
> +memory users they don't trust to play nice can use.
> diff --git a/init/Kconfig b/init/Kconfig
> index 7d30240..c8c58bd 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1035,6 +1035,13 @@ config USER_NS
> help
> This allows containers, i.e. vservers, to use user namespaces
> to provide different user info for different servers.
> +
> + When user namespaces are enabled in the kernel it is
> + recommended that the MEMCG and MEMCG_KMEM options also be
> + enabled and that user-space use the memory control groups to
> + limit the amount of memory a memory unprivileged users can
> + use.
> +
> If unsure, say N.
>
> config PID_NS
> --
> 1.7.5.4
^ permalink raw reply [flat|nested] 62+ messages in thread
[parent not found: <20130126211312.GD11274-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>]
* Re: [PATCH review 3/6] userns: Recommend use of memory control groups.
2013-01-26 21:13 ` Serge E. Hallyn
@ 2013-01-27 6:19 ` Eric W. Biederman
-1 siblings, 0 replies; 62+ messages in thread
From: Eric W. Biederman @ 2013-01-27 6:19 UTC (permalink / raw)
To: Serge E. Hallyn
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Linux Containers,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
"Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> writes:
> Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
>>
>> In the help text describing user namespaces recommend use of memory
>> control groups. In many cases memory control groups are the only
>> mechanism there is to limit how much memory a user who can create
>> user namespaces can use.
>>
>> Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
>
> Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
>
> nit:
>
I have fixed you nit and added the following text, so people know
have a clue where to look to configure cgroups in userspace.
diff --git a/Documentation/namespaces/resource-control.txt b/Documentation/namespaces/resource-control.txt
index 3d8178a..abc13c3 100644
--- a/Documentation/namespaces/resource-control.txt
+++ b/Documentation/namespaces/resource-control.txt
@@ -7,4 +7,8 @@ users programs to play nice this problems becomes more acute.
Therefore it is recommended that memory control groups be enabled in
kernels that enable user namespaces, and it is further recommended
that userspace configure memory control groups to limit how much
-memory users they don't trust to play nice can use.
+memory user's they don't trust to play nice can use.
+
+Memory control groups can be configured by installing the libcgroup
+package present on most distros editing /etc/cgrules.conf,
+/etc/cgconfig.conf and setting up libpam-cgroup.
^ permalink raw reply related [flat|nested] 62+ messages in thread
* Re: [PATCH review 3/6] userns: Recommend use of memory control groups.
@ 2013-01-27 6:19 ` Eric W. Biederman
0 siblings, 0 replies; 62+ messages in thread
From: Eric W. Biederman @ 2013-01-27 6:19 UTC (permalink / raw)
To: Serge E. Hallyn; +Cc: Linux Containers, linux-kernel, linux-fsdevel
"Serge E. Hallyn" <serge@hallyn.com> writes:
> Quoting Eric W. Biederman (ebiederm@xmission.com):
>>
>> In the help text describing user namespaces recommend use of memory
>> control groups. In many cases memory control groups are the only
>> mechanism there is to limit how much memory a user who can create
>> user namespaces can use.
>>
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>
> Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
>
> nit:
>
I have fixed you nit and added the following text, so people know
have a clue where to look to configure cgroups in userspace.
diff --git a/Documentation/namespaces/resource-control.txt b/Documentation/namespaces/resource-control.txt
index 3d8178a..abc13c3 100644
--- a/Documentation/namespaces/resource-control.txt
+++ b/Documentation/namespaces/resource-control.txt
@@ -7,4 +7,8 @@ users programs to play nice this problems becomes more acute.
Therefore it is recommended that memory control groups be enabled in
kernels that enable user namespaces, and it is further recommended
that userspace configure memory control groups to limit how much
-memory users they don't trust to play nice can use.
+memory user's they don't trust to play nice can use.
+
+Memory control groups can be configured by installing the libcgroup
+package present on most distros editing /etc/cgrules.conf,
+/etc/cgconfig.conf and setting up libpam-cgroup.
^ permalink raw reply related [flat|nested] 62+ messages in thread
* Re: [PATCH review 3/6] userns: Recommend use of memory control groups.
[not found] ` <87txq4hedl.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-01-26 21:13 ` Serge E. Hallyn
@ 2013-01-28 7:37 ` Lord Glauber Costa of Sealand
1 sibling, 0 replies; 62+ messages in thread
From: Lord Glauber Costa of Sealand @ 2013-01-28 7:37 UTC (permalink / raw)
To: Eric W. Biederman
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Linux Containers,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
On 01/26/2013 06:22 AM, Eric W. Biederman wrote:
>
> In the help text describing user namespaces recommend use of memory
> control groups. In many cases memory control groups are the only
> mechanism there is to limit how much memory a user who can create
> user namespaces can use.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
> ---
> Documentation/namespaces/resource-control.txt | 10 ++++++++++
> init/Kconfig | 7 +++++++
> 2 files changed, 17 insertions(+), 0 deletions(-)
> create mode 100644 Documentation/namespaces/resource-control.txt
>
> diff --git a/Documentation/namespaces/resource-control.txt b/Documentation/namespaces/resource-control.txt
> new file mode 100644
> index 0000000..3d8178a
> --- /dev/null
> +++ b/Documentation/namespaces/resource-control.txt
> @@ -0,0 +1,10 @@
> +There are a lot of kinds of objects in the kernel that don't have
> +individual limits or that have limits that are ineffective when a set
> +of processes is allowed to switch user ids. With user namespaces
> +enabled in a kernel for people who don't trust their users or their
> +users programs to play nice this problems becomes more acute.
> +
> +Therefore it is recommended that memory control groups be enabled in
> +kernels that enable user namespaces, and it is further recommended
> +that userspace configure memory control groups to limit how much
> +memory users they don't trust to play nice can use.
> diff --git a/init/Kconfig b/init/Kconfig
> index 7d30240..c8c58bd 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1035,6 +1035,13 @@ config USER_NS
> help
> This allows containers, i.e. vservers, to use user namespaces
> to provide different user info for different servers.
> +
> + When user namespaces are enabled in the kernel it is
> + recommended that the MEMCG and MEMCG_KMEM options also be
> + enabled and that user-space use the memory control groups to
> + limit the amount of memory a memory unprivileged users can
> + use.
> +
> If unsure, say N.
Since this becomes an official recommendation that people will likely
follow, are we really that much concerned about the types of abuses the
MEMCG_KMEM will prevent? Those are mostly metadata-based abuses users
could do in their own local disks without mounting anything extra (and
things that look like that)
Unless there is a specific concern here, shouldn't we say "... that the
MEMCG (and possibly MEMCG_KMEM) options..." ?
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [PATCH review 3/6] userns: Recommend use of memory control groups.
2013-01-26 2:22 ` Eric W. Biederman
(?)
(?)
@ 2013-01-28 7:37 ` Lord Glauber Costa of Sealand
[not found] ` <51062AB5.9060203-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
-1 siblings, 1 reply; 62+ messages in thread
From: Lord Glauber Costa of Sealand @ 2013-01-28 7:37 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: Linux Containers, linux-fsdevel, linux-kernel
On 01/26/2013 06:22 AM, Eric W. Biederman wrote:
>
> In the help text describing user namespaces recommend use of memory
> control groups. In many cases memory control groups are the only
> mechanism there is to limit how much memory a user who can create
> user namespaces can use.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
> Documentation/namespaces/resource-control.txt | 10 ++++++++++
> init/Kconfig | 7 +++++++
> 2 files changed, 17 insertions(+), 0 deletions(-)
> create mode 100644 Documentation/namespaces/resource-control.txt
>
> diff --git a/Documentation/namespaces/resource-control.txt b/Documentation/namespaces/resource-control.txt
> new file mode 100644
> index 0000000..3d8178a
> --- /dev/null
> +++ b/Documentation/namespaces/resource-control.txt
> @@ -0,0 +1,10 @@
> +There are a lot of kinds of objects in the kernel that don't have
> +individual limits or that have limits that are ineffective when a set
> +of processes is allowed to switch user ids. With user namespaces
> +enabled in a kernel for people who don't trust their users or their
> +users programs to play nice this problems becomes more acute.
> +
> +Therefore it is recommended that memory control groups be enabled in
> +kernels that enable user namespaces, and it is further recommended
> +that userspace configure memory control groups to limit how much
> +memory users they don't trust to play nice can use.
> diff --git a/init/Kconfig b/init/Kconfig
> index 7d30240..c8c58bd 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1035,6 +1035,13 @@ config USER_NS
> help
> This allows containers, i.e. vservers, to use user namespaces
> to provide different user info for different servers.
> +
> + When user namespaces are enabled in the kernel it is
> + recommended that the MEMCG and MEMCG_KMEM options also be
> + enabled and that user-space use the memory control groups to
> + limit the amount of memory a memory unprivileged users can
> + use.
> +
> If unsure, say N.
Since this becomes an official recommendation that people will likely
follow, are we really that much concerned about the types of abuses the
MEMCG_KMEM will prevent? Those are mostly metadata-based abuses users
could do in their own local disks without mounting anything extra (and
things that look like that)
Unless there is a specific concern here, shouldn't we say "... that the
MEMCG (and possibly MEMCG_KMEM) options..." ?
^ permalink raw reply [flat|nested] 62+ messages in thread
* [PATCH review 4/6] userns: Allow the userns root to mount of devpts
2013-01-26 2:15 ` Eric W. Biederman
@ 2013-01-26 2:23 ` Eric W. Biederman
-1 siblings, 0 replies; 62+ messages in thread
From: Eric W. Biederman @ 2013-01-26 2:23 UTC (permalink / raw)
To: Linux Containers
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
- The context in which devpts is mounted has no effect on the creation
of ptys as the /dev/ptmx interface has been used by unprivileged
users for many years.
- Only support unprivileged mounts in combination with the newinstance
option to ensure that mounting of /dev/pts in a user namespace will
not allow the options of an existing mount of devpts to be modified.
- Create /dev/pts/ptmx as the root user in the user namespace that
mounts devpts so that it's permissions to be changed.
Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
---
fs/devpts/inode.c | 18 ++++++++++++++++++
1 files changed, 18 insertions(+), 0 deletions(-)
diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
index 472e6be..073d30b 100644
--- a/fs/devpts/inode.c
+++ b/fs/devpts/inode.c
@@ -243,6 +243,13 @@ static int mknod_ptmx(struct super_block *sb)
struct dentry *root = sb->s_root;
struct pts_fs_info *fsi = DEVPTS_SB(sb);
struct pts_mount_opts *opts = &fsi->mount_opts;
+ kuid_t root_uid;
+ kgid_t root_gid;
+
+ root_uid = make_kuid(current_user_ns(), 0);
+ root_gid = make_kgid(current_user_ns(), 0);
+ if (!uid_valid(root_uid) || !gid_valid(root_gid))
+ return -EINVAL;
mutex_lock(&root->d_inode->i_mutex);
@@ -273,6 +280,8 @@ static int mknod_ptmx(struct super_block *sb)
mode = S_IFCHR|opts->ptmxmode;
init_special_inode(inode, mode, MKDEV(TTYAUX_MAJOR, 2));
+ inode->i_uid = root_uid;
+ inode->i_gid = root_gid;
d_add(dentry, inode);
@@ -438,6 +447,12 @@ static struct dentry *devpts_mount(struct file_system_type *fs_type,
if (error)
return ERR_PTR(error);
+ /* Require newinstance for all user namespace mounts to ensure
+ * the mount options are not changed.
+ */
+ if ((current_user_ns() != &init_user_ns) && !opts.newinstance)
+ return ERR_PTR(-EINVAL);
+
if (opts.newinstance)
s = sget(fs_type, NULL, set_anon_super, flags, NULL);
else
@@ -491,6 +506,9 @@ static struct file_system_type devpts_fs_type = {
.name = "devpts",
.mount = devpts_mount,
.kill_sb = devpts_kill_sb,
+#ifdef CONFIG_DEVPTS_MULTIPLE_INSTANCES
+ .fs_flags = FS_USERNS_MOUNT | FS_USERNS_DEV_MOUNT,
+#endif
};
/*
--
1.7.5.4
^ permalink raw reply related [flat|nested] 62+ messages in thread* [PATCH review 4/6] userns: Allow the userns root to mount of devpts
@ 2013-01-26 2:23 ` Eric W. Biederman
0 siblings, 0 replies; 62+ messages in thread
From: Eric W. Biederman @ 2013-01-26 2:23 UTC (permalink / raw)
To: Linux Containers; +Cc: Serge E. Hallyn, linux-kernel, linux-fsdevel
- The context in which devpts is mounted has no effect on the creation
of ptys as the /dev/ptmx interface has been used by unprivileged
users for many years.
- Only support unprivileged mounts in combination with the newinstance
option to ensure that mounting of /dev/pts in a user namespace will
not allow the options of an existing mount of devpts to be modified.
- Create /dev/pts/ptmx as the root user in the user namespace that
mounts devpts so that it's permissions to be changed.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
fs/devpts/inode.c | 18 ++++++++++++++++++
1 files changed, 18 insertions(+), 0 deletions(-)
diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
index 472e6be..073d30b 100644
--- a/fs/devpts/inode.c
+++ b/fs/devpts/inode.c
@@ -243,6 +243,13 @@ static int mknod_ptmx(struct super_block *sb)
struct dentry *root = sb->s_root;
struct pts_fs_info *fsi = DEVPTS_SB(sb);
struct pts_mount_opts *opts = &fsi->mount_opts;
+ kuid_t root_uid;
+ kgid_t root_gid;
+
+ root_uid = make_kuid(current_user_ns(), 0);
+ root_gid = make_kgid(current_user_ns(), 0);
+ if (!uid_valid(root_uid) || !gid_valid(root_gid))
+ return -EINVAL;
mutex_lock(&root->d_inode->i_mutex);
@@ -273,6 +280,8 @@ static int mknod_ptmx(struct super_block *sb)
mode = S_IFCHR|opts->ptmxmode;
init_special_inode(inode, mode, MKDEV(TTYAUX_MAJOR, 2));
+ inode->i_uid = root_uid;
+ inode->i_gid = root_gid;
d_add(dentry, inode);
@@ -438,6 +447,12 @@ static struct dentry *devpts_mount(struct file_system_type *fs_type,
if (error)
return ERR_PTR(error);
+ /* Require newinstance for all user namespace mounts to ensure
+ * the mount options are not changed.
+ */
+ if ((current_user_ns() != &init_user_ns) && !opts.newinstance)
+ return ERR_PTR(-EINVAL);
+
if (opts.newinstance)
s = sget(fs_type, NULL, set_anon_super, flags, NULL);
else
@@ -491,6 +506,9 @@ static struct file_system_type devpts_fs_type = {
.name = "devpts",
.mount = devpts_mount,
.kill_sb = devpts_kill_sb,
+#ifdef CONFIG_DEVPTS_MULTIPLE_INSTANCES
+ .fs_flags = FS_USERNS_MOUNT | FS_USERNS_DEV_MOUNT,
+#endif
};
/*
--
1.7.5.4
^ permalink raw reply related [flat|nested] 62+ messages in thread[parent not found: <87obgchecv.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH review 4/6] userns: Allow the userns root to mount of devpts
2013-01-26 2:23 ` Eric W. Biederman
@ 2013-01-26 21:22 ` Serge E. Hallyn
-1 siblings, 0 replies; 62+ messages in thread
From: Serge E. Hallyn @ 2013-01-26 21:22 UTC (permalink / raw)
To: Eric W. Biederman
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Linux Containers,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
>
> - The context in which devpts is mounted has no effect on the creation
> of ptys as the /dev/ptmx interface has been used by unprivileged
> users for many years.
>
> - Only support unprivileged mounts in combination with the newinstance
> option to ensure that mounting of /dev/pts in a user namespace will
> not allow the options of an existing mount of devpts to be modified.
>
> - Create /dev/pts/ptmx as the root user in the user namespace that
> mounts devpts so that it's permissions to be changed.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> ---
> fs/devpts/inode.c | 18 ++++++++++++++++++
> 1 files changed, 18 insertions(+), 0 deletions(-)
>
> diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
> index 472e6be..073d30b 100644
> --- a/fs/devpts/inode.c
> +++ b/fs/devpts/inode.c
> @@ -243,6 +243,13 @@ static int mknod_ptmx(struct super_block *sb)
> struct dentry *root = sb->s_root;
> struct pts_fs_info *fsi = DEVPTS_SB(sb);
> struct pts_mount_opts *opts = &fsi->mount_opts;
> + kuid_t root_uid;
> + kgid_t root_gid;
> +
> + root_uid = make_kuid(current_user_ns(), 0);
> + root_gid = make_kgid(current_user_ns(), 0);
> + if (!uid_valid(root_uid) || !gid_valid(root_gid))
> + return -EINVAL;
>
> mutex_lock(&root->d_inode->i_mutex);
>
> @@ -273,6 +280,8 @@ static int mknod_ptmx(struct super_block *sb)
>
> mode = S_IFCHR|opts->ptmxmode;
> init_special_inode(inode, mode, MKDEV(TTYAUX_MAJOR, 2));
> + inode->i_uid = root_uid;
> + inode->i_gid = root_gid;
>
> d_add(dentry, inode);
>
> @@ -438,6 +447,12 @@ static struct dentry *devpts_mount(struct file_system_type *fs_type,
> if (error)
> return ERR_PTR(error);
>
> + /* Require newinstance for all user namespace mounts to ensure
> + * the mount options are not changed.
> + */
> + if ((current_user_ns() != &init_user_ns) && !opts.newinstance)
> + return ERR_PTR(-EINVAL);
> +
> if (opts.newinstance)
> s = sget(fs_type, NULL, set_anon_super, flags, NULL);
> else
> @@ -491,6 +506,9 @@ static struct file_system_type devpts_fs_type = {
> .name = "devpts",
> .mount = devpts_mount,
> .kill_sb = devpts_kill_sb,
> +#ifdef CONFIG_DEVPTS_MULTIPLE_INSTANCES
> + .fs_flags = FS_USERNS_MOUNT | FS_USERNS_DEV_MOUNT,
> +#endif
> };
>
> /*
> --
> 1.7.5.4
^ permalink raw reply [flat|nested] 62+ messages in thread* Re: [PATCH review 4/6] userns: Allow the userns root to mount of devpts
@ 2013-01-26 21:22 ` Serge E. Hallyn
0 siblings, 0 replies; 62+ messages in thread
From: Serge E. Hallyn @ 2013-01-26 21:22 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Linux Containers, Serge E. Hallyn, linux-kernel, linux-fsdevel
Quoting Eric W. Biederman (ebiederm@xmission.com):
>
> - The context in which devpts is mounted has no effect on the creation
> of ptys as the /dev/ptmx interface has been used by unprivileged
> users for many years.
>
> - Only support unprivileged mounts in combination with the newinstance
> option to ensure that mounting of /dev/pts in a user namespace will
> not allow the options of an existing mount of devpts to be modified.
>
> - Create /dev/pts/ptmx as the root user in the user namespace that
> mounts devpts so that it's permissions to be changed.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
> ---
> fs/devpts/inode.c | 18 ++++++++++++++++++
> 1 files changed, 18 insertions(+), 0 deletions(-)
>
> diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
> index 472e6be..073d30b 100644
> --- a/fs/devpts/inode.c
> +++ b/fs/devpts/inode.c
> @@ -243,6 +243,13 @@ static int mknod_ptmx(struct super_block *sb)
> struct dentry *root = sb->s_root;
> struct pts_fs_info *fsi = DEVPTS_SB(sb);
> struct pts_mount_opts *opts = &fsi->mount_opts;
> + kuid_t root_uid;
> + kgid_t root_gid;
> +
> + root_uid = make_kuid(current_user_ns(), 0);
> + root_gid = make_kgid(current_user_ns(), 0);
> + if (!uid_valid(root_uid) || !gid_valid(root_gid))
> + return -EINVAL;
>
> mutex_lock(&root->d_inode->i_mutex);
>
> @@ -273,6 +280,8 @@ static int mknod_ptmx(struct super_block *sb)
>
> mode = S_IFCHR|opts->ptmxmode;
> init_special_inode(inode, mode, MKDEV(TTYAUX_MAJOR, 2));
> + inode->i_uid = root_uid;
> + inode->i_gid = root_gid;
>
> d_add(dentry, inode);
>
> @@ -438,6 +447,12 @@ static struct dentry *devpts_mount(struct file_system_type *fs_type,
> if (error)
> return ERR_PTR(error);
>
> + /* Require newinstance for all user namespace mounts to ensure
> + * the mount options are not changed.
> + */
> + if ((current_user_ns() != &init_user_ns) && !opts.newinstance)
> + return ERR_PTR(-EINVAL);
> +
> if (opts.newinstance)
> s = sget(fs_type, NULL, set_anon_super, flags, NULL);
> else
> @@ -491,6 +506,9 @@ static struct file_system_type devpts_fs_type = {
> .name = "devpts",
> .mount = devpts_mount,
> .kill_sb = devpts_kill_sb,
> +#ifdef CONFIG_DEVPTS_MULTIPLE_INSTANCES
> + .fs_flags = FS_USERNS_MOUNT | FS_USERNS_DEV_MOUNT,
> +#endif
> };
>
> /*
> --
> 1.7.5.4
^ permalink raw reply [flat|nested] 62+ messages in thread
* [PATCH review 5/6] userns: Allow the userns root to mount ramfs.
2013-01-26 2:15 ` Eric W. Biederman
@ 2013-01-26 2:26 ` Eric W. Biederman
-1 siblings, 0 replies; 62+ messages in thread
From: Eric W. Biederman @ 2013-01-26 2:26 UTC (permalink / raw)
To: Linux Containers
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
There is no backing store to ramfs and file creation
rules are the same as for any other filesystem so
it is semantically safe to allow unprivileged users
to mount it.
The memory control group successfully limits how much
memory ramfs can consume on any system that cares about
a user namespace root using ramfs to exhaust memory
the memory control group can be deployed.
Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
---
fs/ramfs/inode.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/fs/ramfs/inode.c b/fs/ramfs/inode.c
index eab8c09..c24f1e1 100644
--- a/fs/ramfs/inode.c
+++ b/fs/ramfs/inode.c
@@ -260,6 +260,7 @@ static struct file_system_type ramfs_fs_type = {
.name = "ramfs",
.mount = ramfs_mount,
.kill_sb = ramfs_kill_sb,
+ .fs_flags = FS_USERNS_MOUNT,
};
static struct file_system_type rootfs_fs_type = {
.name = "rootfs",
--
1.7.5.4
^ permalink raw reply related [flat|nested] 62+ messages in thread* [PATCH review 5/6] userns: Allow the userns root to mount ramfs.
@ 2013-01-26 2:26 ` Eric W. Biederman
0 siblings, 0 replies; 62+ messages in thread
From: Eric W. Biederman @ 2013-01-26 2:26 UTC (permalink / raw)
To: Linux Containers; +Cc: Serge E. Hallyn, linux-kernel, linux-fsdevel
There is no backing store to ramfs and file creation
rules are the same as for any other filesystem so
it is semantically safe to allow unprivileged users
to mount it.
The memory control group successfully limits how much
memory ramfs can consume on any system that cares about
a user namespace root using ramfs to exhaust memory
the memory control group can be deployed.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
fs/ramfs/inode.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/fs/ramfs/inode.c b/fs/ramfs/inode.c
index eab8c09..c24f1e1 100644
--- a/fs/ramfs/inode.c
+++ b/fs/ramfs/inode.c
@@ -260,6 +260,7 @@ static struct file_system_type ramfs_fs_type = {
.name = "ramfs",
.mount = ramfs_mount,
.kill_sb = ramfs_kill_sb,
+ .fs_flags = FS_USERNS_MOUNT,
};
static struct file_system_type rootfs_fs_type = {
.name = "rootfs",
--
1.7.5.4
^ permalink raw reply related [flat|nested] 62+ messages in thread[parent not found: <87ip6khe7w.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH review 5/6] userns: Allow the userns root to mount ramfs.
2013-01-26 2:26 ` Eric W. Biederman
@ 2013-01-26 21:29 ` Serge E. Hallyn
-1 siblings, 0 replies; 62+ messages in thread
From: Serge E. Hallyn @ 2013-01-26 21:29 UTC (permalink / raw)
To: Eric W. Biederman
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Linux Containers,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
>
> There is no backing store to ramfs and file creation
> rules are the same as for any other filesystem so
> it is semantically safe to allow unprivileged users
> to mount it.
>
> The memory control group successfully limits how much
> memory ramfs can consume on any system that cares about
> a user namespace root using ramfs to exhaust memory
> the memory control group can be deployed.
But that does mean that to avoid this new type of attack, when handed a
new kernel (i.e. by one's distro) one has to explicitly (know about and)
configure those limits. The "your distro should do this for you"
argument doesn't seem right. And I'd really prefer there not be
barriers to user namespaces being compiled in when there don't have to
be.
What was your thought on the suggestion to only allow FS_USERNS_MOUNT
mounts by users confined in a non-init memory cgroup?
Alternatively, what about a simple sysctl knob to turn on
FS_USERNS_MOUNTs? Then if I've got no untrusted users I can just turn
that on without the system second-guessing me for not having extra
configuration...
>
> Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
> ---
> fs/ramfs/inode.c | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/fs/ramfs/inode.c b/fs/ramfs/inode.c
> index eab8c09..c24f1e1 100644
> --- a/fs/ramfs/inode.c
> +++ b/fs/ramfs/inode.c
> @@ -260,6 +260,7 @@ static struct file_system_type ramfs_fs_type = {
> .name = "ramfs",
> .mount = ramfs_mount,
> .kill_sb = ramfs_kill_sb,
> + .fs_flags = FS_USERNS_MOUNT,
> };
> static struct file_system_type rootfs_fs_type = {
> .name = "rootfs",
> --
> 1.7.5.4
^ permalink raw reply [flat|nested] 62+ messages in thread* Re: [PATCH review 5/6] userns: Allow the userns root to mount ramfs.
@ 2013-01-26 21:29 ` Serge E. Hallyn
0 siblings, 0 replies; 62+ messages in thread
From: Serge E. Hallyn @ 2013-01-26 21:29 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Linux Containers, Serge E. Hallyn, linux-kernel, linux-fsdevel
Quoting Eric W. Biederman (ebiederm@xmission.com):
>
> There is no backing store to ramfs and file creation
> rules are the same as for any other filesystem so
> it is semantically safe to allow unprivileged users
> to mount it.
>
> The memory control group successfully limits how much
> memory ramfs can consume on any system that cares about
> a user namespace root using ramfs to exhaust memory
> the memory control group can be deployed.
But that does mean that to avoid this new type of attack, when handed a
new kernel (i.e. by one's distro) one has to explicitly (know about and)
configure those limits. The "your distro should do this for you"
argument doesn't seem right. And I'd really prefer there not be
barriers to user namespaces being compiled in when there don't have to
be.
What was your thought on the suggestion to only allow FS_USERNS_MOUNT
mounts by users confined in a non-init memory cgroup?
Alternatively, what about a simple sysctl knob to turn on
FS_USERNS_MOUNTs? Then if I've got no untrusted users I can just turn
that on without the system second-guessing me for not having extra
configuration...
>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
> fs/ramfs/inode.c | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/fs/ramfs/inode.c b/fs/ramfs/inode.c
> index eab8c09..c24f1e1 100644
> --- a/fs/ramfs/inode.c
> +++ b/fs/ramfs/inode.c
> @@ -260,6 +260,7 @@ static struct file_system_type ramfs_fs_type = {
> .name = "ramfs",
> .mount = ramfs_mount,
> .kill_sb = ramfs_kill_sb,
> + .fs_flags = FS_USERNS_MOUNT,
> };
> static struct file_system_type rootfs_fs_type = {
> .name = "rootfs",
> --
> 1.7.5.4
^ permalink raw reply [flat|nested] 62+ messages in thread[parent not found: <20130126212918.GG11274-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>]
* Re: [PATCH review 5/6] userns: Allow the userns root to mount ramfs.
2013-01-26 21:29 ` Serge E. Hallyn
@ 2013-01-27 6:09 ` Eric W. Biederman
-1 siblings, 0 replies; 62+ messages in thread
From: Eric W. Biederman @ 2013-01-27 6:09 UTC (permalink / raw)
To: Serge E. Hallyn
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Linux Containers,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
"Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> writes:
> Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
>>
>> There is no backing store to ramfs and file creation
>> rules are the same as for any other filesystem so
>> it is semantically safe to allow unprivileged users
>> to mount it.
>>
>> The memory control group successfully limits how much
>> memory ramfs can consume on any system that cares about
>> a user namespace root using ramfs to exhaust memory
>> the memory control group can be deployed.
>
> But that does mean that to avoid this new type of attack, when handed a
> new kernel (i.e. by one's distro) one has to explicitly (know about and)
> configure those limits. The "your distro should do this for you"
> argument doesn't seem right. And I'd really prefer there not be
> barriers to user namespaces being compiled in when there don't have to
> be.
The thing is this really isn't a new type of attack. There are a lot of
existing methods to exhaust memory with the default configuration on
most distros. All this is is a new method to method to implement such
an attack.
Most distros allow a large number or processes and allow those processes
to consume a large if not unlimited amount of ram.
The OOM killer still will recover your system from a ramfs or a tmpfs
mounted in a mount namespace created with user namespace permissions.
It works because the OOM killer will kill all of the processes in the
mount namespace. At which point all of the mounts have their reference
counts go to 0 the filesystems are unmounted. When a ramfs or
tmpfs is unmounted all of the files in a ramfs or tmpfs are freed.
On the flip side every resource has historically come with it's own new
knob. The new knob in this case is memory control groups. It isn't an
rlimit, and it isn't global limit tunable with a sysctl. It is a much
more general knob than that.
> What was your thought on the suggestion to only allow FS_USERNS_MOUNT
> mounts by users confined in a non-init memory cgroup?
Over design.
But more than that there are a lot of other ways to get into trouble if
you don't enable memory control groups with user namespaces. tmpfs is
just the first one I identified.
for (;;) unshare(CLONE_NEWUSER) is equally as bad, and if I look I can
find a bunch of others.
The practical fact is that allowing userspace to exhaust memory and get
the system into an OOM condition happens today. There are lots of lots
of resources that it would take a lot of time to individually limit, or
put a knob on and even then we would miss some. The memory control group
limits all of those now, and isn't particularly hard to configure.
So for the people who care I recommend using the tools that are
available now and work now the memory control group.
Personally I don't think distros care.
> Alternatively, what about a simple sysctl knob to turn on
> FS_USERNS_MOUNTs? Then if I've got no untrusted users I can just turn
> that on without the system second-guessing me for not having extra
> configuration...
I suppose we could do something like what happens on terminals where
scheduler control groups are automatically created by the kernel. Or
perhaps have an on/off sysctl knob for user namespaces themselves. I
don't think anything more fine grained is worth it at this point.
Not that I will oppose more fine grained patches if someone writes else
writes them, I just don't see the bang for the buck.
I understand about not wanting to introduce limits on people enabling
user namespaces. Most distro's don't appear to limit users memory today
so enabling user namespaces won't change anything. For people who do
want to limit a users memory consumption it looks like all you need
to do is something like:
$ apt-get install cgroup-bin libcgroup1 libpam-cgroup
$ cat >> /etc/cgconfig <<EOF
group eric {
perm {
task {
uid = root;
gid = root;
}
admin {
uid = root;
gid = root;
}
}
memory {
memory.limit_in_bytes = 1073741824;
memory.kmem.limit_in_bytes = 1073741824;
}
}
mount {
memory = /mnt/cgroups/memory;
}
EOF
$ cat >> /etc/cgrules <<EOF
eric memory eric/
EOF
So shrug. The mechanisms that I am suggesting people use already exist,
and appear to have been present long enough to have made it into debian
stable release February of 2011.
My apologies for not having done that part of my homework earlier to
know that libpam-cgroup and friends are well established and
have existed for quite a long time.
Eric
^ permalink raw reply [flat|nested] 62+ messages in thread* Re: [PATCH review 5/6] userns: Allow the userns root to mount ramfs.
@ 2013-01-27 6:09 ` Eric W. Biederman
0 siblings, 0 replies; 62+ messages in thread
From: Eric W. Biederman @ 2013-01-27 6:09 UTC (permalink / raw)
To: Serge E. Hallyn; +Cc: Linux Containers, linux-kernel, linux-fsdevel
"Serge E. Hallyn" <serge@hallyn.com> writes:
> Quoting Eric W. Biederman (ebiederm@xmission.com):
>>
>> There is no backing store to ramfs and file creation
>> rules are the same as for any other filesystem so
>> it is semantically safe to allow unprivileged users
>> to mount it.
>>
>> The memory control group successfully limits how much
>> memory ramfs can consume on any system that cares about
>> a user namespace root using ramfs to exhaust memory
>> the memory control group can be deployed.
>
> But that does mean that to avoid this new type of attack, when handed a
> new kernel (i.e. by one's distro) one has to explicitly (know about and)
> configure those limits. The "your distro should do this for you"
> argument doesn't seem right. And I'd really prefer there not be
> barriers to user namespaces being compiled in when there don't have to
> be.
The thing is this really isn't a new type of attack. There are a lot of
existing methods to exhaust memory with the default configuration on
most distros. All this is is a new method to method to implement such
an attack.
Most distros allow a large number or processes and allow those processes
to consume a large if not unlimited amount of ram.
The OOM killer still will recover your system from a ramfs or a tmpfs
mounted in a mount namespace created with user namespace permissions.
It works because the OOM killer will kill all of the processes in the
mount namespace. At which point all of the mounts have their reference
counts go to 0 the filesystems are unmounted. When a ramfs or
tmpfs is unmounted all of the files in a ramfs or tmpfs are freed.
On the flip side every resource has historically come with it's own new
knob. The new knob in this case is memory control groups. It isn't an
rlimit, and it isn't global limit tunable with a sysctl. It is a much
more general knob than that.
> What was your thought on the suggestion to only allow FS_USERNS_MOUNT
> mounts by users confined in a non-init memory cgroup?
Over design.
But more than that there are a lot of other ways to get into trouble if
you don't enable memory control groups with user namespaces. tmpfs is
just the first one I identified.
for (;;) unshare(CLONE_NEWUSER) is equally as bad, and if I look I can
find a bunch of others.
The practical fact is that allowing userspace to exhaust memory and get
the system into an OOM condition happens today. There are lots of lots
of resources that it would take a lot of time to individually limit, or
put a knob on and even then we would miss some. The memory control group
limits all of those now, and isn't particularly hard to configure.
So for the people who care I recommend using the tools that are
available now and work now the memory control group.
Personally I don't think distros care.
> Alternatively, what about a simple sysctl knob to turn on
> FS_USERNS_MOUNTs? Then if I've got no untrusted users I can just turn
> that on without the system second-guessing me for not having extra
> configuration...
I suppose we could do something like what happens on terminals where
scheduler control groups are automatically created by the kernel. Or
perhaps have an on/off sysctl knob for user namespaces themselves. I
don't think anything more fine grained is worth it at this point.
Not that I will oppose more fine grained patches if someone writes else
writes them, I just don't see the bang for the buck.
I understand about not wanting to introduce limits on people enabling
user namespaces. Most distro's don't appear to limit users memory today
so enabling user namespaces won't change anything. For people who do
want to limit a users memory consumption it looks like all you need
to do is something like:
$ apt-get install cgroup-bin libcgroup1 libpam-cgroup
$ cat >> /etc/cgconfig <<EOF
group eric {
perm {
task {
uid = root;
gid = root;
}
admin {
uid = root;
gid = root;
}
}
memory {
memory.limit_in_bytes = 1073741824;
memory.kmem.limit_in_bytes = 1073741824;
}
}
mount {
memory = /mnt/cgroups/memory;
}
EOF
$ cat >> /etc/cgrules <<EOF
eric memory eric/
EOF
So shrug. The mechanisms that I am suggesting people use already exist,
and appear to have been present long enough to have made it into debian
stable release February of 2011.
My apologies for not having done that part of my homework earlier to
know that libpam-cgroup and friends are well established and
have existed for quite a long time.
Eric
^ permalink raw reply [flat|nested] 62+ messages in thread[parent not found: <87bocb5f8a.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH review 5/6] userns: Allow the userns root to mount ramfs.
2013-01-27 6:09 ` Eric W. Biederman
@ 2013-01-27 18:23 ` Serge E. Hallyn
-1 siblings, 0 replies; 62+ messages in thread
From: Serge E. Hallyn @ 2013-01-27 18:23 UTC (permalink / raw)
To: Eric W. Biederman
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Linux Containers,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> "Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> writes:
>
> > Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> >>
> >> There is no backing store to ramfs and file creation
> >> rules are the same as for any other filesystem so
> >> it is semantically safe to allow unprivileged users
> >> to mount it.
> >>
> >> The memory control group successfully limits how much
> >> memory ramfs can consume on any system that cares about
> >> a user namespace root using ramfs to exhaust memory
> >> the memory control group can be deployed.
> >
> > But that does mean that to avoid this new type of attack, when handed a
> > new kernel (i.e. by one's distro) one has to explicitly (know about and)
> > configure those limits. The "your distro should do this for you"
> > argument doesn't seem right. And I'd really prefer there not be
> > barriers to user namespaces being compiled in when there don't have to
> > be.
>
> The thing is this really isn't a new type of attack. There are a lot of
Of course.
> existing methods to exhaust memory with the default configuration on
> most distros. All this is is a new method to method to implement such
> an attack.
Right.
...
> > What was your thought on the suggestion to only allow FS_USERNS_MOUNT
> > mounts by users confined in a non-init memory cgroup?
>
> Over design.
Ok. Fair.
> So shrug. The mechanisms that I am suggesting people use already exist,
> and appear to have been present long enough to have made it into debian
> stable release February of 2011.
Heh - right, libcgroup does have its problems, but I don't think there
are any problems with the pam module actually. I'm meant to talk with
the debian maintainer for them soon, will test.
thanks,
-serge
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [PATCH review 5/6] userns: Allow the userns root to mount ramfs.
@ 2013-01-27 18:23 ` Serge E. Hallyn
0 siblings, 0 replies; 62+ messages in thread
From: Serge E. Hallyn @ 2013-01-27 18:23 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Serge E. Hallyn, Linux Containers, linux-kernel, linux-fsdevel
Quoting Eric W. Biederman (ebiederm@xmission.com):
> "Serge E. Hallyn" <serge@hallyn.com> writes:
>
> > Quoting Eric W. Biederman (ebiederm@xmission.com):
> >>
> >> There is no backing store to ramfs and file creation
> >> rules are the same as for any other filesystem so
> >> it is semantically safe to allow unprivileged users
> >> to mount it.
> >>
> >> The memory control group successfully limits how much
> >> memory ramfs can consume on any system that cares about
> >> a user namespace root using ramfs to exhaust memory
> >> the memory control group can be deployed.
> >
> > But that does mean that to avoid this new type of attack, when handed a
> > new kernel (i.e. by one's distro) one has to explicitly (know about and)
> > configure those limits. The "your distro should do this for you"
> > argument doesn't seem right. And I'd really prefer there not be
> > barriers to user namespaces being compiled in when there don't have to
> > be.
>
> The thing is this really isn't a new type of attack. There are a lot of
Of course.
> existing methods to exhaust memory with the default configuration on
> most distros. All this is is a new method to method to implement such
> an attack.
Right.
...
> > What was your thought on the suggestion to only allow FS_USERNS_MOUNT
> > mounts by users confined in a non-init memory cgroup?
>
> Over design.
Ok. Fair.
> So shrug. The mechanisms that I am suggesting people use already exist,
> and appear to have been present long enough to have made it into debian
> stable release February of 2011.
Heh - right, libcgroup does have its problems, but I don't think there
are any problems with the pam module actually. I'm meant to talk with
the debian maintainer for them soon, will test.
thanks,
-serge
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [PATCH review 5/6] userns: Allow the userns root to mount ramfs.
[not found] ` <87ip6khe7w.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-01-26 21:29 ` Serge E. Hallyn
@ 2013-01-27 18:23 ` Serge E. Hallyn
1 sibling, 0 replies; 62+ messages in thread
From: Serge E. Hallyn @ 2013-01-27 18:23 UTC (permalink / raw)
To: Eric W. Biederman
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Linux Containers,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
>
> There is no backing store to ramfs and file creation
> rules are the same as for any other filesystem so
> it is semantically safe to allow unprivileged users
> to mount it.
>
> The memory control group successfully limits how much
> memory ramfs can consume on any system that cares about
> a user namespace root using ramfs to exhaust memory
> the memory control group can be deployed.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> ---
> fs/ramfs/inode.c | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/fs/ramfs/inode.c b/fs/ramfs/inode.c
> index eab8c09..c24f1e1 100644
> --- a/fs/ramfs/inode.c
> +++ b/fs/ramfs/inode.c
> @@ -260,6 +260,7 @@ static struct file_system_type ramfs_fs_type = {
> .name = "ramfs",
> .mount = ramfs_mount,
> .kill_sb = ramfs_kill_sb,
> + .fs_flags = FS_USERNS_MOUNT,
> };
> static struct file_system_type rootfs_fs_type = {
> .name = "rootfs",
> --
> 1.7.5.4
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [PATCH review 5/6] userns: Allow the userns root to mount ramfs.
2013-01-26 2:26 ` Eric W. Biederman
(?)
(?)
@ 2013-01-27 18:23 ` Serge E. Hallyn
-1 siblings, 0 replies; 62+ messages in thread
From: Serge E. Hallyn @ 2013-01-27 18:23 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Linux Containers, Serge E. Hallyn, linux-kernel, linux-fsdevel
Quoting Eric W. Biederman (ebiederm@xmission.com):
>
> There is no backing store to ramfs and file creation
> rules are the same as for any other filesystem so
> it is semantically safe to allow unprivileged users
> to mount it.
>
> The memory control group successfully limits how much
> memory ramfs can consume on any system that cares about
> a user namespace root using ramfs to exhaust memory
> the memory control group can be deployed.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
> ---
> fs/ramfs/inode.c | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/fs/ramfs/inode.c b/fs/ramfs/inode.c
> index eab8c09..c24f1e1 100644
> --- a/fs/ramfs/inode.c
> +++ b/fs/ramfs/inode.c
> @@ -260,6 +260,7 @@ static struct file_system_type ramfs_fs_type = {
> .name = "ramfs",
> .mount = ramfs_mount,
> .kill_sb = ramfs_kill_sb,
> + .fs_flags = FS_USERNS_MOUNT,
> };
> static struct file_system_type rootfs_fs_type = {
> .name = "rootfs",
> --
> 1.7.5.4
^ permalink raw reply [flat|nested] 62+ messages in thread
* [PATCH review 6/6] userns: Allow the userns root to mount tmpfs.
2013-01-26 2:15 ` Eric W. Biederman
@ 2013-01-26 2:26 ` Eric W. Biederman
-1 siblings, 0 replies; 62+ messages in thread
From: Eric W. Biederman @ 2013-01-26 2:26 UTC (permalink / raw)
To: Linux Containers
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
There is no backing store to tmpfs and file creation rules are the
same as for any other filesystem so it is semantically safe to allow
unprivileged users to mount it. ramfs is safe for the same reasons so
allow either flavor of tmpfs to be mounted by a user namespace root
user.
The memory control group successfully limits how much memory tmpfs can
consume on any system that cares about a user namespace root using
tmpfs to exhaust memory the memory control group can be deployed.
Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
---
mm/shmem.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index 5c90d84..197ca5e 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2766,6 +2766,7 @@ static struct file_system_type shmem_fs_type = {
.name = "tmpfs",
.mount = shmem_mount,
.kill_sb = kill_litter_super,
+ .fs_flags = FS_USERNS_MOUNT,
};
int __init shmem_init(void)
@@ -2823,6 +2824,7 @@ static struct file_system_type shmem_fs_type = {
.name = "tmpfs",
.mount = ramfs_mount,
.kill_sb = kill_litter_super,
+ .fs_flags = FS_USERNS_MOUNT,
};
int __init shmem_init(void)
--
1.7.5.4
^ permalink raw reply related [flat|nested] 62+ messages in thread* [PATCH review 6/6] userns: Allow the userns root to mount tmpfs.
@ 2013-01-26 2:26 ` Eric W. Biederman
0 siblings, 0 replies; 62+ messages in thread
From: Eric W. Biederman @ 2013-01-26 2:26 UTC (permalink / raw)
To: Linux Containers; +Cc: Serge E. Hallyn, linux-kernel, linux-fsdevel
There is no backing store to tmpfs and file creation rules are the
same as for any other filesystem so it is semantically safe to allow
unprivileged users to mount it. ramfs is safe for the same reasons so
allow either flavor of tmpfs to be mounted by a user namespace root
user.
The memory control group successfully limits how much memory tmpfs can
consume on any system that cares about a user namespace root using
tmpfs to exhaust memory the memory control group can be deployed.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
mm/shmem.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index 5c90d84..197ca5e 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2766,6 +2766,7 @@ static struct file_system_type shmem_fs_type = {
.name = "tmpfs",
.mount = shmem_mount,
.kill_sb = kill_litter_super,
+ .fs_flags = FS_USERNS_MOUNT,
};
int __init shmem_init(void)
@@ -2823,6 +2824,7 @@ static struct file_system_type shmem_fs_type = {
.name = "tmpfs",
.mount = ramfs_mount,
.kill_sb = kill_litter_super,
+ .fs_flags = FS_USERNS_MOUNT,
};
int __init shmem_init(void)
--
1.7.5.4
^ permalink raw reply related [flat|nested] 62+ messages in thread[parent not found: <87d2wshe6v.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH review 6/6] userns: Allow the userns root to mount tmpfs.
2013-01-26 2:26 ` Eric W. Biederman
@ 2013-01-27 18:23 ` Serge E. Hallyn
-1 siblings, 0 replies; 62+ messages in thread
From: Serge E. Hallyn @ 2013-01-27 18:23 UTC (permalink / raw)
To: Eric W. Biederman
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Linux Containers,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
>
> There is no backing store to tmpfs and file creation rules are the
> same as for any other filesystem so it is semantically safe to allow
> unprivileged users to mount it. ramfs is safe for the same reasons so
> allow either flavor of tmpfs to be mounted by a user namespace root
> user.
>
> The memory control group successfully limits how much memory tmpfs can
> consume on any system that cares about a user namespace root using
> tmpfs to exhaust memory the memory control group can be deployed.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> ---
> mm/shmem.c | 2 ++
> 1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 5c90d84..197ca5e 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -2766,6 +2766,7 @@ static struct file_system_type shmem_fs_type = {
> .name = "tmpfs",
> .mount = shmem_mount,
> .kill_sb = kill_litter_super,
> + .fs_flags = FS_USERNS_MOUNT,
> };
>
> int __init shmem_init(void)
> @@ -2823,6 +2824,7 @@ static struct file_system_type shmem_fs_type = {
> .name = "tmpfs",
> .mount = ramfs_mount,
> .kill_sb = kill_litter_super,
> + .fs_flags = FS_USERNS_MOUNT,
> };
>
> int __init shmem_init(void)
> --
> 1.7.5.4
^ permalink raw reply [flat|nested] 62+ messages in thread* Re: [PATCH review 6/6] userns: Allow the userns root to mount tmpfs.
@ 2013-01-27 18:23 ` Serge E. Hallyn
0 siblings, 0 replies; 62+ messages in thread
From: Serge E. Hallyn @ 2013-01-27 18:23 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Linux Containers, Serge E. Hallyn, linux-kernel, linux-fsdevel
Quoting Eric W. Biederman (ebiederm@xmission.com):
>
> There is no backing store to tmpfs and file creation rules are the
> same as for any other filesystem so it is semantically safe to allow
> unprivileged users to mount it. ramfs is safe for the same reasons so
> allow either flavor of tmpfs to be mounted by a user namespace root
> user.
>
> The memory control group successfully limits how much memory tmpfs can
> consume on any system that cares about a user namespace root using
> tmpfs to exhaust memory the memory control group can be deployed.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
> ---
> mm/shmem.c | 2 ++
> 1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 5c90d84..197ca5e 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -2766,6 +2766,7 @@ static struct file_system_type shmem_fs_type = {
> .name = "tmpfs",
> .mount = shmem_mount,
> .kill_sb = kill_litter_super,
> + .fs_flags = FS_USERNS_MOUNT,
> };
>
> int __init shmem_init(void)
> @@ -2823,6 +2824,7 @@ static struct file_system_type shmem_fs_type = {
> .name = "tmpfs",
> .mount = ramfs_mount,
> .kill_sb = kill_litter_super,
> + .fs_flags = FS_USERNS_MOUNT,
> };
>
> int __init shmem_init(void)
> --
> 1.7.5.4
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [PATCH review 6/6] userns: Allow the userns root to mount tmpfs.
[not found] ` <87d2wshe6v.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-01-27 18:23 ` Serge E. Hallyn
@ 2013-01-28 1:28 ` Gao feng
1 sibling, 0 replies; 62+ messages in thread
From: Gao feng @ 2013-01-28 1:28 UTC (permalink / raw)
To: Eric W. Biederman
Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Linux Containers,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
On 2013/01/26 10:26, Eric W. Biederman wrote:
>
> There is no backing store to tmpfs and file creation rules are the
> same as for any other filesystem so it is semantically safe to allow
> unprivileged users to mount it. ramfs is safe for the same reasons so
> allow either flavor of tmpfs to be mounted by a user namespace root
> user.
>
> The memory control group successfully limits how much memory tmpfs can
> consume on any system that cares about a user namespace root using
> tmpfs to exhaust memory the memory control group can be deployed.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
> ---
useful to me,thanks Eric & Serge.
Acked-by: Gao feng <gaofeng-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
> mm/shmem.c | 2 ++
> 1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 5c90d84..197ca5e 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -2766,6 +2766,7 @@ static struct file_system_type shmem_fs_type = {
> .name = "tmpfs",
> .mount = shmem_mount,
> .kill_sb = kill_litter_super,
> + .fs_flags = FS_USERNS_MOUNT,
> };
>
> int __init shmem_init(void)
> @@ -2823,6 +2824,7 @@ static struct file_system_type shmem_fs_type = {
> .name = "tmpfs",
> .mount = ramfs_mount,
> .kill_sb = kill_litter_super,
> + .fs_flags = FS_USERNS_MOUNT,
> };
>
> int __init shmem_init(void)
>
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [PATCH review 6/6] userns: Allow the userns root to mount tmpfs.
2013-01-26 2:26 ` Eric W. Biederman
(?)
(?)
@ 2013-01-28 1:28 ` Gao feng
-1 siblings, 0 replies; 62+ messages in thread
From: Gao feng @ 2013-01-28 1:28 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Linux Containers, Serge E. Hallyn, linux-kernel, linux-fsdevel
On 2013/01/26 10:26, Eric W. Biederman wrote:
>
> There is no backing store to tmpfs and file creation rules are the
> same as for any other filesystem so it is semantically safe to allow
> unprivileged users to mount it. ramfs is safe for the same reasons so
> allow either flavor of tmpfs to be mounted by a user namespace root
> user.
>
> The memory control group successfully limits how much memory tmpfs can
> consume on any system that cares about a user namespace root using
> tmpfs to exhaust memory the memory control group can be deployed.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
useful to me,thanks Eric & Serge.
Acked-by: Gao feng <gaofeng@cn.fujitsu.com>
> mm/shmem.c | 2 ++
> 1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 5c90d84..197ca5e 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -2766,6 +2766,7 @@ static struct file_system_type shmem_fs_type = {
> .name = "tmpfs",
> .mount = shmem_mount,
> .kill_sb = kill_litter_super,
> + .fs_flags = FS_USERNS_MOUNT,
> };
>
> int __init shmem_init(void)
> @@ -2823,6 +2824,7 @@ static struct file_system_type shmem_fs_type = {
> .name = "tmpfs",
> .mount = ramfs_mount,
> .kill_sb = kill_litter_super,
> + .fs_flags = FS_USERNS_MOUNT,
> };
>
> int __init shmem_init(void)
>
^ permalink raw reply [flat|nested] 62+ messages in thread