* Re: [PATCH 1/6] user namespace : add the framework
[not found] ` <20070604194024.GA21703-6s5zFf/epYLPQpwDFJZrxKsjOiXwFzmk@public.gmane.org>
@ 2007-07-16 1:31 ` Andrew Morton
[not found] ` <20070715183132.e31a2064.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2007-07-16 1:31 UTC (permalink / raw)
To: Serge E. Hallyn
Cc: containers-qjLDD68F18O7TbgM5vRIOg, David Howells,
xemul-3ImXcnM4P+0
On Mon, 4 Jun 2007 14:40:24 -0500 "Serge E. Hallyn" <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> wrote:
> Add the user namespace struct and framework
>
> Basically, it will allow a process to unshare its user_struct table, resetting
> at the same time its own user_struct and all the associated accounting.
>
> A new root user (uid == 0) is added to the user namespace upon creation. Such
> root users have full privileges and it seems that theses privileges should be
> controlled through some means (process capabilities ?)
The whole magical-uid-0-user thing in this patch seem just wrong to
me.
I'll merge it anyway, mainly because I want to merge _something_ (why oh
why do the git-tree guys leave everything to the last minute?) but it strikes
me that there's something fundamentally wrong whenever the kernel starts
"knowing" about the significance of UIDs in this fashion.
It worries me.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/6] user namespace : add the framework
[not found] ` <20070715183132.e31a2064.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
@ 2007-07-16 14:34 ` Serge E. Hallyn
[not found] ` <20070716143443.GA7393-6s5zFf/epYLPQpwDFJZrxKsjOiXwFzmk@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Serge E. Hallyn @ 2007-07-16 14:34 UTC (permalink / raw)
To: Andrew Morton
Cc: containers-qjLDD68F18O7TbgM5vRIOg, David Howells,
xemul-3ImXcnM4P+0
Quoting Andrew Morton (akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org):
> On Mon, 4 Jun 2007 14:40:24 -0500 "Serge E. Hallyn" <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> wrote:
>
> > Add the user namespace struct and framework
> >
> > Basically, it will allow a process to unshare its user_struct table, resetting
> > at the same time its own user_struct and all the associated accounting.
> >
> > A new root user (uid == 0) is added to the user namespace upon creation. Such
> > root users have full privileges and it seems that theses privileges should be
> > controlled through some means (process capabilities ?)
>
> The whole magical-uid-0-user thing in this patch seem just wrong to
> me.
>
> I'll merge it anyway, mainly because I want to merge _something_ (why oh
> why do the git-tree guys leave everything to the last minute?) but it strikes
> me that there's something fundamentally wrong whenever the kernel starts
> "knowing" about the significance of UIDs in this fashion.
$(&(%
I thought I disagreed, but now I'm pretty sure I completely agree.
'root_user' exists in the kernel right now, but the root_user checks
which exist (in fork.c and sys.c) shouldn't actually be applied for root
in a container, since the container may not be trusted.
There is the root_user_keyring stuff - David, is that only special cased
so that root's keys can be statically allocated?
> It worries me.
Cedric, you've probably mentioned this before, but what is wrong with
the following patch? User namespaces still seem to work for me with
this, but maybe I'm testing wrong. Can you give it a shot?
thanks,
-serge
From 743e4f5c15ff9ec4110adc9d06e39a3fb0541512 Mon Sep 17 00:00:00 2001
From: Serge E. Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Date: Mon, 16 Jul 2007 10:26:25 -0400
Subject: [PATCH 1/1] userns: remove uid0 logic
While 'root_user' is hard_coded in the kernel, as are uid 0
checks, removing the 'root_user' from a user namespace
does not appear to break user namespaces.
Signed-off-by: Serge E. Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
---
include/linux/user_namespace.h | 1 -
kernel/fork.c | 2 +-
kernel/sys.c | 2 +-
kernel/user_namespace.c | 9 ---------
4 files changed, 2 insertions(+), 12 deletions(-)
diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index bb32057..fcb4f30 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -12,7 +12,6 @@
struct user_namespace {
struct kref kref;
struct list_head uidhash_table[UIDHASH_SZ];
- struct user_struct *root_user;
};
extern struct user_namespace init_user_ns;
diff --git a/kernel/fork.c b/kernel/fork.c
index 8b6ba70..f5c7f49 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1004,7 +1004,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
if (atomic_read(&p->user->processes) >=
p->signal->rlim[RLIMIT_NPROC].rlim_cur) {
if (!capable(CAP_SYS_ADMIN) && !capable(CAP_SYS_RESOURCE) &&
- p->user != current->nsproxy->user_ns->root_user)
+ p->user != &root_user)
goto bad_fork_free;
}
diff --git a/kernel/sys.c b/kernel/sys.c
index e01b5d1..809e416 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1085,7 +1085,7 @@ static int set_user(uid_t new_ruid, int dumpclear)
if (atomic_read(&new_user->processes) >=
current->signal->rlim[RLIMIT_NPROC].rlim_cur &&
- new_user != current->nsproxy->user_ns->root_user) {
+ new_user != &root_user) {
free_uid(new_user);
return -EAGAIN;
}
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 89a27e8..df11a27 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -14,7 +14,6 @@ struct user_namespace init_user_ns = {
.kref = {
.refcount = ATOMIC_INIT(2),
},
- .root_user = &root_user,
};
EXPORT_SYMBOL_GPL(init_user_ns);
@@ -41,17 +40,9 @@ static struct user_namespace *clone_user_ns(struct user_namespace *old_ns)
for (n = 0; n < UIDHASH_SZ; ++n)
INIT_LIST_HEAD(ns->uidhash_table + n);
- /* Insert new root user. */
- ns->root_user = alloc_uid(ns, 0);
- if (!ns->root_user) {
- kfree(ns);
- return NULL;
- }
-
/* Reset current->user with a new one */
new_user = alloc_uid(ns, current->uid);
if (!new_user) {
- free_uid(ns->root_user);
kfree(ns);
return NULL;
}
--
1.5.1.1.GIT
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/6] user namespace : add the framework
[not found] ` <20070716143443.GA7393-6s5zFf/epYLPQpwDFJZrxKsjOiXwFzmk@public.gmane.org>
@ 2007-07-16 14:38 ` Serge E. Hallyn
2007-07-16 14:54 ` David Howells
2007-07-16 14:54 ` Kirill Korotaev
2 siblings, 0 replies; 9+ messages in thread
From: Serge E. Hallyn @ 2007-07-16 14:38 UTC (permalink / raw)
To: Serge E. Hallyn
Cc: containers-qjLDD68F18O7TbgM5vRIOg, David Howells, Andrew Morton,
xemul-3ImXcnM4P+0
Quoting Serge E. Hallyn (serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org):
> Quoting Andrew Morton (akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org):
> > On Mon, 4 Jun 2007 14:40:24 -0500 "Serge E. Hallyn" <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> wrote:
> >
> > > Add the user namespace struct and framework
> > >
> > > Basically, it will allow a process to unshare its user_struct table, resetting
> > > at the same time its own user_struct and all the associated accounting.
> > >
> > > A new root user (uid == 0) is added to the user namespace upon creation. Such
> > > root users have full privileges and it seems that theses privileges should be
> > > controlled through some means (process capabilities ?)
> >
> > The whole magical-uid-0-user thing in this patch seem just wrong to
> > me.
> >
> > I'll merge it anyway, mainly because I want to merge _something_ (why oh
> > why do the git-tree guys leave everything to the last minute?) but it strikes
> > me that there's something fundamentally wrong whenever the kernel starts
> > "knowing" about the significance of UIDs in this fashion.
>
> $(&(%
>
> I thought I disagreed, but now I'm pretty sure I completely agree.
>
> 'root_user' exists in the kernel right now, but the root_user checks
> which exist (in fork.c and sys.c) shouldn't actually be applied for root
> in a container, since the container may not be trusted.
By the way, I don't think these two uses of 'user == &root_user' are
legitimate. CAP_SYS_RESOURCE should probably be checked for both. Hah,
it already is in fork.c, in addition to root_user check.
But I guess I should take that up elsewhere.
thanks,
-serge
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/6] user namespace : add the framework
[not found] ` <20070716143443.GA7393-6s5zFf/epYLPQpwDFJZrxKsjOiXwFzmk@public.gmane.org>
2007-07-16 14:38 ` Serge E. Hallyn
@ 2007-07-16 14:54 ` David Howells
[not found] ` <28037.1184597651-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2007-07-16 14:54 ` Kirill Korotaev
2 siblings, 1 reply; 9+ messages in thread
From: David Howells @ 2007-07-16 14:54 UTC (permalink / raw)
To: Serge E. Hallyn
Cc: containers-qjLDD68F18O7TbgM5vRIOg, Andrew Morton,
xemul-3ImXcnM4P+0
Serge E. Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> wrote:
> There is the root_user_keyring stuff - David, is that only special cased
> so that root's keys can be statically allocated?
It's because the boot processes start up with UID 0, and the user_struct for
UID 0 is statically allocated (root_user).
David
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/6] user namespace : add the framework
[not found] ` <20070716143443.GA7393-6s5zFf/epYLPQpwDFJZrxKsjOiXwFzmk@public.gmane.org>
2007-07-16 14:38 ` Serge E. Hallyn
2007-07-16 14:54 ` David Howells
@ 2007-07-16 14:54 ` Kirill Korotaev
[not found] ` <469B86A4.3050006-3ImXcnM4P+0@public.gmane.org>
2 siblings, 1 reply; 9+ messages in thread
From: Kirill Korotaev @ 2007-07-16 14:54 UTC (permalink / raw)
To: Serge E. Hallyn
Cc: containers-qjLDD68F18O7TbgM5vRIOg, David Howells, Andrew Morton,
xemul-3ImXcnM4P+0
Serge E. Hallyn wrote:
> Quoting Andrew Morton (akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org):
>
>>On Mon, 4 Jun 2007 14:40:24 -0500 "Serge E. Hallyn" <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> wrote:
>>
>>
>>>Add the user namespace struct and framework
>>>
>>>Basically, it will allow a process to unshare its user_struct table, resetting
>>>at the same time its own user_struct and all the associated accounting.
>>>
>>>A new root user (uid == 0) is added to the user namespace upon creation. Such
>>>root users have full privileges and it seems that theses privileges should be
>>>controlled through some means (process capabilities ?)
>>
>>The whole magical-uid-0-user thing in this patch seem just wrong to
>>me.
>>
>>I'll merge it anyway, mainly because I want to merge _something_ (why oh
>>why do the git-tree guys leave everything to the last minute?) but it strikes
>>me that there's something fundamentally wrong whenever the kernel starts
>>"knowing" about the significance of UIDs in this fashion.
>
>
> $(&(%
>
> I thought I disagreed, but now I'm pretty sure I completely agree.
>
> 'root_user' exists in the kernel right now, but the root_user checks
> which exist (in fork.c and sys.c) shouldn't actually be applied for root
> in a container, since the container may not be trusted.
This rlimit check doesn't help *untrusted* containers, so your logic is wrong here.
Instead, it allows root of the container to operate in any situation.
E.g. consider root user hit the limit. After that you won't be able to login/ssh to fix anything.
NOTE: container root can have no CAP_SYS_RESOURCE and CAP_SYS_ADMIN as it is in OpenVZ.
But in general I'm not against the patch, since in OpenVZ we can replace the check
with another capability we use for VE admin - CAP_VE_SYS_ADMIN.
Kirill
> There is the root_user_keyring stuff - David, is that only special cased
> so that root's keys can be statically allocated?
>
>
>>It worries me.
>
>
> Cedric, you've probably mentioned this before, but what is wrong with
> the following patch? User namespaces still seem to work for me with
> this, but maybe I'm testing wrong. Can you give it a shot?
>
> thanks,
> -serge
>
>>From 743e4f5c15ff9ec4110adc9d06e39a3fb0541512 Mon Sep 17 00:00:00 2001
> From: Serge E. Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
> Date: Mon, 16 Jul 2007 10:26:25 -0400
> Subject: [PATCH 1/1] userns: remove uid0 logic
>
> While 'root_user' is hard_coded in the kernel, as are uid 0
> checks, removing the 'root_user' from a user namespace
> does not appear to break user namespaces.
>
> Signed-off-by: Serge E. Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
> ---
> include/linux/user_namespace.h | 1 -
> kernel/fork.c | 2 +-
> kernel/sys.c | 2 +-
> kernel/user_namespace.c | 9 ---------
> 4 files changed, 2 insertions(+), 12 deletions(-)
>
> diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
> index bb32057..fcb4f30 100644
> --- a/include/linux/user_namespace.h
> +++ b/include/linux/user_namespace.h
> @@ -12,7 +12,6 @@
> struct user_namespace {
> struct kref kref;
> struct list_head uidhash_table[UIDHASH_SZ];
> - struct user_struct *root_user;
> };
>
> extern struct user_namespace init_user_ns;
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 8b6ba70..f5c7f49 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1004,7 +1004,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
> if (atomic_read(&p->user->processes) >=
> p->signal->rlim[RLIMIT_NPROC].rlim_cur) {
> if (!capable(CAP_SYS_ADMIN) && !capable(CAP_SYS_RESOURCE) &&
> - p->user != current->nsproxy->user_ns->root_user)
> + p->user != &root_user)
> goto bad_fork_free;
> }
>
> diff --git a/kernel/sys.c b/kernel/sys.c
> index e01b5d1..809e416 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -1085,7 +1085,7 @@ static int set_user(uid_t new_ruid, int dumpclear)
>
> if (atomic_read(&new_user->processes) >=
> current->signal->rlim[RLIMIT_NPROC].rlim_cur &&
> - new_user != current->nsproxy->user_ns->root_user) {
> + new_user != &root_user) {
> free_uid(new_user);
> return -EAGAIN;
> }
> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
> index 89a27e8..df11a27 100644ldn't actually be applied for root
in a container, since the container may not be trus
> --- a/kernel/user_namespace.c
> +++ b/kernel/user_namespace.c
> @@ -14,7 +14,6 @@ struct user_namespace init_user_ns = {
> .kref = {
> .refcount = ATOMIC_INIT(2),
> },
> - .root_user = &root_user,
> };
>
> EXPORT_SYMBOL_GPL(init_user_ns);
> @@ -41,17 +40,9 @@ static struct user_namespace *clone_user_ns(struct user_namespace *old_ns)
> for (n = 0; n < UIDHASH_SZ; ++n)
> INIT_LIST_HEAD(ns->uidhash_table + n);
>
> - /* Insert new root user. */
> - ns->root_user = alloc_uid(ns, 0);
> - if (!ns->root_user) {
> - kfree(ns);
> - return NULL;
> - }
> -
> /* Reset current->user with a new one */
> new_user = alloc_uid(ns, current->uid);
> if (!new_user) {
> - free_uid(ns->root_user);
> kfree(ns);
> return NULL;
> }
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/6] user namespace : add the framework
[not found] ` <469B86A4.3050006-3ImXcnM4P+0@public.gmane.org>
@ 2007-07-16 15:08 ` Serge E. Hallyn
[not found] ` <20070716150800.GA31369-6s5zFf/epYLPQpwDFJZrxKsjOiXwFzmk@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Serge E. Hallyn @ 2007-07-16 15:08 UTC (permalink / raw)
To: Kirill Korotaev
Cc: containers-qjLDD68F18O7TbgM5vRIOg, Andrew Morton,
xemul-3ImXcnM4P+0, David Howells
Quoting Kirill Korotaev (dev-3ImXcnM4P+0@public.gmane.org):
> Serge E. Hallyn wrote:
> > Quoting Andrew Morton (akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org):
> >
> >>On Mon, 4 Jun 2007 14:40:24 -0500 "Serge E. Hallyn" <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> wrote:
> >>
> >>
> >>>Add the user namespace struct and framework
> >>>
> >>>Basically, it will allow a process to unshare its user_struct table, resetting
> >>>at the same time its own user_struct and all the associated accounting.
> >>>
> >>>A new root user (uid == 0) is added to the user namespace upon creation. Such
> >>>root users have full privileges and it seems that theses privileges should be
> >>>controlled through some means (process capabilities ?)
> >>
> >>The whole magical-uid-0-user thing in this patch seem just wrong to
> >>me.
> >>
> >>I'll merge it anyway, mainly because I want to merge _something_ (why oh
> >>why do the git-tree guys leave everything to the last minute?) but it strikes
> >>me that there's something fundamentally wrong whenever the kernel starts
> >>"knowing" about the significance of UIDs in this fashion.
> >
> >
> > $(&(%
> >
> > I thought I disagreed, but now I'm pretty sure I completely agree.
> >
> > 'root_user' exists in the kernel right now, but the root_user checks
> > which exist (in fork.c and sys.c) shouldn't actually be applied for root
> > in a container, since the container may not be trusted.
>
> This rlimit check doesn't help *untrusted* containers, so your logic is wrong here.
> Instead, it allows root of the container to operate in any situation.
And I'm not sure that should be the case.
In my view, root of a container is equivalent to a normal user on the
host system, just like root in a qemu process.
> E.g. consider root user hit the limit. After that you won't be able to login/ssh to fix anything.
That's fine in the container.
> NOTE: container root can have no CAP_SYS_RESOURCE and CAP_SYS_ADMIN as it is in OpenVZ.
And eventually we'll want that to be the default in upstream containers.
But it's not the case upstream right now. Before we can do that, we
need an answer to per-container capabilities.
Do you (either you specifically, or anyone at openvz) have plans to
address the per-container capabilities problem? Herbert? Eric?
I'm interested, but would like to get the file capabilites squared away
before I consider coding on it.
> But in general I'm not against the patch, since in OpenVZ we can replace the check
> with another capability we use for VE admin - CAP_VE_SYS_ADMIN.
If that truly sufficies then great. If not, then in order to support
openvz in the meantime I say we drop my patch, but we remember that when
we straighten out the security issues this will need to be addressed.
thanks,
-serge
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/6] user namespace : add the framework
[not found] ` <28037.1184597651-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2007-07-16 15:27 ` Serge E. Hallyn
0 siblings, 0 replies; 9+ messages in thread
From: Serge E. Hallyn @ 2007-07-16 15:27 UTC (permalink / raw)
To: David Howells
Cc: containers-qjLDD68F18O7TbgM5vRIOg, Andrew Morton,
xemul-3ImXcnM4P+0
Quoting David Howells (dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
> Serge E. Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> wrote:
>
> > There is the root_user_keyring stuff - David, is that only special cased
> > so that root's keys can be statically allocated?
>
> It's because the boot processes start up with UID 0, and the user_struct for
> UID 0 is statically allocated (root_user).
Right - thanks David.
-serge
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/6] user namespace : add the framework
[not found] ` <20070716150800.GA31369-6s5zFf/epYLPQpwDFJZrxKsjOiXwFzmk@public.gmane.org>
@ 2007-07-18 0:11 ` Herbert Poetzl
[not found] ` <20070718001135.GA27495-ZD0Mn47LIGX0Pe/G4T7+5F6hYfS7NtTn@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Herbert Poetzl @ 2007-07-18 0:11 UTC (permalink / raw)
To: Serge E. Hallyn
Cc: containers-qjLDD68F18O7TbgM5vRIOg, David Howells, Andrew Morton,
xemul-3ImXcnM4P+0
On Mon, Jul 16, 2007 at 10:08:00AM -0500, Serge E. Hallyn wrote:
> Quoting Kirill Korotaev (dev-3ImXcnM4P+0@public.gmane.org):
> > Serge E. Hallyn wrote:
> > > Quoting Andrew Morton (akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org):
> > >
> > >>On Mon, 4 Jun 2007 14:40:24 -0500 "Serge E. Hallyn" <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> wrote:
> > >>
> > >>
> > >>>Add the user namespace struct and framework
> > >>>
> > >>>Basically, it will allow a process to unshare its user_struct
> > >>>table, resetting at the same time its own user_struct and all the
> > >>>associated accounting.
> > >>>
> > >>>A new root user (uid == 0) is added to the user namespace upon
> > >>>creation. Such root users have full privileges and it seems
> > >>>that theses privileges should be controlled through some means
> > >>>(process capabilities ?)
> > >>
> > >>The whole magical-uid-0-user thing in this patch seem just wrong
> > >>to me.
> > >>
> > >>I'll merge it anyway, mainly because I want to merge _something_
> > >>(why oh why do the git-tree guys leave everything to the last
> > >>minute?) but it strikes me that there's something fundamentally
> > >>wrong whenever the kernel starts "knowing" about the significance
> > >>of UIDs in this fashion.
> > >
> > >
> > > $(&(%
> > >
> > > I thought I disagreed, but now I'm pretty sure I completely agree.
> > >
> > > 'root_user' exists in the kernel right now, but the root_user
> > > checks which exist (in fork.c and sys.c) shouldn't actually be
> > > applied for root in a container, since the container may not be
> > > trusted.
> >
> > This rlimit check doesn't help *untrusted* containers, so your logic
> > is wrong here. Instead, it allows root of the container to operate
> > in any situation.
>
> And I'm not sure that should be the case.
>
> In my view, root of a container is equivalent to a normal user on the
> host system, just like root in a qemu process.
>
> > E.g. consider root user hit the limit. After that you won't be able
> > to login/ssh to fix anything.
>
> That's fine in the container.
>
> > NOTE: container root can have no CAP_SYS_RESOURCE and CAP_SYS_ADMIN
> > as it is in OpenVZ.
> And eventually we'll want that to be the default in upstream containers.
> But it's not the case upstream right now. Before we can do that, we
> need an answer to per-container capabilities.
>
> Do you (either you specifically, or anyone at openvz) have plans to
> address the per-container capabilities problem? Herbert? Eric?
it is already addressed in Linux-VServer and OpenVZ
Linux-VServer adds a so called 'capabilitiy mask',
which is applied to the 'normal' capability system,
thus a guest cannot utilize/exercise capabilities
not included in that mask (which makes the guest
root 'secure')
> I'm interested, but would like to get the file capabilites squared
> away before I consider coding on it.
>
> > But in general I'm not against the patch, since in OpenVZ we can
> > replace the check with another capability we use for VE admin -
> > CAP_VE_SYS_ADMIN.
>
> If that truly sufficies then great. If not, then in order to support
> openvz in the meantime I say we drop my patch, but we remember that
> when we straighten out the security issues this will need to be
> addressed.
I'm not very fond of handling guest or host root special
and I think the capability system was designed to exactly
handle the guest root case properly ...
will look through the patches shortly and comment ...
best,
Herbert
> thanks,
> -serge
> _______________________________________________
> Containers mailing list
> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linux-foundation.org/mailman/listinfo/containers
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/6] user namespace : add the framework
[not found] ` <20070718001135.GA27495-ZD0Mn47LIGX0Pe/G4T7+5F6hYfS7NtTn@public.gmane.org>
@ 2007-07-18 14:21 ` Serge E. Hallyn
0 siblings, 0 replies; 9+ messages in thread
From: Serge E. Hallyn @ 2007-07-18 14:21 UTC (permalink / raw)
To: Herbert Poetzl
Cc: xemul-3ImXcnM4P+0, David Howells,
containers-qjLDD68F18O7TbgM5vRIOg, Andrew Morton
Quoting Herbert Poetzl (herbert-dBHVzrDq9nF4Lj/PQRBjDg@public.gmane.org):
> On Mon, Jul 16, 2007 at 10:08:00AM -0500, Serge E. Hallyn wrote:
> > Quoting Kirill Korotaev (dev-3ImXcnM4P+0@public.gmane.org):
> > > Serge E. Hallyn wrote:
> > > > Quoting Andrew Morton (akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org):
> > > >
> > > >>On Mon, 4 Jun 2007 14:40:24 -0500 "Serge E. Hallyn" <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> wrote:
> > > >>
> > > >>
> > > >>>Add the user namespace struct and framework
> > > >>>
> > > >>>Basically, it will allow a process to unshare its user_struct
> > > >>>table, resetting at the same time its own user_struct and all the
> > > >>>associated accounting.
> > > >>>
> > > >>>A new root user (uid == 0) is added to the user namespace upon
> > > >>>creation. Such root users have full privileges and it seems
> > > >>>that theses privileges should be controlled through some means
> > > >>>(process capabilities ?)
> > > >>
> > > >>The whole magical-uid-0-user thing in this patch seem just wrong
> > > >>to me.
> > > >>
> > > >>I'll merge it anyway, mainly because I want to merge _something_
> > > >>(why oh why do the git-tree guys leave everything to the last
> > > >>minute?) but it strikes me that there's something fundamentally
> > > >>wrong whenever the kernel starts "knowing" about the significance
> > > >>of UIDs in this fashion.
> > > >
> > > >
> > > > $(&(%
> > > >
> > > > I thought I disagreed, but now I'm pretty sure I completely agree.
> > > >
> > > > 'root_user' exists in the kernel right now, but the root_user
> > > > checks which exist (in fork.c and sys.c) shouldn't actually be
> > > > applied for root in a container, since the container may not be
> > > > trusted.
> > >
> > > This rlimit check doesn't help *untrusted* containers, so your logic
> > > is wrong here. Instead, it allows root of the container to operate
> > > in any situation.
> >
> > And I'm not sure that should be the case.
> >
> > In my view, root of a container is equivalent to a normal user on the
> > host system, just like root in a qemu process.
> >
> > > E.g. consider root user hit the limit. After that you won't be able
> > > to login/ssh to fix anything.
> >
> > That's fine in the container.
> >
> > > NOTE: container root can have no CAP_SYS_RESOURCE and CAP_SYS_ADMIN
> > > as it is in OpenVZ.
>
> > And eventually we'll want that to be the default in upstream containers.
> > But it's not the case upstream right now. Before we can do that, we
> > need an answer to per-container capabilities.
> >
> > Do you (either you specifically, or anyone at openvz) have plans to
> > address the per-container capabilities problem? Herbert? Eric?
>
> it is already addressed in Linux-VServer and OpenVZ
> Linux-VServer adds a so called 'capabilitiy mask',
> which is applied to the 'normal' capability system,
> thus a guest cannot utilize/exercise capabilities
> not included in that mask (which makes the guest
> root 'secure')
Are you special-casing some capabilities? For instance, cap_sys_admin
in a container obviously shouldn't be fully granted, but some of it's
abilities (setting hostname, setting enc key on loopback, etc) should
be allowed.
Anyway, there unfortunately are several problems with a plain capability
mask now.
First, there is the fact that we don't have just 'the host' and 'virtual
hosts'. We have a set of namespaces, with usually no notion of one
master namespace. So do we just mask out the masked capabilities as
soon as any process does an unshare with any of the key flags? Or do
we have a separate way to 'unshare capabilities'?
We could make cap-bset a per-process thing, where any process can take
flags out (but not add them back in).
We could make a cap-bset container, where when creating a new container,
we can take capabilities out of cap-bset.
Back to shortcomings,
Second, we wanted to implement user equivalence across namespaces. And
to do that for root likely requires cross-namespace capabilities.
For instance, let's say I, user hallyn, unshare all my namespaces to
create a virtual host vh.
Maybe I want user serge in vh to be equivalent to user hallyn
in the real host. So I want to give it a key (host-uidns,
hallyn). Now it can read all my files.
Maybe I want user hallyn in the host to have acess to all files
in the vh, whether owned by root or any user. So I want to
give user hallyn in the host uidns to have a key (vh-uidns,
root).
That second example is nice and simple, but of course we prefer
to think about capabilities. So I might just want to be able
to give user hallyn in the host uid namespace a key
(vh-uidns, CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_PTRACE).
Maybe all of this can be addressed with a posix capability container,
a custom ns_capability LSM, and a uidns keyring.
thanks,
-serge
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2007-07-18 14:21 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20070604193957.GA19331@sergelap.austin.ibm.com>
[not found] ` <20070604194024.GA21703@sergelap.austin.ibm.com>
[not found] ` <20070604194024.GA21703-6s5zFf/epYLPQpwDFJZrxKsjOiXwFzmk@public.gmane.org>
2007-07-16 1:31 ` [PATCH 1/6] user namespace : add the framework Andrew Morton
[not found] ` <20070715183132.e31a2064.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2007-07-16 14:34 ` Serge E. Hallyn
[not found] ` <20070716143443.GA7393-6s5zFf/epYLPQpwDFJZrxKsjOiXwFzmk@public.gmane.org>
2007-07-16 14:38 ` Serge E. Hallyn
2007-07-16 14:54 ` David Howells
[not found] ` <28037.1184597651-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2007-07-16 15:27 ` Serge E. Hallyn
2007-07-16 14:54 ` Kirill Korotaev
[not found] ` <469B86A4.3050006-3ImXcnM4P+0@public.gmane.org>
2007-07-16 15:08 ` Serge E. Hallyn
[not found] ` <20070716150800.GA31369-6s5zFf/epYLPQpwDFJZrxKsjOiXwFzmk@public.gmane.org>
2007-07-18 0:11 ` Herbert Poetzl
[not found] ` <20070718001135.GA27495-ZD0Mn47LIGX0Pe/G4T7+5F6hYfS7NtTn@public.gmane.org>
2007-07-18 14:21 ` Serge E. Hallyn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox