* Re: [PATCH] Relax ns_can_attach checks to allow attaching to grandchild cgroups
[not found] ` <20081219120954.GB27472-yp6mvK3Bdd2rDJvtcaxF/A@public.gmane.org>
@ 2008-12-19 21:16 ` Andrew Morton
[not found] ` <20081219131641.cafc65ac.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2008-12-19 21:16 UTC (permalink / raw)
To: Grzegorz Nosek
Cc: containers-qjLDD68F18O7TbgM5vRIOg, menage-hpIqsD4AKlfQT0dZR+AlfA
(cc containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org)
Please don't send patches via private email!
From: Grzegorz Nosek <root-AfQBxy1nhrQ00sYp1HPQUA@public.gmane.org>
The ns_proxy cgroup allows moving processes to child cgroups only one
level deep at a time. This commit relaxes this restriction and makes it
possible to attach tasks directly to grandchild cgroups, e.g.:
($pid is in the root cgroup)
echo $pid > /cgroup/CG1/CG2/tasks
Previously this operation would fail with -EPERM and would have to be
performed as two steps:
echo $pid > /cgroup/CG1/tasks
echo $pid > /cgroup/CG1/CG2/tasks
Signed-off-by: Grzegorz Nosek <root-AfQBxy1nhrQ00sYp1HPQUA@public.gmane.org>
Reviewed-by: Li Zefan <lizf-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
Cc: Paul Menage <menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: Balbir Singh <balbir-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
Cc: Pavel Emelyanov <xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
Signed-off-by: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
---
include/linux/cgroup.h | 4 ++--
kernel/cgroup.c | 12 +++++++-----
kernel/ns_cgroup.c | 9 +++------
3 files changed, 12 insertions(+), 13 deletions(-)
diff -puN include/linux/cgroup.h~relax-ns_can_attach-checks-to-allow-attaching-to-grandchild-cgroups include/linux/cgroup.h
--- a/include/linux/cgroup.h~relax-ns_can_attach-checks-to-allow-attaching-to-grandchild-cgroups
+++ a/include/linux/cgroup.h
@@ -344,8 +344,8 @@ int cgroup_path(const struct cgroup *cgr
int cgroup_task_count(const struct cgroup *cgrp);
-/* Return true if the cgroup is a descendant of the current cgroup */
-int cgroup_is_descendant(const struct cgroup *cgrp);
+/* Return true if cgrp is a descendant of the task's cgroup */
+int cgroup_is_descendant(const struct cgroup *cgrp, struct task_struct *task);
/* Control Group subsystem type. See Documentation/cgroups.txt for details */
diff -puN kernel/cgroup.c~relax-ns_can_attach-checks-to-allow-attaching-to-grandchild-cgroups kernel/cgroup.c
--- a/kernel/cgroup.c~relax-ns_can_attach-checks-to-allow-attaching-to-grandchild-cgroups
+++ a/kernel/cgroup.c
@@ -3072,18 +3072,19 @@ int cgroup_clone(struct task_struct *tsk
}
/**
- * cgroup_is_descendant - see if @cgrp is a descendant of current task's cgrp
+ * cgroup_is_descendant - see if @cgrp is a descendant of @task's cgrp
* @cgrp: the cgroup in question
+ * @task: the task in question
*
- * See if @cgrp is a descendant of the current task's cgroup in
- * the appropriate hierarchy.
+ * See if @cgrp is a descendant of @task's cgroup in the appropriate
+ * hierarchy.
*
* If we are sending in dummytop, then presumably we are creating
* the top cgroup in the subsystem.
*
* Called only by the ns (nsproxy) cgroup.
*/
-int cgroup_is_descendant(const struct cgroup *cgrp)
+int cgroup_is_descendant(const struct cgroup *cgrp, struct task_struct *task)
{
int ret;
struct cgroup *target;
@@ -3093,7 +3094,8 @@ int cgroup_is_descendant(const struct cg
return 1;
get_first_subsys(cgrp, NULL, &subsys_id);
- target = task_cgroup(current, subsys_id);
+ target = task_cgroup(task, subsys_id);
+
while (cgrp != target && cgrp!= cgrp->top_cgroup)
cgrp = cgrp->parent;
ret = (cgrp == target);
diff -puN kernel/ns_cgroup.c~relax-ns_can_attach-checks-to-allow-attaching-to-grandchild-cgroups kernel/ns_cgroup.c
--- a/kernel/ns_cgroup.c~relax-ns_can_attach-checks-to-allow-attaching-to-grandchild-cgroups
+++ a/kernel/ns_cgroup.c
@@ -45,21 +45,18 @@ int ns_cgroup_clone(struct task_struct *
static int ns_can_attach(struct cgroup_subsys *ss,
struct cgroup *new_cgroup, struct task_struct *task)
{
- struct cgroup *orig;
-
if (current != task) {
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
- if (!cgroup_is_descendant(new_cgroup))
+ if (!cgroup_is_descendant(new_cgroup, current))
return -EPERM;
}
if (atomic_read(&new_cgroup->count) != 0)
return -EPERM;
- orig = task_cgroup(task, ns_subsys_id);
- if (orig && orig != new_cgroup->parent)
+ if (!cgroup_is_descendant(new_cgroup, task))
return -EPERM;
return 0;
@@ -77,7 +74,7 @@ static struct cgroup_subsys_state *ns_cr
if (!capable(CAP_SYS_ADMIN))
return ERR_PTR(-EPERM);
- if (!cgroup_is_descendant(cgroup))
+ if (!cgroup_is_descendant(cgroup, current))
return ERR_PTR(-EPERM);
ns_cgroup = kzalloc(sizeof(*ns_cgroup), GFP_KERNEL);
_
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Relax ns_can_attach checks to allow attaching to grandchild cgroups
[not found] ` <20081219131641.cafc65ac.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
@ 2008-12-19 22:23 ` Serge E. Hallyn
[not found] ` <20081219222304.GA25828-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Serge E. Hallyn @ 2008-12-19 22:23 UTC (permalink / raw)
To: Andrew Morton
Cc: containers-qjLDD68F18O7TbgM5vRIOg, menage-hpIqsD4AKlfQT0dZR+AlfA,
Grzegorz Nosek
Quoting Andrew Morton (akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org):
> (cc containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org)
>
> Please don't send patches via private email!
>
>
>
> From: Grzegorz Nosek <root-AfQBxy1nhrQ00sYp1HPQUA@public.gmane.org>
>
> The ns_proxy cgroup allows moving processes to child cgroups only one
> level deep at a time. This commit relaxes this restriction and makes it
> possible to attach tasks directly to grandchild cgroups, e.g.:
>
> ($pid is in the root cgroup)
> echo $pid > /cgroup/CG1/CG2/tasks
>
> Previously this operation would fail with -EPERM and would have to be
> performed as two steps:
> echo $pid > /cgroup/CG1/tasks
> echo $pid > /cgroup/CG1/CG2/tasks
>
> Signed-off-by: Grzegorz Nosek <root-AfQBxy1nhrQ00sYp1HPQUA@public.gmane.org>
> Reviewed-by: Li Zefan <lizf-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
> Cc: Paul Menage <menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Cc: Balbir Singh <balbir-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
> Cc: Pavel Emelyanov <xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
> Signed-off-by: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Acked-by: Serge Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
I trust (since you're not removing it) that the restriction that
the target cgroup be empty is not a problem?
Also, 'rule 1' in the comment above ns_can_attach should be modified
accordingly (s/child/descendant).
thanks,
-serge
> ---
>
> include/linux/cgroup.h | 4 ++--
> kernel/cgroup.c | 12 +++++++-----
> kernel/ns_cgroup.c | 9 +++------
> 3 files changed, 12 insertions(+), 13 deletions(-)
>
> diff -puN include/linux/cgroup.h~relax-ns_can_attach-checks-to-allow-attaching-to-grandchild-cgroups include/linux/cgroup.h
> --- a/include/linux/cgroup.h~relax-ns_can_attach-checks-to-allow-attaching-to-grandchild-cgroups
> +++ a/include/linux/cgroup.h
> @@ -344,8 +344,8 @@ int cgroup_path(const struct cgroup *cgr
>
> int cgroup_task_count(const struct cgroup *cgrp);
>
> -/* Return true if the cgroup is a descendant of the current cgroup */
> -int cgroup_is_descendant(const struct cgroup *cgrp);
> +/* Return true if cgrp is a descendant of the task's cgroup */
> +int cgroup_is_descendant(const struct cgroup *cgrp, struct task_struct *task);
>
> /* Control Group subsystem type. See Documentation/cgroups.txt for details */
>
> diff -puN kernel/cgroup.c~relax-ns_can_attach-checks-to-allow-attaching-to-grandchild-cgroups kernel/cgroup.c
> --- a/kernel/cgroup.c~relax-ns_can_attach-checks-to-allow-attaching-to-grandchild-cgroups
> +++ a/kernel/cgroup.c
> @@ -3072,18 +3072,19 @@ int cgroup_clone(struct task_struct *tsk
> }
>
> /**
> - * cgroup_is_descendant - see if @cgrp is a descendant of current task's cgrp
> + * cgroup_is_descendant - see if @cgrp is a descendant of @task's cgrp
> * @cgrp: the cgroup in question
> + * @task: the task in question
> *
> - * See if @cgrp is a descendant of the current task's cgroup in
> - * the appropriate hierarchy.
> + * See if @cgrp is a descendant of @task's cgroup in the appropriate
> + * hierarchy.
> *
> * If we are sending in dummytop, then presumably we are creating
> * the top cgroup in the subsystem.
> *
> * Called only by the ns (nsproxy) cgroup.
> */
> -int cgroup_is_descendant(const struct cgroup *cgrp)
> +int cgroup_is_descendant(const struct cgroup *cgrp, struct task_struct *task)
> {
> int ret;
> struct cgroup *target;
> @@ -3093,7 +3094,8 @@ int cgroup_is_descendant(const struct cg
> return 1;
>
> get_first_subsys(cgrp, NULL, &subsys_id);
> - target = task_cgroup(current, subsys_id);
> + target = task_cgroup(task, subsys_id);
> +
> while (cgrp != target && cgrp!= cgrp->top_cgroup)
> cgrp = cgrp->parent;
> ret = (cgrp == target);
> diff -puN kernel/ns_cgroup.c~relax-ns_can_attach-checks-to-allow-attaching-to-grandchild-cgroups kernel/ns_cgroup.c
> --- a/kernel/ns_cgroup.c~relax-ns_can_attach-checks-to-allow-attaching-to-grandchild-cgroups
> +++ a/kernel/ns_cgroup.c
> @@ -45,21 +45,18 @@ int ns_cgroup_clone(struct task_struct *
> static int ns_can_attach(struct cgroup_subsys *ss,
> struct cgroup *new_cgroup, struct task_struct *task)
> {
> - struct cgroup *orig;
> -
> if (current != task) {
> if (!capable(CAP_SYS_ADMIN))
> return -EPERM;
>
> - if (!cgroup_is_descendant(new_cgroup))
> + if (!cgroup_is_descendant(new_cgroup, current))
> return -EPERM;
> }
>
> if (atomic_read(&new_cgroup->count) != 0)
> return -EPERM;
>
> - orig = task_cgroup(task, ns_subsys_id);
> - if (orig && orig != new_cgroup->parent)
> + if (!cgroup_is_descendant(new_cgroup, task))
> return -EPERM;
>
> return 0;
> @@ -77,7 +74,7 @@ static struct cgroup_subsys_state *ns_cr
>
> if (!capable(CAP_SYS_ADMIN))
> return ERR_PTR(-EPERM);
> - if (!cgroup_is_descendant(cgroup))
> + if (!cgroup_is_descendant(cgroup, current))
> return ERR_PTR(-EPERM);
>
> ns_cgroup = kzalloc(sizeof(*ns_cgroup), GFP_KERNEL);
> _
>
> _______________________________________________
> Containers mailing list
> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linux-foundation.org/mailman/listinfo/containers
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Relax ns_can_attach checks to allow attaching to grandchild cgroups
[not found] ` <20081219222304.GA25828-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2008-12-19 23:03 ` Grzegorz Nosek
[not found] ` <20081219230330.GA30623-IaEwMO9oKu/77SC2UrCW1JJg/dWx8T/9@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Grzegorz Nosek @ 2008-12-19 23:03 UTC (permalink / raw)
To: Serge E. Hallyn
Cc: containers-qjLDD68F18O7TbgM5vRIOg, Andrew Morton,
menage-hpIqsD4AKlfQT0dZR+AlfA, Grzegorz Nosek
On pią, gru 19, 2008 at 04:23:04 -0600, Serge E. Hallyn wrote:
> Quoting Andrew Morton (akpm@linux-foundation.org):
> > (cc containers@lists.osdl.org)
> >
> > Please don't send patches via private email!
My apologies.
> I trust (since you're not removing it) that the restriction that
> the target cgroup be empty is not a problem?
Sigh, good catch. I'm building my lxc-based environment slowly and I'm
only testing the most basic stuff currently, so I'd bug you about it
eventually.
Frankly, I don't understand the reason behind these restrictions and
feel like I'm missing some important piece of a puzzle. In my tests all
the tasks in question are living in the same namespace (though it won't
always be so), so I'd guess I should be able to move the tasks freely
between cgroups. Why exactly does the target cgroup have to be empty?
Also, should we remember the task->nsproxy pointer in the cgroup data
and ignore hierarchy if it matches? I guess it would be safe to store
the raw pointer without refcounting it in any way as we'd never
dereference it (could keep it as uintptr_t to reinforce the idea) but
only compare with another pointer.
Does that make any sense? Or should I simply mount the cgroup fs without
the ns subsystem and forget the whole thing? What exactly do I lose by
doing so?
> Also, 'rule 1' in the comment above ns_can_attach should be modified
> accordingly (s/child/descendant).
Indeed. Will resend after receiving some enlightenment about the above.
Thank you for your comments.
Best regards,
Grzegorz Nosek
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Relax ns_can_attach checks to allow attaching to grandchild cgroups
[not found] ` <20081219230330.GA30623-IaEwMO9oKu/77SC2UrCW1JJg/dWx8T/9@public.gmane.org>
@ 2008-12-19 23:34 ` Serge E. Hallyn
0 siblings, 0 replies; 9+ messages in thread
From: Serge E. Hallyn @ 2008-12-19 23:34 UTC (permalink / raw)
To: Grzegorz Nosek
Cc: containers-qjLDD68F18O7TbgM5vRIOg, Andrew Morton,
menage-hpIqsD4AKlfQT0dZR+AlfA
Quoting Grzegorz Nosek (root@localdomain.pl):
> On pią, gru 19, 2008 at 04:23:04 -0600, Serge E. Hallyn wrote:
> > Quoting Andrew Morton (akpm@linux-foundation.org):
> > > (cc containers@lists.osdl.org)
> > >
> > > Please don't send patches via private email!
>
> My apologies.
>
> > I trust (since you're not removing it) that the restriction that
> > the target cgroup be empty is not a problem?
>
> Sigh, good catch. I'm building my lxc-based environment slowly and I'm
> only testing the most basic stuff currently, so I'd bug you about it
> eventually.
>
> Frankly, I don't understand the reason behind these restrictions and
> feel like I'm missing some important piece of a puzzle. In my tests all
> the tasks in question are living in the same namespace (though it won't
> always be so), so I'd guess I should be able to move the tasks freely
> between cgroups. Why exactly does the target cgroup have to be empty?
The reasoning goes back to one motivation of the ns cgroup being
to facilitate actual moves between namespaces. Since that is no
longer being considered, easing the restrictions is ok.
On the other hand, the only remaining use for the ns cgroup is to
provide some locking of tasks/containers into cgroups. So the main
restriction I'd like to keep in place is that you can only go
downward in cgroup hierarchy. (Think devices whitelist cgroup).
Now maybe it makes sense to split the two things ns_cgroup does
into 2+ cgroups: one (nstrack) would simply clone a new child
cgroup every time a task does an unshare, another (if mounted)
prevents a task from moving to a cgroup which isn't a desendent,
while a third could do more complicated controls, i.e. a task
could be locked into cgroup:/a/b, after which it could freely
move up and down under cgroup:/a/b, (i.e. to switch between
cgroup:/a/b/c1 and cgroup:/a/b/c2), but could never escape
cgroup:/a/b. Then you could choose which if any movement-controlling
and movement-tracking cgroups to compose.
I actually rather like that idea, but I think we'd have to
keep the ns cgroup the way it is, while using new cgroups to
offer the new functionality.
> Also, should we remember the task->nsproxy pointer in the cgroup data
> and ignore hierarchy if it matches? I guess it would be safe to store
> the raw pointer without refcounting it in any way as we'd never
> dereference it (could keep it as uintptr_t to reinforce the idea) but
> only compare with another pointer.
No, I'm thinking that despite the name, since we wont' use it to
actually enter namespaces, we should keep it decoupled from nsproxy.
> Does that make any sense? Or should I simply mount the cgroup fs without
> the ns subsystem and forget the whole thing? What exactly do I lose by
> doing so?
With liblxc I think you might lose the way that it keeps track of
containers. Not sure though - give it a shot.
> > Also, 'rule 1' in the comment above ns_can_attach should be modified
> > accordingly (s/child/descendant).
>
> Indeed. Will resend after receiving some enlightenment about the above.
>
> Thank you for your comments.
>
> Best regards,
> Grzegorz Nosek
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] Relax ns_can_attach checks to allow attaching to grandchild cgroups
@ 2009-01-06 22:25 Grzegorz Nosek
[not found] ` <20090106222536.GA25228-IaEwMO9oKu/77SC2UrCW1JJg/dWx8T/9@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Grzegorz Nosek @ 2009-01-06 22:25 UTC (permalink / raw)
To: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b; +Cc: containers-qjLDD68F18O7TbgM5vRIOg
The ns_proxy cgroup allows moving processes to child cgroups only one
level deep at a time. This commit relaxes this restriction and makes it
possible to attach tasks directly to grandchild cgroups, e.g.:
($pid is in the root cgroup)
echo $pid > /cgroup/CG1/CG2/tasks
Previously this operation would fail with -EPERM and would have to be
performed as two steps:
echo $pid > /cgroup/CG1/tasks
echo $pid > /cgroup/CG1/CG2/tasks
Also, the target cgroup no longer needs to be empty to move a task there.
Signed-off-by: Grzegorz Nosek <root-AfQBxy1nhrQ00sYp1HPQUA@public.gmane.org>
---
include/linux/cgroup.h | 4 ++--
kernel/cgroup.c | 11 ++++++-----
kernel/ns_cgroup.c | 14 ++++----------
3 files changed, 12 insertions(+), 17 deletions(-)
Changes since previous version:
- removed the restriction of empty target cgroup when moving tasks
- updated missed comment (s/child/descendant/)
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 1ba3ded..53188f5 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -301,8 +301,8 @@ int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen);
int cgroup_task_count(const struct cgroup *cgrp);
-/* Return true if the cgroup is a descendant of the current cgroup */
-int cgroup_is_descendant(const struct cgroup *cgrp);
+/* Return true if cgrp is a descendant of the task's cgroup */
+int cgroup_is_descendant(const struct cgroup *cgrp, struct task_struct *task);
/* Control Group subsystem type. See Documentation/cgroups.txt for details */
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index ee91952..d881222 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -3104,18 +3104,19 @@ int cgroup_clone(struct task_struct *tsk, struct cgroup_subsys *subsys,
}
/**
- * cgroup_is_descendant - see if @cgrp is a descendant of current task's cgrp
+ * cgroup_is_descendant - see if @cgrp is a descendant of @task's cgrp
* @cgrp: the cgroup in question
+ * @task: the task in question
*
- * See if @cgrp is a descendant of the current task's cgroup in
- * the appropriate hierarchy.
+ * See if @cgrp is a descendant of @task's cgroup in the appropriate
+ * hierarchy.
*
* If we are sending in dummytop, then presumably we are creating
* the top cgroup in the subsystem.
*
* Called only by the ns (nsproxy) cgroup.
*/
-int cgroup_is_descendant(const struct cgroup *cgrp)
+int cgroup_is_descendant(const struct cgroup *cgrp, struct task_struct *task)
{
int ret;
struct cgroup *target;
@@ -3125,7 +3126,7 @@ int cgroup_is_descendant(const struct cgroup *cgrp)
return 1;
get_first_subsys(cgrp, NULL, &subsys_id);
- target = task_cgroup(current, subsys_id);
+ target = task_cgroup(task, subsys_id);
while (cgrp != target && cgrp!= cgrp->top_cgroup)
cgrp = cgrp->parent;
ret = (cgrp == target);
diff --git a/kernel/ns_cgroup.c b/kernel/ns_cgroup.c
index 43c2111..890691a 100644
--- a/kernel/ns_cgroup.c
+++ b/kernel/ns_cgroup.c
@@ -35,7 +35,7 @@ int ns_cgroup_clone(struct task_struct *task, struct pid *pid)
/*
* Rules:
- * 1. you can only enter a cgroup which is a child of your current
+ * 1. you can only enter a cgroup which is a descendant of your current
* cgroup
* 2. you can only place another process into a cgroup if
* a. you have CAP_SYS_ADMIN
@@ -46,21 +46,15 @@ int ns_cgroup_clone(struct task_struct *task, struct pid *pid)
static int ns_can_attach(struct cgroup_subsys *ss,
struct cgroup *new_cgroup, struct task_struct *task)
{
- struct cgroup *orig;
-
if (current != task) {
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
- if (!cgroup_is_descendant(new_cgroup))
+ if (!cgroup_is_descendant(new_cgroup, current))
return -EPERM;
}
- if (atomic_read(&new_cgroup->count) != 0)
- return -EPERM;
-
- orig = task_cgroup(task, ns_subsys_id);
- if (orig && orig != new_cgroup->parent)
+ if (!cgroup_is_descendant(new_cgroup, task))
return -EPERM;
return 0;
@@ -78,7 +72,7 @@ static struct cgroup_subsys_state *ns_create(struct cgroup_subsys *ss,
if (!capable(CAP_SYS_ADMIN))
return ERR_PTR(-EPERM);
- if (!cgroup_is_descendant(cgroup))
+ if (!cgroup_is_descendant(cgroup, current))
return ERR_PTR(-EPERM);
ns_cgroup = kzalloc(sizeof(*ns_cgroup), GFP_KERNEL);
--
1.5.4.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] Relax ns_can_attach checks to allow attaching to grandchild cgroups
[not found] ` <20090106222536.GA25228-IaEwMO9oKu/77SC2UrCW1JJg/dWx8T/9@public.gmane.org>
@ 2009-01-07 0:18 ` Serge E. Hallyn
2009-01-07 1:58 ` Li Zefan
2009-01-13 0:37 ` Andrew Morton
2 siblings, 0 replies; 9+ messages in thread
From: Serge E. Hallyn @ 2009-01-07 0:18 UTC (permalink / raw)
To: Grzegorz Nosek
Cc: containers-qjLDD68F18O7TbgM5vRIOg,
akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b
Quoting Grzegorz Nosek (root-AfQBxy1nhrQ00sYp1HPQUA@public.gmane.org):
> The ns_proxy cgroup allows moving processes to child cgroups only one
> level deep at a time. This commit relaxes this restriction and makes it
> possible to attach tasks directly to grandchild cgroups, e.g.:
>
> ($pid is in the root cgroup)
> echo $pid > /cgroup/CG1/CG2/tasks
>
> Previously this operation would fail with -EPERM and would have to be
> performed as two steps:
> echo $pid > /cgroup/CG1/tasks
> echo $pid > /cgroup/CG1/CG2/tasks
>
> Also, the target cgroup no longer needs to be empty to move a task there.
>
> Signed-off-by: Grzegorz Nosek <root-AfQBxy1nhrQ00sYp1HPQUA@public.gmane.org>
I'm ok with these semantics.
Acked-by: Serge Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
-serge
> ---
> include/linux/cgroup.h | 4 ++--
> kernel/cgroup.c | 11 ++++++-----
> kernel/ns_cgroup.c | 14 ++++----------
> 3 files changed, 12 insertions(+), 17 deletions(-)
>
> Changes since previous version:
> - removed the restriction of empty target cgroup when moving tasks
> - updated missed comment (s/child/descendant/)
>
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index 1ba3ded..53188f5 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -301,8 +301,8 @@ int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen);
>
> int cgroup_task_count(const struct cgroup *cgrp);
>
> -/* Return true if the cgroup is a descendant of the current cgroup */
> -int cgroup_is_descendant(const struct cgroup *cgrp);
> +/* Return true if cgrp is a descendant of the task's cgroup */
> +int cgroup_is_descendant(const struct cgroup *cgrp, struct task_struct *task);
>
> /* Control Group subsystem type. See Documentation/cgroups.txt for details */
>
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index ee91952..d881222 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -3104,18 +3104,19 @@ int cgroup_clone(struct task_struct *tsk, struct cgroup_subsys *subsys,
> }
>
> /**
> - * cgroup_is_descendant - see if @cgrp is a descendant of current task's cgrp
> + * cgroup_is_descendant - see if @cgrp is a descendant of @task's cgrp
> * @cgrp: the cgroup in question
> + * @task: the task in question
> *
> - * See if @cgrp is a descendant of the current task's cgroup in
> - * the appropriate hierarchy.
> + * See if @cgrp is a descendant of @task's cgroup in the appropriate
> + * hierarchy.
> *
> * If we are sending in dummytop, then presumably we are creating
> * the top cgroup in the subsystem.
> *
> * Called only by the ns (nsproxy) cgroup.
> */
> -int cgroup_is_descendant(const struct cgroup *cgrp)
> +int cgroup_is_descendant(const struct cgroup *cgrp, struct task_struct *task)
> {
> int ret;
> struct cgroup *target;
> @@ -3125,7 +3126,7 @@ int cgroup_is_descendant(const struct cgroup *cgrp)
> return 1;
>
> get_first_subsys(cgrp, NULL, &subsys_id);
> - target = task_cgroup(current, subsys_id);
> + target = task_cgroup(task, subsys_id);
> while (cgrp != target && cgrp!= cgrp->top_cgroup)
> cgrp = cgrp->parent;
> ret = (cgrp == target);
> diff --git a/kernel/ns_cgroup.c b/kernel/ns_cgroup.c
> index 43c2111..890691a 100644
> --- a/kernel/ns_cgroup.c
> +++ b/kernel/ns_cgroup.c
> @@ -35,7 +35,7 @@ int ns_cgroup_clone(struct task_struct *task, struct pid *pid)
>
> /*
> * Rules:
> - * 1. you can only enter a cgroup which is a child of your current
> + * 1. you can only enter a cgroup which is a descendant of your current
> * cgroup
> * 2. you can only place another process into a cgroup if
> * a. you have CAP_SYS_ADMIN
> @@ -46,21 +46,15 @@ int ns_cgroup_clone(struct task_struct *task, struct pid *pid)
> static int ns_can_attach(struct cgroup_subsys *ss,
> struct cgroup *new_cgroup, struct task_struct *task)
> {
> - struct cgroup *orig;
> -
> if (current != task) {
> if (!capable(CAP_SYS_ADMIN))
> return -EPERM;
>
> - if (!cgroup_is_descendant(new_cgroup))
> + if (!cgroup_is_descendant(new_cgroup, current))
> return -EPERM;
> }
>
> - if (atomic_read(&new_cgroup->count) != 0)
> - return -EPERM;
> -
> - orig = task_cgroup(task, ns_subsys_id);
> - if (orig && orig != new_cgroup->parent)
> + if (!cgroup_is_descendant(new_cgroup, task))
> return -EPERM;
>
> return 0;
> @@ -78,7 +72,7 @@ static struct cgroup_subsys_state *ns_create(struct cgroup_subsys *ss,
>
> if (!capable(CAP_SYS_ADMIN))
> return ERR_PTR(-EPERM);
> - if (!cgroup_is_descendant(cgroup))
> + if (!cgroup_is_descendant(cgroup, current))
> return ERR_PTR(-EPERM);
>
> ns_cgroup = kzalloc(sizeof(*ns_cgroup), GFP_KERNEL);
> --
> 1.5.4.3
>
> _______________________________________________
> Containers mailing list
> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linux-foundation.org/mailman/listinfo/containers
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Relax ns_can_attach checks to allow attaching to grandchild cgroups
[not found] ` <20090106222536.GA25228-IaEwMO9oKu/77SC2UrCW1JJg/dWx8T/9@public.gmane.org>
2009-01-07 0:18 ` Serge E. Hallyn
@ 2009-01-07 1:58 ` Li Zefan
2009-01-13 0:37 ` Andrew Morton
2 siblings, 0 replies; 9+ messages in thread
From: Li Zefan @ 2009-01-07 1:58 UTC (permalink / raw)
To: Grzegorz Nosek
Cc: containers-qjLDD68F18O7TbgM5vRIOg,
akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b
Grzegorz Nosek wrote:
> The ns_proxy cgroup allows moving processes to child cgroups only one
> level deep at a time. This commit relaxes this restriction and makes it
> possible to attach tasks directly to grandchild cgroups, e.g.:
>
> ($pid is in the root cgroup)
> echo $pid > /cgroup/CG1/CG2/tasks
>
> Previously this operation would fail with -EPERM and would have to be
> performed as two steps:
> echo $pid > /cgroup/CG1/tasks
> echo $pid > /cgroup/CG1/CG2/tasks
>
> Also, the target cgroup no longer needs to be empty to move a task there.
>
> Signed-off-by: Grzegorz Nosek <root-AfQBxy1nhrQ00sYp1HPQUA@public.gmane.org>
Reviewed-by: Li Zefan <lizf-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
> ---
> include/linux/cgroup.h | 4 ++--
> kernel/cgroup.c | 11 ++++++-----
> kernel/ns_cgroup.c | 14 ++++----------
> 3 files changed, 12 insertions(+), 17 deletions(-)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Relax ns_can_attach checks to allow attaching to grandchild cgroups
[not found] ` <20090106222536.GA25228-IaEwMO9oKu/77SC2UrCW1JJg/dWx8T/9@public.gmane.org>
2009-01-07 0:18 ` Serge E. Hallyn
2009-01-07 1:58 ` Li Zefan
@ 2009-01-13 0:37 ` Andrew Morton
[not found] ` <20090112163729.774b5b50.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2009-01-13 0:37 UTC (permalink / raw)
To: Grzegorz Nosek; +Cc: containers-qjLDD68F18O7TbgM5vRIOg
On Tue, 6 Jan 2009 23:25:36 +0100
Grzegorz Nosek <root-AfQBxy1nhrQ00sYp1HPQUA@public.gmane.org> wrote:
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -3104,18 +3104,19 @@ int cgroup_clone(struct task_struct *tsk, struct cgroup_subsys *subsys,
> }
>
> /**
> - * cgroup_is_descendant - see if @cgrp is a descendant of current task's cgrp
> + * cgroup_is_descendant - see if @cgrp is a descendant of @task's cgrp
> * @cgrp: the cgroup in question
> + * @task: the task in question
> *
> - * See if @cgrp is a descendant of the current task's cgroup in
> - * the appropriate hierarchy.
> + * See if @cgrp is a descendant of @task's cgroup in the appropriate
> + * hierarchy.
> *
> * If we are sending in dummytop, then presumably we are creating
> * the top cgroup in the subsystem.
> *
> * Called only by the ns (nsproxy) cgroup.
> */
> -int cgroup_is_descendant(const struct cgroup *cgrp)
> +int cgroup_is_descendant(const struct cgroup *cgrp, struct task_struct *task)
> {
> int ret;
> struct cgroup *target;
> @@ -3125,7 +3126,7 @@ int cgroup_is_descendant(const struct cgroup *cgrp)
> return 1;
>
> get_first_subsys(cgrp, NULL, &subsys_id);
> - target = task_cgroup(current, subsys_id);
> + target = task_cgroup(task, subsys_id);
What locking prevents *task from vanishing here?
> while (cgrp != target && cgrp!= cgrp->top_cgroup)
> cgrp = cgrp->parent;
> ret = (cgrp == target);
> diff --git a/kernel/ns_cgroup.c b/kernel/ns_cgroup.c
> index 43c2111..890691a 100644
> --- a/kernel/ns_cgroup.c
> +++ b/kernel/ns_cgroup.c
> @@ -35,7 +35,7 @@ int ns_cgroup_clone(struct task_struct *task, struct pid *pid)
>
> /*
> * Rules:
> - * 1. you can only enter a cgroup which is a child of your current
> + * 1. you can only enter a cgroup which is a descendant of your current
> * cgroup
> * 2. you can only place another process into a cgroup if
> * a. you have CAP_SYS_ADMIN
> @@ -46,21 +46,15 @@ int ns_cgroup_clone(struct task_struct *task, struct pid *pid)
> static int ns_can_attach(struct cgroup_subsys *ss,
> struct cgroup *new_cgroup, struct task_struct *task)
> {
> - struct cgroup *orig;
> -
> if (current != task) {
> if (!capable(CAP_SYS_ADMIN))
> return -EPERM;
>
> - if (!cgroup_is_descendant(new_cgroup))
> + if (!cgroup_is_descendant(new_cgroup, current))
> return -EPERM;
> }
>
> - if (atomic_read(&new_cgroup->count) != 0)
> - return -EPERM;
> -
> - orig = task_cgroup(task, ns_subsys_id);
Same question applied to the previous code.
> - if (orig && orig != new_cgroup->parent)
> + if (!cgroup_is_descendant(new_cgroup, task))
> return -EPERM;
>
> return 0;
> @@ -78,7 +72,7 @@ static struct cgroup_subsys_state *ns_create(struct cgroup_subsys *ss,
>
> if (!capable(CAP_SYS_ADMIN))
> return ERR_PTR(-EPERM);
> - if (!cgroup_is_descendant(cgroup))
> + if (!cgroup_is_descendant(cgroup, current))
> return ERR_PTR(-EPERM);
>
> ns_cgroup = kzalloc(sizeof(*ns_cgroup), GFP_KERNEL);
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Relax ns_can_attach checks to allow attaching to grandchild cgroups
[not found] ` <20090112163729.774b5b50.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
@ 2009-01-13 1:44 ` Li Zefan
0 siblings, 0 replies; 9+ messages in thread
From: Li Zefan @ 2009-01-13 1:44 UTC (permalink / raw)
To: Andrew Morton; +Cc: containers-qjLDD68F18O7TbgM5vRIOg
>> /**
>> - * cgroup_is_descendant - see if @cgrp is a descendant of current task's cgrp
>> + * cgroup_is_descendant - see if @cgrp is a descendant of @task's cgrp
>> * @cgrp: the cgroup in question
>> + * @task: the task in question
>> *
>> - * See if @cgrp is a descendant of the current task's cgroup in
>> - * the appropriate hierarchy.
>> + * See if @cgrp is a descendant of @task's cgroup in the appropriate
>> + * hierarchy.
>> *
>> * If we are sending in dummytop, then presumably we are creating
>> * the top cgroup in the subsystem.
>> *
>> * Called only by the ns (nsproxy) cgroup.
>> */
>> -int cgroup_is_descendant(const struct cgroup *cgrp)
>> +int cgroup_is_descendant(const struct cgroup *cgrp, struct task_struct *task)
>> {
>> int ret;
>> struct cgroup *target;
>> @@ -3125,7 +3126,7 @@ int cgroup_is_descendant(const struct cgroup *cgrp)
>> return 1;
>>
>> get_first_subsys(cgrp, NULL, &subsys_id);
>> - target = task_cgroup(current, subsys_id);
>> + target = task_cgroup(task, subsys_id);
>
> What locking prevents *task from vanishing here?
>
As *task is passed to this function, I think it's the caller's responsibility to
insure this ?
>> while (cgrp != target && cgrp!= cgrp->top_cgroup)
>> cgrp = cgrp->parent;
>> ret = (cgrp == target);
>> diff --git a/kernel/ns_cgroup.c b/kernel/ns_cgroup.c
>> index 43c2111..890691a 100644
>> --- a/kernel/ns_cgroup.c
>> +++ b/kernel/ns_cgroup.c
>> @@ -35,7 +35,7 @@ int ns_cgroup_clone(struct task_struct *task, struct pid *pid)
>>
>> /*
>> * Rules:
>> - * 1. you can only enter a cgroup which is a child of your current
>> + * 1. you can only enter a cgroup which is a descendant of your current
>> * cgroup
>> * 2. you can only place another process into a cgroup if
>> * a. you have CAP_SYS_ADMIN
>> @@ -46,21 +46,15 @@ int ns_cgroup_clone(struct task_struct *task, struct pid *pid)
>> static int ns_can_attach(struct cgroup_subsys *ss,
>> struct cgroup *new_cgroup, struct task_struct *task)
>> {
>> - struct cgroup *orig;
>> -
>> if (current != task) {
>> if (!capable(CAP_SYS_ADMIN))
>> return -EPERM;
>>
>> - if (!cgroup_is_descendant(new_cgroup))
>> + if (!cgroup_is_descendant(new_cgroup, current))
>> return -EPERM;
>> }
>>
>> - if (atomic_read(&new_cgroup->count) != 0)
>> - return -EPERM;
>> -
>> - orig = task_cgroup(task, ns_subsys_id);
>
> Same question applied to the previous code.
>
It's called by cgroup core, and cgroup guarantees the task won't vanish. (by get_task_struct)
>> - if (orig && orig != new_cgroup->parent)
>> + if (!cgroup_is_descendant(new_cgroup, task))
>> return -EPERM;
>>
>> return 0;
>> @@ -78,7 +72,7 @@ static struct cgroup_subsys_state *ns_create(struct cgroup_subsys *ss,
>>
>> if (!capable(CAP_SYS_ADMIN))
>> return ERR_PTR(-EPERM);
>> - if (!cgroup_is_descendant(cgroup))
>> + if (!cgroup_is_descendant(cgroup, current))
>> return ERR_PTR(-EPERM);
>>
>> ns_cgroup = kzalloc(sizeof(*ns_cgroup), GFP_KERNEL);
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-01-13 1:44 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-06 22:25 [PATCH] Relax ns_can_attach checks to allow attaching to grandchild cgroups Grzegorz Nosek
[not found] ` <20090106222536.GA25228-IaEwMO9oKu/77SC2UrCW1JJg/dWx8T/9@public.gmane.org>
2009-01-07 0:18 ` Serge E. Hallyn
2009-01-07 1:58 ` Li Zefan
2009-01-13 0:37 ` Andrew Morton
[not found] ` <20090112163729.774b5b50.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2009-01-13 1:44 ` Li Zefan
[not found] <20081219120954.GB27472@megiteam.pl>
[not found] ` <20081219120954.GB27472-yp6mvK3Bdd2rDJvtcaxF/A@public.gmane.org>
2008-12-19 21:16 ` Andrew Morton
[not found] ` <20081219131641.cafc65ac.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2008-12-19 22:23 ` Serge E. Hallyn
[not found] ` <20081219222304.GA25828-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-12-19 23:03 ` Grzegorz Nosek
[not found] ` <20081219230330.GA30623-IaEwMO9oKu/77SC2UrCW1JJg/dWx8T/9@public.gmane.org>
2008-12-19 23:34 ` Serge E. Hallyn
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.