* [PATCH 1/3] cgroup: make sure parent won't be destroyed before its children
[not found] ` <1365474213-13354-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2013-04-09 2:23 ` Tejun Heo
2013-04-09 2:23 ` [PATCH 2/3] cgroup: implement cgroup_is_descendant() Tejun Heo
` (4 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: Tejun Heo @ 2013-04-09 2:23 UTC (permalink / raw)
To: mingo-H+wXaHxf7aLQT0dZR+AlfA
Cc: namhyung.kim-Hm3cg6mZ9cc, a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
eranian-hpIqsD4AKlfQT0dZR+AlfA, mhocko-AlSwsSmVLrQ,
paulus-eUNUBHrolfbYtjvyW6yDsg,
acme-f8uhVLnGfZaxAyOMLChx1axOck334EZe, Tejun Heo,
cgroups-u79uwXL29TY76Z2rM5mHXA
From: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Suppose we rmdir a cgroup and there're still css refs, this cgroup won't
be freed. Then we rmdir the parent cgroup, and the parent is freed
immediately due to css ref draining to 0. Now it would be a disaster if
the still-alive child cgroup tries to access its parent.
Make sure this won't happen.
Signed-off-by: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Reviewed-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
kernel/cgroup.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index ba3e24a..d7dfa64 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -876,6 +876,13 @@ static void cgroup_free_fn(struct work_struct *work)
mutex_unlock(&cgroup_mutex);
/*
+ * We get a ref to the parent's dentry, and put the ref when
+ * this cgroup is being freed, so it's guaranteed that the
+ * parent won't be destroyed before its children.
+ */
+ dput(cgrp->parent->dentry);
+
+ /*
* Drop the active superblock reference that we took when we
* created the cgroup
*/
@@ -4168,6 +4175,9 @@ static long cgroup_create(struct cgroup *parent, struct dentry *dentry,
for_each_subsys(root, ss)
dget(dentry);
+ /* hold a ref to the parent's dentry */
+ dget(parent->dentry);
+
/* creation succeeded, notify subsystems */
for_each_subsys(root, ss) {
err = online_css(ss, cgrp);
--
1.8.1.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/3] cgroup: implement cgroup_is_descendant()
[not found] ` <1365474213-13354-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-04-09 2:23 ` [PATCH 1/3] cgroup: make sure parent won't be destroyed before its children Tejun Heo
@ 2013-04-09 2:23 ` Tejun Heo
2013-04-09 2:23 ` [PATCH 3/3] perf: make perf_event cgroup hierarchical Tejun Heo
` (3 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: Tejun Heo @ 2013-04-09 2:23 UTC (permalink / raw)
To: mingo-H+wXaHxf7aLQT0dZR+AlfA
Cc: namhyung.kim-Hm3cg6mZ9cc, a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
eranian-hpIqsD4AKlfQT0dZR+AlfA, mhocko-AlSwsSmVLrQ,
paulus-eUNUBHrolfbYtjvyW6yDsg,
acme-f8uhVLnGfZaxAyOMLChx1axOck334EZe, Tejun Heo,
cgroups-u79uwXL29TY76Z2rM5mHXA
From: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
A couple controllers want to determine whether two cgroups are in
ancestor/descendant relationship. As it's more likely that the
descendant is the primary subject of interest and there are other
operations focusing on the descendants, let's ask is_descendent rather
than is_ancestor.
Implementation is trivial as the previous patch guarantees that all
ancestors of a cgroup stay accessible as long as the cgroup is
accessible.
tj: Removed depth optimization, renamed from cgroup_is_ancestor(),
rewrote descriptions.
Signed-off-by: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
include/linux/cgroup.h | 1 +
kernel/cgroup.c | 20 ++++++++++++++++++++
2 files changed, 21 insertions(+)
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 515927e..45aee0f 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -439,6 +439,7 @@ int cgroup_add_cftypes(struct cgroup_subsys *ss, struct cftype *cfts);
int cgroup_rm_cftypes(struct cgroup_subsys *ss, struct cftype *cfts);
int cgroup_is_removed(const struct cgroup *cgrp);
+bool cgroup_is_descendant(struct cgroup *cgrp, struct cgroup *ancestor);
int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index d7dfa64..678a22c 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -276,6 +276,26 @@ inline int cgroup_is_removed(const struct cgroup *cgrp)
return test_bit(CGRP_REMOVED, &cgrp->flags);
}
+/**
+ * cgroup_is_descendant - test ancestry
+ * @cgrp: the cgroup to be tested
+ * @ancestor: possible ancestor of @cgrp
+ *
+ * Test whether @cgrp is a descendant of @ancestor. It also returns %true
+ * if @cgrp == @ancestor. This function is safe to call as long as @cgrp
+ * and @ancestor are accessible.
+ */
+bool cgroup_is_descendant(struct cgroup *cgrp, struct cgroup *ancestor)
+{
+ while (cgrp) {
+ if (cgrp == ancestor)
+ return true;
+ cgrp = cgrp->parent;
+ }
+ return false;
+}
+EXPORT_SYMBOL_GPL(cgroup_is_descendant);
+
/* bits in struct cgroupfs_root flags field */
enum {
ROOT_NOPREFIX, /* mounted subsystems have no named prefix */
--
1.8.1.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/3] perf: make perf_event cgroup hierarchical
[not found] ` <1365474213-13354-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-04-09 2:23 ` [PATCH 1/3] cgroup: make sure parent won't be destroyed before its children Tejun Heo
2013-04-09 2:23 ` [PATCH 2/3] cgroup: implement cgroup_is_descendant() Tejun Heo
@ 2013-04-09 2:23 ` Tejun Heo
2013-04-09 7:07 ` [PATCHSET] perf, cgroup: implement hierarchy support for perf_event controller Li Zefan
` (2 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: Tejun Heo @ 2013-04-09 2:23 UTC (permalink / raw)
To: mingo-H+wXaHxf7aLQT0dZR+AlfA
Cc: namhyung.kim-Hm3cg6mZ9cc, a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
eranian-hpIqsD4AKlfQT0dZR+AlfA, mhocko-AlSwsSmVLrQ,
paulus-eUNUBHrolfbYtjvyW6yDsg,
acme-f8uhVLnGfZaxAyOMLChx1axOck334EZe, Tejun Heo,
cgroups-u79uwXL29TY76Z2rM5mHXA
perf_event is one of a couple remaining cgroup controllers with broken
hierarchy support. Converting it to support hierarchy is almost
trivial. The only thing necessary is to consider a task belonging to
a descendant cgroup as a match. IOW, if the cgroup of the currently
executing task (@cpuctx->cgrp) equals or is a descendant of the
event's cgroup (@event->cgrp), then the event should be enabled.
Implement hierarchy support and remove .broken_hierarchy tag along
with the incorrect comment on what needs to be done for hierarchy
support.
Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Peter Zijlstra <a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org>
Cc: Paul Mackerras <paulus-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>
Cc: Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Arnaldo Carvalho de Melo <acme-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org>
Cc: Stephane Eranian <eranian-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: Namhyung Kim <namhyung.kim-Hm3cg6mZ9cc@public.gmane.org>
---
kernel/events/core.c | 24 ++++++++++++++++--------
1 file changed, 16 insertions(+), 8 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index b0cd865..310ec19 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -251,7 +251,22 @@ perf_cgroup_match(struct perf_event *event)
struct perf_event_context *ctx = event->ctx;
struct perf_cpu_context *cpuctx = __get_cpu_context(ctx);
- return !event->cgrp || event->cgrp == cpuctx->cgrp;
+ /* @event doesn't care about cgroup */
+ if (!event->cgrp)
+ return true;
+
+ /* wants specific cgroup scope but @cpuctx isn't associated with any */
+ if (!cpuctx->cgrp)
+ return false;
+
+ /*
+ * Cgroup scoping is recursive. An event enabled for a cgroup is
+ * also enabled for all its descendant cgroups. If @cpuctx's
+ * cgroup is a descendant of @event's (the test covers identity
+ * case), it's a match.
+ */
+ return cgroup_is_descendant(cpuctx->cgrp->css.cgroup,
+ event->cgrp->css.cgroup);
}
static inline bool perf_tryget_cgroup(struct perf_event *event)
@@ -7509,12 +7524,5 @@ struct cgroup_subsys perf_subsys = {
.css_free = perf_cgroup_css_free,
.exit = perf_cgroup_exit,
.attach = perf_cgroup_attach,
-
- /*
- * perf_event cgroup doesn't handle nesting correctly.
- * ctx->nr_cgroups adjustments should be propagated through the
- * cgroup hierarchy. Fix it and remove the following.
- */
- .broken_hierarchy = true,
};
#endif /* CONFIG_CGROUP_PERF */
--
1.8.1.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCHSET] perf, cgroup: implement hierarchy support for perf_event controller
[not found] ` <1365474213-13354-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
` (2 preceding siblings ...)
2013-04-09 2:23 ` [PATCH 3/3] perf: make perf_event cgroup hierarchical Tejun Heo
@ 2013-04-09 7:07 ` Li Zefan
2013-04-10 7:32 ` Peter Zijlstra
2013-04-10 18:08 ` Tejun Heo
5 siblings, 0 replies; 9+ messages in thread
From: Li Zefan @ 2013-04-09 7:07 UTC (permalink / raw)
To: Tejun Heo
Cc: namhyung.kim-Hm3cg6mZ9cc, a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
eranian-hpIqsD4AKlfQT0dZR+AlfA, mhocko-AlSwsSmVLrQ,
mingo-H+wXaHxf7aLQT0dZR+AlfA, paulus-eUNUBHrolfbYtjvyW6yDsg,
acme-f8uhVLnGfZaxAyOMLChx1axOck334EZe,
cgroups-u79uwXL29TY76Z2rM5mHXA
On 2013/4/9 10:23, Tejun Heo wrote:
> perf_event cgroup controller is one of the remaining few with broken
> hierarchy support. It turns out it's pretty easy to implement - the
> only thing necessary is making perf_cgroup_match() return %true also
> when the cgroup of the current task is a descendant of the event's
> cgroup. This patchset implements cgroup_is_descendant() and uses it
> to implement hierarchy support in perf_event controller.
>
> This patchset contains the following three patches.
>
> 0001-cgroup-make-sure-parent-won-t-be-destroyed-before-it.patch
> 0002-cgroup-implement-cgroup_is_descendant.patch
> 0003-perf-make-perf_event-cgroup-hierarchical.patch
>
> The patches are also available in the following git branch, which is
> based on top of cgroup/for-3.10. It's currently based on top of
> cgroup/for-3.10 as the first patch causes non-trivial conflict with it
> otherwise, which is not difficult to resolve but still nice to avoid
> anyway.
>
> Li, Michal, I picked the first two patches from Li's memcg patchset.
> Can we push the first two through cgroup/for-3.10 and put the rest in
> -mm?
>
Sure. Andrew asked me to resend the memcg patchset when we are at 3.10-rc1,
because 3.9-rc6 is a bit too late and he's too busy.
> Ingo, how should these be routed?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCHSET] perf, cgroup: implement hierarchy support for perf_event controller
[not found] ` <1365474213-13354-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
` (3 preceding siblings ...)
2013-04-09 7:07 ` [PATCHSET] perf, cgroup: implement hierarchy support for perf_event controller Li Zefan
@ 2013-04-10 7:32 ` Peter Zijlstra
2013-04-10 9:37 ` Ingo Molnar
2013-04-10 18:08 ` Tejun Heo
5 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2013-04-10 7:32 UTC (permalink / raw)
To: Tejun Heo
Cc: mingo-DgEjT+Ai2ygdnm+yROfE0A, paulus-eUNUBHrolfbYtjvyW6yDsg,
acme-f8uhVLnGfZaxAyOMLChx1axOck334EZe,
eranian-hpIqsD4AKlfQT0dZR+AlfA, namhyung.kim-Hm3cg6mZ9cc,
lizefan-hv44wF8Li93QT0dZR+AlfA, mhocko-AlSwsSmVLrQ,
kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A,
cgroups-u79uwXL29TY76Z2rM5mHXA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
On Mon, 2013-04-08 at 19:23 -0700, Tejun Heo wrote:
> perf_event cgroup controller is one of the remaining few with broken
> hierarchy support. It turns out it's pretty easy to implement - the
> only thing necessary is making perf_cgroup_match() return %true also
> when the cgroup of the current task is a descendant of the event's
> cgroup. This patchset implements cgroup_is_descendant() and uses it
> to implement hierarchy support in perf_event controller.
>
> This patchset contains the following three patches.
>
> 0001-cgroup-make-sure-parent-won-t-be-destroyed-before-it.patch
> 0002-cgroup-implement-cgroup_is_descendant.patch
> 0003-perf-make-perf_event-cgroup-hierarchical.patch
Thanks, looks simple and straight fwd.
> Ingo, how should these be routed?
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git perf_event-hierarchy-support
Ingo typically likes multi-maintainer sets to be pulled from a common
tree into all relevant maintainer trees so that git merges afterwards
just-work (tm). But I'll let him expand.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCHSET] perf, cgroup: implement hierarchy support for perf_event controller
2013-04-10 7:32 ` Peter Zijlstra
@ 2013-04-10 9:37 ` Ingo Molnar
[not found] ` <20130410093755.GC24443-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Ingo Molnar @ 2013-04-10 9:37 UTC (permalink / raw)
To: Peter Zijlstra
Cc: namhyung.kim-Hm3cg6mZ9cc,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
eranian-hpIqsD4AKlfQT0dZR+AlfA, mhocko-AlSwsSmVLrQ,
paulus-eUNUBHrolfbYtjvyW6yDsg,
acme-f8uhVLnGfZaxAyOMLChx1axOck334EZe, Tejun Heo,
cgroups-u79uwXL29TY76Z2rM5mHXA
* Peter Zijlstra <a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org> wrote:
> On Mon, 2013-04-08 at 19:23 -0700, Tejun Heo wrote:
> > perf_event cgroup controller is one of the remaining few with broken
> > hierarchy support. It turns out it's pretty easy to implement - the
> > only thing necessary is making perf_cgroup_match() return %true also
> > when the cgroup of the current task is a descendant of the event's
> > cgroup. This patchset implements cgroup_is_descendant() and uses it
> > to implement hierarchy support in perf_event controller.
> >
> > This patchset contains the following three patches.
> >
> > 0001-cgroup-make-sure-parent-won-t-be-destroyed-before-it.patch
> > 0002-cgroup-implement-cgroup_is_descendant.patch
> > 0003-perf-make-perf_event-cgroup-hierarchical.patch
>
> Thanks, looks simple and straight fwd.
>
> > Ingo, how should these be routed?
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git perf_event-hierarchy-support
>
> Ingo typically likes multi-maintainer sets to be pulled from a common
> tree into all relevant maintainer trees so that git merges afterwards
> just-work (tm). But I'll let him expand.
Yeah - at least for larger changes that's a good workflow.
For smaller changes we can pick one or the other tree. Tejun, do these changes
create any conflicts with the current tip:master tree? If not then you could carry
these changes in your tree. If there's significant conflicts then it might be
better to rebase this on top of perf/core and pull them into perf/core. (assuming
there's no other cgroups prereq patches beyond the ones in this series.)
Thanks,
Ingo
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCHSET] perf, cgroup: implement hierarchy support for perf_event controller
[not found] ` <1365474213-13354-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
` (4 preceding siblings ...)
2013-04-10 7:32 ` Peter Zijlstra
@ 2013-04-10 18:08 ` Tejun Heo
5 siblings, 0 replies; 9+ messages in thread
From: Tejun Heo @ 2013-04-10 18:08 UTC (permalink / raw)
To: mingo-H+wXaHxf7aLQT0dZR+AlfA
Cc: a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw,
paulus-eUNUBHrolfbYtjvyW6yDsg,
acme-f8uhVLnGfZaxAyOMLChx1axOck334EZe,
eranian-hpIqsD4AKlfQT0dZR+AlfA, namhyung.kim-Hm3cg6mZ9cc,
lizefan-hv44wF8Li93QT0dZR+AlfA, mhocko-AlSwsSmVLrQ,
kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A,
cgroups-u79uwXL29TY76Z2rM5mHXA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
On Mon, Apr 08, 2013 at 07:23:30PM -0700, Tejun Heo wrote:
> perf_event cgroup controller is one of the remaining few with broken
> hierarchy support. It turns out it's pretty easy to implement - the
> only thing necessary is making perf_cgroup_match() return %true also
> when the cgroup of the current task is a descendant of the event's
> cgroup. This patchset implements cgroup_is_descendant() and uses it
> to implement hierarchy support in perf_event controller.
Applied to cgroup/for-3.10.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 9+ messages in thread