From: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
To: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>,
Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>,
Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
Paul Turner <pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
Thomas Graf <tgraf-G/eBtMaohhA@public.gmane.org>,
"Serge E. Hallyn" <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>,
Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Paul Mackerras <paulus-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>,
Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Arnaldo Carvalho de Melo
<acme-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org>,
Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>,
"Aneesh Kumar K.V"
<aneesh.kumar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Subject: Re: [PATCH v4 cgroup/for-3.7] cgroup: mark subsystems with broken hierarchy support and whine if cgroups are nested for them
Date: Thu, 13 Sep 2012 21:27:26 +0200 [thread overview]
Message-ID: <20120913192726.GC22529@dhcp22.suse.cz> (raw)
In-Reply-To: <20120913192058.GH7677-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
On Thu 13-09-12 12:20:58, Tejun Heo wrote:
> Currently, cgroup hierarchy support is a mess. cpu related subsystems
> behave correctly - configuration, accounting and control on a parent
> properly cover its children. blkio and freezer completely ignore
> hierarchy and treat all cgroups as if they're directly under the root
> cgroup. Others show yet different behaviors.
>
> These differing interpretations of cgroup hierarchy make using cgroup
> confusing and it impossible to co-mount controllers into the same
> hierarchy and obtain sane behavior.
>
> Eventually, we want full hierarchy support from all subsystems and
> probably a unified hierarchy. Users using separate hierarchies
> expecting completely different behaviors depending on the mounted
> subsystem is deterimental to making any progress on this front.
>
> This patch adds cgroup_subsys.broken_hierarchy and sets it to %true
> for controllers which are lacking in hierarchy support. The goal of
> this patch is two-fold.
>
> * Move users away from using hierarchy on currently non-hierarchical
> subsystems, so that implementing proper hierarchy support on those
> doesn't surprise them.
>
> * Keep track of which controllers are broken how and nudge the
> subsystems to implement proper hierarchy support.
>
> For now, start with a single warning message. We can whine louder
> later on.
>
> v2: Fixed a typo spotted by Michal. Warning message updated.
>
> v3: Updated memcg part so that it doesn't generate warning in the
> cases where .use_hierarchy=false doesn't make the behavior
> different from root.use_hierarchy=true. Fixed a typo spotted by
> Glauber.
>
> v4: Check ->broken_hierarchy after cgroup creation is complete so that
> ->create() can affect the result per Michal. Dropped unnecessary
> memcg root handling per Michal.
>
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Cc: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
> Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> Cc: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
> Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
> Cc: Paul Turner <pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> Cc: Thomas Graf <tgraf-G/eBtMaohhA@public.gmane.org>
> Cc: Serge E. Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
> Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Cc: Paul Mackerras <paulus-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>
> Cc: Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Cc: Arnaldo Carvalho de Melo <acme-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org>
> Cc: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
> Cc: Aneesh Kumar K.V <aneesh.kumar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Acked-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
Thanks!
> ---
> block/blk-cgroup.c | 8 ++++++++
> include/linux/cgroup.h | 15 +++++++++++++++
> kernel/cgroup.c | 12 +++++++++++-
> kernel/cgroup_freezer.c | 8 ++++++++
> kernel/events/core.c | 7 +++++++
> mm/memcontrol.c | 7 +++++++
> net/core/netprio_cgroup.c | 12 +++++++++++-
> net/sched/cls_cgroup.c | 9 +++++++++
> security/device_cgroup.c | 9 +++++++++
> 9 files changed, 85 insertions(+), 2 deletions(-)
>
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -737,6 +737,14 @@ struct cgroup_subsys blkio_subsys = {
> .subsys_id = blkio_subsys_id,
> .base_cftypes = blkcg_files,
> .module = THIS_MODULE,
> +
> + /*
> + * blkio subsystem is utterly broken in terms of hierarchy support.
> + * It treats all cgroups equally regardless of where they're
> + * located in the hierarchy - all cgroups are treated as if they're
> + * right below the root. Fix it and remove the following.
> + */
> + .broken_hierarchy = true,
> };
> EXPORT_SYMBOL_GPL(blkio_subsys);
>
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -503,6 +503,21 @@ struct cgroup_subsys {
> */
> bool __DEPRECATED_clear_css_refs;
>
> + /*
> + * If %false, this subsystem is properly hierarchical -
> + * configuration, resource accounting and restriction on a parent
> + * cgroup cover those of its children. If %true, hierarchy support
> + * is broken in some ways - some subsystems ignore hierarchy
> + * completely while others are only implemented half-way.
> + *
> + * It's now disallowed to create nested cgroups if the subsystem is
> + * broken and cgroup core will emit a warning message on such
> + * cases. Eventually, all subsystems will be made properly
> + * hierarchical and this will go away.
> + */
> + bool broken_hierarchy;
> + bool warned_broken_hierarchy;
> +
> #define MAX_CGROUP_TYPE_NAMELEN 32
> const char *name;
>
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -4075,8 +4075,9 @@ static long cgroup_create(struct cgroup
> set_bit(CGRP_CLONE_CHILDREN, &cgrp->flags);
>
> for_each_subsys(root, ss) {
> - struct cgroup_subsys_state *css = ss->create(cgrp);
> + struct cgroup_subsys_state *css;
>
> + css = ss->create(cgrp);
> if (IS_ERR(css)) {
> err = PTR_ERR(css);
> goto err_destroy;
> @@ -4090,6 +4091,15 @@ static long cgroup_create(struct cgroup
> /* At error, ->destroy() callback has to free assigned ID. */
> if (clone_children(parent) && ss->post_clone)
> ss->post_clone(cgrp);
> +
> + if (ss->broken_hierarchy && !ss->warned_broken_hierarchy &&
> + parent->parent) {
> + pr_warning("cgroup: %s (%d) created nested cgroup for controller \"%s\" which has incomplete hierarchy support. Nested cgroups may change behavior in the future.\n",
> + current->comm, current->pid, ss->name);
> + if (!strcmp(ss->name, "memory"))
> + pr_warning("cgroup: \"memory\" requires setting use_hierarchy to 1 on the root.\n");
> + ss->warned_broken_hierarchy = true;
> + }
> }
>
> list_add(&cgrp->sibling, &cgrp->parent->children);
> --- a/kernel/cgroup_freezer.c
> +++ b/kernel/cgroup_freezer.c
> @@ -373,4 +373,12 @@ struct cgroup_subsys freezer_subsys = {
> .can_attach = freezer_can_attach,
> .fork = freezer_fork,
> .base_cftypes = files,
> +
> + /*
> + * freezer subsys doesn't handle hierarchy at all. Frozen state
> + * should be inherited through the hierarchy - if a parent is
> + * frozen, all its children should be frozen. Fix it and remove
> + * the following.
> + */
> + .broken_hierarchy = true,
> };
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -7285,5 +7285,12 @@ struct cgroup_subsys perf_subsys = {
> .destroy = perf_cgroup_destroy,
> .exit = perf_cgroup_exit,
> .attach = perf_cgroup_attach,
> +
> + /*
> + * perf_event cgroup doesn't handle nesting correctly.
> + * ctx->nr_cgroups adjustments should be propagated through the
> + * cgroup hierarchy. Fix it and remove the following.
> + */
> + .broken_hierarchy = true,
> };
> #endif /* CONFIG_CGROUP_PERF */
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4973,6 +4973,13 @@ mem_cgroup_create(struct cgroup *cont)
> } else {
> res_counter_init(&memcg->res, NULL);
> res_counter_init(&memcg->memsw, NULL);
> + /*
> + * Deeper hierachy with use_hierarchy == false doesn't make
> + * much sense so let cgroup subsystem know about this
> + * unfortunate state in our controller.
> + */
> + if (parent && parent != root_mem_cgroup)
> + mem_cgroup_subsys.broken_hierarchy = true;
> }
> memcg->last_scanned_node = MAX_NUMNODES;
> INIT_LIST_HEAD(&memcg->oom_notify);
> --- a/net/core/netprio_cgroup.c
> +++ b/net/core/netprio_cgroup.c
> @@ -330,7 +330,17 @@ struct cgroup_subsys net_prio_subsys = {
> .subsys_id = net_prio_subsys_id,
> #endif
> .base_cftypes = ss_files,
> - .module = THIS_MODULE
> + .module = THIS_MODULE,
> +
> + /*
> + * net_prio has artificial limit on the number of cgroups and
> + * disallows nesting making it impossible to co-mount it with other
> + * hierarchical subsystems. Remove the artificially low PRIOIDX_SZ
> + * limit and properly nest configuration such that children follow
> + * their parents' configurations by default and are allowed to
> + * override and remove the following.
> + */
> + .broken_hierarchy = true,
> };
>
> static int netprio_device_event(struct notifier_block *unused,
> --- a/net/sched/cls_cgroup.c
> +++ b/net/sched/cls_cgroup.c
> @@ -82,6 +82,15 @@ struct cgroup_subsys net_cls_subsys = {
> #endif
> .base_cftypes = ss_files,
> .module = THIS_MODULE,
> +
> + /*
> + * While net_cls cgroup has the rudimentary hierarchy support of
> + * inheriting the parent's classid on cgroup creation, it doesn't
> + * properly propagates config changes in ancestors to their
> + * descendents. A child should follow the parent's configuration
> + * but be allowed to override it. Fix it and remove the following.
> + */
> + .broken_hierarchy = true,
> };
>
> struct cls_cgroup_head {
> --- a/security/device_cgroup.c
> +++ b/security/device_cgroup.c
> @@ -457,6 +457,15 @@ struct cgroup_subsys devices_subsys = {
> .destroy = devcgroup_destroy,
> .subsys_id = devices_subsys_id,
> .base_cftypes = dev_cgroup_files,
> +
> + /*
> + * While devices cgroup has the rudimentary hierarchy support which
> + * checks the parent's restriction, it doesn't properly propagates
> + * config changes in ancestors to their descendents. A child
> + * should only be allowed to add more restrictions to the parent's
> + * configuration. Fix it and remove the following.
> + */
> + .broken_hierarchy = true,
> };
>
> int __devcgroup_inode_permission(struct inode *inode, int mask)
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2012-09-13 19:27 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-13 18:34 [PATCH cgroup/for-3.7] cgroup: mark subsystems with broken hierarchy support and whine if cgroups are nested for them Tejun Heo
[not found] ` <20120913183402.GG7677-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-13 18:46 ` Michal Hocko
2012-09-13 19:20 ` [PATCH v4 " Tejun Heo
[not found] ` <20120913192058.GH7677-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-13 19:27 ` Michal Hocko [this message]
2012-09-13 19:41 ` Vivek Goyal
[not found] ` <20120913194111.GM4396-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-09-13 20:03 ` Tejun Heo
[not found] ` <20120913200354.GN7677-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-13 20:17 ` Vivek Goyal
2012-09-14 9:14 ` Glauber Costa
[not found] ` <5052F58E.4050808-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-09-14 9:17 ` Glauber Costa
[not found] ` <5052F61A.3040602-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-09-14 21:06 ` Tejun Heo
2012-09-14 4:55 ` Serge Hallyn
2012-09-14 7:48 ` Li Zefan
2012-09-14 9:11 ` Glauber Costa
2012-09-14 19:06 ` Tejun Heo
[not found] ` <20120914190608.GM17747-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-18 15:04 ` Michal Hocko
[not found] ` <20120918150404.GC19840-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2012-09-18 16:27 ` Tejun Heo
[not found] ` <CAOS58YPjEE-+Kak_XidAd5qT84pZKt_z=4_MXadHYvzP+yZ5+Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-09-18 16:33 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120913192726.GC22529@dhcp22.suse.cz \
--to=mhocko-alswssmvlrq@public.gmane.org \
--cc=acme-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org \
--cc=aneesh.kumar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
--cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
--cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
--cc=lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org \
--cc=mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org \
--cc=paulus-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org \
--cc=peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
--cc=pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
--cc=tgraf-G/eBtMaohhA@public.gmane.org \
--cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox