From: Waiman Long <longman@redhat.com>
To: Tejun Heo <tj@kernel.org>, Li Zefan <lizefan@huawei.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>
Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
kernel-team@fb.com, pjt@google.com, luto@amacapital.net,
efault@gmx.de, torvalds@linux-foundation.org, guro@fb.com,
Waiman Long <longman@redhat.com>
Subject: [PATCH v2 3/4] cgroup: Allow reenabling of controller in bypass mode
Date: Fri, 21 Jul 2017 16:34:52 -0400 [thread overview]
Message-ID: <1500669293-21792-4-git-send-email-longman@redhat.com> (raw)
In-Reply-To: <1500669293-21792-1-git-send-email-longman@redhat.com>
Controllers set to bypass mode in the parent's "cgroup.subtree_control"
can now be optionally enabled by writing the controller name with the
'+' prefix to "cgroup.controllers". Using the '#' prefix will reset it
back to the bypass state.
This capability increases the flexibility each controller has in
shaping the effective cgroup hierarchy to best suit its need.
Signed-off-by: Waiman Long <longman@redhat.com>
---
Documentation/cgroup-v2.txt | 23 +++++++++-
include/linux/cgroup-defs.h | 7 +++
kernel/cgroup/cgroup.c | 109 ++++++++++++++++++++++++++++++++++++++++++--
3 files changed, 134 insertions(+), 5 deletions(-)
diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
index f17a74b..efb69c4 100644
--- a/Documentation/cgroup-v2.txt
+++ b/Documentation/cgroup-v2.txt
@@ -395,6 +395,18 @@ prefixed controller interface files from C and D. This means that the
controller interface files - anything which doesn't start with
"cgroup." are owned by the parent rather than the cgroup itself.
+Once a controller is put into bypass mode in "cgroup.subtree_control",
+the cgroup's children can optionally enable this controller by writing
+the controller name with the '+ prefix into "cgroup.controllers".
+In this case, the controller interface files are considered to be
+owned by the child cgroup itself, not by its parent. Therefore,
+setting the bypass mode in "cgroup.subtree_control" means delegating
+the authority of enabling or disabling the controller interface files
+to its children. Writing the controller name with the '#' prefix into
+"cgroup.controllers" resets the state back to bypass mode. The state
+of a controller cannot be changed if it is enabled or bypassed in its
+"cgroup.subtree_control".
+
Cgroup Hierarchy
~~~~~~~~~~~~~~~~
@@ -859,11 +871,18 @@ All cgroup core files are prefixed with "cgroup."
should be granted along with the containing directory.
cgroup.controllers
- A read-only space separated values file which exists on all
+ A read-write space separated values file which exists on all
cgroups.
It shows space separated list of all controllers available to
- the cgroup. The controllers are not ordered.
+ the cgroup. Controller names with '#' prefix are in bypass
+ mode. The controllers are not ordered.
+
+ When a controller is set into bypass mode in its parent's
+ "cgroup.subtree_control", its name prefixed with '+' or '#'
+ can be written to enable it or reset it back to bypass mode
+ respectively. Controllers not in bypass mode are not allowed
+ to be written.
cgroup.subtree_control
A read-write space separated values file which exists on all
diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index 3cac6d0..25c2ac8 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -308,6 +308,13 @@ struct cgroup {
u16 old_subtree_ss_mask;
u16 old_subtree_bypass;
+ /*
+ * The bitmask of subsystems that are set in its parent's
+ * ->subtree_bypass and explictly enabled in this cgroup.
+ */
+ u16 enable_ss_mask;
+ u16 old_enable_ss_mask;
+
/* Private pointers for each registered subsystem */
struct cgroup_subsys_state __rcu *subsys[CGROUP_SUBSYS_COUNT];
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 1e7feae..358d8b3 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -420,7 +420,7 @@ static u16 cgroup_control(struct cgroup *cgrp, bool show_bypass)
u16 root_ss_mask = cgrp->root->subsys_mask;
if (parent) {
- u16 ss_mask = parent->subtree_control;
+ u16 ss_mask = parent->subtree_control|cgrp->enable_ss_mask;
if (show_bypass)
ss_mask |= parent->subtree_bypass;
@@ -443,7 +443,7 @@ static u16 cgroup_ss_mask(struct cgroup *cgrp, bool show_bypass)
struct cgroup *parent = cgroup_parent(cgrp);
if (parent) {
- u16 ss_mask = parent->subtree_ss_mask;
+ u16 ss_mask = parent->subtree_ss_mask|cgrp->enable_ss_mask;
if (show_bypass)
@@ -2811,6 +2811,7 @@ static void cgroup_save_control(struct cgroup *cgrp)
dsct->old_subtree_control = dsct->subtree_control;
dsct->old_subtree_ss_mask = dsct->subtree_ss_mask;
dsct->old_subtree_bypass = dsct->subtree_bypass;
+ dsct->old_enable_ss_mask = dsct->enable_ss_mask;
}
}
@@ -2854,6 +2855,7 @@ static void cgroup_restore_control(struct cgroup *cgrp)
dsct->subtree_control = dsct->old_subtree_control;
dsct->subtree_ss_mask = dsct->old_subtree_ss_mask;
dsct->subtree_bypass = dsct->old_subtree_bypass;
+ dsct->enable_ss_mask = dsct->old_enable_ss_mask;
}
}
@@ -3124,7 +3126,8 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
cgroup_for_each_live_child(child, cgrp)
- child_enable |= child->subtree_control|child->subtree_bypass;
+ child_enable |= child->subtree_control|child->subtree_bypass|
+ child->enable_ss_mask;
/*
* Cannot change the state of a controller if enabled in children.
@@ -3157,6 +3160,105 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
return ret ?: nbytes;
}
+/*
+ * Change bypass status of controllers for a cgroup in the default hierarchy.
+ */
+static ssize_t cgroup_controllers_write(struct kernfs_open_file *of,
+ char *buf, size_t nbytes,
+ loff_t off)
+{
+ u16 enable = 0, bypass = 0;
+ struct cgroup *cgrp, *parent;
+ struct cgroup_subsys *ss;
+ char *tok;
+ int ssid, ret;
+
+ /*
+ * Parse input - space separated list of subsystem names prefixed
+ * with either + or #.
+ */
+ buf = strstrip(buf);
+ while ((tok = strsep(&buf, " "))) {
+ if (tok[0] == '\0')
+ continue;
+ do_each_subsys_mask(ss, ssid, ~cgrp_dfl_inhibit_ss_mask) {
+ if (!cgroup_ssid_enabled(ssid) ||
+ strcmp(tok + 1, ss->name))
+ continue;
+
+ if (*tok == '+') {
+ enable |= 1 << ssid;
+ bypass &= ~(1 << ssid);
+ } else if (*tok == '#') {
+ bypass |= 1 << ssid;
+ enable &= ~(1 << ssid);
+ } else {
+ return -EINVAL;
+ }
+ break;
+ } while_each_subsys_mask();
+ if (ssid == CGROUP_SUBSYS_COUNT)
+ return -EINVAL;
+ }
+
+ cgrp = cgroup_kn_lock_live(of->kn, true);
+ if (!cgrp)
+ return -ENODEV;
+
+ /*
+ * Write to root cgroup's controllers file is not allowed.
+ */
+ parent = cgroup_parent(cgrp);
+ if (!parent) {
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+
+ /*
+ * Only controllers set into bypass mode in the parent cgroup
+ * can be specified here.
+ */
+ if (~parent->subtree_bypass & (enable|bypass)) {
+ ret = -ENOENT;
+ goto out_unlock;
+ }
+
+ /*
+ * Mask off irrelevant bits.
+ */
+ enable &= ~cgrp->enable_ss_mask;
+ bypass &= cgrp->enable_ss_mask;
+
+ if (!(enable|bypass)) {
+ ret = 0;
+ goto out_unlock;
+ }
+
+ /*
+ * We cannot change the bypass state of a controller that is enabled
+ * in subtree_control.
+ */
+ if ((cgrp->subtree_control|cgrp->subtree_bypass) & (enable|bypass)) {
+ ret = -EBUSY;
+ goto out_unlock;
+ }
+
+ /* Save and update control masks and prepare csses */
+ cgroup_save_control(cgrp);
+
+ cgrp->enable_ss_mask |= enable;
+ cgrp->enable_ss_mask &= ~bypass;
+
+ ret = cgroup_apply_control(cgrp);
+ cgroup_finalize_control(cgrp, ret);
+ kernfs_activate(cgrp->kn);
+ ret = 0;
+
+out_unlock:
+ cgroup_kn_unlock(of->kn);
+ return ret ?: nbytes;
+}
+
static int cgroup_enable_threaded(struct cgroup *cgrp)
{
struct cgroup *parent = cgroup_parent(cgrp);
@@ -4322,6 +4424,7 @@ static ssize_t cgroup_threads_write(struct kernfs_open_file *of,
{
.name = "cgroup.controllers",
.seq_show = cgroup_controllers_show,
+ .write = cgroup_controllers_write,
},
{
.name = "cgroup.subtree_control",
--
1.8.3.1
next prev parent reply other threads:[~2017-07-21 20:34 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-21 20:34 [PATCH v2 0/4] cgroup: Introducing bypass mode Waiman Long
2017-07-21 20:34 ` Waiman Long
[not found] ` <1500669293-21792-1-git-send-email-longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-07-21 20:34 ` [PATCH v2 1/4] cgroup: Child cgroup creation not allowed on invalid domain Waiman Long
2017-07-21 20:34 ` Waiman Long
2017-07-22 13:43 ` Tejun Heo
[not found] ` <20170722134358.GB3329631-4dN5La/x3IkLX0oZNxdnEQ2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
2017-07-23 12:18 ` [PATCH] cgroup: remove unnecessary empty check when enabling threaded mode Tejun Heo
2017-07-23 12:18 ` Tejun Heo
[not found] ` <20170723121826.GC1498614-4dN5La/x3IkLX0oZNxdnEQ2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
2017-07-24 19:14 ` Waiman Long
2017-07-24 19:14 ` Waiman Long
[not found] ` <c2637f47-3e90-a322-f84b-0c913da14977-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-07-25 17:22 ` [PATCH cgroup/for-4.14] cgroup: add comment to cgroup_enable_threaded() Tejun Heo
2017-07-25 17:22 ` Tejun Heo
2017-07-25 17:16 ` [PATCH] cgroup: remove unnecessary empty check when enabling threaded mode Tejun Heo
2017-07-24 17:48 ` [PATCH v2 1/4] cgroup: Child cgroup creation not allowed on invalid domain Waiman Long
2017-07-21 20:34 ` [PATCH v2 2/4] cgroup: Allow bypass mode in subtree_control Waiman Long
[not found] ` <1500669293-21792-3-git-send-email-longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-07-22 13:50 ` Tejun Heo
2017-07-22 13:50 ` Tejun Heo
[not found] ` <20170722135030.GC3329631-4dN5La/x3IkLX0oZNxdnEQ2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
2017-07-24 18:20 ` Waiman Long
2017-07-24 18:20 ` Waiman Long
2017-07-25 17:13 ` Tejun Heo
[not found] ` <20170725171343.GB3216015-4dN5La/x3IkLX0oZNxdnEQ2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
2017-07-25 19:10 ` Waiman Long
2017-07-25 19:10 ` Waiman Long
2017-08-01 14:29 ` Waiman Long
2017-07-21 20:34 ` Waiman Long [this message]
2017-07-21 20:34 ` [PATCH v2 4/4] cgroup: Make debug controller report new controller masks Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1500669293-21792-4-git-send-email-longman@redhat.com \
--to=longman@redhat.com \
--cc=cgroups@vger.kernel.org \
--cc=efault@gmx.de \
--cc=guro@fb.com \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lizefan@huawei.com \
--cc=luto@amacapital.net \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.