[PATCHSET cgroup/for-3.16] cgroup: remove cgroup_tree

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCHSET cgroup/for-3.16] cgroup: remove cgroup_tree_mutex
@ 2014-05-06 20:19 Tejun Heo
  2014-05-06 20:19 ` [PATCH 1/8] cgroup: reorganize cgroup_create() Tejun Heo
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: Tejun Heo @ 2014-05-06 20:19 UTC (permalink / raw)
  To: lizefan; +Cc: cgroups, linux-kernel

Hello,

cgroup_tree_mutex was introduced during kernfs conversion to work
around the cyclic locking dependency between kernfs active protection
and cgroup_mutex.  Some file and directory operations need to acquire
cgroup_mutex which puts the mutex under the kernfs active protection;
however, cgroup also needs to access the hierarchy and the registered
cftypes to detemine which files to remove, which obviously can't be
done while holding cgroup_mutex anymore.

cgroup_tree_mutex nests above both cgroup_mutex and kernfs active
protection and protects hierarchy and cftypes so that those file
operations can be performed while holding it without cgroup_mutex.
This works but is kinda cumbersome as most places end up taking both
cgroup_tree_mutex and cgroup_mutex and there's on-going friction on
what needs to be protected by which combination.

Furthermore, due to new requirements from subtree_control
implementations, kernfs ended up growing full-blown mechanism to
bypass active protection instead of just supporting self-removal and
cgroup ended up using both mechanisms - two layered mutexes and active
protection bypss - on different areas, which is totally unncessary.

This patchset converts everything over to kernfs active protection
bypass and drops cgroup_tree_mutex making cgroup locking noticeably
simpler.  It contains the following eight patches.

 0001-cgroup-reorganize-cgroup_create.patch
 0002-cgroup-collapse-cgroup_create-into-croup_mkdir.patch
 0003-cgroup-grab-cgroup_mutex-earlier-in-cgroup_subtree_c.patch
 0004-cgroup-move-cgroup-kn-priv-clearing-to-cgroup_rmdir.patch
 0005-cgroup-factor-out-cgroup_kn_lock_live-and-cgroup_kn_.patch
 0006-cgroup-use-cgroup_kn_lock_live-in-other-cgroup-kernf.patch
 0007-cgroup-nest-kernfs-active-protection-under-cgroup_mu.patch
 0008-cgroup-remove-cgroup_tree_mutex.patch

0001-0004 reorganize various kernfs handling paths so that they are
more uniform in terms of active protection handling.

0005 factors out two locking helpers - cgroup_kn_lock_live() and
cgroup_kn_unlock() - which handle both kernfs active protection bypass
and locking.

0006 applies it to other kernfs method implementations which were
grabbing cgroup_mutex under active protection.

0007 reverses the locking dependency between cgroup_mutex and kernfs
active protection so that the latter nests under the former, making
cgroup_mutex equivalent to cgroup_tree_mutex.

0008 removes cgroup_tree_mutex.

This patchset is on top of

  cgroup/for-3.16 12d3089c192c ("kernel/cpuset.c: convert printk to pr_foo()")
+ [1] [PATCHSET cgroup/for-3.16] cgroup: post unified hierarchy fixes and updates
+ [2] [PATCHSET cgroup/for-3.16] cgroup: implement cftype->write()

and is available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-kill-tree_mutex

diffstat follows.  Thanks.

 kernel/cgroup.c |  385 +++++++++++++++++++++++---------------------------------
 1 file changed, 163 insertions(+), 222 deletions(-)

--
tejun

[1] http://lkml.kernel.org/g/1399377044-29873-1-git-send-email-tj@kernel.org
[2] http://lkml.kernel.org/g/1399380266-3324-1-git-send-email-tj@kernel.org

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/8] cgroup: reorganize cgroup_create()
  2014-05-06 20:19 [PATCHSET cgroup/for-3.16] cgroup: remove cgroup_tree_mutex Tejun Heo
@ 2014-05-06 20:19 ` Tejun Heo
  2014-05-06 20:19 ` [PATCH 2/8] cgroup: collapse cgroup_create() into croup_mkdir() Tejun Heo
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Tejun Heo @ 2014-05-06 20:19 UTC (permalink / raw)
  To: lizefan; +Cc: cgroups, linux-kernel, Tejun Heo

Reorganize cgroup_create() so that all paths share unlock out path.

* All err_* labels are renamed to out_* as they're now shared by both
  success and failure paths.

* @err renamed to @ret for the similar reason as above and so that
  it's more consistent with other functions.

* cgroup memory allocation moved after locking so that freeing failed
  cgroup happens before unlocking.  While this moves more code inside
  critical section, memory allocations inside cgroup locking are
  already pretty common and this is unlikely to make any noticeable
  difference.

* While at it, replace a stray @parent->root dereference with @root.

This reorganization will help simplifying locking.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/cgroup.c | 69 ++++++++++++++++++++++++++++-----------------------------
 1 file changed, 34 insertions(+), 35 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index da493a4..4d58229 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4247,15 +4247,10 @@ static long cgroup_create(struct cgroup *parent, const char *name,
 {
 	struct cgroup *cgrp;
 	struct cgroup_root *root = parent->root;
-	int ssid, err;
+	int ssid, ret;
 	struct cgroup_subsys *ss;
 	struct kernfs_node *kn;
 
-	/* allocate the cgroup and its ID, 0 is reserved for the root */
-	cgrp = kzalloc(sizeof(*cgrp), GFP_KERNEL);
-	if (!cgrp)
-		return -ENOMEM;
-
 	mutex_lock(&cgroup_tree_mutex);
 
 	/*
@@ -4266,8 +4261,15 @@ static long cgroup_create(struct cgroup *parent, const char *name,
 	 * don't get nasty surprises if we ever grow another caller.
 	 */
 	if (!cgroup_lock_live_group(parent)) {
-		err = -ENODEV;
-		goto err_unlock_tree;
+		ret = -ENODEV;
+		goto out_unlock_tree;
+	}
+
+	/* allocate the cgroup and its ID, 0 is reserved for the root */
+	cgrp = kzalloc(sizeof(*cgrp), GFP_KERNEL);
+	if (!cgrp) {
+		ret = -ENOMEM;
+		goto out_unlock;
 	}
 
 	/*
@@ -4276,15 +4278,15 @@ static long cgroup_create(struct cgroup *parent, const char *name,
 	 */
 	cgrp->id = cgroup_idr_alloc(&root->cgroup_idr, NULL, 2, 0, GFP_NOWAIT);
 	if (cgrp->id < 0) {
-		err = -ENOMEM;
-		goto err_unlock;
+		ret = -ENOMEM;
+		goto out_free_cgrp;
 	}
 
 	init_cgroup_housekeeping(cgrp);
 
 	cgrp->parent = parent;
 	cgrp->dummy_css.parent = &parent->dummy_css;
-	cgrp->root = parent->root;
+	cgrp->root = root;
 
 	if (notify_on_release(parent))
 		set_bit(CGRP_NOTIFY_ON_RELEASE, &cgrp->flags);
@@ -4295,8 +4297,8 @@ static long cgroup_create(struct cgroup *parent, const char *name,
 	/* create the directory */
 	kn = kernfs_create_dir(parent->kn, name, mode, cgrp);
 	if (IS_ERR(kn)) {
-		err = PTR_ERR(kn);
-		goto err_free_id;
+		ret = PTR_ERR(kn);
+		goto out_free_id;
 	}
 	cgrp->kn = kn;
 
@@ -4319,20 +4321,20 @@ static long cgroup_create(struct cgroup *parent, const char *name,
 	 */
 	cgroup_idr_replace(&root->cgroup_idr, cgrp, cgrp->id);
 
-	err = cgroup_kn_set_ugid(kn);
-	if (err)
-		goto err_destroy;
+	ret = cgroup_kn_set_ugid(kn);
+	if (ret)
+		goto out_destroy;
 
-	err = cgroup_addrm_files(cgrp, cgroup_base_files, true);
-	if (err)
-		goto err_destroy;
+	ret = cgroup_addrm_files(cgrp, cgroup_base_files, true);
+	if (ret)
+		goto out_destroy;
 
 	/* let's create and online css's */
 	for_each_subsys(ss, ssid) {
 		if (parent->child_subsys_mask & (1 << ssid)) {
-			err = create_css(cgrp, ss);
-			if (err)
-				goto err_destroy;
+			ret = create_css(cgrp, ss);
+			if (ret)
+				goto out_destroy;
 		}
 	}
 
@@ -4345,25 +4347,22 @@ static long cgroup_create(struct cgroup *parent, const char *name,
 
 	kernfs_activate(kn);
 
-	mutex_unlock(&cgroup_mutex);
-	mutex_unlock(&cgroup_tree_mutex);
-
-	return 0;
+	ret = 0;
+	goto out_unlock;
 
-err_free_id:
+out_free_id:
 	cgroup_idr_remove(&root->cgroup_idr, cgrp->id);
-err_unlock:
+out_free_cgrp:
+	kfree(cgrp);
+out_unlock:
 	mutex_unlock(&cgroup_mutex);
-err_unlock_tree:
+out_unlock_tree:
 	mutex_unlock(&cgroup_tree_mutex);
-	kfree(cgrp);
-	return err;
+	return ret;
 
-err_destroy:
+out_destroy:
 	cgroup_destroy_locked(cgrp);
-	mutex_unlock(&cgroup_mutex);
-	mutex_unlock(&cgroup_tree_mutex);
-	return err;
+	goto out_unlock;
 }
 
 static int cgroup_mkdir(struct kernfs_node *parent_kn, const char *name,
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/8] cgroup: collapse cgroup_create() into croup_mkdir()
  2014-05-06 20:19 [PATCHSET cgroup/for-3.16] cgroup: remove cgroup_tree_mutex Tejun Heo
  2014-05-06 20:19 ` [PATCH 1/8] cgroup: reorganize cgroup_create() Tejun Heo
@ 2014-05-06 20:19 ` Tejun Heo
  2014-05-06 20:19 ` [PATCH 3/8] cgroup: grab cgroup_mutex earlier in cgroup_subtree_control_write() Tejun Heo
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Tejun Heo @ 2014-05-06 20:19 UTC (permalink / raw)
  To: lizefan; +Cc: cgroups, linux-kernel, Tejun Heo

cgroup_mkdir() is the sole user of cgroup_create().  Let's collapse
the latter into the former.  This will help simplifying locking.
While at it, remove now stale comment about inode locking.

This patch doesn't introduce any functional changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/cgroup.c | 52 +++++++++++++---------------------------------------
 1 file changed, 13 insertions(+), 39 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 4d58229..69c0ed9 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4236,30 +4236,24 @@ err_free_css:
 	return err;
 }
 
-/**
- * cgroup_create - create a cgroup
- * @parent: cgroup that will be parent of the new cgroup
- * @name: name of the new cgroup
- * @mode: mode to set on new cgroup
- */
-static long cgroup_create(struct cgroup *parent, const char *name,
-			  umode_t mode)
+static int cgroup_mkdir(struct kernfs_node *parent_kn, const char *name,
+			umode_t mode)
 {
-	struct cgroup *cgrp;
+	struct cgroup *parent = parent_kn->priv, *cgrp;
 	struct cgroup_root *root = parent->root;
-	int ssid, ret;
 	struct cgroup_subsys *ss;
 	struct kernfs_node *kn;
-
-	mutex_lock(&cgroup_tree_mutex);
+	int ssid, ret;
 
 	/*
-	 * Only live parents can have children.  Note that the liveliness
-	 * check isn't strictly necessary because cgroup_mkdir() and
-	 * cgroup_rmdir() are fully synchronized by i_mutex; however, do it
-	 * anyway so that locking is contained inside cgroup proper and we
-	 * don't get nasty surprises if we ever grow another caller.
+	 * cgroup_mkdir() grabs cgroup_tree_mutex which nests outside
+	 * kernfs active_ref and cgroup_create() already synchronizes
+	 * properly against removal through cgroup_lock_live_group().
+	 * Break it before calling cgroup_create().
 	 */
+	cgroup_get(parent);
+	kernfs_break_active_protection(parent_kn);
+	mutex_lock(&cgroup_tree_mutex);
 	if (!cgroup_lock_live_group(parent)) {
 		ret = -ENODEV;
 		goto out_unlock_tree;
@@ -4358,6 +4352,8 @@ out_unlock:
 	mutex_unlock(&cgroup_mutex);
 out_unlock_tree:
 	mutex_unlock(&cgroup_tree_mutex);
+	kernfs_unbreak_active_protection(parent_kn);
+	cgroup_put(parent);
 	return ret;
 
 out_destroy:
@@ -4365,28 +4361,6 @@ out_destroy:
 	goto out_unlock;
 }
 
-static int cgroup_mkdir(struct kernfs_node *parent_kn, const char *name,
-			umode_t mode)
-{
-	struct cgroup *parent = parent_kn->priv;
-	int ret;
-
-	/*
-	 * cgroup_create() grabs cgroup_tree_mutex which nests outside
-	 * kernfs active_ref and cgroup_create() already synchronizes
-	 * properly against removal through cgroup_lock_live_group().
-	 * Break it before calling cgroup_create().
-	 */
-	cgroup_get(parent);
-	kernfs_break_active_protection(parent_kn);
-
-	ret = cgroup_create(parent, name, mode);
-
-	kernfs_unbreak_active_protection(parent_kn);
-	cgroup_put(parent);
-	return ret;
-}
-
 /*
  * This is called when the refcnt of a css is confirmed to be killed.
  * css_tryget() is now guaranteed to fail.
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/8] cgroup: grab cgroup_mutex earlier in cgroup_subtree_control_write()
  2014-05-06 20:19 [PATCHSET cgroup/for-3.16] cgroup: remove cgroup_tree_mutex Tejun Heo
  2014-05-06 20:19 ` [PATCH 1/8] cgroup: reorganize cgroup_create() Tejun Heo
  2014-05-06 20:19 ` [PATCH 2/8] cgroup: collapse cgroup_create() into croup_mkdir() Tejun Heo
@ 2014-05-06 20:19 ` Tejun Heo
  2014-05-06 20:19 ` [PATCH 4/8] cgroup: move cgroup->kn->priv clearing to cgroup_rmdir() Tejun Heo
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Tejun Heo @ 2014-05-06 20:19 UTC (permalink / raw)
  To: lizefan; +Cc: cgroups, linux-kernel, Tejun Heo

Move cgroup_lock_live_group() invocation upwards to right below
cgroup_tree_mutex in cgroup_subtree_control_write().  This is to help
the planned locking simplification.

This doesn't make any userland-visible behavioral changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/cgroup.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 69c0ed9..f544e46 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2583,6 +2583,10 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 	kernfs_break_active_protection(of->kn);
 
 	mutex_lock(&cgroup_tree_mutex);
+	if (!cgroup_lock_live_group(cgrp)) {
+		ret = -ENODEV;
+		goto out_unlock_tree;
+	}
 
 	for_each_subsys(ss, ssid) {
 		if (enable & (1 << ssid)) {
@@ -2606,6 +2610,7 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 				cgroup_get(child);
 				prepare_to_wait(&child->offline_waitq, &wait,
 						TASK_UNINTERRUPTIBLE);
+				mutex_unlock(&cgroup_mutex);
 				mutex_unlock(&cgroup_tree_mutex);
 				schedule();
 				finish_wait(&child->offline_waitq, &wait);
@@ -2620,7 +2625,7 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 			    (cgrp->parent &&
 			     !(cgrp->parent->child_subsys_mask & (1 << ssid)))) {
 				ret = -ENOENT;
-				goto out_unlock_tree;
+				goto out_unlock;
 			}
 		} else if (disable & (1 << ssid)) {
 			if (!(cgrp->child_subsys_mask & (1 << ssid))) {
@@ -2632,7 +2637,7 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 			cgroup_for_each_live_child(child, cgrp) {
 				if (child->child_subsys_mask & (1 << ssid)) {
 					ret = -EBUSY;
-					goto out_unlock_tree;
+					goto out_unlock;
 				}
 			}
 		}
@@ -2640,12 +2645,7 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 
 	if (!enable && !disable) {
 		ret = 0;
-		goto out_unlock_tree;
-	}
-
-	if (!cgroup_lock_live_group(cgrp)) {
-		ret = -ENODEV;
-		goto out_unlock_tree;
+		goto out_unlock;
 	}
 
 	/*
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/8] cgroup: move cgroup->kn->priv clearing to cgroup_rmdir()
  2014-05-06 20:19 [PATCHSET cgroup/for-3.16] cgroup: remove cgroup_tree_mutex Tejun Heo
                   ` (2 preceding siblings ...)
  2014-05-06 20:19 ` [PATCH 3/8] cgroup: grab cgroup_mutex earlier in cgroup_subtree_control_write() Tejun Heo
@ 2014-05-06 20:19 ` Tejun Heo
  2014-05-06 20:19 ` [PATCH 5/8] cgroup: factor out cgroup_kn_lock_live() and cgroup_kn_unlock() Tejun Heo
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Tejun Heo @ 2014-05-06 20:19 UTC (permalink / raw)
  To: lizefan; +Cc: cgroups, linux-kernel, Tejun Heo

The ->priv field of a cgroup directory kernfs_node points back to the
cgroup.  This field is RCU cleared in cgroup_destroy_locked() for
non-kernfs accesses from css_tryget_from_dir() and
cgroupstats_build().

As these are only applicable to cgroups which finished creation
successfully and fully initialized cgroups are always removed by
cgroup_rmdir(), this can be safely moved to the end of cgroup_rmdir().

This will help simplifying cgroup locking and shouldn't introduce any
behavior difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/cgroup.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index f544e46..3118a63 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4547,17 +4547,7 @@ static int cgroup_destroy_locked(struct cgroup *cgrp)
 
 	/* remove @cgrp directory along with the base files */
 	mutex_unlock(&cgroup_mutex);
-
-	/*
-	 * There are two control paths which try to determine cgroup from
-	 * dentry without going through kernfs - cgroupstats_build() and
-	 * css_tryget_from_dir().  Those are supported by RCU protecting
-	 * clearing of cgrp->kn->priv backpointer, which should happen
-	 * after all files under it have been removed.
-	 */
 	kernfs_remove(cgrp->kn);	/* @cgrp has an extra ref on its kn */
-	RCU_INIT_POINTER(*(void __rcu __force **)&cgrp->kn->priv, NULL);
-
 	mutex_lock(&cgroup_mutex);
 
 	return 0;
@@ -4616,6 +4606,17 @@ static int cgroup_rmdir(struct kernfs_node *kn)
 	mutex_unlock(&cgroup_tree_mutex);
 
 	kernfs_unbreak_active_protection(kn);
+
+	/*
+	 * There are two control paths which try to determine cgroup from
+	 * dentry without going through kernfs - cgroupstats_build() and
+	 * css_tryget_from_dir().  Those are supported by RCU protecting
+	 * clearing of kn->priv backpointer, which should happen after all
+	 * files under it have been removed.
+	 */
+	if (!ret)
+		RCU_INIT_POINTER(*(void __rcu __force **)&kn->priv, NULL);
+
 	cgroup_put(cgrp);
 	return ret;
 }
@@ -5175,7 +5176,7 @@ struct cgroup_subsys_state *css_tryget_from_dir(struct dentry *dentry,
 	/*
 	 * This path doesn't originate from kernfs and @kn could already
 	 * have been or be removed at any point.  @kn->priv is RCU
-	 * protected for this access.  See destroy_locked() for details.
+	 * protected for this access.  See cgroup_rmdir() for details.
 	 */
 	cgrp = rcu_dereference(kn->priv);
 	if (cgrp)
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/8] cgroup: factor out cgroup_kn_lock_live() and cgroup_kn_unlock()
  2014-05-06 20:19 [PATCHSET cgroup/for-3.16] cgroup: remove cgroup_tree_mutex Tejun Heo
                   ` (3 preceding siblings ...)
  2014-05-06 20:19 ` [PATCH 4/8] cgroup: move cgroup->kn->priv clearing to cgroup_rmdir() Tejun Heo
@ 2014-05-06 20:19 ` Tejun Heo
  2014-05-06 20:19 ` [PATCH 6/8] cgroup: use cgroup_kn_lock_live() in other cgroup kernfs methods Tejun Heo
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Tejun Heo @ 2014-05-06 20:19 UTC (permalink / raw)
  To: lizefan; +Cc: cgroups, linux-kernel, Tejun Heo

cgroup_mkdir(), cgroup_rmdir() and cgroup_subtree_control_write()
share the logic to break active protection so that they can grab
cgroup_tree_mutex which nests above active protection and/or remove
self.  Factor out this logic into cgroup_kn_lock_live() and
cgroup_kn_unlock().

This patch doesn't introduce any functional changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/cgroup.c | 157 ++++++++++++++++++++++++++++++++------------------------
 1 file changed, 90 insertions(+), 67 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 3118a63..1d0d5f7 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1093,6 +1093,75 @@ static void cgroup_put(struct cgroup *cgrp)
 	call_rcu(&cgrp->rcu_head, cgroup_free_rcu);
 }
 
+/**
+ * cgroup_kn_unlock - unlocking helper for cgroup kernfs methods
+ * @kn: the kernfs_node being serviced
+ *
+ * This helper undoes cgroup_kn_lock_live() and should be invoked before
+ * the method finishes if locking succeeded.  Note that once this function
+ * returns the cgroup returned by cgroup_kn_lock_live() may become
+ * inaccessible any time.  If the caller intends to continue to access the
+ * cgroup, it should pin it before invoking this function.
+ */
+static void cgroup_kn_unlock(struct kernfs_node *kn)
+{
+	struct cgroup *cgrp;
+
+	if (kernfs_type(kn) == KERNFS_DIR)
+		cgrp = kn->priv;
+	else
+		cgrp = kn->parent->priv;
+
+	mutex_unlock(&cgroup_mutex);
+	mutex_unlock(&cgroup_tree_mutex);
+
+	kernfs_unbreak_active_protection(kn);
+	cgroup_put(cgrp);
+}
+
+/**
+ * cgroup_kn_lock_live - locking helper for cgroup kernfs methods
+ * @kn: the kernfs_node being serviced
+ *
+ * This helper is to be used by a cgroup kernfs method currently servicing
+ * @kn.  It breaks the active protection, performs cgroup locking and
+ * verifies that the associated cgroup is alive.  Returns the cgroup if
+ * alive; otherwise, %NULL.  A successful return should be undone by a
+ * matching cgroup_kn_unlock() invocation.
+ *
+ * Any cgroup kernfs method implementation which requires locking the
+ * associated cgroup should use this helper.  It avoids nesting cgroup
+ * locking under kernfs active protection and allows all kernfs operations
+ * including self-removal.
+ */
+static struct cgroup *cgroup_kn_lock_live(struct kernfs_node *kn)
+{
+	struct cgroup *cgrp;
+
+	if (kernfs_type(kn) == KERNFS_DIR)
+		cgrp = kn->priv;
+	else
+		cgrp = kn->parent->priv;
+
+	/*
+	 * We're gonna grab cgroup_tree_mutex which nests outside kernfs
+	 * active_ref.  cgroup liveliness check alone provides enough
+	 * protection against removal.  Ensure @cgrp stays accessible and
+	 * break the active_ref protection.
+	 */
+	cgroup_get(cgrp);
+	kernfs_break_active_protection(kn);
+
+	mutex_lock(&cgroup_tree_mutex);
+	mutex_lock(&cgroup_mutex);
+
+	if (!cgroup_is_dead(cgrp))
+		return cgrp;
+
+	cgroup_kn_unlock(kn);
+	return NULL;
+}
+
 static void cgroup_rm_file(struct cgroup *cgrp, const struct cftype *cft)
 {
 	char name[CGROUP_FILE_NAME_MAX];
@@ -2541,7 +2610,7 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 					    loff_t off)
 {
 	unsigned int enable = 0, disable = 0;
-	struct cgroup *cgrp = of_css(of)->cgroup, *child;
+	struct cgroup *cgrp, *child;
 	struct cgroup_subsys *ss;
 	char *tok;
 	int ssid, ret;
@@ -2573,20 +2642,9 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 			return -EINVAL;
 	}
 
-	/*
-	 * We're gonna grab cgroup_tree_mutex which nests outside kernfs
-	 * active_ref.  cgroup_lock_live_group() already provides enough
-	 * protection.  Ensure @cgrp stays accessible and break the
-	 * active_ref protection.
-	 */
-	cgroup_get(cgrp);
-	kernfs_break_active_protection(of->kn);
-
-	mutex_lock(&cgroup_tree_mutex);
-	if (!cgroup_lock_live_group(cgrp)) {
-		ret = -ENODEV;
-		goto out_unlock_tree;
-	}
+	cgrp = cgroup_kn_lock_live(of->kn);
+	if (!cgrp)
+		return -ENODEV;
 
 	for_each_subsys(ss, ssid) {
 		if (enable & (1 << ssid)) {
@@ -2610,14 +2668,12 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 				cgroup_get(child);
 				prepare_to_wait(&child->offline_waitq, &wait,
 						TASK_UNINTERRUPTIBLE);
-				mutex_unlock(&cgroup_mutex);
-				mutex_unlock(&cgroup_tree_mutex);
+				cgroup_kn_unlock(of->kn);
 				schedule();
 				finish_wait(&child->offline_waitq, &wait);
 				cgroup_put(child);
 
-				ret = restart_syscall();
-				goto out_unbreak;
+				return restart_syscall();
 			}
 
 			/* unavailable or not enabled on the parent? */
@@ -2693,12 +2749,7 @@ static ssize_t cgroup_subtree_control_write(struct kernfs_open_file *of,
 	kernfs_activate(cgrp->kn);
 	ret = 0;
 out_unlock:
-	mutex_unlock(&cgroup_mutex);
-out_unlock_tree:
-	mutex_unlock(&cgroup_tree_mutex);
-out_unbreak:
-	kernfs_unbreak_active_protection(of->kn);
-	cgroup_put(cgrp);
+	cgroup_kn_unlock(of->kn);
 	return ret ?: nbytes;
 
 err_undo_css:
@@ -4239,25 +4290,16 @@ err_free_css:
 static int cgroup_mkdir(struct kernfs_node *parent_kn, const char *name,
 			umode_t mode)
 {
-	struct cgroup *parent = parent_kn->priv, *cgrp;
-	struct cgroup_root *root = parent->root;
+	struct cgroup *parent, *cgrp;
+	struct cgroup_root *root;
 	struct cgroup_subsys *ss;
 	struct kernfs_node *kn;
 	int ssid, ret;
 
-	/*
-	 * cgroup_mkdir() grabs cgroup_tree_mutex which nests outside
-	 * kernfs active_ref and cgroup_create() already synchronizes
-	 * properly against removal through cgroup_lock_live_group().
-	 * Break it before calling cgroup_create().
-	 */
-	cgroup_get(parent);
-	kernfs_break_active_protection(parent_kn);
-	mutex_lock(&cgroup_tree_mutex);
-	if (!cgroup_lock_live_group(parent)) {
-		ret = -ENODEV;
-		goto out_unlock_tree;
-	}
+	parent = cgroup_kn_lock_live(parent_kn);
+	if (!parent)
+		return -ENODEV;
+	root = parent->root;
 
 	/* allocate the cgroup and its ID, 0 is reserved for the root */
 	cgrp = kzalloc(sizeof(*cgrp), GFP_KERNEL);
@@ -4349,11 +4391,7 @@ out_free_id:
 out_free_cgrp:
 	kfree(cgrp);
 out_unlock:
-	mutex_unlock(&cgroup_mutex);
-out_unlock_tree:
-	mutex_unlock(&cgroup_tree_mutex);
-	kernfs_unbreak_active_protection(parent_kn);
-	cgroup_put(parent);
+	cgroup_kn_unlock(parent_kn);
 	return ret;
 
 out_destroy:
@@ -4580,32 +4618,17 @@ static void cgroup_destroy_css_killed(struct cgroup *cgrp)
 
 static int cgroup_rmdir(struct kernfs_node *kn)
 {
-	struct cgroup *cgrp = kn->priv;
+	struct cgroup *cgrp;
 	int ret = 0;
 
-	/*
-	 * This is self-destruction but @kn can't be removed while this
-	 * callback is in progress.  Let's break active protection.  Once
-	 * the protection is broken, @cgrp can be destroyed at any point.
-	 * Pin it so that it stays accessible.
-	 */
-	cgroup_get(cgrp);
-	kernfs_break_active_protection(kn);
-
-	mutex_lock(&cgroup_tree_mutex);
-	mutex_lock(&cgroup_mutex);
-
-	/*
-	 * @cgrp might already have been destroyed while we're trying to
-	 * grab the mutexes.
-	 */
-	if (!cgroup_is_dead(cgrp))
-		ret = cgroup_destroy_locked(cgrp);
+	cgrp = cgroup_kn_lock_live(kn);
+	if (!cgrp)
+		return 0;
+	cgroup_get(cgrp);	/* for @kn->priv clearing */
 
-	mutex_unlock(&cgroup_mutex);
-	mutex_unlock(&cgroup_tree_mutex);
+	ret = cgroup_destroy_locked(cgrp);
 
-	kernfs_unbreak_active_protection(kn);
+	cgroup_kn_unlock(kn);
 
 	/*
 	 * There are two control paths which try to determine cgroup from
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 6/8] cgroup: use cgroup_kn_lock_live() in other cgroup kernfs methods
  2014-05-06 20:19 [PATCHSET cgroup/for-3.16] cgroup: remove cgroup_tree_mutex Tejun Heo
                   ` (4 preceding siblings ...)
  2014-05-06 20:19 ` [PATCH 5/8] cgroup: factor out cgroup_kn_lock_live() and cgroup_kn_unlock() Tejun Heo
@ 2014-05-06 20:19 ` Tejun Heo
  2014-05-06 20:19 ` [PATCH 7/8] cgroup: nest kernfs active protection under cgroup_mutex Tejun Heo
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Tejun Heo @ 2014-05-06 20:19 UTC (permalink / raw)
  To: lizefan; +Cc: cgroups, linux-kernel, Tejun Heo

Make __cgroup_procs_write() and cgroup_release_agent_write() use
cgroup_kn_lock_live() and cgroup_kn_unlock() instead of
cgroup_lock_live_group().  This puts the operations under both
cgroup_tree_mutex and cgroup_mutex protection without circular
dependency from kernfs active protection.  Also, this means that
cgroup_mutex is no longer nested below kernfs active protection.
There is no longer any place where the two locks interact.

This leaves cgroup_lock_live_group() without any user.  Removed.

This will help simplifying cgroup locking.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/cgroup.c | 39 ++++++++++++---------------------------
 1 file changed, 12 insertions(+), 27 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 1d0d5f7..281fce5 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -386,23 +386,6 @@ static int notify_on_release(const struct cgroup *cgrp)
 			;						\
 		else
 
-/**
- * cgroup_lock_live_group - take cgroup_mutex and check that cgrp is alive.
- * @cgrp: the cgroup to be checked for liveness
- *
- * On success, returns true; the mutex should be later unlocked.  On
- * failure returns false with no lock held.
- */
-static bool cgroup_lock_live_group(struct cgroup *cgrp)
-{
-	mutex_lock(&cgroup_mutex);
-	if (cgroup_is_dead(cgrp)) {
-		mutex_unlock(&cgroup_mutex);
-		return false;
-	}
-	return true;
-}
-
 /* the list of cgroups eligible for automatic release. Protected by
  * release_list_lock */
 static LIST_HEAD(release_list);
@@ -2306,14 +2289,15 @@ static ssize_t __cgroup_procs_write(struct kernfs_open_file *of, char *buf,
 {
 	struct task_struct *tsk;
 	const struct cred *cred = current_cred(), *tcred;
-	struct cgroup *cgrp = of_css(of)->cgroup;
+	struct cgroup *cgrp;
 	pid_t pid;
 	int ret;
 
 	if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0)
 		return -EINVAL;
 
-	if (!cgroup_lock_live_group(cgrp))
+	cgrp = cgroup_kn_lock_live(of->kn);
+	if (!cgrp)
 		return -ENODEV;
 
 retry_find_task:
@@ -2379,7 +2363,7 @@ retry_find_task:
 
 	put_task_struct(tsk);
 out_unlock_cgroup:
-	mutex_unlock(&cgroup_mutex);
+	cgroup_kn_unlock(of->kn);
 	return ret ?: nbytes;
 }
 
@@ -2429,17 +2413,18 @@ static ssize_t cgroup_procs_write(struct kernfs_open_file *of,
 static ssize_t cgroup_release_agent_write(struct kernfs_open_file *of,
 					  char *buf, size_t nbytes, loff_t off)
 {
-	struct cgroup *cgrp = of_css(of)->cgroup;
-	struct cgroup_root *root = cgrp->root;
+	struct cgroup *cgrp;
 
-	BUILD_BUG_ON(sizeof(root->release_agent_path) < PATH_MAX);
-	if (!cgroup_lock_live_group(cgrp))
+	BUILD_BUG_ON(sizeof(cgrp->root->release_agent_path) < PATH_MAX);
+
+	cgrp = cgroup_kn_lock_live(of->kn);
+	if (!cgrp)
 		return -ENODEV;
 	spin_lock(&release_agent_path_lock);
-	strlcpy(root->release_agent_path, strstrip(buf),
-		sizeof(root->release_agent_path));
+	strlcpy(cgrp->root->release_agent_path, strstrip(buf),
+		sizeof(cgrp->root->release_agent_path));
 	spin_unlock(&release_agent_path_lock);
-	mutex_unlock(&cgroup_mutex);
+	cgroup_kn_unlock(of->kn);
 	return nbytes;
 }
 
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 7/8] cgroup: nest kernfs active protection under cgroup_mutex
  2014-05-06 20:19 [PATCHSET cgroup/for-3.16] cgroup: remove cgroup_tree_mutex Tejun Heo
                   ` (5 preceding siblings ...)
  2014-05-06 20:19 ` [PATCH 6/8] cgroup: use cgroup_kn_lock_live() in other cgroup kernfs methods Tejun Heo
@ 2014-05-06 20:19 ` Tejun Heo
  2014-05-06 20:19 ` [PATCH 8/8] cgroup: remove cgroup_tree_mutex Tejun Heo
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Tejun Heo @ 2014-05-06 20:19 UTC (permalink / raw)
  To: lizefan; +Cc: cgroups, linux-kernel, Tejun Heo

After the recent cgroup_kn_lock_live() changes, cgroup_mutex is no
longer nested below kernfs active protection.  The two don't have any
relationship now.

This patch nests kernfs active protection under cgroup_mutex.  All
cftype operations now require both cgroup_tree_mutex and cgroup_mutex,
temporary cgroup_mutex releases over kernfs operations are removed,
and cgroup_add/rm_cftypes() grab both mutexes.

This makes cgroup_tree_mutex redundant, which will be removed by the
next patch.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/cgroup.c | 27 ++++++++++++++++-----------
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 281fce5..601a35d 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1127,7 +1127,7 @@ static struct cgroup *cgroup_kn_lock_live(struct kernfs_node *kn)
 		cgrp = kn->parent->priv;
 
 	/*
-	 * We're gonna grab cgroup_tree_mutex which nests outside kernfs
+	 * We're gonna grab cgroup_mutex which nests outside kernfs
 	 * active_ref.  cgroup liveliness check alone provides enough
 	 * protection against removal.  Ensure @cgrp stays accessible and
 	 * break the active_ref protection.
@@ -1150,6 +1150,7 @@ static void cgroup_rm_file(struct cgroup *cgrp, const struct cftype *cft)
 	char name[CGROUP_FILE_NAME_MAX];
 
 	lockdep_assert_held(&cgroup_tree_mutex);
+	lockdep_assert_held(&cgroup_mutex);
 	kernfs_remove_by_name(cgrp->kn, cgroup_file_name(cgrp, cft, name));
 }
 
@@ -1216,11 +1217,9 @@ static int rebind_subsystems(struct cgroup_root *dst_root, unsigned int ss_mask)
 	 * Nothing can fail from this point on.  Remove files for the
 	 * removed subsystems and rebind each subsystem.
 	 */
-	mutex_unlock(&cgroup_mutex);
 	for_each_subsys(ss, ssid)
 		if (ss_mask & (1 << ssid))
 			cgroup_clear_dir(&ss->root->cgrp, 1 << ssid);
-	mutex_lock(&cgroup_mutex);
 
 	for_each_subsys(ss, ssid) {
 		struct cgroup_root *src_root;
@@ -2946,6 +2945,7 @@ static int cgroup_addrm_files(struct cgroup *cgrp, struct cftype cfts[],
 	int ret;
 
 	lockdep_assert_held(&cgroup_tree_mutex);
+	lockdep_assert_held(&cgroup_mutex);
 
 	for (cft = cfts; cft->name[0] != '\0'; cft++) {
 		/* does cft->flags tell us to skip this file on @cgrp? */
@@ -2981,6 +2981,7 @@ static int cgroup_apply_cftypes(struct cftype *cfts, bool is_add)
 	int ret = 0;
 
 	lockdep_assert_held(&cgroup_tree_mutex);
+	lockdep_assert_held(&cgroup_mutex);
 
 	/* add/rm files for all cgroups created before */
 	css_for_each_descendant_pre(css, cgroup_css(root, ss)) {
@@ -3049,6 +3050,7 @@ static int cgroup_init_cftypes(struct cgroup_subsys *ss, struct cftype *cfts)
 static int cgroup_rm_cftypes_locked(struct cftype *cfts)
 {
 	lockdep_assert_held(&cgroup_tree_mutex);
+	lockdep_assert_held(&cgroup_mutex);
 
 	if (!cfts || !cfts[0].ss)
 		return -ENOENT;
@@ -3075,7 +3077,9 @@ int cgroup_rm_cftypes(struct cftype *cfts)
 	int ret;
 
 	mutex_lock(&cgroup_tree_mutex);
+	mutex_lock(&cgroup_mutex);
 	ret = cgroup_rm_cftypes_locked(cfts);
+	mutex_unlock(&cgroup_mutex);
 	mutex_unlock(&cgroup_tree_mutex);
 	return ret;
 }
@@ -3106,12 +3110,14 @@ int cgroup_add_cftypes(struct cgroup_subsys *ss, struct cftype *cfts)
 		return ret;
 
 	mutex_lock(&cgroup_tree_mutex);
+	mutex_lock(&cgroup_mutex);
 
 	list_add_tail(&cfts->node, &ss->cfts);
 	ret = cgroup_apply_cftypes(cfts, true);
 	if (ret)
 		cgroup_rm_cftypes_locked(cfts);
 
+	mutex_unlock(&cgroup_mutex);
 	mutex_unlock(&cgroup_tree_mutex);
 	return ret;
 }
@@ -4446,6 +4452,7 @@ static void css_killed_ref_fn(struct percpu_ref *ref)
 static void kill_css(struct cgroup_subsys_state *css)
 {
 	lockdep_assert_held(&cgroup_tree_mutex);
+	lockdep_assert_held(&cgroup_mutex);
 
 	/*
 	 * This must happen before css is disassociated with its cgroup.
@@ -4545,13 +4552,10 @@ static int cgroup_destroy_locked(struct cgroup *cgrp)
 	/*
 	 * Initiate massacre of all css's.  cgroup_destroy_css_killed()
 	 * will be invoked to perform the rest of destruction once the
-	 * percpu refs of all css's are confirmed to be killed.  This
-	 * involves removing the subsystem's files, drop cgroup_mutex.
+	 * percpu refs of all css's are confirmed to be killed.
 	 */
-	mutex_unlock(&cgroup_mutex);
 	for_each_css(css, ssid, cgrp)
 		kill_css(css);
-	mutex_lock(&cgroup_mutex);
 
 	/* CGRP_DEAD is set, remove from ->release_list for the last time */
 	raw_spin_lock(&release_list_lock);
@@ -4568,10 +4572,11 @@ static int cgroup_destroy_locked(struct cgroup *cgrp)
 	if (!cgrp->nr_css)
 		cgroup_destroy_css_killed(cgrp);
 
-	/* remove @cgrp directory along with the base files */
-	mutex_unlock(&cgroup_mutex);
-	kernfs_remove(cgrp->kn);	/* @cgrp has an extra ref on its kn */
-	mutex_lock(&cgroup_mutex);
+	/*
+	 * Remove @cgrp directory along with the base files.  @cgrp has an
+	 * extra ref on its kn.
+	 */
+	kernfs_remove(cgrp->kn);
 
 	return 0;
 };
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 8/8] cgroup: remove cgroup_tree_mutex
  2014-05-06 20:19 [PATCHSET cgroup/for-3.16] cgroup: remove cgroup_tree_mutex Tejun Heo
                   ` (6 preceding siblings ...)
  2014-05-06 20:19 ` [PATCH 7/8] cgroup: nest kernfs active protection under cgroup_mutex Tejun Heo
@ 2014-05-06 20:19 ` Tejun Heo
  2014-05-09 19:58 ` [PATCHSET cgroup/for-3.16] " Tejun Heo
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Tejun Heo @ 2014-05-06 20:19 UTC (permalink / raw)
  To: lizefan; +Cc: cgroups, linux-kernel, Tejun Heo

cgroup_tree_mutex was introduced to work around the circular
dependency between cgroup_mutex and kernfs active protection - some
kernfs file and directory operations needed cgroup_mutex putting
cgroup_mutex under active protection but cgroup also needs to be able
to access cgroup hierarchies and cftypes to determine which
kernfs_nodes need to be removed.  cgroup_tree_mutex nested above both
cgroup_mutex and kernfs active protection and used to protect the
hierarchy and cftypes.  While this worked, it added a lot of double
lockings and was generally cumbersome.

kernfs provides a mechanism to opt out of active protection and cgroup
was already using it for removal and subtree_control.  There's no
reason to mix both methods of avoiding circular locking dependency and
the preceding cgroup_kn_lock_live() changes applied it to all relevant
cgroup kernfs operations making it unnecessary to nest cgroup_mutex
under kernfs active protection.  The previous patch reversed the
original lock ordering and put cgroup_mutex above kernfs active
protection.

After these changes, all cgroup_tree_mutex usages are now accompanied
by cgroup_mutex making the former completely redundant.  This patch
removes cgroup_tree_mutex and all its usages.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/cgroup.c | 64 ++++++++-------------------------------------------------
 1 file changed, 9 insertions(+), 55 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 601a35d..4f218f3 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -71,15 +71,6 @@
 					 MAX_CFTYPE_NAME + 2)
 
 /*
- * cgroup_tree_mutex nests above cgroup_mutex and protects cftypes, file
- * creation/removal and hierarchy changing operations including cgroup
- * creation, removal, css association and controller rebinding.  This outer
- * lock is needed mainly to resolve the circular dependency between kernfs
- * active ref and cgroup_mutex.  cgroup_tree_mutex nests above both.
- */
-static DEFINE_MUTEX(cgroup_tree_mutex);
-
-/*
  * cgroup_mutex is the master lock.  Any modification to cgroup or its
  * hierarchy must be performed while holding it.
  *
@@ -111,11 +102,10 @@ static DEFINE_SPINLOCK(cgroup_idr_lock);
  */
 static DEFINE_SPINLOCK(release_agent_path_lock);
 
-#define cgroup_assert_mutexes_or_rcu_locked()				\
+#define cgroup_assert_mutex_or_rcu_locked()				\
 	rcu_lockdep_assert(rcu_read_lock_held() ||			\
-			   lockdep_is_held(&cgroup_tree_mutex) ||	\
 			   lockdep_is_held(&cgroup_mutex),		\
-			   "cgroup_[tree_]mutex or RCU read lock required");
+			   "cgroup_mutex or RCU read lock required");
 
 /*
  * cgroup destruction makes heavy use of work items and there can be a lot
@@ -243,7 +233,6 @@ static struct cgroup_subsys_state *cgroup_css(struct cgroup *cgrp,
 {
 	if (ss)
 		return rcu_dereference_check(cgrp->subsys[ss->id],
-					lockdep_is_held(&cgroup_tree_mutex) ||
 					lockdep_is_held(&cgroup_mutex));
 	else
 		return &cgrp->dummy_css;
@@ -347,7 +336,6 @@ static int notify_on_release(const struct cgroup *cgrp)
 	for ((ssid) = 0; (ssid) < CGROUP_SUBSYS_COUNT; (ssid)++)	\
 		if (!((css) = rcu_dereference_check(			\
 				(cgrp)->subsys[(ssid)],			\
-				lockdep_is_held(&cgroup_tree_mutex) ||	\
 				lockdep_is_held(&cgroup_mutex)))) { }	\
 		else
 
@@ -381,7 +369,7 @@ static int notify_on_release(const struct cgroup *cgrp)
 /* iterate over child cgrps, lock should be held throughout iteration */
 #define cgroup_for_each_live_child(child, cgrp)				\
 	list_for_each_entry((child), &(cgrp)->children, sibling)	\
-		if (({ lockdep_assert_held(&cgroup_tree_mutex);		\
+		if (({ lockdep_assert_held(&cgroup_mutex);		\
 		       cgroup_is_dead(child); }))			\
 			;						\
 		else
@@ -869,7 +857,6 @@ static void cgroup_destroy_root(struct cgroup_root *root)
 	struct cgroup *cgrp = &root->cgrp;
 	struct cgrp_cset_link *link, *tmp_link;
 
-	mutex_lock(&cgroup_tree_mutex);
 	mutex_lock(&cgroup_mutex);
 
 	BUG_ON(atomic_read(&root->nr_cgrps));
@@ -899,7 +886,6 @@ static void cgroup_destroy_root(struct cgroup_root *root)
 	cgroup_exit_root_id(root);
 
 	mutex_unlock(&cgroup_mutex);
-	mutex_unlock(&cgroup_tree_mutex);
 
 	kernfs_destroy_root(root->kf_root);
 	cgroup_free_root(root);
@@ -1096,7 +1082,6 @@ static void cgroup_kn_unlock(struct kernfs_node *kn)
 		cgrp = kn->parent->priv;
 
 	mutex_unlock(&cgroup_mutex);
-	mutex_unlock(&cgroup_tree_mutex);
 
 	kernfs_unbreak_active_protection(kn);
 	cgroup_put(cgrp);
@@ -1135,7 +1120,6 @@ static struct cgroup *cgroup_kn_lock_live(struct kernfs_node *kn)
 	cgroup_get(cgrp);
 	kernfs_break_active_protection(kn);
 
-	mutex_lock(&cgroup_tree_mutex);
 	mutex_lock(&cgroup_mutex);
 
 	if (!cgroup_is_dead(cgrp))
@@ -1149,7 +1133,6 @@ static void cgroup_rm_file(struct cgroup *cgrp, const struct cftype *cft)
 {
 	char name[CGROUP_FILE_NAME_MAX];
 
-	lockdep_assert_held(&cgroup_tree_mutex);
 	lockdep_assert_held(&cgroup_mutex);
 	kernfs_remove_by_name(cgrp->kn, cgroup_file_name(cgrp, cft, name));
 }
@@ -1179,7 +1162,6 @@ static int rebind_subsystems(struct cgroup_root *dst_root, unsigned int ss_mask)
 	struct cgroup_subsys *ss;
 	int ssid, i, ret;
 
-	lockdep_assert_held(&cgroup_tree_mutex);
 	lockdep_assert_held(&cgroup_mutex);
 
 	for_each_subsys(ss, ssid) {
@@ -1457,7 +1439,6 @@ static int cgroup_remount(struct kernfs_root *kf_root, int *flags, char *data)
 		return -EINVAL;
 	}
 
-	mutex_lock(&cgroup_tree_mutex);
 	mutex_lock(&cgroup_mutex);
 
 	/* See what subsystems are wanted */
@@ -1503,7 +1484,6 @@ static int cgroup_remount(struct kernfs_root *kf_root, int *flags, char *data)
 	kfree(opts.release_agent);
 	kfree(opts.name);
 	mutex_unlock(&cgroup_mutex);
-	mutex_unlock(&cgroup_tree_mutex);
 	return ret;
 }
 
@@ -1606,7 +1586,6 @@ static int cgroup_setup_root(struct cgroup_root *root, unsigned int ss_mask)
 	struct css_set *cset;
 	int i, ret;
 
-	lockdep_assert_held(&cgroup_tree_mutex);
 	lockdep_assert_held(&cgroup_mutex);
 
 	ret = cgroup_idr_alloc(&root->cgroup_idr, root_cgrp, 1, 2, GFP_NOWAIT);
@@ -1696,7 +1675,6 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type,
 	if (!use_task_css_set_links)
 		cgroup_enable_task_cg_lists();
 
-	mutex_lock(&cgroup_tree_mutex);
 	mutex_lock(&cgroup_mutex);
 
 	/* First find the desired set of subsystems */
@@ -1761,9 +1739,7 @@ retry:
 		 */
 		if (!atomic_inc_not_zero(&root->cgrp.refcnt)) {
 			mutex_unlock(&cgroup_mutex);
-			mutex_unlock(&cgroup_tree_mutex);
 			msleep(10);
-			mutex_lock(&cgroup_tree_mutex);
 			mutex_lock(&cgroup_mutex);
 			goto retry;
 		}
@@ -1796,7 +1772,6 @@ retry:
 
 out_unlock:
 	mutex_unlock(&cgroup_mutex);
-	mutex_unlock(&cgroup_tree_mutex);
 
 	kfree(opts.release_agent);
 	kfree(opts.name);
@@ -2507,7 +2482,6 @@ static int cgroup_update_dfl_csses(struct cgroup *cgrp)
 	struct css_set *src_cset;
 	int ret;
 
-	lockdep_assert_held(&cgroup_tree_mutex);
 	lockdep_assert_held(&cgroup_mutex);
 
 	/* look up all csses currently attached to @cgrp's subtree */
@@ -2866,20 +2840,18 @@ static int cgroup_rename(struct kernfs_node *kn, struct kernfs_node *new_parent,
 		return -EPERM;
 
 	/*
-	 * We're gonna grab cgroup_tree_mutex which nests outside kernfs
+	 * We're gonna grab cgroup_mutex which nests outside kernfs
 	 * active_ref.  kernfs_rename() doesn't require active_ref
-	 * protection.  Break them before grabbing cgroup_tree_mutex.
+	 * protection.  Break them before grabbing cgroup_mutex.
 	 */
 	kernfs_break_active_protection(new_parent);
 	kernfs_break_active_protection(kn);
 
-	mutex_lock(&cgroup_tree_mutex);
 	mutex_lock(&cgroup_mutex);
 
 	ret = kernfs_rename(kn, new_parent, new_name_str);
 
 	mutex_unlock(&cgroup_mutex);
-	mutex_unlock(&cgroup_tree_mutex);
 
 	kernfs_unbreak_active_protection(kn);
 	kernfs_unbreak_active_protection(new_parent);
@@ -2944,7 +2916,6 @@ static int cgroup_addrm_files(struct cgroup *cgrp, struct cftype cfts[],
 	struct cftype *cft;
 	int ret;
 
-	lockdep_assert_held(&cgroup_tree_mutex);
 	lockdep_assert_held(&cgroup_mutex);
 
 	for (cft = cfts; cft->name[0] != '\0'; cft++) {
@@ -2980,7 +2951,6 @@ static int cgroup_apply_cftypes(struct cftype *cfts, bool is_add)
 	struct cgroup_subsys_state *css;
 	int ret = 0;
 
-	lockdep_assert_held(&cgroup_tree_mutex);
 	lockdep_assert_held(&cgroup_mutex);
 
 	/* add/rm files for all cgroups created before */
@@ -3049,7 +3019,6 @@ static int cgroup_init_cftypes(struct cgroup_subsys *ss, struct cftype *cfts)
 
 static int cgroup_rm_cftypes_locked(struct cftype *cfts)
 {
-	lockdep_assert_held(&cgroup_tree_mutex);
 	lockdep_assert_held(&cgroup_mutex);
 
 	if (!cfts || !cfts[0].ss)
@@ -3076,11 +3045,9 @@ int cgroup_rm_cftypes(struct cftype *cfts)
 {
 	int ret;
 
-	mutex_lock(&cgroup_tree_mutex);
 	mutex_lock(&cgroup_mutex);
 	ret = cgroup_rm_cftypes_locked(cfts);
 	mutex_unlock(&cgroup_mutex);
-	mutex_unlock(&cgroup_tree_mutex);
 	return ret;
 }
 
@@ -3109,7 +3076,6 @@ int cgroup_add_cftypes(struct cgroup_subsys *ss, struct cftype *cfts)
 	if (ret)
 		return ret;
 
-	mutex_lock(&cgroup_tree_mutex);
 	mutex_lock(&cgroup_mutex);
 
 	list_add_tail(&cfts->node, &ss->cfts);
@@ -3118,7 +3084,6 @@ int cgroup_add_cftypes(struct cgroup_subsys *ss, struct cftype *cfts)
 		cgroup_rm_cftypes_locked(cfts);
 
 	mutex_unlock(&cgroup_mutex);
-	mutex_unlock(&cgroup_tree_mutex);
 	return ret;
 }
 
@@ -3158,7 +3123,7 @@ css_next_child(struct cgroup_subsys_state *pos_css,
 	struct cgroup *cgrp = parent_css->cgroup;
 	struct cgroup *next;
 
-	cgroup_assert_mutexes_or_rcu_locked();
+	cgroup_assert_mutex_or_rcu_locked();
 
 	/*
 	 * @pos could already have been removed.  Once a cgroup is removed,
@@ -3224,7 +3189,7 @@ css_next_descendant_pre(struct cgroup_subsys_state *pos,
 {
 	struct cgroup_subsys_state *next;
 
-	cgroup_assert_mutexes_or_rcu_locked();
+	cgroup_assert_mutex_or_rcu_locked();
 
 	/* if first iteration, visit @root */
 	if (!pos)
@@ -3264,7 +3229,7 @@ css_rightmost_descendant(struct cgroup_subsys_state *pos)
 {
 	struct cgroup_subsys_state *last, *tmp;
 
-	cgroup_assert_mutexes_or_rcu_locked();
+	cgroup_assert_mutex_or_rcu_locked();
 
 	do {
 		last = pos;
@@ -3311,7 +3276,7 @@ css_next_descendant_post(struct cgroup_subsys_state *pos,
 {
 	struct cgroup_subsys_state *next;
 
-	cgroup_assert_mutexes_or_rcu_locked();
+	cgroup_assert_mutex_or_rcu_locked();
 
 	/* if first iteration, visit leftmost descendant which may be @root */
 	if (!pos)
@@ -4179,7 +4144,6 @@ static int online_css(struct cgroup_subsys_state *css)
 	struct cgroup_subsys *ss = css->ss;
 	int ret = 0;
 
-	lockdep_assert_held(&cgroup_tree_mutex);
 	lockdep_assert_held(&cgroup_mutex);
 
 	if (ss->css_online)
@@ -4197,7 +4161,6 @@ static void offline_css(struct cgroup_subsys_state *css)
 {
 	struct cgroup_subsys *ss = css->ss;
 
-	lockdep_assert_held(&cgroup_tree_mutex);
 	lockdep_assert_held(&cgroup_mutex);
 
 	if (!(css->flags & CSS_ONLINE))
@@ -4400,7 +4363,6 @@ static void css_killed_work_fn(struct work_struct *work)
 		container_of(work, struct cgroup_subsys_state, destroy_work);
 	struct cgroup *cgrp = css->cgroup;
 
-	mutex_lock(&cgroup_tree_mutex);
 	mutex_lock(&cgroup_mutex);
 
 	/*
@@ -4418,7 +4380,6 @@ static void css_killed_work_fn(struct work_struct *work)
 		cgroup_destroy_css_killed(cgrp);
 
 	mutex_unlock(&cgroup_mutex);
-	mutex_unlock(&cgroup_tree_mutex);
 
 	/*
 	 * Put the css refs from kill_css().  Each css holds an extra
@@ -4451,7 +4412,6 @@ static void css_killed_ref_fn(struct percpu_ref *ref)
  */
 static void kill_css(struct cgroup_subsys_state *css)
 {
-	lockdep_assert_held(&cgroup_tree_mutex);
 	lockdep_assert_held(&cgroup_mutex);
 
 	/*
@@ -4511,7 +4471,6 @@ static int cgroup_destroy_locked(struct cgroup *cgrp)
 	bool empty;
 	int ssid;
 
-	lockdep_assert_held(&cgroup_tree_mutex);
 	lockdep_assert_held(&cgroup_mutex);
 
 	/*
@@ -4594,7 +4553,6 @@ static void cgroup_destroy_css_killed(struct cgroup *cgrp)
 {
 	struct cgroup *parent = cgrp->parent;
 
-	lockdep_assert_held(&cgroup_tree_mutex);
 	lockdep_assert_held(&cgroup_mutex);
 
 	/* delete this cgroup from parent->children */
@@ -4648,7 +4606,6 @@ static void __init cgroup_init_subsys(struct cgroup_subsys *ss, bool early)
 
 	printk(KERN_INFO "Initializing cgroup subsys %s\n", ss->name);
 
-	mutex_lock(&cgroup_tree_mutex);
 	mutex_lock(&cgroup_mutex);
 
 	idr_init(&ss->css_idr);
@@ -4686,7 +4643,6 @@ static void __init cgroup_init_subsys(struct cgroup_subsys *ss, bool early)
 	cgrp_dfl_root.subsys_mask |= 1 << ss->id;
 
 	mutex_unlock(&cgroup_mutex);
-	mutex_unlock(&cgroup_tree_mutex);
 }
 
 /**
@@ -4736,7 +4692,6 @@ int __init cgroup_init(void)
 
 	BUG_ON(cgroup_init_cftypes(NULL, cgroup_base_files));
 
-	mutex_lock(&cgroup_tree_mutex);
 	mutex_lock(&cgroup_mutex);
 
 	/* Add init_css_set to the hash table */
@@ -4746,7 +4701,6 @@ int __init cgroup_init(void)
 	BUG_ON(cgroup_setup_root(&cgrp_dfl_root, 0));
 
 	mutex_unlock(&cgroup_mutex);
-	mutex_unlock(&cgroup_tree_mutex);
 
 	for_each_subsys(ss, ssid) {
 		if (ss->early_init) {
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCHSET cgroup/for-3.16] cgroup: remove cgroup_tree_mutex
  2014-05-06 20:19 [PATCHSET cgroup/for-3.16] cgroup: remove cgroup_tree_mutex Tejun Heo
                   ` (7 preceding siblings ...)
  2014-05-06 20:19 ` [PATCH 8/8] cgroup: remove cgroup_tree_mutex Tejun Heo
@ 2014-05-09 19:58 ` Tejun Heo
  2014-05-13  7:35 ` Li Zefan
  2014-05-13 16:22 ` Tejun Heo
  10 siblings, 0 replies; 12+ messages in thread
From: Tejun Heo @ 2014-05-09 19:58 UTC (permalink / raw)
  To: lizefan; +Cc: cgroups, linux-kernel

On Tue, May 06, 2014 at 04:19:26PM -0400, Tejun Heo wrote:
> This patchset is on top of
> 
>   cgroup/for-3.16 12d3089c192c ("kernel/cpuset.c: convert printk to pr_foo()")
> + [1] [PATCHSET cgroup/for-3.16] cgroup: post unified hierarchy fixes and updates
> + [2] [PATCHSET cgroup/for-3.16] cgroup: implement cftype->write()
> 
> and is available in the following git branch.
> 
>  git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-kill-tree_mutex

Refreshed without content change on top of

 b9a63d0116e8 ("Merge branch 'for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu into for-3.16")
 + [1] [PATCHSET v2 cgroup/for-3.16] cgroup: post unified hierarchy fixes and updates
 + [2] (REFRESHED) [PATCHSET cgroup/for-3.16] cgroup: implement cftype->write()

and available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-kill-tree_mutex-v2

Thanks.

-- 
tejun

[1] http://lkml.kernel.org/g/1399663975-315-1-git-send-email-tj@kernel.org
[2] http://lkml.kernel.org/g/20140509195059.GE4486@htj.dyndns.org

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHSET cgroup/for-3.16] cgroup: remove cgroup_tree_mutex
  2014-05-06 20:19 [PATCHSET cgroup/for-3.16] cgroup: remove cgroup_tree_mutex Tejun Heo
                   ` (8 preceding siblings ...)
  2014-05-09 19:58 ` [PATCHSET cgroup/for-3.16] " Tejun Heo
@ 2014-05-13  7:35 ` Li Zefan
  2014-05-13 16:22 ` Tejun Heo
  10 siblings, 0 replies; 12+ messages in thread
From: Li Zefan @ 2014-05-13  7:35 UTC (permalink / raw)
  To: Tejun Heo; +Cc: cgroups, linux-kernel

On 2014/5/7 4:19, Tejun Heo wrote:
> Hello,
> 
> cgroup_tree_mutex was introduced during kernfs conversion to work
> around the cyclic locking dependency between kernfs active protection
> and cgroup_mutex.  Some file and directory operations need to acquire
> cgroup_mutex which puts the mutex under the kernfs active protection;
> however, cgroup also needs to access the hierarchy and the registered
> cftypes to detemine which files to remove, which obviously can't be
> done while holding cgroup_mutex anymore.
> 
> cgroup_tree_mutex nests above both cgroup_mutex and kernfs active
> protection and protects hierarchy and cftypes so that those file
> operations can be performed while holding it without cgroup_mutex.
> This works but is kinda cumbersome as most places end up taking both
> cgroup_tree_mutex and cgroup_mutex and there's on-going friction on
> what needs to be protected by which combination.
> 
> Furthermore, due to new requirements from subtree_control
> implementations, kernfs ended up growing full-blown mechanism to
> bypass active protection instead of just supporting self-removal and
> cgroup ended up using both mechanisms - two layered mutexes and active
> protection bypss - on different areas, which is totally unncessary.
> 
> This patchset converts everything over to kernfs active protection
> bypass and drops cgroup_tree_mutex making cgroup locking noticeably
> simpler.  It contains the following eight patches.
> 
>  0001-cgroup-reorganize-cgroup_create.patch
>  0002-cgroup-collapse-cgroup_create-into-croup_mkdir.patch
>  0003-cgroup-grab-cgroup_mutex-earlier-in-cgroup_subtree_c.patch
>  0004-cgroup-move-cgroup-kn-priv-clearing-to-cgroup_rmdir.patch
>  0005-cgroup-factor-out-cgroup_kn_lock_live-and-cgroup_kn_.patch
>  0006-cgroup-use-cgroup_kn_lock_live-in-other-cgroup-kernf.patch
>  0007-cgroup-nest-kernfs-active-protection-under-cgroup_mu.patch
>  0008-cgroup-remove-cgroup_tree_mutex.patch
> 
> 0001-0004 reorganize various kernfs handling paths so that they are
> more uniform in terms of active protection handling.
> 
> 0005 factors out two locking helpers - cgroup_kn_lock_live() and
> cgroup_kn_unlock() - which handle both kernfs active protection bypass
> and locking.
> 
> 0006 applies it to other kernfs method implementations which were
> grabbing cgroup_mutex under active protection.
> 
> 0007 reverses the locking dependency between cgroup_mutex and kernfs
> active protection so that the latter nests under the former, making
> cgroup_mutex equivalent to cgroup_tree_mutex.
> 
> 0008 removes cgroup_tree_mutex.
> 
> This patchset is on top of
> 
>   cgroup/for-3.16 12d3089c192c ("kernel/cpuset.c: convert printk to pr_foo()")
> + [1] [PATCHSET cgroup/for-3.16] cgroup: post unified hierarchy fixes and updates
> + [2] [PATCHSET cgroup/for-3.16] cgroup: implement cftype->write()
> 
> and is available in the following git branch.
> 
>  git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-kill-tree_mutex
> 
> diffstat follows.  Thanks.
> 
>  kernel/cgroup.c |  385 +++++++++++++++++++++++---------------------------------
>  1 file changed, 163 insertions(+), 222 deletions(-)
> 

Acked-by: Li Zefan <lizefan@huawei.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHSET cgroup/for-3.16] cgroup: remove cgroup_tree_mutex
  2014-05-06 20:19 [PATCHSET cgroup/for-3.16] cgroup: remove cgroup_tree_mutex Tejun Heo
                   ` (9 preceding siblings ...)
  2014-05-13  7:35 ` Li Zefan
@ 2014-05-13 16:22 ` Tejun Heo
  10 siblings, 0 replies; 12+ messages in thread
From: Tejun Heo @ 2014-05-13 16:22 UTC (permalink / raw)
  To: lizefan; +Cc: cgroups, linux-kernel

On Tue, May 06, 2014 at 04:19:26PM -0400, Tejun Heo wrote:
> Hello,
> 
> cgroup_tree_mutex was introduced during kernfs conversion to work
> around the cyclic locking dependency between kernfs active protection
> and cgroup_mutex.  Some file and directory operations need to acquire
> cgroup_mutex which puts the mutex under the kernfs active protection;
> however, cgroup also needs to access the hierarchy and the registered
> cftypes to detemine which files to remove, which obviously can't be
> done while holding cgroup_mutex anymore.
> 
> cgroup_tree_mutex nests above both cgroup_mutex and kernfs active
> protection and protects hierarchy and cftypes so that those file
> operations can be performed while holding it without cgroup_mutex.
> This works but is kinda cumbersome as most places end up taking both
> cgroup_tree_mutex and cgroup_mutex and there's on-going friction on
> what needs to be protected by which combination.
> 
> Furthermore, due to new requirements from subtree_control
> implementations, kernfs ended up growing full-blown mechanism to
> bypass active protection instead of just supporting self-removal and
> cgroup ended up using both mechanisms - two layered mutexes and active
> protection bypss - on different areas, which is totally unncessary.
> 
> This patchset converts everything over to kernfs active protection
> bypass and drops cgroup_tree_mutex making cgroup locking noticeably
> simpler.  It contains the following eight patches.
> 
>  0001-cgroup-reorganize-cgroup_create.patch
>  0002-cgroup-collapse-cgroup_create-into-croup_mkdir.patch
>  0003-cgroup-grab-cgroup_mutex-earlier-in-cgroup_subtree_c.patch
>  0004-cgroup-move-cgroup-kn-priv-clearing-to-cgroup_rmdir.patch
>  0005-cgroup-factor-out-cgroup_kn_lock_live-and-cgroup_kn_.patch
>  0006-cgroup-use-cgroup_kn_lock_live-in-other-cgroup-kernf.patch
>  0007-cgroup-nest-kernfs-active-protection-under-cgroup_mu.patch
>  0008-cgroup-remove-cgroup_tree_mutex.patch

Applied to cgroup/for-3.16.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2014-05-13 16:22 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-06 20:19 [PATCHSET cgroup/for-3.16] cgroup: remove cgroup_tree_mutex Tejun Heo
2014-05-06 20:19 ` [PATCH 1/8] cgroup: reorganize cgroup_create() Tejun Heo
2014-05-06 20:19 ` [PATCH 2/8] cgroup: collapse cgroup_create() into croup_mkdir() Tejun Heo
2014-05-06 20:19 ` [PATCH 3/8] cgroup: grab cgroup_mutex earlier in cgroup_subtree_control_write() Tejun Heo
2014-05-06 20:19 ` [PATCH 4/8] cgroup: move cgroup->kn->priv clearing to cgroup_rmdir() Tejun Heo
2014-05-06 20:19 ` [PATCH 5/8] cgroup: factor out cgroup_kn_lock_live() and cgroup_kn_unlock() Tejun Heo
2014-05-06 20:19 ` [PATCH 6/8] cgroup: use cgroup_kn_lock_live() in other cgroup kernfs methods Tejun Heo
2014-05-06 20:19 ` [PATCH 7/8] cgroup: nest kernfs active protection under cgroup_mutex Tejun Heo
2014-05-06 20:19 ` [PATCH 8/8] cgroup: remove cgroup_tree_mutex Tejun Heo
2014-05-09 19:58 ` [PATCHSET cgroup/for-3.16] " Tejun Heo
2014-05-13  7:35 ` Li Zefan
2014-05-13 16:22 ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).