cgroups.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] cgroup: fix rmdir EBUSY regression in 3.11
@ 2013-08-28 23:31 Hugh Dickins
       [not found] ` <alpine.LNX.2.00.1308281624430.1266-fupSdm12i1nKWymIFiNcPA@public.gmane.org>
  0 siblings, 1 reply; 2+ messages in thread
From: Hugh Dickins @ 2013-08-28 23:31 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Linus Torvalds, Li Zefan, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On 3.11-rc we are seeing cgroup directories left behind when they should
have been removed.  Here's a trivial reproducer:

cd /sys/fs/cgroup/memory
mkdir parent parent/child; rmdir parent/child parent
rmdir: failed to remove `parent': Device or resource busy

It's because cgroup_destroy_locked() (step 1 of destruction) leaves
cgroup on parent's children list, letting cgroup_offline_fn() (step 2 of
destruction) remove it; but step 2 is run by work queue, which may not
yet have removed the children when parent destruction checks the list.

Fix that by checking through a non-empty list of children: if every one
of them has already been marked CGRP_DEAD, then it's safe to proceed:
those children are invisible to userspace, and should not obstruct rmdir.

(I didn't see any reason to keep the cgrp->children checks under the
unrelated css_set_lock, so moved them out.)

Signed-off-by: Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
---

 kernel/cgroup.c |   17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

--- 3.11-rc7/kernel/cgroup.c	2013-08-11 23:38:33.096579188 -0700
+++ linux/kernel/cgroup.c	2013-08-28 15:44:06.952564071 -0700
@@ -4490,8 +4490,23 @@ static int cgroup_destroy_locked(struct
 	 * @cgrp from being removed while __put_css_set() is in progress.
 	 */
 	read_lock(&css_set_lock);
-	empty = list_empty(&cgrp->cset_links) && list_empty(&cgrp->children);
+	empty = list_empty(&cgrp->cset_links);
 	read_unlock(&css_set_lock);
+
+	if (empty && !list_empty(&cgrp->children)) {
+		struct cgroup *child;
+		/*
+		 * We need "rmdir parent/child parent" to succeed,
+		 * but our children may not have reached step 2 yet.
+		 */
+		rcu_read_lock();
+		list_for_each_entry_rcu(child, &cgrp->children, sibling) {
+			empty = cgroup_is_dead(child);
+			if (!empty)
+				break;
+		}
+		rcu_read_unlock();
+	}
 	if (!empty)
 		return -EBUSY;
 

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [PATCH cgroup/for-3.11-fixes] cgroup: fix rmdir EBUSY regression in 3.11
       [not found] ` <alpine.LNX.2.00.1308281624430.1266-fupSdm12i1nKWymIFiNcPA@public.gmane.org>
@ 2013-08-29 15:42   ` Tejun Heo
  0 siblings, 0 replies; 2+ messages in thread
From: Tejun Heo @ 2013-08-29 15:42 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Linus Torvalds, Li Zefan, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Hello, Hugh.

Oops, sorry about that.  I massaged the patch a bit and applied it to
cgroup/for-3.11-fixes.  Will send pull request to Linus right away.

Thanks.
------ 8< ------
From bb78a92f47696b2da49f2692b6a9fa56d07c444a Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Date: Wed, 28 Aug 2013 16:31:23 -0700

On 3.11-rc we are seeing cgroup directories left behind when they should
have been removed.  Here's a trivial reproducer:

cd /sys/fs/cgroup/memory
mkdir parent parent/child; rmdir parent/child parent
rmdir: failed to remove `parent': Device or resource busy

It's because cgroup_destroy_locked() (step 1 of destruction) leaves
cgroup on parent's children list, letting cgroup_offline_fn() (step 2 of
destruction) remove it; but step 2 is run by work queue, which may not
yet have removed the children when parent destruction checks the list.

Fix that by checking through a non-empty list of children: if every one
of them has already been marked CGRP_DEAD, then it's safe to proceed:
those children are invisible to userspace, and should not obstruct rmdir.

(I didn't see any reason to keep the cgrp->children checks under the
unrelated css_set_lock, so moved them out.)

tj: Flattened nested ifs a bit and updated comment so that it's
    correct on both for-3.11-fixes and for-3.12.

Signed-off-by: Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 kernel/cgroup.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 781845a..e919633 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4480,6 +4480,7 @@ static int cgroup_destroy_locked(struct cgroup *cgrp)
 	struct dentry *d = cgrp->dentry;
 	struct cgroup_event *event, *tmp;
 	struct cgroup_subsys *ss;
+	struct cgroup *child;
 	bool empty;
 
 	lockdep_assert_held(&d->d_inode->i_mutex);
@@ -4490,12 +4491,28 @@ static int cgroup_destroy_locked(struct cgroup *cgrp)
 	 * @cgrp from being removed while __put_css_set() is in progress.
 	 */
 	read_lock(&css_set_lock);
-	empty = list_empty(&cgrp->cset_links) && list_empty(&cgrp->children);
+	empty = list_empty(&cgrp->cset_links);
 	read_unlock(&css_set_lock);
 	if (!empty)
 		return -EBUSY;
 
 	/*
+	 * Make sure there's no live children.  We can't test ->children
+	 * emptiness as dead children linger on it while being destroyed;
+	 * otherwise, "rmdir parent/child parent" may fail with -EBUSY.
+	 */
+	empty = true;
+	rcu_read_lock();
+	list_for_each_entry_rcu(child, &cgrp->children, sibling) {
+		empty = cgroup_is_dead(child);
+		if (!empty)
+			break;
+	}
+	rcu_read_unlock();
+	if (!empty)
+		return -EBUSY;
+
+	/*
 	 * Block new css_tryget() by killing css refcnts.  cgroup core
 	 * guarantees that, by the time ->css_offline() is invoked, no new
 	 * css reference will be given out via css_tryget().  We can't
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2013-08-29 15:42 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-28 23:31 [PATCH] cgroup: fix rmdir EBUSY regression in 3.11 Hugh Dickins
     [not found] ` <alpine.LNX.2.00.1308281624430.1266-fupSdm12i1nKWymIFiNcPA@public.gmane.org>
2013-08-29 15:42   ` [PATCH cgroup/for-3.11-fixes] " Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).