From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org
Cc: cl-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org
Subject: [PATCH 09/11] cgroup: reorder the operations in cgroup_destroy_locked()
Date: Wed, 12 Jun 2013 21:04:56 -0700 [thread overview]
Message-ID: <1371096298-24402-10-git-send-email-tj@kernel.org> (raw)
In-Reply-To: <1371096298-24402-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
This patch reorders the operations in cgroup_destroy_locked() such
that the userland visible parts happen before css offlining and
removal from the ->sibling list. This will be used to make css use
percpu refcnt.
While at it, split out CGRP_DEAD related comment from the refcnt
deactivation one and correct / clarify how different guarantees are
met.
While this patch changes the specific order of operations, it
shouldn't cause any noticeable behavior difference.
Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
kernel/cgroup.c | 61 +++++++++++++++++++++++++++++++++------------------------
1 file changed, 35 insertions(+), 26 deletions(-)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 062653f..9261acc 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4382,13 +4382,8 @@ static int cgroup_destroy_locked(struct cgroup *cgrp)
/*
* Block new css_tryget() by deactivating refcnt and mark @cgrp
- * removed. This makes future css_tryget() and child creation
- * attempts fail thus maintaining the removal conditions verified
- * above.
- *
- * Note that CGRP_DEAD assertion is depended upon by
- * cgroup_next_sibling() to resume iteration after dropping RCU
- * read lock. See cgroup_next_sibling() for details.
+ * removed. This makes future css_tryget() attempts fail which we
+ * guarantee to ->css_offline() callbacks.
*/
for_each_subsys(cgrp->root, ss) {
struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id];
@@ -4396,8 +4391,41 @@ static int cgroup_destroy_locked(struct cgroup *cgrp)
WARN_ON(atomic_read(&css->refcnt) < 0);
atomic_add(CSS_DEACT_BIAS, &css->refcnt);
}
+
+ /*
+ * Mark @cgrp dead. This prevents further task migration and child
+ * creation by disabling cgroup_lock_live_group(). Note that
+ * CGRP_DEAD assertion is depended upon by cgroup_next_sibling() to
+ * resume iteration after dropping RCU read lock. See
+ * cgroup_next_sibling() for details.
+ */
set_bit(CGRP_DEAD, &cgrp->flags);
+ /* CGRP_DEAD is set, remove from ->release_list for the last time */
+ raw_spin_lock(&release_list_lock);
+ if (!list_empty(&cgrp->release_list))
+ list_del_init(&cgrp->release_list);
+ raw_spin_unlock(&release_list_lock);
+
+ /*
+ * Remove @cgrp directory. The removal puts the base ref but we
+ * aren't quite done with @cgrp yet, so hold onto it.
+ */
+ dget(d);
+ cgroup_d_remove_dir(d);
+
+ /*
+ * Unregister events and notify userspace.
+ * Notify userspace about cgroup removing only after rmdir of cgroup
+ * directory to avoid race between userspace and kernelspace.
+ */
+ spin_lock(&cgrp->event_list_lock);
+ list_for_each_entry_safe(event, tmp, &cgrp->event_list, list) {
+ list_del_init(&event->list);
+ schedule_work(&event->remove);
+ }
+ spin_unlock(&cgrp->event_list_lock);
+
/* tell subsystems to initate destruction */
for_each_subsys(cgrp->root, ss)
offline_css(ss, cgrp);
@@ -4412,34 +4440,15 @@ static int cgroup_destroy_locked(struct cgroup *cgrp)
for_each_subsys(cgrp->root, ss)
css_put(cgrp->subsys[ss->subsys_id]);
- raw_spin_lock(&release_list_lock);
- if (!list_empty(&cgrp->release_list))
- list_del_init(&cgrp->release_list);
- raw_spin_unlock(&release_list_lock);
-
/* delete this cgroup from parent->children */
list_del_rcu(&cgrp->sibling);
list_del_init(&cgrp->allcg_node);
- dget(d);
- cgroup_d_remove_dir(d);
dput(d);
set_bit(CGRP_RELEASABLE, &parent->flags);
check_for_release(parent);
- /*
- * Unregister events and notify userspace.
- * Notify userspace about cgroup removing only after rmdir of cgroup
- * directory to avoid race between userspace and kernelspace.
- */
- spin_lock(&cgrp->event_list_lock);
- list_for_each_entry_safe(event, tmp, &cgrp->event_list, list) {
- list_del_init(&event->list);
- schedule_work(&event->remove);
- }
- spin_unlock(&cgrp->event_list_lock);
-
return 0;
}
--
1.8.2.1
WARNING: multiple messages have this Message-ID (diff)
From: Tejun Heo <tj@kernel.org>
To: lizefan@huawei.com
Cc: containers@lists.linux-foundation.org, cgroups@vger.kernel.org,
koverstreet@google.com, linux-kernel@vger.kernel.org,
cl@linux-foundation.org, Tejun Heo <tj@kernel.org>
Subject: [PATCH 09/11] cgroup: reorder the operations in cgroup_destroy_locked()
Date: Wed, 12 Jun 2013 21:04:56 -0700 [thread overview]
Message-ID: <1371096298-24402-10-git-send-email-tj@kernel.org> (raw)
In-Reply-To: <1371096298-24402-1-git-send-email-tj@kernel.org>
This patch reorders the operations in cgroup_destroy_locked() such
that the userland visible parts happen before css offlining and
removal from the ->sibling list. This will be used to make css use
percpu refcnt.
While at it, split out CGRP_DEAD related comment from the refcnt
deactivation one and correct / clarify how different guarantees are
met.
While this patch changes the specific order of operations, it
shouldn't cause any noticeable behavior difference.
Signed-off-by: Tejun Heo <tj@kernel.org>
---
kernel/cgroup.c | 61 +++++++++++++++++++++++++++++++++------------------------
1 file changed, 35 insertions(+), 26 deletions(-)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 062653f..9261acc 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4382,13 +4382,8 @@ static int cgroup_destroy_locked(struct cgroup *cgrp)
/*
* Block new css_tryget() by deactivating refcnt and mark @cgrp
- * removed. This makes future css_tryget() and child creation
- * attempts fail thus maintaining the removal conditions verified
- * above.
- *
- * Note that CGRP_DEAD assertion is depended upon by
- * cgroup_next_sibling() to resume iteration after dropping RCU
- * read lock. See cgroup_next_sibling() for details.
+ * removed. This makes future css_tryget() attempts fail which we
+ * guarantee to ->css_offline() callbacks.
*/
for_each_subsys(cgrp->root, ss) {
struct cgroup_subsys_state *css = cgrp->subsys[ss->subsys_id];
@@ -4396,8 +4391,41 @@ static int cgroup_destroy_locked(struct cgroup *cgrp)
WARN_ON(atomic_read(&css->refcnt) < 0);
atomic_add(CSS_DEACT_BIAS, &css->refcnt);
}
+
+ /*
+ * Mark @cgrp dead. This prevents further task migration and child
+ * creation by disabling cgroup_lock_live_group(). Note that
+ * CGRP_DEAD assertion is depended upon by cgroup_next_sibling() to
+ * resume iteration after dropping RCU read lock. See
+ * cgroup_next_sibling() for details.
+ */
set_bit(CGRP_DEAD, &cgrp->flags);
+ /* CGRP_DEAD is set, remove from ->release_list for the last time */
+ raw_spin_lock(&release_list_lock);
+ if (!list_empty(&cgrp->release_list))
+ list_del_init(&cgrp->release_list);
+ raw_spin_unlock(&release_list_lock);
+
+ /*
+ * Remove @cgrp directory. The removal puts the base ref but we
+ * aren't quite done with @cgrp yet, so hold onto it.
+ */
+ dget(d);
+ cgroup_d_remove_dir(d);
+
+ /*
+ * Unregister events and notify userspace.
+ * Notify userspace about cgroup removing only after rmdir of cgroup
+ * directory to avoid race between userspace and kernelspace.
+ */
+ spin_lock(&cgrp->event_list_lock);
+ list_for_each_entry_safe(event, tmp, &cgrp->event_list, list) {
+ list_del_init(&event->list);
+ schedule_work(&event->remove);
+ }
+ spin_unlock(&cgrp->event_list_lock);
+
/* tell subsystems to initate destruction */
for_each_subsys(cgrp->root, ss)
offline_css(ss, cgrp);
@@ -4412,34 +4440,15 @@ static int cgroup_destroy_locked(struct cgroup *cgrp)
for_each_subsys(cgrp->root, ss)
css_put(cgrp->subsys[ss->subsys_id]);
- raw_spin_lock(&release_list_lock);
- if (!list_empty(&cgrp->release_list))
- list_del_init(&cgrp->release_list);
- raw_spin_unlock(&release_list_lock);
-
/* delete this cgroup from parent->children */
list_del_rcu(&cgrp->sibling);
list_del_init(&cgrp->allcg_node);
- dget(d);
- cgroup_d_remove_dir(d);
dput(d);
set_bit(CGRP_RELEASABLE, &parent->flags);
check_for_release(parent);
- /*
- * Unregister events and notify userspace.
- * Notify userspace about cgroup removing only after rmdir of cgroup
- * directory to avoid race between userspace and kernelspace.
- */
- spin_lock(&cgrp->event_list_lock);
- list_for_each_entry_safe(event, tmp, &cgrp->event_list, list) {
- list_del_init(&event->list);
- schedule_work(&event->remove);
- }
- spin_unlock(&cgrp->event_list_lock);
-
return 0;
}
--
1.8.2.1
next prev parent reply other threads:[~2013-06-13 4:04 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-13 4:04 [PATCHSET v2 cgroup/for-3.11] cgroup: convert cgroup_subsys_state refcnt to percpu_ref Tejun Heo
2013-06-13 4:04 ` Tejun Heo
2013-06-13 4:04 ` [PATCH 05/11] cgroup: clean up css_[try]get() and css_put() Tejun Heo
2013-06-13 4:04 ` [PATCH 08/11] cgroup: remove cgroup->count and use Tejun Heo
2013-06-13 4:04 ` [PATCH 10/11] cgroup: split cgroup destruction into two steps Tejun Heo
2013-06-13 4:04 ` [PATCH 11/11] cgroup: use percpu refcnt for cgroup_subsys_states Tejun Heo
[not found] ` <1371096298-24402-12-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-06-13 23:16 ` Kent Overstreet
2013-06-13 23:16 ` Kent Overstreet
2013-06-14 12:55 ` Michal Hocko
2013-06-14 12:55 ` Michal Hocko
2013-06-14 12:55 ` Michal Hocko
[not found] ` <20130614125539.GC10084-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-06-14 14:15 ` Glauber Costa
2013-06-14 14:15 ` Glauber Costa
2013-06-14 14:15 ` Glauber Costa
[not found] ` <20130614141510.GA19021-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2013-06-14 14:22 ` Michal Hocko
2013-06-14 14:22 ` Michal Hocko
2013-06-14 13:20 ` Michal Hocko
2013-06-14 13:20 ` Michal Hocko
[not found] ` <20130614132026.GD10084-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-06-14 22:31 ` Tejun Heo
2013-06-14 22:31 ` Tejun Heo
[not found] ` <20130614223125.GD6593-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2013-06-15 5:35 ` Tejun Heo
2013-06-15 5:35 ` Tejun Heo
[not found] ` <20130615053522.GA7017-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-06-15 5:39 ` Tejun Heo
2013-06-15 5:39 ` Tejun Heo
2013-06-15 6:31 ` Tejun Heo
2013-06-15 6:31 ` Tejun Heo
2013-06-17 13:27 ` Michal Hocko
2013-06-17 13:27 ` Michal Hocko
2013-06-17 13:27 ` Michal Hocko
[not found] ` <20130617132744.GB5018-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-06-17 17:16 ` Tejun Heo
2013-06-17 17:16 ` Tejun Heo
2013-06-17 17:16 ` Tejun Heo
2013-06-15 5:35 ` Tejun Heo
2013-06-14 22:31 ` Tejun Heo
2013-06-14 13:20 ` Michal Hocko
[not found] ` <1371096298-24402-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-06-13 4:04 ` [PATCH 01/11] cgroup: remove now unused css_depth() Tejun Heo
2013-06-13 4:04 ` Tejun Heo
2013-06-13 4:04 ` [PATCH 02/11] cgroup: consistently use @cset for struct css_set variables Tejun Heo
2013-06-13 4:04 ` Tejun Heo
2013-06-13 4:04 ` Tejun Heo
2013-06-13 4:04 ` [PATCH 03/11] cgroup: bring some sanity to naming around cg_cgroup_link Tejun Heo
2013-06-13 4:04 ` Tejun Heo
2013-06-13 4:04 ` [PATCH 04/11] cgroup: use kzalloc() instead of kmalloc() Tejun Heo
2013-06-13 4:04 ` Tejun Heo
2013-06-13 4:04 ` [PATCH 05/11] cgroup: clean up css_[try]get() and css_put() Tejun Heo
2013-06-13 4:04 ` [PATCH 06/11] cgroup: rename CGRP_REMOVED to CGRP_DEAD Tejun Heo
2013-06-13 4:04 ` Tejun Heo
2013-06-13 4:04 ` [PATCH 07/11] cgroup: drop unnecessary RCU dancing from __put_css_set() Tejun Heo
2013-06-13 4:04 ` Tejun Heo
2013-06-13 4:04 ` [PATCH 08/11] cgroup: remove cgroup->count and use Tejun Heo
2013-06-13 4:04 ` Tejun Heo [this message]
2013-06-13 4:04 ` [PATCH 09/11] cgroup: reorder the operations in cgroup_destroy_locked() Tejun Heo
2013-06-13 4:04 ` [PATCH 10/11] cgroup: split cgroup destruction into two steps Tejun Heo
2013-06-13 4:04 ` [PATCH 11/11] cgroup: use percpu refcnt for cgroup_subsys_states Tejun Heo
2013-06-13 6:04 ` [PATCHSET v2 cgroup/for-3.11] cgroup: convert cgroup_subsys_state refcnt to percpu_ref Li Zefan
2013-06-13 6:04 ` Li Zefan
[not found] ` <51B960FF.7070604-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-06-13 17:56 ` Tejun Heo
2013-06-13 17:56 ` Tejun Heo
[not found] ` <20130613175627.GC10799-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2013-06-14 2:41 ` Tejun Heo
2013-06-14 2:41 ` Tejun Heo
2013-06-13 17:56 ` Tejun Heo
-- strict thread matches above, loose matches on Subject: below --
2013-06-12 21:03 [PATCHSET " Tejun Heo
[not found] ` <1371070996-20613-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-06-12 21:03 ` [PATCH 09/11] cgroup: reorder the operations in cgroup_destroy_locked() Tejun Heo
2013-06-12 21:03 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1371096298-24402-10-git-send-email-tj@kernel.org \
--to=tj-dgejt+ai2ygdnm+yrofe0a@public.gmane.org \
--cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=cl-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.