public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] cgroups: fix probable race with put_css_set[_taskexit] and find_css_set
@ 2008-08-19  6:29 Lai Jiangshan
  2008-09-10  0:28 ` Paul Menage
  0 siblings, 1 reply; 13+ messages in thread
From: Lai Jiangshan @ 2008-08-19  6:29 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Paul Menage, Linux Kernel Mailing List

put_css_set_taskexit may be called when find_css_set is called on
other cpu. And the race will occur:

put_css_set_taskexit side                    find_css_set side

                                        |
atomic_dec_and_test(&kref->refcount)    |
    /* kref->refcount = 0 */            |
....................................................................
                                        |  read_lock(&css_set_lock)
                                        |  find_existing_css_set
                                        |  get_css_set
                                        |  read_unlock(&css_set_lock);
....................................................................
__release_css_set                       |
....................................................................
                                        | /* use a released css_set */
                                        |


[put_css_set is the same. But in the current code, all put_css_set are
put into cgroup mutex critical region as the same as find_css_set.]


Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 13932ab..003912e 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -241,7 +241,6 @@ static void unlink_css_set(struct css_set *cg)
 	struct cg_cgroup_link *link;
 	struct cg_cgroup_link *saved_link;
 
-	write_lock(&css_set_lock);
 	hlist_del(&cg->hlist);
 	css_set_count--;
 
@@ -251,8 +250,6 @@ static void unlink_css_set(struct css_set *cg)
 		list_del(&link->cgrp_link_list);
 		kfree(link);
 	}
-
-	write_unlock(&css_set_lock);
 }
 
 static void __release_css_set(struct kref *k, int taskexit)
@@ -260,7 +257,13 @@ static void __release_css_set(struct kref *k, int taskexit)
 	int i;
 	struct css_set *cg = container_of(k, struct css_set, ref);
 
+	write_lock(&css_set_lock);
+	if (atomic_read(&k->refcount) > 0) {
+		write_unlock(&css_set_lock);
+		return;
+	}
 	unlink_css_set(cg);
+	write_unlock(&css_set_lock);
 
 	rcu_read_lock();
 	for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
@@ -410,6 +413,20 @@ static struct css_set *find_css_set(
 	 * the desired set */
 	read_lock(&css_set_lock);
 	res = find_existing_css_set(oldcg, cgrp, template);
+	/*
+	 * put_css_set[_taskexit]() may race with find_css_set(), in that
+	 * find_css_set() got the css_set after put_css_set() had released it.
+	 *
+	 * We should put the whole put_css_set[_taskexit]() into css_set_lock's
+	 * write_lock critical setion to avoid this race. But it will increase
+	 * overhead for do_exit().
+	 *
+	 * So we do not avoid this race but put it under control:
+	 * __release_css_set() will re-check the refcount
+	 * with css_set_lock held.
+	 *
+	 * This race may trigger the warnning in kref_get().
+	 */
 	if (res)
 		get_css_set(res);
 	read_unlock(&css_set_lock);


^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2008-09-12 19:33 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-19  6:29 [PATCH] cgroups: fix probable race with put_css_set[_taskexit] and find_css_set Lai Jiangshan
2008-09-10  0:28 ` Paul Menage
2008-09-10  2:18   ` Lai Jiangshan
2008-09-10  2:40     ` Li Zefan
2008-09-10  3:11     ` Paul Menage
2008-09-10  5:01     ` Greg KH
2008-09-10  5:31       ` Paul Menage
2008-09-10  6:17         ` Greg KH
2008-09-10  6:25           ` Li Zefan
2008-09-10  6:29             ` Greg KH
2008-09-10 15:03               ` Paul Menage
2008-09-12 15:58                 ` Greg KH
2008-09-12 19:33                   ` Paul Menage

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox