[PATCH 01/12] memcg: take reference before releasing rcu_read_lock

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Li Zefan <lizefan@huawei.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Tejun Heo <tj@kernel.org>, Glauber Costa <glommer@parallels.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Cgroups <cgroups@vger.kernel.org>,
	linux-mm@kvack.org
Subject: [PATCH 01/12] memcg: take reference before releasing rcu_read_lock
Date: Mon, 8 Apr 2013 14:33:03 +0800	[thread overview]
Message-ID: <5162649F.40108@huawei.com> (raw)
In-Reply-To: <5162648B.9070802@huawei.com>

The memcg is not referenced, so it can be destroyed at anytime right
after we exit rcu read section, so it's not safe to access it.

To fix this, we call css_tryget() to get a reference while we're still
in rcu read section.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Glauber Costa <glommer@parallels.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 mm/memcontrol.c | 63 ++++++++++++++++++++++++++++++---------------------------
 1 file changed, 33 insertions(+), 30 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index d280016..e054ac0 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3460,7 +3460,6 @@ static void memcg_create_cache_work_func(struct work_struct *w)
 
 /*
  * Enqueue the creation of a per-memcg kmem_cache.
- * Called with rcu_read_lock.
  */
 static void __memcg_create_cache_enqueue(struct mem_cgroup *memcg,
 					 struct kmem_cache *cachep)
@@ -3468,12 +3467,8 @@ static void __memcg_create_cache_enqueue(struct mem_cgroup *memcg,
 	struct create_work *cw;
 
 	cw = kmalloc(sizeof(struct create_work), GFP_NOWAIT);
-	if (cw == NULL)
-		return;
-
-	/* The corresponding put will be done in the workqueue. */
-	if (!css_tryget(&memcg->css)) {
-		kfree(cw);
+	if (cw == NULL) {
+		css_put(&memcg->css);
 		return;
 	}
 
@@ -3529,10 +3524,9 @@ struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep,
 
 	rcu_read_lock();
 	memcg = mem_cgroup_from_task(rcu_dereference(current->mm->owner));
-	rcu_read_unlock();
 
 	if (!memcg_can_account_kmem(memcg))
-		return cachep;
+		goto out;
 
 	idx = memcg_cache_id(memcg);
 
@@ -3541,29 +3535,38 @@ struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep,
 	 * code updating memcg_caches will issue a write barrier to match this.
 	 */
 	read_barrier_depends();
-	if (unlikely(cachep->memcg_params->memcg_caches[idx] == NULL)) {
-		/*
-		 * If we are in a safe context (can wait, and not in interrupt
-		 * context), we could be be predictable and return right away.
-		 * This would guarantee that the allocation being performed
-		 * already belongs in the new cache.
-		 *
-		 * However, there are some clashes that can arrive from locking.
-		 * For instance, because we acquire the slab_mutex while doing
-		 * kmem_cache_dup, this means no further allocation could happen
-		 * with the slab_mutex held.
-		 *
-		 * Also, because cache creation issue get_online_cpus(), this
-		 * creates a lock chain: memcg_slab_mutex -> cpu_hotplug_mutex,
-		 * that ends up reversed during cpu hotplug. (cpuset allocates
-		 * a bunch of GFP_KERNEL memory during cpuup). Due to all that,
-		 * better to defer everything.
-		 */
-		memcg_create_cache_enqueue(memcg, cachep);
-		return cachep;
+	if (likely(cachep->memcg_params->memcg_caches[idx])) {
+		cachep = cachep->memcg_params->memcg_caches[idx];
+		goto out;
 	}
 
-	return cachep->memcg_params->memcg_caches[idx];
+	/* The corresponding put will be done in the workqueue. */
+	if (!css_tryget(&memcg->css))
+		goto out;
+	rcu_read_unlock();
+
+	/*
+	 * If we are in a safe context (can wait, and not in interrupt
+	 * context), we could be be predictable and return right away.
+	 * This would guarantee that the allocation being performed
+	 * already belongs in the new cache.
+	 *
+	 * However, there are some clashes that can arrive from locking.
+	 * For instance, because we acquire the slab_mutex while doing
+	 * kmem_cache_dup, this means no further allocation could happen
+	 * with the slab_mutex held.
+	 *
+	 * Also, because cache creation issue get_online_cpus(), this
+	 * creates a lock chain: memcg_slab_mutex -> cpu_hotplug_mutex,
+	 * that ends up reversed during cpu hotplug. (cpuset allocates
+	 * a bunch of GFP_KERNEL memory during cpuup). Due to all that,
+	 * better to defer everything.
+	 */
+	memcg_create_cache_enqueue(memcg, cachep);
+	return cachep;
+out:
+	rcu_read_unlock();
+	return cachep;
 }
 EXPORT_SYMBOL(__memcg_kmem_get_cache);
 
-- 
1.8.0.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2013-04-08  6:33 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-08  6:32 [PATCH 0/12][V2] memcg: make memcg's life cycle the same as cgroup Li Zefan
2013-04-08  6:33 ` Li Zefan [this message]
2013-04-08  6:33 ` [PATCH 02/12] memcg: avoid accessing memcg after releasing reference Li Zefan
2013-04-08  6:33 ` [PATCH 03/12] Revert "memcg: avoid dangling reference count in creation failure." Li Zefan
2013-04-09  2:50   ` Kamezawa Hiroyuki
2013-04-08  6:33 ` [PATCH 04/12] memcg, kmem: fix reference count handling on the error path Li Zefan
2013-04-09  2:51   ` Kamezawa Hiroyuki
2013-04-08  6:34 ` [PATCH 05/12] memcg: use css_get() in sock_update_memcg() Li Zefan
2013-04-08  6:34 ` [PATCH 06/12] memcg: don't use mem_cgroup_get() when creating a kmemcg cache Li Zefan
2013-04-09  2:53   ` Kamezawa Hiroyuki
2013-04-08  6:34 ` [PATCH 07/12] memcg: use css_get/put when charging/uncharging kmem Li Zefan
2013-04-08 14:14   ` Michal Hocko
2013-04-09  2:55   ` Kamezawa Hiroyuki
2013-04-08  6:34 ` [PATCH 08/12] memcg: use css_get/put for swap memcg Li Zefan
2013-04-08  6:35 ` [PATCH 09/12] cgroup: make sure parent won't be destroyed before its children Li Zefan
2013-04-08 15:36   ` Tejun Heo
2013-04-08  6:35 ` [PATCH 10/12] memcg: don't need to get a reference to the parent Li Zefan
2013-04-08  6:36 ` [PATCH 11/12] memcg: kill memcg refcnt Li Zefan
2013-04-08  6:36 ` [PATCH 12/12] memcg: don't need to free memcg via RCU or workqueue Li Zefan
2013-04-08 14:15   ` Michal Hocko
2013-04-09  2:57   ` Kamezawa Hiroyuki
2013-04-08  6:36 ` [PATCH 13/12] memcg: don't need memcg->memcg_name Li Zefan
2013-04-08 14:25   ` Michal Hocko
2013-04-09  1:28     ` Li Zefan
2013-04-09  3:10   ` Kamezawa Hiroyuki
2013-04-09  3:18     ` Li Zefan
2013-04-09  3:46       ` Kamezawa Hiroyuki
2013-04-09  7:55         ` Glauber Costa
2013-04-09  3:59       ` Tejun Heo
2013-04-09  7:53       ` Glauber Costa

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:d280016 dfblob:e054ac0 )
 OR (
bs:"[PATCH 01/12] memcg: take reference before releasing rcu_read_lock" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5162649F.40108@huawei.com \
    --to=lizefan@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=glommer@parallels.com \
    --cc=hannes@cmpxchg.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).