linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/9] memcg: make memcg's life cycle the same as cgroup
@ 2013-06-14  1:53 Li Zefan
  2013-06-14  1:53 ` [PATCH v4 1/9] Revert "memcg: avoid dangling reference count in creation failure." Li Zefan
                   ` (9 more replies)
  0 siblings, 10 replies; 12+ messages in thread
From: Li Zefan @ 2013-06-14  1:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
	LKML, Cgroups, linux-mm, Michal Hocko

Hi Andrew,

All the patches in this patchset has been acked by Michal and Kamezawa-san, and
it's ready to be merged into -mm.

I have another pending patchset that kills css_id, which depends on this one.


Changes since v3:
- rebased against mmotm 2013-06-06-16-19
- changed wmb() to smp_wmb() and moved it to memcg_kmem_mark_dead() and added
  more comment.

Changes since v2:

- rebased against 3.10-rc1
- collected some acks
- the two memcg bug fixes has been merged into mainline
- the cgroup core patch has been merged into mainline

Changes since v1:

- wrote better changelog and added acked-by and reviewed-by tags
- revised some comments as suggested by Michal
- added a wmb() in kmem_cgroup_css_offline(), pointed out by Michal
- fixed a bug which causes a css_put() never be called


Now memcg has its own refcnt, so when a cgroup is destroyed, the memcg can
still be alive. This patchset converts memcg to always use css_get/put, so
memcg will have the same life cycle as its corresponding cgroup.

The historical reason that memcg didn't use css_get in some cases, is that
cgroup couldn't be removed if there're still css refs. The situation has
changed so that rmdir a cgroup will succeed regardless css refs, but won't
be freed until css refs goes down to 0.

Since the introduction of kmemcg, the memcg refcnt handling grows even more
complicated. This patchset greately simplifies memcg's life cycle management.

Also, after those changes, we can convert memcg to use cgroup->id, and then
we can kill css_id.

Li Zefan (7):
  memcg: use css_get() in sock_update_memcg()
  memcg: don't use mem_cgroup_get() when creating a kmemcg cache
  memcg: use css_get/put when charging/uncharging kmem
  memcg: use css_get/put for swap memcg
  memcg: don't need to get a reference to the parent
  memcg: kill memcg refcnt
  memcg: don't need to free memcg via RCU or workqueue

Michal Hocko (2):
  Revert "memcg: avoid dangling reference count in creation failure."
  memcg, kmem: fix reference count handling on the error path

 mm/memcontrol.c | 208 +++++++++++++++++++++-----------------------------------
 1 file changed, 77 insertions(+), 131 deletions(-)

-- 
1.8.0.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v4 1/9] Revert "memcg: avoid dangling reference count in creation failure."
  2013-06-14  1:53 [PATCH v4 0/9] memcg: make memcg's life cycle the same as cgroup Li Zefan
@ 2013-06-14  1:53 ` Li Zefan
  2013-06-14  1:53 ` [PATCH v4 2/9] memcg, kmem: fix reference count handling on the error path Li Zefan
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Li Zefan @ 2013-06-14  1:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
	LKML, Cgroups, linux-mm, Michal Hocko

From: Michal Hocko <mhocko@suse.cz>

This reverts commit e4715f01be697a3730c78f8ffffb595591d6a88c

mem_cgroup_put is hierarchy aware so mem_cgroup_put(memcg) already drops
an additional reference from all parents so the additional
mem_cgrroup_put(parent) potentially causes use-after-free.

Cc: <stable@vger.kernel.org> # 3.9+
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 mm/memcontrol.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2e851f4..0bacc0d 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6332,8 +6332,6 @@ mem_cgroup_css_online(struct cgroup *cont)
 		 * call __mem_cgroup_free, so return directly
 		 */
 		mem_cgroup_put(memcg);
-		if (parent->use_hierarchy)
-			mem_cgroup_put(parent);
 	}
 	return error;
 }
-- 
1.8.0.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 2/9] memcg, kmem: fix reference count handling on the error path
  2013-06-14  1:53 [PATCH v4 0/9] memcg: make memcg's life cycle the same as cgroup Li Zefan
  2013-06-14  1:53 ` [PATCH v4 1/9] Revert "memcg: avoid dangling reference count in creation failure." Li Zefan
@ 2013-06-14  1:53 ` Li Zefan
  2013-06-14  1:54 ` [PATCH v4 3/9] memcg: use css_get() in sock_update_memcg() Li Zefan
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Li Zefan @ 2013-06-14  1:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
	LKML, Cgroups, linux-mm, Michal Hocko

From: Michal Hocko <mhocko@suse.cz>

mem_cgroup_css_online calls mem_cgroup_put if memcg_init_kmem
fails. This is not correct because only memcg_propagate_kmem takes an
additional reference while mem_cgroup_sockets_init is allowed to fail as
well (although no current implementation fails) but it doesn't take any
reference. This all suggests that it should be memcg_propagate_kmem that
should clean up after itself so this patch moves mem_cgroup_put over
there.

Unfortunately this is not that easy (as pointed out by Li Zefan) because
memcg_kmem_mark_dead marks the group dead (KMEM_ACCOUNTED_DEAD) if it
is marked active (KMEM_ACCOUNTED_ACTIVE) which is the case even if
memcg_propagate_kmem fails so the additional reference is dropped in
that case in kmem_cgroup_destroy which means that the reference would be
dropped two times.

The easiest way then would be to simply remove mem_cgrroup_put from
mem_cgroup_css_online and rely on kmem_cgroup_destroy doing the right
thing.

Cc: <stable@vger.kernel.org> # 3.8+
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 mm/memcontrol.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 0bacc0d..b5ec4da 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6325,14 +6325,6 @@ mem_cgroup_css_online(struct cgroup *cont)
 
 	error = memcg_init_kmem(memcg, &mem_cgroup_subsys);
 	mutex_unlock(&memcg_create_mutex);
-	if (error) {
-		/*
-		 * We call put now because our (and parent's) refcnts
-		 * are already in place. mem_cgroup_put() will internally
-		 * call __mem_cgroup_free, so return directly
-		 */
-		mem_cgroup_put(memcg);
-	}
 	return error;
 }
 
-- 
1.8.0.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 3/9] memcg: use css_get() in sock_update_memcg()
  2013-06-14  1:53 [PATCH v4 0/9] memcg: make memcg's life cycle the same as cgroup Li Zefan
  2013-06-14  1:53 ` [PATCH v4 1/9] Revert "memcg: avoid dangling reference count in creation failure." Li Zefan
  2013-06-14  1:53 ` [PATCH v4 2/9] memcg, kmem: fix reference count handling on the error path Li Zefan
@ 2013-06-14  1:54 ` Li Zefan
  2013-06-14  1:54 ` [PATCH v4 4/9] memcg: don't use mem_cgroup_get() when creating a kmemcg cache Li Zefan
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Li Zefan @ 2013-06-14  1:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
	LKML, Cgroups, linux-mm, Michal Hocko

Use css_get/css_put instead of mem_cgroup_get/put.

Note, if at the same time someone is moving @current to a different
cgroup and removing the old cgroup, css_tryget() may return false,
and sock->sk_cgrp won't be initialized, which is fine.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b5ec4da..a8e8d2a 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -561,15 +561,15 @@ void sock_update_memcg(struct sock *sk)
 		 */
 		if (sk->sk_cgrp) {
 			BUG_ON(mem_cgroup_is_root(sk->sk_cgrp->memcg));
-			mem_cgroup_get(sk->sk_cgrp->memcg);
+			css_get(&sk->sk_cgrp->memcg->css);
 			return;
 		}
 
 		rcu_read_lock();
 		memcg = mem_cgroup_from_task(current);
 		cg_proto = sk->sk_prot->proto_cgroup(memcg);
-		if (!mem_cgroup_is_root(memcg) && memcg_proto_active(cg_proto)) {
-			mem_cgroup_get(memcg);
+		if (!mem_cgroup_is_root(memcg) &&
+		    memcg_proto_active(cg_proto) && css_tryget(&memcg->css)) {
 			sk->sk_cgrp = cg_proto;
 		}
 		rcu_read_unlock();
@@ -583,7 +583,7 @@ void sock_release_memcg(struct sock *sk)
 		struct mem_cgroup *memcg;
 		WARN_ON(!sk->sk_cgrp->memcg);
 		memcg = sk->sk_cgrp->memcg;
-		mem_cgroup_put(memcg);
+		css_put(&sk->sk_cgrp->memcg->css);
 	}
 }
 
-- 
1.8.0.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 4/9] memcg: don't use mem_cgroup_get() when creating a kmemcg cache
  2013-06-14  1:53 [PATCH v4 0/9] memcg: make memcg's life cycle the same as cgroup Li Zefan
                   ` (2 preceding siblings ...)
  2013-06-14  1:54 ` [PATCH v4 3/9] memcg: use css_get() in sock_update_memcg() Li Zefan
@ 2013-06-14  1:54 ` Li Zefan
  2013-06-14  1:54 ` [PATCH v4 5/9] memcg: use css_get/put when charging/uncharging kmem Li Zefan
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Li Zefan @ 2013-06-14  1:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
	LKML, Cgroups, linux-mm, Michal Hocko

Use css_get()/css_put() instead of mem_cgroup_get()/mem_cgroup_put().

There are two things being done in the current code:

First, we acquired a css_ref to make sure that the underlying cgroup
would not go away. That is a short lived reference, and it is put as
soon as the cache is created.

At this point, we acquire a long-lived per-cache memcg reference count
to guarantee that the memcg will still be alive.

so it is:

enqueue: css_get
create : memcg_get, css_put
destroy: memcg_put

So we only need to get rid of the memcg_get, change the memcg_put to
css_put, and get rid of the now extra css_put.

(This changelog is mostly written by Glauber)

Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 mm/memcontrol.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a8e8d2a..466c595 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3252,7 +3252,7 @@ void memcg_release_cache(struct kmem_cache *s)
 	list_del(&s->memcg_params->list);
 	mutex_unlock(&memcg->slab_caches_mutex);
 
-	mem_cgroup_put(memcg);
+	css_put(&memcg->css);
 out:
 	kfree(s->memcg_params);
 }
@@ -3412,16 +3412,18 @@ static struct kmem_cache *memcg_create_kmem_cache(struct mem_cgroup *memcg,
 
 	mutex_lock(&memcg_cache_mutex);
 	new_cachep = cachep->memcg_params->memcg_caches[idx];
-	if (new_cachep)
+	if (new_cachep) {
+		css_put(&memcg->css);
 		goto out;
+	}
 
 	new_cachep = kmem_cache_dup(memcg, cachep);
 	if (new_cachep == NULL) {
 		new_cachep = cachep;
+		css_put(&memcg->css);
 		goto out;
 	}
 
-	mem_cgroup_get(memcg);
 	atomic_set(&new_cachep->memcg_params->nr_pages , 0);
 
 	cachep->memcg_params->memcg_caches[idx] = new_cachep;
@@ -3509,8 +3511,6 @@ static void memcg_create_cache_work_func(struct work_struct *w)
 
 	cw = container_of(w, struct create_work, work);
 	memcg_create_kmem_cache(cw->memcg, cw->cachep);
-	/* Drop the reference gotten when we enqueued. */
-	css_put(&cw->memcg->css);
 	kfree(cw);
 }
 
-- 
1.8.0.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 5/9] memcg: use css_get/put when charging/uncharging kmem
  2013-06-14  1:53 [PATCH v4 0/9] memcg: make memcg's life cycle the same as cgroup Li Zefan
                   ` (3 preceding siblings ...)
  2013-06-14  1:54 ` [PATCH v4 4/9] memcg: don't use mem_cgroup_get() when creating a kmemcg cache Li Zefan
@ 2013-06-14  1:54 ` Li Zefan
  2013-06-28 22:59   ` Andrew Morton
  2013-06-14  1:55 ` [PATCH v4 6/9] memcg: use css_get/put for swap memcg Li Zefan
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 12+ messages in thread
From: Li Zefan @ 2013-06-14  1:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
	LKML, Cgroups, linux-mm, Michal Hocko

Use css_get/put instead of mem_cgroup_get/put.

We can't do a simple replacement, because here mem_cgroup_put()
is called during mem_cgroup_css_free(), while mem_cgroup_css_free()
won't be called until css refcnt goes down to 0.

Instead we increment css refcnt in mem_cgroup_css_offline(), and
then check if there's still kmem charges. If not, css refcnt will
be decremented immediately, otherwise the refcnt will be released
after the last kmem allocation is uncahred.

v3:
- changed wmb() to smp_wmb(), and moved it to memcg_kmem_mark_dead(),
  and added comment.

v2:
- added wmb() in kmem_cgroup_css_offline(), pointed out by Michal
- revised comments as suggested by Michal
- fixed to check if kmem is activated in kmem_cgroup_css_offline()

Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
---
 mm/memcontrol.c | 70 ++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 45 insertions(+), 25 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 466c595..8f20a9c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -416,6 +416,11 @@ static void memcg_kmem_clear_activated(struct mem_cgroup *memcg)
 
 static void memcg_kmem_mark_dead(struct mem_cgroup *memcg)
 {
+	/*
+	 * We need to call css_get() first, because memcg_uncharge_kmem()
+	 * will call css_put() if it sees the memcg is dead.
+	 */
+	smp_wmb();
 	if (test_bit(KMEM_ACCOUNTED_ACTIVE, &memcg->kmem_account_flags))
 		set_bit(KMEM_ACCOUNTED_DEAD, &memcg->kmem_account_flags);
 }
@@ -3060,8 +3065,16 @@ static void memcg_uncharge_kmem(struct mem_cgroup *memcg, u64 size)
 	if (res_counter_uncharge(&memcg->kmem, size))
 		return;
 
+	/*
+	 * Releases a reference taken in kmem_cgroup_css_offline in case
+	 * this last uncharge is racing with the offlining code or it is
+	 * outliving the memcg existence.
+	 *
+	 * The memory barrier imposed by test&clear is paired with the
+	 * explicit one in memcg_kmem_mark_dead().
+	 */
 	if (memcg_kmem_test_and_clear_dead(memcg))
-		mem_cgroup_put(memcg);
+		css_put(&memcg->css);
 }
 
 void memcg_cache_list_add(struct mem_cgroup *memcg, struct kmem_cache *cachep)
@@ -5165,14 +5178,6 @@ static int memcg_update_kmem_limit(struct cgroup *cont, u64 val)
 		 * starts accounting before all call sites are patched
 		 */
 		memcg_kmem_set_active(memcg);
-
-		/*
-		 * kmem charges can outlive the cgroup. In the case of slab
-		 * pages, for instance, a page contain objects from various
-		 * processes, so it is unfeasible to migrate them away. We
-		 * need to reference count the memcg because of that.
-		 */
-		mem_cgroup_get(memcg);
 	} else
 		ret = res_counter_set_limit(&memcg->kmem, val);
 out:
@@ -5205,12 +5210,10 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
 		goto out;
 
 	/*
-	 * destroy(), called if we fail, will issue static_key_slow_inc() and
-	 * mem_cgroup_put() if kmem is enabled. We have to either call them
-	 * unconditionally, or clear the KMEM_ACTIVE flag. I personally find
-	 * this more consistent, since it always leads to the same destroy path
+	 * __mem_cgroup_free() will issue static_key_slow_dec() because this
+	 * memcg is active already. If the later initialization fails then the
+	 * cgroup core triggers the cleanup so we do not have to do it here.
 	 */
-	mem_cgroup_get(memcg);
 	static_key_slow_inc(&memcg_kmem_enabled_key);
 
 	mutex_lock(&set_limit_mutex);
@@ -5893,23 +5896,38 @@ static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
 	return mem_cgroup_sockets_init(memcg, ss);
 }
 
-static void kmem_cgroup_destroy(struct mem_cgroup *memcg)
+static void kmem_cgroup_css_offline(struct mem_cgroup *memcg)
 {
-	mem_cgroup_sockets_destroy(memcg);
+	if (!memcg_kmem_is_active(memcg))
+		return;
+
+	/*
+	 * kmem charges can outlive the cgroup. In the case of slab
+	 * pages, for instance, a page contain objects from various
+	 * processes. As we prevent from taking a reference for every
+	 * such allocation we have to be careful when doing uncharge
+	 * (see memcg_uncharge_kmem) and here during offlining.
+	 *
+	 * The idea is that that only the _last_ uncharge which sees
+	 * the dead memcg will drop the last reference. An additional
+	 * reference is taken here before the group is marked dead
+	 * which is then paired with css_put during uncharge resp. here.
+	 *
+	 * Although this might sound strange as this path is called from
+	 * css_offline() when the referencemight have dropped down to 0
+	 * and shouldn't be incremented anymore (css_tryget would fail)
+	 * we do not have other options because of the kmem allocations
+	 * lifetime.
+	 */
+	css_get(&memcg->css);
 
 	memcg_kmem_mark_dead(memcg);
 
 	if (res_counter_read_u64(&memcg->kmem, RES_USAGE) != 0)
 		return;
 
-	/*
-	 * Charges already down to 0, undo mem_cgroup_get() done in the charge
-	 * path here, being careful not to race with memcg_uncharge_kmem: it is
-	 * possible that the charges went down to 0 between mark_dead and the
-	 * res_counter read, so in that case, we don't need the put
-	 */
 	if (memcg_kmem_test_and_clear_dead(memcg))
-		mem_cgroup_put(memcg);
+		css_put(&memcg->css);
 }
 #else
 static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
@@ -5917,7 +5935,7 @@ static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
 	return 0;
 }
 
-static void kmem_cgroup_destroy(struct mem_cgroup *memcg)
+static void kmem_cgroup_css_offline(struct mem_cgroup *memcg)
 {
 }
 #endif
@@ -6350,6 +6368,8 @@ static void mem_cgroup_css_offline(struct cgroup *cont)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
 
+	kmem_cgroup_css_offline(memcg);
+
 	mem_cgroup_invalidate_reclaim_iterators(memcg);
 	mem_cgroup_reparent_charges(memcg);
 	mem_cgroup_destroy_all_caches(memcg);
@@ -6359,7 +6379,7 @@ static void mem_cgroup_css_free(struct cgroup *cont)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
 
-	kmem_cgroup_destroy(memcg);
+	mem_cgroup_sockets_destroy(memcg);
 
 	mem_cgroup_put(memcg);
 }
-- 
1.8.0.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 6/9] memcg: use css_get/put for swap memcg
  2013-06-14  1:53 [PATCH v4 0/9] memcg: make memcg's life cycle the same as cgroup Li Zefan
                   ` (4 preceding siblings ...)
  2013-06-14  1:54 ` [PATCH v4 5/9] memcg: use css_get/put when charging/uncharging kmem Li Zefan
@ 2013-06-14  1:55 ` Li Zefan
  2013-06-14  1:55 ` [PATCH v4 7/9] memcg: don't need to get a reference to the parent Li Zefan
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Li Zefan @ 2013-06-14  1:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
	LKML, Cgroups, linux-mm, Michal Hocko

Use css_get/put instead of mem_cgroup_get/put. A simple replacement
will do.

The historical reason that memcg has its own refcnt instead of always
using css_get/put, is that cgroup couldn't be removed if there're still
css refs, so css refs can't be used as long-lived reference. The
situation has changed so that rmdir a cgroup will succeed regardless
css refs, but won't be freed until css refs goes down to 0.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 mm/memcontrol.c | 26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 8f20a9c..a63d9f1 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4213,12 +4213,12 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype,
 	unlock_page_cgroup(pc);
 	/*
 	 * even after unlock, we have memcg->res.usage here and this memcg
-	 * will never be freed.
+	 * will never be freed, so it's safe to call css_get().
 	 */
 	memcg_check_events(memcg, page);
 	if (do_swap_account && ctype == MEM_CGROUP_CHARGE_TYPE_SWAPOUT) {
 		mem_cgroup_swap_statistics(memcg, true);
-		mem_cgroup_get(memcg);
+		css_get(&memcg->css);
 	}
 	/*
 	 * Migration does not charge the res_counter for the
@@ -4330,7 +4330,7 @@ mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent, bool swapout)
 
 	/*
 	 * record memcg information,  if swapout && memcg != NULL,
-	 * mem_cgroup_get() was called in uncharge().
+	 * css_get() was called in uncharge().
 	 */
 	if (do_swap_account && swapout && memcg)
 		swap_cgroup_record(ent, css_id(&memcg->css));
@@ -4361,7 +4361,7 @@ void mem_cgroup_uncharge_swap(swp_entry_t ent)
 		if (!mem_cgroup_is_root(memcg))
 			res_counter_uncharge(&memcg->memsw, PAGE_SIZE);
 		mem_cgroup_swap_statistics(memcg, false);
-		mem_cgroup_put(memcg);
+		css_put(&memcg->css);
 	}
 	rcu_read_unlock();
 }
@@ -4395,11 +4395,14 @@ static int mem_cgroup_move_swap_account(swp_entry_t entry,
 		 * This function is only called from task migration context now.
 		 * It postpones res_counter and refcount handling till the end
 		 * of task migration(mem_cgroup_clear_mc()) for performance
-		 * improvement. But we cannot postpone mem_cgroup_get(to)
-		 * because if the process that has been moved to @to does
-		 * swap-in, the refcount of @to might be decreased to 0.
+		 * improvement. But we cannot postpone css_get(to)  because if
+		 * the process that has been moved to @to does swap-in, the
+		 * refcount of @to might be decreased to 0.
+		 *
+		 * We are in attach() phase, so the cgroup is guaranteed to be
+		 * alive, so we can just call css_get().
 		 */
-		mem_cgroup_get(to);
+		css_get(&to->css);
 		return 0;
 	}
 	return -EINVAL;
@@ -6690,6 +6693,7 @@ static void __mem_cgroup_clear_mc(void)
 {
 	struct mem_cgroup *from = mc.from;
 	struct mem_cgroup *to = mc.to;
+	int i;
 
 	/* we must uncharge all the leftover precharges from mc.to */
 	if (mc.precharge) {
@@ -6710,7 +6714,9 @@ static void __mem_cgroup_clear_mc(void)
 		if (!mem_cgroup_is_root(mc.from))
 			res_counter_uncharge(&mc.from->memsw,
 						PAGE_SIZE * mc.moved_swap);
-		__mem_cgroup_put(mc.from, mc.moved_swap);
+
+		for (i = 0; i < mc.moved_swap; i++)
+			css_put(&mc.from->css);
 
 		if (!mem_cgroup_is_root(mc.to)) {
 			/*
@@ -6720,7 +6726,7 @@ static void __mem_cgroup_clear_mc(void)
 			res_counter_uncharge(&mc.to->res,
 						PAGE_SIZE * mc.moved_swap);
 		}
-		/* we've already done mem_cgroup_get(mc.to) */
+		/* we've already done css_get(mc.to) */
 		mc.moved_swap = 0;
 	}
 	memcg_oom_recover(from);
-- 
1.8.0.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 7/9] memcg: don't need to get a reference to the parent
  2013-06-14  1:53 [PATCH v4 0/9] memcg: make memcg's life cycle the same as cgroup Li Zefan
                   ` (5 preceding siblings ...)
  2013-06-14  1:55 ` [PATCH v4 6/9] memcg: use css_get/put for swap memcg Li Zefan
@ 2013-06-14  1:55 ` Li Zefan
  2013-06-14  1:55 ` [PATCH v4 8/9] memcg: kill memcg refcnt Li Zefan
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Li Zefan @ 2013-06-14  1:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
	LKML, Cgroups, linux-mm, Michal Hocko

The cgroup core guarantees it's always safe to access the parent.

v2:
- added a comment in mem_cgroup_put() as suggested by Michal
- removed mem_cgroup_get(), otherwise gcc will warn that it's not used

Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 mm/memcontrol.c | 19 +++----------------
 1 file changed, 3 insertions(+), 16 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a63d9f1..443fb45 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -513,7 +513,6 @@ enum res_type {
  */
 static DEFINE_MUTEX(memcg_create_mutex);
 
-static void mem_cgroup_get(struct mem_cgroup *memcg);
 static void mem_cgroup_put(struct mem_cgroup *memcg);
 
 static inline
@@ -6210,19 +6209,10 @@ static void free_rcu(struct rcu_head *rcu_head)
 	schedule_work(&memcg->work_freeing);
 }
 
-static void mem_cgroup_get(struct mem_cgroup *memcg)
-{
-	atomic_inc(&memcg->refcnt);
-}
-
 static void __mem_cgroup_put(struct mem_cgroup *memcg, int count)
 {
-	if (atomic_sub_and_test(count, &memcg->refcnt)) {
-		struct mem_cgroup *parent = parent_mem_cgroup(memcg);
+	if (atomic_sub_and_test(count, &memcg->refcnt))
 		call_rcu(&memcg->rcu_freeing, free_rcu);
-		if (parent)
-			mem_cgroup_put(parent);
-	}
 }
 
 static void mem_cgroup_put(struct mem_cgroup *memcg)
@@ -6325,12 +6315,9 @@ mem_cgroup_css_online(struct cgroup *cont)
 		res_counter_init(&memcg->kmem, &parent->kmem);
 
 		/*
-		 * We increment refcnt of the parent to ensure that we can
-		 * safely access it on res_counter_charge/uncharge.
-		 * This refcnt will be decremented when freeing this
-		 * mem_cgroup(see mem_cgroup_put).
+		 * No need to take a reference to the parent because cgroup
+		 * core guarantees its existence.
 		 */
-		mem_cgroup_get(parent);
 	} else {
 		res_counter_init(&memcg->res, NULL);
 		res_counter_init(&memcg->memsw, NULL);
-- 
1.8.0.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 8/9] memcg: kill memcg refcnt
  2013-06-14  1:53 [PATCH v4 0/9] memcg: make memcg's life cycle the same as cgroup Li Zefan
                   ` (6 preceding siblings ...)
  2013-06-14  1:55 ` [PATCH v4 7/9] memcg: don't need to get a reference to the parent Li Zefan
@ 2013-06-14  1:55 ` Li Zefan
  2013-06-14  1:56 ` [PATCH v4 9/9] memcg: don't need to free memcg via RCU or workqueue Li Zefan
  2013-06-19  1:29 ` [PATCH v4 0/9] memcg: make memcg's life cycle the same as cgroup Li Zefan
  9 siblings, 0 replies; 12+ messages in thread
From: Li Zefan @ 2013-06-14  1:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
	LKML, Cgroups, linux-mm, Michal Hocko

Now memcg has the same life cycle as its corresponding cgroup.
Kill the useless refcnt.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 mm/memcontrol.c | 18 +-----------------
 1 file changed, 1 insertion(+), 17 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 443fb45..7b622c3 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -303,8 +303,6 @@ struct mem_cgroup {
 	bool		oom_lock;
 	atomic_t	under_oom;
 
-	atomic_t	refcnt;
-
 	int	swappiness;
 	/* OOM-Killer disable */
 	int		oom_kill_disable;
@@ -513,8 +511,6 @@ enum res_type {
  */
 static DEFINE_MUTEX(memcg_create_mutex);
 
-static void mem_cgroup_put(struct mem_cgroup *memcg);
-
 static inline
 struct mem_cgroup *mem_cgroup_from_css(struct cgroup_subsys_state *s)
 {
@@ -6209,17 +6205,6 @@ static void free_rcu(struct rcu_head *rcu_head)
 	schedule_work(&memcg->work_freeing);
 }
 
-static void __mem_cgroup_put(struct mem_cgroup *memcg, int count)
-{
-	if (atomic_sub_and_test(count, &memcg->refcnt))
-		call_rcu(&memcg->rcu_freeing, free_rcu);
-}
-
-static void mem_cgroup_put(struct mem_cgroup *memcg)
-{
-	__mem_cgroup_put(memcg, 1);
-}
-
 /*
  * Returns the parent mem_cgroup in memcgroup hierarchy with hierarchy enabled.
  */
@@ -6279,7 +6264,6 @@ mem_cgroup_css_alloc(struct cgroup *cont)
 
 	memcg->last_scanned_node = MAX_NUMNODES;
 	INIT_LIST_HEAD(&memcg->oom_notify);
-	atomic_set(&memcg->refcnt, 1);
 	memcg->move_charge_at_immigrate = 0;
 	mutex_init(&memcg->thresholds_lock);
 	spin_lock_init(&memcg->move_lock);
@@ -6371,7 +6355,7 @@ static void mem_cgroup_css_free(struct cgroup *cont)
 
 	mem_cgroup_sockets_destroy(memcg);
 
-	mem_cgroup_put(memcg);
+	call_rcu(&memcg->rcu_freeing, free_rcu);
 }
 
 #ifdef CONFIG_MMU
-- 
1.8.0.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 9/9] memcg: don't need to free memcg via RCU or workqueue
  2013-06-14  1:53 [PATCH v4 0/9] memcg: make memcg's life cycle the same as cgroup Li Zefan
                   ` (7 preceding siblings ...)
  2013-06-14  1:55 ` [PATCH v4 8/9] memcg: kill memcg refcnt Li Zefan
@ 2013-06-14  1:56 ` Li Zefan
  2013-06-19  1:29 ` [PATCH v4 0/9] memcg: make memcg's life cycle the same as cgroup Li Zefan
  9 siblings, 0 replies; 12+ messages in thread
From: Li Zefan @ 2013-06-14  1:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
	LKML, Cgroups, linux-mm, Michal Hocko, Hugh Dickins

Now memcg has the same life cycle with its corresponding cgroup, and
a cgroup is freed via RCU and then mem_cgroup_css_free() will be
called in a work function, so we can simply call __mem_cgroup_free()
in mem_cgroup_css_free().

This actually reverts 59927fb984de1703c67bc640c3e522d8b5276c73
("memcg: free mem_cgroup by RCU to fix oops").

Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 mm/memcontrol.c | 51 +++++----------------------------------------------
 1 file changed, 5 insertions(+), 46 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 7b622c3..234f311 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -267,28 +267,10 @@ struct mem_cgroup {
 	/* vmpressure notifications */
 	struct vmpressure vmpressure;
 
-	union {
-		/*
-		 * the counter to account for mem+swap usage.
-		 */
-		struct res_counter memsw;
-
-		/*
-		 * rcu_freeing is used only when freeing struct mem_cgroup,
-		 * so put it into a union to avoid wasting more memory.
-		 * It must be disjoint from the css field.  It could be
-		 * in a union with the res field, but res plays a much
-		 * larger part in mem_cgroup life than memsw, and might
-		 * be of interest, even at time of free, when debugging.
-		 * So share rcu_head with the less interesting memsw.
-		 */
-		struct rcu_head rcu_freeing;
-		/*
-		 * We also need some space for a worker in deferred freeing.
-		 * By the time we call it, rcu_freeing is no longer in use.
-		 */
-		struct work_struct work_freeing;
-	};
+	/*
+	 * the counter to account for mem+swap usage.
+	 */
+	struct res_counter memsw;
 
 	/*
 	 * the counter to account for kernel memory usage.
@@ -6182,29 +6164,6 @@ static void __mem_cgroup_free(struct mem_cgroup *memcg)
 		vfree(memcg);
 }
 
-
-/*
- * Helpers for freeing a kmalloc()ed/vzalloc()ed mem_cgroup by RCU,
- * but in process context.  The work_freeing structure is overlaid
- * on the rcu_freeing structure, which itself is overlaid on memsw.
- */
-static void free_work(struct work_struct *work)
-{
-	struct mem_cgroup *memcg;
-
-	memcg = container_of(work, struct mem_cgroup, work_freeing);
-	__mem_cgroup_free(memcg);
-}
-
-static void free_rcu(struct rcu_head *rcu_head)
-{
-	struct mem_cgroup *memcg;
-
-	memcg = container_of(rcu_head, struct mem_cgroup, rcu_freeing);
-	INIT_WORK(&memcg->work_freeing, free_work);
-	schedule_work(&memcg->work_freeing);
-}
-
 /*
  * Returns the parent mem_cgroup in memcgroup hierarchy with hierarchy enabled.
  */
@@ -6355,7 +6314,7 @@ static void mem_cgroup_css_free(struct cgroup *cont)
 
 	mem_cgroup_sockets_destroy(memcg);
 
-	call_rcu(&memcg->rcu_freeing, free_rcu);
+	__mem_cgroup_free(memcg);
 }
 
 #ifdef CONFIG_MMU
-- 
1.8.0.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 0/9] memcg: make memcg's life cycle the same as cgroup
  2013-06-14  1:53 [PATCH v4 0/9] memcg: make memcg's life cycle the same as cgroup Li Zefan
                   ` (8 preceding siblings ...)
  2013-06-14  1:56 ` [PATCH v4 9/9] memcg: don't need to free memcg via RCU or workqueue Li Zefan
@ 2013-06-19  1:29 ` Li Zefan
  9 siblings, 0 replies; 12+ messages in thread
From: Li Zefan @ 2013-06-19  1:29 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
	LKML, Cgroups, linux-mm, Michal Hocko

Hi Andrew, any chance for this patchset to be queued for 3.11?

On 2013/6/14 9:53, Li Zefan wrote:
> Hi Andrew,
> 
> All the patches in this patchset has been acked by Michal and Kamezawa-san, and
> it's ready to be merged into -mm.
> 
> I have another pending patchset that kills css_id, which depends on this one.
> 
> 
> Changes since v3:
> - rebased against mmotm 2013-06-06-16-19
> - changed wmb() to smp_wmb() and moved it to memcg_kmem_mark_dead() and added
>   more comment.
> 
> Changes since v2:
> 
> - rebased against 3.10-rc1
> - collected some acks
> - the two memcg bug fixes has been merged into mainline
> - the cgroup core patch has been merged into mainline
> 
> Changes since v1:
> 
> - wrote better changelog and added acked-by and reviewed-by tags
> - revised some comments as suggested by Michal
> - added a wmb() in kmem_cgroup_css_offline(), pointed out by Michal
> - fixed a bug which causes a css_put() never be called
> 
> 
> Now memcg has its own refcnt, so when a cgroup is destroyed, the memcg can
> still be alive. This patchset converts memcg to always use css_get/put, so
> memcg will have the same life cycle as its corresponding cgroup.
> 
> The historical reason that memcg didn't use css_get in some cases, is that
> cgroup couldn't be removed if there're still css refs. The situation has
> changed so that rmdir a cgroup will succeed regardless css refs, but won't
> be freed until css refs goes down to 0.
> 
> Since the introduction of kmemcg, the memcg refcnt handling grows even more
> complicated. This patchset greately simplifies memcg's life cycle management.
> 
> Also, after those changes, we can convert memcg to use cgroup->id, and then
> we can kill css_id.
> 
> Li Zefan (7):
>   memcg: use css_get() in sock_update_memcg()
>   memcg: don't use mem_cgroup_get() when creating a kmemcg cache
>   memcg: use css_get/put when charging/uncharging kmem
>   memcg: use css_get/put for swap memcg
>   memcg: don't need to get a reference to the parent
>   memcg: kill memcg refcnt
>   memcg: don't need to free memcg via RCU or workqueue
> 
> Michal Hocko (2):
>   Revert "memcg: avoid dangling reference count in creation failure."
>   memcg, kmem: fix reference count handling on the error path
> 
>  mm/memcontrol.c | 208 +++++++++++++++++++++-----------------------------------
>  1 file changed, 77 insertions(+), 131 deletions(-)
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 5/9] memcg: use css_get/put when charging/uncharging kmem
  2013-06-14  1:54 ` [PATCH v4 5/9] memcg: use css_get/put when charging/uncharging kmem Li Zefan
@ 2013-06-28 22:59   ` Andrew Morton
  0 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2013-06-28 22:59 UTC (permalink / raw)
  To: Li Zefan
  Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
	LKML, Cgroups, linux-mm, Michal Hocko

On Fri, 14 Jun 2013 09:54:57 +0800 Li Zefan <lizefan@huawei.com> wrote:

> Use css_get/put instead of mem_cgroup_get/put.
> 
> We can't do a simple replacement, because here mem_cgroup_put()
> is called during mem_cgroup_css_free(), while mem_cgroup_css_free()
> won't be called until css refcnt goes down to 0.
> 
> Instead we increment css refcnt in mem_cgroup_css_offline(), and
> then check if there's still kmem charges. If not, css refcnt will
> be decremented immediately, otherwise the refcnt will be released
> after the last kmem allocation is uncahred.
> 
> ...
>
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -416,6 +416,11 @@ static void memcg_kmem_clear_activated(struct mem_cgroup *memcg)
>  
>  static void memcg_kmem_mark_dead(struct mem_cgroup *memcg)
>  {
> +	/*
> +	 * We need to call css_get() first, because memcg_uncharge_kmem()
> +	 * will call css_put() if it sees the memcg is dead.
> +	 */
> +	smp_wmb();
>  	if (test_bit(KMEM_ACCOUNTED_ACTIVE, &memcg->kmem_account_flags))
>  		set_bit(KMEM_ACCOUNTED_DEAD, &memcg->kmem_account_flags);
>  }

That comment is rather confusing and unhelpful.  "We need to call
css_get", but we clearly *don't* call css_get().  I guess we want

--- a/mm/memcontrol.c~memcg-use-css_get-put-when-charging-uncharging-kmem-fix
+++ a/mm/memcontrol.c
@@ -407,7 +407,7 @@ static void memcg_kmem_clear_activated(s
 static void memcg_kmem_mark_dead(struct mem_cgroup *memcg)
 {
 	/*
-	 * We need to call css_get() first, because memcg_uncharge_kmem()
+	 * Our caller must use css_get() first, because memcg_uncharge_kmem()
 	 * will call css_put() if it sees the memcg is dead.
 	 */
 	smp_wmb();
_


But it's still not good.

- What is the smp_wmb() for?  These barriers should always be
  documented so readers can see what we're barriering against but this
  one is floating around unexplained.

- What does memcg_uncharge_kmem() have to do with all this? 
  memcg_kmem_mark_dead() just does a set_bit() - what has that to do
  with memcg_kmem_mark_dead().

So I dunno, it's all clear as mud and I hope we can do better.


> @@ -3060,8 +3065,16 @@ static void memcg_uncharge_kmem(struct mem_cgroup *memcg, u64 size)
>  	if (res_counter_uncharge(&memcg->kmem, size))
>  		return;
>  
> +	/*
> +	 * Releases a reference taken in kmem_cgroup_css_offline in case
> +	 * this last uncharge is racing with the offlining code or it is
> +	 * outliving the memcg existence.
> +	 *
> +	 * The memory barrier imposed by test&clear is paired with the
> +	 * explicit one in memcg_kmem_mark_dead().
> +	 */

OK, that clears things up a bit.  A small bit.


This code is far too tricky :(

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-06-28 22:59 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-14  1:53 [PATCH v4 0/9] memcg: make memcg's life cycle the same as cgroup Li Zefan
2013-06-14  1:53 ` [PATCH v4 1/9] Revert "memcg: avoid dangling reference count in creation failure." Li Zefan
2013-06-14  1:53 ` [PATCH v4 2/9] memcg, kmem: fix reference count handling on the error path Li Zefan
2013-06-14  1:54 ` [PATCH v4 3/9] memcg: use css_get() in sock_update_memcg() Li Zefan
2013-06-14  1:54 ` [PATCH v4 4/9] memcg: don't use mem_cgroup_get() when creating a kmemcg cache Li Zefan
2013-06-14  1:54 ` [PATCH v4 5/9] memcg: use css_get/put when charging/uncharging kmem Li Zefan
2013-06-28 22:59   ` Andrew Morton
2013-06-14  1:55 ` [PATCH v4 6/9] memcg: use css_get/put for swap memcg Li Zefan
2013-06-14  1:55 ` [PATCH v4 7/9] memcg: don't need to get a reference to the parent Li Zefan
2013-06-14  1:55 ` [PATCH v4 8/9] memcg: kill memcg refcnt Li Zefan
2013-06-14  1:56 ` [PATCH v4 9/9] memcg: don't need to free memcg via RCU or workqueue Li Zefan
2013-06-19  1:29 ` [PATCH v4 0/9] memcg: make memcg's life cycle the same as cgroup Li Zefan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).