* [PATCH 1/9] Revert "memcg: avoid dangling reference count in creation failure."
2013-05-17 7:02 [PATCH 0/12][V3] memcg: make memcg's life cycle the same as cgroup Li Zefan
@ 2013-05-17 7:03 ` Li Zefan
2013-05-17 7:03 ` [PATCH 2/9] memcg, kmem: fix reference count handling on the error path Li Zefan
` (7 subsequent siblings)
8 siblings, 0 replies; 16+ messages in thread
From: Li Zefan @ 2013-05-17 7:03 UTC (permalink / raw)
To: Andrew Morton
Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
LKML, Cgroups, linux-mm
This reverts commit e4715f01be697a3730c78f8ffffb595591d6a88c
mem_cgroup_put is hierarchy aware so mem_cgroup_put(memcg) already drops
an additional reference from all parents so the additional
mem_cgrroup_put(parent) potentially causes use-after-free.
Cc: <stable@vger.kernel.org> # 3.9.x
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
mm/memcontrol.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index cb1c9de..5918e90 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6297,8 +6297,6 @@ mem_cgroup_css_online(struct cgroup *cont)
* call __mem_cgroup_free, so return directly
*/
mem_cgroup_put(memcg);
- if (parent->use_hierarchy)
- mem_cgroup_put(parent);
}
return error;
}
--
1.8.0.2
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCH 2/9] memcg, kmem: fix reference count handling on the error path
2013-05-17 7:02 [PATCH 0/12][V3] memcg: make memcg's life cycle the same as cgroup Li Zefan
2013-05-17 7:03 ` [PATCH 1/9] Revert "memcg: avoid dangling reference count in creation failure." Li Zefan
@ 2013-05-17 7:03 ` Li Zefan
2013-05-17 7:03 ` [PATCH 4/9] memcg: don't use mem_cgroup_get() when creating a kmemcg cache Li Zefan
` (6 subsequent siblings)
8 siblings, 0 replies; 16+ messages in thread
From: Li Zefan @ 2013-05-17 7:03 UTC (permalink / raw)
To: Andrew Morton
Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
LKML, Cgroups, linux-mm
mem_cgroup_css_online calls mem_cgroup_put if memcg_init_kmem
fails. This is not correct because only memcg_propagate_kmem takes an
additional reference while mem_cgroup_sockets_init is allowed to fail as
well (although no current implementation fails) but it doesn't take any
reference. This all suggests that it should be memcg_propagate_kmem that
should clean up after itself so this patch moves mem_cgroup_put over
there.
Unfortunately this is not that easy (as pointed out by Li Zefan) because
memcg_kmem_mark_dead marks the group dead (KMEM_ACCOUNTED_DEAD) if it
is marked active (KMEM_ACCOUNTED_ACTIVE) which is the case even if
memcg_propagate_kmem fails so the additional reference is dropped in
that case in kmem_cgroup_destroy which means that the reference would be
dropped two times.
The easiest way then would be to simply remove mem_cgrroup_put from
mem_cgroup_css_online and rely on kmem_cgroup_destroy doing the right
thing.
Cc: <stable@vger.kernel.org> # 3.8+
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
mm/memcontrol.c | 8 --------
1 file changed, 8 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5918e90..4d0458d 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6290,14 +6290,6 @@ mem_cgroup_css_online(struct cgroup *cont)
error = memcg_init_kmem(memcg, &mem_cgroup_subsys);
mutex_unlock(&memcg_create_mutex);
- if (error) {
- /*
- * We call put now because our (and parent's) refcnts
- * are already in place. mem_cgroup_put() will internally
- * call __mem_cgroup_free, so return directly
- */
- mem_cgroup_put(memcg);
- }
return error;
}
--
1.8.0.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCH 4/9] memcg: don't use mem_cgroup_get() when creating a kmemcg cache
2013-05-17 7:02 [PATCH 0/12][V3] memcg: make memcg's life cycle the same as cgroup Li Zefan
2013-05-17 7:03 ` [PATCH 1/9] Revert "memcg: avoid dangling reference count in creation failure." Li Zefan
2013-05-17 7:03 ` [PATCH 2/9] memcg, kmem: fix reference count handling on the error path Li Zefan
@ 2013-05-17 7:03 ` Li Zefan
2013-05-17 7:04 ` [PATCH 5/9] memcg: use css_get/put when charging/uncharging kmem Li Zefan
` (5 subsequent siblings)
8 siblings, 0 replies; 16+ messages in thread
From: Li Zefan @ 2013-05-17 7:03 UTC (permalink / raw)
To: Andrew Morton
Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
LKML, Cgroups, linux-mm
Use css_get()/css_put() instead of mem_cgroup_get()/mem_cgroup_put().
There are two things being done in the current code:
First, we acquired a css_ref to make sure that the underlying cgroup
would not go away. That is a short lived reference, and it is put as
soon as the cache is created.
At this point, we acquire a long-lived per-cache memcg reference count
to guarantee that the memcg will still be alive.
so it is:
enqueue: css_get
create : memcg_get, css_put
destroy: memcg_put
So we only need to get rid of the memcg_get, change the memcg_put to
css_put, and get rid of the now extra css_put.
(This changelog is mostly written by Glauber)
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
mm/memcontrol.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index f1320d3..63526f9 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3227,7 +3227,7 @@ void memcg_release_cache(struct kmem_cache *s)
list_del(&s->memcg_params->list);
mutex_unlock(&memcg->slab_caches_mutex);
- mem_cgroup_put(memcg);
+ css_put(&memcg->css);
out:
kfree(s->memcg_params);
}
@@ -3387,16 +3387,18 @@ static struct kmem_cache *memcg_create_kmem_cache(struct mem_cgroup *memcg,
mutex_lock(&memcg_cache_mutex);
new_cachep = cachep->memcg_params->memcg_caches[idx];
- if (new_cachep)
+ if (new_cachep) {
+ css_put(&memcg->css);
goto out;
+ }
new_cachep = kmem_cache_dup(memcg, cachep);
if (new_cachep == NULL) {
new_cachep = cachep;
+ css_put(&memcg->css);
goto out;
}
- mem_cgroup_get(memcg);
atomic_set(&new_cachep->memcg_params->nr_pages , 0);
cachep->memcg_params->memcg_caches[idx] = new_cachep;
@@ -3484,8 +3486,6 @@ static void memcg_create_cache_work_func(struct work_struct *w)
cw = container_of(w, struct create_work, work);
memcg_create_kmem_cache(cw->memcg, cw->cachep);
- /* Drop the reference gotten when we enqueued. */
- css_put(&cw->memcg->css);
kfree(cw);
}
--
1.8.0.2
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCH 5/9] memcg: use css_get/put when charging/uncharging kmem
2013-05-17 7:02 [PATCH 0/12][V3] memcg: make memcg's life cycle the same as cgroup Li Zefan
` (2 preceding siblings ...)
2013-05-17 7:03 ` [PATCH 4/9] memcg: don't use mem_cgroup_get() when creating a kmemcg cache Li Zefan
@ 2013-05-17 7:04 ` Li Zefan
[not found] ` <5195D666.6030408-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-05-17 7:04 ` [PATCH 6/9] memcg: use css_get/put for swap memcg Li Zefan
` (4 subsequent siblings)
8 siblings, 1 reply; 16+ messages in thread
From: Li Zefan @ 2013-05-17 7:04 UTC (permalink / raw)
To: Andrew Morton
Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
LKML, Cgroups, linux-mm
Use css_get/put instead of mem_cgroup_get/put.
We can't do a simple replacement, because here mem_cgroup_put()
is called during mem_cgroup_css_free(), while mem_cgroup_css_free()
won't be called until css refcnt goes down to 0.
Instead we increment css refcnt in mem_cgroup_css_offline(), and
then check if there's still kmem charges. If not, css refcnt will
be decremented immediately, otherwise the refcnt won't be decremented
when kmem charges goes down to 0.
v2:
- added wmb() in kmem_cgroup_css_offline(), pointed out by Michal
- revised comments as suggested by Michal
- fixed to check if kmem is activated in kmem_cgroup_css_offline()
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
mm/memcontrol.c | 66 +++++++++++++++++++++++++++++++++++----------------------
1 file changed, 41 insertions(+), 25 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 63526f9..2f0d117 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3033,8 +3033,16 @@ static void memcg_uncharge_kmem(struct mem_cgroup *memcg, u64 size)
if (res_counter_uncharge(&memcg->kmem, size))
return;
+ /*
+ * Releases a reference taken in kmem_cgroup_css_offline in case
+ * this last uncharge is racing with the offlining code or it is
+ * outliving the memcg existence.
+ *
+ * The memory barrier imposed by test&clear is paired with the
+ * explicit one in kmem_cgroup_css_offline.
+ */
if (memcg_kmem_test_and_clear_dead(memcg))
- mem_cgroup_put(memcg);
+ css_put(&memcg->css);
}
void memcg_cache_list_add(struct mem_cgroup *memcg, struct kmem_cache *cachep)
@@ -5130,14 +5138,6 @@ static int memcg_update_kmem_limit(struct cgroup *cont, u64 val)
* starts accounting before all call sites are patched
*/
memcg_kmem_set_active(memcg);
-
- /*
- * kmem charges can outlive the cgroup. In the case of slab
- * pages, for instance, a page contain objects from various
- * processes, so it is unfeasible to migrate them away. We
- * need to reference count the memcg because of that.
- */
- mem_cgroup_get(memcg);
} else
ret = res_counter_set_limit(&memcg->kmem, val);
out:
@@ -5170,12 +5170,10 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
goto out;
/*
- * destroy(), called if we fail, will issue static_key_slow_inc() and
- * mem_cgroup_put() if kmem is enabled. We have to either call them
- * unconditionally, or clear the KMEM_ACTIVE flag. I personally find
- * this more consistent, since it always leads to the same destroy path
+ * __mem_cgroup_free() will issue static_key_slow_dec() because this
+ * memcg is active already. If the later initialization fails then the
+ * cgroup core triggers the cleanup so we do not have to do it here.
*/
- mem_cgroup_get(memcg);
static_key_slow_inc(&memcg_kmem_enabled_key);
mutex_lock(&set_limit_mutex);
@@ -5858,23 +5856,39 @@ static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
return mem_cgroup_sockets_init(memcg, ss);
}
-static void kmem_cgroup_destroy(struct mem_cgroup *memcg)
+static void kmem_cgroup_css_offline(struct mem_cgroup *memcg)
{
- mem_cgroup_sockets_destroy(memcg);
+ if (!memcg_kmem_is_active(memcg))
+ return;
+ /*
+ * kmem charges can outlive the cgroup. In the case of slab
+ * pages, for instance, a page contain objects from various
+ * processes. As we prevent from taking a reference for every
+ * such allocation we have to be careful when doing uncharge
+ * (see memcg_uncharge_kmem) and here during offlining.
+ *
+ * The idea is that that only the _last_ uncharge which sees
+ * the dead memcg will drop the last reference. An additional
+ * reference is taken here before the group is marked dead
+ * which is then paired with css_put during uncharge resp. here.
+ *
+ * Although this might sound strange as this path is called when
+ * the reference has already dropped down to 0 and shouldn't be
+ * incremented anymore (css_tryget would fail) we do not have
+ * other options because of the kmem allocations lifetime.
+ */
+ css_get(&memcg->css);
+
+ /* see comment in memcg_uncharge_kmem() */
+ wmb();
memcg_kmem_mark_dead(memcg);
if (res_counter_read_u64(&memcg->kmem, RES_USAGE) != 0)
return;
- /*
- * Charges already down to 0, undo mem_cgroup_get() done in the charge
- * path here, being careful not to race with memcg_uncharge_kmem: it is
- * possible that the charges went down to 0 between mark_dead and the
- * res_counter read, so in that case, we don't need the put
- */
if (memcg_kmem_test_and_clear_dead(memcg))
- mem_cgroup_put(memcg);
+ css_put(&memcg->css);
}
#else
static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
@@ -5882,7 +5896,7 @@ static int memcg_init_kmem(struct mem_cgroup *memcg, struct cgroup_subsys *ss)
return 0;
}
-static void kmem_cgroup_destroy(struct mem_cgroup *memcg)
+static void kmem_cgroup_css_offline(struct mem_cgroup *memcg)
{
}
#endif
@@ -6315,6 +6329,8 @@ static void mem_cgroup_css_offline(struct cgroup *cont)
{
struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
+ kmem_cgroup_css_offline(memcg);
+
mem_cgroup_invalidate_reclaim_iterators(memcg);
mem_cgroup_reparent_charges(memcg);
mem_cgroup_destroy_all_caches(memcg);
@@ -6324,7 +6340,7 @@ static void mem_cgroup_css_free(struct cgroup *cont)
{
struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
- kmem_cgroup_destroy(memcg);
+ mem_cgroup_sockets_destroy(memcg);
mem_cgroup_put(memcg);
}
--
1.8.0.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCH 6/9] memcg: use css_get/put for swap memcg
2013-05-17 7:02 [PATCH 0/12][V3] memcg: make memcg's life cycle the same as cgroup Li Zefan
` (3 preceding siblings ...)
2013-05-17 7:04 ` [PATCH 5/9] memcg: use css_get/put when charging/uncharging kmem Li Zefan
@ 2013-05-17 7:04 ` Li Zefan
2013-05-17 7:04 ` [PATCH 7/9] memcg: don't need to get a reference to the parent Li Zefan
` (3 subsequent siblings)
8 siblings, 0 replies; 16+ messages in thread
From: Li Zefan @ 2013-05-17 7:04 UTC (permalink / raw)
To: Andrew Morton
Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
LKML, Cgroups, linux-mm
Use css_get/put instead of mem_cgroup_get/put. A simple replacement
will do.
The historical reason that memcg has its own refcnt instead of always
using css_get/put, is that cgroup couldn't be removed if there're still
css refs, so css refs can't be used as long-lived reference. The
situation has changed so that rmdir a cgroup will succeed regardless
css refs, but won't be freed until css refs goes down to 0.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
mm/memcontrol.c | 26 ++++++++++++++++----------
1 file changed, 16 insertions(+), 10 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2f0d117..b11ea88 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4185,12 +4185,12 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype,
unlock_page_cgroup(pc);
/*
* even after unlock, we have memcg->res.usage here and this memcg
- * will never be freed.
+ * will never be freed, so it's safe to call css_get().
*/
memcg_check_events(memcg, page);
if (do_swap_account && ctype == MEM_CGROUP_CHARGE_TYPE_SWAPOUT) {
mem_cgroup_swap_statistics(memcg, true);
- mem_cgroup_get(memcg);
+ css_get(&memcg->css);
}
/*
* Migration does not charge the res_counter for the
@@ -4290,7 +4290,7 @@ mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent, bool swapout)
/*
* record memcg information, if swapout && memcg != NULL,
- * mem_cgroup_get() was called in uncharge().
+ * css_get() was called in uncharge().
*/
if (do_swap_account && swapout && memcg)
swap_cgroup_record(ent, css_id(&memcg->css));
@@ -4321,7 +4321,7 @@ void mem_cgroup_uncharge_swap(swp_entry_t ent)
if (!mem_cgroup_is_root(memcg))
res_counter_uncharge(&memcg->memsw, PAGE_SIZE);
mem_cgroup_swap_statistics(memcg, false);
- mem_cgroup_put(memcg);
+ css_put(&memcg->css);
}
rcu_read_unlock();
}
@@ -4355,11 +4355,14 @@ static int mem_cgroup_move_swap_account(swp_entry_t entry,
* This function is only called from task migration context now.
* It postpones res_counter and refcount handling till the end
* of task migration(mem_cgroup_clear_mc()) for performance
- * improvement. But we cannot postpone mem_cgroup_get(to)
- * because if the process that has been moved to @to does
- * swap-in, the refcount of @to might be decreased to 0.
+ * improvement. But we cannot postpone css_get(to) because if
+ * the process that has been moved to @to does swap-in, the
+ * refcount of @to might be decreased to 0.
+ *
+ * We are in attach() phase, so the cgroup is guaranteed to be
+ * alive, so we can just call css_get().
*/
- mem_cgroup_get(to);
+ css_get(&to->css);
return 0;
}
return -EINVAL;
@@ -6651,6 +6654,7 @@ static void __mem_cgroup_clear_mc(void)
{
struct mem_cgroup *from = mc.from;
struct mem_cgroup *to = mc.to;
+ int i;
/* we must uncharge all the leftover precharges from mc.to */
if (mc.precharge) {
@@ -6671,7 +6675,9 @@ static void __mem_cgroup_clear_mc(void)
if (!mem_cgroup_is_root(mc.from))
res_counter_uncharge(&mc.from->memsw,
PAGE_SIZE * mc.moved_swap);
- __mem_cgroup_put(mc.from, mc.moved_swap);
+
+ for (i = 0; i < mc.moved_swap; i++)
+ css_put(&mc.from->css);
if (!mem_cgroup_is_root(mc.to)) {
/*
@@ -6681,7 +6687,7 @@ static void __mem_cgroup_clear_mc(void)
res_counter_uncharge(&mc.to->res,
PAGE_SIZE * mc.moved_swap);
}
- /* we've already done mem_cgroup_get(mc.to) */
+ /* we've already done css_get(mc.to) */
mc.moved_swap = 0;
}
memcg_oom_recover(from);
--
1.8.0.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCH 7/9] memcg: don't need to get a reference to the parent
2013-05-17 7:02 [PATCH 0/12][V3] memcg: make memcg's life cycle the same as cgroup Li Zefan
` (4 preceding siblings ...)
2013-05-17 7:04 ` [PATCH 6/9] memcg: use css_get/put for swap memcg Li Zefan
@ 2013-05-17 7:04 ` Li Zefan
2013-05-17 7:05 ` [PATCH 8/9] memcg: kill memcg refcnt Li Zefan
` (2 subsequent siblings)
8 siblings, 0 replies; 16+ messages in thread
From: Li Zefan @ 2013-05-17 7:04 UTC (permalink / raw)
To: Andrew Morton
Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
LKML, Cgroups, linux-mm
The cgroup core guarantees it's always safe to access the parent.
v2:
- added a comment in mem_cgroup_put() as suggested by Michal
- removed mem_cgroup_get(), otherwise gcc will warn that it's not used
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
mm/memcontrol.c | 19 +++----------------
1 file changed, 3 insertions(+), 16 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b11ea88..c6d267a 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -508,7 +508,6 @@ enum res_type {
*/
static DEFINE_MUTEX(memcg_create_mutex);
-static void mem_cgroup_get(struct mem_cgroup *memcg);
static void mem_cgroup_put(struct mem_cgroup *memcg);
static inline
@@ -6171,19 +6170,10 @@ static void free_rcu(struct rcu_head *rcu_head)
schedule_work(&memcg->work_freeing);
}
-static void mem_cgroup_get(struct mem_cgroup *memcg)
-{
- atomic_inc(&memcg->refcnt);
-}
-
static void __mem_cgroup_put(struct mem_cgroup *memcg, int count)
{
- if (atomic_sub_and_test(count, &memcg->refcnt)) {
- struct mem_cgroup *parent = parent_mem_cgroup(memcg);
+ if (atomic_sub_and_test(count, &memcg->refcnt))
call_rcu(&memcg->rcu_freeing, free_rcu);
- if (parent)
- mem_cgroup_put(parent);
- }
}
static void mem_cgroup_put(struct mem_cgroup *memcg)
@@ -6286,12 +6276,9 @@ mem_cgroup_css_online(struct cgroup *cont)
res_counter_init(&memcg->kmem, &parent->kmem);
/*
- * We increment refcnt of the parent to ensure that we can
- * safely access it on res_counter_charge/uncharge.
- * This refcnt will be decremented when freeing this
- * mem_cgroup(see mem_cgroup_put).
+ * No need to take a reference to the parent because cgroup
+ * core guarantees its existence.
*/
- mem_cgroup_get(parent);
} else {
res_counter_init(&memcg->res, NULL);
res_counter_init(&memcg->memsw, NULL);
--
1.8.0.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCH 8/9] memcg: kill memcg refcnt
2013-05-17 7:02 [PATCH 0/12][V3] memcg: make memcg's life cycle the same as cgroup Li Zefan
` (5 preceding siblings ...)
2013-05-17 7:04 ` [PATCH 7/9] memcg: don't need to get a reference to the parent Li Zefan
@ 2013-05-17 7:05 ` Li Zefan
2013-05-17 7:05 ` [PATCH 9/9] memcg: don't need to free memcg via RCU or workqueue Li Zefan
[not found] ` <5195D5F8.7000609-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
8 siblings, 0 replies; 16+ messages in thread
From: Li Zefan @ 2013-05-17 7:05 UTC (permalink / raw)
To: Andrew Morton
Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
LKML, Cgroups, linux-mm
Now memcg has the same life cycle as its corresponding cgroup.
Kill the useless refcnt.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
mm/memcontrol.c | 18 +-----------------
1 file changed, 1 insertion(+), 17 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c6d267a..348126a 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -303,8 +303,6 @@ struct mem_cgroup {
bool oom_lock;
atomic_t under_oom;
- atomic_t refcnt;
-
int swappiness;
/* OOM-Killer disable */
int oom_kill_disable;
@@ -508,8 +506,6 @@ enum res_type {
*/
static DEFINE_MUTEX(memcg_create_mutex);
-static void mem_cgroup_put(struct mem_cgroup *memcg);
-
static inline
struct mem_cgroup *mem_cgroup_from_css(struct cgroup_subsys_state *s)
{
@@ -6170,17 +6166,6 @@ static void free_rcu(struct rcu_head *rcu_head)
schedule_work(&memcg->work_freeing);
}
-static void __mem_cgroup_put(struct mem_cgroup *memcg, int count)
-{
- if (atomic_sub_and_test(count, &memcg->refcnt))
- call_rcu(&memcg->rcu_freeing, free_rcu);
-}
-
-static void mem_cgroup_put(struct mem_cgroup *memcg)
-{
- __mem_cgroup_put(memcg, 1);
-}
-
/*
* Returns the parent mem_cgroup in memcgroup hierarchy with hierarchy enabled.
*/
@@ -6240,7 +6225,6 @@ mem_cgroup_css_alloc(struct cgroup *cont)
memcg->last_scanned_node = MAX_NUMNODES;
INIT_LIST_HEAD(&memcg->oom_notify);
- atomic_set(&memcg->refcnt, 1);
memcg->move_charge_at_immigrate = 0;
mutex_init(&memcg->thresholds_lock);
spin_lock_init(&memcg->move_lock);
@@ -6332,7 +6316,7 @@ static void mem_cgroup_css_free(struct cgroup *cont)
mem_cgroup_sockets_destroy(memcg);
- mem_cgroup_put(memcg);
+ call_rcu(&memcg->rcu_freeing, free_rcu);
}
#ifdef CONFIG_MMU
--
1.8.0.2
^ permalink raw reply related [flat|nested] 16+ messages in thread* [PATCH 9/9] memcg: don't need to free memcg via RCU or workqueue
2013-05-17 7:02 [PATCH 0/12][V3] memcg: make memcg's life cycle the same as cgroup Li Zefan
` (6 preceding siblings ...)
2013-05-17 7:05 ` [PATCH 8/9] memcg: kill memcg refcnt Li Zefan
@ 2013-05-17 7:05 ` Li Zefan
[not found] ` <5195D5F8.7000609-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
8 siblings, 0 replies; 16+ messages in thread
From: Li Zefan @ 2013-05-17 7:05 UTC (permalink / raw)
To: Andrew Morton
Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
LKML, Cgroups, linux-mm
Now memcg has the same life cycle with its corresponding cgroup, and
a cgroup is freed via RCU and then mem_cgroup_css_free() will be
called in a work function, so we can simply call __mem_cgroup_free()
in mem_cgroup_css_free().
This actually reverts 59927fb984de1703c67bc640c3e522d8b5276c73
("memcg: free mem_cgroup by RCU to fix oops").
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
mm/memcontrol.c | 51 +++++----------------------------------------------
1 file changed, 5 insertions(+), 46 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 348126a..eb27b58 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -267,28 +267,10 @@ struct mem_cgroup {
/* vmpressure notifications */
struct vmpressure vmpressure;
- union {
- /*
- * the counter to account for mem+swap usage.
- */
- struct res_counter memsw;
-
- /*
- * rcu_freeing is used only when freeing struct mem_cgroup,
- * so put it into a union to avoid wasting more memory.
- * It must be disjoint from the css field. It could be
- * in a union with the res field, but res plays a much
- * larger part in mem_cgroup life than memsw, and might
- * be of interest, even at time of free, when debugging.
- * So share rcu_head with the less interesting memsw.
- */
- struct rcu_head rcu_freeing;
- /*
- * We also need some space for a worker in deferred freeing.
- * By the time we call it, rcu_freeing is no longer in use.
- */
- struct work_struct work_freeing;
- };
+ /*
+ * the counter to account for mem+swap usage.
+ */
+ struct res_counter memsw;
/*
* the counter to account for kernel memory usage.
@@ -6143,29 +6125,6 @@ static void __mem_cgroup_free(struct mem_cgroup *memcg)
vfree(memcg);
}
-
-/*
- * Helpers for freeing a kmalloc()ed/vzalloc()ed mem_cgroup by RCU,
- * but in process context. The work_freeing structure is overlaid
- * on the rcu_freeing structure, which itself is overlaid on memsw.
- */
-static void free_work(struct work_struct *work)
-{
- struct mem_cgroup *memcg;
-
- memcg = container_of(work, struct mem_cgroup, work_freeing);
- __mem_cgroup_free(memcg);
-}
-
-static void free_rcu(struct rcu_head *rcu_head)
-{
- struct mem_cgroup *memcg;
-
- memcg = container_of(rcu_head, struct mem_cgroup, rcu_freeing);
- INIT_WORK(&memcg->work_freeing, free_work);
- schedule_work(&memcg->work_freeing);
-}
-
/*
* Returns the parent mem_cgroup in memcgroup hierarchy with hierarchy enabled.
*/
@@ -6316,7 +6275,7 @@ static void mem_cgroup_css_free(struct cgroup *cont)
mem_cgroup_sockets_destroy(memcg);
- call_rcu(&memcg->rcu_freeing, free_rcu);
+ __mem_cgroup_free(memcg);
}
#ifdef CONFIG_MMU
--
1.8.0.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 16+ messages in thread[parent not found: <5195D5F8.7000609-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>]
* [PATCH 3/9] memcg: use css_get() in sock_update_memcg()
[not found] ` <5195D5F8.7000609-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
@ 2013-05-17 7:03 ` Li Zefan
2013-05-17 7:06 ` [PATCH 0/12][V3] memcg: make memcg's life cycle the same as cgroup Li Zefan
1 sibling, 0 replies; 16+ messages in thread
From: Li Zefan @ 2013-05-17 7:03 UTC (permalink / raw)
To: Andrew Morton
Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
LKML, Cgroups, linux-mm-Bw31MaZKKs3YtjvyW6yDsg
Use css_get/css_put instead of mem_cgroup_get/put.
Note, if at the same time someone is moving @current to a different
cgroup and removing the old cgroup, css_tryget() may return false,
and sock->sk_cgrp won't be initialized, which is fine.
Signed-off-by: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
Acked-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
---
mm/memcontrol.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 4d0458d..f1320d3 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -561,15 +561,15 @@ void sock_update_memcg(struct sock *sk)
*/
if (sk->sk_cgrp) {
BUG_ON(mem_cgroup_is_root(sk->sk_cgrp->memcg));
- mem_cgroup_get(sk->sk_cgrp->memcg);
+ css_get(&sk->sk_cgrp->memcg->css);
return;
}
rcu_read_lock();
memcg = mem_cgroup_from_task(current);
cg_proto = sk->sk_prot->proto_cgroup(memcg);
- if (!mem_cgroup_is_root(memcg) && memcg_proto_active(cg_proto)) {
- mem_cgroup_get(memcg);
+ if (!mem_cgroup_is_root(memcg) &&
+ memcg_proto_active(cg_proto) && css_tryget(&memcg->css)) {
sk->sk_cgrp = cg_proto;
}
rcu_read_unlock();
@@ -583,7 +583,7 @@ void sock_release_memcg(struct sock *sk)
struct mem_cgroup *memcg;
WARN_ON(!sk->sk_cgrp->memcg);
memcg = sk->sk_cgrp->memcg;
- mem_cgroup_put(memcg);
+ css_put(&sk->sk_cgrp->memcg->css);
}
}
--
1.8.0.2
^ permalink raw reply related [flat|nested] 16+ messages in thread* Re: [PATCH 0/12][V3] memcg: make memcg's life cycle the same as cgroup
[not found] ` <5195D5F8.7000609-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-05-17 7:03 ` [PATCH 3/9] memcg: use css_get() in sock_update_memcg() Li Zefan
@ 2013-05-17 7:06 ` Li Zefan
1 sibling, 0 replies; 16+ messages in thread
From: Li Zefan @ 2013-05-17 7:06 UTC (permalink / raw)
To: Andrew Morton
Cc: Tejun Heo, Glauber Costa, KAMEZAWA Hiroyuki, Johannes Weiner,
LKML, Cgroups, linux-mm-Bw31MaZKKs3YtjvyW6yDsg
The subject should be "[PATCH 0/9][v3] ..."
On 2013/5/17 15:02, Li Zefan wrote:
> Hi Andrew,
>
^ permalink raw reply [flat|nested] 16+ messages in thread