From: Suleiman Souhlal <ssouhlal@FreeBSD.org>
To: cgroups@vger.kernel.org
Cc: suleiman@google.com, glommer@parallels.com,
kamezawa.hiroyu@jp.fujitsu.com, penberg@kernel.org,
yinghan@google.com, hughd@google.com, gthelen@google.com,
linux-mm@kvack.org, devel@openvz.org
Subject: [PATCH 07/10] memcg: Stop res_counter underflows.
Date: Mon, 27 Feb 2012 14:58:50 -0800 [thread overview]
Message-ID: <1330383533-20711-8-git-send-email-ssouhlal@FreeBSD.org> (raw)
In-Reply-To: <1330383533-20711-1-git-send-email-ssouhlal@FreeBSD.org>
From: Hugh Dickins <hughd@google.com>
If __mem_cgroup_try_charge() goes the "bypass" route in charging slab
(typically when the task has been OOM-killed), that later results in
res_counter_uncharge_locked() underflows - a stream of warnings from
kernel/res_counter.c:96!
Solve this by accounting kmem_bypass when we shift that charge to root,
and whenever a memcg has any kmem_bypass outstanding, deduct from that
when unaccounting kmem, before deducting from kmem_bytes: so that its
kmem_bytes soon returns to being a fair account.
The amount of memory bypassed is shown in memory.stat as
kernel_bypassed_memory.
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Suleiman Souhlal <suleiman@google.com>
---
mm/memcontrol.c | 43 ++++++++++++++++++++++++++++++++++++++++---
1 files changed, 40 insertions(+), 3 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index d1c0cd7..6a475ed 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -302,6 +302,9 @@ struct mem_cgroup {
/* Slab accounting */
struct kmem_cache *slabs[MAX_KMEM_CACHE_TYPES];
#endif
+#ifdef CONFIG_CGROUP_MEM_RES_CTLR_KMEM
+ atomic64_t kmem_bypassed;
+#endif
int independent_kmem_limit;
};
@@ -4037,6 +4040,7 @@ enum {
MCS_INACTIVE_FILE,
MCS_ACTIVE_FILE,
MCS_UNEVICTABLE,
+ MCS_KMEM_BYPASSED,
NR_MCS_STAT,
};
@@ -4060,7 +4064,8 @@ struct {
{"active_anon", "total_active_anon"},
{"inactive_file", "total_inactive_file"},
{"active_file", "total_active_file"},
- {"unevictable", "total_unevictable"}
+ {"unevictable", "total_unevictable"},
+ {"kernel_bypassed_memory", "total_kernel_bypassed_memory"}
};
@@ -4100,6 +4105,10 @@ mem_cgroup_get_local_stat(struct mem_cgroup *memcg, struct mcs_total_stat *s)
s->stat[MCS_ACTIVE_FILE] += val * PAGE_SIZE;
val = mem_cgroup_nr_lru_pages(memcg, BIT(LRU_UNEVICTABLE));
s->stat[MCS_UNEVICTABLE] += val * PAGE_SIZE;
+
+#ifdef CONFIG_CGROUP_MEM_RES_CTLR_KMEM
+ s->stat[MCS_KMEM_BYPASSED] += atomic64_read(&memcg->kmem_bypassed);
+#endif
}
static void
@@ -5616,14 +5625,24 @@ memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, long long delta)
ret = 0;
if (memcg && !memcg->independent_kmem_limit) {
+ /*
+ * __mem_cgroup_try_charge may decide to bypass the charge and
+ * set _memcg to NULL, in which case we need to account to the
+ * root.
+ */
_memcg = memcg;
if (__mem_cgroup_try_charge(NULL, gfp, delta / PAGE_SIZE,
&_memcg, may_oom) != 0)
return -ENOMEM;
+
+ if (!_memcg && memcg != root_mem_cgroup) {
+ atomic64_add(delta, &memcg->kmem_bypassed);
+ memcg = NULL;
+ }
}
- if (_memcg)
- ret = res_counter_charge(&_memcg->kmem_bytes, delta, &fail_res);
+ if (memcg)
+ ret = res_counter_charge(&memcg->kmem_bytes, delta, &fail_res);
return ret;
}
@@ -5631,6 +5650,22 @@ memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, long long delta)
void
memcg_uncharge_kmem(struct mem_cgroup *memcg, long long delta)
{
+ long long bypassed;
+
+ if (memcg) {
+ bypassed = atomic64_read(&memcg->kmem_bypassed);
+ if (bypassed > 0) {
+ if (bypassed > delta)
+ bypassed = delta;
+ do {
+ memcg_uncharge_kmem(NULL, bypassed);
+ delta -= bypassed;
+ bypassed = atomic64_sub_return(bypassed,
+ &memcg->kmem_bypassed);
+ } while (bypassed < 0); /* Might have raced */
+ }
+ }
+
if (memcg)
res_counter_uncharge(&memcg->kmem_bytes, delta);
@@ -5956,6 +5991,7 @@ memcg_kmem_init(struct mem_cgroup *memcg, struct mem_cgroup *parent)
memcg_slab_init(memcg);
+ atomic64_set(&memcg->kmem_bypassed, 0);
memcg->independent_kmem_limit = 0;
}
@@ -5967,6 +6003,7 @@ memcg_kmem_move(struct mem_cgroup *memcg)
memcg_slab_move(memcg);
+ atomic64_set(&memcg->kmem_bypassed, 0);
spin_lock_irqsave(&memcg->kmem_bytes.lock, flags);
kmem_bytes = memcg->kmem_bytes.usage;
res_counter_uncharge_locked(&memcg->kmem_bytes, kmem_bytes);
--
1.7.7.3
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-02-27 22:59 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-27 22:58 [PATCH 00/10] memcg: Kernel Memory Accounting Suleiman Souhlal
2012-02-27 22:58 ` [PATCH 01/10] memcg: Kernel memory accounting infrastructure Suleiman Souhlal
2012-02-28 13:10 ` Glauber Costa
2012-02-29 0:37 ` Suleiman Souhlal
2012-02-28 13:11 ` Glauber Costa
2012-02-27 22:58 ` [PATCH 02/10] memcg: Uncharge all kmem when deleting a cgroup Suleiman Souhlal
2012-02-28 19:00 ` Glauber Costa
2012-02-29 0:24 ` Suleiman Souhlal
2012-02-29 16:51 ` Glauber Costa
2012-02-29 6:22 ` KAMEZAWA Hiroyuki
2012-02-29 19:00 ` Suleiman Souhlal
2012-02-27 22:58 ` [PATCH 03/10] memcg: Reclaim when more than one page needed Suleiman Souhlal
2012-02-29 6:18 ` KAMEZAWA Hiroyuki
2012-02-27 22:58 ` [PATCH 04/10] memcg: Introduce __GFP_NOACCOUNT Suleiman Souhlal
2012-02-29 6:00 ` KAMEZAWA Hiroyuki
2012-02-29 16:53 ` Glauber Costa
2012-02-29 19:09 ` Suleiman Souhlal
2012-03-01 0:10 ` KAMEZAWA Hiroyuki
2012-03-01 0:24 ` Glauber Costa
2012-03-01 6:05 ` KAMEZAWA Hiroyuki
2012-03-03 14:22 ` Glauber Costa
2012-03-03 16:38 ` Suleiman Souhlal
2012-03-03 23:24 ` Glauber Costa
2012-03-04 0:10 ` Suleiman Souhlal
2012-03-06 10:36 ` Glauber Costa
2012-03-06 16:13 ` Suleiman Souhlal
2012-03-06 18:31 ` Glauber Costa
2012-02-27 22:58 ` [PATCH 05/10] memcg: Slab accounting Suleiman Souhlal
2012-02-28 13:24 ` Glauber Costa
2012-02-28 23:31 ` Suleiman Souhlal
2012-02-29 17:00 ` Glauber Costa
2012-02-27 22:58 ` [PATCH 06/10] memcg: Track all the memcg children of a kmem_cache Suleiman Souhlal
2012-02-27 22:58 ` Suleiman Souhlal [this message]
2012-02-28 13:31 ` [PATCH 07/10] memcg: Stop res_counter underflows Glauber Costa
2012-02-28 23:07 ` Suleiman Souhlal
2012-02-29 17:05 ` Glauber Costa
2012-02-29 19:17 ` Suleiman Souhlal
2012-02-27 22:58 ` [PATCH 08/10] memcg: Add CONFIG_CGROUP_MEM_RES_CTLR_KMEM_ACCT_ROOT Suleiman Souhlal
2012-02-28 13:34 ` Glauber Costa
2012-02-28 23:36 ` Suleiman Souhlal
2012-02-28 23:54 ` KAMEZAWA Hiroyuki
2012-02-29 17:09 ` Glauber Costa
2012-02-29 19:24 ` Suleiman Souhlal
2012-02-27 22:58 ` [PATCH 09/10] memcg: Per-memcg memory.kmem.slabinfo file Suleiman Souhlal
2012-02-27 22:58 ` [PATCH 10/10] memcg: Document kernel memory accounting Suleiman Souhlal
2012-02-27 23:05 ` Randy Dunlap
2012-02-28 8:49 ` [PATCH 00/10] memcg: Kernel Memory Accounting Pekka Enberg
2012-02-28 22:12 ` Suleiman Souhlal
2012-02-28 13:03 ` Glauber Costa
2012-02-28 22:47 ` Suleiman Souhlal
2012-02-29 16:47 ` Glauber Costa
2012-02-29 19:28 ` Suleiman Souhlal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1330383533-20711-8-git-send-email-ssouhlal@FreeBSD.org \
--to=ssouhlal@freebsd.org \
--cc=cgroups@vger.kernel.org \
--cc=devel@openvz.org \
--cc=glommer@parallels.com \
--cc=gthelen@google.com \
--cc=hughd@google.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=penberg@kernel.org \
--cc=suleiman@google.com \
--cc=yinghan@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).