linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"nishimura@mxp.nes.nec.co.jp" <nishimura@mxp.nes.nec.co.jp>,
	"balbir@linux.vnet.ibm.com" <balbir@linux.vnet.ibm.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>
Subject: [patch] memcg: prevent endless loop with huge pages and near-limit group
Date: Thu, 27 Jan 2011 11:40:14 +0100	[thread overview]
Message-ID: <20110127104014.GD2401@cmpxchg.org> (raw)
In-Reply-To: <20110127103438.GC2401@cmpxchg.org>

This is a patch I sent to Andrea ages ago in response to a RHEL
bugzilla.  Not sure why it did not reach mainline...  But it fixes one
issue you described in 4/7, namely looping around a not exceeded limit
with a huge page that won't fit anymore.

---
From: Johannes Weiner <hannes@cmpxchg.org>
Subject: [patch] memcg: prevent endless loop with huge pages and near-limit group

If reclaim after a failed charging was unsuccessful, the limits are
checked again, just in case they settled by means of other tasks.

This is all fine as long as every charge is of size PAGE_SIZE, because
in that case, being below the limit means having at least PAGE_SIZE
bytes available.

But with transparent huge pages, we may end up in an endless loop
where charging and reclaim fail, but we keep going because the limits
are not yet exceeded, although not allowing for a huge page.

Fix this up by explicitely checking for enough room, not just whether
we are within limits.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/res_counter.h |   12 ++++++++++++
 mm/memcontrol.c             |   20 +++++++++++---------
 2 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/include/linux/res_counter.h b/include/linux/res_counter.h
index fcb9884..03212e4 100644
--- a/include/linux/res_counter.h
+++ b/include/linux/res_counter.h
@@ -182,6 +182,18 @@ static inline bool res_counter_check_under_limit(struct res_counter *cnt)
 	return ret;
 }
 
+static inline bool res_counter_check_room(struct res_counter *cnt,
+					  unsigned long room)
+{
+	bool ret;
+	unsigned long flags;
+
+	spin_lock_irqsave(&cnt->lock, flags);
+	ret = cnt->limit - cnt->usage >= room;
+	spin_unlock_irqrestore(&cnt->lock, flags);
+	return ret;
+}
+
 static inline bool res_counter_check_under_soft_limit(struct res_counter *cnt)
 {
 	bool ret;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index d572102..8fa4be3 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1111,6 +1111,15 @@ static bool mem_cgroup_check_under_limit(struct mem_cgroup *mem)
 	return false;
 }
 
+static bool mem_cgroup_check_room(struct mem_cgroup *mem, unsigned long room)
+{
+	if (!res_counter_check_room(&mem->res, room))
+		return false;
+	if (!do_swap_account)
+		return true;
+	return res_counter_check_room(&mem->memsw, room);
+}
+
 static unsigned int get_swappiness(struct mem_cgroup *memcg)
 {
 	struct cgroup *cgrp = memcg->css.cgroup;
@@ -1844,16 +1853,9 @@ static int __mem_cgroup_do_charge(struct mem_cgroup *mem, gfp_t gfp_mask,
 	if (!(gfp_mask & __GFP_WAIT))
 		return CHARGE_WOULDBLOCK;
 
-	ret = mem_cgroup_hierarchical_reclaim(mem_over_limit, NULL,
+	mem_cgroup_hierarchical_reclaim(mem_over_limit, NULL,
 					gfp_mask, flags);
-	/*
-	 * try_to_free_mem_cgroup_pages() might not give us a full
-	 * picture of reclaim. Some pages are reclaimed and might be
-	 * moved to swap cache or just unmapped from the cgroup.
-	 * Check the limit again to see if the reclaim reduced the
-	 * current usage of the cgroup before giving up
-	 */
-	if (ret || mem_cgroup_check_under_limit(mem_over_limit))
+	if (mem_cgroup_check_room(mem_over_limit, csize))
 		return CHARGE_RETRY;
 
 	/*
-- 
1.7.3.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-01-27 10:40 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-21  6:34 [PATCH 0/7] memcg : more fixes and clean up for 2.6.28-rc KAMEZAWA Hiroyuki
2011-01-21  6:37 ` [PATCH 1/7] memcg : comment, style fixes for recent patch of move_parent KAMEZAWA Hiroyuki
2011-01-21  7:16   ` Daisuke Nishimura
2011-01-24 10:14   ` Johannes Weiner
2011-01-24 10:15     ` KAMEZAWA Hiroyuki
2011-01-24 10:45       ` Johannes Weiner
2011-01-24 11:14         ` Hiroyuki Kamezawa
2011-01-24 11:34           ` Johannes Weiner
2011-01-21  6:39 ` [PATCH 2/7] memcg : more fixes and clean up for 2.6.28-rc KAMEZAWA Hiroyuki
2011-01-21  7:17   ` Daisuke Nishimura
2011-01-24 10:14   ` Johannes Weiner
2011-01-21  6:41 ` [PATCH 3/7] memcg : fix mem_cgroup_check_under_limit KAMEZAWA Hiroyuki
2011-01-21  7:45   ` Daisuke Nishimura
2011-01-24 10:04   ` Johannes Weiner
2011-01-24 10:03     ` KAMEZAWA Hiroyuki
2011-01-21  6:44 ` [PATCH 4/7] memcg : fix charge function of THP allocation KAMEZAWA Hiroyuki
2011-01-21  8:48   ` Daisuke Nishimura
2011-01-24  0:14     ` KAMEZAWA Hiroyuki
2011-01-27 10:34   ` Johannes Weiner
2011-01-27 10:40     ` Johannes Weiner [this message]
2011-01-27 23:40       ` [patch] memcg: prevent endless loop with huge pages and near-limit group KAMEZAWA Hiroyuki
2011-01-27 13:46     ` [patch 2/3] memcg: prevent endless loop on huge page charge Johannes Weiner
2011-01-27 14:00       ` Gleb Natapov
2011-01-27 14:14         ` Johannes Weiner
2011-01-27 23:41           ` KAMEZAWA Hiroyuki
2011-01-27 13:47     ` [patch 3/3] memcg: never OOM when charging huge pages Johannes Weiner
2011-01-27 23:44       ` KAMEZAWA Hiroyuki
2011-01-27 23:45       ` Daisuke Nishimura
2011-01-27 23:49         ` KAMEZAWA Hiroyuki
2011-01-27 14:18     ` [PATCH 4/7] memcg : fix charge function of THP allocation Johannes Weiner
2011-01-27 23:38     ` KAMEZAWA Hiroyuki
2011-01-21  6:46 ` [PATCH 5/7] memcg : fix khugepaged scan of process under buzy memcg KAMEZAWA Hiroyuki
2011-01-21  6:49 ` [PATCH 6/7] memcg : use better variable name KAMEZAWA Hiroyuki
2011-01-21  6:50 ` [PATCH 7/7] memcg : remove ugly vairable initialization by callers KAMEZAWA Hiroyuki
2011-01-21  9:17   ` Daisuke Nishimura
2011-01-24 10:19   ` Johannes Weiner
2011-01-24  0:29 ` [PATCH 0/7] memcg : more fixes and clean up for 2.6.28-rc KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110127104014.GD2401@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nishimura@mxp.nes.nec.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).