linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Suleiman Souhlal <ssouhlal@FreeBSD.org>
To: cgroups@vger.kernel.org
Cc: suleiman@google.com, glommer@parallels.com,
	kamezawa.hiroyu@jp.fujitsu.com, penberg@kernel.org,
	yinghan@google.com, hughd@google.com, gthelen@google.com,
	linux-mm@kvack.org, devel@openvz.org
Subject: [PATCH 03/10] memcg: Reclaim when more than one page needed.
Date: Mon, 27 Feb 2012 14:58:46 -0800	[thread overview]
Message-ID: <1330383533-20711-4-git-send-email-ssouhlal@FreeBSD.org> (raw)
In-Reply-To: <1330383533-20711-1-git-send-email-ssouhlal@FreeBSD.org>

From: Hugh Dickins <hughd@google.com>

mem_cgroup_do_charge() was written before slab accounting, and expects
three cases: being called for 1 page, being called for a stock of 32 pages,
or being called for a hugepage.  If we call for 2 pages (and several slabs
used in process creation are such, at least with the debug options I had),
it assumed it's being called for stock and just retried without reclaiming.

Fix that by passing down a minsize argument in addition to the csize;
and pass minsize to consume_stock() also, so that it can draw on stock
for higher order slabs, instead of accumulating an increasing surplus
of stock, as its "nr_pages == 1" tests previously caused.

And what to do about that (csize == PAGE_SIZE && ret) retry?  If it's
needed at all (and presumably is since it's there, perhaps to handle
races), then it should be extended to more than PAGE_SIZE, yet how far?
And should there be a retry count limit, of what?  For now retry up to
COSTLY_ORDER (as page_alloc.c does), stay safe with a cond_resched(),
and make sure not to do it if __GFP_NORETRY.

Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Suleiman Souhlal <suleiman@google.com>
---
 mm/memcontrol.c |   35 +++++++++++++++++++----------------
 1 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 6f44fcb..c82ca1c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1928,19 +1928,19 @@ static DEFINE_PER_CPU(struct memcg_stock_pcp, memcg_stock);
 static DEFINE_MUTEX(percpu_charge_mutex);
 
 /*
- * Try to consume stocked charge on this cpu. If success, one page is consumed
- * from local stock and true is returned. If the stock is 0 or charges from a
- * cgroup which is not current target, returns false. This stock will be
- * refilled.
+ * Try to consume stocked charge on this cpu. If success, nr_pages pages are
+ * consumed from local stock and true is returned. If the stock is 0 or
+ * charges from a cgroup which is not current target, returns false.
+ * This stock will be refilled.
  */
-static bool consume_stock(struct mem_cgroup *memcg)
+static bool consume_stock(struct mem_cgroup *memcg, int nr_pages)
 {
 	struct memcg_stock_pcp *stock;
 	bool ret = true;
 
 	stock = &get_cpu_var(memcg_stock);
-	if (memcg == stock->cached && stock->nr_pages)
-		stock->nr_pages--;
+	if (memcg == stock->cached && stock->nr_pages >= nr_pages)
+		stock->nr_pages -= nr_pages;
 	else /* need to call res_counter_charge */
 		ret = false;
 	put_cpu_var(memcg_stock);
@@ -2131,7 +2131,7 @@ enum {
 };
 
 static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
-				unsigned int nr_pages, bool oom_check)
+    unsigned int nr_pages, unsigned int min_pages, bool oom_check)
 {
 	unsigned long csize = nr_pages * PAGE_SIZE;
 	struct mem_cgroup *mem_over_limit;
@@ -2154,18 +2154,18 @@ static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
 	} else
 		mem_over_limit = mem_cgroup_from_res_counter(fail_res, res);
 	/*
-	 * nr_pages can be either a huge page (HPAGE_PMD_NR), a batch
-	 * of regular pages (CHARGE_BATCH), or a single regular page (1).
-	 *
 	 * Never reclaim on behalf of optional batching, retry with a
 	 * single page instead.
 	 */
-	if (nr_pages == CHARGE_BATCH)
+	if (nr_pages > min_pages)
 		return CHARGE_RETRY;
 
 	if (!(gfp_mask & __GFP_WAIT))
 		return CHARGE_WOULDBLOCK;
 
+	if (gfp_mask & __GFP_NORETRY)
+		return CHARGE_NOMEM;
+
 	ret = mem_cgroup_reclaim(mem_over_limit, gfp_mask, flags);
 	if (mem_cgroup_margin(mem_over_limit) >= nr_pages)
 		return CHARGE_RETRY;
@@ -2178,8 +2178,10 @@ static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
 	 * unlikely to succeed so close to the limit, and we fall back
 	 * to regular pages anyway in case of failure.
 	 */
-	if (nr_pages == 1 && ret)
+	if (nr_pages <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER) && ret) {
+		cond_resched();
 		return CHARGE_RETRY;
+	}
 
 	/*
 	 * At task move, charge accounts can be doubly counted. So, it's
@@ -2253,7 +2255,7 @@ again:
 		VM_BUG_ON(css_is_removed(&memcg->css));
 		if (mem_cgroup_is_root(memcg))
 			goto done;
-		if (nr_pages == 1 && consume_stock(memcg))
+		if (consume_stock(memcg, nr_pages))
 			goto done;
 		css_get(&memcg->css);
 	} else {
@@ -2278,7 +2280,7 @@ again:
 			rcu_read_unlock();
 			goto done;
 		}
-		if (nr_pages == 1 && consume_stock(memcg)) {
+		if (consume_stock(memcg, nr_pages)) {
 			/*
 			 * It seems dagerous to access memcg without css_get().
 			 * But considering how consume_stok works, it's not
@@ -2313,7 +2315,8 @@ again:
 			nr_oom_retries = MEM_CGROUP_RECLAIM_RETRIES;
 		}
 
-		ret = mem_cgroup_do_charge(memcg, gfp_mask, batch, oom_check);
+		ret = mem_cgroup_do_charge(memcg, gfp_mask, batch, nr_pages,
+		    oom_check);
 		switch (ret) {
 		case CHARGE_OK:
 			break;
-- 
1.7.7.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2012-02-27 22:59 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-27 22:58 [PATCH 00/10] memcg: Kernel Memory Accounting Suleiman Souhlal
2012-02-27 22:58 ` [PATCH 01/10] memcg: Kernel memory accounting infrastructure Suleiman Souhlal
2012-02-28 13:10   ` Glauber Costa
2012-02-29  0:37     ` Suleiman Souhlal
2012-02-28 13:11   ` Glauber Costa
2012-02-27 22:58 ` [PATCH 02/10] memcg: Uncharge all kmem when deleting a cgroup Suleiman Souhlal
2012-02-28 19:00   ` Glauber Costa
2012-02-29  0:24     ` Suleiman Souhlal
2012-02-29 16:51       ` Glauber Costa
2012-02-29  6:22   ` KAMEZAWA Hiroyuki
2012-02-29 19:00     ` Suleiman Souhlal
2012-02-27 22:58 ` Suleiman Souhlal [this message]
2012-02-29  6:18   ` [PATCH 03/10] memcg: Reclaim when more than one page needed KAMEZAWA Hiroyuki
2012-02-27 22:58 ` [PATCH 04/10] memcg: Introduce __GFP_NOACCOUNT Suleiman Souhlal
2012-02-29  6:00   ` KAMEZAWA Hiroyuki
2012-02-29 16:53     ` Glauber Costa
2012-02-29 19:09     ` Suleiman Souhlal
2012-03-01  0:10       ` KAMEZAWA Hiroyuki
2012-03-01  0:24         ` Glauber Costa
2012-03-01  6:05           ` KAMEZAWA Hiroyuki
2012-03-03 14:22             ` Glauber Costa
2012-03-03 16:38               ` Suleiman Souhlal
2012-03-03 23:24                 ` Glauber Costa
2012-03-04  0:10                   ` Suleiman Souhlal
2012-03-06 10:36                     ` Glauber Costa
2012-03-06 16:13                       ` Suleiman Souhlal
2012-03-06 18:31                         ` Glauber Costa
2012-02-27 22:58 ` [PATCH 05/10] memcg: Slab accounting Suleiman Souhlal
2012-02-28 13:24   ` Glauber Costa
2012-02-28 23:31     ` Suleiman Souhlal
2012-02-29 17:00       ` Glauber Costa
2012-02-27 22:58 ` [PATCH 06/10] memcg: Track all the memcg children of a kmem_cache Suleiman Souhlal
2012-02-27 22:58 ` [PATCH 07/10] memcg: Stop res_counter underflows Suleiman Souhlal
2012-02-28 13:31   ` Glauber Costa
2012-02-28 23:07     ` Suleiman Souhlal
2012-02-29 17:05       ` Glauber Costa
2012-02-29 19:17         ` Suleiman Souhlal
2012-02-27 22:58 ` [PATCH 08/10] memcg: Add CONFIG_CGROUP_MEM_RES_CTLR_KMEM_ACCT_ROOT Suleiman Souhlal
2012-02-28 13:34   ` Glauber Costa
2012-02-28 23:36     ` Suleiman Souhlal
2012-02-28 23:54       ` KAMEZAWA Hiroyuki
2012-02-29 17:09       ` Glauber Costa
2012-02-29 19:24         ` Suleiman Souhlal
2012-02-27 22:58 ` [PATCH 09/10] memcg: Per-memcg memory.kmem.slabinfo file Suleiman Souhlal
2012-02-27 22:58 ` [PATCH 10/10] memcg: Document kernel memory accounting Suleiman Souhlal
2012-02-27 23:05   ` Randy Dunlap
2012-02-28  8:49 ` [PATCH 00/10] memcg: Kernel Memory Accounting Pekka Enberg
2012-02-28 22:12   ` Suleiman Souhlal
2012-02-28 13:03 ` Glauber Costa
2012-02-28 22:47   ` Suleiman Souhlal
2012-02-29 16:47     ` Glauber Costa
2012-02-29 19:28       ` Suleiman Souhlal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1330383533-20711-4-git-send-email-ssouhlal@FreeBSD.org \
    --to=ssouhlal@freebsd.org \
    --cc=cgroups@vger.kernel.org \
    --cc=devel@openvz.org \
    --cc=glommer@parallels.com \
    --cc=gthelen@google.com \
    --cc=hughd@google.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=penberg@kernel.org \
    --cc=suleiman@google.com \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).