From: Johannes Weiner <jweiner@redhat.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
Balbir Singh <bsingharora@gmail.com>,
Ying Han <yinghan@google.com>, Greg Thelen <gthelen@google.com>,
Michel Lespinasse <walken@google.com>,
Rik van Riel <riel@redhat.com>,
Minchan Kim <minchan.kim@gmail.com>,
Christoph Hellwig <hch@infradead.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [patch 05/11] mm: move memcg hierarchy reclaim to generic reclaim code
Date: Tue, 20 Sep 2011 15:29:28 +0200 [thread overview]
Message-ID: <20110920132928.GA16338@redhat.com> (raw)
In-Reply-To: <20110920130915.GE27675@tiehlicka.suse.cz>
On Tue, Sep 20, 2011 at 03:09:15PM +0200, Michal Hocko wrote:
> On Mon 12-09-11 12:57:22, Johannes Weiner wrote:
> > Memory cgroup limit reclaim and traditional global pressure reclaim
> > will soon share the same code to reclaim from a hierarchical tree of
> > memory cgroups.
> >
> > In preparation of this, move the two right next to each other in
> > shrink_zone().
>
> I like the way how you've split mem_cgroup_hierarchical_reclaim into
> mem_cgroup_reclaim and mem_cgroup_soft_reclaim and I guess this deserves
> a note in the patch description. Especially that mem_cgroup_reclaim is
> hierarchical even though it doesn't use mem_cgroup_iter directly but
> rather via do_try_to_free_pages and shrink_zone.
>
> I am not sure I see how shrink_mem_cgroup_zone works. See comments and
> questions bellow:
>
> >
> > Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> > ---
> > include/linux/memcontrol.h | 25 ++++++-
> > mm/memcontrol.c | 167 ++++++++++++++++++++++----------------------
> > mm/vmscan.c | 43 ++++++++++-
> > 3 files changed, 147 insertions(+), 88 deletions(-)
> >
> [...]
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index f4b404e..413e1f8 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> [...]
> > @@ -783,19 +781,33 @@ struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm)
> > return memcg;
> > }
> >
> > -struct mem_cgroup_iter {
> > - struct zone *zone;
> > - int priority;
> > - unsigned int generation;
> > -};
> > -
> > -static struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
> > - struct mem_cgroup *prev,
> > - struct mem_cgroup_iter *iter)
> > +/**
> > + * mem_cgroup_iter - iterate over memory cgroup hierarchy
> > + * @root: hierarchy root
> > + * @prev: previously returned memcg, NULL on first invocation
> > + * @iter: token for partial walks, NULL for full walks
> > + *
> > + * Returns references to children of the hierarchy starting at @root,
>
> I guess you meant "starting at @prev"
Nope, although it is a bit ambiguous, both the hierarchy as well as
the iteration start at @root. Unless @iter is specified, but then the
hierarchy still starts at @root.
I attached a patch below that fixes the phrasing.
> > @@ -1479,6 +1496,41 @@ u64 mem_cgroup_get_limit(struct mem_cgroup *memcg)
> > return min(limit, memsw);
> > }
> >
> > +static unsigned long mem_cgroup_reclaim(struct mem_cgroup *mem,
> > + gfp_t gfp_mask,
> > + unsigned long flags)
> > +{
> > + unsigned long total = 0;
> > + bool noswap = false;
> > + int loop;
> > +
> > + if (flags & MEM_CGROUP_RECLAIM_NOSWAP)
> > + noswap = true;
> > + else if (!(flags & MEM_CGROUP_RECLAIM_SHRINK) && mem->memsw_is_minimum)
> > + noswap = true;
> > +
> > + for (loop = 0; loop < MEM_CGROUP_MAX_RECLAIM_LOOPS; loop++) {
> > + if (loop)
> > + drain_all_stock_async(mem);
> > + total += try_to_free_mem_cgroup_pages(mem, gfp_mask, noswap);
> > + /*
> > + * Avoid freeing too much when shrinking to resize the
> > + * limit. XXX: Shouldn't the margin check be enough?
>
> I guess the MEM_CGROUP_RECLAIM_SHRINK condition should help shrinkers to
> die more easily on signal even if they make some progress.
Ah, that makes sense. I'll remove the XXX. Folded in the patch
below.
> > @@ -2104,12 +2104,43 @@ restart:
> > static void shrink_zone(int priority, struct zone *zone,
> > struct scan_control *sc)
> > {
> > - struct mem_cgroup_zone mz = {
> > - .mem_cgroup = sc->target_mem_cgroup,
> > + struct mem_cgroup *root = sc->target_mem_cgroup;
> > + struct mem_cgroup_iter iter = {
> > .zone = zone,
> > + .priority = priority,
> > };
> > + struct mem_cgroup *mem;
> > +
> > + if (global_reclaim(sc)) {
> > + struct mem_cgroup_zone mz = {
> > + .mem_cgroup = NULL,
> > + .zone = zone,
> > + };
> > +
> > + shrink_mem_cgroup_zone(priority, &mz, sc);
> > + return;
> > + }
> > +
> > + mem = mem_cgroup_iter(root, NULL, &iter);
> > + do {
> > + struct mem_cgroup_zone mz = {
> > + .mem_cgroup = mem,
> > + .zone = zone,
> > + };
> >
> > - shrink_mem_cgroup_zone(priority, &mz, sc);
> > + shrink_mem_cgroup_zone(priority, &mz, sc);
> > + /*
> > + * Limit reclaim has historically picked one memcg and
> > + * scanned it with decreasing priority levels until
> > + * nr_to_reclaim had been reclaimed. This priority
> > + * cycle is thus over after a single memcg.
> > + */
> > + if (!global_reclaim(sc)) {
>
> How can we have global_reclaim(sc) == true here?
We can't yet, but will when global reclaim and limit reclaim use that
same loop a patch or two later. I added it early to have an anchor
for the comment above it, so that it's obvious how limit reclaim
behaves, and that I don't have to change limit reclaim-specific
conditions when changing how global reclaim works.
> Shouldn't we just check how much have we reclaimed from that group and
> iterate only if it wasn't sufficient (at least SWAP_CLUSTER_MAX)?
I played with various exit conditions and in the end just left the
behaviour exactly like we had it before this patch, just that the
algorithm is now inside out. If you want to make such a fundamental
change, you have to prove it works :-)
Signed-off-by: Johannes Weiner <jweiner@redhat.com>
---
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a7d14a5..349620c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -784,8 +784,8 @@ struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm)
* @prev: previously returned memcg, NULL on first invocation
* @iter: token for partial walks, NULL for full walks
*
- * Returns references to children of the hierarchy starting at @root,
- * or @root itself, or %NULL after a full round-trip.
+ * Returns references to children of the hierarchy below @root, or
+ * @root itself, or %NULL after a full round-trip.
*
* Caller must pass the return value in @prev on subsequent
* invocations for reference counting, or use mem_cgroup_iter_break()
@@ -1493,7 +1493,8 @@ static unsigned long mem_cgroup_reclaim(struct mem_cgroup *mem,
total += try_to_free_mem_cgroup_pages(mem, gfp_mask, noswap);
/*
* Avoid freeing too much when shrinking to resize the
- * limit. XXX: Shouldn't the margin check be enough?
+ * limit, and bail easily so that signals have a
+ * chance to kill the resizing task.
*/
if (total && (flags & MEM_CGROUP_RECLAIM_SHRINK))
break;
next prev parent reply other threads:[~2011-09-20 13:30 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-12 10:57 [patch 0/11] mm: memcg naturalization -rc3 Johannes Weiner
2011-09-12 10:57 ` [patch 01/11] mm: memcg: consolidate hierarchy iteration primitives Johannes Weiner
2011-09-12 22:37 ` Kirill A. Shutemov
2011-09-13 5:40 ` Johannes Weiner
2011-09-19 13:06 ` Michal Hocko
2011-09-13 10:06 ` KAMEZAWA Hiroyuki
2011-09-19 12:53 ` Michal Hocko
2011-09-20 8:45 ` Johannes Weiner
2011-09-20 8:53 ` Michal Hocko
2011-09-12 10:57 ` [patch 02/11] mm: vmscan: distinguish global reclaim from global LRU scanning Johannes Weiner
2011-09-12 23:02 ` Kirill A. Shutemov
2011-09-13 5:48 ` Johannes Weiner
2011-09-13 10:07 ` KAMEZAWA Hiroyuki
2011-09-19 13:23 ` Michal Hocko
2011-09-19 13:46 ` Michal Hocko
2011-09-20 8:52 ` Johannes Weiner
2011-09-12 10:57 ` [patch 03/11] mm: vmscan: distinguish between memcg triggering reclaim and memcg being scanned Johannes Weiner
2011-09-13 10:23 ` KAMEZAWA Hiroyuki
2011-09-19 14:29 ` Michal Hocko
2011-09-20 8:58 ` Johannes Weiner
2011-09-20 9:17 ` Michal Hocko
2011-09-29 7:55 ` Johannes Weiner
2011-09-12 10:57 ` [patch 04/11] mm: memcg: per-priority per-zone hierarchy scan generations Johannes Weiner
2011-09-13 10:27 ` KAMEZAWA Hiroyuki
2011-09-13 11:03 ` Johannes Weiner
2011-09-14 0:55 ` KAMEZAWA Hiroyuki
2011-09-14 5:56 ` Johannes Weiner
2011-09-14 7:40 ` KAMEZAWA Hiroyuki
2011-09-20 8:15 ` Michal Hocko
2011-09-20 8:45 ` Michal Hocko
2011-09-20 9:10 ` Johannes Weiner
2011-09-20 12:37 ` Michal Hocko
2011-09-12 10:57 ` [patch 05/11] mm: move memcg hierarchy reclaim to generic reclaim code Johannes Weiner
2011-09-13 10:31 ` KAMEZAWA Hiroyuki
2011-09-20 13:09 ` Michal Hocko
2011-09-20 13:29 ` Johannes Weiner [this message]
2011-09-20 14:08 ` Michal Hocko
2011-09-12 10:57 ` [patch 06/11] mm: memcg: remove optimization of keeping the root_mem_cgroup LRU lists empty Johannes Weiner
2011-09-13 10:34 ` KAMEZAWA Hiroyuki
2011-09-20 15:02 ` Michal Hocko
2011-09-29 9:20 ` Johannes Weiner
2011-09-29 9:49 ` Michal Hocko
2011-09-12 10:57 ` [patch 07/11] mm: vmscan: convert unevictable page rescue scanner to per-memcg LRU lists Johannes Weiner
2011-09-13 10:37 ` KAMEZAWA Hiroyuki
2011-09-21 12:33 ` Michal Hocko
2011-09-21 13:47 ` Johannes Weiner
2011-09-21 14:08 ` Michal Hocko
2011-09-12 10:57 ` [patch 08/11] mm: vmscan: convert global reclaim " Johannes Weiner
2011-09-13 10:41 ` KAMEZAWA Hiroyuki
2011-09-21 13:10 ` Michal Hocko
2011-09-21 13:51 ` Johannes Weiner
2011-09-21 13:57 ` Michal Hocko
2011-09-12 10:57 ` [patch 09/11] mm: collect LRU list heads into struct lruvec Johannes Weiner
2011-09-13 10:43 ` KAMEZAWA Hiroyuki
2011-09-21 13:43 ` Michal Hocko
2011-09-21 15:15 ` Michal Hocko
2011-09-12 10:57 ` [patch 10/11] mm: make per-memcg LRU lists exclusive Johannes Weiner
2011-09-13 10:47 ` KAMEZAWA Hiroyuki
2011-09-21 15:24 ` Michal Hocko
2011-09-21 15:47 ` Johannes Weiner
2011-09-21 16:05 ` Michal Hocko
2011-09-12 10:57 ` [patch 11/11] mm: memcg: remove unused node/section info from pc->flags Johannes Weiner
2011-09-13 10:50 ` KAMEZAWA Hiroyuki
2011-09-21 15:32 ` Michal Hocko
2011-09-13 20:35 ` [patch 0/11] mm: memcg naturalization -rc3 Kirill A. Shutemov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110920132928.GA16338@redhat.com \
--to=jweiner@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=bsingharora@gmail.com \
--cc=gthelen@google.com \
--cc=hch@infradead.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=minchan.kim@gmail.com \
--cc=nishimura@mxp.nes.nec.co.jp \
--cc=riel@redhat.com \
--cc=walken@google.com \
--cc=yinghan@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).