linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <jweiner@redhat.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	Balbir Singh <bsingharora@gmail.com>,
	Ying Han <yinghan@google.com>, Michal Hocko <mhocko@suse.cz>,
	Greg Thelen <gthelen@google.com>,
	Michel Lespinasse <walken@google.com>,
	Rik van Riel <riel@redhat.com>,
	Minchan Kim <minchan.kim@gmail.com>,
	Christoph Hellwig <hch@infradead.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [patch 04/11] mm: memcg: per-priority per-zone hierarchy scan generations
Date: Wed, 14 Sep 2011 07:56:34 +0200	[thread overview]
Message-ID: <20110914055634.GA28051@redhat.com> (raw)
In-Reply-To: <20110914095504.30fca5d0.kamezawa.hiroyu@jp.fujitsu.com>

On Wed, Sep 14, 2011 at 09:55:04AM +0900, KAMEZAWA Hiroyuki wrote:
> On Tue, 13 Sep 2011 13:03:01 +0200
> Johannes Weiner <jweiner@redhat.com> wrote:
> 
> > On Tue, Sep 13, 2011 at 07:27:59PM +0900, KAMEZAWA Hiroyuki wrote:
> > > On Mon, 12 Sep 2011 12:57:21 +0200
> > > Johannes Weiner <jweiner@redhat.com> wrote:
> > > 
> > > > Memory cgroup limit reclaim currently picks one memory cgroup out of
> > > > the target hierarchy, remembers it as the last scanned child, and
> > > > reclaims all zones in it with decreasing priority levels.
> > > > 
> > > > The new hierarchy reclaim code will pick memory cgroups from the same
> > > > hierarchy concurrently from different zones and priority levels, it
> > > > becomes necessary that hierarchy roots not only remember the last
> > > > scanned child, but do so for each zone and priority level.
> > > > 
> > > > Furthermore, detecting full hierarchy round-trips reliably will become
> > > > crucial, so instead of counting on one iterator site seeing a certain
> > > > memory cgroup twice, use a generation counter that is increased every
> > > > time the child with the highest ID has been visited.
> > > > 
> > > > Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> > > 
> > > I cannot image how this works. could you illustrate more with easy example ?
> > 
> > Previously, we did
> > 
> > 	mem = mem_cgroup_iter(root)
> > 	  for each priority level:
> > 	    for each zone in zonelist:
> > 
> > and this would reclaim memcg-1-zone-1, memcg-1-zone-2, memcg-1-zone-3
> > etc.
> > 
> yes.
> 
> > The new code does
> > 
> > 	for each priority level
> > 	  for each zone in zonelist
> >             mem = mem_cgroup_iter(root)
> > 
> > but with a single last_scanned_child per memcg, this would scan
> > memcg-1-zone-1, memcg-2-zone-2, memcg-3-zone-3 etc, which does not
> > make much sense.
> > 
> > Now imagine two reclaimers.  With the old code, the first reclaimer
> > would pick memcg-1 and scan all its zones, the second reclaimer would
> > pick memcg-2 and reclaim all its zones.  Without this patch, the first
> > reclaimer would pick memcg-1 and scan zone-1, the second reclaimer
> > would pick memcg-2 and scan zone-1, then the first reclaimer would
> > pick memcg-3 and scan zone-2.  If the reclaimers are concurrently
> > scanning at different priority levels, things are even worse because
> > one reclaimer may put much more force on the memcgs it gets from
> > mem_cgroup_iter() than the other reclaimer.  They must not share the
> > same iterator.
> > 
> > The generations are needed because the old algorithm did not rely too
> > much on detecting full round-trips.  After every reclaim cycle, it
> > checked the limit and broke out of the loop if enough was reclaimed,
> > no matter how many children were reclaimed from.  The new algorithm is
> > used for global reclaim, where the only exit condition of the
> > hierarchy reclaim is the full roundtrip, because equal pressure needs
> > to be applied to all zones.
> > 
> Hm, ok, maybe good for global reclam.
> Is this used for both of reclaim-by-limit and global-reclaim ?

No, the hierarchy iteration in shrink_zone() is done after a single
memcg, which is equivalent to the old code: scan all zones at all
priority levels from a memcg, then move on to the next memcg.  This
also works because of the per-zone per-priority last_scanned_child:

	for each priority
	  for each zone
	    mem = mem_cgroup_iter(root)
	    scan(mem)

priority-12 + zone-1 will yield memcg-1.  priority-12 + zone-2 starts
at its own last_scanned_child, so yields memcg-1 as well, etc.  A
second reclaimer that comes in with priority-12 + zone-1 will receive
memcg-2 for scanning.  So there is no change in behaviour for limit
reclaim.

> If so, I need to abandon node-selection-logic for reclaim-by-limit
> and nodemask-for-memcg which shows me very good result. 
> I'll be sad ;)

With my clarification, do you still think so?

  reply	other threads:[~2011-09-14  5:57 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-12 10:57 [patch 0/11] mm: memcg naturalization -rc3 Johannes Weiner
2011-09-12 10:57 ` [patch 01/11] mm: memcg: consolidate hierarchy iteration primitives Johannes Weiner
2011-09-12 22:37   ` Kirill A. Shutemov
2011-09-13  5:40     ` Johannes Weiner
2011-09-19 13:06     ` Michal Hocko
2011-09-13 10:06   ` KAMEZAWA Hiroyuki
2011-09-19 12:53   ` Michal Hocko
2011-09-20  8:45     ` Johannes Weiner
2011-09-20  8:53       ` Michal Hocko
2011-09-12 10:57 ` [patch 02/11] mm: vmscan: distinguish global reclaim from global LRU scanning Johannes Weiner
2011-09-12 23:02   ` Kirill A. Shutemov
2011-09-13  5:48     ` Johannes Weiner
2011-09-13 10:07   ` KAMEZAWA Hiroyuki
2011-09-19 13:23   ` Michal Hocko
2011-09-19 13:46     ` Michal Hocko
2011-09-20  8:52     ` Johannes Weiner
2011-09-12 10:57 ` [patch 03/11] mm: vmscan: distinguish between memcg triggering reclaim and memcg being scanned Johannes Weiner
2011-09-13 10:23   ` KAMEZAWA Hiroyuki
2011-09-19 14:29   ` Michal Hocko
2011-09-20  8:58     ` Johannes Weiner
2011-09-20  9:17       ` Michal Hocko
2011-09-29  7:55         ` Johannes Weiner
2011-09-12 10:57 ` [patch 04/11] mm: memcg: per-priority per-zone hierarchy scan generations Johannes Weiner
2011-09-13 10:27   ` KAMEZAWA Hiroyuki
2011-09-13 11:03     ` Johannes Weiner
2011-09-14  0:55       ` KAMEZAWA Hiroyuki
2011-09-14  5:56         ` Johannes Weiner [this message]
2011-09-14  7:40           ` KAMEZAWA Hiroyuki
2011-09-20  8:15       ` Michal Hocko
2011-09-20  8:45   ` Michal Hocko
2011-09-20  9:10     ` Johannes Weiner
2011-09-20 12:37       ` Michal Hocko
2011-09-12 10:57 ` [patch 05/11] mm: move memcg hierarchy reclaim to generic reclaim code Johannes Weiner
2011-09-13 10:31   ` KAMEZAWA Hiroyuki
2011-09-20 13:09   ` Michal Hocko
2011-09-20 13:29     ` Johannes Weiner
2011-09-20 14:08       ` Michal Hocko
2011-09-12 10:57 ` [patch 06/11] mm: memcg: remove optimization of keeping the root_mem_cgroup LRU lists empty Johannes Weiner
2011-09-13 10:34   ` KAMEZAWA Hiroyuki
2011-09-20 15:02   ` Michal Hocko
2011-09-29  9:20     ` Johannes Weiner
2011-09-29  9:49       ` Michal Hocko
2011-09-12 10:57 ` [patch 07/11] mm: vmscan: convert unevictable page rescue scanner to per-memcg LRU lists Johannes Weiner
2011-09-13 10:37   ` KAMEZAWA Hiroyuki
2011-09-21 12:33   ` Michal Hocko
2011-09-21 13:47     ` Johannes Weiner
2011-09-21 14:08       ` Michal Hocko
2011-09-12 10:57 ` [patch 08/11] mm: vmscan: convert global reclaim " Johannes Weiner
2011-09-13 10:41   ` KAMEZAWA Hiroyuki
2011-09-21 13:10   ` Michal Hocko
2011-09-21 13:51     ` Johannes Weiner
2011-09-21 13:57       ` Michal Hocko
2011-09-12 10:57 ` [patch 09/11] mm: collect LRU list heads into struct lruvec Johannes Weiner
2011-09-13 10:43   ` KAMEZAWA Hiroyuki
2011-09-21 13:43   ` Michal Hocko
2011-09-21 15:15     ` Michal Hocko
2011-09-12 10:57 ` [patch 10/11] mm: make per-memcg LRU lists exclusive Johannes Weiner
2011-09-13 10:47   ` KAMEZAWA Hiroyuki
2011-09-21 15:24   ` Michal Hocko
2011-09-21 15:47     ` Johannes Weiner
2011-09-21 16:05       ` Michal Hocko
2011-09-12 10:57 ` [patch 11/11] mm: memcg: remove unused node/section info from pc->flags Johannes Weiner
2011-09-13 10:50   ` KAMEZAWA Hiroyuki
2011-09-21 15:32   ` Michal Hocko
2011-09-13 20:35 ` [patch 0/11] mm: memcg naturalization -rc3 Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110914055634.GA28051@redhat.com \
    --to=jweiner@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bsingharora@gmail.com \
    --cc=gthelen@google.com \
    --cc=hch@infradead.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=minchan.kim@gmail.com \
    --cc=nishimura@mxp.nes.nec.co.jp \
    --cc=riel@redhat.com \
    --cc=walken@google.com \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).