linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <jweiner@redhat.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	Balbir Singh <bsingharora@gmail.com>,
	Ying Han <yinghan@google.com>, Greg Thelen <gthelen@google.com>,
	Michel Lespinasse <walken@google.com>,
	Rik van Riel <riel@redhat.com>,
	Minchan Kim <minchan.kim@gmail.com>,
	Christoph Hellwig <hch@infradead.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [patch 04/11] mm: memcg: per-priority per-zone hierarchy scan generations
Date: Tue, 20 Sep 2011 11:10:32 +0200	[thread overview]
Message-ID: <20110920091032.GD11489@redhat.com> (raw)
In-Reply-To: <20110920084531.GB27675@tiehlicka.suse.cz>

On Tue, Sep 20, 2011 at 10:45:32AM +0200, Michal Hocko wrote:
> On Mon 12-09-11 12:57:21, Johannes Weiner wrote:
> > Memory cgroup limit reclaim currently picks one memory cgroup out of
> > the target hierarchy, remembers it as the last scanned child, and
> > reclaims all zones in it with decreasing priority levels.
> > 
> > The new hierarchy reclaim code will pick memory cgroups from the same
> > hierarchy concurrently from different zones and priority levels, it
> > becomes necessary that hierarchy roots not only remember the last
> > scanned child, but do so for each zone and priority level.
> > 
> > Furthermore, detecting full hierarchy round-trips reliably will become
> > crucial, so instead of counting on one iterator site seeing a certain
> > memory cgroup twice, use a generation counter that is increased every
> > time the child with the highest ID has been visited.
> 
> In principle I think the patch is good. I have some concerns about
> locking and I would really appreciate some more description (like you
> provided in the other email in this thread).

Okay, I'll incorporate that description into the changelog.

> > @@ -131,6 +136,8 @@ struct mem_cgroup_per_zone {
> >  	struct list_head	lists[NR_LRU_LISTS];
> >  	unsigned long		count[NR_LRU_LISTS];
> >  
> > +	struct mem_cgroup_iter_state iter_state[DEF_PRIORITY + 1];
> > +
> >  	struct zone_reclaim_stat reclaim_stat;
> >  	struct rb_node		tree_node;	/* RB tree node */
> >  	unsigned long long	usage_in_excess;/* Set to the value by which */
> [...]
> > @@ -781,9 +783,15 @@ struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm)
> >  	return memcg;
> >  }
> >  
> > +struct mem_cgroup_iter {
> 
> Wouldn't be mem_cgroup_zone_iter_state a better name. It is true it is
> rather long but I find mem_cgroup_iter very confusing because the actual
> position is stored in the zone's state. The other thing is that it looks
> like we have two iterators in mem_cgroup_iter function now but in fact
> the iter parameter is just a state when we start iteration.

Agreed, the naming is unfortunate.  How about
mem_cgroup_reclaim_cookie or something comparable?  It's limited to
reclaim anyway, hierarchy walkers that do not age the LRU lists should
not advance the shared iterator state, so might as well encode it in
the name.

> > +	struct zone *zone;
> > +	int priority;
> > +	unsigned int generation;
> > +};
> > +
> >  static struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
> >  					  struct mem_cgroup *prev,
> > -					  bool remember)
> > +					  struct mem_cgroup_iter *iter)
> 
> I would rather see a different name for the last parameter
> (iter_state?).

I'm with you on this.  Will think something up.

> > @@ -804,10 +812,20 @@ static struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
> >  	}
> >  
> >  	while (!mem) {
> > +		struct mem_cgroup_iter_state *uninitialized_var(is);
> >  		struct cgroup_subsys_state *css;
> >  
> > -		if (remember)
> > -			id = root->last_scanned_child;
> > +		if (iter) {
> > +			int nid = zone_to_nid(iter->zone);
> > +			int zid = zone_idx(iter->zone);
> > +			struct mem_cgroup_per_zone *mz;
> > +
> > +			mz = mem_cgroup_zoneinfo(root, nid, zid);
> > +			is = &mz->iter_state[iter->priority];
> > +			if (prev && iter->generation != is->generation)
> > +				return NULL;
> > +			id = is->position;
> 
> Do we need any kind of locking here (spin_lock(&is->lock))?
> If two parallel reclaimers start on the same zone and priority they will
> see the same position and so bang on the same cgroup.

Note that last_scanned_child wasn't lock-protected before this series,
so there is no actual difference.

I can say, though, that during development I had a lock in there for
some time and it didn't make any difference for 32 concurrent
reclaimers on a quadcore.  Feel free to evaluate with higher
concurrency :)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-09-20  9:32 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-12 10:57 [patch 0/11] mm: memcg naturalization -rc3 Johannes Weiner
2011-09-12 10:57 ` [patch 01/11] mm: memcg: consolidate hierarchy iteration primitives Johannes Weiner
2011-09-12 22:37   ` Kirill A. Shutemov
2011-09-13  5:40     ` Johannes Weiner
2011-09-19 13:06     ` Michal Hocko
2011-09-13 10:06   ` KAMEZAWA Hiroyuki
2011-09-19 12:53   ` Michal Hocko
2011-09-20  8:45     ` Johannes Weiner
2011-09-20  8:53       ` Michal Hocko
2011-09-12 10:57 ` [patch 02/11] mm: vmscan: distinguish global reclaim from global LRU scanning Johannes Weiner
2011-09-12 23:02   ` Kirill A. Shutemov
2011-09-13  5:48     ` Johannes Weiner
2011-09-13 10:07   ` KAMEZAWA Hiroyuki
2011-09-19 13:23   ` Michal Hocko
2011-09-19 13:46     ` Michal Hocko
2011-09-20  8:52     ` Johannes Weiner
2011-09-12 10:57 ` [patch 03/11] mm: vmscan: distinguish between memcg triggering reclaim and memcg being scanned Johannes Weiner
2011-09-13 10:23   ` KAMEZAWA Hiroyuki
2011-09-19 14:29   ` Michal Hocko
2011-09-20  8:58     ` Johannes Weiner
2011-09-20  9:17       ` Michal Hocko
2011-09-29  7:55         ` Johannes Weiner
2011-09-12 10:57 ` [patch 04/11] mm: memcg: per-priority per-zone hierarchy scan generations Johannes Weiner
2011-09-13 10:27   ` KAMEZAWA Hiroyuki
2011-09-13 11:03     ` Johannes Weiner
2011-09-14  0:55       ` KAMEZAWA Hiroyuki
2011-09-14  5:56         ` Johannes Weiner
2011-09-14  7:40           ` KAMEZAWA Hiroyuki
2011-09-20  8:15       ` Michal Hocko
2011-09-20  8:45   ` Michal Hocko
2011-09-20  9:10     ` Johannes Weiner [this message]
2011-09-20 12:37       ` Michal Hocko
2011-09-12 10:57 ` [patch 05/11] mm: move memcg hierarchy reclaim to generic reclaim code Johannes Weiner
2011-09-13 10:31   ` KAMEZAWA Hiroyuki
2011-09-20 13:09   ` Michal Hocko
2011-09-20 13:29     ` Johannes Weiner
2011-09-20 14:08       ` Michal Hocko
2011-09-12 10:57 ` [patch 06/11] mm: memcg: remove optimization of keeping the root_mem_cgroup LRU lists empty Johannes Weiner
2011-09-13 10:34   ` KAMEZAWA Hiroyuki
2011-09-20 15:02   ` Michal Hocko
2011-09-29  9:20     ` Johannes Weiner
2011-09-29  9:49       ` Michal Hocko
2011-09-12 10:57 ` [patch 07/11] mm: vmscan: convert unevictable page rescue scanner to per-memcg LRU lists Johannes Weiner
2011-09-13 10:37   ` KAMEZAWA Hiroyuki
2011-09-21 12:33   ` Michal Hocko
2011-09-21 13:47     ` Johannes Weiner
2011-09-21 14:08       ` Michal Hocko
2011-09-12 10:57 ` [patch 08/11] mm: vmscan: convert global reclaim " Johannes Weiner
2011-09-13 10:41   ` KAMEZAWA Hiroyuki
2011-09-21 13:10   ` Michal Hocko
2011-09-21 13:51     ` Johannes Weiner
2011-09-21 13:57       ` Michal Hocko
2011-09-12 10:57 ` [patch 09/11] mm: collect LRU list heads into struct lruvec Johannes Weiner
2011-09-13 10:43   ` KAMEZAWA Hiroyuki
2011-09-21 13:43   ` Michal Hocko
2011-09-21 15:15     ` Michal Hocko
2011-09-12 10:57 ` [patch 10/11] mm: make per-memcg LRU lists exclusive Johannes Weiner
2011-09-13 10:47   ` KAMEZAWA Hiroyuki
2011-09-21 15:24   ` Michal Hocko
2011-09-21 15:47     ` Johannes Weiner
2011-09-21 16:05       ` Michal Hocko
2011-09-12 10:57 ` [patch 11/11] mm: memcg: remove unused node/section info from pc->flags Johannes Weiner
2011-09-13 10:50   ` KAMEZAWA Hiroyuki
2011-09-21 15:32   ` Michal Hocko
2011-09-13 20:35 ` [patch 0/11] mm: memcg naturalization -rc3 Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110920091032.GD11489@redhat.com \
    --to=jweiner@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bsingharora@gmail.com \
    --cc=gthelen@google.com \
    --cc=hch@infradead.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=minchan.kim@gmail.com \
    --cc=nishimura@mxp.nes.nec.co.jp \
    --cc=riel@redhat.com \
    --cc=walken@google.com \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).