All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <jweiner@redhat.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	Balbir Singh <bsingharora@gmail.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [patch] memcg: skip scanning active lists based on individual size
Date: Thu, 1 Sep 2011 08:15:40 +0200	[thread overview]
Message-ID: <20110901061540.GA22561@redhat.com> (raw)
In-Reply-To: <20110901090931.c0721216.kamezawa.hiroyu@jp.fujitsu.com>

On Thu, Sep 01, 2011 at 09:09:31AM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 31 Aug 2011 19:13:34 +0900
> Minchan Kim <minchan.kim@gmail.com> wrote:
> 
> > On Wed, Aug 31, 2011 at 6:08 PM, Johannes Weiner <jweiner@redhat.com> wrote:
> > > Reclaim decides to skip scanning an active list when the corresponding
> > > inactive list is above a certain size in comparison to leave the
> > > assumed working set alone while there are still enough reclaim
> > > candidates around.
> > >
> > > The memcg implementation of comparing those lists instead reports
> > > whether the whole memcg is low on the requested type of inactive
> > > pages, considering all nodes and zones.
> > >
> > > This can lead to an oversized active list not being scanned because of
> > > the state of the other lists in the memcg, as well as an active list
> > > being scanned while its corresponding inactive list has enough pages.
> > >
> > > Not only is this wrong, it's also a scalability hazard, because the
> > > global memory state over all nodes and zones has to be gathered for
> > > each memcg and zone scanned.
> > >
> > > Make these calculations purely based on the size of the two LRU lists
> > > that are actually affected by the outcome of the decision.
> > >
> > > Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> > > Cc: Rik van Riel <riel@redhat.com>
> > > Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> > > Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > > Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> > > Cc: Balbir Singh <bsingharora@gmail.com>
> > 
> > Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
> > 
> > I can't understand why memcg is designed for considering all nodes and zones.
> > Is it a mistake or on purpose?
> 
> It's purpose. memcg just takes care of the amount of pages.

This mechanism isn't about memcg at all, it's an aging decision at a
much lower level.  Can you tell me how the old implementation is
supposed to work?

> But, hmm, this change may be good for softlimit and your work.

Yes, I noticed those paths showing up in a profile with my patches.
Lots of memcgs on a multi-node machine will trigger it too.  But it's
secondary, my primary reasoning was: this does not make sense at all.

> I'll ack when you add performance numbers in changelog.

It's not exactly a performance optimization but I'll happily run some
workloads.  Do you have suggestions what to test for?  I.e. where
would you expect regressions?

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <jweiner@redhat.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	Balbir Singh <bsingharora@gmail.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [patch] memcg: skip scanning active lists based on individual size
Date: Thu, 1 Sep 2011 08:15:40 +0200	[thread overview]
Message-ID: <20110901061540.GA22561@redhat.com> (raw)
In-Reply-To: <20110901090931.c0721216.kamezawa.hiroyu@jp.fujitsu.com>

On Thu, Sep 01, 2011 at 09:09:31AM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 31 Aug 2011 19:13:34 +0900
> Minchan Kim <minchan.kim@gmail.com> wrote:
> 
> > On Wed, Aug 31, 2011 at 6:08 PM, Johannes Weiner <jweiner@redhat.com> wrote:
> > > Reclaim decides to skip scanning an active list when the corresponding
> > > inactive list is above a certain size in comparison to leave the
> > > assumed working set alone while there are still enough reclaim
> > > candidates around.
> > >
> > > The memcg implementation of comparing those lists instead reports
> > > whether the whole memcg is low on the requested type of inactive
> > > pages, considering all nodes and zones.
> > >
> > > This can lead to an oversized active list not being scanned because of
> > > the state of the other lists in the memcg, as well as an active list
> > > being scanned while its corresponding inactive list has enough pages.
> > >
> > > Not only is this wrong, it's also a scalability hazard, because the
> > > global memory state over all nodes and zones has to be gathered for
> > > each memcg and zone scanned.
> > >
> > > Make these calculations purely based on the size of the two LRU lists
> > > that are actually affected by the outcome of the decision.
> > >
> > > Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> > > Cc: Rik van Riel <riel@redhat.com>
> > > Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> > > Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > > Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> > > Cc: Balbir Singh <bsingharora@gmail.com>
> > 
> > Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
> > 
> > I can't understand why memcg is designed for considering all nodes and zones.
> > Is it a mistake or on purpose?
> 
> It's purpose. memcg just takes care of the amount of pages.

This mechanism isn't about memcg at all, it's an aging decision at a
much lower level.  Can you tell me how the old implementation is
supposed to work?

> But, hmm, this change may be good for softlimit and your work.

Yes, I noticed those paths showing up in a profile with my patches.
Lots of memcgs on a multi-node machine will trigger it too.  But it's
secondary, my primary reasoning was: this does not make sense at all.

> I'll ack when you add performance numbers in changelog.

It's not exactly a performance optimization but I'll happily run some
workloads.  Do you have suggestions what to test for?  I.e. where
would you expect regressions?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-09-01  6:16 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-31  9:08 [patch] memcg: skip scanning active lists based on individual size Johannes Weiner
2011-08-31  9:08 ` Johannes Weiner
2011-08-31 10:13 ` Minchan Kim
2011-08-31 10:13   ` Minchan Kim
2011-08-31 12:30   ` Johannes Weiner
2011-08-31 12:30     ` Johannes Weiner
2011-09-01  0:09   ` KAMEZAWA Hiroyuki
2011-09-01  0:09     ` KAMEZAWA Hiroyuki
2011-09-01  6:15     ` Johannes Weiner [this message]
2011-09-01  6:15       ` Johannes Weiner
2011-09-01  6:31       ` KAMEZAWA Hiroyuki
2011-09-01  6:31         ` KAMEZAWA Hiroyuki
2011-09-05 18:25         ` Johannes Weiner
2011-09-05 18:25           ` Johannes Weiner
2011-09-06  9:33           ` KAMEZAWA Hiroyuki
2011-09-06  9:33             ` KAMEZAWA Hiroyuki
2011-09-06 10:43             ` Johannes Weiner
2011-09-06 10:43               ` Johannes Weiner
2011-09-06 10:52               ` KAMEZAWA Hiroyuki
2011-09-06 10:52                 ` KAMEZAWA Hiroyuki
2011-08-31 17:19 ` Ying Han
2011-08-31 17:19   ` Ying Han
2011-08-31 18:27 ` Rik van Riel
2011-08-31 18:27   ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110901061540.GA22561@redhat.com \
    --to=jweiner@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bsingharora@gmail.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan.kim@gmail.com \
    --cc=nishimura@mxp.nes.nec.co.jp \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.