linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Ying Han <yinghan@google.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Minchan Kim <minchan.kim@gmail.com>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Tejun Heo <tj@kernel.org>, Pavel Emelyanov <xemul@openvz.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Li Zefan <lizf@cn.fujitsu.com>, Mel Gorman <mel@csn.ul.ie>,
	Christoph Lameter <cl@linux.com>, Rik van Riel <riel@redhat.com>,
	Hugh Dickins <hughd@google.com>, Michal Hocko <mhocko@suse.cz>,
	Dave Hansen <dave@linux.vnet.ibm.com>,
	Zhu Yanhai <zhu.yanhai@gmail.com>,
	linux-mm@kvack.org
Subject: Re: [PATCH V6 00/10] memcg: per cgroup background reclaim
Date: Wed, 20 Apr 2011 20:05:05 -0700	[thread overview]
Message-ID: <BANLkTi=JTGngiosgEsWEo5A-xGAOeEpVGQ@mail.gmail.com> (raw)
In-Reply-To: <20110421025107.GG2333@cmpxchg.org>

[-- Attachment #1: Type: text/plain, Size: 3980 bytes --]

On Wed, Apr 20, 2011 at 7:51 PM, Johannes Weiner <hannes@cmpxchg.org> wrote:

> Hello Ying,
>
> I'm sorry that I chime in so late, I was still traveling until Monday.
>

Hey, hope you had a great trip :)

>
> On Mon, Apr 18, 2011 at 08:57:36PM -0700, Ying Han wrote:
> > The current implementation of memcg supports targeting reclaim when the
> > cgroup is reaching its hard_limit and we do direct reclaim per cgroup.
> > Per cgroup background reclaim is needed which helps to spread out memory
> > pressure over longer period of time and smoothes out the cgroup
> performance.
>
> Latency reduction makes perfect sense, the reasons kswapd exists apply
> to memory control groups as well.  But I disagree with the design
> choices you made.
>
> > If the cgroup is configured to use per cgroup background reclaim, a
> kswapd
> > thread is created which only scans the per-memcg LRU list.
>
> We already have direct reclaim, direct reclaim on behalf of a memcg,
> and global kswapd-reclaim.  Please don't add yet another reclaim path
> that does its own thing and interacts unpredictably with the rest of
> them.
>

Yes, we do have per-memcg direct reclaim and kswapd-reclaim. but the later
one is global and we don't want to start reclaiming from each memcg until we
reach the global memory pressure.

>
> As discussed on LSF, we want to get rid of the global LRU.  So the
> goal is to have each reclaim entry end up at the same core part of
> reclaim that round-robin scans a subset of zones from a subset of
> memory control groups.
>

True, but that is for system under global memory pressure and we would like
to do targeting reclaim instead of reclaiming from the global LRU. That is
not the same in this patch, which is doing targeting reclaim proactively
per-memcg based on their hard_limit.

>
> > Two watermarks ("high_wmark", "low_wmark") are added to trigger the
> > background reclaim and stop it. The watermarks are calculated based
> > on the cgroup's limit_in_bytes.
>
> Which brings me to the next issue: making the watermarks configurable.
>
> You argued that having them adjustable from userspace is required for
> overcommitting the hardlimits and per-memcg kswapd reclaim not kicking
> in in case of global memory pressure.  But that is only a problem
> because global kswapd reclaim is (apart from soft limit reclaim)
> unaware of memory control groups.
>
> I think the much better solution is to make global kswapd memcg aware
> (with the above mentioned round-robin reclaim scheduler), compared to
> adding new (and final!) kernel ABI to avoid an internal shortcoming.
>

We need to make the global kswapd memcg aware and that is the
soft_limit hierarchical reclaim.
It is different from doing per-memcg background reclaim which we want to
reclaim memory per-memcg
before they goes to per-memcg direct reclaim.

>
> The whole excercise of asynchroneous background reclaim is to reduce
> reclaim latency.  We already have a mechanism for global memory
> pressure in place.  Per-memcg watermarks should only exist to avoid
> direct reclaim due to hitting the hardlimit, nothing else.
>

Yes, but we have per-memcg direct reclaim which is based on the hard_limit.
The latency we need to reduce is the direct reclaim which is different from
global memory pressure.

>
> So in summary, I think converting the reclaim core to this round-robin
> scheduler solves all these problems at once: a single code path for
> reclaim, breaking up of the global lru lock, fair soft limit reclaim,
> and a mechanism for latency reduction that just DTRT without any
> user-space configuration necessary.
>

Not exactly. We will have cases where only few cgroups configured and the
total hard_limit always less than the machine capacity. So we will never
trigger the global memory pressure. However, we still need to smooth out the
performance per-memcg by doing background page reclaim proactively before
they hit their hard_limit (direct reclaim)

--Ying


>
>        Hannes
>

[-- Attachment #2: Type: text/html, Size: 5409 bytes --]

  reply	other threads:[~2011-04-21  3:05 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-19  3:57 [PATCH V6 00/10] memcg: per cgroup background reclaim Ying Han
2011-04-19  3:57 ` [PATCH V6 01/10] Add kswapd descriptor Ying Han
2011-04-19  3:57 ` [PATCH V6 02/10] Add per memcg reclaim watermarks Ying Han
2011-04-19  3:57 ` [PATCH V6 03/10] New APIs to adjust per-memcg wmarks Ying Han
2011-04-19  3:57 ` [PATCH V6 04/10] Infrastructure to support per-memcg reclaim Ying Han
2011-04-19  3:57 ` [PATCH V6 05/10] Implement the select_victim_node within memcg Ying Han
2011-04-19  3:57 ` [PATCH V6 06/10] Per-memcg background reclaim Ying Han
2011-04-20  1:03   ` KAMEZAWA Hiroyuki
2011-04-20  3:25     ` Ying Han
2011-04-20  4:20     ` Ying Han
2012-03-19  8:14   ` Zhu Yanhai
2012-03-20  5:37     ` Ying Han
2011-04-19  3:57 ` [PATCH V6 07/10] Add per-memcg zone "unreclaimable" Ying Han
2011-04-19  3:57 ` [PATCH V6 08/10] Enable per-memcg background reclaim Ying Han
2011-04-19  3:57 ` [PATCH V6 09/10] Add API to export per-memcg kswapd pid Ying Han
2011-04-20  1:15   ` KAMEZAWA Hiroyuki
2011-04-20  3:39     ` Ying Han
2011-04-19  3:57 ` [PATCH V6 10/10] Add some per-memcg stats Ying Han
2011-04-21  2:51 ` [PATCH V6 00/10] memcg: per cgroup background reclaim Johannes Weiner
2011-04-21  3:05   ` Ying Han [this message]
2011-04-21  3:53     ` Johannes Weiner
2011-04-21  4:00   ` KAMEZAWA Hiroyuki
2011-04-21  4:24     ` Ying Han
2011-04-21  4:46       ` KAMEZAWA Hiroyuki
2011-04-21  5:08     ` Johannes Weiner
2011-04-21  5:28       ` Ying Han
2011-04-23  1:35         ` Johannes Weiner
2011-04-23  2:10           ` Ying Han
2011-04-23  2:34             ` Johannes Weiner
2011-04-23  3:33               ` Ying Han
2011-04-23  3:41                 ` Rik van Riel
2011-04-23  3:49                   ` Ying Han
2011-04-27  7:36                 ` Johannes Weiner
2011-04-27 17:41                   ` Ying Han
2011-04-27 21:37                     ` Johannes Weiner
2011-04-21  5:41       ` KAMEZAWA Hiroyuki
2011-04-21  6:23         ` Ying Han
2011-04-23  2:02         ` Johannes Weiner
2011-04-21  3:40 ` KAMEZAWA Hiroyuki
2011-04-21  3:48   ` [PATCH 2/3] weight for memcg background reclaim (Was " KAMEZAWA Hiroyuki
2011-04-21  6:11     ` Ying Han
2011-04-21  6:38       ` KAMEZAWA Hiroyuki
2011-04-21  6:59         ` Ying Han
2011-04-21  7:01           ` KAMEZAWA Hiroyuki
2011-04-21  7:12             ` Ying Han
2011-04-21  3:50   ` [PATCH 3/3/] fix mem_cgroup_watemark_ok " KAMEZAWA Hiroyuki
2011-04-21  5:29     ` Ying Han
2011-04-21  4:22   ` Ying Han
2011-04-21  4:27     ` KAMEZAWA Hiroyuki
2011-04-21  4:31     ` Ying Han
2011-04-21  3:43 ` [PATCH 1/3] memcg kswapd thread pool (Was " KAMEZAWA Hiroyuki
2011-04-21  7:09   ` Ying Han
2011-04-21  7:14     ` KAMEZAWA Hiroyuki
2011-04-21  8:10   ` Minchan Kim
2011-04-21  8:46     ` KAMEZAWA Hiroyuki
2011-04-21  9:05       ` Minchan Kim
2011-04-21 16:56         ` Ying Han
2011-04-22  1:02           ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='BANLkTi=JTGngiosgEsWEo5A-xGAOeEpVGQ@mail.gmail.com' \
    --to=yinghan@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=cl@linux.com \
    --cc=dave@linux.vnet.ibm.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=mel@csn.ul.ie \
    --cc=mhocko@suse.cz \
    --cc=minchan.kim@gmail.com \
    --cc=nishimura@mxp.nes.nec.co.jp \
    --cc=riel@redhat.com \
    --cc=tj@kernel.org \
    --cc=xemul@openvz.org \
    --cc=zhu.yanhai@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).