Re: [PATCH V2 5/5] memcg: change the target nr_to_reclaim for each memcg under kswapd

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Johannes Weiner <hannes@cmpxchg.org>
To: Ying Han <yinghan@google.com>
Cc: Michal Hocko <mhocko@suse.cz>, Mel Gorman <mel@csn.ul.ie>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Rik van Riel <riel@redhat.com>, Hillf Danton <dhillf@gmail.com>,
	Hugh Dickins <hughd@google.com>,
	Dan Magenheimer <dan.magenheimer@oracle.com>,
	linux-mm@kvack.org
Subject: Re: [PATCH V2 5/5] memcg: change the target nr_to_reclaim for each memcg under kswapd
Date: Thu, 12 Apr 2012 19:44:20 +0200	[thread overview]
Message-ID: <20120412174420.GN1787@cmpxchg.org> (raw)
In-Reply-To: <CALWz4ixVdamJX4DyaM-zWwp7enXfXLbMbAKLLVQ6FpcVPUiLsg@mail.gmail.com>

On Thu, Apr 12, 2012 at 09:45:47AM -0700, Ying Han wrote:
> On Thu, Apr 12, 2012 at 7:24 AM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> > On Wed, Apr 11, 2012 at 09:06:27PM -0700, Ying Han wrote:
> >> On Wed, Apr 11, 2012 at 4:56 PM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> >> > On Wed, Apr 11, 2012 at 03:00:27PM -0700, Ying Han wrote:
> >> >> Under global background reclaim, the sc->nr_to_reclaim is set to
> >> >> ULONG_MAX. Now we are iterating all memcgs under the zone and we
> >> >> shouldn't pass the pressure from kswapd for each memcg.
> >> >>
> >> >> After all, the balance_pgdat() breaks after reclaiming SWAP_CLUSTER_MAX
> >> >> pages to prevent building up reclaim priorities.
> >> >
> >> > shrink_mem_cgroup_zone() bails out of a zone, balance_pgdat() bails
> >> > out of a priority loop, there is quite a difference.
> >> >
> >> > After this patch, kswapd no longer puts equal pressure on all zones in
> >> > the zonelist, which was a key reason why we could justify bailing
> >> > early out of individual zones in direct reclaim: kswapd will restore
> >> > fairness.
> >>
> >> Guess I see your point here.
> >>
> >> My intention is to prevent over-reclaim memcgs per-zone by having
> >> nr_to_reclaim to ULONG_MAX. Now, we scan each memcg based on
> >> get_scan_count() without bailout, do you see a problem w/o this patch?
> >
> > The fact that we iterate over each memcg does not make a difference,
> > because the target that get_scan_count() returns for each zone-memcg
> > is in sum what it would have returned for the whole zone, so the scan
> > aggressiveness did not increase.  It just distributes the zone's scan
> > target over the set of memcgs proportional to their share of pages in
> > that zone.
> >
> > So I have trouble deciding what's right.
> >
> > On the one hand, I don't see why you bother with this patch, because
> > you don't increase the risk of overreclaim.  Michal's concern for
> > overreclaim came from the fact that I had kswapd do soft limit reclaim
> > at priority 0 without ever bailing from individual zones.  But your
> > soft limit implementation is purely about selecting memcgs to reclaim,
> > you never increase the pressure put on a memcg anywhere.
> 
> I agree w/ you here.
> 
> >
> > On the other hand, I don't even agree with that aspect of your series;
> > that you no longer prioritize explicitely soft-limited groups in
> > excess over unconfigured groups, as I mentioned in the other mail.
> > But if you did, you would likely need a patch like this, I think.
> 
> Prioritize between memcg w/ default softlimit (0) and memcg exceeds
> non-default softlimit (x) ?

Yup:

	A ( soft = default, usage = 10 )
	B ( soft =       8, usage = 10 )

This is the "memory-nice this one workload" I was referring to in the
other mail.  It would have reclaimed B more aggressively than A in the
past.  After your patch, they will both be reclaimed equally, because
you change the default from "below softlimit" to "above soft limit".

> Are you referring to the balance the reclaim between eligible memcgs
> based on different factors like softlimit_exceed, recent_scanned,
> recent_reclaimed....? If so, I am planning to make that as second step
> after this patch series.

Well, humm.  You potentially break existing setups.  It would be good
not to do that, even just temporarily.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2012-04-12 17:44 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-11 22:00 [PATCH V2 5/5] memcg: change the target nr_to_reclaim for each memcg under kswapd Ying Han
2012-04-11 23:56 ` Johannes Weiner
2012-04-12  4:06   ` Ying Han
2012-04-12 14:24     ` Johannes Weiner
2012-04-12 16:45       ` Ying Han
2012-04-12 17:44         ` Johannes Weiner [this message]
2012-04-12 17:58           ` Ying Han
2012-04-15  1:57 ` Hillf Danton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120412174420.GN1787@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=dan.magenheimer@oracle.com \
    --cc=dhillf@gmail.com \
    --cc=hughd@google.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=mhocko@suse.cz \
    --cc=riel@redhat.com \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).