Re: [RFC][PATCH -mm -v2 3/4] mm,vmscan: reclaim from the highest score cgroups

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Rik van Riel <riel@redhat.com>
To: Ying Han <yinghan@google.com>
Cc: linux-mm@kvack.org, aquini@redhat.com, hannes@cmpxchg.org,
	mhocko@suse.cz, Mel Gorman <mel@csn.ul.ie>
Subject: Re: [RFC][PATCH -mm -v2 3/4] mm,vmscan: reclaim from the highest score cgroups
Date: Sat, 18 Aug 2012 00:02:34 -0400	[thread overview]
Message-ID: <502F13DA.1090806@redhat.com> (raw)
In-Reply-To: <CALWz4ixRDG9biZrO2VXcvsCAYS5WS7CNwrw0i+3o1u+r1Ls_UQ@mail.gmail.com>

On 08/17/2012 08:26 PM, Ying Han wrote:

> Seems I should really look into the numbers, which i tried to avoid at
> the beginning... :(

It comes down to the same drawings we made on the white board
back in April :)

> Here are the test cases on top of my head as well as the expected
> output, forget about root cgroup for now:
>
> case 1. A & B above softlimit
>      a) score(B) > score(A), and keep reclaiming from B
>      b) as long as usage(B) > softlimit(B), no reclaim on A
>      c) until B under softlimit, reclaim from A

By reclaiming from (B), it is very possible (and likely) that
the score of (B) will be depressed below that of (A), after
which we will start reclaiming from (A).

This could happen even while both (A) and (B) are still over
their soft limits.

> case 2. A above softlimit and B under softlimit
>      a) score(A) > score(B), and keep reclaiming from A
>      b) as long as usage (A) > softlimit (A), no reclaim on B
>      c) until A under softlimit, then reclaim on both as case 3

Pretty much, yes.

If we have not scanned anything at all in (B), we might scan
SWAP_CLUSTER_MAX (32) pages in B, but that will instantly reduce
B's score by a factor 33 and get us to reclaim from (A) again.

That is 33 because we do a +1 in the calculation to avoid
division by zero :)

> case 3. A & B under softlimit
>      a) score(B) > score(A), and keep reclaiming from B
>      b) there should be no reclaim happen on A.

Reclaiming from (B) will reduce B's score, so eventually we will
end up reclaiming from (A) again.

The more memory pressure one lruvec gets, the lower its score,
and the more likely that somebody else has a higher score.

> My patch delivers the functionality of case 2, but not distributing
> the pressure across memcgs as this patch does (case 1 & 3).  Also, on
> case3 where in my patch I would scan all the memcgs for nothing where
> in this patch it will eventually pick a memcg to reclaim. Not sure if
> it is a lot save though.
>
> Over the three cases, I would say case 2 is the basic functionality we
> want to guarantee and the case 1 and case 3 are optimizations on top
> of that.

There is an additional optimization that becomes possible
with my approach, and not with round robin.

Some people want to run systems with hundreds, or even
thousands of memory cgroups. Having direct reclaim iterate
over all those cgroups could have a really bad impact on
direct reclaim latency.

Once we have a scoring mechanism, we can implement a further
optimization where we sort the lruvecs, adjusting their
priority as things happen (pages get allocated, freed or
scanned), instead of every time we go through the reclaim
code.

That way it will become possible to have a system that
truly scales to large numbers of cgroups.

> I would like to run the test above and please help to clarify if they
> make sense.

The test makes sense to me.

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2012-08-18  4:02 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-16 15:34 [RFC][PATCH -mm -v2 0/4] mm,vmscan: reclaim from highest score cgroup Rik van Riel
2012-08-16 15:35 ` [RFC][PATCH -mm -v2 1/4] mm,vmscan: track recent pressure on each LRU set Rik van Riel
2012-08-16 15:36 ` [RFC][PATCH -mm -v2 2/4] mm,memcontrol: export mem_cgroup_get/put Rik van Riel
2012-08-16 15:37 ` [RFC][PATCH -mm -v2 3/4] mm,vmscan: reclaim from the highest score cgroups Rik van Riel
2012-08-17 23:34   ` Ying Han
2012-08-17 23:41     ` Rik van Riel
2012-08-18  0:26       ` Ying Han
2012-08-18  4:02         ` Rik van Riel [this message]
2012-08-16 15:38 ` [RFC][PATCH -mm -v2 4/4] mm,vmscan: evict inactive file pages first Rik van Riel
2012-08-23 23:07   ` Ying Han
2012-08-24  3:00     ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=502F13DA.1090806@redhat.com \
    --to=riel@redhat.com \
    --cc=aquini@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=mhocko@suse.cz \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).