From: Mel Gorman <mgorman@suse.de>
To: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Jiri Slaby <jslaby@suse.cz>,
Valdis Kletnieks <Valdis.Kletnieks@vt.edu>,
Rik van Riel <riel@redhat.com>,
Zlatko Calusic <zcalusic@bitsync.net>,
Johannes Weiner <hannes@cmpxchg.org>,
dormando <dormando@rydia.net>,
Satoru Moriya <satoru.moriya@hds.com>,
Michal Hocko <mhocko@suse.cz>, Linux-MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 02/10] mm: vmscan: Obey proportional scanning requirements for kswapd
Date: Wed, 10 Apr 2013 15:08:24 +0100 [thread overview]
Message-ID: <20130410140824.GC3710@suse.de> (raw)
In-Reply-To: <516511DF.5020805@jp.fujitsu.com>
On Wed, Apr 10, 2013 at 04:16:47PM +0900, Kamezawa Hiroyuki wrote:
> (2013/04/09 20:06), Mel Gorman wrote:
> > Simplistically, the anon and file LRU lists are scanned proportionally
> > depending on the value of vm.swappiness although there are other factors
> > taken into account by get_scan_count(). The patch "mm: vmscan: Limit
> > the number of pages kswapd reclaims" limits the number of pages kswapd
> > reclaims but it breaks this proportional scanning and may evenly shrink
> > anon/file LRUs regardless of vm.swappiness.
> >
> > This patch preserves the proportional scanning and reclaim. It does mean
> > that kswapd will reclaim more than requested but the number of pages will
> > be related to the high watermark.
> >
> > [mhocko@suse.cz: Correct proportional reclaim for memcg and simplify]
> > Signed-off-by: Mel Gorman <mgorman@suse.de>
> > Acked-by: Rik van Riel <riel@redhat.com>
> > ---
> > mm/vmscan.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++--------
> > 1 file changed, 46 insertions(+), 8 deletions(-)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 4835a7a..0742c45 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -1825,13 +1825,21 @@ static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
> > enum lru_list lru;
> > unsigned long nr_reclaimed = 0;
> > unsigned long nr_to_reclaim = sc->nr_to_reclaim;
> > + unsigned long nr_anon_scantarget, nr_file_scantarget;
> > struct blk_plug plug;
> > + bool scan_adjusted = false;
> >
> > get_scan_count(lruvec, sc, nr);
> >
> > + /* Record the original scan target for proportional adjustments later */
> > + nr_file_scantarget = nr[LRU_INACTIVE_FILE] + nr[LRU_ACTIVE_FILE] + 1;
> > + nr_anon_scantarget = nr[LRU_INACTIVE_ANON] + nr[LRU_ACTIVE_ANON] + 1;
> > +
>
> I'm sorry I couldn't understand the calc...
>
> Assume here
> nr_file_scantarget = 100
> nr_anon_file_target = 100.
>
I think you might have meant nr_anon_scantarget here instead of
nr_anon_file_target.
>
> > blk_start_plug(&plug);
> > while (nr[LRU_INACTIVE_ANON] || nr[LRU_ACTIVE_FILE] ||
> > nr[LRU_INACTIVE_FILE]) {
> > + unsigned long nr_anon, nr_file, percentage;
> > +
> > for_each_evictable_lru(lru) {
> > if (nr[lru]) {
> > nr_to_scan = min(nr[lru], SWAP_CLUSTER_MAX);
> > @@ -1841,17 +1849,47 @@ static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
> > lruvec, sc);
> > }
> > }
> > +
> > + if (nr_reclaimed < nr_to_reclaim || scan_adjusted)
> > + continue;
> > +
> > /*
> > - * On large memory systems, scan >> priority can become
> > - * really large. This is fine for the starting priority;
> > - * we want to put equal scanning pressure on each zone.
> > - * However, if the VM has a harder time of freeing pages,
> > - * with multiple processes reclaiming pages, the total
> > - * freeing target can get unreasonably large.
> > + * For global direct reclaim, reclaim only the number of pages
> > + * requested. Less care is taken to scan proportionally as it
> > + * is more important to minimise direct reclaim stall latency
> > + * than it is to properly age the LRU lists.
> > */
> > - if (nr_reclaimed >= nr_to_reclaim &&
> > - sc->priority < DEF_PRIORITY)
> > + if (global_reclaim(sc) && !current_is_kswapd())
> > break;
> > +
> > + /*
> > + * For kswapd and memcg, reclaim at least the number of pages
> > + * requested. Ensure that the anon and file LRUs shrink
> > + * proportionally what was requested by get_scan_count(). We
> > + * stop reclaiming one LRU and reduce the amount scanning
> > + * proportional to the original scan target.
> > + */
> > + nr_file = nr[LRU_INACTIVE_FILE] + nr[LRU_ACTIVE_FILE];
> > + nr_anon = nr[LRU_INACTIVE_ANON] + nr[LRU_ACTIVE_ANON];
> > +
>
> Then, nr_file = 80, nr_anon=70.
>
As we scan evenly in SCAN_CLUSTER_MAX groups of pages, this wouldn't happen
but for the purposes of discussions, lets assume it did.
>
> > + if (nr_file > nr_anon) {
> > + lru = LRU_BASE;
> > + percentage = nr_anon * 100 / nr_anon_scantarget;
> > + } else {
> > + lru = LRU_FILE;
> > + percentage = nr_file * 100 / nr_file_scantarget;
> > + }
>
> the percentage will be 70.
>
Yes.
> > +
> > + /* Stop scanning the smaller of the LRU */
> > + nr[lru] = 0;
> > + nr[lru + LRU_ACTIVE] = 0;
> > +
>
> this will stop anon scan.
>
Yes.
> > + /* Reduce scanning of the other LRU proportionally */
> > + lru = (lru == LRU_FILE) ? LRU_BASE : LRU_FILE;
> > + nr[lru] = nr[lru] * percentage / 100;;
> > + nr[lru + LRU_ACTIVE] = nr[lru + LRU_ACTIVE] * percentage / 100;
> > +
>
> finally, in the next iteration,
>
> nr[file] = 80 * 0.7 = 56.
>
> After loop, anon-scan is 30 pages , file-scan is 76(20+56) pages..
>
Well spotted, this would indeed reclaim too many pages from the other
LRU. I wanted to avoid recording the original scan targets as it's an
extra 40 bytes on the stack but it's unavoidable.
> I think the calc here should be
>
> nr[lru] = nr_lru_scantarget * percentage / 100 - nr[lru]
>
> Here, 80-70=10 more pages to scan..should be proportional.
>
nr[lru] at the end there is pages remaining to be scanned not pages
scanned already. Did you mean something like this?
nr[lru] = scantarget[lru] * percentage / 100 - (scantarget[lru] - nr[lru])
With care taken to ensure we do not underflow? Something like
unsigned long nr[NR_LRU_LISTS];
unsigned long targets[NR_LRU_LISTS];
...
memcpy(targets, nr, sizeof(nr));
...
nr[lru] = targets[lru] * percentage / 100;
nr[lru] -= min(nr[lru], (targets[lru] - nr[lru]));
lru += LRU_ACTIVE;
nr[lru] = targets[lru] * percentage / 100;
nr[lru] -= min(nr[lru], (targets[lru] - nr[lru]));
?
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-04-10 14:08 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-09 11:06 [PATCH 0/10] Reduce system disruption due to kswapd V2 Mel Gorman
2013-04-09 11:06 ` [PATCH 01/10] mm: vmscan: Limit the number of pages kswapd reclaims at each priority Mel Gorman
2013-04-09 13:27 ` Michal Hocko
2013-04-10 6:47 ` Kamezawa Hiroyuki
2013-04-09 11:06 ` [PATCH 02/10] mm: vmscan: Obey proportional scanning requirements for kswapd Mel Gorman
2013-04-10 7:16 ` Kamezawa Hiroyuki
2013-04-10 14:08 ` Mel Gorman [this message]
2013-04-11 0:14 ` Kamezawa Hiroyuki
2013-04-11 9:09 ` Mel Gorman
2013-04-09 11:06 ` [PATCH 03/10] mm: vmscan: Flatten kswapd priority loop Mel Gorman
2013-04-10 7:47 ` Kamezawa Hiroyuki
2013-04-10 13:29 ` Mel Gorman
2013-04-12 2:45 ` Rik van Riel
2013-04-09 11:06 ` [PATCH 04/10] mm: vmscan: Decide whether to compact the pgdat based on reclaim progress Mel Gorman
2013-04-10 8:05 ` Kamezawa Hiroyuki
2013-04-10 13:57 ` Mel Gorman
2013-04-12 2:46 ` Rik van Riel
2013-04-09 11:07 ` [PATCH 05/10] mm: vmscan: Do not allow kswapd to scan at maximum priority Mel Gorman
2013-04-09 11:07 ` [PATCH 06/10] mm: vmscan: Have kswapd writeback pages based on dirty pages encountered, not priority Mel Gorman
2013-04-12 2:51 ` Rik van Riel
2013-04-09 11:07 ` [PATCH 07/10] mm: vmscan: Block kswapd if it is encountering pages under writeback Mel Gorman
2013-04-12 2:54 ` Rik van Riel
2013-04-09 11:07 ` [PATCH 08/10] mm: vmscan: Have kswapd shrink slab only once per priority Mel Gorman
2013-04-09 11:07 ` [PATCH 09/10] mm: vmscan: Check if kswapd should writepage once per pgdat scan Mel Gorman
2013-04-09 11:07 ` [PATCH 10/10] mm: vmscan: Move logic from balance_pgdat() to kswapd_shrink_zone() Mel Gorman
2013-04-12 2:56 ` Rik van Riel
2013-04-09 17:27 ` [PATCH 0/10] Reduce system disruption due to kswapd V2 Christoph Lameter
2013-04-10 14:14 ` Mel Gorman
2013-04-10 22:28 ` dormando
2013-04-10 23:46 ` KOSAKI Motohiro
2013-04-11 9:10 ` Mel Gorman
2013-04-11 20:13 ` Michal Hocko
2013-04-11 20:55 ` Zlatko Calusic
2013-04-12 19:40 ` Mel Gorman
2013-04-12 19:52 ` Mel Gorman
2013-04-12 20:07 ` Zlatko Calusic
2013-04-12 20:41 ` Mel Gorman
2013-04-12 21:14 ` Zlatko Calusic
2013-04-22 6:37 ` Zlatko Calusic
2013-04-22 6:43 ` Simon Jeons
2013-04-22 6:54 ` Zlatko Calusic
2013-04-22 7:12 ` Simon Jeons
-- strict thread matches above, loose matches on Subject: below --
2013-04-11 19:57 [PATCH 0/10] Reduce system disruption due to kswapd V3 Mel Gorman
2013-04-11 19:57 ` [PATCH 02/10] mm: vmscan: Obey proportional scanning requirements for kswapd Mel Gorman
2013-04-18 15:01 ` Johannes Weiner
2013-04-18 15:58 ` Mel Gorman
2013-03-17 13:04 [RFC PATCH 0/8] Reduce system disruption due to kswapd Mel Gorman
2013-03-17 13:04 ` [PATCH 02/10] mm: vmscan: Obey proportional scanning requirements for kswapd Mel Gorman
2013-03-17 14:39 ` Andi Kleen
2013-03-17 15:08 ` Mel Gorman
2013-03-21 1:10 ` Rik van Riel
2013-03-21 9:54 ` Mel Gorman
2013-03-21 14:01 ` Michal Hocko
2013-03-21 14:31 ` Mel Gorman
2013-03-21 15:07 ` Michal Hocko
2013-03-21 15:34 ` Mel Gorman
2013-03-22 7:54 ` Michal Hocko
2013-03-22 8:37 ` Mel Gorman
2013-03-22 10:04 ` Michal Hocko
2013-03-22 10:47 ` Michal Hocko
2013-03-21 16:25 ` Johannes Weiner
2013-03-21 18:02 ` Mel Gorman
2013-03-22 16:53 ` Johannes Weiner
2013-03-22 18:25 ` Mel Gorman
2013-03-22 19:09 ` Johannes Weiner
2013-03-22 19:46 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130410140824.GC3710@suse.de \
--to=mgorman@suse.de \
--cc=Valdis.Kletnieks@vt.edu \
--cc=akpm@linux-foundation.org \
--cc=dormando@rydia.net \
--cc=hannes@cmpxchg.org \
--cc=jslaby@suse.cz \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=riel@redhat.com \
--cc=satoru.moriya@hds.com \
--cc=zcalusic@bitsync.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).