From: Minchan Kim <minchan@kernel.org>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
Andrea Arcangeli <aarcange@redhat.com>,
Andi Kleen <andi@firstfloor.org>, Michal Hocko <mhocko@suse.cz>,
Tim Chen <tim.c.chen@linux.intel.com>,
kernel-team@fb.com
Subject: Re: [PATCH 07/10] mm: base LRU balancing on an explicit cost model
Date: Wed, 8 Jun 2016 17:14:21 +0900 [thread overview]
Message-ID: <20160608081421.GC28620@bbox> (raw)
In-Reply-To: <20160606194836.3624-8-hannes@cmpxchg.org>
On Mon, Jun 06, 2016 at 03:48:33PM -0400, Johannes Weiner wrote:
> Currently, scan pressure between the anon and file LRU lists is
> balanced based on a mixture of reclaim efficiency and a somewhat vague
> notion of "value" of having certain pages in memory over others. That
> concept of value is problematic, because it has caused us to count any
> event that remotely makes one LRU list more or less preferrable for
> reclaim, even when these events are not directly comparable to each
> other and impose very different costs on the system - such as a
> referenced file page that we still deactivate and a referenced
> anonymous page that we actually rotate back to the head of the list.
>
> There is also conceptual overlap with the LRU algorithm itself. By
> rotating recently used pages instead of reclaiming them, the algorithm
> already biases the applied scan pressure based on page value. Thus,
> when rebalancing scan pressure due to rotations, we should think of
> reclaim cost, and leave assessing the page value to the LRU algorithm.
>
> Lastly, considering both value-increasing as well as value-decreasing
> events can sometimes cause the same type of event to be counted twice,
> i.e. how rotating a page increases the LRU value, while reclaiming it
> succesfully decreases the value. In itself this will balance out fine,
> but it quietly skews the impact of events that are only recorded once.
>
> The abstract metric of "value", the murky relationship with the LRU
> algorithm, and accounting both negative and positive events make the
> current pressure balancing model hard to reason about and modify.
>
> In preparation for thrashing-based LRU balancing, this patch switches
> to a balancing model of accounting the concrete, actually observed
> cost of reclaiming one LRU over another. For now, that cost includes
> pages that are scanned but rotated back to the list head. Subsequent
> patches will add consideration for IO caused by refaulting recently
> evicted pages. The idea is to primarily scan the LRU that thrashes the
> least, and secondarily scan the LRU that needs the least amount of
> work to free memory.
>
> Rename struct zone_reclaim_stat to struct lru_cost, and move from two
> separate value ratios for the LRU lists to a relative LRU cost metric
> with a shared denominator. Then make everything that affects the cost
> go through a new lru_note_cost() function.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
> include/linux/mmzone.h | 23 +++++++++++------------
> include/linux/swap.h | 2 ++
> mm/swap.c | 15 +++++----------
> mm/vmscan.c | 35 +++++++++++++++--------------------
> 4 files changed, 33 insertions(+), 42 deletions(-)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 02069c23486d..4d257d00fbf5 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -191,22 +191,21 @@ static inline int is_active_lru(enum lru_list lru)
> return (lru == LRU_ACTIVE_ANON || lru == LRU_ACTIVE_FILE);
> }
>
> -struct zone_reclaim_stat {
> - /*
> - * The pageout code in vmscan.c keeps track of how many of the
> - * mem/swap backed and file backed pages are referenced.
> - * The higher the rotated/scanned ratio, the more valuable
> - * that cache is.
> - *
> - * The anon LRU stats live in [0], file LRU stats in [1]
> - */
> - unsigned long recent_rotated[2];
> - unsigned long recent_scanned[2];
> +/*
> + * This tracks cost of reclaiming one LRU type - file or anon - over
> + * the other. As the observed cost of pressure on one type increases,
> + * the scan balance in vmscan.c tips toward the other type.
> + *
> + * The recorded cost for anon is in numer[0], file in numer[1].
> + */
> +struct lru_cost {
> + unsigned long numer[2];
> + unsigned long denom;
> };
>
> struct lruvec {
> struct list_head lists[NR_LRU_LISTS];
> - struct zone_reclaim_stat reclaim_stat;
> + struct lru_cost balance;
> /* Evictions & activations on the inactive file list */
> atomic_long_t inactive_age;
> #ifdef CONFIG_MEMCG
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 178f084365c2..c461ce0533da 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -295,6 +295,8 @@ extern unsigned long nr_free_pagecache_pages(void);
>
>
> /* linux/mm/swap.c */
> +extern void lru_note_cost(struct lruvec *lruvec, bool file,
> + unsigned int nr_pages);
> extern void lru_cache_add(struct page *);
> extern void lru_cache_putback(struct page *page);
> extern void lru_add_page_tail(struct page *page, struct page *page_tail,
> diff --git a/mm/swap.c b/mm/swap.c
> index 814e3a2e54b4..645d21242324 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -249,15 +249,10 @@ void rotate_reclaimable_page(struct page *page)
> }
> }
>
> -static void update_page_reclaim_stat(struct lruvec *lruvec,
> - int file, int rotated,
> - unsigned int nr_pages)
> +void lru_note_cost(struct lruvec *lruvec, bool file, unsigned int nr_pages)
> {
> - struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
> -
> - reclaim_stat->recent_scanned[file] += nr_pages;
> - if (rotated)
> - reclaim_stat->recent_rotated[file] += nr_pages;
> + lruvec->balance.numer[file] += nr_pages;
> + lruvec->balance.denom += nr_pages;
balance.numer[0] + balance.number[1] = balance.denom
so we can remove denom at the moment?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Minchan Kim <minchan@kernel.org>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: <linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
Andrea Arcangeli <aarcange@redhat.com>,
Andi Kleen <andi@firstfloor.org>, Michal Hocko <mhocko@suse.cz>,
Tim Chen <tim.c.chen@linux.intel.com>, <kernel-team@fb.com>
Subject: Re: [PATCH 07/10] mm: base LRU balancing on an explicit cost model
Date: Wed, 8 Jun 2016 17:14:21 +0900 [thread overview]
Message-ID: <20160608081421.GC28620@bbox> (raw)
In-Reply-To: <20160606194836.3624-8-hannes@cmpxchg.org>
On Mon, Jun 06, 2016 at 03:48:33PM -0400, Johannes Weiner wrote:
> Currently, scan pressure between the anon and file LRU lists is
> balanced based on a mixture of reclaim efficiency and a somewhat vague
> notion of "value" of having certain pages in memory over others. That
> concept of value is problematic, because it has caused us to count any
> event that remotely makes one LRU list more or less preferrable for
> reclaim, even when these events are not directly comparable to each
> other and impose very different costs on the system - such as a
> referenced file page that we still deactivate and a referenced
> anonymous page that we actually rotate back to the head of the list.
>
> There is also conceptual overlap with the LRU algorithm itself. By
> rotating recently used pages instead of reclaiming them, the algorithm
> already biases the applied scan pressure based on page value. Thus,
> when rebalancing scan pressure due to rotations, we should think of
> reclaim cost, and leave assessing the page value to the LRU algorithm.
>
> Lastly, considering both value-increasing as well as value-decreasing
> events can sometimes cause the same type of event to be counted twice,
> i.e. how rotating a page increases the LRU value, while reclaiming it
> succesfully decreases the value. In itself this will balance out fine,
> but it quietly skews the impact of events that are only recorded once.
>
> The abstract metric of "value", the murky relationship with the LRU
> algorithm, and accounting both negative and positive events make the
> current pressure balancing model hard to reason about and modify.
>
> In preparation for thrashing-based LRU balancing, this patch switches
> to a balancing model of accounting the concrete, actually observed
> cost of reclaiming one LRU over another. For now, that cost includes
> pages that are scanned but rotated back to the list head. Subsequent
> patches will add consideration for IO caused by refaulting recently
> evicted pages. The idea is to primarily scan the LRU that thrashes the
> least, and secondarily scan the LRU that needs the least amount of
> work to free memory.
>
> Rename struct zone_reclaim_stat to struct lru_cost, and move from two
> separate value ratios for the LRU lists to a relative LRU cost metric
> with a shared denominator. Then make everything that affects the cost
> go through a new lru_note_cost() function.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
> include/linux/mmzone.h | 23 +++++++++++------------
> include/linux/swap.h | 2 ++
> mm/swap.c | 15 +++++----------
> mm/vmscan.c | 35 +++++++++++++++--------------------
> 4 files changed, 33 insertions(+), 42 deletions(-)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 02069c23486d..4d257d00fbf5 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -191,22 +191,21 @@ static inline int is_active_lru(enum lru_list lru)
> return (lru == LRU_ACTIVE_ANON || lru == LRU_ACTIVE_FILE);
> }
>
> -struct zone_reclaim_stat {
> - /*
> - * The pageout code in vmscan.c keeps track of how many of the
> - * mem/swap backed and file backed pages are referenced.
> - * The higher the rotated/scanned ratio, the more valuable
> - * that cache is.
> - *
> - * The anon LRU stats live in [0], file LRU stats in [1]
> - */
> - unsigned long recent_rotated[2];
> - unsigned long recent_scanned[2];
> +/*
> + * This tracks cost of reclaiming one LRU type - file or anon - over
> + * the other. As the observed cost of pressure on one type increases,
> + * the scan balance in vmscan.c tips toward the other type.
> + *
> + * The recorded cost for anon is in numer[0], file in numer[1].
> + */
> +struct lru_cost {
> + unsigned long numer[2];
> + unsigned long denom;
> };
>
> struct lruvec {
> struct list_head lists[NR_LRU_LISTS];
> - struct zone_reclaim_stat reclaim_stat;
> + struct lru_cost balance;
> /* Evictions & activations on the inactive file list */
> atomic_long_t inactive_age;
> #ifdef CONFIG_MEMCG
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 178f084365c2..c461ce0533da 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -295,6 +295,8 @@ extern unsigned long nr_free_pagecache_pages(void);
>
>
> /* linux/mm/swap.c */
> +extern void lru_note_cost(struct lruvec *lruvec, bool file,
> + unsigned int nr_pages);
> extern void lru_cache_add(struct page *);
> extern void lru_cache_putback(struct page *page);
> extern void lru_add_page_tail(struct page *page, struct page *page_tail,
> diff --git a/mm/swap.c b/mm/swap.c
> index 814e3a2e54b4..645d21242324 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -249,15 +249,10 @@ void rotate_reclaimable_page(struct page *page)
> }
> }
>
> -static void update_page_reclaim_stat(struct lruvec *lruvec,
> - int file, int rotated,
> - unsigned int nr_pages)
> +void lru_note_cost(struct lruvec *lruvec, bool file, unsigned int nr_pages)
> {
> - struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
> -
> - reclaim_stat->recent_scanned[file] += nr_pages;
> - if (rotated)
> - reclaim_stat->recent_rotated[file] += nr_pages;
> + lruvec->balance.numer[file] += nr_pages;
> + lruvec->balance.denom += nr_pages;
balance.numer[0] + balance.number[1] = balance.denom
so we can remove denom at the moment?
next prev parent reply other threads:[~2016-06-08 8:15 UTC|newest]
Thread overview: 125+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-06 19:48 [PATCH 00/10] mm: balance LRU lists based on relative thrashing Johannes Weiner
2016-06-06 19:48 ` Johannes Weiner
2016-06-06 19:48 ` [PATCH 01/10] mm: allow swappiness that prefers anon over file Johannes Weiner
2016-06-06 19:48 ` Johannes Weiner
2016-06-07 0:25 ` Minchan Kim
2016-06-07 0:25 ` Minchan Kim
2016-06-07 14:18 ` Johannes Weiner
2016-06-07 14:18 ` Johannes Weiner
2016-06-08 0:06 ` Minchan Kim
2016-06-08 0:06 ` Minchan Kim
2016-06-08 15:58 ` Johannes Weiner
2016-06-08 15:58 ` Johannes Weiner
2016-06-09 1:01 ` Minchan Kim
2016-06-09 1:01 ` Minchan Kim
2016-06-09 13:32 ` Johannes Weiner
2016-06-09 13:32 ` Johannes Weiner
2016-06-06 19:48 ` [PATCH 02/10] mm: swap: unexport __pagevec_lru_add() Johannes Weiner
2016-06-06 19:48 ` Johannes Weiner
2016-06-06 21:32 ` Rik van Riel
2016-06-07 9:07 ` Michal Hocko
2016-06-07 9:07 ` Michal Hocko
2016-06-08 7:14 ` Minchan Kim
2016-06-08 7:14 ` Minchan Kim
2016-06-06 19:48 ` [PATCH 03/10] mm: fold and remove lru_cache_add_anon() and lru_cache_add_file() Johannes Weiner
2016-06-06 19:48 ` Johannes Weiner
2016-06-06 21:33 ` Rik van Riel
2016-06-07 9:12 ` Michal Hocko
2016-06-07 9:12 ` Michal Hocko
2016-06-08 7:24 ` Minchan Kim
2016-06-08 7:24 ` Minchan Kim
2016-06-06 19:48 ` [PATCH 04/10] mm: fix LRU balancing effect of new transparent huge pages Johannes Weiner
2016-06-06 19:48 ` Johannes Weiner
2016-06-06 21:36 ` Rik van Riel
2016-06-07 9:19 ` Michal Hocko
2016-06-07 9:19 ` Michal Hocko
2016-06-08 7:28 ` Minchan Kim
2016-06-08 7:28 ` Minchan Kim
2016-06-06 19:48 ` [PATCH 05/10] mm: remove LRU balancing effect of temporary page isolation Johannes Weiner
2016-06-06 19:48 ` Johannes Weiner
2016-06-06 21:56 ` Rik van Riel
2016-06-06 22:15 ` Johannes Weiner
2016-06-06 22:15 ` Johannes Weiner
2016-06-07 1:11 ` Rik van Riel
2016-06-07 13:57 ` Johannes Weiner
2016-06-07 13:57 ` Johannes Weiner
2016-06-07 9:26 ` Michal Hocko
2016-06-07 9:26 ` Michal Hocko
2016-06-07 14:06 ` Johannes Weiner
2016-06-07 14:06 ` Johannes Weiner
2016-06-07 9:49 ` Michal Hocko
2016-06-07 9:49 ` Michal Hocko
2016-06-08 7:39 ` Minchan Kim
2016-06-08 7:39 ` Minchan Kim
2016-06-08 16:02 ` Johannes Weiner
2016-06-08 16:02 ` Johannes Weiner
2016-06-06 19:48 ` [PATCH 06/10] mm: remove unnecessary use-once cache bias from LRU balancing Johannes Weiner
2016-06-06 19:48 ` Johannes Weiner
2016-06-07 2:20 ` Rik van Riel
2016-06-07 14:11 ` Johannes Weiner
2016-06-07 14:11 ` Johannes Weiner
2016-06-08 8:03 ` Minchan Kim
2016-06-08 8:03 ` Minchan Kim
2016-06-08 12:31 ` Michal Hocko
2016-06-08 12:31 ` Michal Hocko
2016-06-06 19:48 ` [PATCH 07/10] mm: base LRU balancing on an explicit cost model Johannes Weiner
2016-06-06 19:48 ` Johannes Weiner
2016-06-06 19:13 ` kbuild test robot
2016-06-07 2:34 ` Rik van Riel
2016-06-07 14:12 ` Johannes Weiner
2016-06-07 14:12 ` Johannes Weiner
2016-06-08 8:14 ` Minchan Kim [this message]
2016-06-08 8:14 ` Minchan Kim
2016-06-08 16:06 ` Johannes Weiner
2016-06-08 16:06 ` Johannes Weiner
2016-06-08 12:51 ` Michal Hocko
2016-06-08 12:51 ` Michal Hocko
2016-06-08 16:16 ` Johannes Weiner
2016-06-08 16:16 ` Johannes Weiner
2016-06-09 12:18 ` Michal Hocko
2016-06-09 12:18 ` Michal Hocko
2016-06-09 13:33 ` Johannes Weiner
2016-06-09 13:33 ` Johannes Weiner
2016-06-06 19:48 ` [PATCH 08/10] mm: deactivations shouldn't bias the LRU balance Johannes Weiner
2016-06-06 19:48 ` Johannes Weiner
2016-06-08 8:15 ` Minchan Kim
2016-06-08 8:15 ` Minchan Kim
2016-06-08 12:57 ` Michal Hocko
2016-06-08 12:57 ` Michal Hocko
2016-06-06 19:48 ` [PATCH 09/10] mm: only count actual rotations as LRU reclaim cost Johannes Weiner
2016-06-06 19:48 ` Johannes Weiner
2016-06-08 8:19 ` Minchan Kim
2016-06-08 8:19 ` Minchan Kim
2016-06-08 13:18 ` Michal Hocko
2016-06-08 13:18 ` Michal Hocko
2016-06-06 19:48 ` [PATCH 10/10] mm: balance LRU lists based on relative thrashing Johannes Weiner
2016-06-06 19:48 ` Johannes Weiner
2016-06-06 19:22 ` kbuild test robot
2016-06-06 23:50 ` Tim Chen
2016-06-06 23:50 ` Tim Chen
2016-06-07 16:23 ` Johannes Weiner
2016-06-07 16:23 ` Johannes Weiner
2016-06-07 19:56 ` Tim Chen
2016-06-07 19:56 ` Tim Chen
2016-06-08 13:58 ` Michal Hocko
2016-06-08 13:58 ` Michal Hocko
2016-06-10 2:19 ` Minchan Kim
2016-06-10 2:19 ` Minchan Kim
2016-06-13 15:52 ` Johannes Weiner
2016-06-13 15:52 ` Johannes Weiner
2016-06-15 2:23 ` Minchan Kim
2016-06-15 2:23 ` Minchan Kim
2016-06-16 15:12 ` Johannes Weiner
2016-06-16 15:12 ` Johannes Weiner
2016-06-17 7:49 ` Minchan Kim
2016-06-17 7:49 ` Minchan Kim
2016-06-17 17:01 ` Johannes Weiner
2016-06-17 17:01 ` Johannes Weiner
2016-06-20 7:42 ` Minchan Kim
2016-06-20 7:42 ` Minchan Kim
2016-06-22 21:56 ` Johannes Weiner
2016-06-22 21:56 ` Johannes Weiner
2016-06-24 6:22 ` Minchan Kim
2016-06-24 6:22 ` Minchan Kim
2016-06-07 9:51 ` [PATCH 00/10] " Michal Hocko
2016-06-07 9:51 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160608081421.GC28620@bbox \
--to=minchan@kernel.org \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.cz \
--cc=riel@redhat.com \
--cc=tim.c.chen@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.