From: Ryan Mallon <rmallon@gmail.com>
To: Johannes Weiner <hannes@cmpxchg.org>,
Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Chinner <david@fromorbit.com>,
Rik van Riel <riel@redhat.com>, Jan Kara <jack@suse.cz>,
Vlastimil Babka <vbabka@suse.cz>,
Peter Zijlstra <peterz@infradead.org>, Tejun Heo <tj@kernel.org>,
Andi Kleen <andi@firstfloor.org>,
Andrea Arcangeli <aarcange@redhat.com>,
Greg Thelen <gthelen@google.com>,
Christoph Hellwig <hch@infradead.org>,
Hugh Dickins <hughd@google.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Mel Gorman <mgorman@suse.de>, Minchan Kim <minchan.kim@gmail.com>,
Michel Lespinasse <walken@google.com>,
Seth Jennings <sjenning@linux.vnet.ibm.com>,
Roman Gushchin <klamm@yandex-team.ru>,
Ozgun Erdogan <ozgun@citusdata.com>,
Metin Doslu <metin@citusdata.com>,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [patch 7/9] mm: thrash detection-based file cache sizing
Date: Tue, 26 Nov 2013 12:56:23 +1100 [thread overview]
Message-ID: <5293FFC7.5070907@gmail.com> (raw)
In-Reply-To: <1385336308-27121-8-git-send-email-hannes@cmpxchg.org>
On 25/11/13 10:38, Johannes Weiner wrote:
> The VM maintains cached filesystem pages on two types of lists. One
> list holds the pages recently faulted into the cache, the other list
> holds pages that have been referenced repeatedly on that first list.
> The idea is to prefer reclaiming young pages over those that have
> shown to benefit from caching in the past. We call the recently used
> list "inactive list" and the frequently used list "active list".
>
> Currently, the VM aims for a 1:1 ratio between the lists, which is the
> "perfect" trade-off between the ability to *protect* frequently used
> pages and the ability to *detect* frequently used pages. This means
> that working set changes bigger than half of cache memory go
> undetected and thrash indefinitely, whereas working sets bigger than
> half of cache memory are unprotected against used-once streams that
> don't even need caching.
>
> Historically, every reclaim scan of the inactive list also took a
> smaller number of pages from the tail of the active list and moved
> them to the head of the inactive list. This model gave established
> working sets more gracetime in the face of temporary use-once streams,
> but ultimately was not significantly better than a FIFO policy and
> still thrashed cache based on eviction speed, rather than actual
> demand for cache.
>
> This patch solves one half of the problem by decoupling the ability to
> detect working set changes from the inactive list size. By
> maintaining a history of recently evicted file pages it can detect
> frequently used pages with an arbitrarily small inactive list size,
> and subsequently apply pressure on the active list based on actual
> demand for cache, not just overall eviction speed.
>
> Every zone maintains a counter that tracks inactive list aging speed.
> When a page is evicted, a snapshot of this counter is stored in the
> now-empty page cache radix tree slot. On refault, the minimum access
> distance of the page can be assesed, to evaluate whether the page
> should be part of the active list or not.
>
> This fixes the VM's blindness towards working set changes in excess of
> the inactive list. And it's the foundation to further improve the
> protection ability and reduce the minimum inactive list size of 50%.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
<snip>
> + * fault ------------------------+
> + * |
> + * +--------------+ | +-------------+
> + * reclaim <- | inactive | <-+-- demotion | active | <--+
> + * +--------------+ +-------------+ |
> + * | |
> + * +-------------- promotion ------------------+
> + *
> + *
> + * Access frequency and refault distance
> + *
> + * A workload is trashing when its pages are frequently used but they
"thrashing".
~Ryan
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Ryan Mallon <rmallon@gmail.com>
To: Johannes Weiner <hannes@cmpxchg.org>,
Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Chinner <david@fromorbit.com>,
Rik van Riel <riel@redhat.com>, Jan Kara <jack@suse.cz>,
Vlastimil Babka <vbabka@suse.cz>,
Peter Zijlstra <peterz@infradead.org>, Tejun Heo <tj@kernel.org>,
Andi Kleen <andi@firstfloor.org>,
Andrea Arcangeli <aarcange@redhat.com>,
Greg Thelen <gthelen@google.com>,
Christoph Hellwig <hch@infradead.org>,
Hugh Dickins <hughd@google.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Mel Gorman <mgorman@suse.de>, Minchan Kim <minchan.kim@gmail.com>,
Michel Lespinasse <walken@google.com>,
Seth Jennings <sjenning@linux.vnet.ibm.com>,
Roman Gushchin <klamm@yandex-team.ru>,
Ozgun Erdogan <ozgun@citusdata.com>,
Metin Doslu <metin@citusdata.com>,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [patch 7/9] mm: thrash detection-based file cache sizing
Date: Tue, 26 Nov 2013 12:56:23 +1100 [thread overview]
Message-ID: <5293FFC7.5070907@gmail.com> (raw)
In-Reply-To: <1385336308-27121-8-git-send-email-hannes@cmpxchg.org>
On 25/11/13 10:38, Johannes Weiner wrote:
> The VM maintains cached filesystem pages on two types of lists. One
> list holds the pages recently faulted into the cache, the other list
> holds pages that have been referenced repeatedly on that first list.
> The idea is to prefer reclaiming young pages over those that have
> shown to benefit from caching in the past. We call the recently used
> list "inactive list" and the frequently used list "active list".
>
> Currently, the VM aims for a 1:1 ratio between the lists, which is the
> "perfect" trade-off between the ability to *protect* frequently used
> pages and the ability to *detect* frequently used pages. This means
> that working set changes bigger than half of cache memory go
> undetected and thrash indefinitely, whereas working sets bigger than
> half of cache memory are unprotected against used-once streams that
> don't even need caching.
>
> Historically, every reclaim scan of the inactive list also took a
> smaller number of pages from the tail of the active list and moved
> them to the head of the inactive list. This model gave established
> working sets more gracetime in the face of temporary use-once streams,
> but ultimately was not significantly better than a FIFO policy and
> still thrashed cache based on eviction speed, rather than actual
> demand for cache.
>
> This patch solves one half of the problem by decoupling the ability to
> detect working set changes from the inactive list size. By
> maintaining a history of recently evicted file pages it can detect
> frequently used pages with an arbitrarily small inactive list size,
> and subsequently apply pressure on the active list based on actual
> demand for cache, not just overall eviction speed.
>
> Every zone maintains a counter that tracks inactive list aging speed.
> When a page is evicted, a snapshot of this counter is stored in the
> now-empty page cache radix tree slot. On refault, the minimum access
> distance of the page can be assesed, to evaluate whether the page
> should be part of the active list or not.
>
> This fixes the VM's blindness towards working set changes in excess of
> the inactive list. And it's the foundation to further improve the
> protection ability and reduce the minimum inactive list size of 50%.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
<snip>
> + * fault ------------------------+
> + * |
> + * +--------------+ | +-------------+
> + * reclaim <- | inactive | <-+-- demotion | active | <--+
> + * +--------------+ +-------------+ |
> + * | |
> + * +-------------- promotion ------------------+
> + *
> + *
> + * Access frequency and refault distance
> + *
> + * A workload is trashing when its pages are frequently used but they
"thrashing".
~Ryan
next prev parent reply other threads:[~2013-11-26 1:56 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-24 23:38 [patch 0/9] mm: thrash detection-based file cache sizing v6 Johannes Weiner
2013-11-24 23:38 ` Johannes Weiner
2013-11-24 23:38 ` [patch 1/9] fs: cachefiles: use add_to_page_cache_lru() Johannes Weiner
2013-11-24 23:38 ` Johannes Weiner
2013-11-24 23:38 ` [patch 2/9] lib: radix-tree: radix_tree_delete_item() Johannes Weiner
2013-11-24 23:38 ` Johannes Weiner
2013-11-25 8:21 ` Minchan Kim
2013-11-25 8:21 ` Minchan Kim
2013-11-24 23:38 ` [patch 3/9] mm: shmem: save one radix tree lookup when truncating swapped pages Johannes Weiner
2013-11-24 23:38 ` Johannes Weiner
2013-11-25 8:21 ` Minchan Kim
2013-11-25 8:21 ` Minchan Kim
2013-11-24 23:38 ` [patch 4/9] mm: filemap: move radix tree hole searching here Johannes Weiner
2013-11-24 23:38 ` Johannes Weiner
2013-11-24 23:38 ` [patch 5/9] mm + fs: prepare for non-page entries in page cache radix trees Johannes Weiner
2013-11-24 23:38 ` Johannes Weiner
2013-11-24 23:38 ` [patch 6/9] mm + fs: store shadow entries in page cache Johannes Weiner
2013-11-24 23:38 ` Johannes Weiner
2013-11-25 23:17 ` Dave Chinner
2013-11-25 23:17 ` Dave Chinner
2013-11-26 10:20 ` Peter Zijlstra
2013-11-26 10:20 ` Peter Zijlstra
2013-11-27 16:45 ` Johannes Weiner
2013-11-27 16:45 ` Johannes Weiner
2013-11-27 17:08 ` Johannes Weiner
2013-11-27 17:08 ` Johannes Weiner
2013-11-27 23:32 ` Dave Chinner
2013-11-27 23:32 ` Dave Chinner
2013-11-24 23:38 ` [patch 7/9] mm: thrash detection-based file cache sizing Johannes Weiner
2013-11-24 23:38 ` Johannes Weiner
2013-11-25 23:50 ` Andrew Morton
2013-11-25 23:50 ` Andrew Morton
2013-11-26 2:15 ` Johannes Weiner
2013-11-26 2:15 ` Johannes Weiner
2013-11-26 1:56 ` Ryan Mallon [this message]
2013-11-26 1:56 ` Ryan Mallon
2013-11-26 20:57 ` Johannes Weiner
2013-11-26 20:57 ` Johannes Weiner
2013-11-24 23:38 ` [patch 8/9] lib: radix_tree: tree node interface Johannes Weiner
2013-11-24 23:38 ` Johannes Weiner
2013-11-24 23:38 ` [patch 9/9] mm: keep page cache radix tree nodes in check Johannes Weiner
2013-11-24 23:38 ` Johannes Weiner
2013-11-25 23:49 ` Dave Chinner
2013-11-25 23:49 ` Dave Chinner
2013-11-26 21:27 ` Johannes Weiner
2013-11-26 21:27 ` Johannes Weiner
2013-11-26 22:29 ` Dave Chinner
2013-11-26 22:29 ` Dave Chinner
2013-11-26 23:00 ` Johannes Weiner
2013-11-26 23:00 ` Johannes Weiner
2013-11-27 0:59 ` Dave Chinner
2013-11-27 0:59 ` Dave Chinner
2013-11-26 0:13 ` Andrew Morton
2013-11-26 0:13 ` Andrew Morton
2013-11-26 22:05 ` Johannes Weiner
2013-11-26 22:05 ` Johannes Weiner
2013-11-26 0:57 ` [patch 0/9] mm: thrash detection-based file cache sizing v6 Andrew Morton
2013-11-26 0:57 ` Andrew Morton
2013-11-26 22:30 ` Johannes Weiner
2013-11-26 22:30 ` Johannes Weiner
2013-11-28 4:40 ` Johannes Weiner
2013-11-28 4:40 ` Johannes Weiner
-- strict thread matches above, loose matches on Subject: below --
2013-12-02 19:21 [patch 0/9] mm: thrash detection-based file cache sizing v7 Johannes Weiner
2013-12-02 19:21 ` [patch 7/9] mm: thrash detection-based file cache sizing Johannes Weiner
2013-12-02 19:21 ` Johannes Weiner
2014-01-10 18:10 [patch 0/9] mm: thrash detection-based file cache sizing v8 Johannes Weiner
2014-01-10 18:10 ` [patch 7/9] mm: thrash detection-based file cache sizing Johannes Weiner
2014-01-10 18:10 ` Johannes Weiner
2014-01-10 22:51 ` Rik van Riel
2014-01-10 22:51 ` Rik van Riel
2014-01-10 22:51 ` Rik van Riel
2014-01-13 2:42 ` Minchan Kim
2014-01-13 2:42 ` Minchan Kim
2014-01-14 1:01 ` Bob Liu
2014-01-14 1:01 ` Bob Liu
2014-01-14 1:01 ` Bob Liu
2014-01-14 19:16 ` Johannes Weiner
2014-01-14 19:16 ` Johannes Weiner
2014-01-15 2:57 ` Bob Liu
2014-01-15 2:57 ` Bob Liu
2014-01-15 2:57 ` Bob Liu
2014-01-15 3:52 ` Zhang Yanfei
2014-01-15 3:52 ` Zhang Yanfei
2014-01-16 21:17 ` Johannes Weiner
2014-01-16 21:17 ` Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5293FFC7.5070907@gmail.com \
--to=rmallon@gmail.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=david@fromorbit.com \
--cc=gthelen@google.com \
--cc=hannes@cmpxchg.org \
--cc=hch@infradead.org \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=klamm@yandex-team.ru \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=metin@citusdata.com \
--cc=mgorman@suse.de \
--cc=minchan.kim@gmail.com \
--cc=ozgun@citusdata.com \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=sjenning@linux.vnet.ibm.com \
--cc=tj@kernel.org \
--cc=vbabka@suse.cz \
--cc=walken@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.