From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org, Andrew Morton <akpm@osdl.org>,
Christoph Lameter <christoph@lameter.com>,
Wu Fengguang <wfg@mail.ustc.edu.cn>,
Nick Piggin <npiggin@suse.de>, Marijn Meijles <marijn@bitpit.net>,
Rik van Riel <riel@redhat.com>,
Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Subject: Re: [PATCH 10/9] clockpro-document.patch
Date: Sat, 31 Dec 2005 19:59:18 +0100 [thread overview]
Message-ID: <1136055558.17853.72.camel@twins> (raw)
In-Reply-To: <20051230223952.765.21096.sendpatchset@twins.localnet>
By popular request,
I'll finish it some time next year.
Best wishes all.
--- /dev/null 2003-12-29 19:37:00.000000000 +0100
+++ linux-2.6-git/Documentation/vm/clockpro.txt 2005-12-31 19:55:45.000000000 +0100
@@ -0,0 +1,97 @@
+This document describes the page replace algorithm as implemented in the linux
+kernel. It is based on CLOCK-Pro, found here:
+ http://www.cs.wm.edu/hpcs/WWW/HTML/publications/abs05-3.html
+
+ Base Algorithm Summary
+
+The algorithm is based on reuse distance as opposed to the recency familair
+from LRU. The reuse distance is the number of pages referenced between the
+current and previous page reference.
+
+It categorizes pages with a small reuse distance as hot and those with a large
+reuse distance as cold. The threshold between hot and cold is the test period,
+that is, if a page is referenced during its test period its reuse distance is
+small, ie. it becomes hot. The test period is the largest reuse distance of a
+hot page, which in turn depends on the number of resident cold pages.
+
+The number of resident cold pages is an adaptive target which is incremented
+when a page is referenced in its test period and decremented when a test
+period expires.
+
+Reclaim looks for unreferenced cold pages, for cold pages that are still in
+their test period the metadata is kept until the test period expires.
+
+In order to be able to compare reuse distance all pages are kept on one CLOCK
+however the management of the page state requires more than one hand.
+CLOCK-Pro has three, the following table gives the actions of each hand:
+
+ res | hot | tst | ref || Hcold | Hhot | Htst || Flt
+ ----+-----+-----+-----++--------+--------+--------++------
+ 1 | 1 | 0 | 1 || = 1101 | 1100 | = 1101 ||
+ 1 | 1 | 0 | 0 || = 1100 | 1000 | = 1100 ||
+ ----+-----+-----+-----++--------+--------+--------++------
+ 1 | 0 | 1 | 1 || 1100 | 1001 | 1001 ||
+ 1 | 0 | 1 | 0 || N 0010 | 1000 | 1000 ||
+ 1 | 0 | 0 | 1 || 1010 | = 1001 | = 1001 ||
+ 1 | 0 | 0 | 0 || X 0000 | = 1000 | = 1000 ||
+ ====+=====+=====+=====++========+========+========++======
+ 0 | 0 | 1 | 1 || | | || 1100
+ 0 | 0 | 1 | 0 || = 0010 | X 0000 | X 0000 ||
+ 0 | 0 | 0 | 1 || | | || 1010
+
+Where the first four columns give the page state and the next three columns
+give the new state when the respective hand moves along. The prefixes '=', 'N'
+and 'X' are used to indicate: state unchanged, page tracked as non-resident
+and remove page. The last column gives the state on page fault.
+
+The hand dynamics is as follows, reclaim rotates the cold hand and takes
+unreferenced cold pages. When during this rotation the actual number of cold
+pages drops below the target number of cold pages the hot hand is rotated.
+
+The hot hand demotes unreferenced hot pages to cold, and terminates the test
+period of pages is passes by. If however the total number of pages tracked
+rises above twice the total available resident pages the test hand is rotated.
+
+The test hand, like the hot hand, terminates the test period of any page it
+passes by. Remember that terminating the test period of a non-resident cold
+page removes it altogether, thus limiting the total pages tracked.
+
+
+ Implementation Notes
+
+Since pages reclaimed in one zone can end up being faulted back in another
+zone it is incorrect to have per-zone non-resident page tracking. Hence the
+resident and non-resident page tracking needs to be separated.
+
+In order to accomplish this the following is done; take two CLOCKs, a two
+handed one for resident pages and a single handed one for non-resident pages.
+The resident CLOCK's hands will reflect hand cold and the resident part of hand
+hot, the non-resident CLOCK's hand will reflect the non-resident part of hand
+hot. Hence the rotation speeds of the resident hand hot and non-resident hand
+are coupled so that when one has made a full revolution so will have the
+other.
+
+The functionality of hand test is accomplished by simply limiting the number
+of entries on the non-resident clock to the number of pages on the resident
+clock. When a new entry is added to an already full non-resident clock the
+oldest entry will be removed; ie. its test period terminated.
+
+This uncoupling of non-resident and resident pages has the effect that the
+exact position of the non-resident pages relative to the resident pages is
+lost. This uncertainty propagates to the duration of the test period, some
+pages will be terminated to soon, other too late. However this is a bounded
+error with an average of 0.
+
+This scheme can then be extended to multiple zones by scaling the rotation
+speed coupling between the resident hot hand and the non-resident hand to the
+zone size. That is, when all resident hot hands have made one full revolution
+so will the non-resident hand have.
+
+Demand paging introduces yet another problem, when a page is faulted into
+memory it effectivly doesn't matter what the referenced bit is set to, it will
+be used as soon as we finish the fault anyway. Hence the first check will
+always activate the page even though we have only had a single use. The
+classic use-once problem. To tackle this the pages can be inserted one state
+lower than normal and behind hand hot instead of behind hand cold, this so
+that hand hot cannot interfere with the lowered page state and the first
+reference is lost quicker.
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org, Andrew Morton <akpm@osdl.org>,
Christoph Lameter <christoph@lameter.com>,
Wu Fengguang <wfg@mail.ustc.edu.cn>,
Nick Piggin <npiggin@suse.de>, Marijn Meijles <marijn@bitpit.net>,
Rik van Riel <riel@redhat.com>,
Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Subject: Re: [PATCH 10/9] clockpro-document.patch
Date: Sat, 31 Dec 2005 19:59:18 +0100 [thread overview]
Message-ID: <1136055558.17853.72.camel@twins> (raw)
In-Reply-To: <20051230223952.765.21096.sendpatchset@twins.localnet>
By popular request,
I'll finish it some time next year.
Best wishes all.
--- /dev/null 2003-12-29 19:37:00.000000000 +0100
+++ linux-2.6-git/Documentation/vm/clockpro.txt 2005-12-31 19:55:45.000000000 +0100
@@ -0,0 +1,97 @@
+This document describes the page replace algorithm as implemented in the linux
+kernel. It is based on CLOCK-Pro, found here:
+ http://www.cs.wm.edu/hpcs/WWW/HTML/publications/abs05-3.html
+
+ Base Algorithm Summary
+
+The algorithm is based on reuse distance as opposed to the recency familair
+from LRU. The reuse distance is the number of pages referenced between the
+current and previous page reference.
+
+It categorizes pages with a small reuse distance as hot and those with a large
+reuse distance as cold. The threshold between hot and cold is the test period,
+that is, if a page is referenced during its test period its reuse distance is
+small, ie. it becomes hot. The test period is the largest reuse distance of a
+hot page, which in turn depends on the number of resident cold pages.
+
+The number of resident cold pages is an adaptive target which is incremented
+when a page is referenced in its test period and decremented when a test
+period expires.
+
+Reclaim looks for unreferenced cold pages, for cold pages that are still in
+their test period the metadata is kept until the test period expires.
+
+In order to be able to compare reuse distance all pages are kept on one CLOCK
+however the management of the page state requires more than one hand.
+CLOCK-Pro has three, the following table gives the actions of each hand:
+
+ res | hot | tst | ref || Hcold | Hhot | Htst || Flt
+ ----+-----+-----+-----++--------+--------+--------++------
+ 1 | 1 | 0 | 1 || = 1101 | 1100 | = 1101 ||
+ 1 | 1 | 0 | 0 || = 1100 | 1000 | = 1100 ||
+ ----+-----+-----+-----++--------+--------+--------++------
+ 1 | 0 | 1 | 1 || 1100 | 1001 | 1001 ||
+ 1 | 0 | 1 | 0 || N 0010 | 1000 | 1000 ||
+ 1 | 0 | 0 | 1 || 1010 | = 1001 | = 1001 ||
+ 1 | 0 | 0 | 0 || X 0000 | = 1000 | = 1000 ||
+ ====+=====+=====+=====++========+========+========++======
+ 0 | 0 | 1 | 1 || | | || 1100
+ 0 | 0 | 1 | 0 || = 0010 | X 0000 | X 0000 ||
+ 0 | 0 | 0 | 1 || | | || 1010
+
+Where the first four columns give the page state and the next three columns
+give the new state when the respective hand moves along. The prefixes '=', 'N'
+and 'X' are used to indicate: state unchanged, page tracked as non-resident
+and remove page. The last column gives the state on page fault.
+
+The hand dynamics is as follows, reclaim rotates the cold hand and takes
+unreferenced cold pages. When during this rotation the actual number of cold
+pages drops below the target number of cold pages the hot hand is rotated.
+
+The hot hand demotes unreferenced hot pages to cold, and terminates the test
+period of pages is passes by. If however the total number of pages tracked
+rises above twice the total available resident pages the test hand is rotated.
+
+The test hand, like the hot hand, terminates the test period of any page it
+passes by. Remember that terminating the test period of a non-resident cold
+page removes it altogether, thus limiting the total pages tracked.
+
+
+ Implementation Notes
+
+Since pages reclaimed in one zone can end up being faulted back in another
+zone it is incorrect to have per-zone non-resident page tracking. Hence the
+resident and non-resident page tracking needs to be separated.
+
+In order to accomplish this the following is done; take two CLOCKs, a two
+handed one for resident pages and a single handed one for non-resident pages.
+The resident CLOCK's hands will reflect hand cold and the resident part of hand
+hot, the non-resident CLOCK's hand will reflect the non-resident part of hand
+hot. Hence the rotation speeds of the resident hand hot and non-resident hand
+are coupled so that when one has made a full revolution so will have the
+other.
+
+The functionality of hand test is accomplished by simply limiting the number
+of entries on the non-resident clock to the number of pages on the resident
+clock. When a new entry is added to an already full non-resident clock the
+oldest entry will be removed; ie. its test period terminated.
+
+This uncoupling of non-resident and resident pages has the effect that the
+exact position of the non-resident pages relative to the resident pages is
+lost. This uncertainty propagates to the duration of the test period, some
+pages will be terminated to soon, other too late. However this is a bounded
+error with an average of 0.
+
+This scheme can then be extended to multiple zones by scaling the rotation
+speed coupling between the resident hot hand and the non-resident hand to the
+zone size. That is, when all resident hot hands have made one full revolution
+so will the non-resident hand have.
+
+Demand paging introduces yet another problem, when a page is faulted into
+memory it effectivly doesn't matter what the referenced bit is set to, it will
+be used as soon as we finish the fault anyway. Hence the first check will
+always activate the page even though we have only had a single use. The
+classic use-once problem. To tackle this the pages can be inserted one state
+lower than normal and behind hand hot instead of behind hand cold, this so
+that hand hot cannot interfere with the lowered page state and the first
+reference is lost quicker.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2005-12-31 18:59 UTC|newest]
Thread overview: 117+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-12-30 22:40 [PATCH] vm: page-replace and clockpro Peter Zijlstra
2005-12-30 22:40 ` Peter Zijlstra
2005-12-30 22:40 ` [PATCH 01/14] page-replace-single-batch-insert.patch Peter Zijlstra
2005-12-30 22:40 ` Peter Zijlstra
2005-12-31 7:03 ` Marcelo Tosatti
2005-12-31 7:03 ` Marcelo Tosatti
2005-12-31 9:43 ` Peter Zijlstra
2005-12-31 9:43 ` Peter Zijlstra
2005-12-31 14:44 ` Rik van Riel
2005-12-31 14:44 ` Rik van Riel
2005-12-31 22:19 ` Marcelo Tosatti
2005-12-31 22:19 ` Marcelo Tosatti
2005-12-30 22:40 ` [PATCH 02/14] page-replace-try_pageout.patch Peter Zijlstra
2005-12-30 22:40 ` Peter Zijlstra
2005-12-30 22:40 ` [PATCH 03/14] page-replace-remove-sc-from-refill.patch Peter Zijlstra
2005-12-30 22:40 ` Peter Zijlstra
2005-12-30 22:40 ` [PATCH 04/14] page-replace-activate_page.patch Peter Zijlstra
2005-12-30 22:40 ` Peter Zijlstra
2005-12-30 22:41 ` [PATCH 05/14] page-replace-remove-loop.patch Peter Zijlstra
2005-12-30 22:41 ` Peter Zijlstra
2005-12-30 22:41 ` [PATCH 06/14] page-replace-move-macros.patch Peter Zijlstra
2005-12-30 22:41 ` Peter Zijlstra
2005-12-30 22:41 ` [PATCH 07/14] page-replace-move-isolate_lru_pages.patch Peter Zijlstra
2005-12-30 22:41 ` Peter Zijlstra
2005-12-30 22:41 ` [PATCH 08/14] page-replace-candidates.patch Peter Zijlstra
2005-12-30 22:41 ` Peter Zijlstra
2005-12-30 22:41 ` [PATCH 09/14] page-replace-reinsert.patch Peter Zijlstra
2005-12-30 22:41 ` Peter Zijlstra
2005-12-30 22:41 ` [PATCH 10/14] page-replace-remove-mm_inline.patch Peter Zijlstra
2005-12-30 22:41 ` Peter Zijlstra
2005-12-30 22:42 ` [PATCH 11/14] page-replace-move-refill.patch Peter Zijlstra
2005-12-30 22:42 ` Peter Zijlstra
2005-12-30 22:42 ` [PATCH 12/14] page-replace-rotate.patch Peter Zijlstra
2005-12-30 22:42 ` Peter Zijlstra
2005-12-30 22:42 ` [PATCH 13/14] page-replace-init.patch Peter Zijlstra
2005-12-30 22:42 ` Peter Zijlstra
2005-12-30 22:42 ` [PATCH 14/14] page-replace-kswapd-incmin.patch Peter Zijlstra
2005-12-30 22:42 ` Peter Zijlstra
2005-12-31 1:15 ` Marcelo Tosatti
2005-12-31 1:15 ` Marcelo Tosatti
2005-12-31 9:40 ` Peter Zijlstra
2005-12-31 9:40 ` Peter Zijlstra
2005-12-30 22:42 ` [PATCH 1/9] clockpro-nonresident.patch Peter Zijlstra
2005-12-30 22:42 ` Peter Zijlstra
2005-12-31 1:13 ` Marcelo Tosatti
2005-12-31 1:13 ` Marcelo Tosatti
2005-12-31 9:54 ` Peter Zijlstra
2005-12-31 9:54 ` Peter Zijlstra
2005-12-31 14:53 ` Rik van Riel
2005-12-31 14:53 ` Rik van Riel
2005-12-31 22:20 ` Marcelo Tosatti
2005-12-31 22:20 ` Marcelo Tosatti
2005-12-30 22:42 ` [PATCH 2/9] clockpro-nonresident-del.patch Peter Zijlstra
2005-12-30 22:42 ` Peter Zijlstra
2005-12-30 22:43 ` [PATCH 3/9] clockpro-PG_test.patch Peter Zijlstra
2005-12-30 22:43 ` Peter Zijlstra
2005-12-30 22:43 ` [PATCH 4/9] clockpro-use-once.patch Peter Zijlstra
2005-12-30 22:43 ` Peter Zijlstra
2005-12-30 22:43 ` [PATCH 5/9] clockpro-ignore_token.patch Peter Zijlstra
2005-12-30 22:43 ` Peter Zijlstra
2005-12-30 22:43 ` [PATCH 6/9] clockpro-clockpro.patch Peter Zijlstra
2005-12-30 22:43 ` Peter Zijlstra
2005-12-31 0:24 ` Marcelo Tosatti
2005-12-31 0:24 ` Marcelo Tosatti
2005-12-31 1:22 ` Rik van Riel
2005-12-31 1:22 ` Rik van Riel
2005-12-31 3:27 ` Marcelo Tosatti
2005-12-31 3:27 ` Marcelo Tosatti
2005-12-31 5:24 ` Rik van Riel
2005-12-31 5:24 ` Rik van Riel
2005-12-31 10:57 ` Peter Zijlstra
2005-12-31 10:57 ` Peter Zijlstra
2005-12-31 10:48 ` Peter Zijlstra
2005-12-31 10:48 ` Peter Zijlstra
2005-12-31 22:12 ` Marcelo Tosatti
2005-12-31 22:12 ` Marcelo Tosatti
2006-01-03 19:30 ` Christoph Lameter
2006-01-03 19:30 ` Christoph Lameter
2005-12-31 11:29 ` Peter Zijlstra
2005-12-31 11:29 ` Peter Zijlstra
2006-01-05 9:47 ` IWAMOTO Toshihiro
2006-01-05 9:47 ` IWAMOTO Toshihiro
2006-01-05 13:32 ` Rik van Riel
2006-01-05 13:32 ` Rik van Riel
2006-01-06 9:01 ` IWAMOTO Toshihiro
2006-01-06 9:01 ` IWAMOTO Toshihiro
2006-01-24 6:30 ` IWAMOTO Toshihiro
2006-01-24 6:30 ` IWAMOTO Toshihiro
2006-01-24 7:25 ` IWAMOTO Toshihiro
2006-01-25 8:00 ` Peter Zijlstra
2006-02-03 9:25 ` Peter Zijlstra
2006-02-06 9:30 ` IWAMOTO Toshihiro
2006-02-06 10:07 ` Peter Zijlstra
2006-02-08 10:05 ` IWAMOTO Toshihiro
2006-02-08 20:00 ` Peter Zijlstra
2006-02-09 6:57 ` Peter Zijlstra
2006-02-09 7:22 ` IWAMOTO Toshihiro
2006-02-09 10:07 ` IWAMOTO Toshihiro
2006-02-09 15:23 ` Rik van Riel
2006-02-08 9:53 ` IWAMOTO Toshihiro
2005-12-31 22:40 ` Marcelo Tosatti
2005-12-31 22:40 ` Marcelo Tosatti
2006-01-01 10:37 ` Peter Zijlstra
2006-01-01 10:37 ` Peter Zijlstra
2006-01-03 12:21 ` Marcelo Tosatti
2006-01-03 12:21 ` Marcelo Tosatti
2006-02-14 7:29 ` IWAMOTO Toshihiro
2006-02-15 6:35 ` Peter Zijlstra
2006-02-16 6:25 ` IWAMOTO Toshihiro
2005-12-30 22:43 ` [PATCH 7/9] clockpro-remove-old.patch Peter Zijlstra
2005-12-30 22:43 ` Peter Zijlstra
2005-12-30 22:43 ` [PATCH 8/9] clockpro-rename_PG_active.patch Peter Zijlstra
2005-12-30 22:43 ` Peter Zijlstra
2005-12-30 22:44 ` [PATCH 9/9] clockpro-clockpro-stats.patch Peter Zijlstra
2005-12-30 22:44 ` Peter Zijlstra
2005-12-31 18:59 ` Peter Zijlstra [this message]
2005-12-31 18:59 ` [PATCH 10/9] clockpro-document.patch Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1136055558.17853.72.camel@twins \
--to=a.p.zijlstra@chello.nl \
--cc=akpm@osdl.org \
--cc=christoph@lameter.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=marcelo.tosatti@cyclades.com \
--cc=marijn@bitpit.net \
--cc=npiggin@suse.de \
--cc=riel@redhat.com \
--cc=wfg@mail.ustc.edu.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.