All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: linux-mm@kvack.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: [PATCH 25/39] mm: pgrep: documentation
Date: Wed, 12 Jul 2006 16:41:54 +0200	[thread overview]
Message-ID: <20060712144154.16998.1788.sendpatchset@lappy> (raw)
In-Reply-To: <20060712143659.16998.6444.sendpatchset@lappy>

From: Marcelo Tosatti <marcelo.tosatti@cyclades.com>

Documentation for the page replace framework.

Signed-off-by: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>

 Documentation/vm/pgrepment_api.txt |  216 +++++++++++++++++++++++++++++++++++++
 1 file changed, 216 insertions(+)

Index: linux-2.6/Documentation/vm/pgrepment_api.txt
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/Documentation/vm/pgrepment_api.txt	2006-07-12 16:09:19.000000000 +0200
@@ -0,0 +1,216 @@
+		Page Replacement Policy Interface
+
+Introduction
+============
+
+This document describes the page replacement interfaces used by the
+virtual memory subsystem.
+
+When the system's free memory runs below a certain threshold, an action
+must be initiated to reclaim memory for future use. The decision of
+which memory pages to evict is called the replacement policy.
+
+There are several types of reclaimable objects which live in the
+system's memory:
+
+a) file cache pages
+b) anonymous process pages
+c) shared memory (shm) pages
+d) SLAB cache pages, used for internal kernel objects such as the inode
+and dentry caches.
+
+The policy API abstracts the replacement structure for pagecache objects
+(items a) b) and c)), separating it from the reclaim code path.
+
+This allows maintenance of different policies to deal with different
+workload requirements.
+
+Zoned VM
+========
+
+In Linux, physical memory is managed separately into zones, because
+certain types of allocations are address constrained.
+
+The operating system has to support types of hardware which cannot
+access full 32-bit addresses, but are limited to an address mask. For
+instance, ISA devices can only address the lower 24-bits (hence their
+visilibity goes up to 16MB).
+
+Additionally, pages used for internal kernel data must be restricted to
+the direct kernel mapping, which is approximately 1GB in current default
+configurations.
+
+Different zones must be managed separately from the perspective of page
+reclaim path, because particular zones might suffer more pressure than
+others.
+
+This means that the page replacement structures have to be maintained
+separately for each zone.
+
+Description
+===========
+
+The page replacement policy interface consists off a set of operations
+which are invoked from the common VM code.
+
+As mentioned before, the policy specific data has to be maintained
+separately for each zone, therefore "struct zone" embeds the following
+data structure:
+
+	struct pgrep_data policy;
+
+Which is to be defined by the policy in a separate header file.
+
+At the moment, this data structure is guarded by the "zone->lru_lock"
+spinlock, thus shared by all policies.
+
+Initialization (invoked during system bootup)
+--------------
+
+ * void __init pgrep_init(void)
+
+Policy private initialization.
+
+ * void __init pgrep_init_zone(struct zone *)
+
+Initialize zone specific policy data.
+
+
+Methods called by the VM
+------------------------
+
+ * void pgrep_hint_active(struct page *);
+ * void pgrep_hint_use_once(struct page *);
+
+Give the policy hints as to the importance of the page. These hints can
+be viewed as initial priority of page, where active is +1 and use_once -1.
+
+
+ * void fastcall pgrep_add(struct page *);
+
+Insert page into per-CPU list(s), used for batching groups of pages to
+relief zone->lru_lock contention. Called during page instantiation.
+
+
+ * void pgrep_add_drain(void);
+ * void pgrep_add_drain_cpu(unsigned int);
+
+Drain the per-CPU lists(s), pushing pages to the actual cache.
+Called in locations where it is important to not have stale data
+into the per-CPU lists.
+
+
+ * void pagevec_pgrep_add(struct pagevec *);
+ * void __pagevec_pgrep_add(struct pagevec *);
+
+Insert a whole pagevec worth of pages directly.
+
+
+ * void pgrep_get_candidates(struct zone *, int, struct list_head *);
+
+Select candidates for eviction from the specified zone.
+
+@zone: which memory zone to scan for.
+@nr_to_scan: number of pages to scan.
+@page_list: list_head to add select pages
+
+Called by mm/vmscan.c::shrink_cache(), the main function used to
+evict pagecache pages from a specific zone.
+
+
+ * reclaim_t pgrep_reclaimable(struct page *);
+
+Determines wether a page is reclaimable, used by shrink_list().
+This function encapsulates the call to page_referenced.
+
+
+ * void pgrep_activate(struct page *);
+
+Callback used to let the policy know this page was referenced.
+
+
+ * void pgrep_put_candidates(struct zone *, struct list_head *);
+
+Put unfreeable pages back into the zone's cache mgmt structures.
+
+@zone: memory zone which pages belong
+@page_list: list of pages to reinsert
+
+
+ * void pgrep_remove(struct zone *, struct page *);
+
+Remove page from cache. This function clears the page state.
+
+
+ * int pgrep_isolate(struct page *);
+
+Isolate a specified page; ie. remove it from the cache mgmt structures without
+clearing its page state (used for page migration).
+
+
+ * void pgrep_reinsert(struct list_head *);
+
+Reinsert a list of pages previously isolated by pgrep_isolate().
+Remember that these pages still have their page state; this property
+distinguishes this function from pgrep_add().
+NOTE: the pages on the list need not be in the same zone.
+
+
+ * void __pgrep_rotate_reclaimable(struct zone *, struct page *);
+
+Place this page so that it will be in the next candidate batch.
+
+
+ * void pgrep_remember(struct zone *, struct page*);
+ * void pgrep_forget(struct address_space *, unsigned long);
+
+Hooks for nonresident page management. Allows the policy to remember and
+forget about pages that are no longer resident.
+
+ * void pgrep_show(struct zone *);
+ * void pgrep_zoneinfo(struct zone *, struct seq_file *);
+
+Prints zoneinfo in the various ways.
+
+* void __pgrep_counts(unsigned long *, unsigned long *,
+			     unsigned long *, struct pglist_data *);
+
+Gives 'active', 'inactive' and free count in pages for the selected pgdat.
+Where active/inactive are open for interpretation of the policy.
+
+ * unsigned long __pgrep_nr_pages(struct zone *);
+
+Gives the total number of pages currently managed by the page replacement
+policy.
+
+
+ * unsigned long __pgrep_nr_scan(struct zone *);
+
+Gives the number of pages needed to drive the scanning.
+
+Helpers
+-------
+
+Certain helpers are shared by all policies, follows a description of them:
+
+1) int should_reclaim_mapped(struct zone *);
+
+The point of this algorithm is to decide when to start reclaiming mapped
+memory instead of clean pagecache.
+
+Returns 1 if mapped pages should be candidates for reclaim, 0 otherwise.
+
+Page flags
+----------
+
+A number of bits in page->flags are reserved for the page replacement
+policies, they are:
+
+	PG_reclaim1	/* bit 6 */
+	PG_reclaim2	/* bit 20 */
+	PG_reclaim3	/* bit 21 */
+
+The policy private semantics of this bits are to be defined in
+the policy implementation. This bits are internal to the policy and as such
+should not be interpreted in any way by external code.
+

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2006-07-12 14:41 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-07-12 14:36 [PATCH 0/39] mm: 2.6.17-pr1 - generic page-replacement framework and 4 new policies Peter Zijlstra
2006-07-12 14:37 ` [PATCH 1/39] mm: disuse activate_page() Peter Zijlstra
2006-07-12 14:37 ` [PATCH 2/39] mm: adjust blk_congestion_wait() logic Peter Zijlstra
2006-07-12 14:37 ` [PATCH 3/39] mm: pgrep: prepare for page replace framework Peter Zijlstra
2006-07-12 14:37 ` [PATCH 4/39] mm: pgrep: convert insertion Peter Zijlstra
2006-07-12 14:37 ` [PATCH 5/39] mm: pgrep: add a use-once insertion hint Peter Zijlstra
2006-07-12 14:38 ` [PATCH 6/39] mm: pgrep: generice __pagevec_*_add Peter Zijlstra
2006-07-12 14:38 ` [PATCH 7/39] mm: pgrep: abstract the activation logic Peter Zijlstra
2006-07-12 14:38 ` [PATCH 8/39] mm: pgrep: move useful macros around Peter Zijlstra
2006-07-12 14:38 ` [PATCH 9/39] mm: pgrep: move struct scan_control around Peter Zijlstra
2006-07-12 14:38 ` [PATCH 10/39] mm: pgrep: isolate the reclaim_mapped logic Peter Zijlstra
2006-07-12 14:39 ` [PATCH 11/39] mm: pgrep: replace mark_page_accessed Peter Zijlstra
2006-07-12 14:39 ` [PATCH 12/39] mm: pgrep: move the shrink logic Peter Zijlstra
2006-07-12 14:39 ` [PATCH 13/39] mm: pgrep: abstract rotate_reclaimable_page() Peter Zijlstra
2006-07-12 14:39 ` [PATCH 14/39] mm: pgrep: manage page-state Peter Zijlstra
2006-07-12 14:39 ` [PATCH 15/39] mm: pgrep: abstract page removal Peter Zijlstra
2006-07-12 14:40 ` [PATCH 16/39] mm: pgrep: remove mm_inline.h Peter Zijlstra
2006-07-12 14:40 ` [PATCH 17/39] mm: pgrep: re-insertion logic Peter Zijlstra
2006-07-12 14:40 ` [PATCH 18/39] mm: pgrep: initialisation hooks Peter Zijlstra
2006-07-12 14:40 ` [PATCH 19/39] mm: pgrep: info functions Peter Zijlstra
2006-07-12 14:40 ` [PATCH 20/39] mm: pgrep: page count functions Peter Zijlstra
2006-07-12 14:41 ` [PATCH 21/39] mm: pgrep: per policy data Peter Zijlstra
2006-07-12 14:41 ` [PATCH 22/39] mm: pgrep: per policy PG_flags Peter Zijlstra
2006-07-12 14:41 ` [PATCH 23/39] mm: pgrep: nonresident page tracking hooks Peter Zijlstra
2006-07-12 14:41 ` [PATCH 24/39] mm: pgrep: generic shrinker logic Peter Zijlstra
2006-07-12 14:41 ` Peter Zijlstra [this message]
2006-07-12 14:42 ` [PATCH 26/39] sum_cpu_var Peter Zijlstra
2006-07-12 14:42 ` [PATCH 27/39] mm: clockpro: nonresident page tracking for CLOCK-Pro Peter Zijlstra
2006-07-12 14:42 ` [PATCH 28/39] mm: clockpro: re-introduce page_referenced() ignore_token Peter Zijlstra
2006-07-12 14:42 ` [PATCH 29/39] mm: clockpro: second per policy PG_flag Peter Zijlstra
2006-07-12 14:42 ` [PATCH 30/39] mm: clockpro: CLOCK-Pro policy implementation Peter Zijlstra
2006-07-12 14:43 ` [PATCH 31/39] mm: cart: nonresident page tracking for CART Peter Zijlstra
2006-07-12 14:43 ` [PATCH 32/39] mm: cart: third per policy PG_flag Peter Zijlstra
2006-07-12 14:43 ` [PATCH 33/39] mm: cart: CART policy implementation Peter Zijlstra
2006-07-12 14:43 ` [PATCH 34/39] mm: cart: CART-r " Peter Zijlstra
2006-07-12 14:43 ` [PATCH 35/39] mm: random: random page replacement policy Peter Zijlstra
2006-07-12 14:44 ` [PATCH 36/39] mm: refault histogram for non-resident policies Peter Zijlstra
2006-07-12 14:44 ` [PATCH 37/39] mm: use-once: cleanup of the use-once logic Peter Zijlstra
2006-07-12 14:44 ` [PATCH 38/39] mm: use-once: use the generic shrinker logic Peter Zijlstra
2006-07-12 14:44 ` [PATCH 39/39] mm: use-once: cleanup of the insertion logic Peter Zijlstra
2006-07-13 15:38 ` [PATCH 0/39] mm: 2.6.17-pr1 - generic page-replacement framework and 4 new policies Christoph Lameter
2006-07-15 17:03   ` Peter Zijlstra
2006-07-16  3:50     ` Christoph Lameter
2006-07-26 10:03       ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060712144154.16998.1788.sendpatchset@lappy \
    --to=a.p.zijlstra@chello.nl \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.