linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH/RFC] Page Cache Policy V0.0 0/5 Overview
@ 2006-04-20 20:39 Lee Schermerhorn
  0 siblings, 0 replies; only message in thread
From: Lee Schermerhorn @ 2006-04-20 20:39 UTC (permalink / raw)
  To: linux-mm; +Cc: Christoph Lameter, Andi Kleen, Eric Whitney

Resend with subject!

Page Cache Policy V0.0 0/5 Overview

Work in progress -- for comment.  Christoph wanted to see
this addressed before migrate-on-fault goes any farther.
So, here's a cut.  Series to follow...

Note:  tested atop recently posted add-shmem-migratepage-a_op
patch on 2.6.17-rc1-mm2
----------------------

Basic "problem":  currently [2.6.17-rc1], file mmap()ed SHARED
do not follow policy applied to the mapped regions.  Instead, 
shared file backed pages are allocated using the allocating
tasks' task policy.  This is inconsistent with the way that anon
and shmem pages are handled.

One reason for this is that down where pages are allocated for
file backed pages, the faulting (mm, vma, address) are not 
available to compute the policy.  However, we do have the inode
[via the address space] and file index/offset available.  If the
applicable policy could be determined from just this info, the
vma and address would not be required.

The following series of patches against 2.6.17-rc1-mm2 implements
numa memory policy for shared, mmap()ed files.   Because files
mmap()ed SHARED are shared between tasks just like shared memory
regions, I've used the shared_policy infrastructure from shmem.
This infrastructure applies policies directly to ranges of a file
using a prio tree.

The patches break out as follows:

1 - add-offset-arg-to-migrate_pages_to

	A minor preparatory patch:  adds the page offset/index
	arg to migrate_pages_to() for properly computing nodes
	for interleaved policies.  Used by subsequent patch.

2 - move-shared-policy-to-inode

	This patch generalizes the shared_policy infrastructure
	for use by generic files.   First, it adds a shared_policy
	pointer to the struct address_space.  This pointer is
	initialized to NULL on inode allocation, indicating the
	default policy.  The shared memory subsystem is then
	modified to use the shared policy struct out of the
	address_space [a.k.a. mapping] instead of explicitly
	using one embedded in the shmem inode info struct.

	Note, however, at this point we still use the embedded
	shared_policy.  We just point the mapping spolicy pointer
	at the embedded struct at init time.

	One BIG side-effect of this patch:  we no longer split
	vm areas to apply sub-range policies if the vma has
	a set_policy vm_op.  Only shmem currently has a set_policy
	op, and it knows how to handle subranges via the prio tree.
	So, I'm proposing to adopt this semantic:  if a vma has
	set_policy() op, it must know to handle subranges and must
	have a get_policy() op that also knows how to handle sub-
	ranges.

	Tested to ensure shared policies still work for shmem.

	TODO:  check effects on numa maps of not splitting vmas.

3 - alloc-shared-policies

	This patch removes the shared_policy structs embedded in
	the shmem and hugetlbfs inode info structs, and dynamically
	allocates them, from a new kmem cache, when needed.

	Shmem will allocate a shared policy at segment init if
	the superblock [mount] specifies non-default policy.
	Otherwise, the shared_policy struct will only be allocated
	if a task mbind()s a range of the segment.

	Hugetlbfs just leaves the spolicy pointer NULL [default].
	It will be allocated by the shmem set_policy() vm_op if
	a task mbinds a range of the hugetlb segment.

4 - generic-file-policy-vm-ops

	This patch clones the shmem set/get_policy vm_ops for use
	by generic mmap()ed files.  The functions are added to the
	generic_file_vm_ops struct. These functions operate on the
	shared_policy prio tree associated with the inode, allocating
	one if necessary.

	Note:   these turned out to be indentical in all but name to
	the shmem '_policy ops.  Maybe eliminate one copy and share?

5 - use-file-policy-for-page-cache

	This patch enhances page_cache_alloc[_cold]() to take an
	offset/index argument.  It uses this to lookup the policy
	using a new function get_file_policy() which is just a
	wrapper around mpol_shared_policy_lookup().  If the inode's
	[mapping's] shared_policy pointer is NULL, just returns the
	default policy.

	Then page_cache_alloc[_cold]() calls a new function,
	alloc_page_pol() to evaluate the policy [at a specified
	offset] and allocate an appropriate page.  alloc_page_pol()
	shares some code with alloc_page_vma(), so this area is
	reworked to minimize duplication.  

	All callers of page_cache_alloc[_cold]() are modified to
	pass the file index/offset for which a page is requested.
	The index/offset is available at all call sites as it will
	be used to insert the page into the mapping's radix tree.

Cursory testing with memtoy for shm segments, shared and privately
mapped files; single task and 2 tasks mmap()ing same file.  When
the file is mmap()ed shared, either task's policy changes are seen
by both tasks.  When one maps shared and the other private, the
private mapper's policies apply only to its mapping.

Lots more testing needed.

Lee Schermerhorn




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2006-04-20 20:39 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-20 20:39 [PATCH/RFC] Page Cache Policy V0.0 0/5 Overview Lee Schermerhorn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).