public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/8] Variable Order Page Cache
@ 2007-04-19 16:35 Christoph Lameter
  2007-04-19 16:35 ` [RFC 1/8] Add order field to address_space struct Christoph Lameter
                   ` (13 more replies)
  0 siblings, 14 replies; 58+ messages in thread
From: Christoph Lameter @ 2007-04-19 16:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Nick Piggin, Christoph Lameter, Paul Jackson,
	Dave Chinner, Andi Kleen

Variable Order Page Cache Patchset

This patchset modifies the core VM so that higher order page cache pages
become possible. The higher order page cache pages are compound pages
and can be handled in the same way as regular pages.

The order of the pages is determined by the order set up in the mapping
(struct address_space). By default the order is set to zero.
This means that higher order pages are optional. There is no attempt here
to generally change the page order of the page cache. 4K pages are effective
for small files.

However, it would be good if the VM would support I/O to higher order pages
to enable efficient support for large scale I/O. If one wants to write a
long file of a few gigabytes then the filesystem should have a choice of
selecting a larger page size for that file and handle larger chunks of
memory at once.

The support here is only for buffered I/O and only for one filesystem (ramfs).
Modification of other filesystems to support higher order pages may require
extensive work of other components of the kernel. But I hope this shows that
there is a relatively easy way to that goal that could be taken in steps..

Note that the higher order pages are subject to reclaim. This works in general
since we are always operating on a single page struct. Reclaim is fooled to
think that it is touching page sized objects (there are likely issues to be
fixed there if we want to go down this road).

What is currently not supported:
- Buffer heads for higher order pages (possible with the compound pages in mm
  that do not use page->private requires upgrade of the buffer cache layers).
- Higher order pages in the block layer etc.
- Mmapping higher order pages

Note that this is proof-of-concept. Lots of functionality is missing and
various issues have not been dealt with. Use of higher order pages may cause
memory fragmentation. Mel Gorman's anti-fragmentation work is probably
essential if we want to do this. We likely need actual defragmentation
support.

The main point of this patchset is to demonstrates that it is basically
possible to have higher order support with straightforward changes to the
VM.

The ramfs driver can be used to test higher order page cache functionality
(and may help troubleshoot the VM support until we get some real filesystem
and real devices supporting higher order pages).

If you apply this patch and then you can f.e. try this:

mount -tramfs -o10 none /media

	Mounts a ramfs filesystem with order 10 pages (4 MB)

cp linux-2.6.21-rc7.tar.gz /media

	Populate the ramfs. Note that we allocate 14 pages of 4M each
	instead of 13508..

umount /media

	Gets rid of the large pages again

Comments appreciated.


^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2007-04-24 23:13 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-04-19 16:35 [RFC 0/8] Variable Order Page Cache Christoph Lameter
2007-04-19 16:35 ` [RFC 1/8] Add order field to address_space struct Christoph Lameter
2007-04-19 16:35 ` [RFC 2/8] Basic allocation for higher order page cache pages Christoph Lameter
2007-04-20 10:55   ` Mel Gorman
2007-04-19 16:35 ` [RFC 3/8] Flushing and zeroing " Christoph Lameter
2007-04-20 11:02   ` Mel Gorman
2007-04-20 16:15     ` Christoph Lameter
2007-04-20 16:51       ` William Lee Irwin III
2007-04-19 16:35 ` [RFC 4/8] Enhance fallback functions in libs to support higher order pages Christoph Lameter
2007-04-19 18:48   ` Adam Litke
2007-04-19 19:10     ` Christoph Lameter
2007-04-19 22:50       ` David Chinner
2007-04-20  1:15         ` Christoph Lameter
2007-04-20  8:21       ` Jens Axboe
2007-04-20 16:01         ` Christoph Lameter
2007-04-20 16:51           ` Jens Axboe
2007-04-20 11:05   ` Mel Gorman
2007-04-20 18:50     ` Dave Kleikamp
2007-04-20 19:10       ` Christoph Lameter
2007-04-20 19:27         ` Dave Kleikamp
2007-04-24 23:00         ` Matt Mackall
2007-04-19 16:35 ` [RFC 5/8] Enhance generic_read/write " Christoph Lameter
2007-04-19 16:35 ` [RFC 6/8] Account for pages in the page cache in terms of base pages Christoph Lameter
2007-04-19 17:45   ` Nish Aravamudan
2007-04-19 17:52     ` Christoph Lameter
2007-04-19 17:54       ` Avi Kivity
2007-04-19 16:35 ` [RFC 7/8] Enhance ramfs to support higher order pages Christoph Lameter
2007-04-20 13:42   ` Mel Gorman
2007-04-20 14:47     ` William Lee Irwin III
2007-04-20 16:30       ` Christoph Lameter
2007-04-20 17:11         ` William Lee Irwin III
2007-04-20 17:15           ` Christoph Lameter
2007-04-20 17:19             ` William Lee Irwin III
2007-04-20 17:57               ` Christoph Lameter
2007-04-20 19:21                 ` William Lee Irwin III
2007-04-20 17:59               ` Christoph Lameter
2007-04-20 18:01               ` Christoph Lameter
2007-04-20 18:02               ` Christoph Lameter
2007-04-20 16:20     ` Christoph Lameter
2007-04-19 16:35 ` [RFC 8/8] Add some debug output Christoph Lameter
2007-04-19 19:09 ` [RFC 0/8] Variable Order Page Cache Badari Pulavarty
2007-04-19 19:12   ` Christoph Lameter
2007-04-19 19:11 ` Andi Kleen
2007-04-19 19:15   ` Christoph Lameter
2007-04-20 14:37   ` Mel Gorman
2007-04-19 22:42 ` David Chinner
2007-04-20  1:14   ` Christoph Lameter
2007-04-20  6:32   ` Jens Axboe
2007-04-20  7:48     ` David Chinner
2007-04-21 22:18       ` Andrew Morton
2007-04-19 23:58 ` Maxim Levitsky
2007-04-20  1:15   ` Christoph Lameter
2007-04-20  4:47 ` William Lee Irwin III
2007-04-20  5:27   ` Christoph Lameter
2007-04-20  6:22     ` William Lee Irwin III
2007-04-20  8:42   ` David Chinner
2007-04-20 14:14 ` Mel Gorman
2007-04-20 16:23   ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox