From: Maxim Levitsky <maximlevitsky@gmail.com>
To: Christoph Lameter <clameter@sgi.com>
Cc: linux-kernel@vger.kernel.org,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Nick Piggin <nickpiggin@yahoo.com.au>, Paul Jackson <pj@sgi.com>,
Dave Chinner <dgc@sgi.com>, Andi Kleen <ak@suse.de>
Subject: Re: [RFC 0/8] Variable Order Page Cache
Date: Fri, 20 Apr 2007 02:58:24 +0300 [thread overview]
Message-ID: <200704200258.24749.maximlevitsky@gmail.com> (raw)
In-Reply-To: <20070419163504.11948.58487.sendpatchset@schroedinger.engr.sgi.com>
On Thursday 19 April 2007 19:35:04 Christoph Lameter wrote:
> Variable Order Page Cache Patchset
>
> This patchset modifies the core VM so that higher order page cache pages
> become possible. The higher order page cache pages are compound pages
> and can be handled in the same way as regular pages.
>
> The order of the pages is determined by the order set up in the mapping
> (struct address_space). By default the order is set to zero.
> This means that higher order pages are optional. There is no attempt here
> to generally change the page order of the page cache. 4K pages are effective
> for small files.
>
> However, it would be good if the VM would support I/O to higher order pages
> to enable efficient support for large scale I/O. If one wants to write a
> long file of a few gigabytes then the filesystem should have a choice of
> selecting a larger page size for that file and handle larger chunks of
> memory at once.
>
> The support here is only for buffered I/O and only for one filesystem
(ramfs).
> Modification of other filesystems to support higher order pages may require
> extensive work of other components of the kernel. But I hope this shows that
> there is a relatively easy way to that goal that could be taken in steps..
>
> Note that the higher order pages are subject to reclaim. This works in
general
> since we are always operating on a single page struct. Reclaim is fooled to
> think that it is touching page sized objects (there are likely issues to be
> fixed there if we want to go down this road).
>
> What is currently not supported:
> - Buffer heads for higher order pages (possible with the compound pages in
mm
> that do not use page->private requires upgrade of the buffer cache
layers).
> - Higher order pages in the block layer etc.
> - Mmapping higher order pages
>
> Note that this is proof-of-concept. Lots of functionality is missing and
> various issues have not been dealt with. Use of higher order pages may cause
> memory fragmentation. Mel Gorman's anti-fragmentation work is probably
> essential if we want to do this. We likely need actual defragmentation
> support.
>
> The main point of this patchset is to demonstrates that it is basically
> possible to have higher order support with straightforward changes to the
> VM.
>
> The ramfs driver can be used to test higher order page cache functionality
> (and may help troubleshoot the VM support until we get some real filesystem
> and real devices supporting higher order pages).
>
> If you apply this patch and then you can f.e. try this:
>
> mount -tramfs -o10 none /media
>
> Mounts a ramfs filesystem with order 10 pages (4 MB)
>
> cp linux-2.6.21-rc7.tar.gz /media
>
> Populate the ramfs. Note that we allocate 14 pages of 4M each
> instead of 13508..
>
> umount /media
>
> Gets rid of the large pages again
>
> Comments appreciated.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
Hello,
This is exactly what I wanted some time ago,
Thank you very much, I was almost thinking of doing this myself
(but decided that it is too difficult now for me and maybe doesn't worth the
effort)
I want to point out on number of problems that this will solve (and reasons I
wanted to do that)
First of all, today, packet writing on cd/dvd doesn't work well, it is very
slow because
now all file-systems are limited to 4k-barrier and cd/dvd can write only
32k/64k packets.
This is why a pktcdvd was written and it emulates those 4k sectors by doing
read/modify/write cycle
This cause a lot of seeks and read/writing switches and thus it is very slow.
By introducing a bigger that 4k page cache a dvd/cd can be divided is 64k/32k
blocks that will be read an written freely
(Although dvd can read 2k I don't think that reading a 64k block will hurt
since most of time drive is busy seeking and locating a specific sector)
Now I thinking to implement this in an other way, I mean I want to teach udf
filesystem to to packet writing on its own, bypassing disk cache (but not page
cache)
Secondary 32/64k limitation is present of flash devices too, so they can
benefit too, and I almost sure that future hard disks will use bigger block
size too.
To summarize I want to tell that bigger pagesize will allow devices that have
big hardware sectors to work fine in linux.
Best regards,
Maxim Levitsky
next prev parent reply other threads:[~2007-04-19 23:59 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-04-19 16:35 [RFC 0/8] Variable Order Page Cache Christoph Lameter
2007-04-19 16:35 ` [RFC 1/8] Add order field to address_space struct Christoph Lameter
2007-04-19 16:35 ` [RFC 2/8] Basic allocation for higher order page cache pages Christoph Lameter
2007-04-20 10:55 ` Mel Gorman
2007-04-19 16:35 ` [RFC 3/8] Flushing and zeroing " Christoph Lameter
2007-04-20 11:02 ` Mel Gorman
2007-04-20 16:15 ` Christoph Lameter
2007-04-20 16:51 ` William Lee Irwin III
2007-04-19 16:35 ` [RFC 4/8] Enhance fallback functions in libs to support higher order pages Christoph Lameter
2007-04-19 18:48 ` Adam Litke
2007-04-19 19:10 ` Christoph Lameter
2007-04-19 22:50 ` David Chinner
2007-04-20 1:15 ` Christoph Lameter
2007-04-20 8:21 ` Jens Axboe
2007-04-20 16:01 ` Christoph Lameter
2007-04-20 16:51 ` Jens Axboe
2007-04-20 11:05 ` Mel Gorman
2007-04-20 18:50 ` Dave Kleikamp
2007-04-20 19:10 ` Christoph Lameter
2007-04-20 19:27 ` Dave Kleikamp
2007-04-24 23:00 ` Matt Mackall
2007-04-19 16:35 ` [RFC 5/8] Enhance generic_read/write " Christoph Lameter
2007-04-19 16:35 ` [RFC 6/8] Account for pages in the page cache in terms of base pages Christoph Lameter
2007-04-19 17:45 ` Nish Aravamudan
2007-04-19 17:52 ` Christoph Lameter
2007-04-19 17:54 ` Avi Kivity
2007-04-19 16:35 ` [RFC 7/8] Enhance ramfs to support higher order pages Christoph Lameter
2007-04-20 13:42 ` Mel Gorman
2007-04-20 14:47 ` William Lee Irwin III
2007-04-20 16:30 ` Christoph Lameter
2007-04-20 17:11 ` William Lee Irwin III
2007-04-20 17:15 ` Christoph Lameter
2007-04-20 17:19 ` William Lee Irwin III
2007-04-20 17:57 ` Christoph Lameter
2007-04-20 19:21 ` William Lee Irwin III
2007-04-20 17:59 ` Christoph Lameter
2007-04-20 18:01 ` Christoph Lameter
2007-04-20 18:02 ` Christoph Lameter
2007-04-20 16:20 ` Christoph Lameter
2007-04-19 16:35 ` [RFC 8/8] Add some debug output Christoph Lameter
2007-04-19 19:09 ` [RFC 0/8] Variable Order Page Cache Badari Pulavarty
2007-04-19 19:12 ` Christoph Lameter
2007-04-19 19:11 ` Andi Kleen
2007-04-19 19:15 ` Christoph Lameter
2007-04-20 14:37 ` Mel Gorman
2007-04-19 22:42 ` David Chinner
2007-04-20 1:14 ` Christoph Lameter
2007-04-20 6:32 ` Jens Axboe
2007-04-20 7:48 ` David Chinner
2007-04-21 22:18 ` Andrew Morton
2007-04-19 23:58 ` Maxim Levitsky [this message]
2007-04-20 1:15 ` Christoph Lameter
2007-04-20 4:47 ` William Lee Irwin III
2007-04-20 5:27 ` Christoph Lameter
2007-04-20 6:22 ` William Lee Irwin III
2007-04-20 8:42 ` David Chinner
2007-04-20 14:14 ` Mel Gorman
2007-04-20 16:23 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200704200258.24749.maximlevitsky@gmail.com \
--to=maximlevitsky@gmail.com \
--cc=a.p.zijlstra@chello.nl \
--cc=ak@suse.de \
--cc=clameter@sgi.com \
--cc=dgc@sgi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nickpiggin@yahoo.com.au \
--cc=pj@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.