public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Maxim Levitsky <maximlevitsky@gmail.com>
To: Christoph Lameter <clameter@sgi.com>
Cc: linux-kernel@vger.kernel.org,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Nick Piggin <nickpiggin@yahoo.com.au>, Paul Jackson <pj@sgi.com>,
	Dave Chinner <dgc@sgi.com>, Andi Kleen <ak@suse.de>
Subject: Re: [RFC 0/8] Variable Order Page Cache
Date: Fri, 20 Apr 2007 02:58:24 +0300	[thread overview]
Message-ID: <200704200258.24749.maximlevitsky@gmail.com> (raw)
In-Reply-To: <20070419163504.11948.58487.sendpatchset@schroedinger.engr.sgi.com>

On Thursday 19 April 2007 19:35:04 Christoph Lameter wrote:
> Variable Order Page Cache Patchset
> 
> This patchset modifies the core VM so that higher order page cache pages
> become possible. The higher order page cache pages are compound pages
> and can be handled in the same way as regular pages.
> 
> The order of the pages is determined by the order set up in the mapping
> (struct address_space). By default the order is set to zero.
> This means that higher order pages are optional. There is no attempt here
> to generally change the page order of the page cache. 4K pages are effective
> for small files.
> 
> However, it would be good if the VM would support I/O to higher order pages
> to enable efficient support for large scale I/O. If one wants to write a
> long file of a few gigabytes then the filesystem should have a choice of
> selecting a larger page size for that file and handle larger chunks of
> memory at once.
> 
> The support here is only for buffered I/O and only for one filesystem 
(ramfs).
> Modification of other filesystems to support higher order pages may require
> extensive work of other components of the kernel. But I hope this shows that
> there is a relatively easy way to that goal that could be taken in steps..
> 
> Note that the higher order pages are subject to reclaim. This works in 
general
> since we are always operating on a single page struct. Reclaim is fooled to
> think that it is touching page sized objects (there are likely issues to be
> fixed there if we want to go down this road).
> 
> What is currently not supported:
> - Buffer heads for higher order pages (possible with the compound pages in 
mm
>   that do not use page->private requires upgrade of the buffer cache 
layers).
> - Higher order pages in the block layer etc.
> - Mmapping higher order pages
> 
> Note that this is proof-of-concept. Lots of functionality is missing and
> various issues have not been dealt with. Use of higher order pages may cause
> memory fragmentation. Mel Gorman's anti-fragmentation work is probably
> essential if we want to do this. We likely need actual defragmentation
> support.
> 
> The main point of this patchset is to demonstrates that it is basically
> possible to have higher order support with straightforward changes to the
> VM.
> 
> The ramfs driver can be used to test higher order page cache functionality
> (and may help troubleshoot the VM support until we get some real filesystem
> and real devices supporting higher order pages).
> 
> If you apply this patch and then you can f.e. try this:
> 
> mount -tramfs -o10 none /media
> 
> 	Mounts a ramfs filesystem with order 10 pages (4 MB)
> 
> cp linux-2.6.21-rc7.tar.gz /media
> 
> 	Populate the ramfs. Note that we allocate 14 pages of 4M each
> 	instead of 13508..
> 
> umount /media
> 
> 	Gets rid of the large pages again
> 
> Comments appreciated.
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

Hello,

This is exactly what I wanted some time ago,
Thank you very much, I was almost thinking of doing this myself 
(but decided that it is too difficult now for me and maybe doesn't worth the 
effort)

I want to point out on number of problems that this will solve (and reasons I 
wanted to do that)

First of all, today, packet writing on cd/dvd doesn't work well, it is very 
slow because 
now all file-systems are limited to 4k-barrier and cd/dvd can write only 
32k/64k packets.
This is why a pktcdvd was written and it emulates those 4k sectors by doing 
read/modify/write cycle
This cause a lot of seeks and read/writing switches and thus it is very slow.

By introducing a bigger that 4k page cache a dvd/cd can be divided is 64k/32k 
blocks that will be read an written freely
(Although dvd can read 2k  I don't think that reading a 64k block will hurt 
since most of time drive is busy seeking and locating a specific sector)

Now I thinking to implement this in an other way, I mean I want to teach udf 
filesystem to to packet writing on its own, bypassing disk cache (but not page 
cache)

Secondary 32/64k limitation is present of flash devices too, so they can 
benefit too, and I almost sure that future hard disks will use bigger block 
size too.

To summarize I want to tell that bigger pagesize will allow devices that have 
big hardware sectors to work fine in linux.

Best regards,
	Maxim Levitsky

  parent reply	other threads:[~2007-04-19 23:59 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-19 16:35 [RFC 0/8] Variable Order Page Cache Christoph Lameter
2007-04-19 16:35 ` [RFC 1/8] Add order field to address_space struct Christoph Lameter
2007-04-19 16:35 ` [RFC 2/8] Basic allocation for higher order page cache pages Christoph Lameter
2007-04-20 10:55   ` Mel Gorman
2007-04-19 16:35 ` [RFC 3/8] Flushing and zeroing " Christoph Lameter
2007-04-20 11:02   ` Mel Gorman
2007-04-20 16:15     ` Christoph Lameter
2007-04-20 16:51       ` William Lee Irwin III
2007-04-19 16:35 ` [RFC 4/8] Enhance fallback functions in libs to support higher order pages Christoph Lameter
2007-04-19 18:48   ` Adam Litke
2007-04-19 19:10     ` Christoph Lameter
2007-04-19 22:50       ` David Chinner
2007-04-20  1:15         ` Christoph Lameter
2007-04-20  8:21       ` Jens Axboe
2007-04-20 16:01         ` Christoph Lameter
2007-04-20 16:51           ` Jens Axboe
2007-04-20 11:05   ` Mel Gorman
2007-04-20 18:50     ` Dave Kleikamp
2007-04-20 19:10       ` Christoph Lameter
2007-04-20 19:27         ` Dave Kleikamp
2007-04-24 23:00         ` Matt Mackall
2007-04-19 16:35 ` [RFC 5/8] Enhance generic_read/write " Christoph Lameter
2007-04-19 16:35 ` [RFC 6/8] Account for pages in the page cache in terms of base pages Christoph Lameter
2007-04-19 17:45   ` Nish Aravamudan
2007-04-19 17:52     ` Christoph Lameter
2007-04-19 17:54       ` Avi Kivity
2007-04-19 16:35 ` [RFC 7/8] Enhance ramfs to support higher order pages Christoph Lameter
2007-04-20 13:42   ` Mel Gorman
2007-04-20 14:47     ` William Lee Irwin III
2007-04-20 16:30       ` Christoph Lameter
2007-04-20 17:11         ` William Lee Irwin III
2007-04-20 17:15           ` Christoph Lameter
2007-04-20 17:19             ` William Lee Irwin III
2007-04-20 17:57               ` Christoph Lameter
2007-04-20 19:21                 ` William Lee Irwin III
2007-04-20 17:59               ` Christoph Lameter
2007-04-20 18:01               ` Christoph Lameter
2007-04-20 18:02               ` Christoph Lameter
2007-04-20 16:20     ` Christoph Lameter
2007-04-19 16:35 ` [RFC 8/8] Add some debug output Christoph Lameter
2007-04-19 19:09 ` [RFC 0/8] Variable Order Page Cache Badari Pulavarty
2007-04-19 19:12   ` Christoph Lameter
2007-04-19 19:11 ` Andi Kleen
2007-04-19 19:15   ` Christoph Lameter
2007-04-20 14:37   ` Mel Gorman
2007-04-19 22:42 ` David Chinner
2007-04-20  1:14   ` Christoph Lameter
2007-04-20  6:32   ` Jens Axboe
2007-04-20  7:48     ` David Chinner
2007-04-21 22:18       ` Andrew Morton
2007-04-19 23:58 ` Maxim Levitsky [this message]
2007-04-20  1:15   ` Christoph Lameter
2007-04-20  4:47 ` William Lee Irwin III
2007-04-20  5:27   ` Christoph Lameter
2007-04-20  6:22     ` William Lee Irwin III
2007-04-20  8:42   ` David Chinner
2007-04-20 14:14 ` Mel Gorman
2007-04-20 16:23   ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200704200258.24749.maximlevitsky@gmail.com \
    --to=maximlevitsky@gmail.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=ak@suse.de \
    --cc=clameter@sgi.com \
    --cc=dgc@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=pj@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox