From: Andrew Morton <akpm@linux-foundation.org>
To: Christoph Lameter <clameter@sgi.com>
Cc: David Chinner <dgc@sgi.com>,
linux-kernel@vger.kernel.org, Mel Gorman <mel@skynet.ie>,
William Lee Irwin III <wli@holomorphy.com>,
Jens Axboe <jens.axboe@oracle.com>,
Badari Pulavarty <pbadari@gmail.com>,
Maxim Levitsky <maximlevitsky@gmail.com>,
Nick Piggin <nickpiggin@yahoo.com.au>
Subject: Re: [00/17] Large Blocksize Support V3
Date: Fri, 27 Apr 2007 22:36:32 -0700 [thread overview]
Message-ID: <20070427223632.52def99e.akpm@linux-foundation.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0704272158360.7342@schroedinger.engr.sgi.com>
On Fri, 27 Apr 2007 22:08:17 -0700 (PDT) Christoph Lameter <clameter@sgi.com> wrote:
> On Fri, 27 Apr 2007, Andrew Morton wrote:
>
> > My (repeated) point is that if we populate pagecache with physically-contiguous 4k
> > pages in this manner then bio+block will be able to create much larger SG lists.
>
> True but the "if" becomes exceedingly rare the longer the system was in
> operation. 64k implies 16 pages in sequence. This is going to be a bit
> difficult to get.
Nonsense. We need higher-order allocations whichever scheme is used.
And lumpy reclaim in the moveable zone should be extremely reliable. It
_should_ be the case that it can only be defeated by excessive use of
mlock. But we've seen no testing to either confirm or refute that.
> Then there is the overhead of handling these pages.
> Which may be not significant given growing processor capabilities in some
> usage cases. In others like a synchronized application running on a large
> number of nodes this is likely introduce random delays between processor
> to processor communication that will significantly impair performance.
Well, who knows.
> And then there is the long list of features that cannot be accomplished
> with such an approach like mounting a volume with large block size,
> handling CD/DVDs, getting rid of various shim layers etc.
There are disadvantages against which this must be traded off.
And if the volume which is mounted with the large page option also has a
lot of small files on it, we've gone and dramatically deoptimised the
user's machine. It would have been better to make the 4k-page
implementation faster, rather than working around existing inefficiencies.
> I'd also like to have much higher orders of allocations for scientific
> applications that require an extremely large I/O rate. For those we
> could f.e. dedicate memory nodes that will only use a very high page
> order to prevent fragmentation. E.g. 1G pages is certainly something that
> lots of our customers would find beneficial (and they are actually
> already using those types of pages in the form of huge pages but with
> limited capabilities).
>
> But then we are sadly again trying to find another workaround that
> will not get us there and will not allow the flexibility in the
> VM that would make things much easier for lots of usage scenarios.
Your patch *is* a workaround. It's a workaround for small CPU pagesize.
It's a workaround for suboptimal VFS anf filesystem implementations. It's
a workaround for a disk adapter which has suboptimal readahead and
writeback caching implementations.
See? I can spin too.
Fact is, this change has *costs*. And you're completely ignoring them,
trying to spin them away. It ain't working and it never will. I'm seeing
no serious attempt to think about how we can reduce those costs while
retaining most of the benefits.
next prev parent reply other threads:[~2007-04-28 5:36 UTC|newest]
Thread overview: 235+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-04-24 22:21 [00/17] Large Blocksize Support V3 clameter
2007-04-24 22:21 ` [01/17] Remove open coded implementation of memclear_highpage flush clameter
2007-04-24 22:21 ` [02/17] Fix page allocation flags in grow_dev_page() clameter
2007-04-24 22:21 ` [03/17] Fix: find_or_create_page does not spread memory clameter
2007-04-24 22:21 ` [04/17] Free up page->private for compound pages clameter
2007-04-24 22:21 ` [05/17] More compound page features clameter
2007-04-24 22:21 ` [06/17] Fix up handling of Compound head pages clameter
2007-04-24 22:21 ` [07/17] vmstat.c: Support accounting for compound pages clameter
2007-04-24 22:21 ` [08/17] Define functions for page cache handling clameter
2007-04-24 23:00 ` Eric Dumazet
2007-04-25 6:27 ` Christoph Lameter
2007-04-24 22:21 ` [09/17] Convert PAGE_CACHE_xxx -> page_cache_xxx function calls clameter
2007-04-24 22:21 ` [10/17] Variable Order Page Cache: Add clearing and flushing function clameter
2007-04-26 7:02 ` Christoph Lameter
2007-04-26 8:14 ` David Chinner
2007-04-24 22:21 ` [11/17] Readahead support for the variable order page cache clameter
2007-04-24 22:21 ` [12/17] Variable Page Cache Size: Fix up reclaim counters clameter
2007-04-24 22:21 ` [13/17] set_blocksize: Allow to set a larger block size than PAGE_SIZE clameter
2007-04-24 22:21 ` [14/17] Add VM_BUG_ONs to check for correct page order clameter
2007-04-24 22:21 ` [15/17] ramfs: Variable order page cache support clameter
2007-04-24 22:21 ` [16/17] ext2: " clameter
2007-04-24 22:21 ` [17/17] xfs: " clameter
2007-04-25 0:46 ` [00/17] Large Blocksize Support V3 Jörn Engel
2007-04-25 0:47 ` H. Peter Anvin
2007-04-25 3:11 ` William Lee Irwin III
2007-04-25 11:35 ` Jens Axboe
2007-04-25 15:36 ` Christoph Lameter
2007-04-25 17:53 ` Jens Axboe
2007-04-25 18:03 ` Christoph Lameter
2007-04-25 18:05 ` Jens Axboe
2007-04-25 18:14 ` Christoph Lameter
2007-04-25 18:16 ` Jens Axboe
2007-04-25 13:28 ` Mel Gorman
2007-04-25 15:23 ` Christoph Lameter
2007-04-25 22:46 ` Badari Pulavarty
2007-04-26 1:14 ` David Chinner
2007-04-26 1:17 ` David Chinner
2007-04-26 4:51 ` Eric W. Biederman
2007-04-26 5:05 ` Christoph Lameter
2007-04-26 5:44 ` Eric W. Biederman
2007-04-26 6:37 ` Christoph Lameter
2007-04-26 9:16 ` Mel Gorman
2007-04-26 6:38 ` Nick Piggin
2007-04-26 6:46 ` Christoph Lameter
2007-04-26 6:57 ` Nick Piggin
2007-04-26 7:10 ` Christoph Lameter
2007-04-26 7:22 ` Nick Piggin
2007-04-26 7:34 ` Christoph Lameter
2007-04-26 7:48 ` Nick Piggin
2007-04-26 9:20 ` David Chinner
2007-04-26 13:53 ` Avi Kivity
2007-04-26 14:33 ` David Chinner
2007-04-26 14:56 ` Avi Kivity
2007-04-26 15:20 ` Nick Piggin
2007-04-26 17:42 ` Jens Axboe
2007-04-26 18:59 ` Eric W. Biederman
2007-04-26 16:07 ` Christoph Hellwig
2007-04-27 10:05 ` Nick Piggin
2007-04-27 13:06 ` Mel Gorman
2007-04-26 13:50 ` William Lee Irwin III
2007-04-26 18:09 ` Eric W. Biederman
2007-04-26 23:34 ` William Lee Irwin III
2007-04-26 7:48 ` Questions on printk and console_drivers gshan
2007-04-26 10:06 ` [00/17] Large Blocksize Support V3 Mel Gorman
2007-04-26 14:47 ` Nick Piggin
2007-04-26 15:58 ` Christoph Hellwig
2007-04-26 16:05 ` Jens Axboe
2007-04-26 16:16 ` Christoph Hellwig
2007-04-26 13:28 ` Alan Cox
2007-04-26 13:30 ` Jens Axboe
2007-04-29 14:12 ` Matt Mackall
2007-04-28 10:55 ` Pierre Ossman
2007-04-28 15:39 ` Eric W. Biederman
2007-04-26 5:37 ` Nick Piggin
2007-04-26 6:38 ` David Chinner
2007-04-26 6:50 ` Nick Piggin
2007-04-26 8:40 ` Mel Gorman
2007-04-26 8:55 ` Nick Piggin
2007-04-26 10:30 ` Mel Gorman
2007-04-26 10:54 ` Eric W. Biederman
2007-04-26 12:23 ` Mel Gorman
2007-04-26 17:58 ` Christoph Lameter
2007-04-26 18:02 ` Jens Axboe
2007-04-26 16:11 ` Christoph Hellwig
2007-04-26 17:49 ` Eric W. Biederman
2007-04-26 18:03 ` Christoph Lameter
2007-04-26 18:03 ` Jens Axboe
2007-04-26 18:09 ` Christoph Hellwig
2007-04-26 18:12 ` Jens Axboe
2007-04-26 18:24 ` Christoph Hellwig
2007-04-26 18:24 ` Jens Axboe
2007-04-26 18:28 ` Christoph Lameter
2007-04-26 18:29 ` Jens Axboe
2007-04-26 18:35 ` Christoph Lameter
2007-04-26 18:39 ` Jens Axboe
2007-04-26 19:35 ` Eric W. Biederman
2007-04-26 19:42 ` Jens Axboe
2007-04-27 4:05 ` Eric W. Biederman
2007-04-27 10:26 ` Nick Piggin
2007-04-27 13:51 ` Eric W. Biederman
2007-04-26 20:22 ` Mel Gorman
2007-04-27 0:21 ` William Lee Irwin III
2007-04-27 5:16 ` Jens Axboe
2007-04-27 10:38 ` Nick Piggin
2007-04-26 10:10 ` Eric W. Biederman
2007-04-26 13:50 ` David Chinner
2007-04-26 14:40 ` William Lee Irwin III
2007-04-26 15:38 ` Nick Piggin
2007-04-26 15:58 ` William Lee Irwin III
2007-04-27 9:46 ` Nick Piggin
2007-04-27 0:19 ` Jeremy Higdon
2007-04-26 18:07 ` Christoph Lameter
2007-04-26 18:45 ` Eric W. Biederman
2007-04-26 18:59 ` Christoph Lameter
2007-04-26 19:21 ` Eric W. Biederman
2007-04-26 6:40 ` Christoph Lameter
2007-04-26 6:53 ` Nick Piggin
2007-04-26 7:04 ` David Chinner
2007-04-26 7:07 ` Nick Piggin
2007-04-26 7:11 ` Christoph Lameter
2007-04-26 7:17 ` Nick Piggin
2007-04-26 7:28 ` Christoph Lameter
2007-04-26 7:45 ` Nick Piggin
2007-04-26 18:10 ` Christoph Lameter
2007-04-27 10:08 ` Nick Piggin
2007-04-26 7:07 ` Christoph Lameter
2007-04-26 7:15 ` Nick Piggin
2007-04-26 7:22 ` Christoph Lameter
2007-04-26 7:42 ` Nick Piggin
2007-04-26 10:48 ` Mel Gorman
2007-04-26 12:37 ` Andy Whitcroft
2007-04-26 14:18 ` David Chinner
2007-04-26 15:08 ` Nick Piggin
2007-04-26 15:19 ` William Lee Irwin III
2007-04-26 15:28 ` David Chinner
2007-04-26 14:53 ` William Lee Irwin III
2007-04-26 18:16 ` Christoph Lameter
2007-04-26 18:21 ` Eric W. Biederman
2007-04-27 0:32 ` William Lee Irwin III
2007-04-27 10:22 ` Nick Piggin
2007-04-27 12:58 ` William Lee Irwin III
2007-04-27 13:06 ` Nick Piggin
2007-04-27 14:49 ` William Lee Irwin III
2007-04-26 18:13 ` Christoph Lameter
2007-04-27 10:15 ` Nick Piggin
2007-04-26 14:49 ` William Lee Irwin III
2007-04-26 18:50 ` Maxim Levitsky
2007-04-27 2:04 ` Andrew Morton
2007-04-27 2:27 ` David Chinner
2007-04-27 2:53 ` Andrew Morton
2007-04-27 3:47 ` [00/17] Large Blocksize Support V3 (mmap conceptual discussion) Christoph Lameter
2007-04-27 4:20 ` [00/17] Large Blocksize Support V3 David Chinner
2007-04-27 5:15 ` Andrew Morton
2007-04-27 5:49 ` Christoph Lameter
2007-04-27 6:55 ` Andrew Morton
2007-04-27 7:19 ` Christoph Lameter
2007-04-27 7:26 ` Andrew Morton
2007-04-27 8:37 ` David Chinner
2007-04-27 12:01 ` Christoph Lameter
2007-04-27 16:36 ` David Chinner
2007-04-27 17:34 ` David Chinner
2007-04-27 19:11 ` Andrew Morton
2007-04-28 1:43 ` Nick Piggin
2007-04-28 8:04 ` Peter Zijlstra
2007-04-28 8:22 ` Andrew Morton
2007-04-28 8:32 ` Peter Zijlstra
2007-04-28 8:55 ` Andrew Morton
2007-04-28 9:36 ` Peter Zijlstra
2007-04-28 14:09 ` William Lee Irwin III
2007-04-28 18:26 ` Andrew Morton
2007-04-28 19:19 ` William Lee Irwin III
2007-04-28 21:28 ` Andrew Morton
2007-04-28 3:17 ` David Chinner
2007-04-28 3:49 ` Christoph Lameter
2007-04-28 4:56 ` Andrew Morton
2007-04-28 5:08 ` Christoph Lameter
2007-04-28 5:36 ` Andrew Morton [this message]
2007-04-28 6:24 ` Christoph Lameter
2007-04-28 6:52 ` Andrew Morton
2007-04-30 5:30 ` Christoph Lameter
2007-04-28 9:43 ` Alan Cox
2007-04-28 9:58 ` Andrew Morton
2007-04-28 10:21 ` Alan Cox
2007-04-28 10:25 ` Andrew Morton
2007-04-28 11:29 ` Alan Cox
2007-04-28 14:37 ` William Lee Irwin III
2007-04-27 7:22 ` Christoph Lameter
2007-04-27 7:29 ` Andrew Morton
2007-04-27 7:35 ` Christoph Lameter
2007-04-27 7:43 ` Andrew Morton
2007-04-27 11:05 ` Paul Mackerras
2007-04-27 11:41 ` Nick Piggin
2007-04-27 12:12 ` Christoph Lameter
2007-04-27 12:25 ` Nick Piggin
2007-04-27 13:39 ` Christoph Hellwig
2007-04-28 2:27 ` Nick Piggin
2007-04-28 2:39 ` William Lee Irwin III
2007-04-28 2:50 ` Nick Piggin
2007-04-28 3:16 ` William Lee Irwin III
2007-04-28 8:16 ` Christoph Hellwig
2007-04-27 16:48 ` Christoph Lameter
2007-04-27 13:37 ` Christoph Hellwig
2007-04-27 12:14 ` Paul Mackerras
2007-04-27 12:36 ` Nick Piggin
2007-04-27 13:42 ` Christoph Hellwig
2007-04-27 11:58 ` Christoph Lameter
2007-04-27 13:44 ` William Lee Irwin III
2007-04-27 19:15 ` Andrew Morton
2007-04-28 2:21 ` William Lee Irwin III
2007-04-27 6:09 ` David Chinner
2007-04-27 7:04 ` Andrew Morton
2007-04-27 8:03 ` David Chinner
2007-04-27 8:48 ` Andrew Morton
2007-04-27 16:45 ` Theodore Tso
2007-05-04 13:33 ` Eric W. Biederman
2007-05-07 4:29 ` David Chinner
2007-05-07 4:48 ` Eric W. Biederman
2007-05-07 5:27 ` David Chinner
2007-05-07 6:43 ` Eric W. Biederman
2007-05-07 6:49 ` William Lee Irwin III
2007-05-07 7:06 ` William Lee Irwin III
2007-05-08 8:49 ` William Lee Irwin III
2007-05-07 16:06 ` Christoph Lameter
2007-05-07 17:29 ` William Lee Irwin III
2007-05-04 12:57 ` Eric W. Biederman
2007-05-04 13:31 ` Eric W. Biederman
2007-05-04 16:11 ` Christoph Lameter
2007-05-07 4:58 ` David Chinner
2007-05-07 6:56 ` Eric W. Biederman
2007-05-07 15:17 ` Weigert, Daniel
2007-04-27 16:55 ` Theodore Tso
2007-04-27 17:32 ` Nicholas Miell
2007-04-27 18:12 ` William Lee Irwin III
2007-04-28 16:39 ` Maxim Levitsky
2007-04-30 5:23 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070427223632.52def99e.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=clameter@sgi.com \
--cc=dgc@sgi.com \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=maximlevitsky@gmail.com \
--cc=mel@skynet.ie \
--cc=nickpiggin@yahoo.com.au \
--cc=pbadari@gmail.com \
--cc=wli@holomorphy.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox