public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Paul Mackerras <paulus@samba.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Christoph Lameter <clameter@sgi.com>, David Chinner <dgc@sgi.com>,
	linux-kernel@vger.kernel.org, Mel Gorman <mel@skynet.ie>,
	William Lee Irwin III <wli@holomorphy.com>,
	Jens Axboe <jens.axboe@oracle.com>,
	Badari Pulavarty <pbadari@gmail.com>,
	Maxim Levitsky <maximlevitsky@gmail.com>
Subject: Re: [00/17] Large Blocksize Support V3
Date: Fri, 27 Apr 2007 21:41:39 +1000	[thread overview]
Message-ID: <4631E173.7000204@yahoo.com.au> (raw)
In-Reply-To: <17969.55533.59287.509889@cargo.ozlabs.ibm.com>

Paul Mackerras wrote:
> Andrew Morton writes:
> 
> 
>>If x86 had larger pagesize we wouldn't be seeing any of this.  It is a workaround
>>for present-generation hardware.
> 
> 
> Unfortunately, it's not really practical to increase the page size
> very much on most systems, because you end up wasting a lot of space
> in the page cache.  So there is a tension between wanting a small page
> size so your page cache uses memory efficiently, and wanting a large
> page size so the TLB covers more address space and your programs run
> faster (not to mention other benefits such as the kernel having to
> manage fewer pages, and I/O being done in bigger chunks).
> 
> Thus there is not really any single page size that suits all workloads
> and machines.  With distros wanting to just have a single kernel per
> architecture, and the fact that the page size is a compile-time
> constant, we currently end up having to pick one size and just put up
> with the fact that it will suck for some users.  We currently have
> this situation on ppc64 now that POWER5+ and POWER6 machines have
> hardware support for 64k pages as well as 4k pages.
> 
> So I can see a few different options:
> 
> (a) Keep things more or less as they are now and just wear the fact
> that we will continue to show lower performance than certain
> proprietary OSes, or
> 
> (b) Somehow manage to make the page size a variable rather than a
> compile-time constant, and pick a suitable page size at boot time
> based on how much memory the machine has, or something.  I looked at
> implementing this at one point and recoiled in horror. :)
> 
> (c) Make the page cache able to use small pages for small files and
> large pages for large files.  AIUI this is basically what Christoph is
> proposing.
> 
> Option (a) isn't very palatable to me (nor I expect, Christoph :)
> since it basically says that Linux is very much focussed on the
> embedded and desktop end of things and isn't really suitable as a
> high-performance OS for large SMP systems.  I don't want to believe
> that. ;)
> 
> Option (b) would be a bit of an ugly hack.
> 
> Which leaves option (c) - unless you have a further option.  So I have
> to say I support Christoph on this, at least as far as the general
> principle is concerned.

For the TLB issue, higher order pagecache doesn't help. If distros
ship with a 4K page size on powerpc, and use some larger pages in
the pagecache, some people are still going to get angry because
they wanted to use 64K pages... But I agree 64K pages is too big
for most things anyway, and 16 would be better as a default (which
hopefully x86-64 will get one day).

Anyway, for io performance, there are alternatives, dispite what
some people seem to be saying. We can submit larger sglists to the
device for larger ios, which Jens is looking at (which could help
all types of workloads, not just those with sequential large file
IO).

After that, I'd find it amusing if HBAs worth thousands of $ have
trouble looking up sglists at the relatively glacial pace that IO
requires, and/or can't spare a few more K for reasonable sglist
sizes, but if that is really the case, then we could use iommus
and/or just attempt to put physically contiguous pages in pagecache,
rather than require it.

-- 
SUSE Labs, Novell Inc.

  reply	other threads:[~2007-04-27 11:41 UTC|newest]

Thread overview: 235+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-24 22:21 [00/17] Large Blocksize Support V3 clameter
2007-04-24 22:21 ` [01/17] Remove open coded implementation of memclear_highpage flush clameter
2007-04-24 22:21 ` [02/17] Fix page allocation flags in grow_dev_page() clameter
2007-04-24 22:21 ` [03/17] Fix: find_or_create_page does not spread memory clameter
2007-04-24 22:21 ` [04/17] Free up page->private for compound pages clameter
2007-04-24 22:21 ` [05/17] More compound page features clameter
2007-04-24 22:21 ` [06/17] Fix up handling of Compound head pages clameter
2007-04-24 22:21 ` [07/17] vmstat.c: Support accounting for compound pages clameter
2007-04-24 22:21 ` [08/17] Define functions for page cache handling clameter
2007-04-24 23:00   ` Eric Dumazet
2007-04-25  6:27     ` Christoph Lameter
2007-04-24 22:21 ` [09/17] Convert PAGE_CACHE_xxx -> page_cache_xxx function calls clameter
2007-04-24 22:21 ` [10/17] Variable Order Page Cache: Add clearing and flushing function clameter
2007-04-26  7:02   ` Christoph Lameter
2007-04-26  8:14     ` David Chinner
2007-04-24 22:21 ` [11/17] Readahead support for the variable order page cache clameter
2007-04-24 22:21 ` [12/17] Variable Page Cache Size: Fix up reclaim counters clameter
2007-04-24 22:21 ` [13/17] set_blocksize: Allow to set a larger block size than PAGE_SIZE clameter
2007-04-24 22:21 ` [14/17] Add VM_BUG_ONs to check for correct page order clameter
2007-04-24 22:21 ` [15/17] ramfs: Variable order page cache support clameter
2007-04-24 22:21 ` [16/17] ext2: " clameter
2007-04-24 22:21 ` [17/17] xfs: " clameter
2007-04-25  0:46 ` [00/17] Large Blocksize Support V3 Jörn Engel
2007-04-25  0:47 ` H. Peter Anvin
2007-04-25  3:11 ` William Lee Irwin III
2007-04-25 11:35 ` Jens Axboe
2007-04-25 15:36   ` Christoph Lameter
2007-04-25 17:53     ` Jens Axboe
2007-04-25 18:03       ` Christoph Lameter
2007-04-25 18:05         ` Jens Axboe
2007-04-25 18:14           ` Christoph Lameter
2007-04-25 18:16             ` Jens Axboe
2007-04-25 13:28 ` Mel Gorman
2007-04-25 15:23   ` Christoph Lameter
2007-04-25 22:46 ` Badari Pulavarty
2007-04-26  1:14   ` David Chinner
2007-04-26  1:17     ` David Chinner
2007-04-26  4:51 ` Eric W. Biederman
2007-04-26  5:05   ` Christoph Lameter
2007-04-26  5:44     ` Eric W. Biederman
2007-04-26  6:37       ` Christoph Lameter
2007-04-26  9:16         ` Mel Gorman
2007-04-26  6:38       ` Nick Piggin
2007-04-26  6:46         ` Christoph Lameter
2007-04-26  6:57           ` Nick Piggin
2007-04-26  7:10             ` Christoph Lameter
2007-04-26  7:22               ` Nick Piggin
2007-04-26  7:34                 ` Christoph Lameter
2007-04-26  7:48                   ` Nick Piggin
2007-04-26  9:20                     ` David Chinner
2007-04-26 13:53                       ` Avi Kivity
2007-04-26 14:33                         ` David Chinner
2007-04-26 14:56                           ` Avi Kivity
2007-04-26 15:20                       ` Nick Piggin
2007-04-26 17:42                         ` Jens Axboe
2007-04-26 18:59                           ` Eric W. Biederman
2007-04-26 16:07                     ` Christoph Hellwig
2007-04-27 10:05                       ` Nick Piggin
2007-04-27 13:06                         ` Mel Gorman
2007-04-26 13:50                   ` William Lee Irwin III
2007-04-26 18:09                     ` Eric W. Biederman
2007-04-26 23:34                       ` William Lee Irwin III
2007-04-26  7:48                 ` Questions on printk and console_drivers gshan
2007-04-26 10:06           ` [00/17] Large Blocksize Support V3 Mel Gorman
2007-04-26 14:47             ` Nick Piggin
2007-04-26 15:58         ` Christoph Hellwig
2007-04-26 16:05           ` Jens Axboe
2007-04-26 16:16             ` Christoph Hellwig
2007-04-26 13:28       ` Alan Cox
2007-04-26 13:30         ` Jens Axboe
2007-04-29 14:12         ` Matt Mackall
2007-04-28 10:55       ` Pierre Ossman
2007-04-28 15:39         ` Eric W. Biederman
2007-04-26  5:37   ` Nick Piggin
2007-04-26  6:38     ` David Chinner
2007-04-26  6:50       ` Nick Piggin
2007-04-26  8:40         ` Mel Gorman
2007-04-26  8:55           ` Nick Piggin
2007-04-26 10:30             ` Mel Gorman
2007-04-26 10:54               ` Eric W. Biederman
2007-04-26 12:23                 ` Mel Gorman
2007-04-26 17:58                 ` Christoph Lameter
2007-04-26 18:02                   ` Jens Axboe
2007-04-26 16:11         ` Christoph Hellwig
2007-04-26 17:49           ` Eric W. Biederman
2007-04-26 18:03             ` Christoph Lameter
2007-04-26 18:03               ` Jens Axboe
2007-04-26 18:09                 ` Christoph Hellwig
2007-04-26 18:12                   ` Jens Axboe
2007-04-26 18:24                     ` Christoph Hellwig
2007-04-26 18:24                       ` Jens Axboe
2007-04-26 18:28                     ` Christoph Lameter
2007-04-26 18:29                       ` Jens Axboe
2007-04-26 18:35                         ` Christoph Lameter
2007-04-26 18:39                           ` Jens Axboe
2007-04-26 19:35                             ` Eric W. Biederman
2007-04-26 19:42                               ` Jens Axboe
2007-04-27  4:05                                 ` Eric W. Biederman
2007-04-27 10:26                                   ` Nick Piggin
2007-04-27 13:51                                     ` Eric W. Biederman
2007-04-26 20:22                             ` Mel Gorman
2007-04-27  0:21                               ` William Lee Irwin III
2007-04-27  5:16                               ` Jens Axboe
2007-04-27 10:38           ` Nick Piggin
2007-04-26 10:10       ` Eric W. Biederman
2007-04-26 13:50         ` David Chinner
2007-04-26 14:40           ` William Lee Irwin III
2007-04-26 15:38           ` Nick Piggin
2007-04-26 15:58             ` William Lee Irwin III
2007-04-27  9:46               ` Nick Piggin
2007-04-27  0:19           ` Jeremy Higdon
2007-04-26 18:07         ` Christoph Lameter
2007-04-26 18:45           ` Eric W. Biederman
2007-04-26 18:59             ` Christoph Lameter
2007-04-26 19:21               ` Eric W. Biederman
2007-04-26  6:40     ` Christoph Lameter
2007-04-26  6:53       ` Nick Piggin
2007-04-26  7:04         ` David Chinner
2007-04-26  7:07           ` Nick Piggin
2007-04-26  7:11             ` Christoph Lameter
2007-04-26  7:17               ` Nick Piggin
2007-04-26  7:28                 ` Christoph Lameter
2007-04-26  7:45                   ` Nick Piggin
2007-04-26 18:10                     ` Christoph Lameter
2007-04-27 10:08                       ` Nick Piggin
2007-04-26  7:07         ` Christoph Lameter
2007-04-26  7:15           ` Nick Piggin
2007-04-26  7:22             ` Christoph Lameter
2007-04-26  7:42               ` Nick Piggin
2007-04-26 10:48                 ` Mel Gorman
2007-04-26 12:37                 ` Andy Whitcroft
2007-04-26 14:18                   ` David Chinner
2007-04-26 15:08                   ` Nick Piggin
2007-04-26 15:19                     ` William Lee Irwin III
2007-04-26 15:28                     ` David Chinner
2007-04-26 14:53                 ` William Lee Irwin III
2007-04-26 18:16                   ` Christoph Lameter
2007-04-26 18:21                   ` Eric W. Biederman
2007-04-27  0:32                     ` William Lee Irwin III
2007-04-27 10:22                       ` Nick Piggin
2007-04-27 12:58                         ` William Lee Irwin III
2007-04-27 13:06                           ` Nick Piggin
2007-04-27 14:49                             ` William Lee Irwin III
2007-04-26 18:13                 ` Christoph Lameter
2007-04-27 10:15                   ` Nick Piggin
2007-04-26 14:49               ` William Lee Irwin III
2007-04-26 18:50 ` Maxim Levitsky
2007-04-27  2:04 ` Andrew Morton
2007-04-27  2:27   ` David Chinner
2007-04-27  2:53     ` Andrew Morton
2007-04-27  3:47       ` [00/17] Large Blocksize Support V3 (mmap conceptual discussion) Christoph Lameter
2007-04-27  4:20       ` [00/17] Large Blocksize Support V3 David Chinner
2007-04-27  5:15         ` Andrew Morton
2007-04-27  5:49           ` Christoph Lameter
2007-04-27  6:55             ` Andrew Morton
2007-04-27  7:19               ` Christoph Lameter
2007-04-27  7:26                 ` Andrew Morton
2007-04-27  8:37                   ` David Chinner
2007-04-27 12:01                   ` Christoph Lameter
2007-04-27 16:36                   ` David Chinner
2007-04-27 17:34                     ` David Chinner
2007-04-27 19:11                       ` Andrew Morton
2007-04-28  1:43                         ` Nick Piggin
2007-04-28  8:04                           ` Peter Zijlstra
2007-04-28  8:22                             ` Andrew Morton
2007-04-28  8:32                               ` Peter Zijlstra
2007-04-28  8:55                                 ` Andrew Morton
2007-04-28  9:36                                   ` Peter Zijlstra
2007-04-28 14:09                               ` William Lee Irwin III
2007-04-28 18:26                                 ` Andrew Morton
2007-04-28 19:19                                   ` William Lee Irwin III
2007-04-28 21:28                                     ` Andrew Morton
2007-04-28  3:17                         ` David Chinner
2007-04-28  3:49                           ` Christoph Lameter
2007-04-28  4:56                           ` Andrew Morton
2007-04-28  5:08                             ` Christoph Lameter
2007-04-28  5:36                               ` Andrew Morton
2007-04-28  6:24                                 ` Christoph Lameter
2007-04-28  6:52                                   ` Andrew Morton
2007-04-30  5:30                                     ` Christoph Lameter
2007-04-28  9:43                             ` Alan Cox
2007-04-28  9:58                               ` Andrew Morton
2007-04-28 10:21                                 ` Alan Cox
2007-04-28 10:25                                   ` Andrew Morton
2007-04-28 11:29                                     ` Alan Cox
2007-04-28 14:37                                       ` William Lee Irwin III
2007-04-27  7:22               ` Christoph Lameter
2007-04-27  7:29                 ` Andrew Morton
2007-04-27  7:35                   ` Christoph Lameter
2007-04-27  7:43                     ` Andrew Morton
2007-04-27 11:05               ` Paul Mackerras
2007-04-27 11:41                 ` Nick Piggin [this message]
2007-04-27 12:12                   ` Christoph Lameter
2007-04-27 12:25                     ` Nick Piggin
2007-04-27 13:39                       ` Christoph Hellwig
2007-04-28  2:27                         ` Nick Piggin
2007-04-28  2:39                           ` William Lee Irwin III
2007-04-28  2:50                             ` Nick Piggin
2007-04-28  3:16                               ` William Lee Irwin III
2007-04-28  8:16                           ` Christoph Hellwig
2007-04-27 16:48                       ` Christoph Lameter
2007-04-27 13:37                     ` Christoph Hellwig
2007-04-27 12:14                   ` Paul Mackerras
2007-04-27 12:36                     ` Nick Piggin
2007-04-27 13:42                     ` Christoph Hellwig
2007-04-27 11:58                 ` Christoph Lameter
2007-04-27 13:44               ` William Lee Irwin III
2007-04-27 19:15                 ` Andrew Morton
2007-04-28  2:21                   ` William Lee Irwin III
2007-04-27  6:09           ` David Chinner
2007-04-27  7:04             ` Andrew Morton
2007-04-27  8:03               ` David Chinner
2007-04-27  8:48                 ` Andrew Morton
2007-04-27 16:45                   ` Theodore Tso
2007-05-04 13:33                     ` Eric W. Biederman
2007-05-07  4:29                       ` David Chinner
2007-05-07  4:48                         ` Eric W. Biederman
2007-05-07  5:27                           ` David Chinner
2007-05-07  6:43                             ` Eric W. Biederman
2007-05-07  6:49                               ` William Lee Irwin III
2007-05-07  7:06                                 ` William Lee Irwin III
2007-05-08  8:49                                   ` William Lee Irwin III
2007-05-07 16:06                               ` Christoph Lameter
2007-05-07 17:29                                 ` William Lee Irwin III
2007-05-04 12:57                   ` Eric W. Biederman
2007-05-04 13:31                 ` Eric W. Biederman
2007-05-04 16:11                   ` Christoph Lameter
2007-05-07  4:58                   ` David Chinner
2007-05-07  6:56                     ` Eric W. Biederman
2007-05-07 15:17                       ` Weigert, Daniel
2007-04-27 16:55           ` Theodore Tso
2007-04-27 17:32             ` Nicholas Miell
2007-04-27 18:12               ` William Lee Irwin III
2007-04-28 16:39 ` Maxim Levitsky
2007-04-30  5:23   ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4631E173.7000204@yahoo.com.au \
    --to=nickpiggin@yahoo.com.au \
    --cc=akpm@linux-foundation.org \
    --cc=clameter@sgi.com \
    --cc=dgc@sgi.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maximlevitsky@gmail.com \
    --cc=mel@skynet.ie \
    --cc=paulus@samba.org \
    --cc=pbadari@gmail.com \
    --cc=wli@holomorphy.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox