All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@zip.com.au>
To: dean gaudet <dean-list-linux-kernel@arctic.org>
Cc: linux-kernel@vger.kernel.org
Subject: Re: 3x slower file reading oddity
Date: Mon, 17 Jun 2002 17:36:12 -0700	[thread overview]
Message-ID: <3D0E807C.5D50C17E@zip.com.au> (raw)
In-Reply-To: Pine.LNX.4.44.0206171649270.18507-100000@twinlark.arctic.org

dean gaudet wrote:
> 
> ...
> > You'll get best throughput with a single read thread.
> 
> what if you have a disk array with lots of spindles?  it seems at some
> point that you need to give the array or some lower level driver a lot of
> i/os to choose from so that it can get better parallelism out of the
> hardware.

mm.  For that particular test, you'd get nice speedups from striping
the blockgroups across disks, so each `cat' is probably talking to
a different disk.  I don't think I've seen anything like that proposed
though.

But regardless of the disk topology, the sanest way to get good IO
scheduling is to throw a lot of requests at the block layer.  That's
simple for writes.  But for reads, it's harder.

You could fork one `cat' per file ;)  (Not so silly, really.  But if
you took this approach, you'd need "many" more threads than blockgroups).

Or teach `cat' to perform asynchronous (aio) reads.  You'd need async
opens, too.   But generally we get a good cache hit rate against the
data which is needed to open a small file.

hmm.  What else?  Physical readahead - read metadata into the block
device's pagecache and flip pages from there into directories and
files on-demand.  Fat chance of that happening.

Or change ext2/3 to not place directories in different block groups
at all.  That's super-effective, but does cause somewhat worse long-term
fragmentation.

You can probably lessen the seek-rate by accessing the files in the correct
order.  Read all the files from a directory before descending into any of
its subdirectories.  Can find(1) do that?  You should be able to pretty
much achieve disk bandwidth this way - it depends on how bad the inter-
and intra-file fragmentation has become.

-

  reply	other threads:[~2002-06-18  0:37 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-06-17 20:03 3x slower file reading oddity dean gaudet
2002-06-17 20:07 ` Benjamin LaHaise
2002-06-17 20:31   ` dean gaudet
2002-06-17 23:26 ` Andrew Morton
2002-06-18  0:15   ` dean gaudet
2002-06-18  0:36     ` Andrew Morton [this message]
2002-06-18  1:40       ` dean gaudet
2002-06-18  1:45       ` Andreas Dilger
2002-06-18  2:08         ` dean gaudet
2002-06-18 10:45           ` Padraig Brady

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3D0E807C.5D50C17E@zip.com.au \
    --to=akpm@zip.com.au \
    --cc=dean-list-linux-kernel@arctic.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.