All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@zip.com.au>
To: dean gaudet <dean-list-linux-kernel@arctic.org>
Cc: linux-kernel@vger.kernel.org
Subject: Re: 3x slower file reading oddity
Date: Mon, 17 Jun 2002 16:26:57 -0700	[thread overview]
Message-ID: <3D0E7041.860710CA@zip.com.au> (raw)
In-Reply-To: Pine.LNX.4.44.0206171246270.31265-100000@twinlark.arctic.org

dean gaudet wrote:
> 
> i was trying to study various cpu & i/o bottlenecks for a backup
> tool (rdiff-backup) and i stumbled into this oddity:
> 
> # time xargs -0 -n100 cat -- > /dev/null < /tmp/filelist
> 0.520u 5.310s 0:36.92 15.7%     0+0k 0+0io 11275pf+0w
> # time xargs -0 -n100 cat -- > /dev/null < /tmp/filelist
> 0.510u 5.090s 0:35.05 15.9%     0+0k 0+0io 11275pf+0w
> 
> # time xargs -0 -P2 -n100 cat -- > /dev/null < /tmp/filelist
> 0.500u 5.380s 1:30.51 6.4%      0+0k 0+0io 11275pf+0w
> # time xargs -0 -P2 -n100 cat -- > /dev/null < /tmp/filelist
> 0.420u 4.810s 1:36.73 5.4%      0+0k 0+0io 11275pf+0w
> 
> 3x slower with the two cats in parallel.

Note that the CPU time remained constant.  The wall time went up.
You did more seeking with the dual-thread approach.

I rather depends on what is in /tmp/filelist.  I assume it's
something like the output of `find'.  And I assume you're
using ext2 or ext3?

- ext2/3 will chop the filesystem up into 128-megabyte block groups.

- It attemts to place all the files in a directory into the same
  block group.

- It will explicitly place new directories into a different blockgroup
  from their parent.

And I suspect it's the latter point which has caught you out.  You have
two threads, and probably each thread's list of 100 files is from a
different directory.  And hence it lives in a different block group.
And hence your two threads are competing for the disk head.

Even increasing the elevator read latency won't help you here - we don't
perform inter-file readahead, so as soon as thread 1 blocks on a read,
it has *no* reads queued up and the other thread's requests are then 
serviced.

You'll get best throughput with a single read thread.  There are some
smarter readahead things we could do in there, but it tends to be
that device-level readahead fixes everything up anyway.

-

  parent reply	other threads:[~2002-06-17 23:28 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-06-17 20:03 3x slower file reading oddity dean gaudet
2002-06-17 20:07 ` Benjamin LaHaise
2002-06-17 20:31   ` dean gaudet
2002-06-17 23:26 ` Andrew Morton [this message]
2002-06-18  0:15   ` dean gaudet
2002-06-18  0:36     ` Andrew Morton
2002-06-18  1:40       ` dean gaudet
2002-06-18  1:45       ` Andreas Dilger
2002-06-18  2:08         ` dean gaudet
2002-06-18 10:45           ` Padraig Brady

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3D0E7041.860710CA@zip.com.au \
    --to=akpm@zip.com.au \
    --cc=dean-list-linux-kernel@arctic.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.