linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: tytso@mit.edu
Cc: linux-ext4@vger.kernel.org
Subject: Re: e2fsck readahead speedup performance report
Date: Fri, 8 Aug 2014 20:22:40 -0700	[thread overview]
Message-ID: <20140809032240.GK11191@birch.djwong.org> (raw)
In-Reply-To: <20140809031845.GJ11191@birch.djwong.org>

On Fri, Aug 08, 2014 at 08:18:45PM -0700, Darrick J. Wong wrote:
> Hi all,
> 
> Since I this email last week, I rewrote the prefetch algorithms for pass 1 and

"Since I last replied to the e2fsck readahead patch last week..."

> 2 and separated thread support into a separate patch.  Upon discovering that
> issuing a POSIX_FADV_DONTNEED call caused a noticeable increase (of about 2-5%
> points) on fsck runtime, I dropped that part out.
> 
> In pass 1, we now walk the group descriptors looking for inode table blocks to
> read until we have found enough to issue a $readahead_kb size readahead
> command.  The patch also computes the number of the first inode of the last
> inode buffer block of the last group of the readahead group and schedules the
> next readahead to occur when we reach that inode.  This keeps the readahead
> running at closer to full speed and eliminates conflicting IOs between the
> checker thread and the readahead.
> 
> For pass 2, readahead is broken up into $readahead_kb sized chunks instead of
> issuing all of them at once.  This should increase the likelihood that a block
> is not evicted before pass2 tries to read it.
> 
> Pass 4's readahead remains unchanged.
> 
> The raw numbers from my performance evaluation of the new code live here:
> https://docs.google.com/spreadsheets/d/1hTCfr30TebXcUV8HnSatNkm4OXSyP9ezbhtMbB_UuLU
> 
> This time, I repeatedly ran e2fsck -Fnfvtt with various sizes of readahead
> buffer to see how that affected fsck runtime.  The run times are listed in the
> table at row 22, and I've created a table at row 46 to show % reduction in
> e2fsck runtime.  I tried (mostly) power-of-two buffer sizes from 1MB to 1GB; as
> you can see, even a small amount of readahead can speed things up quite a lot,
> though the returns diminish as the buffer sizes get exponentially larger.  USB
> disks suffer across the board, probably due to their slow single-issue nature.
> Hopefully UAS will eliminate that gap, though currently it just crashes my
> machines.
> 
> Note that all of these filesystems are formatted ext4 with an per-group inode
> table size of 2MB, which is probably why readahead=2MB seems to win most often.
> I think 2MB is a small enough amount that we needn't worry about thrashing
> memory in the case of parallel e2fsck, particularly because with a small
> readahead amount, e2fsck is most likely going to demand the blocks fairly soon
> anyway.  The design of the new pass1 RA code won't issue RA for a fraction of a
> block group's inode table blocks, so I propose setting RA to blocksize *
> inode_blocks_per_group.

I forgot to mention that I'll disable RA if the buffer size is greater than
1/100th of RAM.

--D
> 
> On a lark I fired up an old ext3 filesystem to see what would happen, and the
> results generally follow the ext4 results.  I haven't done much digging into
> ext3 though.  Potentially, one could prefetch the block map blocks when reading
> in another inode_buffer_block's worth of inode tables.
> 
> Will send patches soon.
> 
> --D
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2014-08-09  3:22 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-09  3:18 e2fsck readahead speedup performance report Darrick J. Wong
2014-08-09  3:22 ` Darrick J. Wong [this message]
2014-08-09  3:56 ` Theodore Ts'o
2014-08-09  4:06   ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140809032240.GK11191@birch.djwong.org \
    --to=darrick.wong@oracle.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).