From: Theodore Ts'o <tytso@mit.edu>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-ext4@vger.kernel.org
Subject: Re: [PATCH 04/35] e2fsck: read-ahead metadata during passes 1, 2, and 4
Date: Mon, 20 Apr 2015 23:03:52 -0400 [thread overview]
Message-ID: <20150421030352.GE3238@thunk.org> (raw)
In-Reply-To: <20150402023427.25243.66810.stgit@birch.djwong.org>
On Wed, Apr 01, 2015 at 07:34:27PM -0700, Darrick J. Wong wrote:
> e2fsck pass1 is modified to use the block group data prefetch function
> to try to fetch the inode tables into the pagecache before it is
> needed. We iterate through the blockgroups until we have enough inode
> tables that need reading such that we can issue readahead; then we sit
> and wait until the last inode table block read of the last group to
> start fetching the next bunch.
>
> pass2 is modified to use the dirblock prefetching function to prefetch
> the list of directory blocks that are assembled in pass1. We use the
> "iterate a subset of a dblist" and avoid copying the dblist. Directory
> blocks are fetched incrementally as we walk through the directory
> block list. In previous iterations of this patch we would free the
> directory blocks after processing, but the performance hit to e2fsck
> itself wasn't worth it. Furthermore, it is anticipated that most
> users will then mount the FS and start using the directories, so they
> may as well remain in the page cache.
>
> pass4 is modified to prefetch the block and inode bitmaps in
> anticipation of pass 5, because pass4 is entirely CPU bound.
>
> In general, these mechanisms can decrease fsck time by 10-40%, if the
> host system has sufficient memory and the storage system can provide a
> lot of IOPs. Pretty much any storage system capable of handling
> multiple IOs in-flight at any time will see a fairly large performance
> boost. (Single-issue USB mass storage disks seem to suffer badly.)
>
> By default, the readahead buffer size will be set to the size of a block
> group's inode table (which is 2MiB for a regular ext4 FS). The -E
> readahead_kb= option can be given to specify the amount of memory to
> use for readahead or zero to disable it entirely; or an option can be
> given in e2fsck.conf.
>
> v2: Fix an off-by-one error in the pass1 readahead which made the
> readahead trigger one inode too late if the block groups are full.
>
> v3: Use the dblist partial iterator function to read ahead parts
> of the directory block list in pass 2, instead of making sublists.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Thanks, applied.
- Ted
next prev parent reply other threads:[~2015-04-21 3:03 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-02 2:34 [PATCH 00/35] e2fsprogs April 2015 patchbomb Darrick J. Wong
2015-04-02 2:34 ` [PATCH 01/35] e2fuzz: fuzz harder Darrick J. Wong
2015-04-21 1:47 ` Theodore Ts'o
2015-04-02 2:34 ` [PATCH 02/35] e2fsck: turn inline data symlink into a fast symlink when possible Darrick J. Wong
2015-04-21 1:47 ` Theodore Ts'o
2015-04-02 2:34 ` [PATCH 03/35] libext2fs/e2fsck: provide routines to read-ahead metadata Darrick J. Wong
2015-04-21 3:03 ` Theodore Ts'o
2015-04-02 2:34 ` [PATCH 04/35] e2fsck: read-ahead metadata during passes 1, 2, and 4 Darrick J. Wong
2015-04-21 3:03 ` Theodore Ts'o [this message]
2015-04-02 2:34 ` [PATCH 05/35] e2fsck: track directories to be rehashed with a bitmap Darrick J. Wong
2015-04-21 2:26 ` Theodore Ts'o
2015-04-21 4:43 ` Darrick J. Wong
2015-04-21 14:06 ` Theodore Ts'o
2015-04-02 2:34 ` [PATCH 06/35] e2fsck: rebuild sparse extent trees/convert non-extent ext3 files Darrick J. Wong
2015-04-21 16:33 ` Theodore Ts'o
2015-04-02 2:34 ` [PATCH 07/35] e2fsck: convert block-mapped files to extents on bigalloc fs Darrick J. Wong
2015-04-21 14:36 ` Theodore Ts'o
2015-05-05 22:45 ` Darrick J. Wong
2015-04-02 2:34 ` [PATCH 08/35] tests: verify proper rebuilding of sparse extent trees and block map file conversion Darrick J. Wong
2015-04-21 14:47 ` Theodore Ts'o
2015-04-02 2:35 ` [PATCH 09/35] e2fsck: abort on read error beyond end of FS Darrick J. Wong
2015-04-02 4:10 ` Andreas Dilger
[not found] ` <20150402060021.GP11031@birch.djwong.org>
[not found] ` <10D33B1F-52B7-4242-9A67-FB9E1CE75296@dilger.ca>
2015-04-06 18:57 ` Darrick J. Wong
2015-04-02 2:35 ` [PATCH 10/35] undo-io: add new calls to and speed up the undo io manager Darrick J. Wong
2015-04-02 4:06 ` Andreas Dilger
2015-04-21 15:00 ` Theodore Ts'o
2015-04-21 16:48 ` Theodore Ts'o
2015-04-22 2:47 ` Darrick J. Wong
2015-05-05 14:20 ` Theodore Ts'o
2015-04-02 2:35 ` [PATCH 11/35] undo-io: be more flexible about setting block size Darrick J. Wong
2015-05-05 14:21 ` Theodore Ts'o
2015-04-02 2:35 ` [PATCH 12/35] undo-io: use a bitmap to track what we've already written Darrick J. Wong
2015-05-05 14:21 ` Theodore Ts'o
2015-04-02 2:35 ` [PATCH 13/35] e2undo: fix memory leaks and tweak the error messages somewhat Darrick J. Wong
2015-05-05 14:22 ` Theodore Ts'o
2015-04-02 2:35 ` [PATCH 14/35] e2undo: ditch tdb file, write everything to a flat file Darrick J. Wong
2015-05-05 14:24 ` Theodore Ts'o
2015-04-02 2:35 ` [PATCH 15/35] libext2fs: support atexit cleanups Darrick J. Wong
2015-05-05 14:31 ` Theodore Ts'o
2015-04-02 2:35 ` [PATCH 16/35] e2fsck: optionally create an undo file Darrick J. Wong
2015-05-05 14:07 ` Theodore Ts'o
2015-04-02 2:35 ` [PATCH 17/35] resize2fs: optionally create " Darrick J. Wong
2015-05-05 14:36 ` Theodore Ts'o
2015-04-02 2:35 ` [PATCH 18/35] tune2fs: " Darrick J. Wong
2015-05-05 14:36 ` Theodore Ts'o
2015-04-02 2:36 ` [PATCH 19/35] mke2fs: " Darrick J. Wong
2015-05-05 14:37 ` Theodore Ts'o
2015-04-02 2:36 ` [PATCH 20/35] debugfs: " Darrick J. Wong
2015-05-05 14:43 ` Theodore Ts'o
2015-04-02 2:36 ` [PATCH 21/35] tests: test undo file creation in e2fsck/resize2fs/tune2fs/mke2fs Darrick J. Wong
2015-05-05 14:43 ` Theodore Ts'o
2015-04-02 2:36 ` [PATCH 22/35] tests: test various features of the new e2undo format Darrick J. Wong
2015-05-05 14:44 ` Theodore Ts'o
2015-04-02 2:36 ` [PATCH 23/35] copy-in: create hardlinks with the correct directory filetype Darrick J. Wong
2015-05-05 14:46 ` Theodore Ts'o
2015-04-02 2:36 ` [PATCH 24/35] copy-in: for files, only iterate file blocks that are mapped Darrick J. Wong
2015-05-05 14:49 ` Theodore Ts'o
2015-04-02 2:36 ` [PATCH 25/35] copyin: fix error handling Darrick J. Wong
2015-05-05 14:51 ` Theodore Ts'o
2015-04-02 2:36 ` [PATCH 26/35] mke2fs: add simple tests and re-alphabetize mke2fs manpage options Darrick J. Wong
2015-05-05 14:52 ` Theodore Ts'o
2015-04-02 2:37 ` [PATCH 27/35] contrib: script to create minified ext4 image from a directory Darrick J. Wong
2015-05-05 14:52 ` Theodore Ts'o
2015-04-02 2:37 ` [PATCH 28/35] libext2fs: support allocating uninit blocks in bmap2() Darrick J. Wong
2015-04-02 2:37 ` [PATCH 29/35] libext2fs: find/alloc a range of empty blocks Darrick J. Wong
2015-04-02 2:37 ` [PATCH 30/35] libext2fs: add new hooks to support large allocations Darrick J. Wong
2015-04-02 2:37 ` [PATCH 31/35] libext2fs: implement fallocate Darrick J. Wong
2015-04-02 2:37 ` [PATCH 32/35] libext2fs: use fallocate for creating journals and hugefiles Darrick J. Wong
2015-04-02 2:37 ` [PATCH 33/35] debugfs: implement fallocate Darrick J. Wong
2015-04-02 2:37 ` [PATCH 34/35] tests: test debugfs punch command Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150421030352.GE3238@thunk.org \
--to=tytso@mit.edu \
--cc=darrick.wong@oracle.com \
--cc=linux-ext4@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.