public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger@dilger.ca>
To: Theodore Ts'o <tytso@mit.edu>,
	"Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-ext4 <linux-ext4@vger.kernel.org>
Subject: e2fsck performance for large lost+found dirs
Date: Mon, 2 Nov 2015 15:49:38 -0700	[thread overview]
Message-ID: <62BDBCF1-B82A-449C-BFDD-84AF76CC829E@dilger.ca> (raw)

[-- Attachment #1: Type: text/plain, Size: 2573 bytes --]

Running e2fsck on a filesystem with a large number of disconnected inodes
can take a long time to complete.   This can happen with Lustre, since there
may be hundreds of thousands of inodes in one directory, and extent corruption
can wipe out the whole directory (more on that in a separate email).

It is a very lengthy O(n^2) operation to reattach all disconnected inodes
to lost+found, which can take a long time if there are millions of them.


It would be much more efficient to keep a cursor pointing to the last leaf
block in the directory in which an entry was inserted. Since e2fsck is not
deleting entries from lost+found, and because the entry names are always
getting longer (due to scanning in increasing inode number order) there is
no value to search earlier blocks again. This would make lost+found insertion
O(1) and significantly improve e2fsck performance in this case. This could be
very fast since there is only the inode bitmap to traverse, and filenames are
"#ino" so only leaf block allocation and writes needed.

For generic libext2fs usage (e.g. Darrick's FUSE interface) where entries
may be deleted from a directory, the cursor could be reset to the block of
any deleted entry, if it was at a lower offset.

Darrick, at one time I thought you had a patch to fix this behaviour, but
I couldn't find it.  Maybe your patch was related to a similar O(n^2) search
problem with block allocation?


Any thoughts about how to fix this?  I was originally thinking that I could
just cache this into the "file pointer", but no such thing exists in the
ext2fs_link() interface, only the directory inode number is passed, and it
has an additional level of indirection in that it calls the block iterator
with a callback to process each leaf block separately.  It also has the
problem that any cache would be local to ext2fs_link() and not visible to
ext2fs_unlink().

I was thinking to start with ext2fs_link() calling ext2fs_dir_iterate2()
directly, just to avoid the first level of confusion in this code.  That
still doesn't allow passing a starting offset for the iteration, however.

Next, cache the block number into the link_struct state and skip the leaf
block searches if block number < cursor, but that still needs iterating
over all the blocks to skip to the last one.

Should I just call ext2fs_block_iterate2() in e2fsck_reconnect_file()
and keep the mechanics local to e2fsck?  I thought it might be a good
generic optimization, but the interfaces make this difficult.

Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

             reply	other threads:[~2015-11-02 22:49 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-02 22:49 Andreas Dilger [this message]
2015-11-02 23:42 ` e2fsck performance for large lost+found dirs Darrick J. Wong
2015-11-03  9:02   ` Andreas Dilger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=62BDBCF1-B82A-449C-BFDD-84AF76CC829E@dilger.ca \
    --to=adilger@dilger.ca \
    --cc=darrick.wong@oracle.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox