linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rogier Wolff <R.E.Wolff@BitWizard.nl>
To: Andreas Dilger <adilger@dilger.ca>
Cc: Rogier Wolff <R.E.Wolff@BitWizard.nl>,
	Pawe?? Brodacki <pawel.brodacki@googlemail.com>,
	Amir Goldstein <amir73il@gmail.com>, Ted Ts'o <tytso@mit.edu>,
	"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>
Subject: Re: fsck performance.
Date: Wed, 23 Feb 2011 03:54:35 +0100	[thread overview]
Message-ID: <20110223025435.GL21917@bitwizard.nl> (raw)
In-Reply-To: <386B23FA-CE6E-4D9C-9799-C121B2E8C3BB@dilger.ca>

On Tue, Feb 22, 2011 at 09:32:28AM -0700, Andreas Dilger wrote:
> Roger,

> Any idea what the hash size does to memory usage?  I wonder if we
> can scale this based on the directory count, or if the memory usage
> is minimal (only needed in case of tdb) then just make it the
> default. It definitely appears to have been a major performance
> boost.

First, that hash size is passed to the tdb module, so yes it only
matters when tdb is actually used.

Second, I expect tdb's memory use to be significantly impacted by the
hash size. However.... tdb's memory use is dwarfed by e2fsck's own
memory use....  I have not noticed any difference in memory use of
e2fsck. (by watching "top" output. I haven't done any scientific
measurements.)

> Another possible optuzatiom is to use the in-memory icount list
> (preferably with the patch to reduce realloc size) until the
> allocations fail and only then dump the list into tdb?  That would
> allow people to run with a swapfile configured by default, but only
> pay the cost of on-disk operations if really needed.

I don't think this is a good idea. When you expect the "big"
allocations to eventually fail (i.e. icount), you'll evenutally end up
with an allocation somewhere else that fails where you don't have
anything prepared for. A program like e2fsck will be handling
larger and different filesystems "in the field" from what you expected
at the outset. It should be robust. 

My fsck is currently walking the ridge.... 

It grew from about 1000M to over 2500 after pass 1. I was expecting it
to hit the 3G limit before the end. But luckily somehow some memory
got released, and now it seems stable at 2001Mb.

It is currently again in a CPU-bound task. I think it's doing
lots of tdb lookups. 

It has asked me: 

First entry 'DSCN11194.JPG' (inode=279188586) in directory inode 277579348 (...) should be '.'
Fix<y>? yes


Which is clearly wrong. IF we can find directory entries in that
directory (i.e. it acutally IS a directory), then it is likely that
the file DSCN11194.JPG still exists, and that it has inode
279188586. If it should've been '.', it would've been inode
277579348. So instead of overwriting this "first entry" of the
directory, the fix should've been:

Directory "." is missing in directory inode 277579348. Add?

If neccesary, place should be made inside the directory. 

	Roger. 

-- 
** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**    Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233    **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ

      parent reply	other threads:[~2011-02-23  2:54 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-20  9:06 fsck performance Rogier Wolff
2011-02-20 17:09 ` Ted Ts'o
2011-02-20 19:34   ` Ted Ts'o
2011-02-20 21:55     ` Rogier Wolff
2011-02-20 22:20       ` Ted Ts'o
2011-02-20 23:15         ` Rogier Wolff
2011-02-20 23:41           ` Ted Ts'o
2011-02-21 10:31             ` Amir Goldstein
2011-02-21 16:04               ` Paweł Brodacki
2011-02-21 18:00                 ` Andreas Dilger
2011-02-22 10:20                   ` Rogier Wolff
2011-02-22 13:36                     ` Rogier Wolff
2011-02-22 13:54                       ` Rogier Wolff
2011-02-22 16:32                         ` Andreas Dilger
2011-02-22 22:13                           ` Ted Ts'o
2011-02-23  4:44                             ` Rogier Wolff
2011-02-23 11:32                               ` Theodore Tso
2011-02-23 20:53                                 ` Rogier Wolff
2011-02-23 22:24                                   ` Andreas Dilger
2011-02-23 23:17                                     ` Ted Ts'o
2011-02-24  0:41                                       ` Andreas Dilger
2011-02-24  8:59                                         ` Rogier Wolff
2011-02-24  7:29                                     ` Rogier Wolff
2011-02-24  8:59                                       ` Amir Goldstein
2011-02-24  9:02                                         ` Rogier Wolff
2011-02-24  9:33                                           ` Amir Goldstein
2011-02-24 23:53                                         ` Rogier Wolff
2011-02-25  0:26                                       ` Daniel Taylor
2011-02-23  2:54                           ` Rogier Wolff [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110223025435.GL21917@bitwizard.nl \
    --to=r.e.wolff@bitwizard.nl \
    --cc=adilger@dilger.ca \
    --cc=amir73il@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=pawel.brodacki@googlemail.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).