From: Amir Goldstein <amir73il@gmail.com>
To: Rogier Wolff <R.E.Wolff@bitwizard.nl>
Cc: Andreas Dilger <adilger@dilger.ca>, linux-ext4@vger.kernel.org
Subject: Re: fsck performance.
Date: Thu, 24 Feb 2011 10:59:23 +0200 [thread overview]
Message-ID: <AANLkTinv-JPaiPzKC4xSqh0shmzjf4rKeQShRwiOznoz@mail.gmail.com> (raw)
In-Reply-To: <20110224072945.GE16661@bitwizard.nl>
On Thu, Feb 24, 2011 at 9:29 AM, Rogier Wolff <R.E.Wolff@bitwizard.nl> wrote:
> On Wed, Feb 23, 2011 at 03:24:18PM -0700, Andreas Dilger wrote:
>
>> The dircount can be extracted from the group descriptors, which
>> count the number of allocated directories in each group. Since the
>
> OK.
>
>> superblock "free inodes" count is no longer updated except at
>> unmount time, the code would need to walk all of the group
>> descriptors to get this number anyway.
>
> No worries. It matters a bit for performance, but if that free inode
> count in the superblock is outdated, we'll just use that outdated
> one. The one case that I'm afraid of is that someone creates a new
> filesystem (superblock inodes-in-use =~= 0), then copies on millions
> of files, and then crashes his system....
>
> I'll add a minimum of 999931, causing an overhead of around 4Mb of
> disk space usage if this was totally unneccesary.
>
>> If you have the opportunity, I wonder whether the entire need for
>> tdb can be avoided in your case by using swap and the icount
>> optimization patches previously posted? I'd really like to get that
>> patch included upstream, but it needs testing in an environment like
>> yours where icount is a significant factor. This would avoid all of
>> the tdb overhead.
>
> First: I don't think it will work. The largest amount of memory that
> e2fsck had allocated was 2.5Gb. At that point it also had around 1.5G
> of disk space in use for tdb's for a total of 4G. On the other hand,
> we've established that the overhead in tdb is about 24bytes per 8
> bytes of real data.... So maybe we would only have needed 200M of
> in-memory datastructures to handle this. Two of those 400M together
> with the dircount (tdb =750M, assume same ratio) total 600M still
> above 3G.
>
> Second: e2fsck is too fragile as it is. It should be able to handle
> big filesystems on little systems. I have a puny little 2GHz Athlon
> system that currently has 3T of disk storage and 1G RAM. Embedded
> Linux systems can be running those amounts of storage with only 64
> or 128 Mb of RAM.
>
> Even if MY filesystem happens to pass, with a little less memory-use,
> then there is a slightly larger system that won't.
>
> I have a server that has 4x2T instead of the server that has 4*1T. It
> uses the same backup strategy, so it too has lots and lots of files.
> In fact it has 84M inodes in use. (I thought 96M inodes would be
> plenty... wrong! I HAVE run out of inodes on that thing!)
>
> That one too may need to fsck the filesystem...
>
> I remember hearing about a tool that would extract all the filesystem
> meta-info, so that I can make an image that I can then test e.g. fsck
> upon? Inodes, directory blocks, indirect blocks etc.?
>
That tool is e2image -r, which creates a sparse file image of your fs
(only metadata is written, the rest is holes), so you need to be careful
when copying/transferring it to another machine to do it wisely
(i.e. bzip or dd directly to a new HDD)
Not sure what you will do if fsck fixes errors on that image...
Mostly (if it didn't clone multiply claimed blocks for example), you would
be able to write the fixed image back onto your original fs,
but that would be risky.
> Then I could make an image where I could test this. I don't really
> want to put this offline again for multiple days.
>
>
> Roger.
>
>
> --
> ** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
> ** Delftechpark 26 2628 XH Delft, The Netherlands. KVK: 27239233 **
> *-- BitWizard writes Linux device drivers for any device you may have! --*
> Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement.
> Does it sit on the couch all day? Is it unemployed? Please be specific!
> Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2011-02-24 8:59 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-20 9:06 fsck performance Rogier Wolff
2011-02-20 17:09 ` Ted Ts'o
2011-02-20 19:34 ` Ted Ts'o
2011-02-20 21:55 ` Rogier Wolff
2011-02-20 22:20 ` Ted Ts'o
2011-02-20 23:15 ` Rogier Wolff
2011-02-20 23:41 ` Ted Ts'o
2011-02-21 10:31 ` Amir Goldstein
2011-02-21 16:04 ` Paweł Brodacki
2011-02-21 18:00 ` Andreas Dilger
2011-02-22 10:20 ` Rogier Wolff
2011-02-22 13:36 ` Rogier Wolff
2011-02-22 13:54 ` Rogier Wolff
2011-02-22 16:32 ` Andreas Dilger
2011-02-22 22:13 ` Ted Ts'o
2011-02-23 4:44 ` Rogier Wolff
2011-02-23 11:32 ` Theodore Tso
2011-02-23 20:53 ` Rogier Wolff
2011-02-23 22:24 ` Andreas Dilger
2011-02-23 23:17 ` Ted Ts'o
2011-02-24 0:41 ` Andreas Dilger
2011-02-24 8:59 ` Rogier Wolff
2011-02-24 7:29 ` Rogier Wolff
2011-02-24 8:59 ` Amir Goldstein [this message]
2011-02-24 9:02 ` Rogier Wolff
2011-02-24 9:33 ` Amir Goldstein
2011-02-24 23:53 ` Rogier Wolff
2011-02-25 0:26 ` Daniel Taylor
2011-02-23 2:54 ` Rogier Wolff
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AANLkTinv-JPaiPzKC4xSqh0shmzjf4rKeQShRwiOznoz@mail.gmail.com \
--to=amir73il@gmail.com \
--cc=R.E.Wolff@bitwizard.nl \
--cc=adilger@dilger.ca \
--cc=linux-ext4@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).