Re: fsck.ext4 taking months

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Ted Ts'o <tytso@mit.edu>
To: Christian Brandt <brandtc@psi5.com>
Cc: linux-ext4@vger.kernel.org
Subject: Re: fsck.ext4 taking months
Date: Mon, 28 Mar 2011 11:47:30 -0400	[thread overview]
Message-ID: <20110328154730.GD21075@thunk.org> (raw)
In-Reply-To: <4D8F1F75.8010201@psi5.com>

On Sun, Mar 27, 2011 at 01:28:53PM +0200, Christian Brandt wrote:
> Situation: External 500GB drive holds lots of snapshots using lots of
> hard links made by rsync --link-dest. The controller went bad and
> destroyed superblock and directory structures. The drive contains
> roughly a million files and four complete directory-tree-snapshots with
> each roughly a million hardlinks.

As Ric said, this is a configuration that can take a long time to
fsck, mainly due to swapping (it's fairly memory intensive).  But
500GB isn't *that* big.  The larger problem is that a lot more than
just superblock and directory structures got destroyed:

> File ??? (Inode #123456, modify time Wed Jul 22 16:20:23 2009)
>   block Nr. 6144 double block(s), used with four file(s):
>     <filesystem metadata>
>     ??? (Inode #123457, mod time Wed Jul 22 16:20:23 2009)
>     ??? (Inode #123458, mod time Wed Jul 22 16:20:23 2009)
>     ...
> multiply claimed block map? Yes

This means that you have very badly damaged inode tables.  You either
have garbage written into the inode table, or inode table blocks
written to the wrong location on disk, or both.  (I'd guess most
likely both).

> Is there an adhoc method of getting my data back faster?

What's your high level goal?  If this is a backup device, how badly do
you need the old snapshots?   

> Is the slow performance with lots of hard links a known issue?

Lots of hard links will cause a large memory usage requirement.  This
is a problem primarily on 32-bit systems, particularly (ahem) "value"
NAS systems that don't have a lot of physical memory to begin with.
On 64-bit systems, you can either install enough physical memory that
this won't be a problem, or you can enable swap, in which case you
might end up swapping a lot (which will cause things to be slow) but
it should finish.

We do have a workaround for people who just can't add the physical
memory, which inolves adding a [scratch_files] section in e2fsck, and
that does cause slow performance.  There has been some work on
improving that lately, by tuning the use of the tdb library we are
using.  But if you haven't specifically enabled this workaround, it's
prboably not an issue.

I think what you're running into is the a problem caused by very badly
corrupted inode tables, and the work to keep track of the
double-allocated blocks is slowing things down.  We've improved things
a lot in this area, so we're O(n log n) in number of multiply claimed
blocks, instead of O(n^2), but if N is sufficiently large, this can
still be problematic.

There are patches that I've never had time to vet and merge that will
try to use hueristics to determine if an inode table block is hopeless
garbage, and if so, to skip the inode table block entirely.  This will
speed up e2fsck's performance in these situations, and the risk of
perhaps skipping some valid data that could have otherwise been
recovered.

So where are you at this point?  Have you completed running the fsck,
and simply wanted to let us know?  Do you need assistance in trying to
recover this disk?

							- Ted

next prev parent reply	other threads:[~2011-03-28 15:47 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-27 11:28 fsck.ext4 taking months Christian Brandt
2011-03-28 14:43 ` Ric Wheeler
2011-03-29  6:03   ` Rogier Wolff
2011-03-29 20:26     ` Christian Brandt
2011-03-30  8:45       ` Rogier Wolff
2011-03-29 20:21   ` Christian Brandt
2011-03-28 15:07 ` Eric Sandeen
2011-03-28 15:47 ` Ted Ts'o [this message]
2011-03-29 22:02   ` Christian Brandt
2011-03-30  8:34     ` Rogier Wolff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110328154730.GD21075@thunk.org \
    --to=tytso@mit.edu \
    --cc=brandtc@psi5.com \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).