All of lore.kernel.org
 help / color / mirror / Atom feed
From: Theodore Tso <tytso@mit.edu>
To: Frank Mayhar <fmayhar@google.com>
Cc: linux-ext4@vger.kernel.org
Subject: Re: fsck infinite loop on corrupt ext4 file system
Date: Tue, 18 Aug 2009 13:03:31 -0400	[thread overview]
Message-ID: <20090818170331.GE28560@mit.edu> (raw)
In-Reply-To: <1250613069.10195.12.camel@bobble.smo.corp.google.com>

On Tue, Aug 18, 2009 at 09:31:09AM -0700, Frank Mayhar wrote:
> 
> Will do.  I wasn't able to keep a copy of the corrupted image but I
> should be able to do _something_ with your patch.  Thanks!
> 

OK, I was hoping you had a test case handy.  I'll try to generate one,
so I can check the changes into git.  I had left things unchecked in
just in case I had missed something that might get picked up assuming
you still had a corrupted image to try testing the patch out against.

> > In addition, e2fsck tries very hard not to destroy data, and so there
> > is the question of what to do if there are data blocks located where
> > the inode table "should" be.
> 
> I would think that that case would be even more rare than the one we're
> dealing with here.  In fact outside of a resize operation I can't think
> of how it might happen.

With ext3 and ext4 prior to 2.6.30 (when we added the block validity
check code), it was actually pretty easy for this to happen, actually
--- all it would take is a corrupted block allocation bitmap.  With
the latest ext4 code, I grant it's pretty unlikely to happen.

It still can happen, if the both the block group descriptors get
corrupted, such that the block allocation bitmap block points to a
mostly zero-filled block, and the inode table pointer for a block
group is also corrupted to some place random.  If this doesn't get
noticed for some period of time while blocks are allocated, and then
later, e2fsck recovers by reading the backup block group descriptors,
this failure mode could very much happen.  It does require multiple
simultaneous failures, though, so it's not likely, but over hundreds
of thousands or millions of deployed Linux systems, Murphy's Law has a
way of catching up with us.  :-/

Something we *could* do to further reduce the chances would be to
compare the primary and backup group descriptors, either at
mount-time, or in e2fsck.  This would add an extra level of paranoia,
although the people who are trying to do 5 second boots with HDD's
would probably complain about the extra seeks that we'd be introducing
as a result.

							- Ted

  reply	other threads:[~2009-08-18 17:03 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-14 23:55 fsck infinite loop on corrupt ext4 file system Frank Mayhar
2009-08-18  1:10 ` Frank Mayhar
2009-08-18  2:47   ` Andreas Dilger
2009-08-18 16:01   ` Theodore Tso
2009-08-18 16:31     ` Frank Mayhar
2009-08-18 17:03       ` Theodore Tso [this message]
2009-08-18 19:03         ` Andreas Dilger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090818170331.GE28560@mit.edu \
    --to=tytso@mit.edu \
    --cc=fmayhar@google.com \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.