linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Allison Henderson <achender@linux.vnet.ibm.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>,
	Andreas Dilger <adilger@dilger.ca>,
	"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>,
	"jane@us.ibm.com" <jane@us.ibm.com>,
	"marcel.dufour@ca.ibm.com" <marcel.dufour@ca.ibm.com>
Subject: Re: fs corruption recovery
Date: Fri, 20 Mar 2015 11:45:02 -0700	[thread overview]
Message-ID: <20150320184501.GN11031@birch.djwong.org> (raw)
In-Reply-To: <550BB465.6040601@linux.vnet.ibm.com>

On Thu, Mar 19, 2015 at 10:47:17PM -0700, Allison Henderson wrote:
> On 03/19/2015 06:47 PM, Theodore Ts'o wrote:
> >On Wed, Mar 18, 2015 at 06:59:52PM -0600, Andreas Dilger wrote:
> >>I think that running a 17TB filesystem on ext3 is a recipe for disaster.  They should use ext4 for anything larger than 16TB.
> >
> >It's not *possible* to have a 17TB file system with ext3.  Something
> >must be very wrong there.  16TB is the maximum you can have before you
> >end up overflowing a 32-bit block number.  Unless this is a PowerPC
> >with a 16K block size or some such?
> >
> >If e2fsck is segfaulting, then I would certainly try getting the
> >latest version of e2fsprogs, just in case the problem isn't just that
> >it's running out of memory.  Also if recovering customer data is the
> >most important thing, the first thing they should do is a make image
> >copy of the file system, since it's possible that incorrect use of
> >e2fsck, or an old/buggy version of e2fsck could make things work.

...make things *worse*.

> >
> >In particular, if they are seeing errors with multply claimed inodes,
> >it's likely that part of the inode table was written to the wrong
> >place, and sometimes a skilled human being can get more data than
> >simply using e2fsck -y and praying.  At the end of the day the
> >question is how much is the customer data work and how much effort is
> >the customer / IBM willing to invest in trying to get every last bit
> >of data back?
> >
> >						- Ted
> >
> 
> Hi all,
> 
> Sorry for the delay, our email servers went down for a bit after I
> sent the email.  I will work with Marcel to find the block size,
> page size and arch.  It is my understanding they they have a

Just guessing PPC, in which case you'll really want an e2fsck released after
the giant heaps of bugfixes I've sent over the last year.  There were a lot
of bugs that only show up on bigendian systems, which probably don't get
much testing nowadays.

Even if it's a 17179869184 byte ext3 FS on x86, you're probably still better
off with a less buggy e2fsck.  There are a number of fixes to prevent the
crosslinked file fixer and the directory fixer from doing insane things to
the FS.

> contract with this customer to maintain this data, so there is
> pressure to recover it. Unfortunately the product mirrored the fs
> corruption to the back up device before the corruption was
> discovered.  I've been told that I was the only person they could
> find left that had some background with ext3/4, so I have an inkling

Yep. ;)

> that the "skilled human being" might end up being me, even though
> its been a while since I've worked with it. :-) Maybe I could poke
> into the inode table and see what I can figure out. We will be sure
> to make image backups though.  Thx a bunch for the feed back, we
> really appreciate the help!  I will keep folks updated when I have
> more info.  Thx!

If you have LVM or other volume management, please take a snapshot and fsck the
snapshot first, so you can capture a log of what happens without blasting away
at existing data.

--D

> 
> Allison Henderson
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

      reply	other threads:[~2015-03-20 18:45 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-19  0:56 fs corruption recovery Allison Henderson
2015-03-19  0:59 ` Andreas Dilger
2015-03-19 21:52   ` Eric Sandeen
2015-03-20  1:47   ` Theodore Ts'o
2015-03-20  5:47     ` Allison Henderson
2015-03-20 18:45       ` Darrick J. Wong [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150320184501.GN11031@birch.djwong.org \
    --to=darrick.wong@oracle.com \
    --cc=achender@linux.vnet.ibm.com \
    --cc=adilger@dilger.ca \
    --cc=jane@us.ibm.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=marcel.dufour@ca.ibm.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).