From: David Chinner <dgc@sgi.com>
To: Pallai Roland <dap@mail.index.hu>
Cc: David Chinner <dgc@sgi.com>,
Linux-Raid <linux-raid@vger.kernel.org>,
xfs@oss.sgi.com
Subject: Re: raid5: I lost a XFS file system due to a minor IDE cable problem
Date: Tue, 29 May 2007 09:36:17 +1000 [thread overview]
Message-ID: <20070528233617.GI85884050@sgi.com> (raw)
In-Reply-To: <200705281730.53343.dap@mail.index.hu>
On Mon, May 28, 2007 at 05:30:52PM +0200, Pallai Roland wrote:
>
> On Monday 28 May 2007 14:53:55 Pallai Roland wrote:
> > On Friday 25 May 2007 02:05:47 David Chinner wrote:
> > > "-o ro,norecovery" will allow you to mount the filesystem and get any
> > > uncorrupted data off it.
> > >
> > > You still may get shutdowns if you trip across corrupted metadata in
> > > the filesystem, though.
> >
> > This filesystem is completely dead.
> > [...]
>
> I tried to make a md patch to stop writes if a raid5 array got 2+ failed
> drives, but I found it's already done, oops. :) handle_stripe5() ignores
> writes in this case quietly, I tried and works.
Hmmm - it clears the uptodate bit on the bio, which is supposed to
make the bio return EIO. That looks to be doing the right thing...
> There's an another layer I used on this box between md and xfs: loop-aes. I
Oh, that's a kind of important thing to forget to mention....
> used it since years and rock stable, but now it's my first suspect, cause I
> found a bug in it today:
> I assembled my array from n-1 disks, and I failed a second disk for a test
> and I found /dev/loop1 still provides *random* data where /dev/md1 serves
> nothing, it's definitely a loop-aes bug:
.....
> It's not an explanation to my screwed up file system, but for me it's enough
> to drop loop-aes. Eh.
If you can get random data back instead of an error from the block device,
then I'm not surprised your filesystem is toast. If it's one sector in a
larger block that is corrupted, then the only thing that will protect you from
this sort of corruption causing problems is metadata checksums (yet another
thin on my list of stuff to do).
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
next prev parent reply other threads:[~2007-05-28 23:36 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <200705241318.30711.dap@mail.index.hu>
2007-05-24 11:20 ` raid5: I lost a XFS file system due to a minor IDE cable problem Justin Piszcz
2007-05-25 0:05 ` David Chinner
2007-05-25 1:35 ` Pallai Roland
2007-05-25 4:55 ` David Chinner
2007-05-25 5:43 ` Alberto Alonso
2007-05-25 8:36 ` David Chinner
2007-05-28 22:45 ` Alberto Alonso
2007-05-29 3:28 ` David Chinner
2007-05-29 3:37 ` Alberto Alonso
2007-05-25 14:35 ` Pallai Roland
2007-05-28 0:30 ` David Chinner
2007-05-28 1:50 ` Pallai Roland
2007-05-28 2:17 ` David Chinner
2007-05-28 11:17 ` Pallai Roland
2007-05-28 23:06 ` David Chinner
2007-05-25 14:01 ` Pallai Roland
2007-05-28 12:53 ` Pallai Roland
2007-05-28 15:30 ` Pallai Roland
2007-05-28 23:36 ` David Chinner [this message]
2007-05-30 16:11 ` Christian Kujau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070528233617.GI85884050@sgi.com \
--to=dgc@sgi.com \
--cc=dap@mail.index.hu \
--cc=linux-raid@vger.kernel.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox