linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pallai Roland <dap@mail.index.hu>
To: David Chinner <dgc@sgi.com>
Cc: Linux-Raid <linux-raid@vger.kernel.org>, xfs@oss.sgi.com
Subject: Re: raid5: I lost a XFS file system due to a minor IDE cable problem
Date: Mon, 28 May 2007 13:17:31 +0200	[thread overview]
Message-ID: <200705281317.32384.dap@mail.index.hu> (raw)
In-Reply-To: <20070528021718.GZ85884050@sgi.com>


On Monday 28 May 2007 04:17:18 David Chinner wrote:
> On Mon, May 28, 2007 at 03:50:17AM +0200, Pallai Roland wrote:
> > On Monday 28 May 2007 02:30:11 David Chinner wrote:
> > > On Fri, May 25, 2007 at 04:35:36PM +0200, Pallai Roland wrote:
> > > > .and I've spammed such messages. This "internal error" isn't a good
> > > > reason to shut down the file system?
> > >
> > > Actaully, that error does shut the filesystem down in most cases. When
> > > you see that output, the function is returning -EFSCORRUPTED. You've
> > > got a corrupted freespace btree.
> > >
> > > The reason why you get spammed is that this is happening during
> > > background writeback, and there is no one to return the -EFSCORRUPTED
> > > error to. The background writeback path doesn't specifically detect
> > > shut down filesystems or trigger shutdowns on errors because that
> > > happens in different layers so you just end up with failed data writes.
> > > These errors will occur on the next foreground data or metadata
> > > allocation and that will shut the filesystem down at that point.
> > >
> > > I'm not sure that we should be ignoring EFSCORRUPTED errors here; maybe
> > > in this case we should be shutting down the filesystem.  That would
> > > certainly cut down on the spamming and would not appear to change
> > > anything other behaviour....
> >
> >  If I remember correctly, my file system wasn't shutted down at all, it
> > was "writeable" for whole night, the yafc slowly "written" files to it.
> > Maybe all write operations had failed, but yafc doesn't warn.
>
> So you never created new files or directories, unlinked files or
> directories, did synchronous writes, etc? Just had slowly growing files?
 I just overwritten badly downloaded files.

> >  Spamming is just annoying when we need to find out what went wrong (My
> > kernel.log is 300Mb), but for data security it's important to react to
> > EFSCORRUPTED error in any case, I think so. Please consider this.
>
> The filesystem has responded correctly to the corruption in terms of
> data security (i.e. failed the data write and warned noisily about
> it), but it probably hasn't done everything it should....
>
> Hmmmm. A quick look at the linux code makes me thikn that background
> writeback on linux has never been able to cause a shutdown in this
> case. However, the same error on Irix will definitely cause a
> shutdown, though....
 I hope Linux will follow Irix, that's a consistent standpoint.


 David, have you a plan to implement your "reporting raid5 block layer" idea? 
No one else has caring about this silent data loss on temporary (cable, 
power) failed raid5 arrays as I see, I really hope you do at least!


--
 d


  reply	other threads:[~2007-05-28 11:17 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-24 11:18 raid5: I lost a XFS file system due to a minor IDE cable problem Pallai Roland
2007-05-24 11:20 ` Justin Piszcz
2007-05-25  0:05   ` David Chinner
2007-05-25  1:35     ` Pallai Roland
2007-05-25  4:55       ` David Chinner
2007-05-25  5:43         ` Alberto Alonso
2007-05-25  8:36           ` David Chinner
2007-05-28 22:45             ` Alberto Alonso
2007-05-29  3:28               ` David Chinner
2007-05-29  3:37                 ` Alberto Alonso
2007-05-25 14:35         ` Pallai Roland
2007-05-28  0:30           ` David Chinner
2007-05-28  1:50             ` Pallai Roland
2007-05-28  2:17               ` David Chinner
2007-05-28 11:17                 ` Pallai Roland [this message]
2007-05-28 23:06                   ` David Chinner
2007-05-25 14:01       ` Pallai Roland
2007-05-28 12:53     ` Pallai Roland
2007-05-28 15:30       ` Pallai Roland
2007-05-28 23:36         ` David Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200705281317.32384.dap@mail.index.hu \
    --to=dap@mail.index.hu \
    --cc=dgc@sgi.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).