public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Leslie Rhorer <lrhorer@mygrande.net>
To: Roger Willcocks <roger@filmlight.ltd.uk>, Sean Caron <scaron@umich.edu>
Cc: Eric Sandeen <sandeen@sandeen.net>, "xfs@oss.sgi.com" <xfs@oss.sgi.com>
Subject: Re: Corrupted files
Date: Tue, 09 Sep 2014 20:23:51 -0500	[thread overview]
Message-ID: <540FA827.3090308@mygrande.net> (raw)
In-Reply-To: <62B15E94-1944-457F-B298-89EDEE3EC70D@filmlight.ltd.uk>

On 9/9/2014 8:00 PM, Roger Willcocks wrote:
> I normally watch quietly from the sidelines but I think it's important
> to get some balance here

	That is almost always wise advice.  Shooting from the hip often has 
regrettable consequences, yet being too cautious can have its down side, 
too.  In this case, things are working very well at the moment, and the 
apparent issues are reasonably small, so there is no need for panic.

> our customers between them run many hundreds
> of multi-terabyte arrays and when something goes badly awry it generally
> falls to me to sort it out. In my experience xfs_repair does exactly
> what it says on the tin.

	I couldn't say.  This is only the second time I have ever had an array 
drop, and the first time it was completely unrecoverable.  Less than 5 
minutes after I had started a RAID upgrade from RAID5 to RAID6, there 
was a protracted power outage.  I shut down the system cleanly and after 
the outage restarted the reshape.  The recovery had only been running a 
few minutes when the system suffered a kernel panic - I never did find 
out why.  Every single structure on the array larger than the stripe 
size (16K, I think) was garbage.

> I can recall only a couple of instances where we elected to reformat and
> reload from backups and they were both due to human error: somebody
> deleted the wrong raid unit when doing routine maintenance, and then
> tried to fix it up hemselves.
>
> In theory of course xfs_repair shouldn't be needed if the write barriers
> work properly (it's a journalled filesystem), but low-level corruption
> does creep in due to power failures / kernel crashes and it's this which
> xfs_repair is intended to address; not massive data corruption due to
> failed hardware or careless users.

	Oh, yeah, like losing 3 out of 8 drives in the array after a drive 
controller replacement...

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2014-09-10  1:24 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-09 15:21 Corrupted files Leslie Rhorer
2014-09-09 15:50 ` Sean Caron
2014-09-09 16:03   ` Sean Caron
2014-09-09 22:24     ` Eric Sandeen
2014-09-09 22:57       ` Sean Caron
2014-09-10  1:00         ` Roger Willcocks
2014-09-10  1:23           ` Leslie Rhorer [this message]
2014-09-10  5:09         ` Eric Sandeen
2014-09-10  0:48       ` Leslie Rhorer
2014-09-10  1:10         ` Roger Willcocks
2014-09-10  1:31           ` Leslie Rhorer
2014-09-10 14:24             ` Emmanuel Florac
2014-09-10 14:49               ` Sean Caron
2014-09-09 16:08 ` Emmanuel Florac
2014-09-09 22:06 ` Dave Chinner
2014-09-10  1:12   ` Leslie Rhorer
2014-09-10  1:25     ` Sean Caron
2014-09-10  1:43       ` Leslie Rhorer
2014-09-10 14:31         ` Emmanuel Florac
2014-09-10 14:52           ` Grozdan
2014-09-10 15:12             ` Emmanuel Florac
2014-09-10 15:32               ` Grozdan
2014-09-10 14:54           ` Sean Caron
2014-09-10 23:18           ` Leslie Rhorer
2014-09-11 13:24           ` Greg Freemyer
2014-09-12  7:06             ` Emmanuel Florac
2014-09-10  1:53     ` Dave Chinner
2014-09-10  3:10       ` Leslie Rhorer
2014-09-10  3:33         ` Dave Chinner
2014-09-10  4:14           ` Leslie Rhorer
2014-09-10  4:22             ` Leslie Rhorer
2014-09-10 14:34               ` Emmanuel Florac
2014-09-10  4:51           ` Leslie Rhorer
2014-09-10  5:23             ` Dave Chinner
2014-09-11  5:47               ` Leslie Rhorer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=540FA827.3090308@mygrande.net \
    --to=lrhorer@mygrande.net \
    --cc=roger@filmlight.ltd.uk \
    --cc=sandeen@sandeen.net \
    --cc=scaron@umich.edu \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox