linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Brown <david.brown@hesbynett.no>
To: linux-raid@vger.kernel.org
Subject: Re:
Date: Mon, 26 Sep 2011 21:56:50 +0200	[thread overview]
Message-ID: <j5qle2$tkq$1@dough.gmane.org> (raw)
In-Reply-To: <33a40b3d7b23e719d77e6d091064c285.squirrel@www.maxstr.com>

On 26/09/11 20:04, Kenn wrote:
>> On Mon, 26 Sep 2011 00:42:23 -0700 "Kenn"<kenn@kenn.us>  wrote:
>>
>>> Replying.  I realize and I apologize I didn't create a subject.  I hope
>>> this doesn't confuse majordomo.
>>>
>>>> On Sun, 25 Sep 2011 21:23:31 -0700 "Kenn"<kenn@kenn.us>  wrote:
>>>>
>>>>> I have a raid5 array that had a drive drop out, and resilvered the
>>> wrong
>>>>> drive when I put it back in, corrupting and destroying the raid.  I
>>>>> stopped the array at less than 1% resilvering and I'm in the process
>>> of
>>>>> making a dd-copy of the drive to recover the files.
>>>>
>>>> I don't know what you mean by "resilvered".
>>>
>>> Resilvering -- Rebuilding the array.  Lesser used term, sorry!
>>
>> I see..
>>
>> I guess that looking-glass mirrors have a silver backing and when it
>> becomes
>> tarnished you might re-silver the mirror to make it better again.
>> So the name works as a poor pun for RAID1.  But I don't see how it applies
>> to RAID5....
>> No matter.
>>
>> Basically you have messed up badly.
>> Recreating arrays should only be done as a last-ditch attempt to get data
>> back, and preferably with expert advice...
>>
>> When you created the array with all devices present it effectively started
>> copying the corruption that you had deliberately (why??) placed on device
>> 2
>> (sde) onto device 4 (counting from 0).
>> So now you have two devices that are corrupt in the early blocks.
>> There is not much you can do to fix that.
>>
>> There is some chance that 'fsck' could find a backup superblock somewhere
>> and
>> try to put the pieces back together.  But the 'mkfs' probably made a
>> substantial mess of important data structures so I don't consider you
>> chances
>> very high.
>> Keeping sde out and just working with the remaining 4 is certainly your
>> best
>> bet.
>>
>> What made you think it would be a good idea to re-create the array when
>> all
>> you wanted to do was trigger a resync/recovery??
>>
>> NeilBrown
>
> Originally I had failed&  removed sde from the array and then added it
> back in, but no resilvering happened, it was just placed as raid device #
> 5 as an active (faulty?) spare, no rebuilding.  So I thought I'd have to
> recreate the array to get it to rebuild.
>
> Because my sde disk was only questionably healthy, if the problem was the
> loose cable, I wanted to test the sde disk by having a complete rebuild
> put onto it.   I was confident in all the other drives because when I
> mounted the array without sde, I ran a complete md5sum scan and
> everything's checksum was correct.  So I wanted to force a complete
> rebuilding of the array on sde and the --zero-superblock was supposed to
> render sde "new" to the array to force the rebuild onto sde.  I just did
> the fsck and mkfs for good measure instead of spending the time of using
> dd to zero every byte on the drive.  At the time because I thought if
> --zero-superblock went wrong, md would reject a blank drive as a data
> source for rebuilding and prevent resilvering.
>
> So that brings up another point -- I've been reading through your blog,
> and I acknowledge your thoughts on not having much benefit to checksums on
> every block (http://neil.brown.name/blog/20110227114201), but sometimes
> people like to having that extra lock on their door even though it takes
> more effort to go in and out of their home.  In my five-drive array, if
> the last five words were the checksums of the blocks on every drive, the
> checksums off each drive could vote on trusting the blocks of every other
> drive during the rebuild process, and prevent an idiot (me) from killing
> his data.  It would force wasteful sectors on the drive, perhaps harm
> performance by squeezing 2+n bytes out of each sector, but if someone
> wants to protect their data as much as possible, it would be a welcome
> option where performance is not a priority.
>
> Also, the checksums do provide some protection: first, against against
> partial media failure, which is a major flaw in raid 456 design according
> to http://www.miracleas.com/BAARF/RAID5_versus_RAID10.txt , and checksum
> voting could protect against the Atomicity/write-in-place flaw outlined in
> http://en.wikipedia.org/wiki/RAID#Problems_with_RAID .
>
> What do you think?
>
> Kenn

/raid/ protects against partial media flaws.  If one disk in a raid5 
stripe has a bad sector, that sector will be ignored and the missing 
data will be re-created from the other disks using the raid recovery 
algorithm.  If you want to have such protection even when doing a resync 
(as many people do), then use raid6 - it has two parity blocks.

As Neil points out in his blog, it is impossible to fully recover from a 
failure part way through a write - checksum voting or majority voting 
/may/ give you the right answer, but it may not.  If you need protection 
against that, you have to have filesystem level control (data logging 
and journalling as well as metafile journalling), or perhaps use raid 
systems with battery backed write caches.



  reply	other threads:[~2011-09-26 19:56 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-26  4:23 (unknown), Kenn
2011-09-26  4:52 ` NeilBrown
2011-09-26  7:03   ` Re: Roman Mamedov
2011-09-26 23:23     ` Re: Kenn
2011-09-26 23:46     ` Recovering from a Bad Resilver / Rebuild Kenn
2011-09-27  9:27       ` David Brown
2011-09-26  7:42   ` Kenn
2011-09-26  8:04     ` Re: NeilBrown
2011-09-26 18:04       ` Re: Kenn
2011-09-26 19:56         ` David Brown [this message]
  -- strict thread matches above, loose matches on Subject: below --
2020-08-12 10:54 Re: Alex Anadi
2020-06-24 13:54 Re; test02
2017-11-13 14:55 Amos Kalonzo
2017-05-03  6:23 Re: H.A
2017-04-13 15:58 (unknown), Scott Ellentuch
     [not found] ` <CAK2H+efb3iKA5P3yd7uRqJomci6ENvrB1JRBBmtQEpEvyPMe7w@mail.gmail.com>
2017-04-13 16:38   ` Scott Ellentuch
2017-02-23 15:09 Qin's Yanjun
2016-11-06 21:00 (unknown), Dennis Dataopslag
2016-11-07 16:50 ` Wols Lists
2016-11-07 17:13   ` Re: Wols Lists
2016-11-17 20:33 ` Re: Dennis Dataopslag
2016-11-17 22:12   ` Re: Wols Lists
2015-09-30 12:06 Apple-Free-Lotto
2014-11-26 18:38 (unknown), Travis Williams
2014-11-26 20:49 ` NeilBrown
2014-11-29 15:08   ` Re: Peter Grandi
2012-12-25  0:12 (unknown), bobzer
2012-12-25  5:38 ` Phil Turmel
     [not found]   ` <CADzS=ar9c7hC1Z7HT9pTUEnoPR+jeo8wdexrrsFbVfPnZ9Tbmg@mail.gmail.com>
2012-12-26  2:15     ` Re: Phil Turmel
2012-12-26 11:29       ` Re: bobzer
2012-12-17  0:59 (unknown), Maik Purwin
2012-12-17  3:55 ` Phil Turmel
2011-06-18 20:39 (unknown) Dragon
2011-06-19 18:40 ` Phil Turmel
2011-06-10 20:26 (unknown) Dragon
2011-06-11  2:06 ` Phil Turmel
2011-06-09 12:16 (unknown) Dragon
2011-06-09 13:39 ` Phil Turmel
2011-06-09  6:50 (unknown) Dragon
2011-06-09 12:01 ` Phil Turmel
2011-04-10  1:20 Re: Young Chang
2010-11-13  6:01 (unknown), Mike Viau
2010-11-13 19:36 ` Neil Brown
2010-03-08  1:37 (unknown), Leslie Rhorer
2010-03-08  1:53 ` Neil Brown
2010-03-08  2:01   ` Leslie Rhorer
2010-03-08  2:22     ` Michael Evans
2010-03-08  3:20       ` Leslie Rhorer
2010-03-08  3:31         ` Michael Evans
2010-01-06 14:19 (unknown) Lapohos Tibor
2010-01-06 20:21 ` Michael Evans
2010-01-06 20:57   ` Re: Antonio Perez
2009-06-05  0:50 (unknown), Jack Etherington
2009-06-05  1:18 ` Roger Heflin
2009-04-02  4:16 (unknown), Lelsie Rhorer
2009-04-02  4:22 ` David Lethe
2009-04-05  0:12   ` RE: Lelsie Rhorer
2009-04-05  0:38     ` Greg Freemyer
2009-04-05  5:05       ` Lelsie Rhorer
2009-04-05 11:42         ` Greg Freemyer
2009-04-05  0:45     ` Re: Roger Heflin
2009-04-05  5:21       ` Lelsie Rhorer
2009-04-05  5:33         ` RE: David Lethe
2009-04-02  7:33 ` Peter Grandi
2009-04-02 13:35 ` Re: Andrew Burgess
2008-05-14 12:53 (unknown), Henry, Andrew
2008-05-14 21:13 ` David Greaves
2006-05-30  8:06 Jake White
2006-02-26  5:04 Norberto X. Milton
2006-02-15  4:30 Re: Hillary
2006-01-11 14:47 (unknown) bhess
2006-01-12 11:16 ` David Greaves
2006-01-12 17:20   ` Re: Ross Vandegrift
2006-01-17 12:12     ` Re: David Greaves
     [not found] <57GDJLHJLEAG07CI@vger.kernel.org>
2005-07-24 10:31 ` Re: jfire
     [not found] <4HCKFFJ3GIC1F340@vger.kernel.org>
2005-05-30  2:49 ` Re: bouche
2002-06-04 15:47 (unknown) Colonel
2002-06-04 21:55 ` Jure Pecar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='j5qle2$tkq$1@dough.gmane.org' \
    --to=david.brown@hesbynett.no \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).