Re: Suggestion needed for fixing RAID6

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: MRK <mrk@shiftmail.org>
To: Janos Haar <janos.haar@netcenter.hu>
Cc: linux-raid <linux-raid@vger.kernel.org>
Subject: Re: Suggestion needed for fixing RAID6
Date: Fri, 23 Apr 2010 14:34:24 +0200	[thread overview]
Message-ID: <4BD193D0.5080003@shiftmail.org> (raw)
In-Reply-To: <695a01cae2c1$a72907d0$0400a8c0@dcccs>

On 04/23/2010 10:47 AM, Janos Haar wrote:
>
> ----- Original Message ----- From: "Luca Berra" <bluca@comedia.it>
> To: <linux-raid@vger.kernel.org>
> Sent: Friday, April 23, 2010 8:51 AM
> Subject: Re: Suggestion needed for fixing RAID6
>
>
>> another option could be using the device mapper snapshot-merge target
>> (writable snapshot), which iirc is a 2.6.33+ feature
>> look at
>> http://smorgasbord.gavagai.nl/2010/03/online-merging-of-cow-volumes-with-dm-snapshot/ 
>>
>> for hints.
>> btw i have no clue how the scsi error will travel thru the dm layer.
>> L.
>
> ...or cowloop! :-)
> This is a good idea! :-)
> Thank you.
>
> I have another one:
> re-create the array (--assume-clean) with external bitmap, than drop 
> the missing drive.
> Than manually manipulate the bitmap file to re-sync only the last 10% 
> wich is good enough for me...

Cowloop is kinda deprecated in favour of DM, says wikipedia, and messing 
with the bitmap looks complicated to me.
I think Luca's is a great suggestion. You can use 3 files with 
loop-device so to store the COW devices for the 3 disks which are 
faulty. So that writes go there and you can complete the resync.
Then you would fail the cow devices one by one from mdadm and replicate 
to spares.

But this will work ONLY if read errors are still be reported across the 
DM-snapshot thingo. Otherwise (if it e.g. returns a block of zeroes 
without error) you are eventually going to get data corruption when 
replacing drives.

You can check if read errors are reported, by looking at the dmesg 
during the resync. If you see many  "read error corrected..." it works, 
while if it's silent it means it hasn't received read errors which means 
that it doesn't work. If it doesn't work DO NOT go ahead replacing 
drives, or you will get data corruption.

So you need an initial test which just performs a resync but *without* 
replicating to a spare. So I suggest you first remove all the spares 
from the array, then create the COW snapshots, then assemble the array, 
perform a resync, look at the dmesg. If it works: add the spares back, 
fail one drive, etc.

If this technique works this would be useful for everybody, so pls keep 
us informed!!
Thank you

next prev parent reply	other threads:[~2010-04-23 12:34 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-22 10:09 Suggestion needed for fixing RAID6 Janos Haar
2010-04-22 15:00 ` Mikael Abrahamsson
2010-04-22 15:12   ` Janos Haar
2010-04-22 15:18     ` Mikael Abrahamsson
2010-04-22 16:25       ` Janos Haar
2010-04-22 16:32       ` Peter Rabbitson
     [not found] ` <4BD0AF2D.90207@stud.tu-ilmenau.de>
2010-04-22 20:48   ` Janos Haar
2010-04-23  6:51 ` Luca Berra
2010-04-23  8:47   ` Janos Haar
2010-04-23 12:34     ` MRK [this message]
2010-04-24 19:36       ` Janos Haar
2010-04-24 22:47         ` MRK
2010-04-25 10:00           ` Janos Haar
2010-04-26 10:24             ` MRK
2010-04-26 12:52               ` Janos Haar
2010-04-26 16:53                 ` MRK
2010-04-26 22:39                   ` Janos Haar
2010-04-26 23:06                     ` Michael Evans
     [not found]                       ` <7cfd01cae598$419e8d20$0400a8c0@dcccs>
2010-04-27  0:04                         ` Michael Evans
2010-04-27 15:50                   ` Janos Haar
2010-04-27 23:02                     ` MRK
2010-04-28  1:37                       ` Neil Brown
2010-04-28  2:02                         ` Mikael Abrahamsson
2010-04-28  2:12                           ` Neil Brown
2010-04-28  2:30                             ` Mikael Abrahamsson
2010-05-03  2:29                               ` Neil Brown
2010-04-28 12:57                         ` MRK
2010-04-28 13:32                           ` Janos Haar
2010-04-28 14:19                             ` MRK
2010-04-28 14:51                               ` Janos Haar
2010-04-29  7:55                               ` Janos Haar
2010-04-29 15:22                                 ` MRK
2010-04-29 21:07                                   ` Janos Haar
2010-04-29 23:00                                     ` MRK
2010-04-30  6:17                                       ` Janos Haar
2010-04-30 23:54                                         ` MRK
     [not found]                                         ` <4BDB6DB6.5020306@sh iftmail.org>
2010-05-01  9:37                                           ` Janos Haar
2010-05-01 17:17                                             ` MRK
2010-05-01 21:44                                               ` Janos Haar
2010-05-02 23:05                                                 ` MRK
2010-05-03  2:17                                                 ` Neil Brown
2010-05-03 10:04                                                   ` MRK
2010-05-03 10:21                                                     ` MRK
2010-05-03 21:04                                                       ` Neil Brown
2010-05-03 21:02                                                     ` Neil Brown
     [not found]                                                   ` <4BDE9FB6.80309@shiftmai! l.org>
2010-05-03 10:20                                                     ` Janos Haar
2010-05-05 15:24                                                     ` Suggestion needed for fixing RAID6 [SOLVED] Janos Haar
2010-05-05 19:27                                                       ` MRK

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BD193D0.5080003@shiftmail.org \
    --to=mrk@shiftmail.org \
    --cc=janos.haar@netcenter.hu \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).