Re: Suggestion needed for fixing RAID6

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Janos Haar" <janos.haar@netcenter.hu>
To: MRK <mrk@shiftmail.org>
Cc: linux-raid@vger.kernel.org
Subject: Re: Suggestion needed for fixing RAID6
Date: Mon, 26 Apr 2010 14:52:57 +0200	[thread overview]
Message-ID: <7a3e01cae53f$684122c0$0400a8c0@dcccs> (raw)
In-Reply-To: 4BD569E2.7010409@shiftmail.org


----- Original Message ----- 
From: "MRK" <mrk@shiftmail.org>
To: "Janos Haar" <janos.haar@netcenter.hu>
Cc: <linux-raid@vger.kernel.org>
Sent: Monday, April 26, 2010 12:24 PM
Subject: Re: Suggestion needed for fixing RAID6


> On 04/25/2010 12:00 PM, Janos Haar wrote:
>>
>> ----- Original Message ----- From: "MRK" <mrk@shiftmail.org>
>> To: "Janos Haar" <janos.haar@netcenter.hu>
>> Cc: <linux-raid@vger.kernel.org>
>> Sent: Sunday, April 25, 2010 12:47 AM
>> Subject: Re: Suggestion needed for fixing RAID6
>>
>> Just a little note:
>>
>> The repair-sync action failed similar way too. :-(
>>
>>
>>> On 04/24/2010 09:36 PM, Janos Haar wrote:
>>>>
>>>> Ok, i am doing it.
>>>>
>>>> I think i have found some interesting, what is unexpected:
>>>> After 99.9% (and another 1800minute) the array is dropped the 
>>>> dm-snapshot structure!
>>>>
>>>> ...[CUT]...
>>>>
>>>> raid5:md3: read error not correctable (sector 2923767944 on dm-0).
>>>> raid5:md3: read error not correctable (sector 2923767952 on dm-0).
>>>> raid5:md3: read error not correctable (sector 2923767960 on dm-0).
>>>> raid5:md3: read error not correctable (sector 2923767968 on dm-0).
>>>> raid5:md3: read error not correctable (sector 2923767976 on dm-0).
>>>> raid5:md3: read error not correctable (sector 2923767984 on dm-0).
>>>> raid5:md3: read error not correctable (sector 2923767992 on dm-0).
>>>> raid5:md3: read error not correctable (sector 2923768000 on dm-0).
>>>>
>>>> ...[CUT]...
>>>>
>>
> 
> Remember this exact error message: "read error not correctable"
> 
>>
>>>
>>> This is strange because the write should have gone to the cow device. 
>>> Are you sure you did everything correctly with DM? Could you post 
>>> here how you created the dm-0 device?
>>
>> echo 0 $(blockdev --getsize /dev/sde4) \
>>        snapshot /dev/sde4 /dev/loop3 p 8 | \
>>        dmsetup create cow
>>
> 
> Seems correct to me...
> 
>> ]# losetup /dev/loop3
>> /dev/loop3: [0901]:55091517 (/snapshot.bin)
>>
> This line comes BEFORE the other one, right?
> 
>> /snapshot.bin is a sparse file with 2000G seeked size.
>> I have 3.6GB free space in / so the out of space is not an option. :-)
>>
>>
> [...]
>>
>>>
>>> We might ask to the DM people why it's not working maybe. Anyway 
>>> there is one good news, and it's that the read error apparently does 
>>> travel through the DM stack.
>>
>> For me, this looks like md's bug not dm's problem.
>> The "uncorrectable read error" means exactly the drive can't correct 
>> the damaged sector with ECC, and this is an unreadable sector. 
>> (pending in smart table)
>> The auto read reallocation failed not meas the sector is not 
>> re-allocatable by rewriting it!
>> The most of the drives doesn't do read-reallocation only 
>> write-reallocation.
>>
>> These drives wich does read reallocation, does it because the sector 
>> was hard to re-calculate (maybe needed more rotation, more 
>> repositioning, too much time) and moved automatically, BUT those 
>> sectors ARE NOT reported to the pc as read-error (UNC), so must NOT 
>> appear in the log...
>>
> 
> No the error message really comes from MD. Can you read C code? Go into 
> the kernel source and look this file:
> 
> linux_source_dir/drivers/md/raid5.c
> 
> (file raid5.c is also for raid6) search for "read error not correctable"
> 
> What you see there is the reason for failure. You see the line "if 
> (conf->mddev->degraded)" just above? I think your mistake was that you 
> did the DM COW trick only on the last device, or anyway one device only, 
> instead you should have done it on all 3 devices which were failing.
> 
> It did not work for you because at the moment you got the read error on 
> the last disk, two disks were already dropped from the array, the array 
> was doubly degraded, and it's not possible to correct a read error if 
> the array is degraded because you don't have enough parity information 
> to recover the data for that sector.

Oops, you are right!
It was my mistake.
Sorry, i will try it again, to support 2 drives with dm-cow.
I will try it.

Thanks again.

Janos

next prev parent reply	other threads:[~2010-04-26 12:52 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-22 10:09 Suggestion needed for fixing RAID6 Janos Haar
2010-04-22 15:00 ` Mikael Abrahamsson
2010-04-22 15:12   ` Janos Haar
2010-04-22 15:18     ` Mikael Abrahamsson
2010-04-22 16:25       ` Janos Haar
2010-04-22 16:32       ` Peter Rabbitson
     [not found] ` <4BD0AF2D.90207@stud.tu-ilmenau.de>
2010-04-22 20:48   ` Janos Haar
2010-04-23  6:51 ` Luca Berra
2010-04-23  8:47   ` Janos Haar
2010-04-23 12:34     ` MRK
2010-04-24 19:36       ` Janos Haar
2010-04-24 22:47         ` MRK
2010-04-25 10:00           ` Janos Haar
2010-04-26 10:24             ` MRK
2010-04-26 12:52               ` Janos Haar [this message]
2010-04-26 16:53                 ` MRK
2010-04-26 22:39                   ` Janos Haar
2010-04-26 23:06                     ` Michael Evans
     [not found]                       ` <7cfd01cae598$419e8d20$0400a8c0@dcccs>
2010-04-27  0:04                         ` Michael Evans
2010-04-27 15:50                   ` Janos Haar
2010-04-27 23:02                     ` MRK
2010-04-28  1:37                       ` Neil Brown
2010-04-28  2:02                         ` Mikael Abrahamsson
2010-04-28  2:12                           ` Neil Brown
2010-04-28  2:30                             ` Mikael Abrahamsson
2010-05-03  2:29                               ` Neil Brown
2010-04-28 12:57                         ` MRK
2010-04-28 13:32                           ` Janos Haar
2010-04-28 14:19                             ` MRK
2010-04-28 14:51                               ` Janos Haar
2010-04-29  7:55                               ` Janos Haar
2010-04-29 15:22                                 ` MRK
2010-04-29 21:07                                   ` Janos Haar
2010-04-29 23:00                                     ` MRK
2010-04-30  6:17                                       ` Janos Haar
2010-04-30 23:54                                         ` MRK
     [not found]                                         ` <4BDB6DB6.5020306@sh iftmail.org>
2010-05-01  9:37                                           ` Janos Haar
2010-05-01 17:17                                             ` MRK
2010-05-01 21:44                                               ` Janos Haar
2010-05-02 23:05                                                 ` MRK
2010-05-03  2:17                                                 ` Neil Brown
2010-05-03 10:04                                                   ` MRK
2010-05-03 10:21                                                     ` MRK
2010-05-03 21:04                                                       ` Neil Brown
2010-05-03 21:02                                                     ` Neil Brown
     [not found]                                                   ` <4BDE9FB6.80309@shiftmai! l.org>
2010-05-03 10:20                                                     ` Janos Haar
2010-05-05 15:24                                                     ` Suggestion needed for fixing RAID6 [SOLVED] Janos Haar
2010-05-05 19:27                                                       ` MRK

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='7a3e01cae53f$684122c0$0400a8c0@dcccs' \
    --to=janos.haar@netcenter.hu \
    --cc=linux-raid@vger.kernel.org \
    --cc=mrk@shiftmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).