All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Janos Haar" <janos.haar@netcenter.hu>
To: MRK <mrk@shiftmail.org>
Cc: linux-raid@vger.kernel.org
Subject: Re: Suggestion needed for fixing RAID6
Date: Mon, 26 Apr 2010 14:52:57 +0200	[thread overview]
Message-ID: <7a3e01cae53f$684122c0$0400a8c0@dcccs> (raw)
In-Reply-To: 4BD569E2.7010409@shiftmail.org


----- Original Message ----- 
From: "MRK" <mrk@shiftmail.org>
To: "Janos Haar" <janos.haar@netcenter.hu>
Cc: <linux-raid@vger.kernel.org>
Sent: Monday, April 26, 2010 12:24 PM
Subject: Re: Suggestion needed for fixing RAID6


> On 04/25/2010 12:00 PM, Janos Haar wrote:
>>
>> ----- Original Message ----- From: "MRK" <mrk@shiftmail.org>
>> To: "Janos Haar" <janos.haar@netcenter.hu>
>> Cc: <linux-raid@vger.kernel.org>
>> Sent: Sunday, April 25, 2010 12:47 AM
>> Subject: Re: Suggestion needed for fixing RAID6
>>
>> Just a little note:
>>
>> The repair-sync action failed similar way too. :-(
>>
>>
>>> On 04/24/2010 09:36 PM, Janos Haar wrote:
>>>>
>>>> Ok, i am doing it.
>>>>
>>>> I think i have found some interesting, what is unexpected:
>>>> After 99.9% (and another 1800minute) the array is dropped the 
>>>> dm-snapshot structure!
>>>>
>>>> ...[CUT]...
>>>>
>>>> raid5:md3: read error not correctable (sector 2923767944 on dm-0).
>>>> raid5:md3: read error not correctable (sector 2923767952 on dm-0).
>>>> raid5:md3: read error not correctable (sector 2923767960 on dm-0).
>>>> raid5:md3: read error not correctable (sector 2923767968 on dm-0).
>>>> raid5:md3: read error not correctable (sector 2923767976 on dm-0).
>>>> raid5:md3: read error not correctable (sector 2923767984 on dm-0).
>>>> raid5:md3: read error not correctable (sector 2923767992 on dm-0).
>>>> raid5:md3: read error not correctable (sector 2923768000 on dm-0).
>>>>
>>>> ...[CUT]...
>>>>
>>
> 
> Remember this exact error message: "read error not correctable"
> 
>>
>>>
>>> This is strange because the write should have gone to the cow device. 
>>> Are you sure you did everything correctly with DM? Could you post 
>>> here how you created the dm-0 device?
>>
>> echo 0 $(blockdev --getsize /dev/sde4) \
>>        snapshot /dev/sde4 /dev/loop3 p 8 | \
>>        dmsetup create cow
>>
> 
> Seems correct to me...
> 
>> ]# losetup /dev/loop3
>> /dev/loop3: [0901]:55091517 (/snapshot.bin)
>>
> This line comes BEFORE the other one, right?
> 
>> /snapshot.bin is a sparse file with 2000G seeked size.
>> I have 3.6GB free space in / so the out of space is not an option. :-)
>>
>>
> [...]
>>
>>>
>>> We might ask to the DM people why it's not working maybe. Anyway 
>>> there is one good news, and it's that the read error apparently does 
>>> travel through the DM stack.
>>
>> For me, this looks like md's bug not dm's problem.
>> The "uncorrectable read error" means exactly the drive can't correct 
>> the damaged sector with ECC, and this is an unreadable sector. 
>> (pending in smart table)
>> The auto read reallocation failed not meas the sector is not 
>> re-allocatable by rewriting it!
>> The most of the drives doesn't do read-reallocation only 
>> write-reallocation.
>>
>> These drives wich does read reallocation, does it because the sector 
>> was hard to re-calculate (maybe needed more rotation, more 
>> repositioning, too much time) and moved automatically, BUT those 
>> sectors ARE NOT reported to the pc as read-error (UNC), so must NOT 
>> appear in the log...
>>
> 
> No the error message really comes from MD. Can you read C code? Go into 
> the kernel source and look this file:
> 
> linux_source_dir/drivers/md/raid5.c
> 
> (file raid5.c is also for raid6) search for "read error not correctable"
> 
> What you see there is the reason for failure. You see the line "if 
> (conf->mddev->degraded)" just above? I think your mistake was that you 
> did the DM COW trick only on the last device, or anyway one device only, 
> instead you should have done it on all 3 devices which were failing.
> 
> It did not work for you because at the moment you got the read error on 
> the last disk, two disks were already dropped from the array, the array 
> was doubly degraded, and it's not possible to correct a read error if 
> the array is degraded because you don't have enough parity information 
> to recover the data for that sector.

Oops, you are right!
It was my mistake.
Sorry, i will try it again, to support 2 drives with dm-cow.
I will try it.

Thanks again.

Janos



  reply	other threads:[~2010-04-26 12:52 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-22 10:09 Suggestion needed for fixing RAID6 Janos Haar
2010-04-22 15:00 ` Mikael Abrahamsson
2010-04-22 15:12   ` Janos Haar
2010-04-22 15:18     ` Mikael Abrahamsson
2010-04-22 16:25       ` Janos Haar
2010-04-22 16:32       ` Peter Rabbitson
     [not found] ` <4BD0AF2D.90207@stud.tu-ilmenau.de>
2010-04-22 20:48   ` Janos Haar
2010-04-23  6:51 ` Luca Berra
2010-04-23  8:47   ` Janos Haar
2010-04-23 12:34     ` MRK
2010-04-24 19:36       ` Janos Haar
2010-04-24 22:47         ` MRK
2010-04-25 10:00           ` Janos Haar
2010-04-26 10:24             ` MRK
2010-04-26 12:52               ` Janos Haar [this message]
2010-04-26 16:53                 ` MRK
2010-04-26 22:39                   ` Janos Haar
2010-04-26 23:06                     ` Michael Evans
     [not found]                       ` <7cfd01cae598$419e8d20$0400a8c0@dcccs>
2010-04-27  0:04                         ` Michael Evans
2010-04-27 15:50                   ` Janos Haar
2010-04-27 23:02                     ` MRK
2010-04-28  1:37                       ` Neil Brown
2010-04-28  2:02                         ` Mikael Abrahamsson
2010-04-28  2:12                           ` Neil Brown
2010-04-28  2:30                             ` Mikael Abrahamsson
2010-05-03  2:29                               ` Neil Brown
2010-04-28 12:57                         ` MRK
2010-04-28 13:32                           ` Janos Haar
2010-04-28 14:19                             ` MRK
2010-04-28 14:51                               ` Janos Haar
2010-04-29  7:55                               ` Janos Haar
2010-04-29 15:22                                 ` MRK
2010-04-29 21:07                                   ` Janos Haar
2010-04-29 23:00                                     ` MRK
2010-04-30  6:17                                       ` Janos Haar
2010-04-30 23:54                                         ` MRK
     [not found]                                         ` <4BDB6DB6.5020306@sh iftmail.org>
2010-05-01  9:37                                           ` Janos Haar
2010-05-01 17:17                                             ` MRK
2010-05-01 21:44                                               ` Janos Haar
2010-05-02 23:05                                                 ` MRK
2010-05-03  2:17                                                 ` Neil Brown
2010-05-03 10:04                                                   ` MRK
2010-05-03 10:21                                                     ` MRK
2010-05-03 21:04                                                       ` Neil Brown
2010-05-03 21:02                                                     ` Neil Brown
     [not found]                                                   ` <4BDE9FB6.80309@shiftmai! l.org>
2010-05-03 10:20                                                     ` Janos Haar
2010-05-05 15:24                                                     ` Suggestion needed for fixing RAID6 [SOLVED] Janos Haar
2010-05-05 19:27                                                       ` MRK

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='7a3e01cae53f$684122c0$0400a8c0@dcccs' \
    --to=janos.haar@netcenter.hu \
    --cc=linux-raid@vger.kernel.org \
    --cc=mrk@shiftmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.