emergency call for help: raid5 fallen apart

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* emergency call for help: raid5 fallen apart
@ 2010-02-24 14:54 Stefan G. Weichinger
  2010-02-24 15:05 ` Stefan G. Weichinger
  0 siblings, 1 reply; 21+ messages in thread
From: Stefan G. Weichinger @ 2010-02-24 14:54 UTC (permalink / raw)
  To: linux-raid

Sorry for maybe FAQing, I am in emergency mode:

customer server, RAID5 + hotspare, 4 drives ...

gentoo Linux version 2.6.25-gentoo-r7

mdadm 2.6.4-r1 here

-

one of the 4 drives showed massive errors in dmesg, /dev/sdc
SMART-errors etc.
bought new drive and wanted to swap today.

# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4]
md1 : active raid1 sdb1[1] sda1[0]
      104320 blocks [2/2] [UU]

md3 : active raid5 sdb3[1] sda3[0]
      19550976 blocks level 5, 64k chunk, algorithm 2 [3/2] [UU_]

md4 : inactive sdb4[1](S) sdd4[3](S) sdc4[2](S) sda4[0](S)
      583641088 blocks

-

I did:

mdadm /dev/md3 --fail /dev/sdc3

went OK

mdadm /dev/md4 --remove /dev/sdc3

OK as well, raid md3 rebuilt

-

With md4 I was too aggressive maybe:

mdadm /dev/md4 --fail /dev/sdc4 --remove /dev/sdc4

this rendered md4 unusable, even after a reboot it can't be reassambled.

This is bad, to say the least.

md4 : inactive sdb4[1](S) sdd4[3](S) sdc4[2](S) sda4[0](S)
      583641088 blocks

What to try?

This is a crucial server and I feel a lot of pressure.
Rebuilding that raid would mean a lot of restore-work etc.
So I would really appreciate a goo advice here.

THANKS!

Stefan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: emergency call for help: raid5 fallen apart
  2010-02-24 14:54 emergency call for help: raid5 fallen apart Stefan G. Weichinger
@ 2010-02-24 15:05 ` Stefan G. Weichinger
  2010-02-24 15:22   ` Robin Hill
  0 siblings, 1 reply; 21+ messages in thread
From: Stefan G. Weichinger @ 2010-02-24 15:05 UTC (permalink / raw)
  To: linux-raid

Am 24.02.2010 15:54, schrieb Stefan G. Weichinger:

> What to try?
> 
> This is a crucial server and I feel a lot of pressure.
> Rebuilding that raid would mean a lot of restore-work etc.
> So I would really appreciate a goo advice here.

Followup:

--examine shows different statii for the four partitions:


server-gentoo ~ # mdadm --examine /dev/sda4
/dev/sda4:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : d4b0e9c1:067357ce:2569337e:e9af8bed
  Creation Time : Tue Aug  5 14:14:16 2008
     Raid Level : raid5
  Used Dev Size : 145910272 (139.15 GiB 149.41 GB)
     Array Size : 291820544 (278.30 GiB 298.82 GB)
   Raid Devices : 3
  Total Devices : 4
Preferred Minor : 4

    Update Time : Wed Feb 24 15:33:37 2010
          State : active
 Active Devices : 2
Working Devices : 3
 Failed Devices : 1
  Spare Devices : 1
       Checksum : 3039381e - correct
         Events : 0.13

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8        4        0      active sync   /dev/sda4

   0     0       8        4        0      active sync   /dev/sda4
   1     1       8       20        1      active sync   /dev/sdb4
   2     2       0        0        2      faulty removed
   3     3       8       52        3      spare   /dev/sdd4


server-gentoo ~ # mdadm --examine /dev/sdb4
/dev/sdb4:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : d4b0e9c1:067357ce:2569337e:e9af8bed
  Creation Time : Tue Aug  5 14:14:16 2008
     Raid Level : raid5
  Used Dev Size : 145910272 (139.15 GiB 149.41 GB)
     Array Size : 291820544 (278.30 GiB 298.82 GB)
   Raid Devices : 3
  Total Devices : 4
Preferred Minor : 4

    Update Time : Wed Feb 24 15:37:05 2010
          State : clean
 Active Devices : 1
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 1
       Checksum : 3039393f - correct
         Events : 0.32

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8       20        1      active sync   /dev/sdb4

   0     0       0        0        0      removed
   1     1       8       20        1      active sync   /dev/sdb4
   2     2       0        0        2      faulty removed
   3     3       8       52        3      spare   /dev/sdd4


server-gentoo ~ # mdadm --examine /dev/sdc4
/dev/sdc4:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : d4b0e9c1:067357ce:2569337e:e9af8bed
  Creation Time : Tue Aug  5 14:14:16 2008
     Raid Level : raid5
  Used Dev Size : 145910272 (139.15 GiB 149.41 GB)
     Array Size : 291820544 (278.30 GiB 298.82 GB)
   Raid Devices : 3
  Total Devices : 4
Preferred Minor : 4

    Update Time : Wed Feb 24 15:33:28 2010
          State : clean
 Active Devices : 3
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 1
       Checksum : 30393836 - correct
         Events : 0.10

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     2       8       36        2      active sync   /dev/sdc4

   0     0       8        4        0      active sync   /dev/sda4
   1     1       8       20        1      active sync   /dev/sdb4
   2     2       8       36        2      active sync   /dev/sdc4
   3     3       8       52        3      spare   /dev/sdd4


server-gentoo ~ # mdadm --examine /dev/sdd4
/dev/sdd4:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : d4b0e9c1:067357ce:2569337e:e9af8bed
  Creation Time : Tue Aug  5 14:14:16 2008
     Raid Level : raid5
  Used Dev Size : 145910272 (139.15 GiB 149.41 GB)
     Array Size : 291820544 (278.30 GiB 298.82 GB)
   Raid Devices : 3
  Total Devices : 4
Preferred Minor : 4

    Update Time : Wed Feb 24 15:37:05 2010
          State : clean
 Active Devices : 1
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 1
       Checksum : 3039395d - correct
         Events : 0.32

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       52        3      spare   /dev/sdd4

   0     0       0        0        0      removed
   1     1       8       20        1      active sync   /dev/sdb4
   2     2       0        0        2      faulty removed
   3     3       8       52        3      spare   /dev/sdd4



---- Does this info help?
Thanks, Stefan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: emergency call for help: raid5 fallen apart
  2010-02-24 15:05 ` Stefan G. Weichinger
@ 2010-02-24 15:22   ` Robin Hill
  2010-02-24 15:32     ` Stefan G. Weichinger
  0 siblings, 1 reply; 21+ messages in thread
From: Robin Hill @ 2010-02-24 15:22 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 950 bytes --]

On Wed Feb 24, 2010 at 04:05:36PM +0100, Stefan G. Weichinger wrote:

> Am 24.02.2010 15:54, schrieb Stefan G. Weichinger:
> 
> > What to try?
> > 
> > This is a crucial server and I feel a lot of pressure.
> > Rebuilding that raid would mean a lot of restore-work etc.
> > So I would really appreciate a goo advice here.
> 
> Followup:
> 
> --examine shows different statii for the four partitions:
> 
Hmm, that looks like sda4 dropped out after sdc4 was removed, failing
the array.  Can you force assemble the array?
    mdadm -A /dev/md4 -f /dev/sda4 /dev/sdb4

If that works, you'll want to re-add the hot spare so it rebuilds.
You'll also need to fsck the filesystem afterwards.

Cheers,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: emergency call for help: raid5 fallen apart
  2010-02-24 15:22   ` Robin Hill
@ 2010-02-24 15:32     ` Stefan G. Weichinger
  2010-02-24 16:38       ` Stefan G. Weichinger
  0 siblings, 1 reply; 21+ messages in thread
From: Stefan G. Weichinger @ 2010-02-24 15:32 UTC (permalink / raw)
  To: linux-raid

Am 24.02.2010 16:22, schrieb Robin Hill:
> On Wed Feb 24, 2010 at 04:05:36PM +0100, Stefan G. Weichinger wrote:
> 
>> Am 24.02.2010 15:54, schrieb Stefan G. Weichinger:
>>
>>> What to try?
>>>
>>> This is a crucial server and I feel a lot of pressure.
>>> Rebuilding that raid would mean a lot of restore-work etc.
>>> So I would really appreciate a goo advice here.
>>
>> Followup:
>>
>> --examine shows different statii for the four partitions:
>>
> Hmm, that looks like sda4 dropped out after sdc4 was removed, failing
> the array.  Can you force assemble the array?
>     mdadm -A /dev/md4 -f /dev/sda4 /dev/sdb4
> 
> If that works, you'll want to re-add the hot spare so it rebuilds.
> You'll also need to fsck the filesystem afterwards.

I thank you a lot for this piece of help.
I always hesitate to TRY things in such a situation as I once back then
dropped a RAID by doing the wrong thing.

The md4 is UP again on 2 spindles, 3rd re-added right now.

Looks promising.

THANKS, I owe you something.

I report back later with more details ...

S

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: emergency call for help: raid5 fallen apart
  2010-02-24 15:32     ` Stefan G. Weichinger
@ 2010-02-24 16:38       ` Stefan G. Weichinger
  2010-02-24 16:53         ` Stefan G. Weichinger
  0 siblings, 1 reply; 21+ messages in thread
From: Stefan G. Weichinger @ 2010-02-24 16:38 UTC (permalink / raw)
  To: linux-raid

Am 24.02.2010 16:32, schrieb Stefan G. Weichinger:

> I report back later with more details ...

sda4 drops out repeatedly ...

Swapped physical sdc already ... adding sdc4 leads to failing md4 again
after starting the rebuild.

I now have md4 on sda4 and sdb4 ... xfs_repaired ... and sync the data
to a plain new xfs-partition on sdc4 ... just to get current data out of
the way.

oh my

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: emergency call for help: raid5 fallen apart
  2010-02-24 16:38       ` Stefan G. Weichinger
@ 2010-02-24 16:53         ` Stefan G. Weichinger
  2010-02-24 17:02           ` Stefan G. Weichinger
  2010-02-24 17:09           ` Robin Hill
  0 siblings, 2 replies; 21+ messages in thread
From: Stefan G. Weichinger @ 2010-02-24 16:53 UTC (permalink / raw)
  To: linux-raid

Am 24.02.2010 17:38, schrieb Stefan G. Weichinger:

> I now have md4 on sda4 and sdb4 ... xfs_repaired ... and sync the data
> to a plain new xfs-partition on sdc4 ... just to get current data out of
> the way.


Status now, after another reboot because of a failing md4:

why degraded? How to get out of that and re-add sdc4 or sdd4 ?
What about that device 2 down there??


server-gentoo ~ # mdadm -D /dev/md4
/dev/md4:
        Version : 00.90.03
  Creation Time : Tue Aug  5 14:14:16 2008
     Raid Level : raid5
     Array Size : 291820544 (278.30 GiB 298.82 GB)
  Used Dev Size : 145910272 (139.15 GiB 149.41 GB)
   Raid Devices : 3
  Total Devices : 2
Preferred Minor : 4
    Persistence : Superblock is persistent

    Update Time : Wed Feb 24 17:41:15 2010
          State : clean, degraded
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : d4b0e9c1:067357ce:2569337e:e9af8bed
         Events : 0.198

    Number   Major   Minor   RaidDevice State
       0       8        4        0      active sync   /dev/sda4
       1       8       20        1      active sync   /dev/sdb4
       2       0        0        2      removed


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: emergency call for help: raid5 fallen apart
  2010-02-24 16:53         ` Stefan G. Weichinger
@ 2010-02-24 17:02           ` Stefan G. Weichinger
  2010-02-25  8:05             ` Giovanni Tessore
  2010-02-24 17:09           ` Robin Hill
  1 sibling, 1 reply; 21+ messages in thread
From: Stefan G. Weichinger @ 2010-02-24 17:02 UTC (permalink / raw)
  To: linux-raid


sda fails also:


Feb 24 17:57:42 server-gentoo ata1.00: configured for UDMA/133
Feb 24 17:57:42 server-gentoo sd 0:0:0:0: [sda] Result: hostbyte=0x00
driverbyte=0x08
Feb 24 17:57:42 server-gentoo sd 0:0:0:0: [sda] Sense Key : 0x3
[current] [descriptor]
Feb 24 17:57:42 server-gentoo Descriptor sense data with sense
descriptors (in hex):
Feb 24 17:57:42 server-gentoo 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00
00 00
Feb 24 17:57:42 server-gentoo 01 3c ba 1a
Feb 24 17:57:42 server-gentoo sd 0:0:0:0: [sda] ASC=0x11 ASCQ=0x4
Feb 24 17:57:42 server-gentoo end_request: I/O error, dev sda, sector
20757018
Feb 24 17:57:42 server-gentoo raid5:md4: read error not correctable
(sector 1032 on sda4).
Feb 24 17:57:42 server-gentoo raid5: Disk failure on sda4, disabling
device. Operation continuing on 1 devices
Feb 24 17:57:42 server-gentoo raid5:md4: read error not correctable
(sector 1040 on sda4).
Feb 24 17:57:42 server-gentoo raid5:md4: read error not correctable
(sector 1048 on sda4).

(sector 1072 on sda4).



So I am down to one drive, from 3 ...

:-(

Does it make sense to repeat:

mdadm --assemble
xfs_repair
mount

and rsync stuff aside until it fails again?

I once was lucky with such a strategy ...

S

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: emergency call for help: raid5 fallen apart
  2010-02-24 17:02           ` Stefan G. Weichinger
@ 2010-02-25  8:05             ` Giovanni Tessore
  2010-02-25 16:27               ` Stefan /*St0fF*/ Hübner
  2010-02-25 16:45               ` John Robinson
  0 siblings, 2 replies; 21+ messages in thread
From: Giovanni Tessore @ 2010-02-25  8:05 UTC (permalink / raw)
  To: linux-raid

Stefan G. Weichinger wrote:
> Feb 24 17:57:42 server-gentoo end_request: I/O error, dev sda, sector
> 20757018
> Feb 24 17:57:42 server-gentoo raid5:md4: read error not correctable
> (sector 1032 on sda4).
> Feb 24 17:57:42 server-gentoo raid5: Disk failure on sda4, disabling
> device. Operation continuing on 1 devices
> Feb 24 17:57:42 server-gentoo raid5:md4: read error not correctable
> (sector 1040 on sda4).
> Feb 24 17:57:42 server-gentoo raid5:md4: read error not correctable
> (sector 1048 on sda4).
>
>
> Does it make sense to repeat:
>
> mdadm --assemble
> xfs_repair
> mount
>
> and rsync stuff aside until it fails again?
>
> I once was lucky with such a strategy ...
>   

I recently had similar problem with a 6 disk array, when one died and 
another gave read errors during reconstruction (see older posts about 
end of january).
I was able to recover most data reassembling the array and copying data 
from it to another storage, repeating the assembly each time the read 
errors was encountered; so the 'strategy' mostly worked for me 
(recovered almost everything); it may help setting the md device in 
readonly mode, and mounign the partition as readonly.

I hope you can recover your data.
Regards

PS.
I see this is the 4th time in a month that poeple reports problem on 
raid5 due to the read errors during reconstruction; it looks like the 
'corrected read errors' policy is quite a real concern.

-- 
Cordiali saluti.
Yours faithfully.

Giovanni Tessore

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: emergency call for help: raid5 fallen apart
  2010-02-25  8:05             ` Giovanni Tessore
@ 2010-02-25 16:27               ` Stefan /*St0fF*/ Hübner
  2010-02-25 16:45               ` John Robinson
  1 sibling, 0 replies; 21+ messages in thread
From: Stefan /*St0fF*/ Hübner @ 2010-02-25 16:27 UTC (permalink / raw)
  To: linux-raid

Giovanni Tessore schrieb:
> [...] 
> PS.
> I see this is the 4th time in a month that poeple reports problem on
> raid5 due to the read errors during reconstruction; it looks like the
> 'corrected read errors' policy is quite a real concern.
> 

It surely is.  There are many more than 4 cases in a month, one could
earn a living from it - thank you very much harddisk producers.

What we need: setting Error Recovery Control timeouts upon assembly of
RAIDs on ATA-disks.  Then the normal disks behave like the raid-edition
disks from some vendors (i.e. Hitachi even documents on it for the
7K2000 Deskstar drives - well, only between the lines).

But this feature is pretty hard to implement - you'll have to guess and
test if a disk understands ATA.

All the best,
Stefan Hübner
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: emergency call for help: raid5 fallen apart
  2010-02-25  8:05             ` Giovanni Tessore
  2010-02-25 16:27               ` Stefan /*St0fF*/ Hübner
@ 2010-02-25 16:45               ` John Robinson
  2010-02-25 17:41                 ` Dawning Sky
                                   ` (2 more replies)
  1 sibling, 3 replies; 21+ messages in thread
From: John Robinson @ 2010-02-25 16:45 UTC (permalink / raw)
  To: Giovanni Tessore; +Cc: linux-raid

On 25/02/2010 08:05, Giovanni Tessore wrote:
[...]
> I see this is the 4th time in a month that poeple reports problem on 
> raid5 due to the read errors during reconstruction; it looks like the 
> 'corrected read errors' policy is quite a real concern.

If you mean md's policy of reconstructing from the other discs and 
rewriting when there's a read error from one disc of an array, rather 
than immediately kicking the disc that had a read error, I think you're 
wrong - I think md is saving lots of users from hitting problems, by 
keeping their arrays up and running, and giving their discs a chance to 
remap bad sectors, instead of forcing the user to do full-disc 
reconstructions more often which will make them more likely to hit read 
errors during recovery.

I do think we urgently need the hot reconstruction/recovery feature, so 
failing drives can be recovered to fresh drives with two sources of 
data, i.e. both the failing drive and the remaining drives in the array, 
giving us two chances of recovering every sector.

Cheers,

John.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: emergency call for help: raid5 fallen apart
  2010-02-25 16:45               ` John Robinson
@ 2010-02-25 17:41                 ` Dawning Sky
  2010-02-25 18:31                   ` John Robinson
  2010-02-26 20:15                 ` Bill Davidsen
  2010-02-28 11:50                 ` Stefan /*St0fF*/ Hübner
  2 siblings, 1 reply; 21+ messages in thread
From: Dawning Sky @ 2010-02-25 17:41 UTC (permalink / raw)
  To: linux-raid

On Thu, Feb 25, 2010 at 8:45 AM, John Robinson
<john.robinson@anonymous.org.uk> wrote:
> On 25/02/2010 08:05, Giovanni Tessore wrote:
> [...]
>>
> I do think we urgently need the hot reconstruction/recovery feature, so
> failing drives can be recovered to fresh drives with two sources of data,
> i.e. both the failing drive and the remaining drives in the array, giving us
> two chances of recovering every sector.

I was one of those 4 cases in the part month.  I would have certainly
benefited from this when I tried to replace a failing drive on my old
raid-5.  But  I think actually the redundancy you desired can be
achieved by running a raid-6 at the degraded mode (with 1 missing
drive).

Do I miss something?  If this is the case, shouldn't we all
be doing this instead of using the raid-5?


>
> Cheers,
>
> John.

DS

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: emergency call for help: raid5 fallen apart
  2010-02-25 17:41                 ` Dawning Sky
@ 2010-02-25 18:31                   ` John Robinson
  2010-02-26  2:42                     ` Michael Evans
  0 siblings, 1 reply; 21+ messages in thread
From: John Robinson @ 2010-02-25 18:31 UTC (permalink / raw)
  To: Linux RAID

On 25/02/2010 17:41, Dawning Sky wrote:
> On Thu, Feb 25, 2010 at 8:45 AM, John Robinson
> <john.robinson@anonymous.org.uk> wrote:
>> On 25/02/2010 08:05, Giovanni Tessore wrote:
>> [...]
>> I do think we urgently need the hot reconstruction/recovery feature, so
>> failing drives can be recovered to fresh drives with two sources of data,
>> i.e. both the failing drive and the remaining drives in the array, giving us
>> two chances of recovering every sector.
> 
> I was one of those 4 cases in the part month.  I would have certainly
> benefited from this when I tried to replace a failing drive on my old
> raid-5.  But  I think actually the redundancy you desired can be
> achieved by running a raid-6 at the degraded mode (with 1 missing
> drive).
> 
> Do I miss something?  If this is the case, shouldn't we all
> be doing this instead of using the raid-5?

I think you must be missing something, yes. RAID-6 with one drive 
missing would have 2 chances of recovering each sector, but then so does 
RAID-5 with no drives missing. In either case, lose a drive and you need 
every sector on the remaining drives to be good to complete the 
reconstruction and keep the array up.

Cheers,

John.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: emergency call for help: raid5 fallen apart
  2010-02-25 18:31                   ` John Robinson
@ 2010-02-26  2:42                     ` Michael Evans
  0 siblings, 0 replies; 21+ messages in thread
From: Michael Evans @ 2010-02-26  2:42 UTC (permalink / raw)
  To: John Robinson; +Cc: Linux RAID

On Thu, Feb 25, 2010 at 10:31 AM, John Robinson
<john.robinson@anonymous.org.uk> wrote:
> On 25/02/2010 17:41, Dawning Sky wrote:
>>
>> On Thu, Feb 25, 2010 at 8:45 AM, John Robinson
>> <john.robinson@anonymous.org.uk> wrote:
>>>
>>> On 25/02/2010 08:05, Giovanni Tessore wrote:
>>> [...]
>>> I do think we urgently need the hot reconstruction/recovery feature, so
>>> failing drives can be recovered to fresh drives with two sources of data,
>>> i.e. both the failing drive and the remaining drives in the array, giving
>>> us
>>> two chances of recovering every sector.
>>
>> I was one of those 4 cases in the part month.  I would have certainly
>> benefited from this when I tried to replace a failing drive on my old
>> raid-5.  But  I think actually the redundancy you desired can be
>> achieved by running a raid-6 at the degraded mode (with 1 missing
>> drive).
>>
>> Do I miss something?  If this is the case, shouldn't we all
>> be doing this instead of using the raid-5?
>
> I think you must be missing something, yes. RAID-6 with one drive missing
> would have 2 chances of recovering each sector, but then so does RAID-5 with
> no drives missing. In either case, lose a drive and you need every sector on
> the remaining drives to be good to complete the reconstruction and keep the
> array up.
>
> Cheers,
>
> John.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

No, what they're saying is that often drives don't /totally/ fail.
They have segments that go bad first, and we are often catching them
in that state.  To use the segments that /are/ successfully returned
there is a good chance that multiple 'not full member' drive could
provide a complete, or usefully very near complete with known 'dead'
areas set to store on fresh devices.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: emergency call for help: raid5 fallen apart
  2010-02-25 16:45               ` John Robinson
  2010-02-25 17:41                 ` Dawning Sky
@ 2010-02-26 20:15                 ` Bill Davidsen
  2010-02-28 11:50                 ` Stefan /*St0fF*/ Hübner
  2 siblings, 0 replies; 21+ messages in thread
From: Bill Davidsen @ 2010-02-26 20:15 UTC (permalink / raw)
  To: John Robinson; +Cc: Giovanni Tessore, linux-raid

John Robinson wrote:
> On 25/02/2010 08:05, Giovanni Tessore wrote:
> [...]
>> I see this is the 4th time in a month that poeple reports problem on 
>> raid5 due to the read errors during reconstruction; it looks like the 
>> 'corrected read errors' policy is quite a real concern.
>
> If you mean md's policy of reconstructing from the other discs and 
> rewriting when there's a read error from one disc of an array, rather 
> than immediately kicking the disc that had a read error, I think 
> you're wrong - I think md is saving lots of users from hitting 
> problems, by keeping their arrays up and running, and giving their 
> discs a chance to remap bad sectors, instead of forcing the user to do 
> full-disc reconstructions more often which will make them more likely 
> to hit read errors during recovery.
>
> I do think we urgently need the hot reconstruction/recovery feature, 
> so failing drives can be recovered to fresh drives with two sources of 
> data, i.e. both the failing drive and the remaining drives in the 
> array, giving us two chances of recovering every sector.

Ideally, there would be a way to avoid kicking any failing drive, or 
even trying to rewrite the unreadable sector. Some md utility which 
would clone a drive using logic similar to this:
 - start with array assembled but not started
 - read a sector from the source drive
   reconstruct t if source fails
   report errors and keep going
 - write any recovered sector to the destination
 - optionally read it back to be sure it worked, rewrite and note errors
   to be useful it must flush to the platter and reread. Yes, it will be 
slow.
 
Don't try to be smart, try to make a usable copy of a drive!

I think in case a sector can't be recovered a fixed pattern should be 
written to the destination, for ease of identification if nothing else.

I think being able to specify MBR or a partition would be useful, that 
would let critical things be saved faster and with less work. This also 
open up possibilities for migration of several kinds.

This really should be a command in mdadm! Why? Because it is vital that 
changes on how mdadm does things are tracked in this tool. Because when 
you are down to trying this you don't want to be looking for matching 
versions, etc.

-- 
Bill Davidsen <davidsen@tmr.com>
  "We can't solve today's problems by using the same thinking we
   used in creating them." - Einstein


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: emergency call for help: raid5 fallen apart
  2010-02-25 16:45               ` John Robinson
  2010-02-25 17:41                 ` Dawning Sky
  2010-02-26 20:15                 ` Bill Davidsen
@ 2010-02-28 11:50                 ` Stefan /*St0fF*/ Hübner
  2010-02-28 12:52                   ` Stefan /*St0fF*/ Hübner
  2 siblings, 1 reply; 21+ messages in thread
From: Stefan /*St0fF*/ Hübner @ 2010-02-28 11:50 UTC (permalink / raw)
  To: linux-raid

Hi John,

John Robinson schrieb:
> On 25/02/2010 08:05, Giovanni Tessore wrote:
> [...]
>> I see this is the 4th time in a month that poeple reports problem on
>> raid5 due to the read errors during reconstruction; it looks like the
>> 'corrected read errors' policy is quite a real concern.
> 
> If you mean md's policy of reconstructing from the other discs and
> rewriting when there's a read error from one disc of an array, rather
> than immediately kicking the disc that had a read error, I think you're
> wrong - I think md is saving lots of users from hitting problems, by
> keeping their arrays up and running, and giving their discs a chance to
> remap bad sectors, instead of forcing the user to do full-disc
> reconstructions more often which will make them more likely to hit read
> errors during recovery.

I think you misunderstood me.  I recently was told what you wrote in the
last paragraph.  I know it is good, as that is the most intelligently
possible behaviour of md.
BUT: if the drive takes let's say 2 min for internal error recovery to
succeed of fail (whichever, doesn't matter), then the SG EH layer of the
kernel will drop the disk, not md.  This forces md to drop the disk,
also.  The conclusion is: a technology is needed to prevent another
kernel level from dropping the disk.  This technology exists, it's
called SCT-ERC (Smart Control Transport - Error Recovery Control).  It's
the same as WD's TLER or Samsung's CCTL.  But it is non-volatile.  After
a power on reset the timeout-values are reset to factory defaults.  So
it needs to be set right before adding a disk to an array.
(for more info: check www.t13.org, find the ATA8-ACS documents)
> 
> I do think we urgently need the hot reconstruction/recovery feature, so
> failing drives can be recovered to fresh drives with two sources of
> data, i.e. both the failing drive and the remaining drives in the array,
> giving us two chances of recovering every sector.

I do not think this is easily possible.  One would have to keep a map
about the "in sync" sectors of an array member and the "failed" sectors.
 My guess is: this would need a partial redesign (again a new superblock
type containing information about "failed segments" probably).
Please correct me if I'm wrong and that is already included in 1.X (I'm
mostly working on 0.90 Superblocks).
> 
> Cheers,
> 
> John.
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Cheers,
Stefan.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: emergency call for help: raid5 fallen apart
  2010-02-28 11:50                 ` Stefan /*St0fF*/ Hübner
@ 2010-02-28 12:52                   ` Stefan /*St0fF*/ Hübner
  0 siblings, 0 replies; 21+ messages in thread
From: Stefan /*St0fF*/ Hübner @ 2010-02-28 12:52 UTC (permalink / raw)
  To: linux-raid

Sorry @all, I had a few typos:

Stefan /*St0fF*/ Hübner schrieb:
> [...]
> BUT: if the drive takes let's say 2 min for internal error recovery to
> succeed of fail (whichever, doesn't matter), then the SG EH layer of the

-> succeed OR fail

> kernel will drop the disk, not md.  This forces md to drop the disk,
> also.  The conclusion is: a technology is needed to prevent another
> kernel level from dropping the disk.  This technology exists, it's
> called SCT-ERC (Smart Control Transport - Error Recovery Control).  It's
> the same as WD's TLER or Samsung's CCTL.  But it is non-volatile.  After

-> But it is volatile.

> a power on reset the timeout-values are reset to factory defaults.  So
> it needs to be set right before adding a disk to an array.
> (for more info: check www.t13.org, find the ATA8-ACS documents)
>> I do think we urgently need the hot reconstruction/recovery feature, so
>> failing drives can be recovered to fresh drives with two sources of
>> data, i.e. both the failing drive and the remaining drives in the array,
>> giving us two chances of recovering every sector.
> 
> I do not think this is easily possible.  One would have to keep a map
> about the "in sync" sectors of an array member and the "failed" sectors.
>  My guess is: this would need a partial redesign (again a new superblock
> type containing information about "failed segments" probably).
> Please correct me if I'm wrong and that is already included in 1.X (I'm
> mostly working on 0.90 Superblocks).
>> Cheers,
>>
>> John.
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> Cheers,
> Stefan.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: emergency call for help: raid5 fallen apart
  2010-02-24 16:53         ` Stefan G. Weichinger
  2010-02-24 17:02           ` Stefan G. Weichinger
@ 2010-02-24 17:09           ` Robin Hill
  2010-02-24 17:28             ` Stefan G. Weichinger
  2010-02-24 17:35             ` Stefan G. Weichinger
  1 sibling, 2 replies; 21+ messages in thread
From: Robin Hill @ 2010-02-24 17:09 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2420 bytes --]

On Wed Feb 24, 2010 at 05:53:27PM +0100, Stefan G. Weichinger wrote:

> Am 24.02.2010 17:38, schrieb Stefan G. Weichinger:
> 
> > I now have md4 on sda4 and sdb4 ... xfs_repaired ... and sync the data
> > to a plain new xfs-partition on sdc4 ... just to get current data out of
> > the way.
> 
> 
> Status now, after another reboot because of a failing md4:
> 
> why degraded? How to get out of that and re-add sdc4 or sdd4 ?
> What about that device 2 down there??
> 
> 
> server-gentoo ~ # mdadm -D /dev/md4
> /dev/md4:
>         Version : 00.90.03
>   Creation Time : Tue Aug  5 14:14:16 2008
>      Raid Level : raid5
>      Array Size : 291820544 (278.30 GiB 298.82 GB)
>   Used Dev Size : 145910272 (139.15 GiB 149.41 GB)
>    Raid Devices : 3
>   Total Devices : 2
> Preferred Minor : 4
>     Persistence : Superblock is persistent
> 
>     Update Time : Wed Feb 24 17:41:15 2010
>           State : clean, degraded
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 0
> 
>          Layout : left-symmetric
>      Chunk Size : 64K
> 
>            UUID : d4b0e9c1:067357ce:2569337e:e9af8bed
>          Events : 0.198
> 
>     Number   Major   Minor   RaidDevice State
>        0       8        4        0      active sync   /dev/sda4
>        1       8       20        1      active sync   /dev/sdb4
>        2       0        0        2      removed
> 
It's degraded because you only have 2 disks in the array, presumably the
event count on the other disks doesn't match up.  If you've replaced sdc
and sdd never got rebuilt onto, then you only have the two disks
available for the array anyway.

If these are the only disks with up-to-date data, and sda4 is still
failing, I can only suggest stopping the array and using dd/dd_rescue to
copy sda4 onto a working disk.  You should then be able to reassemble
the array with sdb4 and the new disk, then add in a hot spare to
recover.

Alternately, bite the bullet, recreate the array and restore.

Either way, it looks like you ought to be running regular checks on the
array to try to pick up/fix these background failures.

Cheers,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: emergency call for help: raid5 fallen apart
  2010-02-24 17:09           ` Robin Hill
@ 2010-02-24 17:28             ` Stefan G. Weichinger
  2010-02-24 17:35             ` Stefan G. Weichinger
  1 sibling, 0 replies; 21+ messages in thread
From: Stefan G. Weichinger @ 2010-02-24 17:28 UTC (permalink / raw)
  To: linux-raid

Am 24.02.2010 18:09, schrieb Robin Hill:

> It's degraded because you only have 2 disks in the array, presumably the
> event count on the other disks doesn't match up.  If you've replaced sdc
> and sdd never got rebuilt onto, then you only have the two disks
> available for the array anyway.

Yep.

> If these are the only disks with up-to-date data, and sda4 is still
> failing, I can only suggest stopping the array and using dd/dd_rescue to
> copy sda4 onto a working disk.  You should then be able to reassemble
> the array with sdb4 and the new disk, then add in a hot spare to
> recover.

OK, that's plan B. For now I try to get data aside.

md4 is a PV in an LVM-VG ... the main data-LV seems to trigger the
errors, but another LV seems more stable (other sectors or something).

This other LV contains rsnapshots of the main data-LV ... so if I am
lucky I only lose about 2hrs of work if I get the latest snapshot
copied. rsync is down to character "s" already ........

For sure there's a third LV as well, containing VMware-VMs ... oh my.

Let's pray this one is OK as well, at least while copying stuff.

> Alternately, bite the bullet, recreate the array and restore.

hmm

> Either way, it looks like you ought to be running regular checks on the
> array to try to pick up/fix these background failures.

smartd lead me to the failing sdc ... no note of sda though ...

A bad taste after all.

Thanks anyway, Stefan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: emergency call for help: raid5 fallen apart
  2010-02-24 17:09           ` Robin Hill
  2010-02-24 17:28             ` Stefan G. Weichinger
@ 2010-02-24 17:35             ` Stefan G. Weichinger
  2010-02-24 18:12               ` Robin Hill
  1 sibling, 1 reply; 21+ messages in thread
From: Stefan G. Weichinger @ 2010-02-24 17:35 UTC (permalink / raw)
  To: linux-raid

Am 24.02.2010 18:09, schrieb Robin Hill:

> If these are the only disks with up-to-date data, and sda4 is still
> failing, I can only suggest stopping the array and using dd/dd_rescue to
> copy sda4 onto a working disk.  You should then be able to reassemble
> the array with sdb4 and the new disk, then add in a hot spare to
> recover.

Currently sdd isn't in use at all.

So I could


mdadm --stop /dev/md4

ddrescue /dev/sda4 /dev/sdd4

mdadm --assemble --force /dev/md4 /dev/sdb4 /dev/sdd4

??

Sorry for my explicit questions, I am rather stressed here ...

S

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: emergency call for help: raid5 fallen apart
  2010-02-24 17:35             ` Stefan G. Weichinger
@ 2010-02-24 18:12               ` Robin Hill
  2010-02-24 19:54                 ` Stefan G. Weichinger
  0 siblings, 1 reply; 21+ messages in thread
From: Robin Hill @ 2010-02-24 18:12 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1205 bytes --]

On Wed Feb 24, 2010 at 06:35:46PM +0100, Stefan G. Weichinger wrote:

> Am 24.02.2010 18:09, schrieb Robin Hill:
> 
> > If these are the only disks with up-to-date data, and sda4 is still
> > failing, I can only suggest stopping the array and using dd/dd_rescue to
> > copy sda4 onto a working disk.  You should then be able to reassemble
> > the array with sdb4 and the new disk, then add in a hot spare to
> > recover.
> 
> Currently sdd isn't in use at all.
> 
> So I could
> 
> 
> mdadm --stop /dev/md4
> 
> ddrescue /dev/sda4 /dev/sdd4
> 
> mdadm --assemble --force /dev/md4 /dev/sdb4 /dev/sdd4
> 
> ??
> 
Yes, that'd be what I'd recommend - the ddrescue will only need to make
a single pass across sda4 (except for failed blocks of course), so will
have the lowest risk of exacerbating the disk problems.  Of course, the
practicality of doing the assemble will depend on the number of
unreadable blocks found by ddrescue.

Good luck!
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: emergency call for help: raid5 fallen apart
  2010-02-24 18:12               ` Robin Hill
@ 2010-02-24 19:54                 ` Stefan G. Weichinger
  0 siblings, 0 replies; 21+ messages in thread
From: Stefan G. Weichinger @ 2010-02-24 19:54 UTC (permalink / raw)
  To: linux-raid

Am 24.02.2010 19:12, schrieb Robin Hill:

>> mdadm --stop /dev/md4
>>
>> ddrescue /dev/sda4 /dev/sdd4
>>
>> mdadm --assemble --force /dev/md4 /dev/sdb4 /dev/sdd4
>>
>> ??
>>
> Yes, that'd be what I'd recommend - the ddrescue will only need to make
> a single pass across sda4 (except for failed blocks of course), so will
> have the lowest risk of exacerbating the disk problems.  Of course, the
> practicality of doing the assemble will depend on the number of
> unreadable blocks found by ddrescue.

I decided to somehow roll back.

As far as we see we lose 1.5 hrs of work done by some people ... thanks
to the rsnapshots ...

The LV containing the VM was/is healthy, I was able to copy the
vm-directory fine and the VM boots and runs. No work lost here.

So far I don't use that flaky md4 for now ... doing the ddrescue would
take quite some time and this box has to be UP tomorrow.

And additionally I wouldn't know about the result.

I'll decide how to proceed tomorrow. For now the data and the function
for tomorrow comes first, even without full RAID now. So I restore stuff
now ...

I am somehow tired and exhausted now and not willing to risk what I have
got now.

> Good luck!
>     Robin

Thanks a lot ... Stefan

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2010-02-28 12:52 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-24 14:54 emergency call for help: raid5 fallen apart Stefan G. Weichinger
2010-02-24 15:05 ` Stefan G. Weichinger
2010-02-24 15:22   ` Robin Hill
2010-02-24 15:32     ` Stefan G. Weichinger
2010-02-24 16:38       ` Stefan G. Weichinger
2010-02-24 16:53         ` Stefan G. Weichinger
2010-02-24 17:02           ` Stefan G. Weichinger
2010-02-25  8:05             ` Giovanni Tessore
2010-02-25 16:27               ` Stefan /*St0fF*/ Hübner
2010-02-25 16:45               ` John Robinson
2010-02-25 17:41                 ` Dawning Sky
2010-02-25 18:31                   ` John Robinson
2010-02-26  2:42                     ` Michael Evans
2010-02-26 20:15                 ` Bill Davidsen
2010-02-28 11:50                 ` Stefan /*St0fF*/ Hübner
2010-02-28 12:52                   ` Stefan /*St0fF*/ Hübner
2010-02-24 17:09           ` Robin Hill
2010-02-24 17:28             ` Stefan G. Weichinger
2010-02-24 17:35             ` Stefan G. Weichinger
2010-02-24 18:12               ` Robin Hill
2010-02-24 19:54                 ` Stefan G. Weichinger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).