* emergency call for help: raid5 fallen apart @ 2010-02-24 14:54 Stefan G. Weichinger 2010-02-24 15:05 ` Stefan G. Weichinger 0 siblings, 1 reply; 21+ messages in thread From: Stefan G. Weichinger @ 2010-02-24 14:54 UTC (permalink / raw) To: linux-raid Sorry for maybe FAQing, I am in emergency mode: customer server, RAID5 + hotspare, 4 drives ... gentoo Linux version 2.6.25-gentoo-r7 mdadm 2.6.4-r1 here - one of the 4 drives showed massive errors in dmesg, /dev/sdc SMART-errors etc. bought new drive and wanted to swap today. # cat /proc/mdstat Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] md1 : active raid1 sdb1[1] sda1[0] 104320 blocks [2/2] [UU] md3 : active raid5 sdb3[1] sda3[0] 19550976 blocks level 5, 64k chunk, algorithm 2 [3/2] [UU_] md4 : inactive sdb4[1](S) sdd4[3](S) sdc4[2](S) sda4[0](S) 583641088 blocks - I did: mdadm /dev/md3 --fail /dev/sdc3 went OK mdadm /dev/md4 --remove /dev/sdc3 OK as well, raid md3 rebuilt - With md4 I was too aggressive maybe: mdadm /dev/md4 --fail /dev/sdc4 --remove /dev/sdc4 this rendered md4 unusable, even after a reboot it can't be reassambled. This is bad, to say the least. md4 : inactive sdb4[1](S) sdd4[3](S) sdc4[2](S) sda4[0](S) 583641088 blocks What to try? This is a crucial server and I feel a lot of pressure. Rebuilding that raid would mean a lot of restore-work etc. So I would really appreciate a goo advice here. THANKS! Stefan ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: emergency call for help: raid5 fallen apart 2010-02-24 14:54 emergency call for help: raid5 fallen apart Stefan G. Weichinger @ 2010-02-24 15:05 ` Stefan G. Weichinger 2010-02-24 15:22 ` Robin Hill 0 siblings, 1 reply; 21+ messages in thread From: Stefan G. Weichinger @ 2010-02-24 15:05 UTC (permalink / raw) To: linux-raid Am 24.02.2010 15:54, schrieb Stefan G. Weichinger: > What to try? > > This is a crucial server and I feel a lot of pressure. > Rebuilding that raid would mean a lot of restore-work etc. > So I would really appreciate a goo advice here. Followup: --examine shows different statii for the four partitions: server-gentoo ~ # mdadm --examine /dev/sda4 /dev/sda4: Magic : a92b4efc Version : 00.90.00 UUID : d4b0e9c1:067357ce:2569337e:e9af8bed Creation Time : Tue Aug 5 14:14:16 2008 Raid Level : raid5 Used Dev Size : 145910272 (139.15 GiB 149.41 GB) Array Size : 291820544 (278.30 GiB 298.82 GB) Raid Devices : 3 Total Devices : 4 Preferred Minor : 4 Update Time : Wed Feb 24 15:33:37 2010 State : active Active Devices : 2 Working Devices : 3 Failed Devices : 1 Spare Devices : 1 Checksum : 3039381e - correct Events : 0.13 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 0 8 4 0 active sync /dev/sda4 0 0 8 4 0 active sync /dev/sda4 1 1 8 20 1 active sync /dev/sdb4 2 2 0 0 2 faulty removed 3 3 8 52 3 spare /dev/sdd4 server-gentoo ~ # mdadm --examine /dev/sdb4 /dev/sdb4: Magic : a92b4efc Version : 00.90.00 UUID : d4b0e9c1:067357ce:2569337e:e9af8bed Creation Time : Tue Aug 5 14:14:16 2008 Raid Level : raid5 Used Dev Size : 145910272 (139.15 GiB 149.41 GB) Array Size : 291820544 (278.30 GiB 298.82 GB) Raid Devices : 3 Total Devices : 4 Preferred Minor : 4 Update Time : Wed Feb 24 15:37:05 2010 State : clean Active Devices : 1 Working Devices : 2 Failed Devices : 1 Spare Devices : 1 Checksum : 3039393f - correct Events : 0.32 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 1 8 20 1 active sync /dev/sdb4 0 0 0 0 0 removed 1 1 8 20 1 active sync /dev/sdb4 2 2 0 0 2 faulty removed 3 3 8 52 3 spare /dev/sdd4 server-gentoo ~ # mdadm --examine /dev/sdc4 /dev/sdc4: Magic : a92b4efc Version : 00.90.00 UUID : d4b0e9c1:067357ce:2569337e:e9af8bed Creation Time : Tue Aug 5 14:14:16 2008 Raid Level : raid5 Used Dev Size : 145910272 (139.15 GiB 149.41 GB) Array Size : 291820544 (278.30 GiB 298.82 GB) Raid Devices : 3 Total Devices : 4 Preferred Minor : 4 Update Time : Wed Feb 24 15:33:28 2010 State : clean Active Devices : 3 Working Devices : 4 Failed Devices : 0 Spare Devices : 1 Checksum : 30393836 - correct Events : 0.10 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 2 8 36 2 active sync /dev/sdc4 0 0 8 4 0 active sync /dev/sda4 1 1 8 20 1 active sync /dev/sdb4 2 2 8 36 2 active sync /dev/sdc4 3 3 8 52 3 spare /dev/sdd4 server-gentoo ~ # mdadm --examine /dev/sdd4 /dev/sdd4: Magic : a92b4efc Version : 00.90.00 UUID : d4b0e9c1:067357ce:2569337e:e9af8bed Creation Time : Tue Aug 5 14:14:16 2008 Raid Level : raid5 Used Dev Size : 145910272 (139.15 GiB 149.41 GB) Array Size : 291820544 (278.30 GiB 298.82 GB) Raid Devices : 3 Total Devices : 4 Preferred Minor : 4 Update Time : Wed Feb 24 15:37:05 2010 State : clean Active Devices : 1 Working Devices : 2 Failed Devices : 1 Spare Devices : 1 Checksum : 3039395d - correct Events : 0.32 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 3 8 52 3 spare /dev/sdd4 0 0 0 0 0 removed 1 1 8 20 1 active sync /dev/sdb4 2 2 0 0 2 faulty removed 3 3 8 52 3 spare /dev/sdd4 ---- Does this info help? Thanks, Stefan ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: emergency call for help: raid5 fallen apart 2010-02-24 15:05 ` Stefan G. Weichinger @ 2010-02-24 15:22 ` Robin Hill 2010-02-24 15:32 ` Stefan G. Weichinger 0 siblings, 1 reply; 21+ messages in thread From: Robin Hill @ 2010-02-24 15:22 UTC (permalink / raw) To: linux-raid [-- Attachment #1: Type: text/plain, Size: 950 bytes --] On Wed Feb 24, 2010 at 04:05:36PM +0100, Stefan G. Weichinger wrote: > Am 24.02.2010 15:54, schrieb Stefan G. Weichinger: > > > What to try? > > > > This is a crucial server and I feel a lot of pressure. > > Rebuilding that raid would mean a lot of restore-work etc. > > So I would really appreciate a goo advice here. > > Followup: > > --examine shows different statii for the four partitions: > Hmm, that looks like sda4 dropped out after sdc4 was removed, failing the array. Can you force assemble the array? mdadm -A /dev/md4 -f /dev/sda4 /dev/sdb4 If that works, you'll want to re-add the hot spare so it rebuilds. You'll also need to fsck the filesystem afterwards. Cheers, Robin -- ___ ( ' } | Robin Hill <robin@robinhill.me.uk> | / / ) | Little Jim says .... | // !! | "He fallen in de water !!" | [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: emergency call for help: raid5 fallen apart 2010-02-24 15:22 ` Robin Hill @ 2010-02-24 15:32 ` Stefan G. Weichinger 2010-02-24 16:38 ` Stefan G. Weichinger 0 siblings, 1 reply; 21+ messages in thread From: Stefan G. Weichinger @ 2010-02-24 15:32 UTC (permalink / raw) To: linux-raid Am 24.02.2010 16:22, schrieb Robin Hill: > On Wed Feb 24, 2010 at 04:05:36PM +0100, Stefan G. Weichinger wrote: > >> Am 24.02.2010 15:54, schrieb Stefan G. Weichinger: >> >>> What to try? >>> >>> This is a crucial server and I feel a lot of pressure. >>> Rebuilding that raid would mean a lot of restore-work etc. >>> So I would really appreciate a goo advice here. >> >> Followup: >> >> --examine shows different statii for the four partitions: >> > Hmm, that looks like sda4 dropped out after sdc4 was removed, failing > the array. Can you force assemble the array? > mdadm -A /dev/md4 -f /dev/sda4 /dev/sdb4 > > If that works, you'll want to re-add the hot spare so it rebuilds. > You'll also need to fsck the filesystem afterwards. I thank you a lot for this piece of help. I always hesitate to TRY things in such a situation as I once back then dropped a RAID by doing the wrong thing. The md4 is UP again on 2 spindles, 3rd re-added right now. Looks promising. THANKS, I owe you something. I report back later with more details ... S ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: emergency call for help: raid5 fallen apart 2010-02-24 15:32 ` Stefan G. Weichinger @ 2010-02-24 16:38 ` Stefan G. Weichinger 2010-02-24 16:53 ` Stefan G. Weichinger 0 siblings, 1 reply; 21+ messages in thread From: Stefan G. Weichinger @ 2010-02-24 16:38 UTC (permalink / raw) To: linux-raid Am 24.02.2010 16:32, schrieb Stefan G. Weichinger: > I report back later with more details ... sda4 drops out repeatedly ... Swapped physical sdc already ... adding sdc4 leads to failing md4 again after starting the rebuild. I now have md4 on sda4 and sdb4 ... xfs_repaired ... and sync the data to a plain new xfs-partition on sdc4 ... just to get current data out of the way. oh my ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: emergency call for help: raid5 fallen apart 2010-02-24 16:38 ` Stefan G. Weichinger @ 2010-02-24 16:53 ` Stefan G. Weichinger 2010-02-24 17:02 ` Stefan G. Weichinger 2010-02-24 17:09 ` Robin Hill 0 siblings, 2 replies; 21+ messages in thread From: Stefan G. Weichinger @ 2010-02-24 16:53 UTC (permalink / raw) To: linux-raid Am 24.02.2010 17:38, schrieb Stefan G. Weichinger: > I now have md4 on sda4 and sdb4 ... xfs_repaired ... and sync the data > to a plain new xfs-partition on sdc4 ... just to get current data out of > the way. Status now, after another reboot because of a failing md4: why degraded? How to get out of that and re-add sdc4 or sdd4 ? What about that device 2 down there?? server-gentoo ~ # mdadm -D /dev/md4 /dev/md4: Version : 00.90.03 Creation Time : Tue Aug 5 14:14:16 2008 Raid Level : raid5 Array Size : 291820544 (278.30 GiB 298.82 GB) Used Dev Size : 145910272 (139.15 GiB 149.41 GB) Raid Devices : 3 Total Devices : 2 Preferred Minor : 4 Persistence : Superblock is persistent Update Time : Wed Feb 24 17:41:15 2010 State : clean, degraded Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 64K UUID : d4b0e9c1:067357ce:2569337e:e9af8bed Events : 0.198 Number Major Minor RaidDevice State 0 8 4 0 active sync /dev/sda4 1 8 20 1 active sync /dev/sdb4 2 0 0 2 removed ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: emergency call for help: raid5 fallen apart 2010-02-24 16:53 ` Stefan G. Weichinger @ 2010-02-24 17:02 ` Stefan G. Weichinger 2010-02-25 8:05 ` Giovanni Tessore 2010-02-24 17:09 ` Robin Hill 1 sibling, 1 reply; 21+ messages in thread From: Stefan G. Weichinger @ 2010-02-24 17:02 UTC (permalink / raw) To: linux-raid sda fails also: Feb 24 17:57:42 server-gentoo ata1.00: configured for UDMA/133 Feb 24 17:57:42 server-gentoo sd 0:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08 Feb 24 17:57:42 server-gentoo sd 0:0:0:0: [sda] Sense Key : 0x3 [current] [descriptor] Feb 24 17:57:42 server-gentoo Descriptor sense data with sense descriptors (in hex): Feb 24 17:57:42 server-gentoo 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Feb 24 17:57:42 server-gentoo 01 3c ba 1a Feb 24 17:57:42 server-gentoo sd 0:0:0:0: [sda] ASC=0x11 ASCQ=0x4 Feb 24 17:57:42 server-gentoo end_request: I/O error, dev sda, sector 20757018 Feb 24 17:57:42 server-gentoo raid5:md4: read error not correctable (sector 1032 on sda4). Feb 24 17:57:42 server-gentoo raid5: Disk failure on sda4, disabling device. Operation continuing on 1 devices Feb 24 17:57:42 server-gentoo raid5:md4: read error not correctable (sector 1040 on sda4). Feb 24 17:57:42 server-gentoo raid5:md4: read error not correctable (sector 1048 on sda4). (sector 1072 on sda4). So I am down to one drive, from 3 ... :-( Does it make sense to repeat: mdadm --assemble xfs_repair mount and rsync stuff aside until it fails again? I once was lucky with such a strategy ... S ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: emergency call for help: raid5 fallen apart 2010-02-24 17:02 ` Stefan G. Weichinger @ 2010-02-25 8:05 ` Giovanni Tessore 2010-02-25 16:27 ` Stefan /*St0fF*/ Hübner 2010-02-25 16:45 ` John Robinson 0 siblings, 2 replies; 21+ messages in thread From: Giovanni Tessore @ 2010-02-25 8:05 UTC (permalink / raw) To: linux-raid Stefan G. Weichinger wrote: > Feb 24 17:57:42 server-gentoo end_request: I/O error, dev sda, sector > 20757018 > Feb 24 17:57:42 server-gentoo raid5:md4: read error not correctable > (sector 1032 on sda4). > Feb 24 17:57:42 server-gentoo raid5: Disk failure on sda4, disabling > device. Operation continuing on 1 devices > Feb 24 17:57:42 server-gentoo raid5:md4: read error not correctable > (sector 1040 on sda4). > Feb 24 17:57:42 server-gentoo raid5:md4: read error not correctable > (sector 1048 on sda4). > > > Does it make sense to repeat: > > mdadm --assemble > xfs_repair > mount > > and rsync stuff aside until it fails again? > > I once was lucky with such a strategy ... > I recently had similar problem with a 6 disk array, when one died and another gave read errors during reconstruction (see older posts about end of january). I was able to recover most data reassembling the array and copying data from it to another storage, repeating the assembly each time the read errors was encountered; so the 'strategy' mostly worked for me (recovered almost everything); it may help setting the md device in readonly mode, and mounign the partition as readonly. I hope you can recover your data. Regards PS. I see this is the 4th time in a month that poeple reports problem on raid5 due to the read errors during reconstruction; it looks like the 'corrected read errors' policy is quite a real concern. -- Cordiali saluti. Yours faithfully. Giovanni Tessore ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: emergency call for help: raid5 fallen apart 2010-02-25 8:05 ` Giovanni Tessore @ 2010-02-25 16:27 ` Stefan /*St0fF*/ Hübner 2010-02-25 16:45 ` John Robinson 1 sibling, 0 replies; 21+ messages in thread From: Stefan /*St0fF*/ Hübner @ 2010-02-25 16:27 UTC (permalink / raw) To: linux-raid Giovanni Tessore schrieb: > [...] > PS. > I see this is the 4th time in a month that poeple reports problem on > raid5 due to the read errors during reconstruction; it looks like the > 'corrected read errors' policy is quite a real concern. > It surely is. There are many more than 4 cases in a month, one could earn a living from it - thank you very much harddisk producers. What we need: setting Error Recovery Control timeouts upon assembly of RAIDs on ATA-disks. Then the normal disks behave like the raid-edition disks from some vendors (i.e. Hitachi even documents on it for the 7K2000 Deskstar drives - well, only between the lines). But this feature is pretty hard to implement - you'll have to guess and test if a disk understands ATA. All the best, Stefan Hübner -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: emergency call for help: raid5 fallen apart 2010-02-25 8:05 ` Giovanni Tessore 2010-02-25 16:27 ` Stefan /*St0fF*/ Hübner @ 2010-02-25 16:45 ` John Robinson 2010-02-25 17:41 ` Dawning Sky ` (2 more replies) 1 sibling, 3 replies; 21+ messages in thread From: John Robinson @ 2010-02-25 16:45 UTC (permalink / raw) To: Giovanni Tessore; +Cc: linux-raid On 25/02/2010 08:05, Giovanni Tessore wrote: [...] > I see this is the 4th time in a month that poeple reports problem on > raid5 due to the read errors during reconstruction; it looks like the > 'corrected read errors' policy is quite a real concern. If you mean md's policy of reconstructing from the other discs and rewriting when there's a read error from one disc of an array, rather than immediately kicking the disc that had a read error, I think you're wrong - I think md is saving lots of users from hitting problems, by keeping their arrays up and running, and giving their discs a chance to remap bad sectors, instead of forcing the user to do full-disc reconstructions more often which will make them more likely to hit read errors during recovery. I do think we urgently need the hot reconstruction/recovery feature, so failing drives can be recovered to fresh drives with two sources of data, i.e. both the failing drive and the remaining drives in the array, giving us two chances of recovering every sector. Cheers, John. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: emergency call for help: raid5 fallen apart 2010-02-25 16:45 ` John Robinson @ 2010-02-25 17:41 ` Dawning Sky 2010-02-25 18:31 ` John Robinson 2010-02-26 20:15 ` Bill Davidsen 2010-02-28 11:50 ` Stefan /*St0fF*/ Hübner 2 siblings, 1 reply; 21+ messages in thread From: Dawning Sky @ 2010-02-25 17:41 UTC (permalink / raw) To: linux-raid On Thu, Feb 25, 2010 at 8:45 AM, John Robinson <john.robinson@anonymous.org.uk> wrote: > On 25/02/2010 08:05, Giovanni Tessore wrote: > [...] >> > I do think we urgently need the hot reconstruction/recovery feature, so > failing drives can be recovered to fresh drives with two sources of data, > i.e. both the failing drive and the remaining drives in the array, giving us > two chances of recovering every sector. I was one of those 4 cases in the part month. I would have certainly benefited from this when I tried to replace a failing drive on my old raid-5. But I think actually the redundancy you desired can be achieved by running a raid-6 at the degraded mode (with 1 missing drive). Do I miss something? If this is the case, shouldn't we all be doing this instead of using the raid-5? > > Cheers, > > John. DS ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: emergency call for help: raid5 fallen apart 2010-02-25 17:41 ` Dawning Sky @ 2010-02-25 18:31 ` John Robinson 2010-02-26 2:42 ` Michael Evans 0 siblings, 1 reply; 21+ messages in thread From: John Robinson @ 2010-02-25 18:31 UTC (permalink / raw) To: Linux RAID On 25/02/2010 17:41, Dawning Sky wrote: > On Thu, Feb 25, 2010 at 8:45 AM, John Robinson > <john.robinson@anonymous.org.uk> wrote: >> On 25/02/2010 08:05, Giovanni Tessore wrote: >> [...] >> I do think we urgently need the hot reconstruction/recovery feature, so >> failing drives can be recovered to fresh drives with two sources of data, >> i.e. both the failing drive and the remaining drives in the array, giving us >> two chances of recovering every sector. > > I was one of those 4 cases in the part month. I would have certainly > benefited from this when I tried to replace a failing drive on my old > raid-5. But I think actually the redundancy you desired can be > achieved by running a raid-6 at the degraded mode (with 1 missing > drive). > > Do I miss something? If this is the case, shouldn't we all > be doing this instead of using the raid-5? I think you must be missing something, yes. RAID-6 with one drive missing would have 2 chances of recovering each sector, but then so does RAID-5 with no drives missing. In either case, lose a drive and you need every sector on the remaining drives to be good to complete the reconstruction and keep the array up. Cheers, John. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: emergency call for help: raid5 fallen apart 2010-02-25 18:31 ` John Robinson @ 2010-02-26 2:42 ` Michael Evans 0 siblings, 0 replies; 21+ messages in thread From: Michael Evans @ 2010-02-26 2:42 UTC (permalink / raw) To: John Robinson; +Cc: Linux RAID On Thu, Feb 25, 2010 at 10:31 AM, John Robinson <john.robinson@anonymous.org.uk> wrote: > On 25/02/2010 17:41, Dawning Sky wrote: >> >> On Thu, Feb 25, 2010 at 8:45 AM, John Robinson >> <john.robinson@anonymous.org.uk> wrote: >>> >>> On 25/02/2010 08:05, Giovanni Tessore wrote: >>> [...] >>> I do think we urgently need the hot reconstruction/recovery feature, so >>> failing drives can be recovered to fresh drives with two sources of data, >>> i.e. both the failing drive and the remaining drives in the array, giving >>> us >>> two chances of recovering every sector. >> >> I was one of those 4 cases in the part month. I would have certainly >> benefited from this when I tried to replace a failing drive on my old >> raid-5. But I think actually the redundancy you desired can be >> achieved by running a raid-6 at the degraded mode (with 1 missing >> drive). >> >> Do I miss something? If this is the case, shouldn't we all >> be doing this instead of using the raid-5? > > I think you must be missing something, yes. RAID-6 with one drive missing > would have 2 chances of recovering each sector, but then so does RAID-5 with > no drives missing. In either case, lose a drive and you need every sector on > the remaining drives to be good to complete the reconstruction and keep the > array up. > > Cheers, > > John. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > No, what they're saying is that often drives don't /totally/ fail. They have segments that go bad first, and we are often catching them in that state. To use the segments that /are/ successfully returned there is a good chance that multiple 'not full member' drive could provide a complete, or usefully very near complete with known 'dead' areas set to store on fresh devices. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: emergency call for help: raid5 fallen apart 2010-02-25 16:45 ` John Robinson 2010-02-25 17:41 ` Dawning Sky @ 2010-02-26 20:15 ` Bill Davidsen 2010-02-28 11:50 ` Stefan /*St0fF*/ Hübner 2 siblings, 0 replies; 21+ messages in thread From: Bill Davidsen @ 2010-02-26 20:15 UTC (permalink / raw) To: John Robinson; +Cc: Giovanni Tessore, linux-raid John Robinson wrote: > On 25/02/2010 08:05, Giovanni Tessore wrote: > [...] >> I see this is the 4th time in a month that poeple reports problem on >> raid5 due to the read errors during reconstruction; it looks like the >> 'corrected read errors' policy is quite a real concern. > > If you mean md's policy of reconstructing from the other discs and > rewriting when there's a read error from one disc of an array, rather > than immediately kicking the disc that had a read error, I think > you're wrong - I think md is saving lots of users from hitting > problems, by keeping their arrays up and running, and giving their > discs a chance to remap bad sectors, instead of forcing the user to do > full-disc reconstructions more often which will make them more likely > to hit read errors during recovery. > > I do think we urgently need the hot reconstruction/recovery feature, > so failing drives can be recovered to fresh drives with two sources of > data, i.e. both the failing drive and the remaining drives in the > array, giving us two chances of recovering every sector. Ideally, there would be a way to avoid kicking any failing drive, or even trying to rewrite the unreadable sector. Some md utility which would clone a drive using logic similar to this: - start with array assembled but not started - read a sector from the source drive reconstruct t if source fails report errors and keep going - write any recovered sector to the destination - optionally read it back to be sure it worked, rewrite and note errors to be useful it must flush to the platter and reread. Yes, it will be slow. Don't try to be smart, try to make a usable copy of a drive! I think in case a sector can't be recovered a fixed pattern should be written to the destination, for ease of identification if nothing else. I think being able to specify MBR or a partition would be useful, that would let critical things be saved faster and with less work. This also open up possibilities for migration of several kinds. This really should be a command in mdadm! Why? Because it is vital that changes on how mdadm does things are tracked in this tool. Because when you are down to trying this you don't want to be looking for matching versions, etc. -- Bill Davidsen <davidsen@tmr.com> "We can't solve today's problems by using the same thinking we used in creating them." - Einstein ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: emergency call for help: raid5 fallen apart 2010-02-25 16:45 ` John Robinson 2010-02-25 17:41 ` Dawning Sky 2010-02-26 20:15 ` Bill Davidsen @ 2010-02-28 11:50 ` Stefan /*St0fF*/ Hübner 2010-02-28 12:52 ` Stefan /*St0fF*/ Hübner 2 siblings, 1 reply; 21+ messages in thread From: Stefan /*St0fF*/ Hübner @ 2010-02-28 11:50 UTC (permalink / raw) To: linux-raid Hi John, John Robinson schrieb: > On 25/02/2010 08:05, Giovanni Tessore wrote: > [...] >> I see this is the 4th time in a month that poeple reports problem on >> raid5 due to the read errors during reconstruction; it looks like the >> 'corrected read errors' policy is quite a real concern. > > If you mean md's policy of reconstructing from the other discs and > rewriting when there's a read error from one disc of an array, rather > than immediately kicking the disc that had a read error, I think you're > wrong - I think md is saving lots of users from hitting problems, by > keeping their arrays up and running, and giving their discs a chance to > remap bad sectors, instead of forcing the user to do full-disc > reconstructions more often which will make them more likely to hit read > errors during recovery. I think you misunderstood me. I recently was told what you wrote in the last paragraph. I know it is good, as that is the most intelligently possible behaviour of md. BUT: if the drive takes let's say 2 min for internal error recovery to succeed of fail (whichever, doesn't matter), then the SG EH layer of the kernel will drop the disk, not md. This forces md to drop the disk, also. The conclusion is: a technology is needed to prevent another kernel level from dropping the disk. This technology exists, it's called SCT-ERC (Smart Control Transport - Error Recovery Control). It's the same as WD's TLER or Samsung's CCTL. But it is non-volatile. After a power on reset the timeout-values are reset to factory defaults. So it needs to be set right before adding a disk to an array. (for more info: check www.t13.org, find the ATA8-ACS documents) > > I do think we urgently need the hot reconstruction/recovery feature, so > failing drives can be recovered to fresh drives with two sources of > data, i.e. both the failing drive and the remaining drives in the array, > giving us two chances of recovering every sector. I do not think this is easily possible. One would have to keep a map about the "in sync" sectors of an array member and the "failed" sectors. My guess is: this would need a partial redesign (again a new superblock type containing information about "failed segments" probably). Please correct me if I'm wrong and that is already included in 1.X (I'm mostly working on 0.90 Superblocks). > > Cheers, > > John. > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Cheers, Stefan. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: emergency call for help: raid5 fallen apart 2010-02-28 11:50 ` Stefan /*St0fF*/ Hübner @ 2010-02-28 12:52 ` Stefan /*St0fF*/ Hübner 0 siblings, 0 replies; 21+ messages in thread From: Stefan /*St0fF*/ Hübner @ 2010-02-28 12:52 UTC (permalink / raw) To: linux-raid Sorry @all, I had a few typos: Stefan /*St0fF*/ Hübner schrieb: > [...] > BUT: if the drive takes let's say 2 min for internal error recovery to > succeed of fail (whichever, doesn't matter), then the SG EH layer of the -> succeed OR fail > kernel will drop the disk, not md. This forces md to drop the disk, > also. The conclusion is: a technology is needed to prevent another > kernel level from dropping the disk. This technology exists, it's > called SCT-ERC (Smart Control Transport - Error Recovery Control). It's > the same as WD's TLER or Samsung's CCTL. But it is non-volatile. After -> But it is volatile. > a power on reset the timeout-values are reset to factory defaults. So > it needs to be set right before adding a disk to an array. > (for more info: check www.t13.org, find the ATA8-ACS documents) >> I do think we urgently need the hot reconstruction/recovery feature, so >> failing drives can be recovered to fresh drives with two sources of >> data, i.e. both the failing drive and the remaining drives in the array, >> giving us two chances of recovering every sector. > > I do not think this is easily possible. One would have to keep a map > about the "in sync" sectors of an array member and the "failed" sectors. > My guess is: this would need a partial redesign (again a new superblock > type containing information about "failed segments" probably). > Please correct me if I'm wrong and that is already included in 1.X (I'm > mostly working on 0.90 Superblocks). >> Cheers, >> >> John. >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > Cheers, > Stefan. > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: emergency call for help: raid5 fallen apart 2010-02-24 16:53 ` Stefan G. Weichinger 2010-02-24 17:02 ` Stefan G. Weichinger @ 2010-02-24 17:09 ` Robin Hill 2010-02-24 17:28 ` Stefan G. Weichinger 2010-02-24 17:35 ` Stefan G. Weichinger 1 sibling, 2 replies; 21+ messages in thread From: Robin Hill @ 2010-02-24 17:09 UTC (permalink / raw) To: linux-raid [-- Attachment #1: Type: text/plain, Size: 2420 bytes --] On Wed Feb 24, 2010 at 05:53:27PM +0100, Stefan G. Weichinger wrote: > Am 24.02.2010 17:38, schrieb Stefan G. Weichinger: > > > I now have md4 on sda4 and sdb4 ... xfs_repaired ... and sync the data > > to a plain new xfs-partition on sdc4 ... just to get current data out of > > the way. > > > Status now, after another reboot because of a failing md4: > > why degraded? How to get out of that and re-add sdc4 or sdd4 ? > What about that device 2 down there?? > > > server-gentoo ~ # mdadm -D /dev/md4 > /dev/md4: > Version : 00.90.03 > Creation Time : Tue Aug 5 14:14:16 2008 > Raid Level : raid5 > Array Size : 291820544 (278.30 GiB 298.82 GB) > Used Dev Size : 145910272 (139.15 GiB 149.41 GB) > Raid Devices : 3 > Total Devices : 2 > Preferred Minor : 4 > Persistence : Superblock is persistent > > Update Time : Wed Feb 24 17:41:15 2010 > State : clean, degraded > Active Devices : 2 > Working Devices : 2 > Failed Devices : 0 > Spare Devices : 0 > > Layout : left-symmetric > Chunk Size : 64K > > UUID : d4b0e9c1:067357ce:2569337e:e9af8bed > Events : 0.198 > > Number Major Minor RaidDevice State > 0 8 4 0 active sync /dev/sda4 > 1 8 20 1 active sync /dev/sdb4 > 2 0 0 2 removed > It's degraded because you only have 2 disks in the array, presumably the event count on the other disks doesn't match up. If you've replaced sdc and sdd never got rebuilt onto, then you only have the two disks available for the array anyway. If these are the only disks with up-to-date data, and sda4 is still failing, I can only suggest stopping the array and using dd/dd_rescue to copy sda4 onto a working disk. You should then be able to reassemble the array with sdb4 and the new disk, then add in a hot spare to recover. Alternately, bite the bullet, recreate the array and restore. Either way, it looks like you ought to be running regular checks on the array to try to pick up/fix these background failures. Cheers, Robin -- ___ ( ' } | Robin Hill <robin@robinhill.me.uk> | / / ) | Little Jim says .... | // !! | "He fallen in de water !!" | [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: emergency call for help: raid5 fallen apart 2010-02-24 17:09 ` Robin Hill @ 2010-02-24 17:28 ` Stefan G. Weichinger 2010-02-24 17:35 ` Stefan G. Weichinger 1 sibling, 0 replies; 21+ messages in thread From: Stefan G. Weichinger @ 2010-02-24 17:28 UTC (permalink / raw) To: linux-raid Am 24.02.2010 18:09, schrieb Robin Hill: > It's degraded because you only have 2 disks in the array, presumably the > event count on the other disks doesn't match up. If you've replaced sdc > and sdd never got rebuilt onto, then you only have the two disks > available for the array anyway. Yep. > If these are the only disks with up-to-date data, and sda4 is still > failing, I can only suggest stopping the array and using dd/dd_rescue to > copy sda4 onto a working disk. You should then be able to reassemble > the array with sdb4 and the new disk, then add in a hot spare to > recover. OK, that's plan B. For now I try to get data aside. md4 is a PV in an LVM-VG ... the main data-LV seems to trigger the errors, but another LV seems more stable (other sectors or something). This other LV contains rsnapshots of the main data-LV ... so if I am lucky I only lose about 2hrs of work if I get the latest snapshot copied. rsync is down to character "s" already ........ For sure there's a third LV as well, containing VMware-VMs ... oh my. Let's pray this one is OK as well, at least while copying stuff. > Alternately, bite the bullet, recreate the array and restore. hmm > Either way, it looks like you ought to be running regular checks on the > array to try to pick up/fix these background failures. smartd lead me to the failing sdc ... no note of sda though ... A bad taste after all. Thanks anyway, Stefan ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: emergency call for help: raid5 fallen apart 2010-02-24 17:09 ` Robin Hill 2010-02-24 17:28 ` Stefan G. Weichinger @ 2010-02-24 17:35 ` Stefan G. Weichinger 2010-02-24 18:12 ` Robin Hill 1 sibling, 1 reply; 21+ messages in thread From: Stefan G. Weichinger @ 2010-02-24 17:35 UTC (permalink / raw) To: linux-raid Am 24.02.2010 18:09, schrieb Robin Hill: > If these are the only disks with up-to-date data, and sda4 is still > failing, I can only suggest stopping the array and using dd/dd_rescue to > copy sda4 onto a working disk. You should then be able to reassemble > the array with sdb4 and the new disk, then add in a hot spare to > recover. Currently sdd isn't in use at all. So I could mdadm --stop /dev/md4 ddrescue /dev/sda4 /dev/sdd4 mdadm --assemble --force /dev/md4 /dev/sdb4 /dev/sdd4 ?? Sorry for my explicit questions, I am rather stressed here ... S ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: emergency call for help: raid5 fallen apart 2010-02-24 17:35 ` Stefan G. Weichinger @ 2010-02-24 18:12 ` Robin Hill 2010-02-24 19:54 ` Stefan G. Weichinger 0 siblings, 1 reply; 21+ messages in thread From: Robin Hill @ 2010-02-24 18:12 UTC (permalink / raw) To: linux-raid [-- Attachment #1: Type: text/plain, Size: 1205 bytes --] On Wed Feb 24, 2010 at 06:35:46PM +0100, Stefan G. Weichinger wrote: > Am 24.02.2010 18:09, schrieb Robin Hill: > > > If these are the only disks with up-to-date data, and sda4 is still > > failing, I can only suggest stopping the array and using dd/dd_rescue to > > copy sda4 onto a working disk. You should then be able to reassemble > > the array with sdb4 and the new disk, then add in a hot spare to > > recover. > > Currently sdd isn't in use at all. > > So I could > > > mdadm --stop /dev/md4 > > ddrescue /dev/sda4 /dev/sdd4 > > mdadm --assemble --force /dev/md4 /dev/sdb4 /dev/sdd4 > > ?? > Yes, that'd be what I'd recommend - the ddrescue will only need to make a single pass across sda4 (except for failed blocks of course), so will have the lowest risk of exacerbating the disk problems. Of course, the practicality of doing the assemble will depend on the number of unreadable blocks found by ddrescue. Good luck! Robin -- ___ ( ' } | Robin Hill <robin@robinhill.me.uk> | / / ) | Little Jim says .... | // !! | "He fallen in de water !!" | [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: emergency call for help: raid5 fallen apart 2010-02-24 18:12 ` Robin Hill @ 2010-02-24 19:54 ` Stefan G. Weichinger 0 siblings, 0 replies; 21+ messages in thread From: Stefan G. Weichinger @ 2010-02-24 19:54 UTC (permalink / raw) To: linux-raid Am 24.02.2010 19:12, schrieb Robin Hill: >> mdadm --stop /dev/md4 >> >> ddrescue /dev/sda4 /dev/sdd4 >> >> mdadm --assemble --force /dev/md4 /dev/sdb4 /dev/sdd4 >> >> ?? >> > Yes, that'd be what I'd recommend - the ddrescue will only need to make > a single pass across sda4 (except for failed blocks of course), so will > have the lowest risk of exacerbating the disk problems. Of course, the > practicality of doing the assemble will depend on the number of > unreadable blocks found by ddrescue. I decided to somehow roll back. As far as we see we lose 1.5 hrs of work done by some people ... thanks to the rsnapshots ... The LV containing the VM was/is healthy, I was able to copy the vm-directory fine and the VM boots and runs. No work lost here. So far I don't use that flaky md4 for now ... doing the ddrescue would take quite some time and this box has to be UP tomorrow. And additionally I wouldn't know about the result. I'll decide how to proceed tomorrow. For now the data and the function for tomorrow comes first, even without full RAID now. So I restore stuff now ... I am somehow tired and exhausted now and not willing to risk what I have got now. > Good luck! > Robin Thanks a lot ... Stefan ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2010-02-28 12:52 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-02-24 14:54 emergency call for help: raid5 fallen apart Stefan G. Weichinger 2010-02-24 15:05 ` Stefan G. Weichinger 2010-02-24 15:22 ` Robin Hill 2010-02-24 15:32 ` Stefan G. Weichinger 2010-02-24 16:38 ` Stefan G. Weichinger 2010-02-24 16:53 ` Stefan G. Weichinger 2010-02-24 17:02 ` Stefan G. Weichinger 2010-02-25 8:05 ` Giovanni Tessore 2010-02-25 16:27 ` Stefan /*St0fF*/ Hübner 2010-02-25 16:45 ` John Robinson 2010-02-25 17:41 ` Dawning Sky 2010-02-25 18:31 ` John Robinson 2010-02-26 2:42 ` Michael Evans 2010-02-26 20:15 ` Bill Davidsen 2010-02-28 11:50 ` Stefan /*St0fF*/ Hübner 2010-02-28 12:52 ` Stefan /*St0fF*/ Hübner 2010-02-24 17:09 ` Robin Hill 2010-02-24 17:28 ` Stefan G. Weichinger 2010-02-24 17:35 ` Stefan G. Weichinger 2010-02-24 18:12 ` Robin Hill 2010-02-24 19:54 ` Stefan G. Weichinger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).