From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Janos Haar" Subject: Re: Suggestion needed for fixing RAID6 [SOLVED] Date: Wed, 5 May 2010 17:24:37 +0200 Message-ID: <25a201caec67$15b26580$0400a8c0@dcccs> References: <626601cae203$dae35030$0400a8c0@dcccs> <20100423065143.GA17743@maude.comedia.it> <695a01cae2c1$a72907d0$0400a8c0@dcccs> <4BD193D0.5080003@shiftmail.org> <717901cae3e5$6a5fa730$0400a8c0@dcccs> <4BD3751A.5000403@shiftmail.org> <756601cae45e$213d6190$0400a8c0@dcccs> <4BD569E2.7010409@shiftmail.org> <7a3e01cae53f$684122c0$0400a8c0@dcccs> <4BD5C51E.9040207@shiftmail.org> <80a201cae621$684daa30$0400a8c0@dcccs> <4BD76CF6.5020804@shiftmail.org> <20100428113732.03486490@notabene.brown> <4BD830B0.1080406@shiftmail.org> <025e01cae6d7$30bb7870$0400a8c0@dcccs> <4BD843D4.7030700@shiftmail.org> <062001cae771$545e0910$0400a8c0@dcccs> <4BD9A41E.9050009@shiftmail.org> <0c1201cae7e0$01f9a930$0400a8c0@dcccs> <4BDA0F88.70907@shiftmail.org> <0d6401cae82c$da8b5590$0400a8c0@dcccs> <4BDB6DB6.5020306@sh iftmail.org> <12cf01cae911$f0d92940$0400a8c0@dcccs> <4BDC6217.9000209@shiftmail.org> <154b01cae977$6e09da80$0400a8c0@dcccs> <20100503121747.7f2cc1f1@notabene.brown> <4BDE9FB6.80309@shiftmai! l.org> Mime-Version: 1.0 Content-Type: text/plain; format=flowed; charset="ISO-8859-1"; reply-type=response Content-Transfer-Encoding: 7bit Return-path: Sender: linux-raid-owner@vger.kernel.org To: MRK Cc: linux-raid@vger.kernel.org, Neil Brown List-Id: linux-raid.ids > > It is not clear to me what kind of error MD got from DM: > > Apr 29 09:50:29 Clarus-gl2k10-2 kernel: device-mapper: snapshots: > Invalidating snapshot: Error reading/writing. > Apr 29 09:50:29 Clarus-gl2k10-2 kernel: ata8: EH complete > Apr 29 09:50:29 Clarus-gl2k10-2 kernel: raid5: Disk failure on dm-1, > disabling device. > > I don't understand from what place the md_error() is called... > but also in this case it doesn't look like a rewrite error... > > I think without DM COW it should probably work in his case. First sorry for delay. Without DM, the original behavior-fix patch worked very well. Neil is generally right about the drive should reallocate the bad sectors on rewrite, but this is the ideal scenario wich is far from the real world unfortunately.... I needed to repeat 4 times the "repair" sync methode on the better HDD (wich have only 123 bads) to get readable again. The another hdd have >2500 bads wich looks like have no chance to fix this way. > > Your new patch skips the rewriting and keeps the unreadable sectors, > right? So that the drive isn't dropped on rewrite... > >> The following patch should address this issue for you. >> It is*not* a general-purpose fix, but a specific fix > [CUT] Neil, i think this patch should be in the sysfs or in the proc to be inactive by default, and of course will be good for recover bad cases like mine. There is a lot of hdd problems wich can make really uncorrectable sectors wich can't be good again even on rewrite.... Thanks a lot for all who helped me to solve this.... And MRK, please don't forget to write in my name. :-) Cheers, Janos