From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Brown Subject: Re: data scrubbing Date: Sat, 30 Jul 2011 00:16:55 +0200 Message-ID: References: <4E327445.9080404@oldum.net> <4E32B4D3.3030905@oldum.net> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: In-Reply-To: Sender: linux-raid-owner@vger.kernel.org To: linux-raid@vger.kernel.org List-Id: linux-raid.ids On 29/07/11 23:51, Mathias Bur=C3=A9n wrote: > On 29 July 2011 21:48, Beolach wrote: >> On Fri, Jul 29, 2011 at 07:25, Nikolay Kichukov = wrote: >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA1 >>> >>> Hi, >>> >>> This is a good to know! >>> >>> Just performed a check on a raid1 and got: >>> >>> Jul 29 15:37:36 hanna64 mdadm[2277]: RebuildFinished event detected= on md device /dev/md1, component device mismatches >>> found: 128 >>> >>> So I presume those mismatches have now been rewritten to both disks= successfully. Am I wrong there? >>> >>> cat /sys/block/md1/md/mismatch_cnt >>> 128 >>> >>> >> >> That depends on if you did a "check" or a "repair" - see the SCRUBBI= NG >> AND MISMATCHES section of the md(4) man page: >> "If check was used, then no action is taken to handle the mismatch= , >> it is simply recorded. If repair was used, then a mismatch wi= ll >> be repaired in the same way that resync repairs arrays." >> >> >> Good luck, >> Beolach >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid= " in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > Sorry to chime in like this. After reading the above, is there a > reason why anyone shouldn't _always_ use repair instead of check on a > weekly RAID6 check? You have to run repair anyway after a check if an= y > issues are found, right? > > Or does the system become vulnerable during a repair? (less redundant= ) > > Thanks, > Mathias If you do a repair, then when a mismatch is found one of the disks is=20 taken as the "bad" one, and re-created. For raid1, the first copy is=20 assumed correct. For raid5/6, the data blocks are assumed correct and=20 the parities re-created. As Neil Brown explained on his blog, without=20 any more information then this is as good as md raid can do. However,=20 it is not necessarily as good as /you/ can do. For example, you might=20 be able to determine which files use the blocks in the mismatched=20 stripe, and figure out which block was bad. Or for 3-disk raid1 you=20 could pick the bad block as the odd one out (assuming the other two=20 matched). For raid6, it's possible to spot if it is a single-disk=20 mismatch and correct that one disk (for each disk in turn, assume it is= =20 missing and re-create it from the other disks using normal raid6=20 recovery. If the stripe is then consistent, you've fixed the mismatch)= =2E=20 However, such approaches are not necessarily the correct one. Thus=20 the "repair" just does the simplest and fastest correction of the=20 mismatch, and "check" does not change the stripe in case you want to=20 manually pick a different method. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html