From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wols Lists Subject: Re: using the raid6check report Date: Sun, 8 Jan 2017 21:06:14 +0000 Message-ID: <5872A9C6.7010408@youngman.org.uk> References: <14e8ec23-de4a-e90b-4b67-155e5e3cc228@eyal.emu.id.au> <20170108174010.GA3699@lazy.lzy> <20170108204659.GB7057@lazy.lzy> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit Return-path: In-Reply-To: <20170108204659.GB7057@lazy.lzy> Sender: linux-raid-owner@vger.kernel.org To: Piergiorgio Sartor , Eyal Lebedinsky Cc: linux-raid@vger.kernel.org List-Id: linux-raid.ids On 08/01/17 20:46, Piergiorgio Sartor wrote: > "should" as in "it is supposed to do it". > > So, as far as I know, "raid6check" with "repair" will > check the parity and try to find errors. > If possible, it will find where the error is, then > re-compute the value and write the corrected data. > > Now, this was somehow tested and *should* work. > > An other option is just to check for the errors and > see if one drive is constantly at fault. > This will not write anything, so it is safer, but > it will help to see if there are strange things, > before writing to the disk(s). Hmmm ... I've now been thinking about it, and actually I'm not sure it's possible even with raid6, to correct a corrupt read. The thing is, raid protects against a failure to read - if a sector fails, the parity will re-create it. But if a data sector is corrupted, how is raid to know WHICH sector? If one of the parity sectors is corrupted, it's easy. Calculate parity from the data, and either P or Q will be wrong, so fix it. But if it's a *data* sector that's corrupted, both P and Q will be wrong. How easy is it to work back from that, and work out *which* data sector is wrong? My fu makes me think you can't, though I could quite easily be wrong :-) But should that even happen, unless a disk is on its way out, anyway? I remember years ago, back in the 80s, our minicomputers had error-correction in the drive. I don't remember the algorithm, but it wrote 16-bit words to disk - each an 8-bit data byte. The first half was the original data, and the second half was some parity pattern such that for any single-bit corruption you knew which half was corrupt, and you could throw away the corrupt parity, or recreate the correct data from the parity. Even with a 2-bit error I think it was >90% detection and recreation. I can't imagine something like that not being in drive hardware today. Cheers, Wol