Re: Read errors on raid5 ignored, array still clean .. then disaster !!

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Asdo <asdo@shiftmail.org>
To: Giovanni Tessore <giotex@texsoft.it>
Cc: linux-raid@vger.kernel.org, Neil Brown <neilb@suse.de>
Subject: Re: Read errors on raid5 ignored, array still clean .. then disaster !!
Date: Sat, 30 Jan 2010 22:41:13 +0100	[thread overview]
Message-ID: <4B64A779.6070809@shiftmail.org> (raw)
In-Reply-To: <4B647E0E.6050609@texsoft.it>

Giovanni Tessore wrote:
>
>>>> Is this some kind of bug?        
>>> No
>>>     
>> I'm not sure I agree.
>>   
>
> Hm funny ... I just read now from md's man:
>
> "In  kernels  prior to about 2.6.15, a read error would cause the same 
> effect as a write error.  In later kernels, a read-error will instead 
> cause md to attempt a recovery by overwriting the bad block. .... "
>
> So things have changed since 2.6.15 ... I was not so wrong to expect 
> "the old behaviour" and to be disappointed.
> But something important was missing during this change imho:
> 1) let the old behaviour be the default: add 
> /sys/block/mdXX/max_correctale_read_errors, with default to 0.
> 2) let the new behaviour be the default, but update mdadm and 
> /proc/mdstat to report read error events.
>
> I think the situation is now quite clear.
> Thanks

I have the feeling the current behaviour is the correct one at least for 
RAID-6.

If you scrub often enough, read errors should be catched when you still 
have enough good disks in that stripe.
At that point rewrite will kick in.
If the disk has enough relocation sectors available, the sector will 
relocate, otherwise the disk gets kicked.

As other people have written, disks now are much bigger than in the 
past, and a damaged sector can happen. It's not necessary to kick the 
drive yet.

This is with RAID-6.

RAID-5 unfortunately is inherently insecure, here is why:
If one drive gets kicked, MD starts recovering to a spare.
At that point any single read error during the regeneration (that's a 
scrub) will fail the array.
This is a problem that cannot be overcome in theory.
Even with the old algorithm, any sector failed after the last scrub will 
take the array down when one disk is kicked (array will go down during 
recovery).
So you would need to scrub continuously, or you would need 
hyper-reliable disks.

Yes, kicking a drive as soon as it presents the first unreadable sector 
can be a strategy for trying to select hyper-reliable disks...

Ok after all I might agree this can be a reasonable strategy for 
raid1,4,5...

I'd also agree that with 1.x superblock it would be desirable to be able 
to set the maximum number of corrected read errors before a drive is 
kicked, which could be set by default to 0 for raid 1,4,5 and to... I 
don't know... 20 (50? 100?) for raid-6.

Actually I believe the drives should be kicked for this threshold only 
AFTER the end of the scrub, so that they are used for parity computation 
till the end of the scrub. I would suggest to check for this threshold 
at the end of each scrub, not before, and during normal array operation 
only if a scrub/resync is not in progress (will be checked at the end 
anyway).

Thank you
Asdo

next prev parent reply	other threads:[~2010-01-30 21:41 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-26 22:28 Read errors on raid5 ignored, array still clean .. then disaster !! Giovanni Tessore
2010-01-27  7:41 ` Luca Berra
2010-01-27  9:01   ` Goswin von Brederlow
2010-01-29 10:48   ` Neil Brown
2010-01-29 11:58     ` Goswin von Brederlow
2010-01-29 19:14     ` Giovanni Tessore
2010-01-30  7:58       ` Luca Berra
2010-01-30 15:52         ` Giovanni Tessore
2010-01-30  7:54     ` Luca Berra
2010-01-30 10:55     ` Giovanni Tessore
2010-01-30 18:44     ` Giovanni Tessore
2010-01-30 21:41       ` Asdo [this message]
2010-01-30 22:20         ` Giovanni Tessore
2010-01-31  1:23           ` Roger Heflin
2010-01-31 10:45             ` Giovanni Tessore
2010-01-31 14:08               ` Roger Heflin
2010-01-31 14:31         ` Asdo
2010-02-01 10:56           ` Giovanni Tessore
2010-02-01 12:45             ` Asdo
2010-02-01 15:11               ` Giovanni Tessore
2010-02-01 13:27             ` Luca Berra
2010-02-01 15:51               ` Giovanni Tessore
2010-01-27  9:01 ` Asdo
2010-01-27 10:09   ` Giovanni Tessore
2010-01-27 10:50     ` Asdo
2010-01-27 15:06       ` Goswin von Brederlow
2010-01-27 16:15       ` Giovanni Tessore
2010-01-27 19:33     ` Richard Scobie
  -- strict thread matches above, loose matches on Subject: below --
2010-01-27  9:56 Giovanni Tessore

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B64A779.6070809@shiftmail.org \
    --to=asdo@shiftmail.org \
    --cc=giotex@texsoft.it \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).