Re: Raid5 drive fail during grow and no backup

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Phil Turmel <philip@turmel.org>
To: "P. Gautschi" <linuxlist@gautschi.net>
Cc: Vince <stuff@hagenhuegel.de>, linux-raid@vger.kernel.org
Subject: Re: Raid5 drive fail during grow and no backup
Date: Fri, 07 Nov 2014 22:36:26 -0500	[thread overview]
Message-ID: <545D8FBA.9090701@turmel.org> (raw)
In-Reply-To: <545CEDFB.6060806@gautschi.net>

On 11/07/2014 11:06 AM, P. Gautschi wrote:
>  > This is a problem you haven't solved yet, I think. The raid array
> should have fixed this bad sector for you without kicking the drive out.
> The scenario is common with "green" drives and/or consumer-grade drives
> in general.
>  > ...
>  > Then you can set up your array to properly correct bad sectors, and
> set your system to look for bad sectors on
>  > a regular basis.
>
> What is the behavior of mdadm when a disk reports a read error?
> - reconstruct the data, deliver it to the fs and otherwise ignore it?
> - set the disk to fail?
> - reconstruct the data, rewrite the failed data and continue with any
> action?
> - rewrite the failed data and reread it (bypassing the cache on the HD)?

Option 3.  Reconstruct and rewrite.

However, if the device with the bad sector is trying to recover longer 
than the linux low level driver's timeout, bad things^TM happen. 
Specifically, the driver resets the SATA (or SCSI) connection and 
attempts to reconnect.  During this brief time, it will not accept 
further I/O, so the write back of the reconstructed data fails.  Then 
the device has experienced a *write* error, so MD fails the drive.  This 
is the out-of-the-box behavior of consumer-grade drives in raid arrays.

> Do read operation always read the parity too in order to detect problems
> early
> before a sector on a other disks fails?

No.

> Can the behavior be configured in any way? I found no documentation
> regarding this.

The administrator must schedule "check" scrubs of the array to look for 
bad sectors, or wait for them to be found naturally.  Such scrubs will 
also find inconsistent parity and report it.  A "repair" scrub can then 
fix the broken parity.

I understand that some distros include a cron job for this purpose. 
I've always rolled my own.

Phil

next prev parent reply	other threads:[~2014-11-08  3:36 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-31 13:34 Raid5 drive fail during grow and no backup Vince
2014-11-02  3:22 ` Phil Turmel
2014-11-03 14:45   ` Vince
2014-11-04 16:17     ` Phil Turmel
2014-11-05 19:03       ` Vince
2014-11-06 17:12         ` Vince
2014-11-07 13:36           ` Phil Turmel
2014-11-07 16:07             ` P. Gautschi
2014-11-07 16:06       ` P. Gautschi
2014-11-08  3:36         ` Phil Turmel [this message]
2014-11-10  3:20           ` Jason Keltz
2014-12-04 19:29           ` Phillip Susi
2014-12-04 20:02             ` Phil Turmel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=545D8FBA.9090701@turmel.org \
    --to=philip@turmel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=linuxlist@gautschi.net \
    --cc=stuff@hagenhuegel.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).