Re: Raid5 drive fail during grow and no backup

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jason Keltz <jas@cse.yorku.ca>
To: Phil Turmel <philip@turmel.org>
Cc: linux-raid@vger.kernel.org
Subject: Re: Raid5 drive fail during grow and no backup
Date: Sun, 09 Nov 2014 22:20:22 -0500	[thread overview]
Message-ID: <54602EF6.9070909@cse.yorku.ca> (raw)
In-Reply-To: <545D8FBA.9090701@turmel.org>

On 07/11/2014 10:36 PM, Phil Turmel wrote:
> On 11/07/2014 11:06 AM, P. Gautschi wrote:
>>  > This is a problem you haven't solved yet, I think. The raid array
>> should have fixed this bad sector for you without kicking the drive out.
>> The scenario is common with "green" drives and/or consumer-grade drives
>> in general.
>>  > ...
>>  > Then you can set up your array to properly correct bad sectors, and
>> set your system to look for bad sectors on
>>  > a regular basis.
>>
>> What is the behavior of mdadm when a disk reports a read error?
>> - reconstruct the data, deliver it to the fs and otherwise ignore it?
>> - set the disk to fail?
>> - reconstruct the data, rewrite the failed data and continue with any
>> action?
>> - rewrite the failed data and reread it (bypassing the cache on the HD)?
>
> Option 3.  Reconstruct and rewrite.
>
> However, if the device with the bad sector is trying to recover longer 
> than the linux low level driver's timeout, bad things^TM happen. 
> Specifically, the driver resets the SATA (or SCSI) connection and 
> attempts to reconnect.  During this brief time, it will not accept 
> further I/O, so the write back of the reconstructed data fails.  Then 
> the device has experienced a *write* error, so MD fails the drive.  
> This is the out-of-the-box behavior of consumer-grade drives in raid 
> arrays.

Hi Phil,
Sorry to interject..
Since I'm in the midst of setting up a 22 disk RAID 10 with 2 TB WD 
black (desktop) drives, I wanted to be clear that I understand this 
particular scenerio that you bring up.  Should a drive enter a deep 
error recovery, would I be correct that the worst that should happen 
would be a hang for the users during this recovery time, and, if the 
driver does reset the SATA connection (as it likely would do), then a 
potential removal of the disk from the array, but not the destruction of 
the array?  If I had a spare disk, it would be used for a potential 
rebuild, but I could test the original disk and re-add it back to the 
pool at another time.

Any feedback would be helpful.

Thanks!

Jason.

next prev parent reply	other threads:[~2014-11-10  3:20 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-31 13:34 Raid5 drive fail during grow and no backup Vince
2014-11-02  3:22 ` Phil Turmel
2014-11-03 14:45   ` Vince
2014-11-04 16:17     ` Phil Turmel
2014-11-05 19:03       ` Vince
2014-11-06 17:12         ` Vince
2014-11-07 13:36           ` Phil Turmel
2014-11-07 16:07             ` P. Gautschi
2014-11-07 16:06       ` P. Gautschi
2014-11-08  3:36         ` Phil Turmel
2014-11-10  3:20           ` Jason Keltz [this message]
2014-12-04 19:29           ` Phillip Susi
2014-12-04 20:02             ` Phil Turmel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54602EF6.9070909@cse.yorku.ca \
    --to=jas@cse.yorku.ca \
    --cc=linux-raid@vger.kernel.org \
    --cc=philip@turmel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).