linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bill Davidsen <davidsen@tmr.com>
To: John Robinson <john.robinson@anonymous.org.uk>
Cc: Giovanni Tessore <giotex@texsoft.it>, linux-raid@vger.kernel.org
Subject: Re: emergency call for help: raid5 fallen apart
Date: Fri, 26 Feb 2010 15:15:30 -0500	[thread overview]
Message-ID: <4B882BE2.7060005@tmr.com> (raw)
In-Reply-To: <4B86A943.3040804@anonymous.org.uk>

John Robinson wrote:
> On 25/02/2010 08:05, Giovanni Tessore wrote:
> [...]
>> I see this is the 4th time in a month that poeple reports problem on 
>> raid5 due to the read errors during reconstruction; it looks like the 
>> 'corrected read errors' policy is quite a real concern.
>
> If you mean md's policy of reconstructing from the other discs and 
> rewriting when there's a read error from one disc of an array, rather 
> than immediately kicking the disc that had a read error, I think 
> you're wrong - I think md is saving lots of users from hitting 
> problems, by keeping their arrays up and running, and giving their 
> discs a chance to remap bad sectors, instead of forcing the user to do 
> full-disc reconstructions more often which will make them more likely 
> to hit read errors during recovery.
>
> I do think we urgently need the hot reconstruction/recovery feature, 
> so failing drives can be recovered to fresh drives with two sources of 
> data, i.e. both the failing drive and the remaining drives in the 
> array, giving us two chances of recovering every sector.

Ideally, there would be a way to avoid kicking any failing drive, or 
even trying to rewrite the unreadable sector. Some md utility which 
would clone a drive using logic similar to this:
 - start with array assembled but not started
 - read a sector from the source drive
   reconstruct t if source fails
   report errors and keep going
 - write any recovered sector to the destination
 - optionally read it back to be sure it worked, rewrite and note errors
   to be useful it must flush to the platter and reread. Yes, it will be 
slow.
 
Don't try to be smart, try to make a usable copy of a drive!

I think in case a sector can't be recovered a fixed pattern should be 
written to the destination, for ease of identification if nothing else.

I think being able to specify MBR or a partition would be useful, that 
would let critical things be saved faster and with less work. This also 
open up possibilities for migration of several kinds.

This really should be a command in mdadm! Why? Because it is vital that 
changes on how mdadm does things are tracked in this tool. Because when 
you are down to trying this you don't want to be looking for matching 
versions, etc.

-- 
Bill Davidsen <davidsen@tmr.com>
  "We can't solve today's problems by using the same thinking we
   used in creating them." - Einstein


  parent reply	other threads:[~2010-02-26 20:15 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-24 14:54 emergency call for help: raid5 fallen apart Stefan G. Weichinger
2010-02-24 15:05 ` Stefan G. Weichinger
2010-02-24 15:22   ` Robin Hill
2010-02-24 15:32     ` Stefan G. Weichinger
2010-02-24 16:38       ` Stefan G. Weichinger
2010-02-24 16:53         ` Stefan G. Weichinger
2010-02-24 17:02           ` Stefan G. Weichinger
2010-02-25  8:05             ` Giovanni Tessore
2010-02-25 16:27               ` Stefan /*St0fF*/ Hübner
2010-02-25 16:45               ` John Robinson
2010-02-25 17:41                 ` Dawning Sky
2010-02-25 18:31                   ` John Robinson
2010-02-26  2:42                     ` Michael Evans
2010-02-26 20:15                 ` Bill Davidsen [this message]
2010-02-28 11:50                 ` Stefan /*St0fF*/ Hübner
2010-02-28 12:52                   ` Stefan /*St0fF*/ Hübner
2010-02-24 17:09           ` Robin Hill
2010-02-24 17:28             ` Stefan G. Weichinger
2010-02-24 17:35             ` Stefan G. Weichinger
2010-02-24 18:12               ` Robin Hill
2010-02-24 19:54                 ` Stefan G. Weichinger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B882BE2.7060005@tmr.com \
    --to=davidsen@tmr.com \
    --cc=giotex@texsoft.it \
    --cc=john.robinson@anonymous.org.uk \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).