linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tudor Holton <tudor@smartguide.com.au>
Cc: linux-raid <linux-raid@vger.kernel.org>
Subject: Re: Trouble adding disk to degraded array
Date: Thu, 10 Jan 2013 09:33:20 +1100	[thread overview]
Message-ID: <50EDF030.2060502@smartguide.com.au> (raw)
In-Reply-To: <CAJ=LqmbKR1BczeFyGeQooDsPMf8PqriiD3z7i1_LjupGij5ewQ@mail.gmail.com>

Having been through this process recently, and I agree that the advice 
will most likely lead the user to speculate on this as a potential 
cause, is there some way we could more easily alert the user to this 
situation?  Maybe we could mark the disk with a (URE) tag in mdstat (my 
preference) and/or reporting the error as "md: URE error occurred during 
read on disk X, aborting synchronization, returning discs [Y,Z...] to 
spare"? Trailing logs during synchronization can take several hours on 
large arrays (and busy servers) and cause alot of time wastage, 
particularly if you don't know what you're looking for.

Since it first affected me I found this kind of question asked quite 
regularly on a multitude of tech forums and alot of the responses I came 
across were incorrect or misleading at best. Alot more were along the 
lines of "That happened to me, and after trying to fix it for days I 
just wiped the array and started again.  Then it happened to the array 
again later.  mdadm is so unstable!"  Unfortunately we can't avoid 
people blaming the software, but we can at least help them to diagnose 
the problem more quicky and help their pain and our reputation.  :-)

Incidentally, is the state "active faulty" an allowed state? Because 
that could be a good way to report it, also.

On 10/01/13 08:18, Nicholas Ipsen(Sephiroth_VII) wrote:
> --snip---
>
> On 9 January 2013 18:55, Phil Turmel <philip@turmel.org> wrote:
>> On 01/09/2013 12:21 PM, Nicholas Ipsen(Sephiroth_VII) wrote:
>>> I recently had mdadm mark a disk in my RAID5-array as faulty. As it
>>> was within warranty, I returned it to the manufacturer, and have now
>>> installed a new drive. However, when I try to add it, recovery fails
>>> about halfway through,  with the newly added drive being marked as a
>>> spare, and one of my other drives marked as faulty!
>>>
>>> I seem to have full access to my data when assembling the array
>>> without the new disk using --force, and e2fsck reports no problems
>>> with the filesystem.
>>>
>>> What is happening here?
>> You haven't offered a great deal of information here, so I'll speculate:
>>   an unused sector one of your original drives has become unreadable (per
>> most drive specs, occurs naturally about every 12TB read).  Since
>> rebuilding an array involves computing parity for every stripe, the
>> unused sector is read and triggers the unrecoverable read error (URE).
>> Since the rebuild is incomplete, mdadm has no way to generate this
>> sector from another source, and doesn't know it isn't used, so the drive
>> is kicked out of the array.  You now have a double-degraded raid5, which
>> cannot continue operating.
>>
--snip--

  parent reply	other threads:[~2013-01-09 22:33 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAJ=LqmbYG8H45M196ZuRcDMu9Ucz0t_pQenQbZtMKM9AhSqrpQ@mail.gmail.com>
2013-01-09 17:21 ` Trouble adding disk to degraded array Nicholas Ipsen(Sephiroth_VII)
2013-01-09 17:55   ` Phil Turmel
2013-01-09 21:18     ` Nicholas Ipsen(Sephiroth_VII)
2013-01-09 21:54       ` Phil Turmel
2013-01-09 22:33       ` Tudor Holton [this message]
2013-01-09 23:47         ` Nicholas Ipsen
2013-01-11 13:14           ` Nicholas Ipsen
2013-01-11 14:07             ` Mikael Abrahamsson
2013-01-12  0:01               ` Nicholas Ipsen
2013-01-12  0:24                 ` Phil Turmel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50EDF030.2060502@smartguide.com.au \
    --to=tudor@smartguide.com.au \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).