From: John Robinson <john.robinson@anonymous.org.uk>
To: "Paweł Brodacki" <pawel.brodacki@googlemail.com>
Cc: "linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>
Subject: Re: Help understanding the root cause of a member dropping out of a RAID 1 set.
Date: Fri, 14 Aug 2009 18:07:44 +0100 [thread overview]
Message-ID: <4A8599E0.5000604@anonymous.org.uk> (raw)
In-Reply-To: <b95c1fdd0908140609jac48ba9u595f839d3d167293@mail.gmail.com>
On 14/08/2009 14:09, Paweł Brodacki wrote:
> 2009/8/13 John Robinson <john.robinson@anonymous.org.uk>:
>
>> Can or could md be made or configured to try re-adding a device if this
>> sort of thing happens? After all, a stray cosmic ray or whatever perhaps
>> shouldn't make one lose redundancy if the drive's actually OK?
>
> I think that from the coding point of view md probably could. The more
> important thing is if it should. The only hard fact is that there was
> an error while accessing the device. md has no way of telling if it
> was just a freak accident, or the drive is unreliable from now on.
Ah well, perhaps we need to give md a way of knowing the difference
between a transient error (that has been recovered from) and a more
serious error.
> Therefore it does the one safe thing and says "I won't trust you
> anymore.". If a human being knows better, the said being is free to
> re-add the drive.
>
> Personally I'd hate having a suspicious drive being auto-added in hope
> it will rebuild and function properly.
I wouldn't want it to be the default behaviour, but I'd like the option
to configure things that way. I'd want the number of auto-re-adds
configurable too.
> Because such an option could seem tempting but could and would cause
> loss of reliability I'd expect bad publicity if it was actually added.
But it could cause improvements in reliability too. If the cable on
drive A is hit by cosmic rays, the drive is taken out of the array, but
the drive's actually still fine, then drive B fails before the operator
has re-added drive A, the array goes down when it didn't need to.
What is the operator's most likely response to seeing the SATA bus
reset? She's going to re-add the drive assuming it was a transient
error. If we could make this happen automatically, we could close a
window when the array's more vulnerable. I wouldn't suggest we do it
silently; it gets logged, notified etc. just like the drive being taken
out of the array would be.
Cheers,
John.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2009-08-14 17:07 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-13 8:44 Help understanding the root cause of a member dropping out of a RAID 1 set Simon Jackson
2009-08-13 16:13 ` Billy Crook
2009-08-13 16:26 ` John Robinson
2009-08-14 13:09 ` Paweł Brodacki
2009-08-14 17:07 ` John Robinson [this message]
2009-08-14 20:56 ` Richard Scobie
2009-08-14 13:21 ` Robin Hill
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A8599E0.5000604@anonymous.org.uk \
--to=john.robinson@anonymous.org.uk \
--cc=linux-raid@vger.kernel.org \
--cc=pawel.brodacki@googlemail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.