All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Robinson <john.robinson@anonymous.org.uk>
To: "Paweł Brodacki" <pawel.brodacki@googlemail.com>
Cc: "linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>
Subject: Re: Help understanding the root cause of a member dropping out of a 	RAID 1 set.
Date: Fri, 14 Aug 2009 18:07:44 +0100	[thread overview]
Message-ID: <4A8599E0.5000604@anonymous.org.uk> (raw)
In-Reply-To: <b95c1fdd0908140609jac48ba9u595f839d3d167293@mail.gmail.com>

On 14/08/2009 14:09, Paweł Brodacki wrote:
> 2009/8/13 John Robinson <john.robinson@anonymous.org.uk>:
> 
>> Can or could md be made or configured to try re-adding a device if this
>> sort of thing happens? After all, a stray cosmic ray or whatever perhaps
>> shouldn't make one lose redundancy if the drive's actually OK?
> 
> I think that from the coding point of view md probably could. The more
> important thing is if it should. The only hard fact is that there was
> an error while accessing the device. md has no way of telling if it
> was just a freak accident, or the drive is unreliable from now on.

Ah well, perhaps we need to give md a way of knowing the difference 
between a transient error (that has been recovered from) and a more 
serious error.

> Therefore it does the one safe thing and says "I won't trust you
> anymore.". If a human being knows better, the said being is free to
> re-add the drive.
> 
> Personally I'd hate having a suspicious drive being auto-added in hope
> it will rebuild and function properly.

I wouldn't want it to be the default behaviour, but I'd like the option 
to configure things that way. I'd want the number of auto-re-adds 
configurable too.

> Because such an option could seem tempting but could and would cause
> loss of reliability I'd expect bad publicity if it was actually added.

But it could cause improvements in reliability too. If the cable on 
drive A is hit by cosmic rays, the drive is taken out of the array, but 
the drive's actually still fine, then drive B fails before the operator 
has re-added drive A, the array goes down when it didn't need to.

What is the operator's most likely response to seeing the SATA bus 
reset? She's going to re-add the drive assuming it was a transient 
error. If we could make this happen automatically, we could close a 
window when the array's more vulnerable. I wouldn't suggest we do it 
silently; it gets logged, notified etc. just like the drive being taken 
out of the array would be.

Cheers,

John.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2009-08-14 17:07 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-13  8:44 Help understanding the root cause of a member dropping out of a RAID 1 set Simon Jackson
2009-08-13 16:13 ` Billy Crook
2009-08-13 16:26   ` John Robinson
2009-08-14 13:09     ` Paweł Brodacki
2009-08-14 17:07       ` John Robinson [this message]
2009-08-14 20:56         ` Richard Scobie
2009-08-14 13:21     ` Robin Hill

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A8599E0.5000604@anonymous.org.uk \
    --to=john.robinson@anonymous.org.uk \
    --cc=linux-raid@vger.kernel.org \
    --cc=pawel.brodacki@googlemail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.