From: Dan Christensen <jdc@uwo.ca>
To: linux-raid@vger.kernel.org
Subject: Re: devices get kicked from RAID about once a month
Date: Fri, 04 Jun 2010 11:56:55 -0400 [thread overview]
Message-ID: <87zkzalrjs.fsf@uwo.ca> (raw)
In-Reply-To: 20100604135031.GA5092@cthulhu.home.robinhill.me.uk
Robin Hill <robin@robinhill.me.uk> writes:
> On Fri Jun 04, 2010 at 09:30:09AM -0400, Dan Christensen wrote:
>
>> what would the raid layer do when it got a read error?
>>
> It reconstructs the data and attempts a write. A write failure will
> then fail the drive.
[...]
> It does exactly the same on the read timeout. The problem is that when
> it sends the write, the drive is still busy attempting the read, so
> ignores the write request (until it's free). This then times out as
> well, so the array assumes the drive has failed.
>
>> These questions are motivated from the following logic. Since it is
>> generally recognized that quicker read errors (e.g. TLER) are good
>> for drives in a raid array, *increasing* the SATA timeouts seems like it
>> is going in the wrong direction. Wouldn't it be better to have short
>> timeouts, but have the raid layer treat a timeout less seriously?
>>
> As has been stated, the RAID layer doesn't have any timeouts. It's the
> SCSI/ATA layer which is timing out the read/write and reporting a
> failure to the RAID layer. If the timeout at this level is increased
> sufficiently then either the read will eventually succeed, or it'll
> still fail but the write will then succeed (as the drive is no longer
> busy) (or the write will fail and the disk is really failed).
Ok, I now understand the idea here. Even if the SATA timeout were
reduced, there's nothing the raid layer can do until the drive is
ready to respond again. So it makes sense to work around this by
increasing the SATA timeouts.
Thanks for the clarification!
Dan
next prev parent reply other threads:[~2010-06-04 15:56 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-02 14:14 devices get kicked from RAID about once a month Dan Christensen
2010-06-02 15:02 ` rsivak
2010-06-02 15:29 ` Dan Christensen
2010-06-02 15:37 ` John Robinson
2010-06-02 16:33 ` Dan Christensen
2010-06-02 17:42 ` Bill Davidsen
2010-06-02 17:49 ` Dan Christensen
2010-06-03 16:37 ` Bill Davidsen
2010-06-03 16:47 ` Dan Christensen
2010-06-03 21:33 ` Neil Brown
2010-06-04 13:30 ` Dan Christensen
2010-06-04 13:50 ` Robin Hill
2010-06-04 15:56 ` Dan Christensen [this message]
2010-06-02 19:55 ` Miha Verlic
-- strict thread matches above, loose matches on Subject: below --
2010-06-02 18:29 Stefan /*St0fF*/ Hübner
2010-06-03 0:13 ` Neil Brown
2010-06-03 17:00 ` Bill Davidsen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87zkzalrjs.fsf@uwo.ca \
--to=jdc@uwo.ca \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.