Re: software raid and ERC

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: software raid and ERC
       [not found] <CAAevFRRuGc6x4hJax-kM8ncW9=873aRnjN-WWkoheYD7r6jimA@mail.gmail.com>
@ 2012-04-17 14:05 ` .
  2012-04-17 17:47   ` Emmanuel Noobadmin
                     ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: . @ 2012-04-17 14:05 UTC (permalink / raw)
  To: linux-raid

I'm trying to decide what disks to use for a software raid array that
will host mirrors of open source stuff.  So this array will run 24/7
and resiliency to disk failures is needed, but service levels can be
similar to a home file server.  ie, it's ok if one of the disks goes
into a deep recovery cycle for a few minutes once a month and I can't
host the stuff - people will just retry the download.  Backups aren't
really important either, as I can just mirror all the content again
(even if it takes weeks to do so).

Due to budget reasons, "enterprise" disks are out. What I've read
strongly recommends the ERC / TLER / CCTL feature for raid
applications - even including software raid.  But is ERC really
required in my scenario?

The Wikipedia article at
http://en.wikipedia.org/wiki/Error_recovery_control#Software_Raid on
ERC seems to suggest that mdadm will not error out the drive no matter
how long the recovery takes.  Instead, the SCSI disk layer is the
limiting factor, as a lengthy recovery cycle could lead to a scsi
command timeout, ignoring the drive reset command, and leading to the
disk being marked offline.  If this is indeed the case, I am tempted
to just set the scsi timeout value to 5 minutes (or whatever the
maximum period that deep recovery can take).  Are there other similar
timeouts or gotchas in other layers?  Eg, in LVM, FS code, etc?

In another post
(http://marc.info/?l=linux-raid&m=130964222812107&w=2), Drew said:
> TLER just shortens the firmware's error recovery from something like
> 60 seconds down to 4 seconds. It's mainly useful in hardware RAID but
> I can see it being useful with mdraid in the enterprise where you
> can't afford to wait for the drive to do it's own recovery attempts.

In my use case, I really don't mind if the server freezes for a while.

Please advise if there are other considerations for ERC, use of
consumer-grade disks, or "enterprise" disks.  Thanks!

P.S. I would buy ERC if I could, but the right hard disk models do not
seem to be available locally.  My preference is for the Hitachi 7k3000
series, but it seems to be out of stock with possibly months of delay.
 So what's left are the consumer 2TB models with ERC feature -
possibly just the Hitachi 2TB or Samsung Spinpoint F4 2TB.  I'd
appreciate any other suggestions too.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: software raid and ERC
  2012-04-17 14:05 ` software raid and ERC .
@ 2012-04-17 17:47   ` Emmanuel Noobadmin
  2012-04-18  2:08   ` Phil Turmel
  2012-04-18  3:12   ` .
  2 siblings, 0 replies; 5+ messages in thread
From: Emmanuel Noobadmin @ 2012-04-17 17:47 UTC (permalink / raw)
  To: .; +Cc: linux-raid

On 4/17/12, . <desire@gmail.com> wrote:

> P.S. I would buy ERC if I could, but the right hard disk models do not
> seem to be available locally.  My preference is for the Hitachi 7k3000
> series, but it seems to be out of stock with possibly months of delay.
>  So what's left are the consumer 2TB models with ERC feature -
> possibly just the Hitachi 2TB or Samsung Spinpoint F4 2TB.  I'd
> appreciate any other suggestions too.

Mine is just anecdotal info from a dozen or so machines. So far I
haven't had any mdraid 1 issues running on consumer grade HDDs.
Basically budget issues since only of my customer had been willing to
fork out premiums for the RE drives. I'll strongly recommend you mix
drive models (or at least the batch) though to avoid a batch/model
issue, such as the Seagate 1TB firmware problem, from killing all the
drives at almost the same time.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: software raid and ERC
  2012-04-17 14:05 ` software raid and ERC .
  2012-04-17 17:47   ` Emmanuel Noobadmin
@ 2012-04-18  2:08   ` Phil Turmel
  2012-04-18  3:12   ` .
  2 siblings, 0 replies; 5+ messages in thread
From: Phil Turmel @ 2012-04-18  2:08 UTC (permalink / raw)
  To: .; +Cc: linux-raid

On 04/17/2012 10:05 AM, . wrote:
> I'm trying to decide what disks to use for a software raid array that
> will host mirrors of open source stuff.  So this array will run 24/7
> and resiliency to disk failures is needed, but service levels can be
> similar to a home file server.  ie, it's ok if one of the disks goes
> into a deep recovery cycle for a few minutes once a month and I can't
> host the stuff - people will just retry the download.  Backups aren't
> really important either, as I can just mirror all the content again
> (even if it takes weeks to do so).
> 
> Due to budget reasons, "enterprise" disks are out. What I've read
> strongly recommends the ERC / TLER / CCTL feature for raid
> applications - even including software raid.  But is ERC really
> required in my scenario?
> 
> The Wikipedia article at
> http://en.wikipedia.org/wiki/Error_recovery_control#Software_Raid on
> ERC seems to suggest that mdadm will not error out the drive no matter
> how long the recovery takes.  Instead, the SCSI disk layer is the
> limiting factor, as a lengthy recovery cycle could lead to a scsi
> command timeout, ignoring the drive reset command, and leading to the
> disk being marked offline.  If this is indeed the case, I am tempted
> to just set the scsi timeout value to 5 minutes (or whatever the
> maximum period that deep recovery can take).  Are there other similar
> timeouts or gotchas in other layers?  Eg, in LVM, FS code, etc?

I've been burned by this very phenomenon on a set of Seagate drives
that I thought had SCTERC, but didn't (their predecessors did).

I wasn't aware that the driver timeouts were configurable.  Pointers?

> In another post
> (http://marc.info/?l=linux-raid&m=130964222812107&w=2), Drew said:
>> TLER just shortens the firmware's error recovery from something like
>> 60 seconds down to 4 seconds. It's mainly useful in hardware RAID but
>> I can see it being useful with mdraid in the enterprise where you
>> can't afford to wait for the drive to do it's own recovery attempts.
> 
> In my use case, I really don't mind if the server freezes for a while.
> 
> Please advise if there are other considerations for ERC, use of
> consumer-grade disks, or "enterprise" disks.  Thanks!

Be aware that SCTERC must be set on any drive power cycle--it's not a
persistent setting on desktop drives.

> P.S. I would buy ERC if I could, but the right hard disk models do not
> seem to be available locally.  My preference is for the Hitachi 7k3000
> series, but it seems to be out of stock with possibly months of delay.
>  So what's left are the consumer 2TB models with ERC feature -
> possibly just the Hitachi 2TB or Samsung Spinpoint F4 2TB.  I'd
> appreciate any other suggestions too.

The 7k3000 family is my preference at the moment.  Fortunately I don't
have an immediate need.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: software raid and ERC
  2012-04-17 14:05 ` software raid and ERC .
  2012-04-17 17:47   ` Emmanuel Noobadmin
  2012-04-18  2:08   ` Phil Turmel
@ 2012-04-18  3:12   ` .
  2012-04-18  3:52     ` NeilBrown
  2 siblings, 1 reply; 5+ messages in thread
From: . @ 2012-04-18  3:12 UTC (permalink / raw)
  To: linux-raid

Thanks to the couple of folks who have replied on/off-list, but it
didn't precisely answer what causes drives performing deep recovery to
be kicked out (and I haven't found suitable answers in the list
archives either):

The ERC wiki [1] and Red Hat Storage Administration Guide [2] clearly
describe this behavior of the SCSI layer : a drive performing deep
recovery would miss a scsi command timeout, which would cause the SCSI
layer to attempt to abort the command and reset the device/bus/host.
If these error handlers fail, the drive is set offline (which I
presume is what kicks the drive out).

The SCSI command timeout can be tuned at
/sys/block/.../device/timeout, and defaults to 30 seconds.  Perhaps
raising this timeout to a large value would also prevent deep recovery
cycles from causing the _SCSI layer_ to set the drive offline.  True
or False?

Apart from the behaviour of the SCSI layer, does the linux software
raid layer have any concept of timeouts that would cause a drive to be
kicked when performing a deep recovery cycle?  A storagereview forum
thread [3] claims that the linux software raid layer does not have a
concept of timeouts and does not care about ERC.  In a web article [4]
the major NAS manufacturers that use software raid seem to agree with
this stance.

On the other hand, how I interpret a previous post from Stefan [5] is
that the linux raid layer does have its own timeout mechanism that
will kick a non-responding drive.

> Without ERC-timeout, the drive tries to correct the error on
> its own (not reacting on any requests), mdraid assumes an error after a
> while and tries to rewrite the "missing" sector (assembled from the
> other disks).  But the drive will still not react to the write request
> as it is still doing its internal recovery procedure.  Now mdraid
> assumes the disk to be bad and kicks it.

Since I can't read code, I'm hoping that this list where software raid
development takes place would be able to clear up whether

a.  Do delays caused by deep recovery cycles actually have any direct
impact on the linux software raid layer, or does it simply issue a
command to the underlying storage/scsi subsystem and block until there
is a response?

b.  If there is no direct impact to the software raid layer, and the
impact is indirectly caused by the drive being set offline when a SCSI
command timeout and error handling routine fails...  would increasing
the scsi command timeout help to mitigate ERC delays?

Your comments and insights are much appreciated.

[1] http://en.wikipedia.org/wiki/Time-Limited_Error_Recovery#Software_Raid

[2] http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html-single/Storage_Administration_Guide/index.html#task_controlling-scsi-command-timer-onlining-devices

[3] http://forums.storagereview.com/index.php/topic/29208-how-to-use-desktop-drives-in-raid-without-tlererccctl/page__view__findpost__p__266337

[4] http://www.smallnetbuilder.com/nas/nas-features/31202-should-you-use-tler-drives-in-your-raid-nas

[5] http://marc.info/?l=linux-raid&m=128640221813394&w=2

[snipped the previous lengthy email]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: software raid and ERC
  2012-04-18  3:12   ` .
@ 2012-04-18  3:52     ` NeilBrown
  0 siblings, 0 replies; 5+ messages in thread
From: NeilBrown @ 2012-04-18  3:52 UTC (permalink / raw)
  To: .; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1726 bytes --]

On Wed, 18 Apr 2012 11:12:57 +0800 "." <desire@gmail.com> wrote:


> Apart from the behaviour of the SCSI layer, does the linux software
> raid layer have any concept of timeouts that would cause a drive to be
> kicked when performing a deep recovery cycle?  A storagereview forum
> thread [3] claims that the linux software raid layer does not have a
> concept of timeouts and does not care about ERC.  In a web article [4]
> the major NAS manufacturers that use software raid seem to agree with
> this stance.

Linux software RAID does not have a concept of timeouts.

> 
> On the other hand, how I interpret a previous post from Stefan [5] is
> that the linux raid layer does have its own timeout mechanism that
> will kick a non-responding drive.

That aspect of that post is inaccurate.

> 
> > Without ERC-timeout, the drive tries to correct the error on
> > its own (not reacting on any requests), mdraid assumes an error after a
> > while and tries to rewrite the "missing" sector (assembled from the
> > other disks).  But the drive will still not react to the write request
> > as it is still doing its internal recovery procedure.  Now mdraid
> > assumes the disk to be bad and kicks it.
> 
> Since I can't read code, I'm hoping that this list where software raid
> development takes place would be able to clear up whether
> 
> a.  Do delays caused by deep recovery cycles actually have any direct
> impact on the linux software raid layer, or does it simply issue a
> command to the underlying storage/scsi subsystem and block until there
> is a response?

md/raid in linux simply issues a command and waits for it to complete, either
with success or failure.

NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-04-18  3:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CAAevFRRuGc6x4hJax-kM8ncW9=873aRnjN-WWkoheYD7r6jimA@mail.gmail.com>
2012-04-17 14:05 ` software raid and ERC .
2012-04-17 17:47   ` Emmanuel Noobadmin
2012-04-18  2:08   ` Phil Turmel
2012-04-18  3:12   ` .
2012-04-18  3:52     ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).