From: Bill Davidsen <davidsen@tmr.com>
To: Nix <nix@esperi.org.uk>
Cc: Mark Hahn <hahn@physics.mcmaster.ca>, linux-raid@vger.kernel.org
Subject: Re: disks becoming slow but not explicitly failing anyone?
Date: Thu, 04 May 2006 20:45:23 -0400 [thread overview]
Message-ID: <445AA023.6070406@tmr.com> (raw)
In-Reply-To: <874q0irdah.fsf@hades.wkstn.nix>
Nix wrote:
>On 23 Apr 2006, Mark Hahn stipulated:
>
>
>>>I've seen a lot of cheap disks say (generally deep in the data sheet
>>>that's only available online after much searching and that nobody ever
>>>reads) that they are only reliable if used for a maximum of twelve hours
>>>a day, or 90 hours a week, or something of that nature. Even server
>>>
>>>
>>I haven't, and I read lots of specs. they _will_ sometimes say that
>>non-enterprise drives are "intended" or "designed" for a 8x5 desktop-like
>>usage pattern.
>>
>>
>
>That's the phrasing, yes: foolish me assumed that meant `if you leave it
>on for much longer than that, things will go wrong'.
>
>
>
>> to the normal way of thinking about reliability, this would
>>simply mean a factor of 4.2x lower reliability - say from 1M to 250K hours
>>MTBF. that's still many times lower rate of failure than power supplies or
>>fans.
>>
>>
>
>Ah, right, it's not a drastic change.
>
>
>
>>>It still stuns me that anyone would ever voluntarily buy drives that
>>>can't be left switched on (which is perhaps why the manufacturers hide
>>>
>>>
>>I've definitely never seen any spec that stated that the drive had to be
>>switched off. the issue is really just "what is the designed duty-cycle?"
>>
>>
>
>I see. So it's just `we didn't try to push the MTBF up as far as we would
>on other sorts of disks'.
>
>
>
>>I run a number of servers which are used as compute clusters. load is
>>definitely 24x7, since my users always keep the queues full. but the servers
>>are not maxed out 24x7, and do work quite nicely with desktop drives
>>for years at a time. it's certainly also significant that these are in a
>>decent machineroom environment.
>>
>>
>
>Yeah; i.e., cooled. I don't have a cleanroom in my house so the RAID
>array I run there is necessarily uncooled, and the alleged aircon in the
>room housing work's array is permanently on the verge of total collapse
>(I think it lowers the temperature, but not by much).
>
>
>
>>it's unfortunate that disk vendors aren't more forthcoming with their drive
>>stats. for instance, it's obvious that "wear" in MTBF terms would depend
>>nonlinearly on the duty cycle. it's important for a customer to know where
>>that curve bends, and to try to stay in the low-wear zone. similarly, disk
>>
>>
>
>Agreed! I tend to assume that non-laptop disks hate being turned on and
>hate temperature changes, so just keep them running 24x7. This seems to be OK,
>with the only disks this has ever killed being Hitachi server-class disks in
>a very expensive Sun server which was itself meant for 24x7 operation; the
>cheaper disks in my home systems were quite happy. (Go figure...)
>
>
>
>>specs often just give a max operating temperature (often 60C!), which is
>>almost disingenuous, since temperature has a superlinear effect on reliability.
>>
>>
>
>I'll say. I'm somewhat twitchy about the uncooled 37C disks in one of my
>machines: but one of the other disks ran at well above 60C for *years*
>without incident: it was an old one with no onboard temperature sensing,
>and it was perhaps five years after startup that I opened that machine
>for the first time in years and noticed that the disk housing nearly
>burned me when I touched it. The guy who installed it said that yes, it
>had always run that hot, and was that important? *gah*
>
>I got a cooler for that disk in short order.
>
>
>
>>a system designer needs to evaluate the expected duty cycle when choosing
>>disks, as well as many other factors which are probably more important.
>>for instance, an earlier thread concerned a vast amount of read traffic
>>to disks resulting from atime updates.
>>
>>
>
>Oddly, I see a steady pulse of write traffic, ~100Kb/s, to one dm device
>(translating into read+write on the underlying disks) even when the
>system is quiescient, all daemons killed, and all fsen mounted with
>noatime. One of these days I must fish out blktrace and see what's
>causing it (but that machine is hard to quiesce like that: it's in heavy
>use).
>
>
>
>>simply using more disks also decreases the load per disk, though this is
>>clearly only a win if it's the difference in staying out of the disks
>>"duty-cycle danger zone" (since more disks divide system MTBF).
>>
>>
>
>Well, yes, but if you have enough more you can make some of them spares
>and push up the MTBF again (and the cooling requirements, and the power
>consumption: I wish there was a way to spin down spares until they were
>needed, but non-laptop controllers don't often seem to provide a way to
>spin anything down at all that I know of).
>
>
>
hdparam will let you set the spindown time. I have all mine set that way
for power and heat reasons, they tend to be in burst use. Dropped the CR
temp by enough to notice, but I need some more local cooling for that
room still.
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
next prev parent reply other threads:[~2006-05-05 0:45 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-04-22 20:05 disks becoming slow but not explicitly failing anyone? Carlos Carvalho
2006-04-23 0:45 ` Mark Hahn
2006-04-23 13:38 ` Nix
2006-04-23 18:04 ` Mark Hahn
2006-04-24 19:20 ` Nix
2006-05-05 0:45 ` Bill Davidsen [this message]
2006-04-27 3:31 ` Konstantin Olchanski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=445AA023.6070406@tmr.com \
--to=davidsen@tmr.com \
--cc=hahn@physics.mcmaster.ca \
--cc=linux-raid@vger.kernel.org \
--cc=nix@esperi.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).