linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Martin K. Petersen" <martin.petersen@oracle.com>
To: Bill Davidsen <davidsen@tmr.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
	Redeeman <redeeman@metanurb.dk>, David Lethe <david@santools.com>,
	Jon Nelson <jnelson-linux-raid@jamponi.net>,
	Justin Piszcz <jpiszcz@lucidpixels.com>,
	linux-raid@vger.kernel.org
Subject: Re: upgrade advice
Date: Mon, 12 Jan 2009 23:37:34 -0500	[thread overview]
Message-ID: <yq163kju8hd.fsf@sermon.lab.mkp.net> (raw)
In-Reply-To: <496C03C0.8050407@tmr.com> (Bill Davidsen's message of "Mon\, 12 Jan 2009 22\:00\:16 -0500")

>>>>> "Bill" == Bill Davidsen <davidsen@tmr.com> writes:

>> Most of the errors you see on drives are a result of media errors
>> that are big enough that the drive ECC can't correct them.  Errors
>> are often caused by head misses due to bad tracking, vibration from
>> other drives in the enclosure, the user kicking the cabinet at an
>> inopportune moment, etc.  I.e. external interference.  Other errors
>> are due to real imperfections of the media itself.

Bill> I would be surprised if a consumer grade drive doing more retries
Bill> over several seconds rather than several rotations wasn't better
Bill> able to correct for most of the transient problems you mention.

Not all the problems I mentioned are of transient nature.  Several
common corruption scenarios are caused by the transient external factors
*at write time*.  No amount of retrying is going to fix something that
was badly written to begin with.  Doesn't even have to be the sector in
question.  Could be adjacent tracks that got clobbered.


Bill> Other than possibly having more ECC bits there isn't much
Bill> difference,

I mentioned better tracking/multiple sync marks as another crucial
difference.  That's a pretty huge deal in my book.

Nearline drive firmware also devotes resources to predicting impending
failure.  They have the ability to throttle the I/O pipeline if there's
an increased risk of write error due to excessive seeking, overheating,
etc.  That means that under load performance can be choppy.

That is unacceptable behavior in the consumer/interactivity
benchmarketing-focused market whereas making sure you write things
correctly is an absolute must in the enterprise space.  And the
non-deterministic performance characteristics are not such a big deal
when the drives are sitting behind an array head with non-volatile
cache.


Bill> as several people here have noted you don't want the drive to hang
Bill> for several seconds trying this and that in a server
Bill> environment. And given that there are a very small number of
Bill> things to be done on error, like reread, seek away and back,
Bill> recalibrate, etc, 

Again, you are talking about behavior when a transient read error is
detected.  My focus is the due diligence done by the firmware during
write operations.

It is correct that one of the defining characteristics of nearline
vs. consumer drives is the retry behavior.  But that's not the point I
was trying to make.

What I was trying to convey was that:

1. Contrary to popular belief there is no inherent mechanical difference
   between consumer and nearline drives.  Same heads, arms, motors, etc.
   The premium you pay is not for "mechanical ruggedness".  That's what
   most people assume when they are charged more(*).

2. The difference is largely in how the firmware encodes stuff on the
   physical platters in the drive, the internal housekeeping overhead.
   That difference between consumer and nearline is getting bigger with
   each generation of drives.

That said, I'm also sure you can appreciate that media defect tolerances
are likely to be different between nearline and consumer kit despite
coming off the same assembly line.


(*) Seagate recently put out some SAS nearline drives that have a
different logic board than their SATA cousins.  So there's actually a
real hardware difference in that series.  The fatter PCB with dual
processors enables even better integrity protection (on par with "real"
enterprise drives) albeit at lower duty cycles.

-- 
Martin K. Petersen	Oracle Linux Engineering

  reply	other threads:[~2009-01-13  4:37 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-16 11:10 upgrade advice Max Waterman
2008-12-16 11:44 ` Justin Piszcz
2008-12-16 11:49   ` Redeeman
2008-12-16 13:07     ` Justin Piszcz
2008-12-16 14:15       ` Redeeman
2008-12-16 15:57         ` Jon Nelson
2008-12-16 16:10           ` Ryan Wagoner
2008-12-16 16:27           ` Redeeman
2008-12-16 23:29           ` Justin Piszcz
2008-12-17  0:08             ` Jon Nelson
2008-12-17  6:59               ` Redeeman
2008-12-17 13:26                 ` Jon Nelson
2008-12-17 14:28                   ` David Lethe
2008-12-17 14:37                     ` Jon Nelson
2008-12-17 15:30                       ` David Lethe
2008-12-18  7:36                         ` Redeeman
2008-12-18 16:37                         ` John Robinson
2008-12-18 16:43                           ` Justin Piszcz
2008-12-18 17:18                             ` John Robinson
2008-12-18 19:14                             ` upgrade advice / Disk drive failure rates - real world David Lethe
2008-12-18 22:51                               ` Max Waterman
2008-12-19  4:28                                 ` David Lethe
2008-12-17 14:46                     ` upgrade advice Redeeman
2008-12-17 15:38                       ` Martin K. Petersen
2009-01-13  3:00                         ` Bill Davidsen
2009-01-13  4:37                           ` Martin K. Petersen [this message]
2008-12-16 13:09     ` Max Waterman
2008-12-16 13:04   ` Max Waterman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yq163kju8hd.fsf@sermon.lab.mkp.net \
    --to=martin.petersen@oracle.com \
    --cc=david@santools.com \
    --cc=davidsen@tmr.com \
    --cc=jnelson-linux-raid@jamponi.net \
    --cc=jpiszcz@lucidpixels.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=redeeman@metanurb.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).