From: NeilBrown <neilb@suse.de>
To: Gionatan Danti <g.danti@assyoma.it>
Cc: Mikael Abrahamsson <swmike@swm.pp.se>, linux-raid@vger.kernel.org
Subject: Re: On URE and RAID rebuild - again!
Date: Tue, 5 Aug 2014 09:29:51 +1000 [thread overview]
Message-ID: <20140805092951.7d8f8e6d@notabene.brown> (raw)
In-Reply-To: <35916d10dab6084e6f28da2e0975fce7@assyoma.it>
[-- Attachment #1: Type: text/plain, Size: 3853 bytes --]
On Tue, 05 Aug 2014 00:44:04 +0200 Gionatan Danti <g.danti@assyoma.it> wrote:
> Il 2014-08-04 20:40 Mikael Abrahamsson ha scritto:
> >
> > Why do you think that's wrong? 10^-14 is what the vendor guarantees. I
> > have had drives with worse performance (after a couple of months I had
> > several UNC sectors without reading much).
> >
> > Your claim about the article being wrong is the same as saying that
> > the risk reported of getting into a car accident is wrong because
> > you've driven that amount of kilometers but haven't been in an
> > accident yet.
> >
> > This is statistics, marketing and warranty, not guaranteed behavior.
>
> Yes, I understand this. However, the linked article (and many others)
> state:
> "If you have a 2TB drive, you write 2TB to it, and then you fully read
> that, just over 6 times, then you will run into one read error,
> theoretically speaking."
This statement is wrong, and doesn't even make any sense. It displays a deep
misunderstanding of probability (the same deep misunderstanding that leads
people to buy lottery tickets).
>
> I read my 500 GB drive over _60_ times, reading 3x more total data than
> stated above.
>
> I started the entire discussion to know how UREs are calculated, trying
> to understand if they are expressed as probability ("1 probabily over
> 10^14 that we can not read a sector) or a statistical record ("we found
> that 1 on 10^14 is not readable").
Probabilities are often calculated by examining a statistical record - the
two concepts are not separate.
There is probably some theoretical analysis, some statistical analysis, some
marketing and maybe even some actuarial analysis that goes in to the quoted
figure. I remember when CPU speed was measured in "MIPS".
This stood for
Meaningless Indicators of Performance for Salesmen
URE rates numbers are probably equally trustworthy.
>
> If defined as a probability, I am very lucky: if my math is OK, I should
> have only 0.5% to read about 40 TB of data (my math is:
> (1-(1/10^14))^(3*(10^14))). If, on the other hand, UREs are defined as
> statistical evidence (as MTBF), environment and test conditions (eg:
> duty cycle, read/write distribution, etc) are absolutely critical to
> understand what this parameter really mean for us.
The probability number doesn't tell you much at all about your drive.
Your drive probably works much better than the quoted rate, but could be much
worse.
The quoted number might say something useful about a collection of 10,000
drives, but if you can afford those, you can probably afford to competent
statistician to explain the details too.
>
> I'm under impression (and maybe I'm wrong, as usual :)) that UREs mainly
> depends on incomplete writes and/or unsable sectors. If this is the
> case, maybe the published URE values are related to the entire HDD
> warranty. In other word, they should be read as "in normal condition,
> with typical loads, out HDD will exibit about 1/10^14 unrecoverable
> error during the entire disk lifespan".
I'm not an electro-magnetic engineer, but I would guess that UREs are caused
by some combination of:
- irregularities in the physical media
- imperfections in positioning of the write head
- fluctuations in temperature and pressure which could
affect precise performance of resistors and capacitors etc.
and probably various quantum effects that I know nothing about.
Maybe most UREs come from a spec of dust that was in the wrong place at the
wrong time.
If think a better summary would be:
in normal conditions and typical loads, a collection of 10^14 drives will
exhibit errors somewhere in the collection on a regular basis.
>
> It is reasonable? Or I am horribly wrong?
> Regards.
>
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2014-08-04 23:29 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-30 8:29 On URE and RAID rebuild - again! Gionatan Danti
2014-07-30 11:13 ` Mikael Abrahamsson
2014-07-30 13:05 ` Gionatan Danti
2014-07-30 21:31 ` NeilBrown
2014-07-31 7:16 ` Gionatan Danti
2014-08-02 16:21 ` Gionatan Danti
2014-08-03 3:48 ` NeilBrown
2014-08-04 7:02 ` Mikael Abrahamsson
2014-08-04 7:13 ` NeilBrown
2014-08-04 13:27 ` Gionatan Danti
2014-08-04 18:40 ` Mikael Abrahamsson
2014-08-04 22:44 ` Gionatan Danti
2014-08-04 23:29 ` NeilBrown [this message]
2014-08-05 6:52 ` Gionatan Danti
2014-08-05 19:01 ` Piergiorgio Sartor
2014-08-05 19:42 ` Gionatan Danti
2014-08-06 17:05 ` Chris Murphy
2014-08-06 16:34 ` Chris Murphy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140805092951.7d8f8e6d@notabene.brown \
--to=neilb@suse.de \
--cc=g.danti@assyoma.it \
--cc=linux-raid@vger.kernel.org \
--cc=swmike@swm.pp.se \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).