Re: On URE and RAID rebuild - again!

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Gionatan Danti <g.danti@assyoma.it>
To: Piergiorgio Sartor <piergiorgio.sartor@nexgo.de>
Cc: Mikael Abrahamsson <swmike@swm.pp.se>,
	linux-raid@vger.kernel.org, g.danti@assyoma.it
Subject: Re: On URE and RAID rebuild - again!
Date: Tue, 05 Aug 2014 21:42:17 +0200	[thread overview]
Message-ID: <228fa3bd137e034e9ec974094f37b368@assyoma.it> (raw)
In-Reply-To: <20140805190159.GA2897@lazy.lzy>

Il 2014-08-05 21:01 Piergiorgio Sartor ha scritto:
> 
> This means they, who wrote the article, did not
> really *tested* what they wrote.
> Which already tells us a lot about the quality
> of the article itself.

True. Problem is that the web is full of similar articles, which sounded 
waaaaay to much "suspicious" is what they said.

> What's the difference between "probability" and
> "statistical record"?
> Is not one calculated with the other?

Premise: I am not a statistical expert, so maybe I used the wrong terms 
and/or my entire reasoning is flawed.

I am trying to imagine _how_ the various vendors arrive at the claimed 
number and _how much_ we have confidence in URE rate. _If_ for some 
reason (eg: magnetical interference during write and/or rest) a fixed 
"wrong read" probability exists _and_ _if_ it is correct to consider 
each sector read as totally indipendent events, HDD manufacturer may 
have a quite precise formula from which URE rate is obtained.

If, on the other hand, they "simply" observe how a big drive population 
reacts over time, maybe we can expect bigger variations between drivers.

I'm just speculating here; what really worried me was "you can't read 6 
times your 2 TB drive" argument :)

> I'm to lazy to try to understand what 3*10^14 is.
> What is it?

I have read about 40 TB of data, or 320 Tb. 10^14 is 12.5 TB or 100 Tb, 
if you prefer. So 3*10^14 simply is the numnber of bit that I read (URE 
is expressed as 1 event over 10^14 bit, so I wonder that make sense to 
use the same scale here).

> I'm under the impression you did not grasp the
> concept of probability is such contex.
> Given that it is not clear how the manufacturers
> compute their numbers, both cases you describe
> are the same.
> All the possible conditions are included in the
> probability computation.

I can see your point...

> You can state: under worst case scenario, *each*
> bit has a probability of 10E-14 of being wrong.
> What does this mean?

... and _this_ is what really interested me. Manufacturer publish URE 
rate as "max" values, so should be reasonable to assume that they are 
worst-case scenario. If this is the case, we can be quite sure that our 
URE rate will be lower then published specs (assuming that drive are 
deployed with care).

On the other hand, in some articles and even in this mailing list I read 
that published URE rate really are a "max of various means" and do not 
represent true worst-case scenario.

> As already wrote by others, it is not clear what
> that number (10E-14) means.
> A common understanding could be, as stated above,
> each bit has a *probability* of 10E-14 of being wrong.
> 
> Practically, it does *not* mean that reading 10E14 bit
> will deliver one bit wrong sistematically.

But if the spec is representative of normal usage scenario, reading 40 
TB of data with URE of 10^-14 has very high probabily to return a bad 
read (>95%) ...

> Furthermore, as already again stated, very likely
> an "average" HDD has much lower URE probability.

This is reassuring :)

> 
> Is this pure curiosity from your side or are
> you trying to achieve something?
> 
> There is a report, from CERN I think, provinding
> real world statistics about HDD problems.
> 
> http://storagemojo.com/2007/09/19/cerns-data-corruption-research/
> 
> bye,

Yes, I saw this article and read it with great interest. After all it 
seems that the greater part of data corruption is due to 
firmware/kernel/driver bug, and that URE rate play a minor role here.

Thank you very much guys. I'm sorry to boring you with all these 
questions, but I'm just trying to learn something!
Regards.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

next prev parent reply	other threads:[~2014-08-05 19:42 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-30  8:29 On URE and RAID rebuild - again! Gionatan Danti
2014-07-30 11:13 ` Mikael Abrahamsson
2014-07-30 13:05   ` Gionatan Danti
2014-07-30 21:31     ` NeilBrown
2014-07-31  7:16       ` Gionatan Danti
2014-08-02 16:21         ` Gionatan Danti
2014-08-03  3:48           ` NeilBrown
2014-08-04  7:02             ` Mikael Abrahamsson
2014-08-04  7:13               ` NeilBrown
2014-08-04 13:27             ` Gionatan Danti
2014-08-04 18:40               ` Mikael Abrahamsson
2014-08-04 22:44                 ` Gionatan Danti
2014-08-04 23:29                   ` NeilBrown
2014-08-05  6:52                     ` Gionatan Danti
2014-08-05 19:01                   ` Piergiorgio Sartor
2014-08-05 19:42                     ` Gionatan Danti [this message]
2014-08-06 17:05                       ` Chris Murphy
2014-08-06 16:34                   ` Chris Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=228fa3bd137e034e9ec974094f37b368@assyoma.it \
    --to=g.danti@assyoma.it \
    --cc=linux-raid@vger.kernel.org \
    --cc=piergiorgio.sartor@nexgo.de \
    --cc=swmike@swm.pp.se \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.