Re: Markov models for Ceph

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Loic Dachary <loic@dachary.org>
To: Koleos Fuscus <koleosfuscus@gmail.com>
Cc: "ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: Markov models for Ceph
Date: Mon, 07 Jul 2014 19:16:32 +0200	[thread overview]
Message-ID: <53BAD5F0.20804@dachary.org> (raw)
In-Reply-To: <CACUN8h1j=j-=37TRcnauqGuxMff3Ebs4x5jewY3KefOgDprp8g@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5170 bytes --]

Hi koleosfuscus,

From http://www.kaymgee.com/Kevin_Greenan/Software_files/hfrs.tar downloaded from http://www.kaymgee.com/Kevin_Greenan/Software.html

In hfrs/models/weaver_8_8_3.disk.ber.model

[num states]
4
0 1 a failure
1 0 b repair
1 2 c failure
2 1 d repair
2 3 e failure
[assign]
a=N*lam_d
b=mu
c=(N-1)*lam_d
d=2*mu
e=(N-2)*lam_d
N=8
lam_d=(1/461386.)
mu=(1/12.)
[END]

is semi-human parsable but hfrs/models/weaver_8_8_3.disk.ber.model

[num states]
5
0 1 a failure
0 4 b failure
1 2 c failure
1 4 d failure
1 0 e repair
2 3 f failure
2 4 g failure
2 1 h repair
3 4 i failure
3 2 j repair
[assign]
a=(N-0)*lam_d*(1-0.000000)*(1-(0.000000*(1-(1-p)**(N-1))))
b=(N-0)*lam_d*(0.000000)+(N-0)*lam_d*(1-0.000000)*((0.000000*(1-(1-p)**(N-1))))
c=(N-1)*lam_d*(1-0.000000)*(1-(0.000000*(1-(1-p)**(N-2))))
d=(N-1)*lam_d*(0.000000)+(N-1)*lam_d*(1-0.000000)*((0.000000*(1-(1-p)**(N-2))))
e=1*mu
f=(N-2)*lam_d*(1-0.000000)*(1-(0.114286*(1-(1-p)**(N-3))))
g=(N-2)*lam_d*(0.000000)+(N-2)*lam_d*(1-0.000000)*((0.114286*(1-(1-p)**(N-3))))
h=2*mu
i=(N-3)*lam_d
j=3*mu
N=8
lam_d=(1/461386.)
mu=(1/12.)
p=0.0237
[END]

[Disk sector conditional fault tolerance]
[[0.0, 0.0, 0.0, 0.0, 0.0043956043956043956, 0.02197802197802198, 0.075924075924075921], [0.0, 0.0, 0.0, 0.01098901098901099, 0.057942057942057944, 0.19780219780219779, 1.0], [0.0, 0.0, 0.034632034632034632, 0.16623376623376623, 0.49494949494949497, 1.0, 1.0], [0.0, 0.11428571428571428, 0.44126984126984126, 0.98333333333333328, 1.0, 1.0, 1.0]]


Kevin write that "The HFRS uses an extremely efficient mathematical technique, called importance sampling, which enables the observation of extremely low-probability events.  I have implemented (and derived in my thesis) efficient simulation algorithms under both exponential and Weibull failure/repairs.  The combination of these techniques, in addition to a custom Markov model solver, makes the HFRS an extremely useful tool for evaluating storage system reliability." meaning you need to understand both https://en.wikipedia.org/wiki/Markov_model and https://en.wikipedia.org/wiki/Importance_sampling as well as the semantics of the input file which is documented in the README.

Nice find koleosfuscus :-)

Cheers

On 07/07/2014 17:19, Koleos Fuscus wrote:
> Hello Loic,
> 
> You ask previously:
> In other words, is there a place where one could set things like "disk
> fail % of the time" and "network is X Gb/s" and "repairing a disk
> failure requires disk require reading B bytes from M disks" ? As far
> as I understand, such factors cannot be expressed with a single
> formula and this is why a Markov model is useful.
> 
> I think we need to run simulations to have a more precise estimation
> of the reliability of an erasure coded system. Markov models are not
> as flexible as you may think. Besides, solving equations when the
> number of components that may fail is large makes the problem not
> trivial. Maybe standard simulation is enough. As observed by Greenan
> in his thesis, standard simulations have problems with rare events
> which may not be observed during simulation time. I don't know if we
> should care about rare events for comparing methods..
> 
> Greenan released the software used for his thesis. It is completely
> developed in Python.
> http://www.kaymgee.com/Kevin_Greenan/Software.html
> 
> I found Greenan tool while trying to validate the results of ceph-tool
> and the numbers are completely different:
> 
> For instance:
> 
> Parameters for ceph tool:
> Disk type consumer, FIT1=2167, FIT2=2167
> Size: 2000GiB
> RAID-6
> Replace 0h
> Rebuild 6000MiB/s
> Volumes:8
> NRE model: ignore
> Period: 10 years
> 
> (I used this numbers to compared with model 2DFT.disk.model of Greenan tool)
> 
> Parameters for  Greenan HFRS tool
> python mm_solve.py -m 2DFT.disk.model -M
> 
> Results
> 
> CEPH:
> 
>     storage               durability    PL(site)  PL(copies)
> PL(NRE)     PL(rep)    loss/PiB
> 
>     ----------            ----------  ----------  ----------
> ----------  ----------  ----------
> 
>     RAID-6: 6+2             11-nines   0.000e+00   1.318e-12
> 0.000e+00   0.000e+00   9.887e+02
> 
> 
> HRFS:
> 
> Analytic MTTDL:  4.06111903031e+12
> *********************
> Analytic prob. of failure: 2.15660e-08
> *********************
> 
> Could you check if the parameters for ceph are correct and equivalent
> to HRFS model?Do you think it has sense to include Greenan tool.
> Greenan has a number of models including nonMDS codes. I am not sure
> yet how we can describe the LRC code in this platform but it might be
> possible.
> 
> koleosfuscus
> 
> ________________________________________________________________
> "My reply is: the software has no known bugs, therefore it has not
> been updated."
> Wietse Venema
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

     prev parent reply	other threads:[~2014-07-07 17:16 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-07 15:19 Markov models for Ceph Koleos Fuscus
2014-07-07 17:16 ` Loic Dachary [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53BAD5F0.20804@dachary.org \
    --to=loic@dachary.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=koleosfuscus@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.