From: caleb miles <caleb.miles@inktank.com>
To: ceph-devel@vger.kernel.org
Subject: chooseleaf_descend_once
Date: Mon, 26 Nov 2012 21:28:43 -0500 [thread overview]
Message-ID: <50B4255B.10509@inktank.com> (raw)
Hello all,
Here's what I've done to try and validate the new
chooseleaf_descend_once tunable first described in commit
f1a53c5e80a48557e63db9c52b83f39391bc69b8 in the wip-crush branch of
ceph.git.
First I set the new tunable to it's legacy value, disabled,
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 0
The map contains one thousand osd devices contained in one hundred hosts
with the following data rule
rule data {
ruleset 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
I then simulate the creation of one million placement groups using the
crushtool
$ crushtool -i hundred.map --test --min-x 0 --max-x 999999 --num-rep 3
--output-csv --weight 120 0.0 --weight 121 0.0 --weight 122 0.0 --weight
123 0.0 --weight 124 0.0 --weight 125 0.0 --weight 125 0.0 --weight 150
0.0 --weight 151 0.0 --weight 152 0.0 --weight 153 0.0 --weight 154 0.0
--weight 155 0.0 --weight 156 0.0 --weight 180 0.0 --weight 181 0.0
--weight 182 0.0 --weight 183 0.0 --weight 184 0.0 --weight 185 0.0
--weight 186 0.0
with the majority of devices in three hosts marked out. Then in (I)Python
import scipy.stats as s
import matplotlib.mlab as m
data = m.csv2rec("data-device_utilization.csv")
s.chisquare(data['number_of_objects_stored'],
data['number_of_objects_expected'])
which will output
(122939.76474477499, 0.0)
so that the chi squared value is 122939.795 and the p value is, rounded
to, 0.0 and the observed placement distribution statistically differs
from a uniform distribution. Repeating with the new tunable set to
tunable chooseleaf_descend_once 1
I obtain the following result
(998.97643161876761, 0.32151775131589833)
so that the chi squared value is 998.976 and the p value is 0.32 and the
observed placement distribution is statistically identical to the
uniform distribution at the five and ten percent confidence levels,
higher as well of course. The p value is the probability of obtaining a
chi squared value more extreme than the statistic observed. Basically,
from my rudimentary understanding of probability theory, that if you
obtain a p value p < P then reject the null hypothesis, in our case that
the observed placement distribution is drawn from the uniform
distribution, at the P confidence level.
Caleb
next reply other threads:[~2012-11-27 2:29 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-27 2:28 caleb miles [this message]
2012-11-27 18:28 ` chooseleaf_descend_once Jim Schutt
2012-11-28 16:16 ` chooseleaf_descend_once Caleb Miles
[not found] ` <CA+zLgM0WR06Kn-pkSn7PKaZF=pHEcH5Mdzaaa=6iftuvA_kajw@mail.gmail.com>
2012-11-28 17:13 ` chooseleaf_descend_once Jim Schutt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50B4255B.10509@inktank.com \
--to=caleb.miles@inktank.com \
--cc=ceph-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.