From: "Jim Schutt" <jaschut@sandia.gov>
To: Caleb Miles <caleb.miles@inktank.com>
Cc: ceph-devel@vger.kernel.org
Subject: Re: chooseleaf_descend_once
Date: Wed, 28 Nov 2012 10:13:01 -0700 [thread overview]
Message-ID: <50B6461D.7080004@sandia.gov> (raw)
In-Reply-To: <CA+zLgM0WR06Kn-pkSn7PKaZF=pHEcH5Mdzaaa=6iftuvA_kajw@mail.gmail.com>
On 11/28/2012 09:11 AM, Caleb Miles wrote:
> Hey Jim,
>
> Running the third test with tunable chooseleaf_descend_once 0 with no
> devices marked out yields the following result
>
> (999.82733333333397, 0.48667056652539997)
>
> so chi squared value is 999 with a corresponding p value of 0.487 so that
> the placement distribution seems to be drawn from the uniform distribution
> as desired.
Great, thanks for doing that extra test.
Plus, I see that Sage has merged it. Cool.
Thanks -- Jim
>
> Caleb
>
>
> On Tue, Nov 27, 2012 at 1:28 PM, Jim Schutt<jaschut@sandia.gov> wrote:
>
>> Hi Caleb,
>>
>>
>> On 11/26/2012 07:28 PM, caleb miles wrote:
>>
>>> Hello all,
>>>
>>> Here's what I've done to try and validate the new chooseleaf_descend_once
>>> tunable first described in commit f1a53c5e80a48557e63db9c52b83f3**9391bc69b8
>>> in the wip-crush branch of ceph.git.
>>>
>>> First I set the new tunable to it's legacy value, disabled,
>>>
>>> tunable choose_local_tries 0
>>> tunable choose_local_fallback_tries 0
>>> tunable choose_total_tries 50
>>> tunable chooseleaf_descend_once 0
>>>
>>> The map contains one thousand osd devices contained in one hundred hosts
>>> with the following data rule
>>>
>>> rule data {
>>> ruleset 0
>>> type replicated
>>> min_size 1
>>> max_size 10
>>> step take default
>>> step chooseleaf firstn 0 type host
>>> step emit
>>> }
>>>
>>> I then simulate the creation of one million placement groups using the
>>> crushtool
>>>
>>> $ crushtool -i hundred.map --test --min-x 0 --max-x 999999 --num-rep 3
>>> --output-csv --weight 120 0.0 --weight 121 0.0 --weight 122 0.0 --weight
>>> 123 0.0 --weight 124 0.0 --weight 125 0.0 --weight 125 0.0 --weight 150 0.0
>>> --weight 151 0.0 --weight 152 0.0 --weight 153 0.0 --weight 154 0.0
>>> --weight 155 0.0 --weight 156 0.0 --weight 180 0.0 --weight 181 0.0
>>> --weight 182 0.0 --weight 183 0.0 --weight 184 0.0 --weight 185 0.0
>>> --weight 186 0.0
>>>
>>> with the majority of devices in three hosts marked out. Then in (I)Python
>>>
>>> import scipy.stats as s
>>> import matplotlib.mlab as m
>>>
>>> data = m.csv2rec("data-device_**utilization.csv")
>>> s.chisquare(data['number_of_**objects_stored'], data['number_of_objects_*
>>> *expected'])
>>>
>>> which will output
>>>
>>> (122939.76474477499, 0.0)
>>>
>>> so that the chi squared value is 122939.795 and the p value is, rounded
>>> to, 0.0 and the observed placement distribution statistically differs from
>>> a uniform distribution. Repeating with the new tunable set to
>>>
>>> tunable chooseleaf_descend_once 1
>>>
>>> I obtain the following result
>>>
>>> (998.97643161876761, 0.32151775131589833)
>>>
>>> so that the chi squared value is 998.976 and the p value is 0.32 and the
>>> observed placement distribution is statistically identical to the uniform
>>> distribution at the five and ten percent confidence levels, higher as well
>>> of course. The p value is the probability of obtaining a chi squared value
>>> more extreme than the statistic observed. Basically, from my rudimentary
>>> understanding of probability theory, that if you obtain a p value p< P
>>> then reject the null hypothesis, in our case that the observed placement
>>> distribution is drawn from the uniform distribution, at the P confidence
>>> level.
>>>
>>>
>> Cool. Thanks for doing these tests.
>>
>> Is there any point to doing a third test, with
>>
>> tunable chooseleaf_descend_once 0
>>
>> and no devices marked out, but in all other respects
>> the same as the above two tests?
>>
>> I would expect the results for that case and the last
>> case you tested to be essentially identical in the degree
>> of uniformity, but is it worth verifying?
>>
>> -- Jim
>>
>> Caleb
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/**majordomo-info.html<http://vger.kernel.org/majordomo-info.html>
>>>
>>>
>>>
>>
>>
>
prev parent reply other threads:[~2012-11-28 17:13 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-27 2:28 chooseleaf_descend_once caleb miles
2012-11-27 18:28 ` chooseleaf_descend_once Jim Schutt
2012-11-28 16:16 ` chooseleaf_descend_once Caleb Miles
[not found] ` <CA+zLgM0WR06Kn-pkSn7PKaZF=pHEcH5Mdzaaa=6iftuvA_kajw@mail.gmail.com>
2012-11-28 17:13 ` Jim Schutt [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50B6461D.7080004@sandia.gov \
--to=jaschut@sandia.gov \
--cc=caleb.miles@inktank.com \
--cc=ceph-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.