All of lore.kernel.org
 help / color / mirror / Atom feed
* chooseleaf_descend_once
@ 2012-11-27  2:28 caleb miles
  2012-11-27 18:28 ` chooseleaf_descend_once Jim Schutt
  0 siblings, 1 reply; 4+ messages in thread
From: caleb miles @ 2012-11-27  2:28 UTC (permalink / raw)
  To: ceph-devel

Hello all,

Here's what I've done to try and validate the new 
chooseleaf_descend_once tunable first described in commit 
f1a53c5e80a48557e63db9c52b83f39391bc69b8 in the wip-crush branch of 
ceph.git.

First I set the new tunable to it's legacy value, disabled,

tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 0

The map contains one thousand osd devices contained in one hundred hosts 
with the following data rule

rule data {
ruleset 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}

I then simulate the creation of one million placement groups using the 
crushtool

$ crushtool -i hundred.map --test --min-x 0 --max-x 999999 --num-rep 3 
--output-csv --weight 120 0.0 --weight 121 0.0 --weight 122 0.0 --weight 
123 0.0 --weight 124 0.0 --weight 125 0.0 --weight 125 0.0 --weight 150 
0.0 --weight 151 0.0 --weight 152 0.0 --weight 153 0.0 --weight 154 0.0 
--weight 155 0.0 --weight 156 0.0 --weight 180 0.0 --weight 181 0.0 
--weight 182 0.0 --weight 183 0.0 --weight 184 0.0 --weight 185 0.0 
--weight 186 0.0

with the majority of devices in three hosts marked out. Then in (I)Python

import scipy.stats as s
import matplotlib.mlab as m

data = m.csv2rec("data-device_utilization.csv")
s.chisquare(data['number_of_objects_stored'], 
data['number_of_objects_expected'])

which will output

(122939.76474477499, 0.0)

so that the chi squared value is 122939.795 and the p value is, rounded 
to, 0.0 and the observed placement distribution statistically differs 
from a uniform distribution. Repeating with the new tunable set to

tunable chooseleaf_descend_once 1

I obtain the following result

(998.97643161876761, 0.32151775131589833)

so that the chi squared value is 998.976 and the p value is 0.32 and the 
observed placement distribution is statistically identical to the 
uniform distribution at the five and ten percent confidence levels, 
higher as well of course. The p value is the probability of obtaining a 
chi squared value more extreme than the statistic observed. Basically, 
from my rudimentary understanding of probability theory, that if you 
obtain a p value p < P then reject the null hypothesis, in our case that 
the observed placement distribution is drawn from the uniform 
distribution, at the P confidence level.

Caleb

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: chooseleaf_descend_once
  2012-11-27  2:28 chooseleaf_descend_once caleb miles
@ 2012-11-27 18:28 ` Jim Schutt
  2012-11-28 16:16   ` chooseleaf_descend_once Caleb Miles
       [not found]   ` <CA+zLgM0WR06Kn-pkSn7PKaZF=pHEcH5Mdzaaa=6iftuvA_kajw@mail.gmail.com>
  0 siblings, 2 replies; 4+ messages in thread
From: Jim Schutt @ 2012-11-27 18:28 UTC (permalink / raw)
  To: caleb miles; +Cc: ceph-devel

Hi Caleb,

On 11/26/2012 07:28 PM, caleb miles wrote:
> Hello all,
>
> Here's what I've done to try and validate the new chooseleaf_descend_once tunable first described in commit f1a53c5e80a48557e63db9c52b83f39391bc69b8 in the wip-crush branch of ceph.git.
>
> First I set the new tunable to it's legacy value, disabled,
>
> tunable choose_local_tries 0
> tunable choose_local_fallback_tries 0
> tunable choose_total_tries 50
> tunable chooseleaf_descend_once 0
>
> The map contains one thousand osd devices contained in one hundred hosts with the following data rule
>
> rule data {
> ruleset 0
> type replicated
> min_size 1
> max_size 10
> step take default
> step chooseleaf firstn 0 type host
> step emit
> }
>
> I then simulate the creation of one million placement groups using the crushtool
>
> $ crushtool -i hundred.map --test --min-x 0 --max-x 999999 --num-rep 3 --output-csv --weight 120 0.0 --weight 121 0.0 --weight 122 0.0 --weight 123 0.0 --weight 124 0.0 --weight 125 0.0 --weight 125 0.0 --weight 150 0.0 --weight 151 0.0 --weight 152 0.0 --weight 153 0.0 --weight 154 0.0 --weight 155 0.0 --weight 156 0.0 --weight 180 0.0 --weight 181 0.0 --weight 182 0.0 --weight 183 0.0 --weight 184 0.0 --weight 185 0.0 --weight 186 0.0
>
> with the majority of devices in three hosts marked out. Then in (I)Python
>
> import scipy.stats as s
> import matplotlib.mlab as m
>
> data = m.csv2rec("data-device_utilization.csv")
> s.chisquare(data['number_of_objects_stored'], data['number_of_objects_expected'])
>
> which will output
>
> (122939.76474477499, 0.0)
>
> so that the chi squared value is 122939.795 and the p value is, rounded to, 0.0 and the observed placement distribution statistically differs from a uniform distribution. Repeating with the new tunable set to
>
> tunable chooseleaf_descend_once 1
>
> I obtain the following result
>
> (998.97643161876761, 0.32151775131589833)
>
> so that the chi squared value is 998.976 and the p value is 0.32 and the observed placement distribution is statistically identical to the uniform distribution at the five and ten percent confidence levels, higher as well of course. The p value is the probability of obtaining a chi squared value more extreme than the statistic observed. Basically, from my rudimentary understanding of probability theory, that if you obtain a p value p < P then reject the null hypothesis, in our case that the observed placement distribution is drawn from the uniform distribution, at the P confidence level.
>

Cool.  Thanks for doing these tests.

Is there any point to doing a third test, with

tunable chooseleaf_descend_once 0

and no devices marked out, but in all other respects
the same as the above two tests?

I would expect the results for that case and the last
case you tested to be essentially identical in the degree
of uniformity, but is it worth verifying?

-- Jim

> Caleb
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: chooseleaf_descend_once
  2012-11-27 18:28 ` chooseleaf_descend_once Jim Schutt
@ 2012-11-28 16:16   ` Caleb Miles
       [not found]   ` <CA+zLgM0WR06Kn-pkSn7PKaZF=pHEcH5Mdzaaa=6iftuvA_kajw@mail.gmail.com>
  1 sibling, 0 replies; 4+ messages in thread
From: Caleb Miles @ 2012-11-28 16:16 UTC (permalink / raw)
  To: Jim Schutt; +Cc: ceph-devel

Hey Jim,

Running the third test with tunable chooseleaf_descend_once 0 with no
devices marked out yields the following result

(999.82733333333397, 0.48667056652539997)

so chi squared value is 999 with a corresponding p value of 0.487 so
that the placement distribution seems to be drawn from the uniform
distribution as desired.

Caleb

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: chooseleaf_descend_once
       [not found]   ` <CA+zLgM0WR06Kn-pkSn7PKaZF=pHEcH5Mdzaaa=6iftuvA_kajw@mail.gmail.com>
@ 2012-11-28 17:13     ` Jim Schutt
  0 siblings, 0 replies; 4+ messages in thread
From: Jim Schutt @ 2012-11-28 17:13 UTC (permalink / raw)
  To: Caleb Miles; +Cc: ceph-devel

On 11/28/2012 09:11 AM, Caleb Miles wrote:
> Hey Jim,
>
> Running the third test with tunable chooseleaf_descend_once 0 with no
> devices marked out yields the following result
>
> (999.82733333333397, 0.48667056652539997)
>
> so chi squared value is 999 with a corresponding p value of 0.487 so that
> the placement distribution seems to be drawn from the uniform distribution
> as desired.

Great, thanks for doing that extra test.

Plus, I see that Sage has merged it.   Cool.

Thanks -- Jim


>
> Caleb
>
>
> On Tue, Nov 27, 2012 at 1:28 PM, Jim Schutt<jaschut@sandia.gov>  wrote:
>
>> Hi Caleb,
>>
>>
>> On 11/26/2012 07:28 PM, caleb miles wrote:
>>
>>> Hello all,
>>>
>>> Here's what I've done to try and validate the new chooseleaf_descend_once
>>> tunable first described in commit f1a53c5e80a48557e63db9c52b83f3**9391bc69b8
>>> in the wip-crush branch of ceph.git.
>>>
>>> First I set the new tunable to it's legacy value, disabled,
>>>
>>> tunable choose_local_tries 0
>>> tunable choose_local_fallback_tries 0
>>> tunable choose_total_tries 50
>>> tunable chooseleaf_descend_once 0
>>>
>>> The map contains one thousand osd devices contained in one hundred hosts
>>> with the following data rule
>>>
>>> rule data {
>>> ruleset 0
>>> type replicated
>>> min_size 1
>>> max_size 10
>>> step take default
>>> step chooseleaf firstn 0 type host
>>> step emit
>>> }
>>>
>>> I then simulate the creation of one million placement groups using the
>>> crushtool
>>>
>>> $ crushtool -i hundred.map --test --min-x 0 --max-x 999999 --num-rep 3
>>> --output-csv --weight 120 0.0 --weight 121 0.0 --weight 122 0.0 --weight
>>> 123 0.0 --weight 124 0.0 --weight 125 0.0 --weight 125 0.0 --weight 150 0.0
>>> --weight 151 0.0 --weight 152 0.0 --weight 153 0.0 --weight 154 0.0
>>> --weight 155 0.0 --weight 156 0.0 --weight 180 0.0 --weight 181 0.0
>>> --weight 182 0.0 --weight 183 0.0 --weight 184 0.0 --weight 185 0.0
>>> --weight 186 0.0
>>>
>>> with the majority of devices in three hosts marked out. Then in (I)Python
>>>
>>> import scipy.stats as s
>>> import matplotlib.mlab as m
>>>
>>> data = m.csv2rec("data-device_**utilization.csv")
>>> s.chisquare(data['number_of_**objects_stored'], data['number_of_objects_*
>>> *expected'])
>>>
>>> which will output
>>>
>>> (122939.76474477499, 0.0)
>>>
>>> so that the chi squared value is 122939.795 and the p value is, rounded
>>> to, 0.0 and the observed placement distribution statistically differs from
>>> a uniform distribution. Repeating with the new tunable set to
>>>
>>> tunable chooseleaf_descend_once 1
>>>
>>> I obtain the following result
>>>
>>> (998.97643161876761, 0.32151775131589833)
>>>
>>> so that the chi squared value is 998.976 and the p value is 0.32 and the
>>> observed placement distribution is statistically identical to the uniform
>>> distribution at the five and ten percent confidence levels, higher as well
>>> of course. The p value is the probability of obtaining a chi squared value
>>> more extreme than the statistic observed. Basically, from my rudimentary
>>> understanding of probability theory, that if you obtain a p value p<  P
>>> then reject the null hypothesis, in our case that the observed placement
>>> distribution is drawn from the uniform distribution, at the P confidence
>>> level.
>>>
>>>
>> Cool.  Thanks for doing these tests.
>>
>> Is there any point to doing a third test, with
>>
>> tunable chooseleaf_descend_once 0
>>
>> and no devices marked out, but in all other respects
>> the same as the above two tests?
>>
>> I would expect the results for that case and the last
>> case you tested to be essentially identical in the degree
>> of uniformity, but is it worth verifying?
>>
>> -- Jim
>>
>>   Caleb
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/**majordomo-info.html<http://vger.kernel.org/majordomo-info.html>
>>>
>>>
>>>
>>
>>
>



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-11-28 17:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-27  2:28 chooseleaf_descend_once caleb miles
2012-11-27 18:28 ` chooseleaf_descend_once Jim Schutt
2012-11-28 16:16   ` chooseleaf_descend_once Caleb Miles
     [not found]   ` <CA+zLgM0WR06Kn-pkSn7PKaZF=pHEcH5Mdzaaa=6iftuvA_kajw@mail.gmail.com>
2012-11-28 17:13     ` chooseleaf_descend_once Jim Schutt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.