From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jim Schutt" Subject: Re: chooseleaf_descend_once Date: Wed, 28 Nov 2012 10:13:01 -0700 Message-ID: <50B6461D.7080004@sandia.gov> References: <50B4255B.10509@inktank.com> <50B50662.2040002@sandia.gov> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from sentry-two.sandia.gov ([132.175.109.14]:52115 "EHLO sentry-two.sandia.gov" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754497Ab2K1RNX (ORCPT ); Wed, 28 Nov 2012 12:13:23 -0500 In-Reply-To: Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Caleb Miles Cc: ceph-devel@vger.kernel.org On 11/28/2012 09:11 AM, Caleb Miles wrote: > Hey Jim, > > Running the third test with tunable chooseleaf_descend_once 0 with no > devices marked out yields the following result > > (999.82733333333397, 0.48667056652539997) > > so chi squared value is 999 with a corresponding p value of 0.487 so that > the placement distribution seems to be drawn from the uniform distribution > as desired. Great, thanks for doing that extra test. Plus, I see that Sage has merged it. Cool. Thanks -- Jim > > Caleb > > > On Tue, Nov 27, 2012 at 1:28 PM, Jim Schutt wrote: > >> Hi Caleb, >> >> >> On 11/26/2012 07:28 PM, caleb miles wrote: >> >>> Hello all, >>> >>> Here's what I've done to try and validate the new chooseleaf_descend_once >>> tunable first described in commit f1a53c5e80a48557e63db9c52b83f3**9391bc69b8 >>> in the wip-crush branch of ceph.git. >>> >>> First I set the new tunable to it's legacy value, disabled, >>> >>> tunable choose_local_tries 0 >>> tunable choose_local_fallback_tries 0 >>> tunable choose_total_tries 50 >>> tunable chooseleaf_descend_once 0 >>> >>> The map contains one thousand osd devices contained in one hundred hosts >>> with the following data rule >>> >>> rule data { >>> ruleset 0 >>> type replicated >>> min_size 1 >>> max_size 10 >>> step take default >>> step chooseleaf firstn 0 type host >>> step emit >>> } >>> >>> I then simulate the creation of one million placement groups using the >>> crushtool >>> >>> $ crushtool -i hundred.map --test --min-x 0 --max-x 999999 --num-rep 3 >>> --output-csv --weight 120 0.0 --weight 121 0.0 --weight 122 0.0 --weight >>> 123 0.0 --weight 124 0.0 --weight 125 0.0 --weight 125 0.0 --weight 150 0.0 >>> --weight 151 0.0 --weight 152 0.0 --weight 153 0.0 --weight 154 0.0 >>> --weight 155 0.0 --weight 156 0.0 --weight 180 0.0 --weight 181 0.0 >>> --weight 182 0.0 --weight 183 0.0 --weight 184 0.0 --weight 185 0.0 >>> --weight 186 0.0 >>> >>> with the majority of devices in three hosts marked out. Then in (I)Python >>> >>> import scipy.stats as s >>> import matplotlib.mlab as m >>> >>> data = m.csv2rec("data-device_**utilization.csv") >>> s.chisquare(data['number_of_**objects_stored'], data['number_of_objects_* >>> *expected']) >>> >>> which will output >>> >>> (122939.76474477499, 0.0) >>> >>> so that the chi squared value is 122939.795 and the p value is, rounded >>> to, 0.0 and the observed placement distribution statistically differs from >>> a uniform distribution. Repeating with the new tunable set to >>> >>> tunable chooseleaf_descend_once 1 >>> >>> I obtain the following result >>> >>> (998.97643161876761, 0.32151775131589833) >>> >>> so that the chi squared value is 998.976 and the p value is 0.32 and the >>> observed placement distribution is statistically identical to the uniform >>> distribution at the five and ten percent confidence levels, higher as well >>> of course. The p value is the probability of obtaining a chi squared value >>> more extreme than the statistic observed. Basically, from my rudimentary >>> understanding of probability theory, that if you obtain a p value p< P >>> then reject the null hypothesis, in our case that the observed placement >>> distribution is drawn from the uniform distribution, at the P confidence >>> level. >>> >>> >> Cool. Thanks for doing these tests. >> >> Is there any point to doing a third test, with >> >> tunable chooseleaf_descend_once 0 >> >> and no devices marked out, but in all other respects >> the same as the above two tests? >> >> I would expect the results for that case and the last >> case you tested to be essentially identical in the degree >> of uniformity, but is it worth verifying? >> >> -- Jim >> >> Caleb >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/**majordomo-info.html >>> >>> >>> >> >> >