From mboxrd@z Thu Jan 1 00:00:00 1970 From: Xiaopong Tran Subject: Re: Very unbalanced storage Date: Sat, 01 Sep 2012 10:33:41 +0800 Message-ID: <50417405.4060002@gmail.com> References: <50409BDC.5010006@gmail.com> <5040D179.5000008@aktzero.com> <5040E532.6040306@aktzero.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from mail-pb0-f46.google.com ([209.85.160.46]:55522 "EHLO mail-pb0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752319Ab2IACd2 (ORCPT ); Fri, 31 Aug 2012 22:33:28 -0400 Received: by pbbrr13 with SMTP id rr13so5709304pbb.19 for ; Fri, 31 Aug 2012 19:33:27 -0700 (PDT) In-Reply-To: Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Gregory Farnum Cc: Andrew Thompson , "ceph-devel@vger.kernel.org" On 09/01/2012 12:39 AM, Gregory Farnum wrote: > On Fri, Aug 31, 2012 at 9:24 AM, Andrew Thompson wrote: >> On 8/31/2012 12:10 PM, Sage Weil wrote: >>> >>> On Fri, 31 Aug 2012, Andrew Thompson wrote: >>>> >>>> Have you been reweight-ing osds? I went round and round with my cluster a >>>> few days ago reloading different crush maps only to find that it >>>> re-injecting a crush map didn't seem to overwrite reweights. Take a look at >>>> `ceph osd tree` to see if the reweight column matches the weight column. >>> >>> Note that the ideal situation is for reweight to be 1, regardless of what >>> the crush weight is. If you find the utilizations are skewed, I would >>> look for other causes before resorting to reweight-by-utilization; it is >>> meant to adjust the normal statistical variation you expect from a >>> (pseudo)random placement, but if the variance is high there is likely >>> another cause. >> >> >> So if someone(me, guilty) had been messing with reweight, will setting them >> all to 1 return it to a normal un-reweight-ed state? > > Yep! > If you have OSDs with different sizes you'll want to adjust the CRUSH > weights, not the reweight values: > http://ceph.com/docs/master/ops/manage/crush/#adjusting-the-crush-weight Thanks for the reply. Yes, this was what I did, we had 1TB and 2TB HD, so using 1TB as the base line, with weight being 1.0, then I'd like that the 2TB HD store 2x amount of data, so that the disks always have roughly same relative amount of data. Originally, every osd has weight of 1.0, and I did: ceph osd crush reweight osd.30 2.0 and all the 2TB disks. And that's probably what caused the skew afterward. The crush map attached in my last message was fetched from the cluster, and ceph osd tree does show that the weight of the 2TB disks as 2, but reweight is 1. Now I'm getting confused by the meaning of crush weight :) Best, Xiaopong