From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mark Nelson Subject: Re: [ceph-users] CephFS test-case Date: Fri, 06 Sep 2013 18:29:42 -0500 Message-ID: <522A6566.4020700@inktank.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from mail-ie0-f182.google.com ([209.85.223.182]:58496 "EHLO mail-ie0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750708Ab3IFX3o (ORCPT ); Fri, 6 Sep 2013 19:29:44 -0400 Received: by mail-ie0-f182.google.com with SMTP id aq17so8468193iec.27 for ; Fri, 06 Sep 2013 16:29:43 -0700 (PDT) In-Reply-To: Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Sage Weil Cc: Nigel Williams , ceph-devel@vger.kernel.org On 09/06/2013 06:22 PM, Sage Weil wrote: > [re-adding ceph-devel] > > On Sat, 7 Sep 2013, Nigel Williams wrote: > >> On Sat, Sep 7, 2013 at 1:27 AM, Sage Weil wrote: >>> It sounds like the problem is cluster B's pools have too few PGs, making >>> the data distribution get all out of whack. >> >> Agree, it was too few PGs, I have no re-adjusted and it is busy >> backfilling and evening out the data-distribution across the OSDs. >> >> My overall point is that the out-of-the-box defaults don't provide a >> stable test-deployment (whereas older versions like 0.61 did), and so >> minimally perhaps ceph-deploy needs to have a stab at choosing a >> workable value of PGs? or alternatively the health warning could >> include a note about PGs being too low. > > I agree; this is a general problem that we need to come up with a better > solution to. > > One idea: > > - make ceph health warn when the pg distribution looks "bad" > - too few pgs relative the # of osds > - too many objects in a pool relative to the # of pgs and the > above > > (We'll need to be a little creative to make thresholds that make sense.) > > If we have an interactive ceph-deploy new, we can also estimate how big > the cluster will get and make a more sensible starting count. I like that > less, though, as it potentially confusing and has more room for user > error. At one point Sam and I were discussing some kind of message that wouldn't be a health warning, but something kind of similar to what you are discussing here. The idea is this would be for when Ceph thinks something is configured sub-optimally, but the issue doesn't necessarily affect the health of the cluster (at least in so much as everything is functioning as defined). We were concerned that people might not want more things causing health warnings. > > sage > > >> >>> ceph osd dump | grep ^pool >>> say, and how many OSDs do you have? >> >> I assume you mean PGs, it was the default (192?) and changing it to >> 400 seems to have helped. There are 12 OSDs (4 per server, 3 servers). >> >> > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >