Re: Unexpected pg placement in degraded mode with custom crush rule

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mark Kirkwood <mark.kirkwood@catalyst.net.nz>
To: Sage Weil <sage@inktank.com>
Cc: ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: Unexpected pg placement in degraded mode with custom crush rule
Date: Fri, 05 Jul 2013 16:53:03 +1200	[thread overview]
Message-ID: <51D6512F.1020501@catalyst.net.nz> (raw)
In-Reply-To: <51D64C59.3010004@catalyst.net.nz>

Retesting with 0.61.4:

Immediately after stopping 2 osd in rack1:

2013-07-05 16:23:02.852386 mon.0 [INF] pgmap v450: 1160 pgs: 1160 
active+degraded; 2000 MB data, 12991 MB used, 6135 MB / 20150 MB avail; 
100/200 degraded (50.000%)

... time passes:

2013-07-05 16:51:03.248198 mon.0 [INF] pgmap v465: 1160 pgs: 1160 
active+degraded; 2000 MB data, 12993 MB used, 6133 MB / 20150 MB avail; 
100/200 degraded (50.000%)

So looks like Cuttlefish is behaving as expected. Is this due to tweaks 
in the 'choose' algorithm in the later code?

Cheers

Mark

On 05/07/13 16:32, Mark Kirkwood wrote:
> Hi Sage,
>
> I don't believe so, I'm loading the objects directly from another host 
> (which is running 0.64 built from src) with:
>
> $ rados -m 192.168.122.21 -p obj put smallnode$n.dat smallnode.dat   # 
> $n=0->99
>
> and the osd's are all running 0.56.6, so I don't think there is any 
> kernel rbd or librbd involved.
>
>
> I did try:
>
> $ ceph osd crush tunables optimal
>
> In one run - no difference.
>
> I have updated to 0.61.4 and am running the test again, will update 
> with the results!
>
> Cheers
>
> Mark
>
> On 05/07/13 16:01, Sage Weil wrote:
>> Hi Mark,
>>
>> If you're not using a kernel cephfs or rbd client older than ~3.9, or
>> ceph-fuse/librbd/librados older than bobtail, then you should
>>
>>   ceph osd crush tunables optimal
>>
>> and I suspect that this will suddenly work perfectly.  The defaults are
>> still using semi-broken legacy values because client support is pretty
>> new.  Trees like yours, with sparsely populated leaves, tend to be most
>> affected.
>>
>> (I bet you're seeing the rack separation rule violated because the
>> previous copy of the PG was already there and ceph won't throw out old
>> copies before creating new ones.)
>>
>>
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2013-07-05  4:53 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-05  3:36 Unexpected pg placement in degraded mode with custom crush rule Mark Kirkwood
2013-07-05  4:01 ` Sage Weil
2013-07-05  4:32   ` Mark Kirkwood
2013-07-05  4:53     ` Mark Kirkwood [this message]
2013-07-05 15:46       ` Sage Weil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51D6512F.1020501@catalyst.net.nz \
    --to=mark.kirkwood@catalyst.net.nz \
    --cc=ceph-devel@vger.kernel.org \
    --cc=sage@inktank.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.