From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefan Priebe Subject: Re: ceph stays degraded after crushmap rearrangement Date: Sat, 05 Jan 2013 19:15:02 +0100 Message-ID: <50E86DA6.4040900@profihost.ag> References: <50E85799.4060607@profihost.ag> <50E85CC9.9080503@profihost.ag> <50E85EB1.8060803@profihost.ag> <50E86008.30000@profihost.ag> <50E86708.6070909@profihost.ag> <50E86B57.4090708@profihost.ag> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from mail.profihost.ag ([85.158.179.208]:41669 "EHLO mail.profihost.ag" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755817Ab3AESO5 (ORCPT ); Sat, 5 Jan 2013 13:14:57 -0500 In-Reply-To: <50E86B57.4090708@profihost.ag> Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Sage Weil Cc: "ceph-devel@vger.kernel.org" Hi, ok will go back to tag v0.56. With testing it now looks again like this: health HEALTH_WARN 5 pgs backfill; 9 pgs backfilling; 3433 pgs peering; 5111 pgs stale; 20 pgs stuck inactive; 34 pgs stuck unclean; recovery 129/79284 degraded (0.163%) monmap e1: 3 mons at {a=10.255.0.100:6789/0,b=10.255.0.101:6789/0,c=10.255.0.102:6789/0}, election epoch 2018, quorum 0,1,2 a,b,c osdmap e9518: 24 osds: 24 up, 24 in pgmap v2214634: 7632 pgs: 25 stale+active, 4160 stale+active+clean, 5 stale+active+remapped+wait_backfill, 2521 peering, 911 stale+peering, 9 stale+active+remapped+backfilling, 1 stale+remapped+peering; 152 GB data, 318 GB used, 5017 GB / 5336 GB avail; 129/79284 degraded (0.163%) mdsmap e1: 0/0/1 up With v0.56 it was recovering successfully instead of going stale... Stefan Am 05.01.2013 19:05, schrieb Stefan Priebe: > Hi, > > Am 05.01.2013 18:56, schrieb Sage Weil: >>> But my rbd images are gone ?! >>> >>> [1202: ~]# rbd -p kvmpool1 ls >>> [1202: ~]# >> >> Oh.. I think this is related to the librados/librbd compatibility issue I >> mentioned yesterday. Please make sure the clients (librados, librbd) are >> also running the latest testing branch. > > ah OK - thanks that's it - ceph has now also recovered completely with > old crushmap. > > OK now back to my original problem. > > i wanted to change from this: > ----------------------------------------- > ... > > rack D2-switchA { > id -100 # do not change unnecessarily > # weight 12.000 > alg straw > hash 0 # rjenkins1 > item server1263 weight 4.000 > item server1264 weight 4.000 > item server1265 weight 4.000 > } > rack D2-switchB { > id -101 # do not change unnecessarily > # weight 12.000 > alg straw > hash 0 # rjenkins1 > item server1266 weight 4.000 > item server1267 weight 4.000 > item server1268 weight 4.000 > } > root root { > id -10000 # do not change unnecessarily > # weight 24.000 > alg straw > hash 0 # rjenkins1 > item D2-switchA weight 12.000 > item D2-switchB weight 12.000 > } > > ... > ----------------------------------------- > > to this one: > > ----------------------------------------- > ... > > rack D2 { > id -100 # do not change unnecessarily > # weight 24.000 > alg straw > hash 0 # rjenkins1 > item cloud1-1263 weight 4.000 > item cloud1-1264 weight 4.000 > item cloud1-1265 weight 4.000 > item cloud1-1266 weight 4.000 > item cloud1-1267 weight 4.000 > item cloud1-1268 weight 4.000 > } > root root { > id -10000 # do not change unnecessarily > # weight 24.000 > alg straw > hash 0 # rjenkins1 > item D2 weight 24.000 > } > > ... > ----------------------------------------- > > This was where all problems started. Is this wrong? / not possible? > > Greets, > Stefan