ceph stays degraded after crushmap rearrangement

All of lore.kernel.org
 help / color / mirror / Atom feed

* ceph stays degraded after crushmap rearrangement
@ 2013-01-05 16:40 Stefan Priebe
  2013-01-05 17:03 ` Stefan Priebe
  0 siblings, 1 reply; 11+ messages in thread
From: Stefan Priebe @ 2013-01-05 16:40 UTC (permalink / raw)
  To: ceph-devel@vger.kernel.org

Hi list,

i've rearranged my crushmap. Ceph was degraded about 18% and was 
recovering / rearranging fine.

But now it stays still and degraded status is rising??

2013-01-05 17:35:40.906587 mon.0 [INF] pgmap v2211269: 7632 pgs: 7632 
active+remapped; 152 GB data, 312 GB used, 5023 GB / 5336 GB avail; 
22/79086 degraded (0.028%)

...

2013-01-05 17:37:50.142106 mon.0 [INF] pgmap v2211386: 7632 pgs: 7632 
active+remapped; 152 GB data, 312 GB used, 5023 GB / 5336 GB avail; 
24/79090 degraded (0.030%)

..

2013-01-05 17:40:35.292054 mon.0 [INF] pgmap v2211526: 7632 pgs: 7632 
active+remapped; 152 GB data, 313 GB used, 5023 GB / 5336 GB avail; 
32/79106 degraded (0.040%)

I'm on currect testing branch.

Greets,
Stefan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ceph stays degraded after crushmap rearrangement
  2013-01-05 16:40 ceph stays degraded after crushmap rearrangement Stefan Priebe
@ 2013-01-05 17:03 ` Stefan Priebe
  2013-01-05 17:06   ` Sage Weil
  0 siblings, 1 reply; 11+ messages in thread
From: Stefan Priebe @ 2013-01-05 17:03 UTC (permalink / raw)
  To: ceph-devel@vger.kernel.org

Hello,

now i cannot even access an rbd image anymore.

Hanging status is now:
2013-01-05 18:01:21.736298 mon.0 [INF] pgmap v2212193: 7632 pgs: 1 
stale, 10 peering, 14 stale+peering, 1 stale+remapped, 1807 
stale+active+remapped, 1 stale+active+degraded, 2587 remapped+peering, 
1767 stale+remapped+peering, 1341 stale+active+degraded+remapped, 103 
stale+active+replay+degraded+remapped; 152 GB data, 313 GB used, 5022 GB 
/ 5336 GB avail; 7647/79122 degraded (9.665%)


Stefan
Am 05.01.2013 17:40, schrieb Stefan Priebe:
> Hi list,
>
> i've rearranged my crushmap. Ceph was degraded about 18% and was
> recovering / rearranging fine.
>
> But now it stays still and degraded status is rising??
>
> 2013-01-05 17:35:40.906587 mon.0 [INF] pgmap v2211269: 7632 pgs: 7632
> active+remapped; 152 GB data, 312 GB used, 5023 GB / 5336 GB avail;
> 22/79086 degraded (0.028%)
>
> ...
>
> 2013-01-05 17:37:50.142106 mon.0 [INF] pgmap v2211386: 7632 pgs: 7632
> active+remapped; 152 GB data, 312 GB used, 5023 GB / 5336 GB avail;
> 24/79090 degraded (0.030%)
>
> ..
>
> 2013-01-05 17:40:35.292054 mon.0 [INF] pgmap v2211526: 7632 pgs: 7632
> active+remapped; 152 GB data, 313 GB used, 5023 GB / 5336 GB avail;
> 32/79106 degraded (0.040%)
>
> I'm on currect testing branch.
>
> Greets,
> Stefan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ceph stays degraded after crushmap rearrangement
  2013-01-05 17:03 ` Stefan Priebe
@ 2013-01-05 17:06   ` Sage Weil
  2013-01-05 17:11     ` Stefan Priebe
  0 siblings, 1 reply; 11+ messages in thread
From: Sage Weil @ 2013-01-05 17:06 UTC (permalink / raw)
  To: Stefan Priebe; +Cc: ceph-devel@vger.kernel.org

On Sat, 5 Jan 2013, Stefan Priebe wrote:
> Hello,
> 
> now i cannot even access an rbd image anymore.
> 
> Hanging status is now:
> 2013-01-05 18:01:21.736298 mon.0 [INF] pgmap v2212193: 7632 pgs: 1 stale, 10
> peering, 14 stale+peering, 1 stale+remapped, 1807 stale+active+remapped, 1
> stale+active+degraded, 2587 remapped+peering, 1767 stale+remapped+peering,
> 1341 stale+active+degraded+remapped, 103
> stale+active+replay+degraded+remapped; 152 GB data, 313 GB used, 5022 GB /
> 5336 GB avail; 7647/79122 degraded (9.665%)

It looks like some of the ceph-osds stopped.

Are all daemons running the testing branch code?  What does 'ceph -s' say?  
Or 'ceph pg <pgid> query' on a random active+remapped pgid?

sage


> 
> 
> Stefan
> Am 05.01.2013 17:40, schrieb Stefan Priebe:
> > Hi list,
> > 
> > i've rearranged my crushmap. Ceph was degraded about 18% and was
> > recovering / rearranging fine.
> > 
> > But now it stays still and degraded status is rising??
> > 
> > 2013-01-05 17:35:40.906587 mon.0 [INF] pgmap v2211269: 7632 pgs: 7632
> > active+remapped; 152 GB data, 312 GB used, 5023 GB / 5336 GB avail;
> > 22/79086 degraded (0.028%)
> > 
> > ...
> > 
> > 2013-01-05 17:37:50.142106 mon.0 [INF] pgmap v2211386: 7632 pgs: 7632
> > active+remapped; 152 GB data, 312 GB used, 5023 GB / 5336 GB avail;
> > 24/79090 degraded (0.030%)
> > 
> > ..
> > 
> > 2013-01-05 17:40:35.292054 mon.0 [INF] pgmap v2211526: 7632 pgs: 7632
> > active+remapped; 152 GB data, 313 GB used, 5023 GB / 5336 GB avail;
> > 32/79106 degraded (0.040%)
> > 
> > I'm on currect testing branch.
> > 
> > Greets,
> > Stefan
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ceph stays degraded after crushmap rearrangement
  2013-01-05 17:06   ` Sage Weil
@ 2013-01-05 17:11     ` Stefan Priebe
  2013-01-05 17:16       ` Stefan Priebe
  0 siblings, 1 reply; 11+ messages in thread
From: Stefan Priebe @ 2013-01-05 17:11 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel@vger.kernel.org

Hi,

i just stopped EVERYTHING and have now started ALL osds again. It seems 
to recover now. But here is the output.

Am 05.01.2013 18:06, schrieb Sage Weil:
> It looks like some of the ceph-osds stopped.

Yes they just run with 100% CPU but do nothing.

> Are all daemons running the testing branch code?
Yes.

> What does 'ceph -s' say?
    health HEALTH_WARN 1247 pgs degraded; 4105 pgs peering; 4414 pgs 
stale; 3876 pgs stuck inactive; 4394 pgs stuck stale; 7632 pgs stuck 
unclean; recovery 6503/79122 degraded (8.219%)
    monmap e1: 3 mons at 
{a=10.255.0.100:6789/0,b=10.255.0.101:6789/0,c=10.255.0.102:6789/0}, 
election epoch 1990, quorum 0,1,2 a,b,c
    osdmap e8292: 24 osds: 24 up, 24 in
     pgmap v2212272: 7632 pgs: 1 stale, 119 peering, 467 
active+remapped, 6 active+degraded, 24 stale+peering, 1 stale+remapped, 
1748 stale+active+remapped, 63 active+replay+remapped, 1 
stale+active+degraded, 2563 remapped+peering, 1399 
stale+remapped+peering, 1154 stale+active+degraded+remapped, 86 
stale+active+replay+degraded+remapped; 152 GB data, 313 GB used, 5022 GB 
/ 5336 GB avail; 6503/79122 degraded (8.219%)
    mdsmap e1: 0/0/1 up

> Or 'ceph pg <pgid> query' on a random active+remapped pgid?
# ceph pg 3.b53 query

{ "state": "active+remapped",
   "up": [
         53],
   "acting": [
         53,
         32],
   "info": { "pgid": "3.b53",
       "last_update": "7137'9942",
       "last_complete": "7137'9942",
       "log_tail": "6452'8941",
       "last_backfill": "MAX",
       "purged_snaps": "[1~69,6b~724]",
       "history": { "epoch_created": 10,
           "last_epoch_started": 8291,
           "last_epoch_clean": 8291,
           "last_epoch_split": 0,
           "same_up_since": 8284,
           "same_interval_since": 8284,
           "same_primary_since": 8284,
           "last_scrub": "7137'9942",
           "last_scrub_stamp": "2013-01-05 15:28:03.766723",
           "last_deep_scrub": "6644'9328",
           "last_deep_scrub_stamp": "2012-12-30 15:27:19.596947"},
       "stats": { "version": "7137'9942",
           "reported": "8284'13320",
           "state": "active+remapped",
           "last_fresh": "2013-01-05 18:10:06.987730",
           "last_change": "2013-01-05 18:09:03.891013",
           "last_active": "2013-01-05 18:10:06.987730",
           "last_clean": "2013-01-05 17:00:45.793351",
           "last_unstale": "2013-01-05 18:10:06.987730",
           "mapping_epoch": 8283,
           "log_start": "6452'8941",
           "ondisk_log_start": "6452'8941",
           "created": 10,
           "last_epoch_clean": 10,
           "parent": "0.0",
           "parent_split_bits": 0,
           "last_scrub": "7137'9942",
           "last_scrub_stamp": "2013-01-05 15:28:03.766723",
           "last_deep_scrub": "6644'9328",
           "last_deep_scrub_stamp": "2012-12-30 15:27:19.596947",
           "log_size": 155155,
           "ondisk_log_size": 155155,
           "stats_invalid": "0",
           "stat_sum": { "num_bytes": 54525952,
               "num_objects": 13,
               "num_object_clones": 0,
               "num_object_copies": 0,
               "num_objects_missing_on_primary": 0,
               "num_objects_degraded": 0,
               "num_objects_unfound": 0,
               "num_read": 0,
               "num_read_kb": 0,
               "num_write": 9933,
               "num_write_kb": 1130756},
           "stat_cat_sum": {},
           "up": [
                 53],
           "acting": [
                 53,
                 32]},
       "empty": 0,
       "dne": 0,
       "incomplete": 0,
       "last_epoch_started": 8291},
   "recovery_state": [
         { "name": "Started\/Primary\/Active",
           "enter_time": "2013-01-05 18:09:03.890171",
           "might_have_unfound": [],
           "recovery_progress": { "backfill_target": -1,
               "waiting_on_backfill": 0,
               "backfill_pos": "0\/\/0\/\/-1",
               "backfill_info": { "begin": "0\/\/0\/\/-1",
                   "end": "0\/\/0\/\/-1",
                   "objects": []},
               "peer_backfill_info": { "begin": "0\/\/0\/\/-1",
                   "end": "0\/\/0\/\/-1",
                   "objects": []},
               "backfills_in_flight": [],
               "pull_from_peer": [],
               "pushing": []},
           "scrub": { "scrubber.epoch_start": "0",
               "scrubber.active": 0,
               "scrubber.block_writes": 0,
               "scrubber.finalizing": 0,
               "scrubber.waiting_on": 0,
               "scrubber.waiting_on_whom": []}},
         { "name": "Started",
           "enter_time": "2013-01-05 18:08:41.848771"}]}

Stefan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ceph stays degraded after crushmap rearrangement
  2013-01-05 17:11     ` Stefan Priebe
@ 2013-01-05 17:16       ` Stefan Priebe
  2013-01-05 17:40         ` Sage Weil
  0 siblings, 1 reply; 11+ messages in thread
From: Stefan Priebe @ 2013-01-05 17:16 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel@vger.kernel.org

Hi,
Am 05.01.2013 18:11, schrieb Stefan Priebe:
> Hi,
>
> i just stopped EVERYTHING and have now started ALL osds again. It seems
> to recover now. But here is the output.
Just an illusion. Still hangs.

# ceph -s
    health HEALTH_WARN 934 pgs degraded; 23 pgs down; 1887 pgs peering; 
1330 pgs stale; 670 pgs stuck inactive; 882 pgs stuck stale; 7632 pgs 
stuck unclean; recovery 4811/79122 degraded (6.080%)
    monmap e1: 3 mons at 
{a=10.255.0.100:6789/0,b=10.255.0.101:6789/0,c=10.255.0.102:6789/0}, 
election epoch 1996, quorum 0,1,2 a,b,c
    osdmap e8393: 24 osds: 24 up, 24 in
     pgmap v2212487: 7632 pgs: 475 peering, 4013 active+remapped, 18 
down+peering, 490 active+degraded, 798 stale+active+remapped, 1 
active+replay+degraded, 1305 remapped+peering, 84 
stale+remapped+peering, 5 stale+down+remapped+peering, 364 
stale+active+degraded+remapped, 79 
stale+active+replay+degraded+remapped; 152 GB data, 314 GB used, 5021 GB 
/ 5336 GB avail; 4811/79122 degraded (6.080%)
    mdsmap e1: 0/0/1 up

Greets,
Stefan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ceph stays degraded after crushmap rearrangement
  2013-01-05 17:16       ` Stefan Priebe
@ 2013-01-05 17:40         ` Sage Weil
  2013-01-05 17:46           ` Stefan Priebe
  0 siblings, 1 reply; 11+ messages in thread
From: Sage Weil @ 2013-01-05 17:40 UTC (permalink / raw)
  To: Stefan Priebe; +Cc: ceph-devel@vger.kernel.org

On Sat, 5 Jan 2013, Stefan Priebe wrote:
> Hi,
> Am 05.01.2013 18:11, schrieb Stefan Priebe:
> > Hi,
> > 
> > i just stopped EVERYTHING and have now started ALL osds again. It seems
> > to recover now. But here is the output.
> Just an illusion. Still hangs.

Can you turn up logging, or attach with gdb, so we can see what they are 
doing with all that CPU?

s


> 
> # ceph -s
>    health HEALTH_WARN 934 pgs degraded; 23 pgs down; 1887 pgs peering; 1330
> pgs stale; 670 pgs stuck inactive; 882 pgs stuck stale; 7632 pgs stuck
> unclean; recovery 4811/79122 degraded (6.080%)
>    monmap e1: 3 mons at
> {a=10.255.0.100:6789/0,b=10.255.0.101:6789/0,c=10.255.0.102:6789/0}, election
> epoch 1996, quorum 0,1,2 a,b,c
>    osdmap e8393: 24 osds: 24 up, 24 in
>     pgmap v2212487: 7632 pgs: 475 peering, 4013 active+remapped, 18
> down+peering, 490 active+degraded, 798 stale+active+remapped, 1
> active+replay+degraded, 1305 remapped+peering, 84 stale+remapped+peering, 5
> stale+down+remapped+peering, 364 stale+active+degraded+remapped, 79
> stale+active+replay+degraded+remapped; 152 GB data, 314 GB used, 5021 GB /
> 5336 GB avail; 4811/79122 degraded (6.080%)
>    mdsmap e1: 0/0/1 up
> 
> Greets,
> Stefan
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ceph stays degraded after crushmap rearrangement
  2013-01-05 17:40         ` Sage Weil
@ 2013-01-05 17:46           ` Stefan Priebe
  2013-01-05 17:56             ` Sage Weil
  0 siblings, 1 reply; 11+ messages in thread
From: Stefan Priebe @ 2013-01-05 17:46 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel@vger.kernel.org

Hi,
Am 05.01.2013 18:40, schrieb Sage Weil:
> On Sat, 5 Jan 2013, Stefan Priebe wrote:
>> Hi,
>> Am 05.01.2013 18:11, schrieb Stefan Priebe:
>>> Hi,
>>>
>>> i just stopped EVERYTHING and have now started ALL osds again. It seems
>>> to recover now. But here is the output.
>> Just an illusion. Still hangs.
>
> Can you turn up logging, or attach with gdb, so we can see what they are
> doing with all that CPU?

Right now i've imported the OLD crushmap and i've no stale PGs nor 
hanging OSDs anymore.

But my rbd images are gone ?!

[1202: ~]# rbd -p kvmpool1 ls
[1202: ~]#

Greets
Stefan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ceph stays degraded after crushmap rearrangement
  2013-01-05 17:46           ` Stefan Priebe
@ 2013-01-05 17:56             ` Sage Weil
  2013-01-05 18:05               ` Stefan Priebe
  0 siblings, 1 reply; 11+ messages in thread
From: Sage Weil @ 2013-01-05 17:56 UTC (permalink / raw)
  To: Stefan Priebe; +Cc: ceph-devel@vger.kernel.org

On Sat, 5 Jan 2013, Stefan Priebe wrote:
> Hi,
> Am 05.01.2013 18:40, schrieb Sage Weil:
> > On Sat, 5 Jan 2013, Stefan Priebe wrote:
> > > Hi,
> > > Am 05.01.2013 18:11, schrieb Stefan Priebe:
> > > > Hi,
> > > > 
> > > > i just stopped EVERYTHING and have now started ALL osds again. It seems
> > > > to recover now. But here is the output.
> > > Just an illusion. Still hangs.
> > 
> > Can you turn up logging, or attach with gdb, so we can see what they are
> > doing with all that CPU?
> 
> Right now i've imported the OLD crushmap and i've no stale PGs nor hanging
> OSDs anymore.
> 
> But my rbd images are gone ?!
> 
> [1202: ~]# rbd -p kvmpool1 ls
> [1202: ~]#

Oh.. I think this is related to the librados/librbd compatibility issue I 
mentioned yesterday.  Please make sure the clients (librados, librbd) are 
also running the latest testing branch.

sage

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ceph stays degraded after crushmap rearrangement
  2013-01-05 17:56             ` Sage Weil
@ 2013-01-05 18:05               ` Stefan Priebe
  2013-01-05 18:15                 ` Stefan Priebe
  2013-01-05 18:46                 ` Sage Weil
  0 siblings, 2 replies; 11+ messages in thread
From: Stefan Priebe @ 2013-01-05 18:05 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel@vger.kernel.org

Hi,

Am 05.01.2013 18:56, schrieb Sage Weil:
>> But my rbd images are gone ?!
>>
>> [1202: ~]# rbd -p kvmpool1 ls
>> [1202: ~]#
>
> Oh.. I think this is related to the librados/librbd compatibility issue I
> mentioned yesterday.  Please make sure the clients (librados, librbd) are
> also running the latest testing branch.

ah OK  - thanks that's it - ceph has now also recovered completely with 
old crushmap.

OK now back to my original problem.

i wanted to change from this:
-----------------------------------------
...

rack D2-switchA {
         id -100         # do not change unnecessarily
         # weight 12.000
         alg straw
         hash 0  # rjenkins1
         item server1263 weight 4.000
         item server1264 weight 4.000
         item server1265 weight 4.000
}
rack D2-switchB {
         id -101         # do not change unnecessarily
         # weight 12.000
         alg straw
         hash 0  # rjenkins1
         item server1266 weight 4.000
         item server1267 weight 4.000
         item server1268 weight 4.000
}
root root {
         id -10000               # do not change unnecessarily
         # weight 24.000
         alg straw
         hash 0  # rjenkins1
         item D2-switchA weight 12.000
         item D2-switchB weight 12.000
}

...
-----------------------------------------

to this one:

-----------------------------------------
...

rack D2 {
         id -100         # do not change unnecessarily
         # weight 24.000
         alg straw
         hash 0  # rjenkins1
         item cloud1-1263 weight 4.000
         item cloud1-1264 weight 4.000
         item cloud1-1265 weight 4.000
         item cloud1-1266 weight 4.000
         item cloud1-1267 weight 4.000
         item cloud1-1268 weight 4.000
}
root root {
         id -10000               # do not change unnecessarily
         # weight 24.000
         alg straw
         hash 0  # rjenkins1
         item D2 weight 24.000
}

...
-----------------------------------------

This was where all problems started. Is this wrong? / not possible?

Greets,
Stefan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ceph stays degraded after crushmap rearrangement
  2013-01-05 18:05               ` Stefan Priebe
@ 2013-01-05 18:15                 ` Stefan Priebe
  2013-01-05 18:46                 ` Sage Weil
  1 sibling, 0 replies; 11+ messages in thread
From: Stefan Priebe @ 2013-01-05 18:15 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel@vger.kernel.org

Hi,

ok will go back to tag v0.56.

With testing it now looks again like this:

    health HEALTH_WARN 5 pgs backfill; 9 pgs backfilling; 3433 pgs 
peering; 5111 pgs stale; 20 pgs stuck inactive; 34 pgs stuck unclean; 
recovery 129/79284 degraded (0.163%)
    monmap e1: 3 mons at 
{a=10.255.0.100:6789/0,b=10.255.0.101:6789/0,c=10.255.0.102:6789/0}, 
election epoch 2018, quorum 0,1,2 a,b,c
    osdmap e9518: 24 osds: 24 up, 24 in
     pgmap v2214634: 7632 pgs: 25 stale+active, 4160 stale+active+clean, 
5 stale+active+remapped+wait_backfill, 2521 peering, 911 stale+peering, 
9 stale+active+remapped+backfilling, 1 stale+remapped+peering; 152 GB 
data, 318 GB used, 5017 GB / 5336 GB avail; 129/79284 degraded (0.163%)
    mdsmap e1: 0/0/1 up

With v0.56 it was recovering successfully instead of going stale...

Stefan

Am 05.01.2013 19:05, schrieb Stefan Priebe:
> Hi,
>
> Am 05.01.2013 18:56, schrieb Sage Weil:
>>> But my rbd images are gone ?!
>>>
>>> [1202: ~]# rbd -p kvmpool1 ls
>>> [1202: ~]#
>>
>> Oh.. I think this is related to the librados/librbd compatibility issue I
>> mentioned yesterday.  Please make sure the clients (librados, librbd) are
>> also running the latest testing branch.
>
> ah OK  - thanks that's it - ceph has now also recovered completely with
> old crushmap.
>
> OK now back to my original problem.
>
> i wanted to change from this:
> -----------------------------------------
> ...
>
> rack D2-switchA {
>          id -100         # do not change unnecessarily
>          # weight 12.000
>          alg straw
>          hash 0  # rjenkins1
>          item server1263 weight 4.000
>          item server1264 weight 4.000
>          item server1265 weight 4.000
> }
> rack D2-switchB {
>          id -101         # do not change unnecessarily
>          # weight 12.000
>          alg straw
>          hash 0  # rjenkins1
>          item server1266 weight 4.000
>          item server1267 weight 4.000
>          item server1268 weight 4.000
> }
> root root {
>          id -10000               # do not change unnecessarily
>          # weight 24.000
>          alg straw
>          hash 0  # rjenkins1
>          item D2-switchA weight 12.000
>          item D2-switchB weight 12.000
> }
>
> ...
> -----------------------------------------
>
> to this one:
>
> -----------------------------------------
> ...
>
> rack D2 {
>          id -100         # do not change unnecessarily
>          # weight 24.000
>          alg straw
>          hash 0  # rjenkins1
>          item cloud1-1263 weight 4.000
>          item cloud1-1264 weight 4.000
>          item cloud1-1265 weight 4.000
>          item cloud1-1266 weight 4.000
>          item cloud1-1267 weight 4.000
>          item cloud1-1268 weight 4.000
> }
> root root {
>          id -10000               # do not change unnecessarily
>          # weight 24.000
>          alg straw
>          hash 0  # rjenkins1
>          item D2 weight 24.000
> }
>
> ...
> -----------------------------------------
>
> This was where all problems started. Is this wrong? / not possible?
>
> Greets,
> Stefan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ceph stays degraded after crushmap rearrangement
  2013-01-05 18:05               ` Stefan Priebe
  2013-01-05 18:15                 ` Stefan Priebe
@ 2013-01-05 18:46                 ` Sage Weil
  1 sibling, 0 replies; 11+ messages in thread
From: Sage Weil @ 2013-01-05 18:46 UTC (permalink / raw)
  To: Stefan Priebe; +Cc: ceph-devel@vger.kernel.org

Followed up on IRC.  The rbd thing was the library version mismatch (see 
my email last night), and the crush issue looks like it was related to 
the rule and only have 1 rack, but we'll see when Stefan tries again.

sage


On Sat, 5 Jan 2013, Stefan Priebe wrote:

> Hi,
> 
> Am 05.01.2013 18:56, schrieb Sage Weil:
> > > But my rbd images are gone ?!
> > > 
> > > [1202: ~]# rbd -p kvmpool1 ls
> > > [1202: ~]#
> > 
> > Oh.. I think this is related to the librados/librbd compatibility issue I
> > mentioned yesterday.  Please make sure the clients (librados, librbd) are
> > also running the latest testing branch.
> 
> ah OK  - thanks that's it - ceph has now also recovered completely with old
> crushmap.
> 
> OK now back to my original problem.
> 
> i wanted to change from this:
> -----------------------------------------
> ...
> 
> rack D2-switchA {
>         id -100         # do not change unnecessarily
>         # weight 12.000
>         alg straw
>         hash 0  # rjenkins1
>         item server1263 weight 4.000
>         item server1264 weight 4.000
>         item server1265 weight 4.000
> }
> rack D2-switchB {
>         id -101         # do not change unnecessarily
>         # weight 12.000
>         alg straw
>         hash 0  # rjenkins1
>         item server1266 weight 4.000
>         item server1267 weight 4.000
>         item server1268 weight 4.000
> }
> root root {
>         id -10000               # do not change unnecessarily
>         # weight 24.000
>         alg straw
>         hash 0  # rjenkins1
>         item D2-switchA weight 12.000
>         item D2-switchB weight 12.000
> }
> 
> ...
> -----------------------------------------
> 
> to this one:
> 
> -----------------------------------------
> ...
> 
> rack D2 {
>         id -100         # do not change unnecessarily
>         # weight 24.000
>         alg straw
>         hash 0  # rjenkins1
>         item cloud1-1263 weight 4.000
>         item cloud1-1264 weight 4.000
>         item cloud1-1265 weight 4.000
>         item cloud1-1266 weight 4.000
>         item cloud1-1267 weight 4.000
>         item cloud1-1268 weight 4.000
> }
> root root {
>         id -10000               # do not change unnecessarily
>         # weight 24.000
>         alg straw
>         hash 0  # rjenkins1
>         item D2 weight 24.000
> }
> 
> ...
> -----------------------------------------
> 
> This was where all problems started. Is this wrong? / not possible?
> 
> Greets,
> Stefan
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-01-05 18:46 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-05 16:40 ceph stays degraded after crushmap rearrangement Stefan Priebe
2013-01-05 17:03 ` Stefan Priebe
2013-01-05 17:06   ` Sage Weil
2013-01-05 17:11     ` Stefan Priebe
2013-01-05 17:16       ` Stefan Priebe
2013-01-05 17:40         ` Sage Weil
2013-01-05 17:46           ` Stefan Priebe
2013-01-05 17:56             ` Sage Weil
2013-01-05 18:05               ` Stefan Priebe
2013-01-05 18:15                 ` Stefan Priebe
2013-01-05 18:46                 ` Sage Weil

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.