From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vladimir Bashkirtsev Subject: Stuck OSD phantom Date: Mon, 04 Jun 2012 11:29:08 +0930 Message-ID: <4FCC166C.30202@bashkirtsev.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from mail.logics.net.au ([150.101.56.178]:39904 "EHLO mail.logics.net.au" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755598Ab2FDB7V (ORCPT ); Sun, 3 Jun 2012 21:59:21 -0400 Received: from x.logics.net.au (gw.logics.net.au [150.101.235.251] (may be forged)) (authenticated bits=0) by mail.logics.net.au (8.14.5/8.14.1) with ESMTP id q541x8VB026642 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Mon, 4 Jun 2012 11:29:14 +0930 Sender: ceph-devel-owner@vger.kernel.org List-ID: To: ceph-devel@vger.kernel.org Dear devs, While playing around with ceph with six OSDs I decided to retire two OSDs simultaneously (I do triplication so ceph should withstand such damage) to see how ceph will cope with it. I was doing it in different ways trying to get ceph off-rails and it looks I have managed it. :) First of all I have tried to kill OSDs by pulling them off and then doing ceph osd lost . Performed as expected. However ceph kept record of former OSDs even so it did not try to use it. Looks correct. Then I have recreated OSDs and magically they just came back online and filled up with data again. Again: that's what is expected. At last I have tried planned removal of OSDs: ceph osd crush remove 3 ceph osd rm osd.3 Ceph complained that osd is still up. Shutdown OSD, tried again. Success. Done the same with second OSD. Everything looked fine still. And then accidentally (and that's perhaps good test) I have rebooted box running osd.3 and it had ceph osd in rc. So osd.3 started without having knowledge that it was evicted from cluster. Cluster magically took it back and osd.3 joined the culster (however it did not got any load as it was removed from crush). I removed it from rc, shut it down, done ceph osd crush remove 3 (just to be certain) and ceph osd rm osd.3 (both succeeded) but now I have osd.3 still counted towards total cluster capacity, osd dump shows it as non existent, pg dump shows it as it still member of a cluster: [root@x ceph]# ceph osd dump dumped osdmap epoch 14892 epoch 14892 fsid 7719f573-4c48-4852-a27f-51c7a3fe1c1e created 2012-03-31 04:47:12.130128 modifed 2012-06-04 11:16:57.687645 flags pool 0 'data' rep size 3 crush_ruleset 0 object_hash rjenkins pg_num 192 pgp_num 192 last_change 13812 owner 0 crash_replay_interval 45 pool 1 'metadata' rep size 3 crush_ruleset 1 object_hash rjenkins pg_num 192 pgp_num 192 last_change 13815 owner 0 pool 2 'rbd' rep size 3 crush_ruleset 2 object_hash rjenkins pg_num 192 pgp_num 192 last_change 13817 owner 0 max_osd 6 osd.0 up in weight 1 up_from 14407 up_thru 14890 down_at 14400 last_clean_interval [14383,14399) 172.16.64.200:6801/25023 172.16.64.200:6802/25023 172.16.64.200:6803/25023 exists,up osd.1 up in weight 1 up_from 14420 up_thru 14890 down_at 14413 last_clean_interval [14388,14412) lost_at 11147 172.16.64.201:6800/5719 172.16.64.201:6801/5719 172.16.64.201:6802/5719 exists,up 2c7ca892-e83c-4158-a3ae-7c4f96f040b0 osd.4 up in weight 1 up_from 14432 up_thru 14890 down_at 14425 last_clean_interval [14393,14424) lost_at 13373 172.16.64.204:6800/17419 172.16.64.204:6802/17419 172.16.64.204:6803/17419 exists,up 19703275-74c3-403b-8647-85cc4f7ad870 osd.5 up in weight 1 up_from 14448 up_thru 14890 down_at 14438 last_clean_interval [14366,14437) 172.16.64.205:6800/7021 172.16.64.205:6801/7021 172.16.64.205:6802/7021 exists,up 699a39ca-3806-4c4f-9cdc-76cbed61b2ab [root@x ceph]# ceph pg dump dumped all in format plain version 2223459 last_osdmap_epoch 14892 last_pg_scan 12769 full_ratio 0.95 nearfull_ratio 0.85 <-snip-> pool 0 21404 0 0 0 40635329394 23681638 23681638 pool 1 114 0 0 0 234237438 4481899 4481899 pool 2 113241 0 2 0 473699447111 27387426 27387426 sum 134759 0 2 0 514569013943 55550963 55550963 osdstat kbused kbavail kb hb in hb out 0 399440780 316714764 744751104 [1,4,5] [] 1 400369588 125798956 546603008 [0,4,5] [] 3 130380 90124804 94470144 [0,1,4,5] [] 4 387384720 132412912 540409856 [0,1,5] [] 5 344705816 233764680 600997888 [0,1,4] [] sum 1532031284 898816116 2527232000 Any idea how to get rid of it completely? Regards, Vladimir