* Re: Can't start osd- one osd alway be down.
[not found] ` <544B9A59.4020502-QlevPasa8l681eZEIcUDRw@public.gmane.org>
@ 2014-10-25 14:57 ` Ta Ba Tuan
0 siblings, 0 replies; only message in thread
From: Ta Ba Tuan @ 2014-10-25 14:57 UTC (permalink / raw)
To: ceph-users-idqoXFIVOFJgJs9I8MT0rw,
ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
[-- Attachment #1.1: Type: text/plain, Size: 11121 bytes --]
#ceph pg *6.9d8* query
...
"peer_info": [
{ "peer": "49",
"pgid": "6.9d8",
"last_update": "102889'7801917",
"last_complete": "102889'7801917",
"log_tail": "102377'7792649",
"last_user_version": 7801879,
"last_backfill": "MAX",
"purged_snaps":
"[1~7,9~44b,455~1f8,64f~63,6b3~3a,6ee~12f,81f~10,830~8,839~69b,ed7~7,edf~4,ee4~6f5,15da~f9,16d4~1f,16f5~7,16fd~4,1705~5
e,1764~7,1771~78,17eb~12,1800~2,1803~d,1812~3,181a~1,181c~a,1827~3b,1863~1,1865~1,1867~1,186b~e,187a~3,1881~1,1884~7,188c~1,188f~3,1894~5,189f~2,
18ab~1,18c6~1,1922~13,193d~1,1940~1,194a~1,1968~5,1975~1,1979~4,197e~4,1984~1,1987~11,199c~1,19a0~1,19a3~9,19ad~3,19b2~1,19b6~27,19de~8]",
"history": { "epoch_created": 164,
"last_epoch_started": 102888,
"last_epoch_clean": 102888,
"last_epoch_split": 0
"parent_split_bits": 0,
"last_scrub": "91654'7460936",
"last_scrub_stamp": "2014-10-10 10:36:25.433016",
"last_deep_scrub": "81667'5815892",
"last_deep_scrub_stamp": "2014-08-29 09:44:14.012219",
"last_clean_scrub_stamp": "2014-10-10 10:36:25.433016",
"log_size": 9229,
"ondisk_log_size": 9229,
"stats_invalid": "1",
"stat_sum": { "num_bytes": 17870536192,
"num_objects": 4327,
"num_object_clones": 29,
"num_object_copies": 12981,*
** "num_objects_missing_on_primary": 4,*
"num_objects_degraded": 4,
"num_objects_unfound": 0,
"num_objects_dirty": 1092,
"num_whiteouts": 0,
"num_read": 4820626,
"num_read_kb": 59073045,
"num_write": 12748709,
"num_write_kb": 181630845,
"num_scrub_errors": 0,
"num_shallow_scrub_errors": 0,
"num_deep_scrub_errors": 0,
"num_objects_recovered": 135847,
"num_bytes_recovered": 562255538176,
"num_keys_recovered": 0,
"num_objects_omap": 0,
"num_objects_hit_set_archive": 0},
On 10/25/2014 07:40 PM, Ta Ba Tuan wrote:
> My Ceph was hung, and "osd.21 172.30.5.2:6870/8047 879 : [ERR]
> 6.9d8 has 4 objects unfound and apparently lost".
>
> After I restart all ceph-data nodes, I can't start osd.21, have many
> logs about pg 6.9d8 as:
>
> -440> 2014-10-25 19:28:17.468161 7fec5731d700 5 -- op tracker --
> seq: 3083, time: 2014-10-25 19:28:17.468161, event: reached_pg, op:
> MOSDPGPus
> h(*6.9d8* 102856
> [PushOp(e8de59d8/*rbd_data.4d091f7304c844.000000000000e871/head//6*,
> version: 102853'7800592, data_included: [0~4194304], data_size:
> 4194304, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2,
> recovery_info:
> ObjectRecoveryInfo(e8de59d8/rbd_data.4d091f7304c844.00000000
> 0000e871/head//6@102853'7800592, copy_subset: [0~4194304],
> clone_subset: {}), after_progress: ObjectRecoveryProgress(!first,
> data_recovered_to:41
> 94304, data_complete:true, omap_recovered_to:, omap_complete:true),
> before_progress: ObjectRecoveryProgress(first, data_recovered_to:0,
> data_comp
> lete:false, omap_recovered_to:, omap_complete:false))])
>
> I think having some error objects. What'm I must do?,please!
> Thanks!
> --
> Tuan
> HaNoi-VietNam
>
>
> On 10/25/2014 03:01 PM, Ta Ba Tuan wrote:
>> I send some related bugs:
>> (osd.21 not be able started)
>>
>> -8705> 2014-10-25 14:41:04.345727 7f12bac2f700 5 *osd.21* pg_epoch:
>> 102843 pg[*6.5e1*( v 102843'11832159
>> (102377'11822991,102843'11832159] lb
>> c4951de1/rbd_data.3955c5cdbb2ea.00000000000405f0/head//6
>> local-les=101780 n=4719 ec=164 les/c 102841/102838
>> 102840/102840/102477) [40,0,21]/[40,0,60] r=-1 lpr=102840
>> pi=31832-102839/230 luod=0'0 crt=102843'11832157 lcod 102843'11832158
>> active+remapped] *exit Started/ReplicaActive/RepNotRecovering*
>> 0.000170 1 0.000296
>>
>> -1637> 2014-10-25 14:41:14.326580 7f12bac2f700 5 *osd.21* pg_epoch:
>> 102843 pg[*2.23b*( v 102839'91984 (91680'88526,102839'91984]
>> local-les=102841 n=85 ec=25000 les/c 102841/102838
>> 102840/102840/102656) [90,21,120] r=1 lpr=102840 pi=100114-102839/50
>> luod=0'0 crt=102839'91984 active] *enter
>> Started/ReplicaActive/RepNotRecovering*
>>
>> -437> 2014-10-25 14:41:15.042174 7f12ba42e700 5 *osd.21 *pg_epoch:
>> 102843 pg[*27.239(* v 102808'38419 (81621'35409,102808'38419]
>> local-les=102841 n=23 ec=25085 les/c 102841/102838
>> 102840/102840/102656) [90,21,120] r=1 lpr=102840 pi=100252-102839/53
>> luod=0'0 crt=102808'38419 active] *enter
>> **Started/ReplicaActive/RepNotRecovering*
>>
>> Thanks!
>>
>>
>> On 10/25/2014 11:26 AM, Ta Ba Tuan wrote:
>>> Hi Craig, Thanks for replying.
>>> When i started that osd, Ceph Log from "ceph -w" warns pgs 7.9d8
>>> 23.596, 23.9c6, 23.63 can't recovery as pasted log.
>>>
>>> Those pgs are "active+degraded" state.
>>> #ceph pg map 7.9d8
>>> osdmap e102808 pg 7.9d8 (7.9d8) -> up [93,49] acting [93,49] (When
>>> start osd.21 then pg 7.9d8 and three remain pgs to changed to state
>>> "active+recovering") . osd.21 still down after following logs:
>>>
>>>
>>> 2014-10-25 10:57:48.415920 osd.21 [WRN] slow request 30.835731
>>> seconds old, received at 2014-10-25 10:57:17.580013:
>>> MOSDPGPush(*7.9d8 *102803 [Push
>>> Op(e13589d8/rbd_data.4b843b2ae8944a.0000000000000c00/head//6,
>>> version: 102798'7794851, data_included: [0~4194304], data_size:
>>> 4194304, omap_heade
>>> r_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info:
>>> ObjectRecoveryInfo(e13589d8/rbd_data.4b843b2ae8944a.0000000000000c00/head//6@102
>>> 798'7794851, copy_subset: [0~4194304], clone_subset: {}),
>>> after_progress: ObjectRecoveryProgress(!first,
>>> data_recovered_to:4194304, data_complete
>>> :true, omap_recovered_to:, omap_complete:true), before_progress:
>>> ObjectRecoveryProgress(first, data_recovered_to:0,
>>> data_complete:false, omap_rec
>>> overed_to:, omap_complete:false))]) v2 currently no flag points reached
>>>
>>> 2014-10-25 10:57:48.415927 osd.21 [WRN] slow request 30.275588
>>> seconds old, received at 2014-10-25 10:57:18.140156:
>>> MOSDPGPush(*23.596* 102803 [Pus
>>> hOp(4ca76d96/rbd_data.5dd32f2ae8944a.0000000000000385/head//24,
>>> version: 102798'295732, data_included: [0~4194304], data_size:
>>> 4194304, omap_head
>>> er_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info:
>>> ObjectRecoveryInfo(4ca76d96/rbd_data.5dd32f2ae8944a.0000000000000385/head//24@1
>>> 02798'295732, copy_subset: [0~4194304], clone_subset: {}),
>>> after_progress: ObjectRecoveryProgress(!first,
>>> data_recovered_to:4194304, data_complet
>>> e:true, omap_recovered_to:, omap_complete:true), before_progress:
>>> ObjectRecoveryProgress(first, data_recovered_to:0,
>>> data_complete:false, omap_re
>>> covered_to:, omap_complete:false))]) v2 currently no flag points reached
>>>
>>> 2014-10-25 10:57:48.415910 osd.21 [WRN] slow request 30.860696
>>> seconds old, received at 2014-10-25 10:57:17.555048:
>>> MOSDPGPush(*23.9c6* 102803 [Pus
>>> hOp(efdde9c6/rbd_data.5b64062ae8944a.0000000000000b15/head//24,
>>> version: 102798'66056, data_included: [0~4194304], data_size:
>>> 4194304, omap_heade
>>> r_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info:
>>> ObjectRecoveryInfo(efdde9c6/rbd_data.5b64062ae8944a.0000000000000b15/head//24@10
>>> 2798'66056, copy_subset: [0~4194304], clone_subset: {}),
>>> after_progress: ObjectRecoveryProgress(!first,
>>> data_recovered_to:4194304, data_complete:
>>> true, omap_recovered_to:, omap_complete:true), before_progress:
>>> ObjectRecoveryProgress(first, data_recovered_to:0,
>>> data_complete:false, omap_reco
>>> vered_to:, omap_complete:false))]) v2 currently no flag points reached
>>>
>>> 2014-10-25 10:57:58.418847 osd.21 [WRN] 26 slow requests, 1 included
>>> below; oldest blocked for > 54.967456 secs
>>> 2014-10-25 10:57:58.418859 osd.21 [WRN] slow request 30.967294
>>> seconds old, received at 2014-10-25 10:57:27.451488:
>>> MOSDPGPush(*23.63c* 102803 [Pus
>>> hOp(40e4b63c/rbd_data.57ed612ae8944a.0000000000000c00/head//24,
>>> version: 102748'145637, data_included: [0~4194304], data_size:
>>> 4194304, omap_head
>>> er_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info:
>>> ObjectRecoveryInfo(40e4b63c/rbd_data.57ed612ae8944a.0000000000000c00/head//24@1
>>> 02748'145637, copy_subset: [0~4194304], clone_subset: {}),
>>> after_progress: ObjectRecoveryProgress(!first,
>>> data_recovered_to:4194304, data_complet
>>> e:true, omap_recovered_to:, omap_complete:true), before_progress:
>>> ObjectRecoveryProgress(first, data_recovered_to:0,
>>> data_complete:false, omap_re
>>> covered_to:, omap_complete:false))]) v2 currently no flag points reached
>>>
>>> Thanks!
>>> --
>>> Tuan
>>> HaNoi-VietNam
>>>
>>> On 10/25/2014 05:07 AM, Craig Lewis wrote:
>>>> It looks like you're running into http://tracker.ceph.com/issues/5699
>>>>
>>>> You're running 0.80.7, which has a fix for that bug. From my
>>>> reading of the code, I believe the fix only prevents the issue from
>>>> occurring. It doesn't work around or repair bad snapshots created
>>>> on older versions of Ceph.
>>>>
>>>> Were any of the snapshots you're removing up created on older
>>>> versions of Ceph? If they were all created on Firefly, then you
>>>> should open a new tracker issue, and try to get some help on IRC or
>>>> the developers mailing list.
>>>>
>>>> On Thu, Oct 23, 2014 at 10:21 PM, Ta Ba Tuan <tuantb-QlevPasa8l681eZEIcUDRw@public.gmane.org
>>>> <mailto:tuantb-QlevPasa8l681eZEIcUDRw@public.gmane.org>> wrote:
>>>>
>>>> Dear everyone
>>>>
>>>> I can't start osd.21, (attached log file).
>>>> some pgs can't be repair. I'm using replicate 3 for my data pool.
>>>> Feel some objects in those pgs be failed,
>>>>
>>>> I tried to delete some data that related above objects, but
>>>> still not start osd.21
>>>> and, removed osd.21, but other osds (eg: osd.86 down, not start
>>>> osd.86).
>>>>
>>>> Guide me to debug it, please! Thanks!
>>>>
>>>> --
>>>> Tuan
>>>> Ha Noi - VietNam
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org <mailto:ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org>
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[-- Attachment #1.2: Type: text/html, Size: 19466 bytes --]
[-- Attachment #2: Type: text/plain, Size: 178 bytes --]
_______________________________________________
ceph-users mailing list
ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
^ permalink raw reply [flat|nested] only message in thread