From mboxrd@z Thu Jan 1 00:00:00 1970 From: John Spray Subject: Re: MDS crashes (80.8) Date: Thu, 26 Feb 2015 18:01:16 +0000 Message-ID: <54EF5F6C.8030305@redhat.com> References: <54EF527B.8060403@redhat.com> <54EF5CF0.3030503@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from mx1.redhat.com ([209.132.183.28]:55257 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753404AbbBZSBU (ORCPT ); Thu, 26 Feb 2015 13:01:20 -0500 In-Reply-To: Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Wyllys Ingersoll Cc: ceph-devel@vger.kernel.org On 26/02/2015 17:58, Wyllys Ingersoll wrote: > Yeah, I noticed that too, so I recreated both of those pools and it > still wont start. It crashes in a different place now, but still wont > start, even after running 'newfs'. Attached is the debug log output > when I start ceph-mds > > ... > common/Thread.cc: In function 'int Thread::join(void**)' thread > 7f0127612700 time 2015-02-26 07:55:09.734802 > common/Thread.cc: 141: FAILED assert(status == 0) > ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7) > 1: (Thread::detach()+0) [0x8c30d0] > 2: (MonClient::shutdown()+0x50f) [0x881d5f] > 3: (MDS::suicide()+0xe5) [0x576285] > 4: (MDLog::handle_journaler_write_error(int)+0x5f) [0x7aab2f] > 5: (Context::complete(int)+0x9) [0x56d9a9] > 6: (Journaler::handle_write_error(int)+0x5e) [0x7b949e] > 7: (Journaler::_finish_write_head(int, Journaler::Header&, > Context*)+0x306) [0x7b9946] > 8: (Context::complete(int)+0x9) [0x56d9a9] > 9: (Objecter::check_op_pool_dne(Objecter::Op*)+0x214) [0x7ce6a4] > 10: (Objecter::C_Op_Map_Latest::finish(int)+0x124) [0x7cea04] > 11: (Context::complete(int)+0x9) [0x56d9a9] > 12: (Finisher::finisher_thread_entry()+0x1b8) [0x9aced8] > 13: (()+0x8182) [0x7f012db95182] > 14: (clone()+0x6d) [0x7f012c50befd] > NOTE: a copy of the executable, or `objdump -rdS ` is > needed to interpret this. > 2015-02-26 07:55:09.736256 7f0127612700 -1 common/Thread.cc: In > function 'int Thread::join(void**)' thread 7f0127612700 time > 2015-02-26 07:55:09.734802 > common/Thread.cc: 141: FAILED assert(status == 0) That is still looking like a missing pool (the check_op_pool_dne frame in the trace). Can you post "ceph osd dump" and "ceph mds dump"? John