From: Joao Eduardo Luis <joao.luis@inktank.com>
To: James Harper <james.harper@bendigoit.com.au>
Cc: "ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: mon crash
Date: Wed, 19 Jun 2013 16:30:21 +0100 [thread overview]
Message-ID: <51C1CE8D.408@inktank.com> (raw)
In-Reply-To: <6035A0D088A63A46850C3988ED045A4B5C2183F2@BITCOM1.int.sbss.com.au>
On 06/19/2013 10:53 AM, James Harper wrote:
> Every time I start up one of my mons it crashes. Two others are running but there seems to be long delays (=several seconds) when doing mon status (maybe this is the behaviour when one mon is down?)
>
> The tail of /var/log/ceph/ceph-mon.4.log follows this email.
>
> Version is 0.61.3-1~bpo70+1 from http://ceph.com/debian-cuttlefish wheezy main
>
> This was happening in a previous version, and then even before that but I thought I'd fixed it by wiping the errant mon and recreating it.
>
> Anything else I can supply that might help?
>
> Thanks
>
> James
>
> 0> 2013-06-19 19:45:44.018695 7f472d995700 -1 mon/Monitor.cc: In function 'void Monitor::sync_timeout(entity_inst_t&)' thread 7f472d995700 time 2013-06-19 19:45:44.017928
> mon/Monitor.cc: 1101: FAILED assert(sync_state == SYNC_STATE_CHUNKS)
>
> ceph version 0.61.3 (92b1e398576d55df8e5888dd1a9545ed3fd99532)
> 1: /usr/bin/ceph-mon() [0x4c8eca]
> 2: (Context::complete(int)+0xa) [0x4d70fa]
> 3: (SafeTimer::timer_thread()+0x1af) [0x64ad4f]
> 4: (SafeTimerThread::entry()+0xd) [0x64c3dd]
> 5: (()+0x6b50) [0x7f47c0c3ab50]
> 6: (clone()+0x6d) [0x7f47bf39ba7d]
> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Issues on sync_timeout() have been seen, I track them down for some
time, find nothing of worth and logs usually don't help that much, and I
eventually have to move on.
http://tracker.ceph.com/issues/4845
and
http://tracker.ceph.com/issues/5171
contain two iterations of what appears to be the same bug. My guess is
that there's a lingering Context not being cancelled somewhere. Or it
might be some other thing altogether.
James, do you happen to have a full log you can share with us?
-Joao
--
Joao Eduardo Luis
Software Engineer | http://inktank.com | http://ceph.com
next prev parent reply other threads:[~2013-06-19 15:30 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-19 9:53 mon crash James Harper
2013-06-19 15:30 ` Joao Eduardo Luis [this message]
2013-06-19 16:01 ` Sage Weil
2013-06-19 23:31 ` James Harper
2013-06-20 11:03 ` Joao Eduardo Luis
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51C1CE8D.408@inktank.com \
--to=joao.luis@inktank.com \
--cc=ceph-devel@vger.kernel.org \
--cc=james.harper@bendigoit.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.