From: Xiaopong Tran <xiaopong.tran@gmail.com>
To: Gregory Farnum <greg@inktank.com>
Cc: Sage Weil <sage@inktank.com>,
"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: mon crash on debian wheezy
Date: Wed, 29 Aug 2012 09:56:35 +0800 [thread overview]
Message-ID: <503D76D3.9050004@gmail.com> (raw)
In-Reply-To: <CAPYLRzgoRf11m6NampLc=p2uNA2VZ_4qwRK-2TOwZCutiYg1Hg@mail.gmail.com>
On 08/29/2012 12:21 AM, Gregory Farnum wrote:
> On Tue, Aug 28, 2012 at 7:50 AM, Xiaopong Tran <xiaopong.tran@gmail.com> wrote:
>> On 08/25/2012 12:28 AM, Sage Weil wrote:
>>>
>>> On Fri, 24 Aug 2012, Xiaopong Tran wrote:
>>>>
>>>> Hello,
>>>>
>>>> I've been running the 0.48argonaut on production for over a month
>>>> without any issue. and today, I suddenly lost one mon. Taking a look
>>>> into the syslog file, I see the following trace log. I just couldn't
>>>> see what's wrong from the trace log. However, this event created
>>>> a gigantic core file. Here's the size of the core file:
>>>>
>>>> -rw------- 1 root root 16085647360 Aug 24 14:53 core
>>>>
>>>> This happened while we were migrating data from our old storage
>>>> to the ceph. We are running about 20 processes, migrating data
>>>> into ceph, while there are about 30 more application processes
>>>> reading from and writing new data to it.
>>>>
>>>> The following is from syslog:
>>>
>>>
>>> We've seen these backtraces before too, but haven't figured out what
>>> causes them. (See, for example, http://tracker.newdream.net/issues/2026.)
>>>
>>> Was there anything in the mon's log file? In most cases, a crash results
>>> in a stack trace of ceph-mon in the mon log file.
>>>
>>> Glad to hear everything recovered nicely afterwards. :)
>>>
>>> Thanks!
>>> sage
>>>
>>
>> Ah well, I got two crashes in less than 3 days. I browsed thru the
>> mon log files, and the ceph log files, and there is nothing suspicious,
>> no trace dump or anything.
>>
>> One question I don't get is, after mon has crashed, it's not running
>> anymore, who is creating that empty mon log? The same question goes
>> for osd. I had two osd down today, and I also see empty osd log files.
>>
>> And how does the crash end up generating such a huge core file?
>>
>> If there's any information I can provide, I'd be happy to do so.
>
> Can you extract the backtrace from the core dump?
>
Will try to do that, it's a big one though :)
Thanks
Xiaopong
prev parent reply other threads:[~2012-08-29 1:56 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-24 8:12 mon crash on debian wheezy Xiaopong Tran
2012-08-24 16:28 ` Sage Weil
2012-08-28 14:50 ` Xiaopong Tran
2012-08-28 16:21 ` Gregory Farnum
2012-08-29 1:56 ` Xiaopong Tran [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=503D76D3.9050004@gmail.com \
--to=xiaopong.tran@gmail.com \
--cc=ceph-devel@vger.kernel.org \
--cc=greg@inktank.com \
--cc=sage@inktank.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.