From: "Székelyi Szabolcs" <szekelyi@niif.hu>
To: ceph-devel@vger.kernel.org
Subject: Re: OSD doesn't start
Date: Thu, 05 Jul 2012 16:12:42 +0200 [thread overview]
Message-ID: <95834053.QbLuzMQ4OG@mranderson> (raw)
In-Reply-To: <F1FB8F95B3FA4FF19D53AE9F060D88F5@inktank.com>
On 2012. July 4. 09:34:04 Gregory Farnum wrote:
> Hrm, it looks like the OSD data directory got a little busted somehow. How
> did you perform your upgrade? (That is, how did you kill your daemons, in
> what order, and when did you bring them back up.)
Since it would be hard and long to describe in text, I've collected the
relevant log entries, sorted by time at http://pastebin.com/Ev3M4DQ9 . The
short story is that after seeing that the OSDs won't start, I tried to bring
down the whole cluster and start it up from scratch. It didn't change
anything, so I rebooted the two machines (running all three daemons), to see
if it changes anything. It didn't and I gave up.
My ceph config is available at http://pastebin.com/KKNjmiWM .
Since this is my test cluster, I'm not very concerned about the data on it.
But the other one, with the same config, is dying I think. ceph-fuse is eating
around 75% CPU on the sole monitor ("cc") node. The monitor about 15%. On the
other two nodes, the OSD eats around 50%, the MDS 15%, the monitor another
10%. No Ceph filesystem activity is going on at the moment. Blktrace reports
about 1kB/s disk traffic on the partition hosting the OSD data dir. The data
seems to be accessible at the moment, but I'm afraid that my production
cluster will end up in a similar situation after upgrade, so I don't dare to
touch it.
Do you have any suggestion what I should check?
Thanks,
--
cc
> On Wednesday, July 4, 2012 at 8:31 AM, Székelyi Szabolcs wrote:
> > Hi,
> >
> > after upgrading to 0.48 "Argonaut", my OSDs won't start up again. This
> > problem might not be related to the upgrade, since the cluster had
> > strange behavior before, too: ceph-fuse was spinning the CPU around 70%,
> > so did the OSDs. This happened to both of my clusters. Thought that
> > upgrading might solve the problem, but it just got worse.
> >
> > I've copied the log of the OSD run to http://pastebin.com/XYRtfFMU . I've
> > rebooted all the nodes, but they still don't work.
> >
> > What should I do to resurrect my OSDs?
> >
> > Thanks,
> > --
> > cc
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > (mailto:majordomo@vger.kernel.org) More majordomo info at
> > http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2012-07-05 14:12 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-04 15:31 OSD doesn't start Székelyi Szabolcs
2012-07-04 16:34 ` Gregory Farnum
2012-07-05 14:12 ` Székelyi Szabolcs [this message]
2012-07-05 23:33 ` Székelyi Szabolcs
2012-07-08 18:51 ` Székelyi Szabolcs
2012-07-08 18:53 ` Székelyi Szabolcs
2012-07-09 16:18 ` Gregory Farnum
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=95834053.QbLuzMQ4OG@mranderson \
--to=szekelyi@niif.hu \
--cc=ceph-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.