From: Yann Dupont <Yann.Dupont@univ-nantes.fr>
To: Samuel Just <sam.just@inktank.com>
Cc: Gregory Farnum <greg@inktank.com>,
ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: domino-style OSD crash
Date: Tue, 10 Jul 2012 11:46:29 +0200 [thread overview]
Message-ID: <4FFBF9F5.9050000@univ-nantes.fr> (raw)
In-Reply-To: <CA+4uBUYGaUdYr97EqEGCLk9ByhUeXuqeeDyp2bzVRN7GeokePg@mail.gmail.com>
Le 09/07/2012 19:14, Samuel Just a écrit :
> Can you restart the node that failed to complete the upgrade with
Well, it's a little big complicated ; I now run those nodes with XFS,
and I've long-running jobs on it right now, so I can't stop the ceph
cluster at the moment.
As I've keeped the original broken btrfs volumes, I tried this morning
to run the old osd in parrallel, using the $cluster variable. I only
have partial success.
I tried using different port for the mons, but ceph want to use the old
mon map. I can edit it (epoch 1) but it seems to use 'latest' instead,
the format isn't compatible with monmaptool and I don't know how to
"inject" the modified on a non running cluster.
Anyway, osd seems to start fine, and I can reproduce the bug :
> debug filestore = 20
> debug osd = 20
>
I've put it in [global], is it sufficient ?
>
> and post the log after an hour or so of running? The upgrade process
> might legitimately take a while.
> -Sam
Only 15 minutes running, but ceph-osd is consumming lots of cpu, and a
strace shows lots of pread.
Here is the log :
[..]
2012-07-10 11:33:29.560052 7f3e615ac780 0
filestore(/CEPH-PROD/data/osd.1) mount syncfs(2) syscall not support by
glibc
2012-07-10 11:33:29.560062 7f3e615ac780 0
filestore(/CEPH-PROD/data/osd.1) mount no syncfs(2), but the btrfs SYNC
ioctl will suffice
2012-07-10 11:33:29.560172 7f3e615ac780 -1
filestore(/CEPH-PROD/data/osd.1) FileStore::mount : stale version stamp
detected: 2. Proceeding, do_update is set, performing disk format upgrade.
2012-07-10 11:33:29.560233 7f3e615ac780 0
filestore(/CEPH-PROD/data/osd.1) mount found snaps <3744666,3746725>
2012-07-10 11:33:29.560263 7f3e615ac780 10
filestore(/CEPH-PROD/data/osd.1) current/ seq was 3746725
2012-07-10 11:33:29.560267 7f3e615ac780 10
filestore(/CEPH-PROD/data/osd.1) most recent snap from
<3744666,3746725> is 3746725
2012-07-10 11:33:29.560280 7f3e615ac780 10
filestore(/CEPH-PROD/data/osd.1) mount rolling back to consistent snap
3746725
2012-07-10 11:33:29.839281 7f3e615ac780 5
filestore(/CEPH-PROD/data/osd.1) mount op_seq is 3746725
... and nothing more.
I'll let him running for 3 hours. If I have another message, I'll let
you know.
Cheers,
--
Yann Dupont - Service IRTS, DSI Université de Nantes
Tel : 02.53.48.49.20 - Mail/Jabber : Yann.Dupont@univ-nantes.fr
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2012-07-10 9:46 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-04 8:44 domino-style OSD crash Yann Dupont
2012-06-04 16:16 ` Tommi Virtanen
2012-06-04 17:40 ` Sam Just
2012-06-04 18:34 ` Greg Farnum
2012-07-03 8:40 ` Yann Dupont
2012-07-03 19:42 ` Tommi Virtanen
2012-07-03 20:54 ` Yann Dupont
2012-07-03 21:38 ` Tommi Virtanen
2012-07-04 8:06 ` Yann Dupont
2012-07-04 16:21 ` Gregory Farnum
2012-07-04 17:53 ` Yann Dupont
2012-07-05 21:32 ` Gregory Farnum
2012-07-06 7:19 ` Yann Dupont
2012-07-06 17:01 ` Gregory Farnum
2012-07-07 8:19 ` Yann Dupont
2012-07-09 17:14 ` Samuel Just
2012-07-10 9:46 ` Yann Dupont [this message]
2012-07-10 15:56 ` Tommi Virtanen
2012-07-10 16:39 ` Yann Dupont
2012-07-10 17:11 ` Tommi Virtanen
2012-07-10 17:36 ` Yann Dupont
2012-07-10 18:16 ` Tommi Virtanen
2012-07-09 17:43 ` Tommi Virtanen
2012-07-09 19:05 ` Yann Dupont
2012-07-09 19:48 ` Tommi Virtanen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FFBF9F5.9050000@univ-nantes.fr \
--to=yann.dupont@univ-nantes.fr \
--cc=ceph-devel@vger.kernel.org \
--cc=greg@inktank.com \
--cc=sam.just@inktank.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.