From: "martin" <martin@air-maxx.net>
To: ceph-devel@vger.kernel.org
Subject: cosd locks up with 100% CPU during mkcephfs
Date: Sat, 9 Oct 2010 08:26:24 +0200 [thread overview]
Message-ID: <006601cb677a$e640d270$b2c27750$@net> (raw)
Dear Mailinglist Members,
I have the problem that mkcephfs does not run through. It stops when cosd
locks up with 100%CPU on the first node - named CEPH1.
Out of the script:
fs created label (null) on /dev/sdb1
nodesize 4096 leafsize 4096 sectorsize 4096 size 19.99GB Btrfs Btrfs
v0.19 Scanning for Btrfs filesystems
monmap.4203 100% 477 0.5KB/s 00:00
--- ssh ceph1 "cd /home/ceph/ceph/ceph-0.21.3/src ; ulimit -c unlimited ;
/usr/local/bin/cosd -c /etc/ceph/ceph.conf --monmap /tmp/monmap.4203 -i 0
--mkfs --osd-data /data/osd0"
** WARNING: Ceph is still under heavy development, and is only suitable for
**
** testing and review. Do not trust it with important data.
**
-> then the script never returns
Top
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4691 root 20 0 15164 2228 1864 R 94.8 0.9 25:54.65 cosd
root@CEPH1:/var/log/ceph# kill 4691
bash: line 1: 4691 Terminated /usr/local/bin/cosd -c
/etc/ceph/ceph.conf --monmap /tmp/monmap.4203 -i 0 --mkfs --osd-data
/data/osd0
failed: 'ssh ceph1 /usr/local/bin/cosd -c /etc/ceph/ceph.conf --monmap
/tmp/monmap.4203 -i 0 --mkfs --osd-data /data/osd0'
-> I have waited 1hour .. and no success.
A tail on osd.0.log
10.10.09_01:27:52.636060 b77856d0 journal header: block_size 4096 alignment
4096 max_size 0
10.10.09_01:27:52.636074 b77856d0 journal header: start 4096
10.10.09_01:27:52.636086 b77856d0 journal write_pos 0
10.10.09_01:27:52.646211 b77856d0 journal create done
10.10.09_01:27:52.646282 b77856d0 filestore(/data/osd0) mkjournal created
journal on /data/osd0/journal
10.10.09_01:27:52.646451 b77856d0 filestore(/data/osd0) mkfs done in
/data/osd0
10.10.09_01:27:52.646467 b77856d0 filestore(/data/osd0) basedir /data/osd0
journal /data/osd0/journal
10.10.09_01:27:52.646738 b77856d0 filestore(/data/osd0) mount detected btrfs
10.10.09_01:27:52.646766 b77856d0 filestore(/data/osd0) _do_clone_range 0~1
10.10.09_01:27:52.646784 b77856d0 filestore(/data/osd0) mount btrfs
CLONE_RANGE ioctl is supported
10.10.09_01:27:52.656929 b77856d0 filestore(/data/osd0) mount btrfs
SNAP_CREATE is supported
10.10.09_01:27:52.663403 b77856d0 filestore(/data/osd0) mount btrfs
SNAP_DESTROY is supported
10.10.09_01:27:52.663539 b77856d0 filestore(/data/osd0) mount fsid is
206080828
10.10.09_01:27:52.663655 b77856d0 filestore(/data/osd0) mount found snaps <>
10.10.09_01:27:52.663938 b77856d0 filestore(/data/osd0) mount op_seq is 0
10.10.09_01:27:52.663956 b77856d0 filestore(/data/osd0) open_journal at
/data/osd0/journal
10.10.09_01:27:52.663985 b77856d0 journal journal_replay fs op_seq 0
10.10.09_01:27:52.664008 b77856d0 journal open /data/osd0/journal next_seq 1
10.10.09_01:27:52.664038 b77856d0 journal _open journal is not a block
device, NOT checking disk write cache on /data/osd0/journal
10.10.09_01:27:52.664052 b77856d0 journal _open /data/osd0/journal fd 8:
8192 bytes, block size 4096 bytes, directio = 1
10.10.09_01:27:52.664067 b77856d0 journal read_header
10.10.09_01:27:52.665300 b77856d0 journal header: block_size 4096 alignment
4096 max_size 0
10.10.09_01:27:52.665352 b77856d0 journal header: start 4096
10.10.09_01:27:52.665365 b77856d0 journal write_pos 4096
10.10.09_01:27:52.665389 b77856d0 journal open header.fsid = 206080828
____________________________________________________________________________
___
I just downloaded http://ceph.newdream.net/download/ceph-0.21.3.tar.gz on a
Ubuntu 10.10 Server (x32).
root@CEPH1:/etc/ceph# uname -a
Linux CEPH1 2.6.35-22-generic #33-Ubuntu SMP Sun Sep 19 20:34:50 UTC 2010
i686 GNU/Linux Done configure and make, followed by an install. Then I
cloned the machine 3 more times, making 4 nodes. /dev/sdb1 is formatted with
btrfs.
The nodes are running in VirtualBox
root@CEPH1:/etc/ceph# cat /etc/hosts
127.0.0.1 localhost
x.x.239.140 CEPH1
x.x.239.141 CEPH2
x.x.239.142 CEPH3
x.x.239.143 CEPH4
Distributed ssh-keys, so the scripts run through.
My ceph.conf looks like this:
root@CEPH1:/etc/ceph# cat ceph.conf
; global
[global]
; enable secure authentication
auth supported = cephx
; monitors
[mon]
mon data = /data/mon$id
debug ms = 1
debug mon = 20
debug paxos = 20
debug auth = 20
[mon0]
host = CEPH1
mon addr = x.x.239.140:6789
[mon1]
host = CEPH2
mon addr = x.x.239.141:6789
[mon2]
host = CEPH3
mon addr = x.x.239.142:6789
; mds
; You need at least one. Define two to get a standby.
[mds]
; where the mds keeps it's secret encryption keys
keyring = /data/keyring.$name
; mds logging to debug issues.
debug ms = 1
debug mds = 20
[mds.ceph1]
host = ceph1
[mds.ceph3]
host = ceph3
; osd
[osd]
osd data = /data/osd$id
osd journal = /data/osd$id/journal
debug ms = 1
debug osd = 20
debug filestore = 20
debug journal = 20
[osd0]
host = ceph1
btrfs devs = /dev/sdb1
[osd1]
host = ceph2
btrfs devs = /dev/sdb1
[osd2]
host = ceph3
btrfs devs = /dev/sdb1
[osd3]
host = ceph4
btrfs devs = /dev/sdb1
next reply other threads:[~2010-10-09 6:26 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-09 6:26 martin [this message]
2010-10-11 3:53 ` cosd locks up with 100% CPU during mkcephfs Sage Weil
[not found] <8969876631394566913@unknownmsgid>
2010-10-09 0:50 ` Martin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='006601cb677a$e640d270$b2c27750$@net' \
--to=martin@air-maxx.net \
--cc=ceph-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.