All of lore.kernel.org
 help / color / mirror / Atom feed
From: "martin" <martin@air-maxx.net>
To: ceph-devel@vger.kernel.org
Subject: cosd locks up with 100% CPU during mkcephfs
Date: Sat, 9 Oct 2010 08:26:24 +0200	[thread overview]
Message-ID: <006601cb677a$e640d270$b2c27750$@net> (raw)

Dear Mailinglist Members,

I have the problem that mkcephfs does not run through. It stops when cosd
locks up with 100%CPU on the first node - named CEPH1.
Out of the script:
fs created label (null) on /dev/sdb1
        nodesize 4096 leafsize 4096 sectorsize 4096 size 19.99GB Btrfs Btrfs
v0.19 Scanning for Btrfs filesystems
monmap.4203                                   100%  477     0.5KB/s   00:00
--- ssh ceph1  "cd /home/ceph/ceph/ceph-0.21.3/src ; ulimit -c unlimited ;
/usr/local/bin/cosd -c /etc/ceph/ceph.conf --monmap /tmp/monmap.4203 -i 0
--mkfs --osd-data /data/osd0"
 ** WARNING: Ceph is still under heavy development, and is only suitable for
**
 **          testing and review.  Do not trust it with important data.
**
-> then the script never returns
Top
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 4691 root      20   0 15164 2228 1864 R 94.8  0.9  25:54.65 cosd
root@CEPH1:/var/log/ceph# kill 4691
bash: line 1:  4691 Terminated              /usr/local/bin/cosd -c
/etc/ceph/ceph.conf --monmap /tmp/monmap.4203 -i 0 --mkfs --osd-data
/data/osd0
failed: 'ssh ceph1 /usr/local/bin/cosd -c /etc/ceph/ceph.conf --monmap
/tmp/monmap.4203 -i 0 --mkfs --osd-data /data/osd0'

-> I have waited 1hour .. and no success.

A tail on osd.0.log

10.10.09_01:27:52.636060 b77856d0 journal header: block_size 4096 alignment
4096 max_size 0
10.10.09_01:27:52.636074 b77856d0 journal header: start 4096
10.10.09_01:27:52.636086 b77856d0 journal  write_pos 0
10.10.09_01:27:52.646211 b77856d0 journal create done
10.10.09_01:27:52.646282 b77856d0 filestore(/data/osd0) mkjournal created
journal on /data/osd0/journal
10.10.09_01:27:52.646451 b77856d0 filestore(/data/osd0) mkfs done in
/data/osd0
10.10.09_01:27:52.646467 b77856d0 filestore(/data/osd0) basedir /data/osd0
journal /data/osd0/journal
10.10.09_01:27:52.646738 b77856d0 filestore(/data/osd0) mount detected btrfs
10.10.09_01:27:52.646766 b77856d0 filestore(/data/osd0) _do_clone_range 0~1
10.10.09_01:27:52.646784 b77856d0 filestore(/data/osd0) mount btrfs
CLONE_RANGE ioctl is supported
10.10.09_01:27:52.656929 b77856d0 filestore(/data/osd0) mount btrfs
SNAP_CREATE is supported
10.10.09_01:27:52.663403 b77856d0 filestore(/data/osd0) mount btrfs
SNAP_DESTROY is supported
10.10.09_01:27:52.663539 b77856d0 filestore(/data/osd0) mount fsid is
206080828
10.10.09_01:27:52.663655 b77856d0 filestore(/data/osd0) mount found snaps <>
10.10.09_01:27:52.663938 b77856d0 filestore(/data/osd0) mount op_seq is 0
10.10.09_01:27:52.663956 b77856d0 filestore(/data/osd0) open_journal at
/data/osd0/journal
10.10.09_01:27:52.663985 b77856d0 journal journal_replay fs op_seq 0
10.10.09_01:27:52.664008 b77856d0 journal open /data/osd0/journal next_seq 1
10.10.09_01:27:52.664038 b77856d0 journal _open journal is not a block
device, NOT checking disk write cache on /data/osd0/journal
10.10.09_01:27:52.664052 b77856d0 journal _open /data/osd0/journal fd 8:
8192 bytes, block size 4096 bytes, directio = 1
10.10.09_01:27:52.664067 b77856d0 journal read_header
10.10.09_01:27:52.665300 b77856d0 journal header: block_size 4096 alignment
4096 max_size 0
10.10.09_01:27:52.665352 b77856d0 journal header: start 4096
10.10.09_01:27:52.665365 b77856d0 journal  write_pos 4096
10.10.09_01:27:52.665389 b77856d0 journal open header.fsid = 206080828
____________________________________________________________________________
___


I just downloaded http://ceph.newdream.net/download/ceph-0.21.3.tar.gz on a
Ubuntu 10.10 Server (x32). 
root@CEPH1:/etc/ceph# uname -a
Linux CEPH1 2.6.35-22-generic #33-Ubuntu SMP Sun Sep 19 20:34:50 UTC 2010
i686 GNU/Linux Done configure and make, followed by an install. Then I
cloned the machine 3 more times, making 4 nodes. /dev/sdb1 is formatted with
btrfs.

The nodes are running in VirtualBox
root@CEPH1:/etc/ceph# cat /etc/hosts
127.0.0.1       localhost
x.x.239.140  CEPH1
x.x.239.141  CEPH2
x.x.239.142  CEPH3
x.x.239.143  CEPH4

Distributed ssh-keys, so the scripts run through.

My ceph.conf looks like this:
root@CEPH1:/etc/ceph# cat ceph.conf
; global
[global]
        ; enable secure authentication
auth supported = cephx

; monitors
 [mon]
        mon data = /data/mon$id
        debug ms = 1
        debug mon = 20
        debug paxos = 20
        debug auth = 20

[mon0]
        host = CEPH1
        mon addr = x.x.239.140:6789

[mon1]
        host = CEPH2
        mon addr = x.x.239.141:6789

[mon2]
        host = CEPH3
        mon addr = x.x.239.142:6789

; mds
;  You need at least one.  Define two to get a standby.
[mds]
        ; where the mds keeps it's secret encryption keys
        keyring = /data/keyring.$name
        ; mds logging to debug issues.
        debug ms = 1
        debug mds = 20

[mds.ceph1]
        host = ceph1
[mds.ceph3]
        host = ceph3

; osd
 [osd]
        osd data = /data/osd$id
        osd journal = /data/osd$id/journal
        debug ms = 1
        debug osd = 20
        debug filestore = 20
        debug journal = 20

[osd0]
        host = ceph1
        btrfs devs = /dev/sdb1
[osd1]
        host = ceph2
        btrfs devs = /dev/sdb1
[osd2]
        host = ceph3
        btrfs devs = /dev/sdb1
[osd3]
        host = ceph4
        btrfs devs = /dev/sdb1


             reply	other threads:[~2010-10-09  6:26 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-09  6:26 martin [this message]
2010-10-11  3:53 ` cosd locks up with 100% CPU during mkcephfs Sage Weil
     [not found] <8969876631394566913@unknownmsgid>
2010-10-09  0:50 ` Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='006601cb677a$e640d270$b2c27750$@net' \
    --to=martin@air-maxx.net \
    --cc=ceph-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.