From: Martin Wilderoth <martin.wilderoth@linserv.se>
To: Gregory Farnum <gregory.farnum@dreamhost.com>
Cc: ceph-devel@vger.kernel.org
Subject: Re: osd stops
Date: Wed, 13 Apr 2011 14:12:51 +0200 (CEST) [thread overview]
Message-ID: <174374365.14569.1302696771139.JavaMail.root@mail.linserv.se> (raw)
In-Reply-To: <E54DF381DDD4475587E24CDF6D4C25D0@gmail.com>
This is my config,
;
; Sample ceph ceph.conf file.
;
; This file defines cluster membership, the various locations
; that Ceph stores data, and any other runtime options.
; If a 'host' is defined for a daemon, the start/stop script will
; verify that it matches the hostname (or else ignore it). If it is
; not defined, it is assumed that the daemon is intended to start on
; the current host (e.g., in a setup with a startup.conf on each
; node).
; global
[global]
; enable secure authentication
auth supported = cephx
keyring = /etc/ceph/keyring.bin
; allow ourselves to open a lot of files
max open files = 131072
pid file = /var/run/ceph/$name.pid
debug ms = 1
; monitors
; You need at least one. You need at least three if you want to
; tolerate any node failures. Always create an odd number.
[mon]
mon data = /data/mon$id
; logging, for debugging monitor crashes, in order of
; their likelihood of being helpful :)
;debug ms = 1
;debug mon = 20
;debug paxos = 20
;debug auth = 20
[mon0]
host = ceph1
mon addr = 10.0.6.10:6789
[mon1]
host = ceph2
mon addr = 10.0.6.11:6789
[mon2]
host = ceph3
mon addr = 10.0.6.12:6789
; mds
; You need at least one. Define two to get a standby.
[mds]
; where the mds keeps it's secret encryption keys
keyring = /etc/ceph/keyring.$name
; mds logging to debug issues.
;debug ms = 1
;debug mds = 20
[mds0]
host = ceph1
[mds1]
host = ceph2
[mds2]
host = ceph3
; osd
; You need at least one. Two if you want data to be replicated.
; Define as many as you like.
[osd]
sudo = true
; This is where the btrfs volume will be mounted.
osd data = /data/osd$id
; where the ods keeps it's secret encryption keys
keyring = /etc/ceph/keyring.$name
; Ideally, make this a separate disk or partition. A few
; hundred MB should be enough; more if you have fast or many
; disks. You can use a file under the osd data dir if need be
; (e.g. /data/osd$id/journal), but it will be slower than a
; separate disk or partition.
; This is an example of a file-based journal.
;osd journal = /data/osd$id/journal
;osd journal size = 1000 ; journal size, in megabytes
; osd logging to debug osd issues, in order of likelihood of being
; helpful
; debug ms = 1
; debug osd = 25
; debug monc = 20
; debug journal = 20
; debug filestore = 10
; osd use stale snap = true
[osd0]
host = ceph1
; if 'btrfs devs' is not specified, you're responsible for
; setting up the 'osd data' dir. if it is not btrfs, things
; will behave up until you try to recover from a crash (which
; usually fine for basic testing).
btrfs devs = /dev/sdc
osd journal = /dev/sda1
[osd1]
host = ceph1
btrfs devs = /dev/sdd
osd journal = /dev/sda2
[osd2]
host = ceph2
btrfs devs = /dev/sdc
osd journal = /dev/sda1
[osd3]
host = ceph2
btrfs devs = /dev/sdd
osd journal = /dev/sda2
[osd4]
host = ceph3
btrfs devs = /dev/sdc
osd journal = /dev/sda1
[osd5]
host = ceph3
btrfs devs = /dev/sdd
osd journal = /dev/sda2
The statistics of the disks, this is after the crash of osd2 and osd4.
/dev/sdc 143373312 124954676 18418636 88% /data/osd0
/dev/sdd 143373312 137639524 5733788 97% /data/osd1
/dev/sdc 143373312 120350584 23022728 84% /data/osd2
/dev/sdd 143373312 141986188 1387124 100% /data/osd3
/dev/sdc 143373312 112025716 31347596 79% /data/osd4
/dev/sdd 143373312 115163124 28210188 81% /data/osd5
I will send some statistic of the ext3 as well
----- Ursprungligt meddelande -----
Från: "Gregory Farnum" <gregory.farnum@dreamhost.com>
Till: "Martin Wilderoth" <martin.wilderoth@linserv.se>
Kopia: ceph-devel@vger.kernel.org
Skickat: tisdag, 12 apr 2011 14:24:14
Ämne: Re: osd stops
On Tuesday, April 12, 2011 at 11:05 AM, Martin Wilderoth wrote:
Thanks for the answer, now I know the reson. Some of my osd had 90% of data, dmesg also shows error with the btrfs on the hosts. I will run the test with another file system ext3 :-) or is any other filesystem better. It's a backuppc filesystem with a lot of hardlinks and data I would like to test to run in ceph.
ext3 or really any other FS will handle it better, although Ceph itself is also not super-resilient to such situations. Eventually we will have automatic rebalancing of data but it's not in there right now.
Could you maybe send along your config file and the local filesystem statistics on each of your OSDs? CRUSH is psuedo-random and so it's not going to have perfectly even utilization but if the variance is too high we'll want to look into it sooner rather than later.
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2011-04-13 12:19 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <64610990.14485.1302631358989.JavaMail.root@mail.linserv.se>
2011-04-12 18:05 ` osd stops Martin Wilderoth
2011-04-12 18:24 ` Gregory Farnum
2011-04-13 12:12 ` Martin Wilderoth [this message]
2011-04-13 19:38 ` Gregory Farnum
2011-04-13 19:43 ` Gregory Farnum
[not found] <ab2410b5-fe4c-4600-a2c0-f36a708fb6e2@mail.linserv.se>
2013-04-14 5:07 ` Martin Wilderoth
[not found] <1608788961.14465.1302625479260.JavaMail.root@mail.linserv.se>
2011-04-12 16:26 ` Martin Wilderoth
2011-04-12 16:57 ` Wido den Hollander
2011-04-12 17:24 ` Gregory Farnum
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=174374365.14569.1302696771139.JavaMail.root@mail.linserv.se \
--to=martin.wilderoth@linserv.se \
--cc=ceph-devel@vger.kernel.org \
--cc=gregory.farnum@dreamhost.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.