* v0.38 released @ 2011-11-11 5:14 Sage Weil 2011-11-15 16:42 ` Andre Noll 0 siblings, 1 reply; 23+ messages in thread From: Sage Weil @ 2011-11-11 5:14 UTC (permalink / raw) To: ceph-devel It's a week delayed, but v0.38 is ready. The highlights: * osd: some peering refactoring * osd: 'replay' period is per-pool (now only affects fs data pool) * osd: clean up old osdmaps * osd: allow admin to revert lost objects to prior versions (or delete) * mkcephfs: generate reasonable crush map based on 'host' and 'rack' fields in [osd.NN] sections of ceph.conf * radosgw: bucket index improvements * radosgw: improved swift support * rbd: misc command line tool fixes * debian: misc packaging fixes (including dependency breakage on upgrades) * ceph: query daemon perfcounters via command line tool The big upcoming items for v0.39 are RBD layering (image cloning), further improvements to radosgw's Swift support, and some monitor failure recovery and bootstrapping improvements. We're also continuing work on the automation bits that the Chef cookbooks and Juju charms will use, and a Crowbar barclamp was also just posted on github. Several patches are still working their way into libvirt and qemu to improve support for RBD authentication. You can get v0.38 from the usual places: * Git at git://github.com/NewDreamNetwork/ceph.git * Tarball at http://ceph.newdream.net/download/ceph-0.38.tar.gz * For Debian/Ubuntu packages see http://ceph.newdream.net/docs/latest/ops/install/mkcephfs/#installing-the-packages * For RPMs see https://build.opensuse.org/project/show?project=home%3Ahmacht%3Astorage sage ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v0.38 released 2011-11-11 5:14 v0.38 released Sage Weil @ 2011-11-15 16:42 ` Andre Noll 2011-11-15 19:53 ` Gregory Farnum 0 siblings, 1 reply; 23+ messages in thread From: Andre Noll @ 2011-11-15 16:42 UTC (permalink / raw) To: Sage Weil; +Cc: ceph-devel [-- Attachment #1: Type: text/plain, Size: 2962 bytes --] On Thu, Nov 10, 21:14, Sage Weil wrote: > * osd: some peering refactoring > * osd: 'replay' period is per-pool (now only affects fs data pool) > * osd: clean up old osdmaps > * osd: allow admin to revert lost objects to prior versions (or delete) > * mkcephfs: generate reasonable crush map based on 'host' and 'rack' > fields in [osd.NN] sections of ceph.conf > * radosgw: bucket index improvements > * radosgw: improved swift support > * rbd: misc command line tool fixes > * debian: misc packaging fixes (including dependency breakage on upgrades) > * ceph: query daemon perfcounters via command line tool > > The big upcoming items for v0.39 are RBD layering (image cloning), further > improvements to radosgw's Swift support, and some monitor failure recovery > and bootstrapping improvements. We're also continuing work on the > automation bits that the Chef cookbooks and Juju charms will use, and a > Crowbar barclamp was also just posted on github. Several patches are > still working their way into libvirt and qemu to improve support for RBD > authentication. Any plans to address the ENOSPC issue? I gave v0.38 a try and the file system behaves like the older (<= 0.36) versions I've tried before when it fills up: The ceph mounts hang on all clients. But there is progress: Sync is now interuptable (it used to block in D state so that it could not be killed even with SIGKILL), and umount works even if the file system is full. However, subsequent mount attempts then fail with "mount error 5 = Input/output error". Our test setup consists of one mds, one monitor and 8 osds. mds and monitor are on the same node, and this node is not not an osd. All nodes are running Linux-3.0.9 ATM, but I would be willing to upgrade to 3.1.1 if this is expected to make a difference. Here's some output of "ceph -w". Funny enough it reports 770G of free disk space space although the writing process terminated with ENOSPC. 2011-11-15 12:12:45.388535 pg v38805: 65940 pgs: 1956 creating, 63984 active+clean; 1856 GB data, 3730 GB used, 770 GB / 4600 GB avail 2011-11-15 12:12:45.589228 mds e4: 1/1/1 up {0=0=up:active} 2011-11-15 12:12:45.589326 osd e11: 8 osds: 8 up, 8 in full 2011-11-15 12:12:45.589908 log 2011-11-15 12:12:19.599894 osd.326 192.168.3.26:6800/1673 168 : [INF] 0.593 scrub ok 2011-11-15 12:12:45.590000 mon e1: 1 mons at {0=192.168.3.34:6789/0} 2011-11-15 12:12:49.554163 pg v38806: 65940 pgs: 1956 creating, 63984 active+clean; 1856 GB data, 3730 GB used, 770 GB / 4600 GB avail 2011-11-15 12:12:54.526661 pg v38807: 65940 pgs: 1956 creating, 63984 active+clean; 1856 GB data, 3730 GB used, 770 GB / 4600 GB avail 2011-11-15 12:12:56.309292 pg v38808: 65940 pgs: 1956 creating, 63984 active+clean; 1856 GB data, 3730 GB used, 770 GB / 4600 GB avail Thanks Andre -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v0.38 released 2011-11-15 16:42 ` Andre Noll @ 2011-11-15 19:53 ` Gregory Farnum 2011-11-16 9:56 ` Andre Noll 0 siblings, 1 reply; 23+ messages in thread From: Gregory Farnum @ 2011-11-15 19:53 UTC (permalink / raw) To: Andre Noll; +Cc: Sage Weil, ceph-devel On Tue, Nov 15, 2011 at 8:42 AM, Andre Noll <maan@systemlinux.org> wrote: > On Thu, Nov 10, 21:14, Sage Weil wrote: >> * osd: some peering refactoring >> * osd: 'replay' period is per-pool (now only affects fs data pool) >> * osd: clean up old osdmaps >> * osd: allow admin to revert lost objects to prior versions (or delete) >> * mkcephfs: generate reasonable crush map based on 'host' and 'rack' >> fields in [osd.NN] sections of ceph.conf >> * radosgw: bucket index improvements >> * radosgw: improved swift support >> * rbd: misc command line tool fixes >> * debian: misc packaging fixes (including dependency breakage on upgrades) >> * ceph: query daemon perfcounters via command line tool >> >> The big upcoming items for v0.39 are RBD layering (image cloning), further >> improvements to radosgw's Swift support, and some monitor failure recovery >> and bootstrapping improvements. We're also continuing work on the >> automation bits that the Chef cookbooks and Juju charms will use, and a >> Crowbar barclamp was also just posted on github. Several patches are >> still working their way into libvirt and qemu to improve support for RBD >> authentication. > > Any plans to address the ENOSPC issue? I gave v0.38 a try and the > file system behaves like the older (<= 0.36) versions I've tried > before when it fills up: The ceph mounts hang on all clients. This is something we hope to address in the future, but we haven't come up with a good solution yet. (I haven't seen a good solution in other distributed systems either...) > But there is progress: Sync is now interuptable (it used to block > in D state so that it could not be killed even with SIGKILL), and > umount works even if the file system is full. However, subsequent > mount attempts then fail with "mount error 5 = Input/output error". Yay! > Our test setup consists of one mds, one monitor and 8 osds. mds and > monitor are on the same node, and this node is not not an osd. All > nodes are running Linux-3.0.9 ATM, but I would be willing to upgrade > to 3.1.1 if this is expected to make a difference. > > Here's some output of "ceph -w". Funny enough it reports 770G of free > disk space space although the writing process terminated with ENOSPC. Right now RADOS (the object store under the Ceph FS) is pretty conservative about reporting ENOSPC. Since btrfs is also pretty unhappy when its disk fills up, an OSD marks itself as "full" once it's reached 95% of its capacity, and once a single OSD goes full then RADOS marks itself that way so you don't overfill a disk and have really bad things happen. (Hung mounts suck but are a lot better than mysterious data loss.) Looking at your ceph -s I'm surprised by a few things, though... 1) Why do you have so many PGs? 8k/OSD is rather a lot 2) I wouldn't expect your OSDs to have become so unbalanced that one of them hits 95% full when the cluster's only at 84% capacity. What is this cluster used for? Are you running anything besides the Ceph FS on it? (radosgw, maybe?) -Greg > 2011-11-15 12:12:45.388535 pg v38805: 65940 pgs: 1956 creating, 63984 active+clean; 1856 GB data, 3730 GB used, 770 GB / 4600 GB avail > 2011-11-15 12:12:45.589228 mds e4: 1/1/1 up {0=0=up:active} > 2011-11-15 12:12:45.589326 osd e11: 8 osds: 8 up, 8 in full > 2011-11-15 12:12:45.589908 log 2011-11-15 12:12:19.599894 osd.326 192.168.3.26:6800/1673 168 : [INF] 0.593 scrub ok > 2011-11-15 12:12:45.590000 mon e1: 1 mons at {0=192.168.3.34:6789/0} > 2011-11-15 12:12:49.554163 pg v38806: 65940 pgs: 1956 creating, 63984 active+clean; 1856 GB data, 3730 GB used, 770 GB / 4600 GB avail > 2011-11-15 12:12:54.526661 pg v38807: 65940 pgs: 1956 creating, 63984 active+clean; 1856 GB data, 3730 GB used, 770 GB / 4600 GB avail > 2011-11-15 12:12:56.309292 pg v38808: 65940 pgs: 1956 creating, 63984 active+clean; 1856 GB data, 3730 GB used, 770 GB / 4600 GB avail -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v0.38 released 2011-11-15 19:53 ` Gregory Farnum @ 2011-11-16 9:56 ` Andre Noll 2011-11-16 18:04 ` Tommi Virtanen 0 siblings, 1 reply; 23+ messages in thread From: Andre Noll @ 2011-11-16 9:56 UTC (permalink / raw) To: Gregory Farnum; +Cc: Sage Weil, ceph-devel [-- Attachment #1: Type: text/plain, Size: 5030 bytes --] On Tue, Nov 15, 11:53, Gregory Farnum wrote: > > Any plans to address the ENOSPC issue? I gave v0.38 a try and the > > file system behaves like the older (<= 0.36) versions I've tried > > before when it fills up: The ceph mounts hang on all clients. > > This is something we hope to address in the future, but we haven't > come up with a good solution yet. (I haven't seen a good solution in > other distributed systems either...) Glad to hear the problem is known and will be addressed. We'd love to use ceph as a global tmp file system on our cluster, so users *will* fill it up.. > > But there is progress: Sync is now interuptable (it used to block > > in D state so that it could not be killed even with SIGKILL), and > > umount works even if the file system is full. However, subsequent > > mount attempts then fail with "mount error 5 = Input/output error". > Yay! > > > Our test setup consists of one mds, one monitor and 8 osds. mds and > > monitor are on the same node, and this node is not not an osd. All > > nodes are running Linux-3.0.9 ATM, but I would be willing to upgrade > > to 3.1.1 if this is expected to make a difference. > > > > Here's some output of "ceph -w". Funny enough it reports 770G of free > > disk space space although the writing process terminated with ENOSPC. > Right now RADOS (the object store under the Ceph FS) is pretty > conservative about reporting ENOSPC. Since btrfs is also pretty > unhappy when its disk fills up, an OSD marks itself as "full" once > it's reached 95% of its capacity, and once a single OSD goes full then > RADOS marks itself that way so you don't overfill a disk and have > really bad things happen. (Hung mounts suck but are a lot better than > mysterious data loss.) Six of the eight underlying btrfs for ceph are 500G large, the other two are 800G. Used disk space varies between 459G and 476G. The peak 476G is on a 500G fs, so this one is 98% full. The data was written by a single client using stress, which simply created 5G files in an endless loop. All these files are in the top level directory. > Looking at your ceph -s I'm surprised by a few things, though... > 1) Why do you have so many PGs? 8k/OSD is rather a lot I can't answer this question, but please have a look at the ceph config file below. Maybe you can spot something odd in it. > 2) I wouldn't expect your OSDs to have become so unbalanced that one > of them hits 95% full when the cluster's only at 84% capacity. This seems to be due to the fact that roughly the same amount of data was written to each file system despite of the different file system sizes. Hence only 60% disk space is used on the two 800G file systems. > What is this cluster used for? Are you running anything besides the > Ceph FS on it? (radosgw, maybe?) Besides the ceph daemons only sshd and sge_execd (for executing cluster jobs) is running there. Job submission was disabled on these nodes during the tests, so all systems were completely idle. Thanks for your help Andre --- [global] ; enable secure authentication ;auth supported = cephx ;osd journal size = 100 ; measured in MB [client] ; userspace client debug ms = 1 debug client = 10 ; You need at least one monitor. You need at least three if you want to ; tolerate any node failures. Always create an odd number. [mon] mon data = /var/ceph/mon$id ; some minimal logging (just message traffic) to aid debugging ; debug ms = 1 ; debug auth = 20 ;authentication code [mon.0] host = node334 mon addr = 192.168.3.34:6789 ; You need at least one mds. Define two to get a standby. [mds] ; where the mds keeps it's secret encryption keys keyring = /var/ceph/keyring.$name ; debug mds = 20 [mds.0] host = node334 ; osd ; You need at least one. Two if you want data to be replicated. ; Define as many as you like. [osd] ; This is where the btrfs volume will be mounted. osd data = /var/ceph/osd$id keyring = /etc/ceph/keyring.$name ; Ideally, make this a separate disk or partition. A few GB ; is usually enough; more if you have fast disks. You can use ; a file under the osd data dir if need be ; (e.g. /data/osd$id/journal), but it will be slower than a ; separate disk or partition. osd journal = /var/ceph/osd$id/journal ; If the OSD journal is a file, you need to specify the size. This is specified in MB. osd journal size = 512 [osd.325] host = node325 btrfs devs = /dev/ceph/data [osd.326] host = node326 btrfs devs = /dev/ceph/data [osd.327] host = node327 btrfs devs = /dev/ceph/data [osd.328] host = node328 btrfs devs = /dev/ceph/data [osd.329] host = node329 btrfs devs = /dev/ceph/data [osd.330] host = node330 btrfs devs = /dev/ceph/data [osd.331] host = node331 btrfs devs = /dev/ceph/data [osd.333] host = node333 btrfs devs = /dev/ceph/data -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v0.38 released 2011-11-16 9:56 ` Andre Noll @ 2011-11-16 18:04 ` Tommi Virtanen 2011-11-17 10:35 ` Andre Noll 0 siblings, 1 reply; 23+ messages in thread From: Tommi Virtanen @ 2011-11-16 18:04 UTC (permalink / raw) To: Andre Noll; +Cc: Gregory Farnum, Sage Weil, ceph-devel On Wed, Nov 16, 2011 at 01:56, Andre Noll <maan@systemlinux.org> wrote: >> 2) I wouldn't expect your OSDs to have become so unbalanced that one >> of them hits 95% full when the cluster's only at 84% capacity. > > This seems to be due to the fact that roughly the same amount of data > was written to each file system despite of the different file system > sizes. Hence only 60% disk space is used on the two 800G file systems. That would be it. You probably want to set the weights of your OSDs according to their storage capacity, otherwise the smaller ones will get filled first. http://ceph.newdream.net/wiki/Monitor_commands#reweight ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v0.38 released 2011-11-16 18:04 ` Tommi Virtanen @ 2011-11-17 10:35 ` Andre Noll 2011-11-17 18:01 ` Tommi Virtanen 0 siblings, 1 reply; 23+ messages in thread From: Andre Noll @ 2011-11-17 10:35 UTC (permalink / raw) To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel [-- Attachment #1: Type: text/plain, Size: 1141 bytes --] On Wed, Nov 16, 10:04, Tommi Virtanen wrote: > On Wed, Nov 16, 2011 at 01:56, Andre Noll <maan@systemlinux.org> wrote: > >> 2) I wouldn't expect your OSDs to have become so unbalanced that one > >> of them hits 95% full when the cluster's only at 84% capacity. > > > > This seems to be due to the fact that roughly the same amount of data > > was written to each file system despite of the different file system > > sizes. Hence only 60% disk space is used on the two 800G file systems. > > That would be it. You probably want to set the weights of your OSDs > according to their storage capacity, otherwise the smaller ones will > get filled first. > > http://ceph.newdream.net/wiki/Monitor_commands#reweight I was under the impression that equal weights on all osds means to fill up all file systems by the same percentage, i.e. that file system sizes are already taken care of. But apparently this is not the case. So one has to set the weights manually according to the available disk space. Thanks for enlightening me. Andre -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v0.38 released 2011-11-17 10:35 ` Andre Noll @ 2011-11-17 18:01 ` Tommi Virtanen 2011-11-18 15:01 ` Andre Noll 0 siblings, 1 reply; 23+ messages in thread From: Tommi Virtanen @ 2011-11-17 18:01 UTC (permalink / raw) To: Andre Noll; +Cc: Gregory Farnum, Sage Weil, ceph-devel On Thu, Nov 17, 2011 at 02:35, Andre Noll <maan@systemlinux.org> wrote: > I was under the impression that equal weights on all osds means to > fill up all file systems by the same percentage, i.e. that file system > sizes are already taken care of. > > But apparently this is not the case. So one has to set the weights > manually according to the available disk space. The weight is actually a combination of all the factors that would go in: storage size, disk IO speed, network link bandwidth, heat in that part of the data center, future expansion plans, .. We could automate more of it, but it really is a fundamentally holistic number, and setting it based on just one aspect of reality will lead to someone else being unhappy. So it goes something like this: Step 1: improve documentation Step 2: have a monitoring system be able to feed back information to use as osd weights, with admin customazability -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v0.38 released 2011-11-17 18:01 ` Tommi Virtanen @ 2011-11-18 15:01 ` Andre Noll 2011-11-18 18:47 ` Tommi Virtanen 0 siblings, 1 reply; 23+ messages in thread From: Andre Noll @ 2011-11-18 15:01 UTC (permalink / raw) To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel [-- Attachment #1: Type: text/plain, Size: 1851 bytes --] On Thu, Nov 17, 10:01, Tommi Virtanen wrote: > On Thu, Nov 17, 2011 at 02:35, Andre Noll <maan@systemlinux.org> wrote: > > I was under the impression that equal weights on all osds means to > > fill up all file systems by the same percentage, i.e. that file system > > sizes are already taken care of. > > > > But apparently this is not the case. So one has to set the weights > > manually according to the available disk space. > > The weight is actually a combination of all the factors that would go > in: storage size, disk IO speed, network link bandwidth, heat in that > part of the data center, future expansion plans, .. True. But as we all know, perfect is the enemy of good ;) > We could automate more of it, but it really is a fundamentally holistic > number, and setting it based on just one aspect of reality will lead to > someone else being unhappy. So it goes something like this: > > Step 1: improve documentation For starters, it would be nice to include the ceph osd subcommands in the man pages. To my knowledge they are only documented on the (old) wiki http://ceph.newdream.net/wiki/Monitor_commands at the moment. Would a patch that adds the subcommands and descriptions to the man pages be accepted? If so, I'd be willing to do this work. However, the files in man/ of the ceph git repo seem to be generated by docutils, so I suspect they are not meant to be edited directly. What's the preferred way to patch the man pages? > Step 2: have a monitoring system be able to feed back information to > use as osd weights, with admin customazability How could such a monitoring system be implemented? In particular if abstract criteria like "future extension plans" have to be considered. Andre -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v0.38 released 2011-11-18 15:01 ` Andre Noll @ 2011-11-18 18:47 ` Tommi Virtanen 2011-11-21 17:32 ` Andre Noll 0 siblings, 1 reply; 23+ messages in thread From: Tommi Virtanen @ 2011-11-18 18:47 UTC (permalink / raw) To: Andre Noll; +Cc: Gregory Farnum, Sage Weil, ceph-devel On Fri, Nov 18, 2011 at 07:01, Andre Noll <maan@systemlinux.org> wrote: > For starters, it would be nice to include the ceph osd subcommands > in the man pages. To my knowledge they are only documented on the > (old) wiki > > http://ceph.newdream.net/wiki/Monitor_commands > > at the moment. Would a patch that adds the subcommands and descriptions > to the man pages be accepted? I'm not sure if the man page are the best for that; there's a lot of subcommands, and man forces it into a big list of things. I'd personally go for putting a reference under http://ceph.newdream.net/docs/latest/ops/ and using the structure for separating osd/mon/mds etc into slightly more manageable chunks. > If so, I'd be willing to do this work. However, the files in man/ > of the ceph git repo seem to be generated by docutils, so I suspect > they are not meant to be edited directly. What's the preferred way > to patch the man pages? The content comes from doc/man/ and is built with ./admin/build-doc That puts the whole html into build-doc/output/html/ and the *roff in build-doc/output/man/ and from there it is migrated to man/ "by need" (there's too much noise in the changes to keep doing that all the time, and there's too many toolchain dependencies to generate docs on every build). >> Step 2: have a monitoring system be able to feed back information to >> use as osd weights, with admin customazability > How could such a monitoring system be implemented? In particular if > abstract criteria like "future extension plans" have to be considered. Going back to my initial list: storage size, disk IO speed, network link bandwidth, heat in that part of the data center, future expansion plans, .. That divides into 3 groups: - things that are more about the capability of the hardware (= change very seldomly) - things that are monitored outside of ceph - plans Hence, it seems to me that a sysadmin would do something like look at the node data gathered by something like Ohai/Chef, combine that with collectd/munin-style monitoring of the data center, optionally do something like "increase weights of rack 7 by 40%", and then spit out a mapping of osd id -> weight. Our chef cookbooks will probably provide a skeleton for that in the future, but that's not a short term need; most installations will probably set the weights once when the hardware is new, and I'd expect practically all clusters <6 months old to have fairly homogenous hardware, and thus identical weights. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v0.38 released 2011-11-18 18:47 ` Tommi Virtanen @ 2011-11-21 17:32 ` Andre Noll 2011-11-21 17:36 ` Tommi Virtanen 0 siblings, 1 reply; 23+ messages in thread From: Andre Noll @ 2011-11-21 17:32 UTC (permalink / raw) To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel [-- Attachment #1: Type: text/plain, Size: 3843 bytes --] On Fri, Nov 18, 10:47, Tommi Virtanen wrote: > On Fri, Nov 18, 2011 at 07:01, Andre Noll <maan@systemlinux.org> wrote: > > For starters, it would be nice to include the ceph osd subcommands > > in the man pages. To my knowledge they are only documented on the > > (old) wiki > > > > http://ceph.newdream.net/wiki/Monitor_commands > > > > at the moment. Would a patch that adds the subcommands and descriptions > > to the man pages be accepted? > > I'm not sure if the man page are the best for that; there's a lot of > subcommands, and man forces it into a big list of things. I'd > personally go for putting a reference under > http://ceph.newdream.net/docs/latest/ops/ and using the structure for > separating osd/mon/mds etc into slightly more manageable chunks. I believe that code and documentation should be located as close as possible, and I'd also prefer to edit and access the documentation locally via command line tools rather than through a browser. But I don't have a strong opinion on this, so let's go for the web documentation. Should I prepare something and post a request for inclusion to the web pages on this mailing list, or do you want me to edit the web documentation directly? > > If so, I'd be willing to do this work. However, the files in man/ > > of the ceph git repo seem to be generated by docutils, so I suspect > > they are not meant to be edited directly. What's the preferred way > > to patch the man pages? > > The content comes from doc/man/ and is built with ./admin/build-doc > > That puts the whole html into build-doc/output/html/ and the *roff in > build-doc/output/man/ and from there it is migrated to man/ "by need" > (there's too much noise in the changes to keep doing that all the > time, and there's too many toolchain dependencies to generate docs on > every build). I see, thanks for explaining. The ./admin/build-doc command worked for me out of the box on an Ubuntu lucid system btw. > >> Step 2: have a monitoring system be able to feed back information to > >> use as osd weights, with admin customazability > > How could such a monitoring system be implemented? In particular if > > abstract criteria like "future extension plans" have to be considered. > > Going back to my initial list: storage size, disk IO speed, network > link bandwidth, heat in that > part of the data center, future expansion plans, .. > > That divides into 3 groups: > - things that are more about the capability of the hardware (= change > very seldomly) > - things that are monitored outside of ceph > - plans > > Hence, it seems to me that a sysadmin would do something like look at > the node data gathered by something like Ohai/Chef, combine that with > collectd/munin-style monitoring of the data center, optionally do > something like "increase weights of rack 7 by 40%", and then spit out > a mapping of osd id -> weight. OK, got the idea. However, in this example the difficult thing is the decision "increase weights of rack 7 by 40%", which is made by a human. Recomputing the osd weights accordingly should be fairly simple. > Our chef cookbooks will probably provide a skeleton for that in the > future, but that's not a short term need; most installations will > probably set the weights once when the hardware is new, and I'd expect > practically all clusters <6 months old to have fairly homogenous > hardware, and thus identical weights. Are you implying that ceph is only suitable for new clusters with homogeneous hardware? I'm asking because our cluster is far from homogeneous. There are 8 year old 2-core nodes with small SCSI disks as well as 64-core boxes with much larger SATA disks. Thanks Andre -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v0.38 released 2011-11-21 17:32 ` Andre Noll @ 2011-11-21 17:36 ` Tommi Virtanen 2011-11-21 18:06 ` Andre Noll 0 siblings, 1 reply; 23+ messages in thread From: Tommi Virtanen @ 2011-11-21 17:36 UTC (permalink / raw) To: Andre Noll; +Cc: Gregory Farnum, Sage Weil, ceph-devel On Mon, Nov 21, 2011 at 09:32, Andre Noll <maan@systemlinux.org> wrote: > I believe that code and documentation should be located as close as > possible, and I'd also prefer to edit and access the documentation > locally via command line tools rather than through a browser. But > I don't have a strong opinion on this, so let's go for the web > documentation. I agree, and that's a big part of the reasons for choosing the toolchain I did! All the docs from http://ceph.newdream.net/docs are in the doc/ directory of the source tree. > Should I prepare something and post a request for inclusion to the > web pages on this mailing list, or do you want me to edit the web > documentation directly? Submit it like you would submit a code change. > Are you implying that ceph is only suitable for new clusters with > homogeneous hardware? I'm asking because our cluster is far from > homogeneous. There are 8 year old 2-core nodes with small SCSI disks > as well as 64-core boxes with much larger SATA disks. 8 year old? Wow. We do intend to fully support clusters with nodes of different capacity and speed (as I strongly believe most clusters will go through such a phase in their life). It's just not the default configuration, and won't be needed by most setups in the beginning. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: v0.38 released 2011-11-21 17:36 ` Tommi Virtanen @ 2011-11-21 18:06 ` Andre Noll 2011-11-28 14:04 ` [PATCH/RFC 0/6]: Introduction Andre Noll 0 siblings, 1 reply; 23+ messages in thread From: Andre Noll @ 2011-11-21 18:06 UTC (permalink / raw) To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel [-- Attachment #1: Type: text/plain, Size: 954 bytes --] On Mon, Nov 21, 09:36, Tommi Virtanen wrote > I agree, and that's a big part of the reasons for choosing the > toolchain I did! All the docs from http://ceph.newdream.net/docs are > in the doc/ directory of the source tree. > > > Should I prepare something and post a request for inclusion to the > > web pages on this mailing list, or do you want me to edit the web > > documentation directly? > > Submit it like you would submit a code change. OK, will do so. I'll start with the list of OSD subcommands from the old wiki and try to improve on this. As soon I have something to present I'll send an RFC-style patch series to the list. This will likely contain questions on certain subcommands, and I'll include the relevant parts of any replies in subsequent versions of the patch series. Thanks Andre -- Max Planck Institute for Developmental Biology Spemannstrasse 35, 72076 Tübingen, Germany Phone: (+049) 7071 601 829 [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC 0/6]: Introduction 2011-11-21 18:06 ` Andre Noll @ 2011-11-28 14:04 ` Andre Noll 2011-11-28 14:04 ` [PATCH/RFC 1/6] doc: Import the list of ceph subcommands from wiki Andre Noll ` (6 more replies) 0 siblings, 7 replies; 23+ messages in thread From: Andre Noll @ 2011-11-28 14:04 UTC (permalink / raw) To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel Here is what I have so far. This patch set imports the documentation of the OSD subcommands from the wiki to the preciously emtpy file doc/ops/monitor.rst of the git repo. The first patch is just the result of a cut & paste operation of the corresponding wiki page while the other patches try to improve on this. The aim is to have a complete and up to date documentation for all osd subcommands. I don't believe the series is ready for inclusion yet as some subcommands (cluster_snap, lost, in, out, ...) still lack useful descriptions. It would be nice to add one sentence to each such command that explains its purpose and the circumstances under which one might want to use this particular command. So please review and comment. I will fold in your suggestions and follow up with a re-rolled series provided there is substantial feedback. Thanks Andre ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC 1/6] doc: Import the list of ceph subcommands from wiki. 2011-11-28 14:04 ` [PATCH/RFC 0/6]: Introduction Andre Noll @ 2011-11-28 14:04 ` Andre Noll 2011-11-28 14:04 ` [PATCH/RFC 2/6] doc: Add documentation of missing osd commands Andre Noll ` (5 subsequent siblings) 6 siblings, 0 replies; 23+ messages in thread From: Andre Noll @ 2011-11-28 14:04 UTC (permalink / raw) To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel, Andre Noll This adds the content of the wiki page at http://ceph.newdream.net/wiki/Monitor_commands to doc/ops/monitor.rst in order to make it available at the new official location for the ceph documentation. This first patch is just the result of a cut-and-paste operation. There are no changes in content, but the text was converted to rst format. Signed-Off-By: Andre Noll <maan@systemlinux.org> --- doc/ops/monitor.rst | 178 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 177 insertions(+), 1 deletions(-) diff --git a/doc/ops/monitor.rst b/doc/ops/monitor.rst index 98c75c3..626685e 100644 --- a/doc/ops/monitor.rst +++ b/doc/ops/monitor.rst @@ -4,4 +4,180 @@ Monitoring Ceph ================= -.. todo:: write me +Monitor commands +---------------- + +Monitor commands are issued using the ceph utility (in versions before +Dec08 it was called cmonctl):: + + $ ceph [-m monhost] command + +where the command is usually of the form:: + + $ ceph subsystem command + +System commands +--------------- + +:: + + $ ceph stop + +Cleanly shuts down the cluster. :: + + $ ceph -s + +Shows an overview of the current status of the cluster. :: + + $ ceph -w + +Shows a running summary of the status of the cluster, and major events. + +AUTH subsystem +-------------- +:: + + $ ceph auth add <osd> <--in-file|-i> <path-to-osd-keyring> + +Add auth keyring for an osd. :: + + $ ceph auth list + +Show auth key OSD subsystem. + +OSD subsystem +------------- +:: + + $ ceph osd stat + +Query osd subsystem status. :: + + $ ceph osd getmap -o file + +Write a copy of the most recent osd map to a file. See osdmaptool. :: + + $ ceph osd getcrushmap -o file + +Write a copy of the crush map from the most recent osd map to +file. This is functionally equivalent to :: + + $ ceph osd getmap -o /tmp/osdmap + $ osdmaptool /tmp/osdmap --export-crush file + +:: + + $ ceph osd getmaxosd + +Query the current max_osd parameter in the osd map. :: + + $ ceph osd setmap -i file + +Import the given osd map. Note that this can be a bit dangerous, +since the osd map includes dynamic state about which OSDs are current +on or offline; only do this if you've just modified a (very) recent +copy of the map. :: + + $ ceph osd setcrushmap -i file + +Import the given crush map. :: + + $ ceph osd setmaxosd + +Set the max_osd parameter in the osd map. This is necessary when +expanding the storage cluster. :: + + $ ceph osd down N + +Mark osdN down. :: + + $ ceph osd out N + +Mark osdN out of the distribution (i.e. allocated no data). :: + + $ ceph osd in N + +Mark osdN in the distribution (i.e. allocated data). :: + + $ ceph class list + +List classes that are loaded in the ceph cluster. :: + + $ ceph osd pause + $ ceph osd unpause + +TODO :: + + $ ceph osd reweight N W + +Sets the weight of osdN to W. :: + + $ ceph osd reweight-by-utilization [threshold] + +Reweights all the OSDs by reducing the weight of OSDs which are +heavily overused. By default it will adjust the weights downward on +OSDs which have 120% of the average utilization, but if you include +threshold it will use that percentage instead. :: + + $ ceph osd blacklist add ADDRESS[:source_port] [TIME] + $ ceph osd blacklist rm ADDRESS[:source_port] + +Adds/removes the address to/from the blacklist. When adding an address, +you can specify how long it should be blacklisted in seconds; otherwise +it will default to 1 hour. A blacklisted address is prevented from +connecting to any osd. Blacklisting is most often used to prevent a +laggy mds making bad changes to data on the osds. + +These commands are mostly only useful for failure testing, as +blacklists are normally maintained automatically and shouldn't need +manual intervention. :: + + $ ceph osd pool mksnap POOL SNAPNAME + $ ceph osd pool rmsnap POOL SNAPNAME + +Creates/deletes a snapshot of a pool. :: + + $ ceph osd pool create POOL + $ ceph osd pool delete POOL + +Creates/deletes a storage pool. :: + + $ ceph osd pool set POOL FIELD VALUE + +Changes a pool setting. Valid fields are: + + * ``size``: Sets the number of copies of data in the pool. + * ``pg_num``: TODO + * ``pgp_num``: TODO + +:: + + $ ceph osd scrub N + +Sends a scrub command to osdN. To send the command to all osds, use ``*``. +TODO: what does this actually do :: + + $ ceph osd repair N + +Sends a repair command to osdN. To send the command to all osds, use ``*``. +TODO: what does this actually do + +MDS subsystem +------------- + +Change configuration parameters on a running mds. :: + + $ ceph mds tell <mds-id> injectargs '--<switch> <value> [--<switch> <value>]' + +Example:: + + $ ceph mds tell 0 injectargs '--debug_ms 1 --debug_mds 10' + +Enables debug messages. :: + + $ ceph mds stat + +Displays the status of all metadata servers. + +dump, getmap, stop, set_max_mds, setmap: TODO + -- 1.7.8.rc1.14.g248db ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH/RFC 2/6] doc: Add documentation of missing osd commands. 2011-11-28 14:04 ` [PATCH/RFC 0/6]: Introduction Andre Noll 2011-11-28 14:04 ` [PATCH/RFC 1/6] doc: Import the list of ceph subcommands from wiki Andre Noll @ 2011-11-28 14:04 ` Andre Noll 2011-11-28 14:04 ` [PATCH/RFC 3/6] doc: Document pause and unpause " Andre Noll ` (4 subsequent siblings) 6 siblings, 0 replies; 23+ messages in thread From: Andre Noll @ 2011-11-28 14:04 UTC (permalink / raw) To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel, Andre Noll The set of OSD commands which added by the previous commit is incomplete. This patch adds documentation for the following OSD commands which were previously missing: dump, tree, crush, cluster_snap, lost, create, rm. Signed-Off-By: Andre Noll <maan@systemlinux.org> --- doc/ops/monitor.rst | 44 +++++++++++++++++++++++++++++++++++++++++--- 1 files changed, 41 insertions(+), 3 deletions(-) diff --git a/doc/ops/monitor.rst b/doc/ops/monitor.rst index 626685e..07d9c4f 100644 --- a/doc/ops/monitor.rst +++ b/doc/ops/monitor.rst @@ -64,9 +64,48 @@ file. This is functionally equivalent to :: $ ceph osd getmap -o /tmp/osdmap $ osdmaptool /tmp/osdmap --export-crush file - :: + $ ceph osd dump [--format format>] + +Dump the osd map. Valid formats for -f are "plain" and "json". If no +--format option is given, the osd map is dumped as plain text. :: + + $ ceph osd tree [--format format] + +Dump the osd map as a tree with one line per osd containing weight +and state. :: + + $ ceph osd crush add <id> <name> <weight> [<loc1> [<loc2> ...]] + +Add a new item with the given id/name/weight at the specified +location. :: + + $ ceph osd crush remove <id> + +Remove an existing item from the crush map. :: + + $ ceph osd crush reweight <name> <weight> + +Set the weight of the item given by ``<name>`` to ``<weight>``. :: + + $ ceph osd cluster_snap <name> + +Create a cluster snapshot. :: + + $ ceph osd lost [--yes-i-really-mean-it] + +Mark an OSD as lost. This may result in permanent data loss. Use with caution. :: + + $ ceph osd create [<id>] + +Create a new OSD. If no ID is given, a new ID is automatically selected +if possible. :: + + $ ceph osd rm [<id>...] + +Remove the given OSD(s). :: + $ ceph osd getmaxosd Query the current max_osd parameter in the osd map. :: @@ -179,5 +218,4 @@ Enables debug messages. :: Displays the status of all metadata servers. -dump, getmap, stop, set_max_mds, setmap: TODO - +set_max_mds: TODO -- 1.7.8.rc1.14.g248db ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH/RFC 3/6] doc: Document pause and unpause osd commands. 2011-11-28 14:04 ` [PATCH/RFC 0/6]: Introduction Andre Noll 2011-11-28 14:04 ` [PATCH/RFC 1/6] doc: Import the list of ceph subcommands from wiki Andre Noll 2011-11-28 14:04 ` [PATCH/RFC 2/6] doc: Add documentation of missing osd commands Andre Noll @ 2011-11-28 14:04 ` Andre Noll 2011-11-28 14:04 ` [PATCH/RFC 4/6] doc: Update the list of fields for the pool set command Andre Noll ` (3 subsequent siblings) 6 siblings, 0 replies; 23+ messages in thread From: Andre Noll @ 2011-11-28 14:04 UTC (permalink / raw) To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel, Andre Noll These two commands were undocumented so far. This patch adds a short description. Signed-Off-By: Andre Noll <maan@systemlinux.org> --- doc/ops/monitor.rst | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/doc/ops/monitor.rst b/doc/ops/monitor.rst index 07d9c4f..e7314ae 100644 --- a/doc/ops/monitor.rst +++ b/doc/ops/monitor.rst @@ -145,7 +145,9 @@ List classes that are loaded in the ceph cluster. :: $ ceph osd pause $ ceph osd unpause -TODO :: +Set or clear the pause flags in the OSD map. If set, no IO requests +will be sent to any OSD. Clearing the flags via unpause results in +resending pending requests. :: $ ceph osd reweight N W -- 1.7.8.rc1.14.g248db ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH/RFC 4/6] doc: Update the list of fields for the pool set command. 2011-11-28 14:04 ` [PATCH/RFC 0/6]: Introduction Andre Noll ` (2 preceding siblings ...) 2011-11-28 14:04 ` [PATCH/RFC 3/6] doc: Document pause and unpause " Andre Noll @ 2011-11-28 14:04 ` Andre Noll 2011-11-28 14:04 ` [PATCH/RFC 5/6] doc: Add missing documentation for osd pool get Andre Noll ` (2 subsequent siblings) 6 siblings, 0 replies; 23+ messages in thread From: Andre Noll @ 2011-11-28 14:04 UTC (permalink / raw) To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel, Andre Noll This list was lacking a few fields: crash_replay_interval, pg_num, pgp_num and crush_ruleset. Include these fields and add add short descriptions. Signed-Off-By: Andre Noll <maan@systemlinux.org> --- doc/ops/monitor.rst | 8 ++++++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/doc/ops/monitor.rst b/doc/ops/monitor.rst index e7314ae..4de3c19 100644 --- a/doc/ops/monitor.rst +++ b/doc/ops/monitor.rst @@ -64,6 +64,7 @@ file. This is functionally equivalent to :: $ ceph osd getmap -o /tmp/osdmap $ osdmaptool /tmp/osdmap --export-crush file + :: $ ceph osd dump [--format format>] @@ -188,8 +189,11 @@ Creates/deletes a storage pool. :: Changes a pool setting. Valid fields are: * ``size``: Sets the number of copies of data in the pool. - * ``pg_num``: TODO - * ``pgp_num``: TODO + * ``crash_replay_interval``: The number of seconds to allow + clients to replay acknowledged but uncommited requests. + * ``pg_num``: The placement group number. + * ``pgp_num``: Effective number when calculating pg placement. + * ``crush_ruleset``: rule number for mapping placement. :: -- 1.7.8.rc1.14.g248db ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH/RFC 5/6] doc: Add missing documentation for osd pool get. 2011-11-28 14:04 ` [PATCH/RFC 0/6]: Introduction Andre Noll ` (3 preceding siblings ...) 2011-11-28 14:04 ` [PATCH/RFC 4/6] doc: Update the list of fields for the pool set command Andre Noll @ 2011-11-28 14:04 ` Andre Noll 2011-11-28 18:37 ` Gregory Farnum 2011-11-28 14:04 ` [PATCH/RFC 6/6] doc: Clarify documentation of reweight command Andre Noll 2011-12-05 21:09 ` [PATCH/RFC 0/6]: Introduction Tommi Virtanen 6 siblings, 1 reply; 23+ messages in thread From: Andre Noll @ 2011-11-28 14:04 UTC (permalink / raw) To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel, Andre Noll "osd pool set" was already documented, but the corresponding "get" command was not. This patch adds the list of valid fields for this command, together with short descriptions. Signed-Off-By: Andre Noll <maan@systemlinux.org> --- doc/ops/monitor.rst | 11 +++++++++++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/doc/ops/monitor.rst b/doc/ops/monitor.rst index 4de3c19..076c8e1 100644 --- a/doc/ops/monitor.rst +++ b/doc/ops/monitor.rst @@ -197,6 +197,17 @@ Changes a pool setting. Valid fields are: :: + $ ceph osd pool get POOL FIELD + +Get the value of a pool setting. Valid fields are: + + * ``pg_num``: See above. + * ``pgp_num``: See above. + * ``lpg_num``: The localized pg number. + * ``lpgp_num``: The number of localized pgs. + +:: + $ ceph osd scrub N Sends a scrub command to osdN. To send the command to all osds, use ``*``. -- 1.7.8.rc1.14.g248db ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH/RFC 5/6] doc: Add missing documentation for osd pool get. 2011-11-28 14:04 ` [PATCH/RFC 5/6] doc: Add missing documentation for osd pool get Andre Noll @ 2011-11-28 18:37 ` Gregory Farnum 2011-11-28 19:27 ` Andre Noll 0 siblings, 1 reply; 23+ messages in thread From: Gregory Farnum @ 2011-11-28 18:37 UTC (permalink / raw) To: Andre Noll; +Cc: ceph-devel On Mon, Nov 28, 2011 at 6:04 AM, Andre Noll <maan@systemlinux.org> wrote: > "osd pool set" was already documented, but the corresponding "get" > command was not. This patch adds the list of valid fields for this > command, together with short descriptions. > > Signed-Off-By: Andre Noll <maan@systemlinux.org> > --- > doc/ops/monitor.rst | 11 +++++++++++ > 1 files changed, 11 insertions(+), 0 deletions(-) > > diff --git a/doc/ops/monitor.rst b/doc/ops/monitor.rst > index 4de3c19..076c8e1 100644 > --- a/doc/ops/monitor.rst > +++ b/doc/ops/monitor.rst > @@ -197,6 +197,17 @@ Changes a pool setting. Valid fields are: > > :: > > + $ ceph osd pool get POOL FIELD > + > +Get the value of a pool setting. Valid fields are: > + > + * ``pg_num``: See above. > + * ``pgp_num``: See above. > + * ``lpg_num``: The localized pg number. > + * ``lpgp_num``: The number of localized pgs. The lpg_num and lpgp_num are analogous to the pg_num and the pgp_num — the lpg_num is the number of local PGs, and the lpgp_num is the number used for placing them. This matters less for the local PGs than the regular PGs but it can still control where the replicas are placed. -Greg > $ ceph osd scrub N > > Sends a scrub command to osdN. To send the command to all osds, use ``*``. > -- > 1.7.8.rc1.14.g248db > > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH/RFC 5/6] doc: Add missing documentation for osd pool get. 2011-11-28 18:37 ` Gregory Farnum @ 2011-11-28 19:27 ` Andre Noll 0 siblings, 0 replies; 23+ messages in thread From: Andre Noll @ 2011-11-28 19:27 UTC (permalink / raw) To: Gregory Farnum; +Cc: ceph-devel [-- Attachment #1: Type: text/plain, Size: 698 bytes --] On Mon, Nov 28, 10:37, Gregory Farnum wrote: > > + * ``pg_num``: See above. > > + * ``pgp_num``: See above. > > + * ``lpg_num``: The localized pg number. > > + * ``lpgp_num``: The number of localized pgs. > The lpg_num and lpgp_num are analogous to the pg_num and the pgp_num — > the lpg_num is the number of local PGs, and the lpgp_num is the number > used for placing them. This matters less for the local PGs than the > regular PGs but it can still control where the replicas are placed. Thanks for the clarification. I will update the patch accordingly. Andre -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH/RFC 6/6] doc: Clarify documentation of reweight command. 2011-11-28 14:04 ` [PATCH/RFC 0/6]: Introduction Andre Noll ` (4 preceding siblings ...) 2011-11-28 14:04 ` [PATCH/RFC 5/6] doc: Add missing documentation for osd pool get Andre Noll @ 2011-11-28 14:04 ` Andre Noll 2011-12-05 21:09 ` [PATCH/RFC 0/6]: Introduction Tommi Virtanen 6 siblings, 0 replies; 23+ messages in thread From: Andre Noll @ 2011-11-28 14:04 UTC (permalink / raw) To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel, Andre Noll This has led to some discussions on the mailing list, so let's try to be clear about the meaning of an OSD weight. --- doc/ops/monitor.rst | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/doc/ops/monitor.rst b/doc/ops/monitor.rst index 076c8e1..5fcb0aa 100644 --- a/doc/ops/monitor.rst +++ b/doc/ops/monitor.rst @@ -152,7 +152,9 @@ resending pending requests. :: $ ceph osd reweight N W -Sets the weight of osdN to W. :: +Set the weight of osdN to W. Two OSDs with the same weight will receive +roughly the same number of I/O requests and store approximately the +same amount of data. :: $ ceph osd reweight-by-utilization [threshold] -- 1.7.8.rc1.14.g248db ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH/RFC 0/6]: Introduction 2011-11-28 14:04 ` [PATCH/RFC 0/6]: Introduction Andre Noll ` (5 preceding siblings ...) 2011-11-28 14:04 ` [PATCH/RFC 6/6] doc: Clarify documentation of reweight command Andre Noll @ 2011-12-05 21:09 ` Tommi Virtanen 2011-12-06 17:01 ` Andre Noll 6 siblings, 1 reply; 23+ messages in thread From: Tommi Virtanen @ 2011-12-05 21:09 UTC (permalink / raw) To: Andre Noll; +Cc: Gregory Farnum, Sage Weil, ceph-devel On Mon, Nov 28, 2011 at 06:04, Andre Noll <maan@systemlinux.org> wrote: > Here is what I have so far. This patch set imports the documentation > of the OSD subcommands from the wiki to the preciously emtpy file > doc/ops/monitor.rst of the git repo. The first patch is just the > result of a cut & paste operation of the corresponding wiki page > while the other patches try to improve on this. The aim is to have > a complete and up to date documentation for all osd subcommands. > > I don't believe the series is ready for inclusion yet as some > subcommands (cluster_snap, lost, in, out, ...) still lack useful > descriptions. It would be nice to add one sentence to each such command > that explains its purpose and the circumstances under which one might > want to use this particular command. > > So please review and comment. I will fold in your suggestions and > follow up with a re-rolled series provided there is substantial > feedback. Good work! I want to roll this in the docs asap, even if it is still partial. For that to happen, we need to do two things: 1. get you to add Signed-off-by lines as per SubmittingPatches 2. figure out where this documentation belongs For ops/monitor is meant for "how do I reassure myself my service works the right way". Think nagios, collectd, munin, etc. Apologies for not having much content there yet.. The ops/ hierarchy as a whole is meant to be "user/goal oriented". That is, I don't want to put in a section "ceph monitor commands". Instead, we need to ask the question "what is the admin trying to do", and that's the guiding principle for ops/. A reference-style document that exhaustively lists all possible actions should go into some other top-level section. Right now, the closest parallels we have are config/ ("Configuration reference") and api/, both meant to be comprehensive references. Let's do this: make your patch put things in doc/control.rst, with the title "Control commands", and have doc/index.rst toctree have, below config, an entry for control. I can take it from there if we want to reorganize the document more. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH/RFC 0/6]: Introduction 2011-12-05 21:09 ` [PATCH/RFC 0/6]: Introduction Tommi Virtanen @ 2011-12-06 17:01 ` Andre Noll 0 siblings, 0 replies; 23+ messages in thread From: Andre Noll @ 2011-12-06 17:01 UTC (permalink / raw) To: Tommi Virtanen; +Cc: Gregory Farnum, Sage Weil, ceph-devel [-- Attachment #1: Type: text/plain, Size: 746 bytes --] On Mon, Dec 05, 13:09, Tommi Virtanen wrote: > > So please review and comment. I will fold in your suggestions and > > follow up with a re-rolled series provided there is substantial > > feedback. > > Good work! I want to roll this in the docs asap, even if it is still partial. [...] > Let's do this: make your patch put things in doc/control.rst, with the > title "Control commands", and have doc/index.rst toctree have, below > config, an entry for control. I can take it from there if we want to > reorganize the document more. OK. I will update the patch series according to your comments and send an updated version soon. Thanks Andre -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2011-12-06 16:59 UTC | newest] Thread overview: 23+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-11-11 5:14 v0.38 released Sage Weil 2011-11-15 16:42 ` Andre Noll 2011-11-15 19:53 ` Gregory Farnum 2011-11-16 9:56 ` Andre Noll 2011-11-16 18:04 ` Tommi Virtanen 2011-11-17 10:35 ` Andre Noll 2011-11-17 18:01 ` Tommi Virtanen 2011-11-18 15:01 ` Andre Noll 2011-11-18 18:47 ` Tommi Virtanen 2011-11-21 17:32 ` Andre Noll 2011-11-21 17:36 ` Tommi Virtanen 2011-11-21 18:06 ` Andre Noll 2011-11-28 14:04 ` [PATCH/RFC 0/6]: Introduction Andre Noll 2011-11-28 14:04 ` [PATCH/RFC 1/6] doc: Import the list of ceph subcommands from wiki Andre Noll 2011-11-28 14:04 ` [PATCH/RFC 2/6] doc: Add documentation of missing osd commands Andre Noll 2011-11-28 14:04 ` [PATCH/RFC 3/6] doc: Document pause and unpause " Andre Noll 2011-11-28 14:04 ` [PATCH/RFC 4/6] doc: Update the list of fields for the pool set command Andre Noll 2011-11-28 14:04 ` [PATCH/RFC 5/6] doc: Add missing documentation for osd pool get Andre Noll 2011-11-28 18:37 ` Gregory Farnum 2011-11-28 19:27 ` Andre Noll 2011-11-28 14:04 ` [PATCH/RFC 6/6] doc: Clarify documentation of reweight command Andre Noll 2011-12-05 21:09 ` [PATCH/RFC 0/6]: Introduction Tommi Virtanen 2011-12-06 17:01 ` Andre Noll
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.