* ceph does not work @ 2012-02-23 9:15 Дениска-редиска 2012-02-23 19:00 ` Tommi Virtanen 0 siblings, 1 reply; 7+ messages in thread From: Дениска-редиска @ 2012-02-23 9:15 UTC (permalink / raw) To: ceph-devel ehllo here, i have tried to setup ceph .41 in simple configuration: 3 nodes, each running mon, mds & osd with replication level 3 for data & metadata pools. Each node mounts ceph locally via ceph-fuse cluster seems running well until one of the nodes goes down for simple reboot. Then all mount points become inaccessible, data transfer hangs and cluster stop working What is the purpose of ceph software while such simple case does not go through ? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ceph does not work 2012-02-23 9:15 ceph does not work Дениска-редиска @ 2012-02-23 19:00 ` Tommi Virtanen 2012-02-23 19:07 ` Gregory Farnum 2012-02-23 19:09 ` Sage Weil 0 siblings, 2 replies; 7+ messages in thread From: Tommi Virtanen @ 2012-02-23 19:00 UTC (permalink / raw) To: Дениска-редиска Cc: ceph-devel On Thu, Feb 23, 2012 at 01:15, Дениска-редиска <slim@inbox.lv> wrote: > ehllo here, > > i have tried to setup ceph .41 in simple configuration: > 3 nodes, each running mon, mds & osd with replication level 3 for data & metadata pools. > Each node mounts ceph locally via ceph-fuse > cluster seems running well until one of the nodes goes down for simple reboot. > Then all mount points become inaccessible, data transfer hangs and cluster stop working > > What is the purpose of ceph software while such simple case does not go through ? You have a replication factor of 3, and 3 OSDs. If one of them is down, the replication factor of 3 cannot be satisfied anymore. You need either more nodes, or a smaller replication factor. Ceph is not an eventually consistent system; building a POSIX filesystem on top of one is pretty much impossible. With Ceph, all replicas are always kept up to date. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ceph does not work 2012-02-23 19:00 ` Tommi Virtanen @ 2012-02-23 19:07 ` Gregory Farnum 2012-02-23 19:11 ` Tommi Virtanen 2012-02-23 19:09 ` Sage Weil 1 sibling, 1 reply; 7+ messages in thread From: Gregory Farnum @ 2012-02-23 19:07 UTC (permalink / raw) To: Tommi Virtanen Cc: Дениска-редиска, ceph-devel On Thu, Feb 23, 2012 at 11:00 AM, Tommi Virtanen <tommi.virtanen@dreamhost.com> wrote: > On Thu, Feb 23, 2012 at 01:15, Дениска-редиска <slim@inbox.lv> wrote: >> ehllo here, >> >> i have tried to setup ceph .41 in simple configuration: >> 3 nodes, each running mon, mds & osd with replication level 3 for data & metadata pools. >> Each node mounts ceph locally via ceph-fuse >> cluster seems running well until one of the nodes goes down for simple reboot. >> Then all mount points become inaccessible, data transfer hangs and cluster stop working >> >> What is the purpose of ceph software while such simple case does not go through ? > > You have a replication factor of 3, and 3 OSDs. If one of them is > down, the replication factor of 3 cannot be satisfied anymore. You > need either more nodes, or a smaller replication factor. > > Ceph is not an eventually consistent system; building a POSIX > filesystem on top of one is pretty much impossible. With Ceph, all > replicas are always kept up to date. Actually the OSDs will happily (well, not happily; the will complain. But they will run) run in degraded mode. However, if you have 3 active MDSes and you kill one of them without a standby available, you will lose access to part of your tree. That's probably what happened here... -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ceph does not work 2012-02-23 19:07 ` Gregory Farnum @ 2012-02-23 19:11 ` Tommi Virtanen 2012-02-24 11:33 ` Дениска-редиска 0 siblings, 1 reply; 7+ messages in thread From: Tommi Virtanen @ 2012-02-23 19:11 UTC (permalink / raw) To: Gregory Farnum Cc: Дениска-редиска, ceph-devel On Thu, Feb 23, 2012 at 11:07, Gregory Farnum <gregory.farnum@dreamhost.com> wrote: >>> 3 nodes, each running mon, mds & osd with replication level 3 for data & metadata pools. ... > Actually the OSDs will happily (well, not happily; the will complain. > But they will run) run in degraded mode. However, if you have 3 active > MDSes and you kill one of them without a standby available, you will > lose access to part of your tree. That's probably what happened > here... So let's try that angle. Slim, can you share the output of "ceph -s" with us? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ceph does not work 2012-02-23 19:11 ` Tommi Virtanen @ 2012-02-24 11:33 ` Дениска-редиска 2012-02-24 16:47 ` Gregory Farnum 0 siblings, 1 reply; 7+ messages in thread From: Дениска-редиска @ 2012-02-24 11:33 UTC (permalink / raw) To: Tommi Virtanen, ceph-devel; +Cc: Gregory Farnum running cluster of 3 nodes: lv-test-2 ~ # ceph -s 2012-02-24 13:10:35.481248 pg v726: 594 pgs: 594 active+clean; 120 MB data, 683 MB used, 35448 MB / 37967 MB avail 2012-02-24 13:10:35.484463 mds e177: 3/3/3 up {0=shark1=up:active,1=lv-test-1=up:active,2=lv-test-2=up:active} 2012-02-24 13:10:35.484529 osd e64: 3 osds: 3 up, 3 in 2012-02-24 13:10:35.484630 log 2012-02-24 13:09:50.009333 osd.1 10.0.1.246:6801/3929 29 : [INF] 2.5d scrub ok 2012-02-24 13:10:35.484907 mon e1: 3 mons at {lv-test-1=10.0.1.246:6789/0,lv-test-2=10.0.1.247:6789/0,shark1=10.0.1.81:6789/0} mounting by fuse: lv-test1 ~ # mount ceph-fuse on /uploads type fuse.ceph-fuse (rw,nosuid,nodev,allow_other,default_permissions) simulating write: lv-test-1 ~ # cp -r /usr/src/linux-3.2.2-hardened-r1/ /uploads/ killing one node: lv-test-2 ~ # killall ceph-mon ceph-mds ceph-osd Feb 24 13:11:17 lv-test-2 mon.lv-test-2[3474]: *** Caught signal (Terminated) ** Feb 24 13:11:17 lv-test-2 in thread 3195ce76760. Shutting down. Feb 24 13:11:17 lv-test-2 mds.lv-test-2[3553]: *** Caught signal (Terminated) ** Feb 24 13:11:17 lv-test-2 in thread 2ee100bb760. Shutting down. Feb 24 13:11:17 lv-test-2 osd.2[3654]: *** Caught signal (Terminated) ** Feb 24 13:11:17 lv-test-2 in thread 28f75487760. Shutting down. Feb 24 13:11:35 lv-test-2 client.admin[3885]: 3a6a385b700 monclient: hunting for new mon Feb 24 13:11:35 lv-test-2 client.admin[3885]: 3a6a385b700 client.5017 ms_handle_reset on 10.0.1.246:6789/0 Feb 24 13:11:17 lv-test-1 mon.lv-test-1[3751]: 2d62330a700 -- 10.0.1.246:6789/0 >> 10.0.1.247:6789/0 pipe(0x522b9ba080 sd=9 pgs=37 cs=1 l=0 ).fault with nothing to send, going to standby Feb 24 13:11:17 lv-test-1 mds.lv-test-1[3830]: 2e3b9bd0700 -- 10.0.1.246:6800/3829 >> 10.0.1.247:6800/3552 pipe(0x5faea44c0 sd=13 pgs=13 cs =1 l=0).fault with nothing to send, going to standby Feb 24 13:11:17 lv-test-1 client.admin[3151]: 2a9fbe55700 -- 10.0.1.246:0/3151 >> 10.0.1.247:6800/3552 pipe(0x2a9ec00f560 sd=0 pgs=9 cs=3 l =0).fault with nothing to send, going to standby Feb 24 13:11:17 lv-test-1 osd.1[3930]: 2a51b9f1700 osd.1 64 OSD::ms_handle_reset() Feb 24 13:11:17 lv-test-1 osd.1[3930]: 2a51b9f1700 osd.1 64 OSD::ms_handle_reset() s=0x22b4580700 Feb 24 13:11:17 lv-test-1 client.admin[3151]: 2a9fd95b700 client.4617 ms_handle_reset on 10.0.1.247:6801/3653 Feb 24 13:11:17 lv-test-1 osd.1[3930]: 2a510df7700 -- 10.0.1.246:6803/3929 >> 10.0.1.247:0/3654 pipe(0x2a50c005000 sd=24 pgs=4 cs=1 l=0).fa ult with nothing to send, going to standby Feb 24 13:11:18 lv-test-1 osd.1[3930]: 2a5184e6700 -- 10.0.1.246:6802/3929 >> 10.0.1.247:6802/3653 pipe(0x2a5145e9c50 sd=19 pgs=3 cs=1 l=0) .fault with nothing to send, going to standby Feb 24 13:11:18 lv-test-1 mds.lv-test-1[3830]: 2e3bcadc700 mds.1.5 ms_handle_reset on 10.0.1.247:6801/3653 Feb 24 13:11:18 lv-test-1 osd.1[3930]: 2a5183e5700 -- 10.0.1.246:0/3930 >> 10.0.1.247:6803/3653 pipe(0x2a5145eaeb0 sd=20 pgs=16 cs=1 l=0).f ault with nothing to send, going to standby Feb 24 13:11:34 lv-test-1 osd.1[3930]: 2a5229ff700 osd.1 64 heartbeat_check: no heartbeat from osd.0 since 2012-02-24 13:11:14.122485 (cutoff 2012-02-24 13:11:14.366355) Feb 24 13:11:34 lv-test-1 osd.1[3930]: 2a5117fa700 osd.1 64 heartbeat_check: no heartbeat from osd.0 since 2012-02-24 13:11:14.122485 (cutoff 2012-02-24 13:11:14.826382) Feb 24 13:11:35 lv-test-1 osd.1[3930]: 2a5229ff700 osd.1 64 heartbeat_check: no heartbeat from osd.0 since 2012-02-24 13:11:14.122485 (cutoff 2012-02-24 13:11:15.369660) Feb 24 13:11:35 lv-test-1 osd.1[3930]: 2a51b9f1700 monclient: hunting for new mon Feb 24 13:11:35 lv-test-1 osd.1[3930]: 2a51b9f1700 osd.1 64 OSD::ms_handle_reset() Feb 24 13:11:36 lv-test-1 osd.1[3930]: 2a5117fa700 osd.1 64 heartbeat_check: no heartbeat from osd.0 since 2012-02-24 13:11:14.122485 (cutoff 2012-02-24 13:11:16.129635) Feb 24 13:11:36 lv-test-1 osd.1[3930]: 2a5229ff700 osd.1 64 heartbeat_check: no heartbeat from osd.0 since 2012-02-24 13:11:14.122485 (cutoff 2012-02-24 13:11:16.372900) copy hangs (cannot be killed by kill -9), /uploads is not accessible lv-test-1 ~ # time ceph -s ^C*** Caught signal (Interrupt) ** in thread 2da010af760. Shutting down. real 3m16.481s user 0m0.037s sys 0m0.013s lv-test-2 ~ # time ceph -s ^C*** Caught signal (Interrupt) ** in thread 314b193c760. Shutting down. real 0m35.401s user 0m0.017s sys 0m0.007s so cluster hanged and not responding anymore lets bring up back killed node: lv-test-2 ~ # /etc/init.d/ceph restart lv-test-2 ~ # ceph -s 2012-02-24 13:20:01.996366 pg v734: 594 pgs: 594 active+clean; 120 MB data, 683 MB used, 35448 MB / 37967 MB avail 2012-02-24 13:20:01.999207 mds e177: 3/3/3 up {0=shark1=up:active,1=lv-test-1=up:active,2=lv-test-2=up:active} 2012-02-24 13:20:01.999268 osd e64: 3 osds: 3 up, 3 in 2012-02-24 13:20:01.999368 log 2012-02-24 13:11:02.267947 osd.1 10.0.1.246:6801/3929 41 : [INF] 2.89 scrub ok 2012-02-24 13:20:01.999612 mon e1: 3 mons at {lv-test-1=10.0.1.246:6789/0,lv-test-2=10.0.1.247:6789/0,shark1=10.0.1.81:6789/0} lv-test-2 ~ # ceph -s 2012-02-24 13:20:44.984214 pg v742: 594 pgs: 594 active+clean; 144 MB data, 714 MB used, 35417 MB / 37967 MB avail 2012-02-24 13:20:44.986505 mds e182: 3/3/3 up {0=lv-test-2=up:resolve,1=lv-test-1=up:active,2=lv-test-2=up:active(laggy or crashed)} 2012-02-24 13:20:44.986697 osd e68: 3 osds: 1 up, 3 in 2012-02-24 13:20:44.986918 log 2012-02-24 13:20:42.606730 mon.1 10.0.1.246:6789/0 27 : [INF] mds.1 10.0.1.246:6800/3829 up:active 2012-02-24 13:20:44.987118 mon e1: 3 mons at {lv-test-1=10.0.1.246:6789/0,lv-test-2=10.0.1.247:6789/0,shark1=10.0.1.81:6789/0} Feb 24 13:23:28 lv-test-2 mds.lv-test-2[4608]: 2a93085f700 -- 10.0.1.247:6800/4607 >> 10.0.1.247:6800/3552 pipe(0x19d6f37b40 sd=12 pgs=0 cs=0 l=0).connect claims to be 10.0.1.247:6800/4607 not 10.0.1.247:6800/3552 - wrong node! Feb 24 13:24:13 lv-test-1 client.admin[3151]: 2a9fc158700 -- 10.0.1.246:0/3151 >> 10.0.1.247:6800/3552 pipe(0x2a9ec00f560 sd=0 pgs=9 cs=4 l=0).connect claims to be 10.0.1.247:6800/4607 not 10.0.1.247:6800/3552 - wrong node! Feb 24 13:24:14 lv-test-1 mds.lv-test-1[3830]: 2e3b981b700 -- 10.0.1.246:6800/3829 >> 10.0.1.247:6800/3552 pipe(0x5faea44c0 sd=8 pgs=13 cs=2 l=0).connect claims to be 10.0.1.247:6800/4607 not 10.0.1.247:6800/3552 - wrong node! lv-test-1 ~ # ceph -s 2012-02-24 13:24:36.558927 pg v762: 594 pgs: 594 active+clean; 144 MB data, 741 MB used, 35390 MB / 37967 MB avail 2012-02-24 13:24:36.560927 mds e195: 3/3/3 up {0=lv-test-2=up:resolve,1=lv-test-1=up:active,2=lv-test-2=up:active(laggy or crashed)} 2012-02-24 13:24:36.560988 osd e70: 3 osds: 2 up, 3 in 2012-02-24 13:24:36.561092 log 2012-02-24 13:24:29.691540 osd.2 10.0.1.247:6801/4706 17 : [INF] 0.77 scrub ok 2012-02-24 13:24:36.561201 mon e1: 3 mons at {lv-test-1=10.0.1.246:6789/0,lv-test-2=10.0.1.247:6789/0,shark1=10.0.1.81:6789/0} mount point still inaccessible. Thats all, that sucks is there proven scenario to build cluster of 3 nodes with replication that tolerates shutdown of two nodes without lockups of read/write process ? Цитирование "Tommi Virtanen" <tommi.virtanen@dreamhost.com>: > On Thu, Feb 23, 2012 at 11:07, Gregory Farnum > <gregory.farnum@dreamhost.com> wrote: >>>> 3 nodes, each running mon, mds & osd with replication level 3 for data & met >>>>adata pools. > ... >> Actually the OSDs will happily (well, not happily; the will complain. >> But they will run) run in degraded mode. However, if you have 3 active >> MDSes and you kill one of them without a standby available, you will >> lose access to part of your tree. That's probably what happened >> here... > > So let's try that angle. Slim, can you share the output of "ceph -s" with us >? -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ceph does not work 2012-02-24 11:33 ` Дениска-редиска @ 2012-02-24 16:47 ` Gregory Farnum 0 siblings, 0 replies; 7+ messages in thread From: Gregory Farnum @ 2012-02-24 16:47 UTC (permalink / raw) To: Дениска-редиска Cc: Tommi Virtanen, ceph-devel@vger.kernel.org On Feb 24, 2012, at 3:33 AM, "Дениска-редиска" <slim@inbox.lv> wrote: > running cluster of 3 nodes: > > lv-test-2 ~ # ceph -s > 2012-02-24 13:10:35.481248 pg v726: 594 pgs: 594 active+clean; 120 MB data, 683 MB used, 35448 MB / 37967 MB avail > 2012-02-24 13:10:35.484463 mds e177: 3/3/3 up {0=shark1=up:active,1=lv-test-1=up:active,2=lv-test-2=up:active} > 2012-02-24 13:10:35.484529 osd e64: 3 osds: 3 up, 3 in > 2012-02-24 13:10:35.484630 log 2012-02-24 13:09:50.009333 osd.1 10.0.1.246:6801/3929 29 : [INF] 2.5d scrub ok > 2012-02-24 13:10:35.484907 mon e1: 3 mons at {lv-test-1=10.0.1.246:6789/0,lv-test-2=10.0.1.247:6789/0,shark1=10.0.1.81:6789/0} > > mounting by fuse: > lv-test1 ~ # mount > ceph-fuse on /uploads type fuse.ceph-fuse (rw,nosuid,nodev,allow_other,default_permissions) > > simulating write: > lv-test-1 ~ # cp -r /usr/src/linux-3.2.2-hardened-r1/ /uploads/ > > killing one node: > lv-test-2 ~ # killall ceph-mon ceph-mds ceph-osd > Feb 24 13:11:17 lv-test-2 mon.lv-test-2[3474]: *** Caught signal (Terminated) ** > Feb 24 13:11:17 lv-test-2 in thread 3195ce76760. Shutting down. > Feb 24 13:11:17 lv-test-2 mds.lv-test-2[3553]: *** Caught signal (Terminated) ** > Feb 24 13:11:17 lv-test-2 in thread 2ee100bb760. Shutting down. > Feb 24 13:11:17 lv-test-2 osd.2[3654]: *** Caught signal (Terminated) ** > Feb 24 13:11:17 lv-test-2 in thread 28f75487760. Shutting down. > Feb 24 13:11:35 lv-test-2 client.admin[3885]: 3a6a385b700 monclient: hunting for new mon > Feb 24 13:11:35 lv-test-2 client.admin[3885]: 3a6a385b700 client.5017 ms_handle_reset on 10.0.1.246:6789/0 > > Feb 24 13:11:17 lv-test-1 mon.lv-test-1[3751]: 2d62330a700 -- 10.0.1.246:6789/0 >> 10.0.1.247:6789/0 pipe(0x522b9ba080 sd=9 pgs=37 cs=1 l=0 > ).fault with nothing to send, going to standby > Feb 24 13:11:17 lv-test-1 mds.lv-test-1[3830]: 2e3b9bd0700 -- 10.0.1.246:6800/3829 >> 10.0.1.247:6800/3552 pipe(0x5faea44c0 sd=13 pgs=13 cs > =1 l=0).fault with nothing to send, going to standby > Feb 24 13:11:17 lv-test-1 client.admin[3151]: 2a9fbe55700 -- 10.0.1.246:0/3151 >> 10.0.1.247:6800/3552 pipe(0x2a9ec00f560 sd=0 pgs=9 cs=3 l > =0).fault with nothing to send, going to standby > Feb 24 13:11:17 lv-test-1 osd.1[3930]: 2a51b9f1700 osd.1 64 OSD::ms_handle_reset() > Feb 24 13:11:17 lv-test-1 osd.1[3930]: 2a51b9f1700 osd.1 64 OSD::ms_handle_reset() s=0x22b4580700 > Feb 24 13:11:17 lv-test-1 client.admin[3151]: 2a9fd95b700 client.4617 ms_handle_reset on 10.0.1.247:6801/3653 > Feb 24 13:11:17 lv-test-1 osd.1[3930]: 2a510df7700 -- 10.0.1.246:6803/3929 >> 10.0.1.247:0/3654 pipe(0x2a50c005000 sd=24 pgs=4 cs=1 l=0).fa > ult with nothing to send, going to standby > Feb 24 13:11:18 lv-test-1 osd.1[3930]: 2a5184e6700 -- 10.0.1.246:6802/3929 >> 10.0.1.247:6802/3653 pipe(0x2a5145e9c50 sd=19 pgs=3 cs=1 l=0) > .fault with nothing to send, going to standby > Feb 24 13:11:18 lv-test-1 mds.lv-test-1[3830]: 2e3bcadc700 mds.1.5 ms_handle_reset on 10.0.1.247:6801/3653 > Feb 24 13:11:18 lv-test-1 osd.1[3930]: 2a5183e5700 -- 10.0.1.246:0/3930 >> 10.0.1.247:6803/3653 pipe(0x2a5145eaeb0 sd=20 pgs=16 cs=1 l=0).f > ault with nothing to send, going to standby > Feb 24 13:11:34 lv-test-1 osd.1[3930]: 2a5229ff700 osd.1 64 heartbeat_check: no heartbeat from osd.0 since 2012-02-24 13:11:14.122485 (cutoff 2012-02-24 13:11:14.366355) > Feb 24 13:11:34 lv-test-1 osd.1[3930]: 2a5117fa700 osd.1 64 heartbeat_check: no heartbeat from osd.0 since 2012-02-24 13:11:14.122485 (cutoff 2012-02-24 13:11:14.826382) > Feb 24 13:11:35 lv-test-1 osd.1[3930]: 2a5229ff700 osd.1 64 heartbeat_check: no heartbeat from osd.0 since 2012-02-24 13:11:14.122485 (cutoff 2012-02-24 13:11:15.369660) > Feb 24 13:11:35 lv-test-1 osd.1[3930]: 2a51b9f1700 monclient: hunting for new mon > Feb 24 13:11:35 lv-test-1 osd.1[3930]: 2a51b9f1700 osd.1 64 OSD::ms_handle_reset() > Feb 24 13:11:36 lv-test-1 osd.1[3930]: 2a5117fa700 osd.1 64 heartbeat_check: no heartbeat from osd.0 since 2012-02-24 13:11:14.122485 (cutoff 2012-02-24 13:11:16.129635) > Feb 24 13:11:36 lv-test-1 osd.1[3930]: 2a5229ff700 osd.1 64 heartbeat_check: no heartbeat from osd.0 since 2012-02-24 13:11:14.122485 (cutoff 2012-02-24 13:11:16.372900) > > copy hangs (cannot be killed by kill -9), /uploads is not accessible > > lv-test-1 ~ # time ceph -s > > ^C*** Caught signal (Interrupt) ** > in thread 2da010af760. Shutting down. > > > real 3m16.481s > user 0m0.037s > sys 0m0.013s > > lv-test-2 ~ # time ceph -s > > > ^C*** Caught signal (Interrupt) ** > in thread 314b193c760. Shutting down. > > > real 0m35.401s > user 0m0.017s > sys 0m0.007s > > so cluster hanged and not responding anymore > > lets bring up back killed node: > > lv-test-2 ~ # /etc/init.d/ceph restart > lv-test-2 ~ # ceph -s > 2012-02-24 13:20:01.996366 pg v734: 594 pgs: 594 active+clean; 120 MB data, 683 MB used, 35448 MB / 37967 MB avail > 2012-02-24 13:20:01.999207 mds e177: 3/3/3 up {0=shark1=up:active,1=lv-test-1=up:active,2=lv-test-2=up:active} > 2012-02-24 13:20:01.999268 osd e64: 3 osds: 3 up, 3 in > 2012-02-24 13:20:01.999368 log 2012-02-24 13:11:02.267947 osd.1 10.0.1.246:6801/3929 41 : [INF] 2.89 scrub ok > 2012-02-24 13:20:01.999612 mon e1: 3 mons at {lv-test-1=10.0.1.246:6789/0,lv-test-2=10.0.1.247:6789/0,shark1=10.0.1.81:6789/0} > > lv-test-2 ~ # ceph -s > 2012-02-24 13:20:44.984214 pg v742: 594 pgs: 594 active+clean; 144 MB data, 714 MB used, 35417 MB / 37967 MB avail > 2012-02-24 13:20:44.986505 mds e182: 3/3/3 up {0=lv-test-2=up:resolve,1=lv-test-1=up:active,2=lv-test-2=up:active(laggy or crashed)} > 2012-02-24 13:20:44.986697 osd e68: 3 osds: 1 up, 3 in > 2012-02-24 13:20:44.986918 log 2012-02-24 13:20:42.606730 mon.1 10.0.1.246:6789/0 27 : [INF] mds.1 10.0.1.246:6800/3829 up:active > 2012-02-24 13:20:44.987118 mon e1: 3 mons at {lv-test-1=10.0.1.246:6789/0,lv-test-2=10.0.1.247:6789/0,shark1=10.0.1.81:6789/0} > > Feb 24 13:23:28 lv-test-2 mds.lv-test-2[4608]: 2a93085f700 -- 10.0.1.247:6800/4607 >> 10.0.1.247:6800/3552 pipe(0x19d6f37b40 sd=12 pgs=0 cs=0 l=0).connect claims to be 10.0.1.247:6800/4607 not 10.0.1.247:6800/3552 - wrong node! > > Feb 24 13:24:13 lv-test-1 client.admin[3151]: 2a9fc158700 -- 10.0.1.246:0/3151 >> 10.0.1.247:6800/3552 pipe(0x2a9ec00f560 sd=0 pgs=9 cs=4 l=0).connect claims to be 10.0.1.247:6800/4607 not 10.0.1.247:6800/3552 - wrong node! > Feb 24 13:24:14 lv-test-1 mds.lv-test-1[3830]: 2e3b981b700 -- 10.0.1.246:6800/3829 >> 10.0.1.247:6800/3552 pipe(0x5faea44c0 sd=8 pgs=13 cs=2 l=0).connect claims to be 10.0.1.247:6800/4607 not 10.0.1.247:6800/3552 - wrong node! > > > lv-test-1 ~ # ceph -s > 2012-02-24 13:24:36.558927 pg v762: 594 pgs: 594 active+clean; 144 MB data, 741 MB used, 35390 MB / 37967 MB avail > 2012-02-24 13:24:36.560927 mds e195: 3/3/3 up {0=lv-test-2=up:resolve,1=lv-test-1=up:active,2=lv-test-2=up:active(laggy or crashed)} > 2012-02-24 13:24:36.560988 osd e70: 3 osds: 2 up, 3 in > 2012-02-24 13:24:36.561092 log 2012-02-24 13:24:29.691540 osd.2 10.0.1.247:6801/4706 17 : [INF] 0.77 scrub ok > 2012-02-24 13:24:36.561201 mon e1: 3 mons at {lv-test-1=10.0.1.246:6789/0,lv-test-2=10.0.1.247:6789/0,shark1=10.0.1.81:6789/0} Okay, so you can see here that one MDS is active, one is in the "resolve" state, and another one is apparently crashed. If you have logs or core dumps that you can send us we'd appreciate it, but in the meantime: the Ceph distributed filesystem is not yet production-ready, and a system with multiple active MDSes is significantly less stable and well-tested. If you try using one active MDS and leave the test in standby you will almost certainly see better results (and you're unlikely to be bottlenecked by it). :) Also, one of your OSDs is down, and if that crashed it's a much bigger concern to us right now...can you check the log and see what it says? -Greg > > > mount point still inaccessible. Thats all, that sucks > > > is there proven scenario to build cluster of 3 nodes with replication that tolerates shutdown of two nodes without lockups of read/write process ? > > > > Цитирование "Tommi Virtanen" <tommi.virtanen@dreamhost.com>: >> On Thu, Feb 23, 2012 at 11:07, Gregory Farnum >> <gregory.farnum@dreamhost.com> wrote: >>>>> 3 nodes, each running mon, mds & osd with replication level 3 for data & met >>>>> adata pools. >> ... >>> Actually the OSDs will happily (well, not happily; the will complain. >>> But they will run) run in degraded mode. However, if you have 3 active >>> MDSes and you kill one of them without a standby available, you will >>> lose access to part of your tree. That's probably what happened >>> here... >> >> So let's try that angle. Slim, can you share the output of "ceph -s" with us >> ? > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: ceph does not work 2012-02-23 19:00 ` Tommi Virtanen 2012-02-23 19:07 ` Gregory Farnum @ 2012-02-23 19:09 ` Sage Weil 1 sibling, 0 replies; 7+ messages in thread From: Sage Weil @ 2012-02-23 19:09 UTC (permalink / raw) To: Tommi Virtanen Cc: Дениска-редиска, ceph-devel [-- Attachment #1: Type: TEXT/PLAIN, Size: 1256 bytes --] On Thu, 23 Feb 2012, Tommi Virtanen wrote: > On Thu, Feb 23, 2012 at 01:15, ÿÿÿÿÿÿÿÿÿÿÿÿÿÿ-ÿÿÿÿÿÿÿÿÿÿÿÿÿÿ <slim@inbox.lv> wrote: > > ehllo here, > > > > i have tried to setup ceph .41 in simple configuration: > > 3 nodes, each running mon, mds & osd with replication level 3 for data & metadata pools. > > Each node mounts ceph locally via ceph-fuse > > cluster seems running well until one of the nodes goes down for simple reboot. > > Then all mount points become inaccessible, data transfer hangs and cluster stop working > > > > What is the purpose of ceph software while such simple case does not go through ? > > You have a replication factor of 3, and 3 OSDs. If one of them is > down, the replication factor of 3 cannot be satisfied anymore. You > need either more nodes, or a smaller replication factor. > > Ceph is not an eventually consistent system; building a POSIX > filesystem on top of one is pretty much impossible. With Ceph, all > replicas are always kept up to date. Just to clarify: what should have happend is that after a few seconds (20 by default?) the stopped ceph-osd is marked down and life continues with 2 replicas. 'ceph -s' or 'ceph health' will report some PGs in the 'degraded' state. sage ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-02-24 16:47 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-02-23 9:15 ceph does not work Дениска-редиска 2012-02-23 19:00 ` Tommi Virtanen 2012-02-23 19:07 ` Gregory Farnum 2012-02-23 19:11 ` Tommi Virtanen 2012-02-24 11:33 ` Дениска-редиска 2012-02-24 16:47 ` Gregory Farnum 2012-02-23 19:09 ` Sage Weil
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.