From mboxrd@z Thu Jan 1 00:00:00 1970 From: Raghunandhan Subject: Re: [ceph-commit] Ceph Zfs Date: Sat, 27 Oct 2012 12:20:46 +0630 Message-ID: <90df19a3c42ca43a1bc6afff2697e026@iihtcloudsolutions.com> References: <441865dbf9e127f0b85193c512878a76@iihtcloudsolutions.com> <2a1583303809db2f73424d268d79e0ea@iihtcloudsolutions.com> <508AE6A9.3020104@inktank.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from oproxy11-pub.bluehost.com ([173.254.64.10]:40889 "HELO oproxy11-pub.bluehost.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1751538Ab2J0Fut (ORCPT ); Sat, 27 Oct 2012 01:50:49 -0400 In-Reply-To: <508AE6A9.3020104@inktank.com> Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Dan Mick Cc: Sage Weil , ceph-devel@vger.kernel.org ceph status when used with zfs filesystem osd dies. # ceph -s health HEALTH_WARN 407 pgs degraded; 169 pgs down; 169 pgs peering; 15 pgs recovering; 323 pgs stuck unclean; recovery 38/42 degraded (90.476%); 19/21 unfound (90.476%); 1/2 in osds are down monmap e1: 2 mons at {a=11.0.0.2:6789/0,b=11.0.0.3:6789/0}, election epoch 4, quorum 0,1 a,b osdmap e7: 2 osds: 1 up, 2 in pgmap v10: 576 pgs: 15 active+recovering+degraded, 169 down+peering, 392 active+degraded; 8059 bytes data, 1004 MB used, 85683 MB / 86687 MB avail; 38/42 degraded (90.476%); 19/21 unfound (90.476%) mdsmap e5: 1/1/1 up {0=a=up:active}, 1 up:standby Log file generated when 2 osd's where up and later it went down. 2012-10-27 11:14:07.152741 mon.0 11.0.0.2:6789/0 27 : [INF] osdmap e5: 2 osds: 2 up, 2 in 2012-10-27 11:14:07.192719 mon.0 11.0.0.2:6789/0 28 : [INF] pgmap v6: 576 pgs: 576 creating; 0 bytes data, 0 KB used, 0 KB / 0 KB avail 2012-10-27 11:14:12.007671 mon.0 11.0.0.2:6789/0 29 : [INF] pgmap v7: 576 pgs: 272 creating, 43 active, 253 active+clean, 8 active+recovering; 1243 bytes data, 1003 MB used, 85684 MB / 86687 MB avail; 9/18 degraded (50.000%) 2012-10-27 11:14:32.014302 mon.0 11.0.0.2:6789/0 30 : [DBG] osd.0 11.0.0.2:6801/24250 reported failed by osd.1 11.0.0.3:6801/8443 2012-10-27 11:14:37.033547 mon.0 11.0.0.2:6789/0 31 : [DBG] osd.0 11.0.0.2:6801/24250 reported failed by osd.1 11.0.0.3:6801/8443 2012-10-27 11:14:42.060678 mon.0 11.0.0.2:6789/0 32 : [DBG] osd.0 11.0.0.2:6801/24250 reported failed by osd.1 11.0.0.3:6801/8443 2012-10-27 11:14:42.060827 mon.0 11.0.0.2:6789/0 33 : [INF] osd.0 11.0.0.2:6801/24250 failed (3 reports from 1 peers after 30.046376 >= grace 20.000000) 2012-10-27 11:14:42.157536 mon.0 11.0.0.2:6789/0 34 : [INF] osdmap e6: 2 osds: 1 up, 2 in 2012-10-27 11:19:46.751562 mon.0 11.0.0.2:6789/0 40 : [INF] osd.0 out (down for 304.604259) 2012-10-27 11:19:46.785574 mon.0 11.0.0.2:6789/0 41 : [INF] osdmap e8: 2 osds: 1 up, 1 in 2012-10-27 11:19:46.811588 mon.0 11.0.0.2:6789/0 42 : [INF] pgmap v12: 576 pgs: 15 active+recovering+degraded, 169 down+peering, 392 active+degraded; 8059 bytes data, 1004 MB used, 85683 MB / 86687 MB avail; 38/42 degraded (90.476%); 19/21 unfound (90.476%) 2012-10-27 11:19:49.591172 mon.0 11.0.0.2:6789/0 43 : [INF] pgmap v13: 576 pgs: 15 active+recovering+degraded, 169 down+peering, 392 active+degraded; 8059 bytes data, 1004 MB used, 85683 MB / 86687 MB avail; 38/42 degraded (90.476%); 19/21 unfound (90.476%) 2012-10-27 11:20:04.671337 mon.0 11.0.0.2:6789/0 44 : [INF] pgmap v14: 576 pgs: 15 active+recovering+degraded, 169 down+peering, 392 active+degraded; 8059 bytes data, 1004 MB used, 85683 MB / 86687 MB avail; 38/42 degraded (90.476%); 19/21 unfound (90.476%) --- Regards, Raghunandhan.G On 27-10-2012 02:08, Dan Mick wrote: > On 10/25/2012 09:46 PM, Raghunandhan wrote: >> Hi Sage, >> >> Thanks for replying back, Once a zpool is created if i mount it on >> /var/lib/ceph/osd/ceph-0 the cephfs doesnt recognize it as a >> superblock >> and hence it fails, > > I assume you mean "once a zfs is created"? One can't mount zpools, > can one? > >> Im trying to build this on our cloud storage since >> btrfs has not been stable nor they have come up with online dedup i >> have >> no other choice for now to work with zfs ceph which makes sense. >> >> So what i exactly did was created a zpool store >> 1 Then used the same store and made a block device from it using zfs >> create >> 2 Once the zfs create was successful i was able to format with ext4 >> using xattr >> 3 On top of it was the ceph >> >> Following this process doesnt make sense because of multiple layer >> on >> the storage and the ceph consumes a lot of RAM and cpu cycles which >> ends >> up in kernel hung task. It would be great if there is a way i could >> directly use the zfs pool with ceph and make it work. > > Have you actually tried making a zfs filesystem in the zpool, and > using that as backing store for the osd? > >> >> --- >> Regards, >> Raghunandhan.G >> IIHT Cloud Solutions Pvt. Ltd. >> #15, 4th Floor, 'A' Wing, Sri Lakshmi Complex, >> St. Marks Road, Bangalore - 560 001, India >> >> On 25-10-2012 22:06, Sage Weil wrote: >>> [moved to ceph-devel] >>> >>> On Thu, 25 Oct 2012, Raghunandhan wrote: >>>> Hi All, >>>> >>>> I have been working around ceph quite a long and trying to stitch >>>> zfs >>>> with >>>> ceph. I was able to do it to certain extent as follows: >>>> 1. zpool creation >>>> 2. set dedup >>>> 3. create a mountable volume of zfs (zfs create) >>>> 4. format the volume with ext4 and enabling xattr >>>> 5. mkcephfs on the volume >>>> >>>> This actually works and dedup is perfect. But i need to avoid >>>> multiple layers >>>> on the storage since the performance is very slow and the kernel >>>> timeout >>>> occurs often for a 8GB RAM. I want to test the performance between >>>> btrfs and >>>> zfs. I want to avoid the above multiple layering on storage and >>>> make >>>> the ceph >>>> cluster aware of zfs. Let me know if anyone has workaround this. >>> >>> I'm not familiar enough with zfs to know what 'mountable volume' >>> means.. >>> is that a block device/lun that you're putting ext4 on? Probably >>> the >>> best >>> results will come from creating a zfs *file system* (using the ZPL >>> or >>> whatever it is) and running ceph-osd on top of that. >>> >>> There is at least one open bug from someone having problems there, >>> but >>> we'd very much like to sort out the problem. >>> >>> sage >> >> -- >> To unsubscribe from this list: send the line "unsubscribe >> ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html