From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wido den Hollander Subject: Re: First work on RBD storage pool support in libvirt Date: Fri, 02 Mar 2012 16:30:46 +0100 Message-ID: <4F50E7A6.10100@widodh.nl> References: <4F04A8C1.1090205@widodh.nl> <4F04EFA1.5050704@dreamhost.com> <4F05B8F0.10003@widodh.nl> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from smtp01.mail.pcextreme.nl ([109.72.87.137]:51550 "EHLO smtp01.mail.pcextreme.nl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1030227Ab2CBPav (ORCPT ); Fri, 2 Mar 2012 10:30:51 -0500 In-Reply-To: <4F05B8F0.10003@widodh.nl> Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Josh Durgin Cc: ceph-devel@vger.kernel.org Hi, On 01/05/2012 03:51 PM, Wido den Hollander wrote: > On 01/05/2012 01:32 AM, Josh Durgin wrote: >> On 01/04/2012 11:30 AM, Wido den Hollander wrote: >>> Hi, >>> >>> The last few days I've been working on a storage backend driver for >>> libvirt which supports RBD. >>> >>> This has been in the tracker for a while: >>> http://tracker.newdream.net/issues/1422 >>> >>> My current work can be found at: http://www.widodh.nl/git/libvirt.git in >>> the 'rbd' branch. >> >> Awesome! Glad to see this being worked on. >> >>> I realize it is far from done, a lot of work has to be done, but I'd >>> like to discuss some things first before making some decisions I might >>> later regret. >>> >>> My idea was to discuss it here first and after a few iterations get it >>> reviewed by the libvirt guys. >>> >>> Let me start with the XML: >>> >>> >>> cephclusterdev >>> >>> myrbdpool >>> >> prefer_ipv6='true'/> >>> >> secret='a313871d-864a-423c-9765-5374707565e1'/> >>> >>> >>> >> >> I think it will be easier to manage if the format for network volumes >> and network disks are as similar as possible. In particular, allowing >> multiple hosts, and making the auth element match the network disk >> format (even using the same xml schema). With this in mind, the format >> would be more like: >> >> >> cephclusterdev >> >> >> >> >> >> >> >> >> >> >> Or the secret could be identified by name: >> >> >> cephclusterdev >> >> >> >> >> >> >> >> >> I've been doing some work on this, but I'm limited to what libvirt offers rights now and I don't want to break anything. The current XML: ceph rbd This works just fine. libvirt relies on the type attribute of the auth node for determining the auth type: authType = virXPathString("string(./auth/@type)", ctxt); if (authType == NULL) { source->authType = VIR_STORAGE_POOL_AUTH_NONE; } else { if (STREQ(authType, "chap")) { source->authType = VIR_STORAGE_POOL_AUTH_CHAP; } else if (STREQ(authType, "ceph")) { source->authType = VIR_STORAGE_POOL_AUTH_CEPHX; } else { virStorageReportError(VIR_ERR_XML_ERROR, _("unknown auth type '%s'"), (const char *)authType); goto cleanup; } } I've tested this code over and over and keeps working. There is still some work to do: root@stack01:~# virsh vol-dumpxml rbd/alpha bigmofo-data rbd/bigmofo-data 4398046511104 4398046511104 rbd:rbd/alpha 00 0 0 root@stack01:~# The 'source' node here should be filled with the right 'host' nodes, but that is code that doesn't exist yet in libvirt. It will require some extra work in libvirt. Then there is still the way of passing options down to librados. For example debugging, a user might want to set 'log file' and 'debug rados' so he can debug all the RADOS request which are being made. It would be helpful if somebody could start reviewing the code. In a couple of weeks we can do a proposal at the libvirt guys, but before that the code should be reviewed. Code can be found at: http://www.widodh.nl/git/libvirt.git The branch rbd is where you should be looking. Thanks, Wido > > I'm currently using the already existing structure, for example a iSCSI > pool: > > > virtimages > e9392370-2917-565e-692b-d057f46512d6 > > > > > > > /dev/disk/by-path > > 0700 > 0 > 0 > > > > > This was the easiest way to get things up and running, but I do agree > that matching the disk declaration would be preferable. > >> >>> A few things here: >>> >>> * I'm leaning on the secretDriver from libvirt for storing the actual >>> cephx key. Should I also store the id in there or keep that in the pool >>> declaration? >> >> I'd say keep it in the pool declaration for consistency. >> >>> >>> * prefer_ipv6? I'm a IPv6 guy, I try to get as much over IPv6 as I can. >>> Since Ceph doesn't support dual-stack you have to explicitly enable >>> IPv6. I did not want to let librados read a ceph.conf from outside >>> libvirt I added this variable. Not the fanciest way I think, but it >>> could serve other future storage drivers in libvirt >> >> This actually isn't necessary for RBD - the ms_bind_ipv6 option only >> affects servers (who call bind(2)). > > Ah, ok. I'll remove that! > >> >>> * How should we pass other configuration options? I want to stay away >>> from the ceph.conf as far as possible. Imho a user should be able to >>> define a XML and get it all up and running. You will also run into >>> apparmor/SELinux on systems, so libvirt won't have permission to read >>> files everywhere you want it to. I also thinks the libvirt guys want to >>> keep everything as generic as possible. >> >> I agree, libvirt should be able to configure everything with no external >> files. >> >>> In the future we might see more >>> storage backends which have almost the same properties as RBD. How do we >>> pass extra config options? the volume >> >> The libvirt way seems to be adding more well-defined elements or >> attributes to the xml schema when the new backend is added. Personally >> I'd be happy with a generic