From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:45818) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1StEKS-0006sX-LP for qemu-devel@nongnu.org; Mon, 23 Jul 2012 04:49:30 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1StEKM-0004fz-AQ for qemu-devel@nongnu.org; Mon, 23 Jul 2012 04:49:24 -0400 Received: from e28smtp07.in.ibm.com ([122.248.162.7]:52067) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1StEKL-0004fM-AT for qemu-devel@nongnu.org; Mon, 23 Jul 2012 04:49:18 -0400 Received: from /spool/local by e28smtp07.in.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 23 Jul 2012 14:19:13 +0530 Received: from d28av03.in.ibm.com (d28av03.in.ibm.com [9.184.220.65]) by d28relay01.in.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q6N8nA9r23789642 for ; Mon, 23 Jul 2012 14:19:11 +0530 Received: from d28av03.in.ibm.com (loopback [127.0.0.1]) by d28av03.in.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q6N8n8VP027559 for ; Mon, 23 Jul 2012 18:49:09 +1000 Date: Mon, 23 Jul 2012 14:20:31 +0530 From: Bharata B Rao Message-ID: <20120723085031.GN1046@in.ibm.com> References: <20120721082917.GC1046@in.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [Qemu-devel] [RFC PATCH 0/2] GlusterFS support in QEMU - v2 Reply-To: bharata@linux.vnet.ibm.com List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: Amar Tumballi , Anand Avati , qemu-devel@nongnu.org, Vijay Bellur On Sun, Jul 22, 2012 at 03:42:28PM +0100, Stefan Hajnoczi wrote: > On Sat, Jul 21, 2012 at 9:29 AM, Bharata B Rao > wrote: > > -drive file=gluster:server@port:volname:image > > > > - Here 'gluster' is the protocol. > > - 'server@port' specifies the server where the volume file specification for > > the given volume resides. 'port' is the port number on which gluster > > management daemon (glusterd) is listening. This is optional and if not > > specified, QEMU will send 0 which will make libgfapi to use the default > > port. > > 'server@port' is weird notation. Normally it is 'server:port' (e.g. > URLs). Can you change it? I don't like but, but settled for it since port was optional and : was being used as separator here. > > What about the other transports supported by libgfapi: UNIX domain > sockets and RDMA? My reading of glfs.h is that there are 3 connection > options: > 1. 'transport': 'socket' (default), 'unix', 'rdma' > 2. 'host': server hostname for 'socket', path to UNIX domain socket > for 'unix', or something else for 'rdma' > 3. 'port': TCP port when 'socket' is used. Ignored otherwise. > > Unfortunately QEMU block drivers cannot take custom options yet. That > would make it possible to cleanly map these connection options and > save you from inventing syntax which doesn't expose all options. > > In the meantime it would be nice if the syntax exposed all options. So without the capability to pass custom options to block drivers, am I forced to keep extending the file= with more and more options ? file=gluster:transport:server:port:volname:image ? Looks ugly and not easy to make any particular option optional. If needed I can support this from GlusterFS backend. > > > Note that we are no longer using volfiles directly and use volume names > > instead. For this to work, gluster management daemon (glusterd) needs to > > be running on the QEMU node. This limits the QEMU user to access the volumes by > > the default volfiles that are generated by gluster CLI. This should be > > fine as long as gluster CLI provides the capability to generate or regenerate > > volume files for a given volume with the xlator set that QEMU user is > > interested in. GlusterFS developers tell me that this can be provided with > > some enhancements to Gluster CLI/glusterd. Note that the custom volume files > > is typically needed when GlusterFS server is co-located with QEMU in > > which case it would be beneficial to get rid of client-server overhead and > > RPC communication overhead. > > My knowledge of GlusterFS is limited. Here is what I am thinking: > > 1. The user cannot specify a local configuration file, you require > that there is a glusterd running which provides configuration > information. Yes. User only specifies a volume name and glusterd is used to fetch the right volume file for that volume name. > 2. It is currently not possible to bypass RPC because the glusterd > managed configuration file doesn't support that. It is possible. Gluster already supports custom extensions to volume names and it is possible to use the required volfile by specifying this custom volname extension. For eg, if I have a volume named test, by default the volfile used for it will be test-fuse.vol. Currently I can put my own custom volfile into the standard location and get glusterd pick that up. I can specify test.rpcbypass as volname and glusterd will pick test.rpcbypass.vol. What is currently not supported is the ability to create test.rpcbypass.vol from gluster CLI. I believe that gluster developers are ok with enhancing gluster CLI to support generating/regenerating volfiles for a given volume with custom translator set. > > I'm not sure if these statements are true? > > Would you support local volfiles in the future again? Why force users > to run glusterd? I will let gluster folks on CC to answer this and let us know the benefits of always depending on glusterd. I guess running glusterd would be beneficial when supporting migration. QEMU working from a local volume (with volname=test.rpcbypass) can be easily restarted on a different node by just changing volname to test. glusterd will take care of fetching the right volfile automatically for us. > > > - As mentioned above, the VM image on gluster volume can be specified like > > this: > > -drive file=gluster:localhost:testvol:/F17,format=gluster > > > > Note that format=gluster is not needed ideally and its a work around I have > > until libgfapi provides a working connection cleanup routine (glfs_fini()). > > When the format isn't specified, QEMU figures out the format by doing > > find_image_format that results in one open and close before opening the > > image file long term for standard read and write. Gluster connection > > initialization is done from open and connection termination is done from > > close. But since glfs_fini() isn't working yet, I am bypassing > > find_image_format by specifying format=gluster directly which results in > > just one open and hence I am not limited by glfs_fini(). > > Has libgfapi been released yet? Its part of gluster mainline now. > Does it have versioning which will > allow the QEMU GlusterFS block driver to build against different > versions? I'm just wondering how the pieces will fit together once > distros start shipping them. I request gluster folks on CC to comment about version and shipping information. Regards, Bharata.