From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:40008) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Se5Qg-0005tK-RM for qemu-devel@nongnu.org; Mon, 11 Jun 2012 10:17:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Se5Qe-000885-EY for qemu-devel@nongnu.org; Mon, 11 Jun 2012 10:17:14 -0400 Received: from e23smtp08.au.ibm.com ([202.81.31.141]:47372) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Se5Qd-000877-Lu for qemu-devel@nongnu.org; Mon, 11 Jun 2012 10:17:12 -0400 Received: from /spool/local by e23smtp08.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 11 Jun 2012 14:14:00 +1000 Received: from d23av03.au.ibm.com (d23av03.au.ibm.com [9.190.234.97]) by d23relay03.au.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q5BEH2Io5636402 for ; Tue, 12 Jun 2012 00:17:02 +1000 Received: from d23av03.au.ibm.com (loopback [127.0.0.1]) by d23av03.au.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q5BEH1Ex029887 for ; Tue, 12 Jun 2012 00:17:01 +1000 Date: Mon, 11 Jun 2012 19:48:07 +0530 From: Bharata B Rao Message-ID: <20120611141806.GA2737@in.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Subject: [Qemu-devel] [RFC PATCH 0/3] GlusterFS support in QEMU Reply-To: bharata@linux.vnet.ibm.com List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: Amar Tumballi , Vijay Bellur Hi, This set of patches enables QEMU to boot VM images from gluster volumes. This is achieved by adding gluster as a new block backend driver in QEMU. Its already possible to boot from VM images on gluster volumes using Fuse mount, but this patchset provides the ability to boot VM images from gluster volumes by by-passing the FUSE layer in gluster. In case the image is present on the local system, it is possible to even bypass client and server translator and hence the RPC overhead. QEMU with gluster backend support will take the volume file on command line and then link to libglusterfs library to perform IO to image on gluster volume. block/gluster-helpers.c has bare minimum gluster code that is necessary for QEMU to boot and work with image on gluster volume. I have implemented routines like gluster_create, gluster_open, gluster_aio_readv etc which will eventually not be necessary when we have equivalent routines in libglusterfsclient working. While I have this implementation here, we are also working actively on resurrecting libglusterfsclient and using QEMU with it. In addition to posix routines, block/gluster-helpers.c has some elaborate lookup code which also will become redundant with libglusterfsclient. The patches are experimental in nature and I have only verified that I can boot an image from gluster volume using these patches in fuse-bypass and rpc-bypass modes. I haven't tested with full-blown version of volume file (that is generated by gluster CLI), but have always use only hand crafted volume files with just posix translator in it. How to use this patchset ======================== 1. Compiling GlusterFS - Get GlusterFS source from git://git.gluster.com/glusterfs.git - Compile and install # ./autogen.sh; ./configure; make; make install - Copy a few required header files and libraries # mkdir /usr/local/include/glusterfs/ # cp glusterfs/libglusterfs/src/*.h /usr/local/include/glusterfs/ # cp glusterfs/config.h /usr/local/include/glusterfs/ # cp glusterfs/contrib/uuid/uuid.h /usr/local/include/glusterfs/ 2. Compiling QEMU - Get QEMU sources - Apply the patches from this patchset. - Configure # ./configure --disable-werror --target-list=x86_64-softmmu --enable-glusterfs --enable-uuid - make; make install Note: I have to resort to --disable-werror to mainly tide over the warnings in block/gluster-helpers.c. I didn't spent too much effort in cleaning this up since this code will be gone once we have a working libglusterfsclient. 3. Starting GlusterFS server # glusterfsd -f s-qemu.vol # cat s-qemu.vol volume vm type storage/posix option directory /vm end-volume volume server type protocol/server subvolumes vm option transport-type tcp option auth.addr.vm.allow * end-volume Here /vm is the directory exported by the server. Ensure that this directory is present before GlusterFS server is started. 4. Creating VM image # qemu-img create -f gluster gluster:c-qemu.vol:/F16 5G # cat c-qemu.vol volume client type protocol/client option transport-type tcp option remote-host bharata option remote-subvolume vm end-volume 5. Install a distro (say Fedora16) on the VM image # qemu-system-x86_64 --enable-kvm -smp 4 -m 1024 -drive file=gluster:c-qemu.vol:/F16,format=gluster -cdrom Fedora-16-x86_64-DVD.iso After this follow the normal F16 installation. Next time onwards, the following QEMU command can be used to directly start the VM. 6. Start the VM (Fuse-bypass) # qemu-system-x86_64 --enable-kvm --nographic -smp 4 -m 1024 -drive file=gluster:c-qemu.vol:/F16,format=gluster 6a. Booting VM in RPC-bypass mode. # cat c-qemu-rpcbypass.vol volume vm type storage/posix option directory /vm end-volume # qemu-system-x86_64 --enable-kvm --nographic -smp 4 -m 1024 -drive file=gluster:c-qemu-rpcbypass.vol:/F16,format=gluster Note that in this case, its not necessary to run a gluster server. Tests ===== I have done some initial tests using fio. Here are the details: Environment ----------- Dual core x86_64 laptop QEMU (f8687bab919672ccd) GlusterFS (c40b73fc453caf12) Guest: Fedora 16 (kernel 3.1.0-7.fc16.x86_64) Host: Fedora 16 (kernel 3.4) fio-HEAD-47ea504 fio jobfile ----------- # cat aio-read-direct-seq ; Read 4 files with aio at different depths [global] ioengine=libaio direct=1 rw=read bs=128k size=512m directory=/data1 [file1] iodepth=4 [file2] iodepth=32 [file3] iodepth=8 [file4] iodepth=16 Base ---- QEMU: qemu-system-x86_64 --enable-kvm --nographic -m 1024 -smp 4 -drive file=/vm/dir1/F16,cache=none Fuse mount ---------- Server: glusterfsd -f s-qemu.vol Client: glusterfs -f c-qemu.vol /mnt QEMU: qemu-system-x86_64 --enable-kvm --nographic -m 1024 -smp 4 -drive file=/mnt/dir1/F16,cache=none Fuse bypass ----------- Server: glusterfsd -f s-qemu.vol QEMU: qemu-system-x86_64 --enable-kvm --nographic -m 1024 -smp 4 -drive file=gluster:/c-qemu.vol:/dir1/F16,format=gluster,cache=none RPC bypass ---------- QEMU: qemu-system-x86_64 --enable-kvm --nographic -m 1024 -smp 4 -drive file=gluster:/c-qemu-rpcbypass.vol:/dir1/F16,format=gluster,cache=none Numbers (aggrb, minb and maxb in kB/s. mint and maxt in msec) ------- aggrb minb maxb mint maxt Base 72916 18229 18945 27673 28761 Fuse mount 8211 2052 3094 169433 255396 Fuse bypass 66591 16647 17806 29444 31493 RPC bypass 70940 17735 18782 27914 29562 Note that these are just indicative numbers and I haven't really tuned QEMU or GlusterFS or fio to achieve best results. However with this test we can see that Fuse mount case is not ideal and Fuse bypass and RPC bypass help. Regards, Bharata.