From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anthony Liguori Subject: Re: [PATCH] Blktap: Userspace file-based image support. (RFC) Date: Mon, 19 Jun 2006 13:55:10 -0500 Message-ID: <4496F30E.6020006@us.ibm.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Andrew Warfield Cc: Xen Developers , Julian Chesterfield List-Id: xen-devel@lists.xenproject.org Hi Andy, > > Performance is quite good, and we intend to focus on this a bit more > over the next few weeks, releasing updated patches as they are > available. Bonnie results this morning are as follows (64-bit results > compare against linux blkback+loopback file, Julian can follow up with > loopback results for 32-bit later if anyone's interested): > > -------Sequential Output-------- ---Sequential Input-- > --Random-- > -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- > --Seeks--- > Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU > /sec %CPU > 64-bit: > xen0 4096 40115 93.4 41067 12.7 22757 1.2 32532 56.7 53724 0.4 > 121.4 0.0 > img-sp 4096 20291 86.0 38091 18.1 19939 8.2 30854 69.0 47779 4.2 > 95.3 0.4 > loop-sp 4096 33421 77.6 33663 13.1 18546 5.1 28606 59.2 46659 6.0 > 85.2 0.1 > > 32-Bit: > xen0 1024 33857 94.0 45804 9.0 23269 0.0 25825 52.0 55628 0 > 185.0 0.0 > img-sp 1448 32743 92.0 40703 8.0 23281 0.0 31139 75.0 56585 0 > 208.1 0.0 What is img-sp? Is this blktap + a physical device or is this blktap with something like qcow? The numbers a tad worse than I'd expect them to be if it was a physical device. Theoretically, linux-aio is inserting requests directly into the backend. I expect there to be a certain amount of CPU overhead from context switching but since it's still zero-copy, I wouldn't expect less CPU usage and less throughput. Any idea why this is or am I just totally misunderstanding how things should behave :-) > Working in conjunction with the kernel blktap driver, all disk I/O > requests from VMs are passed to the userspace deamon (using a shared > memory interface) through a character device. Each active disk is > mappd to an individual device node, allowing per-disk processes to > implement individual block devices where desired. The userspace > drivers are implemented using asynchronous (Linux libaio), > O_DIRECT-based calls to preserve the unbuffered, batched and > asynchronous request dispatch achieved with the existing blockback > code. We provide a simple, asynchronous virtual disk interface that > makes it quite easy to add new disk implementations. > A very much like the idea of a userspace block device backend. Have you considered what it would take to completely replace blkback with a userspace backend? I'm also curious why you choose a character device to interact with the ring queue instead of just attaching to the ring queue directly in userspace. I think the whole discussion of COW support is orthogonal to a userspace backend FWIW so I'll save that part of the discussion for another thread :-) Regards, Anthony Liguori > > As of June 2006 the current supported disk formats are: > > - Raw Images (both on partitions and in image files) > - File-backed Qcow disks (sparse qcow overlay on a raw image/patrition). > - Standalone sparse Qcow disks (sparse disks, not backed by a parent > image). > - Fast shareable RAM disk between VMs (requires some form of > cluster-based > filesystem support e.g. OCFS2 in the guest kernel) > - Some VMDK images - your mileage may vary > > Raw and QCow images have asynchronous backends and so should perform > fairly well. VMDK is based directly on the qemu vmdk driver, which is > synchronous (a.k.a. slow). > > The qcow backends support existing qcow disks. There are also a set > of tools to generate and convert qcow images. With these tools (and > driver support), we maintain the qcow file format but adjust > parameters for higher performance with Xen -- using a larger segment > size (4096 instead of 512) and more coarsely allocating metadata > regions. We are continuing to improve this work and expect qcow > performance to improve a great deal over the newxt few weeks. > > Build and Installation Instructions > =================================== > > You will need libaio >= 0.3.104 on your target system to build the > tools (if you are installing RPMs, this means libaio and > libaio-devel). > > Make to configure the blktap backend driver in your dom0 kernel. It > will cooperate fine with the existing backend driver, so you can > experiment with tap disks without breaking existing VM configs. > > To build the tools separately, "make && make install" in > tools/blktap_user. > > > Using the Tools > =============== > > Prepare the image for booting. For qcow files use the qcow utilities > installed earlier. e.g. qcow-create generates a blank standalone image > or a file-backed CoW image. img2qcow takes an existing image or > partition and creates a sparse, standalone qcow-based file. > > Start the userspace disk agent either on system boot (e.g. via an init > script) or manually => 'blktapctrl' > > Customise the VM config file to use the 'tap' handler, followed by the > driver type. e.g. for a raw image such as a file or partition: > > disk = ['tap:aio:,sda1,w'] > > e.g. for a qcow image: > > disk = ['tap:qcow:,sda1,w'] > ------------------------------------------------------------------------ > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel