From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=59649 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OszpZ-0005er-DK for qemu-devel@nongnu.org; Tue, 07 Sep 2010 11:11:30 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OszpI-0000q6-NR for qemu-devel@nongnu.org; Tue, 07 Sep 2010 11:11:17 -0400 Received: from e7.ny.us.ibm.com ([32.97.182.137]:40938) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OszpI-0000q0-Jp for qemu-devel@nongnu.org; Tue, 07 Sep 2010 11:11:12 -0400 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by e7.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id o87EumMt004114 for ; Tue, 7 Sep 2010 10:56:48 -0400 Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id o87FBBKC086358 for ; Tue, 7 Sep 2010 11:11:11 -0400 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id o87FBAgx025527 for ; Tue, 7 Sep 2010 09:11:10 -0600 Message-ID: <4C86560D.9030308@linux.vnet.ibm.com> Date: Tue, 07 Sep 2010 10:11:09 -0500 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] QEMU interfaces for image streaming and post-copy block migration References: <4C864118.7070206@linux.vnet.ibm.com> <4C864D65.6090004@redhat.com> <4C86510E.9010303@linux.vnet.ibm.com> <4C8653E9.2070905@redhat.com> In-Reply-To: <4C8653E9.2070905@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: "libvir-list@redhat.com" , qemu-devel , Stefan Hajnoczi On 09/07/2010 10:02 AM, Kevin Wolf wrote: > Am 07.09.2010 16:49, schrieb Anthony Liguori: > >>> Shouldn't it be a runtime option? You can use the very same image with >>> copy-on-read or copy-on-write and it will behave the same (execpt for >>> performance), so it's not an inherent feature of the image file. >>> >>> >> The way it's implemented in QED is that it's a compatible feature. This >> means that implementations are allowed to ignore it if they want to. >> It's really a suggestion. >> > Well, the point is that I see no reason why an image should contain this > suggestion. There's really nothing about an image that could reasonably > indicate "use this better with copy-on-read than with copy-on-write". > > It's a decision you make when using the image. > Copy-on-read is, in many cases, a property of the backing file because it suggests that the backing file is either very slow or potentially volatile. IOW, let's say I'm an image distributor and I want to provide my images in a QED format that actually streams the image from an http server. I could provide a QED file without a copy-on-read bit set but I'd really like to convey this information as part of the image. You can argue that I should provide a config file too that contained the copy-on-read flag set but you could make the same argument about backing files too. >> So yes, you could have a run time switch that overrides the feature bit >> on disk and either forces copy-on-read on or off. >> >> Do we have a way to pass block drivers run time options? >> > We'll get them with -blockdev. Today we're using colons for format > specific and separate -drive options for generic things. > That's right. I think I'd rather wait for -blockdev. >> You need to understand the cluster boundaries in order to optimize the >> metadata updates. Sure, you can expose interfaces to the block layer to >> give all of this info but that's solving the same problem for doing >> block level copy-on-write. >> >> The other challenge is that for copy-on-read to be efficiently, you >> really need a format that can distinguish between unallocated sectors >> and zero sectors and do zero detection during the copy-on-read >> operation. Otherwise, if you have a 10G virtual disk with a backing >> file that's 1GB is size, copy-on-read will result in the leaf being 10G >> instead of ~1GB. >> > That's a good point. But it's not a reason to make the interface > specific to QED just because other formats would probably not implement > it as efficiently. > You really can't do as good of a job in the block layer because you have very little info about the characteristics of the disk image. Regards, Anthony Liguori > Kevin >