From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:41272) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RDxTX-0007UZ-Cw for qemu-devel@nongnu.org; Wed, 12 Oct 2011 07:59:56 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RDxTV-0008VH-G8 for qemu-devel@nongnu.org; Wed, 12 Oct 2011 07:59:55 -0400 Received: from mail-gx0-f173.google.com ([209.85.161.173]:52436) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RDxTV-0008VD-CL for qemu-devel@nongnu.org; Wed, 12 Oct 2011 07:59:53 -0400 Received: by ggnp2 with SMTP id p2so741940ggn.4 for ; Wed, 12 Oct 2011 04:59:52 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <4E957408.3010204@redhat.com> References: <1318002589-11315-1-git-send-email-stefanha@linux.vnet.ibm.com> <4E9448B4.4060105@redhat.com> <20111012103937.GB24150@stefanha-thinkpad.localdomain> <4E957408.3010204@redhat.com> Date: Wed, 12 Oct 2011 12:59:52 +0100 Message-ID: From: Stefan Hajnoczi Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH 0/3] block: zero write detection List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: Anthony Liguori , Marcelo Tosatti , Stefan Hajnoczi , qemu-devel@nongnu.org On Wed, Oct 12, 2011 at 12:03 PM, Kevin Wolf wrote: > Am 12.10.2011 12:39, schrieb Stefan Hajnoczi: >> On Tue, Oct 11, 2011 at 03:46:28PM +0200, Kevin Wolf wrote: >>> Am 07.10.2011 17:49, schrieb Stefan Hajnoczi: >>>> Image streaming copies data from the backing file into the image file.= =A0It is >>>> important to represent zero regions from the backing file efficiently = during >>>> streaming, otherwise the image file grows to the full virtual disk siz= e and >>>> loses sparseness. >>>> >>>> There are two ways to implement zero write detection, they are subtly = different: >>>> >>>> 1. Allow image formats to provide efficient representations for zero r= egions. >>>> =A0 =A0QED does this with "zero clusters" and it has been discussed fo= r qcow2v3. >>>> >>>> 2. During streaming, check for zeroes and skip writing to the image fi= le when >>>> =A0 =A0zeroes are detected. >>>> >>>> However, there are some disadvantages to #2 because it leaves unalloca= ted holes >>>> in the image file. =A0If image streaming is aborted before it complete= s then it >>>> will be necessary to reread all unallocated clusters from the backing = file upon >>>> resuming image streaming. =A0Potentionally worse is that a backing fil= e over a >>>> slow remote connection will have the zero regions fetched again and ag= ain if >>>> the guest accesses them. =A0#1 avoids these problems because the image= file >>>> contains information on which regions are zeroes and do not need to be >>>> refetched. >>>> >>>> This patch series implements #1 with the existing QED zero cluster fea= ture. =A0In >>>> the future we can add qcow2v3 zero clusters too. =A0We can also implem= ent #2 >>>> directly in the image streaming code as a fallback when the BlockDrive= r does >>>> not support zero detection #1 itself. =A0That way we get the best poss= ible zero >>>> write detection, depending on the image format. >>>> >>>> Here is a qemu-iotest to verify that zero write detection is working: >>>> http://repo.or.cz/w/qemu-iotests/stefanha.git/commitdiff/226949695eef5= 1bdcdea3e6ce3d7e5a863427f37 >>>> >>>> Stefan Hajnoczi (3): >>>> =A0 block: add zero write detection interface >>>> =A0 qed: add zero write detection support >>>> =A0 qemu-io: add zero write detection option >>>> >>>> =A0block.c =A0 =A0 | =A0 16 +++++++++++ >>>> =A0block.h =A0 =A0 | =A0 =A02 + >>>> =A0block/qed.c | =A0 81 ++++++++++++++++++++++++++++++++++++++++++++++= +++++++------ >>>> =A0block_int.h | =A0 13 +++++++++ >>>> =A0qemu-io.c =A0 | =A0 35 ++++++++++++++++++++----- >>>> =A05 files changed, 132 insertions(+), 15 deletions(-) >>> >>> It's good to have an option to detect zero writes and turn them into >>> zero clusters, but it's something that introduces some overhead and >>> probably won't be suitable as a default. >> >> Yes, this series simply has a bdrv_set_zero_detection() API to toggle it >> at runtime. =A0By default it is off to save CPU cycles. >> >>> I think what we really want to have for image streaming is an API that >>> explicitly writes zeros and doesn't have to look at the whole buffer (o= r >>> actually doesn't even get a buffer). >> >> I didn't take this approach to avoid having block drivers handle the >> zero buffers that need to be allocated when the region does not cover >> entire clusters. =A0It can be done for sure but I'm not sure how to do i= t >> nicely yet. > > If I understand your QED code right, in such cases it ignores that there > are some zeros that could be turned into a zero cluster. Considering > this and that you always fill a buffer just to be able to check it > (which is known to take considerable time from qemu-img convert > experience) - how could any solution that works consistently, but > requires an allocation in the block driver be less nice? The fallback is easy when you already have a buffer - just do the write :). My point is that this patch is the simplest approach. Other approaches can optimize better and the question is whether they are worth doing. Stefan