From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:36694) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1THHd7-0005Hl-Ph for qemu-devel@nongnu.org; Thu, 27 Sep 2012 13:12:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1THHd6-0000Zg-Of for qemu-devel@nongnu.org; Thu, 27 Sep 2012 13:12:05 -0400 Received: from paradis.irqsave.net ([109.190.18.76]:41400) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1THHd6-0000ZR-Ga for qemu-devel@nongnu.org; Thu, 27 Sep 2012 13:12:04 -0400 Date: Thu, 27 Sep 2012 19:11:58 +0200 From: =?iso-8859-1?Q?Beno=EEt?= Canet Message-ID: <20120927171157.GA4407@irqsave.net> References: <1348756198-10845-1-git-send-email-benoit@irqsave.net> <50647793.7000106@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <50647793.7000106@redhat.com> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [partial RFC 0/2] QCow2 deduplication write mechanism List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Avi Kivity Cc: kwolf@redhat.com, pbonzini@redhat.com, qemu-devel@nongnu.org, stefanha@linux.vnet.ibm.com > If I understood correctly, this does cluster-level dedup within a qcow2 > image. yes > What is the motivation here? Reduce space usage if a guest copies file= s > internally? First use case is people using LXC, OpenVZ or chroot inside vms they rent to their "cloud" provider. Second use case is applications generating duplicates blocks. A well known CAD software does this and on a filer doing deduplication on 4KB blocks the dedup ratio is around five. >=20 > Why use cluster granularity? If the guest uses smaller granularity, it > will misalign the data wrt cluster boundaries, and deduplication will f= ail. Using cluster granularity would allow to use the qcow2 refcount and l1/l2 tables to track duplicated blocks. The patch embryo does read the missing data if given data is misaligned. To avoid hurting performance using a smaller cluster size with the adequa= te changes in qcow2 may help. Regards Beno=EEt