From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:59954) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TtJt8-0005lu-UL for qemu-devel@nongnu.org; Thu, 10 Jan 2013 10:17:58 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TtJt7-00059r-II for qemu-devel@nongnu.org; Thu, 10 Jan 2013 10:17:50 -0500 Received: from nodalink.pck.nerim.net ([62.212.105.220]:35381 helo=paradis.irqsave.net) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TtJt7-00059e-8w for qemu-devel@nongnu.org; Thu, 10 Jan 2013 10:17:49 -0500 Date: Thu, 10 Jan 2013 16:18:12 +0100 From: =?iso-8859-1?Q?Beno=EEt?= Canet Message-ID: <20130110151812.GA3457@irqsave.net> References: <20130109152443.GB3494@irqsave.net> <20130109163928.GC3494@irqsave.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] QCOW2 deduplication design List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: =?iso-8859-1?Q?Beno=EEt?= Canet , Kevin Wolf , qemu-devel , Stefan Hajnoczi , Paolo Bonzini > Now I understand. This case covers overwriting existing data with new > contents. That is common :). >=20 > But are you seeing a cluster with refcount > 1 being overwritten > often? If so, it's worth looking into why that happens. It may be a > common pattern for certain file systems or applications to write > initial data 'A' first and then change it later. This actually > suggests against online dedup, or at least for something like qcow2 > delayed write where we don't "commit" yet because the guest will > probably still modify or append to the data. I apologize for the bogus former information. The deduplication metrics accounting code was confusing the delete cluste= r operation with the more common hash removal from tree operation. After fixing the metrics code commons files manipulations on the guest on= ly generate a few delete cluster operations. The cases where a lots of cluster are deleted is when the image is overwr= itten with zeroes and reformating a partition with ext3. Regards Beno=EEt