From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:48746) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TqTF0-0000VA-S7 for qemu-devel@nongnu.org; Wed, 02 Jan 2013 13:40:40 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TqTEz-0002DC-7T for qemu-devel@nongnu.org; Wed, 02 Jan 2013 13:40:38 -0500 Received: from nodalink.pck.nerim.net ([62.212.105.220]:56399 helo=paradis.irqsave.net) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TqTEy-0002Cz-R2 for qemu-devel@nongnu.org; Wed, 02 Jan 2013 13:40:37 -0500 Date: Wed, 2 Jan 2013 19:40:53 +0100 From: =?iso-8859-1?Q?Beno=EEt?= Canet Message-ID: <20130102184052.GB30225@irqsave.net> References: <1357143393-29832-1-git-send-email-benoit@irqsave.net> <20130102171057.GP19472@us.grid.coop> <20130102173324.GB29742@irqsave.net> <20130102182637.GR19472@us.grid.coop> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20130102182637.GR19472@us.grid.coop> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [RFC V4 00/30] QCOW2 deduplication List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Troy Benjegerdes Cc: Beno?t Canet , kwolf@redhat.com, qemu-devel@nongnu.org, stefanha@redhat.com, pbonzini@redhat.com Le Wednesday 02 Jan 2013 =E0 12:26:37 (-0600), Troy Benjegerdes a =E9crit= : > The probability may be 'low' but it is not zero. Just because it's > hard to calculate the hash doesn't mean you can't do it. If your > input data is not random the probability of a hash collision is > going to get scewed. >=20 > Read about how Bitcoin uses hashes. >=20 > I need a budget of around $10,000 or so for some FPGAs and/or GPU cards= , > and I can make a regression test that will create deduplication hash > collisions on purpose. It's not a problem as Eric pointed out while reviewing the previous patch= set there is a small place left with zeroes on the deduplication block. A bit could be set on it when a collision is detected and an offset could= point to a cluster used to resolve collisions. >=20 >=20 > On Wed, Jan 02, 2013 at 06:33:24PM +0100, Beno?t Canet wrote: > > > How does this code handle hash collisions, and do you have some reg= ression > > > tests that purposefully create a dedup hash collision, and verify t= hat the > > > 'right thing' happens? > >=20 > > The two hash function that can be used are cryptographics and not bro= ken yet. > > So nobody knows how to generate a collision. > >=20 > > You can do the math to calculate the probability of collision using a= 256 bit > > hash while processing 1EiB of data the result is so low you can consi= der it > > won't happen. > > The sha256 ZFS deduplication works the same way regarding collisions. > >=20 > > I currently use qemu-io-test for testing purpose and iozone with the = -w flag in > > the guest. > > I would like to find a good deduplication stress test to run in a gue= st. > >=20 > > Regards > >=20 > > Beno?t > >=20 > > > It's great that this almost works, but it seems rather dangerous to= put > > > something like this into the mainline code without some regression = tests. > > >=20 > > > (I'm also suspecting the regression test will be a great way to fin= d=20 > > > flakey hardware) > > >=20 > > > -------------------------------------------------------------------= ------- > > > Troy Benjegerdes 'da hozer' hozer@ho= zed.org > > >=20 > > > Somone asked my why I work on this free (http://www.fsf.org/philoso= phy/) > > > software & hardware (http://q3u.be) stuff and not get a real job. > > > Charles Shultz had the best answer: > > >=20 > > > "Why do musicians compose symphonies and poets write poems? They do= it > > > because life wouldn't have any meaning for them if they didn't. Tha= t's why > > > I draw cartoons. It's my life." -- Charles Shultz >=20 > --=20 > -----------------------------------------------------------------------= --- > Troy Benjegerdes 'da hozer' hozer@hozed.= org >=20 > Somone asked my why I work on this free (http://www.fsf.org/philosophy/= ) > software & hardware (http://q3u.be) stuff and not get a real job. > Charles Shultz had the best answer: >=20 > "Why do musicians compose symphonies and poets write poems? They do it > because life wouldn't have any meaning for them if they didn't. That's = why > I draw cartoons. It's my life." -- Charles Shultz >=20