From: "Benoît Canet" <benoit.canet@irqsave.net>
To: ronnie sahlberg <ronniesahlberg@gmail.com>
Cc: "Benoît Canet" <benoit.canet@irqsave.net>,
kwolf@redhat.com, qemu-devel@nongnu.org, stefanha@redhat.com,
pbonzini@redhat.com
Subject: Re: [Qemu-devel] [RFC V4 00/30] QCOW2 deduplication
Date: Wed, 2 Jan 2013 19:55:26 +0100 [thread overview]
Message-ID: <20130102185526.GC30225@irqsave.net> (raw)
In-Reply-To: <CAN05THTRuUAv7337kH-z+etbS7Esw6=u97d9Acr1dju9-9ZO5Q@mail.gmail.com>
Le Wednesday 02 Jan 2013 à 10:47:48 (-0800), ronnie sahlberg a écrit :
> Do you really need to resolve the conflicts?
> It might be easier and sufficient to just flag those hashes where a
> conflict has been detected as : "dont dedup this hash anymore,
> collissions have been seen."
True, that's more elegant.
The user would still need to specify the verify option at creation
and it would require to do a read before verify but it would not make
the qcow2 format uglier.
>
>
> On Wed, Jan 2, 2013 at 10:40 AM, Benoît Canet <benoit.canet@irqsave.net> wrote:
> > Le Wednesday 02 Jan 2013 à 12:26:37 (-0600), Troy Benjegerdes a écrit :
> >> The probability may be 'low' but it is not zero. Just because it's
> >> hard to calculate the hash doesn't mean you can't do it. If your
> >> input data is not random the probability of a hash collision is
> >> going to get scewed.
> >>
> >> Read about how Bitcoin uses hashes.
> >>
> >> I need a budget of around $10,000 or so for some FPGAs and/or GPU cards,
> >> and I can make a regression test that will create deduplication hash
> >> collisions on purpose.
> >
> > It's not a problem as Eric pointed out while reviewing the previous patchset
> > there is a small place left with zeroes on the deduplication block.
> > A bit could be set on it when a collision is detected and an offset could point
> > to a cluster used to resolve collisions.
> >
> >>
> >>
> >> On Wed, Jan 02, 2013 at 06:33:24PM +0100, Beno?t Canet wrote:
> >> > > How does this code handle hash collisions, and do you have some regression
> >> > > tests that purposefully create a dedup hash collision, and verify that the
> >> > > 'right thing' happens?
> >> >
> >> > The two hash function that can be used are cryptographics and not broken yet.
> >> > So nobody knows how to generate a collision.
> >> >
> >> > You can do the math to calculate the probability of collision using a 256 bit
> >> > hash while processing 1EiB of data the result is so low you can consider it
> >> > won't happen.
> >> > The sha256 ZFS deduplication works the same way regarding collisions.
> >> >
> >> > I currently use qemu-io-test for testing purpose and iozone with the -w flag in
> >> > the guest.
> >> > I would like to find a good deduplication stress test to run in a guest.
> >> >
> >> > Regards
> >> >
> >> > Beno?t
> >> >
> >> > > It's great that this almost works, but it seems rather dangerous to put
> >> > > something like this into the mainline code without some regression tests.
> >> > >
> >> > > (I'm also suspecting the regression test will be a great way to find
> >> > > flakey hardware)
> >> > >
> >> > > --------------------------------------------------------------------------
> >> > > Troy Benjegerdes 'da hozer' hozer@hozed.org
> >> > >
> >> > > Somone asked my why I work on this free (http://www.fsf.org/philosophy/)
> >> > > software & hardware (http://q3u.be) stuff and not get a real job.
> >> > > Charles Shultz had the best answer:
> >> > >
> >> > > "Why do musicians compose symphonies and poets write poems? They do it
> >> > > because life wouldn't have any meaning for them if they didn't. That's why
> >> > > I draw cartoons. It's my life." -- Charles Shultz
> >>
> >> --
> >> --------------------------------------------------------------------------
> >> Troy Benjegerdes 'da hozer' hozer@hozed.org
> >>
> >> Somone asked my why I work on this free (http://www.fsf.org/philosophy/)
> >> software & hardware (http://q3u.be) stuff and not get a real job.
> >> Charles Shultz had the best answer:
> >>
> >> "Why do musicians compose symphonies and poets write poems? They do it
> >> because life wouldn't have any meaning for them if they didn't. That's why
> >> I draw cartoons. It's my life." -- Charles Shultz
> >>
> >
next prev parent reply other threads:[~2013-01-02 18:55 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-02 16:16 [Qemu-devel] [RFC V4 00/30] QCOW2 deduplication Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 01/30] qcow2: Add deduplication to the qcow2 specification Benoît Canet
2013-01-03 18:18 ` Eric Blake
2013-01-04 14:49 ` Benoît Canet
2013-01-16 14:50 ` Benoît Canet
2013-01-16 15:58 ` Eric Blake
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 02/30] qcow2: Add deduplication structures and fields Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 03/30] qcow2: Add qcow2_dedup_read_missing_and_concatenate Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 04/30] qcow2: Make update_refcount public Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 05/30] qcow2: Create a way to link to l2 tables when deduplicating Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 06/30] qcow2: Add qcow2_dedup and related functions Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 07/30] qcow2: Add qcow2_dedup_store_new_hashes Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 08/30] qcow2: Implement qcow2_compute_cluster_hash Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 09/30] qcow2: Extract qcow2_dedup_grow_table Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 10/30] qcow2: Add qcow2_dedup_grow_table and use it Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 11/30] qcow2: create function to load deduplication hashes at startup Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 12/30] qcow2: Load and save deduplication table header extension Benoît Canet
2013-01-05 0:02 ` Eric Blake
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 13/30] qcow2: Extract qcow2_do_table_init Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 14/30] qcow2-cache: Allow to choose table size at creation Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 15/30] qcow2: Add qcow2_dedup_init and qcow2_dedup_close Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 16/30] qcow2: Extract qcow2_add_feature and qcow2_remove_feature Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 17/30] block: Add qemu-img dedup create option Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 18/30] qcow2: Behave correctly when refcount reach 0 or 2^16 Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 19/30] qcow2: Integrate deduplication in qcow2_co_writev loop Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 20/30] qcow2: Serialize write requests when deduplication is activated Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 21/30] qcow2: Add verification of dedup table Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 22/30] qcow2: Adapt checking of QCOW_OFLAG_COPIED for dedup Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 23/30] qcow2: Add check_dedup_l2 in order to check l2 of dedup table Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 24/30] qcow2: Do not overwrite existing entries with QCOW_OFLAG_COPIED Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 25/30] qcow2: Integrate SKEIN hash algorithm in deduplication Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 26/30] qcow2: Add lazy refcounts to deduplication to prevent qcow2_cache_set_dependency loops Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 27/30] qcow2: Use large L2 table for deduplication Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 28/30] qcow: Set dedup cluster block size to 64KB Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 29/30] qcow2: init and cleanup deduplication Benoît Canet
2013-01-02 16:16 ` [Qemu-devel] [RFC V4 30/30] qemu-iotests: Filter dedup=on/off so existing tests don't break Benoît Canet
2013-01-02 16:42 ` Eric Blake
2013-01-02 16:50 ` Benoît Canet
2013-01-02 17:10 ` [Qemu-devel] [RFC V4 00/30] QCOW2 deduplication Troy Benjegerdes
2013-01-02 17:33 ` Benoît Canet
2013-01-02 18:01 ` Eric Blake
2013-01-02 18:16 ` Benoît Canet
2013-01-02 18:26 ` Troy Benjegerdes
2013-01-02 18:40 ` Benoît Canet
2013-01-02 18:47 ` ronnie sahlberg
2013-01-02 18:55 ` Benoît Canet [this message]
2013-01-02 19:18 ` Troy Benjegerdes
2013-01-03 2:16 ` ronnie sahlberg
2013-01-03 12:39 ` Stefan Hajnoczi
2013-01-03 19:51 ` Troy Benjegerdes
2013-01-04 7:09 ` Dietmar Maurer
2013-01-04 9:49 ` Stefan Hajnoczi
2013-01-03 17:18 ` Benoît Canet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130102185526.GC30225@irqsave.net \
--to=benoit.canet@irqsave.net \
--cc=kwolf@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=ronniesahlberg@gmail.com \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).