linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Boyd Waters <waters.boyd@gmail.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: Content based storage
Date: Fri, 19 Mar 2010 22:46:27 -0400	[thread overview]
Message-ID: <2b0225fb1003191946k1cf92c63q18e40d41274ce3e8@mail.gmail.com> (raw)
In-Reply-To: <201003172043.17314.hka@qbs.com.pl>

2010/3/17 Hubert Kario <hka@qbs.com.pl>:
>
> Read further, Sun did provide a way to enable the compare step by using
> "verify" instead of "on":
> zfs set dedup=verify <pool>

I have tested ZFS deduplication on the same data set that I'm using to
test btrfs. I used a 5-element radiz, dedup=on, which uses SHA256 for
ZFS checksumming and duplication detection on Build 133 of OpenSolaris
for x86_64.

Subjectively, I felt that the array writes were slower than without
dedup. For a while, the option for "dedup=fletcher4,verify" was in the
system, which permitted the (faster, more prone to collisions)
fletcher4 hash for ZFS checksum, and full comparison in the
(relatively rare) case of collision. Darren Moffat worked to unify the
ZFS SHA256 code with the OpenSolaris crypo-api implementation, which
improved performance [1]. But I was not able to test that
implementation.

My dataset reported a dedup factor of 1.28 for about 4TB, meaning that
almost a third of the dataset was duplicated. This seemed plausible,
as the dataset includes multiple backups of a 400GB data set, as well
as numerous VMWare virtual machines.

Despite the performance hit, I'd be pleased to see work on this
continue. Darren Moffat's performance improvements were encouraging,
and the data set integrity was rock-solid. I had a disk failure during
this test, which almost certainly had far more impact on performance
than the deduplication: failed writes to the disk were blocking I/O,
and it got pretty bad before I was able to replace the disk. I never
lost any data, and array management was dead simple.

So anyway FWIW the ZFS dedup implementation is a good one, and had
headroom for improvement.

Finally, ZFS also lets you set a minimum number of duplicates that you
would like applied to the dataset; it only starts pointing to existing
blocks after the "duplication minimum" is reached. (dedupditto
property) [2]


[1] http://blogs.sun.com/darren/entry/improving_zfs_dedup_performance_via
[2] http://opensolaris.org/jive/thread.jspa?messageID=426661

  reply	other threads:[~2010-03-20  2:46 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-16  9:21 Content based storage David Brown
2010-03-16 22:45 ` Fabio
2010-03-17  8:21   ` David Brown
2010-03-17  0:45 ` Hubert Kario
2010-03-17  8:27   ` David Brown
2010-03-17  8:48     ` Heinz-Josef Claes
2010-03-17 15:25       ` Hubert Kario
2010-03-17 15:33         ` Leszek Ciesielski
2010-03-17 19:43           ` Hubert Kario
2010-03-20  2:46             ` Boyd Waters [this message]
2010-03-20 13:05               ` Ric Wheeler
2010-03-20 21:24                 ` Boyd Waters
2010-03-20 22:16                   ` Ric Wheeler
2010-03-20 22:44                     ` Ric Wheeler
2010-03-21  6:55                       ` Boyd Waters
2010-03-18 23:33   ` create debian package of btrfs kernel from git tree rk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2b0225fb1003191946k1cf92c63q18e40d41274ce3e8@mail.gmail.com \
    --to=waters.boyd@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).