public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Glanzmann <thomas@glanzmann.de>
To: Chris Mason <chris.mason@oracle.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Data Deduplication with the help of an online filesystem check
Date: Tue, 28 Apr 2009 17:59:00 +0200	[thread overview]
Message-ID: <20090428155900.GA1722@cip.informatik.uni-erlangen.de> (raw)
In-Reply-To: <1240839448.26451.13.camel@think.oraclecorp.com>

Hello,
I have a few more questions to this:

        - Is there a checksum for every block in btrfs?

        - Is it possible to retrieve these checksums from userland?

        - Is it possible to use a blocksize of 4 or 8 kbyte with btrfs?

To get a bit more specific: If it is relatively easy to identify and
deduplicate blocks, and if btrfs supports relatively small block sizes
like 4 / 8 kbyte, it is the perfect candidate for VMs. To give you some
data. I took 300 Gbyte (note this is the disk space that is used not the
provisioned space (the space that isn't currently used by the VM so it's the
data that are in use) of VMs running different operating systems and used a
perl script to identify how many data could be deduped give a specific
blocksize:

300 Gbyte of used storage of several productive VMs with the following
Operatings systems running:
\begin{itemize}
        \item Red Hat Linux 32 and 64 Bit (Release 3, 4 and 5)
        \item SuSE Linux 32 and 64 Bit (SLES 9 and 10)
        \item Windows 2003 Std. Edition 32 Bit
        \item Windows 2003 Enterprise Edition 64 Bit
\end{itemize}
\begin{tabular}{r|r|r|l}
blocksize & Deduplicated Data \\
\hline
128k      &  29.9 G \\
 64k      &  41.3 G \\
 32k      &  59.2 G \\
 16k      &  82   G \\
  8k      & 112   G \\
\

Bottom line with 8 K blocksize you can get more than 33% of deduped data
running a productive set of VMs.

        Thomas

  parent reply	other threads:[~2009-04-28 15:59 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-27  3:33 Data Deduplication with the help of an online filesystem check Thomas Glanzmann
2009-04-27 13:37 ` Chris Mason
2009-04-28  5:22   ` Thomas Glanzmann
2009-04-28 10:02     ` Chris Mason
2009-04-28 13:49       ` Andrey Kuzmin
2009-04-28 13:58         ` Chris Mason
2009-04-28 14:04           ` Thomas Glanzmann
2009-04-28 17:21             ` Chris Mason
2009-04-28 20:10               ` Thomas Glanzmann
2009-04-28 20:29                 ` Thomas Glanzmann
2009-04-28 13:58         ` jim owens
2009-04-28 16:10       ` Anthony Roberts
2009-04-28 15:59   ` Thomas Glanzmann [this message]
2009-04-28 16:04     ` Tomasz Chmielewski
2009-04-28 17:29       ` Edward Shishkin
2009-04-28 17:34         ` Thomas Glanzmann
2009-04-28 17:38           ` Chris Mason
2009-04-28 17:43             ` Thomas Glanzmann
2009-04-28 17:45             ` Heinz-Josef Claes
2009-04-28 20:16               ` Thomas Glanzmann
2009-04-28 20:36                 ` Heinz-Josef Claes
2009-04-28 20:52                   ` Thomas Glanzmann
2009-04-28 20:58                     ` Chris Mason
2009-04-28 21:12                       ` Thomas Glanzmann
2009-04-28 21:26                         ` Chris Mason
2009-04-28 22:14                           ` Thomas Glanzmann
2009-04-28 23:18                             ` Chris Mason
2009-04-29 12:03                               ` Thomas Glanzmann
2009-04-29 13:11                                 ` Michael Tharp
2009-04-29 13:14                                 ` Chris Mason
2009-04-29 13:58                                   ` Thomas Glanzmann
2009-04-29 14:31                                     ` Chris Mason
2009-04-29 15:26                                       ` Thomas Glanzmann
2009-04-29 15:45                                         ` Chris Mason
2009-06-04  8:49                                           ` Thomas Glanzmann
2009-06-04 11:43                                             ` Chris Mason
2009-06-04 12:03                                               ` Thomas Glanzmann
2009-06-04 12:43                                                 ` Chris Mason
2009-06-05 12:20                                               ` Tomasz Chmielewski
2009-06-05 12:50                                                 ` Chris Mason
2009-06-05 15:35                                                   ` Tomasz Chmielewski
2009-04-29  0:06                       ` Bron Gondwana
2009-05-06 15:16               ` Sander
2009-04-28 17:32       ` Thomas Glanzmann
2009-04-28 17:41         ` Michael Tharp
2009-04-28 20:14           ` Thomas Glanzmann
2009-05-04 14:29           ` Ric Wheeler
2009-05-04 14:39             ` Tomasz Chmielewski
2009-05-04 14:45               ` Ric Wheeler
2009-05-04 15:15                 ` Thomas Glanzmann
2009-05-04 16:03                   ` Ric Wheeler
2009-05-04 16:16                     ` Andrey Kuzmin
2009-05-04 16:24                       ` Thomas Glanzmann
2009-05-04 18:06                         ` Jan-Frode Myklebust
2009-05-04 19:16                           ` Andrey Kuzmin
2009-05-05  8:02                           ` Thomas Glanzmann
2009-05-04 16:26                     ` Thomas Glanzmann
2009-05-04 19:11                       ` Heinz-Josef Claes
2009-05-04 21:29                         ` Dmitri Nikulin
2009-05-05  7:18                           ` Heinz-Josef Claes
2009-05-24  7:27                         ` Thomas Glanzmann
2009-04-28 17:23     ` Chris Mason
2009-04-28 17:37       ` Thomas Glanzmann
2009-04-28 17:43         ` Chris Mason
2009-04-28 20:15           ` Thomas Glanzmann
2009-04-28 21:19           ` Dmitri Nikulin
2009-04-28 20:24       ` Thomas Glanzmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090428155900.GA1722@cip.informatik.uni-erlangen.de \
    --to=thomas@glanzmann.de \
    --cc=chris.mason@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox