All of lore.kernel.org
 help / color / mirror / Atom feed
From: jim owens <jowens@hp.com>
To: Oliver Mattos <oliver.mattos08@imperial.ac.uk>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Auto-sparseifying
Date: Thu, 11 Dec 2008 08:54:47 -0500	[thread overview]
Message-ID: <49411BA7.8030708@hp.com> (raw)
In-Reply-To: <1228989948.17969.24.camel@mattos-laptop>

... and also Data De-duplication...

A reality check before people go off the deep end here
on these two space saving methods.  It is interesting to
know the "duplicate 512 byte blocks" and "null sequences"
from a statistical point of view.  But it is not practical
to sparse/de-dup at such a small granularity in the FS.

The trade off everyone is missing is that each sparse/dup
is an *extent* that must be tracked in the FS and to do
a read you must send a new *I/O for each disk extent*.

So we blow the metadata structures into unwieldy sizes
and we beat the crap out of the disk.  Even with an SSD
we add tremendous traffic in the I/O pipeline.

Sparse/de-dup on VM page sizes may work OK for small files
but is still not efficient for large files.

jim

  reply	other threads:[~2008-12-11 13:54 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-11 10:05 Auto-sparseifying Oliver Mattos
2008-12-11 13:54 ` jim owens [this message]
2008-12-11 14:57 ` Auto-sparseifying Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49411BA7.8030708@hp.com \
    --to=jowens@hp.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=oliver.mattos08@imperial.ac.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.