Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: jim owens <jowens@hp.com>
To: Morey Roof <moreyroof@gmail.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: New feature Idea
Date: Wed, 13 Aug 2008 14:54:22 -0400	[thread overview]
Message-ID: <48A32DDE.8070203@hp.com> (raw)
In-Reply-To: <48A320A0.80609@gmail.com>

Morey Roof wrote:
> I have been thinking about a new feature to start work on that I am 
> interested in and I was hoping people could give me some feedback and 
> ideas of how to tackle it.  Anyways, I want to create a data 
> deduplication system that can work in two different modes.  One mode is 
> that when the system is idle or not beyond a set load point a background 
> process would scan the volume for duplicate blocks.  The other mode 
> would be used for systems that are nearline or backup systems that don't 
> really care about the performance and it would do the deduplication 
> during block allocation.
> 
> One of the ways I was thinking of to find the duplicate blocks would be 
> to use the checksums as a quick compare.  If the checksums match then do 
> a complete compare before adjusting the nodes on the files.  However, I 
> believe that I will need to create a tree based on the checksum values.
> 
> So any other ideas and thoughts about this?

Don't do it!!!

OK, I know Chris has described some block sharing.  But I hate it.

If I copy "resume" to "resume.save", it is because I want 2 copies
for safety.  I don't want the fs to reduce it to 1 copy.  And
reducing the duplicates is exactly opposite to Chris's paranoid
make-multiple-copies-by-default.

Now feel free to tell me I'm an idiot (other people do) :)

jim

  parent reply	other threads:[~2008-08-13 18:54 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-13 17:57 New feature Idea Morey Roof
2008-08-13 18:45 ` Jeff Fisher
2008-08-13 18:54 ` jim owens [this message]
2008-08-13 19:00   ` Jeff Fisher
2008-08-13 19:09     ` Morey Roof
2008-08-13 19:06   ` Joe Peterson
2008-08-13 19:28     ` jim owens
2008-08-13 19:40       ` Morey Roof
2008-08-13 19:28   ` btrfs-devel
2008-08-13 19:35     ` Kevin Cantu
2008-08-13 19:45       ` Morey Roof
2008-08-14 17:12   ` Chris Mason
2008-08-14 18:06     ` Anthony Roberts
2008-08-14 18:49     ` Zach Brown
2008-08-14 19:45       ` Morey Roof
2008-08-14 19:53       ` Chris Mason
2008-08-13 20:00 ` Andi Kleen
2008-08-13 20:10   ` Morey Roof

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48A32DDE.8070203@hp.com \
    --to=jowens@hp.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=moreyroof@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox