public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Glanzmann <thomas@glanzmann.de>
To: David Newall <davidn@davidnewall.com>, tytso@thunk.org
Cc: LKML <linux-kernel@vger.kernel.org>, linux-ext4@vger.kernel.org
Subject: Re: zero out blocks of freed user data for operation a virtual machine environment
Date: Mon, 25 May 2009 07:26:53 +0200	[thread overview]
Message-ID: <20090525052653.GA10812@cip.informatik.uni-erlangen.de> (raw)
In-Reply-To: <4A1A1094.3020903@davidnewall.com>

Hello David,

                        [ RESEND: CC forgotten ]

> Are you proposing to de-duplicate a live filesystem?

I do, but on the storage appliance / nfs server and not inside the VM.
But inside VM a filesystem could make the deduplication effort much
easier if it reports unused blocks to the outside world by overwriting
them with zero. I have two scenarios in the moment in my head:

        - btrfs has already checksums. I'm at the moment evaluating if
          the crc32 is good enough to find candidates for deduplication
          or if a stronger checksum is required. After that one patch
          needs to be adapted and ioctl needs to be implemented in btrfs
          which than double checks if the blocks are for real
          duplications of each other and deduplicates them

        - btrfs will be at some point be able to generate a list of
          blocks that have changed between two transactions. This list
          can be used to create an (offsite-backup).

See also: http://thread.gmane.org/gmane.comp.file-systems.btrfs/2922

                Thomas

PS: And it seems that NetApp has the above already in a product. They
have the ability to dedup blocks on WAFL and they also have a feature
that allows to have an offsite duplication of the filesystem.

  reply	other threads:[~2009-05-25  5:26 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-24 17:00 zero out blocks of freed user data for operation a virtual machine environment Thomas Glanzmann
2009-05-24 17:15 ` Arjan van de Ven
2009-05-24 17:39   ` Thomas Glanzmann
2009-05-25 12:03     ` Theodore Tso
2009-05-25 12:34       ` Thomas Glanzmann
2009-05-25 13:14         ` Goswin von Brederlow
2009-05-25 14:01           ` Thomas Glanzmann
     [not found]           ` <f3177b9e0905251023n762b815akace1ae34e643458e@mail.gmail.com>
2009-05-25 17:26             ` Chris Worley
2009-05-26 10:22             ` Goswin von Brederlow
2009-05-26 16:52               ` Chris Worley
2009-05-28 19:27                 ` Goswin von Brederlow
2009-05-25  3:29 ` David Newall
2009-05-25  5:26   ` Thomas Glanzmann [this message]
2009-05-25  7:48 ` Ron Yorston
2009-05-25 10:50   ` Thomas Glanzmann
2009-05-25 12:06 ` Theodore Tso
2009-05-25 21:19 ` Bill Davidsen
2009-05-26  4:45   ` Thomas Glanzmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090525052653.GA10812@cip.informatik.uni-erlangen.de \
    --to=thomas@glanzmann.de \
    --cc=davidn@davidnewall.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@thunk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox