From: Theodore Tso <tytso@mit.edu>
To: Thomas Glanzmann <thomas@glanzmann.de>,
tytso@thunk.org, LKML <linux-kernel@vger.kernel.org>,
linux-ext4@vger.kernel.org
Subject: Re: zero out blocks of freed user data for operation a virtual machine environment
Date: Mon, 25 May 2009 08:06:36 -0400 [thread overview]
Message-ID: <20090525120636.GB25908@mit.edu> (raw)
In-Reply-To: <20090524170045.GC24753@cip.informatik.uni-erlangen.de>
On Sun, May 24, 2009 at 07:00:45PM +0200, Thomas Glanzmann wrote:
> Hello Ted,
> I would like to know if there is already a mount option or feature in
> ext3/ext4 that automatically overwrites freed blocks with zeros? If this
> is not the case I would like to know if you would consider a patch for
> upstream? I'm asking this because I currently do some research work on
> data deduplication in virtual machine environments and corresponding
> backups. It would be a huge space saver if there is such a feature
> because todays and tomorrows backup tools for virtual machine
> environments work on the block layer (VMware Consolidated Backup, VMware
> Data Recovery, and NetApp Snapshots). This is not only true for backup
> tools but also for running Virtual machines. The case that this future
> addresses is the following: A huge file is downloaded and later delted.
> The backup and datadeduplication that is operating on the block level
> can't identify the block as unused. This results in backing up the
> amount of the data that was previously allocated by the file and as such
> introduces an performance overhead. If you're interested in real live
> data, I'm able to provide them.
If you are planning to use this on production systems, forcing the
filesystem to zero out blocks to determine whether or not they are in
use is a terrible idea. The performance hit it would impose would
probably not be tolerated by most users.
It would be much better to design a system interface which allowed a
userspace program to be given a list of blocks that are in use given a
certain block range. That way said userspace program could easily
determine whether or not a particular block is in use or not.
- Ted
next prev parent reply other threads:[~2009-05-25 12:06 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-24 17:00 zero out blocks of freed user data for operation a virtual machine environment Thomas Glanzmann
2009-05-24 17:15 ` Arjan van de Ven
2009-05-24 17:39 ` Thomas Glanzmann
2009-05-25 12:03 ` Theodore Tso
2009-05-25 12:34 ` Thomas Glanzmann
2009-05-25 13:14 ` Goswin von Brederlow
2009-05-25 14:01 ` Thomas Glanzmann
[not found] ` <f3177b9e0905251023n762b815akace1ae34e643458e@mail.gmail.com>
2009-05-25 17:26 ` Chris Worley
2009-05-26 10:22 ` Goswin von Brederlow
2009-05-26 16:52 ` Chris Worley
2009-05-28 19:27 ` Goswin von Brederlow
2009-05-25 3:29 ` David Newall
2009-05-25 5:26 ` Thomas Glanzmann
2009-05-25 7:48 ` Ron Yorston
2009-05-25 10:50 ` Thomas Glanzmann
2009-05-25 12:06 ` Theodore Tso [this message]
2009-05-25 21:19 ` Bill Davidsen
2009-05-26 4:45 ` Thomas Glanzmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090525120636.GB25908@mit.edu \
--to=tytso@mit.edu \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=thomas@glanzmann.de \
--cc=tytso@thunk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox