linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Garzik <jgarzik@pobox.com>
To: Andrea Arcangeli <andrea@suse.de>
Cc: Erik Mouw <erik@harddisk-recovery.com>,
	Josh Litherland <josh@temp123.org>,
	linux-kernel@vger.kernel.org
Subject: Re: Transparent compression in the FS
Date: Thu, 16 Oct 2003 13:10:20 -0400	[thread overview]
Message-ID: <3F8ED0FC.9070706@pobox.com> (raw)
In-Reply-To: <20031016162926.GF1663@velociraptor.random>

Andrea Arcangeli wrote:
> Hi Jeff,
> 
> On Wed, Oct 15, 2003 at 11:13:27AM -0400, Jeff Garzik wrote:
> 
>>Josh and others should take a look at Plan9's venti file storage method 
>>-- archival storage is a series of unordered blocks, all of which are 
>>indexed by the sha1 hash of their contents.  This magically coalesces 
>>all duplicate blocks by its very nature, including the loooooong runs of 
>>zeroes that you'll find in many filesystems.  I bet savings on "all 
>>bytes in this block are zero" are worth a bunch right there.
> 
> 
> I had a few ideas on the above.
> 
> if the zero blocks are the problem, there's a tool called zum that nukes
> them and replaces them with holes. I use it sometime, example:
> 
> andrea@velociraptor:~> dd if=/dev/zero of=zero bs=1M count=100
> 100+0 records in
> 100+0 records out
> andrea@velociraptor:~> ls -ls zero
> 102504 -rw-r--r--    1 andrea   andrea   104857600 2003-10-16 18:24 zero
> andrea@velociraptor:~> ~/bin/i686/zum zero
> zero [820032K]  [1 link]
> andrea@velociraptor:~> ls -ls zero
>    0 -rw-r--r--    1 andrea   andrea   104857600 2003-10-16 18:24 zero
> andrea@velociraptor:~> 

Neat.


> the hash to the data is interesting, but 1) you lose the zerocopy
> behaviour for the I/O, it's like doing a checksum for all the data going to
> disk that you normally would never do (except for the tiny files in reiserfs
> with tail packing enabled, but that's not bulk I/O), 2) I wonder how much data
> is really duplicate besides the "zero" holes trivially fixable in userspace
> (modulo bzImage or similar where I'm unsure if the fs code in the bootloader
> can handle holes ;).

FWIW archival storage doesn't really care...  Since all data written to 
disk is hashed with SHA1 (sha1 hash == block's unique id), you gain (a) 
duplicate block coalescing and (b) _real_ data integrity guaranteed, but 
OTOH, you lose performance and possibly lose zero-copy.

I _really_ like the checksum aspect of Plan9's archival storage (venti).

As Andre H and Larry McVoy love to point out, data isn't _really_ secure 
until it's been checksummed, and that checksum data is verified on 
reads.  LM has an anecdote (doesn't he always?  <g>) about how BitKeeper 
-- which checksums its data inside the app -- has found data-corrupting 
kernel bugs, in days long past.

	Jeff



  parent reply	other threads:[~2003-10-16 17:10 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-10-14 20:30 Transparent compression in the FS Josh Litherland
2003-10-15 13:33 ` Erik Mouw
2003-10-15 13:45   ` Josh Litherland
2003-10-15 13:50   ` Nikita Danilov
2003-10-15 14:27     ` Erik Mouw
2003-10-15 14:33       ` Nikita Danilov
2003-10-15 15:54         ` Richard B. Johnson
2003-10-15 16:21           ` Nikita Danilov
2003-10-15 17:19             ` Richard B. Johnson
2003-10-15 17:37               ` Andreas Dilger
2003-10-15 17:48               ` Dave Jones
2003-10-15 18:19                 ` Richard B. Johnson
2003-10-15 18:06               ` Hans Reiser
2003-10-17 12:51                 ` Edward Shushkin
2003-10-15 16:04         ` Erik Mouw
2003-10-15 17:24           ` Josh Litherland
2003-10-15 18:53             ` Erik Bourget
2003-10-15 19:03           ` Geert Uytterhoeven
2003-10-15 19:14             ` Valdis.Kletnieks
2003-10-15 19:24               ` Geert Uytterhoeven
2003-10-15 18:54         ` root
2003-10-16  2:11           ` Chris Meadors
2003-10-16  3:01             ` Shawn
2003-10-15 14:47       ` Erik Bourget
2003-10-15 15:05         ` Nikita Danilov
2003-10-15 15:06           ` Erik Bourget
2003-10-15 21:36       ` Tomas Szepe
2003-10-16  8:04         ` Ville Herva
2003-10-17  1:32       ` Eric W. Biederman
2003-10-15 15:13   ` Jeff Garzik
2003-10-15 21:00     ` Christopher Li
2003-10-16 16:29     ` Andrea Arcangeli
2003-10-16 16:41       ` P
2003-10-16 17:20         ` Jeff Garzik
2003-10-16 23:12         ` jw schultz
2003-10-17  8:03           ` John Bradford
2003-10-17 14:53             ` Eli Carter
2003-10-17 15:27               ` John Bradford
2003-10-17 16:22                 ` Eli Carter
2003-10-17 17:15                   ` John Bradford
2003-10-16 17:10       ` Jeff Garzik [this message]
2003-10-16 17:41         ` Andrea Arcangeli
2003-10-16 17:29       ` Larry McVoy
2003-10-16 17:49         ` Val Henson
2003-10-16 21:02           ` Jeff Garzik
2003-10-16 21:18             ` Chris Meadors
2003-10-16 21:25               ` Jeff Garzik
2003-10-16 21:33             ` Davide Libenzi
2003-10-17  3:47             ` Mark Mielke
2003-10-17 14:31             ` Jörn Engel
2003-10-16 23:04           ` jw schultz
2003-10-16 23:30             ` Jeff Garzik
2003-10-16 23:58               ` jw schultz
2003-10-16 23:53                 ` David Lang
2003-10-17  1:19                 ` Jeff Garzik
2003-10-17  0:45             ` Christopher Li
2003-10-17  1:16               ` Jeff Garzik
2003-10-17  1:32             ` jlnance
2003-10-17  1:47               ` Eric Sandall
2003-10-17  8:11                 ` John Bradford
2003-10-17 17:53                   ` Eric Sandall
2003-10-17 13:07                 ` jlnance
2003-10-17 14:16                   ` Jeff Garzik
2003-10-17 15:06                     ` Valdis.Kletnieks
2003-10-17  1:49               ` Davide Libenzi
2003-10-17  1:59               ` Larry McVoy
2003-10-17  2:19               ` jw schultz
2003-10-17  9:44             ` Pavel Machek
2003-10-17 12:33               ` jlnance
2003-10-17 18:23               ` jw schultz
2003-10-27  2:08                 ` Mike Fedyk
2003-10-27  2:15                   ` jw schultz
2003-10-27  2:22             ` Mike Fedyk
2003-10-27  2:45               ` jw schultz
2003-10-16 18:28         ` John Bradford
2003-10-16 18:31           ` Robert Love
2003-10-16 20:18             ` Jeff Garzik
2003-10-16 18:43           ` Muli Ben-Yehuda
2003-10-16 18:56           ` Richard B. Johnson
2003-10-16 19:00             ` Robert Love
2003-10-16 19:27               ` John Bradford
2003-10-16 19:03             ` John Bradford
2003-10-16 19:20               ` Richard B. Johnson
2003-10-17 13:16         ` Ingo Oeser
2003-10-16 23:20       ` jw schultz
2003-10-17 14:47         ` Eli Carter
2003-10-16  8:27   ` tconnors+linuxkernel1066292516
2003-10-17 10:55   ` Ingo Oeser
2003-10-15 16:25 ` David Woodhouse
2003-10-15 16:56   ` Andreas Dilger
2003-10-15 17:44     ` David Woodhouse
     [not found] <GTJr.60q.17@gated-at.bofh.it>
     [not found] ` <GU2N.6v7.17@gated-at.bofh.it>
     [not found]   ` <GVBC.Ep.23@gated-at.bofh.it>
     [not found]     ` <Hjkq.3Al.1@gated-at.bofh.it>
     [not found]       ` <Hkgx.4Vu.7@gated-at.bofh.it>
     [not found]         ` <HkA0.5lh.9@gated-at.bofh.it>
     [not found]           ` <HnxT.3BB.27@gated-at.bofh.it>
2003-10-17  8:15             ` Ihar 'Philips' Filipau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3F8ED0FC.9070706@pobox.com \
    --to=jgarzik@pobox.com \
    --cc=andrea@suse.de \
    --cc=erik@harddisk-recovery.com \
    --cc=josh@temp123.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).