All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Garzik <jgarzik@pobox.com>
To: jw schultz <jw@pegasys.ws>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Transparent compression in the FS
Date: Thu, 16 Oct 2003 19:30:58 -0400	[thread overview]
Message-ID: <3F8F2A32.90101@pobox.com> (raw)
In-Reply-To: <20031016230448.GA29279@pegasys.ws>

jw schultz wrote:
> Because each hash algorithm has different pathologies
> multiple hashes are generally better than one but their
> effective aggregate bit count is less than the sum of the
> separate bit counts.

I was coming to this conclusion too...  still, it's safer simply to 
handle collisions.


> The idea of this sort of block level hashing to allow
> sharing of identical blocks seems attractive but i wouldn't
> trust any design that did not accept as given that there
> would be false positives.  This means that a write would
> have to not only hash the block but then if there is a
> collision do a compare of the raw data.  Then you have to

yep


> add the overhead of having lists of blocks that match a hash
> value and reference counts for each block itself.  Further,
> every block write would then need to include, at minimum,
> relinking facilities on the hash lists and hash entry
> allocations plus the inode block list being updated. Finally

Consider a simple key-value map, where "$hash $n" is the key and the 
value is the block of data.  Only three operations need occur:
* if hash exists (highly unlikely!), read and compare w/ raw data
* write new block to disk
* add single datum to key-value index

Inode block lists would need to be updated regardless of any collision; 
that would be a standard and required part of any VFS interaction. And 
the internal workings of the key-value index (think Berkeley DB) are 
static, regardless of any collision.

	Jeff




  reply	other threads:[~2003-10-16 23:31 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-10-14 20:30 Transparent compression in the FS Josh Litherland
2003-10-15 13:33 ` Erik Mouw
2003-10-15 13:45   ` Josh Litherland
2003-10-15 13:50   ` Nikita Danilov
2003-10-15 14:27     ` Erik Mouw
2003-10-15 14:33       ` Nikita Danilov
2003-10-15 15:54         ` Richard B. Johnson
2003-10-15 16:21           ` Nikita Danilov
2003-10-15 17:19             ` Richard B. Johnson
2003-10-15 17:37               ` Andreas Dilger
2003-10-15 17:48               ` Dave Jones
2003-10-15 18:19                 ` Richard B. Johnson
2003-10-15 18:06               ` Hans Reiser
2003-10-17 12:51                 ` Edward Shushkin
2003-10-15 16:04         ` Erik Mouw
2003-10-15 17:24           ` Josh Litherland
2003-10-15 18:53             ` Erik Bourget
2003-10-15 19:03           ` Geert Uytterhoeven
2003-10-15 19:14             ` Valdis.Kletnieks
2003-10-15 19:24               ` Geert Uytterhoeven
2003-10-15 18:54         ` root
2003-10-16  2:11           ` Chris Meadors
2003-10-16  3:01             ` Shawn
2003-10-15 14:47       ` Erik Bourget
2003-10-15 15:05         ` Nikita Danilov
2003-10-15 15:06           ` Erik Bourget
2003-10-15 21:36       ` Tomas Szepe
2003-10-16  8:04         ` Ville Herva
2003-10-17  1:32       ` Eric W. Biederman
2003-10-15 15:13   ` Jeff Garzik
2003-10-15 21:00     ` Christopher Li
2003-10-16 16:29     ` Andrea Arcangeli
2003-10-16 16:41       ` P
2003-10-16 17:20         ` Jeff Garzik
2003-10-16 23:12         ` jw schultz
2003-10-17  8:03           ` John Bradford
2003-10-17 14:53             ` Eli Carter
2003-10-17 15:27               ` John Bradford
2003-10-17 16:22                 ` Eli Carter
2003-10-17 17:15                   ` John Bradford
2003-10-16 17:10       ` Jeff Garzik
2003-10-16 17:41         ` Andrea Arcangeli
2003-10-16 17:29       ` Larry McVoy
2003-10-16 17:49         ` Val Henson
2003-10-16 21:02           ` Jeff Garzik
2003-10-16 21:18             ` Chris Meadors
2003-10-16 21:25               ` Jeff Garzik
2003-10-16 21:33             ` Davide Libenzi
2003-10-17  3:47             ` Mark Mielke
2003-10-17 14:31             ` Jörn Engel
2003-10-16 23:04           ` jw schultz
2003-10-16 23:30             ` Jeff Garzik [this message]
2003-10-16 23:58               ` jw schultz
2003-10-16 23:53                 ` David Lang
2003-10-17  1:19                 ` Jeff Garzik
2003-10-17  0:45             ` Christopher Li
2003-10-17  1:16               ` Jeff Garzik
2003-10-17  1:32             ` jlnance
2003-10-17  1:47               ` Eric Sandall
2003-10-17  8:11                 ` John Bradford
2003-10-17 17:53                   ` Eric Sandall
2003-10-17 13:07                 ` jlnance
2003-10-17 14:16                   ` Jeff Garzik
2003-10-17 15:06                     ` Valdis.Kletnieks
2003-10-17  1:49               ` Davide Libenzi
2003-10-17  1:59               ` Larry McVoy
2003-10-17  2:19               ` jw schultz
2003-10-17  9:44             ` Pavel Machek
2003-10-17 12:33               ` jlnance
2003-10-17 18:23               ` jw schultz
2003-10-27  2:08                 ` Mike Fedyk
2003-10-27  2:15                   ` jw schultz
2003-10-27  2:22             ` Mike Fedyk
2003-10-27  2:45               ` jw schultz
2003-10-16 18:28         ` John Bradford
2003-10-16 18:31           ` Robert Love
2003-10-16 20:18             ` Jeff Garzik
2003-10-16 18:43           ` Muli Ben-Yehuda
2003-10-16 18:56           ` Richard B. Johnson
2003-10-16 19:00             ` Robert Love
2003-10-16 19:27               ` John Bradford
2003-10-16 19:03             ` John Bradford
2003-10-16 19:20               ` Richard B. Johnson
2003-10-17 13:16         ` Ingo Oeser
2003-10-16 23:20       ` jw schultz
2003-10-17 14:47         ` Eli Carter
2003-10-16  8:27   ` tconnors+linuxkernel1066292516
2003-10-17 10:55   ` Ingo Oeser
2003-10-15 16:25 ` David Woodhouse
2003-10-15 16:56   ` Andreas Dilger
2003-10-15 17:44     ` David Woodhouse
     [not found] <GTJr.60q.17@gated-at.bofh.it>
     [not found] ` <GU2N.6v7.17@gated-at.bofh.it>
     [not found]   ` <GVBC.Ep.23@gated-at.bofh.it>
     [not found]     ` <Hjkq.3Al.1@gated-at.bofh.it>
     [not found]       ` <Hkgx.4Vu.7@gated-at.bofh.it>
     [not found]         ` <HkA0.5lh.9@gated-at.bofh.it>
     [not found]           ` <HnxT.3BB.27@gated-at.bofh.it>
2003-10-17  8:15             ` Ihar 'Philips' Filipau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3F8F2A32.90101@pobox.com \
    --to=jgarzik@pobox.com \
    --cc=jw@pegasys.ws \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.