All of lore.kernel.org
 help / color / mirror / Atom feed
* Plugin for corruption resistance?
@ 2005-02-11 18:58 Gregory Maxwell
  2005-02-11 20:39 ` Jake Maciejewski
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Gregory Maxwell @ 2005-02-11 18:58 UTC (permalink / raw)
  To: reiserfs-list

Anyone ever given a though to adding support to reiserfs to store a
cryptographic checksum along with a file?


The idea is that files get a hidden attribute that contains their SHA1 hash.
If the file is modified, the hash is marked as 'unclean'. A trusted
cleaner comes by eventually and hashes the file, OR the file is hashed
right away if someone tried to read the attribute while the file is
unclean.

Fsck could be optionally told to go check the hash on every file.
Files could also be tested via a background process that randomly
tests some files every night.

Why would this be useful?

1. Lots of applications today (such a P2P sharing systems) need the
hashes of files.. it's inefficient to keep recomputing them.  The file
system always knows when a file changes, so it can be setup to always
return the correct hash.

2. Random disk corruption can go undetected (even if the drives ECC is
sufficient to prevent corruption there could be memory, bus, or kernel
issues the corrupt data, a hash will help it be detected).

3. Although there are encrypted block devices available in Linux, none
of them can provide authentication.. So it's possible for an attacker
(with access to your disk) to replace hunks of files with random (and
potentially chosen depending on the chaining mode) crud without
detection.

4. It could greatly speed up casual verification of files for changes
(if you don't trust the kernel to report the true hash, then you
couldn't trust it to return the real file to some userspace file
verifier either).... it could also be used to help locate duplicates
in a very efficient manner..

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Plugin for corruption resistance?
  2005-02-11 18:58 Plugin for corruption resistance? Gregory Maxwell
@ 2005-02-11 20:39 ` Jake Maciejewski
  2005-02-11 20:53 ` Tom Vier
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 18+ messages in thread
From: Jake Maciejewski @ 2005-02-11 20:39 UTC (permalink / raw)
  To: reiserfs-list

I think this is a great idea. Solaris ZFS is supposed to have a similar
feature, but reiser4 metas would allow application-level access.

The purpose of checksumming in ZFS is more like Gregory's 2nd point,
except Solaris takes it one step further. If you have RAID, ZFS will fix
the corruption automatically. Even if we didn't have automatic
correction, which would probably impossible without an integrated volume
manager like ZFS, it would still be nice to know if your hard drive is
flipping bits behind your back.

On Fri, 2005-02-11 at 13:58 -0500, Gregory Maxwell wrote:
> Anyone ever given a though to adding support to reiserfs to store a
> cryptographic checksum along with a file?
> 
> 
> The idea is that files get a hidden attribute that contains their SHA1 hash.
> If the file is modified, the hash is marked as 'unclean'. A trusted
> cleaner comes by eventually and hashes the file, OR the file is hashed
> right away if someone tried to read the attribute while the file is
> unclean.
> 
> Fsck could be optionally told to go check the hash on every file.
> Files could also be tested via a background process that randomly
> tests some files every night.
> 
> Why would this be useful?
> 
> 1. Lots of applications today (such a P2P sharing systems) need the
> hashes of files.. it's inefficient to keep recomputing them.  The file
> system always knows when a file changes, so it can be setup to always
> return the correct hash.
> 
> 2. Random disk corruption can go undetected (even if the drives ECC is
> sufficient to prevent corruption there could be memory, bus, or kernel
> issues the corrupt data, a hash will help it be detected).
> 
> 3. Although there are encrypted block devices available in Linux, none
> of them can provide authentication.. So it's possible for an attacker
> (with access to your disk) to replace hunks of files with random (and
> potentially chosen depending on the chaining mode) crud without
> detection.
> 
> 4. It could greatly speed up casual verification of files for changes
> (if you don't trust the kernel to report the true hash, then you
> couldn't trust it to return the real file to some userspace file
> verifier either).... it could also be used to help locate duplicates
> in a very efficient manner..
-- 
Jake Maciejewski <maciejej@msoe.edu>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Plugin for corruption resistance?
  2005-02-11 18:58 Plugin for corruption resistance? Gregory Maxwell
  2005-02-11 20:39 ` Jake Maciejewski
@ 2005-02-11 20:53 ` Tom Vier
  2005-02-12  5:19   ` David Masover
  2005-02-13  3:48 ` Esben Stien
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Tom Vier @ 2005-02-11 20:53 UTC (permalink / raw)
  To: Gregory Maxwell; +Cc: reiserfs-list

On Fri, Feb 11, 2005 at 01:58:59PM -0500, Gregory Maxwell wrote:
> 1. Lots of applications today (such a P2P sharing systems) need the
> hashes of files.. it's inefficient to keep recomputing them.  The file
> system always knows when a file changes, so it can be setup to always
> return the correct hash.

That should be done in userland, imho. Especially since different apps use
lots of different hashes.

I was thinking about this kind of stuff (ECC plugin for r4) not too long
ago. Hashing the whole file is too slow; if you update a single block, the
whole file has to be read in to recalculate. Adding, say, one sector of crc
for each block would be a lot more feasible.

I think the best way to do this though, would be to write a virtual blk
driver that works like loop back (ie, uses a backing file/dev), and shortens
the overall size by one sector * number of blocks. Actually, you could
probably copy the raid5 md code and rewrite it to only use one device. I'd
try that first.

-- 
Tom Vier <tmv@comcast.net>
DSA Key ID 0x15741ECE

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Plugin for corruption resistance?
  2005-02-11 20:53 ` Tom Vier
@ 2005-02-12  5:19   ` David Masover
  0 siblings, 0 replies; 18+ messages in thread
From: David Masover @ 2005-02-12  5:19 UTC (permalink / raw)
  To: Tom Vier; +Cc: Gregory Maxwell, reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

First, let me second the original idea.  So long as the hash isn't
updated until that attribute is read, it should be fine.

Tom Vier wrote:
| On Fri, Feb 11, 2005 at 01:58:59PM -0500, Gregory Maxwell wrote:
|
|>1. Lots of applications today (such a P2P sharing systems) need the
|>hashes of files.. it's inefficient to keep recomputing them.  The file
|>system always knows when a file changes, so it can be setup to always
|>return the correct hash.
|
|
| That should be done in userland, imho. Especially since different apps use
| lots of different hashes.

For no particularly good reason, imho.  md4/md5 are reasonably fast,
sha1 is reasonably secure.  What else do you need?

| I was thinking about this kind of stuff (ECC plugin for r4) not too long
| ago. Hashing the whole file is too slow; if you update a single block, the
| whole file has to be read in to recalculate. Adding, say, one sector
of crc
| for each block would be a lot more feasible.

If it was actually to work like ECC.  But this doesn't sound like it
would be checked at every read, but rather when fsck is run, or when
some app needs it.  If it is supposed to be checked at every read, your
below suggestion is better.

| I think the best way to do this though, would be to write a virtual blk
| driver that works like loop back (ie, uses a backing file/dev), and
shortens
| the overall size by one sector * number of blocks. Actually, you could
| probably copy the raid5 md code and rewrite it to only use one device. I'd
| try that first.

Been done.  Think it's part of dev-mapper or evms or some disk magic
like that.  But it's also useful for individual files, for application
reasons.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQg2RxXgHNmZLgCUhAQKxyhAAkun806f/kI3RmnKX8gV0KZ+ubdMoyNWK
vUF4Ln79jMLAxxe2fxrkBZux7qQhNUpaO69+jIAfYFqPGj/L1RS03lAqhz7bZCDp
2GOiBdoOhB7fBJuv1XKbHBrDJdROE8QTJWLuFMyAvxUC7u+uZZ2yU8EVHlKWTLoH
fA40Vr5t7p77ll/zALG1qpEd9GhSDXAbQ0cbqMvy9cYzo+Wreo9xifH4bT9u8SGk
NgqGTf3iMKhetfFWqxmgg9F34SMVF9IuyRud2mHvqY7NQW1B3k7MFjOax7fgFTRF
xxwUzt2lE77tmEUM87r16sCkK+YSJTNNaTancV4yYhzQ+Oz43NwkUW0nwy0jOOVz
C3sydKjsYoOMiBAVind+arILSrmLwMXpgZ7/6/NV5A7XiUZWy2TeZGYLXjEZbNOV
V5Tg1KsMnJxPS2n/y7FG9HQXx/iFapWG8RWkz3O9Pzg/Zywsi4LbcgI+72iHImLj
5+b5YXxQsv9F415pXEaCSNGmMg7FZ/wURXPXwEruPJrs1aJ1SipoZzUCUXN9OpvJ
efW+IQmbx1tUhQvBfiYmeGj/vscPfkXbwnXlwZOpU7tkkVw8F+t/OJT4jL6Z5wOj
el8FDYz3swRR1W+nUTJK+NOBkkR3RPjtdOqUahXPF7jqK3Wc1EsZ4MGhlBaLwp+l
UqUzpsVgmdc=
=jkeg
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Plugin for corruption resistance?
  2005-02-11 18:58 Plugin for corruption resistance? Gregory Maxwell
  2005-02-11 20:39 ` Jake Maciejewski
  2005-02-11 20:53 ` Tom Vier
@ 2005-02-13  3:48 ` Esben Stien
  2005-02-14  2:01 ` Reiser 4 Apple Michael James
  2005-02-14 17:45 ` Plugin for corruption resistance? Hans Reiser
  4 siblings, 0 replies; 18+ messages in thread
From: Esben Stien @ 2005-02-13  3:48 UTC (permalink / raw)
  To: reiserfs-list

Gregory Maxwell <gmaxwell@gmail.com> writes:

> Anyone ever given a though to adding support to reiserfs to store a
> cryptographic checksum along with a file?

Yes, I thought about putting it there as an extended attribute

-- 
Esben Stien is b0ef@esben-stien.name
http://www.esben-stien.name
irc://irc.esben-stien.name/%23contact
[sip|iax]:b0ef@esben-stien.name

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Reiser 4 Apple
  2005-02-11 18:58 Plugin for corruption resistance? Gregory Maxwell
                   ` (2 preceding siblings ...)
  2005-02-13  3:48 ` Esben Stien
@ 2005-02-14  2:01 ` Michael James
  2005-02-14 18:49   ` Hans Reiser
  2005-02-14 17:45 ` Plugin for corruption resistance? Hans Reiser
  4 siblings, 1 reply; 18+ messages in thread
From: Michael James @ 2005-02-14  2:01 UTC (permalink / raw)
  To: reiserfs-list

Dear Hans,

Have you ever thought of porting reiser4 to BSD?

Apple have:
	Bags of money
	A current filesystem that totally sucks
	An OS that cries out for plugins to satisfy its quirks

Could be a good marriage,
michaelj

-- 
Michael James                         michael.james@csiro.au
System Administrator                    voice:  02 6246 5040
CSIRO Bioinformatics Facility             fax:  02 6246 5166

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: Reiser 4 Apple
@ 2005-02-14 16:32 Burnes, James
  2005-02-14 22:07 ` Simon "Sturmflut" Raffeiner
  0 siblings, 1 reply; 18+ messages in thread
From: Burnes, James @ 2005-02-14 16:32 UTC (permalink / raw)
  To: Michael James, reiserfs-list

Sounds like a great idea, but didn't they hire the filesystem architect
from BeOS a few years ago to re-design the native MacOS X filesystem?

I believe the idea was to clone the database-centric BeFS functions.  I
believe this functionality is on the plate sometime in the future with
Reiser4 and also (some decade) with the next version of WinFS.

Jim Burnes


> -----Original Message-----
> From: Michael James [mailto:Michael.James@csiro.au]
> Sent: Sunday, February 13, 2005 7:01 PM
> To: reiserfs-list@namesys.com
> Subject: Reiser 4 Apple
> 
> Dear Hans,
> 
> Have you ever thought of porting reiser4 to BSD?
> 
> Apple have:
> 	Bags of money
> 	A current filesystem that totally sucks
> 	An OS that cries out for plugins to satisfy its quirks
> 
> Could be a good marriage,
> michaelj
> 
> --
> Michael James                         michael.james@csiro.au
> System Administrator                    voice:  02 6246 5040
> CSIRO Bioinformatics Facility             fax:  02 6246 5166

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Plugin for corruption resistance?
  2005-02-11 18:58 Plugin for corruption resistance? Gregory Maxwell
                   ` (3 preceding siblings ...)
  2005-02-14  2:01 ` Reiser 4 Apple Michael James
@ 2005-02-14 17:45 ` Hans Reiser
  2005-02-15 20:42   ` Adam
  4 siblings, 1 reply; 18+ messages in thread
From: Hans Reiser @ 2005-02-14 17:45 UTC (permalink / raw)
  To: Gregory Maxwell; +Cc: reiserfs-list

Its on the legitimate wish list, if someone wants to code it, let me know.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Reiser 4 Apple
  2005-02-14  2:01 ` Reiser 4 Apple Michael James
@ 2005-02-14 18:49   ` Hans Reiser
  0 siblings, 0 replies; 18+ messages in thread
From: Hans Reiser @ 2005-02-14 18:49 UTC (permalink / raw)
  To: Michael James; +Cc: reiserfs-list

Michael James wrote:

>Dear Hans,
>
>Have you ever thought of porting reiser4 to BSD?
>
>Apple have:
>	Bags of money
>	A current filesystem that totally sucks
>	An OS that cries out for plugins to satisfy its quirks
>
>Could be a good marriage,
>michaelj
>
>  
>
Please convince them to pay for it and I will....

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Reiser 4 Apple
  2005-02-14 16:32 Reiser 4 Apple Burnes, James
@ 2005-02-14 22:07 ` Simon "Sturmflut" Raffeiner
  0 siblings, 0 replies; 18+ messages in thread
From: Simon "Sturmflut" Raffeiner @ 2005-02-14 22:07 UTC (permalink / raw)
  To: reiserfs-list

[-- Attachment #1: Type: text/plain, Size: 1974 bytes --]

AFAIK WinFS will not enhance NTFS with some real Database-centric stuff but 
simply save an index into a standard Microsoft SQL database somewhere in a 
hidden place. At least the Windows Longhorn Preview Releases had a MS-SQL 
Server running in the background. Nobody knows what Apple exactly wants to do 
with Spotlight, it merely sounds like a background indexing engine like 
WinFS. The BeOS approach was completely different in my eyes, and it was damn 
fast at the same time.

Regards,
Simon


Am Montag 14 Februar 2005 17:32 schrieb Burnes, James:
> Sounds like a great idea, but didn't they hire the filesystem architect
> from BeOS a few years ago to re-design the native MacOS X filesystem?
>
> I believe the idea was to clone the database-centric BeFS functions.  I
> believe this functionality is on the plate sometime in the future with
> Reiser4 and also (some decade) with the next version of WinFS.
>
> Jim Burnes
>
> > -----Original Message-----
> > From: Michael James [mailto:Michael.James@csiro.au]
> > Sent: Sunday, February 13, 2005 7:01 PM
> > To: reiserfs-list@namesys.com
> > Subject: Reiser 4 Apple
> >
> > Dear Hans,
> >
> > Have you ever thought of porting reiser4 to BSD?
> >
> > Apple have:
> > 	Bags of money
> > 	A current filesystem that totally sucks
> > 	An OS that cries out for plugins to satisfy its quirks
> >
> > Could be a good marriage,
> > michaelj
> >
> > --
> > Michael James                         michael.james@csiro.au
> > System Administrator                    voice:  02 6246 5040
> > CSIRO Bioinformatics Facility             fax:  02 6246 5166

-- 
mit freundlichen Grüßen

Simon "Sturmflut" Raffeiner

-----------------------------------
Die Weltgeschichte ist die Summe dessen, was vermeidbar gewesen wäre.
                                                   -- Oscar Wilde

GPG Public Key: 0xB2204FA0 @ hkp://subkeys.pgp.net
http://www.lieberbiber.de/pubkey.gpg

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Plugin for corruption resistance?
  2005-02-14 17:45 ` Plugin for corruption resistance? Hans Reiser
@ 2005-02-15 20:42   ` Adam
  2005-02-17  4:10     ` David Masover
  0 siblings, 1 reply; 18+ messages in thread
From: Adam @ 2005-02-15 20:42 UTC (permalink / raw)
  To: reiserfs-list

Hans Reiser <reiser <at> namesys.com> writes:

> 
> Its on the legitimate wish list, if someone wants to code it, let me know.
> 
>

Hans, does this mean that you think that this type of functionality should be
implemented as a Reiser4 plugin and therefore in kernelspace?  Why wouldn't this
be better implemented in userspace via a daemon that is notified of file
modification via dnotify/inotify?


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Plugin for corruption resistance?
  2005-02-15 20:42   ` Adam
@ 2005-02-17  4:10     ` David Masover
  2005-02-17 10:53       ` Christian Iversen
  0 siblings, 1 reply; 18+ messages in thread
From: David Masover @ 2005-02-17  4:10 UTC (permalink / raw)
  To: Adam; +Cc: reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Adam wrote:
| Hans Reiser <reiser <at> namesys.com> writes:
|
|
|>Its on the legitimate wish list, if someone wants to code it, let me know.
|>
|>
|
|
| Hans, does this mean that you think that this type of functionality
should be
| implemented as a Reiser4 plugin and therefore in kernelspace?  Why
wouldn't this
| be better implemented in userspace via a daemon that is notified of file
| modification via dnotify/inotify?

Because dnotify/inotify don't scale.  I don't think they lock on event,
either, whereas a plugin could guarentee that the hash was up-to-date
(no race conditions).

We've been over this before.  There's a reason reiser4 and its plugins
are in kernel space, and not in something like Fuse.


===============================
Disclaimer:  I don't work here.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQhQZKHgHNmZLgCUhAQIIDA/+JtV8NjA9GJwbdr7m7yiSunjnEaecP3AP
nK4qfOwMgwZAMyoeeWfq8b66I0AZwSY9u/pAAGoZcqsdxPzaA/1BNoxIVckT1adJ
uszfFiVNo9NHJNdZdn28C7IQbdM8utYLoQ8QiJr4mjmfPsQevxmpqwNqLcuShHwb
+sy0Ckdkq1IjQzntZC60ZIo3J/g5agn1KuRJ1u7mhrG+jA8kEpeTka3j1I8fDbUK
2ODnJE3nV2QIJ32U271OGPBwgC5Zvca0cui4WsYsad0aD3/8KPZibp9rA/RZc8Ud
xD2XtILL8V4skr7Q0G81UzHoj3ISFj9HQgiwaQt7YPie8YeC68AIwOk8ISWSlTIQ
ifyY/1d/JLTpD2qPxemh6yc6Dje71apYeic0YKoOBfd2Ck1LBgwJWwaBPoYQUsYN
f+f41iuaYJRnXYqfI7A5sniXt8pwI/2RQQc31pGyMA6UZXVIgfJnzDZ+uZphGpFf
kiJyRGS7RhgX0DNFVJ9V3jlYHoqIAe4zsDjwdTk3zLO4dnDFVX0M8LeiuXd8bcnM
UA6cODryfvR3ZA3t4GKTm8ir7RYr5mQIhSwN6s0KTJjVmRMBOPWAnCsVgv9t7rjF
cVF1fS9V1VUZFEoatlI9W1Ju1qautIe390z4lPCiBqF0SUaFfkFUa+QPir2TYbTD
9usfiEWzPpM=
=6prc
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Plugin for corruption resistance?
  2005-02-17  4:10     ` David Masover
@ 2005-02-17 10:53       ` Christian Iversen
  2005-02-18  3:43         ` David Masover
  0 siblings, 1 reply; 18+ messages in thread
From: Christian Iversen @ 2005-02-17 10:53 UTC (permalink / raw)
  To: reiserfs-list

On Thursday 17 February 2005 05:10, David Masover wrote:
> Adam wrote:
> | Hans Reiser <reiser <at> namesys.com> writes:
> |>Its on the legitimate wish list, if someone wants to code it, let me
> |> know.
> |
> | Hans, does this mean that you think that this type of functionality
>
> should be
>
> | implemented as a Reiser4 plugin and therefore in kernelspace?  Why
>
> wouldn't this
>
> | be better implemented in userspace via a daemon that is notified of file
> | modification via dnotify/inotify?
>
> Because dnotify/inotify don't scale.  I don't think they lock on event,
> either, whereas a plugin could guarentee that the hash was up-to-date
> (no race conditions).
>
> We've been over this before.  There's a reason reiser4 and its plugins
> are in kernel space, and not in something like Fuse.

And, surely, updating a hash value when 1 byte changes in a gigabyte file, 
would be much faster from kernel space where you can actually see the 
changeset?

-- 
Regards,
Christian Iversen

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Plugin for corruption resistance?
  2005-02-17 10:53       ` Christian Iversen
@ 2005-02-18  3:43         ` David Masover
  2005-02-18  4:28           ` Valdis.Kletnieks
  0 siblings, 1 reply; 18+ messages in thread
From: David Masover @ 2005-02-18  3:43 UTC (permalink / raw)
  To: Christian Iversen; +Cc: reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Christian Iversen wrote:
| On Thursday 17 February 2005 05:10, David Masover wrote:
|
|>Adam wrote:
|>| Hans Reiser <reiser <at> namesys.com> writes:
|>|>Its on the legitimate wish list, if someone wants to code it, let me
|>|> know.
|>|
|>| Hans, does this mean that you think that this type of functionality
|>
|>should be
|>
|>| implemented as a Reiser4 plugin and therefore in kernelspace?  Why
|>
|>wouldn't this
|>
|>| be better implemented in userspace via a daemon that is notified of file
|>| modification via dnotify/inotify?
|>
|>Because dnotify/inotify don't scale.  I don't think they lock on event,
|>either, whereas a plugin could guarentee that the hash was up-to-date
|>(no race conditions).
|>
|>We've been over this before.  There's a reason reiser4 and its plugins
|>are in kernel space, and not in something like Fuse.
|
|
| And, surely, updating a hash value when 1 byte changes in a gigabyte
file,
| would be much faster from kernel space where you can actually see the
| changeset?

Wouldn't it be sane to just export the changeset to userland?

This way is easier, though.  But I was thinking about accessing the
file.  I don't know of any hashes that can be easily updated from part
of the file, unless you're hashing only pieces of the file in the first
place, but it'd be nice to not bother hashing at all until the hash is
needed, especially if we are hashing the whole file.

Plus, there's the race condition thing.  Definitely a lot of reasons to
put more stuff in the kernel.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQhVkS3gHNmZLgCUhAQIMNA/+IgaEx9p3bceATrrYDEweAB+N4K98dFyM
BgDAtxFS2vzaw6lsF9vtEiHEuhvp5raCAhxcoEO1KgRh21Yc6cR+Yu5FaRc7BV4n
WleJtNk521XFwUsmQXs5nYYHzfNlfJbQax9RBqX4IllbXbHX6YUHDAof/Zy8M4MJ
Wytp10igur9QVcqXSeEYoRbYHXS3MRyT3cIl6Y1VXAdZRYu/ItlLf0ItRPkRyfB7
1yVK4kOaR4c6U95gaHL0S08tLddtiep+9XIAJ+JXdhnP8yfEH43ItoM/KxrGSc/K
PWcgIDicYek6kWWNb8H5dTYIknaW8fYwStuoBdfaLt/9aGO00O4sNmg97skW8H2q
+87d22MTiCFtbHyYnD5cV6EzKe+IdUqcTaISOMbctltQmBsPcQjeAlU+BmaLbzEF
sN91egbv/iuirroO1/OzCCQrihE6u8/9tK6LO2Y+LGO2N+ZpB10ZHaGa8uHqMFy1
w19r9XcTIJHo+mjuWhM4hnRrna8cwsCepf++tyKE26ZD3iPPaUzB8h+U81R+59y0
TXNLKLCfJfuOBs4IK2MgmcSkvxEpzBos3vJJdlvA3s4aMz6ZAP8vp5Wgj/BvMrxj
QhFdy/IPCdDLHc5EO3rOZo3b8e/GZcfxgOJ54nuMMzk0xFrlIbtBtYCBU41o9jiq
yP/8bcQ4cqY=
=UEM3
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Plugin for corruption resistance?
  2005-02-18  3:43         ` David Masover
@ 2005-02-18  4:28           ` Valdis.Kletnieks
  2005-02-18 13:36             ` Gregory Maxwell
  0 siblings, 1 reply; 18+ messages in thread
From: Valdis.Kletnieks @ 2005-02-18  4:28 UTC (permalink / raw)
  To: David Masover; +Cc: Christian Iversen, reiserfs-list

[-- Attachment #1: Type: text/plain, Size: 1540 bytes --]

On Thu, 17 Feb 2005 21:43:08 CST, David Masover said:

> This way is easier, though.  But I was thinking about accessing the
> file.  I don't know of any hashes that can be easily updated from part
> of the file, unless you're hashing only pieces of the file in the first
> place, but it'd be nice to not bother hashing at all until the hash is
> needed, especially if we are hashing the whole file.

There's plenty of CRC functions that are quite easily set up for an
incremental update (see RFCs 1141 and 1624 on how to do it for the CRC function
used for Internet IP packets).  You'd of course not want to use that CRC-16,
but the same basic principle applies to other CRC functions.

The problem is that most CRC functions aren't very much good at detecting
multi-bit errors, and when you're talking about hundreds of gigabytes of
disk on a modern RAID, the CRC functions are hardly bulletproof.

On the flip side, hash functions like MD5 or the SHA family are fairly bulletproof,
but are essentially impossible to develop an incremental update for (if there
existed a fast incremental update for the hash function, that would imply a
very low preimage resistance, rendering it useless as a cryptographic hash).

Also, there's another issue - unlike standard ECC codes that can actually *fix*
the problem (for at least small number of bit errors), it's unclear what you should
do if you find a mismatch between the hash of a block and the block contents, as
you don't know whether it's the actual data or the hash that's corrupted....


[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Plugin for corruption resistance?
  2005-02-18  4:28           ` Valdis.Kletnieks
@ 2005-02-18 13:36             ` Gregory Maxwell
  2005-02-18 22:09               ` Valdis.Kletnieks
  0 siblings, 1 reply; 18+ messages in thread
From: Gregory Maxwell @ 2005-02-18 13:36 UTC (permalink / raw)
  To: Valdis.Kletnieks@vt.edu; +Cc: reiserfs-list

On Thu, 17 Feb 2005 23:28:09 -0500, Valdis.Kletnieks@vt.edu
<Valdis.Kletnieks@vt.edu> wrote:
> On the flip side, hash functions like MD5 or the SHA family are fairly bulletproof,
> but are essentially impossible to develop an incremental update for (if there
> existed a fast incremental update for the hash function, that would imply a
> very low preimage resistance, rendering it useless as a cryptographic hash).

Tree hashes.
Divide the file into blocks of N bytes. Compute size/N hashes. 
Group hashes into pairs. Compute N/2 N' hashes, this is fast because
hashes are small. Group N' hashes into pairs compute N'/2 N'' hashes
etc.. Reduce to a single hash.

A number of useful tradeoffs are possible: By enlarging N you improve
the strength along various cryptographic dimensions.  By changing the
fanout, and deciding how many N your store, which N you store, which
N' you store, etc you decide how easy it is to update the hash and you
decide what the smallest increment you can test is... you trade off
storage (and a little computation) for this ease.

> Also, there's another issue - unlike standard ECC codes that can actually *fix*
> the problem (for at least small number of bit errors), it's unclear what you should
> do if you find a mismatch between the hash of a block and the block contents, as
> you don't know whether it's the actual data or the hash that's corrupted....

In my initial suggestion I offered that hashes could be verified by a
userspace daemon, or by fsck (since it's an expensive operation)...
Such policy could be controlled in the daemon.
In most cases I'd like it to make the file inaccessible until I go and
fix it by hand.

It would also be useful to have the checker daemon watch the logs (or
recieve notifications through some kernel interface)... and any block
level errors (or smartd errors) backprojected up (through raid and lvm
remappings) to the file system level ... After identifying the
potentially corrupted file, it could then test the file.  If the file
has been corrupted, the configured action is taken.

If this policy is in userspace, the level of action sopication could
be very high: for example, if I was on a distribution with package
management, and the file was outside of /home, and the package flags
didn't indicate it was a config file.. then go fetch the package, and
replace the file and send me an email so I don't forget how wonderful
my OS is. :)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Plugin for corruption resistance?
  2005-02-18 13:36             ` Gregory Maxwell
@ 2005-02-18 22:09               ` Valdis.Kletnieks
  2005-02-19  3:28                 ` Gregory Maxwell
  0 siblings, 1 reply; 18+ messages in thread
From: Valdis.Kletnieks @ 2005-02-18 22:09 UTC (permalink / raw)
  To: Gregory Maxwell; +Cc: reiserfs-list

[-- Attachment #1: Type: text/plain, Size: 1445 bytes --]

On Fri, 18 Feb 2005 08:36:51 EST, Gregory Maxwell said:

> Tree hashes.
> Divide the file into blocks of N bytes. Compute size/N hashes. 
> Group hashes into pairs. Compute N/2 N' hashes, this is fast because
> hashes are small. Group N' hashes into pairs compute N'/2 N'' hashes
> etc.. Reduce to a single hash.

You get massively I/O bound real fast this way.  You may want to re-evaluate
whether this *really* buys you anything, especially if you're not using some
sort of guarantee that you know what's actually b0rked...

> In my initial suggestion I offered that hashes could be verified by a
> userspace daemon, or by fsck (since it's an expensive operation)...
> Such policy could be controlled in the daemon.
> In most cases I'd like it to make the file inaccessible until I go and
> fix it by hand.

You're still missing the point that in general, you don't have a way to tell whether
the block the file lived in went bad, or the block the hash lived in went bad.

Sure, if the file *happens* to be ascii text, you can use Wetware 1.5 to scan
the file and tell which one went bad.  However, you'll need Wetware 2.0 to
do the same for your multi-gigabyte Oracle database... :)

(And yes, I *have* seen cases where Tripwire went completely and totally bananas
and claimed zillions of files were corrupted, when the *real* problem was that
the Tripwire database itself had gotten stomped on - so it's *not* a purely
theoretical issue....

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Plugin for corruption resistance?
  2005-02-18 22:09               ` Valdis.Kletnieks
@ 2005-02-19  3:28                 ` Gregory Maxwell
  0 siblings, 0 replies; 18+ messages in thread
From: Gregory Maxwell @ 2005-02-19  3:28 UTC (permalink / raw)
  To: Valdis.Kletnieks@vt.edu; +Cc: reiserfs-list

On Fri, 18 Feb 2005 17:09:00 -0500, Valdis.Kletnieks@vt.edu
<Valdis.Kletnieks@vt.edu> wrote:
> On Fri, 18 Feb 2005 08:36:51 EST, Gregory Maxwell said:
> 
> > Tree hashes.
> > Divide the file into blocks of N bytes. Compute size/N hashes.
> > Group hashes into pairs. Compute N/2 N' hashes, this is fast because
> > hashes are small. Group N' hashes into pairs compute N'/2 N'' hashes
> > etc.. Reduce to a single hash.
> 
> You get massively I/O bound real fast this way.  You may want to re-evaluate
> whether this *really* buys you anything, especially if you're not using some
> sort of guarantee that you know what's actually b0rked...

I brought up tree hashes because someone pointed out there was no way
to incrementally update a normal hash. Tree hashes can easily be
incrementally updated if you keep all the sub parts.

I don't think that would suddenly make it useful for frequently updated files.
 
> > In my initial suggestion I offered that hashes could be verified by a
> > userspace daemon, or by fsck (since it's an expensive operation)...
> > Such policy could be controlled in the daemon.
> > In most cases I'd like it to make the file inaccessible until I go and
> > fix it by hand.
> 
> You're still missing the point that in general, you don't have a way to tell whether
> the block the file lived in went bad, or the block the hash lived in went bad.

I'm not missing the point.  Compare the number of disk blocks a file
takes vs the hash. Compare the ease of atomically updating the hash
data vs atomically updating the hash.
If they don't match, It is far more likely that the file has been
silently corrupted than hash has been.. In either case, something
seriously wrong has happened (i.e. that *any* data has been corrupted
without triggering alarms elsewhere).

Wetware will be required figure out what is going on.
Perhaps correct a serious problem before it eats the whole file system...

Automagic correction of stuff that is automagically correctable is
useful in that it might prevent something worse from happening... For
example, if the corrupted file was /sbin/init.. regardless of the
cause of the problem I'd be glad if the system took some action while
the wetware was in an uninteruptable sleep. ;)
 
> Sure, if the file *happens* to be ascii text, you can use Wetware 1.5 to scan
> the file and tell which one went bad.  However, you'll need Wetware 2.0 to
> do the same for your multi-gigabyte Oracle database... :)

Such a proposed system would likely not be all that useful on a live
database.. the overhead of computing hashes would likely be too
great..  Rather, it would be useful if the database system used it's
knowledge of how data was stored to do this efficiently.

If the database system were written with reiserfs in mind and rather
than using a couple of big opaque files it stored it's data in tens of
thousands of files... then perhaps such a hashing scheme might
actually work out okay.

> (And yes, I *have* seen cases where Tripwire went completely and totally bananas
> and claimed zillions of files were corrupted, when the *real* problem was that
> the Tripwire database itself had gotten stomped on - so it's *not* a purely
> theoretical issue....

The discussion is to store the hash in the file metadata.  ... If that
is getting stomped on, it's a *good* thing if the system goes totally
bananas. In a great many situations I'd rather lose a file completely
than have some random bytes in it silently corrupted. (and of course,
attaching hashes doesn't mean you lose the file... it means it gets
brought to your attention)

As things stand today, there are hundreds of ways a system could end
up with files getting silently corrupted.  Many of them would be
fairly difficult to detect until it's far too late (to recover cleanly
or even detect the root cause).  Right now most distros have a package
management system that can detect changes in some system files, which
is useful against a small subset of these problems, but not most since
it will only detect problems in files that almost never change.

The proposed system of attaching hashes in metadata would protect all
files that are not constantly updated (so that counts out database and
single file mailboxes), but could protect most everything else.  ..
And the things that can't be protected could be with changes to their
operation that would be useful to make for reiserfs due to other
reasons. (there is no performance reason in reiserfs to make a mail
box a single file, for example).

Furthermore, attached hashes could greatly speed up applications using
hashes in a way that no userspace solution can:  Userspace solutions
can't maintain a cache of the files hashes because they have no way to
be *sure* that the file wasn't monkied with while they weren't
watching... so caches are useless for p2p apps or for security
checking.. (and useless for verifying that the system isn't silently
corrupting data, except for completely static files).    If the
integerty of the hash is insured by the file system then your trust of
the hash should be equal to your trust of the kernel, which is the
same level of trust you have in read(), thus you should be able to use
the stored hash in any place where you'd read the file and compute the
hash itself.

I agree that there are applications for additional realtime block
level protection which can't be provided by hashes-as-metadata.  These
would be better addressed via device-mapper... We don't see them
because it's hard to avoid them because they often become useless due
to an overlap with the disks underlying protection. (because all
modern disks have ECC, we tend to lose entire physical blocks at a
time. Since we can't access the underlying correction data in a useful
way we can't use it in correction...we might be duping it entirely,
and worse, since a block level ecc or CRC scheme would change the size
of a disk block, we'd end up with all blocks taking multiple disk
blocks... Even ignoring the potential performance and atomicity
issues, this would greatly increase the impact of block level
corruption: you'd always lose two blocks!)

Raid and disk ECC address low level corruption.  *Some* applications
do testing to catch higher level corruption, but the vast majority
don't simply because it's not the applications primary duty to make
sure it's host isn't broken.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2005-02-19  3:28 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-02-11 18:58 Plugin for corruption resistance? Gregory Maxwell
2005-02-11 20:39 ` Jake Maciejewski
2005-02-11 20:53 ` Tom Vier
2005-02-12  5:19   ` David Masover
2005-02-13  3:48 ` Esben Stien
2005-02-14  2:01 ` Reiser 4 Apple Michael James
2005-02-14 18:49   ` Hans Reiser
2005-02-14 17:45 ` Plugin for corruption resistance? Hans Reiser
2005-02-15 20:42   ` Adam
2005-02-17  4:10     ` David Masover
2005-02-17 10:53       ` Christian Iversen
2005-02-18  3:43         ` David Masover
2005-02-18  4:28           ` Valdis.Kletnieks
2005-02-18 13:36             ` Gregory Maxwell
2005-02-18 22:09               ` Valdis.Kletnieks
2005-02-19  3:28                 ` Gregory Maxwell
  -- strict thread matches above, loose matches on Subject: below --
2005-02-14 16:32 Reiser 4 Apple Burnes, James
2005-02-14 22:07 ` Simon "Sturmflut" Raffeiner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.