All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bron Gondwana <brong@fastmail.fm>
To: Kyle Moffett <mrmacman_g4@mac.com>
Cc: Bryan Henderson <hbryan@us.ibm.com>,
	Jack Stone <jack@hawkeye.stone.uk.eu.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	alan <alan@clueserver.org>, "H. Peter Anvin" <hpa@zytor.com>,
	linux-fsdevel@vger.kernel.org,
	LKML Kernel <linux-kernel@vger.kernel.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	git@vger.kernel.org
Subject: Re: Versioning file system
Date: Tue, 19 Jun 2007 17:58:57 +1000	[thread overview]
Message-ID: <20070619075857.GA2944@brong.net> (raw)
In-Reply-To: <6E9A6F9E-8948-40F2-9129-1F1491D49D83@mac.com>

On Mon, Jun 18, 2007 at 11:10:42PM -0400, Kyle Moffett wrote:
> On Jun 18, 2007, at 13:56:05, Bryan Henderson wrote:
>>> The question remains is where to implement versioning: directly in 
>>> individual filesystems or in the vfs code so all filesystems can use it?
>>
>> Or not in the kernel at all.  I've been doing versioning of the types I 
>> described for years with user space code and I don't remember feeling that 
>> I compromised in order not to involve the kernel.
>
> What I think would be particularly interesting in this domain is something 
> similar in concept to GIT, except in a file-system:

I've written a couple of user-space things very much like this - one
being a purely database (blobs in database, yeah I know) system for
managing medical data, where signatures and auditability were the most
important part of the system.  Performance really wasn't a
consideration.

The other one is my current job, FastMail - we have a virtual filesystem
which uses files stored by sha1 on ordainary filesystems for data
storage and a database for metadata (filename to sha1 mappings, mtime,
mimetype, directory structure, etc).

Multiple machine distribution is handled by a daemon on each machine
which can be asked to make sure the file gets sent out to every machine
that matches the prefix and will only return success once it's written
to at least one other machine.  Database replication is a different
beast.


It can work, but there's one big pain at the file level: no mmap.

If you don't want to support mmap it can work reasonably happily, though
you may want to keep your sha1 (or other digest) state as well as the
final digest so you can cheaply calculate the digest for a small append
without walking the entire file.  You may also want to keep state
checkpoints every so often along a big file so that truncates don't cost
too much to recalculate.

Luckily in a userspace VFS that's only accessed via FTP and DAV we can
support a limited set of operations (basically create, append, read,
delete)  You don't get that luxury for a general purpose filesystem, and
that's the problem.  There will always be particular usage patterns
(especially something that mmaps or seeks and touches all over the place
like a loopback mounted filesystem or a database file) that just dodn't
work for file-level sha1s.


It does have some lovely properties though.  I'd enjoy working in an
envionment that didn't look much like POSIX but had the strong
guarantees and auditability that addressing by sha1 buys you.

Bron.



  parent reply	other threads:[~2007-06-19  8:29 UTC|newest]

Thread overview: 117+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-15 22:23 Versioning file system Jack Stone
2007-06-15 22:38 ` H. Peter Anvin
2007-06-15 22:51   ` alan
2007-06-15 22:59     ` H. Peter Anvin
2007-06-15 23:06       ` alan
2007-06-16  8:11     ` Jack Stone
2007-06-16  9:46       ` Jeffrey V. Merkey
2007-06-16 10:12         ` Jeffrey V. Merkey
2007-06-16 13:15           ` Mark Williamson
2007-06-16 19:57             ` Jeffrey V. Merkey
2007-06-16 16:49           ` Jan Harkes
2007-06-16 20:03             ` Jeffrey V. Merkey
2007-06-16 19:38               ` Jack Stone
2007-06-16 20:08               ` Alan Cox
2007-06-16 21:25                 ` Jeffrey V. Merkey
2007-06-16 20:39               ` Jan Harkes
2007-06-16 20:43                 ` Jack Stone
2007-06-16 22:17                 ` Alan Cox
2007-06-17  2:18                   ` Jeffrey V. Merkey
2007-06-17  2:39                     ` Jeffrey V. Merkey
2007-06-17 22:11                   ` Dale Amon
2007-06-16 21:06               ` Dale Amon
2007-06-16 11:42         ` Graham Murray
2007-06-16 14:53     ` Jörn Engel
2007-06-16 14:53       ` Jörn Engel
2007-06-18  9:45       ` Andreas Dilger
2007-06-18  9:45         ` Andreas Dilger
2007-06-18  9:54         ` Jack Stone
2007-06-18 10:13         ` Jörn Engel
2007-06-18 10:13           ` Jörn Engel
2007-06-18 14:01         ` Theodore Tso
2007-06-18 16:16           ` alan
2007-06-18 17:29             ` Theodore Tso
2007-06-18 17:33               ` Jeremy Allison
2007-06-18 20:30                 ` Theodore Tso
2007-06-18 20:50                   ` J. Bruce Fields
2007-06-18 17:46               ` H. Peter Anvin
2007-07-04 17:32               ` Erik Mouw
2007-07-04 20:47                 ` Theodore Tso
2007-07-05 17:55                   ` Erik Mouw
2007-07-05 13:57                 ` John Stoffel
2007-07-05 14:23                   ` Chris Mason
2007-07-05 17:57                   ` Erik Mouw
2007-06-18 15:32         ` Chris Mason
2007-06-18 23:18           ` Bron Gondwana
2007-09-29 17:44         ` Sorin Faibish
2007-09-29 17:44           ` Sorin Faibish
2007-06-18 15:51     ` Bryan Henderson
2007-06-18 16:37       ` Jack Stone
2007-06-18 16:56         ` H. Peter Anvin
2007-06-18 17:56         ` Bryan Henderson
2007-06-19  3:10           ` Kyle Moffett
2007-06-19  7:49             ` Jack Stone
2007-06-19  7:58             ` Bron Gondwana [this message]
2007-06-20  2:43               ` Kyle Moffett
2007-06-19  9:09             ` Martin Langhoff
2007-06-19 16:52             ` Jakub Narebski
2007-06-15 22:52 ` Chris Snook
2007-06-16  8:25   ` Jack Stone
2007-06-19 18:03     ` Chris Snook
2007-06-19 19:06       ` Jack Stone
2007-06-19 20:03         ` Chris Snook
2007-06-19 20:08           ` Jack Stone
2007-06-19 20:15             ` Chris Snook
2007-06-19 20:27               ` Jack Stone
2007-06-19 20:34             ` John Stoffel
2007-06-19 20:38               ` Jack Stone
2007-06-19 20:38               ` Matthew Wilcox
2007-06-19 21:02                 ` John Stoffel
2007-06-19 19:08       ` H. Peter Anvin
2007-06-19 19:12         ` Jack Stone
2007-06-19 19:15           ` H. Peter Anvin
2007-06-19 19:22             ` Jack Stone
2007-06-19 20:10           ` Chris Snook
2007-06-19 20:14             ` Jack Stone
2007-06-19 20:31               ` Chris Snook
2007-06-20  8:34           ` Bernd Petrovitsch
2007-06-19 21:50         ` Alan Cox
2007-06-19 22:07           ` H. Peter Anvin
2007-06-20  8:05             ` Ph. Marek
2007-06-19 20:43       ` Lennart Sorensen
2007-06-19 22:07         ` david
2007-06-19 22:13           ` H. Peter Anvin
2007-06-19 23:07             ` Jan Harkes
2007-06-19 23:12               ` H. Peter Anvin
2007-06-19 22:21           ` Lennart Sorensen
2007-06-19 23:35         ` Bryan Henderson
2007-06-20  0:27           ` Trond Myklebust
2007-06-20  5:00             ` H. Peter Anvin
2007-06-20 17:04             ` Bryan Henderson
2007-06-20 17:10               ` H. Peter Anvin
2007-06-20 17:33               ` Chris Snook
2007-06-15 22:57 ` Kok, Auke
2007-06-15 23:01   ` alan
2007-06-16 11:20     ` Johannes Weiner
     [not found] <8wst3-3kh-31@gated-at.bofh.it>
     [not found] ` <8wsCC-3wf-21@gated-at.bofh.it>
     [not found]   ` <8wsW4-3UY-3@gated-at.bofh.it>
     [not found]     ` <8wJal-3KA-1@gated-at.bofh.it>
     [not found]       ` <8xm22-4Ql-1@gated-at.bofh.it>
     [not found]         ` <8xq5G-32l-7@gated-at.bofh.it>
     [not found]           ` <8xs7w-69W-21@gated-at.bofh.it>
2007-06-18 20:54             ` Bodo Eggert
2007-06-18 21:08               ` alan
2007-06-18 21:31                 ` H. Peter Anvin
2007-06-18 21:34                   ` Jeremy Allison
2007-06-18 22:10                   ` Theodore Tso
2007-06-18 22:26                     ` Jörn Engel
2007-06-18 22:26                       ` Jörn Engel
2007-06-18 21:24                       ` Brad Boyer
2007-06-18 21:24                         ` Brad Boyer
2007-06-19  3:15                         ` Kyle Moffett
2007-06-19  3:15                           ` Kyle Moffett
2007-06-18 22:34                       ` Jeremy Allison
2007-06-18 22:34                         ` Jeremy Allison
2007-06-18 22:56                       ` alan
2007-06-19  7:01                       ` Theodore Tso
2007-06-19  7:01                         ` Theodore Tso
2007-06-18 22:48                     ` Jeremy Allison
2007-06-18 23:00                       ` alan
2007-06-19  7:05                       ` Theodore Tso
2007-06-19 16:52                         ` Jeremy Allison
2007-06-19 16:56                           ` H. Peter Anvin
2007-06-18 22:47                   ` alan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070619075857.GA2944@brong.net \
    --to=brong@fastmail.fm \
    --cc=akpm@linux-foundation.org \
    --cc=alan@clueserver.org \
    --cc=git@vger.kernel.org \
    --cc=hbryan@us.ibm.com \
    --cc=hpa@zytor.com \
    --cc=jack@hawkeye.stone.uk.eu.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mrmacman_g4@mac.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.