From: Hans Reiser <reiser@namesys.com>
To: mingz@ele.uri.edu
Cc: David Masover <ninja@slaphack.com>,
Peter van Hardenberg <pvh@uvic.ca>,
reiserfs-list@namesys.com
Subject: Re: Versioning Plugin
Date: Sat, 12 Nov 2005 20:23:23 -0800 [thread overview]
Message-ID: <4376BFBB.5020201@namesys.com> (raw)
In-Reply-To: <1131851150.21584.54.camel@localhost.localdomain>
Ming Zhang wrote:
>On Sat, 2005-11-12 at 20:56 -0600, David Masover wrote:
>
>
>>Ming Zhang wrote:
>>
>>
>>>On Sat, 2005-11-12 at 15:46 -0600, David Masover wrote:
>>>
>>>
>>>
>>>>Ming Zhang wrote:
>>>>
>>>>
>>>>
>>>>>On Fri, 2005-11-11 at 16:56 -0800, Peter van Hardenberg wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>On November 11, 2005 05:59 am, John Gilmore wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>Does anybody remember GoBack? It was a versioning
>>>>>>>system for windows 95/98 that was incredibly flexible and useful. Tracked
>>>>>>>all changes to the whole disk. Old versions of a file? no problem. grab an
>>>>>>>old version of a directory for referance temporarily? easy. Got a virus?
>>>>>>>revert the whole HD, and then grab the newer copies of your documents and
>>>>>>>saved games as needed.
>>>>>>>
>>>>>>>
>>>>>>My thoughts on this:
>>>>>>
>>>>>>The versioning would be an audit plugin. When the file is modified, tag the
>>>>>>current version, copy it into a sub-directory (oh, I don't know, say
>>>>>>file/.revisions/<number/date>), and disable write access to it. You might not
>>>>>>even need extended filesystem attributes for this, but they would be handy
>>>>>>for tagging particular versions.
>>>>>>
>>>>>>
>>>>>if a file is opened, modified 2 times, then closed. u will only generate
>>>>>1 version right? so "When the file is modified" is inaccurate.
>>>>>
>>>>>
>>>>How about "When the transaction was completed?" Why does it matter?
>>>>
>>>>
>>>then how u define a transaction? i mean we first need to choose a good
>>>event/period to define what is a good meaningful version.
>>>
>>>
>>>
>>>
>>>
>>>>>>Copy-on-write would make this action extremely cheap, only adding a couple of
>>>>>>extra writes to make it work.
>>>>>>
>>>>>>
>>>>>add 1 line at the beginning of a 100MB text file will make this uncheap.
>>>>>
>>>>>
>>>>Who has to work with 100 meg text files? And why has this person not
>>>>broken them down into 100 kilobyte text files? Storage efficiency isn't
>>>>really an issue there...
>>>>
>>>>
>>>yes, 100MB/s text file is an extreme example, but a common case can be u
>>>delete 1 frame in a streaming media file.
>>>
>>>
>>What do you mean by "streaming"? (To me, "streaming media" usually
>>means "over the Internet", which makes no sense here.)
>>
>>
>
>what i mean is frame is independent from each other, so when u delete
>one frame, other frame data keep unchanged, like change ABCDEFG from
>ACDEFG.
>
>
>
>
>>>basically, a cow is not good
>>>for a data shift situation. u have >99% data unchanged, just their
>>>offset in file is changed. this lead to all blocks changed, then COW
>>>will need to copy a lot.
>>>
>>>
>>When do you have a data shift situation where this is significant enough
>>to impact COW, but not significant enough to affect normal performance?
>>
>>As far as I know, *nix has no way to append to the beginning of a file,
>>so if you're editing a large video file, say several gigs of DVD, you
>>have to write out several gigs worth of data all over again because you
>>want it shifted.
>>
>>
>
>yes, this is also what i know. thanks for u analysis, i now agree that
>COW should be ok for this case, considering the overhead.
>
>but another issue about COW is that when u have lots of versions, any
>write to original data will lead a lot of new writings to these COW
>storage.
>
>any place i can find document about how to write a plugin for reiser?
>sounds like interesting. :P
>
>ming
>
>
>
>>The filesystem may eventually provide more intelligent ways of messing
>>with a file, and the COW system should be able to handle when a program
>>appends to or chops off the beginning of a file.
>>
>>Until then, we can rely somewhat on programs optimizing for speed --
>>rather than rewrite several gigs, it could split the file into smaller
>>files (thus, only the file which was changed is copied), or make it a
>>sort of mini-FS in that it fragments the logical structure of the file
>>so that it writes as little as possible -- for instance, inserting a
>>clip in the middle might write to the end of a "project" file, instead
>>of shifting half of that file over first. You'd keep versions of the
>>project file, not the stream (properly defragmented) you'd export when
>>you're done.
>>
>>For cases where developers didn't have to deal with the speed issues, we
>>don't have to worry about it. In the case of audio editing, if it's
>>actually messing with the sound itself, no COW in the world will catch
>>that. If it's a mixing/sequencing program, that's usually stored as a
>>"project", accompanied by lots of little WAV files, which don't change,
>>and a tiny "project" file describing how they go together, which does
>>change.
>>
>>And for text files and office documents, the sizes just aren't usually
>>enough for us to care. My biggest OpenOffice.org document probably
>>isn't a hundred kilobytes, and my disk space is measured in gigabytes.
>>It'd take over ten thousand revisions to fill a gig with copies of one
>>of those files. Sure, we could make an Oasis plugin for OO.o to use, so
>>all the contents of the document are stored as individual files, turned
>>into a zipfile on demand to match the current standard -- but that's not
>>worth it in the short term, and only really helps with presentations in
>>the long term.
>>
>>Actually, while I think it'd be nice to be able to more advanced
>>splicing in a file (append or delete from the beginning or middle), I
>>think it's more important to come up with a sane way for a program to
>>access a file as a lot of little pieces, and to have a standard way of
>>serializing them for transport (email or otherwise). Kind of like XML,
>>only it could be more efficient than the old model, instead of less.
>>
>>Like XML in that XML allows programmers to dump internal structures to a
>>human-readable file without writing parsers and serializers. Move the
>>serializing logic out to the FS, let it handle the performance, version
>>control, and export issues.
>>
>>
>
>
>
>
>
well, frames should be handled by inheritance, because there are times
you want to see them as separate objects....
next prev parent reply other threads:[~2005-11-13 4:23 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-11-11 13:59 Slowdown is gone & apt-get works with updated reiser4. So nevermind John Gilmore
2005-11-11 21:22 ` Hans Reiser
2005-11-11 22:06 ` Jonathan Briggs
2005-11-12 3:07 ` michael chang
2005-11-12 6:38 ` Hans Reiser
2005-11-12 9:06 ` John Gilmore
2005-11-12 20:57 ` Artur Makówka
2005-11-12 21:28 ` Hans Reiser
2005-11-13 0:55 ` Artur Makówka
2005-11-13 12:18 ` Laurent Riffard
2005-11-13 12:25 ` Laurent Riffard
2005-11-13 12:29 ` Artur Makówka
2005-11-13 13:12 ` Thomas Kuther
2005-11-13 14:05 ` Artur Makówka
2005-11-14 17:48 ` Pat Double
2005-11-14 20:22 ` Artur Makówka
2005-11-13 14:05 ` Ingo Bormuth
2005-11-14 19:41 ` More Slowdown Craig Shelley
2005-11-14 19:53 ` jp
2005-11-14 20:47 ` Christian Iversen
2005-11-15 14:27 ` Craig Shelley
2005-11-15 18:04 ` Laurent Riffard
2005-11-15 18:42 ` Craig Shelley
[not found] ` <437AF653.9040001@namesys.com>
2005-11-16 9:33 ` Craig Shelley
2005-11-17 3:08 ` michael chang
2005-11-17 4:49 ` Hans Reiser
2005-11-17 12:02 ` Artur Makówka
2005-11-17 12:40 ` Vladimir V. Saveliev
2005-11-17 10:34 ` John Gilmore
2005-11-17 21:12 ` Hans Reiser
2005-11-18 20:45 ` Hans Reiser
2005-11-17 8:56 ` Ingo Bormuth
2005-11-17 9:36 ` PFC
2005-11-17 21:08 ` Hans Reiser
2005-11-19 4:44 ` michael chang
2005-11-12 0:56 ` Versioning Plugin Peter van Hardenberg
2005-11-12 2:24 ` Hans Reiser
2005-11-12 14:06 ` Ming Zhang
2005-11-12 21:46 ` David Masover
2005-11-12 22:05 ` Ming Zhang
2005-11-13 2:56 ` David Masover
2005-11-13 3:05 ` Ming Zhang
2005-11-13 4:23 ` Hans Reiser [this message]
2005-11-13 3:28 ` Hans Reiser
2005-11-12 22:54 ` Hans Reiser
2005-11-13 0:57 ` Ming Zhang
2005-11-13 1:55 ` michael chang
2005-11-13 1:56 ` michael chang
2005-11-13 2:13 ` Ming Zhang
2005-11-13 3:03 ` Hans Reiser
2005-11-13 2:13 ` Ming Zhang
2005-11-13 2:27 ` Hans Reiser
2005-11-13 2:32 ` Ming Zhang
2005-11-13 2:53 ` michael chang
2005-11-13 2:56 ` Ming Zhang
2005-11-12 22:57 ` Lares Moreau
2005-11-12 23:35 ` Hans Reiser
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4376BFBB.5020201@namesys.com \
--to=reiser@namesys.com \
--cc=mingz@ele.uri.edu \
--cc=ninja@slaphack.com \
--cc=pvh@uvic.ca \
--cc=reiserfs-list@namesys.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.