* RT[0/3]: Some related random thoughts
@ 2005-04-28 12:59 Kris Shannon
2005-04-28 13:34 ` RT[2/3]: Rename/Code-movement Tracking Kris Shannon
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Kris Shannon @ 2005-04-28 12:59 UTC (permalink / raw)
To: GIT Mailing List
I've had a number of thoughts about the "supposed" missing SCM features of git.
1) Alternate Encodings (including on-disk delta compression)
If the default objects filename doesn't exist, we can try for
other alternative
encodings e.g. 00/a29c403e751c2a2a61eb24fa2249c8956d1c80.xdelta which
can specify the object content as a delta or other ingenious idea...
2) Rename/Code-movement Tracking (file and/or function)
Additional object type tag(s) "rename" which references a changeset
and lists the movement metadata
3) SHA1 backwards reference cache
Allows quickly finding all commits which reference a given tree root,
all/the "rename" for a given commit, all xdeltas which use this blob.
There a quite a few important issues with all 3 of these ideas so I
thought I would
elaborate each in separate emails... (coming soon :)
--
Kris Shannon <kris.shannon.kernel@gmail.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* RT[2/3]: Rename/Code-movement Tracking
2005-04-28 12:59 RT[0/3]: Some related random thoughts Kris Shannon
@ 2005-04-28 13:34 ` Kris Shannon
2005-04-28 13:47 ` RT[1/3]: Alternate Encodings (esp. Delta Compression) Kris Shannon
2005-04-28 13:54 ` RT[3/3]: Reverse lookup of SHA1 references Kris Shannon
2 siblings, 0 replies; 5+ messages in thread
From: Kris Shannon @ 2005-04-28 13:34 UTC (permalink / raw)
To: GIT Mailing List
IMO adding a new object type tag (or tags) for tracking would fit the git model.
Call it "rename" for example (better yet think of a better name)
Let the contents be something along the lines of:
commit CHANGESET-SHA1
PARENT-TREE-SHA1 /old/path\0/new/path\0
PARENT-TREE-SHA1 /old/path/2\0/new/path/2\0
...
The exact details will depend on the renaming model and I don't care to
much at the moment what that turns out to be.
This new tag allows the rename data to be added on to commit's from people
like linus who don't care about rename.
It doesn't reduce security because any can happen to the rename object
and that will cover the whole commit (and those who sign the commit are
obviously not validating any rename information)
The rename objects can be used to assist in automatic merging but from my
experience I would agree with linus that if the right way to merge is not
really obvious then you probably need user input anyway.
--
Kris Shannon <kris.shannon.kernel@gmail.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* RT[1/3]: Alternate Encodings (esp. Delta Compression)
2005-04-28 12:59 RT[0/3]: Some related random thoughts Kris Shannon
2005-04-28 13:34 ` RT[2/3]: Rename/Code-movement Tracking Kris Shannon
@ 2005-04-28 13:47 ` Kris Shannon
2005-04-28 13:54 ` RT[3/3]: Reverse lookup of SHA1 references Kris Shannon
2 siblings, 0 replies; 5+ messages in thread
From: Kris Shannon @ 2005-04-28 13:47 UTC (permalink / raw)
To: GIT Mailing List
If a format is defined for representing delta compression then
it would be prudent to make sure that it could be used for
encoding both forward and backward deltas (not necessarily
in the same delta :) These deltas could then by kept in the
objects directory (i.e. 00/a29c403e751c2a2a61eb24fa2249c8956d1c80.xdelta)
Doing delta compression of old versions is something that should
be done manually (the subversion people have empirical data
to back that up I think but I can't seem to find a link ATM)
A command for wiping old versions from a repository to save space
could be altered to replace the files with their xdelta equivalents
for a reduced space savings but still keeping a full history.
Using delta compression of the new versions (against the old) for
efficient bandwidth consumption is another possible area. If these
delta's are produced on the fly, they could be cached in the objects
directory.
These two different use cases are IMO a good argument for
using this as a convention even if it doesn't become a part of git's
core (i.e. changing read_sha1_file to transparently expand xdelta's)
If you add .xdelta it would follow that other encodings might be useful,
and added to the objects directory in the same way.
--
Kris Shannon <kris.shannon.kernel@gmail.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* RT[3/3]: Reverse lookup of SHA1 references
2005-04-28 12:59 RT[0/3]: Some related random thoughts Kris Shannon
2005-04-28 13:34 ` RT[2/3]: Rename/Code-movement Tracking Kris Shannon
2005-04-28 13:47 ` RT[1/3]: Alternate Encodings (esp. Delta Compression) Kris Shannon
@ 2005-04-28 13:54 ` Kris Shannon
[not found] ` <42717714.50601@dwheeler.com>
2 siblings, 1 reply; 5+ messages in thread
From: Kris Shannon @ 2005-04-28 13:54 UTC (permalink / raw)
To: GIT Mailing List
There are a number of places where you want to find all the objects
which reference this particular object. AIUI this is not currently an easy
task. Some thought should be put into how to make these reverse
lookups fast.
The other two random thoughts would benefit greatly from a relationship
cache.
Umm.... I was going to write some more, but I've gotta go :(
More thoughts later...
--
Kris Shannon <kris.shannon.kernel@gmail.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: RT[3/3]: Reverse lookup of SHA1 references
[not found] ` <42717714.50601@dwheeler.com>
@ 2005-04-29 0:11 ` Kris Shannon
0 siblings, 0 replies; 5+ messages in thread
From: Kris Shannon @ 2005-04-29 0:11 UTC (permalink / raw)
To: dwheeler; +Cc: GIT Mailing List
On 4/29/05, David A. Wheeler <dwheeler@dwheeler.com> wrote:
> Kris Shannon wrote:
> > There are a number of places where you want to find all the objects
> > which reference this particular object. AIUI this is not currently an easy
> > task. Some thought should be put into how to make these reverse
> > lookups fast.
> I have. Please look at my old postings. It turns out to be easy;
> just create a directory parallel to .git/objects, say:
> .git/reverse
> 00/
> 89123408312904819048390/
> 189412890892308290
> 238923849038329089
>
> Anyway you get the idea.
Seems reasonable.
> Linus does NOT want renames noted; see the old emails on that.
> It's not clear this is such a good idea, but he's adamant.
> He thinks this can be handled by merge.
> If not, it can be added later.
I realize that this is not a priority for linus, I was suggesting that
it be added as a separate object type. I think that many of the
extra metadata could be usefully added as other object types.
These can be optional and/or added after the fact.
> --- David A. Wheeler
>
--
Kris Shannon <kris.shannon.kernel@gmail.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2005-04-29 0:05 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-28 12:59 RT[0/3]: Some related random thoughts Kris Shannon
2005-04-28 13:34 ` RT[2/3]: Rename/Code-movement Tracking Kris Shannon
2005-04-28 13:47 ` RT[1/3]: Alternate Encodings (esp. Delta Compression) Kris Shannon
2005-04-28 13:54 ` RT[3/3]: Reverse lookup of SHA1 references Kris Shannon
[not found] ` <42717714.50601@dwheeler.com>
2005-04-29 0:11 ` Kris Shannon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).