Git development

Git development
 help / color / mirror / Atom feed

* Re: git fsck not identifying corrupted packs
From: Wesley J. Landaker @ 2009-10-19 19:27 UTC (permalink / raw)
  To: git
In-Reply-To: <7v7hur1a0h.fsf@alter.siamese.dyndns.org>

(Not CCing everyone, since this is mostly curiosa in the "using git as it 
was never intended" section):

On Monday 19 October 2009 13:03:42 Junio C Hamano wrote:
> Once a packfile is created and we always use it read-only, there didn't
> seem to be much point in suspecting that the underlying filesystems or
> disks may corrupt them in such a way that is not caught by the SHA-1
> checksum over the entire packfile and per object checksum.  That trust in
> the filesystems might have been a good tradeoff between fsck performance
> and reliability on platforms git was initially developed on and for, but
> it might not be true anymore as we run on more platforms these days.

Filesystems are mostly reliable, but only until your crazy users do strange 
and terrible things. I have a real, non-toy environment where I use this 
stack as a [horrible] workaround for some issues beyond my control:

git -> ext4 -> lvm -> dmcrypt -> loop -> sshfs -> cygwin sshd -> SMB share

Amazingly, this works pretty reliably with many gigabytes of data in a git 
repository, even with the occasional crash because of flakiness with the 
"sshfs -> cygwin sshd" piece of the puzzle. But a good "git fsck" sure 
doesn't hurt in this environment! =)

^ permalink raw reply

* Re: How to revert one of multiple merges
From: Michael J Gruber @ 2009-10-19 19:36 UTC (permalink / raw)
  To: Bill Lear; +Cc: git
In-Reply-To: <19164.44099.875117.96168@lisa.zopyra.com>

Bill Lear venit, vidit, dixit 19.10.2009 20:13:
> On Monday, October 19, 2009 at 17:19:35 (+0200) Michael J Gruber writes:
>> Bill Lear venit, vidit, dixit 18.10.2009 04:31:
>>> Branch A, B, C each have 20 commits, 0-19.
>>>
>>> Branch v1.0.0 created, then merge of A, B, C performed.
>>>
>>> After testing, we realize that the branch B is not ready for
>>> production release and we'd like to remove it from branch
>>> v1.0.0.
>>>
>>> If I do
>>>
>>> % git merge A B C
>>>
>>> I get a single commit:
>>>
>>> % git log -p
>>>
>>> commit 1644a0b98c01869aa83e59aa41374c22098c47b6
>>> [...]
>>> Date:   Fri Oct 16 09:52:32 2009 -0500
>>>
>>>     Merge branches 'A', 'B' and 'C' into v1.0.0
>>>
>>> [20 x 3 commits]
>>>
>>> If I do
>>>
>>> % git merge A
>>> % git merge B
>>> % git merge C
>>>
>>> Then:
>>>
>>> % git log -p
>>>
>>> commit 8946edd381384d0882221c87b5b3b7bf47127d70
>>> [...]
>>> Date:   Sat Oct 17 21:28:36 2009 -0500
>>>
>>>     Merge branch 'B' into v1.0.0
>>>
>>> commit 076ed422443e3684e564f7cae2b92e4538088ae6
>>> [...]
>>> Date:   Sat Oct 17 21:28:35 2009 -0500
>>>
>>>     Merge branch 'A' into v1.0.0
>>>
>>> but no "Merge branch 'C' into v1.0.0".
>>
>> Do you get any commits after the merge of B? If yes, then v1.0.0 got
>> fast-forwarded (you can avoid that using --no-ff). If no, C was
>> contained in v1.0.0 already.
> 
> BTW this is all with git 1.6.5.
> 
> My test script that set all of this up makes no commit to any branch
> after the merge of any branch is done.  C was not in v1.0.0 already.
> Here is the script I used to set this up:
> 
> % cat scripto
> rm -rf branch_test
> mkdir branch_test
> cd branch_test
> git init
> 
> echo foo > foo
> git add foo
> git commit -a -m "foo"
> 
> git checkout -b A
> for ((i=0; i < 20; ++i)); do
>     echo "bar $i" > bar
>     git add bar
>     git commit -a -m "bar $i"
> done
> 
> git checkout master
> git checkout -b B
> for ((i=0; i < 20; ++i)); do
>     echo "baz $i" > baz
>     git add baz
>     git commit -a -m "baz $i"
> done
> 
> git checkout master
> git checkout -b C
> for ((i=0; i < 20; ++i)); do
>     echo "buz $i" > buz
>     git add buz
>     git commit -a -m "buz $i"
> done
> 
> git checkout master
> git checkout -b v1.0.0
> 
> After that, I did the merges this way:
> 
> % git merge A
> % git merge B
> % git merge C
> 
> and then the git log shows no merge of C, as above.  Hmm, actually, when I
> just ran this, I get no output showing branch A was merged.  I just
> did this again and here is the merge output:
> 
> % git branch -a
>   A
>   B
>   C
>   master
> * v1.0.0
> % git merge A
> Updating af6c884..c7e5f2c
> Fast forward

A was based off master, and v1.0.0 equals that base commit, so it's a f-f.

>  bar |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
>  create mode 100644 bar
> % git merge B
> Merge made by recursive.
>  baz |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
>  create mode 100644 baz
> % git merge C
> Merge made by recursive.
>  buz |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
>  create mode 100644 buz
> 
> Then, git log -p shows no branch A merge:
> 
> % git log -p | grep -i merge
> Merge: f462b95 2c2b064
>     Merge branch 'C' into v1.0.0
> Merge: c7e5f2c adde6ff
>     Merge branch 'B' into v1.0.0
> 
>> In both cases, it's not clear how C could have been "ready" when B was not.
> 
> A, B, and C, are entirely independent of one another.  I'm trying to
> replicate an instance in which a feature is developed and submitted for
> inclusion in a release, accepted for inclusion, but then later found
> to be defective.
> 
>>> And so, I'm faced with git rebase -i posing some unanswerable questions
>>> to our release manager.  She cannot easily remove B from the merge after
>>> doint either merge A B C, or merge A, merge B, merge C.
>>
>> The way you described the situation there are no commits after the
>> merges. So, why not reset to before the merge and do a "git merge A C"?
> 
> Presumably, I would need to tag the v1.0.0 branch after creating it,
> which I was hoping not to have to do.  I wanted the equivalent of
> "git unmerge B" after doing three separate merges as above, or an octopus
> merge.  I'm just trying to make life simpler for our release manager,
> who is not equipped with git fu.

I think the octopus case is difficult. So let's treat the other one ;)

If you have separate merges you can revert them separately. You don't
need to tag them, you only need to be able to find the merge commit, say
using

git log --grep='Merge branch'

Then you can revert that merge using

git revert -m 1 sha1ofthatmerge

Now comes the difficult part: If, later on, you want to merge B *and
include also the parts of B from before that previous merge* you need to
revert that revert, then merge. If, on the other hand, you only want to
merge the "new parts" of B then simply merge B.

The reason is that reverting a merge undoes its changes, but still
leaves the merge in the DAG, so that the commits of the merged branch
are still considered part of the history, and as such won't be merged in
again.

howto/revert-a-faulty-merge.txt explains this, but the above is a short
summary.

Cheers,
Michael

^ permalink raw reply

* [RFC PATCH] git-gui: Allow staging multiple lines at once
From: Jeff Epler @ 2009-10-19 19:54 UTC (permalink / raw)
  To: git

When applying less than a full hunk, it's still often desirable to apply
a number of consecutive lines.

This change makes it possible to sweep out a range of lines in the diff view
with the left mouse button, then right click and "Stage Lines For Commit".

The selected lines may span multiple hunks.

Signed-off-by: Jeff Epler <jepler@unpythonic.net>
---
The diff looks bigger than it is because it changed the indentation level
of about 80 lines, and that made it necessary to reflow a lengthy comment
block as well.

This introduces new user interface strings.  I felt this was probably a
better decision than using the inaccurate "Stage Line For Commit" when
a block of text was swept out.

I wonder a bit about the message [mc "Apply/Reverse Line"] -- as far as
I can tell, it is never shown to a user, so why is it translated?

 git-gui/git-gui.sh   |   15 +++-
 git-gui/lib/diff.tcl |  222 +++++++++++++++++++++++++++----------------------
 2 files changed, 134 insertions(+), 103 deletions(-)

diff --git a/git-gui/git-gui.sh b/git-gui/git-gui.sh
index 14b92ba..a80ed0d 100755
--- a/git-gui/git-gui.sh
+++ b/git-gui/git-gui.sh
@@ -3165,7 +3165,7 @@ set ui_diff_applyhunk [$ctxm index last]
 lappend diff_actions [list $ctxm entryconf $ui_diff_applyhunk -state]
 $ctxm add command \
 	-label [mc "Apply/Reverse Line"] \
-	-command {apply_line $cursorX $cursorY; do_rescan}
+	-command {apply_range_or_line $cursorX $cursorY; do_rescan}
 set ui_diff_applyline [$ctxm index last]
 lappend diff_actions [list $ctxm entryconf $ui_diff_applyline -state]
 $ctxm add separator
@@ -3205,12 +3205,21 @@ proc popup_diff_menu {ctxm ctxmmg x y X Y} {
 	if {[string first {U} $state] >= 0} {
 		tk_popup $ctxmmg $X $Y
 	} else {
+		set has_range [expr {[$::ui_diff tag nextrange sel 0.0] != {}}]
 		if {$::ui_index eq $::current_diff_side} {
 			set l [mc "Unstage Hunk From Commit"]
-			set t [mc "Unstage Line From Commit"]
+			if {$has_range} {
+				set t [mc "Unstage Lines From Commit"]
+			} else {
+				set t [mc "Unstage Line From Commit"]
+			}
 		} else {
 			set l [mc "Stage Hunk For Commit"]
-			set t [mc "Stage Line For Commit"]
+			if {$has_range} {
+				set t [mc "Stage Lines For Commit"]
+			} else {
+				set t [mc "Stage Line For Commit"]
+			}
 		}
 		if {$::is_3way_diff
 			|| $current_diff_path eq {}
diff --git a/git-gui/lib/diff.tcl b/git-gui/lib/diff.tcl
index 925b3f5..30ac659 100644
--- a/git-gui/lib/diff.tcl
+++ b/git-gui/lib/diff.tcl
@@ -505,10 +505,23 @@ proc apply_hunk {x y} {
 	}
 }
 
-proc apply_line {x y} {
+proc apply_range_or_line {x y} {
 	global current_diff_path current_diff_header current_diff_side
 	global ui_diff ui_index file_states
 
+	set selected [$ui_diff tag nextrange sel 0.0]
+
+	if {$selected == {}} {
+		set first [$ui_diff index "@$x,$y"]
+		set last $first
+	} else {
+		set first [lindex $selected 0]
+		set last [lindex $selected 1]
+	}
+
+	set first_l [$ui_diff index "$first linestart"]
+	set last_l [$ui_diff index "$last lineend"]
+
 	if {$current_diff_path eq {} || $current_diff_header eq {}} return
 	if {![lock_index apply_hunk]} return
 
@@ -531,119 +544,128 @@ proc apply_line {x y} {
 		}
 	}
 
-	set the_l [$ui_diff index @$x,$y]
+	set wholepatch {}
 
-	# operate only on change lines
-	set c1 [$ui_diff get "$the_l linestart"]
-	if {$c1 ne {+} && $c1 ne {-}} {
-		unlock_index
-		return
-	}
-	set sign $c1
-
-	set i_l [$ui_diff search -backwards -regexp ^@@ $the_l 0.0]
-	if {$i_l eq {}} {
-		unlock_index
-		return
-	}
-	# $i_l is now at the beginning of a line
+	while {$first_l < $last_l} {
+		set i_l [$ui_diff search -backwards -regexp ^@@ $first_l 0.0]
+		if {$i_l eq {}} {
+			# If there's not a @@ above, then the selected range
+			# must have come before the first_l @@
+			set i_l [$ui_diff search -regexp ^@@ $first_l $last_l]
+		}
+		if {$i_l eq {}} {
+			unlock_index
+			return
+		}
+		# $i_l is now at the beginning of a line
 
-	# pick start line number from hunk header
-	set hh [$ui_diff get $i_l "$i_l + 1 lines"]
-	set hh [lindex [split $hh ,] 0]
-	set hln [lindex [split $hh -] 1]
+		# pick start line number from hunk header
+		set hh [$ui_diff get $i_l "$i_l + 1 lines"]
+		set hh [lindex [split $hh ,] 0]
+		set hln [lindex [split $hh -] 1]
 
-	# There is a special situation to take care of. Consider this hunk:
-	#
-	#    @@ -10,4 +10,4 @@
-	#     context before
-	#    -old 1
-	#    -old 2
-	#    +new 1
-	#    +new 2
-	#     context after
-	#
-	# We used to keep the context lines in the order they appear in the
-	# hunk. But then it is not possible to correctly stage only
-	# "-old 1" and "+new 1" - it would result in this staged text:
-	#
-	#    context before
-	#    old 2
-	#    new 1
-	#    context after
-	#
-	# (By symmetry it is not possible to *un*stage "old 2" and "new 2".)
-	#
-	# We resolve the problem by introducing an asymmetry, namely, when
-	# a "+" line is *staged*, it is moved in front of the context lines
-	# that are generated from the "-" lines that are immediately before
-	# the "+" block. That is, we construct this patch:
-	#
-	#    @@ -10,4 +10,5 @@
-	#     context before
-	#    +new 1
-	#     old 1
-	#     old 2
-	#     context after
-	#
-	# But we do *not* treat "-" lines that are *un*staged in a special
-	# way.
-	#
-	# With this asymmetry it is possible to stage the change
-	# "old 1" -> "new 1" directly, and to stage the change
-	# "old 2" -> "new 2" by first staging the entire hunk and
-	# then unstaging the change "old 1" -> "new 1".
-
-	# This is non-empty if and only if we are _staging_ changes;
-	# then it accumulates the consecutive "-" lines (after converting
-	# them to context lines) in order to be moved after the "+" change
-	# line.
-	set pre_context {}
-
-	set n 0
-	set i_l [$ui_diff index "$i_l + 1 lines"]
-	set patch {}
-	while {[$ui_diff compare $i_l < "end - 1 chars"] &&
-	       [$ui_diff get $i_l "$i_l + 2 chars"] ne {@@}} {
-		set next_l [$ui_diff index "$i_l + 1 lines"]
-		set c1 [$ui_diff get $i_l]
-		if {[$ui_diff compare $i_l <= $the_l] &&
-		    [$ui_diff compare $the_l < $next_l]} {
-			# the line to stage/unstage
-			set ln [$ui_diff get $i_l $next_l]
-			if {$c1 eq {-}} {
-				set n [expr $n+1]
+		# There is a special situation to take care of. Consider this
+		# hunk:
+		#
+		#    @@ -10,4 +10,4 @@
+		#     context before
+		#    -old 1
+		#    -old 2
+		#    +new 1
+		#    +new 2
+		#     context after
+		#
+		# We used to keep the context lines in the order they appear in
+		# the hunk. But then it is not possible to correctly stage only
+		# "-old 1" and "+new 1" - it would result in this staged text:
+		#
+		#    context before
+		#    old 2
+		#    new 1
+		#    context after
+		#
+		# (By symmetry it is not possible to *un*stage "old 2" and "new
+		# 2".)
+		#
+		# We resolve the problem by introducing an asymmetry, namely,
+		# when a "+" line is *staged*, it is moved in front of the
+		# context lines that are generated from the "-" lines that are
+		# immediately before the "+" block. That is, we construct this
+		# patch:
+		#
+		#    @@ -10,4 +10,5 @@
+		#     context before
+		#    +new 1
+		#     old 1
+		#     old 2
+		#     context after
+		#
+		# But we do *not* treat "-" lines that are *un*staged in a
+		# special way.
+		#
+		# With this asymmetry it is possible to stage the change "old
+		# 1" -> "new 1" directly, and to stage the change "old 2" ->
+		# "new 2" by first staging the entire hunk and then unstaging
+		# the change "old 1" -> "new 1".
+
+		# This is non-empty if and only if we are _staging_ changes;
+		# then it accumulates the consecutive "-" lines (after
+		# converting them to context lines) in order to be moved after
+		# the "+" change line.
+		set pre_context {}
+
+		set n 0
+		set m 0
+		set i_l [$ui_diff index "$i_l + 1 lines"]
+		set patch {}
+		while {[$ui_diff compare $i_l < "end - 1 chars"] &&
+		       [$ui_diff get $i_l "$i_l + 2 chars"] ne {@@}} {
+			set next_l [$ui_diff index "$i_l + 1 lines"]
+			set c1 [$ui_diff get $i_l]
+			if {[$ui_diff compare $first_l <= $i_l] &&
+			    [$ui_diff compare $i_l < $last_l] &&
+			    ($c1 eq {-} || $c1 eq {+})} {
+				# the line to stage/unstage
+				set ln [$ui_diff get $i_l $next_l]
+				if {$c1 eq {-}} {
+					set n [expr $n+1]
+					set patch "$patch$pre_context$ln"
+				} else {
+					set m [expr $m+1]
+					set patch "$patch$ln$pre_context"
+				}
+				set pre_context {}
+			} elseif {$c1 ne {-} && $c1 ne {+}} {
+				# context line
+				set ln [$ui_diff get $i_l $next_l]
 				set patch "$patch$pre_context$ln"
+				set n [expr $n+1]
+				set m [expr $m+1]
+				set pre_context {}
+			} elseif {$c1 eq $to_context} {
+				# turn change line into context line
+				set ln [$ui_diff get "$i_l + 1 chars" $next_l]
+				if {$c1 eq {-}} {
+					set pre_context "$pre_context $ln"
+				} else {
+					set patch "$patch $ln"
+				}
+				set n [expr $n+1]
+				set m [expr $m+1]
 			} else {
-				set patch "$patch$ln$pre_context"
-			}
-			set pre_context {}
-		} elseif {$c1 ne {-} && $c1 ne {+}} {
-			# context line
-			set ln [$ui_diff get $i_l $next_l]
-			set patch "$patch$pre_context$ln"
-			set n [expr $n+1]
-			set pre_context {}
-		} elseif {$c1 eq $to_context} {
-			# turn change line into context line
-			set ln [$ui_diff get "$i_l + 1 chars" $next_l]
-			if {$c1 eq {-}} {
-				set pre_context "$pre_context $ln"
-			} else {
-				set patch "$patch $ln"
 			}
-			set n [expr $n+1]
+			set i_l $next_l
 		}
-		set i_l $next_l
+		set wholepatch "$wholepatch@@ -$hln,$n +$hln,$m @@\n$patch"
+		set first_l [$ui_diff index "$next_l + 1 lines"]
 	}
-	set patch "@@ -$hln,$n +$hln,[eval expr $n $sign 1] @@\n$patch"
 
 	if {[catch {
 		set enc [get_path_encoding $current_diff_path]
 		set p [eval git_write $apply_cmd]
 		fconfigure $p -translation binary -encoding $enc
 		puts -nonewline $p $current_diff_header
-		puts -nonewline $p $patch
+		puts -nonewline $p $wholepatch
 		close $p} err]} {
 		error_popup [append $failed_msg "\n\n$err"]
 	}
-- 
1.6.5.rc1.49.ge970

^ permalink raw reply related

* Re: What's cooking in git.git (Oct 2009, #03; Mon, 19)
From: Erik Faye-Lund @ 2009-10-19 20:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vhbtv6c76.fsf@alter.siamese.dyndns.org>

On Mon, Oct 19, 2009 at 10:05 AM, Junio C Hamano <gitster@pobox.com> wrote:
> * ef/msys-imap (2009-10-13) 8 commits.
>  - MSVC: Enable OpenSSL, and translate -lcrypto
>  - mingw: enable OpenSSL
>  - mingw: wrap SSL_set_(w|r)fd to call _get_osfhandle
>  - imap-send: build imap-send on Windows
>  - imap-send: fix compilation-error on Windows
>  - imap-send: use run-command API for tunneling
>  - imap-send: use separate read and write fds
>  - imap-send: remove useless uid code
>
> Is this good to go yet?

As far as I know, yes. I sent out a new iteration a couple of hours
ago (after you sent this mail). This version addresses all issues that
has been raised AFAIK. But let's give people some more time to shoot
holes in it, OK?

-- 
Erik "kusma" Faye-Lund

^ permalink raw reply

* Re: [RFC PATCH] git-gui: Allow staging multiple lines at once
From: Geert Bosch @ 2009-10-19 20:08 UTC (permalink / raw)
  To: Jeff Epler; +Cc: git
In-Reply-To: <20091019195456.GA11121@unpythonic.net>


On Oct 19, 2009, at 15:54, Jeff Epler wrote:

> When applying less than a full hunk, it's still often desirable to  
> apply
> a number of consecutive lines.
>
> This change makes it possible to sweep out a range of lines in the  
> diff view
> with the left mouse button, then right click and "Stage Lines For  
> Commit".
>
> The selected lines may span multiple hunks.

Great! I've wished for this feature...

Thanks,
    -Geert

^ permalink raw reply

* Re: [PATCH 1/6 (v4)] man page and technical discussion for rev-cache
From: Nick Edelen @ 2009-10-19 20:26 UTC (permalink / raw)
  To: Junio C Hamano, Nicolas Pitre, Johannes Schindelin, Sam Vilain,
	Michael J Gruber
In-Reply-To: <op.uys3qgmitdk399@sirnot.private>

Before any code is introduced the full documentation is put forth.  This 
provides a man page for the porcelain, and a technical doc in technical/.  The 
latter describes the API, and discusses rev-cache's design, file format and 
mechanics.

Signed-off-by: Nick Edelen <sirnot@gmail.com>

---
sorry about the long interval; I haven't got internet in my flat yet and uni
tends to take you over.  anyway, clean resend of patches (attempt 2).  this set
fixes a small bug in graft handling (ie. test t6001), and tweaks test for
compatability.

 Documentation/git-rev-cache.txt       |  190 ++++++++++
 Documentation/technical/rev-cache.txt |  634 +++++++++++++++++++++++++++++++++
 2 files changed, 824 insertions(+), 0 deletions(-)

diff --git a/Documentation/git-rev-cache.txt b/Documentation/git-rev-cache.txt
new file mode 100644
index 0000000..5a713ad
--- /dev/null
+++ b/Documentation/git-rev-cache.txt
@@ -0,0 +1,190 @@
+git-rev-cache(1)
+================
+
+NAME
+----
+git-rev-cache - Add, walk and maintain revision cache slices
+
+SYNOPSIS
+--------
+'git-rev-cache' COMMAND [options] [<commit>...]
+
+DESCRIPTION
+-----------
+The revision cache ('rev-cache') provides a mechanism for significantly
+speeding up revision traversals.  It does this by creating an efficient
+database (cache) of commits, their related objects and topological relations.
+Independant of packs and the object store, this database is composed of
+rev-cache "slices" -- each a different file storing a given segment of commit
+history.  To map commits to their respective slices, a single index file is
+kept for the rev-cache.
+
+'git-rev-cache' provides a front-end for the rev-cache mechanism, intended for
+updating and maintaining rev-cache slices in the current repository.  New cache
+slice files can be 'add'ed, to keep the cache up-to-date; individual slices can
+be traversed; smaller slices can be 'fuse'd into a larger slice; and the
+rev-cache index can be regenerated.
+
+COMMANDS
+--------
+
+add
+~~~
+Add revisions to the cache by creating a new cache slice.  Reads a revision
+list from the command line, formatted as: `START START ... \--not END END ...`
+
+Options
+^^^^^^^
+
+\--all::
+	Include all refs in the new cache slice, like the \--all option in
+	'rev-list'.
+
+\--fresh/\--incremental::
+	Exclude everything already in the revision cache, analogous to
+	\--incremental in 'pack-objects'.
+
+\--stdin::
+	Read newline-seperated revisions from the standard input.  Use \--not
+	to exclude commits, as on the command line.
+
+\--legs::
+	Ensure newly-generated cache slice has no partial ends.  This means that
+	no commit has partially cached parents, in that all its parents are
+	cached or none of them are.  99.9% of users can ignore this command.
++
+\--legs will cause 'rev-cache' to expand potential slice end-points (creating
+"legs") until this condition is met, simplifying the cache slice structure.
+'rev-cache' itself does not care if a slice has legs or not, but the condition
+may reduce the required complexity of other applications that might use the
+revision cache.
+
+\--no-objects::
+	Non-commit objects are normally included along with the commit with
+	which they were introduced.  This is obviously very benificial, but can
+	take longer in cache slice generation.  Using this option will disable
+	non-commit object caching.
++
+\--no-objects is mainly intended for debugging or development purposes, but may
+find use in special situations (e.g. common traversal of only commits).
+
+Output
+^^^^^^
+
+On `stderr` 'add' outputs general information about the generated slice,
+including the number of objects and paths, and the start/end commits (prefix S
+indicates start, E an end).  Through `stdout` it emits only the SHA-1 of the
+slice.
+
+walk
+~~~~
+Analogous to a slice-oriented 'rev-list', 'walk' will traverse a region in a
+particular cache slice.  Interesting and uninteresting (delimited, as with
+'rev-list', with \--not) are specified on the command line, and output is the
+same as vanilla 'rev-list'.
+
+Options
+^^^^^^^
+
+\--objects::
+	Like 'rev-list', 'walk' will normally only list commits.  Use this
+	option to list non-commit objects as well, if they are present in the
+	cache slice.
+
+Output
+^^^^^^
+
+'walk' will simply dump the contents of the output commit list, work list, and
+pending object array.  The headers are outputed on `stderr`, the object hashes
+and names on `stdout`.
+
+fuse
+~~~~
+Merge several cache slices into a single large slice, like 'repack' for
+'rev-cache'.  On each invocation of 'add' a new file ("slice") is added to the
+revision cache directory, and after several additions the directory may become
+populated with many, relatively small slices.  Numerous smaller slices will
+yield poorer performance than a one or two large ones, because of the overhead
+of loading new slices into memory.
+
+Running 'fuse' every once in a while will solve this problem by coalescing all
+the cache slices into one larger slice.  For very large projects, using
+\--ignore-size is advisable to prevent overly large cache slices.  This can be
+set to run on garbage collection; see 'Automation' for more info.
+
+Note that 'fuse' uses the internal revision walker, so the options used in
+fusion override those of the cache slices upon which it operates.  For example,
+if some slices were generated with \--no-objects, yet 'fuse' was performed with
+non-commit objects, the resulting slice would still contain objects but would
+take longer to generate.
+
+Options
+^^^^^^^
+
+\--all::
+	Normally fuse will only include everything that's already in the
+	revision cache.  \--all tells it to start walking from the branch
+	heads, effectively a `add --all --fresh; fuse`
+	(pseudo-revcache-command).
+
+\--no-objects::
+	As in 'add', this option disables inclusion of non-commit objects.  If
+	some cache slices do contain such objects, the information will be lost.
+
+\--ignore-size[=N]::
+	Do not merge cache slices of size >=N (be aware that slices must be
+	mapped to memory).  N can have a suffix of "k" or "m", denoting N as
+	kilobytes and megabytes, respectively.  If N is not provided 'fuse'
+	will default to a size specified in `revcache.ignoresize`, or ~25MB if
+	the config var is not set.
+
+Output
+^^^^^^
+
+This command prints the SHA-1 of the new slice on `stdout`, and information
+about its work on `stderr` -- specifically which files it's removing.
+
+Automation
+^^^^^^^^^^
+
+Set the git configuration variable `gc.revcache` to run 'fuse' on garbage
+collection.  The arguments passed are `fuse \--all \--ignore-size`; i.e. 'gc'
+will keep everything cached into size-regulated slices.
+
+index
+~~~~~
+Regenerate the revision cache index.  If the rev-cache index file associating
+objects with cache slices gets corrupted, lost, or otherwise becomes unusable,
+'index' will quickly regenerate the file.  It's most likely that this won't be
+needed in every day use, as it is targeted towards debugging and development.
+
+alt
+~~~
+Create a cache slice pointer to another slice, identified by its full path:
+`fuse path/to/other/slice`
+
+This command is useful if you have several repositories sharing a common
+history.  Although space requirements for rev-cache are slim anyway, you can in
+this situation reduce it further by using slice pointers, pointing to relavant
+slices in other repositories.  Note that only one level of redirection is
+allowed, and the slice pointer will break if the original slice is removed.
+'fuse' will not touch slice pointers.
+
+NOTES
+-----
+In certain circumstances there may be some inconsistencies with object names
+between cached and non-cached walks.  Specifically, if two objects in a commit
+tree have the same content (= same SHA-1); or if objects of the same SHA-1 are
+introduced independantly in parallel branches.
+
+In the first case rev-cache will use the name of the youngest file, while
+vanilla rev-list will return the name of the entry first encountered in walking
+the tree.  The latter case is a result of rev-cache's internal topological
+ordering: the difference is the same between sorted and unsorted revision walks.
+
+See 'Discussion' for the underlying reasons for the discrepencies.
+
+DISCUSSION
+----------
+For an explanation of the API and its inner workings, see
+link:technical/rev-cache.txt[technical info on rev-cache].
diff --git a/Documentation/technical/rev-cache.txt b/Documentation/technical/rev-cache.txt
new file mode 100644
index 0000000..91fce8b
--- /dev/null
+++ b/Documentation/technical/rev-cache.txt
@@ -0,0 +1,634 @@
+rev-cache
+=========
+
+The revision cache API ('rev-cache') provides a method for efficiently storing
+and accessing commit branch sections.  Such branch slices are defined by a
+series of start/top (interesting) and end/bottom (uninteresting) commits.  Each
+slice contains information on commits in topological order.  Recorded with each
+commit is:
+
+* All intra-slice topological relations, encoded into path "channels" (see
+  'Mechanics' for full explanation).
+* Object meta-data: type, SHA-1, size, date (for commits).
+* Objects introduced by that commit, not present in the its cached parents.
+
+In addition to the API, basic structures are exported for the possibility of
+direct access.
+
+The API
+-------
+You can find the function prototypes in `revision.h`.
+
+Data Structures
+~~~~~~~~~~~~~~~
+The `rev_cache_info` struct holds all the options and flags for the API.
+
+----
+struct rev_cache_info {
+	/* generation flags */
+	unsigned objects : 1,
+		legs : 1,
+		make_index : 1,
+		fuse_me : 1;
+
+	/* index inclusion */
+	unsigned overwrite_all : 1;
+
+	/* traversal flags */
+	unsigned add_to_pending : 1;
+
+	/* fuse options */
+	unsigned int ignore_size;
+
+	/* reserved */
+	struct rev_cache_slice_map *maps,
+		*last_map;
+};
+----
+
+The fields:
+
+`objects`::
+	Add non-commit objects to slice.
+
+`legs`::
+	Ensure end/bottom commits have no children.
+
+`make_index`::
+	Integrate newly-made slice into index.
+
+`fuse_me`::
+	This is specified if a fuse is occuring, and slices are to be reused.
+	This option requires `maps` and `last_maps` to be initialized.
+
+`overwrite_all`::
+	When a cache slice is added to the index, sometimes overlap occures
+	between it and other slices.  Normally, original index entries are kept
+	unless the new entry represents a start commit (older entries are more
+	likely to lead to greater in-slice traversals).  This options overrides
+	that, and updates all entries of the new slice.
+
+`add_to_pending`::
+	Append unique non-commit objects to the `pending` object list in the
+	passed `rev_info` instance.
+
+`add_names`::
+	Include non-commit object names in the pending object entries if
+	`add_to_pending` is set.
+
+`ignore_size`::
+	If non-zero, ignore slices with size greater or equal to this during
+fusion.
+
+`maps`/`last_map`::
+	An array of slice mappings, indexed by their id in the slice index
+	header, to be re-used with `fuse_me`.  `last_map` points to the last
+	mapping used, and should be initialized to 0.
+
+Functions
+~~~~~~~~~
+
+init_rev_cache
+^^^^^^^^^^^^^^
+----
+void init_rev_cache_info(
+	struct rev_cache_info *rci OUT
+)
+----
+
+Initialize `rci` to default options.
+
+make_cache_slice
+^^^^^^^^^^^^^^^^
+----
+int make_cache_slice(
+	struct rev_cache_info *rci IN,
+	struct rev_info *revs IN,
+	struct commit_list **starts IN/OUT,
+	struct commit_list **ends IN/OUT,
+	unsigned char *cache_sha1 OUT
+)
+----
+
+Create a cache slice based on either `revs` (if non-NULL) *or* the `starts` and
+`ends` lists.  The actual list of start and end commits of the slice may be
+different from the parameters, based on what defines the branch segment, and
+this actual list is passed back through `starts` and `ends`.
+
+The cache slice is identified via a SHA-1 generated from the actual start/end
+commit lists.  `cache_sha1`, if non-NULL, can recieve the cache slice name.
+`rci` is used to specify generation options, but can be NULL if you want
+`make_cache_slice` to fall back on defaults.  Returns 0 on success, non-zero on
+failure.
+
+make_cache_index
+^^^^^^^^^^^^^^^^
+----
+int make_cache_index(
+	struct rev_cache_info *rci IN,
+	unsigned char *cache_sha1 IN,
+	int fd IN,
+	unsigned int size IN
+)
+----
+
+Add a slice to the rev-cache index.  `cache_sha1` is the identity hash of the
+cache slice; `fd` is a file descriptor of the cache slice opened with
+read/write privileges (the slice is not actually modified); `size` is the size
+of the cache slice.  Although there are currently no options for index
+updating, `rci` is a placeholder in case of future options.  Note that this
+function is normally called by `make_cache_slice`.  Returns 0 on success,
+non-zero on failure.
+
+open_cache_slice
+^^^^^^^^^^^^^^^^
+----
+int open_cache_slice(
+	unsigned char *sha1 IN,
+	int flags IN
+)
+----
+
+Returns a file descriptor to a cache slice described by `sha1` hash, using
+`flags` as the access mode.  This will follow cache slice pointers to one level
+of indirection.
+
+get_cache_slice
+^^^^^^^^^^^^^^^
+----
+unsigned char *get_cache_slice(
+	struct commit *commit IN
+)
+----
+
+Given a commit object `get_cache_slice` will search the revision cache index
+and return, if found, the cache slice SHA-1.
+
+traverse_cache_slice
+^^^^^^^^^^^^^^^^^^^^
+----
+int traverse_cache_slice(
+	struct rev_info *revs IN/OUT,
+	unsigned char *cache_sha1 IN,
+	struct commit *commit IN,
+	unsigned long *date_so_far IN/OUT,
+	int *slop_so_far IN/OUT,
+	struct commit_list ***queue OUT,
+	struct commit_list **work IN/OUT
+)
+----
+
+Traverse a specified cache slice.  An explanation of the each field:
+
+`revs`::
+	The revision walk instance.  `traverse_cache_slice` uses this for
+	general options (e.g. which objects are included) and slice traversal
+	options (in the `rev_cache_info` field).  If the `add_to_pending`
+	option is specified, non-commit objects are appended to the `pending`
+	object list field.
+
+`cache_sha1`::
+	SHA-1 identifying the cache slice to use.  This can be taken directly
+	from `get_cache_slice`.
+
+`commit`::
+	The current commit object in the revision walk, i.e. the commit which
+	inspired this slice traversal.  Although theoretically redundant in
+	view of the `work` list, this simplifies interaction with normal
+	revision walks, which pop commits from `work` before analyzing them.
+
+`date_so_far`::
+	The date of the oldest encountered interesting commit.  Passing NULL
+	will let `traverse_cache_slice` use defaults.
+
+`slop_so_far`::
+	The `slop` value, a la revision.c.  This is a counter used to determine
+	when to stop traversing, based on how many extra uninteresting commits
+	should be encountered.  NULL will enable defaults, as above.
+
+`queue`::
+	Refers to a pointer to the head of a FIFO commit list, recieving the
+	commits we've seen and added.
+
+`work`::
+	A date-ordered list of commits that have yet to be processed (i.e. seen
+	but not added).  Commits from here present in the slice are removed
+	(and, obviously, used as starting places for traversal), and any end
+	commits encountered are inserted.
+
+starts_from_slices
+^^^^^^^^^^^^^^^^^^
+----
+void starts_from_slices(
+	struct rev_info *revs OUT,
+	unsigned int flags IN,
+	unsigned char *which IN,
+	int n IN
+)
+----
+
+Will mark start-commits in certain rev-cache slices with `flag`, and added them
+to the pending list of `revs`.  If `n` is zero, `starts_from_slices` will use
+all slices.  Otherwise `which` will specify an *unseperated* list of cache
+SHA-1s to use (20 bytes each), and `n` will contain the number of slices (i.e.
+20 * `n` = size of `which`).
+
+fuse_cache_slices
+^^^^^^^^^^^^^^^^^
+----
+int fuse_cache_slices(
+	struct rev_cache_info *rci IN,
+	struct rev_info *revs IN
+)
+----
+
+Generate a slice based on `revs`, replacing all encountered slices with one
+(larger) slice.  The `ignore_size` field in `rci`, if non-zero, will dictate
+which cache slice sizes to ignore in both traversal and replacement.
+
+regenerate_cache_index
+^^^^^^^^^^^^^^^^^^^^^^
+----
+int regenerate_cache_index(
+	struct rev_cache_info *rci IN
+)
+----
+
+Remake the revision cache index, including all the slices.  Currently no
+options in `rci` exist for index (re)generation, but some may develop in the
+future.
+
+to/from_disked_rc_object/index_entry
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+----
+struct rc_object/index_entry *from_disked_rc_object/index_entry(
+	struct rc_object/index_entry_ondisk *src IN,
+	struct rc_object/index_entry *dst OUT
+)
+
+struct rc_object/index_entry_ondisk *to_disked_rc_object/index_entry(
+	struct rc_object/index_entry *src IN,
+	struct rc_object/index_entry_ondisk *dst OUT
+)
+----
+
+Functions to convert between the internal and storage (`_ondisk`) versions of
+object and index entry structures.  These are necessary for direct access to
+the cache slices.  If NULL is provided for `dst` a statically allocated
+structure is used, and a pointer to the struct is returned.  Otherwise the
+functions return `dst`.
+
+Example Usage
+-------------
+
+A few examples to demonstrate usage:
+
+.Creating a slice
+----
+/* pretend you're a porcelain for rev-cache reading from the command line */
+struct rev_info revs;
+struct rev_cache_info rci;
+
+init_revisions(&revs, 0);
+init_rci(&rci);
+
+flags = 0;
+for (i = 1; i < argc; i++) {
+        if (!strcmp(argv[i], "--not"))
+                flags ^= UNINTERESTING;
+        else if(!strcmp(argv[i], "--fresh"))
+                starts_from_slices(&revs, UNINTERESTING, 0, 0);
+        else
+                handle_revision_arg(argv[i], &revs, flags, 1);
+}
+
+/* we want to explicitly set certain options */
+rci.objects = 0;
+
+if (!make_cache_slice(&rci, &revs, 0, 0, cache_sha1))
+        printf("made slice!  it's called %s\n", sha1_to_hex(cache_sha1));
+----
+
+.Traversing a slice
+----
+/* let's say you're walking the tree with a 'work' list of current heads and a
+ * FILO output list 'out' */
+out = 0;
+outp = &out;
+
+while (work) {
+        struct commit *commit = pop_commit(&work);
+        struct object *object = &commit->object;
+        unsigned char *cache_sha1;
+
+        if (cache_sha1 = get_cache_slice(object->sha1)) {
+                /* note that this will instatiate any topo-relations
+                 * as it goes */
+                if (traverse_cache_slice(&revs, cache_sha1,
+                        commit, 0, 0, /* use defaults */
+                        &outp, &work) < 0)
+                        die("I'm overreacting to a non-fatal cache error");
+        } else {
+                struct commit_list *parents = commit->parents;
+
+                while (parents) {
+                        struct commit *p = parents->item;
+                        struct object *po = &p->object;
+
+                        parents = parents->next;
+                        if (po->flags & UNINTERESTING)
+                                continue;
+
+                        if (object->flags & UNINTERESTING)
+                                po->flags |= UNINTERESTING;
+                        else if (po->flags & SEEN)
+                                continue;
+
+                        if (!po->parsed)
+                                parse_commit(p);
+                        insert_by_date(p, &work);
+                }
+
+                if (object->flags & (SEEN | UNINTERESTING) == 0)
+                        outp = &commit_list_insert(commit, outp)->next;
+                object->flags |= SEEN;
+        }
+}
+----
+
+Some Internals
+--------------
+For more advanced usage, the slice and index file(s) may be accessed directly.
+Relavant structures are availabe in `rev-cache.h`.
+
+File Formats
+~~~~~~~~~~~~
+
+Cache Slices
+^^^^^^^^^^^^
+A slice has a basic fixed-size header, followed by a certain number of object
+entries, then a NULL-seperated list of object names.  Commits are sorted in
+topo-order, and each commit entry is followed by the objects added in that
+commit.
+
+----
+         -- +--------------------------------+
+header      | object number, etc...          |
+         -- +--------------------------------+
+commit      | commit info                    |
+entry       | path data                      |
+            +--------------------------------+
+            | tree/blob info                 |
+            +--------------------------------+
+            | tree/blob info                 |
+            +--------------------------------+
+            | ...                            |
+         -- +--------------------------------+
+commit      | commit info                    |
+entry       | path data                      |
+            +--------------------------------+
+            | tree/blob info                 |
+            +--------------------------------+
+            | ...                            |
+         -- +--------------------------------+
+...         ...
+         -- +--------------------------------+
+name list   | \0some_file_name\0             |
+(note       +--------------------------------+
+preceeding  | another_file\0                 |
+null)       ...                              |
+            +--------------------------------+
+----
+
+Here is the header:
+
+----
+struct rc_cache_slice_header {
+	char signature[8]; /* REVCACHE */
+	unsigned char version;
+	uint32_t ofs_objects;
+
+	uint32_t object_nr;
+	uint16_t path_nr;
+	uint32_t size;
+
+	unsigned char sha1[20];
+
+	uint32_t names_size;
+};
+----
+
+Explanations:
+
+`signature`::
+	The identifying signature of cache slice file.  Always "REVCACHE".
+`version`::
+	The version number, currently 1.
+`ofs_objects`::
+	The byte offset at which the commit/object listing starts.  Always
+	present at the 10th byte, regardless of file version.
+`object_nr`::
+	The total number of objects (commit + non-commit objects) present in
+	the slice.
+`path_nr`::
+	The total number of paths/channels used in encoding the topological
+	data.  Note that paths are reused (see 'Mechanics'), so there will
+	never be more than a few hundred paths (if that) used.
+`size`::
+	The size of the slice *excluding* the name list.  In other words, the
+	size of the portion mapped to memory.
+`sha1`::
+	The cache slice SHA-1.
+`names_size`::
+	The size of the name list.  `size` + `names_size` = size of slice
+
+Revision Cache Index
+^^^^^^^^^^^^^^^^^^^^
+The index is a single file that associates SHA-1s with cache slices and file
+positions.  It is somewhat similar to pack-file indexes, containing a fanout
+table and a list of index entries sorted by hash.
+
+----
+         -- +--------------------------------+
+header      | object #, cache #, etc.        |
+         -- +--------------------------------+
+sha1s of    | SHA-1                          |
+slices      | ...                            |
+         -- +--------------------------------+
+fanout      | fanout[0x00]                   |
+table       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+            | fanout[0xff]                   |
+         -- +--------------------------------+
+index       | SHA-1 of object                |
+entries     | index of cache slice SHA-1     |
+            | position in cache slice        |
+            +--------------------------------+
+            |                                |
+            ...
+            +--------------------------------+
+----
+
+The header:
+
+----
+struct rc_index_header {
+	char signature[8]; /* REVINDEX */
+	unsigned char version;
+	uint32_t ofs_objects;
+
+	uint32_t object_nr;
+	unsigned char cache_nr;
+
+	uint32_t max_date;
+};
+----
+
+Explanations:
+
+`signature`::
+	Always "REVINDEX".
+`version`::
+	Version number, currently 1.
+`ofs_objects`::
+	Offset at which the entry objects begin.  This is more obviously useful
+	in the index because the list of slice SHA-1s is variably-sized.
+`object_nr`::
+	Number of index entry objects present.
+`cache_nr`::
+	Number of cache slices to which the index maps, and hence the number of
+slice SHA-1s listed.
+`max_date`::
+	The oldest commit represented in the index.  This is used to help speed
+up lookup times by knowing what range of commits we definitely don't have
+cached.  Normal usage of 'rev-cache' would leave no "holes" in its coverage of
+commit history -- once a commit is cached, everything reachable from it should
+be cached as well.  Most of the time refs are added to rev-cache simultaneous
+as well.  This means that in most situations almost everything <= `max_date`
+will be cached.
+
+Mechanics
+~~~~~~~~~
+
+The most important part of rev-cache is its method of encoding topological
+relations.  To ensure fluid traversal and reconstruction, commits are related
+through high-level "streams"/"channels" rather than individual
+interconnections.  Intuitively, rev-cache stores history the way gitk shows it:
+commits strung up on lines, which interconnect at merges and branches.
+
+Each commit is associated to a given channel/path via a 'path id', and
+variable-length fields govern which paths (if any) are closed or opened at that
+object.  This means that topo-data can be preserved in only a few bytes extra
+per object entry.  Other information stored per entry is the sha-1 hash, type,
+date, size, name, and status in cache slice.  Here is format of an object
+entry, both on-disk and in-memory:
+
+----
+struct object_entry {
+        unsigned type : 3;
+        unsigned is_end : 1;
+        unsigned is_start : 1;
+        unsigned uninteresting : 1;
+        unsigned include : 1;
+        unsigned flags : 1;
+        unsigned char sha1[20];
+
+        unsigned char merge_nr;
+        unsigned char split_nr;
+        unsigned size_size : 3;
+        unsigned name_size : 3;
+
+        uint32_t date;
+        uint16_t path;
+
+        /* merge paths */
+        /* split paths */
+        /* size */
+        /* name index */
+};
+----
+
+An explanation of each field:
+
+`type`::
+	Object type
+`is_end`::
+	The commit has some parents outside the cache slice (all if slice has
+	legs)
+`is_start`::
+	The commit has no children in cache slice
+`uninteresting`::
+	Run-time flag, used in traversal
+`include`::
+	Run-time flag, used in traversal (initialization)
+`flags`::
+	Currently unused, extra bit
+`sha1`::
+	Object SHA-1 hash
+
+`merge_nr`::
+	The number of paths the current channel diverges into; the current path
+	ends upon any merge.
+`split_nr`::
+	The number of paths this commit ends; used on both merging and
+	branching.
+`size_size`::
+	Number of bytes the object size takes up.
+`name_size`::
+	Number of bytes the name index takes up.
+
+`date`::
+	The date of the commit.
+`path`::
+	The path ID of the channel with which this commit is associated.
+
+merge paths::
+	The path IDs (16-bit) that are to be created.  Overflow is not a
+	problem as path IDs are reused, leaving even complicated projects to
+	consume no more than a few hundred IDs.
+split paths::
+	The path IDs (16-bit) that are to be ended.
+size::
+	The size split into the minimum number of bytes.  That is, 1-8 bytes
+	representing the size, least-significant byte first.
+name index::
+	An offset for the null-seperated, object name list at the end of the
+	cache slice.  Also split into the minimum number of bytes.
+
+Each path ID refers to an index in a 'path array', which stores the current
+status (eg. active, interestingness) of each channel.
+
+Due to topo-relations and boundary tracking, all of a commit's parents must be
+encountered before the path is reallocated.  This is achieved by using a
+counter system per merge: starting at the parent number, the counter is
+decremented as each parent is encountered (dictated by 'split paths'); at 0 the
+path is cleared.
+
+Boundary tracking is necessary because non-commits are stored relative to the
+commit in which they were introduced.  If a series of commits is not included
+in the output, the last interesting commit must be parsed manually to ensure
+all objects are accounted for.
+
+To prevent list-objects from recursing into trees that we've already taken care
+of, the flag `FACE_VALUE` is introduced.  An object with this flag is not
+explored (= "taken at face value"), significantly reducing I/O and processing
+time.
+
+Notes
+~~~~~
+
+Due to rev-cache's internal storage format, walking may lead to some
+discrepencies between cached and uncached repositories.  Although noticeable to
+users directly calling rev-list, these are unused or corner cases and
+internally a non-issue.
+
+First note that rev-cache records commits in topological order.  Large portions
+of commit history will already be sorted topologically in the revision walk,
+yielding a different output for unsorted calls to rev-list.  A more obscure
+consquence occurs when two objects of the same SHA-1, but different name, are
+introduced seperately in parallel branches: different names might be shown for
+that object depending on which object entry was encountered first.
+
+A similar disparity arises when two objects of same SHA-1/different name are
+present in the same tree structure.  rev-cache, walking objects as they were
+introduced, lists the youngest file's name; rev-list, walking the full trees
+each commit, shows the first file encountered.
-- 
tg: (e7578ba..) t/revcache/docs (depends on: t/revcache/integration)

^ permalink raw reply related

* Re: [PATCH 2/6 (v4)] basic revision cache system, no integration or features
From: Nick Edelen @ 2009-10-19 20:28 UTC (permalink / raw)
  To: Junio C Hamano, Nicolas Pitre, Johannes Schindelin, Sam Vilain,
	Michael J Gruber
In-Reply-To: <op.uys3qlsjtdk399@sirnot.private>

Second in the revision cache series, this particular patch provides:
 - minimal API: caching only commit topo data
 - minimal porcelain: add and walk cache slices
 - appropriate tests

Signed-off-by: Nick Edelen <sirnot@gmail.com>

---
removed useless test.  sorry about large size: this is less of a patch and more
an addition of two files.

 Makefile                  |    2 +
 builtin-rev-cache.c       |  207 ++++++++
 builtin.h                 |    1 +
 commit.c                  |    2 +
 git.c                     |    1 +
 rev-cache.c               | 1171 +++++++++++++++++++++++++++++++++++++++++++++
 rev-cache.h               |  107 ++++
 revision.c                |    2 +-
 revision.h                |   26 +-
 t/t6017-rev-cache-list.sh |  106 ++++
 10 files changed, 1623 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 42b7d60..93e20ab 100644
--- a/Makefile
+++ b/Makefile
@@ -538,6 +538,7 @@ LIB_OBJS += refs.o
 LIB_OBJS += remote.o
 LIB_OBJS += replace_object.o
 LIB_OBJS += rerere.o
+LIB_OBJS += rev-cache.o
 LIB_OBJS += revision.o
 LIB_OBJS += run-command.o
 LIB_OBJS += server-info.o
@@ -630,6 +631,7 @@ BUILTIN_OBJS += builtin-remote.o
 BUILTIN_OBJS += builtin-replace.o
 BUILTIN_OBJS += builtin-rerere.o
 BUILTIN_OBJS += builtin-reset.o
+BUILTIN_OBJS += builtin-rev-cache.o
 BUILTIN_OBJS += builtin-rev-list.o
 BUILTIN_OBJS += builtin-rev-parse.o
 BUILTIN_OBJS += builtin-revert.o
diff --git a/builtin-rev-cache.c b/builtin-rev-cache.c
new file mode 100644
index 0000000..6eb7065
--- /dev/null
+++ b/builtin-rev-cache.c
@@ -0,0 +1,207 @@
+#include "cache.h"
+#include "object.h"
+#include "commit.h"
+#include "diff.h"
+#include "revision.h"
+#include "rev-cache.h"
+
+/* porcelain for rev-cache.c */
+static int handle_add(int argc, const char *argv[]) /* args beyond this command */
+{
+	struct rev_info revs;
+	struct rev_cache_info rci;
+	char dostdin = 0;
+	unsigned int flags = 0;
+	int i, retval;
+	unsigned char cache_sha1[20];
+	struct commit_list *starts = 0, *ends = 0;
+	struct commit *commit;
+
+	init_revisions(&revs, 0);
+	init_rev_cache_info(&rci);
+
+	for (i = 0; i < argc; i++) {
+		if (!strcmp(argv[i], "--stdin"))
+			dostdin = 1;
+		else if (!strcmp(argv[i], "--fresh") || !strcmp(argv[i], "--incremental"))
+			starts_from_slices(&revs, UNINTERESTING);
+		else if (!strcmp(argv[i], "--not"))
+			flags ^= UNINTERESTING;
+		else if (!strcmp(argv[i], "--legs"))
+			rci.legs = 1;
+		else if (!strcmp(argv[i], "--no-objects"))
+			rci.objects = 0;
+		else if (!strcmp(argv[i], "--all")) {
+			const char *args[2];
+			int argn = 0;
+
+			args[argn++] = "rev-list";
+			args[argn++] = "--all";
+			setup_revisions(argn, args, &revs, 0);
+		} else
+			handle_revision_arg(argv[i], &revs, flags, 1);
+	}
+
+	if (dostdin) {
+		char line[1000];
+
+		flags = 0;
+		while (fgets(line, sizeof(line), stdin)) {
+			int len = strlen(line);
+			while (len && (line[len - 1] == '\n' || line[len - 1] == '\r'))
+				line[--len] = 0;
+
+			if (!len)
+				break;
+
+			if (!strcmp(line, "--not"))
+				flags ^= UNINTERESTING;
+			else
+				handle_revision_arg(line, &revs, flags, 1);
+		}
+	}
+
+	retval = make_cache_slice(&rci, &revs, &starts, &ends, cache_sha1);
+	if (retval < 0)
+		return retval;
+
+	printf("%s\n", sha1_to_hex(cache_sha1));
+
+	fprintf(stderr, "endpoints:\n");
+	while ((commit = pop_commit(&starts)))
+		fprintf(stderr, "S %s\n", sha1_to_hex(commit->object.sha1));
+	while ((commit = pop_commit(&ends)))
+		fprintf(stderr, "E %s\n", sha1_to_hex(commit->object.sha1));
+
+	return 0;
+}
+
+static int handle_walk(int argc, const char *argv[])
+{
+	struct commit *commit;
+	struct rev_info revs;
+	struct commit_list *queue, *work, **qp;
+	unsigned char *sha1p, *sha1pt;
+	unsigned long date = 0;
+	unsigned int flags = 0;
+	int retval, slop = 5, i;
+
+	init_revisions(&revs, 0);
+
+	for (i = 0; i < argc; i++) {
+		if (!strcmp(argv[i], "--not"))
+			flags ^= UNINTERESTING;
+		else if (!strcmp(argv[i], "--objects"))
+			revs.tree_objects = revs.blob_objects = 1;
+		else
+			handle_revision_arg(argv[i], &revs, flags, 1);
+	}
+
+	work = 0;
+	sha1p = 0;
+	for (i = 0; i < revs.pending.nr; i++) {
+		commit = lookup_commit(revs.pending.objects[i].item->sha1);
+
+		sha1pt = get_cache_slice(commit);
+		if (!sha1pt)
+			die("%s: not in a cache slice", sha1_to_hex(commit->object.sha1));
+
+		if (!i)
+			sha1p = sha1pt;
+		else if (sha1p != sha1pt)
+			die("walking porcelain is /per/ cache slice; commits cannot be spread out amoung several");
+
+		insert_by_date(commit, &work);
+	}
+
+	if (!sha1p)
+		die("nothing to traverse!");
+
+	queue = 0;
+	qp = &queue;
+	commit = pop_commit(&work);
+	retval = traverse_cache_slice(&revs, sha1p, commit, &date, &slop, &qp, &work);
+	if (retval < 0)
+		return retval;
+
+	fprintf(stderr, "queue:\n");
+	while ((commit = pop_commit(&queue)) != 0) {
+		printf("%s\n", sha1_to_hex(commit->object.sha1));
+	}
+
+	fprintf(stderr, "work:\n");
+	while ((commit = pop_commit(&work)) != 0) {
+		printf("%s\n", sha1_to_hex(commit->object.sha1));
+	}
+
+	fprintf(stderr, "pending:\n");
+	for (i = 0; i < revs.pending.nr; i++) {
+		struct object *obj = revs.pending.objects[i].item;
+
+		/* unfortunately, despite our careful generation, object duplication *is* a possibility...
+		 * (eg. same object introduced into two different branches) */
+		if (obj->flags & SEEN)
+			continue;
+
+		printf("%s\n", sha1_to_hex(revs.pending.objects[i].item->sha1));
+		obj->flags |= SEEN;
+	}
+
+	return 0;
+}
+
+static int handle_help(void)
+{
+	char *usage = "\
+usage:\n\
+git-rev-cache COMMAND [options] [<commit-id>...]\n\
+commands:\n\
+  add    - add revisions to the cache.  reads commit ids from stdin, \n\
+           formatted as: START START ... --not END END ...\n\
+           options:\n\
+            --all                  use all branch heads as starts\n\
+            --fresh/--incremental  exclude everything already in a cache slice\n\
+            --stdin                also read commit ids from stdin (same form\n\
+                                   as cmd)\n\
+            --legs                 ensure branch is entirely self-contained\n\
+            --no-objects           don't add non-commit objects to slice\n\
+  walk   - walk a cache slice based on set of commits; formatted as add\n\
+           options:\n\
+           --objects               include non-commit objects in traversals\n\
+  fuse   - coalesce cache slices into a single cache.\n\
+           options:\n\
+            --all                  include all objects in repository\n\
+            --no-objects           don't add non-commit objects to slice\n\
+            --ignore-size[=N]      ignore slices of size >= N; defaults to ~5MB\n\
+  index  - regnerate the cache index.";
+
+	puts(usage);
+
+	return 0;
+}
+
+int cmd_rev_cache(int argc, const char *argv[], const char *prefix)
+{
+	const char *arg;
+	int r;
+
+	git_config(git_default_config, NULL);
+
+	if (argc > 1)
+		arg = argv[1];
+	else
+		arg = "";
+
+	argc -= 2;
+	argv += 2;
+	if (!strcmp(arg, "add"))
+		r = handle_add(argc, argv);
+	else if (!strcmp(arg, "walk"))
+		r = handle_walk(argc, argv);
+	else
+		return handle_help();
+
+	fprintf(stderr, "final return value: %d\n", r);
+
+	return 0;
+}
diff --git a/builtin.h b/builtin.h
index a2174dc..2f89feb 100644
--- a/builtin.h
+++ b/builtin.h
@@ -86,6 +86,7 @@ extern int cmd_remote(int argc, const char **argv, const char *prefix);
 extern int cmd_config(int argc, const char **argv, const char *prefix);
 extern int cmd_rerere(int argc, const char **argv, const char *prefix);
 extern int cmd_reset(int argc, const char **argv, const char *prefix);
+extern int cmd_rev_cache(int argc, const char **argv, const char *prefix);
 extern int cmd_rev_list(int argc, const char **argv, const char *prefix);
 extern int cmd_rev_parse(int argc, const char **argv, const char *prefix);
 extern int cmd_revert(int argc, const char **argv, const char *prefix);
diff --git a/commit.c b/commit.c
index fedbd5e..61d83c6 100644
--- a/commit.c
+++ b/commit.c
@@ -252,6 +252,8 @@ int parse_commit_buffer(struct commit *item, void *buffer, unsigned long size)
 	item->tree = lookup_tree(parent);
 	bufptr += 46; /* "tree " + "hex sha1" + "\n" */
 	pptr = &item->parents;
+	while (pop_commit(pptr))
+		; /* clear anything from cache */
 
 	graft = lookup_commit_graft(item->object.sha1);
 	while (bufptr + 48 < tail && !memcmp(bufptr, "parent ", 7)) {
diff --git a/git.c b/git.c
index bd2c5fe..c016f50 100644
--- a/git.c
+++ b/git.c
@@ -345,6 +345,7 @@ static void handle_internal_command(int argc, const char **argv)
 		{ "repo-config", cmd_config },
 		{ "rerere", cmd_rerere, RUN_SETUP },
 		{ "reset", cmd_reset, RUN_SETUP },
+		{ "rev-cache", cmd_rev_cache, RUN_SETUP },
 		{ "rev-list", cmd_rev_list, RUN_SETUP },
 		{ "rev-parse", cmd_rev_parse },
 		{ "revert", cmd_revert, RUN_SETUP | NEED_WORK_TREE },
diff --git a/rev-cache.c b/rev-cache.c
new file mode 100644
index 0000000..8951cdf
--- /dev/null
+++ b/rev-cache.c
@@ -0,0 +1,1171 @@
+#include "cache.h"
+#include "object.h"
+#include "commit.h"
+#include "tree.h"
+#include "tree-walk.h"
+#include "blob.h"
+#include "tag.h"
+#include "diff.h"
+#include "revision.h"
+#include "rev-cache.h"
+#include "run-command.h"
+
+/* list resembles pack index format */
+static uint32_t fanout[0xff + 2];
+
+static unsigned char *idx_map;
+static int idx_size;
+static struct rc_index_header idx_head;
+static unsigned char *idx_caches;
+static char no_idx;
+
+static struct strbuf *acc_buffer;
+
+#define SLOP			5
+
+/* initialization */
+
+struct rc_index_entry *from_disked_rc_index_entry(struct rc_index_entry_ondisk *src, struct rc_index_entry *dst)
+{
+	static struct rc_index_entry entry[4];
+	static int cur;
+
+	if (!dst)
+		dst = &entry[cur++ & 0x3];
+
+	dst->sha1 = src->sha1;
+	dst->is_start = !!(src->flags & 0x80);
+	dst->cache_index = src->flags & 0x7f;
+	dst->pos = ntohl(src->pos);
+
+	return dst;
+}
+
+struct rc_index_entry_ondisk *to_disked_rc_index_entry(struct rc_index_entry *src, struct rc_index_entry_ondisk *dst)
+{
+	static struct rc_index_entry_ondisk entry[4];
+	static int cur;
+
+	if (!dst)
+		dst = &entry[cur++ & 0x3];
+
+	if (dst->sha1 != src->sha1)
+		hashcpy(dst->sha1, src->sha1);
+	dst->flags = (unsigned char)src->is_start << 7 | (unsigned char)src->cache_index;
+	dst->pos = htonl(src->pos);
+
+	return dst;
+}
+
+struct rc_object_entry *from_disked_rc_object_entry(struct rc_object_entry_ondisk *src, struct rc_object_entry *dst)
+{
+	static struct rc_object_entry entry[4];
+	static int cur;
+
+	if (!dst)
+		dst = &entry[cur++ & 0x3];
+
+	dst->type = src->flags >> 5;
+	dst->is_end = !!(src->flags & 0x10);
+	dst->is_start = !!(src->flags & 0x08);
+	dst->uninteresting = !!(src->flags & 0x04);
+	dst->include = !!(src->flags & 0x02);
+	dst->flag = !!(src->flags & 0x01);
+
+	dst->sha1 = src->sha1;
+	dst->merge_nr = src->merge_nr;
+	dst->split_nr = src->split_nr;
+
+	dst->size_size = src->sizes >> 5;
+	dst->padding = src->sizes & 0x1f;
+
+	dst->date = ntohl(src->date);
+	dst->path = ntohs(src->path);
+
+	return dst;
+}
+
+struct rc_object_entry_ondisk *to_disked_rc_object_entry(struct rc_object_entry *src, struct rc_object_entry_ondisk *dst)
+{
+	static struct rc_object_entry_ondisk entry[4];
+	static int cur;
+
+	if (!dst)
+		dst = &entry[cur++ & 0x3];
+
+	dst->flags  = (unsigned char)src->type << 5;
+	dst->flags |= (unsigned char)src->is_end << 4;
+	dst->flags |= (unsigned char)src->is_start << 3;
+	dst->flags |= (unsigned char)src->uninteresting << 2;
+	dst->flags |= (unsigned char)src->include << 1;
+	dst->flags |= (unsigned char)src->flag;
+
+	if (dst->sha1 != src->sha1)
+		hashcpy(dst->sha1, src->sha1);
+	dst->merge_nr = src->merge_nr;
+	dst->split_nr = src->split_nr;
+
+	dst->sizes  = (unsigned char)src->size_size << 5;
+	dst->sizes |= (unsigned char)src->padding;
+
+	dst->date = htonl(src->date);
+	dst->path = htons(src->path);
+
+	return dst;
+}
+
+static int get_index_head(unsigned char *map, int len, struct rc_index_header *head, uint32_t *fanout, unsigned char **caches)
+{
+	struct rc_index_header whead;
+	int i, index = sizeof(struct rc_index_header);
+
+	memcpy(&whead, map, sizeof(struct rc_index_header));
+	if (memcmp(whead.signature, "REVINDEX", 8) || whead.version != SUPPORTED_REVINDEX_VERSION)
+		return -1;
+
+	memcpy(head->signature, "REVINDEX", 8);
+	head->version = whead.version;
+	head->ofs_objects = ntohl(whead.ofs_objects);
+	head->object_nr = ntohl(whead.object_nr);
+	head->cache_nr = whead.cache_nr;
+	head->max_date = ntohl(whead.max_date);
+
+	if (len < index + head->cache_nr * 20 + 0x100 * sizeof(uint32_t))
+		return -2;
+
+	*caches = xmalloc(head->cache_nr * 20);
+	memcpy(*caches, map + index, head->cache_nr * 20);
+	index += head->cache_nr * 20;
+
+	memcpy(fanout, map + index, 0x100 * sizeof(uint32_t));
+	for (i = 0; i <= 0xff; i++)
+		fanout[i] = ntohl(fanout[i]);
+	fanout[0x100] = len;
+
+	return 0;
+}
+
+/* added in init_index */
+static void cleanup_cache_slices(void)
+{
+	if (idx_map) {
+		free(idx_caches);
+		munmap(idx_map, idx_size);
+		idx_map = 0;
+	}
+
+}
+
+static int init_index(void)
+{
+	int fd;
+	struct stat fi;
+
+	fd = open(git_path("rev-cache/index"), O_RDONLY);
+	if (fd == -1 || fstat(fd, &fi))
+		goto end;
+	if (fi.st_size < sizeof(struct rc_index_header))
+		goto end;
+
+	idx_size = fi.st_size;
+	idx_map = xmmap(0, idx_size, PROT_READ, MAP_PRIVATE, fd, 0);
+	close(fd);
+	if (idx_map == MAP_FAILED)
+		goto end;
+	if (get_index_head(idx_map, fi.st_size, &idx_head, fanout, &idx_caches))
+		goto end;
+
+	atexit(cleanup_cache_slices);
+
+	return 0;
+
+end:
+	idx_map = 0;
+	no_idx = 1;
+	return -1;
+}
+
+/* this assumes index is already loaded */
+static struct rc_index_entry_ondisk *search_index_1(unsigned char *sha1)
+{
+	int start, end, starti, endi, i, len, r;
+	struct rc_index_entry_ondisk *ie;
+
+	if (!idx_map)
+		return 0;
+
+	/* binary search */
+	start = fanout[(int)sha1[0]];
+	end = fanout[(int)sha1[0] + 1];
+	len = (end - start) / sizeof(struct rc_index_entry_ondisk);
+	if (!len || len * sizeof(struct rc_index_entry_ondisk) != end - start)
+		return 0;
+
+	starti = 0;
+	endi = len - 1;
+	for (;;) {
+		i = (endi + starti) / 2;
+		ie = (struct rc_index_entry_ondisk *)(idx_map + start + i * sizeof(struct rc_index_entry_ondisk));
+		r = hashcmp(sha1, ie->sha1);
+
+		if (r) {
+			if (starti + 1 == endi) {
+				starti++;
+				continue;
+			} else if (starti == endi)
+				break;
+
+			if (r > 0)
+				starti = i;
+			else /* r < 0 */
+				endi = i;
+		} else
+			return ie;
+	}
+
+	return 0;
+}
+
+static struct rc_index_entry *search_index(unsigned char *sha1)
+{
+	struct rc_index_entry_ondisk *ied = search_index_1(sha1);
+
+	if (ied)
+		return from_disked_rc_index_entry(ied, 0);
+
+	return 0;
+}
+
+unsigned char *get_cache_slice(struct commit *commit)
+{
+	struct rc_index_entry *ie;
+
+	if (!idx_map) {
+		if (no_idx)
+			return 0;
+		init_index();
+	}
+
+	if (commit->date > idx_head.max_date)
+		return 0;
+
+	ie = search_index(commit->object.sha1);
+	if (ie && ie->cache_index < idx_head.cache_nr)
+		return idx_caches + ie->cache_index * 20;
+
+	return 0;
+}
+
+
+/* traversal */
+
+static int setup_traversal(struct rc_slice_header *head, unsigned char *map, struct commit *commit, struct commit_list **work)
+{
+	struct rc_index_entry *iep;
+	struct rc_object_entry *oep;
+	struct commit_list *prev, *wp, **wpp;
+	int retval;
+
+	iep = search_index(commit->object.sha1), 0;
+	oep = RC_OBTAIN_OBJECT_ENTRY(map + iep->pos);
+
+	/* the .uniniteresting bit isn't strictly necessary, as we check the object during traversal as well,
+	 * but we might as well initialize it while we're at it */
+	oep->include = 1;
+	oep->uninteresting = !!(commit->object.flags & UNINTERESTING);
+	to_disked_rc_object_entry(oep, (struct rc_object_entry_ondisk *)(map + iep->pos));
+	retval = iep->pos;
+
+	/* include any others in the work array */
+	prev = 0;
+	wpp = work;
+	wp = *work;
+	while (wp) {
+		struct object *obj = &wp->item->object;
+		struct commit *co;
+
+		/* is this in our cache slice? */
+		iep = search_index(obj->sha1);
+		if (!iep || hashcmp(idx_caches + iep->cache_index * 20, head->sha1)) {
+			prev = wp;
+			wp = wp->next;
+			wpp = &wp;
+			continue;
+		}
+
+		if (iep->pos < retval)
+			retval = iep->pos;
+
+		oep = RC_OBTAIN_OBJECT_ENTRY(map + iep->pos);
+
+		/* mark this for later */
+		oep->include = 1;
+		oep->uninteresting = !!(obj->flags & UNINTERESTING);
+		to_disked_rc_object_entry(oep, (struct rc_object_entry_ondisk *)(map + iep->pos));
+
+		/* remove from work list */
+		co = pop_commit(wpp);
+		wp = *wpp;
+		if (prev)
+			prev->next = wp;
+	}
+
+	return retval;
+}
+
+#define IPATH				0x40
+#define UPATH				0x80
+
+#define GET_COUNT(x)		((x) & 0x3f)
+#define SET_COUNT(x, s)		((x) = ((x) & ~0x3f) | ((s) & 0x3f))
+
+static int traverse_cache_slice_1(struct rc_slice_header *head, unsigned char *map,
+	struct rev_info *revs, struct commit *commit,
+	unsigned long *date_so_far, int *slop_so_far,
+	struct commit_list ***queue, struct commit_list **work)
+{
+	struct commit_list *insert_cache = 0;
+	struct commit **last_objects, *co;
+	int i, total_path_nr = head->path_nr, retval = -1;
+	char consume_children = 0;
+	unsigned char *paths;
+
+	i = setup_traversal(head, map, commit, work);
+	if (i < 0)
+		return -1;
+
+	paths = xcalloc(total_path_nr, sizeof(uint16_t));
+	last_objects = xcalloc(total_path_nr, sizeof(struct commit *));
+
+	/* i already set */
+	while (i < head->size) {
+		struct rc_object_entry *entry = RC_OBTAIN_OBJECT_ENTRY(map + i);
+		int path = entry->path;
+		struct object *obj;
+		int index = i;
+
+		i += RC_ACTUAL_OBJECT_ENTRY_SIZE(entry);
+
+		/* add extra objects if necessary */
+		if (entry->type != OBJ_COMMIT)
+			continue;
+		else
+			consume_children = 0;
+
+		if (path >= total_path_nr)
+			goto end;
+
+		/* in one of our branches?
+		 * uninteresting trumps interesting */
+		if (entry->include)
+			paths[path] |= entry->uninteresting ? UPATH : IPATH;
+		else if (!paths[path])
+			continue;
+
+		/* date stuff */
+		if (revs->max_age != -1 && entry->date < revs->max_age)
+			paths[path] |= UPATH;
+
+		/* lookup object */
+		co = lookup_commit(entry->sha1);
+		obj = &co->object;
+
+		if (obj->flags & UNINTERESTING)
+			paths[path] |= UPATH;
+
+		if ((paths[path] & IPATH) && (paths[path] & UPATH)) {
+			paths[path] = UPATH;
+
+			/* mark edge */
+			if (last_objects[path]) {
+				parse_commit(last_objects[path]);
+
+				last_objects[path]->object.flags &= ~FACE_VALUE;
+				last_objects[path] = 0;
+			}
+		}
+
+		/* now we gotta re-assess the whole interesting thing... */
+		entry->uninteresting = !!(paths[path] & UPATH);
+
+		/* first close paths */
+		if (entry->split_nr) {
+			int j, off = index + sizeof(struct rc_object_entry_ondisk) + RC_PATH_SIZE(entry->merge_nr);
+
+			for (j = 0; j < entry->split_nr; j++) {
+				unsigned short p = ntohs(*(uint16_t *)(map + off + RC_PATH_SIZE(j)));
+
+				if (p >= total_path_nr)
+					goto end;
+
+				/* boundary commit? */
+				if ((paths[p] & IPATH) && entry->uninteresting) {
+					if (last_objects[p]) {
+						parse_commit(last_objects[p]);
+
+						last_objects[p]->object.flags &= ~FACE_VALUE;
+						last_objects[p] = 0;
+					}
+				} else if (last_objects[p] && !last_objects[p]->object.parsed)
+					commit_list_insert(co, &last_objects[p]->parents);
+
+				/* can't close a merge path until all are parents have been encountered */
+				if (GET_COUNT(paths[p])) {
+					SET_COUNT(paths[p], GET_COUNT(paths[p]) - 1);
+
+					if (GET_COUNT(paths[p]))
+						continue;
+				}
+
+				paths[p] = 0;
+				last_objects[p] = 0;
+			}
+		}
+
+		/* make topo relations */
+		if (last_objects[path] && !last_objects[path]->object.parsed)
+			commit_list_insert(co, &last_objects[path]->parents);
+
+		/* initialize commit */
+		if (!entry->is_end) {
+			co->date = entry->date;
+			obj->flags |= ADDED | FACE_VALUE;
+		} else
+			parse_commit(co);
+
+		obj->flags |= SEEN;
+
+		if (entry->uninteresting)
+			obj->flags |= UNINTERESTING;
+
+		/* we need to know what the edges are */
+		last_objects[path] = co;
+
+		/* add to list */
+		if (!(obj->flags & UNINTERESTING) || revs->show_all) {
+			if (entry->is_end)
+				insert_by_date_cached(co, work, insert_cache, &insert_cache);
+			else
+				*queue = &commit_list_insert(co, *queue)->next;
+
+			/* add children to list as well */
+			if (obj->flags & UNINTERESTING)
+				consume_children = 0;
+			else
+				consume_children = 1;
+		}
+
+		/* open parents */
+		if (entry->merge_nr) {
+			int j, off = index + sizeof(struct rc_object_entry_ondisk);
+			char flag = entry->uninteresting ? UPATH : IPATH;
+
+			for (j = 0; j < entry->merge_nr; j++) {
+				unsigned short p = ntohs(*(uint16_t *)(map + off + RC_PATH_SIZE(j)));
+
+				if (p >= total_path_nr)
+					goto end;
+
+				if (paths[p] & flag)
+					continue;
+
+				paths[p] |= flag;
+			}
+
+			/* make sure we don't use this path before all our parents have had their say */
+			SET_COUNT(paths[path], entry->merge_nr);
+		}
+
+	}
+
+	retval = 0;
+
+end:
+	free(paths);
+	free(last_objects);
+
+	return retval;
+}
+
+static int get_cache_slice_header(unsigned char *cache_sha1, unsigned char *map, int len, struct rc_slice_header *head)
+{
+	int t;
+
+	memcpy(head, map, sizeof(struct rc_slice_header));
+	head->ofs_objects = ntohl(head->ofs_objects);
+	head->object_nr = ntohl(head->object_nr);
+	head->size = ntohl(head->size);
+	head->path_nr = ntohs(head->path_nr);
+
+	if (memcmp(head->signature, "REVCACHE", 8))
+		return -1;
+	if (head->version != SUPPORTED_REVCACHE_VERSION)
+		return -2;
+	if (hashcmp(head->sha1, cache_sha1))
+		return -3;
+	t = sizeof(struct rc_slice_header);
+	if (t != head->ofs_objects || t >= len)
+		return -4;
+
+	head->size = len;
+
+	return 0;
+}
+
+int traverse_cache_slice(struct rev_info *revs,
+	unsigned char *cache_sha1, struct commit *commit,
+	unsigned long *date_so_far, int *slop_so_far,
+	struct commit_list ***queue, struct commit_list **work)
+{
+	int fd = -1, retval = -3;
+	struct stat fi;
+	struct rc_slice_header head;
+	struct rev_cache_info *rci;
+	unsigned char *map = MAP_FAILED;
+
+	/* the index should've been loaded already to find cache_sha1, but it's good
+	 * to be absolutely sure... */
+	if (!idx_map)
+		init_index();
+	if (!idx_map)
+		return -1;
+
+	/* load options */
+	rci = &revs->rev_cache_info;
+
+	memset(&head, 0, sizeof(struct rc_slice_header));
+
+	fd = open(git_path("rev-cache/%s", sha1_to_hex(cache_sha1)), O_RDONLY);
+	if (fd == -1)
+		goto end;
+	if (fstat(fd, &fi) || fi.st_size < sizeof(struct rc_slice_header))
+		goto end;
+
+	map = xmmap(0, fi.st_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
+	if (map == MAP_FAILED)
+		goto end;
+	if (get_cache_slice_header(cache_sha1, map, fi.st_size, &head))
+		goto end;
+
+	retval = traverse_cache_slice_1(&head, map, revs, commit, date_so_far, slop_so_far, queue, work);
+
+end:
+	if (map != MAP_FAILED)
+		munmap(map, fi.st_size);
+	if (fd != -1)
+		close(fd);
+
+	return retval;
+}
+
+
+
+/* generation */
+
+struct path_track {
+	struct commit *commit;
+	int path; /* for keeping track of children */
+
+	struct path_track *next, *prev;
+};
+
+static unsigned char *paths;
+static int path_nr = 1, path_sz;
+
+static struct path_track *path_track;
+static struct path_track *path_track_alloc;
+
+#define PATH_IN_USE			0x80 /* biggest bit we can get as a char */
+
+static int get_new_path(void)
+{
+	int i;
+
+	for (i = 1; i < path_nr; i++)
+		if (!paths[i])
+			break;
+
+	if (i == path_nr) {
+		if (path_nr >= path_sz) {
+			path_sz += 50;
+			paths = xrealloc(paths, path_sz);
+			memset(paths + path_sz - 50, 0, 50);
+		}
+		path_nr++;
+	}
+
+	paths[i] = PATH_IN_USE;
+	return i;
+}
+
+static void remove_path_track(struct path_track **ppt, char total_free)
+{
+	struct path_track *t = *ppt;
+
+	if (t->next)
+		t->next->prev = t->prev;
+	if (t->prev)
+		t->prev->next = t->next;
+
+	t = t->next;
+
+	if (total_free)
+		free(*ppt);
+	else {
+		(*ppt)->next = path_track_alloc;
+		path_track_alloc = *ppt;
+	}
+
+	*ppt = t;
+}
+
+static struct path_track *make_path_track(struct path_track **head, struct commit *commit)
+{
+	struct path_track *pt;
+
+	if (path_track_alloc) {
+		pt = path_track_alloc;
+		path_track_alloc = pt->next;
+	} else
+		pt = xmalloc(sizeof(struct path_track));
+
+	memset(pt, 0, sizeof(struct path_track));
+	pt->commit = commit;
+
+	pt->next = *head;
+	if (*head)
+		(*head)->prev = pt;
+	*head = pt;
+
+	return pt;
+}
+
+static void add_path_to_track(struct commit *commit, int path)
+{
+	make_path_track(&path_track, commit);
+	path_track->path = path;
+}
+
+static void handle_paths(struct commit *commit, struct rc_object_entry *object, struct strbuf *merge_str, struct strbuf *split_str)
+{
+	int child_nr, parent_nr, open_parent_nr, this_path;
+	struct commit_list *list;
+	struct commit *first_parent;
+	struct path_track **ppt, *pt;
+
+	/* we can only re-use a closed path once all it's children have been encountered,
+	 * as we need to keep track of commit boundaries */
+	ppt = &path_track;
+	pt = *ppt;
+	child_nr = 0;
+	while (pt) {
+		if (pt->commit == commit) {
+			uint16_t write_path;
+
+			if (paths[pt->path] != PATH_IN_USE)
+				paths[pt->path]--;
+
+			/* make sure we can handle this */
+			child_nr++;
+			if (child_nr > 0x7f)
+				die("%s: too many branches!  rev-cache can only handle %d parents/children per commit",
+					sha1_to_hex(object->sha1), 0x7f);
+
+			/* add to split list */
+			object->split_nr++;
+			write_path = htons((uint16_t)pt->path);
+			strbuf_add(split_str, &write_path, sizeof(uint16_t));
+
+			remove_path_track(ppt, 0);
+			pt = *ppt;
+		} else {
+			pt = pt->next;
+			ppt = &pt;
+		}
+	}
+
+	/* initialize our self! */
+	if (!commit->indegree) {
+		commit->indegree = get_new_path();
+		object->is_start = 1;
+	}
+
+	this_path = commit->indegree;
+	paths[this_path] = PATH_IN_USE;
+	object->path = this_path;
+
+	/* count interesting parents */
+	parent_nr = open_parent_nr = 0;
+	first_parent = 0;
+	for (list = commit->parents; list; list = list->next) {
+		if (list->item->object.flags & UNINTERESTING) {
+			object->is_end = 1;
+			continue;
+		}
+
+		parent_nr++;
+		if (!list->item->indegree)
+			open_parent_nr++;
+		if (!first_parent)
+			first_parent = list->item;
+	}
+
+	if (!parent_nr)
+		return;
+
+	if (parent_nr == 1 && open_parent_nr == 1) {
+		first_parent->indegree = this_path;
+		return;
+	}
+
+	/* bail out on obscene parent/child #s */
+	if (parent_nr > 0x7f)
+		die("%s: too many parents in merge!  rev-cache can only handle %d parents/children per commit",
+			sha1_to_hex(object->sha1), 0x7f);
+
+	/* make merge list */
+	object->merge_nr = parent_nr;
+	paths[this_path] = parent_nr;
+
+	for (list = commit->parents; list; list = list->next) {
+		struct commit *p = list->item;
+		uint16_t write_path;
+
+		if (p->object.flags & UNINTERESTING)
+			continue;
+
+		/* unfortunately due to boundary tracking we can't re-use merge paths
+		 * (unable to guarantee last parent path = this -> last won't always be able to
+		 * set this as a boundary object */
+		if (!p->indegree)
+			p->indegree = get_new_path();
+
+		write_path = htons((uint16_t)p->indegree);
+		strbuf_add(merge_str, &write_path, sizeof(uint16_t));
+
+		/* make sure path is properly ended */
+		add_path_to_track(p, this_path);
+	}
+
+}
+
+
+static void add_object_entry(const unsigned char *sha1, int type, struct rc_object_entry *nothisone,
+	struct strbuf *merge_str, struct strbuf *split_str)
+{
+	struct rc_object_entry object;
+
+	if (!nothisone) {
+		memset(&object, 0, sizeof(object));
+		object.sha1 = (unsigned char *)sha1;
+		object.type = type;
+
+		if (merge_str)
+			object.merge_nr = merge_str->len / sizeof(uint16_t);
+		if (split_str)
+			object.split_nr = split_str->len / sizeof(uint16_t);
+
+		nothisone = &object;
+	}
+
+	strbuf_add(acc_buffer, to_disked_rc_object_entry(nothisone, 0), sizeof(struct rc_object_entry_ondisk));
+
+	if (merge_str && merge_str->len)
+		strbuf_add(acc_buffer, merge_str->buf, merge_str->len);
+	if (split_str && split_str->len)
+		strbuf_add(acc_buffer, split_str->buf, split_str->len);
+
+}
+
+static void init_revcache_directory(void)
+{
+	struct stat fi;
+
+	if (stat(git_path("rev-cache"), &fi) || !S_ISDIR(fi.st_mode))
+		if (mkdir(git_path("rev-cache"), 0777))
+			die("can't make rev-cache directory");
+
+}
+
+void init_rev_cache_info(struct rev_cache_info *rci)
+{
+	rci->objects = 1;
+	rci->legs = 0;
+	rci->make_index = 1;
+
+	rci->add_to_pending = 1;
+
+	rci->ignore_size = 0;
+}
+
+void maybe_fill_with_defaults(struct rev_cache_info *rci)
+{
+	static struct rev_cache_info def_rci;
+
+	if (rci)
+		return;
+
+	init_rev_cache_info(&def_rci);
+	rci = &def_rci;
+}
+
+int make_cache_slice(struct rev_cache_info *rci,
+	struct rev_info *revs, struct commit_list **starts, struct commit_list **ends,
+	unsigned char *cache_sha1)
+{
+	struct rev_info therevs;
+	struct strbuf buffer, startlist, endlist;
+	struct rc_slice_header head;
+	struct commit *commit;
+	unsigned char sha1[20];
+	struct strbuf merge_paths, split_paths;
+	int object_nr, total_sz, fd;
+	char file[PATH_MAX], *newfile;
+	struct rev_cache_info *trci;
+	git_SHA_CTX ctx;
+
+	maybe_fill_with_defaults(rci);
+
+	init_revcache_directory();
+	strcpy(file, git_path("rev-cache/XXXXXX"));
+	fd = xmkstemp(file);
+
+	strbuf_init(&buffer, 0);
+	strbuf_init(&startlist, 0);
+	strbuf_init(&endlist, 0);
+	strbuf_init(&merge_paths, 0);
+	strbuf_init(&split_paths, 0);
+	acc_buffer = &buffer;
+
+	if (!revs) {
+		revs = &therevs;
+		init_revisions(revs, 0);
+
+		/* we're gonna assume no one else has already traversed this... */
+		while ((commit = pop_commit(starts)))
+			add_pending_object(revs, &commit->object, 0);
+
+		while ((commit = pop_commit(ends))) {
+			commit->object.flags |= UNINTERESTING;
+			add_pending_object(revs, &commit->object, 0);
+		}
+	}
+
+	/* write head placeholder */
+	memset(&head, 0, sizeof(head));
+	head.ofs_objects = htonl(sizeof(head));
+	xwrite(fd, &head, sizeof(head));
+
+	/* init revisions! */
+	revs->tree_objects = 1;
+	revs->blob_objects = 1;
+	revs->topo_order = 1;
+	revs->lifo = 1;
+
+	/* re-use info from other caches if possible */
+	trci = &revs->rev_cache_info;
+	init_rev_cache_info(trci);
+	trci->add_to_pending = 0;
+
+	setup_revisions(0, 0, revs, 0);
+	if (prepare_revision_walk(revs))
+		die("died preparing revision walk");
+
+	object_nr = total_sz = 0;
+	while ((commit = get_revision(revs)) != 0) {
+		struct rc_object_entry object;
+
+		strbuf_setlen(&merge_paths, 0);
+		strbuf_setlen(&split_paths, 0);
+
+		memset(&object, 0, sizeof(object));
+		object.type = OBJ_COMMIT;
+		object.date = commit->date;
+		object.sha1 = commit->object.sha1;
+
+		handle_paths(commit, &object, &merge_paths, &split_paths);
+
+		if (object.is_end) {
+			strbuf_add(&endlist, object.sha1, 20);
+			if (ends)
+				commit_list_insert(commit, ends);
+		}
+		/* the two *aren't* mutually exclusive */
+		if (object.is_start) {
+			strbuf_add(&startlist, object.sha1, 20);
+			if (starts)
+				commit_list_insert(commit, starts);
+		}
+
+		commit->indegree = 0;
+
+		add_object_entry(0, 0, &object, &merge_paths, &split_paths);
+		object_nr++;
+
+		/* print every ~1MB or so */
+		if (buffer.len > 1000000) {
+			write_in_full(fd, buffer.buf, buffer.len);
+			total_sz += buffer.len;
+
+			strbuf_setlen(&buffer, 0);
+		}
+	}
+
+	if (buffer.len) {
+		write_in_full(fd, buffer.buf, buffer.len);
+		total_sz += buffer.len;
+	}
+
+	/* go ahead a free some stuff... */
+	strbuf_release(&buffer);
+	strbuf_release(&merge_paths);
+	strbuf_release(&split_paths);
+	if (path_sz)
+		free(paths);
+	while (path_track_alloc)
+		remove_path_track(&path_track_alloc, 1);
+
+	/* the meaning of the hash name is more or less irrelevant, it's the uniqueness that matters */
+	strbuf_add(&endlist, startlist.buf, startlist.len);
+	git_SHA1_Init(&ctx);
+	git_SHA1_Update(&ctx, endlist.buf, endlist.len);
+	git_SHA1_Final(sha1, &ctx);
+
+	/* now actually initialize header */
+	strcpy(head.signature, "REVCACHE");
+	head.version = SUPPORTED_REVCACHE_VERSION;
+
+	head.object_nr = htonl(object_nr);
+	head.size = htonl(ntohl(head.ofs_objects) + total_sz);
+	head.path_nr = htons(path_nr);
+	hashcpy(head.sha1, sha1);
+
+	/* some info! */
+	fprintf(stderr, "objects: %d\n", object_nr);
+	fprintf(stderr, "paths: %d\n", path_nr);
+
+	lseek(fd, 0, SEEK_SET);
+	xwrite(fd, &head, sizeof(head));
+
+	if (rci->make_index && make_cache_index(rci, sha1, fd, ntohl(head.size)) < 0)
+		die("can't update index");
+
+	close(fd);
+
+	newfile = git_path("rev-cache/%s", sha1_to_hex(sha1));
+	if (rename(file, newfile))
+		die("can't move temp file");
+
+	/* let our caller know what we've just made */
+	if (cache_sha1)
+		hashcpy(cache_sha1, sha1);
+
+	strbuf_release(&endlist);
+	strbuf_release(&startlist);
+
+	return 0;
+}
+
+
+static int index_sort_hash(const void *a, const void *b)
+{
+	return hashcmp(((struct rc_index_entry_ondisk *)a)->sha1, ((struct rc_index_entry_ondisk *)b)->sha1);
+}
+
+static int write_cache_index(struct strbuf *body)
+{
+	struct rc_index_header whead;
+	struct lock_file *lk;
+	int fd, i;
+
+	/* clear index map if loaded */
+	if (idx_map) {
+		munmap(idx_map, idx_size);
+		idx_map = 0;
+	}
+
+	lk = xcalloc(sizeof(struct lock_file), 1);
+	fd = hold_lock_file_for_update(lk, git_path("rev-cache/index"), 0);
+	if (fd < 0) {
+		free(lk);
+		return -1;
+	}
+
+	/* endianness yay! */
+	memset(&whead, 0, sizeof(whead));
+	memcpy(whead.signature, "REVINDEX", 8);
+	whead.version = idx_head.version;
+	whead.ofs_objects = htonl(idx_head.ofs_objects);
+	whead.object_nr = htonl(idx_head.object_nr);
+	whead.cache_nr = idx_head.cache_nr;
+	whead.max_date = htonl(idx_head.max_date);
+
+	write(fd, &whead, sizeof(struct rc_index_header));
+	write_in_full(fd, idx_caches, idx_head.cache_nr * 20);
+
+	for (i = 0; i <= 0xff; i++)
+		fanout[i] = htonl(fanout[i]);
+	write_in_full(fd, fanout, 0x100 * sizeof(uint32_t));
+
+	write_in_full(fd, body->buf, body->len);
+
+	if (commit_lock_file(lk) < 0)
+		return -2;
+
+	/* lk freed by lockfile.c */
+
+	return 0;
+}
+
+int make_cache_index(struct rev_cache_info *rci, unsigned char *cache_sha1,
+	int fd, unsigned int size)
+{
+	struct strbuf buffer;
+	int i, cache_index, cur;
+	unsigned char *map;
+	unsigned long max_date;
+
+	if (!idx_map)
+		init_index();
+
+	lseek(fd, 0, SEEK_SET);
+	map = xmmap(0, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
+	if (map == MAP_FAILED)
+		return -1;
+
+	strbuf_init(&buffer, 0);
+	if (idx_map) {
+		strbuf_add(&buffer, idx_map + fanout[0], fanout[0x100] - fanout[0]);
+	} else {
+		/* not an update */
+		memset(&idx_head, 0, sizeof(struct rc_index_header));
+		idx_caches = 0;
+
+		strcpy(idx_head.signature, "REVINDEX");
+		idx_head.version = SUPPORTED_REVINDEX_VERSION;
+		idx_head.ofs_objects = sizeof(struct rc_index_header) + 0x100 * sizeof(uint32_t);
+	}
+
+	/* are we remaking a slice? */
+	for (i = 0; i < idx_head.cache_nr; i++)
+		if (!hashcmp(idx_caches + i * 20, cache_sha1))
+			break;
+
+	if (i == idx_head.cache_nr) {
+		cache_index = idx_head.cache_nr++;
+		idx_head.ofs_objects += 20;
+
+		idx_caches = xrealloc(idx_caches, idx_head.cache_nr * 20);
+		hashcpy(idx_caches + cache_index * 20, cache_sha1);
+	} else
+		cache_index = i;
+
+	i = sizeof(struct rc_slice_header); /* offset */
+	max_date = idx_head.max_date;
+	while (i < size) {
+		struct rc_index_entry index_entry, *entry;
+		struct rc_index_entry_ondisk *disked_entry;
+		struct rc_object_entry *object_entry = RC_OBTAIN_OBJECT_ENTRY(map + i);
+		unsigned long date;
+		int off, pos = i;
+
+		i += RC_ACTUAL_OBJECT_ENTRY_SIZE(object_entry);
+
+		if (object_entry->type != OBJ_COMMIT)
+			continue;
+
+		/* don't include ends; otherwise we'll find ourselves in loops */
+		if (object_entry->is_end)
+			continue;
+
+		/* handle index duplication
+		 * -> keep old copy unless new one is a start -- based on expected usage, older ones will be more
+		 * likely to lead to greater slice traversals than new ones */
+		date = object_entry->date;
+		if (date > idx_head.max_date) {
+			disked_entry = 0;
+			if (date > max_date)
+				max_date = date;
+		} else
+			disked_entry = search_index_1(object_entry->sha1);
+
+		if (disked_entry && !object_entry->is_start)
+			continue;
+		else if (disked_entry) {
+			/* mmm, pointer arithmetic... tasty */  /* (entry - idx_map = offset, so cast is valid) */
+			off = (unsigned int)((unsigned char *)disked_entry - idx_map) - fanout[0];
+			disked_entry = (struct rc_index_entry_ondisk *)(buffer.buf + off);
+			entry = from_disked_rc_index_entry(disked_entry, 0);
+		} else
+			entry = &index_entry;
+
+		memset(entry, 0, sizeof(index_entry));
+		entry->sha1 = object_entry->sha1;
+		entry->is_start = object_entry->is_start;
+		entry->cache_index = cache_index;
+		entry->pos = pos;
+
+		if (entry == &index_entry) {
+			strbuf_add(&buffer, to_disked_rc_index_entry(entry, 0), sizeof(struct rc_index_entry_ondisk));
+			idx_head.object_nr++;
+		} else
+			to_disked_rc_index_entry(entry, disked_entry);
+
+	}
+
+	idx_head.max_date = max_date;
+	qsort(buffer.buf, buffer.len / sizeof(struct rc_index_entry_ondisk), sizeof(struct rc_index_entry_ondisk), index_sort_hash);
+
+	/* generate fanout */
+	cur = 0x00;
+	for (i = 0; i < buffer.len; i += sizeof(struct rc_index_entry_ondisk)) {
+		struct rc_index_entry_ondisk *entry = (struct rc_index_entry_ondisk *)(buffer.buf + i);
+
+		while (cur <= entry->sha1[0])
+			fanout[cur++] = i + idx_head.ofs_objects;
+	}
+
+	while (cur <= 0xff)
+		fanout[cur++] = idx_head.ofs_objects + buffer.len;
+
+	/* BOOM! */
+	if (write_cache_index(&buffer))
+		return -1;
+
+	munmap(map, size);
+	strbuf_release(&buffer);
+
+	/* idx_map is unloaded without cleanup_cache_slices(), so regardless of previous index existence
+	 * we can still free this up */
+	free(idx_caches);
+
+	return 0;
+}
+
+
+/* add start-commits from each cache slice (uninterestingness will be propogated) */
+void starts_from_slices(struct rev_info *revs, unsigned int flags)
+{
+	struct commit *commit;
+	int i;
+
+	if (!idx_map)
+		init_index();
+	if (!idx_map)
+		return;
+
+	for (i = idx_head.ofs_objects; i < idx_size; i += sizeof(struct rc_index_entry_ondisk)) {
+		struct rc_index_entry *entry = RC_OBTAIN_INDEX_ENTRY(idx_map + i);
+
+		if (!entry->is_start)
+			continue;
+
+		commit = lookup_commit(entry->sha1);
+		if (!commit)
+			continue;
+
+		commit->object.flags |= flags;
+		add_pending_object(revs, &commit->object, 0);
+	}
+
+}
diff --git a/rev-cache.h b/rev-cache.h
new file mode 100644
index 0000000..a76dc53
--- /dev/null
+++ b/rev-cache.h
@@ -0,0 +1,107 @@
+#ifndef REV_CACHE_H
+#define REV_CACHE_H
+
+#define SUPPORTED_REVCACHE_VERSION 		1
+#define SUPPORTED_REVINDEX_VERSION		1
+
+#define RC_PATH_SIZE(x)	(sizeof(uint16_t) * (x))
+
+#define RC_OBTAIN_OBJECT_ENTRY(p)			from_disked_rc_object_entry((struct rc_object_entry_ondisk *)(p), 0)
+#define RC_OBTAIN_INDEX_ENTRY(p)			from_disked_rc_index_entry((struct rc_index_entry_ondisk *)(p), 0)
+
+#define RC_ACTUAL_OBJECT_ENTRY_SIZE(e)		(sizeof(struct rc_object_entry_ondisk) + RC_PATH_SIZE((e)->merge_nr + (e)->split_nr) + (e)->size_size)
+
+/* single index maps objects to cache files */
+struct rc_index_header {
+	char signature[8]; /* REVINDEX */
+	unsigned char version;
+	uint32_t ofs_objects;
+
+	uint32_t object_nr;
+	unsigned char cache_nr;
+
+	uint32_t max_date;
+};
+
+struct rc_index_entry_ondisk {
+	unsigned char sha1[20];
+	unsigned char flags;
+	uint32_t pos;
+};
+
+struct rc_index_entry {
+	unsigned char *sha1;
+	unsigned is_start:1;
+	unsigned cache_index:7;
+	uint32_t pos;
+};
+
+
+/* structure for actual cache file */
+struct rc_slice_header {
+	char signature[8]; /* REVCACHE */
+	unsigned char version;
+	uint32_t ofs_objects;
+
+	uint32_t object_nr;
+	uint16_t path_nr;
+	uint32_t size;
+
+	unsigned char sha1[20];
+};
+
+struct rc_object_entry_ondisk {
+	unsigned char flags;
+	unsigned char sha1[20];
+
+	unsigned char merge_nr;
+	unsigned char split_nr;
+	unsigned char sizes;
+
+	uint32_t date;
+	uint16_t path;
+};
+
+struct rc_object_entry {
+	unsigned type:3;
+	unsigned is_end:1;
+	unsigned is_start:1;
+	unsigned uninteresting:1;
+	unsigned include:1;
+	unsigned flag:1; /* unused */
+	unsigned char *sha1; /* 20 byte */
+
+	unsigned char merge_nr; /* : 7 */
+	unsigned char split_nr; /* : 7 */
+	unsigned size_size:3;
+	unsigned padding:5;
+
+	uint32_t date;
+	uint16_t path;
+
+	/* merge paths */
+	/* split paths */
+	/* size */
+};
+
+struct rc_index_entry *from_disked_rc_index_entry(struct rc_index_entry_ondisk *src, struct rc_index_entry *dst);
+struct rc_index_entry_ondisk *to_disked_rc_index_entry(struct rc_index_entry *src, struct rc_index_entry_ondisk *dst);
+struct rc_object_entry *from_disked_rc_object_entry(struct rc_object_entry_ondisk *src, struct rc_object_entry *dst);
+struct rc_object_entry_ondisk *to_disked_rc_object_entry(struct rc_object_entry *src, struct rc_object_entry_ondisk *dst);
+
+extern unsigned char *get_cache_slice(struct commit *commit);
+extern int traverse_cache_slice(struct rev_info *revs,
+	unsigned char *cache_sha1, struct commit *commit,
+	unsigned long *date_so_far, int *slop_so_far,
+	struct commit_list ***queue, struct commit_list **work);
+
+extern void init_rev_cache_info(struct rev_cache_info *rci);
+extern int make_cache_slice(struct rev_cache_info *rci,
+	struct rev_info *revs, struct commit_list **starts, struct commit_list **ends,
+	unsigned char *cache_sha1);
+extern int make_cache_index(struct rev_cache_info *rci, unsigned char *cache_sha1,
+	int fd, unsigned int size);
+
+extern void starts_from_slices(struct rev_info *revs, unsigned int flags);
+
+#endif
diff --git a/revision.c b/revision.c
index 9fc4e8d..de9e2e3 100644
--- a/revision.c
+++ b/revision.c
@@ -432,7 +432,7 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit)
 	commit->object.flags |= TREESAME;
 }
 
-static void insert_by_date_cached(struct commit *p, struct commit_list **head,
+void insert_by_date_cached(struct commit *p, struct commit_list **head,
 		    struct commit_list *cached_base, struct commit_list **cache)
 {
 	struct commit_list *new_entry;
diff --git a/revision.h b/revision.h
index b6421a6..1b7b93f 100644
--- a/revision.h
+++ b/revision.h
@@ -13,7 +13,8 @@
 #define CHILD_SHOWN	(1u<<6)
 #define ADDED		(1u<<7)	/* Parents already parsed and added? */
 #define SYMMETRIC_LEFT	(1u<<8)
-#define ALL_REV_FLAGS	((1u<<9)-1)
+#define FACE_VALUE	(1u<<9)
+#define ALL_REV_FLAGS	((1u<<10)-1)
 
 #define DECORATE_SHORT_REFS	1
 #define DECORATE_FULL_REFS	2
@@ -21,6 +22,19 @@
 struct rev_info;
 struct log_info;
 
+struct rev_cache_info {
+	/* generation flags */
+	unsigned objects : 1,
+		legs : 1,
+		make_index : 1;
+
+	/* traversal flags */
+	unsigned add_to_pending : 1;
+
+	/* fuse options */
+	unsigned int ignore_size;
+};
+
 struct rev_info {
 	/* Starting list */
 	struct commit_list *commits;
@@ -76,6 +90,10 @@ struct rev_info {
 			dense_combined_merges:1,
 			always_show_header:1;
 
+	/* rev-cache flags */
+	unsigned int for_pack:1,
+		dont_cache_me:1;
+
 	/* Format info */
 	unsigned int	shown_one:1,
 			show_merge:1,
@@ -120,6 +138,9 @@ struct rev_info {
 	struct reflog_walk_info *reflog_info;
 	struct decoration children;
 	struct decoration merge_simplification;
+
+	/* caching info, used ONLY by traverse_cache_slice */
+	struct rev_cache_info rev_cache_info;
 };
 
 #define REV_TREE_SAME		0
@@ -172,4 +193,7 @@ enum commit_action {
 extern enum commit_action get_commit_action(struct rev_info *revs, struct commit *commit);
 extern enum commit_action simplify_commit(struct rev_info *revs, struct commit *commit);
 
+extern void insert_by_date_cached(struct commit *p, struct commit_list **head,
+		    struct commit_list *cached_base, struct commit_list **cache);
+
 #endif
diff --git a/t/t6017-rev-cache-list.sh b/t/t6017-rev-cache-list.sh
new file mode 100755
index 0000000..f59f568
--- /dev/null
+++ b/t/t6017-rev-cache-list.sh
@@ -0,0 +1,106 @@
+#!/bin/sh
+
+test_description='git rev-cache tests'
+. ./test-lib.sh
+
+test_cmp_sorted() {
+	grep -io "[a-f0-9]*" $1 | sort >.tmpfile1 &&
+	grep -io "[a-f0-9]*" $2 | sort >.tmpfile2 &&
+	test_cmp .tmpfile1 .tmpfile2
+}
+
+# we want a totally wacked out branch structure...
+# we need branching and merging of sizes up through 3, tree
+# addition/deletion, and enough branching to exercise path
+# reuse
+test_expect_success 'init repo' '
+	echo bla >file &&
+	git add . &&
+	git commit -m "bla" &&
+
+	git branch b1 &&
+	git checkout b1 &&
+	echo blu >file2 &&
+	mkdir d1 &&
+	echo bang >d1/filed1 &&
+	git add . &&
+	git commit -m "blu" &&
+
+	git checkout master &&
+	git branch b2 &&
+	git checkout b2 &&
+	echo kaplaa >>file &&
+	git commit -a -m "kaplaa" &&
+
+	git checkout master &&
+	mkdir smoke &&
+	echo omg >smoke/bong &&
+	git add . &&
+	git commit -m "omg" &&
+
+	git branch b4 &&
+	git checkout b4 &&
+	echo shazam >file8 &&
+	git add . &&
+	git commit -m "shazam" &&
+	git merge -m "merge b2" b2 &&
+
+	echo bam >smoke/pipe &&
+	git add .
+	git commit -m "bam" &&
+
+	git checkout master &&
+	echo pow >file7 &&
+	git add . &&
+	git commit -m "pow" &&
+	git merge -m "merge b4" b4 &&
+
+	git checkout b1 &&
+	echo stuff >d1/filed1 &&
+	git commit -a -m "stuff" &&
+
+	git branch b11 &&
+	git checkout b11 &&
+	echo wazzup >file3 &&
+	git add file3 &&
+	git commit -m "wazzup" &&
+
+	git checkout b1 &&
+	mkdir d1/d2 &&
+	echo lol >d1/d2/filed2 &&
+	git add . &&
+	git commit -m "lol" &&
+
+	git checkout master &&
+	git merge -m "triple merge" b1 b11 &&
+	git rm -r d1 &&
+	git commit -a -m "oh noes"
+'
+
+git-rev-list HEAD --not HEAD~3 >proper_commit_list_limited
+git-rev-list HEAD >proper_commit_list
+git-rev-list HEAD --objects >proper_object_list
+
+test_expect_success 'make cache slice' '
+	git-rev-cache add HEAD 2>output.err &&
+	grep "final return value: 0" output.err
+'
+
+test_expect_success 'remake cache slice' '
+	git-rev-cache add HEAD 2>output.err &&
+	grep "final return value: 0" output.err
+'
+
+#check core mechanics and rev-list hook for commits
+test_expect_success 'test rev-caches walker directly (limited)' '
+	git-rev-cache walk HEAD --not HEAD~3 >list &&
+	test_cmp_sorted list proper_commit_list_limited
+'
+
+test_expect_success 'test rev-caches walker directly (unlimited)' '
+	git-rev-cache walk HEAD >list &&
+	test_cmp_sorted list proper_commit_list
+'
+
+test_done
+
-- 
tg: (e79999b..) t/revcache/basic (depends on: master)

^ permalink raw reply related

* Re: [PATCH 3/6 (v4)] support for non-commit object caching in rev-cache
From: Nick Edelen @ 2009-10-19 20:28 UTC (permalink / raw)
  To: Junio C Hamano, Nicolas Pitre, Johannes Schindelin, Sam Vilain,
	Michael J Gruber
In-Reply-To: <op.uys3qpixtdk399@sirnot.private>

Summarized, this third patch contains:
 - support for non-commit object caching
 - expansion of porcelain to accomodate non-commit objects
 - appropriate tests

Objects are stored relative to the commit in which they were introduced -- 
commits are 'diffed' against their parents.  This will eliminate the need for 
tree recursion in cached commits (significantly reducing I/O), and potentially 
be useful to external applications.

Signed-off-by: Nick Edelen <sirnot@gmail.com>

---
 rev-cache.c               |  202 ++++++++++++++++++++++++++++++++++++++++++++-
 t/t6017-rev-cache-list.sh |    6 ++
 2 files changed, 206 insertions(+), 2 deletions(-)

diff --git a/rev-cache.c b/rev-cache.c
index 8951cdf..ef6b58a 100644
--- a/rev-cache.c
+++ b/rev-cache.c
@@ -259,6 +259,32 @@ unsigned char *get_cache_slice(struct commit *commit)
 
 /* traversal */
 
+static void handle_noncommit(struct rev_info *revs, unsigned char *ptr, struct rc_object_entry *entry)
+{
+	struct object *obj = 0;
+
+	switch (entry->type) {
+	case OBJ_TREE:
+		if (revs->tree_objects)
+			obj = (struct object *)lookup_tree(entry->sha1);
+		break;
+	case OBJ_BLOB:
+		if (revs->blob_objects)
+			obj = (struct object *)lookup_blob(entry->sha1);
+		break;
+	case OBJ_TAG:
+		if (revs->tag_objects)
+			obj = (struct object *)lookup_tag(entry->sha1);
+		break;
+	}
+
+	if (!obj)
+		return;
+
+	obj->flags |= FACE_VALUE;
+	add_pending_object(revs, obj, "");
+}
+
 static int setup_traversal(struct rc_slice_header *head, unsigned char *map, struct commit *commit, struct commit_list **work)
 {
 	struct rc_index_entry *iep;
@@ -347,9 +373,12 @@ static int traverse_cache_slice_1(struct rc_slice_header *head, unsigned char *m
 		i += RC_ACTUAL_OBJECT_ENTRY_SIZE(entry);
 
 		/* add extra objects if necessary */
-		if (entry->type != OBJ_COMMIT)
+		if (entry->type != OBJ_COMMIT) {
+			if (consume_children)
+				handle_noncommit(revs, map + index, entry);
+
 			continue;
-		else
+		} else
 			consume_children = 0;
 
 		if (path >= total_path_nr)
@@ -777,6 +806,171 @@ static void add_object_entry(const unsigned char *sha1, int type, struct rc_obje
 
 }
 
+/* returns non-zero to continue parsing, 0 to skip */
+typedef int (*dump_tree_fn)(const unsigned char *, const char *, unsigned int); /* sha1, path, mode */
+
+/* we need to walk the trees by hash, so unfortunately we can't use traverse_trees in tree-walk.c */
+static int dump_tree(struct tree *tree, dump_tree_fn fn)
+{
+	struct tree_desc desc;
+	struct name_entry entry;
+	struct tree *subtree;
+	int r;
+
+	if (parse_tree(tree))
+		return -1;
+
+	init_tree_desc(&desc, tree->buffer, tree->size);
+	while (tree_entry(&desc, &entry)) {
+		switch (fn(entry.sha1, entry.path, entry.mode)) {
+		case 0:
+			goto continue_loop;
+		default:
+			break;
+		}
+
+		if (S_ISDIR(entry.mode)) {
+			subtree = lookup_tree(entry.sha1);
+			if (!subtree)
+				return -2;
+
+			if ((r = dump_tree(subtree, fn)) < 0)
+				return r;
+		}
+
+continue_loop:
+		continue;
+	}
+
+	return 0;
+}
+
+static int dump_tree_callback(const unsigned char *sha1, const char *path, unsigned int mode)
+{
+	unsigned char data[21];
+
+	hashcpy(data, sha1);
+	data[20] = !!S_ISDIR(mode);
+
+	strbuf_add(acc_buffer, data, 21);
+
+	return 1;
+}
+
+static void tree_addremove(struct diff_options *options,
+	int whatnow, unsigned mode,
+	const unsigned char *sha1,
+	const char *concatpath)
+{
+	unsigned char data[21];
+
+	if (whatnow != '+')
+		return;
+
+	hashcpy(data, sha1);
+	data[20] = !!S_ISDIR(mode);
+
+	strbuf_add(acc_buffer, data, 21);
+}
+
+static void tree_change(struct diff_options *options,
+	unsigned old_mode, unsigned new_mode,
+	const unsigned char *old_sha1,
+	const unsigned char *new_sha1,
+	const char *concatpath)
+{
+	unsigned char data[21];
+
+	if (!hashcmp(old_sha1, new_sha1))
+		return;
+
+	hashcpy(data, new_sha1);
+	data[20] = !!S_ISDIR(new_mode);
+
+	strbuf_add(acc_buffer, data, 21);
+}
+
+static int sort_type_hash(const void *a, const void *b)
+{
+	const unsigned char *sa = (const unsigned char *)a,
+		*sb = (const unsigned char *)b;
+
+	if (sa[20] == sb[20])
+		return hashcmp(sa, sb);
+
+	return sa[20] > sb[20] ? -1 : 1;
+}
+
+static int add_unique_objects(struct commit *commit)
+{
+	struct commit_list *list;
+	struct strbuf os, ost, *orig_buf;
+	struct diff_options opts;
+	int i, j, next;
+	char is_first = 1;
+
+	strbuf_init(&os, 0);
+	strbuf_init(&ost, 0);
+	orig_buf = acc_buffer;
+
+	diff_setup(&opts);
+	DIFF_OPT_SET(&opts, RECURSIVE);
+	DIFF_OPT_SET(&opts, TREE_IN_RECURSIVE);
+	opts.change = tree_change;
+	opts.add_remove = tree_addremove;
+
+	/* this is only called for non-ends (ie. all parents interesting) */
+	for (list = commit->parents; list; list = list->next) {
+		if (is_first)
+			acc_buffer = &os;
+		else
+			acc_buffer = &ost;
+
+		strbuf_setlen(acc_buffer, 0);
+		diff_tree_sha1(list->item->tree->object.sha1, commit->tree->object.sha1, "", &opts);
+		qsort(acc_buffer->buf, acc_buffer->len / 21, 21, (int (*)(const void *, const void *))hashcmp);
+
+		/* take intersection */
+		if (!is_first) {
+			for (next = i = j = 0; i < os.len; i += 21) {
+				while (j < ost.len && hashcmp((unsigned char *)(ost.buf + j), (unsigned char *)(os.buf + i)) < 0)
+					j += 21;
+
+				if (j >= ost.len || hashcmp((unsigned char *)(ost.buf + j), (unsigned char *)(os.buf + i)))
+					continue;
+
+				if (next != i)
+					memcpy(os.buf + next, os.buf + i, 21);
+				next += 21;
+			}
+
+			if (next != i)
+				strbuf_setlen(&os, next);
+		} else
+			is_first = 0;
+	}
+
+	if (is_first) {
+		acc_buffer = &os;
+		dump_tree(commit->tree, dump_tree_callback);
+	}
+
+	if (os.len)
+		qsort(os.buf, os.len / 21, 21, sort_type_hash);
+
+	acc_buffer = orig_buf;
+	for (i = 0; i < os.len; i += 21)
+		add_object_entry((unsigned char *)(os.buf + i), os.buf[i + 20] ? OBJ_TREE : OBJ_BLOB, 0, 0, 0);
+
+	/* last but not least, the main tree */
+	add_object_entry(commit->tree->object.sha1, OBJ_TREE, 0, 0, 0);
+
+	strbuf_release(&ost);
+	strbuf_release(&os);
+
+	return i / 21 + 1;
+}
+
 static void init_revcache_directory(void)
 {
 	struct stat fi;
@@ -902,6 +1096,10 @@ int make_cache_slice(struct rev_cache_info *rci,
 		add_object_entry(0, 0, &object, &merge_paths, &split_paths);
 		object_nr++;
 
+		/* add all unique children for this commit */
+		if (rci->objects && !object.is_end)
+			object_nr += add_unique_objects(commit);
+
 		/* print every ~1MB or so */
 		if (buffer.len > 1000000) {
 			write_in_full(fd, buffer.buf, buffer.len);
diff --git a/t/t6017-rev-cache-list.sh b/t/t6017-rev-cache-list.sh
index f59f568..dc0fc07 100755
--- a/t/t6017-rev-cache-list.sh
+++ b/t/t6017-rev-cache-list.sh
@@ -102,5 +102,11 @@ test_expect_success 'test rev-caches walker directly (unlimited)' '
 	test_cmp_sorted list proper_commit_list
 '
 
+#do the same for objects
+test_expect_success 'test rev-caches walker with objects' '
+	git-rev-cache walk --objects HEAD >list &&
+	test_cmp_sorted list proper_object_list
+'
+
 test_done
 
-- 
tg: (3bf0747..) t/revcache/objects (depends on: t/revcache/basic)

^ permalink raw reply related

* Re: [PATCH 4/6 (v4)] administrative functions for rev-cache, start of integration into git
From: Nick Edelen @ 2009-10-19 20:29 UTC (permalink / raw)
  To: Junio C Hamano, Nicolas Pitre, Johannes Schindelin, Sam Vilain,
	Michael J Gruber
In-Reply-To: <op.uys3qryitdk399@sirnot.private>

This patch, fourth, contains miscellaneous (maintenance) features:
 - support for cache slice fusion, index regeneration and object size caching
 - non-commit object generation refactored
 - porcelain updated to support feature additions

The beginnings of integration into git are present in this patch, mainly 
centered on caching object size; the object generation is refactored to more 
elegantly exploit this.  Fusion allows smaller (incremental) slices to be 
coagulated into a larger slice, reducing overhead, while index regeneration 
enables repair or cleaning of the cache index.

Note that tests for these features are included in the following patch, as they 
take advantage of the rev-cache's integration into the revision walker.

Signed-off-by: Nick Edelen <sirnot@gmail.com>

---
 builtin-gc.c        |    9 +
 builtin-rev-cache.c |   77 ++++++-
 rev-cache.c         |  717 +++++++++++++++++++++++++++++++++++++++++++++------
 rev-cache.h         |    9 +-
 revision.h          |   16 +-
 5 files changed, 749 insertions(+), 79 deletions(-)

diff --git a/builtin-gc.c b/builtin-gc.c
index 7d3e9cc..c92511a 100644
--- a/builtin-gc.c
+++ b/builtin-gc.c
@@ -22,6 +22,7 @@ static const char * const builtin_gc_usage[] = {
 	NULL
 };
 
+static char do_rev_cache = 0;
 static int pack_refs = 1;
 static int aggressive_window = 250;
 static int gc_auto_threshold = 6700;
@@ -34,9 +35,14 @@ static const char *argv_reflog[] = {"reflog", "expire", "--all", NULL};
 static const char *argv_repack[MAX_ADD] = {"repack", "-d", "-l", NULL};
 static const char *argv_prune[] = {"prune", "--expire", NULL, NULL};
 static const char *argv_rerere[] = {"rerere", "gc", NULL};
+static const char *argv_rev_cache[] = {"rev-cache", "fuse", "--all", "--ignore-size", NULL};
 
 static int gc_config(const char *var, const char *value, void *cb)
 {
+	if (!strcmp(var, "gc.revcache")) {
+		do_rev_cache = 1;
+		return 0;
+	}
 	if (!strcmp(var, "gc.packrefs")) {
 		if (value && !strcmp(value, "notbare"))
 			pack_refs = -1;
@@ -244,6 +250,9 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	if (run_command_v_opt(argv_rerere, RUN_GIT_CMD))
 		return error(FAILED_RUN, argv_rerere[0]);
 
+	if (do_rev_cache && run_command_v_opt(argv_rev_cache, RUN_GIT_CMD))
+		return error(FAILED_RUN, argv_rev_cache[0]);
+
 	if (auto_gc && too_many_loose_objects())
 		warning("There are too many unreachable loose objects; "
 			"run 'git prune' to remove them.");
diff --git a/builtin-rev-cache.c b/builtin-rev-cache.c
index 6eb7065..b894c54 100644
--- a/builtin-rev-cache.c
+++ b/builtin-rev-cache.c
@@ -5,6 +5,8 @@
 #include "revision.h"
 #include "rev-cache.h"
 
+unsigned long default_ignore_size = 50 * 1024 * 1024; /* 50mb */
+
 /* porcelain for rev-cache.c */
 static int handle_add(int argc, const char *argv[]) /* args beyond this command */
 {
@@ -24,7 +26,7 @@ static int handle_add(int argc, const char *argv[]) /* args beyond this command
 		if (!strcmp(argv[i], "--stdin"))
 			dostdin = 1;
 		else if (!strcmp(argv[i], "--fresh") || !strcmp(argv[i], "--incremental"))
-			starts_from_slices(&revs, UNINTERESTING);
+			starts_from_slices(&revs, UNINTERESTING, 0, 0);
 		else if (!strcmp(argv[i], "--not"))
 			flags ^= UNINTERESTING;
 		else if (!strcmp(argv[i], "--legs"))
@@ -150,6 +152,57 @@ static int handle_walk(int argc, const char *argv[])
 	return 0;
 }
 
+static int handle_fuse(int argc, const char *argv[])
+{
+	struct rev_info revs;
+	struct rev_cache_info rci;
+	const char *args[5];
+	int i, argn = 0;
+	char add_all = 0;
+
+	init_revisions(&revs, 0);
+	init_rev_cache_info(&rci);
+	args[argn++] = "rev-list";
+
+	for (i = 0; i < argc; i++) {
+		if (!strcmp(argv[i], "--all")) {
+			args[argn++] = "--all";
+			setup_revisions(argn, args, &revs, 0);
+			add_all = 1;
+		} else if (!strcmp(argv[i], "--no-objects"))
+			rci.objects = 0;
+		else if (!strncmp(argv[i], "--ignore-size", 13)) {
+			unsigned long sz;
+
+			if (argv[i][13] == '=')
+				git_parse_ulong(argv[i] + 14, &sz);
+			else
+				sz = default_ignore_size;
+
+			rci.ignore_size = sz;
+		} else
+			continue;
+	}
+
+	if (!add_all)
+		starts_from_slices(&revs, 0, 0, 0);
+
+	return fuse_cache_slices(&rci, &revs);
+}
+
+static int handle_index(int argc, const char *argv[])
+{
+	return regenerate_cache_index(0);
+}
+
+static int handle_alt(int argc, const char *argv[])
+{
+	if (argc < 1)
+		return -1;
+
+	return make_cache_slice_pointer(0, argv[0]);
+}
+
 static int handle_help(void)
 {
 	char *usage = "\
@@ -180,12 +233,28 @@ commands:\n\
 	return 0;
 }
 
+static int rev_cache_config(const char *k, const char *v, void *cb)
+{
+	/* this could potentially be related to pack.windowmemory, but we want a max around 50mb,
+	 * and .windowmemory is often >700mb, with *large* variations */
+	if (!strcmp(k, "revcache.ignoresize")) {
+		int t;
+
+		t = git_config_ulong(k, v);
+		if (t)
+			default_ignore_size = t;
+	}
+
+	return 0;
+}
+
 int cmd_rev_cache(int argc, const char *argv[], const char *prefix)
 {
 	const char *arg;
 	int r;
 
 	git_config(git_default_config, NULL);
+	git_config(rev_cache_config, NULL);
 
 	if (argc > 1)
 		arg = argv[1];
@@ -196,8 +265,14 @@ int cmd_rev_cache(int argc, const char *argv[], const char *prefix)
 	argv += 2;
 	if (!strcmp(arg, "add"))
 		r = handle_add(argc, argv);
+	else if (!strcmp(arg, "fuse"))
+		r = handle_fuse(argc, argv);
 	else if (!strcmp(arg, "walk"))
 		r = handle_walk(argc, argv);
+	else if (!strcmp(arg, "index"))
+		r = handle_index(argc, argv);
+	else if (!strcmp(arg, "alt"))
+		r = handle_alt(argc, argv);
 	else
 		return handle_help();
 
diff --git a/rev-cache.c b/rev-cache.c
index ef6b58a..6e19fbb 100644
--- a/rev-cache.c
+++ b/rev-cache.c
@@ -9,6 +9,13 @@
 #include "revision.h"
 #include "rev-cache.h"
 #include "run-command.h"
+#include "string-list.h"
+
+struct cache_slice_pointer {
+	char signature[8]; /* REVCOPTR */
+	char version;
+	char path[PATH_MAX + 1];
+};
 
 /* list resembles pack index format */
 static uint32_t fanout[0xff + 2];
@@ -259,27 +266,45 @@ unsigned char *get_cache_slice(struct commit *commit)
 
 /* traversal */
 
-static void handle_noncommit(struct rev_info *revs, unsigned char *ptr, struct rc_object_entry *entry)
+static unsigned long decode_size(unsigned char *str, int len);
+
+static void handle_noncommit(struct rev_info *revs, struct commit *commit, unsigned char *ptr, struct rc_object_entry *entry)
 {
-	struct object *obj = 0;
+	struct blob *blob;
+	struct tree *tree;
+	struct object *obj;
+	unsigned long size;
 
+	size = decode_size(ptr + RC_ENTRY_SIZE_OFFSET(entry), entry->size_size);
 	switch (entry->type) {
 	case OBJ_TREE:
-		if (revs->tree_objects)
-			obj = (struct object *)lookup_tree(entry->sha1);
+		if (!revs->tree_objects)
+			return;
+
+		tree = lookup_tree(entry->sha1);
+		if (!tree)
+			return;
+
+		tree->size = size;
+		commit->tree = tree;
+		obj = (struct object *)tree;
 		break;
+
 	case OBJ_BLOB:
-		if (revs->blob_objects)
-			obj = (struct object *)lookup_blob(entry->sha1);
-		break;
-	case OBJ_TAG:
-		if (revs->tag_objects)
-			obj = (struct object *)lookup_tag(entry->sha1);
+		if (!revs->blob_objects)
+			return;
+
+		blob = lookup_blob(entry->sha1);
+		if (!blob)
+			return;
+
+		obj = (struct object *)blob;
 		break;
-	}
 
-	if (!obj)
+	default:
+		/* tag objects aren't really supposed to be here */
 		return;
+	}
 
 	obj->flags |= FACE_VALUE;
 	add_pending_object(revs, obj, "");
@@ -375,7 +400,7 @@ static int traverse_cache_slice_1(struct rc_slice_header *head, unsigned char *m
 		/* add extra objects if necessary */
 		if (entry->type != OBJ_COMMIT) {
 			if (consume_children)
-				handle_noncommit(revs, map + index, entry);
+				handle_noncommit(revs, co, map + index, entry);
 
 			continue;
 		} else
@@ -409,6 +434,8 @@ static int traverse_cache_slice_1(struct rc_slice_header *head, unsigned char *m
 			if (last_objects[path]) {
 				parse_commit(last_objects[path]);
 
+				/* we needn't worry about the unique field; that will be valid as
+				 * long as we're not a end entry */
 				last_objects[path]->object.flags &= ~FACE_VALUE;
 				last_objects[path] = 0;
 			}
@@ -541,6 +568,48 @@ static int get_cache_slice_header(unsigned char *cache_sha1, unsigned char *map,
 	return 0;
 }
 
+int open_cache_slice(unsigned char *sha1, int flags)
+{
+	int fd;
+	char signature[8];
+
+	fd = open(git_path("rev-cache/%s", sha1_to_hex(sha1)), flags);
+	if (fd <= 0)
+		goto end;
+
+	if (read(fd, signature, 8) != 8)
+		goto end;
+
+	/* a normal revision slice */
+	if (!memcmp(signature, "REVCACHE", 8)) {
+		lseek(fd, 0, SEEK_SET);
+		return fd;
+	}
+
+	/* slice pointer */
+	if (!memcmp(signature, "REVCOPTR", 8)) {
+		struct cache_slice_pointer ptr;
+
+		lseek(fd, 0, SEEK_SET);
+		if (read_in_full(fd, &ptr, sizeof(ptr)) != sizeof(ptr))
+			goto end;
+
+		if (ptr.version != SUPPORTED_REVCOPTR_VERSION)
+			goto end;
+
+		close(fd);
+		fd = open(ptr.path, flags);
+
+		return fd;
+	}
+
+end:
+	if (fd > 0)
+		close(fd);
+
+	return -1;
+}
+
 int traverse_cache_slice(struct rev_info *revs,
 	unsigned char *cache_sha1, struct commit *commit,
 	unsigned long *date_so_far, int *slop_so_far,
@@ -564,7 +633,7 @@ int traverse_cache_slice(struct rev_info *revs,
 
 	memset(&head, 0, sizeof(struct rc_slice_header));
 
-	fd = open(git_path("rev-cache/%s", sha1_to_hex(cache_sha1)), O_RDONLY);
+	fd = open_cache_slice(cache_sha1, O_RDONLY);
 	if (fd == -1)
 		goto end;
 	if (fstat(fd, &fi) || fi.st_size < sizeof(struct rc_slice_header))
@@ -591,6 +660,68 @@ end:
 
 /* generation */
 
+static int is_endpoint(struct commit *commit)
+{
+	struct commit_list *list = commit->parents;
+
+	while (list) {
+		if (!(list->item->object.flags & UNINTERESTING))
+			return 0;
+
+		list = list->next;
+	}
+
+	return 1;
+}
+
+/* ensures branch is self-contained: parents are either all interesting or all uninteresting */
+static void make_legs(struct rev_info *revs)
+{
+	struct commit_list *list, **plist;
+	int total = 0;
+
+	/* attach plist to end of commits list */
+	list = revs->commits;
+	while (list && list->next)
+		list = list->next;
+
+	if (list)
+		plist = &list->next;
+	else
+		return;
+
+	/* duplicates don't matter, as get_revision() ignores them */
+	for (list = revs->commits; list; list = list->next) {
+		struct commit *item = list->item;
+		struct commit_list *parents = item->parents;
+
+		if (item->object.flags & UNINTERESTING)
+			continue;
+		if (is_endpoint(item))
+			continue;
+
+		while (parents) {
+			struct commit *p = parents->item;
+			parents = parents->next;
+
+			if (!(p->object.flags & UNINTERESTING))
+				continue;
+
+			p->object.flags &= ~UNINTERESTING;
+			parse_commit(p);
+			plist = &commit_list_insert(p, plist)->next;
+
+			if (!(p->object.flags & SEEN))
+				total++;
+		}
+	}
+
+	if (total)
+		sort_in_topological_order(&revs->commits, 1);
+
+}
+
+
 struct path_track {
 	struct commit *commit;
 	int path; /* for keeping track of children */
@@ -779,31 +910,76 @@ static void handle_paths(struct commit *commit, struct rc_object_entry *object,
 }
 
 
-static void add_object_entry(const unsigned char *sha1, int type, struct rc_object_entry *nothisone,
+static int encode_size(unsigned long size, unsigned char *out)
+{
+	int len = 0;
+
+	while (size) {
+		*out++ = (unsigned char)(size & 0xff);
+		size >>= 8;
+		len++;
+	}
+
+	return len;
+}
+
+static unsigned long decode_size(unsigned char *str, int len)
+{
+	unsigned long size = 0;
+	int shift = 0;
+
+	while (len--) {
+		size |= (unsigned long)*str << shift;
+		shift += 8;
+		str++;
+	}
+
+	return size;
+}
+
+static void add_object_entry(const unsigned char *sha1, struct rc_object_entry *entryp,
 	struct strbuf *merge_str, struct strbuf *split_str)
 {
-	struct rc_object_entry object;
+	struct rc_object_entry entry;
+	unsigned char size_str[7];
+	unsigned long size;
+	enum object_type type;
+	void *data;
 
-	if (!nothisone) {
-		memset(&object, 0, sizeof(object));
-		object.sha1 = (unsigned char *)sha1;
-		object.type = type;
+	if (entryp)
+		sha1 = entryp->sha1;
+
+	/* retrieve size data */
+	data = read_sha1_file(sha1, &type, &size);
+
+	if (data)
+		free(data);
+
+	/* initialize! */
+	if (!entryp) {
+		memset(&entry, 0, sizeof(entry));
+		entry.sha1 = (unsigned char *)sha1;
+		entry.type = type;
 
 		if (merge_str)
-			object.merge_nr = merge_str->len / sizeof(uint16_t);
+			entry.merge_nr = merge_str->len / sizeof(uint16_t);
 		if (split_str)
-			object.split_nr = split_str->len / sizeof(uint16_t);
+			entry.split_nr = split_str->len / sizeof(uint16_t);
 
-		nothisone = &object;
+		entryp = &entry;
 	}
 
-	strbuf_add(acc_buffer, to_disked_rc_object_entry(nothisone, 0), sizeof(struct rc_object_entry_ondisk));
+	entryp->size_size = encode_size(size, size_str);
 
-	if (merge_str && merge_str->len)
+	/* write the muvabitch */
+	strbuf_add(acc_buffer, to_disked_rc_object_entry(entryp, 0), sizeof(struct rc_object_entry_ondisk));
+
+	if (merge_str)
 		strbuf_add(acc_buffer, merge_str->buf, merge_str->len);
-	if (split_str && split_str->len)
+	if (split_str)
 		strbuf_add(acc_buffer, split_str->buf, split_str->len);
 
+	strbuf_add(acc_buffer, size_str, entryp->size_size);
 }
 
 /* returns non-zero to continue parsing, 0 to skip */
@@ -847,12 +1023,7 @@ continue_loop:
 
 static int dump_tree_callback(const unsigned char *sha1, const char *path, unsigned int mode)
 {
-	unsigned char data[21];
-
-	hashcpy(data, sha1);
-	data[20] = !!S_ISDIR(mode);
-
-	strbuf_add(acc_buffer, data, 21);
+	strbuf_add(acc_buffer, sha1, 20);
 
 	return 1;
 }
@@ -862,15 +1033,10 @@ static void tree_addremove(struct diff_options *options,
 	const unsigned char *sha1,
 	const char *concatpath)
 {
-	unsigned char data[21];
-
 	if (whatnow != '+')
 		return;
 
-	hashcpy(data, sha1);
-	data[20] = !!S_ISDIR(mode);
-
-	strbuf_add(acc_buffer, data, 21);
+	strbuf_add(acc_buffer, sha1, 20);
 }
 
 static void tree_change(struct diff_options *options,
@@ -879,26 +1045,10 @@ static void tree_change(struct diff_options *options,
 	const unsigned char *new_sha1,
 	const char *concatpath)
 {
-	unsigned char data[21];
-
 	if (!hashcmp(old_sha1, new_sha1))
 		return;
 
-	hashcpy(data, new_sha1);
-	data[20] = !!S_ISDIR(new_mode);
-
-	strbuf_add(acc_buffer, data, 21);
-}
-
-static int sort_type_hash(const void *a, const void *b)
-{
-	const unsigned char *sa = (const unsigned char *)a,
-		*sb = (const unsigned char *)b;
-
-	if (sa[20] == sb[20])
-		return hashcmp(sa, sb);
-
-	return sa[20] > sb[20] ? -1 : 1;
+	strbuf_add(acc_buffer, new_sha1, 20);
 }
 
 static int add_unique_objects(struct commit *commit)
@@ -909,6 +1059,7 @@ static int add_unique_objects(struct commit *commit)
 	int i, j, next;
 	char is_first = 1;
 
+	/* ...no, calculate unique objects */
 	strbuf_init(&os, 0);
 	strbuf_init(&ost, 0);
 	orig_buf = acc_buffer;
@@ -928,20 +1079,20 @@ static int add_unique_objects(struct commit *commit)
 
 		strbuf_setlen(acc_buffer, 0);
 		diff_tree_sha1(list->item->tree->object.sha1, commit->tree->object.sha1, "", &opts);
-		qsort(acc_buffer->buf, acc_buffer->len / 21, 21, (int (*)(const void *, const void *))hashcmp);
+		qsort(acc_buffer->buf, acc_buffer->len / 20, 20, (int (*)(const void *, const void *))hashcmp);
 
 		/* take intersection */
 		if (!is_first) {
-			for (next = i = j = 0; i < os.len; i += 21) {
+			for (next = i = j = 0; i < os.len; i += 20) {
 				while (j < ost.len && hashcmp((unsigned char *)(ost.buf + j), (unsigned char *)(os.buf + i)) < 0)
-					j += 21;
+					j += 20;
 
 				if (j >= ost.len || hashcmp((unsigned char *)(ost.buf + j), (unsigned char *)(os.buf + i)))
 					continue;
 
 				if (next != i)
-					memcpy(os.buf + next, os.buf + i, 21);
-				next += 21;
+					memcpy(os.buf + next, os.buf + i, 20);
+				next += 20;
 			}
 
 			if (next != i)
@@ -950,25 +1101,102 @@ static int add_unique_objects(struct commit *commit)
 			is_first = 0;
 	}
 
+	/* no parents (!) */
 	if (is_first) {
 		acc_buffer = &os;
 		dump_tree(commit->tree, dump_tree_callback);
 	}
 
-	if (os.len)
-		qsort(os.buf, os.len / 21, 21, sort_type_hash);
-
+	/* the ordering of non-commit objects dosn't really matter, so we're not gonna bother */
 	acc_buffer = orig_buf;
-	for (i = 0; i < os.len; i += 21)
-		add_object_entry((unsigned char *)(os.buf + i), os.buf[i + 20] ? OBJ_TREE : OBJ_BLOB, 0, 0, 0);
+	for (i = 0; i < os.len; i += 20)
+		add_object_entry((unsigned char *)(os.buf + i), 0, 0, 0);
 
 	/* last but not least, the main tree */
-	add_object_entry(commit->tree->object.sha1, OBJ_TREE, 0, 0, 0);
+	add_object_entry(commit->tree->object.sha1, 0, 0, 0);
+
+	return i / 20 + 1;
+}
+
+static int add_objects_verbatim_1(struct rev_cache_slice_map *mapping, int *index)
+{
+	unsigned char *map = mapping->map;
+	int i = *index, object_nr = 0;
+	struct rc_object_entry *entry = RC_OBTAIN_OBJECT_ENTRY(map + i);
+
+	i += RC_ACTUAL_OBJECT_ENTRY_SIZE(entry);
+	while (i < mapping->size) {
+		int pos = i;
 
-	strbuf_release(&ost);
-	strbuf_release(&os);
+		entry = RC_OBTAIN_OBJECT_ENTRY(map + i;
+		i += RC_ACTUAL_OBJECT_ENTRY_SIZE(entry);
+
+		if (entry->type == OBJ_COMMIT) {
+			*index = pos;
+			return object_nr;
+		}
 
-	return i / 21 + 1;
+		strbuf_add(acc_buffer, map + pos, i - pos);
+		object_nr++;
+	}
+
+	*index = 0;
+	return object_nr;
+}
+
+static int add_objects_verbatim(struct rev_cache_info *rci, struct commit *commit)
+{
+	struct rev_cache_slice_map *map;
+	char found = 0;
+	struct rc_index_entry *ie;
+	struct rc_object_entry *entry;
+	int object_nr, i;
+
+	if (!rci->maps)
+		return -1;
+
+	/* check if we can continue where we left off */
+	map = rci->last_map;
+	if (!map)
+		goto search_me;
+
+	i = map->last_index;
+	entry = RC_OBTAIN_OBJECT_ENTRY(map->map + i);
+	if (hashcmp(entry->sha1, commit->object.sha1))
+		goto search_me;
+
+	found = 1;
+
+search_me:
+	if (!found) {
+		ie = search_index(commit->object.sha1);
+		if (!ie || ie->cache_index >= idx_head.cache_nr)
+			return -2;
+
+		map = rci->maps + ie->cache_index;
+		if (!map->size)
+			return -3;
+
+		i = ie->pos;
+		entry = RC_OBTAIN_OBJECT_ENTRY(map->map + i);
+		if (entry->type != OBJ_COMMIT || hashcmp(entry->sha1, commit->object.sha1))
+			return -4;
+	}
+
+	/* can't handle end commits */
+	if (entry->is_end)
+		return -5;
+
+	object_nr = add_objects_verbatim_1(map, &i);
+
+	/* remember this */
+	if (i) {
+		rci->last_map = map;
+		map->last_index = i;
+	} else
+		rci->last_map = 0;
+
+	return object_nr;
 }
 
 static void init_revcache_directory(void)
@@ -983,9 +1211,14 @@ static void init_revcache_directory(void)
 
 void init_rev_cache_info(struct rev_cache_info *rci)
 {
+	memset(rci, 0, sizeof(struct rev_cache_info));
+
 	rci->objects = 1;
 	rci->legs = 0;
 	rci->make_index = 1;
+	rci->fuse_me = 0;
+
+	rci->overwrite_all = 0;
 
 	rci->add_to_pending = 1;
 
@@ -1065,9 +1298,13 @@ int make_cache_slice(struct rev_cache_info *rci,
 	if (prepare_revision_walk(revs))
 		die("died preparing revision walk");
 
+	if (rci->legs)
+		make_legs(revs);
+
 	object_nr = total_sz = 0;
 	while ((commit = get_revision(revs)) != 0) {
 		struct rc_object_entry object;
+		int t;
 
 		strbuf_setlen(&merge_paths, 0);
 		strbuf_setlen(&split_paths, 0);
@@ -1093,12 +1330,17 @@ int make_cache_slice(struct rev_cache_info *rci,
 
 		commit->indegree = 0;
 
-		add_object_entry(0, 0, &object, &merge_paths, &split_paths);
+		add_object_entry(0, &object, &merge_paths, &split_paths);
 		object_nr++;
 
-		/* add all unique children for this commit */
-		if (rci->objects && !object.is_end)
-			object_nr += add_unique_objects(commit);
+		if (rci->objects && !object.is_end) {
+			if (rci->fuse_me && (t = add_objects_verbatim(rci, commit)) >= 0)
+				/* yay!  we did it! */
+				object_nr += t;
+			else
+				/* add all unique children for this commit */
+				object_nr += add_unique_objects(commit);
+		}
 
 		/* print every ~1MB or so */
 		if (buffer.len > 1000000) {
@@ -1223,6 +1465,8 @@ int make_cache_index(struct rev_cache_info *rci, unsigned char *cache_sha1,
 	unsigned char *map;
 	unsigned long max_date;
 
+	maybe_fill_with_defaults(rci);
+
 	if (!idx_map)
 		init_index();
 
@@ -1287,7 +1531,7 @@ int make_cache_index(struct rev_cache_info *rci, unsigned char *cache_sha1,
 		} else
 			disked_entry = search_index_1(object_entry->sha1);
 
-		if (disked_entry && !object_entry->is_start)
+		if (disked_entry && !object_entry->is_start && !rci->overwrite_all)
 			continue;
 		else if (disked_entry) {
 			/* mmm, pointer arithmetic... tasty */  /* (entry - idx_map = offset, so cast is valid) */
@@ -1341,8 +1585,7 @@ int make_cache_index(struct rev_cache_info *rci, unsigned char *cache_sha1,
 }
 
 
-/* add start-commits from each cache slice (uninterestingness will be propogated) */
-void starts_from_slices(struct rev_info *revs, unsigned int flags)
+void starts_from_slices(struct rev_info *revs, unsigned int flags, unsigned char *which, int n)
 {
 	struct commit *commit;
 	int i;
@@ -1358,6 +1601,18 @@ void starts_from_slices(struct rev_info *revs, unsigned int flags)
 		if (!entry->is_start)
 			continue;
 
+		/* only include entries in 'which' slices */
+		if (n) {
+			int j;
+
+			for (j = 0; j < n; j++)
+				if (!hashcmp(idx_caches + entry->cache_index * 20, which + j * 20))
+					break;
+
+			if (j == n)
+				continue;
+		}
+
 		commit = lookup_commit(entry->sha1);
 		if (!commit)
 			continue;
@@ -1367,3 +1622,313 @@ void starts_from_slices(struct rev_info *revs, unsigned int flags)
 	}
 
 }
+
+
+struct slice_fd_time {
+	unsigned char sha1[20];
+	int fd;
+	struct stat fi;
+};
+
+int slice_time_sort(const void *a, const void *b)
+{
+	unsigned long at, bt;
+
+	at = ((struct slice_fd_time *)a)->fi.st_ctime;
+	bt = ((struct slice_fd_time *)b)->fi.st_ctime;
+
+	if (at == bt)
+		return 0;
+
+	return at > bt ? 1 : -1;
+}
+
+int regenerate_cache_index(struct rev_cache_info *rci)
+{
+	DIR *dirh;
+	int i;
+	struct slice_fd_time info;
+	struct strbuf slices;
+
+	/* first remove old index if it exists */
+	unlink_or_warn(git_path("rev-cache/index"));
+
+	strbuf_init(&slices, 0);
+
+	dirh = opendir(git_path("rev-cache"));
+	if (dirh) {
+		struct dirent *de;
+		struct stat fi;
+		int fd;
+		unsigned char sha1[20];
+
+		while ((de = readdir(dirh))) {
+			if (de->d_name[0] == '.')
+				continue;
+
+			if (get_sha1_hex(de->d_name, sha1))
+				continue;
+
+			/* open with RDWR because of mmap call in make_cache_index() */
+			fd = open_cache_slice(sha1, O_RDONLY);
+			if (fd < 0 || fstat(fd, &fi)) {
+				warning("bad cache found [%s]; fuse recommended", de->d_name);
+				if (fd > 0)
+					close(fd);
+				continue;
+			}
+
+			hashcpy(info.sha1, sha1);
+			info.fd = fd;
+			memcpy(&info.fi, &fi, sizeof(struct stat));
+
+			strbuf_add(&slices, &info, sizeof(info));
+		}
+
+		closedir(dirh);
+	}
+
+	/* we want oldest first -> upon overlap, older slices are more likely to have a larger section,
+	 * as of the overlapped commit */
+	qsort(slices.buf, slices.len / sizeof(info), sizeof(info), slice_time_sort);
+
+	for (i = 0; i < slices.len; i += sizeof(info)) {
+		struct slice_fd_time *infop = (struct slice_fd_time *)(slices.buf + i);
+		struct stat *fip = &infop->fi;
+		int fd = infop->fd;
+
+		if (make_cache_index(rci, infop->sha1, fd, fip->st_size) < 0)
+			die("error writing cache");
+
+		close(fd);
+	}
+
+	strbuf_release(&slices);
+
+	return 0;
+}
+
+static int add_slices_for_fuse(struct rev_cache_info *rci, struct string_list *files, struct strbuf *ignore)
+{
+	unsigned char sha1[20];
+	char base[PATH_MAX];
+	int baselen, i, slice_nr = 0;
+	struct stat fi;
+	DIR *dirh;
+	struct dirent *de;
+
+	strncpy(base, git_path("rev-cache"), sizeof(base));
+	baselen = strlen(base);
+
+	dirh = opendir(base);
+	if (!dirh)
+		return 0;
+
+	while ((de = readdir(dirh))) {
+		if (de->d_name[0] == '.')
+			continue;
+
+		base[baselen] = '/';
+		strncpy(base + baselen + 1, de->d_name, sizeof(base) - baselen - 1);
+
+		if (get_sha1_hex(de->d_name, sha1)) {
+			/* whatever it is, we don't need it... */
+			string_list_insert(base, files);
+			continue;
+		}
+
+		/* _theoretically_ it is possible a slice < ignore_size to map objects not covered by, yet reachable from,
+		 * a slice >= ignore_size, meaning that we could potentially delete an 'unfused' slice; but if that
+		 * ever *did* happen their cache structure'd be so fucked up they might as well refuse the entire thing.
+		 * and at any rate the worst it'd do is make rev-list revert to standard walking in that (small) bit.
+		 */
+		if (rci->ignore_size) {
+			if (stat(base, &fi))
+				warning("can't query file %s\n", base);
+			else if (fi.st_size >= rci->ignore_size) {
+				strbuf_add(ignore, sha1, 20);
+				continue;
+			}
+		} else {
+			/* check if a pointer */
+			struct cache_slice_pointer ptr;
+			int fd = open(base, O_RDONLY);
+
+			if (fd < 0)
+				goto dont_save;
+			if (sizeof(ptr) != read_in_full(fd, &ptr, sizeof(ptr)))
+				goto dont_save;
+
+			close(fd);
+			if (!strcmp(ptr.signature, "REVCOPTR")) {
+				strbuf_add(ignore, sha1, 20);
+				continue;
+			}
+		}
+
+dont_save:
+		for (i = idx_head.cache_nr - 1; i >= 0; i--) {
+			if (!hashcmp(idx_caches + i * 20, sha1))
+				break;
+		}
+
+		if (i >= 0)
+			rci->maps[i].size = 1;
+
+		string_list_insert(base, files);
+		slice_nr++;
+	}
+
+	closedir(dirh);
+
+	return slice_nr;
+}
+
+/* the most work-intensive attributes in the cache are the unique objects and size, both
+ * of which can be re-used.  although path structures will be isomorphic, path generation is
+ * not particularly expensive, and at any rate we need to re-sort the commits */
+int fuse_cache_slices(struct rev_cache_info *rci, struct rev_info *revs)
+{
+	unsigned char cache_sha1[20];
+	struct string_list files = {0, 0, 0, 1}; /* dup */
+	struct strbuf ignore;
+	int i;
+
+	maybe_fill_with_defaults(rci);
+
+	if (!idx_map)
+		init_index();
+	if (!idx_map)
+		return -1;
+
+	strbuf_init(&ignore, 0);
+	rci->maps = xcalloc(idx_head.cache_nr, sizeof(struct rev_cache_slice_map));
+	if (add_slices_for_fuse(rci, &files, &ignore) <= 1) {
+		printf("nothing to fuse\n");
+		return 1;
+	}
+
+	if (ignore.len) {
+		starts_from_slices(revs, UNINTERESTING, (unsigned char *)ignore.buf, ignore.len / 20);
+		strbuf_release(&ignore);
+	}
+
+	/* initialize mappings */
+	for (i = idx_head.cache_nr - 1; i >= 0; i--) {
+		struct rev_cache_slice_map *map = rci->maps + i;
+		struct stat fi;
+		int fd;
+
+		if (!map->size)
+			continue;
+		map->size = 0;
+
+		/* pointers are never fused, so we can use open directly */
+		fd = open(git_path("rev-cache/%s", sha1_to_hex(idx_caches + i * 20)), O_RDONLY);
+		if (fd <= 0 || fstat(fd, &fi))
+			continue;
+		if (fi.st_size < sizeof(struct rc_slice_header))
+			continue;
+
+		map->map = xmmap(0, fi.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
+		if (map->map == MAP_FAILED)
+			continue;
+
+		close(fd);
+		map->size = fi.st_size;
+	}
+
+	rci->make_index = 0;
+	rci->fuse_me = 1;
+	if (make_cache_slice(rci, revs, 0, 0, cache_sha1) < 0)
+		die("can't make cache slice");
+
+	printf("%s\n", sha1_to_hex(cache_sha1));
+
+	/* clean up time! */
+	for (i = idx_head.cache_nr - 1; i >= 0; i--) {
+		struct rev_cache_slice_map *map = rci->maps + i;
+
+		if (!map->size)
+			continue;
+
+		munmap(map->map, map->size);
+	}
+	free(rci->maps);
+	cleanup_cache_slices();
+
+	for (i = 0; i < files.nr; i++) {
+		char *name = files.items[i].string;
+
+		fprintf(stderr, "removing %s\n", name);
+		unlink_or_warn(name);
+	}
+
+	string_list_clear(&files, 0);
+
+	return regenerate_cache_index(rci);
+}
+
+static int verify_cache_slice(const char *slice_path, unsigned char *sha1)
+{
+	struct rc_slice_header head;
+	int fd, len, retval = -1;
+	unsigned char *map = MAP_FAILED;
+	struct stat fi;
+
+	len = strlen(slice_path);
+	if (len < 40)
+		return -2;
+	if (get_sha1_hex(slice_path + len - 40, sha1))
+		return -3;
+
+	fd = open(slice_path, O_RDONLY);
+	if (fd == -1)
+		goto end;
+	if (fstat(fd, &fi) || fi.st_size < sizeof(head))
+		goto end;
+
+	map = xmmap(0, sizeof(head), PROT_READ, MAP_PRIVATE, fd, 0);
+	if (map == MAP_FAILED)
+		goto end;
+	if (get_cache_slice_header(sha1, map, fi.st_size, &head))
+		goto end;
+
+	retval = 0;
+
+end:
+	if (map != MAP_FAILED)
+		munmap(map, sizeof(head));
+	if (fd > 0)
+		close(fd);
+
+	return retval;
+}
+
+int make_cache_slice_pointer(struct rev_cache_info *rci, const char *slice_path)
+{
+	struct cache_slice_pointer ptr;
+	int fd;
+	unsigned char sha1[20];
+
+	maybe_fill_with_defaults(rci);
+	rci->overwrite_all = 1;
+
+	if (verify_cache_slice(slice_path, sha1) < 0)
+		return -1;
+
+	strcpy(ptr.signature, "REVCOPTR");
+	ptr.version = SUPPORTED_REVCOPTR_VERSION;
+	strcpy(ptr.path, make_nonrelative_path(slice_path));
+
+	fd = open(git_path("rev-cache/%s", sha1_to_hex(sha1)), O_RDWR | O_CREAT | O_TRUNC, 0666);
+	if (fd < 0)
+		return -2;
+
+	write_in_full(fd, &ptr, sizeof(ptr));
+	make_cache_index(rci, sha1, fd, sizeof(ptr));
+
+	close(fd);
+
+	return 0;
+}
diff --git a/rev-cache.h b/rev-cache.h
index a76dc53..a1af337 100644
--- a/rev-cache.h
+++ b/rev-cache.h
@@ -3,6 +3,7 @@
 
 #define SUPPORTED_REVCACHE_VERSION 		1
 #define SUPPORTED_REVINDEX_VERSION		1
+#define SUPPORTED_REVCOPTR_VERSION		1
 
 #define RC_PATH_SIZE(x)	(sizeof(uint16_t) * (x))
 
@@ -10,6 +11,7 @@
 #define RC_OBTAIN_INDEX_ENTRY(p)			from_disked_rc_index_entry((struct rc_index_entry_ondisk *)(p), 0)
 
 #define RC_ACTUAL_OBJECT_ENTRY_SIZE(e)		(sizeof(struct rc_object_entry_ondisk) + RC_PATH_SIZE((e)->merge_nr + (e)->split_nr) + (e)->size_size)
+#define RC_ENTRY_SIZE_OFFSET(e)				(RC_ACTUAL_OBJECT_ENTRY_SIZE(e) - (e)->size_size)
 
 /* single index maps objects to cache files */
 struct rc_index_header {
@@ -90,6 +92,7 @@ struct rc_object_entry *from_disked_rc_object_entry(struct rc_object_entry_ondis
 struct rc_object_entry_ondisk *to_disked_rc_object_entry(struct rc_object_entry *src, struct rc_object_entry_ondisk *dst);
 
 extern unsigned char *get_cache_slice(struct commit *commit);
+extern int open_cache_slice(unsigned char *sha1, int flags);
 extern int traverse_cache_slice(struct rev_info *revs,
 	unsigned char *cache_sha1, struct commit *commit,
 	unsigned long *date_so_far, int *slop_so_far,
@@ -102,6 +105,10 @@ extern int make_cache_slice(struct rev_cache_info *rci,
 extern int make_cache_index(struct rev_cache_info *rci, unsigned char *cache_sha1,
 	int fd, unsigned int size);
 
-extern void starts_from_slices(struct rev_info *revs, unsigned int flags);
+extern void starts_from_slices(struct rev_info *revs, unsigned int flags, unsigned char *which, int n);
+extern int fuse_cache_slices(struct rev_cache_info *rci, struct rev_info *revs);
+extern int regenerate_cache_index(struct rev_cache_info *rci);
+extern int make_cache_slice_pointer(struct rev_cache_info *rci, const char *slice_path);
 
 #endif
+
diff --git a/revision.h b/revision.h
index 1b7b93f..d160e14 100644
--- a/revision.h
+++ b/revision.h
@@ -22,17 +22,31 @@
 struct rev_info;
 struct log_info;
 
+struct rev_cache_slice_map {
+	unsigned char *map;
+	int size;
+	int last_index;
+};
+
 struct rev_cache_info {
 	/* generation flags */
 	unsigned objects : 1,
 		legs : 1,
-		make_index : 1;
+		make_index : 1,
+		fuse_me : 1;
+
+	/* index inclusion */
+	unsigned overwrite_all : 1;
 
 	/* traversal flags */
 	unsigned add_to_pending : 1;
 
 	/* fuse options */
 	unsigned int ignore_size;
+
+	/* reserved */
+	struct rev_cache_slice_map *maps,
+		*last_map;
 };
 
 struct rev_info {
-- 
tg: (a7b28d4..) t/revcache/misc (depends on: t/revcache/objects)

^ permalink raw reply related

* Re: [PATCH 5/6 (v4)] full integration of rev-cache into git, completed test suite
From: Nick Edelen @ 2009-10-19 20:30 UTC (permalink / raw)
  To: Junio C Hamano, Nicolas Pitre, Johannes Schindelin, Sam Vilain,
	Michael J Gruber
In-Reply-To: <op.uys3quhbtdk399@sirnot.private>

This patch provides a working integration of rev-cache into the revision 
walker, along with some touch-ups:
 - integration into revision walker and list-objects
 - tweak of object generation
 - more fluid handling of damaged cache slices
 - numerous tests for both features from the previous patch, and the 
integration's integrity

'Integration' is rather broad -- a more detailed description follows for each 
aspect:
 - rev-cache
the traversal mechanism is updated to handle many of the non-prune options 
rev-list does (date limiting, slop-handling, etc.), and is adjusted to allow 
for non-fatal cache-traversal failures.

 - revision walker
both limited and unlimited traversal attempt to use the cache when possible, 
smoothly falling back if it's not.

 - list-objects
object listing does not recurse into cached trees, and has been adjusted to 
guarantee commit-tag-tree-blob ordering.

Signed-off-by: Nick Edelen <sirnot@gmail.com>

---
tweak test for compatability.

 builtin-rev-cache.c       |   40 ++++++++
 list-objects.c            |   46 ++++++++-
 rev-cache.c               |  229 +++++++++++++++++++++++++++++++++++++++------
 revision.c                |   88 ++++++++++++++---
 t/t6017-rev-cache-list.sh |  151 ++++++++++++++++++++++++++++-
 5 files changed, 500 insertions(+), 54 deletions(-)

diff --git a/builtin-rev-cache.c b/builtin-rev-cache.c
index b894c54..8f41123 100644
--- a/builtin-rev-cache.c
+++ b/builtin-rev-cache.c
@@ -4,6 +4,7 @@
 #include "diff.h"
 #include "revision.h"
 #include "rev-cache.h"
+#include "list-objects.h"
 
 unsigned long default_ignore_size = 50 * 1024 * 1024; /* 50mb */
 
@@ -78,6 +79,43 @@ static int handle_add(int argc, const char *argv[]) /* args beyond this command
 	return 0;
 }
 
+static void show_commit(struct commit *commit, void *data)
+{
+	printf("%s\n", sha1_to_hex(commit->object.sha1));
+}
+
+static void show_object(struct object *obj, const struct name_path *path, const char *last)
+{
+	printf("%s\n", sha1_to_hex(obj->sha1));
+}
+
+static int test_rev_list(int argc, const char *argv[])
+{
+	struct rev_info revs;
+	unsigned int flags = 0;
+	int i;
+
+	init_revisions(&revs, 0);
+
+	for (i = 0; i < argc; i++) {
+		if (!strcmp(argv[i], "--not"))
+			flags ^= UNINTERESTING;
+		else if (!strcmp(argv[i], "--objects"))
+			revs.tree_objects = revs.blob_objects = 1;
+		else
+			handle_revision_arg(argv[i], &revs, flags, 1);
+	}
+
+	setup_revisions(0, 0, &revs, 0);
+	revs.topo_order = 1;
+	revs.lifo = 1;
+	prepare_revision_walk(&revs);
+
+	traverse_commit_list(&revs, show_commit, show_object, 0);
+
+	return 0;
+}
+
 static int handle_walk(int argc, const char *argv[])
 {
 	struct commit *commit;
@@ -271,6 +309,8 @@ int cmd_rev_cache(int argc, const char *argv[], const char *prefix)
 		r = handle_walk(argc, argv);
 	else if (!strcmp(arg, "index"))
 		r = handle_index(argc, argv);
+	else if (!strcmp(arg, "test"))
+		r = test_rev_list(argc, argv);
 	else if (!strcmp(arg, "alt"))
 		r = handle_alt(argc, argv);
 	else
diff --git a/list-objects.c b/list-objects.c
index 8953548..b8c3370 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -74,22 +74,34 @@ static void process_tree(struct rev_info *revs,
 		die("bad tree object");
 	if (obj->flags & (UNINTERESTING | SEEN))
 		return;
+	if (obj->flags & FACE_VALUE) {
+		obj->flags |= SEEN;
+		show(obj, path, name);
+		/* not parsing the tree saves a lot of time! */
+		return;
+	}
+
 	if (parse_tree(tree) < 0)
 		die("bad tree object %s", sha1_to_hex(obj->sha1));
 	obj->flags |= SEEN;
 	show(obj, path, name);
+
 	me.up = path;
 	me.elem = name;
 	me.elem_len = strlen(name);
-
 	init_tree_desc(&desc, tree->buffer, tree->size);
 
 	while (tree_entry(&desc, &entry)) {
-		if (S_ISDIR(entry.mode))
+		if (S_ISDIR(entry.mode)) {
+			struct tree *subtree = lookup_tree(entry.sha1);
+			if (!subtree)
+				continue;
+
+			subtree->object.flags &= ~FACE_VALUE;
 			process_tree(revs,
-				     lookup_tree(entry.sha1),
+				     subtree,
 				     show, &me, entry.path);
-		else if (S_ISGITLINK(entry.mode))
+		} else if (S_ISGITLINK(entry.mode))
 			process_gitlink(revs, entry.sha1,
 					show, &me, entry.path);
 		else
@@ -136,6 +148,7 @@ void mark_edges_uninteresting(struct commit_list *list,
 
 static void add_pending_tree(struct rev_info *revs, struct tree *tree)
 {
+	tree->object.flags &= ~FACE_VALUE;
 	add_pending_object(revs, &tree->object, "");
 }
 
@@ -146,17 +159,27 @@ void traverse_commit_list(struct rev_info *revs,
 {
 	int i;
 	struct commit *commit;
+	enum object_type what = OBJ_TAG;
+	char face_value = 0;
 
 	while ((commit = get_revision(revs)) != NULL) {
-		add_pending_tree(revs, commit->tree);
+		if (!(commit->object.flags & FACE_VALUE))
+			add_pending_tree(revs, commit->tree);
+		else
+			face_value = 1;
 		show_commit(commit, data);
 	}
+
+loop_objects:
 	for (i = 0; i < revs->pending.nr; i++) {
 		struct object_array_entry *pending = revs->pending.objects + i;
 		struct object *obj = pending->item;
 		const char *name = pending->name;
 		if (obj->flags & (UNINTERESTING | SEEN))
 			continue;
+		if (obj->type != what && face_value)
+			continue;
+
 		if (obj->type == OBJ_TAG) {
 			obj->flags |= SEEN;
 			show_object(obj, NULL, name);
@@ -175,6 +198,19 @@ void traverse_commit_list(struct rev_info *revs,
 		die("unknown pending object %s (%s)",
 		    sha1_to_hex(obj->sha1), name);
 	}
+	if (face_value) {
+		switch (what) {
+		case OBJ_TAG:
+			what = OBJ_TREE;
+			goto loop_objects;
+		case OBJ_TREE:
+			what = OBJ_BLOB;
+			goto loop_objects;
+		default:
+			break;
+		}
+	}
+
 	if (revs->pending.nr) {
 		free(revs->pending.objects);
 		revs->pending.nr = 0;
diff --git a/rev-cache.c b/rev-cache.c
index 6e19fbb..4ef5287 100644
--- a/rev-cache.c
+++ b/rev-cache.c
@@ -11,6 +11,12 @@
 #include "run-command.h"
 #include "string-list.h"
 
+
+struct bad_slice {
+	unsigned char sha1[20];
+	struct bad_slice *next;
+};
+
 struct cache_slice_pointer {
 	char signature[8]; /* REVCOPTR */
 	char version;
@@ -23,8 +29,9 @@ static uint32_t fanout[0xff + 2];
 static unsigned char *idx_map;
 static int idx_size;
 static struct rc_index_header idx_head;
+static char no_idx, add_to_pending;
+static struct bad_slice *bad_slices;
 static unsigned char *idx_caches;
-static char no_idx;
 
 static struct strbuf *acc_buffer;
 
@@ -121,6 +128,30 @@ struct rc_object_entry_ondisk *to_disked_rc_object_entry(struct rc_object_entry
 	return dst;
 }
 
+static void mark_bad_slice(unsigned char *sha1)
+{
+	struct bad_slice *bad;
+
+	bad = xcalloc(sizeof(struct bad_slice), 1);
+	hashcpy(bad->sha1, sha1);
+
+	bad->next = bad_slices;
+	bad_slices = bad;
+}
+
+static int is_bad_slice(unsigned char *sha1)
+{
+	struct bad_slice *bad = bad_slices;
+
+	while (bad) {
+		if (!hashcmp(bad->sha1, sha1))
+			return 1;
+		bad = bad->next;
+	}
+
+	return 0;
+}
+
 static int get_index_head(unsigned char *map, int len, struct rc_index_header *head, uint32_t *fanout, unsigned char **caches)
 {
 	struct rc_index_header whead;
@@ -246,6 +277,7 @@ static struct rc_index_entry *search_index(unsigned char *sha1)
 unsigned char *get_cache_slice(struct commit *commit)
 {
 	struct rc_index_entry *ie;
+	unsigned char *sha1;
 
 	if (!idx_map) {
 		if (no_idx)
@@ -257,8 +289,13 @@ unsigned char *get_cache_slice(struct commit *commit)
 		return 0;
 
 	ie = search_index(commit->object.sha1);
-	if (ie && ie->cache_index < idx_head.cache_nr)
-		return idx_caches + ie->cache_index * 20;
+	if (ie && ie->cache_index < idx_head.cache_nr) {
+		sha1 = idx_caches + ie->cache_index * 20;
+
+		if (is_bad_slice(sha1))
+			return 0;
+		return sha1;
+	}
 
 	return 0;
 }
@@ -268,6 +305,20 @@ unsigned char *get_cache_slice(struct commit *commit)
 
 static unsigned long decode_size(unsigned char *str, int len);
 
+/* on failure */
+static void restore_commit(struct commit *commit)
+{
+	commit->object.flags &= ~(ADDED | SEEN | FACE_VALUE);
+
+	if (!commit->object.parsed) {
+		while (pop_commit(&commit->parents))
+			;
+
+		parse_commit(commit);
+	}
+
+}
+
 static void handle_noncommit(struct rev_info *revs, struct commit *commit, unsigned char *ptr, struct rc_object_entry *entry)
 {
 	struct blob *blob;
@@ -307,23 +358,27 @@ static void handle_noncommit(struct rev_info *revs, struct commit *commit, unsig
 	}
 
 	obj->flags |= FACE_VALUE;
-	add_pending_object(revs, obj, "");
+	if (add_to_pending)
+		add_pending_object(revs, obj, "");
 }
 
-static int setup_traversal(struct rc_slice_header *head, unsigned char *map, struct commit *commit, struct commit_list **work)
+static int setup_traversal(struct rc_slice_header *head, unsigned char *map, struct commit *commit, struct commit_list **work,
+	struct commit_list **unwork, int *ipath_nr, int *upath_nr, char *ioutside)
 {
 	struct rc_index_entry *iep;
 	struct rc_object_entry *oep;
 	struct commit_list *prev, *wp, **wpp;
 	int retval;
 
-	iep = search_index(commit->object.sha1), 0;
+	iep = search_index(commit->object.sha1);
 	oep = RC_OBTAIN_OBJECT_ENTRY(map + iep->pos);
+	if (commit->object.flags & UNINTERESTING) {
+		++*upath_nr;
+		oep->uninteresting = 1;
+	} else
+		++*ipath_nr;
 
-	/* the .uniniteresting bit isn't strictly necessary, as we check the object during traversal as well,
-	 * but we might as well initialize it while we're at it */
 	oep->include = 1;
-	oep->uninteresting = !!(commit->object.flags & UNINTERESTING);
 	to_disked_rc_object_entry(oep, (struct rc_object_entry_ondisk *)(map + iep->pos));
 	retval = iep->pos;
 
@@ -338,6 +393,10 @@ static int setup_traversal(struct rc_slice_header *head, unsigned char *map, str
 		/* is this in our cache slice? */
 		iep = search_index(obj->sha1);
 		if (!iep || hashcmp(idx_caches + iep->cache_index * 20, head->sha1)) {
+			/* there are interesing objects outside the slice */
+			if (!(obj->flags & UNINTERESTING))
+				*ioutside = 1;
+
 			prev = wp;
 			wp = wp->next;
 			wpp = &wp;
@@ -354,11 +413,20 @@ static int setup_traversal(struct rc_slice_header *head, unsigned char *map, str
 		oep->uninteresting = !!(obj->flags & UNINTERESTING);
 		to_disked_rc_object_entry(oep, (struct rc_object_entry_ondisk *)(map + iep->pos));
 
+		/* count even if not in slice so we can stop enumerating if possible */
+		if (obj->flags & UNINTERESTING)
+			++*upath_nr;
+		else
+			++*ipath_nr;
+
 		/* remove from work list */
 		co = pop_commit(wpp);
 		wp = *wpp;
 		if (prev)
 			prev->next = wp;
+
+		/* ...and store in temp list so we can restore work on failure */
+		commit_list_insert(co, unwork);
 	}
 
 	return retval;
@@ -375,13 +443,18 @@ static int traverse_cache_slice_1(struct rc_slice_header *head, unsigned char *m
 	unsigned long *date_so_far, int *slop_so_far,
 	struct commit_list ***queue, struct commit_list **work)
 {
-	struct commit_list *insert_cache = 0;
+	struct commit_list *insert_cache = 0, *myq = 0, **myqp = &myq, *mywork = 0, **myworkp = &mywork, *unwork = 0;
 	struct commit **last_objects, *co;
-	int i, total_path_nr = head->path_nr, retval = -1;
-	char consume_children = 0;
+	unsigned long date = date_so_far ? *date_so_far : ~0ul;
+	int i, ipath_nr = 0, upath_nr = 0, orig_obj_nr = 0,
+		total_path_nr = head->path_nr, retval = -1, slop = slop_so_far ? *slop_so_far : SLOP;
+	char consume_children = 0, ioutside = 0;
 	unsigned char *paths;
 
-	i = setup_traversal(head, map, commit, work);
+	/* take note in case we need to regress */
+	orig_obj_nr = revs->pending.nr;
+
+	i = setup_traversal(head, map, commit, work, &unwork, &ipath_nr, &upath_nr, &ioutside);
 	if (i < 0)
 		return -1;
 
@@ -429,6 +502,7 @@ static int traverse_cache_slice_1(struct rc_slice_header *head, unsigned char *m
 
 		if ((paths[path] & IPATH) && (paths[path] & UPATH)) {
 			paths[path] = UPATH;
+			ipath_nr--;
 
 			/* mark edge */
 			if (last_objects[path]) {
@@ -439,6 +513,7 @@ static int traverse_cache_slice_1(struct rc_slice_header *head, unsigned char *m
 				last_objects[path]->object.flags &= ~FACE_VALUE;
 				last_objects[path] = 0;
 			}
+			obj->flags |= BOUNDARY;
 		}
 
 		/* now we gotta re-assess the whole interesting thing... */
@@ -462,8 +537,10 @@ static int traverse_cache_slice_1(struct rc_slice_header *head, unsigned char *m
 						last_objects[p]->object.flags &= ~FACE_VALUE;
 						last_objects[p] = 0;
 					}
-				} else if (last_objects[p] && !last_objects[p]->object.parsed)
+					obj->flags |= BOUNDARY;
+				} else if (last_objects[p] && !last_objects[p]->object.parsed) {
 					commit_list_insert(co, &last_objects[p]->parents);
+				}
 
 				/* can't close a merge path until all are parents have been encountered */
 				if (GET_COUNT(paths[p])) {
@@ -473,14 +550,33 @@ static int traverse_cache_slice_1(struct rc_slice_header *head, unsigned char *m
 						continue;
 				}
 
+				if (paths[p] & IPATH)
+					ipath_nr--;
+				else
+					upath_nr--;
+
 				paths[p] = 0;
 				last_objects[p] = 0;
 			}
 		}
 
 		/* make topo relations */
-		if (last_objects[path] && !last_objects[path]->object.parsed)
+		if (last_objects[path] && !last_objects[path]->object.parsed) {
 			commit_list_insert(co, &last_objects[path]->parents);
+		}
+
+		/* we've been here already */
+		if (obj->flags & ADDED) {
+			if (entry->uninteresting && !(obj->flags & UNINTERESTING)) {
+				obj->flags |= UNINTERESTING;
+				mark_parents_uninteresting(co);
+				upath_nr--;
+			} else if (!entry->uninteresting)
+				ipath_nr--;
+
+			paths[path] = 0;
+			continue;
+		}
 
 		/* initialize commit */
 		if (!entry->is_end) {
@@ -493,24 +589,51 @@ static int traverse_cache_slice_1(struct rc_slice_header *head, unsigned char *m
 
 		if (entry->uninteresting)
 			obj->flags |= UNINTERESTING;
+		else if (co->date < date)
+			date = co->date;
 
 		/* we need to know what the edges are */
 		last_objects[path] = co;
 
 		/* add to list */
-		if (!(obj->flags & UNINTERESTING) || revs->show_all) {
-			if (entry->is_end)
-				insert_by_date_cached(co, work, insert_cache, &insert_cache);
-			else
-				*queue = &commit_list_insert(co, *queue)->next;
+		if (slop && !(revs->min_age != -1 && co->date > revs->min_age)) {
+
+			if (!(obj->flags & UNINTERESTING) || revs->show_all) {
+				if (entry->is_end)
+					myworkp = &commit_list_insert(co, myworkp)->next;
+				else
+					myqp = &commit_list_insert(co, myqp)->next;
+
+				/* add children to list as well */
+				if (obj->flags & UNINTERESTING)
+					consume_children = 0;
+				else
+					consume_children = 1;
+			}
 
-			/* add children to list as well */
-			if (obj->flags & UNINTERESTING)
-				consume_children = 0;
-			else
-				consume_children = 1;
 		}
 
+		/* should we continue? */
+		if (!slop) {
+			if (!upath_nr) {
+				break;
+			} else if (ioutside || revs->show_all) {
+				/* pass it back to rev-list
+				 * we purposely ignore everything outside this cache, so we don't needlessly traverse the whole
+				 * thing on uninteresting, but that does mean that we may need to bounce back
+				 * and forth a few times with rev-list */
+				myworkp = &commit_list_insert(co, myworkp)->next;
+
+				paths[path] = 0;
+				upath_nr--;
+			} else {
+				break;
+			}
+		} else if (!ipath_nr && co->date <= date)
+			slop--;
+		else
+			slop = SLOP;
+
 		/* open parents */
 		if (entry->merge_nr) {
 			int j, off = index + sizeof(struct rc_object_entry_ondisk);
@@ -525,6 +648,11 @@ static int traverse_cache_slice_1(struct rc_slice_header *head, unsigned char *m
 				if (paths[p] & flag)
 					continue;
 
+				if (flag == IPATH)
+					ipath_nr++;
+				else
+					upath_nr++;
+
 				paths[p] |= flag;
 			}
 
@@ -534,12 +662,54 @@ static int traverse_cache_slice_1(struct rc_slice_header *head, unsigned char *m
 
 	}
 
+	if (date_so_far)
+		*date_so_far = date;
+	if (slop_so_far)
+		*slop_so_far = slop;
 	retval = 0;
 
+	/* success: attach to given lists */
+	if (myqp != &myq) {
+		**queue = myq;
+		*queue = myqp;
+	}
+
+	while ((co = pop_commit(&mywork)) != 0)
+		insert_by_date_cached(co, work, insert_cache, &insert_cache);
+
+	/* free backup */
+	while (pop_commit(&unwork))
+		;
+
 end:
 	free(paths);
 	free(last_objects);
 
+	/* failure: restore work to previous condition
+	 * (cache corruption should *not* be fatal) */
+	if (retval) {
+		while ((co = pop_commit(&unwork)) != 0) {
+			restore_commit(co);
+			co->object.flags |= SEEN;
+			insert_by_date(co, work);
+		}
+
+		/* free lists */
+		while ((co = pop_commit(&myq)) != 0)
+			restore_commit(co);
+
+		while ((co = pop_commit(&mywork)) != 0)
+			restore_commit(co);
+
+		/* truncate object array */
+		for (i = orig_obj_nr; i < revs->pending.nr; i++) {
+			struct object *obj = revs->pending.objects[i].item;
+
+			obj->flags &= ~FACE_VALUE;
+		}
+		revs->pending.nr = orig_obj_nr;
+	}
+
 	return retval;
 }
 
@@ -630,6 +800,7 @@ int traverse_cache_slice(struct rev_info *revs,
 
 	/* load options */
 	rci = &revs->rev_cache_info;
+	add_to_pending = rci->add_to_pending;
 
 	memset(&head, 0, sizeof(struct rc_slice_header));
 
@@ -653,6 +824,10 @@ end:
 	if (fd != -1)
 		close(fd);
 
+	/* remember this! */
+	if (retval)
+		mark_bad_slice(cache_sha1);
+
 	return retval;
 }
 
@@ -1128,7 +1303,7 @@ static int add_objects_verbatim_1(struct rev_cache_slice_map *mapping, int *inde
 	while (i < mapping->size) {
 		int pos = i;
 
-		entry = RC_OBTAIN_OBJECT_ENTRY(map + i;
+		entry = RC_OBTAIN_OBJECT_ENTRY(map + i);
 		i += RC_ACTUAL_OBJECT_ENTRY_SIZE(entry);
 
 		if (entry->type == OBJ_COMMIT) {
diff --git a/revision.c b/revision.c
index de9e2e3..155db70 100644
--- a/revision.c
+++ b/revision.c
@@ -12,6 +12,7 @@
 #include "patch-ids.h"
 #include "decorate.h"
 #include "log-tree.h"
+#include "rev-cache.h"
 
 volatile show_early_output_fn_t show_early_output;
 
@@ -638,6 +639,8 @@ static int limit_list(struct rev_info *revs)
 	struct commit_list *list = revs->commits;
 	struct commit_list *newlist = NULL;
 	struct commit_list **p = &newlist;
+	unsigned char *cache_sha1;
+	char used_cache;
 
 	while (list) {
 		struct commit_list *entry = list;
@@ -650,24 +653,39 @@ static int limit_list(struct rev_info *revs)
 
 		if (revs->max_age != -1 && (commit->date < revs->max_age))
 			obj->flags |= UNINTERESTING;
-		if (add_parents_to_list(revs, commit, &list, NULL) < 0)
-			return -1;
-		if (obj->flags & UNINTERESTING) {
-			mark_parents_uninteresting(commit);
-			if (revs->show_all)
-				p = &commit_list_insert(commit, p)->next;
-			slop = still_interesting(list, date, slop);
-			if (slop)
+
+		/* rev-cache to the rescue!!! */
+		used_cache = 0;
+		if (!revs->dont_cache_me && !(obj->flags & ADDED)) {
+			cache_sha1 = get_cache_slice(commit);
+			if (cache_sha1) {
+				if (traverse_cache_slice(revs, cache_sha1, commit, &date, &slop, &p, &list) < 0)
+					used_cache = 0;
+				else
+					used_cache = 1;
+			}
+		}
+
+		if (!used_cache) {
+			if (add_parents_to_list(revs, commit, &list, NULL) < 0)
+				return -1;
+			if (obj->flags & UNINTERESTING) {
+				mark_parents_uninteresting(commit); /* ME: why? */
+				if (revs->show_all)
+					p = &commit_list_insert(commit, p)->next;
+				slop = still_interesting(list, date, slop);
+				if (slop > 0)
+					continue;
+				/* If showing all, add the whole pending list to the end */
+				if (revs->show_all)
+					*p = list;
+				break;
+			}
+			if (revs->min_age != -1 && (commit->date > revs->min_age))
 				continue;
-			/* If showing all, add the whole pending list to the end */
-			if (revs->show_all)
-				*p = list;
-			break;
+			date = commit->date;
+			p = &commit_list_insert(commit, p)->next;
 		}
-		if (revs->min_age != -1 && (commit->date > revs->min_age))
-			continue;
-		date = commit->date;
-		p = &commit_list_insert(commit, p)->next;
 
 		show = show_early_output;
 		if (!show)
@@ -813,6 +831,8 @@ void init_revisions(struct rev_info *revs, const char *prefix)
 		revs->diffopt.prefix = prefix;
 		revs->diffopt.prefix_length = strlen(prefix);
 	}
+
+	init_rev_cache_info(&revs->rev_cache_info);
 }
 
 static void add_pending_commit_list(struct rev_info *revs,
@@ -1374,6 +1394,11 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, const ch
 	if (revs->reflog_info && revs->graph)
 		die("cannot combine --walk-reflogs with --graph");
 
+	/* limits on caching
+	 * todo: implement this functionality */
+	if (revs->prune || revs->diff)
+		revs->dont_cache_me = 1;
+
 	return left;
 }
 
@@ -1656,6 +1681,8 @@ static int commit_match(struct commit *commit, struct rev_info *opt)
 {
 	if (!opt->grep_filter.pattern_list)
 		return 1;
+	if (!commit->object.parsed)
+		parse_commit(commit);
 	return grep_buffer(&opt->grep_filter,
 			   NULL, /* we say nothing, not even filename */
 			   commit->buffer, strlen(commit->buffer));
@@ -1719,6 +1746,7 @@ static struct commit *get_revision_1(struct rev_info *revs)
 	do {
 		struct commit_list *entry = revs->commits;
 		struct commit *commit = entry->item;
+		struct object *obj = &commit->object;
 
 		revs->commits = entry->next;
 		free(entry);
@@ -1735,11 +1763,39 @@ static struct commit *get_revision_1(struct rev_info *revs)
 			if (revs->max_age != -1 &&
 			    (commit->date < revs->max_age))
 				continue;
+
+			if (!revs->dont_cache_me) {
+				struct commit_list *queue = 0, **queuep = &queue;
+				unsigned char *cache_sha1;
+
+				if (obj->flags & ADDED)
+					goto skip_parenting;
+
+				cache_sha1 = get_cache_slice(commit);
+				if (cache_sha1) {
+					if (!traverse_cache_slice(revs, cache_sha1, commit, 0, 0, &queuep, &revs->commits)) {
+						struct commit_list *work = revs->commits;
+
+						/* attach queue to end of ->commits */
+						while (work && work->next)
+							work = work->next;
+
+						if (work)
+							work->next = queue;
+						else
+							revs->commits = queue;
+
+						goto skip_parenting;
+					}
+				}
+			}
+
 			if (add_parents_to_list(revs, commit, &revs->commits, NULL) < 0)
 				die("Failed to traverse parents of commit %s",
 				    sha1_to_hex(commit->object.sha1));
 		}
 
+skip_parenting:
 		switch (simplify_commit(revs, commit)) {
 		case commit_ignore:
 			continue;
diff --git a/t/t6017-rev-cache-list.sh b/t/t6017-rev-cache-list.sh
index dc0fc07..982fb15 100755
--- a/t/t6017-rev-cache-list.sh
+++ b/t/t6017-rev-cache-list.sh
@@ -38,6 +38,7 @@ test_expect_success 'init repo' '
 	git add . &&
 	git commit -m "omg" &&
 
+	sleep 2 &&
 	git branch b4 &&
 	git checkout b4 &&
 	echo shazam >file8 &&
@@ -46,7 +47,7 @@ test_expect_success 'init repo' '
 	git merge -m "merge b2" b2 &&
 
 	echo bam >smoke/pipe &&
-	git add .
+	git add . &&
 	git commit -m "bam" &&
 
 	git checkout master &&
@@ -71,18 +72,26 @@ test_expect_success 'init repo' '
 	git add . &&
 	git commit -m "lol" &&
 
+	sleep 2 &&
 	git checkout master &&
 	git merge -m "triple merge" b1 b11 &&
 	git rm -r d1 &&
+	sleep 2 &&
 	git commit -a -m "oh noes"
 '
 
-git-rev-list HEAD --not HEAD~3 >proper_commit_list_limited
-git-rev-list HEAD >proper_commit_list
-git-rev-list HEAD --objects >proper_object_list
+max_date=`git-rev-list --timestamp HEAD~1 --max-count=1 | grep -e "^[0-9]*" -o`
+min_date=`git-rev-list --timestamp b4 --max-count=1 | grep -e "^[0-9]*" -o`
+
+git-rev-list --topo-order HEAD --not HEAD~3 >proper_commit_list_limited
+git-rev-list --topo-order HEAD --not HEAD~2 >proper_commit_list_limited2
+git-rev-list --topo-order HEAD >proper_commit_list
+git-rev-list --objects HEAD >proper_object_list
+git-rev-list HEAD --max-age=$min_date --min-age=$max_date >proper_list_date_limited
+
+cache_sha1=`git-rev-cache add HEAD 2>output.err`
 
 test_expect_success 'make cache slice' '
-	git-rev-cache add HEAD 2>output.err &&
 	grep "final return value: 0" output.err
 '
 
@@ -102,11 +111,141 @@ test_expect_success 'test rev-caches walker directly (unlimited)' '
 	test_cmp_sorted list proper_commit_list
 '
 
+test_expect_success 'test rev-list traversal (limited)' '
+	git-rev-list HEAD --not HEAD~3 >list &&
+	test_cmp list proper_commit_list_limited
+'
+
+test_expect_success 'test rev-list traversal (unlimited)' '
+	git-rev-list HEAD >list &&
+	test_cmp list proper_commit_list
+'
+
 #do the same for objects
 test_expect_success 'test rev-caches walker with objects' '
 	git-rev-cache walk --objects HEAD >list &&
 	test_cmp_sorted list proper_object_list
 '
 
-test_done
+test_expect_success 'test rev-list with objects (topo order)' '
+	git-rev-list --topo-order --objects HEAD >list &&
+	test_cmp_sorted list proper_object_list
+'
+
+test_expect_success 'test rev-list with objects (no order)' '
+	git-rev-list --objects HEAD >list &&
+	test_cmp_sorted list proper_object_list
+'
+
+#verify age limiting
+test_expect_success 'test rev-list date limiting (topo order)' '
+	git-rev-list --topo-order --max-age=$min_date --min-age=$max_date HEAD >list &&
+	test_cmp_sorted list proper_list_date_limited
+'
+
+test_expect_success 'test rev-list date limiting (no order)' '
+	git-rev-list --max-age=$min_date --min-age=$max_date HEAD >list &&
+	test_cmp_sorted list proper_list_date_limited
+'
+
+#check partial cache slice
+test_expect_success 'saving old cache and generating partial slice' '
+	cp ".git/rev-cache/$cache_sha1" .git/rev-cache/.old &&
+	rm ".git/rev-cache/$cache_sha1" .git/rev-cache/index &&
+
+	git-rev-cache add HEAD~2 2>output.err &&
+	grep "final return value: 0" output.err
+'
+
+test_expect_success 'rev-list with wholly interesting partial slice' '
+	git-rev-list --topo-order HEAD >list &&
+	test_cmp list proper_commit_list
+'
+
+test_expect_success 'rev-list with partly uninteresting partial slice' '
+	git-rev-list --topo-order HEAD --not HEAD~3 >list &&
+	test_cmp list proper_commit_list_limited
+'
+
+test_expect_success 'rev-list with wholly uninteresting partial slice' '
+	git-rev-list --topo-order HEAD --not HEAD~2 >list &&
+	test_cmp list proper_commit_list_limited2
+'
+
+#try out index generation and fuse (note that --all == HEAD in this case)
+#probably should make a test for that too...
+test_expect_success 'test (non-)fusion of one slice' '
+	git-rev-cache fuse >output.err &&
+	grep "nothing to fuse" output.err
+'
 
+test_expect_success 'make fresh slice' '
+	git-rev-cache add --all --fresh 2>output.err &&
+	grep "final return value: 0" output.err
+'
+
+test_expect_success 'check dual slices' '
+	git-rev-list --topo-order HEAD~2 HEAD >list &&
+	test_cmp list proper_commit_list
+'
+
+test_expect_success 'regenerate index' '
+	rm .git/rev-cache/index &&
+	git-rev-cache index 2>output.err &&
+	grep "final return value: 0" output.err
+'
+
+test_expect_success 'fuse slices' '
+	test -e .git/rev-cache/.old &&
+	git-rev-cache fuse 2>output.err &&
+	grep "final return value: 0" output.err &&
+	test_cmp .git/rev-cache/$cache_sha1 .git/rev-cache/.old
+'
+
+#make sure we can smoothly handle corrupted caches
+test_expect_success 'corrupt slice' '
+	echo bla >.git/rev-cache/$cache_sha1
+'
+
+test_expect_success 'test rev-list traversal (limited) (corrupt slice)' '
+	git-rev-list --topo-order HEAD --not HEAD~3 >list &&
+	test_cmp list proper_commit_list_limited
+'
+
+test_expect_success 'test rev-list traversal (unlimited) (corrupt slice)' '
+	git-rev-list HEAD >list &&
+	test_cmp_sorted list proper_commit_list
+'
+
+test_expect_success 'corrupt index' '
+	echo blu >.git/rev-cache/index
+'
+
+test_expect_success 'test rev-list traversal (limited) (corrupt index)' '
+	git-rev-list --topo-order HEAD --not HEAD~3 >list &&
+	test_cmp list proper_commit_list_limited
+'
+
+test_expect_success 'test rev-list traversal (unlimited) (corrupt index)' '
+	git-rev-list HEAD >list &&
+	test_cmp_sorted list proper_commit_list
+'
+
+#test --ignore-size in fuse
+rm .git/rev-cache/*
+cache_sha1=`git-rev-cache add HEAD~2 2>output.err`
+
+test_expect_success 'make fragmented slices' '
+	git-rev-cache add HEAD~1 --not HEAD~2 2>>output.err &&
+	git-rev-cache add HEAD --fresh 2>>output.err &&
+	test `grep "final return value: 0" output.err | wc -l` -eq 3
+'
+
+cache_size=$(wc -c < .git/rev-cache/$cache_sha1)
+test_expect_success 'test --ignore-size function in fuse' '
+	git-rev-cache fuse --ignore-size=$cache_size 2>output.err &&
+	grep "final return value: 0" output.err &&
+	test -e .git/rev-cache/$cache_sha1
+'
+
+test_done
-- 
tg: (1d78545..) t/revcache/integration (depends on: t/revcache/misc)

^ permalink raw reply related

* Re: [PATCH 6/6 (v4)] support for path name caching in rev-cache
From: Nick Edelen @ 2009-10-19 20:31 UTC (permalink / raw)
  To: Junio C Hamano, Nicolas Pitre, Johannes Schindelin, Sam Vilain,
	Michael J Gruber
In-Reply-To: <op.uys3qwlmtdk399@sirnot.private>

An update to caching mechanism, allowing path names to be cached for blob and 
tree objects.  A list of names appearing in each cache slice is appended to the 
end of the slice, which is referenced by variable-sized indexes per entry.  
This allows pack-objects to more intelligently schedule unpacked/poorly packed 
object, and enables proper duplication of rev-list's behaivor.

The mechanism for this involves adding a 'name' field to blob and tree objects, 
mainly to facilitate reuse of caches during maintenence (like the 'unique' 
field).

Signed-off-by: Nick Edelen <sirnot@gmail.com>

---
 builtin-rev-cache.c       |    3 +-
 rev-cache.c               |  332 +++++++++++++++++++++++++++++++++++++--------
 rev-cache.h               |   16 ++-
 revision.h                |    6 +-
 t/t6017-rev-cache-list.sh |    8 +-
 5 files changed, 300 insertions(+), 65 deletions(-)

diff --git a/builtin-rev-cache.c b/builtin-rev-cache.c
index 8f41123..4c1766d 100644
--- a/builtin-rev-cache.c
+++ b/builtin-rev-cache.c
@@ -177,13 +177,14 @@ static int handle_walk(int argc, const char *argv[])
 	fprintf(stderr, "pending:\n");
 	for (i = 0; i < revs.pending.nr; i++) {
 		struct object *obj = revs.pending.objects[i].item;
+		const char *name = revs.pending.objects[i].name;
 
 		/* unfortunately, despite our careful generation, object duplication *is* a possibility...
 		 * (eg. same object introduced into two different branches) */
 		if (obj->flags & SEEN)
 			continue;
 
-		printf("%s\n", sha1_to_hex(revs.pending.objects[i].item->sha1));
+		printf("%s %s\n", sha1_to_hex(revs.pending.objects[i].item->sha1), name);
 		obj->flags |= SEEN;
 	}
 
diff --git a/rev-cache.c b/rev-cache.c
index 4ef5287..6c96297 100644
--- a/rev-cache.c
+++ b/rev-cache.c
@@ -17,6 +17,14 @@ struct bad_slice {
 	struct bad_slice *next;
 };
 
+struct name_list {
+	unsigned char sha1[20];
+	unsigned int len;
+	struct name_list *next;
+
+	char buf[FLEX_ARRAY];
+};
+
 struct cache_slice_pointer {
 	char signature[8]; /* REVCOPTR */
 	char version;
@@ -29,10 +37,13 @@ static uint32_t fanout[0xff + 2];
 static unsigned char *idx_map;
 static int idx_size;
 static struct rc_index_header idx_head;
-static char no_idx, add_to_pending;
-static struct bad_slice *bad_slices;
+static char no_idx, add_to_pending, add_names;
 static unsigned char *idx_caches;
 
+static struct bad_slice *bad_slices;
+static struct name_list *name_lists, *cur_name_list;
+
+static struct strbuf *acc_name_buffer;
 static struct strbuf *acc_buffer;
 
 #define SLOP			5
@@ -79,7 +90,7 @@ struct rc_object_entry *from_disked_rc_object_entry(struct rc_object_entry_ondis
 	if (!dst)
 		dst = &entry[cur++ & 0x3];
 
-	dst->type = src->flags >> 5;
+	dst->type = src->flags >> 5 & 0x03;
 	dst->is_end = !!(src->flags & 0x10);
 	dst->is_start = !!(src->flags & 0x08);
 	dst->uninteresting = !!(src->flags & 0x04);
@@ -90,8 +101,9 @@ struct rc_object_entry *from_disked_rc_object_entry(struct rc_object_entry_ondis
 	dst->merge_nr = src->merge_nr;
 	dst->split_nr = src->split_nr;
 
-	dst->size_size = src->sizes >> 5;
-	dst->padding = src->sizes & 0x1f;
+	dst->size_size = src->sizes >> 5 & 0x03;
+	dst->name_size = src->sizes >> 2 & 0x03;
+	dst->padding = src->sizes & 0x02;
 
 	dst->date = ntohl(src->date);
 	dst->path = ntohs(src->path);
@@ -120,6 +132,7 @@ struct rc_object_entry_ondisk *to_disked_rc_object_entry(struct rc_object_entry
 	dst->split_nr = src->split_nr;
 
 	dst->sizes  = (unsigned char)src->size_size << 5;
+	dst->sizes |= (unsigned char)src->name_size << 2;
 	dst->sizes |= (unsigned char)src->padding;
 
 	dst->date = htonl(src->date);
@@ -192,6 +205,12 @@ static void cleanup_cache_slices(void)
 		idx_map = 0;
 	}
 
+	while (name_lists) {
+		struct name_list *nl = name_lists->next;
+		free(name_lists);
+		name_lists = nl;
+	}
+
 }
 
 static int init_index(void)
@@ -324,7 +343,7 @@ static void handle_noncommit(struct rev_info *revs, struct commit *commit, unsig
 	struct blob *blob;
 	struct tree *tree;
 	struct object *obj;
-	unsigned long size;
+	unsigned long size, name_index;
 
 	size = decode_size(ptr + RC_ENTRY_SIZE_OFFSET(entry), entry->size_size);
 	switch (entry->type) {
@@ -357,9 +376,23 @@ static void handle_noncommit(struct rev_info *revs, struct commit *commit, unsig
 		return;
 	}
 
+	if (add_names && cur_name_list) {
+		name_index = decode_size(ptr + RC_ENTRY_NAME_OFFSET(entry), entry->name_size);
+
+		if (name_index >= cur_name_list->len)
+			name_index = 0;
+	} else
+		name_index = 0;
+
 	obj->flags |= FACE_VALUE;
-	if (add_to_pending)
-		add_pending_object(revs, obj, "");
+	if (add_to_pending) {
+		char *name = "";
+
+		if (name_index)
+			name = cur_name_list->buf + name_index;
+
+		add_pending_object(revs, obj, name);
+	}
 }
 
 static int setup_traversal(struct rc_slice_header *head, unsigned char *map, struct commit *commit, struct commit_list **work,
@@ -713,15 +746,44 @@ end:
 	return retval;
 }
 
-static int get_cache_slice_header(unsigned char *cache_sha1, unsigned char *map, int len, struct rc_slice_header *head)
+static struct name_list *get_cache_slice_name_list(struct rc_slice_header *head, int fd)
+{
+	struct name_list *nl = name_lists;
+
+	while (nl) {
+		if (!hashcmp(nl->sha1, head->sha1))
+			break;
+		nl = nl->next;
+	}
+
+	if (nl)
+		return nl;
+
+	nl = xcalloc(1, sizeof(struct name_list) + head->name_size);
+	nl->len = head->name_size;
+	hashcpy(nl->sha1, head->sha1);
+
+	lseek(fd, head->size, SEEK_SET);
+	read_in_full(fd, nl->buf, head->name_size);
+
+	nl->next = name_lists;
+	name_lists = nl;
+
+	return nl;
+}
+
+static int get_cache_slice_header(int fd, unsigned char *cache_sha1, int len, struct rc_slice_header *head)
 {
 	int t;
 
-	memcpy(head, map, sizeof(struct rc_slice_header));
+	if (xread(fd, head, sizeof(struct rc_slice_header)) != sizeof(struct rc_slice_header))
+		return -1;
+
 	head->ofs_objects = ntohl(head->ofs_objects);
 	head->object_nr = ntohl(head->object_nr);
 	head->size = ntohl(head->size);
 	head->path_nr = ntohs(head->path_nr);
+	head->name_size = ntohl(head->name_size);
 
 	if (memcmp(head->signature, "REVCACHE", 8))
 		return -1;
@@ -730,10 +792,10 @@ static int get_cache_slice_header(unsigned char *cache_sha1, unsigned char *map,
 	if (hashcmp(head->sha1, cache_sha1))
 		return -3;
 	t = sizeof(struct rc_slice_header);
-	if (t != head->ofs_objects || t >= len)
+	if (t != head->ofs_objects)
 		return -4;
-
-	head->size = len;
+	if (head->size + head->name_size != len)
+		return -5;
 
 	return 0;
 }
@@ -785,7 +847,7 @@ int traverse_cache_slice(struct rev_info *revs,
 	unsigned long *date_so_far, int *slop_so_far,
 	struct commit_list ***queue, struct commit_list **work)
 {
-	int fd = -1, retval = -3;
+	int fd = -1, t, retval;
 	struct stat fi;
 	struct rc_slice_header head;
 	struct rev_cache_info *rci;
@@ -801,26 +863,31 @@ int traverse_cache_slice(struct rev_info *revs,
 	/* load options */
 	rci = &revs->rev_cache_info;
 	add_to_pending = rci->add_to_pending;
+	add_names = rci->add_names;
 
 	memset(&head, 0, sizeof(struct rc_slice_header));
+#	define ERROR(x)		do { retval = (x); goto end; } while (0);
 
 	fd = open_cache_slice(cache_sha1, O_RDONLY);
 	if (fd == -1)
-		goto end;
+		ERROR(-1);
 	if (fstat(fd, &fi) || fi.st_size < sizeof(struct rc_slice_header))
-		goto end;
+		ERROR(-2);
+
+	if ((t = get_cache_slice_header(fd, cache_sha1, fi.st_size, &head)) < 0)
+		ERROR(-t);
+	if (add_names)
+		cur_name_list = get_cache_slice_name_list(&head, fd);
 
-	map = xmmap(0, fi.st_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
+	map = xmmap(0, head.size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
 	if (map == MAP_FAILED)
-		goto end;
-	if (get_cache_slice_header(cache_sha1, map, fi.st_size, &head))
-		goto end;
+		ERROR(-3);
 
 	retval = traverse_cache_slice_1(&head, map, revs, commit, date_so_far, slop_so_far, queue, work);
 
 end:
 	if (map != MAP_FAILED)
-		munmap(map, fi.st_size);
+		munmap(map, head.size);
 	if (fd != -1)
 		close(fd);
 
@@ -828,6 +895,7 @@ end:
 	if (retval)
 		mark_bad_slice(cache_sha1);
 
+#	undef ERROR
 	return retval;
 }
 
@@ -1112,23 +1180,110 @@ static unsigned long decode_size(unsigned char *str, int len)
 	return size;
 }
 
+
+#define NL_HASH_TABLE_SIZE		(0xffff + 1)
+#define NL_HASH_NUMBER			(NL_HASH_TABLE_SIZE >> 3)
+
+struct name_list_hash {
+	int ind;
+	struct name_list_hash *next;
+};
+
+static struct name_list_hash **nl_hash_table;
+static unsigned char *nl_hashes;
+
+/* FNV-1a hash */
+static unsigned int hash_name(const char *name)
+{
+	unsigned int hash = 2166136261ul;
+	const char *p = name;
+
+	while (*p) {
+		hash ^= *p++;
+		hash *= 16777619ul;
+	}
+
+	return hash & 0xffff;
+}
+
+static int name_in_list(const char *name)
+{
+	unsigned int h = hash_name(name);
+	struct name_list_hash *entry = nl_hash_table[h];
+
+	while (entry && strcmp(acc_name_buffer->buf + entry->ind, name))
+		entry = entry->next;
+
+	if (entry)
+		return entry->ind;
+
+	/* add name to buffer and create hash reference */
+	entry = xcalloc(1, sizeof(struct name_list_hash));
+	entry->ind = acc_name_buffer->len;
+	strbuf_add(acc_name_buffer, name, strlen(name) + 1);
+
+	entry->next = nl_hash_table[h];
+	nl_hash_table[h] = entry;
+
+	nl_hashes[h / 8] |= h % 8;
+
+	return entry->ind;
+}
+
+static void init_name_list_hash(void)
+{
+	nl_hash_table = xcalloc(NL_HASH_TABLE_SIZE, sizeof(struct name_list_hash));
+	nl_hashes = xcalloc(NL_HASH_NUMBER, 1);
+}
+
+static void cleanup_name_list_hash(void)
+{
+	int i;
+
+	for (i = 0; i < NL_HASH_NUMBER; i++) {
+		int j, ind = nl_hashes[i];
+
+		if (!ind)
+			continue;
+
+		for (j = 0; j < 8; j++) {
+			struct name_list_hash **entryp;
+
+			if (!(ind & 1 << j))
+				continue;
+
+			entryp = &nl_hash_table[i * 8 + j];
+			while (*entryp) {
+				struct name_list_hash *t = (*entryp)->next;
+
+				free(*entryp);
+				*entryp = t;
+			}
+		}
+	} /* code overhang! */
+
+	free(nl_hashes);
+	free(nl_hash_table);
+}
+
 static void add_object_entry(const unsigned char *sha1, struct rc_object_entry *entryp,
-	struct strbuf *merge_str, struct strbuf *split_str)
+	struct strbuf *merge_str, struct strbuf *split_str, char *name, unsigned long size)
 {
 	struct rc_object_entry entry;
-	unsigned char size_str[7];
-	unsigned long size;
+	unsigned char size_str[7], name_str[7];
 	enum object_type type;
 	void *data;
 
 	if (entryp)
 		sha1 = entryp->sha1;
 
-	/* retrieve size data */
-	data = read_sha1_file(sha1, &type, &size);
+	if (!size) {
+		/* retrieve size data */
+		data = read_sha1_file(sha1, &type, &size);
 
-	if (data)
-		free(data);
+		if (data)
+			free(data);
+	}
 
 	/* initialize! */
 	if (!entryp) {
@@ -1146,6 +1301,9 @@ static void add_object_entry(const unsigned char *sha1, struct rc_object_entry *
 
 	entryp->size_size = encode_size(size, size_str);
 
+	if (name)
+		entryp->name_size = encode_size(name_in_list(name), name_str);
+
 	/* write the muvabitch */
 	strbuf_add(acc_buffer, to_disked_rc_object_entry(entryp, 0), sizeof(struct rc_object_entry_ondisk));
 
@@ -1155,25 +1313,36 @@ static void add_object_entry(const unsigned char *sha1, struct rc_object_entry *
 		strbuf_add(acc_buffer, split_str->buf, split_str->len);
 
 	strbuf_add(acc_buffer, size_str, entryp->size_size);
+
+	if (name)
+		strbuf_add(acc_buffer, name_str, entryp->name_size);
 }
 
 /* returns non-zero to continue parsing, 0 to skip */
 typedef int (*dump_tree_fn)(const unsigned char *, const char *, unsigned int); /* sha1, path, mode */
 
 /* we need to walk the trees by hash, so unfortunately we can't use traverse_trees in tree-walk.c */
-static int dump_tree(struct tree *tree, dump_tree_fn fn)
+static int dump_tree(struct tree *tree, dump_tree_fn fn, char *base)
 {
 	struct tree_desc desc;
 	struct name_entry entry;
 	struct tree *subtree;
-	int r;
+	char concatpath[PATH_MAX];
+	int r, baselen;
 
 	if (parse_tree(tree))
 		return -1;
 
+	baselen = strlen(base);
+	strcpy(concatpath, base);
+
 	init_tree_desc(&desc, tree->buffer, tree->size);
 	while (tree_entry(&desc, &entry)) {
-		switch (fn(entry.sha1, entry.path, entry.mode)) {
+		if (baselen + strlen(entry.path) + 1 >= PATH_MAX)
+			die("we have a problem: %s%s is too big for me to handle", base, entry.path);
+		strcpy(concatpath + baselen, entry.path);
+
+		switch (fn(entry.sha1, concatpath, entry.mode)) {
 		case 0:
 			goto continue_loop;
 		default:
@@ -1185,7 +1354,8 @@ static int dump_tree(struct tree *tree, dump_tree_fn fn)
 			if (!subtree)
 				return -2;
 
-			if ((r = dump_tree(subtree, fn)) < 0)
+			strcat(concatpath, "/");
+			if ((r = dump_tree(subtree, fn, concatpath)) < 0)
 				return r;
 		}
 
@@ -1199,6 +1369,9 @@ continue_loop:
 static int dump_tree_callback(const unsigned char *sha1, const char *path, unsigned int mode)
 {
 	strbuf_add(acc_buffer, sha1, 20);
+	strbuf_add(acc_buffer, (char *)&acc_name_buffer->len, sizeof(size_t));
+
+	strbuf_add(acc_name_buffer, path, strlen(path) + 1);
 
 	return 1;
 }
@@ -1212,6 +1385,9 @@ static void tree_addremove(struct diff_options *options,
 		return;
 
 	strbuf_add(acc_buffer, sha1, 20);
+	strbuf_add(acc_buffer, (char *)&acc_name_buffer->len, sizeof(size_t));
+
+	strbuf_add(acc_name_buffer, concatpath, strlen(concatpath) + 1);
 }
 
 static void tree_change(struct diff_options *options,
@@ -1224,12 +1400,15 @@ static void tree_change(struct diff_options *options,
 		return;
 
 	strbuf_add(acc_buffer, new_sha1, 20);
+	strbuf_add(acc_buffer, (char *)&acc_name_buffer->len, sizeof(size_t));
+
+	strbuf_add(acc_name_buffer, concatpath, strlen(concatpath) + 1);
 }
 
 static int add_unique_objects(struct commit *commit)
 {
 	struct commit_list *list;
-	struct strbuf os, ost, *orig_buf;
+	struct strbuf os, ost, names, *orig_name_buf, *orig_buf;
 	struct diff_options opts;
 	int i, j, next;
 	char is_first = 1;
@@ -1237,13 +1416,17 @@ static int add_unique_objects(struct commit *commit)
 	/* ...no, calculate unique objects */
 	strbuf_init(&os, 0);
 	strbuf_init(&ost, 0);
+	strbuf_init(&names, 0);
 	orig_buf = acc_buffer;
+	orig_name_buf = acc_name_buffer;
+	acc_name_buffer = &names;
 
 	diff_setup(&opts);
 	DIFF_OPT_SET(&opts, RECURSIVE);
 	DIFF_OPT_SET(&opts, TREE_IN_RECURSIVE);
 	opts.change = tree_change;
 	opts.add_remove = tree_addremove;
+#	define ENTRY_SIZE (20 + sizeof(size_t))
 
 	/* this is only called for non-ends (ie. all parents interesting) */
 	for (list = commit->parents; list; list = list->next) {
@@ -1254,20 +1437,20 @@ static int add_unique_objects(struct commit *commit)
 
 		strbuf_setlen(acc_buffer, 0);
 		diff_tree_sha1(list->item->tree->object.sha1, commit->tree->object.sha1, "", &opts);
-		qsort(acc_buffer->buf, acc_buffer->len / 20, 20, (int (*)(const void *, const void *))hashcmp);
+		qsort(acc_buffer->buf, acc_buffer->len / ENTRY_SIZE, ENTRY_SIZE, (int (*)(const void *, const void *))hashcmp);
 
 		/* take intersection */
 		if (!is_first) {
-			for (next = i = j = 0; i < os.len; i += 20) {
+			for (next = i = j = 0; i < os.len; i += ENTRY_SIZE) {
 				while (j < ost.len && hashcmp((unsigned char *)(ost.buf + j), (unsigned char *)(os.buf + i)) < 0)
-					j += 20;
+					j += ENTRY_SIZE;
 
 				if (j >= ost.len || hashcmp((unsigned char *)(ost.buf + j), (unsigned char *)(os.buf + i)))
 					continue;
 
 				if (next != i)
-					memcpy(os.buf + next, os.buf + i, 20);
-				next += 20;
+					memcpy(os.buf + next, os.buf + i, ENTRY_SIZE);
+				next += ENTRY_SIZE;
 			}
 
 			if (next != i)
@@ -1279,29 +1462,37 @@ static int add_unique_objects(struct commit *commit)
 	/* no parents (!) */
 	if (is_first) {
 		acc_buffer = &os;
-		dump_tree(commit->tree, dump_tree_callback);
+		dump_tree(commit->tree, dump_tree_callback, "");
 	}
 
 	/* the ordering of non-commit objects dosn't really matter, so we're not gonna bother */
 	acc_buffer = orig_buf;
-	for (i = 0; i < os.len; i += 20)
-		add_object_entry((unsigned char *)(os.buf + i), 0, 0, 0);
+	acc_name_buffer = orig_name_buf;
+	for (i = 0; i < os.len; i += ENTRY_SIZE)
+		add_object_entry((unsigned char *)(os.buf + i), 0, 0, 0, names.buf + *(size_t *)(os.buf + i + 20), 0);
 
 	/* last but not least, the main tree */
-	add_object_entry(commit->tree->object.sha1, 0, 0, 0);
+	add_object_entry(commit->tree->object.sha1, 0, 0, 0, 0, 0);
 
-	return i / 20 + 1;
+	strbuf_release(&ost);
+	strbuf_release(&os);
+	strbuf_release(&names);
+
+	return i / ENTRY_SIZE + 1;
+#	undef ENTRY_SIZE
 }
 
 static int add_objects_verbatim_1(struct rev_cache_slice_map *mapping, int *index)
 {
-	unsigned char *map = mapping->map;
 	int i = *index, object_nr = 0;
+	unsigned char *map = mapping->map;
 	struct rc_object_entry *entry = RC_OBTAIN_OBJECT_ENTRY(map + i);
+	unsigned long size;
 
 	i += RC_ACTUAL_OBJECT_ENTRY_SIZE(entry);
 	while (i < mapping->size) {
-		int pos = i;
+		char *name;
+		int name_index, pos = i;
 
 		entry = RC_OBTAIN_OBJECT_ENTRY(map + i);
 		i += RC_ACTUAL_OBJECT_ENTRY_SIZE(entry);
@@ -1311,7 +1502,15 @@ static int add_objects_verbatim_1(struct rev_cache_slice_map *mapping, int *inde
 			return object_nr;
 		}
 
-		strbuf_add(acc_buffer, map + pos, i - pos);
+		name_index = decode_size(map + pos + RC_ENTRY_NAME_OFFSET(entry), entry->name_size);
+		if (name_index && name_index < mapping->name_size)
+			name = mapping->names + name_index;
+		else
+			name = 0;
+
+		size = decode_size(map + pos + RC_ENTRY_SIZE_OFFSET(entry), entry->size_size);
+
+		add_object_entry(0, entry, 0, 0, name, size);
 		object_nr++;
 	}
 
@@ -1396,6 +1595,7 @@ void init_rev_cache_info(struct rev_cache_info *rci)
 	rci->overwrite_all = 0;
 
 	rci->add_to_pending = 1;
+	rci->add_names = 1;
 
 	rci->ignore_size = 0;
 }
@@ -1420,9 +1620,9 @@ int make_cache_slice(struct rev_cache_info *rci,
 	struct rc_slice_header head;
 	struct commit *commit;
 	unsigned char sha1[20];
-	struct strbuf merge_paths, split_paths;
+	struct strbuf merge_paths, split_paths, namelist;
 	int object_nr, total_sz, fd;
-	char file[PATH_MAX], *newfile;
+	char file[PATH_MAX], null, *newfile;
 	struct rev_cache_info *trci;
 	git_SHA_CTX ctx;
 
@@ -1437,7 +1637,13 @@ int make_cache_slice(struct rev_cache_info *rci,
 	strbuf_init(&endlist, 0);
 	strbuf_init(&merge_paths, 0);
 	strbuf_init(&split_paths, 0);
+	strbuf_init(&namelist, 0);
 	acc_buffer = &buffer;
+	acc_name_buffer = &namelist;
+
+	null = 0;
+	strbuf_add(&namelist, &null, 1);
+	init_name_list_hash();
 
 	if (!revs) {
 		revs = &therevs;
@@ -1468,6 +1674,7 @@ int make_cache_slice(struct rev_cache_info *rci,
 	trci = &revs->rev_cache_info;
 	init_rev_cache_info(trci);
 	trci->add_to_pending = 0;
+	trci->add_names = 0;
 
 	setup_revisions(0, 0, revs, 0);
 	if (prepare_revision_walk(revs))
@@ -1505,7 +1712,7 @@ int make_cache_slice(struct rev_cache_info *rci,
 
 		commit->indegree = 0;
 
-		add_object_entry(0, &object, &merge_paths, &split_paths);
+		add_object_entry(0, &object, &merge_paths, &split_paths, 0, 0);
 		object_nr++;
 
 		if (rci->objects && !object.is_end) {
@@ -1531,10 +1738,16 @@ int make_cache_slice(struct rev_cache_info *rci,
 		total_sz += buffer.len;
 	}
 
+	/* write path name lookup list */
+	head.name_size = htonl(namelist.len);
+	write_in_full(fd, namelist.buf, namelist.len);
+
 	/* go ahead a free some stuff... */
 	strbuf_release(&buffer);
 	strbuf_release(&merge_paths);
 	strbuf_release(&split_paths);
+	strbuf_release(&namelist);
+	cleanup_name_list_hash();
 	if (path_sz)
 		free(paths);
 	while (path_track_alloc)
@@ -1992,6 +2205,7 @@ int fuse_cache_slices(struct rev_cache_info *rci, struct rev_info *revs)
 	for (i = idx_head.cache_nr - 1; i >= 0; i--) {
 		struct rev_cache_slice_map *map = rci->maps + i;
 		struct stat fi;
+		struct rc_slice_header head;
 		int fd;
 
 		if (!map->size)
@@ -2004,13 +2218,20 @@ int fuse_cache_slices(struct rev_cache_info *rci, struct rev_info *revs)
 			continue;
 		if (fi.st_size < sizeof(struct rc_slice_header))
 			continue;
+		if (get_cache_slice_header(fd, idx_caches + i * 20, fi.st_size, &head))
+			continue;
 
-		map->map = xmmap(0, fi.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
+		map->map = xmmap(0, head.size, PROT_READ, MAP_PRIVATE, fd, 0);
 		if (map->map == MAP_FAILED)
 			continue;
 
+		lseek(fd, head.size, SEEK_SET);
+		map->names = xcalloc(head.name_size, 1);
+		read_in_full(fd, map->names, head.name_size);
+
 		close(fd);
-		map->size = fi.st_size;
+		map->size = head.size;
+		map->name_size = head.name_size;
 	}
 
 	rci->make_index = 0;
@@ -2027,6 +2248,7 @@ int fuse_cache_slices(struct rev_cache_info *rci, struct rev_info *revs)
 		if (!map->size)
 			continue;
 
+		free(map->names);
 		munmap(map->map, map->size);
 	}
 	free(rci->maps);
@@ -2048,7 +2270,6 @@ static int verify_cache_slice(const char *slice_path, unsigned char *sha1)
 {
 	struct rc_slice_header head;
 	int fd, len, retval = -1;
-	unsigned char *map = MAP_FAILED;
 	struct stat fi;
 
 	len = strlen(slice_path);
@@ -2063,17 +2284,12 @@ static int verify_cache_slice(const char *slice_path, unsigned char *sha1)
 	if (fstat(fd, &fi) || fi.st_size < sizeof(head))
 		goto end;
 
-	map = xmmap(0, sizeof(head), PROT_READ, MAP_PRIVATE, fd, 0);
-	if (map == MAP_FAILED)
-		goto end;
-	if (get_cache_slice_header(sha1, map, fi.st_size, &head))
+	if (get_cache_slice_header(fd, sha1, fi.st_size, &head))
 		goto end;
 
 	retval = 0;
 
 end:
-	if (map != MAP_FAILED)
-		munmap(map, sizeof(head));
 	if (fd > 0)
 		close(fd);
 
diff --git a/rev-cache.h b/rev-cache.h
index a1af337..0fa9b44 100644
--- a/rev-cache.h
+++ b/rev-cache.h
@@ -10,8 +10,14 @@
 #define RC_OBTAIN_OBJECT_ENTRY(p)			from_disked_rc_object_entry((struct rc_object_entry_ondisk *)(p), 0)
 #define RC_OBTAIN_INDEX_ENTRY(p)			from_disked_rc_index_entry((struct rc_index_entry_ondisk *)(p), 0)
 
-#define RC_ACTUAL_OBJECT_ENTRY_SIZE(e)		(sizeof(struct rc_object_entry_ondisk) + RC_PATH_SIZE((e)->merge_nr + (e)->split_nr) + (e)->size_size)
-#define RC_ENTRY_SIZE_OFFSET(e)				(RC_ACTUAL_OBJECT_ENTRY_SIZE(e) - (e)->size_size)
+#define RC_ACTUAL_OBJECT_ENTRY_SIZE(e)	(\
+	sizeof(struct rc_object_entry_ondisk) + \
+	RC_PATH_SIZE((e)->merge_nr + (e)->split_nr) + \
+	(e)->size_size + \
+	(e)->name_size\
+)
+#define RC_ENTRY_SIZE_OFFSET(e)			(RC_ACTUAL_OBJECT_ENTRY_SIZE(e) - (e)->name_size - (e)->size_size)
+#define RC_ENTRY_NAME_OFFSET(e)			(RC_ACTUAL_OBJECT_ENTRY_SIZE(e) - (e)->name_size)
 
 /* single index maps objects to cache files */
 struct rc_index_header {
@@ -50,6 +56,8 @@ struct rc_slice_header {
 	uint32_t size;
 
 	unsigned char sha1[20];
+
+	uint32_t name_size;
 };
 
 struct rc_object_entry_ondisk {
@@ -76,7 +84,8 @@ struct rc_object_entry {
 	unsigned char merge_nr; /* : 7 */
 	unsigned char split_nr; /* : 7 */
 	unsigned size_size:3;
-	unsigned padding:5;
+	unsigned name_size:3;
+	unsigned padding:2;
 
 	uint32_t date;
 	uint16_t path;
@@ -84,6 +93,7 @@ struct rc_object_entry {
 	/* merge paths */
 	/* split paths */
 	/* size */
+	/* name id */
 };
 
 struct rc_index_entry *from_disked_rc_index_entry(struct rc_index_entry_ondisk *src, struct rc_index_entry *dst);
diff --git a/revision.h b/revision.h
index d160e14..dd51d27 100644
--- a/revision.h
+++ b/revision.h
@@ -26,6 +26,9 @@ struct rev_cache_slice_map {
 	unsigned char *map;
 	int size;
 	int last_index;
+
+	char *names;
+	int name_size;
 };
 
 struct rev_cache_info {
@@ -39,7 +42,8 @@ struct rev_cache_info {
 	unsigned overwrite_all : 1;
 
 	/* traversal flags */
-	unsigned add_to_pending : 1;
+	unsigned add_to_pending : 1,
+		add_names : 1;
 
 	/* fuse options */
 	unsigned int ignore_size;
diff --git a/t/t6017-rev-cache-list.sh b/t/t6017-rev-cache-list.sh
index 982fb15..f0f3bcf 100755
--- a/t/t6017-rev-cache-list.sh
+++ b/t/t6017-rev-cache-list.sh
@@ -4,8 +4,10 @@ test_description='git rev-cache tests'
 . ./test-lib.sh
 
 test_cmp_sorted() {
-	grep -io "[a-f0-9]*" $1 | sort >.tmpfile1 &&
-	grep -io "[a-f0-9]*" $2 | sort >.tmpfile2 &&
+# note that we're tip-toeing around the corner case of two objects/names
+# for the same SHA-1 => discrepencies between cached and non-cached walks
+	sort $1 >.tmpfile1 &&
+	sort $2 >.tmpfile2 &&
 	test_cmp .tmpfile1 .tmpfile2
 }
 
@@ -15,6 +17,8 @@ test_cmp_sorted() {
 # reuse
 test_expect_success 'init repo' '
 	echo bla >file &&
+	mkdir amaindir &&
+	echo watskeburt >amaindir/file &&
 	git add . &&
 	git commit -m "bla" &&
 
-- 
tg: (9e5164a..) t/revcache/names (depends on: t/revcache/docs)

^ permalink raw reply related

* Re: [PATCH 7/6 (v4)] support for commit grafts, slight change to general mechanism
From: Nick Edelen @ 2009-10-19 20:31 UTC (permalink / raw)
  To: Nick Edelen, Junio C Hamano, Nicolas Pitre, Johannes Schindelin,
	Sam Vilain, Micha
In-Reply-To: <op.uzv4dyuotdk399@sirnot.private>

Adds support for graft commits in rev-cache (w/ test), and slightly alters
graft mechanism.  Before, parse_commit() checked the graft list on every
commit.  Now register_commit_graft() preemptively loads graft commits into
memory, and sets a new 'graft' flag in the object.  This allows awareness of
the commits' medical history without searching a (normally private) array upon
each commit.

Signed-off-by: Nick Edelen <sirnot@gmail.com>

---
fixed bug in mechanism alteration, which was causing test t6001 to fail.

 builtin-rev-cache.c       |   14 ++++++++++++--
 commit.c                  |   27 +++++++++++++++++++++++++--
 object.h                  |    3 ++-
 rev-cache.c               |   32 ++++++++++++++++++++++++++++++++
 t/t6017-rev-cache-list.sh |    6 ++++++
 5 files changed, 77 insertions(+), 5 deletions(-)

diff --git a/builtin-rev-cache.c b/builtin-rev-cache.c
index 4c1766d..b36bc39 100644
--- a/builtin-rev-cache.c
+++ b/builtin-rev-cache.c
@@ -102,8 +102,18 @@ static int test_rev_list(int argc, const char *argv[])
 			flags ^= UNINTERESTING;
 		else if (!strcmp(argv[i], "--objects"))
 			revs.tree_objects = revs.blob_objects = 1;
-		else
-			handle_revision_arg(argv[i], &revs, flags, 1);
+		else {
+			struct commit_graft graft;
+
+			if (argv[i][0] == ':') {
+				handle_revision_arg(argv[i] + 1, &revs, flags, 1);
+
+				hashcpy(graft.sha1, revs.pending.objects[revs.pending.nr - 1].item->sha1);
+				graft.nr_parent = -1;
+				register_commit_graft(&graft, 0);
+			} else
+				handle_revision_arg(argv[i], &revs, flags, 1);
+		}
 	}
 
 	setup_revisions(0, 0, &revs, 0);
diff --git a/commit.c b/commit.c
index 61d83c6..c227748 100644
--- a/commit.c
+++ b/commit.c
@@ -99,6 +99,7 @@ static int commit_graft_pos(const unsigned char *sha1)
 
 int register_commit_graft(struct commit_graft *graft, int ignore_dups)
 {
+	struct commit *commit;
 	int pos = commit_graft_pos(graft->sha1);
 
 	if (0 <= pos) {
@@ -123,6 +124,12 @@ int register_commit_graft(struct commit_graft *graft, int ignore_dups)
 			(commit_graft_nr - pos - 1) *
 			sizeof(*commit_graft));
 	commit_graft[pos] = graft;
+
+	commit = lookup_commit(graft->sha1);
+	commit->object.graft = 1;
+	commit->object.parsed = 0;
+	parse_commit(commit); /* in case commit was already parsed */
+
 	return 0;
 }
 
@@ -221,6 +228,7 @@ int write_shallow_commits(int fd, int use_pack_protocol)
 
 int unregister_shallow(const unsigned char *sha1)
 {
+	struct commit *commit;
 	int pos = commit_graft_pos(sha1);
 	if (pos < 0)
 		return -1;
@@ -229,6 +237,12 @@ int unregister_shallow(const unsigned char *sha1)
 				sizeof(struct commit_graft *)
 				* (commit_graft_nr - pos - 1));
 	commit_graft_nr--;
+
+	commit = lookup_commit(sha1);
+	commit->object.graft = 0;
+	commit->object.parsed = 0;
+	parse_commit(commit);
+
 	return 0;
 }
 
@@ -255,7 +269,13 @@ int parse_commit_buffer(struct commit *item, void *buffer, unsigned long size)
 	while (pop_commit(pptr))
 		; /* clear anything from cache */
 
-	graft = lookup_commit_graft(item->object.sha1);
+	/* make sure .graft flag is initialized */
+	prepare_commit_graft();
+	if (item->object.graft)
+		graft = lookup_commit_graft(item->object.sha1);
+	else
+		graft = 0;
+
 	while (bufptr + 48 < tail && !memcmp(bufptr, "parent ", 7)) {
 		struct commit *new_parent;
 
@@ -283,7 +303,10 @@ int parse_commit_buffer(struct commit *item, void *buffer, unsigned long size)
 				continue;
 			pptr = &commit_list_insert(new_parent, pptr)->next;
 		}
-	}
+		item->object.graft = 1;
+	} else
+		item->object.graft = 0;
+
 	item->date = parse_commit_date(bufptr, tail);
 
 	return 0;
diff --git a/object.h b/object.h
index 89dd0c4..f848e0f 100644
--- a/object.h
+++ b/object.h
@@ -22,7 +22,7 @@ struct object_array {
 };
 
 #define TYPE_BITS   3
-#define FLAG_BITS  27
+#define FLAG_BITS  26
 
 /*
  * The object type is stored in 3 bits.
@@ -30,6 +30,7 @@ struct object_array {
 struct object {
 	unsigned parsed : 1;
 	unsigned used : 1;
+	unsigned graft : 1;
 	unsigned type : TYPE_BITS;
 	unsigned flags : FLAG_BITS;
 	unsigned char sha1[20];
diff --git a/rev-cache.c b/rev-cache.c
index 6c96297..f7b1cd2 100644
--- a/rev-cache.c
+++ b/rev-cache.c
@@ -664,9 +664,41 @@ static int traverse_cache_slice_1(struct rc_slice_header *head, unsigned char *m
 			}
 		} else if (!ipath_nr && co->date <= date)
 			slop--;
+		else if (!ipath_nr && !upath_nr)
+			break;
 		else
 			slop = SLOP;
 
+		/* before opening further topo-relations, check if the parenting has had medical attention */
+		if (obj->graft) {
+			struct commit_list *list;
+
+			parse_commit(co);
+			obj->flags &= ~FACE_VALUE;
+			last_objects[path] = 0;
+
+			/* we're only interested in its indirect influence */
+			for (list = co->parents; list; list = list->next) {
+				struct rc_index_entry *iep;
+				struct object *po = &list->item->object;
+
+				iep = search_index(po->sha1);
+				if (!iep || hashcmp(idx_caches + 20 * iep->cache_index, head->sha1)) {
+					if (!(obj->flags & UNINTERESTING) && !(po->flags & UNINTERESTING))
+						ioutside = 1;
+				}
+			}
+
+			/* an abrupt end */
+			myworkp = &commit_list_insert(co, myworkp)->next;
+			if (entry->uninteresting)
+				upath_nr--;
+			else
+				ipath_nr--;
+			paths[path] = 0;
+			continue;
+		}
+
 		/* open parents */
 		if (entry->merge_nr) {
 			int j, off = index + sizeof(struct rc_object_entry_ondisk);
diff --git a/t/t6017-rev-cache-list.sh b/t/t6017-rev-cache-list.sh
index f0f3bcf..3e16949 100755
--- a/t/t6017-rev-cache-list.sh
+++ b/t/t6017-rev-cache-list.sh
@@ -92,6 +92,7 @@ git-rev-list --topo-order HEAD --not HEAD~2 >proper_commit_list_limited2
 git-rev-list --topo-order HEAD >proper_commit_list
 git-rev-list --objects HEAD >proper_object_list
 git-rev-list HEAD --max-age=$min_date --min-age=$max_date >proper_list_date_limited
+git-rev-cache test HEAD :HEAD~2 >proper_shallow_list 2>/dev/null
 
 cache_sha1=`git-rev-cache add HEAD 2>output.err`
 
@@ -252,4 +253,9 @@ test_expect_success 'test --ignore-size function in fuse' '
 	test -e .git/rev-cache/$cache_sha1
 '
 
+test_expect_success 'check graft handling' '
+	git-rev-cache test HEAD :HEAD~2 >list
+	test_cmp list proper_shallow_list
+'
+
 test_done
-- 
tg: (52c9694..) t/revcache/graft (depends on: t/revcache/names)

^ permalink raw reply related

* Re: [PATCH 0/3] Generalized "string function" syntax
From: René Scharfe @ 2009-10-19 23:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vr5t0nwu8.fsf@alter.siamese.dyndns.org>

Junio C Hamano schrieb:
> René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:
> 
>> Which other text functions are we going to add which would break this
>> model?  The only thing I can think of right now is nesting such
>> functions themselves, e.g. when indenting a list in an indented
>> sub-paragraph in an indented paragraph.  Useful?
> 
> I was more worried about painting ourselves now in a corner we cannot get
> out of easily later.  Even if my answer to question "what are we going to
> add" may be "nothing I can think of right now", it does not make me happy.

If wrapping wasn't implemented as a nested function, nesting could still
be introduced independently and used for other things -- once these
other things arrive.

> Something off the top of my head are combinations like these.
> 
>     %[toupper()%cD%] => 'SUN, 18 OCT 2009 12:34:56 -0700'
>     %[substr(7,3)%[toupper()%cD%]] => 'OCT'
> 
>     %[sanitize()%s%] === %f (i.e. format-patch filename)
>     %[sanitize()%[substr(0,7)%[toupper()%aN%]%]%s] (with upcased author name)

Interesting examples, I particular like sanitize().

> By the way, I think that date formatting can be helped by introducing a
> strftime() function that takes %ct/%at as input, e.g. %aD would become
> 
>     %[strftime(%a, %d %b %Y %H:%M:%S %z)%at]
> 
> and we do not have to worry about keep adding random %[ac]X formats and
> running out of X.  Right now we use d/D/r/i and there were talks of adding
> a shortened 8601 format without time or something we did not implement.

The number of date formats is scary, but this could be solved e.g. by
introducing "%aT(<date format specifiers etc.>)", without nesting.

> Also, if we had this %[func() any string%] mechanism, we probably wouldn't
> have had to add distinction between n/N and e/E after %a and %c.

Yeah, the place holders multiplied, and some of that growth could have
been avoided by providing ways to change the change the output instead
of providing the processed results.

However, I think that nesting is such a big addition that it warrants
further planning.  It turns the simple "see place holder, fill in value"
interpolator into more of a programming language.  Is that really
needed?  And if yes, do we want to keep all these percent signs around
or is it better to invent a nicer syntax?  Or borrow it from somewhere
else?  Or perhaps I'm just afraid of change and complexity.

Anyway, all of the functions that accept strings need to be able to skip
over escape codes, which includes all of those mentioned above except
perhaps strftime.  This is ugly.  Or one could forbid colour codes in
function arguments.

I'm more in favour of adding ways to customize the shape of the elements
rather than adding string functions.  %S(width=76,indent=4) over
%[wrap(76,4)%s%].

I feel I need to think a bit more about this; currently I'm a bit scared
by %[...%].  But first to catch some sleep..

René

^ permalink raw reply

* Re: [PATCH 0/3] Generalized "string function" syntax
From: Junio C Hamano @ 2009-10-19 23:18 UTC (permalink / raw)
  To: René Scharfe; +Cc: git
In-Reply-To: <4ADCF117.2030905@lsrfire.ath.cx>

René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:

> Junio C Hamano schrieb:
> ...
>> I was more worried about painting ourselves now in a corner we cannot get
>> out of easily later.  Even if my answer to question "what are we going to
>> add" may be "nothing I can think of right now", it does not make me happy.
>
> If wrapping wasn't implemented as a nested function, nesting could still
> be introduced independently and used for other things -- once these
> other things arrive.

True.  I do not think we _need_ nested expansion; obviously we have lived
without it for a long time.

> I'm more in favour of adding ways to customize the shape of the elements
> rather than adding string functions.  %S(width=76,indent=4) over
> %[wrap(76,4)%s%].

Yeah, %X(some modifier) that can apply to any 'X' looks much simpler and
easier to look at.  The way the code is structured currently it might be
more work and risk to break things, though.

^ permalink raw reply

* Rebase fails because of apply.whitespace setting
From: John Feuerstein @ 2009-10-19 23:31 UTC (permalink / raw)
  To: git

I had to rewrite history recently (ugh -- private repo, not shared with
anybody) and wanted to change the commit message of the root commit.
During this I've encountered a failing git-rebase if apply.whitespace is
set to "error".

Should whitespace-errors really be detected when rebasing?
(or worse: be "fixed" without explicitely asking for it when using a
global apply.whitespace=fix setting)
What about making --ignore-whitespace the default for git-rebase?

Simplified example:

$ mkdir test
$ cd test/
$ git config apply.whitespace
error
$ git init
Initialized empty Git repository in /home/john/test/.git/
$ git commit --allow-empty -m 'root commit'
[master (root-commit) a208f9f] root commit
$ echo ' ' > whitespace
$ git add whitespace
$ git commit -m 'add whitespace file'
[master 12b685b] add whitespace file
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 whitespace
$ git tag root a208f9f
$ git checkout -b new-root root
Switched to a new branch 'new-root'
$ git commit --amend --allow-empty -m 'changed root commit'
[new-root b9adb89] changed root commit
$ git checkout -
Switched to branch 'master'
$ git rebase --onto new-root root
First, rewinding head to replay your work on top of it...
Applying: add whitespace file
Patch failed at 0001 add whitespace file

When you have resolved this problem run "git rebase --continue".
If you would prefer to skip this patch, instead run "git rebase --skip".
To restore the original branch and stop rebasing run "git rebase --abort".

$ git diff
$ git status
$ git apply .git/rebase-apply/0001
.git/rebase-apply/0001:17: trailing whitespace.

fatal: 1 line adds whitespace errors.

^ permalink raw reply

* Re: Rebase fails because of apply.whitespace setting
From: Junio C Hamano @ 2009-10-19 23:53 UTC (permalink / raw)
  To: John Feuerstein; +Cc: git
In-Reply-To: <4ADCF6E9.8090704@feurix.com>

John Feuerstein <john@feurix.com> writes:

> I had to rewrite history recently (ugh -- private repo, not shared with
> anybody) and wanted to change the commit message of the root commit.
> During this I've encountered a failing git-rebase if apply.whitespace is
> set to "error".
>
> Should whitespace-errors really be detected when rebasing?
> (or worse: be "fixed" without explicitely asking for it when using a
> global apply.whitespace=fix setting)

I've seen this argued both ways.  Some people seem to be happy that they
can use rebase with whitespace=fix to clean up their mess.  Some people
quite rightly oppose it (I am slightly closer to the latter camp myself,
but that does not count that much).

I'd agree that whitespace=error won't help anybody here.

Perhaps rebase can be taught to take --whitespace=warn from the command
line to override whatever is in the configuration?

^ permalink raw reply

* Re: What's cooking in git.git (Oct 2009, #03; Mon, 19)
From: Johan Herland @ 2009-10-20  0:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Johannes.Schindelin
In-Reply-To: <7veioz1a8q.fsf@alter.siamese.dyndns.org>

On Monday 19 October 2009, Junio C Hamano wrote:
> Johan Herland <johan@herland.net> writes:
> > On Monday 19 October 2009, Junio C Hamano wrote:
> >> * jh/notes (2009-10-09) 22 commits.
> >>  - fast-import: Proper notes tree manipulation using the notes API
> >>  ...
> >>  - Introduce commit notes
> >>  (this branch uses sr/gfi-options.)
> >
> > Nope. This branch does not use sr/gfi-options. The jh/cvs-helper branch
> > does, though.
> 
> We should rebase jh/notes on top of v1.6.5 then.

We, as in me? or you?

AFAICS, when rebasing from 'next' to v1.6.5, there is only a single (easily 
resolved) merge conflict. If you prefer, I can resend the entire series 
rebased onto v1.6.5.


...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply

* Re: [PATCH] git add -e documentation: rephrase note
From: Junio C Hamano @ 2009-10-20  0:39 UTC (permalink / raw)
  To: Wesley J. Landaker; +Cc: Jeff King, Miklos Vajna, git
In-Reply-To: <200910191142.03446.wjl@icecavern.net>

"Wesley J. Landaker" <wjl@icecavern.net> writes:

> On Monday 19 October 2009 01:07:23 Junio C Hamano wrote:
>> That is Ok; the comment was not about stage vs add.
>>
>> > But beyond that, yes, you are right that removing a "+" line may have a
>> > different conceptual meaning to the user depending on the surrounding
>> > text. I wonder if such a "check-list" document really makes much sense,
>> > given that using "-e" at all means you need to understand the patch
>> > format and what makes sense (i.e., anybody who understands 'patch'
>> > knows that you can't just delete context lines and expect it to apply).
>>
>> Yeah, that is really what I wanted people who are in this discussion to
>> eventually realize ;-)
>
> Comment from the peanut gallery:
>
> I still think a quick summary checklist is useful even for a seasoned 
> developer that is intimate with the 'patch' format, as it lets users know 
> what git will do with your patch modifications.

Oh, that is not something I am against at all.  I (mis)took Peff's "what
makes sense" to cover more than what a patch format is, namely, "what the
user by modifying the patch and 'add -e' by accepting a modified patch are
doing."

> For example, when I first tried "add -e", my first thought was: "Awesome, 
> but, I wonder if git will do the right thing if I modify the patch in THIS 
> way ...". Fortunately, git did the right thing, but I wasn't really sure 
> until I tried it.

Actually, my "change +b to +e" example was meant to illustrate that git
is not necessarily doing the right thing.

Admittedly, there is no way to always do the right thing unless git can
read user's mind in this case, and realization of why it is impossible to
do is necessary before using "add -e"; otherwise you would end up getting
utterly confused.  I do not use "add -e" myself.

In one edit (i.e. "remove +c"), the intention is clear that the insertion
is merely kept out of the index and it might even eventually be added in a
later commit.  "c" could be a debugging "printf()" that the user may not
ever want to commit.  In either case, it is clear that the user wants to
keep that change in the work tree version.

In another edit (i.e. "change +b to +e"), the only thing that is clear is
that the user wants a version with "e" in the next commit.

Changing from "+b" to "+e" in "add -e" might have been done, in order to
change "+if (1 || debug)" in the work tree version to "+if (debug)" in the
version to be committed (so that the user can keep debugging without
giving the --debug option to the program, for example).  In this case, it
is quite similar to the earlier "remove +c" ("I know I inserted 'c' in my
work tree version, do not commit that change, but I do want to keep it in
my work tree") case, and keeping the work tree intact is the right thing
to do.

But it might have been done because the user spotted a typo in "b" and
wanted to correct it to "e", in which case the user may wish the change
be reflected to the work tree.

As there is no way to distinguish the last two cases (and I do not think
the code even attempts to tell the difference); half of the time you would
(perhaps mistakenly) think that "add -e" did a wrong thing to your typofix
edit, after adding the updated contents correctly to the index: "I told
git to fix the typo 'b' to 'e', and I committed the corrected version, but
then it discarded my fix---the typo is still there in my work tree".

^ permalink raw reply

* git-svn: add support for merges during 'git svn fetch'
From: Sam Vilain @ 2009-10-20  2:41 UTC (permalink / raw)
  To: git

This series adds support for converting SVN merges - in the two
popular formats, SVK and SVN 1.5+, into git parents.

^ permalink raw reply

* [PATCH 1/5] git-svn: add test data for SVK merge, with script.
From: Sam Vilain @ 2009-10-20  2:41 UTC (permalink / raw)
  To: git; +Cc: Sam Vilain
In-Reply-To: <1256006523-5493-1-git-send-email-sam.vilain@catalyst.net.nz>

Dump generated with SVK 2.0.2 and SVN 1.5.1 (on lenny amd64).

Signed-off-by: Sam Vilain <sam.vilain@catalyst.net.nz>
---
 t/t9150/make-svk-dump  |   57 +++++
 t/t9150/svk-merge.dump |  616 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 673 insertions(+), 0 deletions(-)
 create mode 100644 t/t9150/make-svk-dump
 create mode 100644 t/t9150/svk-merge.dump

diff --git a/t/t9150/make-svk-dump b/t/t9150/make-svk-dump
new file mode 100644
index 0000000..2242f14
--- /dev/null
+++ b/t/t9150/make-svk-dump
@@ -0,0 +1,57 @@
+#!/bin/sh
+#
+# this script sets up a Subversion repository for Makefile in the
+# first ever git merge, as if it were done with svk.
+#
+
+set -e
+
+svk depotmap foo ~/.svk/foo
+svk co /foo/ foo
+cd foo
+mkdir trunk
+mkdir branches
+svk add trunk branches
+svk commit -m "Setup trunk and branches"
+cd trunk
+
+git cat-file blob 6683463e:Makefile > Makefile
+svk add Makefile 
+
+svk commit -m "ancestor"
+cd ..
+svk cp trunk branches/left
+
+svk commit -m "make left branch"
+cd branches/left/
+
+git cat-file blob 5873b67e:Makefile > Makefile
+svk commit -m "left update 1"
+
+cd ../../trunk
+git cat-file blob 75118b13:Makefile > Makefile
+svk commit -m "trunk update"
+
+cd ../branches/left
+git cat-file blob b5039db6:Makefile > Makefile
+svk commit -m "left update 2"
+
+cd ../../trunk
+svk sm /foo/branches/left
+# in theory we could delete the "left" branch here, but it's not
+# required so don't do it, in case people start getting ideas ;)
+svk commit -m "merge branch 'left' into 'trunk'"
+
+git cat-file blob b51ad431:Makefile > Makefile
+
+svk diff Makefile && echo "Hey!  No differences, magic"
+
+cd ../..
+
+svnadmin dump ~/.svk/foo > svk-merge.dump
+
+svk co -d foo
+rm -rf foo
+svk depotmap -d /foo/
+rm -rf ~/.svk/foo
+
diff --git a/t/t9150/svk-merge.dump b/t/t9150/svk-merge.dump
new file mode 100644
index 0000000..42f70db
--- /dev/null
+++ b/t/t9150/svk-merge.dump
@@ -0,0 +1,616 @@
+SVN-fs-dump-format-version: 2
+
+UUID: b48289b2-9c08-4d72-af37-0358a40b9c15
+
+Revision-number: 0
+Prop-content-length: 56
+Content-length: 56
+
+K 8
+svn:date
+V 27
+2009-10-19T23:44:03.722969Z
+PROPS-END
+
+Revision-number: 1
+Prop-content-length: 123
+Content-length: 123
+
+K 7
+svn:log
+V 24
+Setup trunk and branches
+K 10
+svn:author
+V 4
+samv
+K 8
+svn:date
+V 27
+2009-10-19T23:44:04.927533Z
+PROPS-END
+
+Node-path: branches
+Node-kind: dir
+Node-action: add
+Prop-content-length: 10
+Content-length: 10
+
+PROPS-END
+
+
+Node-path: trunk
+Node-kind: dir
+Node-action: add
+Prop-content-length: 10
+Content-length: 10
+
+PROPS-END
+
+
+Revision-number: 2
+Prop-content-length: 106
+Content-length: 106
+
+K 7
+svn:log
+V 8
+ancestor
+K 10
+svn:author
+V 4
+samv
+K 8
+svn:date
+V 27
+2009-10-19T23:44:05.835585Z
+PROPS-END
+
+Node-path: trunk/Makefile
+Node-kind: file
+Node-action: add
+Prop-content-length: 10
+Text-content-length: 2401
+Text-content-md5: bfd8ff778d1492dc6758567373176a89
+Content-length: 2411
+
+PROPS-END
+# -DCOLLISION_CHECK if you believe that SHA1's
+# 1461501637330902918203684832716283019655932542976 hashes do not give you
+# enough guarantees about no collisions between objects ever hapenning.
+#
+# -DNSEC if you want git to care about sub-second file mtimes and ctimes.
+# Note that you need some new glibc (at least >2.2.4) for this, and it will
+# BREAK YOUR LOCAL DIFFS! show-diff and anything using it will likely randomly
+# break unless your underlying filesystem supports those sub-second times
+# (my ext3 doesn't).
+CFLAGS=-g -O3 -Wall
+
+CC=gcc
+
+
+PROG=   update-cache show-diff init-db write-tree read-tree commit-tree \
+	cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \
+	check-files ls-tree merge-base
+
+all: $(PROG)
+
+install: $(PROG)
+	install $(PROG) $(HOME)/bin/
+
+LIBS= -lssl -lz
+
+init-db: init-db.o
+
+update-cache: update-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o update-cache update-cache.o read-cache.o $(LIBS)
+
+show-diff: show-diff.o read-cache.o
+	$(CC) $(CFLAGS) -o show-diff show-diff.o read-cache.o $(LIBS)
+
+write-tree: write-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o write-tree write-tree.o read-cache.o $(LIBS)
+
+read-tree: read-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o read-tree read-tree.o read-cache.o $(LIBS)
+
+commit-tree: commit-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o commit-tree commit-tree.o read-cache.o $(LIBS)
+
+cat-file: cat-file.o read-cache.o
+	$(CC) $(CFLAGS) -o cat-file cat-file.o read-cache.o $(LIBS)
+
+fsck-cache: fsck-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o fsck-cache fsck-cache.o read-cache.o $(LIBS)
+
+checkout-cache: checkout-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o checkout-cache checkout-cache.o read-cache.o $(LIBS)
+
+diff-tree: diff-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o diff-tree diff-tree.o read-cache.o $(LIBS)
+
+rev-tree: rev-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o rev-tree rev-tree.o read-cache.o $(LIBS)
+
+show-files: show-files.o read-cache.o
+	$(CC) $(CFLAGS) -o show-files show-files.o read-cache.o $(LIBS)
+
+check-files: check-files.o read-cache.o
+	$(CC) $(CFLAGS) -o check-files check-files.o read-cache.o $(LIBS)
+
+ls-tree: ls-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o ls-tree ls-tree.o read-cache.o $(LIBS)
+
+merge-base: merge-base.o read-cache.o
+	$(CC) $(CFLAGS) -o merge-base merge-base.o read-cache.o $(LIBS)
+
+read-cache.o: cache.h
+show-diff.o: cache.h
+
+clean:
+	rm -f *.o $(PROG)
+
+backup: clean
+	cd .. ; tar czvf dircache.tar.gz dir-cache
+
+
+Revision-number: 3
+Prop-content-length: 115
+Content-length: 115
+
+K 7
+svn:log
+V 16
+make left branch
+K 10
+svn:author
+V 4
+samv
+K 8
+svn:date
+V 27
+2009-10-19T23:44:06.719737Z
+PROPS-END
+
+Node-path: branches/left
+Node-kind: dir
+Node-action: add
+Node-copyfrom-rev: 2
+Node-copyfrom-path: trunk
+
+
+Revision-number: 4
+Prop-content-length: 112
+Content-length: 112
+
+K 7
+svn:log
+V 13
+left update 1
+K 10
+svn:author
+V 4
+samv
+K 8
+svn:date
+V 27
+2009-10-19T23:44:07.167666Z
+PROPS-END
+
+Node-path: branches/left/Makefile
+Node-kind: file
+Node-action: change
+Text-content-length: 2465
+Text-content-md5: 16e38d9753b061731650561ce01b1195
+Content-length: 2465
+
+# -DCOLLISION_CHECK if you believe that SHA1's
+# 1461501637330902918203684832716283019655932542976 hashes do not give you
+# enough guarantees about no collisions between objects ever hapenning.
+#
+# -DNSEC if you want git to care about sub-second file mtimes and ctimes.
+# Note that you need some new glibc (at least >2.2.4) for this, and it will
+# BREAK YOUR LOCAL DIFFS! show-diff and anything using it will likely randomly
+# break unless your underlying filesystem supports those sub-second times
+# (my ext3 doesn't).
+CFLAGS=-g -O3 -Wall
+
+CC=gcc
+
+
+PROG=   update-cache show-diff init-db write-tree read-tree commit-tree \
+	cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \
+	check-files ls-tree merge-base
+
+all: $(PROG)
+
+install: $(PROG)
+	install $(PROG) $(HOME)/bin/
+
+LIBS= -lssl -lz
+
+init-db: init-db.o
+
+update-cache: update-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o update-cache update-cache.o read-cache.o $(LIBS)
+
+show-diff: show-diff.o read-cache.o
+	$(CC) $(CFLAGS) -o show-diff show-diff.o read-cache.o $(LIBS)
+
+write-tree: write-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o write-tree write-tree.o read-cache.o $(LIBS)
+
+read-tree: read-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o read-tree read-tree.o read-cache.o $(LIBS)
+
+commit-tree: commit-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o commit-tree commit-tree.o read-cache.o $(LIBS)
+
+cat-file: cat-file.o read-cache.o
+	$(CC) $(CFLAGS) -o cat-file cat-file.o read-cache.o $(LIBS)
+
+fsck-cache: fsck-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o fsck-cache fsck-cache.o read-cache.o $(LIBS)
+
+checkout-cache: checkout-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o checkout-cache checkout-cache.o read-cache.o $(LIBS)
+
+diff-tree: diff-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o diff-tree diff-tree.o read-cache.o $(LIBS)
+
+rev-tree: rev-tree.o read-cache.o object.o commit.o tree.o blob.o
+	$(CC) $(CFLAGS) -o rev-tree rev-tree.o read-cache.o object.o commit.o tree.o blob.o $(LIBS)
+
+show-files: show-files.o read-cache.o
+	$(CC) $(CFLAGS) -o show-files show-files.o read-cache.o $(LIBS)
+
+check-files: check-files.o read-cache.o
+	$(CC) $(CFLAGS) -o check-files check-files.o read-cache.o $(LIBS)
+
+ls-tree: ls-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o ls-tree ls-tree.o read-cache.o $(LIBS)
+
+merge-base: merge-base.o read-cache.o
+	$(CC) $(CFLAGS) -o merge-base merge-base.o read-cache.o $(LIBS)
+
+read-cache.o: cache.h
+show-diff.o: cache.h
+
+clean:
+	rm -f *.o $(PROG)
+
+backup: clean
+	cd .. ; tar czvf dircache.tar.gz dir-cache
+
+
+Revision-number: 5
+Prop-content-length: 111
+Content-length: 111
+
+K 7
+svn:log
+V 12
+trunk update
+K 10
+svn:author
+V 4
+samv
+K 8
+svn:date
+V 27
+2009-10-19T23:44:07.619633Z
+PROPS-END
+
+Node-path: trunk/Makefile
+Node-kind: file
+Node-action: change
+Text-content-length: 2521
+Text-content-md5: 0668418a621333f4aa8b6632cd63e2a0
+Content-length: 2521
+
+# -DCOLLISION_CHECK if you believe that SHA1's
+# 1461501637330902918203684832716283019655932542976 hashes do not give you
+# enough guarantees about no collisions between objects ever hapenning.
+#
+# -DNSEC if you want git to care about sub-second file mtimes and ctimes.
+# Note that you need some new glibc (at least >2.2.4) for this, and it will
+# BREAK YOUR LOCAL DIFFS! show-diff and anything using it will likely randomly
+# break unless your underlying filesystem supports those sub-second times
+# (my ext3 doesn't).
+CFLAGS=-g -O3 -Wall
+
+CC=gcc
+
+
+PROG=   update-cache show-diff init-db write-tree read-tree commit-tree \
+	cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \
+	check-files ls-tree merge-base merge-cache
+
+all: $(PROG)
+
+install: $(PROG)
+	install $(PROG) $(HOME)/bin/
+
+LIBS= -lssl -lz
+
+init-db: init-db.o
+
+update-cache: update-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o update-cache update-cache.o read-cache.o $(LIBS)
+
+show-diff: show-diff.o read-cache.o
+	$(CC) $(CFLAGS) -o show-diff show-diff.o read-cache.o $(LIBS)
+
+write-tree: write-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o write-tree write-tree.o read-cache.o $(LIBS)
+
+read-tree: read-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o read-tree read-tree.o read-cache.o $(LIBS)
+
+commit-tree: commit-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o commit-tree commit-tree.o read-cache.o $(LIBS)
+
+cat-file: cat-file.o read-cache.o
+	$(CC) $(CFLAGS) -o cat-file cat-file.o read-cache.o $(LIBS)
+
+fsck-cache: fsck-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o fsck-cache fsck-cache.o read-cache.o $(LIBS)
+
+checkout-cache: checkout-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o checkout-cache checkout-cache.o read-cache.o $(LIBS)
+
+diff-tree: diff-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o diff-tree diff-tree.o read-cache.o $(LIBS)
+
+rev-tree: rev-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o rev-tree rev-tree.o read-cache.o $(LIBS)
+
+show-files: show-files.o read-cache.o
+	$(CC) $(CFLAGS) -o show-files show-files.o read-cache.o $(LIBS)
+
+check-files: check-files.o read-cache.o
+	$(CC) $(CFLAGS) -o check-files check-files.o read-cache.o $(LIBS)
+
+ls-tree: ls-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o ls-tree ls-tree.o read-cache.o $(LIBS)
+
+merge-base: merge-base.o read-cache.o
+	$(CC) $(CFLAGS) -o merge-base merge-base.o read-cache.o $(LIBS)
+
+merge-cache: merge-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o merge-cache merge-cache.o read-cache.o $(LIBS)
+
+read-cache.o: cache.h
+show-diff.o: cache.h
+
+clean:
+	rm -f *.o $(PROG)
+
+backup: clean
+	cd .. ; tar czvf dircache.tar.gz dir-cache
+
+
+Revision-number: 6
+Prop-content-length: 112
+Content-length: 112
+
+K 7
+svn:log
+V 13
+left update 2
+K 10
+svn:author
+V 4
+samv
+K 8
+svn:date
+V 27
+2009-10-19T23:44:08.067554Z
+PROPS-END
+
+Node-path: branches/left/Makefile
+Node-kind: file
+Node-action: change
+Text-content-length: 2593
+Text-content-md5: 5ccff689fb290e00b85fe18ee50c54ba
+Content-length: 2593
+
+# -DCOLLISION_CHECK if you believe that SHA1's
+# 1461501637330902918203684832716283019655932542976 hashes do not give you
+# enough guarantees about no collisions between objects ever hapenning.
+#
+# -DNSEC if you want git to care about sub-second file mtimes and ctimes.
+# Note that you need some new glibc (at least >2.2.4) for this, and it will
+# BREAK YOUR LOCAL DIFFS! show-diff and anything using it will likely randomly
+# break unless your underlying filesystem supports those sub-second times
+# (my ext3 doesn't).
+CFLAGS=-g -O3 -Wall
+
+CC=gcc
+
+
+PROG=   update-cache show-diff init-db write-tree read-tree commit-tree \
+	cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \
+	check-files ls-tree merge-base
+
+all: $(PROG)
+
+install: $(PROG)
+	install $(PROG) $(HOME)/bin/
+
+LIBS= -lssl -lz
+
+init-db: init-db.o
+
+update-cache: update-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o update-cache update-cache.o read-cache.o $(LIBS)
+
+show-diff: show-diff.o read-cache.o
+	$(CC) $(CFLAGS) -o show-diff show-diff.o read-cache.o $(LIBS)
+
+write-tree: write-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o write-tree write-tree.o read-cache.o $(LIBS)
+
+read-tree: read-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o read-tree read-tree.o read-cache.o $(LIBS)
+
+commit-tree: commit-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o commit-tree commit-tree.o read-cache.o $(LIBS)
+
+cat-file: cat-file.o read-cache.o
+	$(CC) $(CFLAGS) -o cat-file cat-file.o read-cache.o $(LIBS)
+
+fsck-cache: fsck-cache.o read-cache.o object.o commit.o tree.o blob.o
+	$(CC) $(CFLAGS) -o fsck-cache fsck-cache.o read-cache.o object.o commit.o tree.o blob.o $(LIBS)
+
+checkout-cache: checkout-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o checkout-cache checkout-cache.o read-cache.o $(LIBS)
+
+diff-tree: diff-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o diff-tree diff-tree.o read-cache.o $(LIBS)
+
+rev-tree: rev-tree.o read-cache.o object.o commit.o tree.o blob.o
+	$(CC) $(CFLAGS) -o rev-tree rev-tree.o read-cache.o object.o commit.o tree.o blob.o $(LIBS)
+
+show-files: show-files.o read-cache.o
+	$(CC) $(CFLAGS) -o show-files show-files.o read-cache.o $(LIBS)
+
+check-files: check-files.o read-cache.o
+	$(CC) $(CFLAGS) -o check-files check-files.o read-cache.o $(LIBS)
+
+ls-tree: ls-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o ls-tree ls-tree.o read-cache.o $(LIBS)
+
+merge-base: merge-base.o read-cache.o object.o commit.o tree.o blob.o
+	$(CC) $(CFLAGS) -o merge-base merge-base.o read-cache.o object.o commit.o tree.o blob.o $(LIBS)
+
+read-cache.o: cache.h
+show-diff.o: cache.h
+
+clean:
+	rm -f *.o $(PROG)
+
+backup: clean
+	cd .. ; tar czvf dircache.tar.gz dir-cache
+
+
+Revision-number: 7
+Prop-content-length: 131
+Content-length: 131
+
+K 7
+svn:log
+V 32
+merge branch 'left' into 'trunk'
+K 10
+svn:author
+V 4
+samv
+K 8
+svn:date
+V 27
+2009-10-19T23:44:08.971801Z
+PROPS-END
+
+Node-path: trunk
+Node-kind: dir
+Node-action: change
+Prop-content-length: 83
+Content-length: 83
+
+K 9
+svk:merge
+V 53
+b48289b2-9c08-4d72-af37-0358a40b9c15:/branches/left:6
+PROPS-END
+
+
+Node-path: trunk/Makefile
+Node-kind: file
+Node-action: change
+Text-content-length: 2713
+Text-content-md5: 0afbe34f244cd662b1f97d708c687f90
+Content-length: 2713
+
+# -DCOLLISION_CHECK if you believe that SHA1's
+# 1461501637330902918203684832716283019655932542976 hashes do not give you
+# enough guarantees about no collisions between objects ever hapenning.
+#
+# -DNSEC if you want git to care about sub-second file mtimes and ctimes.
+# Note that you need some new glibc (at least >2.2.4) for this, and it will
+# BREAK YOUR LOCAL DIFFS! show-diff and anything using it will likely randomly
+# break unless your underlying filesystem supports those sub-second times
+# (my ext3 doesn't).
+CFLAGS=-g -O3 -Wall
+
+CC=gcc
+
+
+PROG=   update-cache show-diff init-db write-tree read-tree commit-tree \
+	cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \
+	check-files ls-tree merge-base merge-cache
+
+all: $(PROG)
+
+install: $(PROG)
+	install $(PROG) $(HOME)/bin/
+
+LIBS= -lssl -lz
+
+init-db: init-db.o
+
+update-cache: update-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o update-cache update-cache.o read-cache.o $(LIBS)
+
+show-diff: show-diff.o read-cache.o
+	$(CC) $(CFLAGS) -o show-diff show-diff.o read-cache.o $(LIBS)
+
+write-tree: write-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o write-tree write-tree.o read-cache.o $(LIBS)
+
+read-tree: read-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o read-tree read-tree.o read-cache.o $(LIBS)
+
+commit-tree: commit-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o commit-tree commit-tree.o read-cache.o $(LIBS)
+
+cat-file: cat-file.o read-cache.o
+	$(CC) $(CFLAGS) -o cat-file cat-file.o read-cache.o $(LIBS)
+
+fsck-cache: fsck-cache.o read-cache.o object.o commit.o tree.o blob.o
+	$(CC) $(CFLAGS) -o fsck-cache fsck-cache.o read-cache.o object.o commit.o tree.o blob.o $(LIBS)
+
+checkout-cache: checkout-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o checkout-cache checkout-cache.o read-cache.o $(LIBS)
+
+diff-tree: diff-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o diff-tree diff-tree.o read-cache.o $(LIBS)
+
+rev-tree: rev-tree.o read-cache.o object.o commit.o tree.o blob.o
+	$(CC) $(CFLAGS) -o rev-tree rev-tree.o read-cache.o object.o commit.o tree.o blob.o $(LIBS)
+
+show-files: show-files.o read-cache.o
+	$(CC) $(CFLAGS) -o show-files show-files.o read-cache.o $(LIBS)
+
+check-files: check-files.o read-cache.o
+	$(CC) $(CFLAGS) -o check-files check-files.o read-cache.o $(LIBS)
+
+ls-tree: ls-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o ls-tree ls-tree.o read-cache.o $(LIBS)
+
+merge-base: merge-base.o read-cache.o object.o commit.o tree.o blob.o
+	$(CC) $(CFLAGS) -o merge-base merge-base.o read-cache.o object.o commit.o tree.o blob.o $(LIBS)
+
+merge-cache: merge-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o merge-cache merge-cache.o read-cache.o $(LIBS)
+
+read-cache.o: cache.h
+show-diff.o: cache.h
+
+clean:
+	rm -f *.o $(PROG)
+
+backup: clean
+	cd .. ; tar czvf dircache.tar.gz dir-cache
+
+
-- 
1.6.3.3

^ permalink raw reply related

* [PATCH 2/5] git-svn: allow test setup script to support PERL env. var
From: Sam Vilain @ 2009-10-20  2:42 UTC (permalink / raw)
  To: git; +Cc: Sam Vilain
In-Reply-To: <1256006523-5493-2-git-send-email-sam.vilain@catalyst.net.nz>

Possibly the 'perl' in the PATH is not the one to be used for the tests;
let PERL set in the environment select it.

Signed-off-by: Sam Vilain <sam.vilain@catalyst.net.nz>
---
  A little unrelated nit

 t/lib-git-svn.sh |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/lib-git-svn.sh b/t/lib-git-svn.sh
index fd8631f..0f7f35c 100644
--- a/t/lib-git-svn.sh
+++ b/t/lib-git-svn.sh
@@ -16,6 +16,7 @@ fi
 GIT_DIR=$PWD/.git
 GIT_SVN_DIR=$GIT_DIR/svn/refs/remotes/git-svn
 SVN_TREE=$GIT_SVN_DIR/svn-tree
+PERL=${PERL:-perl}
 
 svn >/dev/null 2>&1
 if test $? -ne 1
@@ -29,7 +30,7 @@ export svnrepo
 svnconf=$PWD/svnconf
 export svnconf
 
-perl -w -e "
+$PERL -w -e "
 use SVN::Core;
 use SVN::Repos;
 \$SVN::Core::VERSION gt '1.1.0' or exit(42);
@@ -130,7 +131,7 @@ stop_httpd () {
 }
 
 convert_to_rev_db () {
-	perl -w -- - "$@" <<\EOF
+	$PERL -w -- - "$@" <<\EOF
 use strict;
 @ARGV == 2 or die "Usage: convert_to_rev_db <input> <output>";
 open my $wr, '+>', $ARGV[1] or die "$!: couldn't open: $ARGV[1]";
-- 
1.6.3.3

^ permalink raw reply related

* [PATCH 3/5] git-svn: convert SVK merge tickets to extra parents
From: Sam Vilain @ 2009-10-20  2:42 UTC (permalink / raw)
  To: git; +Cc: Sam Vilain
In-Reply-To: <1256006523-5493-3-git-send-email-sam.vilain@catalyst.net.nz>

SVK is a simple case to start with, as its idea of merge parents
matches git's one.  When a svk:merge ticket is encountered, check each
of the listed merged revisions to see if they are in the history of
this commit; if not, then we have encountered a merge - record it.

Signed-off-by: Sam Vilain <sam.vilain@catalyst.net.nz>
---
   I'm not sure if 'other_gs' is the right method to call here;
   ideally we should lookup a revmap by UUID and path, that would
   in principle allow people to import from a bunch of SVK depots all
   being tracked.  Perhaps however this is sufficient.

 git-svn.perl                |   50 ++++++++++++++++++++++++++++++++++++++++++-
 t/t9150-svk-mergetickets.sh |   23 +++++++++++++++++++
 2 files changed, 72 insertions(+), 1 deletions(-)
 create mode 100644 t/t9150-svk-mergetickets.sh

diff --git a/git-svn.perl b/git-svn.perl
index eb4b75a..3c2534c 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -2878,14 +2878,62 @@ sub check_author {
 	$author;
 }
 
+sub find_extra_svk_parents {
+	my ($self, $ed, $tickets, $parents) = @_;
+	# aha!  svk:merge property changed...
+	my @tickets = split "\n", $tickets;
+	my @known_parents;
+	for my $ticket ( @tickets ) {
+		my ($uuid, $path, $rev) = split /:/, $ticket;
+		if ( $uuid eq $self->ra_uuid ) {
+			my $url = $self->rewrite_root || $self->{url};
+			my $repos_root = $url;
+			my $branch_from = $path;
+			$branch_from =~ s{^/}{};
+			my $gs = $self->other_gs($repos_root."/".$branch_from, $url,
+						 $branch_from, $rev, $self->{ref_id});
+			#my $gs = other_gs ( $url, $repos_root, $branch_from,
+				#$self->{ref_id} );
+			if ( my $commit = $gs->rev_map_get($rev, $uuid) ) {
+				# wahey!  we found it, but it might be
+				# an old one (!)
+				push @known_parents, $commit;
+			}
+		}
+	}
+	for my $parent ( @known_parents ) {
+		my @cmd = ('rev-list', $parent, map { "^$_" } @$parents );
+		my ($msg_fh, $ctx) = command_output_pipe(@cmd);
+		my $new;
+		while ( <$msg_fh> ) {
+			$new=1;last;
+		}
+		command_close_pipe($msg_fh, $ctx);
+		if ( $new ) {
+			print STDERR "Found merge parent (svk:merge ticket): $parent\n";
+			push @$parents, $parent;
+		}
+	}
+}
+
 sub make_log_entry {
 	my ($self, $rev, $parents, $ed) = @_;
 	my $untracked = $self->get_untracked($ed);
 
+	my @parents = @$parents;
+	my $ps = $ed->{path_strip} || "";
+	for my $path ( grep { m/$ps/ } %{$ed->{dir_prop}} ) {
+		my $props = $ed->{dir_prop}{$path};
+		if ( $props->{"svk:merge"} ) {
+			$self->find_extra_svk_parents
+				($ed, $props->{"svk:merge"}, \@parents);
+		}
+	}
+
 	open my $un, '>>', "$self->{dir}/unhandled.log" or croak $!;
 	print $un "r$rev\n" or croak $!;
 	print $un $_, "\n" foreach @$untracked;
-	my %log_entry = ( parents => $parents || [], revision => $rev,
+	my %log_entry = ( parents => \@parents, revision => $rev,
 	                  log => '');
 
 	my $headrev;
diff --git a/t/t9150-svk-mergetickets.sh b/t/t9150-svk-mergetickets.sh
new file mode 100644
index 0000000..8000c34
--- /dev/null
+++ b/t/t9150-svk-mergetickets.sh
@@ -0,0 +1,23 @@
+#!/bin/sh
+#
+# Copyright (c) 2007 Sam Vilain
+#
+
+test_description='git-svn svk merge tickets'
+
+. ./lib-git-svn.sh
+
+test_expect_success 'load svk depot' "
+	svnadmin load -q '$rawsvnrepo' < '../t9150/svk-merge.dump' &&
+	git svn init --minimize-url -R svkmerge \
+	  -T trunk -b branches '$svnrepo' &&
+	git svn fetch --all
+	"
+
+uuid=b48289b2-9c08-4d72-af37-0358a40b9c15
+
+test_expect_success 'svk merges were represented coming in' "
+	[ `git-cat-file commit HEAD | grep parent | wc -l` -eq 2 ]
+	"
+
+test_done
-- 
1.6.3.3

^ permalink raw reply related

* [PATCH 5/5] git-svn: convert SVN 1.5+ / svnmerge.py svn:mergeinfo props to parents
From: Sam Vilain @ 2009-10-20  2:42 UTC (permalink / raw)
  To: git; +Cc: Sam Vilain
In-Reply-To: <1256006523-5493-5-git-send-email-sam.vilain@catalyst.net.nz>

This feature is long overdue; convert SVN's merge representation to git's
as revisions are imported.  This works by converting the list of revisions
in each line of the svn:mergeinfo into git revision ranges, and then
checking the latest of each of these revision ranges for A) being new and
B) now being completely merged.

Signed-off-by: Sam Vilain <sam.vilain@catalyst.net.nz>
---
   There's much more implemented here than is tested...
 git-svn.perl             |   93 ++++++++++++++++++++++++++++++++++++++++++++++
 t/t9151-svn-mergeinfo.sh |   21 ++++++++++
 2 files changed, 114 insertions(+), 0 deletions(-)
 create mode 100644 t/t9151-svn-mergeinfo.sh

diff --git a/git-svn.perl b/git-svn.perl
index 3c2534c..244e40d 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -2916,6 +2916,93 @@ sub find_extra_svk_parents {
 	}
 }
 
+# note: this function should only be called if the various dirprops
+# have actually changed
+sub find_extra_svn_parents {
+	my ($self, $ed, $mergeinfo, $parents) = @_;
+	# aha!  svk:merge property changed...
+
+	# We first search for merged tips which are not in our
+	# history.  Then, we figure out which git revisions are in
+	# that tip, but not this revision.  If all of those revisions
+	# are now marked as merge, we can add the tip as a parent.
+	my @merges = split "\n", $mergeinfo;
+	my @merge_tips;
+	my @merged_commit_ranges;
+	my $url = $self->rewrite_root || $self->{url};
+	for my $merge ( @merges ) {
+		my ($source, $revs) = split ":", $merge;
+		my $path = $source;
+		$path =~ s{^/}{};
+		my $gs = Git::SVN->find_by_url($url.$source, $url, $path);
+		if ( !$gs ) {
+			warn "Couldn't find revmap for $url$source\n";
+			next;
+		}
+		my @ranges = split ",", $revs;
+		my ($tip, $tip_commit);
+		# find the tip
+		for my $range ( @ranges ) {
+			my ($bottom, $top) = split "-", $range;
+			$top ||= $bottom;
+			my $bottom_commit =
+				$gs->rev_map_get($bottom, $self->ra_uuid) ||
+				$gs->rev_map_get($bottom+1, $self->ra_uuid);
+			my $top_commit =
+				$gs->rev_map_get($top, $self->ra_uuid);
+
+			unless ($top_commit and $bottom_commit) {
+				warn "W:unknown path/rev in svn:mergeinfo "
+					."dirprop: $source:$range\n";
+				next;
+			}
+
+			push @merged_commit_ranges,
+				"$bottom_commit..$top_commit";
+
+			if ( !defined $tip or $top > $tip ) {
+				$tip = $top;
+				$tip_commit = $top_commit;
+			}
+		}
+		unless (!$tip_commit or
+				grep { $_ eq $tip_commit } @$parents ) {
+			push @merge_tips, $tip_commit;
+		}
+		else {
+			push @merge_tips, undef;
+		}
+	}
+	for my $merge_tip ( @merge_tips ) {
+		my $spec = shift @merges;
+		next unless $merge_tip;
+		my @cmd = ('rev-list', "-1", $merge_tip,
+			   "--not", @$parents );
+		my ($msg_fh, $ctx) = command_output_pipe(@cmd);
+		my $new;
+		while ( <$msg_fh> ) {
+			$new=1;last;
+		}
+		command_close_pipe($msg_fh, $ctx);
+		if ( $new ) {
+			push @cmd, @merged_commit_ranges;
+			my ($msg_fh, $ctx) = command_output_pipe(@cmd);
+			my $unmerged;
+			while ( <$msg_fh> ) {
+				$unmerged=1;last;
+			}
+			command_close_pipe($msg_fh, $ctx);
+			if ( $unmerged ) {
+				warn "W:svn cherry-pick ignored ($spec)\n";
+			}
+			else {
+				warn "Found merge parent (svn:mergeinfo prop): $merge_tip\n";
+				push @$parents, $merge_tip;
+			}
+		}
+	}
+}
+
 sub make_log_entry {
 	my ($self, $rev, $parents, $ed) = @_;
 	my $untracked = $self->get_untracked($ed);
@@ -2928,6 +3015,12 @@ sub make_log_entry {
 			$self->find_extra_svk_parents
 				($ed, $props->{"svk:merge"}, \@parents);
 		}
+		if ( $props->{"svn:mergeinfo"} ) {
+			$self->find_extra_svn_parents
+				($ed,
+				 $props->{"svn:mergeinfo"},
+				 \@parents);
+		}
 	}
 
 	open my $un, '>>', "$self->{dir}/unhandled.log" or croak $!;
diff --git a/t/t9151-svn-mergeinfo.sh b/t/t9151-svn-mergeinfo.sh
new file mode 100644
index 0000000..7eb36e5
--- /dev/null
+++ b/t/t9151-svn-mergeinfo.sh
@@ -0,0 +1,21 @@
+#!/bin/sh
+#
+# Copyright (c) 2007, 2009 Sam Vilain
+#
+
+test_description='git-svn svn mergeinfo properties'
+
+. ./lib-git-svn.sh
+
+test_expect_success 'load svn dump' "
+	svnadmin load -q '$rawsvnrepo' < '../t9151/svn-mergeinfo.dump' &&
+	git svn init --minimize-url -R svnmerge \
+	  -T trunk -b branches '$svnrepo' &&
+	git svn fetch --all
+	"
+
+test_expect_success 'svn merges were represented coming in' "
+	[ `git cat-file commit HEAD | grep parent | wc -l` -eq 2 ]
+	"
+
+test_done
-- 
1.6.3.3

^ permalink raw reply related

* [PATCH 4/5] git-svn: add test data for SVN 1.5+ merge, with script.
From: Sam Vilain @ 2009-10-20  2:42 UTC (permalink / raw)
  To: git; +Cc: Sam Vilain
In-Reply-To: <1256006523-5493-4-git-send-email-sam.vilain@catalyst.net.nz>

Dump generated with SVN 1.5.1 (on lenny amd64).  This test should hopefully
cover all but a few intermediate versions of the svnmerge.py script.

Signed-off-by: Sam Vilain <sam.vilain@catalyst.net.nz>
---
 t/t9151/.gitignore         |    2 +
 t/t9151/make-svnmerge-dump |   73 +++++
 t/t9151/svn-mergeinfo.dump |  736 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 811 insertions(+), 0 deletions(-)
 create mode 100644 t/t9151/.gitignore
 create mode 100644 t/t9151/make-svnmerge-dump
 create mode 100644 t/t9151/svn-mergeinfo.dump

diff --git a/t/t9151/.gitignore b/t/t9151/.gitignore
new file mode 100644
index 0000000..587c37d
--- /dev/null
+++ b/t/t9151/.gitignore
@@ -0,0 +1,2 @@
+foo
+foo.svn
diff --git a/t/t9151/make-svnmerge-dump b/t/t9151/make-svnmerge-dump
new file mode 100644
index 0000000..e35d64d
--- /dev/null
+++ b/t/t9151/make-svnmerge-dump
@@ -0,0 +1,73 @@
+#!/bin/sh
+#
+# this script sets up a Subversion repository for Makefile in the
+# first ever git merge, as if it were done with svnmerge (SVN 1.5+)
+#
+
+rm -rf foo.svn foo
+set -e
+
+mkdir foo.svn
+svnadmin create foo.svn
+svn co file://`pwd`/foo.svn foo
+
+cd foo
+mkdir trunk
+mkdir branches
+svn add trunk branches
+svn commit -m "Setup trunk and branches"
+cd trunk
+
+git cat-file blob 6683463e:Makefile > Makefile
+svn add Makefile 
+
+echo "Committing ANCESTOR"
+svn commit -m "ancestor"
+cd ..
+svn cp trunk branches/left
+
+echo "Committing BRANCH POINT"
+svn commit -m "make left branch"
+cd branches/left/
+
+#$sm init
+#svn commit -m "init svnmerge"
+
+git cat-file blob 5873b67e:Makefile > Makefile
+echo "Committing BRANCH UPDATE 1"
+svn commit -m "left update 1"
+cd ../..
+
+cd trunk
+git cat-file blob 75118b13:Makefile > Makefile
+echo "Committing TRUNK UPDATE"
+svn commit -m "trunk update"
+
+cd ../branches/left
+git cat-file blob ff5ebe39:Makefile > Makefile
+echo "Committing BRANCH UPDATE 2"
+svn commit -m "left update 2"
+
+git cat-file blob b5039db6:Makefile > Makefile
+echo "Committing BRANCH UPDATE 3"
+svn commit -m "left update 3"
+
+# merge to trunk
+
+cd ../..
+svn update
+cd trunk
+
+svn merge ../branches/left --accept postpone
+
+git cat-file blob b51ad431:Makefile > Makefile
+
+svn resolved Makefile
+
+svn commit -m "Merge trunk"
+
+cd ../..
+
+svnadmin dump foo.svn > svn-mergeinfo.dump
+
+rm -rf foo foo.svn
diff --git a/t/t9151/svn-mergeinfo.dump b/t/t9151/svn-mergeinfo.dump
new file mode 100644
index 0000000..2153187
--- /dev/null
+++ b/t/t9151/svn-mergeinfo.dump
@@ -0,0 +1,736 @@
+SVN-fs-dump-format-version: 2
+
+UUID: 1ce241d1-ba54-4eb9-bded-03057fe48a33
+
+Revision-number: 0
+Prop-content-length: 56
+Content-length: 56
+
+K 8
+svn:date
+V 27
+2009-10-20T01:33:37.692723Z
+PROPS-END
+
+Revision-number: 1
+Prop-content-length: 123
+Content-length: 123
+
+K 7
+svn:log
+V 24
+Setup trunk and branches
+K 10
+svn:author
+V 4
+samv
+K 8
+svn:date
+V 27
+2009-10-20T01:33:38.159933Z
+PROPS-END
+
+Node-path: branches
+Node-kind: dir
+Node-action: add
+Prop-content-length: 10
+Content-length: 10
+
+PROPS-END
+
+
+Node-path: trunk
+Node-kind: dir
+Node-action: add
+Prop-content-length: 10
+Content-length: 10
+
+PROPS-END
+
+
+Revision-number: 2
+Prop-content-length: 106
+Content-length: 106
+
+K 7
+svn:log
+V 8
+ancestor
+K 10
+svn:author
+V 4
+samv
+K 8
+svn:date
+V 27
+2009-10-20T01:33:39.160059Z
+PROPS-END
+
+Node-path: trunk/Makefile
+Node-kind: file
+Node-action: add
+Prop-content-length: 10
+Text-content-length: 2401
+Text-content-md5: bfd8ff778d1492dc6758567373176a89
+Content-length: 2411
+
+PROPS-END
+# -DCOLLISION_CHECK if you believe that SHA1's
+# 1461501637330902918203684832716283019655932542976 hashes do not give you
+# enough guarantees about no collisions between objects ever hapenning.
+#
+# -DNSEC if you want git to care about sub-second file mtimes and ctimes.
+# Note that you need some new glibc (at least >2.2.4) for this, and it will
+# BREAK YOUR LOCAL DIFFS! show-diff and anything using it will likely randomly
+# break unless your underlying filesystem supports those sub-second times
+# (my ext3 doesn't).
+CFLAGS=-g -O3 -Wall
+
+CC=gcc
+
+
+PROG=   update-cache show-diff init-db write-tree read-tree commit-tree \
+	cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \
+	check-files ls-tree merge-base
+
+all: $(PROG)
+
+install: $(PROG)
+	install $(PROG) $(HOME)/bin/
+
+LIBS= -lssl -lz
+
+init-db: init-db.o
+
+update-cache: update-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o update-cache update-cache.o read-cache.o $(LIBS)
+
+show-diff: show-diff.o read-cache.o
+	$(CC) $(CFLAGS) -o show-diff show-diff.o read-cache.o $(LIBS)
+
+write-tree: write-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o write-tree write-tree.o read-cache.o $(LIBS)
+
+read-tree: read-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o read-tree read-tree.o read-cache.o $(LIBS)
+
+commit-tree: commit-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o commit-tree commit-tree.o read-cache.o $(LIBS)
+
+cat-file: cat-file.o read-cache.o
+	$(CC) $(CFLAGS) -o cat-file cat-file.o read-cache.o $(LIBS)
+
+fsck-cache: fsck-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o fsck-cache fsck-cache.o read-cache.o $(LIBS)
+
+checkout-cache: checkout-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o checkout-cache checkout-cache.o read-cache.o $(LIBS)
+
+diff-tree: diff-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o diff-tree diff-tree.o read-cache.o $(LIBS)
+
+rev-tree: rev-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o rev-tree rev-tree.o read-cache.o $(LIBS)
+
+show-files: show-files.o read-cache.o
+	$(CC) $(CFLAGS) -o show-files show-files.o read-cache.o $(LIBS)
+
+check-files: check-files.o read-cache.o
+	$(CC) $(CFLAGS) -o check-files check-files.o read-cache.o $(LIBS)
+
+ls-tree: ls-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o ls-tree ls-tree.o read-cache.o $(LIBS)
+
+merge-base: merge-base.o read-cache.o
+	$(CC) $(CFLAGS) -o merge-base merge-base.o read-cache.o $(LIBS)
+
+read-cache.o: cache.h
+show-diff.o: cache.h
+
+clean:
+	rm -f *.o $(PROG)
+
+backup: clean
+	cd .. ; tar czvf dircache.tar.gz dir-cache
+
+
+Revision-number: 3
+Prop-content-length: 115
+Content-length: 115
+
+K 7
+svn:log
+V 16
+make left branch
+K 10
+svn:author
+V 4
+samv
+K 8
+svn:date
+V 27
+2009-10-20T01:33:41.148192Z
+PROPS-END
+
+Node-path: branches/left
+Node-kind: dir
+Node-action: add
+Node-copyfrom-rev: 1
+Node-copyfrom-path: trunk
+Prop-content-length: 34
+Content-length: 34
+
+K 13
+svn:mergeinfo
+V 0
+
+PROPS-END
+
+
+Node-path: branches/left/Makefile
+Node-kind: file
+Node-action: add
+Node-copyfrom-rev: 2
+Node-copyfrom-path: trunk/Makefile
+Text-copy-source-md5: bfd8ff778d1492dc6758567373176a89
+
+
+Revision-number: 4
+Prop-content-length: 112
+Content-length: 112
+
+K 7
+svn:log
+V 13
+left update 1
+K 10
+svn:author
+V 4
+samv
+K 8
+svn:date
+V 27
+2009-10-20T01:33:42.148773Z
+PROPS-END
+
+Node-path: branches/left/Makefile
+Node-kind: file
+Node-action: change
+Text-content-length: 2465
+Text-content-md5: 16e38d9753b061731650561ce01b1195
+Content-length: 2465
+
+# -DCOLLISION_CHECK if you believe that SHA1's
+# 1461501637330902918203684832716283019655932542976 hashes do not give you
+# enough guarantees about no collisions between objects ever hapenning.
+#
+# -DNSEC if you want git to care about sub-second file mtimes and ctimes.
+# Note that you need some new glibc (at least >2.2.4) for this, and it will
+# BREAK YOUR LOCAL DIFFS! show-diff and anything using it will likely randomly
+# break unless your underlying filesystem supports those sub-second times
+# (my ext3 doesn't).
+CFLAGS=-g -O3 -Wall
+
+CC=gcc
+
+
+PROG=   update-cache show-diff init-db write-tree read-tree commit-tree \
+	cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \
+	check-files ls-tree merge-base
+
+all: $(PROG)
+
+install: $(PROG)
+	install $(PROG) $(HOME)/bin/
+
+LIBS= -lssl -lz
+
+init-db: init-db.o
+
+update-cache: update-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o update-cache update-cache.o read-cache.o $(LIBS)
+
+show-diff: show-diff.o read-cache.o
+	$(CC) $(CFLAGS) -o show-diff show-diff.o read-cache.o $(LIBS)
+
+write-tree: write-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o write-tree write-tree.o read-cache.o $(LIBS)
+
+read-tree: read-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o read-tree read-tree.o read-cache.o $(LIBS)
+
+commit-tree: commit-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o commit-tree commit-tree.o read-cache.o $(LIBS)
+
+cat-file: cat-file.o read-cache.o
+	$(CC) $(CFLAGS) -o cat-file cat-file.o read-cache.o $(LIBS)
+
+fsck-cache: fsck-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o fsck-cache fsck-cache.o read-cache.o $(LIBS)
+
+checkout-cache: checkout-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o checkout-cache checkout-cache.o read-cache.o $(LIBS)
+
+diff-tree: diff-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o diff-tree diff-tree.o read-cache.o $(LIBS)
+
+rev-tree: rev-tree.o read-cache.o object.o commit.o tree.o blob.o
+	$(CC) $(CFLAGS) -o rev-tree rev-tree.o read-cache.o object.o commit.o tree.o blob.o $(LIBS)
+
+show-files: show-files.o read-cache.o
+	$(CC) $(CFLAGS) -o show-files show-files.o read-cache.o $(LIBS)
+
+check-files: check-files.o read-cache.o
+	$(CC) $(CFLAGS) -o check-files check-files.o read-cache.o $(LIBS)
+
+ls-tree: ls-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o ls-tree ls-tree.o read-cache.o $(LIBS)
+
+merge-base: merge-base.o read-cache.o
+	$(CC) $(CFLAGS) -o merge-base merge-base.o read-cache.o $(LIBS)
+
+read-cache.o: cache.h
+show-diff.o: cache.h
+
+clean:
+	rm -f *.o $(PROG)
+
+backup: clean
+	cd .. ; tar czvf dircache.tar.gz dir-cache
+
+
+Revision-number: 5
+Prop-content-length: 111
+Content-length: 111
+
+K 7
+svn:log
+V 12
+trunk update
+K 10
+svn:author
+V 4
+samv
+K 8
+svn:date
+V 27
+2009-10-20T01:33:43.159959Z
+PROPS-END
+
+Node-path: trunk/Makefile
+Node-kind: file
+Node-action: change
+Text-content-length: 2521
+Text-content-md5: 0668418a621333f4aa8b6632cd63e2a0
+Content-length: 2521
+
+# -DCOLLISION_CHECK if you believe that SHA1's
+# 1461501637330902918203684832716283019655932542976 hashes do not give you
+# enough guarantees about no collisions between objects ever hapenning.
+#
+# -DNSEC if you want git to care about sub-second file mtimes and ctimes.
+# Note that you need some new glibc (at least >2.2.4) for this, and it will
+# BREAK YOUR LOCAL DIFFS! show-diff and anything using it will likely randomly
+# break unless your underlying filesystem supports those sub-second times
+# (my ext3 doesn't).
+CFLAGS=-g -O3 -Wall
+
+CC=gcc
+
+
+PROG=   update-cache show-diff init-db write-tree read-tree commit-tree \
+	cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \
+	check-files ls-tree merge-base merge-cache
+
+all: $(PROG)
+
+install: $(PROG)
+	install $(PROG) $(HOME)/bin/
+
+LIBS= -lssl -lz
+
+init-db: init-db.o
+
+update-cache: update-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o update-cache update-cache.o read-cache.o $(LIBS)
+
+show-diff: show-diff.o read-cache.o
+	$(CC) $(CFLAGS) -o show-diff show-diff.o read-cache.o $(LIBS)
+
+write-tree: write-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o write-tree write-tree.o read-cache.o $(LIBS)
+
+read-tree: read-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o read-tree read-tree.o read-cache.o $(LIBS)
+
+commit-tree: commit-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o commit-tree commit-tree.o read-cache.o $(LIBS)
+
+cat-file: cat-file.o read-cache.o
+	$(CC) $(CFLAGS) -o cat-file cat-file.o read-cache.o $(LIBS)
+
+fsck-cache: fsck-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o fsck-cache fsck-cache.o read-cache.o $(LIBS)
+
+checkout-cache: checkout-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o checkout-cache checkout-cache.o read-cache.o $(LIBS)
+
+diff-tree: diff-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o diff-tree diff-tree.o read-cache.o $(LIBS)
+
+rev-tree: rev-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o rev-tree rev-tree.o read-cache.o $(LIBS)
+
+show-files: show-files.o read-cache.o
+	$(CC) $(CFLAGS) -o show-files show-files.o read-cache.o $(LIBS)
+
+check-files: check-files.o read-cache.o
+	$(CC) $(CFLAGS) -o check-files check-files.o read-cache.o $(LIBS)
+
+ls-tree: ls-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o ls-tree ls-tree.o read-cache.o $(LIBS)
+
+merge-base: merge-base.o read-cache.o
+	$(CC) $(CFLAGS) -o merge-base merge-base.o read-cache.o $(LIBS)
+
+merge-cache: merge-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o merge-cache merge-cache.o read-cache.o $(LIBS)
+
+read-cache.o: cache.h
+show-diff.o: cache.h
+
+clean:
+	rm -f *.o $(PROG)
+
+backup: clean
+	cd .. ; tar czvf dircache.tar.gz dir-cache
+
+
+Revision-number: 6
+Prop-content-length: 112
+Content-length: 112
+
+K 7
+svn:log
+V 13
+left update 2
+K 10
+svn:author
+V 4
+samv
+K 8
+svn:date
+V 27
+2009-10-20T01:33:44.164175Z
+PROPS-END
+
+Node-path: branches/left/Makefile
+Node-kind: file
+Node-action: change
+Text-content-length: 2529
+Text-content-md5: f6b197cc3f2e89a83e545d4bb003de73
+Content-length: 2529
+
+# -DCOLLISION_CHECK if you believe that SHA1's
+# 1461501637330902918203684832716283019655932542976 hashes do not give you
+# enough guarantees about no collisions between objects ever hapenning.
+#
+# -DNSEC if you want git to care about sub-second file mtimes and ctimes.
+# Note that you need some new glibc (at least >2.2.4) for this, and it will
+# BREAK YOUR LOCAL DIFFS! show-diff and anything using it will likely randomly
+# break unless your underlying filesystem supports those sub-second times
+# (my ext3 doesn't).
+CFLAGS=-g -O3 -Wall
+
+CC=gcc
+
+
+PROG=   update-cache show-diff init-db write-tree read-tree commit-tree \
+	cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \
+	check-files ls-tree merge-base
+
+all: $(PROG)
+
+install: $(PROG)
+	install $(PROG) $(HOME)/bin/
+
+LIBS= -lssl -lz
+
+init-db: init-db.o
+
+update-cache: update-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o update-cache update-cache.o read-cache.o $(LIBS)
+
+show-diff: show-diff.o read-cache.o
+	$(CC) $(CFLAGS) -o show-diff show-diff.o read-cache.o $(LIBS)
+
+write-tree: write-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o write-tree write-tree.o read-cache.o $(LIBS)
+
+read-tree: read-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o read-tree read-tree.o read-cache.o $(LIBS)
+
+commit-tree: commit-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o commit-tree commit-tree.o read-cache.o $(LIBS)
+
+cat-file: cat-file.o read-cache.o
+	$(CC) $(CFLAGS) -o cat-file cat-file.o read-cache.o $(LIBS)
+
+fsck-cache: fsck-cache.o read-cache.o object.o commit.o tree.o blob.o
+	$(CC) $(CFLAGS) -o fsck-cache fsck-cache.o read-cache.o object.o commit.o tree.o blob.o $(LIBS)
+
+checkout-cache: checkout-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o checkout-cache checkout-cache.o read-cache.o $(LIBS)
+
+diff-tree: diff-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o diff-tree diff-tree.o read-cache.o $(LIBS)
+
+rev-tree: rev-tree.o read-cache.o object.o commit.o tree.o blob.o
+	$(CC) $(CFLAGS) -o rev-tree rev-tree.o read-cache.o object.o commit.o tree.o blob.o $(LIBS)
+
+show-files: show-files.o read-cache.o
+	$(CC) $(CFLAGS) -o show-files show-files.o read-cache.o $(LIBS)
+
+check-files: check-files.o read-cache.o
+	$(CC) $(CFLAGS) -o check-files check-files.o read-cache.o $(LIBS)
+
+ls-tree: ls-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o ls-tree ls-tree.o read-cache.o $(LIBS)
+
+merge-base: merge-base.o read-cache.o
+	$(CC) $(CFLAGS) -o merge-base merge-base.o read-cache.o $(LIBS)
+
+read-cache.o: cache.h
+show-diff.o: cache.h
+
+clean:
+	rm -f *.o $(PROG)
+
+backup: clean
+	cd .. ; tar czvf dircache.tar.gz dir-cache
+
+
+Revision-number: 7
+Prop-content-length: 112
+Content-length: 112
+
+K 7
+svn:log
+V 13
+left update 3
+K 10
+svn:author
+V 4
+samv
+K 8
+svn:date
+V 27
+2009-10-20T01:33:45.144214Z
+PROPS-END
+
+Node-path: branches/left/Makefile
+Node-kind: file
+Node-action: change
+Text-content-length: 2593
+Text-content-md5: 5ccff689fb290e00b85fe18ee50c54ba
+Content-length: 2593
+
+# -DCOLLISION_CHECK if you believe that SHA1's
+# 1461501637330902918203684832716283019655932542976 hashes do not give you
+# enough guarantees about no collisions between objects ever hapenning.
+#
+# -DNSEC if you want git to care about sub-second file mtimes and ctimes.
+# Note that you need some new glibc (at least >2.2.4) for this, and it will
+# BREAK YOUR LOCAL DIFFS! show-diff and anything using it will likely randomly
+# break unless your underlying filesystem supports those sub-second times
+# (my ext3 doesn't).
+CFLAGS=-g -O3 -Wall
+
+CC=gcc
+
+
+PROG=   update-cache show-diff init-db write-tree read-tree commit-tree \
+	cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \
+	check-files ls-tree merge-base
+
+all: $(PROG)
+
+install: $(PROG)
+	install $(PROG) $(HOME)/bin/
+
+LIBS= -lssl -lz
+
+init-db: init-db.o
+
+update-cache: update-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o update-cache update-cache.o read-cache.o $(LIBS)
+
+show-diff: show-diff.o read-cache.o
+	$(CC) $(CFLAGS) -o show-diff show-diff.o read-cache.o $(LIBS)
+
+write-tree: write-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o write-tree write-tree.o read-cache.o $(LIBS)
+
+read-tree: read-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o read-tree read-tree.o read-cache.o $(LIBS)
+
+commit-tree: commit-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o commit-tree commit-tree.o read-cache.o $(LIBS)
+
+cat-file: cat-file.o read-cache.o
+	$(CC) $(CFLAGS) -o cat-file cat-file.o read-cache.o $(LIBS)
+
+fsck-cache: fsck-cache.o read-cache.o object.o commit.o tree.o blob.o
+	$(CC) $(CFLAGS) -o fsck-cache fsck-cache.o read-cache.o object.o commit.o tree.o blob.o $(LIBS)
+
+checkout-cache: checkout-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o checkout-cache checkout-cache.o read-cache.o $(LIBS)
+
+diff-tree: diff-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o diff-tree diff-tree.o read-cache.o $(LIBS)
+
+rev-tree: rev-tree.o read-cache.o object.o commit.o tree.o blob.o
+	$(CC) $(CFLAGS) -o rev-tree rev-tree.o read-cache.o object.o commit.o tree.o blob.o $(LIBS)
+
+show-files: show-files.o read-cache.o
+	$(CC) $(CFLAGS) -o show-files show-files.o read-cache.o $(LIBS)
+
+check-files: check-files.o read-cache.o
+	$(CC) $(CFLAGS) -o check-files check-files.o read-cache.o $(LIBS)
+
+ls-tree: ls-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o ls-tree ls-tree.o read-cache.o $(LIBS)
+
+merge-base: merge-base.o read-cache.o object.o commit.o tree.o blob.o
+	$(CC) $(CFLAGS) -o merge-base merge-base.o read-cache.o object.o commit.o tree.o blob.o $(LIBS)
+
+read-cache.o: cache.h
+show-diff.o: cache.h
+
+clean:
+	rm -f *.o $(PROG)
+
+backup: clean
+	cd .. ; tar czvf dircache.tar.gz dir-cache
+
+
+Revision-number: 8
+Prop-content-length: 110
+Content-length: 110
+
+K 7
+svn:log
+V 11
+Merge trunk
+K 10
+svn:author
+V 4
+samv
+K 8
+svn:date
+V 27
+2009-10-20T01:33:48.176135Z
+PROPS-END
+
+Node-path: trunk
+Node-kind: dir
+Node-action: change
+Prop-content-length: 53
+Content-length: 53
+
+K 13
+svn:mergeinfo
+V 18
+/branches/left:2-7
+PROPS-END
+
+
+Node-path: trunk/Makefile
+Node-kind: file
+Node-action: change
+Text-content-length: 2713
+Text-content-md5: 0afbe34f244cd662b1f97d708c687f90
+Content-length: 2713
+
+# -DCOLLISION_CHECK if you believe that SHA1's
+# 1461501637330902918203684832716283019655932542976 hashes do not give you
+# enough guarantees about no collisions between objects ever hapenning.
+#
+# -DNSEC if you want git to care about sub-second file mtimes and ctimes.
+# Note that you need some new glibc (at least >2.2.4) for this, and it will
+# BREAK YOUR LOCAL DIFFS! show-diff and anything using it will likely randomly
+# break unless your underlying filesystem supports those sub-second times
+# (my ext3 doesn't).
+CFLAGS=-g -O3 -Wall
+
+CC=gcc
+
+
+PROG=   update-cache show-diff init-db write-tree read-tree commit-tree \
+	cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \
+	check-files ls-tree merge-base merge-cache
+
+all: $(PROG)
+
+install: $(PROG)
+	install $(PROG) $(HOME)/bin/
+
+LIBS= -lssl -lz
+
+init-db: init-db.o
+
+update-cache: update-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o update-cache update-cache.o read-cache.o $(LIBS)
+
+show-diff: show-diff.o read-cache.o
+	$(CC) $(CFLAGS) -o show-diff show-diff.o read-cache.o $(LIBS)
+
+write-tree: write-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o write-tree write-tree.o read-cache.o $(LIBS)
+
+read-tree: read-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o read-tree read-tree.o read-cache.o $(LIBS)
+
+commit-tree: commit-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o commit-tree commit-tree.o read-cache.o $(LIBS)
+
+cat-file: cat-file.o read-cache.o
+	$(CC) $(CFLAGS) -o cat-file cat-file.o read-cache.o $(LIBS)
+
+fsck-cache: fsck-cache.o read-cache.o object.o commit.o tree.o blob.o
+	$(CC) $(CFLAGS) -o fsck-cache fsck-cache.o read-cache.o object.o commit.o tree.o blob.o $(LIBS)
+
+checkout-cache: checkout-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o checkout-cache checkout-cache.o read-cache.o $(LIBS)
+
+diff-tree: diff-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o diff-tree diff-tree.o read-cache.o $(LIBS)
+
+rev-tree: rev-tree.o read-cache.o object.o commit.o tree.o blob.o
+	$(CC) $(CFLAGS) -o rev-tree rev-tree.o read-cache.o object.o commit.o tree.o blob.o $(LIBS)
+
+show-files: show-files.o read-cache.o
+	$(CC) $(CFLAGS) -o show-files show-files.o read-cache.o $(LIBS)
+
+check-files: check-files.o read-cache.o
+	$(CC) $(CFLAGS) -o check-files check-files.o read-cache.o $(LIBS)
+
+ls-tree: ls-tree.o read-cache.o
+	$(CC) $(CFLAGS) -o ls-tree ls-tree.o read-cache.o $(LIBS)
+
+merge-base: merge-base.o read-cache.o object.o commit.o tree.o blob.o
+	$(CC) $(CFLAGS) -o merge-base merge-base.o read-cache.o object.o commit.o tree.o blob.o $(LIBS)
+
+merge-cache: merge-cache.o read-cache.o
+	$(CC) $(CFLAGS) -o merge-cache merge-cache.o read-cache.o $(LIBS)
+
+read-cache.o: cache.h
+show-diff.o: cache.h
+
+clean:
+	rm -f *.o $(PROG)
+
+backup: clean
+	cd .. ; tar czvf dircache.tar.gz dir-cache
+
+
-- 
1.6.3.3

^ permalink raw reply related

* Re: [PATCH] am: allow some defaults to be specified via git-config
From: Sam Vilain @ 2009-10-20  2:44 UTC (permalink / raw)
  To: Wesley J. Landaker; +Cc: git, Nigel McNie
In-Reply-To: <200910191149.13698.wjl@icecavern.net>

Wesley J. Landaker wrote:
> On Thursday 15 October 2009 17:50:27 Sam Vilain wrote:
>> +am.*::
>> +	Specify defaults for linkgit:git-am[1].  Currently, the three
>> +	boolean options, 'sign', 'utf8' and 'keep' may be specified.
>> +
> 
> The 'git am' option is 'signoff', not 'sign'. Shouldn't the command option 
> and config option names match?

Thanks for pointing that out.  Yes, it should be.
-- 
Sam Vilain, Perl Hacker, Catalyst IT (NZ) Ltd.
phone: +64 4 499 2267        PGP ID: 0x66B25843

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox