Git development

Git development
 help / color / mirror / Atom feed

* Re: Errors GITtifying GCC and Binutils
From: Eric Wong @ 2006-03-25  8:25 UTC (permalink / raw)
  To: Jan-Benedict Glaw; +Cc: git
In-Reply-To: <20060324182504.GI31387@lug-owl.de>

Jan-Benedict Glaw <jbglaw@lug-owl.de> wrote:
> On Wed, 2006-03-22 14:33:37 +0100, Jan-Benedict Glaw <jbglaw@lug-owl.de> wrote:
> 
> Since it seems nobody looked at the GCC import run (which means to use
> the svnimport), I ran it again, under strace control:

If you don't care for automated branch handling, how about trying git-svn?
under the contrib/ directory in git.git

	git-svn init svn://gcc.gnu.org/svn/gcc
	git-svn fetch

> > GCC
> > ~~~
> > $ /home/jbglaw/bin/git svnimport -C gcc -v svn://gcc.gnu.org/svn/gcc
> 
> > Committed change 3936:/ 1993-03-31 05:44:03)
> > Commit ID ceff85145f8671fb2a9d826a761cedc2a507bd1e
> > Writing to refs/heads/origin
> > DONE: 3936 origin ceff85145f8671fb2a9d826a761cedc2a507bd1e
> > ... 3937 trunk/gcc/final.c ...
> > Can't fork at /home/jbglaw/bin/git-svnimport line 379.
> 
> ... 4279 trunk/gcc/config/i386/xm-sco.h ...
> 
> This time it broke at a different revision, so I guess it's not a SVN
> or git / git-svnimport problem, but rather a problem of my Perl
> installation or the kernel itself?

I've known of SVN library bindings leaking memory in the past, but I
thought it's been solved.  Afaik, any memory allocated by the Perl
interpreter is never released back to the kernel, either.  (At least
that seems to be the case with my setup (Debian unstable, Perl 5.8.8,
2.6 kernel, x86 machine).

> What are possible reasons for clone() to fail with -ENOMEN? I have to
> admit that the box _is_ loaded a bit all the time:
> 
> jbglaw@bixie:~/vax/git-conversion$ uptime
>  19:23:58 up 136 days,  7:46, 20 users,  load average: 4.45, 4.25, 3.05
> jbglaw@bixie:~/vax/git-conversion$ free
>              total       used       free     shared    buffers     cached
> Mem:        507308     501760       5548          0       2184      16900
> -/+ buffers/cache:     482676      24632
> Swap:      2441872    1295512    1146360

Some importers (my own git-svn included) aren't amazingly efficient when
handling lots of history which gcc has.   It looks like (from what I
understand of the SVN api used in git-svnimport) is that the entire log
for the 100k+ revisions in the tree is slurped down into memory before
any processing is done.

git-svn does this too, but by parsing the output of the svn binary
instead of using the library, so at least it won't have issues with the
svn bindings and libraries to worry about.

My git-svn process running on the SVN tree just finished parsing the svn
log output, and it's maxed out at 74M RSS (on a 32-bit x86).  It'll
probably take a while to import it all (which I won't do), but I could
have just as easily done the following to reduce memory usage by ~half:

	git-svn fetch -r0:50000		# import the first 50000k
	git-svn fetch			# now import the remaining

Afaik, there's no way to do something like the above with git-svnimport
for memory-starved setups.

-- 
Eric Wong

^ permalink raw reply

* Re: Fix branch ancestry calculation
From: Keith Packard @ 2006-03-25  7:54 UTC (permalink / raw)
  To: Chris Shoemaker
  Cc: keithp, Linus Torvalds, David Mansfield, David Mansfield,
	Git Mailing List
In-Reply-To: <20060325014532.GB32522@pe.Belkin>

[-- Attachment #1: Type: text/plain, Size: 1551 bytes --]

On Fri, 2006-03-24 at 20:45 -0500, Chris Shoemaker wrote:

> If that last sentence was a typo then you already know this, but
> otherwise you may be disappointed to learn that it's not _always_
> possible to discern the correct ancestry tree.

Sure, it's possible to generate trees which can't be figured out. So
far, I haven't found any which can't be pieced back together, except in
cases where the tree was accidentally damaged (child branches created on
two separate parent branches)

> If you end up comparing the ancestry tree discovered by your tool and
> the tree output by a patched cvsps, I would be very interested in the
> results.

So far, I've found several concrete trees where cvsps (in any form)
assigns branch points many versions too early compared to the 'true'
history. My tool is getting better answers, but still can't compute the
tree for the X.org X server tree yet. That one has a wide variety of
damage, including the direct copying of ,v files between repositories
which had divered, and the accidental branching of files from different
parent branches. I keep poking at it...

> -chris
> 
> (*) You can distinguish between A->B->head and B->A->head simply by
> date.

I'm doing a lot more date-based identification than I'm really
comfortable with; the bad thing here is that branch points can occur
long before any commits to that branch, when doing date-based
operations, you have a range of possible matching branch points and it's
hard to disambiguate.

-- 
keith.packard@intel.com

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply

* Re: Use a *real* built-in diff generator
From: Junio C Hamano @ 2006-03-25  7:39 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <7vk6ajxbe5.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> writes:

> Linus Torvalds <torvalds@osdl.org> writes:
>
>> This uses a simplified libxdiff setup to generate unified diffs _without_ 
>> doing  fork/execve of GNU "diff".
>
> Good stuff.

The reason I like this is because I was thinking about doing
in-core diffs for different purpose when I was driving to work
this morning [*1*]  --- to make pickaxe a more useful building
block.

Currently, pickaxe tries to do an exact match to find the case
where a given substring S appears in the version C of the file
but not in the its parent C^n (1 <= n), and then it tells the
diffcore to emit the differences.  The user (probably only me on
this list?)  is expected to look at the change, make an
intelligent decision to feed a matching substring S' found in
C^n and restart from that commit.

To be a useful "content movement tracker", the process of
finding matching 'old shape' in the previous version and
re-feeding it to pickaxe should be automated if possible, and
in-core diff machinery would be one component to help that.

For example, if I wanted to find when I stole 'ls-files -t'
feature from Cogito, I would first run less ls-files.c; I see
these and am reasonably sure these relate to what I am looking
for:

	...
        static const char *tag_cached = "";
        static const char *tag_unmerged = "";
        static const char *tag_removed = "";
        static const char *tag_other = "";
        static const char *tag_killed = "";
        static const char *tag_modified = "";
	...

So I run:

	$ git whatchanged -S'static const char *tag_other = "";
        static const char *tag_killed = "";
	static const char *tag_modified' -p master -- ls-files.c

which finds:

        Author: Junio C Hamano <junkio@cox.net>
        Date:   Mon Sep 19 15:11:15 2005 -0700

            Show modified files in git-ls-files
	...
        @@ -28,6 +29,7 @@ static const char *tag_unmerged = "";
         static const char *tag_removed = "";
         static const char *tag_other = "";
         static const char *tag_killed = "";
        +static const char *tag_modified = "";

but that is not what I am interested in; the matching "old
shape" is the version before the tag_modified was added (and it
already had other tag_xxx in there).  So with the current
pickaxe, I manually re-run whatchanged starting from the found
commit with modified string like this:

	$ git whatchanged -S'static const char *tag_removed = "";
        static const char *tag_other = "";
        static const char *tag_killed = "";' -p $that_commit -- ls-files.c

in order to further drill down.

A truly useful pickaxe should take two line numbers and a
filename (to name the range of lines I am interested in) from
the starting version, notice when that range changes shape, and
after showing the found commit, replace the range with the one
matching from the older commit and continue.

[Footnote]

*1* When you are bogged down in a boring day-job, your brain
tends to try to compensate by spending as much your waking time
as possible on thinking about more interesting and more useful
stuff -- like git ;-).

^ permalink raw reply

* Re: Effective difference between git-rebase and git-resolve
From: Junio C Hamano @ 2006-03-25  7:15 UTC (permalink / raw)
  To: Marc Singer; +Cc: git
In-Reply-To: <20060325063225.GA13791@buici.com>

Marc Singer <elf@buici.com> writes:

[I'm shuffling this part to the top]

> Moreover, it isn't clear to me if git-rebase is better than git-merge.

Merge preserves commit ancestry, so if you are hoping it to
clean up your history, that is not the tool to do it.  Both
rebase and cherry-pick are to help you create a cleaner,
alternate history.  So none is better than the other.  They
serve different purposes.

> On Fri, Mar 24, 2006 at 10:08:09PM -0800, Junio C Hamano wrote:
>> > Junio, is there some magic to restart a rebase after you've fixed up the 
>> > conflicts?
>> 
>> The modern rebase is essentially git-format-patch piped to
>> git-am (with -3 flag to allow falling back to three-way merge),
>> and all the familiar "the patch did not apply -- what now?"
>> techniques can be employed.
>...
> By modern do you mean newer than 1.2.4?  I comprehend what you're
> layin' down here, but I don't know if I need to do something
> different.

By modern, I meant v0.99.9-g7f59dbb.

diff-tree 7f59dbb... (from f9039f3...)
Author: Junio C Hamano <junkio@cox.net>
Date:   Mon Nov 14 00:41:53 2005 -0800

    Rewrite rebase to use git-format-patch piped to git-am.
    
    The current rebase implementation finds commits in our tree but
    not in the upstream tree using git-cherry, and tries to apply
    them using git-cherry-pick (i.e. always use 3-way) one by one.
    
    Which is fine, but when some of the changes do not apply
    cleanly, it punts, and punts badly.
    
    Suppose you have commits A-B-C-D-E since you forked from the
    upstream and submitted the changes for inclusion.  You fetch
    from upstream head U and find that B has been picked up.  You
    run git-rebase to update your branch, which tries to apply
    changes contained in A-C-D-E, in this order, but replaying of C
    fails, because the upstream got changes that touch the same area
    from elsewhere.
    
    Now what?
    
    It notes that fact, and goes ahead to apply D and E, and at the
    very end tells you to deal with C by hand.  Even if you somehow
    managed to replay C on top of the result, you would now end up
    with ...-B-...-U-A-D-E-C.
    
    Breaking the order between B and others was the conscious
    decision made by the upstream, so we would not worry about it,
    and even if it were worrisome, it is too late for us to fix now.
    What D and E do may well depend on having C applied before them,
    which is a problem for us.
    
    This rewrites rebase to use git-format-patch piped to git-am,
    and when the patch does not apply, have git-am fall back on
    3-way merge.  The updated diff/patch pair knows how to apply
    trivial binary patches as long as the pre- and post-images are
    locally available, so this should work on a repository with
    binary files as well.
    
    The primary benefit of this change is that it makes rebase
    easier to use when some of the changes do not replay cleanly.
    In the "unapplicable patch in the middle" case, this "rebase"
    works like this:
    
     - A series of patches in e-mail form is created that records
       what A-C-D-E do, and is fed to git-am.  This is stored in
       .dotest/ directory, just like the case you tried to apply
       them from your mailbox.  Your branch is rewound to the tip of
       upstream U, and the original head is kept in .git/ORIG_HEAD,
       so you could "git reset --hard ORIG_HEAD" in case the end
       result is really messy.
    
     - Patch A applies cleanly.  This could either be a clean patch
       application on top of rewound head (i.e. same as upstream
       head), or git-am might have internally fell back on 3-way
       (i.e.  it would have done the same thing as git-cherry-pick).
       In either case, a rebased commit A is made on top of U.
    
     - Patch C does not apply.  git-am stops here, with conflicts to
       be resolved in the working tree.  Yet-to-be-applied D and E
       are still kept in .dotest/ directory at this point.  What the
       user does is exactly the same as fixing up unapplicable patch
       when running git-am:
    
       - Resolve conflict just like any merge conflicts.
       - "git am --resolved --3way" to continue applying the patches.
    
     - This applies the fixed-up patch so by definition it had
       better apply.  "git am" knows the patch after the fixed-up
       one is D and then E; it applies them, and you will get the
       changes from A-C-D-E commits on top of U, in this order.
    
    I've been using this without noticing any problem, and as people
    may know I do a lot of rebases.
    
    Signed-off-by: Junio C Hamano <junkio@cox.net>

^ permalink raw reply

* Re: Use a *real* built-in diff generator
From: Junio C Hamano @ 2006-03-25  6:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0603241938510.15714@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> This uses a simplified libxdiff setup to generate unified diffs _without_ 
> doing  fork/execve of GNU "diff".

Good stuff.

> Now, in the interest of full disclosure, I should also point out a few 
> downsides:
>
>  - the libxdiff algorithm is different,...
>
>  - GNU diff does some nice eye-candy, like trying to figure out what the 
>    last function was, and adding that information to the "@@ .." line. 
>    libxdiff doesn't do that. 

That's kind of sad --- Documentation/SubmittingPatches request
people to say "diff -u -p".

>  - The libxdiff thing has some known deficiencies. In particular, it gets 
>    the "\No newline at end of file" case wrong. So this is currently for 
>    the experimental branch only. I hope Davide will help fix it.

Another thing I noticed is that while libxdiff always shows full
line counts "-n,m +l,k" GNU seems to omit them when it can (m,k
<=1).  I am not sure if apply.c is set up to grok what libxdiff
emits correctly.  Running t/t1200 shows some obvious examples.

^ permalink raw reply

* Re: Effective difference between git-rebase and git-resolve
From: Marc Singer @ 2006-03-25  6:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List
In-Reply-To: <7v64m3ys3a.fsf@assigned-by-dhcp.cox.net>

On Fri, Mar 24, 2006 at 10:08:09PM -0800, Junio C Hamano wrote:
> > Junio, is there some magic to restart a rebase after you've fixed up the 
> > conflicts?
> 
> The modern rebase is essentially git-format-patch piped to
> git-am (with -3 flag to allow falling back to three-way merge),
> and all the familiar "the patch did not apply -- what now?"
> techniques can be employed.
> 
> Since the pre-image blobs recorded in the intermediate
> format-patch output by definition exist in your repository, it
> always falls back to three-way merge when the patch does not
> apply cleanly.  Then you can resolve and say "git am --resolved"
> to continue.

By modern do you mean newer than 1.2.4?  I comprehend what you're
layin' down here, but I don't know if I need to do something
different.

Moreover, it isn't clear to me if git-rebase is better than git-merge.

^ permalink raw reply

* Re: git push refspec URL weirdness
From: Junio C Hamano @ 2006-03-25  6:22 UTC (permalink / raw)
  To: git
In-Reply-To: <E1FMzfr-0006xT-Uq@jdl.com>

Jon Loeliger <jdl@jdl.com> writes:

> So Junio suggested taking advantage of the fact that the
> default refspec uses git+ssh and use this instead:
>
>     URL: www.example.com:/pub/software/linux-2.6-86xx.git
>     Push: my-branch:public-branch
>
> Which just worked.
>
> So this is either a bug report or google food. :-)

Actually, I did not suggest it as a workaround (I've never used
git+ssh:// URL myself -- I'm old fashioned -- and always used
host:path syntax).  If git+ssh:// insists on the fixed port, it
surely is broken, but I do not see how it would make a
difference.  In either case, connect.c::git_connect() goes
PROTO_SSH codepath which never opens a tcp connection itself --
it just calls the same "ssh" command.

^ permalink raw reply

* Re: Bug encountered while comitting
From: Junio C Hamano @ 2006-03-25  6:17 UTC (permalink / raw)
  To: Matthias Kestenholz; +Cc: git
In-Reply-To: <20060325011527.GA23600@spinlock.ch>

Matthias Kestenholz <lists@irregular.ch> writes:

> $ sudo chown root .git/objects/*
>
> repeat the modification and commit commands until you get a message
> similar to the following:
>
> unable to write sha1 filename .git/objects/90/b33..: Permission denied
> fatal: 90b33... is not a valid 'tree' object
> unable to write sha1 filename .git/objects/ba/fe4..: Permission denied
> error: file: failed to insert into database
> fatal: Unable to process file file
> etc...
>
> The result of this all is: refs/heads/master might now point to a
> non-existant commit object. Every git command now errors out with:
>
> fatal: bad tree object HEAD
>
> and git-log shows no output (probably since it does not find a
> commit to begin with)

You are right.  commit-tree does not seem to check if it
successfully wrote the commit object.  How about this?

-- >8 --
diff --git a/commit-tree.c b/commit-tree.c
index 88871b0..16c1787 100644
--- a/commit-tree.c
+++ b/commit-tree.c
@@ -125,7 +125,10 @@ int main(int argc, char **argv)
 	while (fgets(comment, sizeof(comment), stdin) != NULL)
 		add_buffer(&buffer, &size, "%s", comment);
 
-	write_sha1_file(buffer, size, "commit", commit_sha1);
-	printf("%s\n", sha1_to_hex(commit_sha1));
-	return 0;
+	if (!write_sha1_file(buffer, size, "commit", commit_sha1)) {
+		printf("%s\n", sha1_to_hex(commit_sha1));
+		return 0;
+	}
+	else
+		return 1;
 }

^ permalink raw reply related

* Re: Effective difference between git-rebase and git-resolve
From: Junio C Hamano @ 2006-03-25  6:10 UTC (permalink / raw)
  To: Marc Singer; +Cc: Linus Torvalds, Git Mailing List
In-Reply-To: <20060325043507.GA14644@buici.com>

Marc Singer <elf@buici.com> writes:

>> 	git merge "Merge with Linus" work linus
>> 
>> instead, which will use the proper "recursive" merge functionality.
>
> OK.  I'll see if that is more successful.  It would be nice if the
> resolve command printed a message about the command being deprecated.

The only reason I didn't do that was because I just did not want
to disrupt the workflow by Linus.  If nobody in the upper
echelon of kernel people (meaning, longest-time git users) use
git-resolve anymore, I think we should mark it deprecated and
remove it eventually.

^ permalink raw reply

* Re: Effective difference between git-rebase and git-resolve
From: Junio C Hamano @ 2006-03-25  6:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Marc Singer, Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0603242014160.15714@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> As to rebase, it often is very nice, but on the other hand, it leaves 
> things in a total mess when it fails, which is a pity. Maybe there's a 
> nice way to just continue, but I end up just doing a
>
> 	git reset --hard ORIG_HEAD
>
> to undo the failed rebase.
>
> Junio, is there some magic to restart a rebase after you've fixed up the 
> conflicts?

The modern rebase is essentially git-format-patch piped to
git-am (with -3 flag to allow falling back to three-way merge),
and all the familiar "the patch did not apply -- what now?"
techniques can be employed.

Since the pre-image blobs recorded in the intermediate
format-patch output by definition exist in your repository, it
always falls back to three-way merge when the patch does not
apply cleanly.  Then you can resolve and say "git am --resolved"
to continue.

^ permalink raw reply

* Re: Effective difference between git-rebase and git-resolve
From: Linus Torvalds @ 2006-03-25  4:23 UTC (permalink / raw)
  To: Marc Singer; +Cc: Git Mailing List, Junio C Hamano
In-Reply-To: <20060325035423.GB31504@buici.com>

On Fri, 24 Mar 2006, Marc Singer wrote:
>
> The process I've been using to keep my patches current with the latest
> development is this:
> 
>   git checkout linus && git pull linus
>   git checkout work

You'd be much more efficient if you just did

	git fetch linus

which avoids switching back-and-forth (and speeds up the pull too, since 
it doesn't need to update any working directories).

> When I'm ready to merge,
> 
>   git resolve work linus "Update with head"

No, don't do this.

"git resolve" is the _old_ stupid merger, which isn't very helpful at all. 
So please use

	git merge "Merge with Linus" work linus

instead, which will use the proper "recursive" merge functionality.

> Then, I found git-rebase which seems to be more what I'd like to use
> since it moves my patches along on top of the main development line.
> 
>   git rebase linus
> 
> This time, almost everything merged without a hitch except for the
> thorny file from before.  I edited the file, removing the conflict
> markers, and started a build.  But what I found was that some of the
> changes I'd made were no longer present.

Yeah, "git rebase" is not _nearly_ as intuitive as doing a real merge.

What happened was that you resolved the thorny merge, but the rebase had 
stopped when it hit it, so it never actually did the rest of the rebase. 
Which explains why some of your changes are no longer present: they are 
still in the "rebase queue".

>   1) Am I using rebase correctly?

Yes, but you missed the fact that unlike "git merge", rebasing really is a 
"move one commit at a time" thing, and it stopped on the middle.

>   4) Should I prefer rebase over resolve?

You should never do "resolve", it's very oldfashioned. If you want to 
merge, just use "git merge", which will do the right thing.

As to rebase, it often is very nice, but on the other hand, it leaves 
things in a total mess when it fails, which is a pity. Maybe there's a 
nice way to just continue, but I end up just doing a

	git reset --hard ORIG_HEAD

to undo the failed rebase.

Junio, is there some magic to restart a rebase after you've fixed up the 
conflicts?

		Linus

^ permalink raw reply

* Use a *real* built-in diff generator
From: Linus Torvalds @ 2006-03-25  4:13 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List; +Cc: Davide Libenzi


This uses a simplified libxdiff setup to generate unified diffs _without_ 
doing  fork/execve of GNU "diff".

This has several huge advantages, for example:

Before:

	[torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null 

	real    0m24.818s
	user    0m13.332s
	sys     0m8.664s

After:

	[torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null 
	
	real    0m4.563s
	user    0m2.944s
	sys     0m1.580s

and the fact that this should be a lot more portable (ie we can ignore all 
the issues with doing fork/execve under Windows).

Perhaps even more importantly, this allows us to do diffs without actually 
ever writing out the git file contents to a temporary file (and without 
any of the shell quoting issues on filenames etc etc).

NOTE! THIS PATCH DOES NOT DO THAT OPTIMIZATION YET! I was lazy, and the 
current "diff-core" code actually will always write the temp-files, 
because it used to be something that you simply had to do. So this current 
one actually writes a temp-file like before, and then reads it into memory 
again just to do the diff. Stupid.

But if this basic infrastructure is accepted, we can start switching over 
diff-core to not write temp-files, which should speed things up even 
further, especially when doing big tree-to-tree diffs.

Now, in the interest of full disclosure, I should also point out a few 
downsides:

 - the libxdiff algorithm is different, and I bet GNU diff has gotten a 
   lot more testing. And the thing is, generating a diff is not an exact 
   science - you can get two different diffs (and you will), and they can 
   both be perfectly valid. So it's not possible to "validate" the 
   libxdiff output by just comparing it against GNU diff.

 - GNU diff does some nice eye-candy, like trying to figure out what the 
   last function was, and adding that information to the "@@ .." line. 
   libxdiff doesn't do that. 

 - The libxdiff thing has some known deficiencies. In particular, it gets 
   the "\No newline at end of file" case wrong. So this is currently for 
   the experimental branch only. I hope Davide will help fix it.

That said, I think the huge performance advantage, and the fact that it 
integrates better is definitely worth it. But it should go into a 
development branch at least due to the missing newline issue.

Technical note: this is based on libxdiff-0.17, but I did some surgery to 
get rid of the extraneous fat - stuff that git doesn't need, and seriously 
cutting down on mmfile_t, which had much more capabilities than the diff 
algorithm either needed or used. In this version, "mmfile_t" is just a 
trivial <pointer,length> tuple.

That said, I tried to keep the differences to simple removals, so that you 
can do a diff between this and the libxdiff origin, and you'll basically 
see just things getting deleted. Even the mmfile_t simplifications are 
left in a state where the diffs should be readable.

Apologies to Davide, whom I'd love to get feedback on this all from (I 
wrote my own "fill_mmfile()" for the new simpler mmfile_t format: the old 
complex format had a helper function for that, but I did my surgery with 
the goal in mind that eventually we _should_ just do

	mmfile_t mf;

	buf = read_sha1_file(sha1, type, &size);
	mf->ptr = buf;
	mf->size = size;
	.. use "mf" directly ..

which was really a nightmare with the old "helpful" mmfile_t, and really 
is that easy with the new cut-down interfaces).

[ Btw, as any hawk-eye can see from the diff, this was actually generated 
  with itself, so it is "self-hosting". That's about all the testing it 
  has gotten, along with the above kernel diff, which eye-balls correctly, 
  but shows the newline issue when you double-check it with "git-apply" ]

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
----
 Makefile         |   11 +
 diff.c           |   79 ++++++++-
 xdiff/xdiff.h    |   91 ++++++++++
 xdiff/xdiffi.c   |  469 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 xdiff/xdiffi.h   |   60 +++++++
 xdiff/xemit.c    |  141 ++++++++++++++++
 xdiff/xemit.h    |   34 ++++
 xdiff/xinclude.h |   42 +++++
 xdiff/xmacros.h  |   53 ++++++
 xdiff/xprepare.c |  436 ++++++++++++++++++++++++++++++++++++++++++++++++++
 xdiff/xprepare.h |   35 ++++
 xdiff/xtypes.h   |   68 ++++++++
 xdiff/xutils.c   |  265 +++++++++++++++++++++++++++++++
 xdiff/xutils.h   |   44 +++++
 14 files changed, 1820 insertions(+), 8 deletions(-)

diff --git a/Makefile b/Makefile
index 8d45378..0f565eb 100644
--- a/Makefile
+++ b/Makefile
@@ -188,6 +188,7 @@
 	gitMergeCommon.py
 
 LIB_FILE=libgit.a
+XDIFF_LIB=xdiff/lib.a
 
 LIB_H = \
 	blob.h cache.h commit.h count-delta.h csum-file.h delta.h \
@@ -209,7 +210,7 @@
 	fetch-clone.o revision.o pager.o \
 	$(DIFF_OBJS)
 
-LIBS = $(LIB_FILE)
+LIBS = $(LIB_FILE) $(XDIFF_LIB)
 LIBS += -lz
 
 #
@@ -544,11 +545,17 @@
 		-DDEFAULT_GIT_TEMPLATE_DIR='"$(template_dir_SQ)"' $*.c
 
 $(LIB_OBJS): $(LIB_H)
-$(patsubst git-%$X,%.o,$(PROGRAMS)): $(LIB_H)
+$(patsubst git-%$X,%.o,$(PROGRAMS)): $(LIBS)
 $(DIFF_OBJS): diffcore.h
 
 $(LIB_FILE): $(LIB_OBJS)
 	$(AR) rcs $@ $(LIB_OBJS)
+
+XDIFF_OBJS=xdiff/xdiffi.o xdiff/xprepare.o xdiff/xutils.o xdiff/xemit.o
+
+$(XDIFF_LIB): $(XDIFF_OBJS)
+	$(AR) rcs $@ $(XDIFF_OBJS)
+
 
 doc:
 	$(MAKE) -C Documentation all
diff --git a/diff.c b/diff.c
index c0548ee..f6a1f5d 100644
--- a/diff.c
+++ b/diff.c
@@ -8,6 +8,7 @@
 #include "quote.h"
 #include "diff.h"
 #include "diffcore.h"
+#include "xdiff/xdiff.h"
 
 static const char *diff_opts = "-pu";
 
@@ -178,6 +179,49 @@
 		copy_file('+', temp[1].name);
 }
 
+static int fill_mmfile(mmfile_t *mf, const char *file)
+{
+	int fd = open(file, O_RDONLY);
+	struct stat st;
+	char *buf;
+	unsigned long size;
+
+	mf->ptr = NULL;
+	mf->size = 0;
+	if (fd < 0)
+		return 0;
+	fstat(fd, &st);
+	size = st.st_size;
+	buf = xmalloc(size);
+	mf->ptr = buf;
+	mf->size = size;
+	while (size) {
+		int retval = read(fd, buf, size);
+		if (retval < 0) {
+			if (errno == EINTR || errno == EAGAIN)
+				continue;
+			break;
+		}
+		if (!retval)
+			break;
+		buf += retval;
+		size -= retval;
+	}
+	mf->size -= size;
+	close(fd);
+	return 0;
+}
+
+static int fn_out(void *priv, mmbuffer_t *mb, int nbuf)
+{
+	int i;
+
+	for (i = 0; i < nbuf; i++)
+		if (!fwrite(mb[i].ptr, mb[i].size, 1, stdout))
+			return -1;
+	return 0;
+}
+
 static const char *builtin_diff(const char *name_a,
 			 const char *name_b,
 			 struct diff_tempfile *temp,
@@ -186,6 +230,7 @@
 			 const char **args)
 {
 	int i, next_at, cmd_size;
+	mmfile_t mf1, mf2;
 	const char *const diff_cmd = "diff -L%s -L%s";
 	const char *const diff_arg  = "-- %s %s||:"; /* "||:" is to return 0 */
 	const char *input_name_sq[2];
@@ -253,14 +298,36 @@
 			emit_rewrite_diff(name_a, name_b, temp);
 			return NULL;
 		}
+	}
+
+	/* Un-quote the paths */
+	if (label_path[0][0] != '/')
+		label_path[0] = quote_two("a/", name_a);
+	if (label_path[1][0] != '/')
+		label_path[1] = quote_two("b/", name_b);
+
+	printf("--- %s\n", label_path[0]);
+	printf("+++ %s\n", label_path[1]);
+
+	if (fill_mmfile(&mf1, temp[0].name) < 0 ||
+	    fill_mmfile(&mf2, temp[1].name) < 0)
+		die("unable to read files to diff");
+
+	/* Crazy xdl interfaces.. */
+	{
+		xpparam_t xpp;
+		xdemitconf_t xecfg;
+		xdemitcb_t ecb;
+
+		xpp.flags = XDF_NEED_MINIMAL;
+		xecfg.ctxlen = 3;
+		ecb.outf = fn_out;
+		xdl_diff(&mf1, &mf2, &xpp, &xecfg, &ecb);
 	}
 
-	/* This is disgusting */
-	*args++ = "sh";
-	*args++ = "-c";
-	*args++ = cmd;
-	*args = NULL;
-	return "/bin/sh";
+	free(mf1.ptr);
+	free(mf2.ptr);
+	return NULL;
 }
 
 struct diff_filespec *alloc_filespec(const char *path)
diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h
new file mode 100644
index 0000000..d900295
--- /dev/null
+++ b/xdiff/xdiff.h
@@ -0,0 +1,91 @@
+/*
+ *  LibXDiff by Davide Libenzi ( File Differential Library )
+ *  Copyright (C) 2003  Davide Libenzi
+ *
+ *  This library is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU Lesser General Public
+ *  License as published by the Free Software Foundation; either
+ *  version 2.1 of the License, or (at your option) any later version.
+ *
+ *  This library is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *  Lesser General Public License for more details.
+ *
+ *  You should have received a copy of the GNU Lesser General Public
+ *  License along with this library; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ *  Davide Libenzi <davidel@xmailserver.org>
+ *
+ */
+
+#if !defined(XDIFF_H)
+#define XDIFF_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif /* #ifdef __cplusplus */
+
+
+#define XDF_NEED_MINIMAL (1 << 1)
+
+#define XDL_PATCH_NORMAL '-'
+#define XDL_PATCH_REVERSE '+'
+#define XDL_PATCH_MODEMASK ((1 << 8) - 1)
+#define XDL_PATCH_IGNOREBSPACE (1 << 8)
+	
+#define XDL_MMB_READONLY (1 << 0)
+
+#define XDL_MMF_ATOMIC (1 << 0)
+
+#define XDL_BDOP_INS 1
+#define XDL_BDOP_CPY 2
+#define XDL_BDOP_INSB 3
+
+
+typedef struct s_mmfile {
+	char *ptr;
+	long size;
+} mmfile_t;
+
+typedef struct s_mmbuffer {
+	char *ptr;
+	long size;
+} mmbuffer_t;
+
+typedef struct s_xpparam {
+	unsigned long flags;
+} xpparam_t;
+
+typedef struct s_xdemitcb {
+	void *priv;
+	int (*outf)(void *, mmbuffer_t *, int);
+} xdemitcb_t;
+
+typedef struct s_xdemitconf {
+	long ctxlen;
+} xdemitconf_t;
+
+typedef struct s_bdiffparam {
+	long bsize;
+} bdiffparam_t;
+
+
+#define xdl_malloc(x) malloc(x)
+#define xdl_free(ptr) free(ptr)
+#define xdl_realloc(ptr,x) realloc(ptr,x)
+
+void *xdl_mmfile_first(mmfile_t *mmf, long *size);
+void *xdl_mmfile_next(mmfile_t *mmf, long *size);
+long xdl_mmfile_size(mmfile_t *mmf);
+
+int xdl_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
+	     xdemitconf_t const *xecfg, xdemitcb_t *ecb);
+
+#ifdef __cplusplus
+}
+#endif /* #ifdef __cplusplus */
+
+#endif /* #if !defined(XDIFF_H) */
+
diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
new file mode 100644
index 0000000..8ea0483
--- /dev/null
+++ b/xdiff/xdiffi.c
@@ -0,0 +1,469 @@
+/*
+ *  LibXDiff by Davide Libenzi ( File Differential Library )
+ *  Copyright (C) 2003	Davide Libenzi
+ *
+ *  This library is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU Lesser General Public
+ *  License as published by the Free Software Foundation; either
+ *  version 2.1 of the License, or (at your option) any later version.
+ *
+ *  This library is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *  Lesser General Public License for more details.
+ *
+ *  You should have received a copy of the GNU Lesser General Public
+ *  License along with this library; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ *  Davide Libenzi <davidel@xmailserver.org>
+ *
+ */
+
+#include "xinclude.h"
+
+
+
+#define XDL_MAX_COST_MIN 256
+#define XDL_HEUR_MIN_COST 256
+#define XDL_LINE_MAX (long)((1UL << (8 * sizeof(long) - 1)) - 1)
+#define XDL_SNAKE_CNT 20
+#define XDL_K_HEUR 4
+
+
+
+typedef struct s_xdpsplit {
+	long i1, i2;
+	int min_lo, min_hi;
+} xdpsplit_t;
+
+
+
+
+static long xdl_split(unsigned long const *ha1, long off1, long lim1,
+		      unsigned long const *ha2, long off2, long lim2,
+		      long *kvdf, long *kvdb, int need_min, xdpsplit_t *spl,
+		      xdalgoenv_t *xenv);
+static xdchange_t *xdl_add_change(xdchange_t *xscr, long i1, long i2, long chg1, long chg2);
+
+
+
+
+/*
+ * See "An O(ND) Difference Algorithm and its Variations", by Eugene Myers.
+ * Basically considers a "box" (off1, off2, lim1, lim2) and scan from both
+ * the forward diagonal starting from (off1, off2) and the backward diagonal
+ * starting from (lim1, lim2). If the K values on the same diagonal crosses
+ * returns the furthest point of reach. We might end up having to expensive
+ * cases using this algorithm is full, so a little bit of heuristic is needed
+ * to cut the search and to return a suboptimal point.
+ */
+static long xdl_split(unsigned long const *ha1, long off1, long lim1,
+		      unsigned long const *ha2, long off2, long lim2,
+		      long *kvdf, long *kvdb, int need_min, xdpsplit_t *spl,
+		      xdalgoenv_t *xenv) {
+	long dmin = off1 - lim2, dmax = lim1 - off2;
+	long fmid = off1 - off2, bmid = lim1 - lim2;
+	long odd = (fmid - bmid) & 1;
+	long fmin = fmid, fmax = fmid;
+	long bmin = bmid, bmax = bmid;
+	long ec, d, i1, i2, prev1, best, dd, v, k;
+
+	/*
+	 * Set initial diagonal values for both forward and backward path.
+	 */
+	kvdf[fmid] = off1;
+	kvdb[bmid] = lim1;
+
+	for (ec = 1;; ec++) {
+		int got_snake = 0;
+
+		/*
+		 * We need to extent the diagonal "domain" by one. If the next
+		 * values exits the box boundaries we need to change it in the
+		 * opposite direction because (max - min) must be a power of two.
+		 * Also we initialize the extenal K value to -1 so that we can
+		 * avoid extra conditions check inside the core loop.
+		 */
+		if (fmin > dmin)
+			kvdf[--fmin - 1] = -1;
+		else
+			++fmin;
+		if (fmax < dmax)
+			kvdf[++fmax + 1] = -1;
+		else
+			--fmax;
+
+		for (d = fmax; d >= fmin; d -= 2) {
+			if (kvdf[d - 1] >= kvdf[d + 1])
+				i1 = kvdf[d - 1] + 1;
+			else
+				i1 = kvdf[d + 1];
+			prev1 = i1;
+			i2 = i1 - d;
+			for (; i1 < lim1 && i2 < lim2 && ha1[i1] == ha2[i2]; i1++, i2++);
+			if (i1 - prev1 > xenv->snake_cnt)
+				got_snake = 1;
+			kvdf[d] = i1;
+			if (odd && bmin <= d && d <= bmax && kvdb[d] <= i1) {
+				spl->i1 = i1;
+				spl->i2 = i2;
+				spl->min_lo = spl->min_hi = 1;
+				return ec;
+			}
+		}
+
+		/*
+		 * We need to extent the diagonal "domain" by one. If the next
+		 * values exits the box boundaries we need to change it in the
+		 * opposite direction because (max - min) must be a power of two.
+		 * Also we initialize the extenal K value to -1 so that we can
+		 * avoid extra conditions check inside the core loop.
+		 */
+		if (bmin > dmin)
+			kvdb[--bmin - 1] = XDL_LINE_MAX;
+		else
+			++bmin;
+		if (bmax < dmax)
+			kvdb[++bmax + 1] = XDL_LINE_MAX;
+		else
+			--bmax;
+
+		for (d = bmax; d >= bmin; d -= 2) {
+			if (kvdb[d - 1] < kvdb[d + 1])
+				i1 = kvdb[d - 1];
+			else
+				i1 = kvdb[d + 1] - 1;
+			prev1 = i1;
+			i2 = i1 - d;
+			for (; i1 > off1 && i2 > off2 && ha1[i1 - 1] == ha2[i2 - 1]; i1--, i2--);
+			if (prev1 - i1 > xenv->snake_cnt)
+				got_snake = 1;
+			kvdb[d] = i1;
+			if (!odd && fmin <= d && d <= fmax && i1 <= kvdf[d]) {
+				spl->i1 = i1;
+				spl->i2 = i2;
+				spl->min_lo = spl->min_hi = 1;
+				return ec;
+			}
+		}
+
+		if (need_min)
+			continue;
+
+		/*
+		 * If the edit cost is above the heuristic trigger and if
+		 * we got a good snake, we sample current diagonals to see
+		 * if some of the, have reached an "interesting" path. Our
+		 * measure is a function of the distance from the diagonal
+		 * corner (i1 + i2) penalized with the distance from the
+		 * mid diagonal itself. If this value is above the current
+		 * edit cost times a magic factor (XDL_K_HEUR) we consider
+		 * it interesting.
+		 */
+		if (got_snake && ec > xenv->heur_min) {
+			for (best = 0, d = fmax; d >= fmin; d -= 2) {
+				dd = d > fmid ? d - fmid: fmid - d;
+				i1 = kvdf[d];
+				i2 = i1 - d;
+				v = (i1 - off1) + (i2 - off2) - dd;
+
+				if (v > XDL_K_HEUR * ec && v > best &&
+				    off1 + xenv->snake_cnt <= i1 && i1 < lim1 &&
+				    off2 + xenv->snake_cnt <= i2 && i2 < lim2) {
+					for (k = 1; ha1[i1 - k] == ha2[i2 - k]; k++)
+						if (k == xenv->snake_cnt) {
+							best = v;
+							spl->i1 = i1;
+							spl->i2 = i2;
+							break;
+						}
+				}
+			}
+			if (best > 0) {
+				spl->min_lo = 1;
+				spl->min_hi = 0;
+				return ec;
+			}
+
+			for (best = 0, d = bmax; d >= bmin; d -= 2) {
+				dd = d > bmid ? d - bmid: bmid - d;
+				i1 = kvdb[d];
+				i2 = i1 - d;
+				v = (lim1 - i1) + (lim2 - i2) - dd;
+
+				if (v > XDL_K_HEUR * ec && v > best &&
+				    off1 < i1 && i1 <= lim1 - xenv->snake_cnt &&
+				    off2 < i2 && i2 <= lim2 - xenv->snake_cnt) {
+					for (k = 0; ha1[i1 + k] == ha2[i2 + k]; k++)
+						if (k == xenv->snake_cnt - 1) {
+							best = v;
+							spl->i1 = i1;
+							spl->i2 = i2;
+							break;
+						}
+				}
+			}
+			if (best > 0) {
+				spl->min_lo = 0;
+				spl->min_hi = 1;
+				return ec;
+			}
+		}
+
+		/*
+		 * Enough is enough. We spent too much time here and now we collect
+		 * the furthest reaching path using the (i1 + i2) measure.
+		 */
+		if (ec >= xenv->mxcost) {
+			long fbest, fbest1, bbest, bbest1;
+
+			fbest = -1;
+			for (d = fmax; d >= fmin; d -= 2) {
+				i1 = XDL_MIN(kvdf[d], lim1);
+				i2 = i1 - d;
+				if (lim2 < i2)
+					i1 = lim2 + d, i2 = lim2;
+				if (fbest < i1 + i2) {
+					fbest = i1 + i2;
+					fbest1 = i1;
+				}
+			}
+
+			bbest = XDL_LINE_MAX;
+			for (d = bmax; d >= bmin; d -= 2) {
+				i1 = XDL_MAX(off1, kvdb[d]);
+				i2 = i1 - d;
+				if (i2 < off2)
+					i1 = off2 + d, i2 = off2;
+				if (i1 + i2 < bbest) {
+					bbest = i1 + i2;
+					bbest1 = i1;
+				}
+			}
+
+			if ((lim1 + lim2) - bbest < fbest - (off1 + off2)) {
+				spl->i1 = fbest1;
+				spl->i2 = fbest - fbest1;
+				spl->min_lo = 1;
+				spl->min_hi = 0;
+			} else {
+				spl->i1 = bbest1;
+				spl->i2 = bbest - bbest1;
+				spl->min_lo = 0;
+				spl->min_hi = 1;
+			}
+			return ec;
+		}
+	}
+
+	return -1;
+}
+
+
+/*
+ * Rule: "Divide et Impera". Recursively split the box in sub-boxes by calling
+ * the box splitting function. Note that the real job (marking changed lines)
+ * is done in the two boundary reaching checks.
+ */
+int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
+		 diffdata_t *dd2, long off2, long lim2,
+		 long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv) {
+	unsigned long const *ha1 = dd1->ha, *ha2 = dd2->ha;
+
+	/*
+	 * Shrink the box by walking through each diagonal snake (SW and NE).
+	 */
+	for (; off1 < lim1 && off2 < lim2 && ha1[off1] == ha2[off2]; off1++, off2++);
+	for (; off1 < lim1 && off2 < lim2 && ha1[lim1 - 1] == ha2[lim2 - 1]; lim1--, lim2--);
+
+	/*
+	 * If one dimension is empty, then all records on the other one must
+	 * be obviously changed.
+	 */
+	if (off1 == lim1) {
+		char *rchg2 = dd2->rchg;
+		long *rindex2 = dd2->rindex;
+
+		for (; off2 < lim2; off2++)
+			rchg2[rindex2[off2]] = 1;
+	} else if (off2 == lim2) {
+		char *rchg1 = dd1->rchg;
+		long *rindex1 = dd1->rindex;
+
+		for (; off1 < lim1; off1++)
+			rchg1[rindex1[off1]] = 1;
+	} else {
+		long ec;
+		xdpsplit_t spl;
+
+		/*
+		 * Divide ...
+		 */
+		if ((ec = xdl_split(ha1, off1, lim1, ha2, off2, lim2, kvdf, kvdb,
+				    need_min, &spl, xenv)) < 0) {
+
+			return -1;
+		}
+
+		/*
+		 * ... et Impera.
+		 */
+		if (xdl_recs_cmp(dd1, off1, spl.i1, dd2, off2, spl.i2,
+				 kvdf, kvdb, spl.min_lo, xenv) < 0 ||
+		    xdl_recs_cmp(dd1, spl.i1, lim1, dd2, spl.i2, lim2,
+				 kvdf, kvdb, spl.min_hi, xenv) < 0) {
+
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+
+int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
+		xdfenv_t *xe) {
+	long ndiags;
+	long *kvd, *kvdf, *kvdb;
+	xdalgoenv_t xenv;
+	diffdata_t dd1, dd2;
+
+	if (xdl_prepare_env(mf1, mf2, xpp, xe) < 0) {
+
+		return -1;
+	}
+
+	/*
+	 * Allocate and setup K vectors to be used by the differential algorithm.
+	 * One is to store the forward path and one to store the backward path.
+	 */
+	ndiags = xe->xdf1.nreff + xe->xdf2.nreff + 3;
+	if (!(kvd = (long *) xdl_malloc((2 * ndiags + 2) * sizeof(long)))) {
+
+		xdl_free_env(xe);
+		return -1;
+	}
+	kvdf = kvd;
+	kvdb = kvdf + ndiags;
+	kvdf += xe->xdf2.nreff + 1;
+	kvdb += xe->xdf2.nreff + 1;
+
+	/*
+	 * Classical integer square root approximation using shifts.
+	 */
+	xenv.mxcost = 1;
+	for (; ndiags; ndiags >>= 2)
+		xenv.mxcost <<= 1;
+	if (xenv.mxcost < XDL_MAX_COST_MIN)
+		xenv.mxcost = XDL_MAX_COST_MIN;
+	xenv.snake_cnt = XDL_SNAKE_CNT;
+	xenv.heur_min = XDL_HEUR_MIN_COST;
+
+	dd1.nrec = xe->xdf1.nreff;
+	dd1.ha = xe->xdf1.ha;
+	dd1.rchg = xe->xdf1.rchg;
+	dd1.rindex = xe->xdf1.rindex;
+	dd2.nrec = xe->xdf2.nreff;
+	dd2.ha = xe->xdf2.ha;
+	dd2.rchg = xe->xdf2.rchg;
+	dd2.rindex = xe->xdf2.rindex;
+
+	if (xdl_recs_cmp(&dd1, 0, dd1.nrec, &dd2, 0, dd2.nrec,
+			 kvdf, kvdb, (xpp->flags & XDF_NEED_MINIMAL) != 0, &xenv) < 0) {
+
+		xdl_free(kvd);
+		xdl_free_env(xe);
+		return -1;
+	}
+
+	xdl_free(kvd);
+
+	return 0;
+}
+
+
+static xdchange_t *xdl_add_change(xdchange_t *xscr, long i1, long i2, long chg1, long chg2) {
+	xdchange_t *xch;
+
+	if (!(xch = (xdchange_t *) xdl_malloc(sizeof(xdchange_t))))
+		return NULL;
+
+	xch->next = xscr;
+	xch->i1 = i1;
+	xch->i2 = i2;
+	xch->chg1 = chg1;
+	xch->chg2 = chg2;
+
+	return xch;
+}
+
+
+int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) {
+	xdchange_t *cscr = NULL, *xch;
+	char *rchg1 = xe->xdf1.rchg, *rchg2 = xe->xdf2.rchg;
+	long i1, i2, l1, l2;
+
+	/*
+	 * Trivial. Collects "groups" of changes and creates an edit script.
+	 */
+	for (i1 = xe->xdf1.nrec, i2 = xe->xdf2.nrec; i1 >= 0 || i2 >= 0; i1--, i2--)
+		if (rchg1[i1 - 1] || rchg2[i2 - 1]) {
+			for (l1 = i1; rchg1[i1 - 1]; i1--);
+			for (l2 = i2; rchg2[i2 - 1]; i2--);
+
+			if (!(xch = xdl_add_change(cscr, i1, i2, l1 - i1, l2 - i2))) {
+				xdl_free_script(cscr);
+				return -1;
+			}
+			cscr = xch;
+		}
+
+	*xscr = cscr;
+
+	return 0;
+}
+
+
+void xdl_free_script(xdchange_t *xscr) {
+	xdchange_t *xch;
+
+	while ((xch = xscr) != NULL) {
+		xscr = xscr->next;
+		xdl_free(xch);
+	}
+}
+
+
+int xdl_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
+	     xdemitconf_t const *xecfg, xdemitcb_t *ecb) {
+	xdchange_t *xscr;
+	xdfenv_t xe;
+
+	if (xdl_do_diff(mf1, mf2, xpp, &xe) < 0) {
+
+		return -1;
+	}
+
+	if (xdl_build_script(&xe, &xscr) < 0) {
+
+		xdl_free_env(&xe);
+		return -1;
+	}
+
+	if (xscr) {
+		if (xdl_emit_diff(&xe, xscr, ecb, xecfg) < 0) {
+
+			xdl_free_script(xscr);
+			xdl_free_env(&xe);
+			return -1;
+		}
+
+		xdl_free_script(xscr);
+	}
+
+	xdl_free_env(&xe);
+
+	return 0;
+}
+
diff --git a/xdiff/xdiffi.h b/xdiff/xdiffi.h
new file mode 100644
index 0000000..dd8f3c9
--- /dev/null
+++ b/xdiff/xdiffi.h
@@ -0,0 +1,60 @@
+/*
+ *  LibXDiff by Davide Libenzi ( File Differential Library )
+ *  Copyright (C) 2003  Davide Libenzi
+ *
+ *  This library is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU Lesser General Public
+ *  License as published by the Free Software Foundation; either
+ *  version 2.1 of the License, or (at your option) any later version.
+ *
+ *  This library is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *  Lesser General Public License for more details.
+ *
+ *  You should have received a copy of the GNU Lesser General Public
+ *  License along with this library; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ *  Davide Libenzi <davidel@xmailserver.org>
+ *
+ */
+
+#if !defined(XDIFFI_H)
+#define XDIFFI_H
+
+
+typedef struct s_diffdata {
+	long nrec;
+	unsigned long const *ha;
+	long *rindex;
+	char *rchg;
+} diffdata_t;
+
+typedef struct s_xdalgoenv {
+	long mxcost;
+	long snake_cnt;
+	long heur_min;
+} xdalgoenv_t;
+
+typedef struct s_xdchange {
+	struct s_xdchange *next;
+	long i1, i2;
+	long chg1, chg2;
+} xdchange_t;
+
+
+
+int xdl_recs_cmp(diffdata_t *dd1, long off1, long lim1,
+		 diffdata_t *dd2, long off2, long lim2,
+		 long *kvdf, long *kvdb, int need_min, xdalgoenv_t *xenv);
+int xdl_do_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
+		xdfenv_t *xe);
+int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr);
+void xdl_free_script(xdchange_t *xscr);
+int xdl_emit_diff(xdfenv_t *xe, xdchange_t *xscr, xdemitcb_t *ecb,
+		  xdemitconf_t const *xecfg);
+
+
+#endif /* #if !defined(XDIFFI_H) */
+
diff --git a/xdiff/xemit.c b/xdiff/xemit.c
new file mode 100644
index 0000000..2e5d54c
--- /dev/null
+++ b/xdiff/xemit.c
@@ -0,0 +1,141 @@
+/*
+ *  LibXDiff by Davide Libenzi ( File Differential Library )
+ *  Copyright (C) 2003	Davide Libenzi
+ *
+ *  This library is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU Lesser General Public
+ *  License as published by the Free Software Foundation; either
+ *  version 2.1 of the License, or (at your option) any later version.
+ *
+ *  This library is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *  Lesser General Public License for more details.
+ *
+ *  You should have received a copy of the GNU Lesser General Public
+ *  License along with this library; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ *  Davide Libenzi <davidel@xmailserver.org>
+ *
+ */
+
+#include "xinclude.h"
+
+
+
+
+static long xdl_get_rec(xdfile_t *xdf, long ri, char const **rec);
+static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb);
+static xdchange_t *xdl_get_hunk(xdchange_t *xscr, xdemitconf_t const *xecfg);
+
+
+
+
+static long xdl_get_rec(xdfile_t *xdf, long ri, char const **rec) {
+
+	*rec = xdf->recs[ri]->ptr;
+
+	return xdf->recs[ri]->size;
+}
+
+
+static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *ecb) {
+	long size, psize = strlen(pre);
+	char const *rec;
+
+	size = xdl_get_rec(xdf, ri, &rec);
+	if (xdl_emit_diffrec(rec, size, pre, psize, ecb) < 0) {
+
+		return -1;
+	}
+
+	return 0;
+}
+
+
+/*
+ * Starting at the passed change atom, find the latest change atom to be included
+ * inside the differential hunk according to the specified configuration.
+ */
+static xdchange_t *xdl_get_hunk(xdchange_t *xscr, xdemitconf_t const *xecfg) {
+	xdchange_t *xch, *xchp;
+
+	for (xchp = xscr, xch = xscr->next; xch; xchp = xch, xch = xch->next)
+		if (xch->i1 - (xchp->i1 + xchp->chg1) > 2 * xecfg->ctxlen)
+			break;
+
+	return xchp;
+}
+
+
+int xdl_emit_diff(xdfenv_t *xe, xdchange_t *xscr, xdemitcb_t *ecb,
+		  xdemitconf_t const *xecfg) {
+	long s1, s2, e1, e2, lctx;
+	xdchange_t *xch, *xche;
+
+	for (xch = xche = xscr; xch; xch = xche->next) {
+		xche = xdl_get_hunk(xch, xecfg);
+
+		s1 = XDL_MAX(xch->i1 - xecfg->ctxlen, 0);
+		s2 = XDL_MAX(xch->i2 - xecfg->ctxlen, 0);
+
+		lctx = xecfg->ctxlen;
+		lctx = XDL_MIN(lctx, xe->xdf1.nrec - (xche->i1 + xche->chg1));
+		lctx = XDL_MIN(lctx, xe->xdf2.nrec - (xche->i2 + xche->chg2));
+
+		e1 = xche->i1 + xche->chg1 + lctx;
+		e2 = xche->i2 + xche->chg2 + lctx;
+
+		/*
+		 * Emit current hunk header.
+		 */
+		if (xdl_emit_hunk_hdr(s1 + 1, e1 - s1, s2 + 1, e2 - s2, ecb) < 0)
+			return -1;
+
+		/*
+		 * Emit pre-context.
+		 */
+		for (; s1 < xch->i1; s1++)
+			if (xdl_emit_record(&xe->xdf1, s1, " ", ecb) < 0)
+				return -1;
+
+		for (s1 = xch->i1, s2 = xch->i2;; xch = xch->next) {
+			/*
+			 * Merge previous with current change atom.
+			 */
+			for (; s1 < xch->i1 && s2 < xch->i2; s1++, s2++)
+				if (xdl_emit_record(&xe->xdf1, s1, " ", ecb) < 0)
+					return -1;
+
+			/*
+			 * Removes lines from the first file.
+			 */
+			for (s1 = xch->i1; s1 < xch->i1 + xch->chg1; s1++)
+				if (xdl_emit_record(&xe->xdf1, s1, "-", ecb) < 0)
+					return -1;
+
+			/*
+			 * Adds lines from the second file.
+			 */
+			for (s2 = xch->i2; s2 < xch->i2 + xch->chg2; s2++)
+				if (xdl_emit_record(&xe->xdf2, s2, "+", ecb) < 0)
+					return -1;
+
+			if (xch == xche)
+				break;
+			s1 = xch->i1 + xch->chg1;
+			s2 = xch->i2 + xch->chg2;
+		}
+
+		/*
+		 * Emit post-context.
+		 */
+		for (s1 = xche->i1 + xche->chg1; s1 < e1; s1++)
+			if (xdl_emit_record(&xe->xdf1, s1, " ", ecb) < 0)
+				return -1;
+	}
+
+	return 0;
+}
+
diff --git a/xdiff/xemit.h b/xdiff/xemit.h
new file mode 100644
index 0000000..e629417
--- /dev/null
+++ b/xdiff/xemit.h
@@ -0,0 +1,34 @@
+/*
+ *  LibXDiff by Davide Libenzi ( File Differential Library )
+ *  Copyright (C) 2003  Davide Libenzi
+ *
+ *  This library is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU Lesser General Public
+ *  License as published by the Free Software Foundation; either
+ *  version 2.1 of the License, or (at your option) any later version.
+ *
+ *  This library is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *  Lesser General Public License for more details.
+ *
+ *  You should have received a copy of the GNU Lesser General Public
+ *  License along with this library; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ *  Davide Libenzi <davidel@xmailserver.org>
+ *
+ */
+
+#if !defined(XEMIT_H)
+#define XEMIT_H
+
+
+
+int xdl_emit_diff(xdfenv_t *xe, xdchange_t *xscr, xdemitcb_t *ecb,
+		  xdemitconf_t const *xecfg);
+
+
+
+#endif /* #if !defined(XEMIT_H) */
+
diff --git a/xdiff/xinclude.h b/xdiff/xinclude.h
new file mode 100644
index 0000000..9490fc5
--- /dev/null
+++ b/xdiff/xinclude.h
@@ -0,0 +1,42 @@
+/*
+ *  LibXDiff by Davide Libenzi ( File Differential Library )
+ *  Copyright (C) 2003  Davide Libenzi
+ *
+ *  This library is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU Lesser General Public
+ *  License as published by the Free Software Foundation; either
+ *  version 2.1 of the License, or (at your option) any later version.
+ *
+ *  This library is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *  Lesser General Public License for more details.
+ *
+ *  You should have received a copy of the GNU Lesser General Public
+ *  License along with this library; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ *  Davide Libenzi <davidel@xmailserver.org>
+ *
+ */
+
+#if !defined(XINCLUDE_H)
+#define XINCLUDE_H
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <limits.h>
+
+#include "xmacros.h"
+#include "xdiff.h"
+#include "xtypes.h"
+#include "xutils.h"
+#include "xprepare.h"
+#include "xdiffi.h"
+#include "xemit.h"
+
+
+#endif /* #if !defined(XINCLUDE_H) */
+
diff --git a/xdiff/xmacros.h b/xdiff/xmacros.h
new file mode 100644
index 0000000..4c2fde8
--- /dev/null
+++ b/xdiff/xmacros.h
@@ -0,0 +1,53 @@
+/*
+ *  LibXDiff by Davide Libenzi ( File Differential Library )
+ *  Copyright (C) 2003  Davide Libenzi
+ *
+ *  This library is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU Lesser General Public
+ *  License as published by the Free Software Foundation; either
+ *  version 2.1 of the License, or (at your option) any later version.
+ *
+ *  This library is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *  Lesser General Public License for more details.
+ *
+ *  You should have received a copy of the GNU Lesser General Public
+ *  License along with this library; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ *  Davide Libenzi <davidel@xmailserver.org>
+ *
+ */
+
+#if !defined(XMACROS_H)
+#define XMACROS_H
+
+
+#define GR_PRIME 0x9e370001UL
+
+
+#define XDL_MIN(a, b) ((a) < (b) ? (a): (b))
+#define XDL_MAX(a, b) ((a) > (b) ? (a): (b))
+#define XDL_ABS(v) ((v) >= 0 ? (v): -(v))
+#define XDL_ISDIGIT(c) ((c) >= '0' && (c) <= '9')
+#define XDL_HASHLONG(v, b) (((unsigned long)(v) * GR_PRIME) >> ((CHAR_BIT * sizeof(unsigned long)) - (b)))
+#define XDL_PTRFREE(p) do { if (p) { xdl_free(p); (p) = NULL; } } while (0)
+#define XDL_LE32_PUT(p, v) \
+do { \
+	unsigned char *__p = (unsigned char *) (p); \
+	*__p++ = (unsigned char) (v); \
+	*__p++ = (unsigned char) ((v) >> 8); \
+	*__p++ = (unsigned char) ((v) >> 16); \
+	*__p = (unsigned char) ((v) >> 24); \
+} while (0)
+#define XDL_LE32_GET(p, v) \
+do { \
+	unsigned char const *__p = (unsigned char const *) (p); \
+	(v) = (unsigned long) __p[0] | ((unsigned long) __p[1]) << 8 | \
+		((unsigned long) __p[2]) << 16 | ((unsigned long) __p[3]) << 24; \
+} while (0)
+
+
+#endif /* #if !defined(XMACROS_H) */
+
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
new file mode 100644
index 0000000..27a0879
--- /dev/null
+++ b/xdiff/xprepare.c
@@ -0,0 +1,436 @@
+/*
+ *  LibXDiff by Davide Libenzi ( File Differential Library )
+ *  Copyright (C) 2003  Davide Libenzi
+ *
+ *  This library is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU Lesser General Public
+ *  License as published by the Free Software Foundation; either
+ *  version 2.1 of the License, or (at your option) any later version.
+ *
+ *  This library is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *  Lesser General Public License for more details.
+ *
+ *  You should have received a copy of the GNU Lesser General Public
+ *  License along with this library; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ *  Davide Libenzi <davidel@xmailserver.org>
+ *
+ */
+
+#include "xinclude.h"
+
+
+
+#define XDL_KPDIS_RUN 4
+
+
+
+typedef struct s_xdlclass {
+	struct s_xdlclass *next;
+	unsigned long ha;
+	char const *line;
+	long size;
+	long idx;
+} xdlclass_t;
+
+typedef struct s_xdlclassifier {
+	unsigned int hbits;
+	long hsize;
+	xdlclass_t **rchash;
+	chastore_t ncha;
+	long count;
+} xdlclassifier_t;
+
+
+
+
+static int xdl_init_classifier(xdlclassifier_t *cf, long size);
+static void xdl_free_classifier(xdlclassifier_t *cf);
+static int xdl_classify_record(xdlclassifier_t *cf, xrecord_t **rhash, unsigned int hbits,
+			       xrecord_t *rec);
+static int xdl_prepare_ctx(mmfile_t *mf, long narec, xpparam_t const *xpp,
+			   xdlclassifier_t *cf, xdfile_t *xdf);
+static void xdl_free_ctx(xdfile_t *xdf);
+static int xdl_clean_mmatch(char const *dis, long i, long s, long e);
+static int xdl_cleanup_records(xdfile_t *xdf1, xdfile_t *xdf2);
+static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2);
+static int xdl_optimize_ctxs(xdfile_t *xdf1, xdfile_t *xdf2);
+
+
+
+
+static int xdl_init_classifier(xdlclassifier_t *cf, long size) {
+	long i;
+
+	cf->hbits = xdl_hashbits((unsigned int) size);
+	cf->hsize = 1 << cf->hbits;
+
+	if (xdl_cha_init(&cf->ncha, sizeof(xdlclass_t), size / 4 + 1) < 0) {
+
+		return -1;
+	}
+	if (!(cf->rchash = (xdlclass_t **) xdl_malloc(cf->hsize * sizeof(xdlclass_t *)))) {
+
+		xdl_cha_free(&cf->ncha);
+		return -1;
+	}
+	for (i = 0; i < cf->hsize; i++)
+		cf->rchash[i] = NULL;
+
+	cf->count = 0;
+
+	return 0;
+}
+
+
+static void xdl_free_classifier(xdlclassifier_t *cf) {
+
+	xdl_free(cf->rchash);
+	xdl_cha_free(&cf->ncha);
+}
+
+
+static int xdl_classify_record(xdlclassifier_t *cf, xrecord_t **rhash, unsigned int hbits,
+			       xrecord_t *rec) {
+	long hi;
+	char const *line;
+	xdlclass_t *rcrec;
+
+	line = rec->ptr;
+	hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
+	for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
+		if (rcrec->ha == rec->ha && rcrec->size == rec->size &&
+		    !memcmp(line, rcrec->line, rec->size))
+			break;
+
+	if (!rcrec) {
+		if (!(rcrec = xdl_cha_alloc(&cf->ncha))) {
+
+			return -1;
+		}
+		rcrec->idx = cf->count++;
+		rcrec->line = line;
+		rcrec->size = rec->size;
+		rcrec->ha = rec->ha;
+		rcrec->next = cf->rchash[hi];
+		cf->rchash[hi] = rcrec;
+	}
+
+	rec->ha = (unsigned long) rcrec->idx;
+
+	hi = (long) XDL_HASHLONG(rec->ha, hbits);
+	rec->next = rhash[hi];
+	rhash[hi] = rec;
+
+	return 0;
+}
+
+
+static int xdl_prepare_ctx(mmfile_t *mf, long narec, xpparam_t const *xpp,
+			   xdlclassifier_t *cf, xdfile_t *xdf) {
+	unsigned int hbits;
+	long i, nrec, hsize, bsize;
+	unsigned long hav;
+	char const *blk, *cur, *top, *prev;
+	xrecord_t *crec;
+	xrecord_t **recs, **rrecs;
+	xrecord_t **rhash;
+	unsigned long *ha;
+	char *rchg;
+	long *rindex;
+
+	if (xdl_cha_init(&xdf->rcha, sizeof(xrecord_t), narec / 4 + 1) < 0) {
+
+		return -1;
+	}
+	if (!(recs = (xrecord_t **) xdl_malloc(narec * sizeof(xrecord_t *)))) {
+
+		xdl_cha_free(&xdf->rcha);
+		return -1;
+	}
+
+	hbits = xdl_hashbits((unsigned int) narec);
+	hsize = 1 << hbits;
+	if (!(rhash = (xrecord_t **) xdl_malloc(hsize * sizeof(xrecord_t *)))) {
+
+		xdl_free(recs);
+		xdl_cha_free(&xdf->rcha);
+		return -1;
+	}
+	for (i = 0; i < hsize; i++)
+		rhash[i] = NULL;
+
+	nrec = 0;
+	if ((cur = blk = xdl_mmfile_first(mf, &bsize)) != NULL) {
+		for (top = blk + bsize;;) {
+			if (cur >= top) {
+				if (!(cur = blk = xdl_mmfile_next(mf, &bsize)))
+					break;
+				top = blk + bsize;
+			}
+			prev = cur;
+			hav = xdl_hash_record(&cur, top);
+			if (nrec >= narec) {
+				narec *= 2;
+				if (!(rrecs = (xrecord_t **) xdl_realloc(recs, narec * sizeof(xrecord_t *)))) {
+
+					xdl_free(rhash);
+					xdl_free(recs);
+					xdl_cha_free(&xdf->rcha);
+					return -1;
+				}
+				recs = rrecs;
+			}
+			if (!(crec = xdl_cha_alloc(&xdf->rcha))) {
+
+				xdl_free(rhash);
+				xdl_free(recs);
+				xdl_cha_free(&xdf->rcha);
+				return -1;
+			}
+			crec->ptr = prev;
+			crec->size = (long) (cur - prev);
+			crec->ha = hav;
+			recs[nrec++] = crec;
+
+			if (xdl_classify_record(cf, rhash, hbits, crec) < 0) {
+
+				xdl_free(rhash);
+				xdl_free(recs);
+				xdl_cha_free(&xdf->rcha);
+				return -1;
+			}
+		}
+	}
+
+	if (!(rchg = (char *) xdl_malloc((nrec + 2) * sizeof(char)))) {
+
+		xdl_free(rhash);
+		xdl_free(recs);
+		xdl_cha_free(&xdf->rcha);
+		return -1;
+	}
+	memset(rchg, 0, (nrec + 2) * sizeof(char));
+
+	if (!(rindex = (long *) xdl_malloc((nrec + 1) * sizeof(long)))) {
+
+		xdl_free(rchg);
+		xdl_free(rhash);
+		xdl_free(recs);
+		xdl_cha_free(&xdf->rcha);
+		return -1;
+	}
+	if (!(ha = (unsigned long *) xdl_malloc((nrec + 1) * sizeof(unsigned long)))) {
+
+		xdl_free(rindex);
+		xdl_free(rchg);
+		xdl_free(rhash);
+		xdl_free(recs);
+		xdl_cha_free(&xdf->rcha);
+		return -1;
+	}
+
+	xdf->nrec = nrec;
+	xdf->recs = recs;
+	xdf->hbits = hbits;
+	xdf->rhash = rhash;
+	xdf->rchg = rchg + 1;
+	xdf->rindex = rindex;
+	xdf->nreff = 0;
+	xdf->ha = ha;
+	xdf->dstart = 0;
+	xdf->dend = nrec - 1;
+
+	return 0;
+}
+
+
+static void xdl_free_ctx(xdfile_t *xdf) {
+
+	xdl_free(xdf->rhash);
+	xdl_free(xdf->rindex);
+	xdl_free(xdf->rchg - 1);
+	xdl_free(xdf->ha);
+	xdl_free(xdf->recs);
+	xdl_cha_free(&xdf->rcha);
+}
+
+
+int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
+		    xdfenv_t *xe) {
+	long enl1, enl2;
+	xdlclassifier_t cf;
+
+	enl1 = xdl_guess_lines(mf1) + 1;
+	enl2 = xdl_guess_lines(mf2) + 1;
+
+	if (xdl_init_classifier(&cf, enl1 + enl2 + 1) < 0) {
+
+		return -1;
+	}
+
+	if (xdl_prepare_ctx(mf1, enl1, xpp, &cf, &xe->xdf1) < 0) {
+
+		xdl_free_classifier(&cf);
+		return -1;
+	}
+	if (xdl_prepare_ctx(mf2, enl2, xpp, &cf, &xe->xdf2) < 0) {
+
+		xdl_free_ctx(&xe->xdf1);
+		xdl_free_classifier(&cf);
+		return -1;
+	}
+
+	xdl_free_classifier(&cf);
+
+	if (xdl_optimize_ctxs(&xe->xdf1, &xe->xdf2) < 0) {
+
+		xdl_free_ctx(&xe->xdf2);
+		xdl_free_ctx(&xe->xdf1);
+		return -1;
+	}
+
+	return 0;
+}
+
+
+void xdl_free_env(xdfenv_t *xe) {
+
+	xdl_free_ctx(&xe->xdf2);
+	xdl_free_ctx(&xe->xdf1);
+}
+
+
+static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
+	long r, rdis, rpdis;
+
+	for (r = 1, rdis = 0, rpdis = 1; (i - r) >= s; r++) {
+		if (!dis[i - r])
+			rdis++;
+		else if (dis[i - r] == 2)
+			rpdis++;
+		else
+			break;
+	}
+	for (r = 1; (i + r) <= e; r++) {
+		if (!dis[i + r])
+			rdis++;
+		else if (dis[i + r] == 2)
+			rpdis++;
+		else
+			break;
+	}
+
+	return rpdis * XDL_KPDIS_RUN < (rpdis + rdis);
+}
+
+
+/*
+ * Try to reduce the problem complexity, discard records that have no
+ * matches on the other file. Also, lines that have multiple matches
+ * might be potentially discarded if they happear in a run of discardable.
+ */
+static int xdl_cleanup_records(xdfile_t *xdf1, xdfile_t *xdf2) {
+	long i, rhi, nreff;
+	unsigned long hav;
+	xrecord_t **recs;
+	xrecord_t *rec;
+	char *dis, *dis1, *dis2;
+
+	if (!(dis = (char *) xdl_malloc((xdf1->nrec + xdf2->nrec + 2) * sizeof(char)))) {
+
+		return -1;
+	}
+	memset(dis, 0, (xdf1->nrec + xdf2->nrec + 2) * sizeof(char));
+	dis1 = dis;
+	dis2 = dis1 + xdf1->nrec + 1;
+
+	for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
+		hav = (*recs)->ha;
+		rhi = (long) XDL_HASHLONG(hav, xdf2->hbits);
+		for (rec = xdf2->rhash[rhi]; rec; rec = rec->next)
+			if (rec->ha == hav && ++dis1[i] == 2)
+				break;
+	}
+
+	for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
+		hav = (*recs)->ha;
+		rhi = (long) XDL_HASHLONG(hav, xdf1->hbits);
+		for (rec = xdf1->rhash[rhi]; rec; rec = rec->next)
+			if (rec->ha == hav && ++dis2[i] == 2)
+				break;
+	}
+
+	for (nreff = 0, i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart];
+	     i <= xdf1->dend; i++, recs++) {
+		if (dis1[i] == 1 ||
+		    (dis1[i] == 2 && !xdl_clean_mmatch(dis1, i, xdf1->dstart, xdf1->dend))) {
+			xdf1->rindex[nreff] = i;
+			xdf1->ha[nreff] = (*recs)->ha;
+			nreff++;
+		} else
+			xdf1->rchg[i] = 1;
+	}
+	xdf1->nreff = nreff;
+
+	for (nreff = 0, i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart];
+	     i <= xdf2->dend; i++, recs++) {
+		if (dis2[i] == 1 ||
+		    (dis2[i] == 2 && !xdl_clean_mmatch(dis2, i, xdf2->dstart, xdf2->dend))) {
+			xdf2->rindex[nreff] = i;
+			xdf2->ha[nreff] = (*recs)->ha;
+			nreff++;
+		} else
+			xdf2->rchg[i] = 1;
+	}
+	xdf2->nreff = nreff;
+
+	xdl_free(dis);
+
+	return 0;
+}
+
+
+/*
+ * Early trim initial and terminal matching records.
+ */
+static int xdl_trim_ends(xdfile_t *xdf1, xdfile_t *xdf2) {
+	long i, lim;
+	xrecord_t **recs1, **recs2;
+
+	recs1 = xdf1->recs;
+	recs2 = xdf2->recs;
+	for (i = 0, lim = XDL_MIN(xdf1->nrec, xdf2->nrec); i < lim;
+	     i++, recs1++, recs2++)
+		if ((*recs1)->ha != (*recs2)->ha)
+			break;
+
+	xdf1->dstart = xdf2->dstart = i;
+
+	recs1 = xdf1->recs + xdf1->nrec - 1;
+	recs2 = xdf2->recs + xdf2->nrec - 1;
+	for (lim -= i, i = 0; i < lim; i++, recs1--, recs2--)
+		if ((*recs1)->ha != (*recs2)->ha)
+			break;
+
+	xdf1->dend = xdf1->nrec - i - 1;
+	xdf2->dend = xdf2->nrec - i - 1;
+
+	return 0;
+}
+
+
+static int xdl_optimize_ctxs(xdfile_t *xdf1, xdfile_t *xdf2) {
+
+	if (xdl_trim_ends(xdf1, xdf2) < 0 ||
+	    xdl_cleanup_records(xdf1, xdf2) < 0) {
+
+		return -1;
+	}
+
+	return 0;
+}
+
diff --git a/xdiff/xprepare.h b/xdiff/xprepare.h
new file mode 100644
index 0000000..344c569
--- /dev/null
+++ b/xdiff/xprepare.h
@@ -0,0 +1,35 @@
+/*
+ *  LibXDiff by Davide Libenzi ( File Differential Library )
+ *  Copyright (C) 2003  Davide Libenzi
+ *
+ *  This library is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU Lesser General Public
+ *  License as published by the Free Software Foundation; either
+ *  version 2.1 of the License, or (at your option) any later version.
+ *
+ *  This library is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *  Lesser General Public License for more details.
+ *
+ *  You should have received a copy of the GNU Lesser General Public
+ *  License along with this library; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ *  Davide Libenzi <davidel@xmailserver.org>
+ *
+ */
+
+#if !defined(XPREPARE_H)
+#define XPREPARE_H
+
+
+
+int xdl_prepare_env(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp,
+		    xdfenv_t *xe);
+void xdl_free_env(xdfenv_t *xe);
+
+
+
+#endif /* #if !defined(XPREPARE_H) */
+
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
new file mode 100644
index 0000000..3593a66
--- /dev/null
+++ b/xdiff/xtypes.h
@@ -0,0 +1,68 @@
+/*
+ *  LibXDiff by Davide Libenzi ( File Differential Library )
+ *  Copyright (C) 2003  Davide Libenzi
+ *
+ *  This library is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU Lesser General Public
+ *  License as published by the Free Software Foundation; either
+ *  version 2.1 of the License, or (at your option) any later version.
+ *
+ *  This library is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *  Lesser General Public License for more details.
+ *
+ *  You should have received a copy of the GNU Lesser General Public
+ *  License along with this library; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ *  Davide Libenzi <davidel@xmailserver.org>
+ *
+ */
+
+#if !defined(XTYPES_H)
+#define XTYPES_H
+
+
+
+typedef struct s_chanode {
+	struct s_chanode *next;
+	long icurr;
+} chanode_t;
+
+typedef struct s_chastore {
+	chanode_t *head, *tail;
+	long isize, nsize;
+	chanode_t *ancur;
+	chanode_t *sncur;
+	long scurr;
+} chastore_t;
+
+typedef struct s_xrecord {
+	struct s_xrecord *next;
+	char const *ptr;
+	long size;
+	unsigned long ha;
+} xrecord_t;
+
+typedef struct s_xdfile {
+	chastore_t rcha;
+	long nrec;
+	unsigned int hbits;
+	xrecord_t **rhash;
+	long dstart, dend;
+	xrecord_t **recs;
+	char *rchg;
+	long *rindex;
+	long nreff;
+	unsigned long *ha;
+} xdfile_t;
+
+typedef struct s_xdfenv {
+	xdfile_t xdf1, xdf2;
+} xdfenv_t;
+
+
+
+#endif /* #if !defined(XTYPES_H) */
+
diff --git a/xdiff/xutils.c b/xdiff/xutils.c
new file mode 100644
index 0000000..01e6765
--- /dev/null
+++ b/xdiff/xutils.c
@@ -0,0 +1,265 @@
+/*
+ *  LibXDiff by Davide Libenzi ( File Differential Library )
+ *  Copyright (C) 2003	Davide Libenzi
+ *
+ *  This library is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU Lesser General Public
+ *  License as published by the Free Software Foundation; either
+ *  version 2.1 of the License, or (at your option) any later version.
+ *
+ *  This library is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *  Lesser General Public License for more details.
+ *
+ *  You should have received a copy of the GNU Lesser General Public
+ *  License along with this library; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ *  Davide Libenzi <davidel@xmailserver.org>
+ *
+ */
+
+#include "xinclude.h"
+
+
+
+#define XDL_GUESS_NLINES 256
+
+
+
+
+int xdl_emit_diffrec(char const *rec, long size, char const *pre, long psize,
+		     xdemitcb_t *ecb) {
+	mmbuffer_t mb[2];
+
+	mb[0].ptr = (char *) pre;
+	mb[0].size = psize;
+	mb[1].ptr = (char *) rec;
+	mb[1].size = size;
+
+	if (ecb->outf(ecb->priv, mb, 2) < 0) {
+
+		return -1;
+	}
+
+	return 0;
+}
+
+void *xdl_mmfile_first(mmfile_t *mmf, long *size)
+{
+	*size = mmf->size;
+	return mmf->ptr;
+}
+
+
+void *xdl_mmfile_next(mmfile_t *mmf, long *size)
+{
+	return NULL;
+}
+
+
+long xdl_mmfile_size(mmfile_t *mmf)
+{
+	return mmf->size;
+}
+
+
+int xdl_cha_init(chastore_t *cha, long isize, long icount) {
+
+	cha->head = cha->tail = NULL;
+	cha->isize = isize;
+	cha->nsize = icount * isize;
+	cha->ancur = cha->sncur = NULL;
+	cha->scurr = 0;
+
+	return 0;
+}
+
+
+void xdl_cha_free(chastore_t *cha) {
+	chanode_t *cur, *tmp;
+
+	for (cur = cha->head; (tmp = cur) != NULL;) {
+		cur = cur->next;
+		xdl_free(tmp);
+	}
+}
+
+
+void *xdl_cha_alloc(chastore_t *cha) {
+	chanode_t *ancur;
+	void *data;
+
+	if (!(ancur = cha->ancur) || ancur->icurr == cha->nsize) {
+		if (!(ancur = (chanode_t *) xdl_malloc(sizeof(chanode_t) + cha->nsize))) {
+
+			return NULL;
+		}
+		ancur->icurr = 0;
+		ancur->next = NULL;
+		if (cha->tail)
+			cha->tail->next = ancur;
+		if (!cha->head)
+			cha->head = ancur;
+		cha->tail = ancur;
+		cha->ancur = ancur;
+	}
+
+	data = (char *) ancur + sizeof(chanode_t) + ancur->icurr;
+	ancur->icurr += cha->isize;
+
+	return data;
+}
+
+
+void *xdl_cha_first(chastore_t *cha) {
+	chanode_t *sncur;
+
+	if (!(cha->sncur = sncur = cha->head))
+		return NULL;
+
+	cha->scurr = 0;
+
+	return (char *) sncur + sizeof(chanode_t) + cha->scurr;
+}
+
+
+void *xdl_cha_next(chastore_t *cha) {
+	chanode_t *sncur;
+
+	if (!(sncur = cha->sncur))
+		return NULL;
+	cha->scurr += cha->isize;
+	if (cha->scurr == sncur->icurr) {
+		if (!(sncur = cha->sncur = sncur->next))
+			return NULL;
+		cha->scurr = 0;
+	}
+
+	return (char *) sncur + sizeof(chanode_t) + cha->scurr;
+}
+
+
+long xdl_guess_lines(mmfile_t *mf) {
+	long nl = 0, size, tsize = 0;
+	char const *data, *cur, *top;
+
+	if ((cur = data = xdl_mmfile_first(mf, &size)) != NULL) {
+		for (top = data + size; nl < XDL_GUESS_NLINES;) {
+			if (cur >= top) {
+				tsize += (long) (cur - data);
+				if (!(cur = data = xdl_mmfile_next(mf, &size)))
+					break;
+				top = data + size;
+			}
+			nl++;
+			if (!(cur = memchr(cur, '\n', top - cur)))
+				cur = top;
+			else
+				cur++;
+		}
+		tsize += (long) (cur - data);
+	}
+
+	if (nl && tsize)
+		nl = xdl_mmfile_size(mf) / (tsize / nl);
+
+	return nl + 1;
+}
+
+
+unsigned long xdl_hash_record(char const **data, char const *top) {
+	unsigned long ha = 5381;
+	char const *ptr = *data;
+
+	for (; ptr < top && *ptr != '\n'; ptr++) {
+		ha += (ha << 5);
+		ha ^= (unsigned long) *ptr;
+	}
+	*data = ptr < top ? ptr + 1: ptr;
+
+	return ha;
+}
+
+
+unsigned int xdl_hashbits(unsigned int size) {
+	unsigned int val = 1, bits = 0;
+
+	for (; val < size && bits < CHAR_BIT * sizeof(unsigned int); val <<= 1, bits++);
+	return bits ? bits: 1;
+}
+
+
+int xdl_num_out(char *out, long val) {
+	char *ptr, *str = out;
+	char buf[32];
+
+	ptr = buf + sizeof(buf) - 1;
+	*ptr = '\0';
+	if (val < 0) {
+		*--ptr = '-';
+		val = -val;
+	}
+	for (; val && ptr > buf; val /= 10)
+		*--ptr = "0123456789"[val % 10];
+	if (*ptr)
+		for (; *ptr; ptr++, str++)
+			*str = *ptr;
+	else
+		*str++ = '0';
+	*str = '\0';
+
+	return str - out;
+}
+
+
+long xdl_atol(char const *str, char const **next) {
+	long val, base;
+	char const *top;
+
+	for (top = str; XDL_ISDIGIT(*top); top++);
+	if (next)
+		*next = top;
+	for (val = 0, base = 1, top--; top >= str; top--, base *= 10)
+		val += base * (long)(*top - '0');
+	return val;
+}
+
+
+int xdl_emit_hunk_hdr(long s1, long c1, long s2, long c2, xdemitcb_t *ecb) {
+	int nb = 0;
+	mmbuffer_t mb;
+	char buf[128];
+
+	memcpy(buf, "@@ -", 4);
+	nb += 4;
+
+	nb += xdl_num_out(buf + nb, c1 ? s1: 0);
+
+	memcpy(buf + nb, ",", 1);
+	nb += 1;
+
+	nb += xdl_num_out(buf + nb, c1);
+
+	memcpy(buf + nb, " +", 2);
+	nb += 2;
+
+	nb += xdl_num_out(buf + nb, c2 ? s2: 0);
+
+	memcpy(buf + nb, ",", 1);
+	nb += 1;
+
+	nb += xdl_num_out(buf + nb, c2);
+
+	memcpy(buf + nb, " @@\n", 4);
+	nb += 4;
+
+	mb.ptr = buf;
+	mb.size = nb;
+	if (ecb->outf(ecb->priv, &mb, 1) < 0)
+		return -1;
+
+	return 0;
+}
+
diff --git a/xdiff/xutils.h b/xdiff/xutils.h
new file mode 100644
index 0000000..428a4bb
--- /dev/null
+++ b/xdiff/xutils.h
@@ -0,0 +1,44 @@
+/*
+ *  LibXDiff by Davide Libenzi ( File Differential Library )
+ *  Copyright (C) 2003  Davide Libenzi
+ *
+ *  This library is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU Lesser General Public
+ *  License as published by the Free Software Foundation; either
+ *  version 2.1 of the License, or (at your option) any later version.
+ *
+ *  This library is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *  Lesser General Public License for more details.
+ *
+ *  You should have received a copy of the GNU Lesser General Public
+ *  License along with this library; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ *  Davide Libenzi <davidel@xmailserver.org>
+ *
+ */
+
+#if !defined(XUTILS_H)
+#define XUTILS_H
+
+
+int xdl_emit_diffrec(char const *rec, long size, char const *pre, long psize,
+		     xdemitcb_t *ecb);
+int xdl_cha_init(chastore_t *cha, long isize, long icount);
+void xdl_cha_free(chastore_t *cha);
+void *xdl_cha_alloc(chastore_t *cha);
+void *xdl_cha_first(chastore_t *cha);
+void *xdl_cha_next(chastore_t *cha);
+long xdl_guess_lines(mmfile_t *mf);
+unsigned long xdl_hash_record(char const **data, char const *top);
+unsigned int xdl_hashbits(unsigned int size);
+int xdl_num_out(char *out, long val);
+long xdl_atol(char const *str, char const **next);
+int xdl_emit_hunk_hdr(long s1, long c1, long s2, long c2, xdemitcb_t *ecb);
+
+
+
+#endif /* #if !defined(XUTILS_H) */
+

^ permalink raw reply related

* Effective difference between git-rebase and git-resolve
From: Marc Singer @ 2006-03-25  3:54 UTC (permalink / raw)
  To: git

The process I've been using to keep my patches current with the latest
development is this:

  git checkout linus && git pull linus
  git checkout work

When I'm ready to merge,

  git resolve work linus "Update with head"
  git tag basis

This lets me diff against basis even when the linus branch continues
to follow the latest developments.

Today, I wanted to move everything forward.  But the resolve failed to
merge some files.  In fact, one file was apparently so thorny that
resolve just gave up and left no working file.  Bothersome, but I
recovered by moving back to the previous work point.

Then, I found git-rebase which seems to be more what I'd like to use
since it moves my patches along on top of the main development line.

  git rebase linus

This time, almost everything merged without a hitch except for the
thorny file from before.  I edited the file, removing the conflict
markers, and started a build.  But what I found was that some of the
changes I'd made were no longer present.  Several files showed no sign
of the patches even though the kernel versions hadn't changed.

So, I have a couple of questions:

  1) Am I using rebase correctly?
  2) If not, did it leave some of my changes uncommitted and hidden
     somewhere? git-ls-files --unmerged shows no sign of them.
  3) Do I have to pull all of my patches off, apply them to the head
     of the tree, and only use git-rebase to make this work?
  4) Should I prefer rebase over resolve?

^ permalink raw reply

* git push refspec URL weirdness
From: Jon Loeliger @ 2006-03-25  3:42 UTC (permalink / raw)
  To: git

I wanted to git push some bits to a remote repo
and set up this .git/refs/remotes/publish file:

    URL: git+ssh://www.example.com/some/path/repo.git
    Push: my-branch:public-branch

So that I could "git push publish".

The ssh on the far side is listening on port 1234
and not the default 22.  So I slapped this into my
~/.ssh/config file on the local machine:

    Host www.example.com
    Port 1234

This worked great for a straight "ssh www.example.com"
connection.  However, git still complained that port 22
was refusing connections.  It was.  Git shouldn't have
been trying to use it.

So Junio suggested taking advantage of the fact that the
default refspec uses git+ssh and use this instead:

    URL: www.example.com:/pub/software/linux-2.6-86xx.git
    Push: my-branch:public-branch

Which just worked.

So this is either a bug report or google food. :-)

jdl

^ permalink raw reply

* [PATCH] Removed bogus "<snap>" identifier.
From: Jon Loeliger @ 2006-03-25  3:27 UTC (permalink / raw)
  To: git


Signed-off-by: Jon Loeliger <jdl@jdl.com>

---

 Documentation/git.txt |    4 ----
 1 files changed, 0 insertions(+), 4 deletions(-)

c610f57ccfb52441719c5602894139acdd1271ee
diff --git a/Documentation/git.txt b/Documentation/git.txt
index 0c424ff..fe34f50 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -521,10 +521,6 @@ HEAD::
 	a valid head 'name'
 	(i.e. the contents of `$GIT_DIR/refs/heads/<head>`).
 
-<snap>::
-	a valid snapshot 'name'
-	(i.e. the contents of `$GIT_DIR/refs/snap/<snap>`).
-
 
 File/Directory Structure
 ------------------------
-- 
1.2.4.gdd7be

^ permalink raw reply related

* [PATCH] Clarify and expand some hook documentation.
From: Jon Loeliger @ 2006-03-25  3:21 UTC (permalink / raw)
  To: git


Clarify update and post-update hooks.
Made a few references to the hooks documentation.

Signed-off-by: Jon Loeliger <jdl@jdl.com>

---

 Documentation/git.txt               |    2 +
 Documentation/hooks.txt             |   49 ++++++++++++++++++++++++++---------
 Documentation/repository-layout.txt |    2 +
 3 files changed, 41 insertions(+), 12 deletions(-)

83472863b77cde6209ce01211500e2bd9b81ecc7
diff --git a/Documentation/git.txt b/Documentation/git.txt
index de3934d..0c424ff 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -531,6 +531,8 @@ File/Directory Structure
 
 Please see link:repository-layout.html[repository layout] document.
 
+Read link:hooks.html[hooks] for more details about each hook.
+
 Higher level SCMs may provide and manage additional information in the
 `$GIT_DIR`.
 
diff --git a/Documentation/hooks.txt b/Documentation/hooks.txt
index 4ad1920..3824a95 100644
--- a/Documentation/hooks.txt
+++ b/Documentation/hooks.txt
@@ -97,16 +97,31 @@ send out a commit notification e-mail.
 update
 ------
 
-This hook is invoked by `git-receive-pack`, which is invoked
-when a `git push` is done against the repository.  It takes
-three parameters, name of the ref being updated, old object name
-stored in the ref, and the new objectname to be stored in the
-ref.  Exiting with non-zero status from this hook prevents
-`git-receive-pack` from updating the ref.
+This hook is invoked by `git-receive-pack` on the remote repository,
+which is happens when a `git push` is done on a local repository.
+Just before updating the ref on the remote repository, the update hook
+is invoked.  It's exit status determins the success or failure of
+the ref update.
+
+The hook executes once for each ref to be updated, and takes
+three parameters:
+    - the name of the ref being updated,
+    - the old object name stored in the ref,
+    - and the new objectname to be stored in the ref.
+
+A zero exit from the update hook allows the ref to be updated.
+Exiting with a non-zero status prevents `git-receive-pack`
+from updating the ref.
 
-This can be used to prevent 'forced' update on certain refs by
+This hook can be used to prevent 'forced' update on certain refs by
 making sure that the object name is a commit object that is a
 descendant of the commit object named by the old object name.
+That is, to enforce a "fast forward only" policy.
+
+It could also be used to log the old..new status.  However, it
+does not know the entire set of branches, so it would end up
+firing one e-mail per ref when used naively, though.
+
 Another use suggested on the mailing list is to use this hook to
 implement access control which is finer grained than the one
 based on filesystem group.
@@ -115,20 +130,30 @@ The standard output of this hook is sent
 want to report something to the git-send-pack on the other end,
 you can redirect your output to your stderr.
 
+
 post-update
 -----------
 
-This hook is invoked by `git-receive-pack`, which is invoked
-when a `git push` is done against the repository.  It takes
-variable number of parameters; each of which is the name of ref
-that was actually updated.
+This hook is invoked by `git-receive-pack` on the remote repository,
+which is happens when a `git push` is done on a local repository.
+It executes on the remote repository once after all the refs have
+been updated.
+
+It takes a variable number of parameters, each of which is the
+name of ref that was actually updated.
 
 This hook is meant primarily for notification, and cannot affect
 the outcome of `git-receive-pack`.
 
+The post-update hook can tell what are the heads that were pushed,
+but it does not know what their original and updated values are,
+so it is a poor place to do log old..new.
+
 The default post-update hook, when enabled, runs
 `git-update-server-info` to keep the information used by dumb
-transport up-to-date.
+transports (eg, http) up-to-date.  If you are publishing
+a git repository that is accessible via http, you should
+probably enable this hook.
 
 The standard output of this hook is sent to /dev/null; if you
 want to report something to the git-send-pack on the other end,
diff --git a/Documentation/repository-layout.txt b/Documentation/repository-layout.txt
index 1f19bf8..98fbe7d 100644
--- a/Documentation/repository-layout.txt
+++ b/Documentation/repository-layout.txt
@@ -89,6 +89,8 @@ hooks::
 	commands.  A handful of sample hooks are installed when
 	`git init-db` is run, but all of them are disabled by
 	default.  To enable, they need to be made executable.
+	Read link:hooks.html[hooks] for more details about
+	each hook.
 
 index::
 	The current index file for the repository.  It is
-- 
1.2.4.gdd7be

^ permalink raw reply related

* Re: Fix branch ancestry calculation
From: Chris Shoemaker @ 2006-03-25  1:45 UTC (permalink / raw)
  To: Keith Packard
  Cc: Linus Torvalds, David Mansfield, David Mansfield,
	Git Mailing List
In-Reply-To: <1143218338.6850.68.camel@neko.keithp.com>

On Fri, Mar 24, 2006 at 08:38:58AM -0800, Keith Packard wrote:
> On Fri, 2006-03-24 at 07:46 -0800, Linus Torvalds wrote:
> > 
> > On Fri, 24 Mar 2006, David Mansfield wrote:
> > > 
> > > Anyway, I'd like to nail down some of the other nagging ancestry/branch point
> > > problems if possible.
> > 
> > What I considered doing was to just ignore the branch ancestry that cvsps 
> > gives us, and instead use whatever branch that is closest (ie generates 
> > the minimal diff). That's really wrong too (the data just _has_ to be in 
> > CVS somehow), but I just don't know how CVS handles branches, and it's how 
> > we'd have to do merges if we were to ever support them (since afaik, the 
> > merge-back information simply doesn't exists in CVS).
> 
> cvsps is more of a problem than cvs itself. Per-file branch information
> is readily available in the ,v files; each version has a list of
> branches from that version, and there are even tags marking the names of
> them. One issue that I've discovered is when files have differing branch
> structure in the same repository. That happens when a branch is created
> while files are checked out on different branches.  I'm not quite sure
> what to do in this case; I've been trying several approaches and none
> seem optimal. One remaining plan is to just attach such branches by
> date, but that assumes that the first commit along a branch occurs
> shortly after the branch is created (which isn't required).
> 
> Of course, this branch information is only created when a change is made
> to the file along said branch, so most of the repository will lack
> precise branch information for each branch. When you create a child
> branch, the files with no commits in the parent branch will never get
> branch information, so the child branch will be numbered as if it were a
> branch off of the grandparent. Globally, it is possible to reconstruct
> the entire branch structure.

If that last sentence was a typo then you already know this, but
otherwise you may be disappointed to learn that it's not _always_
possible to discern the correct ancestry tree.

The simplest counter-example is two branches where each adds one file
and no files in common are modified.  If A and B both branched off of
HEAD and each adds one file, then they should each only have one file.
But if B branched from A which branched from HEAD, then B should also
have the file that was added to A. (*)  However, the information to
distinguish these two cases isn't recorded in CVS.  

I seem to have described this example more fully in the notes I took
while writing the patch to cvsps that does the global inferrence
you're describing.  You _usually_ can make a very good guess, and the
more files that are modified, the better you can do.

BTW, those notes are still available here:
http://www.codesifter.com/cvsps-notes.txt 

If you end up comparing the ancestry tree discovered by your tool and
the tree output by a patched cvsps, I would be very interested in the
results.

-chris

(*) You can distinguish between A->B->head and B->A->head simply by
date.

^ permalink raw reply

* Re: Bug encountered while comitting
From: Matthias Kestenholz @ 2006-03-25  1:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vacbfzc3u.fsf@assigned-by-dhcp.cox.net>

* Junio C Hamano (junkio@cox.net) wrote:
> Matthias Kestenholz <lists@irregular.ch> writes:
> 
> > The PHP script created directories under .git/objects which were
> > only writable by www-data. There were other directories which were
> > owned by user mk and group www-data, and they were group writable.
> >
> > So, I had write access to only a part of the .git directory.
> 
> core.sharedrepository perhaps?
> 
> While it probably is not a good idea to have you in www-data, it
> appears that is essentially you will end up doing, because PHP
> scripts that may _create_ new directories better not to have
> privilege to give newly created directories away to you (busting
> your quota), so they will be owned by www-data.www-data and for
> you to be able to write into it you either need to be www-data
> user or in www-data group, with core.sharedrepostiory set.
> 

Thanks for your answer; I did not know about this option (I should
probably re-read all the docs).

Anyway, I think git should never corrupt a repository, even if it
does not have the write permissions it needs.

The following commands were sufficient to create a corrupt
repository with git (v1.2.4-1, debian package):

$ git-init-db
$ echo test > file
$ git-update-index --add file ; git commit -m 'message'

repeat (f.e. 10 times):
$ echo test >> file
$ git-update-index file ; git commit -m 'message'

$ sudo chown root .git/objects/*

repeat the modification and commit commands until you get a message
similar to the following:

unable to write sha1 filename .git/objects/90/b33..: Permission denied
fatal: 90b33... is not a valid 'tree' object
unable to write sha1 filename .git/objects/ba/fe4..: Permission denied
error: file: failed to insert into database
fatal: Unable to process file file
etc...

The result of this all is: refs/heads/master might now point to a
non-existant commit object. Every git command now errors out with:

fatal: bad tree object HEAD

and git-log shows no output (probably since it does not find a
commit to begin with)

git-commit should abort as soon as it encounters an error and not
update HEAD.

Thanks,
Matthias

(Note: To find the last valid commit object, I could just scan the
objects directory for the recently modified files and write the sha1
value to refs/heads/master, so I had no data loss.)

-- 
:wq

^ permalink raw reply

* Re: Bug encountered while comitting
From: Matthias Kestenholz @ 2006-03-25  1:15 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: git
In-Reply-To: <442449E1.5060007@op5.se>

* Andreas Ericsson (ae@op5.se) wrote:
> Matthias Kestenholz wrote:
> >
> >The PHP script created directories under .git/objects
> 
> 
> Ouch... You're not really supposed to do that. The proper thing to do is 
> to do things in the working tree and commit them to git later.
> 

I think I did not express myself clearly. The PHP Script executes
git commands which in turn create or modify the mentioned
directories. I do not create them myself by hand.

> >When I tried to commit, I got a message saying "Unable to write sha1
> >filename".
> >
> 
> What file were you trying to write?
> 

Some file I was updating (in this case, a file holding some wiki
content)

> >The result was, that only part of the commit was recorded and that I
> >experienced repository corruption. refs/heads/master pointed to a
> >non-existant object.
> >
> 
> Did you use git tools to update .git/refs/heads/master ?
> 

Yes.

> 
> >The expected behavior would have been an error message telling me I
> >had insufficient write privileges and surely no repository
> >corruption.
> >
> 
> Didn't you get the strerror(3) message from that? If so, I'd consider it 
> a bug.

As I (only now) wrote in the other email, I got the
"Permission denied" message, and that was a sufficient hint what was
wrong.

-- 
:wq

^ permalink raw reply

* Re: Errors GITtifying GCC and Binutils
From: Chris Shoemaker @ 2006-03-25  0:37 UTC (permalink / raw)
  To: Jan-Benedict Glaw; +Cc: Linus Torvalds, git
In-Reply-To: <20060324075229.GH31387@lug-owl.de>

On Fri, Mar 24, 2006 at 08:52:29AM +0100, Jan-Benedict Glaw wrote:
> On Thu, 2006-03-23 19:39:44 -0500, Chris Shoemaker <c.shoemaker@cox.net> wrote:
> > On Thu, Mar 23, 2006 at 09:03:06PM +0100, Jan-Benedict Glaw wrote:
> > > On Wed, 2006-03-22 17:28:23 -0800, Linus Torvalds <torvalds@osdl.org> wrote:
> > > It seems there's a patch like
> > > http://www.gelato.unsw.edu.au/archives/git/0602/16278.html is missing?
> > > ...or we need a better cvsps.  Shall I add it and try again / try to
> > > continue, or give up on it for now?  Though it would be nice to have
> > > these two large and important source trees under GIT control :-)
> > 
> > You make want to try the cvsps patch I attached to the email here:
> > 
> > http://www.gelato.unsw.edu.au/archives/git/0511/11812.html
> 
> [...]

> invalid initial_branch for file bfd/po/BLD-POTFILES.in, probably
> from old cache, run with -x.

I guess that error message wasn't quite as obvious as I intended.

That means you have old cvsps cache state hanging around.  You can
either run cvsps with -x or delete the cache file manually.  Those are
the files in ~/.cvsps.

Incidentally, I'd recommend doing this in two stages during
trouble-shooting.  Run cvsps first and verify that you can produce a
valid ancestry tree.  If it's not-quite-right you can even edit the
cvsps output to reparent the incorrect branches.  Then run
git-cvsimport after you're satisfied with the ancestry.

-chris

^ permalink raw reply

* Re: Bug encountered while comitting
From: Junio C Hamano @ 2006-03-24 22:55 UTC (permalink / raw)
  To: Matthias Kestenholz; +Cc: git
In-Reply-To: <20060324183951.GA23193@spinlock.ch>

Matthias Kestenholz <lists@irregular.ch> writes:

> The PHP script created directories under .git/objects which were
> only writable by www-data. There were other directories which were
> owned by user mk and group www-data, and they were group writable.
>
> So, I had write access to only a part of the .git directory.

core.sharedrepository perhaps?

While it probably is not a good idea to have you in www-data, it
appears that is essentially you will end up doing, because PHP
scripts that may _create_ new directories better not to have
privilege to give newly created directories away to you (busting
your quota), so they will be owned by www-data.www-data and for
you to be able to write into it you either need to be www-data
user or in www-data group, with core.sharedrepostiory set.

^ permalink raw reply

* Question about git-ls-files
From: Radoslaw Szkodzinski @ 2006-03-24 22:49 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 870 bytes --]

git-ls-files is a very useful command to list various types of files.

However, it has some weird behaviour.

Let's say someone removed the file and not updated the index yet.
I want to get status for all files in the directory, so I launch git-ls-files 
-t -m -d -o, and I get:

R removed-file
C removed-file
? something-else

(it looks even better if I call it w/o -t)

If the file is removed, then marking it as changed is redundant.
A removed file cannot be unchanged.

This behaviour makes for slow parsing, because to get the changed files which 
still exist one has to at least skip items (or uniquify the list). This 
shouldn't be needed.

Removed files should of course still be listed as changed if there's no -d 
parameter.

-- 
GPG Key id:  0xD1F10BA2
Fingerprint: 96E2 304A B9C4 949A 10A0  9105 9543 0453 D1F1 0BA2

AstralStorm

[-- Attachment #2: Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply

* Re: History rewriting swiss army knife
From: Junio C Hamano @ 2006-03-24 22:47 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git
In-Reply-To: <20060324140831.GY18185@pasky.or.cz>

Petr Baudis <pasky@suse.cz> writes:

>   It's never been so easy before - I've written cg-admin-rewritehist,
> which will execute your filters for each commit (which can rewrite the
> tree contents, just the tree itself through the index, committer/author
> information and commit message) while the script will obviously preserve
> all the other information like merges, author/committer information etc.

Hmph.  The above description sounds like you are not allowing
the user's custom script to drop existing parent (or graft a new
one) while rewriting.  I have not looked at how you are
interfacing with user's custom script, but I sort-of expected
you to throw a commit at it from older to newer (i.e. topo-order
in reverse), along with the names of already re-written commit
objects that are parents of taht commit, and have it build a
rewritten commit and report its object name back to you.

But it sounds like a useful tool in certain situations -- I
sounded mildly negative last night, but after you gave an
example of cleaning up a half-botched import, I changed my mind.

^ permalink raw reply

* Re: Bug encountered while comitting
From: Andreas Ericsson @ 2006-03-24 19:34 UTC (permalink / raw)
  To: Matthias Kestenholz; +Cc: git
In-Reply-To: <20060324183951.GA23193@spinlock.ch>

Matthias Kestenholz wrote:
> Hello list,
> 
> I don't know if this is the right place to report a bug, but I'll
> just try and see what comes back.
> 
> I am trying to build a Wiki [1] using PHP, a hacked version of Markdown,
> and git for content tracking. I use the git core plumbing to do the
> history work.
> 
> The PHP script created directories under .git/objects


Ouch... You're not really supposed to do that. The proper thing to do is 
to do things in the working tree and commit them to git later.


> which were
> only writable by www-data. There were other directories which were
> owned by user mk and group www-data, and they were group writable.
> 
> So, I had write access to only a part of the .git directory.
> 

Unless you're using the git tools (or things hooking in to the git core 
C functions somehow), don't touch the .git directory.

(this merits an exclamation marks, so brace yourselves) !


> When I tried to commit, I got a message saying "Unable to write sha1
> filename".
> 

What file were you trying to write?

> The result was, that only part of the commit was recorded and that I
> experienced repository corruption. refs/heads/master pointed to a
> non-existant object.
> 

Did you use git tools to update .git/refs/heads/master ?


> The expected behavior would have been an error message telling me I
> had insufficient write privileges and surely no repository
> corruption.
> 

Didn't you get the strerror(3) message from that? If so, I'd consider it 
a bug.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply

* Re: Errors GITtifying GCC and Binutils
From: Santi Béjar @ 2006-03-24 19:35 UTC (permalink / raw)
  To: Jan-Benedict Glaw; +Cc: git
In-Reply-To: <20060324182504.GI31387@lug-owl.de>

Jan-Benedict Glaw <jbglaw@lug-owl.de> writes:

> On Wed, 2006-03-22 14:33:37 +0100, Jan-Benedict Glaw <jbglaw@lug-owl.de> wrote:
>
> Since it seems nobody looked at the GCC import run (which means to use
> the svnimport), I ran it again, under strace control:
>
>> GCC
>> ~~~
>> $ /home/jbglaw/bin/git svnimport -C gcc -v svn://gcc.gnu.org/svn/gcc
>
>> Committed change 3936:/ 1993-03-31 05:44:03)
>> Commit ID ceff85145f8671fb2a9d826a761cedc2a507bd1e
>> Writing to refs/heads/origin
>> DONE: 3936 origin ceff85145f8671fb2a9d826a761cedc2a507bd1e
>> ... 3937 trunk/gcc/final.c ...
>> Can't fork at /home/jbglaw/bin/git-svnimport line 379.
>

I have the same (?) problem with one of my svn repository. It worked
before (I've redone the import with the -r flag), so I bisected it.
The problematic commit seems to be:

diff-tree 4802426... (from 525c0d7...)
Author: Karl  Hasselström <kha@treskal.com>
Date:   Sun Feb 26 06:11:27 2006 +0100

    svnimport: Convert executable flag

    Convert the svn:executable property to file mode 755 when converting
    an SVN repository to GIT.

    Signed-off-by: Karl Hasselström <kha@treskal.com>
    Signed-off-by: Junio C Hamano <junkio@cox.net>

:100755 100755 ee2940f... 6603b96... M  git-svnimport.perl

I think it has a memory leak, it used up to 140m of memory.

$ git reset --hard 4802426^
$ time ../git-svnimport.perl file:///path/
Use of uninitialized value in string eq at ../git-svnimport.perl line 463.
Use of uninitialized value in substitution (s///) at ../git-svnimport.perl line 466.
real    0m55.801s
user    0m30.578s
sys     0m23.084s

$ git reset --hard 4802426
$ time ../git-svnimport.perl file:///path/
Use of uninitialized value in string eq at ../git-svnimport.perl line 463.
Use of uninitialized value in substitution (s///) at ../git-svnimport.perl line 466.
Can't fork at /home/santi/usr/src/scm/git/git-svnimport.perl line 331.
real    6m2.163s
user    0m20.332s
sys     0m50.180s

and it didn't finished. Hope it helps.

Santi

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox