* Re: JGIT: discuss: diff/patch implementation
2008-11-10 14:22 JGIT: discuss: diff/patch implementation Francis Galiegue
@ 2008-11-10 15:56 ` Robin Rosenberg
2008-11-10 16:16 ` Francis Galiegue
2008-11-10 19:46 ` Johannes Schindelin
` (2 subsequent siblings)
3 siblings, 1 reply; 16+ messages in thread
From: Robin Rosenberg @ 2008-11-10 15:56 UTC (permalink / raw)
To: Francis Galiegue; +Cc: Git Mailing List, Shawn O. Pearce
måndag 10 november 2008 15:22:13 skrev Francis Galiegue:
> Hello,
>
> A very nice git feature, without even going as far as merges, is the cherry
> pick feature.
>
> For this to be doable from within the Eclipse Git plugin, a diff/patch
> implementation needs to be found, in a license compatible with the current
> JGit license (3-clause BSD, as far as I can tell). Or a new implementation
> can be rewritten from scratch, of course.
>
> I found this:
>
> http://code.google.com/p/google-diff-match-patch
>
> Its license is the Apache 2.0 license. It implements the same algorithm than
> git's internal diff engine ("An O(ND) Difference Algorithm and its
> Variations", by Eugene Myers), and as far as I can tell so far (IANAL, far
> from it), it is compatible with JGit's current license.
>
> Could this be a viable candidate?
Our approach was to do just that, for the very reasons you mention.
I'll have a look. Thanks for doing some research for us. That project was
unknown to me..
-- robin
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: JGIT: discuss: diff/patch implementation
2008-11-10 15:56 ` Robin Rosenberg
@ 2008-11-10 16:16 ` Francis Galiegue
2008-11-10 16:59 ` Robin Rosenberg
0 siblings, 1 reply; 16+ messages in thread
From: Francis Galiegue @ 2008-11-10 16:16 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: Git Mailing List, Shawn O. Pearce
Le Monday 10 November 2008 16:56:35 Robin Rosenberg, vous avez écrit :
[...]
> >
> > I found this:
> >
> > http://code.google.com/p/google-diff-match-patch
> >
> > Its license is the Apache 2.0 license. It implements the same algorithm
> > than git's internal diff engine ("An O(ND) Difference Algorithm and its
> > Variations", by Eugene Myers), and as far as I can tell so far (IANAL,
> > far from it), it is compatible with JGit's current license.
> >
> > Could this be a viable candidate?
>
> Our approach was to do just that, for the very reasons you mention.
> I'll have a look. Thanks for doing some research for us. That project was
> unknown to me..
>
> -- robin
Well, this API has a problem from the get go, since it does... Char by char
comparison. Ouch.
I'll try and hack it so that it does line by line, but given my Java skills,
uh...
--
fge
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: JGIT: discuss: diff/patch implementation
2008-11-10 16:16 ` Francis Galiegue
@ 2008-11-10 16:59 ` Robin Rosenberg
2008-11-10 18:11 ` Francis Galiegue
0 siblings, 1 reply; 16+ messages in thread
From: Robin Rosenberg @ 2008-11-10 16:59 UTC (permalink / raw)
To: Francis Galiegue; +Cc: Git Mailing List, Shawn O. Pearce
måndag 10 november 2008 17:16:28 skrev Francis Galiegue:
> Le Monday 10 November 2008 16:56:35 Robin Rosenberg, vous avez écrit :
> [...]
> > >
> > > I found this:
> > >
> > > http://code.google.com/p/google-diff-match-patch
> > >
> > > Its license is the Apache 2.0 license. It implements the same algorithm
> > > than git's internal diff engine ("An O(ND) Difference Algorithm and its
> > > Variations", by Eugene Myers), and as far as I can tell so far (IANAL,
> > > far from it), it is compatible with JGit's current license.
> > >
> > > Could this be a viable candidate?
> >
> > Our approach was to do just that, for the very reasons you mention.
> > I'll have a look. Thanks for doing some research for us. That project was
> > unknown to me..
> >
> > -- robin
>
> Well, this API has a problem from the get go, since it does... Char by char
> comparison. Ouch.
>
> I'll try and hack it so that it does line by line, but given my Java skills,
> uh...
>
We might want a byte-oriented version. Converting to char first is way
too slow.
-- robin
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: JGIT: discuss: diff/patch implementation
2008-11-10 16:59 ` Robin Rosenberg
@ 2008-11-10 18:11 ` Francis Galiegue
0 siblings, 0 replies; 16+ messages in thread
From: Francis Galiegue @ 2008-11-10 18:11 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: Git Mailing List, Shawn O. Pearce
Le Monday 10 November 2008 17:59:03 Robin Rosenberg, vous avez écrit :
[Sorry if this is offtopic for the git mailing list...]
> >
> > Well, this API has a problem from the get go, since it does... Char by
> > char comparison. Ouch.
> >
> > I'll try and hack it so that it does line by line, but given my Java
> > skills, uh...
>
> We might want a byte-oriented version. Converting to char first is way
> too slow.
>
Well, AFAICT, here is how the current git code detects whether a file is
binary or not:
----
#define FIRST_FEW_BYTES 8000
int buffer_is_binary(const char *ptr, unsigned long size)
{
if (FIRST_FEW_BYTES < size)
size = FIRST_FEW_BYTES;
return !!memchr(ptr, 0, size);
}
----
Easy enough to be coded in Java, hey, even I could do it :p
So, provided binary files are dealt with already, what penalty is left for
Java to deal with?
--
fge
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: JGIT: discuss: diff/patch implementation
2008-11-10 14:22 JGIT: discuss: diff/patch implementation Francis Galiegue
2008-11-10 15:56 ` Robin Rosenberg
@ 2008-11-10 19:46 ` Johannes Schindelin
2008-11-10 20:21 ` Francis Galiegue
2008-11-10 20:50 ` Junio C Hamano
2008-11-11 7:27 ` Rogan Dawes
3 siblings, 1 reply; 16+ messages in thread
From: Johannes Schindelin @ 2008-11-10 19:46 UTC (permalink / raw)
To: Francis Galiegue; +Cc: Git Mailing List, Shawn O. Pearce, Robin Rosenberg
Hi,
On Mon, 10 Nov 2008, Francis Galiegue wrote:
> A very nice git feature, without even going as far as merges, is the
> cherry pick feature.
>
> For this to be doable from within the Eclipse Git plugin, a diff/patch
> implementation needs to be found, in a license compatible with the
> current JGit license (3-clause BSD, as far as I can tell). Or a new
> implementation can be rewritten from scratch, of course.
Do not forget creating efficient packs. They also need an efficient diff
engine.
> I found this:
>
> http://code.google.com/p/google-diff-match-patch
Nice.
As was pointed out already, it is more meant to work on text than I'd like
to, and it also seems to have cute DWIMery for HTML.
I did not find any implementation, so I started implementing my own
version of Gene Myers' algorithm, with the plan to extend it with a
patience diff option.
My code so far can generate a diff between two files, but does not use
O(D) space (where D is the number of differences), but O(D^2), as I did
not have enough time (a conference, and traveling around the world can do
that to you).
Having looked at the source code of diff-patch-match, I admit that I do
not understand enough of the algorithm with so little documentation, so I
will continue my fun project.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: JGIT: discuss: diff/patch implementation
2008-11-10 19:46 ` Johannes Schindelin
@ 2008-11-10 20:21 ` Francis Galiegue
0 siblings, 0 replies; 16+ messages in thread
From: Francis Galiegue @ 2008-11-10 20:21 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Git Mailing List, Shawn O. Pearce, Robin Rosenberg
Le Monday 10 November 2008 20:46:02 Johannes Schindelin, vous avez écrit :
> Hi,
>
> On Mon, 10 Nov 2008, Francis Galiegue wrote:
> > A very nice git feature, without even going as far as merges, is the
> > cherry pick feature.
> >
> > For this to be doable from within the Eclipse Git plugin, a diff/patch
> > implementation needs to be found, in a license compatible with the
> > current JGit license (3-clause BSD, as far as I can tell). Or a new
> > implementation can be rewritten from scratch, of course.
>
> Do not forget creating efficient packs. They also need an efficient diff
> engine.
>
I wasn't even thinking about this, honestly :p
Let's say that as far as IDE users are concerned, they do have disk space, and
having the ability to cherry-pick is more of a priority than packs ;) Even a
less efficient but "to the point" engine will be good enough for the time
being, or at least, this is what I think.
I understand way too little about the algorithm myself to tell whether it's
also efficient for such a purpose. Maybe it is...
--
fge
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: JGIT: discuss: diff/patch implementation
2008-11-10 14:22 JGIT: discuss: diff/patch implementation Francis Galiegue
2008-11-10 15:56 ` Robin Rosenberg
2008-11-10 19:46 ` Johannes Schindelin
@ 2008-11-10 20:50 ` Junio C Hamano
2008-11-10 20:52 ` Shawn O. Pearce
` (2 more replies)
2008-11-11 7:27 ` Rogan Dawes
3 siblings, 3 replies; 16+ messages in thread
From: Junio C Hamano @ 2008-11-10 20:50 UTC (permalink / raw)
To: Francis Galiegue; +Cc: Git Mailing List, Shawn O. Pearce, Robin Rosenberg
Francis Galiegue <fg@one2team.net> writes:
> A very nice git feature, without even going as far as merges, is the cherry
> pick feature.
I thought cherry-picking needs to be done in terms of 3-way merge, not
diff piped to patch, for correctness's sake.
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: JGIT: discuss: diff/patch implementation
2008-11-10 20:50 ` Junio C Hamano
@ 2008-11-10 20:52 ` Shawn O. Pearce
2008-11-10 21:31 ` Francis Galiegue
2008-11-10 23:37 ` Johannes Schindelin
2008-11-11 10:06 ` Raimund Bauer
2 siblings, 1 reply; 16+ messages in thread
From: Shawn O. Pearce @ 2008-11-10 20:52 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Francis Galiegue, Git Mailing List, Robin Rosenberg
Junio C Hamano <gitster@pobox.com> wrote:
> Francis Galiegue <fg@one2team.net> writes:
>
> > A very nice git feature, without even going as far as merges, is the cherry
> > pick feature.
>
> I thought cherry-picking needs to be done in terms of 3-way merge, not
> diff piped to patch, for correctness's sake.
Yea, the 3-way merge cherry-pick is better. But in a pinch you
can (usually) get correct results from a "diff | patch" pipeline.
Of course that doesn't always work, resulting in patches that don't
apply cleanly, or worse, that apply at the wrong place silently.
--
Shawn.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: JGIT: discuss: diff/patch implementation
2008-11-10 20:52 ` Shawn O. Pearce
@ 2008-11-10 21:31 ` Francis Galiegue
0 siblings, 0 replies; 16+ messages in thread
From: Francis Galiegue @ 2008-11-10 21:31 UTC (permalink / raw)
To: Shawn O. Pearce
Cc: Junio C Hamano, Git Mailing List, Robin Rosenberg,
Johannes Schindelin
Le Monday 10 November 2008 21:52:42 Shawn O. Pearce, vous avez écrit :
> Junio C Hamano <gitster@pobox.com> wrote:
> > Francis Galiegue <fg@one2team.net> writes:
> > > A very nice git feature, without even going as far as merges, is the
> > > cherry pick feature.
> >
> > I thought cherry-picking needs to be done in terms of 3-way merge, not
> > diff piped to patch, for correctness's sake.
>
> Yea, the 3-way merge cherry-pick is better. But in a pinch you
> can (usually) get correct results from a "diff | patch" pipeline.
> Of course that doesn't always work, resulting in patches that don't
> apply cleanly, or worse, that apply at the wrong place silently.
Well, in this case, I'd say it's a case of a bottle being "half full" or "half
empty".
The availability of even a simple diff|patch in jgit, and its being available
in egit, would generally be seen as a "half full" bottle, and would, imho,
GREATLY increase the appeal factor of egit, all the more that you have plenty
of undo/redo ability in Eclipse... And, dare I say it, of git in general as
an SCM to be used in many environments where Eclipse is the de facto IDE.
I know, I may sound irritating, but...
--
fge
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: JGIT: discuss: diff/patch implementation
2008-11-10 20:50 ` Junio C Hamano
2008-11-10 20:52 ` Shawn O. Pearce
@ 2008-11-10 23:37 ` Johannes Schindelin
2008-11-11 10:06 ` Raimund Bauer
2 siblings, 0 replies; 16+ messages in thread
From: Johannes Schindelin @ 2008-11-10 23:37 UTC (permalink / raw)
To: Junio C Hamano
Cc: Francis Galiegue, Git Mailing List, Shawn O. Pearce,
Robin Rosenberg
Hi,
On Mon, 10 Nov 2008, Junio C Hamano wrote:
> Francis Galiegue <fg@one2team.net> writes:
>
> > A very nice git feature, without even going as far as merges, is the
> > cherry pick feature.
>
> I thought cherry-picking needs to be done in terms of 3-way merge, not
> diff piped to patch, for correctness's sake.
I haven't checked how RCS merge does it, but I know how xdiff/xmerge.c
does it ;-)
Basically, it takes the two diffs relative to the base file and works on
the overlapping hunks (i.e. on hunks where the ranges in the base file
overlap).
So we need a diff algorithm very much if we were to imitate that code in
JGit, which I very much plan to do.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: JGIT: discuss: diff/patch implementation
2008-11-10 20:50 ` Junio C Hamano
2008-11-10 20:52 ` Shawn O. Pearce
2008-11-10 23:37 ` Johannes Schindelin
@ 2008-11-11 10:06 ` Raimund Bauer
2008-11-11 17:18 ` Shawn O. Pearce
2 siblings, 1 reply; 16+ messages in thread
From: Raimund Bauer @ 2008-11-11 10:06 UTC (permalink / raw)
To: Junio C Hamano
Cc: Francis Galiegue, Git Mailing List, Shawn O. Pearce,
Robin Rosenberg
On Mon, 2008-11-10 at 12:50 -0800, Junio C Hamano wrote:
> Francis Galiegue <fg@one2team.net> writes:
>
> > A very nice git feature, without even going as far as merges, is the cherry
> > pick feature.
>
> I thought cherry-picking needs to be done in terms of 3-way merge, not
> diff piped to patch, for correctness's sake.
What about http://sourceforge.net/projects/jlibdiff ?
Maybe a bit old, but claims to have diff3 and is under LGPL.
best regards,
Ray
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: JGIT: discuss: diff/patch implementation
2008-11-11 10:06 ` Raimund Bauer
@ 2008-11-11 17:18 ` Shawn O. Pearce
2008-11-11 17:31 ` Sverre Rabbelier
0 siblings, 1 reply; 16+ messages in thread
From: Shawn O. Pearce @ 2008-11-11 17:18 UTC (permalink / raw)
To: Raimund Bauer
Cc: Junio C Hamano, Francis Galiegue, Git Mailing List,
Robin Rosenberg
Raimund Bauer <ray007@gmx.net> wrote:
> On Mon, 2008-11-10 at 12:50 -0800, Junio C Hamano wrote:
> > Francis Galiegue <fg@one2team.net> writes:
> >
> > > A very nice git feature, without even going as far as merges, is the cherry
> > > pick feature.
> >
> > I thought cherry-picking needs to be done in terms of 3-way merge, not
> > diff piped to patch, for correctness's sake.
>
> What about http://sourceforge.net/projects/jlibdiff ?
> Maybe a bit old, but claims to have diff3 and is under LGPL.
I hadn't looked at that library before.
We've generally tried to avoid LGPL diff implementations, but partly
because any I found were ports from a GPL C based code tree to Java,
but then the guy who did the port went and changed the license
to LGPL. Slightly dubious if you ask me. ;-)
LGPL plays nicely with BSD, especially in Java where runtime
relinking is possible. But it does screw with jgit.pgm's little
idea of "shove *everything* into a single shell script", as then
its not runtime re-linkable by the user.
I don't know how the Eclipse foundation feels about distributing
LGPL in the IDE. One of our major reasons for going with a BSD
license on JGit was so the Eclipse Git team provider plugin could be
distributed alongside the CVS team provider, as part of the basic IDE
team provider package. We're clearly not ready for that wide of a
distribution, but it was a goal Robin and I set out for the project.
--
Shawn.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: JGIT: discuss: diff/patch implementation
2008-11-11 17:18 ` Shawn O. Pearce
@ 2008-11-11 17:31 ` Sverre Rabbelier
0 siblings, 0 replies; 16+ messages in thread
From: Sverre Rabbelier @ 2008-11-11 17:31 UTC (permalink / raw)
To: Shawn O. Pearce
Cc: Raimund Bauer, Junio C Hamano, Francis Galiegue, Git Mailing List,
Robin Rosenberg
On Tue, Nov 11, 2008 at 18:18, Shawn O. Pearce <spearce@spearce.org> wrote:
> I don't know how the Eclipse foundation feels about distributing
> LGPL in the IDE. One of our major reasons for going with a BSD
> license on JGit was so the Eclipse Git team provider plugin could be
> distributed alongside the CVS team provider, as part of the basic IDE
> team provider package. We're clearly not ready for that wide of a
> distribution, but it was a goal Robin and I set out for the project.
Why not keep that as a goall? For now, you can stick with one of the
existing LGPL implementations, later, when you want to have JGit
distributed with Eclipse, you (or Johanness Schindelin when he has the
time) write up your own Java version of it and license it BSD
--
Cheers,
Sverre Rabbelier
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: JGIT: discuss: diff/patch implementation
2008-11-10 14:22 JGIT: discuss: diff/patch implementation Francis Galiegue
` (2 preceding siblings ...)
2008-11-10 20:50 ` Junio C Hamano
@ 2008-11-11 7:27 ` Rogan Dawes
2008-11-11 17:13 ` Shawn O. Pearce
3 siblings, 1 reply; 16+ messages in thread
From: Rogan Dawes @ 2008-11-11 7:27 UTC (permalink / raw)
To: Francis Galiegue; +Cc: Git Mailing List, Shawn O. Pearce, Robin Rosenberg
Francis Galiegue wrote:
> Hello,
>
> A very nice git feature, without even going as far as merges, is the cherry
> pick feature.
>
> For this to be doable from within the Eclipse Git plugin, a diff/patch
> implementation needs to be found, in a license compatible with the current
> JGit license (3-clause BSD, as far as I can tell). Or a new implementation
> can be rewritten from scratch, of course.
Shouldn't Eclipse already *have* a diff/patch implementation, for its
other "team work" plugins?
Rogan
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: JGIT: discuss: diff/patch implementation
2008-11-11 7:27 ` Rogan Dawes
@ 2008-11-11 17:13 ` Shawn O. Pearce
0 siblings, 0 replies; 16+ messages in thread
From: Shawn O. Pearce @ 2008-11-11 17:13 UTC (permalink / raw)
To: Rogan Dawes; +Cc: Francis Galiegue, Git Mailing List, Robin Rosenberg
Rogan Dawes <lists@dawes.za.net> wrote:
> Francis Galiegue wrote:
>>
>> For this to be doable from within the Eclipse Git plugin, a diff/patch
>> implementation needs to be found, in a license compatible with the
>> current JGit license (3-clause BSD, as far as I can tell). Or a new
>> implementation can be rewritten from scratch, of course.
>
> Shouldn't Eclipse already *have* a diff/patch implementation, for its
> other "team work" plugins?
Err, uhm, sort of.
Eclipse has patch available as an internal API, but it is exposed
in the UI for any team provider (or no team provider at all) to
use to apply patches to a project in the workspace.
The team provider API assumes the VCS implementation has its own
diff, and therefore the diff implementation inside Eclipse is only
used for the native Compare view
I've dug around that part of the text compare plugin and its mostly
internal APIs, and mostly still low-level LCS generation from
arbitrary object input. It doesn't seem well suited to producing
fast diffs of text.
Its under the EPL. We could take the code and simplify it down,
but I think by that point we'd mostly just want to rewrite it, or
use a different library anyway. At which point we wouldn't want
to bring in the EPL baggage if we can have a BSD implementation.
So yea, there's some implementation in there, but its not easy to
use or get to...
--
Shawn.
^ permalink raw reply [flat|nested] 16+ messages in thread