From: "Adam J. Richter" <adam@yggdrasil.com>
To: wsc9tt@gmail.com
Cc: barkalow@iabervon.org, bram@bitconjurer.org,
droundry@abridgegame.org, git@vger.kernel.org,
ry102@rz.uni-karlsruhe.de, tupshin@tupshin.com
Subject: Re: The criss-cross merge case
Date: Sat, 30 Apr 2005 05:32:11 -0700 [thread overview]
Message-ID: <200504301232.j3UCWBO05174@adam.yggdrasil.com> (raw)
On Fri, 29 Apr 2005 07:19:18 -0500, Wayne Scott wrote:
>On 4/28/05, Adam J. Richter <adam@yggdrasil.com> wrote:
>> On 2005-04-28, Benedikt Schmidt wrote:
>> >AFAIK the paper mentioned in the GNU diff sources [1] is an improvement
>> >to an earlier paper by the same author titled
>> >"A File Comparison Program" - Miller, Myers - 1985.
>> [...]
>> >[1] http://citeseer.ist.psu.edu/myers86ond.html
>>
>> Monotone apparently uses a futher acceleration of that algorithm
>> from the 1989 paper, also co-authored by the Myers, "An O(NP) Sequence
>> Comparison Algorithm" by Sun Wu, Udi Manber, and Gene Myers.
>> http://www.eecs.berkeley.edu/~gene/Papers/np_diff.pdf . The Monotone
>> implementation was apparently a port of an implementation originally
>> written in Scheme by Aubrey Jaffer.
>>
>> I don't fully understand the 1989 paper, but I get the
>> general impression that is a small change to the previous algorithm
>> (the one in GNU diff) that might be a 30 line patch if someone
>> got around to submitting it, and seems to make the code run more
>> than twice as fast in practice. One of these days, I will probably get
>> around to coding up a patch to GNU diff if nobody beats me to it.
>>
>> Making diff run faster may have at least one potentially useful
>> benefit for merging. A faster diff makes it more practical run diff
>> on smaller units of comparison. I posted a note here before about
>> converting the input files to diff3 to have just one character per
>> line, and then undoing that transformation of the result to produce
>> a character based merge that seemed to work pretty well in the
>> couple of tests that I tried.
>I just read that paper and unless I am mistaken, it already describes
>the basis for how GNU diff works. I don't think anything in that
>paper would make it faster.
>
>I also don't find anything to suggest the Monotone guys have rewritten
>diff. Just some notes from graydon that notes python's difflib uses a
>non-optimal diff that is faster in some cases.
In terminology that can only be understood by reading
the 1985 paper, the 1989 paper describes a possible reduction
in the number of diagonals in the edit graph that iterations of the
1989 algorithm have to consider. I say "possible reduction" because
the reduction can be zero in the worse case, although I get the
impression that it should be a reduction of 50% or better
typically, and it makes the case where the changes is just
a bunch of inserts run in linear time.
I believe that the longest common subsequence finder
at the core of GNU diff does not currently perform this optimization,
but the one in monotone-0.18/lcs.{cc,hh} does.
__ ______________
Adam J. Richter \ /
adam@yggdrasil.com | g g d r a s i l
next reply other threads:[~2005-04-30 13:36 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-04-30 12:32 Adam J. Richter [this message]
-- strict thread matches above, loose matches on Subject: below --
2005-04-28 14:25 The criss-cross merge case Adam J. Richter
2005-04-29 12:19 ` Wayne Scott
2005-04-27 20:25 Bram Cohen
2005-04-27 23:32 ` Daniel Barkalow
2005-04-28 0:43 ` Tupshin Harper
2005-04-28 1:16 ` Daniel Barkalow
2005-04-28 2:15 ` Benedikt Schmidt
2005-04-28 2:19 ` Daniel Barkalow
2005-04-28 11:16 ` David Roundy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200504301232.j3UCWBO05174@adam.yggdrasil.com \
--to=adam@yggdrasil.com \
--cc=barkalow@iabervon.org \
--cc=bram@bitconjurer.org \
--cc=droundry@abridgegame.org \
--cc=git@vger.kernel.org \
--cc=ry102@rz.uni-karlsruhe.de \
--cc=tupshin@tupshin.com \
--cc=wsc9tt@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).