git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* libxdiff and patience diff
@ 2008-11-04  0:40 Pierre Habouzit
  2008-11-04  3:17 ` Davide Libenzi
  2008-11-04  5:39 ` Johannes Schindelin
  0 siblings, 2 replies; 67+ messages in thread
From: Pierre Habouzit @ 2008-11-04  0:40 UTC (permalink / raw)
  To: davidel; +Cc: Git ML

[-- Attachment #1: Type: text/plain, Size: 2679 bytes --]

Hi Davide,

I've been working tonight, trying to make libxdiff support the patience
diff algorithm, but I've totally failed, because I _thought_ I
understood what xdl_split was doing, but it appears I don't.


[ For the readers playing at home, the patience diff algorithm is
  explained after my sig. ]


What I did is to:
(1) add a flag to the xenv in xdl_split that says that I want a
    patience "split".
(2) Then in xdl_split, if that bit is set, I compute the longest common
    subsequence from the patience diff.
(3) for each split it computes I call xdl_recs_cmp on that interval.


What I thought it would achieve is that I force some boundaries at which
libxdiff _must_ resync. Though, it seems that for some reason it doesn't
work, probably because the "ha" stuff and the boundaries absolutely
don't work the way I thought it did.

So where is the place I should do that ? I suspect it should be
partly in xprepare.c but I'm a bit stuck right now.


Any pointer on how the stuff in xprepare.c and xdiffi.c work would help
greatly, it's really not self-evident to me :)


-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org



Patience diff
=============

Basically, it's an algorithm that considers line from the left file A
and the right file B.

It searches for lines that are unique *IN* A and unique *IN* B as well.
Then using a clever O(n log log n) where n is that number of lines, you
extract the longer common sequence of those lines.

This defines two splits of file A and B. For each corresponding chunks,
you then reduce the lines matching at the beginning and the end, and
reiterate the algorithm on the interior of that space, until there are
_no_ unique lines at all, and then you apply a usual diff algorithm to
generate the diff in that final section.

http://alfedenzo.livejournal.com/170301.html has a nice visual
explanation of the fact, even if it forgets about the "zones" trimming
that helps for the efficiency. It's also often wrong to generate the
stacks in the order shown on the blog, because you recurse from the max
line to the min line, which is not the natural order to work on a diff.
But that's just a matter of "inverting" all the comparisons which is
pretty obvious to do.


The difference of output can be seen on http://glandium.org/blog/?p=120
where the patience diff picks the line
   "include $(topsrcdir)/config/rules.mk"
as being unique on the left and the right file, hence will use it as
sync point rather than using it in a diff.


[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2009-01-10 11:37 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-04  0:40 libxdiff and patience diff Pierre Habouzit
2008-11-04  3:17 ` Davide Libenzi
2008-11-04  8:33   ` Pierre Habouzit
2008-11-04  5:39 ` Johannes Schindelin
2008-11-04  8:30   ` Pierre Habouzit
2008-11-04 14:34     ` Johannes Schindelin
2008-11-04 15:23       ` Pierre Habouzit
2008-11-04 15:57         ` Johannes Schindelin
2008-11-04 16:15           ` Pierre Habouzit
2009-01-01 16:38         ` [PATCH 0/3] Teach Git about the patience diff algorithm Johannes Schindelin
2009-01-01 16:38           ` [PATCH 1/3] Implement " Johannes Schindelin
2009-01-01 16:39           ` [PATCH 2/3] Introduce the diff option '--patience' Johannes Schindelin
2009-01-01 16:39           ` [PATCH 3/3] bash completions: Add the --patience option Johannes Schindelin
2009-01-01 19:45           ` [PATCH 0/3] Teach Git about the patience diff algorithm Linus Torvalds
2009-01-01 20:00             ` Linus Torvalds
2009-01-02 18:17               ` Johannes Schindelin
2009-01-02 18:49                 ` Linus Torvalds
2009-01-02 19:07                   ` Johannes Schindelin
2009-01-02 18:51                 ` Jeff King
2009-01-02 21:59               ` [PATCH 1/3 v2] Implement " Johannes Schindelin
2009-01-02 21:59                 ` Johannes Schindelin
2009-01-01 20:46             ` [PATCH 0/3] Teach Git about " Adeodato Simó
2009-01-02  1:56               ` Linus Torvalds
2009-01-02 10:55                 ` Clemens Buchacher
2009-01-02 10:58                   ` Clemens Buchacher
2009-01-02 16:42                     ` Linus Torvalds
2009-01-02 18:46                       ` Johannes Schindelin
2009-01-02 19:03                         ` Linus Torvalds
2009-01-02 19:22                           ` Johannes Schindelin
2009-01-02 19:39                           ` Jeff King
2009-01-02 19:50                             ` Jeff King
2009-01-02 20:52                               ` Jeff King
2009-01-02 23:05                                 ` Linus Torvalds
2009-01-03 16:24                             ` Bazaar's patience diff as GIT_EXTERNAL_DIFF Adeodato Simó
2009-01-02 21:59                       ` [PATCH 0/3] Teach Git about the patience diff algorithm Johannes Schindelin
2009-01-08 19:55                       ` Adeodato Simó
2009-01-08 20:06                         ` Adeodato Simó
2009-01-09  6:54                         ` Junio C Hamano
2009-01-09 13:07                           ` Johannes Schindelin
2009-01-09 15:59                             ` Adeodato Simó
2009-01-09 18:09                             ` Linus Torvalds
2009-01-09 18:13                               ` Linus Torvalds
2009-01-09 20:53                             ` Junio C Hamano
2009-01-10 11:36                               ` Johannes Schindelin
2009-01-02 11:03                   ` Junio C Hamano
2009-01-02 18:50                 ` Adeodato Simó
2009-01-06 11:17     ` Pierre Habouzit
2009-01-06 11:39       ` Pierre Habouzit
2009-01-06 19:40       ` Johannes Schindelin
2009-01-07 14:39         ` Pierre Habouzit
2009-01-07 17:01           ` Johannes Schindelin
2009-01-07 17:04             ` [PATCH v3 1/3] Implement " Johannes Schindelin
2009-01-07 18:10               ` Davide Libenzi
2009-01-07 18:32                 ` Johannes Schindelin
2009-01-07 20:09                   ` Davide Libenzi
2009-01-07 20:19                     ` Johannes Schindelin
2009-01-07 18:59                 ` Linus Torvalds
2009-01-07 20:00                   ` Johannes Schindelin
2009-01-07 20:11                   ` Davide Libenzi
2009-01-07 20:15         ` [PATCH 0/3] Teach Git about " Sam Vilain
2009-01-07 20:25           ` Linus Torvalds
2009-01-08  2:31             ` Sam Vilain
2009-01-07 20:38           ` Johannes Schindelin
2009-01-07 20:48             ` Junio C Hamano
2009-01-07 22:00               ` Johannes Schindelin
2009-01-07 22:45                 ` Pierre Habouzit
2009-01-07 23:03                   ` Johannes Schindelin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).