From: Stefan Beller <sbeller@google.com>
To: git@vger.kernel.org
Cc: Stefan Beller <sbeller@google.com>
Subject: [PATCH] Documentation/diff-options: explain different diff algorithms
Date: Mon, 23 Jul 2018 17:36:19 -0700 [thread overview]
Message-ID: <20180724003619.185290-1-sbeller@google.com> (raw)
As a user I wondered what the diff algorithms are about. Offer at least
a basic explanation on the differences of the diff algorithms.
Signed-off-by: Stefan Beller <sbeller@google.com>
---
Documentation/diff-options.txt | 32 +++++++++++++++++++++++++-------
1 file changed, 25 insertions(+), 7 deletions(-)
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index f466600972f..0d765482027 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -94,16 +94,34 @@ diff" algorithm internally.
Choose a diff algorithm. The variants are as follows:
+
--
-`default`, `myers`;;
- The basic greedy diff algorithm. Currently, this is the default.
`minimal`;;
- Spend extra time to make sure the smallest possible diff is
- produced.
+ A diff as produced by the basic greedy algorithm described in
+ link:http://www.xmailserver.org/diff2.pdf[An O(ND) Difference Algorithm and its Variations]
+`default`, `myers`;;
+ The same algorithm as `minimal`, extended with a heuristic to
+ reduce extensive searches. Currently, this is the default.
`patience`;;
- Use "patience diff" algorithm when generating patches.
+ Use "patience diff" algorithm when generating patches. This
+ matches the longest common subsequence of unique lines on
+ both sides, recursively. It obtained its name by the way the
+ longest subsequence is found, as that is a byproduct of the
+ patience sorting algorithm. If there are no unique lines left
+ it falls back to `myers`. Empirically this algorithm produces
+ a more readable output for code, but it does not garantuee
+ the shortest output.
`histogram`;;
- This algorithm extends the patience algorithm to "support
- low-occurrence common elements".
+ This algorithm re-implements the `patience` algorithm with
+ "support of low-occurrence common elements" and only picks
+ one element of the LCS for the recursion. It is often the
+ fastest, but in cornercases (when there are many longest
+ common subsequences of the same length) it produces bad
+ results as seen in:
+
+ seq 1 100 >one
+ echo 99 > two
+ seq 1 2 98 >>two
+ git diff --no-index --histogram one two
+
--
+
For instance, if you configured diff.algorithm variable to a
--
2.18.0.345.g5c9ce644c3-goog
next reply other threads:[~2018-07-24 0:36 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-24 0:36 Stefan Beller [this message]
2018-07-24 4:40 ` [PATCH] Documentation/diff-options: explain different diff algorithms Jonathan Nieder
2018-07-24 17:38 ` Stefan Beller
2018-07-24 20:06 ` Junio C Hamano
2018-08-06 22:25 ` Stefan Beller
2018-08-06 23:18 ` Jonathan Nieder
2018-08-07 15:56 ` Junio C Hamano
2018-08-09 19:26 ` Stefan Beller
2018-08-10 22:18 ` Stefan Beller
2018-08-09 19:51 ` Stefan Beller
2018-08-10 0:10 ` [PATCH 0/2] Getting data on different diff algorithms WAS: " Stefan Beller
2018-08-10 0:10 ` [PATCH 1/2] WIP: range-diff: take extra arguments for different diffs Stefan Beller
2018-08-10 0:10 ` [PATCH 2/2] WIP range-diff: print some statistics about the range Stefan Beller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180724003619.185290-1-sbeller@google.com \
--to=sbeller@google.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).