From: Jeff King <peff@peff.net>
To: "Michał Kiedrowicz" <michal.kiedrowicz@gmail.com>
Cc: git@vger.kernel.org
Subject: [PATCH 4/5] diff-highlight: match multi-line hunks
Date: Mon, 13 Feb 2012 17:36:36 -0500 [thread overview]
Message-ID: <20120213223636.GD19521@sigill.intra.peff.net> (raw)
In-Reply-To: <20120213222702.GA19393@sigill.intra.peff.net>
Currently we only bother highlighting single-line hunks. The
rationale was that the purpose of highlighting is to point
out small changes between two similar lines that are
otherwise hard to see. However, that meant we missed similar
cases where two lines were changed together, like:
-foo(buf);
-bar(buf);
+foo(obj->buf);
+bar(obj->buf);
Each of those changes is simple, and would benefit from
highlighting (the "obj->" parts in this case).
This patch considers whole hunks at a time. For now, we
consider only the case where the hunk has the same number of
removed and added lines, and assume that the lines from each
segment correspond one-to-one. While this is just a
heuristic, in practice it seems to generate sensible
results (especially because we now omit highlighting on
completely-changed lines, so when our heuristic is wrong, we
tend to avoid highlighting at all).
Based on an original idea and implementation by Michał
Kiedrowicz.
Signed-off-by: Jeff King <peff@peff.net>
---
Same attribution statement applies as to patch 2 (in fact, patches 1 and
3 could be attributed to you, too).
This version has the missing documentation fixes. The implementation is
a little different than yours. I rearranged the parsing in a manner that
was a little more obvious to me, and I pulled out the "don't highlight
if the number of lines don't match" case into its own conditional, which
makes it more obvious where to work if somebody wants to try doing
something fancier.
contrib/diff-highlight/README | 16 ++++---
contrib/diff-highlight/diff-highlight | 70 ++++++++++++++++++++-------------
2 files changed, 52 insertions(+), 34 deletions(-)
diff --git a/contrib/diff-highlight/README b/contrib/diff-highlight/README
index 1b7b6df..4a58579 100644
--- a/contrib/diff-highlight/README
+++ b/contrib/diff-highlight/README
@@ -14,13 +14,15 @@ Instead, this script post-processes the line-oriented diff, finds pairs
of lines, and highlights the differing segments. It's currently very
simple and stupid about doing these tasks. In particular:
- 1. It will only highlight a pair of lines if they are the only two
- lines in a hunk. It could instead try to match up "before" and
- "after" lines for a given hunk into pairs of similar lines.
- However, this may end up visually distracting, as the paired
- lines would have other highlighted lines in between them. And in
- practice, the lines which most need attention called to their
- small, hard-to-see changes are touching only a single line.
+ 1. It will only highlight hunks in which the number of removed and
+ added lines is the same, and it will pair lines within the hunk by
+ position (so the first removed line is compared to the first added
+ line, and so forth). This is simple and tends to work well in
+ practice. More complex changes don't highlight well, so we tend to
+ exclude them due to the "same number of removed and added lines"
+ restriction. Or even if we do try to highlight them, they end up
+ not highlighting because of our "don't highlight if the whole line
+ would be highlighted" rule.
2. It will find the common prefix and suffix of two lines, and
consider everything in the middle to be "different". It could
diff --git a/contrib/diff-highlight/diff-highlight b/contrib/diff-highlight/diff-highlight
index 279d211..c4404d4 100755
--- a/contrib/diff-highlight/diff-highlight
+++ b/contrib/diff-highlight/diff-highlight
@@ -10,23 +10,28 @@ my $UNHIGHLIGHT = "\x1b[27m";
my $COLOR = qr/\x1b\[[0-9;]*m/;
my $BORING = qr/$COLOR|\s/;
-my @window;
+my @removed;
+my @added;
+my $in_hunk;
while (<>) {
- # We highlight only single-line changes, so we need
- # a 4-line window to make a decision on whether
- # to highlight.
- push @window, $_;
- next if @window < 4;
- if ($window[0] =~ /^$COLOR*(\@| )/ &&
- $window[1] =~ /^$COLOR*-/ &&
- $window[2] =~ /^$COLOR*\+/ &&
- $window[3] !~ /^$COLOR*\+/) {
- print shift @window;
- show_hunk(shift @window, shift @window);
+ if (!$in_hunk) {
+ print;
+ $in_hunk = /^$COLOR*\@/;
+ }
+ elsif (/^$COLOR*-/) {
+ push @removed, $_;
+ }
+ elsif (/^$COLOR*\+/) {
+ push @added, $_;
}
else {
- print shift @window;
+ show_hunk(\@removed, \@added);
+ @removed = ();
+ @added = ();
+
+ print;
+ $in_hunk = /^$COLOR*[\@ ]/;
}
# Most of the time there is enough output to keep things streaming,
@@ -42,26 +47,37 @@ while (<>) {
}
}
-# Special case a single-line hunk at the end of file.
-if (@window == 3 &&
- $window[0] =~ /^$COLOR*(\@| )/ &&
- $window[1] =~ /^$COLOR*-/ &&
- $window[2] =~ /^$COLOR*\+/) {
- print shift @window;
- show_hunk(shift @window, shift @window);
-}
-
-# And then flush any remaining lines.
-while (@window) {
- print shift @window;
-}
+# Flush any queued hunk (this can happen when there is no trailing context in
+# the final diff of the input).
+show_hunk(\@removed, \@added);
exit 0;
sub show_hunk {
my ($a, $b) = @_;
- print highlight_pair($a, $b);
+ # If one side is empty, then there is nothing to compare or highlight.
+ if (!@$a || !@$b) {
+ print @$a, @$b;
+ return;
+ }
+
+ # If we have mismatched numbers of lines on each side, we could try to
+ # be clever and match up similar lines. But for now we are simple and
+ # stupid, and only handle multi-line hunks that remove and add the same
+ # number of lines.
+ if (@$a != @$b) {
+ print @$a, @$b;
+ return;
+ }
+
+ my @queue;
+ for (my $i = 0; $i < @$a; $i++) {
+ my ($rm, $add) = highlight_pair($a->[$i], $b->[$i]);
+ print $rm;
+ push @queue, $add;
+ }
+ print @queue;
}
sub highlight_pair {
--
1.7.8.4.17.g2df81
next prev parent reply other threads:[~2012-02-13 22:36 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-10 9:18 [PATCH 0/8] gitweb: Highlight interesting parts of diff Michał Kiedrowicz
2012-02-10 9:18 ` [PATCH 1/8] gitweb: Extract print_sidebyside_diff_lines() Michał Kiedrowicz
2012-02-11 15:20 ` Jakub Narebski
2012-02-11 23:03 ` Michał Kiedrowicz
2012-02-10 9:18 ` [PATCH 2/8] gitweb: Use print_diff_chunk() for both side-by-side and inline diffs Michał Kiedrowicz
2012-02-11 15:53 ` Jakub Narebski
2012-02-11 23:16 ` Michał Kiedrowicz
2012-02-25 9:00 ` Michał Kiedrowicz
2012-02-10 9:18 ` [PATCH 3/8] gitweb: Move HTML-formatting diff line back to process_diff_line() Michał Kiedrowicz
2012-02-11 16:02 ` Jakub Narebski
2012-02-10 9:18 ` [PATCH 4/8] gitweb: Push formatting diff lines to print_diff_chunk() Michał Kiedrowicz
2012-02-11 16:29 ` Jakub Narebski
2012-02-11 23:20 ` Michał Kiedrowicz
2012-02-11 23:30 ` Michał Kiedrowicz
2012-02-10 9:18 ` [PATCH 5/8] gitweb: Format diff lines just before printing Michał Kiedrowicz
2012-02-11 17:14 ` Jakub Narebski
2012-02-11 23:38 ` Michał Kiedrowicz
2012-02-10 9:18 ` [PATCH 6/8] gitweb: Highlight interesting parts of diff Michał Kiedrowicz
2012-02-10 13:23 ` Jakub Narebski
2012-02-10 14:15 ` Michał Kiedrowicz
2012-02-10 14:55 ` Jakub Narebski
2012-02-10 17:33 ` Michał Kiedrowicz
2012-02-10 22:52 ` Splitting gitweb (was: Re: [PATCH 6/8] gitweb: Highlight interesting parts of diff) Jakub Narebski
2012-02-10 20:24 ` [PATCH 6/8] gitweb: Highlight interesting parts of diff Jeff King
2012-02-14 6:54 ` Michal Kiedrowicz
2012-02-14 7:14 ` Junio C Hamano
2012-02-14 8:20 ` Jeff King
2012-02-10 20:20 ` Jeff King
2012-02-10 21:29 ` Michał Kiedrowicz
2012-02-10 21:32 ` Jeff King
2012-02-10 21:36 ` Michał Kiedrowicz
2012-02-10 21:47 ` [PATCH] diff-highlight: Work for multiline changes too Michał Kiedrowicz
2012-02-13 22:27 ` Jeff King
2012-02-13 22:28 ` [PATCH 1/5] diff-highlight: make perl strict and warnings fatal Jeff King
2012-02-13 22:32 ` [PATCH 2/5] diff-highlight: don't highlight whole lines Jeff King
2012-02-14 6:35 ` Michal Kiedrowicz
2012-02-13 22:33 ` [PATCH 3/5] diff-highlight: refactor to prepare for multi-line hunks Jeff King
2012-02-13 22:36 ` Jeff King [this message]
2012-02-13 22:37 ` [PATCH 5/5] diff-highlight: document some non-optimal cases Jeff King
2012-02-14 6:48 ` Michal Kiedrowicz
2012-02-14 0:05 ` [PATCH] diff-highlight: Work for multiline changes too Junio C Hamano
2012-02-14 0:22 ` Jeff King
2012-02-14 1:19 ` Junio C Hamano
2012-02-14 6:04 ` Jeff King
2012-02-14 6:28 ` Michal Kiedrowicz
2012-02-10 21:56 ` [PATCH 6/8] gitweb: Highlight interesting parts of diff Jakub Narebski
2012-02-11 23:45 ` Jakub Narebski
2012-02-12 10:42 ` Jakub Narebski
2012-02-13 6:54 ` Michal Kiedrowicz
2012-02-13 19:58 ` Jakub Narebski
2012-02-13 21:10 ` Michał Kiedrowicz
2012-02-13 6:41 ` Michal Kiedrowicz
2012-02-13 18:44 ` Jakub Narebski
2012-02-13 21:09 ` Michał Kiedrowicz
2012-02-14 17:31 ` Jakub Narebski
2012-02-14 18:23 ` Michał Kiedrowicz
2012-02-14 18:52 ` Jeff King
2012-02-14 20:04 ` Michał Kiedrowicz
2012-02-14 20:38 ` Jeff King
2012-02-10 9:18 ` [PATCH 7/8] gitweb: Use different colors to present marked changes Michał Kiedrowicz
2012-02-12 0:11 ` Jakub Narebski
2012-02-13 6:46 ` Michal Kiedrowicz
2012-02-10 9:18 ` [PATCH 8/8] gitweb: Highlight combined diffs Michał Kiedrowicz
2012-02-12 0:03 ` Jakub Narebski
2012-02-13 6:48 ` Michal Kiedrowicz
2012-02-11 18:32 ` [PATCH 0/8] gitweb: Highlight interesting parts of diff Jakub Narebski
2012-02-11 22:56 ` Michał Kiedrowicz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120213223636.GD19521@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=michal.kiedrowicz@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).