From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Patrick Palka <patrick@parcs.ath.cx>, git@vger.kernel.org
Subject: Re: [PATCH] Improve contrib/diff-highlight to highlight unevenly-sized hunks
Date: Fri, 19 Jun 2015 07:38:48 -0400 [thread overview]
Message-ID: <20150619113847.GA31824@peff.net> (raw)
In-Reply-To: <20150619073455.GA29109@peff.net>
On Fri, Jun 19, 2015 at 03:34:55AM -0400, Jeff King wrote:
> And here's some more bad news. If you look at the diff for this
> patch itself, it's terribly unreadable (the regular diff already is
> pretty bad, but the highlights make it much worse). There are big chunks
> where we take away 5 or 10 lines from the old code, and replace them
> with totally unrelated lines. We end up highlighting almost the entire
> thing, except for spaces and punctuation.
>
> We might be able to solve this with a percentage heuristic similar to
> the one Patrick proposed. It's not really interesting to highlight
> unless we're doing it on probably 20% or less of the diff (where 20% is
> a number I just made up).
That turned out to be pretty easy; patch is below (on top of what I sent
earlier). I set the percentage at 50% based on eyeballing "git log -p"
in git.git, and it seems to give good results.
So I think the big remaining issue is improved tokenizing. Maybe Patrick
will want to take a stab at it.
---
diff --git a/contrib/diff-highlight/diff-highlight b/contrib/diff-highlight/diff-highlight
index 1525ccc..9454446 100755
--- a/contrib/diff-highlight/diff-highlight
+++ b/contrib/diff-highlight/diff-highlight
@@ -114,12 +114,32 @@ sub show_hunk {
if $bits & 2;
}
+ my $highlighted = count_highlight(@highlight_a) +
+ count_highlight(@highlight_b);
+ my $total = length($a) + length($b);
+ my $pct = $highlighted / $total;
+
+ if ($pct > 0.5) {
+ @highlight_a = ();
+ @highlight_b = ();
+ }
+
# And now show the output both with the original stripped annotations,
# as well as our new highlights.
show_image($a, [merge_annotations(\@stripped_a, \@highlight_a)]);
show_image($b, [merge_annotations(\@stripped_b, \@highlight_b)]);
}
+sub count_highlight {
+ my $total = 0;
+ while (@_) {
+ my $from = shift;
+ my $to = shift;
+ $total += $to->[0] - $from->[0];
+ }
+ return $total;
+}
+
# Strip out any diff syntax (i.e., leading +/-), along with any ANSI color
# codes from the pre- or post-image of a hunk. The result is a string of text
# suitable for diffing against the other side of the hunk.
next prev parent reply other threads:[~2015-06-19 11:38 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-15 17:20 [PATCH] Improve contrib/diff-highlight to highlight unevenly-sized hunks Patrick Palka
2015-06-18 15:50 ` Junio C Hamano
2015-06-18 16:28 ` Patrick Palka
2015-06-18 18:08 ` Junio C Hamano
2015-06-18 19:04 ` Jeff King
2015-06-18 20:14 ` Patrick Palka
2015-06-18 20:45 ` Jeff King
2015-06-18 21:23 ` Jeff King
2015-06-18 21:39 ` Junio C Hamano
2015-06-18 22:25 ` Patrick Palka
2015-06-19 3:54 ` Jeff King
2015-06-19 4:49 ` Junio C Hamano
2015-06-19 5:32 ` Jeff King
2015-06-19 7:34 ` Jeff King
2015-06-19 11:38 ` Jeff King [this message]
2015-06-19 17:20 ` Junio C Hamano
2015-06-18 23:06 ` Patrick Palka
2015-06-18 20:23 ` Patrick Palka
2015-06-18 19:08 ` Jeff King
2015-06-18 20:27 ` Patrick Palka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150619113847.GA31824@peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=patrick@parcs.ath.cx \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).