git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Patrick Palka <patrick@parcs.ath.cx>, git@vger.kernel.org
Subject: Re: [PATCH] Improve contrib/diff-highlight to highlight unevenly-sized hunks
Date: Fri, 19 Jun 2015 07:38:48 -0400	[thread overview]
Message-ID: <20150619113847.GA31824@peff.net> (raw)
In-Reply-To: <20150619073455.GA29109@peff.net>

On Fri, Jun 19, 2015 at 03:34:55AM -0400, Jeff King wrote:

> And here's some more bad news. If you look at the diff for this
> patch itself, it's terribly unreadable (the regular diff already is
> pretty bad, but the highlights make it much worse). There are big chunks
> where we take away 5 or 10 lines from the old code, and replace them
> with totally unrelated lines. We end up highlighting almost the entire
> thing, except for spaces and punctuation.
> 
> We might be able to solve this with a percentage heuristic similar to
> the one Patrick proposed. It's not really interesting to highlight
> unless we're doing it on probably 20% or less of the diff (where 20% is
> a number I just made up).

That turned out to be pretty easy; patch is below (on top of what I sent
earlier). I set the percentage at 50% based on eyeballing "git log -p"
in git.git, and it seems to give good results.

So I think the big remaining issue is improved tokenizing. Maybe Patrick
will want to take a stab at it.

---
diff --git a/contrib/diff-highlight/diff-highlight b/contrib/diff-highlight/diff-highlight
index 1525ccc..9454446 100755
--- a/contrib/diff-highlight/diff-highlight
+++ b/contrib/diff-highlight/diff-highlight
@@ -114,12 +114,32 @@ sub show_hunk {
 			if $bits & 2;
 	}
 
+	my $highlighted = count_highlight(@highlight_a) +
+			  count_highlight(@highlight_b);
+	my $total = length($a) + length($b);
+	my $pct = $highlighted / $total;
+
+	if ($pct > 0.5) {
+		@highlight_a = ();
+		@highlight_b = ();
+	}
+
 	# And now show the output both with the original stripped annotations,
 	# as well as our new highlights.
 	show_image($a, [merge_annotations(\@stripped_a, \@highlight_a)]);
 	show_image($b, [merge_annotations(\@stripped_b, \@highlight_b)]);
 }
 
+sub count_highlight {
+	my $total = 0;
+	while (@_) {
+		my $from = shift;
+		my $to = shift;
+		$total += $to->[0] - $from->[0];
+	}
+	return $total;
+}
+
 # Strip out any diff syntax (i.e., leading +/-), along with any ANSI color
 # codes from the pre- or post-image of a hunk. The result is a string of text
 # suitable for diffing against the other side of the hunk.

  reply	other threads:[~2015-06-19 11:38 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-15 17:20 [PATCH] Improve contrib/diff-highlight to highlight unevenly-sized hunks Patrick Palka
2015-06-18 15:50 ` Junio C Hamano
2015-06-18 16:28   ` Patrick Palka
2015-06-18 18:08     ` Junio C Hamano
2015-06-18 19:04       ` Jeff King
2015-06-18 20:14         ` Patrick Palka
2015-06-18 20:45           ` Jeff King
2015-06-18 21:23             ` Jeff King
2015-06-18 21:39               ` Junio C Hamano
2015-06-18 22:25               ` Patrick Palka
2015-06-19  3:54               ` Jeff King
2015-06-19  4:49                 ` Junio C Hamano
2015-06-19  5:32                   ` Jeff King
2015-06-19  7:34                     ` Jeff King
2015-06-19 11:38                       ` Jeff King [this message]
2015-06-19 17:20                     ` Junio C Hamano
2015-06-18 23:06             ` Patrick Palka
2015-06-18 20:23       ` Patrick Palka
2015-06-18 19:08     ` Jeff King
2015-06-18 20:27       ` Patrick Palka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150619113847.GA31824@peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=patrick@parcs.ath.cx \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).