git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yoshihiro Sugi <sugi1982@gmail.com>
To: git@vger.kernel.org
Cc: Yoshihiro Sugi <sugi1982@gmail.com>
Subject: Re: [PATCH] contrib/diff-highlight: multibyte characters diff
Date: Fri, 14 Feb 2014 04:14:16 +0900	[thread overview]
Message-ID: <1392318856-55920-1-git-send-email-sugi1982@gmail.com> (raw)
In-Reply-To: <20140212205948.GA4453@sigill.intra.peff.net>

Thanks for reviewing.
as you wrote, diff content may not be utf8 at all. and we don't know that the user's terminal watns is utf8.
I think your trying utf8 decode and fall back approach is better than my patch, and do work well.

is using "$@" for catching error like the patch below?
According to perldoc Encode.pm, encode/decode with "FB_CROAK" may destroy original string. We should probabry use "LEAVE_SRC" on decode_utf8's second argument.

---
 contrib/diff-highlight/diff-highlight | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/contrib/diff-highlight/diff-highlight b/contrib/diff-highlight/diff-highlight
index c4404d4..0743851 100755
--- a/contrib/diff-highlight/diff-highlight
+++ b/contrib/diff-highlight/diff-highlight
@@ -2,6 +2,7 @@
 
 use warnings FATAL => 'all';
 use strict;
+use Encode qw(decode_utf8 encode_utf8);
 
 # Highlight by reversing foreground and background. You could do
 # other things like bold or underline if you prefer.
@@ -73,13 +74,23 @@ sub show_hunk {
 
 	my @queue;
 	for (my $i = 0; $i < @$a; $i++) {
-		my ($rm, $add) = highlight_pair($a->[$i], $b->[$i]);
-		print $rm;
-		push @queue, $add;
+		my ($a_dec, $encode_rm) = decode($a->[$i]);
+		my ($b_dec, $encode_add) = decode($b->[$i]);
+		my ($rm, $add) = highlight_pair($a_dec, $b_dec);
+		print $encode_rm->($rm);
+		push @queue, $encode_add->($add);
 	}
 	print @queue;
 }
 
+sub decode {
+	my $orig = shift;
+	my $decoded = eval { decode_utf8($orig, Encode::FB_CROAK | Encode::LEAVE_SRC) };
+	return $@ ?
+	       ($orig, sub { shift }) :
+	       ($decoded, sub { encode_utf8(shift) });
+}
+
 sub highlight_pair {
 	my @a = split_line(shift);
 	my @b = split_line(shift);
-- 
1.8.5.3

      parent reply	other threads:[~2014-02-13 19:14 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-11  9:09 [PATCH] contrib/diff-highlight: multibyte characters diff Yoshihiro Sugi
2014-02-11 19:30 ` Junio C Hamano
2014-02-12 20:59 ` Jeff King
2014-02-12 23:10   ` Thomas Adam
2014-02-12 23:27     ` Jeff King
2014-02-13  1:17       ` brian m. carlson
2014-02-13  1:37         ` Jeff King
2014-02-13 19:14   ` Yoshihiro Sugi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1392318856-55920-1-git-send-email-sugi1982@gmail.com \
    --to=sugi1982@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).