git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Yoshihiro Sugi <sugi1982@gmail.com>
Subject: Re: [PATCH] contrib/diff-highlight: multibyte characters diff
Date: Tue, 11 Feb 2014 11:30:33 -0800	[thread overview]
Message-ID: <xmqqioslphuu.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <1392109750-47852-1-git-send-email-sugi1982@gmail.com> (Yoshihiro Sugi's message of "Tue, 11 Feb 2014 18:09:10 +0900")

Yoshihiro Sugi <sugi1982@gmail.com> writes:

> Signed-off-by: Yoshihiro Sugi <sugi1982@gmail.com>
>
> diff-highlight split each hunks and compare them as byte sequences.
> it causes problems when diff hunks include multibyte characters.
> This change enable to work on such cases by decoding inputs and encoding output as utf8 string.
> ---
>  contrib/diff-highlight/diff-highlight | 16 ++++++++++------
>  1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/contrib/diff-highlight/diff-highlight b/contrib/diff-highlight/diff-highlight
> index c4404d4..49b4f53 100755
> --- a/contrib/diff-highlight/diff-highlight
> +++ b/contrib/diff-highlight/diff-highlight
> @@ -2,6 +2,7 @@
>  
>  use warnings FATAL => 'all';
>  use strict;
> +use Encode qw(decode_utf8 encode_utf8);
>  
>  # Highlight by reversing foreground and background. You could do
>  # other things like bold or underline if you prefer.
> @@ -15,8 +16,9 @@ my @added;
>  my $in_hunk;
>  
>  while (<>) {
> +	$_ = decode_utf8($_);
>  	if (!$in_hunk) {
> -		print;
> +		print encode_utf8($_);
>  		$in_hunk = /^$COLOR*\@/;
>  	}
>  	elsif (/^$COLOR*-/) {
> @@ -30,7 +32,7 @@ while (<>) {
>  		@removed = ();
>  		@added = ();
>  
> -		print;
> +		print encode_utf8($_);
>  		$in_hunk = /^$COLOR*[\@ ]/;
>  	}
>  
> @@ -58,7 +60,8 @@ sub show_hunk {
>  
>  	# If one side is empty, then there is nothing to compare or highlight.
>  	if (!@$a || !@$b) {
> -		print @$a, @$b;
> +		print encode_utf8($_) for @$a;
> +		print encode_utf8($_) for @$b;
>  		return;
>  	}
>  
> @@ -67,17 +70,18 @@ sub show_hunk {
>  	# stupid, and only handle multi-line hunks that remove and add the same
>  	# number of lines.
>  	if (@$a != @$b) {
> -		print @$a, @$b;
> +		print encode_utf8($_) for @$a;
> +		print encode_utf8($_) for @$b;
>  		return;
>  	}
>  
>  	my @queue;
>  	for (my $i = 0; $i < @$a; $i++) {
>  		my ($rm, $add) = highlight_pair($a->[$i], $b->[$i]);
> -		print $rm;
> +		print encode_utf8($rm);
>  		push @queue, $add;
>  	}
> -	print @queue;
> +	print encode_utf8($_) for @queue;
>  }
>  
>  sub highlight_pair {

  reply	other threads:[~2014-02-11 19:30 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-11  9:09 [PATCH] contrib/diff-highlight: multibyte characters diff Yoshihiro Sugi
2014-02-11 19:30 ` Junio C Hamano [this message]
2014-02-12 20:59 ` Jeff King
2014-02-12 23:10   ` Thomas Adam
2014-02-12 23:27     ` Jeff King
2014-02-13  1:17       ` brian m. carlson
2014-02-13  1:37         ` Jeff King
2014-02-13 19:14   ` Yoshihiro Sugi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqioslphuu.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=sugi1982@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).