From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Yoshihiro Sugi <sugi1982@gmail.com>
Subject: Re: [PATCH] contrib/diff-highlight: multibyte characters diff
Date: Tue, 11 Feb 2014 11:30:33 -0800 [thread overview]
Message-ID: <xmqqioslphuu.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <1392109750-47852-1-git-send-email-sugi1982@gmail.com> (Yoshihiro Sugi's message of "Tue, 11 Feb 2014 18:09:10 +0900")
Yoshihiro Sugi <sugi1982@gmail.com> writes:
> Signed-off-by: Yoshihiro Sugi <sugi1982@gmail.com>
>
> diff-highlight split each hunks and compare them as byte sequences.
> it causes problems when diff hunks include multibyte characters.
> This change enable to work on such cases by decoding inputs and encoding output as utf8 string.
> ---
> contrib/diff-highlight/diff-highlight | 16 ++++++++++------
> 1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/contrib/diff-highlight/diff-highlight b/contrib/diff-highlight/diff-highlight
> index c4404d4..49b4f53 100755
> --- a/contrib/diff-highlight/diff-highlight
> +++ b/contrib/diff-highlight/diff-highlight
> @@ -2,6 +2,7 @@
>
> use warnings FATAL => 'all';
> use strict;
> +use Encode qw(decode_utf8 encode_utf8);
>
> # Highlight by reversing foreground and background. You could do
> # other things like bold or underline if you prefer.
> @@ -15,8 +16,9 @@ my @added;
> my $in_hunk;
>
> while (<>) {
> + $_ = decode_utf8($_);
> if (!$in_hunk) {
> - print;
> + print encode_utf8($_);
> $in_hunk = /^$COLOR*\@/;
> }
> elsif (/^$COLOR*-/) {
> @@ -30,7 +32,7 @@ while (<>) {
> @removed = ();
> @added = ();
>
> - print;
> + print encode_utf8($_);
> $in_hunk = /^$COLOR*[\@ ]/;
> }
>
> @@ -58,7 +60,8 @@ sub show_hunk {
>
> # If one side is empty, then there is nothing to compare or highlight.
> if (!@$a || !@$b) {
> - print @$a, @$b;
> + print encode_utf8($_) for @$a;
> + print encode_utf8($_) for @$b;
> return;
> }
>
> @@ -67,17 +70,18 @@ sub show_hunk {
> # stupid, and only handle multi-line hunks that remove and add the same
> # number of lines.
> if (@$a != @$b) {
> - print @$a, @$b;
> + print encode_utf8($_) for @$a;
> + print encode_utf8($_) for @$b;
> return;
> }
>
> my @queue;
> for (my $i = 0; $i < @$a; $i++) {
> my ($rm, $add) = highlight_pair($a->[$i], $b->[$i]);
> - print $rm;
> + print encode_utf8($rm);
> push @queue, $add;
> }
> - print @queue;
> + print encode_utf8($_) for @queue;
> }
>
> sub highlight_pair {
next prev parent reply other threads:[~2014-02-11 19:30 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-11 9:09 [PATCH] contrib/diff-highlight: multibyte characters diff Yoshihiro Sugi
2014-02-11 19:30 ` Junio C Hamano [this message]
2014-02-12 20:59 ` Jeff King
2014-02-12 23:10 ` Thomas Adam
2014-02-12 23:27 ` Jeff King
2014-02-13 1:17 ` brian m. carlson
2014-02-13 1:37 ` Jeff King
2014-02-13 19:14 ` Yoshihiro Sugi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqioslphuu.fsf@gitster.dls.corp.google.com \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
--cc=sugi1982@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).