git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Kazuhiro Kato via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org,  Kazuhiro Kato <kazuhiro.kato@hotmail.co.jp>
Subject: Re: [PATCH 2/2] fix: when resolving merge conflicts, japanese file names become garbled.
Date: Wed, 19 Feb 2025 10:11:53 -0800	[thread overview]
Message-ID: <xmqqh64pg5xy.fsf@gitster.g> (raw)
In-Reply-To: <c698805f088e0643e5faf027d4eaa6de14d6c1ff.1739918546.git.gitgitgadget@gmail.com> (Kazuhiro Kato via GitGitGadget's message of "Tue, 18 Feb 2025 22:42:26 +0000")

"Kazuhiro Kato via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Kazuhiro Kato <kazuhiro.kato@hotmail.co.jp>

Here is a place to give a bit more context.  In what way the current
code is wrong, what end-user visible symptoms are brought due to
that wrongness, what is the correct way to implement it, etc.

> Signed-off-by: Kazuhiro Kato <kazuhiro.kato@hotmail.co.jp>
> ---
>  gitk-git/gitk | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/gitk-git/gitk b/gitk-git/gitk
> index 88951ed2384..f4f8dbd5fad 100755
> --- a/gitk-git/gitk
> +++ b/gitk-git/gitk
> @@ -8205,12 +8205,13 @@ proc parseblobdiffline {ids line} {
>  
>          if {$type eq "--cc"} {
>              # start of a new file in a merge diff
> -            set fname [string range $line 10 end]
> +            set fname_raw [string range $line 10 end]
> +            set fname [encoding convertfrom $fname_raw]

Is this "the Tcl read from git things as sequence of bytes, not
characters, so somebody needs to pass the bytes to "encoding"
function to turn them into a sequence of characters?  Unless
everything is US-ASCII, that is.

If that is the case, presumably the $line has a sequence of bytes,
so it may be wrong to chop it at 10th position (presumably that's
10th byte, not 10th character) when we are trying to teach the code
to deal with non-ASCII data, no?  

I am reasonably sure that [string length "diff --git"] is where 10
comes from, and that prefix will always be in ASCII, but it feels
safer and kosher if we converted the whole line first and then
chopped off the prefix.

The patch title says Japanese, but I would imagine this applies to
anything non-ASCII, so it would be better to retitle the patch to
say "non-ASCII" instead to signal that the issue the patch fixes
applies more widely.

Thanks.

  reply	other threads:[~2025-02-19 18:11 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-18 22:42 [PATCH 0/2] gitk: Fixing file name encoding issues Kazuhiro Kato via GitGitGadget
2025-02-18 22:42 ` [PATCH 1/2] " Kazuhiro Kato via GitGitGadget
2025-02-19 17:30   ` Konstantin Khomoutov
2025-02-19 18:02     ` Junio C Hamano
2025-02-18 22:42 ` [PATCH 2/2] fix: when resolving merge conflicts, japanese file names become garbled Kazuhiro Kato via GitGitGadget
2025-02-19 18:11   ` Junio C Hamano [this message]
2025-02-18 22:52 ` [PATCH 0/2] gitk: Fixing file name encoding issues Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqh64pg5xy.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=kazuhiro.kato@hotmail.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).