* [PATCH] gitk: UTF-8 support
@ 2005-11-23 4:15 Pavel Roskin
2005-11-24 0:47 ` Junio C Hamano
0 siblings, 1 reply; 8+ messages in thread
From: Pavel Roskin @ 2005-11-23 4:15 UTC (permalink / raw)
To: git, Paul Mackerras
Add gitencoding variable and set it to "utf-8". Use it for converting
git-rev-list output.
Signed-off-by: Pavel Roskin <proski@gnu.org>
diff --git a/gitk b/gitk
index 3dd97e2..e53b609 100755
--- a/gitk
+++ b/gitk
@@ -19,7 +19,7 @@ proc gitdir {} {
proc getcommits {rargs} {
global commits commfd phase canv mainfont env
global startmsecs nextupdate ncmupdate
- global ctext maincursor textcursor leftover
+ global ctext maincursor textcursor leftover gitencoding
# check that we can find a .git directory somewhere...
set gitdir [gitdir]
@@ -49,7 +49,7 @@ proc getcommits {rargs} {
exit 1
}
set leftover {}
- fconfigure $commfd -blocking 0 -translation lf
+ fconfigure $commfd -blocking 0 -translation lf -encoding $gitencoding
fileevent $commfd readable [list getcommitlines $commfd]
$canv delete all
$canv create text 3 3 -anchor nw -text "Reading commits..." \
@@ -3657,6 +3657,7 @@ set datemode 0
set boldnames 0
set diffopts "-U 5 -p"
set wrcomcmd "git-diff-tree --stdin -p --pretty"
+set gitencoding "utf-8"
set mainfont {Helvetica 9}
set textfont {Courier 9}
--
Regards,
Pavel Roskin
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH] gitk: UTF-8 support
2005-11-23 4:15 [PATCH] gitk: UTF-8 support Pavel Roskin
@ 2005-11-24 0:47 ` Junio C Hamano
2005-11-24 1:11 ` Paul Mackerras
2005-11-24 4:53 ` Pavel Roskin
0 siblings, 2 replies; 8+ messages in thread
From: Junio C Hamano @ 2005-11-24 0:47 UTC (permalink / raw)
To: Pavel Roskin, Paul Mackerras; +Cc: git
Pavel Roskin <proski@gnu.org> writes:
> Add gitencoding variable and set it to "utf-8". Use it for converting
> git-rev-list output.
Sounds good, but is it necessary? Unless I am grossly mistaken,
I am opposed to this patch.
When I run gitk with LANG and/or LC_CTYPE set to ja_JP.utf8 (I
suspect *whatever*.utf8 would work the same way) on git.git
repository, I see Lukas's name (originally in iso8859-1 but my
commit objects have it in utf8) and Yoshifuji-san's name
(iso2022 converted to utf8) just fine.
And when I run gitk with LANG and/or LC_CTYPE set to ja_JP.ujis
(that is another name for EUC-JP) on a toy repository I have
commit log messages in EUC-JP (I am not recommending that, just
pointing out a possibility), I can see them just fine. In that
test repository, setting locale to *.utf8 would not work.
So I suspect your change breaks projects that use local
encodings, without fixing or adding anything new.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gitk: UTF-8 support
2005-11-24 0:47 ` Junio C Hamano
@ 2005-11-24 1:11 ` Paul Mackerras
2005-11-24 4:53 ` Pavel Roskin
1 sibling, 0 replies; 8+ messages in thread
From: Paul Mackerras @ 2005-11-24 1:11 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Pavel Roskin, git
Junio C Hamano writes:
> Pavel Roskin <proski@gnu.org> writes:
>
> > Add gitencoding variable and set it to "utf-8". Use it for converting
> > git-rev-list output.
>
> Sounds good, but is it necessary? Unless I am grossly mistaken,
> I am opposed to this patch.
I already put this into my gitk.git repository, so you might want to
hold off pulling that into git.git until we work this out.
Being clueless about i18n and locales, I tend to just take whatever
people tell me is needed in that area. :)
Regards,
Paul.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gitk: UTF-8 support
2005-11-24 0:47 ` Junio C Hamano
2005-11-24 1:11 ` Paul Mackerras
@ 2005-11-24 4:53 ` Pavel Roskin
2005-11-24 6:23 ` Junio C Hamano
1 sibling, 1 reply; 8+ messages in thread
From: Pavel Roskin @ 2005-11-24 4:53 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Paul Mackerras, git
Hello!
Quoting Junio C Hamano <junkio@cox.net>:
> Pavel Roskin <proski@gnu.org> writes:
>
> > Add gitencoding variable and set it to "utf-8". Use it for converting
> > git-rev-list output.
>
> Sounds good, but is it necessary? Unless I am grossly mistaken,
> I am opposed to this patch.
>
> When I run gitk with LANG and/or LC_CTYPE set to ja_JP.utf8 (I
> suspect *whatever*.utf8 would work the same way) on git.git
> repository, I see Lukas's name (originally in iso8859-1 but my
> commit objects have it in utf8) and Yoshifuji-san's name
> (iso2022 converted to utf8) just fine.
I see. I always use C locale.
> And when I run gitk with LANG and/or LC_CTYPE set to ja_JP.ujis
> (that is another name for EUC-JP) on a toy repository I have
> commit log messages in EUC-JP (I am not recommending that, just
> pointing out a possibility), I can see them just fine. In that
> test repository, setting locale to *.utf8 would not work.
Then what would you do to work with a repository using utf-8 if the current
locale is not utf-8?
> So I suspect your change breaks projects that use local
> encodings, without fixing or adding anything new.
I'll be away from any sane OS until Monday, but I assume my patch should help
those whose locale is set to an encoding other than utf-8 if they want to use a
repository using utf-8.
Anyway, I see your point. Not ever git repository uses utf-8. It is not
enforced by git.
So we have two solutions. One is to enforce utf-8 locale in git and force all
"offenders" to convert. That's probably too intrusive for now.
The other solution is to have a publicly available file under .git that would
keep the encoding name for the metadata (user names, logs etc). gitk could use
that file now. git could implement conversions a bit later. Maybe git could
warn about mismatching encoding as the first step.
--
Regards,
Pavel Roskin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gitk: UTF-8 support
2005-11-24 4:53 ` Pavel Roskin
@ 2005-11-24 6:23 ` Junio C Hamano
2005-11-24 7:12 ` Pavel Roskin
0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2005-11-24 6:23 UTC (permalink / raw)
To: Pavel Roskin; +Cc: Paul Mackerras, git
Pavel Roskin <proski@gnu.org> writes:
> Then what would you do to work with a repository using utf-8 if the current
> locale is not utf-8?
Obviously the same way as I did to try things out:
$ LANG=en_US.utf8 DISPLAY=:0 gitk blah
> Anyway, I see your point. Not ever git repository uses utf-8. It is not
> enforced by git.
That is not the point. Point is that I think the user can use
LANG and LC_ALL (I suspect LC_CTYPE is what matters) to get what
you want, and I suspect hardcoding utf8 robs users the
possibility to deal with a repository that uses something else.
And as I suggested in another message (in the died-out thread
about gitweb), we could have i18n.commitEncoding in the
configuration to help gitk and gitweb. I think that is the same
as your "other option".
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gitk: UTF-8 support
2005-11-24 6:23 ` Junio C Hamano
@ 2005-11-24 7:12 ` Pavel Roskin
2005-11-28 0:12 ` Junio C Hamano
0 siblings, 1 reply; 8+ messages in thread
From: Pavel Roskin @ 2005-11-24 7:12 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Paul Mackerras, git
Quoting Junio C Hamano <junkio@cox.net>:
> That is not the point. Point is that I think the user can use
> LANG and LC_ALL (I suspect LC_CTYPE is what matters) to get what
> you want, and I suspect hardcoding utf8 robs users the
> possibility to deal with a repository that uses something else.
Not to argue with you, but it's worth pointing out that git is heavily multiuser
software, and interoperability should not be ranked below local configurability.
> And as I suggested in another message (in the died-out thread
> about gitweb), we could have i18n.commitEncoding in the
> configuration to help gitk and gitweb. I think that is the same
> as your "other option".
Yes. Then my patch needs to be changed to set encoding to that setting and only
if it's present.
--
Regards,
Pavel Roskin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] gitk: UTF-8 support
2005-11-24 7:12 ` Pavel Roskin
@ 2005-11-28 0:12 ` Junio C Hamano
2005-11-28 21:55 ` Pavel Roskin
0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2005-11-28 0:12 UTC (permalink / raw)
To: Pavel Roskin; +Cc: Paul Mackerras, git
Pavel Roskin <proski@gnu.org> writes:
>> And as I suggested in another message (in the died-out thread
>> about gitweb), we could have i18n.commitEncoding in the
>> configuration to help gitk and gitweb. I think that is the same
>> as your "other option".
>
> Yes. Then my patch needs to be changed to set encoding to that setting and only
> if it's present.
The following patch on top of your patch has seen only very
light testing, but it seems to do the right thing for my
repository with utf-8 commit messages and another with euc-jp
commit messages. For the latter, I needed to do:
$ git-repo-config i18n.commitencoding euc-jp
-- >8 --
[PATCH] gitk: Use i18n.commitencoding configuration item.
Hardcoding "utf-8" in the script breaks projects that use local
encoding, so allow setting i18n.commitEncoding.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
diff --git a/gitk b/gitk
index b53a5c5..2242216 100755
--- a/gitk
+++ b/gitk
@@ -3669,7 +3669,14 @@ set datemode 0
set boldnames 0
set diffopts "-U 5 -p"
set wrcomcmd "git-diff-tree --stdin -p --pretty"
-set gitencoding "utf-8"
+
+set gitencoding ""
+catch {
+ set gitencoding [exec git-repo-config --get i18n.commitencoding]
+}
+if {$gitencoding == ""} {
+ set gitencoding "utf-8"
+}
set mainfont {Helvetica 9}
set textfont {Courier 9}
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH] gitk: UTF-8 support
2005-11-28 0:12 ` Junio C Hamano
@ 2005-11-28 21:55 ` Pavel Roskin
0 siblings, 0 replies; 8+ messages in thread
From: Pavel Roskin @ 2005-11-28 21:55 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Paul Mackerras, git
On Sun, 2005-11-27 at 16:12 -0800, Junio C Hamano wrote:
> The following patch on top of your patch has seen only very
> light testing, but it seems to do the right thing for my
> repository with utf-8 commit messages and another with euc-jp
> commit messages. For the latter, I needed to do:
>
> $ git-repo-config i18n.commitencoding euc-jp
Thank you! I'm glad my code was actually useful.
Now we may want to make git recode commit data to the repository
encoding. Lots of iconv/glibc/portability fun on the horizon. At very
least git should protest if both i18n.commitencoding and the locale
encoding are set to different values.
Another issue is support for non-ASCII characters in the filenames.
--
Regards,
Pavel Roskin
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2005-11-28 21:56 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-11-23 4:15 [PATCH] gitk: UTF-8 support Pavel Roskin
2005-11-24 0:47 ` Junio C Hamano
2005-11-24 1:11 ` Paul Mackerras
2005-11-24 4:53 ` Pavel Roskin
2005-11-24 6:23 ` Junio C Hamano
2005-11-24 7:12 ` Pavel Roskin
2005-11-28 0:12 ` Junio C Hamano
2005-11-28 21:55 ` Pavel Roskin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).