* StGit and charsets
@ 2008-08-04 14:21 Jon Smirl
2008-08-04 15:31 ` Karl Hasselström
0 siblings, 1 reply; 3+ messages in thread
From: Jon Smirl @ 2008-08-04 14:21 UTC (permalink / raw)
To: Git Mailing List, Karl Hasselström
Do you have tests in place to handle the names and comments in patches
being in different charsets? When I was working with the mailmap file
a large source of errors was from mangling names in alternate
charsets. I recall errors in Finnish, Japanese and Chinese names for
sure. I don't know which tools did the charset mangling.
I don't work much with international charsets. If someone is using
something like Russian or Finish locally, is the metadata in the patch
converted to UTF8 before exporting or sending it as mail? Comments
should be in English, but people's names may need UTF8. And what about
email addresses, does DNS allow Unicode names now?
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: StGit and charsets
2008-08-04 15:31 ` Karl Hasselström
@ 2008-08-04 15:21 ` Jon Smirl
0 siblings, 0 replies; 3+ messages in thread
From: Jon Smirl @ 2008-08-04 15:21 UTC (permalink / raw)
To: Karl Hasselström; +Cc: Git Mailing List
On 8/4/08, Karl Hasselström <kha@treskal.com> wrote:
> On 2008-08-04 10:21:20 -0400, Jon Smirl wrote:
> > I don't work much with international charsets. If someone is using
> > something like Russian or Finish locally, is the metadata in the
> > patch converted to UTF8 before exporting or sending it as mail?
>
>
> I think what happens is that it's assumed to be utf8. No one has
> complained that their non-utf8 locale doesn't work, but my guess is
> that's because those people just haven't tried StGit yet.
This might be worth testing for Asian locales. If Asian locales make
it through ok everything else probably will too.
I don't know enough about Unicode to decode the mangled names, I
suspect they are still in the local charset and haven't been UTF8
converted. Of course the name mangling may be coming from other
tools. There is no way to tell.
This problem happens in other ways in the kernel. There is a file in
the DRM code where two people's names have been inserted in different
charsets. If I set my editor to display one right the other is always
wrong. checkpatch probably needs to make sure that patches are
completely valid UTF8.
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: StGit and charsets
2008-08-04 14:21 StGit and charsets Jon Smirl
@ 2008-08-04 15:31 ` Karl Hasselström
2008-08-04 15:21 ` Jon Smirl
0 siblings, 1 reply; 3+ messages in thread
From: Karl Hasselström @ 2008-08-04 15:31 UTC (permalink / raw)
To: Jon Smirl; +Cc: Git Mailing List
On 2008-08-04 10:21:20 -0400, Jon Smirl wrote:
> Do you have tests in place to handle the names and comments in
> patches being in different charsets?
There are some tests in t1800 and t1900 that use non-ascii names.
Might be others, but a quick grep didn't find them.
> I don't work much with international charsets. If someone is using
> something like Russian or Finish locally, is the metadata in the
> patch converted to UTF8 before exporting or sending it as mail?
I think what happens is that it's assumed to be utf8. No one has
complained that their non-utf8 locale doesn't work, but my guess is
that's because those people just haven't tried StGit yet.
> Comments should be in English, but people's names may need UTF8.
Mine does, for example. I started my StGit career by making sure it
could send out patches that didn't mangle my name. (That's a while
ago, though, so my memory's getting a bit rusty.)
> And what about email addresses, does DNS allow Unicode names now?
I think it's up to each TLD whether they want to allow it. See e.g.
http://en.wikipedia.org/wiki/Internationalized_domain_name
--
Karl Hasselström, kha@treskal.com
www.treskal.com/kalle
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2008-08-04 15:22 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-04 14:21 StGit and charsets Jon Smirl
2008-08-04 15:31 ` Karl Hasselström
2008-08-04 15:21 ` Jon Smirl
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).