public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Norman Diamond" <ndiamond@wta.att.ne.jp>
To: "Eric W. Biederman" <ebiederm@xmission.com>,
	<linux-kernel@vger.kernel.org>
Subject: Re: UTF-8 practically vs. theoretically in the VFS API
Date: Mon, 23 Feb 2004 20:35:06 +0900	[thread overview]
Message-ID: <015e01c3fa01$346bb0d0$34ee4ca5@DIAMONDLX60> (raw)

Eric W. Biederman wrote:

> First it is worth noting that the existing practice is that ttys
> always use the character set encoding of the user.

Each tty uses the character set encoding of that tty's user.  There were
times when I needed to have some tty windows open using EUC (ordinary work
on that Linux machine) and some tty windows open using SJIS (editing files
which would be sent to cellular telephones), in the same X session.  They
worked.

> Even X cut and paste frequently abuses the iso8859-1 range,

I'll take your word for it.  I've copied and pasted EUC strings, I've copied
and pasted SJIS strings, I don't know if X copy and paste abused EUC or SJIS
ranges, but it worked.

One thing I never thought of trying to test is to copy and paste between one
tty using EUC and one tty using SJIS.

> Now the work is how to get multiple locales to play nicely with each
> other.  utf-8 and unicode are convenient for that as they preserve the
> existing assumptions that terminals, filenames, and text files are
> all using the same character set encoding, even when multiple locales
> are involved.
>
> So within one machine utf-8 solves the multiple locale problem.

That preserves a nice fiction.  If you depend on assuming that fiction,
you'll get useless results.

> The rule ``All data that passing through a pseudo-tty is in the
> character set encoding specified by the locale of the owner of the
> tty'' seems both reasonable and no significant change from the current
> status quo.

Yes, that is a return to usability.

> On the wire between two machines I recommend passing unicode
> characters.

Why should the wire get a different encoding than the user set in the
pseudo-tty?  Consider TeraTerm.  The user tells TeraTerm what character set
is in use on the wire, which is the same as the character set in use on the
remote side (where sshd or whatever server provides the pseudo-tty).
TeraTerm converts between that and the local character set (where the
TeraTerm program and window and user get the character set decided for them
by someone in Sasazuka or Redmond).

> By convention glibc stores unicode values in wchar_t.

That is hard to believe.  glibc existed before Unicode did and wchar_t
existed before Unicode did.  I sure thought that glibc existed in Japan at
the time, but I could be wrong, I didn't say this is impossible but merely
hard to believe.  In commercial Unix systems, wchar_t held either EUC or
SJIS depending on the vendor.

As usual I do not even have time to keep up with this thread, so if you have
questions then please CC me personally, though I don't know if I'll have
time to investigate anything that needs it.


             reply	other threads:[~2004-02-23 11:36 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-02-23 11:35 Norman Diamond [this message]
     [not found] ` <fa.ip45pqg.i26oru@ifi.uio.no>
2004-02-23 19:13   ` UTF-8 practically vs. theoretically in the VFS API Junio C Hamano
     [not found] <04Feb13.163954est.41760@gpu.utcc.utoronto.ca>
2004-02-14 23:06 ` JFS default behavior Robin Rosenberg
2004-02-14 23:29   ` viro
2004-02-15  0:07     ` Robin Rosenberg
2004-02-15  2:41       ` Linus Torvalds
2004-02-16 18:36         ` UTF-8 practically vs. theoretically in the VFS API (was: Re: JFS default behavior) Marc Lehmann
2004-02-16 18:49           ` Linus Torvalds
2004-02-16 19:26             ` UTF-8 practically vs. theoretically in the VFS API Jeff Garzik
2004-02-16 19:48               ` John Bradford
2004-02-16 19:48                 ` Linus Torvalds
2004-02-16 20:20                   ` Marc Lehmann
2004-02-16 20:26                     ` Linus Torvalds
2004-02-18  2:49                     ` Rob Landley
2004-02-16 20:21                   ` bert hubert
2004-02-16 20:33                     ` Marc Lehmann
2004-02-18  2:58                     ` H. Peter Anvin
2004-02-18  3:13                       ` Linus Torvalds
2004-02-18  3:22                         ` H. Peter Anvin
2004-02-18  3:30                           ` Linus Torvalds
2004-02-18  5:30                             ` H. Peter Anvin
2004-02-18 10:29                               ` Robin Rosenberg
2004-02-18 11:49                                 ` Tomas Szepe
2004-02-18 11:59                                   ` Robin Rosenberg
2004-02-18 12:05                                     ` Tomas Szepe
2004-02-18 12:34                                       ` Robin Rosenberg
2004-02-18 15:35                               ` Linus Torvalds
2004-02-18 19:47                                 ` Tomas Szepe
2004-02-18 20:01                                   ` H. Peter Anvin
2004-02-18 21:22                                     ` Robin Rosenberg
2004-02-18 21:42                                       ` H. Peter Anvin
2004-02-18 11:24                           ` Jamie Lokier
2004-02-18 11:33                         ` Jamie Lokier
2004-02-18 16:47                           ` H. Peter Anvin
2004-02-18 19:59                           ` Linus Torvalds
2004-02-18 20:08                             ` H. Peter Anvin
2004-02-18  7:25                       ` bert hubert
2004-02-16 20:16                 ` Marc Lehmann
2004-02-16 20:20                   ` Jeff Garzik
2004-02-16 21:10                   ` viro
2004-02-17  7:18                   ` jw schultz
2004-02-17  7:42                   ` Nick Piggin
2004-02-16 20:03             ` UTF-8 practically vs. theoretically in the VFS API (was: Re: JFS default behavior) Marc Lehmann
2004-02-16 20:23               ` Linus Torvalds
2004-02-16 22:26                 ` Jamie Lokier
2004-02-16 22:40                   ` Linus Torvalds
2004-02-17  7:14                     ` Lehmann 
2004-02-17 11:20                       ` UTF-8 practically vs. theoretically in the VFS API Helge Hafting
2004-02-17 15:56                       ` UTF-8 practically vs. theoretically in the VFS API (was: Re: JFS default behavior) Linus Torvalds
     [not found]                         ` <20040217161111.GE8231@schmorp.de>
2004-02-17 16:32                           ` Linus Torvalds
2004-02-17 16:46                             ` Jamie Lokier
2004-02-17 19:00                               ` UTF-8 practically vs. theoretically in the VFS API Måns Rullgård
2004-02-17 20:57                                 ` Jamie Lokier
2004-02-17 21:06                                   ` Alex Belits
2004-02-17 21:47                                     ` Jamie Lokier
2004-02-22 15:32                                       ` Eric W. Biederman
2004-02-22 16:28                                         ` Jamie Lokier
2004-02-22 21:53                                           ` Eric W. Biederman
2004-02-18  7:23                                     ` Marc Lehmann
2004-02-17 21:23                                   ` Matthew Kirkwood
2004-02-17 16:54                             ` Stefan Smietanowski
2004-02-18  1:27                               ` Hans Reiser
2004-02-18  2:08                                 ` Robin Rosenberg
2004-02-18 11:06                                   ` Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='015e01c3fa01$346bb0d0$34ee4ca5@DIAMONDLX60' \
    --to=ndiamond@wta.att.ne.jp \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox