From: Eduard Bloch <edi@gmx.de>
To: Jamie Lokier <jamie@shareable.org>
Cc: Linux kernel <linux-kernel@vger.kernel.org>
Subject: Re: JFS default behavior (was: UTF-8 in file systems? xfs/extfs/etc.)
Date: Mon, 16 Feb 2004 15:03:38 +0100 [thread overview]
Message-ID: <20040216140338.GA2927@zombie.inka.de> (raw)
In-Reply-To: <20040215010150.GA3611@mail.shareable.org>
#include <hallo.h>
* Jamie Lokier [Sun, Feb 15 2004, 01:01:50AM]:
> > Then you have something wrong in the shell configuration of the remote
> > machine. I do not see any problems in having a ssh shell opened from a
> > UTF-8 terminal to a machine where the shell environment is also
> > configured to use UTF-8 environment.
>
> Of course that's fine. What goes wrong is when you connect to that
> same machine from another terminal which is not UTF-8.
>
> There are in fact two different problems, and you have ignored them both :)
>
> Firstly, "ls", editors, filenames:
>
> The shell configuration is irrelevant. If I create a file name
> like "£100.txt" (that's POUND followed by "100.txt") when I'm
Sure, sure, I can read it since I use UTF-8 too.
> connected from a UTF-8 terminal, it creates a filename encoded in
> UTF-8 and displays it fine.
>
> If I then log in to the same machine from another terminal which
> displays latin1, then "ls" will _not_ display the name correctly
> _regardless_ of shell or locale configuration.
I know what you mean and that is why I already proposed a radical
solution. Let me repeat it:
- convert all files from the previous charset to UTF-8 overnight
if the previous charset was unknown, first make sure that you can
guess it for all users and contact users that have files with
suspicous filenames (eg. not convertable from Latin1). Or look trough
their shell/X config files (*)
- in libc, implement a recoding function to convert file names from
LC_CTYPE to the underlying UTF-8 encoding
Done.
(*) There is no other way. Linux developers ignored the diversity of
charset/encodings over many years and now the needed information is lost
(not stored anywhere in the filesystem)
> If I then create a file called "£100.txt" (same name) using the
> terminal which displays latin1, it creates a filename encoded in
> latin1.
Of course. That is what the conversion shoudl be done in Userspace
(libc). The kernel itself does not know about used locale.
> Unfortunately, to be compatible with shell utilities, programs like
> Mutt and Emacs which _are_ aware of the display and input encodings
> will use the current terminal's encoding when accessing the
That is the correct way, though.
> filesystem. So even those programs create file names with the
> wrong encoding, if you log in from the wrong kind of terminal.
It is the _right_ enconding in the moment when they create it.
> Secondly, message locale and the shell:
>
> There is no mechanism for SSH to convey which character encoding
> the remote machine must use for displaying and inputting text, yet
> client terminals come in different flavours. That is the problem.
>
> (On my laptop, for example, which is a standard RH9, Gnome terminal
> windows are UTF-8 but console is latin1). These are both fine
> locally. There is no configuration on a remote machine which is right
> for both of them, though.)
Yup, I know that problem. At least to display them correctly, you can
either run unicode_start (to enable console's own conversion) which
sucks when they are chars from completely different language groups, eg.
latin and cyrillic. I used dynafont for a while which worked well for
displaying characters.
> I think this is because the character encoding used by the terminal
> should be in the TERM environment variable, but it is in LANG instead.
No. TERM does not have anything to do with locales (LANG).
Regards,
Eduard.
--
Selbstlosigkeit ist ausgereifter Egoismus.
-- Herbert Spencer
next prev parent reply other threads:[~2004-02-16 14:12 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-02-09 11:58 UTF-8 in file systems? xfs/extfs/etc Nico Schottelius
2004-02-09 12:26 ` Måns Rullgård
2004-02-09 12:28 ` Hugo Mills
2004-02-09 13:04 ` Matthew Reppert
2004-02-09 13:36 ` Matthias Urlichs
2004-02-10 4:32 ` Mike Fedyk
2004-02-10 4:53 ` Matthias Urlichs
2004-02-10 9:46 ` Robin Rosenberg
2004-02-10 23:04 ` jw schultz
2004-02-10 23:17 ` viro
2004-02-10 23:23 ` Måns Rullgård
2004-02-11 0:02 ` Mike Fedyk
2004-02-09 15:06 ` Matthew Garrett
2004-02-11 6:39 ` Tim Connors
2004-02-11 16:35 ` JFS default behavior (was: UTF-8 in file systems? xfs/extfs/etc.) Dave Kleikamp
2004-02-12 0:45 ` Andy Isaacson
2004-02-12 1:19 ` Tim Connors
2004-02-12 3:54 ` jw schultz
2004-02-12 12:03 ` Robin Rosenberg
2004-02-12 8:54 ` Jamie Lokier
2004-02-12 15:55 ` Robin Rosenberg
2004-02-12 16:17 ` John Bradford
2004-02-12 16:40 ` Robin Rosenberg
2004-02-12 17:16 ` John Bradford
2004-02-12 18:06 ` Robin Rosenberg
2004-02-12 19:08 ` John Bradford
2004-02-12 19:39 ` Robin Rosenberg
2004-02-12 21:13 ` John Bradford
2004-02-12 22:29 ` Robin Rosenberg
2004-02-12 22:50 ` Valdis.Kletnieks
2004-02-13 2:58 ` Jamie Lokier
2004-02-13 9:48 ` Robin Rosenberg
2004-02-13 3:15 ` Jamie Lokier
2004-02-14 15:24 ` Eduard Bloch
2004-02-13 0:17 ` Jamie Lokier
2004-02-13 0:38 ` Jamie Lokier
2004-02-13 1:16 ` Robin Rosenberg
2004-02-13 1:23 ` Jamie Lokier
2004-02-13 1:46 ` Robin Rosenberg
2004-02-13 2:29 ` viro
2004-02-13 3:23 ` Jamie Lokier
2004-02-14 15:09 ` Eduard Bloch
2004-02-15 1:01 ` Jamie Lokier
2004-02-16 14:03 ` Eduard Bloch [this message]
2004-02-16 14:28 ` Jamie Lokier
2004-02-16 19:22 ` Eduard Bloch
2004-02-16 21:44 ` Jamie Lokier
2004-02-16 15:18 ` Valdis.Kletnieks
2004-02-16 15:32 ` Jamie Lokier
2004-02-16 19:13 ` Eduard Bloch
2004-02-16 15:46 ` John Bradford
2004-02-16 15:48 ` viro
2004-02-16 16:43 ` John Bradford
2004-02-16 16:25 ` Robin Rosenberg
2004-02-16 15:27 ` Jamie Lokier
2004-02-16 15:44 ` Robin Rosenberg
2004-02-13 10:03 ` Robin Rosenberg
2004-02-13 10:22 ` vda
2004-02-13 10:29 ` Robin Rosenberg
2004-02-12 13:28 ` Dave Kleikamp
2004-02-12 15:26 ` Valdis.Kletnieks
2004-02-12 15:41 ` Dave Kleikamp
-- strict thread matches above, loose matches on Subject: below --
2004-02-12 16:50 Nicolas Mailhot
2004-02-12 18:12 ` Robin Rosenberg
2004-02-13 3:03 ` Jamie Lokier
2004-02-13 10:07 ` Robin Rosenberg
2004-02-13 18:06 ` Nicolas Mailhot
2004-02-13 18:15 ` viro
2004-02-13 18:24 ` Valdis.Kletnieks
2004-02-13 18:31 ` viro
2004-02-13 20:27 ` Jamie Lokier
2004-02-13 18:31 ` Richard B. Johnson
2004-02-13 22:39 ` Robin Rosenberg
[not found] <04Feb13.015940est.41760@gpu.utcc.utoronto.ca>
2004-02-13 10:26 ` Robin Rosenberg
[not found] <04Feb13.024659est.41760@gpu.utcc.utoronto.ca>
2004-02-13 17:57 ` Nicolas Mailhot
[not found] <1nioI-5Re-1@gated-at.bofh.it>
[not found] ` <1orqh-6gs-47@gated-at.bofh.it>
[not found] ` <1ozGR-60N-1@gated-at.bofh.it>
[not found] ` <1oAa3-6pR-37@gated-at.bofh.it>
[not found] ` <1oBpi-7pO-1@gated-at.bofh.it>
[not found] ` <1oCbM-8oW-9@gated-at.bofh.it>
[not found] ` <1p9Kl-7BV-1@gated-at.bofh.it>
[not found] ` <1piXj-1d3-3@gated-at.bofh.it>
2004-02-15 14:26 ` Pascal Schmidt
[not found] ` <1pRLy-21o-31@gated-at.bofh.it>
[not found] ` <1pSRf-31Z-5@gated-at.bofh.it>
2004-02-16 15:44 ` Pascal Schmidt
2004-02-16 15:59 ` Valdis.Kletnieks
[not found] <1pvrI-8bq-29@gated-at.bofh.it>
[not found] ` <1pvrI-8bq-31@gated-at.bofh.it>
[not found] ` <1pvrJ-8bq-33@gated-at.bofh.it>
[not found] ` <1pvrJ-8bq-35@gated-at.bofh.it>
[not found] ` <1pvrJ-8bq-37@gated-at.bofh.it>
[not found] ` <1pvrJ-8bq-39@gated-at.bofh.it>
[not found] ` <1pvrJ-8bq-41@gated-at.bofh.it>
[not found] ` <1pvrJ-8bq-43@gated-at.bofh.it>
[not found] ` <1pTay-3hc-13@gated-at.bofh.it>
[not found] ` <1pTay-3hc-15@gated-at.bofh.it>
[not found] ` <1pTay-3hc-11@gated-at.bofh.it>
[not found] ` <1pTu7-3Ce-7@gated-at.bofh.it>
2004-02-16 17:26 ` Pascal Schmidt
2004-02-16 17:58 ` Valdis.Kletnieks
2004-02-16 19:48 ` Pascal Schmidt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040216140338.GA2927@zombie.inka.de \
--to=edi@gmx.de \
--cc=jamie@shareable.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox