public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jamie Lokier <jamie@shareable.org>
To: Eduard Bloch <edi@gmx.de>
Cc: Linux kernel <linux-kernel@vger.kernel.org>
Subject: Re: JFS default behavior (was: UTF-8 in file systems? xfs/extfs/etc.)
Date: Mon, 16 Feb 2004 21:44:21 +0000	[thread overview]
Message-ID: <20040216214421.GA18853@mail.shareable.org> (raw)
In-Reply-To: <20040216192228.GC15087@zombie.inka.de>

Eduard Bloch wrote:
> TERM specifies the general capabilities of the terminal. It does
> _not_ tell the application inside which FONT encoding is used, nor
> whether it is compatible with multibyte input.

It should - especially the multibyte encoding.

The font is irrelevant; our trouble here is *character encoding* which
has nothing to do with fonts.  Please don't use the incorrect term as
there is widespread confusion over it already.

That isn't just about which glyph is displayed in response to each
byte.  UTF-8 affects terminal escape sequence parsing, and also the
relationship between number of non-control bytes transmitted and the
distance moved by the cursor.

If I write a UTF-8 string to a VT220-like terminal (such as xterm
approximates), some text characters are interpreted as terminal
commands.  (Hint: 0x9b (which can occur in UTF-8 text) is equivalent
to 0x1c 0x5b, the control sequence introducer; there are others too).

When you edit a line with the unix terminal line editor, when you type
DEL, it writes BACKSPACE-SPACE-BACKSPACE and removes one byte from the
input.  That utterly fails to do the right thing on UTF-8 terminals.
For example, run the command "cat" by itself, then type "£££", then
hit DEL twice - it will show one pound sterling sign.  Press enter,
and cat will echo the line containing _two_ pound sterling signs.

No setting of LANG or TERM makes that behave correctly.

So, do you think the kernel's line editor should be locale-aware too? :)

> > It is wrong that LANG must have a different value depending on whether
> > I log in using a DEC VT100 or a Gnome Terminal, even though I wish to
> > see exactly the same language, dialect, messages, number formats,
> > currency formats, dates and times.

NB: It's wrong because LANG should be for terminal-independent locale
properties, such as which languages I want to use and how I want text
files stored.

If I log into a remote machine, I want characters displayed according
to the local terminal's requirements, but I want text files and
filenames to use the remote machine's locale, naturally.

> Nonsense, sorry. How should your application know how to encode its
> output?

Increasingly I'm thinking UTF-8-ness should be a terminal capability,
like ocrnl.  The kernel's own line editor needs to know this property
anyway, and it would really help with moving filenames and everything
else over to UTF-8 - with no change to the simple unix programs such
as the shell utilities.

> > It is especially wrong that libraries which should be
> > locale-independent - such as curses, slang and readline - must
> > read the LANG variable in addition to TERM.
> 
> See above. Especially since different chars are used to draw graphical
> characters (lines, boxes, ...), they _must_ know which font encoding
> they have to expect.

See "acsc" in the terminfo(5) database.  Line & box drawing characters
have been treated as a terminal capability for a long time.  Case made :)

-- Jamie

  reply	other threads:[~2004-02-16 21:44 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-02-09 11:58 UTF-8 in file systems? xfs/extfs/etc Nico Schottelius
2004-02-09 12:26 ` Måns Rullgård
2004-02-09 12:28 ` Hugo Mills
2004-02-09 13:04 ` Matthew Reppert
2004-02-09 13:36 ` Matthias Urlichs
2004-02-10  4:32   ` Mike Fedyk
2004-02-10  4:53     ` Matthias Urlichs
2004-02-10  9:46     ` Robin Rosenberg
2004-02-10 23:04     ` jw schultz
2004-02-10 23:17       ` viro
2004-02-10 23:23       ` Måns Rullgård
2004-02-11  0:02       ` Mike Fedyk
2004-02-09 15:06 ` Matthew Garrett
2004-02-11  6:39 ` Tim Connors
2004-02-11 16:35   ` JFS default behavior (was: UTF-8 in file systems? xfs/extfs/etc.) Dave Kleikamp
2004-02-12  0:45     ` Andy Isaacson
2004-02-12  1:19       ` Tim Connors
2004-02-12  3:54       ` jw schultz
2004-02-12 12:03         ` Robin Rosenberg
2004-02-12  8:54       ` Jamie Lokier
2004-02-12 15:55         ` Robin Rosenberg
2004-02-12 16:17           ` John Bradford
2004-02-12 16:40             ` Robin Rosenberg
2004-02-12 17:16               ` John Bradford
2004-02-12 18:06                 ` Robin Rosenberg
2004-02-12 19:08                   ` John Bradford
2004-02-12 19:39                     ` Robin Rosenberg
2004-02-12 21:13                       ` John Bradford
2004-02-12 22:29                         ` Robin Rosenberg
2004-02-12 22:50                           ` Valdis.Kletnieks
2004-02-13  2:58                           ` Jamie Lokier
2004-02-13  9:48                             ` Robin Rosenberg
2004-02-13  3:15                         ` Jamie Lokier
2004-02-14 15:24                     ` Eduard Bloch
2004-02-13  0:17             ` Jamie Lokier
2004-02-13  0:38           ` Jamie Lokier
2004-02-13  1:16             ` Robin Rosenberg
2004-02-13  1:23               ` Jamie Lokier
2004-02-13  1:46                 ` Robin Rosenberg
2004-02-13  2:29               ` viro
2004-02-13  3:23                 ` Jamie Lokier
2004-02-14 15:09                   ` Eduard Bloch
2004-02-15  1:01                     ` Jamie Lokier
2004-02-16 14:03                       ` Eduard Bloch
2004-02-16 14:28                         ` Jamie Lokier
2004-02-16 19:22                           ` Eduard Bloch
2004-02-16 21:44                             ` Jamie Lokier [this message]
2004-02-16 15:18                         ` Valdis.Kletnieks
2004-02-16 15:32                           ` Jamie Lokier
2004-02-16 19:13                             ` Eduard Bloch
2004-02-16 15:46                           ` John Bradford
2004-02-16 15:48                             ` viro
2004-02-16 16:43                               ` John Bradford
2004-02-16 16:25                             ` Robin Rosenberg
2004-02-16 15:27                         ` Jamie Lokier
2004-02-16 15:44                         ` Robin Rosenberg
2004-02-13 10:03                 ` Robin Rosenberg
2004-02-13 10:22                   ` vda
2004-02-13 10:29                     ` Robin Rosenberg
2004-02-12 13:28       ` Dave Kleikamp
2004-02-12 15:26       ` Valdis.Kletnieks
2004-02-12 15:41         ` Dave Kleikamp
  -- strict thread matches above, loose matches on Subject: below --
2004-02-12 16:50 Nicolas Mailhot
2004-02-12 18:12 ` Robin Rosenberg
2004-02-13  3:03 ` Jamie Lokier
2004-02-13 10:07   ` Robin Rosenberg
2004-02-13 18:06   ` Nicolas Mailhot
2004-02-13 18:15     ` viro
2004-02-13 18:24       ` Valdis.Kletnieks
2004-02-13 18:31         ` viro
2004-02-13 20:27           ` Jamie Lokier
2004-02-13 18:31       ` Richard B. Johnson
2004-02-13 22:39         ` Robin Rosenberg
     [not found] <04Feb13.015940est.41760@gpu.utcc.utoronto.ca>
2004-02-13 10:26 ` Robin Rosenberg
     [not found] <04Feb13.024659est.41760@gpu.utcc.utoronto.ca>
2004-02-13 17:57 ` Nicolas Mailhot
     [not found] <1nioI-5Re-1@gated-at.bofh.it>
     [not found] ` <1orqh-6gs-47@gated-at.bofh.it>
     [not found]   ` <1ozGR-60N-1@gated-at.bofh.it>
     [not found]     ` <1oAa3-6pR-37@gated-at.bofh.it>
     [not found]       ` <1oBpi-7pO-1@gated-at.bofh.it>
     [not found]         ` <1oCbM-8oW-9@gated-at.bofh.it>
     [not found]           ` <1p9Kl-7BV-1@gated-at.bofh.it>
     [not found]             ` <1piXj-1d3-3@gated-at.bofh.it>
2004-02-15 14:26               ` Pascal Schmidt
     [not found]               ` <1pRLy-21o-31@gated-at.bofh.it>
     [not found]                 ` <1pSRf-31Z-5@gated-at.bofh.it>
2004-02-16 15:44                   ` Pascal Schmidt
2004-02-16 15:59                     ` Valdis.Kletnieks
     [not found] <1pvrI-8bq-29@gated-at.bofh.it>
     [not found] ` <1pvrI-8bq-31@gated-at.bofh.it>
     [not found]   ` <1pvrJ-8bq-33@gated-at.bofh.it>
     [not found]     ` <1pvrJ-8bq-35@gated-at.bofh.it>
     [not found]       ` <1pvrJ-8bq-37@gated-at.bofh.it>
     [not found]         ` <1pvrJ-8bq-39@gated-at.bofh.it>
     [not found]           ` <1pvrJ-8bq-41@gated-at.bofh.it>
     [not found]             ` <1pvrJ-8bq-43@gated-at.bofh.it>
     [not found]               ` <1pTay-3hc-13@gated-at.bofh.it>
     [not found]                 ` <1pTay-3hc-15@gated-at.bofh.it>
     [not found]                   ` <1pTay-3hc-11@gated-at.bofh.it>
     [not found]                     ` <1pTu7-3Ce-7@gated-at.bofh.it>
2004-02-16 17:26                       ` Pascal Schmidt
2004-02-16 17:58                         ` Valdis.Kletnieks
2004-02-16 19:48                           ` Pascal Schmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040216214421.GA18853@mail.shareable.org \
    --to=jamie@shareable.org \
    --cc=edi@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox