public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Marc Lehmann <pcg@schmorp.de>
To: Linus Torvalds <torvalds@osdl.org>
Cc: John Bradford <john@grabjohn.com>,
	Jeff Garzik <jgarzik@pobox.com>,
	viro@parcelfarce.linux.theplanet.co.uk,
	Linux kernel <linux-kernel@vger.kernel.org>
Subject: Re: UTF-8 practically vs. theoretically in the VFS API
Date: Mon, 16 Feb 2004 21:20:43 +0100	[thread overview]
Message-ID: <20040216202043.GD17015@schmorp.de> (raw)
In-Reply-To: <Pine.LNX.4.58.0402161141140.30742@home.osdl.org>

On Mon, Feb 16, 2004 at 11:48:35AM -0800, Linus Torvalds <torvalds@osdl.org> wrote:
> works on the raw byte sequence and isn't confused). Basically accept the
> fact that UTF-8 strings can contain "garbage", and don't try to fix it up.

But you are wrong, UTF-8 strings never contain garbage. UTF-8 is
well-defined and is always proper UTF-8. It's a tautology.

The evry idea of "UTF-8 with garbage in it" doesn't make sense.

> And no, I'm not claiming that it's wonderfully clean and that we should
> all love it.

It's also a totally useless idiom...

> And it largely works today.
> 		Linus

On ascii-only-systems, it works fine. My system is largely ascii-only,
with only very few filenames (japanese and german ones mostly) in
UTF-8. Sometimes in EUC-JP, but that's a bug in rar.

It also works fine in single-user environments where the user just forces
everything to be in her locale. It does fail miserably on multi-user
systems. It does fail miserably in ISO-C's locale model. It does fail
miserably with gnu shellutils, fileutils and most other apps.

It fails, because it's not at all well supported by the kernel.

Claiming that it largely works today is simply not true for most
non-ascii-users (which increasingly includes the US).

-- 
      -----==-                                             |
      ----==-- _                                           |
      ---==---(_)__  __ ____  __       Marc Lehmann      +--
      --==---/ / _ \/ // /\ \/ /       pcg@goof.com      |e|
      -=====/_/_//_/\_,_/ /_/\_\       XX11-RIPE         --+
    The choice of a GNU generation                       |
                                                         |

  reply	other threads:[~2004-02-16 20:20 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <04Feb13.163954est.41760@gpu.utcc.utoronto.ca>
2004-02-14 14:27 ` JFS default behavior Nicolas Mailhot
2004-02-14 15:40   ` viro
2004-02-14 17:47     ` Nicolas Mailhot
2004-02-14 17:59       ` Nicolas Mailhot
2004-02-14 23:06     ` Robin Rosenberg
2004-02-14 23:29       ` viro
2004-02-15  0:07         ` Robin Rosenberg
2004-02-15  2:41           ` Linus Torvalds
2004-02-15  3:33             ` Matthias Urlichs
2004-02-15  4:04               ` viro
2004-02-15  9:48                 ` Robin Rosenberg
2004-02-15 18:26                 ` yodaiken
2004-02-18  2:48               ` Unicode normalization (userspace issue, but what the heck) H. Peter Anvin
2004-02-20  9:48                 ` Matthias Urlichs
2004-02-16 15:05             ` stty utf8 Jamie Lokier
2004-02-16 16:10               ` Gerd Knorr
2004-02-16 22:03               ` Jamie Lokier
2004-02-16 22:17                 ` Linus Torvalds
2004-02-16 22:04               ` Jamie Lokier
2004-02-16 18:36             ` UTF-8 practically vs. theoretically in the VFS API (was: Re: JFS default behavior) Marc Lehmann
2004-02-16 18:49               ` Linus Torvalds
2004-02-16 19:26                 ` UTF-8 practically vs. theoretically in the VFS API Jeff Garzik
2004-02-16 19:48                   ` John Bradford
2004-02-16 19:48                     ` Linus Torvalds
2004-02-16 20:20                       ` Marc Lehmann [this message]
2004-02-16 20:26                         ` Linus Torvalds
2004-02-18  2:49                         ` Rob Landley
2004-02-16 20:21                       ` bert hubert
2004-02-16 20:33                         ` Marc Lehmann
2004-02-18  2:58                         ` H. Peter Anvin
2004-02-18  3:13                           ` Linus Torvalds
2004-02-18  3:22                             ` H. Peter Anvin
2004-02-18  3:30                               ` Linus Torvalds
2004-02-18  5:30                                 ` H. Peter Anvin
2004-02-18 10:29                                   ` Robin Rosenberg
2004-02-18 11:49                                     ` Tomas Szepe
2004-02-18 11:59                                       ` Robin Rosenberg
2004-02-18 12:05                                         ` Tomas Szepe
2004-02-18 12:34                                           ` Robin Rosenberg
2004-02-18 15:35                                   ` Linus Torvalds
2004-02-18 19:47                                     ` Tomas Szepe
2004-02-18 20:01                                       ` H. Peter Anvin
2004-02-18 21:22                                         ` Robin Rosenberg
2004-02-18 21:42                                           ` H. Peter Anvin
2004-02-18 11:24                               ` Jamie Lokier
2004-02-18 11:33                             ` Jamie Lokier
2004-02-18 16:47                               ` H. Peter Anvin
2004-02-18 19:59                               ` Linus Torvalds
2004-02-18 20:08                                 ` H. Peter Anvin
2004-02-18  7:25                           ` bert hubert
2004-02-16 20:16                     ` Marc Lehmann
2004-02-16 20:20                       ` Jeff Garzik
2004-02-16 21:10                       ` viro
2004-02-17  7:18                       ` jw schultz
2004-02-17  7:42                       ` Nick Piggin
2004-02-16 20:03                 ` UTF-8 practically vs. theoretically in the VFS API (was: Re: JFS default behavior) Marc Lehmann
2004-02-16 20:23                   ` Linus Torvalds
2004-02-16 20:58                     ` Marc Lehmann
2004-02-17 14:12                       ` Dave Kleikamp
2004-02-16 22:26                     ` Jamie Lokier
2004-02-16 22:40                       ` Linus Torvalds
2004-02-16 22:52                         ` Linus Torvalds
2004-02-17 13:15                           ` Jamie Lokier
2004-02-17  7:14                         ` Lehmann 
2004-02-17 11:20                           ` UTF-8 practically vs. theoretically in the VFS API Helge Hafting
2004-02-17 15:56                           ` UTF-8 practically vs. theoretically in the VFS API (was: Re: JFS default behavior) Linus Torvalds
     [not found]                             ` <20040217161111.GE8231@schmorp.de>
2004-02-17 16:32                               ` Linus Torvalds
2004-02-17 16:46                                 ` Jamie Lokier
2004-02-17 19:00                                   ` UTF-8 practically vs. theoretically in the VFS API Måns Rullgård
2004-02-17 20:57                                     ` Jamie Lokier
2004-02-17 21:06                                       ` Alex Belits
2004-02-17 21:47                                         ` Jamie Lokier
2004-02-22 15:32                                           ` Eric W. Biederman
2004-02-22 16:28                                             ` Jamie Lokier
2004-02-22 21:53                                               ` Eric W. Biederman
2004-02-18  7:23                                         ` Marc Lehmann
2004-02-17 21:23                                       ` Matthew Kirkwood
2004-02-18 13:11                                   ` UTF-8 practically vs. theoretically in the VFS API (was: Re: JFS default behavior) Matthew Garrett
2004-02-17 16:52                                 ` Marc Lehmann
2004-02-17 16:54                                 ` UTF-8 practically vs. theoretically in the VFS API Stefan Smietanowski
2004-02-18  1:27                                   ` Hans Reiser
2004-02-18  2:08                                     ` Robin Rosenberg
2004-02-18 11:06                                       ` Jamie Lokier
2004-02-17 20:37                                 ` UTF-8 practically vs. theoretically in the VFS API (was: Re: JFS default behavior) Robin Rosenberg
2004-02-17 16:36                             ` Jamie Lokier
2004-02-17 17:52                               ` viro
2004-02-17 19:29                                 ` Jamie Lokier
2004-02-17 19:45                                   ` Linus Torvalds
2004-02-17 20:30                                     ` Jamie Lokier
2004-02-17 20:49                                       ` Linus Torvalds
2004-02-17 21:17                                         ` Jamie Lokier
2004-02-17 19:51                                   ` Jamie Lokier
2004-02-17 19:53                                   ` viro
2004-02-17 20:35                                     ` John Bradford
2004-02-17 20:40                                       ` Jamie Lokier
2004-02-17 20:50                                         ` John Bradford
2004-02-17 21:04                                           ` Linus Torvalds
2004-02-17 21:16                                             ` John Bradford
2004-02-17 21:21                                               ` Linus Torvalds
2004-02-18  0:52                                                 ` John Bradford
2004-02-17 22:50                                               ` Robin Rosenberg
2004-02-18  6:48                                             ` Marc Lehmann
2004-02-17 20:47                                       ` viro
2004-02-17 20:53                                         ` John Bradford
2004-02-17 20:59                                       ` Linus Torvalds
2004-02-17 21:06                                         ` John Bradford
2004-02-17 21:42                                         ` Alex Belits
2004-02-18  6:56                                           ` Marc Lehmann
2004-02-18 20:37                                             ` Alex Belits
2004-02-18  3:11                                         ` H. Peter Anvin
2004-02-17 20:38                                     ` Jamie Lokier
2004-02-18  3:07                               ` H. Peter Anvin
2004-02-21 13:54                             ` Pavel Machek
2004-02-22 20:09                               ` H. Peter Anvin
2004-02-17  1:24                   ` Alex Belits
2004-02-17 21:09                     ` Jamie Lokier
2004-02-17 21:48                       ` Linus Torvalds
2004-02-17 22:19                       ` Alex Belits
2004-02-23 11:35 UTF-8 practically vs. theoretically in the VFS API Norman Diamond
     [not found] ` <fa.ip45pqg.i26oru@ifi.uio.no>
2004-02-23 19:13   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040216202043.GD17015@schmorp.de \
    --to=pcg@schmorp.de \
    --cc=jgarzik@pobox.com \
    --cc=john@grabjohn.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@osdl.org \
    --cc=viro@parcelfarce.linux.theplanet.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox