public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "\"Martin v. Löwis\"" <martin@v.loewis.de>
To: Martin Mares <mj@ucw.cz>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [Patch] Support UTF-8 scripts
Date: Sat, 17 Sep 2005 15:33:23 +0200	[thread overview]
Message-ID: <432C1B23.9090507@v.loewis.de> (raw)
In-Reply-To: <20050917130529.GA4398@ucw.cz>

Martin Mares wrote:
> I still think that this does solve only a completely insignificant part
> of the problem. Given the zillion existing encodings, you are able to identify
> UTF-8, leaving you with zillion-1 other encodings you are unable to deal with.

Correct. This is a special case only. The more general problem is
already solved: both Python and Perl support source encodings in
the entire zillion encodings. As I explained, this general solution,
while being general, is also not very user-friendly.

Now, why does UTF-8 deserve to be a special case? One reason is that it
has the potential to replace the entire zillion of encodings over time.
However, this can only happen if tool support for this encoding is
really good. The patch contributes a (minor) fragment to the support -
it is a small patch only.

The other reason is that UTF-8 defines its own encoding declaration,
unlike most of the other zillion-1 encodings. So naturally, an
implementation that supports UTF-8 in this way cannot extend to other
encodings. hpa suggested that ISO-2022 would be a more general
mechanism, but pointed out that it hasn't implemented widely in the
last 30 years, so it is unlikely that it will get much better support
in the next thirty years.

> I see a need for a feature which would help identify the charset of the script,
> but the patch in question obviously doesn't offer that -- it solves only a single
> special case of the problem in a completely non-systematic way. This does not
> sound right.

It's not a complete solution, but it *is* part of a general solution.
People have tried in the past to solve the general problem of "identify
the encoding of a text file", both in really general ways (iso-2022)
and in format-specific ways (perl, python). All these solutions are
tedious to use.

There is another general solution: gradually replace the zillion
encodings with a single one, namely Unicode (or, specifically, UTF-8).
This solution will only work when done gradually. Clearly, this
patch doesn't implement this solution entirely, but it contributes
to it, by making usage of UTF-8 in script files more simple. Many
more changes to other software (i.e. non-kernel changes) will be
necessary to implement this solution, as well as (obviously) changes
to existing files.

Regards,
Martin

  reply	other threads:[~2005-09-17 13:33 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <4NsP0-3YF-11@gated-at.bofh.it>
     [not found] ` <4NsP0-3YF-13@gated-at.bofh.it>
     [not found]   ` <4NsP0-3YF-15@gated-at.bofh.it>
     [not found]     ` <4NsP0-3YF-17@gated-at.bofh.it>
     [not found]       ` <4NsP1-3YF-19@gated-at.bofh.it>
     [not found]         ` <4NsP1-3YF-21@gated-at.bofh.it>
     [not found]           ` <4NsOZ-3YF-9@gated-at.bofh.it>
     [not found]             ` <4NsYH-4bv-27@gated-at.bofh.it>
     [not found]               ` <4NtBr-4WU-3@gated-at.bofh.it>
     [not found]                 ` <4NtL0-5lQ-13@gated-at.bofh.it>
2005-09-16 20:34                   ` [Patch] Support UTF-8 scripts "Martin v. Löwis"
2005-09-17 12:01                     ` Martin Mares
2005-09-17 12:25                       ` "Martin v. Löwis"
2005-09-17 12:28                         ` Martin Mares
2005-09-17 12:53                           ` "Martin v. Löwis"
2005-09-17 13:05                             ` Martin Mares
2005-09-17 13:33                               ` "Martin v. Löwis" [this message]
2005-09-19  7:08                         ` Pavel Machek
2005-09-19  7:18                           ` "Martin v. Löwis"
2005-09-19  7:24                             ` Pavel Machek
2005-09-19  7:46                               ` "Martin v. Löwis"
2005-09-19  7:50                                 ` Pavel Machek
2005-09-19 10:48                               ` Alan Cox
2005-09-19 23:49                             ` Horst von Brand
     [not found]                 ` <4Nu4p-5Js-3@gated-at.bofh.it>
2005-09-16 20:41                   ` "Martin v. Löwis"
2005-09-16 22:08                     ` H. Peter Anvin
2005-09-17  6:05                       ` "Martin v. Löwis"
2005-09-16 22:45                     ` Bernd Petrovitsch
2005-09-17  6:20                       ` "Martin v. Löwis"
2005-09-17 22:28                         ` Bernd Petrovitsch
2005-09-18  7:23                           ` "Martin v. Löwis"
2005-09-18 14:50                             ` Bernd Petrovitsch
2005-09-17  6:45             ` "Martin v. Löwis"
     [not found] ` <4NXfZ-5P0-1@gated-at.bofh.it>
     [not found]   ` <4NYlM-7i0-5@gated-at.bofh.it>
     [not found]     ` <4Olip-6HH-13@gated-at.bofh.it>
2005-09-19  4:41       ` "Martin v. Löwis"
     [not found] <4NVHm-3yE-13@gated-at.bofh.it>
     [not found] ` <4NVHm-3yE-15@gated-at.bofh.it>
     [not found]   ` <4NVHm-3yE-17@gated-at.bofh.it>
     [not found]     ` <4NVHm-3yE-19@gated-at.bofh.it>
     [not found]       ` <4NVHm-3yE-21@gated-at.bofh.it>
     [not found]         ` <4NVHm-3yE-23@gated-at.bofh.it>
     [not found]           ` <4NVHm-3yE-25@gated-at.bofh.it>
     [not found]             ` <4NVHm-3yE-27@gated-at.bofh.it>
     [not found]               ` <4NVHm-3yE-29@gated-at.bofh.it>
     [not found]                 ` <4NVHm-3yE-31@gated-at.bofh.it>
     [not found]                   ` <4NVHn-3yE-33@gated-at.bofh.it>
     [not found]                     ` <4NVHn-3yE-35@gated-at.bofh.it>
     [not found]                       ` <4NVHn-3yE-37@gated-at.bofh.it>
     [not found]                         ` <4NVHn-3yE-39@gated-at.bofh.it>
     [not found]                           ` <4Od1x-3e3-5@gated-at.bofh.it>
     [not found]                             ` <4Od1x-3e3-7@gated-at.bofh.it>
     [not found]                               ` <4Od1w-3e3-3@gated-at.bofh.it>
     [not found]                                 ` <4OfZo-7AG-21@gated-at.bofh.it>
2005-09-19  5:11                                   ` "Martin v. Löwis"
     [not found] <4Nvab-7o5-11@gated-at.bofh.it>
     [not found] ` <4Nvab-7o5-13@gated-at.bofh.it>
     [not found]   ` <4Nvab-7o5-15@gated-at.bofh.it>
     [not found]     ` <4Nvab-7o5-17@gated-at.bofh.it>
     [not found]       ` <4Nvab-7o5-19@gated-at.bofh.it>
     [not found]         ` <4Nvab-7o5-21@gated-at.bofh.it>
     [not found]           ` <4Nvab-7o5-23@gated-at.bofh.it>
     [not found]             ` <4Nvab-7o5-25@gated-at.bofh.it>
     [not found]               ` <4Nvab-7o5-27@gated-at.bofh.it>
     [not found]                 ` <4NvjM-7CU-7@gated-at.bofh.it>
     [not found]                   ` <4NvjM-7CU-5@gated-at.bofh.it>
     [not found]                     ` <4NxbR-20S-1@gated-at.bofh.it>
     [not found]                       ` <4NEn7-3M5-7@gated-at.bofh.it>
     [not found]                         ` <4NTvO-yJ-13@gated-at.bofh.it>
2005-09-18  0:53                           ` Bodo Eggert
2005-09-18 16:53                             ` Bernd Petrovitsch
     [not found]                           ` <4O1MJ-3Hf-5@gated-at.bofh.it>
     [not found]                             ` <4O8Oh-5jp-7@gated-at.bofh.it>
2005-09-18 19:23                               ` Bodo Eggert
2005-09-18 21:03                                 ` Bernd Petrovitsch
2005-09-19 19:37                                   ` Bodo Eggert
2005-09-18 22:29                                 ` Valdis.Kletnieks
2005-09-19  6:03                                 ` H. Peter Anvin
2005-09-19  4:54                               ` "Martin v. Löwis"
2005-09-19  8:26                                 ` Bernd Petrovitsch
2005-09-19  9:00                                   ` Valdis.Kletnieks
2005-09-19  9:41                                     ` Bernd Petrovitsch
2005-09-19 21:40                                   ` "Martin v. Löwis"
     [not found] <4N6EL-4Hq-3@gated-at.bofh.it>
     [not found] ` <4N6EL-4Hq-5@gated-at.bofh.it>
     [not found]   ` <4N6EK-4Hq-1@gated-at.bofh.it>
     [not found]     ` <4N6EX-4Hq-27@gated-at.bofh.it>
     [not found]       ` <4N6Ox-4Ts-33@gated-at.bofh.it>
     [not found]         ` <4N7AS-67L-3@gated-at.bofh.it>
2005-09-16 18:02           ` Bodo Eggert
2005-09-16 18:09             ` H. Peter Anvin
2005-09-16 18:57               ` Bodo Eggert
2005-09-16 19:08                 ` Martin Mares
2005-09-16 19:25                 ` H. Peter Anvin
2005-09-16 19:57                 ` Horst von Brand
     [not found]             ` <200509170028.59973.dhazelton@enter.net>
2005-09-17  6:28               ` "Martin v. Löwis"
2005-09-17 22:31                 ` D. Hazelton
2005-09-18  3:45                   ` Kyle Moffett
2005-09-19  0:14                     ` D. Hazelton
2005-09-18  6:58                   ` "Martin v. Löwis"
2005-09-19  0:31                     ` D. Hazelton
2005-09-17 17:16               ` Bodo Eggert
     [not found] <4B2ZV-2dl-7@gated-at.bofh.it>
     [not found] ` <4HKbZ-Cx-37@gated-at.bofh.it>
2005-09-15 18:24   ` "Martin v. Löwis"
2005-09-15 18:25     ` H. Peter Anvin
2005-09-15 18:39       ` "Martin v. Löwis"
2005-09-15 19:20         ` H. Peter Anvin
2005-09-16  8:13         ` Bernd Petrovitsch
2005-08-13 12:07 "Martin v. Löwis"
2005-08-13 16:35 ` Stephen Pollei
2005-08-13 18:42   ` Lee Revell
2005-08-13 18:49     ` Hugo Mills
2005-08-13 18:53       ` Lee Revell
2005-08-14  0:57         ` Alan Cox
2005-08-14  1:19           ` Kyle Moffett
2005-08-14  1:40             ` Lee Revell
2005-08-14 10:40               ` Wichert Akkerman
2005-08-13 19:20       ` Lee Revell
2005-08-16  9:46       ` Jan Engelhardt
2005-08-14  0:53     ` Alan Cox
2005-08-14  4:10       ` James Cloos
2005-08-14  6:18     ` Jason L Tibbitts III
     [not found]       ` <feed8cdd050814125845fe4e2e@mail.gmail.com>
2005-08-14 19:59         ` Lee Revell
2005-08-14 20:13           ` Stephen Pollei
2005-08-14 20:22             ` Lee Revell
2005-08-14 22:10               ` "Martin v. Löwis"
2005-08-14 23:55           ` Alan Cox
2005-08-16 13:56           ` David Madore
     [not found]           ` <mailman.1124063520.13257.linux-kernel2news@redhat.com>
2005-08-16 20:17             ` Pete Zaitcev
2005-08-14 21:52       ` Kyle Moffett
2005-08-14 22:12         ` Valdis.Kletnieks
2005-08-15  8:01     ` Helge Hafting
2005-08-31 23:27 ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=432C1B23.9090507@v.loewis.de \
    --to=martin@v.loewis.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mj@ucw.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox