public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Simos Xenitellis <simos74@gmx.net>
To: linux-kernel@vger.kernel.org
Subject: Re: Improved console UTF-8 support for the Linux kernel?
Date: Sun, 12 Dec 2004 14:02:00 +0000	[thread overview]
Message-ID: <1102860119.3389.7.camel@kl> (raw)


Jan Engelhardt wrote: 
> >> The current UTF-8 keyboard input (for the console) of the Linux kernel
> >> does not support  "composing" or writing characters with accents.
> 
> That's weird, because "ö" (LATIN O WITH DIAERESIS) -- which clearly lies
> outside the 7-bit range, is working on my system without myself poking the
> kernel. Both hitting the key or using compose mode. This also applies to
> A-with-DIAERESIS, U-with-DIAERESIS, sharp german S, but does not for anything
> else, e.g. compose-'-e to generate E with accent aigu.

I am a bit confused. Could you please comment on the following, as a
common test steps?

I am not sure how you wrote the above characters. According to UTF-8,
characters with codepoints above 0x79 require two bytes so that to be
valid. When you compose "ö" (you press something like ";", then "o") in
the console?

For simplicity, let's assume you do something like
% loadkeys --unicode
keycode 53 = 0x0d2f
compose '/' 'q' to U+00F6
compose '/' 'w' to U+00F7
compose '/' 'e' to U+00F8
compose '/' 'r' to U+00F9
compose '/' 't' to U+0100
compose '/' 'y' to U+0101
keycode 2 = U+00F6
keycode 3 = U+00F7
keycode 4 = U+00F8
keycode 5 = U+00F9
keycode 6 = U+0100
keycode 7 = U+0101
^D
% 

Dead key (due to "0d") is the character "/" (0x2f).
Keycodes 2-7 are keys for numbers 1-6.
To test, I type
% cat > test.txt
<we try out all key compositions to generate U+00F6-U+0101>
^D

When we try keys 1-6, we get
% od -x text.txt
0000000 b6c3 b7c3 b8c3 b9c3 80c4 81c4 000a
0000015
%
which is correct.

When we try using the dead key "/" and q-y, we get
% od -x test.txt
0000000 f7f6 f9f8 0100 000a
0000007
% 

To get the keyboard in a sane mode, "loadkeys --unicode -d".

>From here we see there is no conversion to UTF-8 whatsoever.

In the second case, the kernel cannot return the full character when it
is in Unicode mode.

> >Yes, i recently find it out when trying to switch all my system to
> >UTF-8. But the patch from Chris you mention below works very well
> >for me (and for anybody that needs to type compose characters for
> >languages based in the latin1 encoding i guess).
> >
> >> Is there an interest for re-submission of mentioned patches for
> >> inclusion in the kernel (yeah, provided coding style is "normalised")?
> >
> >At least, I am _really_ interested :)
> 
> So am I. I have to use xterm for anything fancy now...
> (especially for the even-more fancy stuff that begins at three-byte UTF8
> sequences, such as Japanese :-)

Good. I hope more people raise their hands for this.

Simos

[I am sending this again. It did not make it to the kernel mailing list in the first^Wsecond post for some reason..]


             reply	other threads:[~2004-12-12 14:03 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-12-12 14:02 Simos Xenitellis [this message]
  -- strict thread matches above, loose matches on Subject: below --
2004-12-11 17:06 Improved console UTF-8 support for the Linux kernel? Simos Xenitellis
2004-12-11 17:30 ` David Gómez
2004-12-11 19:07   ` Jan Engelhardt
2004-12-11 21:25     ` David Gómez
2004-12-11 21:39       ` Jan Engelhardt
2004-12-11 22:01         ` David Gómez
2004-12-11 22:26         ` Gene Heskett
     [not found]     ` <1102803807.3183.59.camel@kl>
2004-12-12  0:05       ` Jan Engelhardt
2004-12-12  0:38         ` David Gómez
2004-12-12 22:08           ` Simos Xenitellis
2004-12-12 22:14             ` Jan Engelhardt
2004-12-12 23:06             ` David Gómez
2004-12-12 23:52             ` Andries Brouwer
2004-12-12 15:44         ` Lehmann 

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1102860119.3389.7.camel@kl \
    --to=simos74@gmx.net \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox