All of lore.kernel.org
 help / color / mirror / Atom feed
From: Simos Xenitellis <simos74@gmx.net>
To: linux-kernel@vger.kernel.org
Subject: Re: Improved console UTF-8 support for the Linux kernel?
Date: Sun, 12 Dec 2004 14:02:00 +0000	[thread overview]
Message-ID: <1102860119.3389.7.camel@kl> (raw)


Jan Engelhardt wrote: 
> >> The current UTF-8 keyboard input (for the console) of the Linux kernel
> >> does not support  "composing" or writing characters with accents.
> 
> That's weird, because "ö" (LATIN O WITH DIAERESIS) -- which clearly lies
> outside the 7-bit range, is working on my system without myself poking the
> kernel. Both hitting the key or using compose mode. This also applies to
> A-with-DIAERESIS, U-with-DIAERESIS, sharp german S, but does not for anything
> else, e.g. compose-'-e to generate E with accent aigu.

I am a bit confused. Could you please comment on the following, as a
common test steps?

I am not sure how you wrote the above characters. According to UTF-8,
characters with codepoints above 0x79 require two bytes so that to be
valid. When you compose "ö" (you press something like ";", then "o") in
the console?

For simplicity, let's assume you do something like
% loadkeys --unicode
keycode 53 = 0x0d2f
compose '/' 'q' to U+00F6
compose '/' 'w' to U+00F7
compose '/' 'e' to U+00F8
compose '/' 'r' to U+00F9
compose '/' 't' to U+0100
compose '/' 'y' to U+0101
keycode 2 = U+00F6
keycode 3 = U+00F7
keycode 4 = U+00F8
keycode 5 = U+00F9
keycode 6 = U+0100
keycode 7 = U+0101
^D
% 

Dead key (due to "0d") is the character "/" (0x2f).
Keycodes 2-7 are keys for numbers 1-6.
To test, I type
% cat > test.txt
<we try out all key compositions to generate U+00F6-U+0101>
^D

When we try keys 1-6, we get
% od -x text.txt
0000000 b6c3 b7c3 b8c3 b9c3 80c4 81c4 000a
0000015
%
which is correct.

When we try using the dead key "/" and q-y, we get
% od -x test.txt
0000000 f7f6 f9f8 0100 000a
0000007
% 

To get the keyboard in a sane mode, "loadkeys --unicode -d".

>From here we see there is no conversion to UTF-8 whatsoever.

In the second case, the kernel cannot return the full character when it
is in Unicode mode.

> >Yes, i recently find it out when trying to switch all my system to
> >UTF-8. But the patch from Chris you mention below works very well
> >for me (and for anybody that needs to type compose characters for
> >languages based in the latin1 encoding i guess).
> >
> >> Is there an interest for re-submission of mentioned patches for
> >> inclusion in the kernel (yeah, provided coding style is "normalised")?
> >
> >At least, I am _really_ interested :)
> 
> So am I. I have to use xterm for anything fancy now...
> (especially for the even-more fancy stuff that begins at three-byte UTF8
> sequences, such as Japanese :-)

Good. I hope more people raise their hands for this.

Simos

[I am sending this again. It did not make it to the kernel mailing list in the first^Wsecond post for some reason..]


             reply	other threads:[~2004-12-12 14:03 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-12-12 14:02 Simos Xenitellis [this message]
  -- strict thread matches above, loose matches on Subject: below --
2004-12-11 17:06 Improved console UTF-8 support for the Linux kernel? Simos Xenitellis
2004-12-11 17:30 ` David Gómez
2004-12-11 19:07   ` Jan Engelhardt
2004-12-11 21:25     ` David Gómez
2004-12-11 21:39       ` Jan Engelhardt
2004-12-11 22:01         ` David Gómez
2004-12-11 22:26         ` Gene Heskett
     [not found]     ` <1102803807.3183.59.camel@kl>
2004-12-12  0:05       ` Jan Engelhardt
2004-12-12  0:38         ` David Gómez
2004-12-12 22:08           ` Simos Xenitellis
2004-12-12 22:14             ` Jan Engelhardt
2004-12-12 23:06             ` David Gómez
2004-12-12 23:52             ` Andries Brouwer
2004-12-12 15:44         ` Lehmann 

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1102860119.3389.7.camel@kl \
    --to=simos74@gmx.net \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.