* Re: [2.6 patch] UTF-8 fixes in comments
@ 2008-04-30 0:08 Samuel Thibault
2008-04-30 3:38 ` Chris Adams
` (2 more replies)
0 siblings, 3 replies; 48+ messages in thread
From: Samuel Thibault @ 2008-04-30 0:08 UTC (permalink / raw)
To: linux-kernel
Willy Tarreau wrote:
> 3) if I enter Alt-196, I get a "Ä". Flushing the buffer shows that od
> got two bytes: c3 84.
Confirmed.
Try init=/bin/stty -a, that will show
-iutf8
So there is little wonder that canonical mode does not work as expected.
Try init=/bin/sh, from that shell run stty iutf8. Then things will work
fine. The fix is thus just to make the VT's tty initial iutf8 setup
follow vt.default_utf8.
Samuel
^ permalink raw reply [flat|nested] 48+ messages in thread* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-30 0:08 [2.6 patch] UTF-8 fixes in comments Samuel Thibault @ 2008-04-30 3:38 ` Chris Adams 2008-04-30 9:38 ` Samuel Thibault 2008-04-30 19:49 ` Willy Tarreau 2 siblings, 0 replies; 48+ messages in thread From: Chris Adams @ 2008-04-30 3:38 UTC (permalink / raw) To: linux-kernel Once upon a time, Samuel Thibault <samuel.thibault@ens-lyon.org> said: >Try init=/bin/sh, from that shell run stty iutf8. Then things will work >fine. The fix is thus just to make the VT's tty initial iutf8 setup >follow vt.default_utf8. You may also need to select a UTF-8 locale (e.g. LANG="en_US.UTF-8") for programs like bash to handle this correctly. -- Chris Adams <cmadams@hiwaay.net> Systems and Network Administrator - HiWAAY Internet Services I don't speak for anybody but myself - that's enough trouble. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-30 0:08 [2.6 patch] UTF-8 fixes in comments Samuel Thibault 2008-04-30 3:38 ` Chris Adams @ 2008-04-30 9:38 ` Samuel Thibault 2008-04-30 19:45 ` Willy Tarreau 2008-04-30 19:49 ` Willy Tarreau 2 siblings, 1 reply; 48+ messages in thread From: Samuel Thibault @ 2008-04-30 9:38 UTC (permalink / raw) To: linux-kernel Cc: cmadams, Willy Tarreau, Alan Cox, Helge Hafting, Adrian Bunk, H. Peter Anvin Chris Adams wrote: > Once upon a time, Samuel Thibault <samuel.thibault@ens-lyon.org> said: > >Try init=/bin/sh, from that shell run stty iutf8. Then things will work > >fine. The fix is thus just to make the VT's tty initial iutf8 setup > >follow vt.default_utf8. > > You may also need to select a UTF-8 locale (e.g. LANG="en_US.UTF-8") for > programs like bash to handle this correctly. Yes of course, but here the purpose was _not_ programs like bash, but the canonical mode (i.e. programs like cat etc.), for which the LANG variable has no effect, only iutf8 has. Samuel ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-30 9:38 ` Samuel Thibault @ 2008-04-30 19:45 ` Willy Tarreau 0 siblings, 0 replies; 48+ messages in thread From: Willy Tarreau @ 2008-04-30 19:45 UTC (permalink / raw) To: Samuel Thibault, linux-kernel, cmadams, Alan Cox, Helge Hafting, Adrian Bunk, H. Peter Anvin On Wed, Apr 30, 2008 at 10:38:32AM +0100, Samuel Thibault wrote: > Chris Adams wrote: > > Once upon a time, Samuel Thibault <samuel.thibault@ens-lyon.org> said: > > >Try init=/bin/sh, from that shell run stty iutf8. Then things will work > > >fine. The fix is thus just to make the VT's tty initial iutf8 setup > > >follow vt.default_utf8. > > > > You may also need to select a UTF-8 locale (e.g. LANG="en_US.UTF-8") for > > programs like bash to handle this correctly. > > Yes of course, but here the purpose was _not_ programs like bash, but > the canonical mode (i.e. programs like cat etc.), for which the LANG > variable has no effect, only iutf8 has. exactly, thanks for understanding my problem Samuel :-) Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-30 0:08 [2.6 patch] UTF-8 fixes in comments Samuel Thibault 2008-04-30 3:38 ` Chris Adams 2008-04-30 9:38 ` Samuel Thibault @ 2008-04-30 19:49 ` Willy Tarreau 2008-05-03 23:50 ` Samuel Thibault 2 siblings, 1 reply; 48+ messages in thread From: Willy Tarreau @ 2008-04-30 19:49 UTC (permalink / raw) To: Samuel Thibault, linux-kernel On Wed, Apr 30, 2008 at 01:08:51AM +0100, Samuel Thibault wrote: > Willy Tarreau wrote: > > 3) if I enter Alt-196, I get a "Ä". Flushing the buffer shows that od > > got two bytes: c3 84. > > Confirmed. > > Try init=/bin/stty -a, that will show > > -iutf8 > > So there is little wonder that canonical mode does not work as expected. > > Try init=/bin/sh, from that shell run stty iutf8. Then things will work > fine. The fix is thus just to make the VT's tty initial iutf8 setup > follow vt.default_utf8. Will try that on a more recent install. Mine's stty does not support this option. Your analysis makes quite a lot of sense, and such a fix would wipe part of my annoyances/anger with this recent change. Thanks, Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-30 19:49 ` Willy Tarreau @ 2008-05-03 23:50 ` Samuel Thibault 2008-05-04 8:55 ` Willy Tarreau 2008-05-04 10:25 ` Fix VT canonical input in UTF-8 mode [Was: UTF-8 fixes in comments] Samuel Thibault 0 siblings, 2 replies; 48+ messages in thread From: Samuel Thibault @ 2008-05-03 23:50 UTC (permalink / raw) To: Willy Tarreau; +Cc: linux-kernel, akpm Hello, Willy Tarreau, le Wed 30 Apr 2008 21:49:20 +0200, a écrit : > On Wed, Apr 30, 2008 at 01:08:51AM +0100, Samuel Thibault wrote: > > Willy Tarreau wrote: > > > 3) if I enter Alt-196, I get a "Ä". Flushing the buffer shows that od > > > got two bytes: c3 84. > > > > Confirmed. > > > > Try init=/bin/stty -a, that will show > > > > -iutf8 > > > > So there is little wonder that canonical mode does not work as expected. > > > > Try init=/bin/sh, from that shell run stty iutf8. Then things will work > > fine. The fix is thus just to make the VT's tty initial iutf8 setup > > follow vt.default_utf8. > > Will try that on a more recent install. Mine's stty does not support > this option. Your analysis makes quite a lot of sense, and such a fix > would wipe part of my annoyances/anger with this recent change. Can you give the patch below a try? Dynamic per-VT utf-8 switch should also work, provided that you reopen the VT (i.e. log out). Samuel Set IUTF8 as appropriate on VT tty open. Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> --- linux/drivers/char/vt.c.orig 2008-05-04 00:37:50.000000000 +0100 +++ linux/drivers/char/vt.c 2008-05-04 00:47:39.000000000 +0100 @@ -2723,6 +2723,10 @@ static int con_open(struct tty_struct *t tty->winsize.ws_row = vc_cons[currcons].d->vc_rows; tty->winsize.ws_col = vc_cons[currcons].d->vc_cols; } + if (vc->vc_utf) + tty->termios->c_iflag |= IUTF8; + else + tty->termios->c_iflag &= ~IUTF8; release_console_sem(); vcs_make_sysfs(tty); return ret; @@ -2899,6 +2903,8 @@ int __init vty_init(void) console_driver->minor_start = 1; console_driver->type = TTY_DRIVER_TYPE_CONSOLE; console_driver->init_termios = tty_std_termios; + if (default_utf8) + console_driver->init_termios.c_iflag |= IUTF8; console_driver->flags = TTY_DRIVER_REAL_RAW | TTY_DRIVER_RESET_TERMIOS; tty_set_operations(console_driver, &con_ops); if (tty_register_driver(console_driver)) ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-05-03 23:50 ` Samuel Thibault @ 2008-05-04 8:55 ` Willy Tarreau 2008-05-04 10:25 ` Fix VT canonical input in UTF-8 mode [Was: UTF-8 fixes in comments] Samuel Thibault 1 sibling, 0 replies; 48+ messages in thread From: Willy Tarreau @ 2008-05-04 8:55 UTC (permalink / raw) To: Samuel Thibault, linux-kernel, akpm Hi Samuel, On Sun, May 04, 2008 at 12:50:28AM +0100, Samuel Thibault wrote: > Can you give the patch below a try? > Dynamic per-VT utf-8 switch should also work, provided that you reopen > the VT (i.e. log out). I confirm that your patch works perfectly for me. Now backspace correctly removes multi-byte characters. My bash is still fooled though but as Alan explained it, it's readline which has to be upgraded now. Thanks! Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Fix VT canonical input in UTF-8 mode [Was: UTF-8 fixes in comments] 2008-05-03 23:50 ` Samuel Thibault 2008-05-04 8:55 ` Willy Tarreau @ 2008-05-04 10:25 ` Samuel Thibault 2008-05-04 11:03 ` Willy Tarreau 2008-05-05 23:00 ` Andrew Morton 1 sibling, 2 replies; 48+ messages in thread From: Samuel Thibault @ 2008-05-04 10:25 UTC (permalink / raw) To: Willy Tarreau, linux-kernel, akpm, stable Samuel Thibault, le Sun 04 May 2008 00:50:27 +0100, a écrit : > Willy Tarreau, le Wed 30 Apr 2008 21:49:20 +0200, a écrit : > > On Wed, Apr 30, 2008 at 01:08:51AM +0100, Samuel Thibault wrote: > > > Willy Tarreau wrote: > > > > 3) if I enter Alt-196, I get a "Ä". Flushing the buffer shows that od > > > > got two bytes: c3 84. > > > > > > Confirmed. > > > > > > Try init=/bin/stty -a, that will show > > > > > > -iutf8 > > > > > > So there is little wonder that canonical mode does not work as expected. > > > > > > Try init=/bin/sh, from that shell run stty iutf8. Then things will work > > > fine. The fix is thus just to make the VT's tty initial iutf8 setup > > > follow vt.default_utf8. > > > > Will try that on a more recent install. Mine's stty does not support > > this option. Your analysis makes quite a lot of sense, and such a fix > > would wipe part of my annoyances/anger with this recent change. > > Can you give the patch below a try? > Dynamic per-VT utf-8 switch should also work, provided that you reopen > the VT (i.e. log out). Willy Tarreau, le Sun 04 May 2008 10:55:14 +0200, a écrit : > I confirm that your patch works perfectly for me. Now backspace correctly > removes multi-byte characters. My bash is still fooled though but as Alan > explained it, it's readline which has to be upgraded now. I guess this is suitable for the stable trees of 2.6.24 and 2.6.25 (where UTF-8 is by default now). Set IUTF8 as appropriate on VT tty open. Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> --- linux/drivers/char/vt.c.orig 2008-05-04 00:37:50.000000000 +0100 +++ linux/drivers/char/vt.c 2008-05-04 00:47:39.000000000 +0100 @@ -2723,6 +2723,10 @@ static int con_open(struct tty_struct *t tty->winsize.ws_row = vc_cons[currcons].d->vc_rows; tty->winsize.ws_col = vc_cons[currcons].d->vc_cols; } + if (vc->vc_utf) + tty->termios->c_iflag |= IUTF8; + else + tty->termios->c_iflag &= ~IUTF8; release_console_sem(); vcs_make_sysfs(tty); return ret; @@ -2899,6 +2903,8 @@ int __init vty_init(void) console_driver->minor_start = 1; console_driver->type = TTY_DRIVER_TYPE_CONSOLE; console_driver->init_termios = tty_std_termios; + if (default_utf8) + console_driver->init_termios.c_iflag |= IUTF8; console_driver->flags = TTY_DRIVER_REAL_RAW | TTY_DRIVER_RESET_TERMIOS; tty_set_operations(console_driver, &con_ops); if (tty_register_driver(console_driver)) ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Fix VT canonical input in UTF-8 mode [Was: UTF-8 fixes in comments] 2008-05-04 10:25 ` Fix VT canonical input in UTF-8 mode [Was: UTF-8 fixes in comments] Samuel Thibault @ 2008-05-04 11:03 ` Willy Tarreau 2008-05-05 23:00 ` Andrew Morton 1 sibling, 0 replies; 48+ messages in thread From: Willy Tarreau @ 2008-05-04 11:03 UTC (permalink / raw) To: Samuel Thibault, linux-kernel, akpm, stable On Sun, May 04, 2008 at 11:25:54AM +0100, Samuel Thibault wrote: > Willy Tarreau, le Sun 04 May 2008 10:55:14 +0200, a écrit : > > I confirm that your patch works perfectly for me. Now backspace correctly > > removes multi-byte characters. My bash is still fooled though but as Alan > > explained it, it's readline which has to be upgraded now. > > I guess this is suitable for the stable trees of 2.6.24 and 2.6.25 > (where UTF-8 is by default now). agreed. > Set IUTF8 as appropriate on VT tty open. > > Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> You should have added: CC: stable@kernel.org here so that the stable team automatically gets notified when it's merged into mainline. Thanks! Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Fix VT canonical input in UTF-8 mode [Was: UTF-8 fixes in comments] 2008-05-04 10:25 ` Fix VT canonical input in UTF-8 mode [Was: UTF-8 fixes in comments] Samuel Thibault 2008-05-04 11:03 ` Willy Tarreau @ 2008-05-05 23:00 ` Andrew Morton 2008-05-05 23:54 ` Samuel Thibault 1 sibling, 1 reply; 48+ messages in thread From: Andrew Morton @ 2008-05-05 23:00 UTC (permalink / raw) To: Samuel Thibault; +Cc: w, linux-kernel, stable On Sun, 4 May 2008 11:25:54 +0100 Samuel Thibault <samuel.thibault@ens-lyon.org> wrote: > Samuel Thibault, le Sun 04 May 2008 00:50:27 +0100, a écrit : > > Willy Tarreau, le Wed 30 Apr 2008 21:49:20 +0200, a écrit : > > > On Wed, Apr 30, 2008 at 01:08:51AM +0100, Samuel Thibault wrote: > > > > Willy Tarreau wrote: > > > > > 3) if I enter Alt-196, I get a "Ä". Flushing the buffer shows that od > > > > > got two bytes: c3 84. > > > > > > > > Confirmed. > > > > > > > > Try init=/bin/stty -a, that will show > > > > > > > > -iutf8 > > > > > > > > So there is little wonder that canonical mode does not work as expected. > > > > > > > > Try init=/bin/sh, from that shell run stty iutf8. Then things will work > > > > fine. The fix is thus just to make the VT's tty initial iutf8 setup > > > > follow vt.default_utf8. > > > > > > Will try that on a more recent install. Mine's stty does not support > > > this option. Your analysis makes quite a lot of sense, and such a fix > > > would wipe part of my annoyances/anger with this recent change. > > > > Can you give the patch below a try? > > Dynamic per-VT utf-8 switch should also work, provided that you reopen > > the VT (i.e. log out). > > Willy Tarreau, le Sun 04 May 2008 10:55:14 +0200, a écrit : > > I confirm that your patch works perfectly for me. Now backspace correctly > > removes multi-byte characters. My bash is still fooled though but as Alan > > explained it, it's readline which has to be upgraded now. > > I guess this is suitable for the stable trees of 2.6.24 and 2.6.25 > (where UTF-8 is by default now). > > > > > Set IUTF8 as appropriate on VT tty open. > > Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> That changelog is pretty darn terse :( I'll often go through the email ladder and try to extract the missing information but this time I don't really see it there. Things like: what is the kernel's current behaviour, why does it behave that way, how does the patch fix it? Thanks. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Fix VT canonical input in UTF-8 mode [Was: UTF-8 fixes in comments] 2008-05-05 23:00 ` Andrew Morton @ 2008-05-05 23:54 ` Samuel Thibault 0 siblings, 0 replies; 48+ messages in thread From: Samuel Thibault @ 2008-05-05 23:54 UTC (permalink / raw) To: Andrew Morton; +Cc: w, linux-kernel, stable Andrew Morton, le Mon 05 May 2008 16:00:44 -0700, a écrit : > > Set IUTF8 as appropriate on VT tty open. > > > > Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> > > That changelog is pretty darn terse :( Erf, sorry. > I'll often go through > the email ladder and try to extract the missing information > but this time I don't really see it there. > > Things like: what is the kernel's current behaviour, why does > it behave that way, how does the patch fix it? Well, it's more an implementation than a fix. Let's try again: For e.g. proper TTY canonical support, IUTF8 termios flag has to be set as appropriate. Linux used to not care about setting that flag for VT TTYs. This patch fixes that by activating it according to the current mode of the VT, and sets the default value according to the vt.default_utf8 parameter. Samuel ^ permalink raw reply [flat|nested] 48+ messages in thread
* [2.6 patch] UTF-8 fixes in comments @ 2008-04-28 15:40 Adrian Bunk 2008-04-28 23:05 ` Willy Tarreau 2008-04-29 12:18 ` KOSAKI Motohiro 0 siblings, 2 replies; 48+ messages in thread From: Adrian Bunk @ 2008-04-28 15:40 UTC (permalink / raw) To: linux-kernel; +Cc: trivial [-- Attachment #1: Type: text/plain, Size: 1497 bytes --] This patch converts some non-UTF-8 encoded text in comments to UTF-8. Signed-off-by: Adrian Bunk <bunk@kernel.org> --- This patch is attached compressed to prevent my MUA from mangling it. Documentation/PCI/pcieaer-howto.txt | 2 - arch/arm/mach-omap2/io.c | 2 - arch/s390/kernel/ebcdic.c | 36 ++++++++++++++-------------- drivers/hid/hid-input.c | 2 - drivers/isdn/hisax/enternow_pci.c | 2 - drivers/media/video/saa5249.c | 2 - drivers/misc/ibmasm/command.c | 2 - drivers/misc/ibmasm/dot_command.c | 2 - drivers/misc/ibmasm/dot_command.h | 2 - drivers/misc/ibmasm/event.c | 2 - drivers/misc/ibmasm/heartbeat.c | 2 - drivers/misc/ibmasm/i2o.h | 2 - drivers/misc/ibmasm/ibmasm.h | 2 - drivers/misc/ibmasm/ibmasmfs.c | 2 - drivers/misc/ibmasm/lowlevel.c | 2 - drivers/misc/ibmasm/lowlevel.h | 2 - drivers/misc/ibmasm/module.c | 2 - drivers/misc/ibmasm/r_heartbeat.c | 2 - drivers/misc/ibmasm/remote.h | 2 - drivers/misc/ibmasm/uart.c | 2 - drivers/s390/ebcdic.c | 36 ++++++++++++++-------------- drivers/scsi/jazz_esp.c | 2 - drivers/spi/omap2_mcspi.c | 2 - drivers/usb/storage/cypress_atacb.c | 2 - drivers/video/omap/rfbi.c | 2 - drivers/video/omap/sossi.c | 2 - 26 files changed, 60 insertions(+), 60 deletions(-) [-- Attachment #2: patch-fix-utf-8.gz --] [-- Type: application/octet-stream, Size: 3987 bytes --] ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-28 15:40 [2.6 patch] UTF-8 fixes in comments Adrian Bunk @ 2008-04-28 23:05 ` Willy Tarreau 2008-04-29 1:29 ` H. Peter Anvin 2008-04-29 9:01 ` Alan Cox 2008-04-29 12:18 ` KOSAKI Motohiro 1 sibling, 2 replies; 48+ messages in thread From: Willy Tarreau @ 2008-04-28 23:05 UTC (permalink / raw) To: Adrian Bunk; +Cc: linux-kernel, trivial On Mon, Apr 28, 2008 at 06:40:23PM +0300, Adrian Bunk wrote: > This patch converts some non-UTF-8 encoded text in comments to UTF-8. Is this really needed Adrian ? I mean, everyone reads iso-8859-1, not everyone reads UTF-8. Now I get random crappy chars which cripple my xterms when reading such comments, and I have to do a full-reset once I've read them. It's not as if it was *that* important, and to be honnest, if you had not sent this patch, I would not even have known that non-ASCII characters were here. However, it will quickly get annoying if a recursive grep returns those pesky codes on non-compatible consoles... Quite frankly, it does not bring anything beyond trouble. I'm not adding a NAK here because I find this rude, but I don't like the orientation we're taking with the sources. We should not force people to install version X or Y of a particular system just to read sources. In fact, I would have better converted accentuated chars to their ASCII equivalent to be more friendly with people who only read 7-bit. Regards, Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-28 23:05 ` Willy Tarreau @ 2008-04-29 1:29 ` H. Peter Anvin 2008-04-29 5:06 ` Willy Tarreau 2008-04-29 9:01 ` Alan Cox 1 sibling, 1 reply; 48+ messages in thread From: H. Peter Anvin @ 2008-04-29 1:29 UTC (permalink / raw) To: Willy Tarreau; +Cc: Adrian Bunk, linux-kernel, trivial Willy Tarreau wrote: > Is this really needed Adrian ? I mean, everyone reads iso-8859-1, not > everyone reads UTF-8. "Everyone" who speaks a Western European language, perhaps; and even then, mostly because a lot of tools still have a "oh, it's not valid UTF-8, guess iso-8859-1" mode. The most common instance of non-ASCII characters in Linux kernel code are people's names, and there are plenty of names which aren't representable in either ASCII or iso-8859-1. The debate on this was years ago, and the consensus was to migrate to UTF-8; however, the salient information should be expressed in the ASCII character set unless impossible. -hpa ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 1:29 ` H. Peter Anvin @ 2008-04-29 5:06 ` Willy Tarreau 2008-04-29 6:04 ` H. Peter Anvin ` (2 more replies) 0 siblings, 3 replies; 48+ messages in thread From: Willy Tarreau @ 2008-04-29 5:06 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Adrian Bunk, linux-kernel, trivial On Mon, Apr 28, 2008 at 06:29:43PM -0700, H. Peter Anvin wrote: > Willy Tarreau wrote: > >Is this really needed Adrian ? I mean, everyone reads iso-8859-1, not > >everyone reads UTF-8. > > "Everyone" who speaks a Western European language, perhaps; and even > then, mostly because a lot of tools still have a "oh, it's not valid > UTF-8, guess iso-8859-1" mode. Or simply because people have not migrated all their install, or have explicitly disabled UTF-8 a few hours after starting to use it once they discovered the mess it caused and the poor support from the tools :-/ > The most common instance of non-ASCII > characters in Linux kernel code are people's names, and there are plenty > of names which aren't representable in either ASCII or iso-8859-1. > > The debate on this was years ago, and the consensus was to migrate to > UTF-8; however, the salient information should be expressed in the ASCII > character set unless impossible. And do we really consider that people's names in *comments* cannot be converted to pure ASCII ? I'm western european and have always been against accents in comments (another reason to write comments in english BTW). Unix and internet have lived without accents for almost 30 years without anyone really bothering. And now we try to put them everywhere (even in domain names, implying big security issues) and it causes real annoyances. People's names have not changed in 30 years, so I guess that the rules used during this time to ASCII-fy the names are still usable. > -hpa Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 5:06 ` Willy Tarreau @ 2008-04-29 6:04 ` H. Peter Anvin 2008-04-29 7:29 ` Adrian Bunk 2008-05-09 12:48 ` David Kågedal 2 siblings, 0 replies; 48+ messages in thread From: H. Peter Anvin @ 2008-04-29 6:04 UTC (permalink / raw) To: Willy Tarreau; +Cc: H. Peter Anvin, Adrian Bunk, linux-kernel, trivial Willy Tarreau wrote: > > And do we really consider that people's names in *comments* cannot > be converted to pure ASCII ? I'm western european and have always > been against accents in comments (another reason to write comments > in english BTW). Unix and internet have lived without accents for > almost 30 years without anyone really bothering. And now we try to > put them everywhere (even in domain names, implying big security > issues) and it causes real annoyances. People's names have not > changed in 30 years, so I guess that the rules used during this > time to ASCII-fy the names are still usable. > For some languages, it's considered acceptable, for others it's considered major corruption. -hpa ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 5:06 ` Willy Tarreau 2008-04-29 6:04 ` H. Peter Anvin @ 2008-04-29 7:29 ` Adrian Bunk 2008-04-29 8:14 ` Willy Tarreau 2008-05-09 12:48 ` David Kågedal 2 siblings, 1 reply; 48+ messages in thread From: Adrian Bunk @ 2008-04-29 7:29 UTC (permalink / raw) To: Willy Tarreau; +Cc: H. Peter Anvin, linux-kernel, trivial On Tue, Apr 29, 2008 at 07:06:05AM +0200, Willy Tarreau wrote: > On Mon, Apr 28, 2008 at 06:29:43PM -0700, H. Peter Anvin wrote: > > Willy Tarreau wrote: > > >Is this really needed Adrian ? I mean, everyone reads iso-8859-1, not > > >everyone reads UTF-8. > > > > "Everyone" who speaks a Western European language, perhaps; and even > > then, mostly because a lot of tools still have a "oh, it's not valid > > UTF-8, guess iso-8859-1" mode. > > Or simply because people have not migrated all their install, or have > explicitly disabled UTF-8 a few hours after starting to use it once > they discovered the mess it caused and the poor support from the > tools :-/ Non-ancient distributions default to UTF-8 and have tools that handle it fine. If you had bad experiences in the last millenium you should try again. > > The most common instance of non-ASCII > > characters in Linux kernel code are people's names, and there are plenty > > of names which aren't representable in either ASCII or iso-8859-1. > > > > The debate on this was years ago, and the consensus was to migrate to > > UTF-8; however, the salient information should be expressed in the ASCII > > character set unless impossible. > > And do we really consider that people's names in *comments* cannot > be converted to pure ASCII ? I'm western european and have always > been against accents in comments (another reason to write comments > in english BTW). Accents are very rare in names in the kernel. Most non-ASCII characters are umlauts and there's no sane way to express them in ASCII (and the vowels without umlaut are pronounced quite differently and might even make names look very strange). And that's only within European languages, outside it becomes even worse. > Unix and internet have lived without accents for > almost 30 years without anyone really bothering. And now we try to > put them everywhere (even in domain names, implying big security > issues) and it causes real annoyances. People's names have not > changed in 30 years, so I guess that the rules used during this > time to ASCII-fy the names are still usable. The comments in the kernel have been converted to UTF-8 quite some time ago, what I'm fixing with my patch is just some recent non-UTF-8 stuff that creeped in. And names in comments in the kernel were not pure ASCII since very early, they were in other charsets. Mostly iso-8859-1, but not all of them. I remember that for one name we first guessed which character it was and then tried to figure out which charset it was in (no, it was not one of iso-8859-*). So it was not "ASCII -> UTF-8", it was "several different charsets -> UTF-8". > Willy cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 7:29 ` Adrian Bunk @ 2008-04-29 8:14 ` Willy Tarreau 2008-04-29 9:06 ` Helge Hafting ` (2 more replies) 0 siblings, 3 replies; 48+ messages in thread From: Willy Tarreau @ 2008-04-29 8:14 UTC (permalink / raw) To: Adrian Bunk; +Cc: H. Peter Anvin, linux-kernel, trivial On Tue, Apr 29, 2008 at 10:29:11AM +0300, Adrian Bunk wrote: > On Tue, Apr 29, 2008 at 07:06:05AM +0200, Willy Tarreau wrote: > > On Mon, Apr 28, 2008 at 06:29:43PM -0700, H. Peter Anvin wrote: > > > Willy Tarreau wrote: > > > >Is this really needed Adrian ? I mean, everyone reads iso-8859-1, not > > > >everyone reads UTF-8. > > > > > > "Everyone" who speaks a Western European language, perhaps; and even > > > then, mostly because a lot of tools still have a "oh, it's not valid > > > UTF-8, guess iso-8859-1" mode. > > > > Or simply because people have not migrated all their install, or have > > explicitly disabled UTF-8 a few hours after starting to use it once > > they discovered the mess it caused and the poor support from the > > tools :-/ > > Non-ancient distributions default to UTF-8 and have tools that handle it > fine. > > If you had bad experiences in the last millenium you should try again. Well, I accidentally used a freshly installed laptop running mandriva 2008. I was typing in a terminal inside KDE (I don't know the program name, sort of an xterm, but with huge borders all around). I made a typo in a word and typed in a "é" (e acute). Pressing backspace to fix it showed me that I remove more chars than typed. I tried again. Pressing this letter 5 times, then 10 times backspace. I removed 5 chars from the prompt. I suspect that if I had used some chars with wider encoding (eg 4 bytes), I could have removed as many... Clearly those tools are not ready. Also, I recently upgraded one machine from 2.6.22 to 2.6.25. Same crappy behaviour on the console (with bash). I quickly set the vt.defaults on the kernel command line to fix the problem. At this stage, I'm not even trying to "fix" the problem, as it's a philosophical debate and I do not want to enter it. Some people consider it normal that we break user-space applications and that it's obvious that all useland code has to be replaced to remain compatible with "evolutions", and I simply do not support this principle. I just care about having the ability to disable the broken behaviour. Most of the problem comes from the variable length characters causing wrapping lines and misplaced tabs when read in non UTF-8 aware editors and/or terminals. The rest of the problem with the terminal going mad could have been caused by other encodings, I admit. > > > The most common instance of non-ASCII > > > characters in Linux kernel code are people's names, and there are plenty > > > of names which aren't representable in either ASCII or iso-8859-1. > > > > > > The debate on this was years ago, and the consensus was to migrate to > > > UTF-8; however, the salient information should be expressed in the ASCII > > > character set unless impossible. > > > > And do we really consider that people's names in *comments* cannot > > be converted to pure ASCII ? I'm western european and have always > > been against accents in comments (another reason to write comments > > in english BTW). > > Accents are very rare in names in the kernel. > > Most non-ASCII characters are umlauts and there's no sane way to > express them in ASCII (and the vowels without umlaut are pronounced > quite differently and might even make names look very strange). Agreed, but it's been done for *years*. I received mails from people spelled "jorn" or "jurgen" and they had no trouble using that spelling in their names or mail addresses. > And that's only within European languages, outside it becomes even > worse. > > > Unix and internet have lived without accents for > > almost 30 years without anyone really bothering. And now we try to > > put them everywhere (even in domain names, implying big security > > issues) and it causes real annoyances. People's names have not > > changed in 30 years, so I guess that the rules used during this > > time to ASCII-fy the names are still usable. > > The comments in the kernel have been converted to UTF-8 quite some time > ago, what I'm fixing with my patch is just some recent non-UTF-8 stuff > that creeped in. Well, if that had already begun, at least you're standardizing. > And names in comments in the kernel were not pure ASCII since very > early, they were in other charsets. > > Mostly iso-8859-1, but not all of them. > > I remember that for one name we first guessed which character it was and > then tried to figure out which charset it was in (no, it was not one > of iso-8859-*). > > So it was not "ASCII -> UTF-8", it was > "several different charsets -> UTF-8". I would have loved to see "several different charsets -> ASCII". Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 8:14 ` Willy Tarreau @ 2008-04-29 9:06 ` Helge Hafting 2008-04-29 9:33 ` Alan Cox 2008-04-29 10:09 ` Willy Tarreau 2008-04-29 9:43 ` Adrian Bunk 2008-04-29 19:31 ` H. Peter Anvin 2 siblings, 2 replies; 48+ messages in thread From: Helge Hafting @ 2008-04-29 9:06 UTC (permalink / raw) To: Willy Tarreau; +Cc: Adrian Bunk, H. Peter Anvin, linux-kernel, trivial Willy Tarreau wrote: > On Tue, Apr 29, 2008 at 10:29:11AM +0300, Adrian Bunk wrote: > >> On Tue, Apr 29, 2008 at 07:06:05AM +0200, Willy Tarreau wrote: >> >>> On Mon, Apr 28, 2008 at 06:29:43PM -0700, H. Peter Anvin wrote: >>> >>>> Willy Tarreau wrote: >>>> >>>>> Is this really needed Adrian ? I mean, everyone reads iso-8859-1, not >>>>> everyone reads UTF-8. >>>>> >>>> "Everyone" who speaks a Western European language, perhaps; and even >>>> then, mostly because a lot of tools still have a "oh, it's not valid >>>> UTF-8, guess iso-8859-1" mode. >>>> >>> Or simply because people have not migrated all their install, or have >>> explicitly disabled UTF-8 a few hours after starting to use it once >>> they discovered the mess it caused and the poor support from the >>> tools :-/ >>> >> Non-ancient distributions default to UTF-8 and have tools that handle it >> fine. >> >> If you had bad experiences in the last millenium you should try again. >> > > Well, I accidentally used a freshly installed laptop running mandriva 2008. > I was typing in a terminal inside KDE (I don't know the program name, sort > of an xterm, but with huge borders all around). I made a typo in a word and > typed in a "é" (e acute). Pressing backspace to fix it showed me that I > remove more chars than typed. I tried again. Pressing this letter 5 times, > then 10 times backspace. I removed 5 chars from the prompt. I suspect that > if I had used some chars with wider encoding (eg 4 bytes), I could have > removed as many... Clearly those tools are not ready. > So don't use that particular tool, and/or file a bug with the maintainer. :-) I have used utf-8 for years - the fact that some editors and some terminal emulators fail is not a problem for me. There are so many that works just fine. There is unicode xterm, and rxvt if you consider xterm too heavy. Both vi and emacs have versions that handle utf-8 competently. You may have to put in a one-off effort in finding a suitable font for your xterm, if you actually wants to see proper umlauts in all cases. If you don't care about looks, then xterm will display blanks/squares and backspace etc. will still work. > Also, I recently upgraded one machine from 2.6.22 to 2.6.25. Same crappy > behaviour on the console (with bash). I quickly set the vt.defaults on > the kernel command line to fix the problem. > > At this stage, I'm not even trying to "fix" the problem, as it's > a philosophical debate and I do not want to enter it. Some people > consider it normal that we break user-space applications and that > it's obvious that all useland code has to be replaced to remain > compatible with "evolutions", and I simply do not support this > principle. Outside the english-speaking world, userland _was_ completely broken in the day of ascii. And supporting the multiple iso8859-xx encodings was completely broken too, if you ever needed more than one of them. Unicode gives userland an opportunity to actually work decently for the first time. Now, ascii may be fine if C development is all you ever use the machine for. You can mangle a few names in comments - some people won't like that at all, some won't care. But try using the same machine for writing a business letter without a proper character set. You won't be taken seriously. Or even a non-english gui app with ascii-only menus. If you want to know what it is like, knock three vowels or so out of the english alphabet. Consider them not supported. Invent "transcriptions" if you like. Try writing a letter that way! Or even kernel code with informative comments. See just how much that suck. > I just care about having the ability to disable the > broken behaviour. Most of the problem comes from the variable > length characters causing wrapping lines and misplaced tabs when > read in non UTF-8 aware editors and/or terminals. Consider the alternative - disable the broken behavior by using a tool that handles UTF-8. There are certainly enough aware apps/tools for those of us that need unicode. >>> And do we really consider that people's names in *comments* cannot >>> be converted to pure ASCII ? I'm western european and have always >>> been against accents in comments (another reason to write comments >>> in english BTW). >>> >> Accents are very rare in names in the kernel. >> >> Most non-ASCII characters are umlauts and there's no sane way to >> express them in ASCII (and the vowels without umlaut are pronounced >> quite differently and might even make names look very strange). >> > > Agreed, but it's been done for *years*. I received mails from people > spelled "jorn" or "jurgen" and they had no trouble using that spelling > in their names or mail addresses. > It has been done for years because there were no other choice. If you wanted to work in unix, just forget your own name! Now there is a choice. Some people still don' care and is fine with "jorn" and such. Some are pissed off, takes offense, or stick to windows or simply puts unicode into kernel comments. If your mailer doesn't support utf-8, chances are you get some mail from people with very strange looking names too. >> And that's only within European languages, outside it becomes even >> worse. >> >> >>> Unix and internet have lived without accents for >>> almost 30 years without anyone really bothering. And now we try to >>> Lots of people actually bothered - and created various encoding schemes to struggle with until they came up with unicode. English speakers and people _only_ interested in simple tools like tar and ls didn't bother perhaps. No problem there - the pressure to support more than ascii always was on those wanting to use more than ascii. Now the kernel contains more than ascii, and if you want to work on it you will have to cope - or succeed in patching it out again. >>> put them everywhere (even in domain names, implying big security >>> issues) and it causes real annoyances. People's names have not >>> changed in 30 years, so I guess that the rules used during this >>> time to ASCII-fy the names are still usable. >>> Such "rules" may work for kernel comments specifically. But linux is used for much more than that, so it now supports utf-8 just fine. People who have a poperly set up system see no reason why they can't use utf-8 in the kernel too. Consider tools that work. Or fix the few remaining that doesn't work - if you are attached to them. >> The comments in the kernel have been converted to UTF-8 quite some time >> ago, what I'm fixing with my patch is just some recent non-UTF-8 stuff >> that creeped in. >> > > Well, if that had already begun, at least you're standardizing. > > >> And names in comments in the kernel were not pure ASCII since very >> early, they were in other charsets. >> >> Mostly iso-8859-1, but not all of them. >> >> I remember that for one name we first guessed which character it was and >> then tried to figure out which charset it was in (no, it was not one >> of iso-8859-*). >> >> So it was not "ASCII -> UTF-8", it was >> "several different charsets -> UTF-8". >> > > I would have loved to see "several different charsets -> ASCII". > And all those that actually used those "different charsets" disagree, or they'd used ascii in the first place too. :-) Helge Hafting ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 9:06 ` Helge Hafting @ 2008-04-29 9:33 ` Alan Cox 2008-04-29 10:09 ` Willy Tarreau 1 sibling, 0 replies; 48+ messages in thread From: Alan Cox @ 2008-04-29 9:33 UTC (permalink / raw) To: Helge Hafting Cc: Willy Tarreau, Adrian Bunk, H. Peter Anvin, linux-kernel, trivial > Outside the english-speaking world, userland _was_ completely (American) Formal UK English uses accented characters for some foreign imports (eg café), ï for words like naïve, and if you are really pretentious you need the æ symbol for words like mediæval although for modern writing this is considered silly. The bash problem btw should have been fixed (if it is bash causing it) as of 2.05b and readline 4.3. If its being cause by the KDE terminal that would suprise me but might be worth filing a bug. Alan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 9:06 ` Helge Hafting 2008-04-29 9:33 ` Alan Cox @ 2008-04-29 10:09 ` Willy Tarreau 2008-04-29 10:10 ` Alan Cox ` (2 more replies) 1 sibling, 3 replies; 48+ messages in thread From: Willy Tarreau @ 2008-04-29 10:09 UTC (permalink / raw) To: Helge Hafting; +Cc: Adrian Bunk, H. Peter Anvin, linux-kernel, trivial On Tue, Apr 29, 2008 at 11:06:05AM +0200, Helge Hafting wrote: > >Well, I accidentally used a freshly installed laptop running mandriva 2008. > >I was typing in a terminal inside KDE (I don't know the program name, sort > >of an xterm, but with huge borders all around). I made a typo in a word and > >typed in a "é" (e acute). Pressing backspace to fix it showed me that I > >remove more chars than typed. I tried again. Pressing this letter 5 times, > >then 10 times backspace. I removed 5 chars from the prompt. I suspect that > >if I had used some chars with wider encoding (eg 4 bytes), I could have > >removed as many... Clearly those tools are not ready. > > > So don't use that particular tool It was not my machine, and had you been there, you would have heard me call it names ! > and/or file a bug with the maintainer. :-) It's too easy to impose crappy designs to end-users and tell them that if that does not work they have to file a bug. There are a minimal set of things that must be tested before shipping. Seeing that the default terminal emulator in KDE on Mandriva 2008 is configured in UTF-8 and does not properly render it simply makes me sick. This is broken by design and even distros trying to get it working for years still can't cope with it. There must be a reason. > I have used utf-8 for years - the fact that some editors and some terminal > emulators fail is not a problem for me. There are so many that works > just fine. There is unicode xterm, and rxvt if you consider xterm too heavy. > Both vi and emacs have versions that handle utf-8 competently. You may > have to > put in a one-off effort in finding a suitable font for your xterm, if you > actually wants to see proper umlauts in all cases. If you don't care about > looks, then xterm will display blanks/squares and backspace etc. will > still work. I don't care about the *look*. Mutt shows me a question mark when it does not know. I care about the *behaviour*. Having backspace go back farther than the prompt is not acceptable. Having 80-col lines span over two lines is absurd. > Outside the english-speaking world, userland _was_ completely > broken in the day of ascii. And supporting the multiple > iso8859-xx encodings was completely broken too, if you ever needed > more than one of them. yes but you just had unexpected characters. Just like MS-DOS when switching from code-page 437 to 850. Aside this, everything worked. > Unicode gives userland an opportunity to actually work decently > for the first time. Unicode yes, UTF-8 no. UTF-8 is a compressed encoding of unicode. That's as silly as if you had to replace your terminals to read native gzip, and expect them as well as all the tools to work properly! > Now, ascii may be fine if C development is all > you ever use the machine for. You can mangle a few names in > comments - some people won't like that at all, some won't care. > > But try using the same machine for writing a business letter without > a proper character set. You won't be taken seriously. Or even a non-english > gui app with ascii-only menus. > > If you want to know what it is like, knock three vowels or so out of the > english alphabet. Consider them not supported. Invent "transcriptions" > if you like. amusing comparison :-) > Try writing a letter that way! Or even kernel code with informative > comments. > See just how much that suck. > > I just care about having the ability to disable the > >broken behaviour. Most of the problem comes from the variable > >length characters causing wrapping lines and misplaced tabs when > >read in non UTF-8 aware editors and/or terminals. > Consider the alternative - disable the broken behavior by using a > tool that handles UTF-8. There are certainly enough aware apps/tools for > those of us that need unicode. Well, booting 2.6.25 with "init=/bin/bash" results in backspace eating the prompt after pressing accentuated letters. Even the control chars have been correctly handled on many UNIXes for decades! The real problem with this crap is that it is viral : "replace all userland applications or die alone on your island". Then "ah, your applications behave in a funny manner, well that may be because of UTF-8, but that is not important, just wait for the update". I'm not even speaking about the security implications it has on a lot of tools, starting with regex libraries. > >Agreed, but it's been done for *years*. I received mails from people > >spelled "jorn" or "jurgen" and they had no trouble using that spelling > >in their names or mail addresses. > > > It has been done for years because there were no other choice. If you > wanted to work in unix, just forget your own name! Now there is a choice. > Some people still don' care and is fine with "jorn" and such. Some are > pissed off, takes offense, or stick to windows or simply puts unicode > into kernel comments. Funny that you mention Windows. Windows has been using 16-bit unicode for a long time without problems. It's a clean encoding. Like it or not. Since they have started using UTF-8, bare windows users have started telling me that there are often bizarre characters in texts instead of accents. That most often happens in forwarded mails. so they get hit too. > If your mailer doesn't support utf-8, chances are you get some mail > from people with very strange looking names too. Once again, I don't care about the strange looking, just about the behaviour. > >>>Unix and internet have lived without accents for > >>>almost 30 years without anyone really bothering. And now we try to > >>> > Lots of people actually bothered - and created various encoding schemes > to struggle with until they came up with unicode. English speakers and > people _only_ interested in simple tools like tar and ls didn't bother > perhaps. You know why we got this encoding ? Simply because it was designed by english speakers who did not want to be impacted at all by the transition. That way they can still use their old "elm", "cat" and "vi" with no hassle and pretend to be UTF-8 ready. > No problem there - the pressure to support more than ascii always was on > those > wanting to use more than ascii. Now the kernel contains more than ascii, > and if you want to work on it you will have to cope - or succeed in > patching it out again. I'm not suggesting to patch it out, just that we stay conservative with the sources. Being limited to certain compilers is already a problem, but we must avoid putting restrictions on the tools required to read/write the sources. > >>>put them everywhere (even in domain names, implying big security > >>>issues) and it causes real annoyances. People's names have not > >>>changed in 30 years, so I guess that the rules used during this > >>>time to ASCII-fy the names are still usable. > >>> > Such "rules" may work for kernel comments specifically. > But linux is used for much more than that, so it now supports utf-8 just > fine. > People who have a poperly set up system see no reason why they > can't use utf-8 in the kernel too. Consider tools that work. Or fix > the few remaining that doesn't work - if you are attached to them. No, you're speaking as a desktop user. You upgrade every 6-months. When you have several machines, with various OSes, you know that the first one which will stuff this crap everywhere will cause even more trouble with the other ones. At one moment, you'll have to upgrade everything. BTW, do you have an UTF-8 patch for the vt320 and vt510 I use as an always-on console on my servers ? Clearly, the system does not have to be "properly setup" to behave correctly. A kernel running bash as init is a "properly setup system". Displaying wrong things is OK, behaving badly is not. > >I would have loved to see "several different charsets -> ASCII". > > > And all those that actually used those "different charsets" disagree, > or they'd used ascii in the first place too. :-) As I said to Adrian, I did not even know there were non-ASCII chars in our sources, and found it a bit shocking. Well, maybe I'm just an old-timer and I need to stop working with computers :-/ Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 10:09 ` Willy Tarreau @ 2008-04-29 10:10 ` Alan Cox 2008-04-29 10:33 ` Willy Tarreau 2008-04-29 19:33 ` H. Peter Anvin 2008-04-29 10:42 ` Adrian Bunk 2008-04-30 9:15 ` Helge Hafting 2 siblings, 2 replies; 48+ messages in thread From: Alan Cox @ 2008-04-29 10:10 UTC (permalink / raw) To: Willy Tarreau Cc: Helge Hafting, Adrian Bunk, H. Peter Anvin, linux-kernel, trivial > Well, booting 2.6.25 with "init=/bin/bash" results in backspace > eating the prompt after pressing accentuated letters. Even the Did you put the bash shell and the console into unicode mode ? > Funny that you mention Windows. Windows has been using 16-bit unicode > for a long time without problems. It's a clean encoding. Like it or not. I would describe the UCS-2 situation as a disaster area - embedded nuls causing breakage, inability to represent the full unicode space and awkward programming interfaces. > You know why we got this encoding ? Simply because it was designed by > english speakers who did not want to be impacted at all by the transition. Actually it was primarily designed to make moving encoding painless so that ascii still worked and C properties like \0 plus traditional Unixisms like "/" just worked. > BTW, do you have an UTF-8 patch for the vt320 and vt510 I use as an > always-on console on my servers ? Clearly, the system does not have to screen supports the needed transliteration for you. Alan -- "Having worked in a university for more than twenty years after leaving industry, I had become unused to seeing management skill routinely exercised, universities being administered rather than managed" -- Peter Checkland ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 10:10 ` Alan Cox @ 2008-04-29 10:33 ` Willy Tarreau 2008-04-29 10:34 ` Alan Cox 2008-05-01 9:46 ` Alexander E. Patrakov 2008-04-29 19:33 ` H. Peter Anvin 1 sibling, 2 replies; 48+ messages in thread From: Willy Tarreau @ 2008-04-29 10:33 UTC (permalink / raw) To: Alan Cox Cc: Helge Hafting, Adrian Bunk, H. Peter Anvin, linux-kernel, trivial On Tue, Apr 29, 2008 at 11:10:14AM +0100, Alan Cox wrote: > > Well, booting 2.6.25 with "init=/bin/bash" results in backspace > > eating the prompt after pressing accentuated letters. Even the > > Did you put the bash shell and the console into unicode mode ? The console yes (by default until I disabled it to restore correct behaviour). The shell no, it was the one present on my machine and has never been compiled with UTF-8 support, and should not have to. If we say that starting with 2.6.24, we're explicitly breaking compatiblity with old userland, fine. But that was not explicitly stated. In my opinion, the problem is that when I press "é", the system sends two chars to the bash, which itself sends two chars to the terminal, which only displays one and moves the cursor one step ahead. Then, pressing backspace once sends one backspace all along, resulting in the terminal blanking one displayed char, but the shell not being aware that only half of it was removed. But if you look at how control chars are handled, if you display ^H then press backspace, you remove all of it. It's the terminal which adjusts the position depending on the character length. So in my opinion, when we send one backspace to the terminal to remove one character, since there are two in the buffer, we should not get back one full char. Ideally, the console driver should send as many backspaces as needed to fix the multiple characters that were emitted. It's not logical at all that if we send 3 chars to a process with one key, sending a cancellation of those chars only sends one backspace. You see, that's really what I hate with this encoding. Every stage relies on the next one to do the fixup. And of course, a lot of combinations fail. > > Funny that you mention Windows. Windows has been using 16-bit unicode > > for a long time without problems. It's a clean encoding. Like it or not. > > I would describe the UCS-2 situation as a disaster area - embedded nuls > causing breakage, inability to represent the full unicode space and > awkward programming interfaces. But at least, there is no feeling of having it working. You immediately see if your tools are compliant or not. > > You know why we got this encoding ? Simply because it was designed by > > english speakers who did not want to be impacted at all by the transition. > > Actually it was primarily designed to make moving encoding painless so > that ascii still worked and C properties like \0 plus traditional > Unixisms like "/" just worked. I cannot imagine how one can believe that something which transcodes one char as a series of 1-to-4 chars will be a painless move. A lot of code is totally broken and was not before the move. > > BTW, do you have an UTF-8 patch for the vt320 and vt510 I use as an > > always-on console on my servers ? Clearly, the system does not have to > > screen supports the needed transliteration for you. That's a useful information, thanks. I was not aware of this. > Alan Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 10:33 ` Willy Tarreau @ 2008-04-29 10:34 ` Alan Cox 2008-04-29 22:12 ` Willy Tarreau 2008-05-01 9:46 ` Alexander E. Patrakov 1 sibling, 1 reply; 48+ messages in thread From: Alan Cox @ 2008-04-29 10:34 UTC (permalink / raw) To: Willy Tarreau Cc: Helge Hafting, Adrian Bunk, H. Peter Anvin, linux-kernel, trivial > behaviour). The shell no, it was the one present on my machine and > has never been compiled with UTF-8 support, and should not have to. Bizarre, so you are using deliberately misconfigured ancient userspace to complain about utf-8 > In my opinion, the problem is that when I press "é", the system sends > two chars to the bash, which itself sends two chars to the terminal, > which only displays one and moves the cursor one step ahead. Then, > pressing backspace once sends one backspace all along, resulting in > the terminal blanking one displayed char, but the shell not being The shell puts the terminal in character by character mode and readline does this. If you have your shell/readline deliberately set up not to be doing unicode locales then it will do the wrong thing. > So in my opinion, when we send one backspace to the terminal to > remove one character, since there are two in the buffer, we > should not get back one full char. Ideally, the console driver > should send as many backspaces as needed to fix the multiple The console driver isn't involved - readline took over for the shell, and readline most definitely supports this in a utf8 locale. Alan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 10:34 ` Alan Cox @ 2008-04-29 22:12 ` Willy Tarreau 2008-04-29 22:15 ` Alan Cox 0 siblings, 1 reply; 48+ messages in thread From: Willy Tarreau @ 2008-04-29 22:12 UTC (permalink / raw) To: Alan Cox Cc: Helge Hafting, Adrian Bunk, H. Peter Anvin, linux-kernel, trivial Hi Alan, On Tue, Apr 29, 2008 at 11:34:10AM +0100, Alan Cox wrote: > > behaviour). The shell no, it was the one present on my machine and > > has never been compiled with UTF-8 support, and should not have to. > > Bizarre, so you are using deliberately misconfigured ancient userspace to > complain about utf-8 No I'm not using anything deliberately misconfigured. I'm trying to explain that on the opposite, any tool which has not been explicitly adapted to those new usages is impacted. > > In my opinion, the problem is that when I press "é", the system sends > > two chars to the bash, which itself sends two chars to the terminal, > > which only displays one and moves the cursor one step ahead. Then, > > pressing backspace once sends one backspace all along, resulting in > > the terminal blanking one displayed char, but the shell not being > > The shell puts the terminal in character by character mode and readline > does this. If you have your shell/readline deliberately set up not to be > doing unicode locales then it will do the wrong thing. Please, I'm not "deliberately" setting my tools *not* to support unicode. I have tools which have worked for years and which are now asked to behave strangely. > > So in my opinion, when we send one backspace to the terminal to > > remove one character, since there are two in the buffer, we > > should not get back one full char. Ideally, the console driver > > should send as many backspaces as needed to fix the multiple > > The console driver isn't involved - readline took over for the shell, and > readline most definitely supports this in a utf8 locale. OK I could reproduce the case without ever involving either a shell or readline or anything. Using "cat" as the init program exhibited the anomaly, though it was not much easy to analyze. Then I switched to "init=od -An -tx1 -". 1) if I enter "A" then press backspace, I get nothing. Pressing enter 16 times flushes the line buffer and "od" prints 16 times "0a", indicating nothing was remaining in the buffer. 2) if I enter Ctrl-V Ctrl-A, my display prints "^A", and when I press backspace, I correctly get the cursor back two chars. Once again, flushing the buffer with enter shows it was empty. 3) if I enter Alt-196, I get a "Ä". Flushing the buffer shows that od got two bytes: c3 84. 4) now if I enter Alt-196 and press backspace, my "Ä" is removed by the backspace, but only the second byte is flushed from the line buffer. Then, if I press enter 15 times, I get a line with c3 0a 0a 0a ... And there is no user-land involved here. I'm really hoping you better understand the problem now. Pressing backspace to fix input does not correct the input with multi-byte chars, it leaves incomplete start sequences. If I press Alt-1111111, then backspace, I get f4 8f 91 0a 0a 0a 0a because it is f4 8f 91 87 minus one byte. Of course, pressing Backspace multiple times removes them all, but it also removes previous characters on the display. Another experience : I press 01234, then Alt-255, Backspace, then 56789. On the display, I have 0123456789. od gets 30 31 32 33 34 c3 35 36 37 38 39. Now if I want to correctly fix the input, I have to press backspace twice, but then I have to make the '4' disappear from my display, while knowing it still remains in the buffer. And indeed, my display shows "012356789" but od sees 30 31 32 33 34 35 36 37 38 39. And this is without anything on the user-land (except 'od'), just plain stupid text console booted with "init=..." So obviously there is something broken as the data fed into stdin does not match what is displayed for multi-byte characters. Hoping this clarifies the situation, Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 22:12 ` Willy Tarreau @ 2008-04-29 22:15 ` Alan Cox 2008-04-29 23:05 ` Willy Tarreau 0 siblings, 1 reply; 48+ messages in thread From: Alan Cox @ 2008-04-29 22:15 UTC (permalink / raw) To: Willy Tarreau Cc: Helge Hafting, Adrian Bunk, H. Peter Anvin, linux-kernel, trivial > OK I could reproduce the case without ever involving either a shell or > readline or anything. Using "cat" as the init program exhibited the > anomaly, though it was not much easy to analyze. Then I switched to > "init=od -An -tx1 -". Did you put the console into utf-8 mode before the cat ? ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 22:15 ` Alan Cox @ 2008-04-29 23:05 ` Willy Tarreau 2008-05-01 20:18 ` H. Peter Anvin 0 siblings, 1 reply; 48+ messages in thread From: Willy Tarreau @ 2008-04-29 23:05 UTC (permalink / raw) To: Alan Cox Cc: Helge Hafting, Adrian Bunk, H. Peter Anvin, linux-kernel, trivial On Tue, Apr 29, 2008 at 11:15:54PM +0100, Alan Cox wrote: > > OK I could reproduce the case without ever involving either a shell or > > readline or anything. Using "cat" as the init program exhibited the > > anomaly, though it was not much easy to analyze. Then I switched to > > "init=od -An -tx1 -". > > Did you put the console into utf-8 mode before the cat ? I had not *explictly* disabled it, since as the doc suggests : vt.default_utf8= [VT] Format=<0|1> Set system-wide default UTF-8 mode for all tty's. Default is 1, i.e. UTF-8 mode is enabled for all newly opened terminals. And I know that I can fix the behaviour by explicitly setting it to zero. Also, the fact that "od" shows me multi-byte characters on the input indicates to me that everything is set to UTF-8. So unless I'm missing something, my console is set by default to UTF-8 (I test this on 2.6.25). Regards, Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 23:05 ` Willy Tarreau @ 2008-05-01 20:18 ` H. Peter Anvin 0 siblings, 0 replies; 48+ messages in thread From: H. Peter Anvin @ 2008-05-01 20:18 UTC (permalink / raw) To: Willy Tarreau; +Cc: Alan Cox, Helge Hafting, Adrian Bunk, linux-kernel, trivial Willy Tarreau wrote: > On Tue, Apr 29, 2008 at 11:15:54PM +0100, Alan Cox wrote: >>> OK I could reproduce the case without ever involving either a shell or >>> readline or anything. Using "cat" as the init program exhibited the >>> anomaly, though it was not much easy to analyze. Then I switched to >>> "init=od -An -tx1 -". >> Did you put the console into utf-8 mode before the cat ? > > I had not *explictly* disabled it, since as the doc suggests : > > vt.default_utf8= > [VT] > Format=<0|1> > Set system-wide default UTF-8 mode for all tty's. > Default is 1, i.e. UTF-8 mode is enabled for all > newly opened terminals. > > And I know that I can fix the behaviour by explicitly setting it to zero. > Also, the fact that "od" shows me multi-byte characters on the input > indicates to me that everything is set to UTF-8. So unless I'm missing > something, my console is set by default to UTF-8 (I test this on 2.6.25). > Yes, there is apparently a real bug here: this vt setting doesn't propagate to the tty layer iutf8 flag. -hpa ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 10:33 ` Willy Tarreau 2008-04-29 10:34 ` Alan Cox @ 2008-05-01 9:46 ` Alexander E. Patrakov 1 sibling, 0 replies; 48+ messages in thread From: Alexander E. Patrakov @ 2008-05-01 9:46 UTC (permalink / raw) To: Willy Tarreau Cc: Alan Cox, Helge Hafting, Adrian Bunk, H. Peter Anvin, linux-kernel, trivial Willy Tarreau wrote: > In my opinion, the problem is that when I press "é", the system sends > two chars to the bash, which itself sends two chars to the terminal, > which only displays one and moves the cursor one step ahead. Then, > pressing backspace once sends one backspace all along, resulting in > the terminal blanking one displayed char, but the shell not being > aware that only half of it was removed. But if you look at how > control chars are handled, if you display ^H then press backspace, > you remove all of it. It's the terminal which adjusts the position > depending on the character length. export LANG=en_US.UTF-8 (i.e., inform the userspace that you are using UTF-8), unset LC_CTYPE and unset LC_ALL (so that they don't override $LANG), and problem solved. -- Alexander E. Patrakov ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 10:10 ` Alan Cox 2008-04-29 10:33 ` Willy Tarreau @ 2008-04-29 19:33 ` H. Peter Anvin 1 sibling, 0 replies; 48+ messages in thread From: H. Peter Anvin @ 2008-04-29 19:33 UTC (permalink / raw) To: Alan Cox; +Cc: Willy Tarreau, Helge Hafting, Adrian Bunk, linux-kernel, trivial Alan Cox wrote: > >> Funny that you mention Windows. Windows has been using 16-bit unicode >> for a long time without problems. It's a clean encoding. Like it or not. > > I would describe the UCS-2 situation as a disaster area - embedded nuls > causing breakage, inability to represent the full unicode space and > awkward programming interfaces. > Not to mention the fact that UCS-2 ran out of code points almost as soon as they said "no more codepoints." The result was UTF-16, a hideous abortion which took all the problems with wide encodings, combined it with all the problems of multibyte encodings, and added a few new ones for good measure. -hpa ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 10:09 ` Willy Tarreau 2008-04-29 10:10 ` Alan Cox @ 2008-04-29 10:42 ` Adrian Bunk 2008-04-29 11:06 ` Willy Tarreau 2008-04-30 9:15 ` Helge Hafting 2 siblings, 1 reply; 48+ messages in thread From: Adrian Bunk @ 2008-04-29 10:42 UTC (permalink / raw) To: Willy Tarreau; +Cc: Helge Hafting, H. Peter Anvin, linux-kernel, trivial On Tue, Apr 29, 2008 at 12:09:34PM +0200, Willy Tarreau wrote: > On Tue, Apr 29, 2008 at 11:06:05AM +0200, Helge Hafting wrote: > > >Well, I accidentally used a freshly installed laptop running mandriva 2008. > > >I was typing in a terminal inside KDE (I don't know the program name, sort > > >of an xterm, but with huge borders all around). I made a typo in a word and > > >typed in a "é" (e acute). Pressing backspace to fix it showed me that I > > >remove more chars than typed. I tried again. Pressing this letter 5 times, > > >then 10 times backspace. I removed 5 chars from the prompt. I suspect that > > >if I had used some chars with wider encoding (eg 4 bytes), I could have > > >removed as many... Clearly those tools are not ready. > > > > > So don't use that particular tool > > It was not my machine, and had you been there, you would have heard me call > it names ! > > > and/or file a bug with the maintainer. :-) > > It's too easy to impose crappy designs to end-users and tell them that if > that does not work they have to file a bug. There are a minimal set of > things that must be tested before shipping. Seeing that the default > terminal emulator in KDE on Mandriva 2008 is configured in UTF-8 and does > not properly render it simply makes me sick. This is broken by design and > even distros trying to get it working for years still can't cope with it. > There must be a reason. I can reproduce your problem in a plain xterm when setting LANG=en_US (most likely the same problem can occur with other non UTF-8 settings). In this case I'm actually more surprised that the character is displayed correctly than that you have to type backspace twice. Any kind of charset mixing is highly problematic (which is also why my patch was attached compressed), so if you disable UTF-8 anywhere in a modern distribution problems are somehow expected (it could also be a bug in Mandrivas default settings, but that would really surprise me). >... > > Unicode gives userland an opportunity to actually work decently > > for the first time. > > Unicode yes, UTF-8 no. UTF-8 is a compressed encoding of unicode. > That's as silly as if you had to replace your terminals to read > native gzip, and expect them as well as all the tools to work > properly! It's not a compressed encoding, it's a variable-length encoding. Besides the size advantages one main advantage of UTF-8 is that ASCII is valid UTF-8. This means that for the ASCII source code in the kernel it doesn't matter whether it's treated as ASCII or UTF-8, and no conversion was needed. You can't get this property with a fixed-size Unicode encoding. >... > Willy cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 10:42 ` Adrian Bunk @ 2008-04-29 11:06 ` Willy Tarreau 2008-04-29 11:27 ` Adrian Bunk 0 siblings, 1 reply; 48+ messages in thread From: Willy Tarreau @ 2008-04-29 11:06 UTC (permalink / raw) To: Adrian Bunk; +Cc: Helge Hafting, H. Peter Anvin, linux-kernel, trivial On Tue, Apr 29, 2008 at 01:42:16PM +0300, Adrian Bunk wrote: > On Tue, Apr 29, 2008 at 12:09:34PM +0200, Willy Tarreau wrote: > > On Tue, Apr 29, 2008 at 11:06:05AM +0200, Helge Hafting wrote: > > > >Well, I accidentally used a freshly installed laptop running mandriva 2008. > > > >I was typing in a terminal inside KDE (I don't know the program name, sort > > > >of an xterm, but with huge borders all around). I made a typo in a word and > > > >typed in a "é" (e acute). Pressing backspace to fix it showed me that I > > > >remove more chars than typed. I tried again. Pressing this letter 5 times, > > > >then 10 times backspace. I removed 5 chars from the prompt. I suspect that > > > >if I had used some chars with wider encoding (eg 4 bytes), I could have > > > >removed as many... Clearly those tools are not ready. > > > > > > > So don't use that particular tool > > > > It was not my machine, and had you been there, you would have heard me call > > it names ! > > > > > and/or file a bug with the maintainer. :-) > > > > It's too easy to impose crappy designs to end-users and tell them that if > > that does not work they have to file a bug. There are a minimal set of > > things that must be tested before shipping. Seeing that the default > > terminal emulator in KDE on Mandriva 2008 is configured in UTF-8 and does > > not properly render it simply makes me sick. This is broken by design and > > even distros trying to get it working for years still can't cope with it. > > There must be a reason. > > I can reproduce your problem in a plain xterm when setting LANG=en_US > (most likely the same problem can occur with other non UTF-8 settings). possibly they broke it when forcing support for variable length ? > In this case I'm actually more surprised that the character is displayed > correctly than that you have to type backspace twice. It's not that I *had* to type it twice. But I *could* type it twice, and the first one removed the character, the second one the prompt. > Any kind of charset mixing is highly problematic (which is also why my > patch was attached compressed), so if you disable UTF-8 anywhere in a > modern distribution problems are somehow expected (it could also be a > bug in Mandrivas default settings, but that would really surprise me). No, it was not disabled at all. I had to type in a command for a co-worker who just did a default install the day before, and typed a typo which I wanted to fix. > > Unicode yes, UTF-8 no. UTF-8 is a compressed encoding of unicode. > > That's as silly as if you had to replace your terminals to read > > native gzip, and expect them as well as all the tools to work > > properly! > > It's not a compressed encoding, it's a variable-length encoding. > > Besides the size advantages one main advantage of UTF-8 is that ASCII is > valid UTF-8. This means that for the ASCII source code in the kernel it > doesn't matter whether it's treated as ASCII or UTF-8, and no conversion > was needed. > > You can't get this property with a fixed-size Unicode encoding. I don't agree. If you refuse character-set mixing, there's no problem. Bit 7 of first char == 1 ? => full text is 32 bit. Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 11:06 ` Willy Tarreau @ 2008-04-29 11:27 ` Adrian Bunk 2008-04-29 11:32 ` Adrian Bunk 0 siblings, 1 reply; 48+ messages in thread From: Adrian Bunk @ 2008-04-29 11:27 UTC (permalink / raw) To: Willy Tarreau; +Cc: Helge Hafting, H. Peter Anvin, linux-kernel, trivial On Tue, Apr 29, 2008 at 01:06:38PM +0200, Willy Tarreau wrote: > On Tue, Apr 29, 2008 at 01:42:16PM +0300, Adrian Bunk wrote: > > On Tue, Apr 29, 2008 at 12:09:34PM +0200, Willy Tarreau wrote: >... > > > Unicode yes, UTF-8 no. UTF-8 is a compressed encoding of unicode. > > > That's as silly as if you had to replace your terminals to read > > > native gzip, and expect them as well as all the tools to work > > > properly! > > > > It's not a compressed encoding, it's a variable-length encoding. > > > > Besides the size advantages one main advantage of UTF-8 is that ASCII is > > valid UTF-8. This means that for the ASCII source code in the kernel it > > doesn't matter whether it's treated as ASCII or UTF-8, and no conversion > > was needed. > > > > You can't get this property with a fixed-size Unicode encoding. > > I don't agree. If you refuse character-set mixing, there's no problem. > Bit 7 of first char == 1 ? => full text is 32 bit. You miss my point. The point is: A conversion "ASCII -> UTF-8" is a nop. This means when changing the kernel from half a dozen charsets used in comments to UTF-8 we only had to change the few characters actually containing non UTF-8. Going to something like UTF-32 as you suggest would have involved converting every single file in the kernel. > Willy cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 11:27 ` Adrian Bunk @ 2008-04-29 11:32 ` Adrian Bunk 2008-04-29 20:18 ` Jeremy Fitzhardinge 0 siblings, 1 reply; 48+ messages in thread From: Adrian Bunk @ 2008-04-29 11:32 UTC (permalink / raw) To: Willy Tarreau; +Cc: Helge Hafting, H. Peter Anvin, linux-kernel, trivial On Tue, Apr 29, 2008 at 02:27:18PM +0300, Adrian Bunk wrote: > > You miss my point. > > The point is: > A conversion "ASCII -> UTF-8" is a nop. > > This means when changing the kernel from half a dozen charsets used in > comments to UTF-8 we only had to change the few characters actually > containing non UTF-8. "containing non-ASCII" > Going to something like UTF-32 as you suggest would have involved > converting every single file in the kernel. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 11:32 ` Adrian Bunk @ 2008-04-29 20:18 ` Jeremy Fitzhardinge 0 siblings, 0 replies; 48+ messages in thread From: Jeremy Fitzhardinge @ 2008-04-29 20:18 UTC (permalink / raw) To: Adrian Bunk Cc: Willy Tarreau, Helge Hafting, H. Peter Anvin, linux-kernel, trivial Adrian Bunk wrote: > On Tue, Apr 29, 2008 at 02:27:18PM +0300, Adrian Bunk wrote: > >> You miss my point. >> >> The point is: >> A conversion "ASCII -> UTF-8" is a nop. >> >> This means when changing the kernel from half a dozen charsets used in >> comments to UTF-8 we only had to change the few characters actually >> containing non UTF-8. >> > > "containing non-ASCII" > Same thing ;) J ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 10:09 ` Willy Tarreau 2008-04-29 10:10 ` Alan Cox 2008-04-29 10:42 ` Adrian Bunk @ 2008-04-30 9:15 ` Helge Hafting 2008-04-30 19:22 ` Adrian Bunk 2008-04-30 19:42 ` H. Peter Anvin 2 siblings, 2 replies; 48+ messages in thread From: Helge Hafting @ 2008-04-30 9:15 UTC (permalink / raw) To: Willy Tarreau; +Cc: Adrian Bunk, H. Peter Anvin, linux-kernel, trivial Willy Tarreau wrote: > On Tue, Apr 29, 2008 at 11:06:05AM +0200, Helge Hafting wrote: > >>> Well, I accidentally used a freshly installed laptop running mandriva 2008. >>> I was typing in a terminal inside KDE (I don't know the program name, sort >>> of an xterm, but with huge borders all around). I made a typo in a word and >>> typed in a "é" (e acute). Pressing backspace to fix it showed me that I >>> remove more chars than typed. I tried again. Pressing this letter 5 times, >>> then 10 times backspace. I removed 5 chars from the prompt. I suspect that >>> if I had used some chars with wider encoding (eg 4 bytes), I could have >>> removed as many... Clearly those tools are not ready. >>> >>> >> So don't use that particular tool >> > > It was not my machine, and had you been there, you would have heard me call > it names ! > We all do that, for various reasons... > >> and/or file a bug with the maintainer. :-) >> > > It's too easy to impose crappy designs to end-users and tell them that if > that does not work they have to file a bug. There are a minimal set of > things that must be tested before shipping. Seeing that the default > terminal emulator in KDE on Mandriva 2008 is configured in UTF-8 and does > not properly render it simply makes me sick. This is broken by design and > even distros trying to get it working for years still can't cope with it. > There must be a reason. > Yeah, ascii-only is a crappy design. :-/ I don't know if mandriva is broken by design - I only use debian. It would not surprise me if some distros botch utf-8 through negligence. They are based in english-speaking countries and have their biggest user bases there - the majority of their customers aren't going to use more than ascii so why should they bother. Someone made a "cool" terminal emulator? Transparency and effects? Distribute it, despite the fact that it won't work in all cases. Distro contains xterm anyway for those that need a fallback. Machine owner thinks one terminal emulator is enough and install the default or cool one only. >> I have used utf-8 for years - the fact that some editors and some terminal >> emulators fail is not a problem for me. There are so many that works >> just fine. There is unicode xterm, and rxvt if you consider xterm too heavy. >> Both vi and emacs have versions that handle utf-8 competently. You may >> have to >> put in a one-off effort in finding a suitable font for your xterm, if you >> actually wants to see proper umlauts in all cases. If you don't care about >> looks, then xterm will display blanks/squares and backspace etc. will >> still work. >> > > I don't care about the *look*. Mutt shows me a question mark when it does > not know. I care about the *behaviour*. Having backspace go back farther > than the prompt is not acceptable. Having 80-col lines span over two lines > is absurd. > > >> Outside the english-speaking world, userland _was_ completely >> broken in the day of ascii. And supporting the multiple >> iso8859-xx encodings was completely broken too, if you ever needed >> more than one of them. >> > > yes but you just had unexpected characters. Just like MS-DOS when > switching from code-page 437 to 850. Aside this, everything worked. > I don't see how wrong characters are better than backspace eating the prompt or 80-col overflowing when it shouldn't. It is all breakage either way. Stuff break if TERM is set wrong for the terminal in use too, or if the app in use don't _use_ the TERM variable. This happens too, and you only notice if the app runs on a terminal incompatible with TERM=linux. [...] >> If you want to know what it is like, knock three vowels or so out of the >> english alphabet. Consider them not supported. Invent "transcriptions" >> if you like. >> > > amusing comparison :-) > Amusing and accurate. I use Norwegian which has 3 non-ascii vowels. As well as some accented characters, but they don't crop up in _every other sentence_. >> Lots of people actually bothered - and created various encoding schemes >> to struggle with until they came up with unicode. English speakers and >> people _only_ interested in simple tools like tar and ls didn't bother >> perhaps. >> > > You know why we got this encoding ? Simply because it was designed by > english speakers who did not want to be impacted at all by the transition. > That way they can still use their old "elm", "cat" and "vi" with no > hassle and pretend to be UTF-8 ready. > It had to be done in an ascii-compatible way. That way, a userland containing a mix of ascii-only apps, fully utf-8 supporting apps, and apps with partial utf-8 support will work flawlessly for ascii-only stuff. Like C source and english language tools. Of course utf-8 only works in the apps supporting it, but utf-8 users keeps fixing this in the apps they need. Breaking ascii compatibility was not an option, because that means replacing the entire userland in one operation. That cannot be done unless a single authority control everything, and the open source world isn't like that. Variable length encoding is necessary, given that: * Ascii should work as before, i.e. one "char" per ascii character * One single encoding so a plain text file can contain the symbols of any writing system in use. There are way more than 256 symbols. [...] >> Such "rules" may work for kernel comments specifically. >> But linux is used for much more than that, so it now supports utf-8 just >> fine. >> People who have a poperly set up system see no reason why they >> can't use utf-8 in the kernel too. Consider tools that work. Or fix >> the few remaining that doesn't work - if you are attached to them. >> > > No, you're speaking as a desktop user. You upgrade every 6-months. When > you have several machines, with various OSes, you know that the first > one which will stuff this crap everywhere will cause even more trouble > with the other ones. At one moment, you'll have to upgrade everything. > BTW, do you have an UTF-8 patch for the vt320 and vt510 I use as an > always-on console on my servers ? Clearly, the system does not have to > be "properly setup" to behave correctly. A kernel running bash as init > is a "properly setup system". Displaying wrong things is OK, behaving > badly is not. > No, I don't have a utf-8 patch for vt320 terminals. Using one is your choice. Either you don't work with utf-8 stuff on it, or you use intermediate software that translate the utf-8 to something the terminal can display in an acceptable matter. > >>> I would have loved to see "several different charsets -> ASCII". >>> >>> >> And all those that actually used those "different charsets" disagree, >> or they'd used ascii in the first place too. :-) >> > > As I said to Adrian, I did not even know there were non-ASCII chars > in our sources, and found it a bit shocking. Well, maybe I'm just an > old-timer and I need to stop working with computers :-/ > If you _cannot_ accept utf-8, then your computer world will shrink with time. Or you can live with a few things you don't like - most of us have to, given that the computer world has so many people with differing opinions. Helge Hafting ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-30 9:15 ` Helge Hafting @ 2008-04-30 19:22 ` Adrian Bunk 2008-04-30 19:42 ` H. Peter Anvin 1 sibling, 0 replies; 48+ messages in thread From: Adrian Bunk @ 2008-04-30 19:22 UTC (permalink / raw) To: Helge Hafting; +Cc: Willy Tarreau, H. Peter Anvin, linux-kernel, trivial On Wed, Apr 30, 2008 at 11:15:12AM +0200, Helge Hafting wrote: > Willy Tarreau wrote: >... >> It's too easy to impose crappy designs to end-users and tell them that if >> that does not work they have to file a bug. There are a minimal set of >> things that must be tested before shipping. Seeing that the default >> terminal emulator in KDE on Mandriva 2008 is configured in UTF-8 and does >> not properly render it simply makes me sick. This is broken by design and >> even distros trying to get it working for years still can't cope with it. >> There must be a reason. >> > Yeah, ascii-only is a crappy design. :-/ I don't know if mandriva is > broken by design - I only use debian. > It would not surprise me if some distros botch utf-8 through negligence. > They are based in english-speaking countries and have their biggest > user bases there - the majority of their customers aren't going to use > more than > ascii so why should they bother. Mandriva is a French company. And what Willy describes really sounds like someone fiddling with some settings (or something like accidentally selecting some non UTF-8 locale). Bad things can happen when you somehow get charsets mixed, but distributions default to UTF-8 for quite some time, and problems with a 100% UTF-8 system have therefore become were unlikely. >... > Helge Hafting cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-30 9:15 ` Helge Hafting 2008-04-30 19:22 ` Adrian Bunk @ 2008-04-30 19:42 ` H. Peter Anvin 1 sibling, 0 replies; 48+ messages in thread From: H. Peter Anvin @ 2008-04-30 19:42 UTC (permalink / raw) To: Helge Hafting; +Cc: Willy Tarreau, Adrian Bunk, linux-kernel, trivial Helge Hafting wrote: > It would not surprise me if some distros botch utf-8 through negligence. > They are based in english-speaking countries and have their biggest > user bases there - the majority of their customers aren't going to use > more than ascii so why should they bother. Well, we were talking about Mandriva, which is a Brazilian-French company, their main languages are Portugese and French; you'd think they'd notice themselves. Most likely there was something in Willy's configuration that buggered it up. -hpa ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 8:14 ` Willy Tarreau 2008-04-29 9:06 ` Helge Hafting @ 2008-04-29 9:43 ` Adrian Bunk 2008-04-29 19:31 ` H. Peter Anvin 2 siblings, 0 replies; 48+ messages in thread From: Adrian Bunk @ 2008-04-29 9:43 UTC (permalink / raw) To: Willy Tarreau; +Cc: H. Peter Anvin, linux-kernel, trivial On Tue, Apr 29, 2008 at 10:14:23AM +0200, Willy Tarreau wrote: > On Tue, Apr 29, 2008 at 10:29:11AM +0300, Adrian Bunk wrote: > > On Tue, Apr 29, 2008 at 07:06:05AM +0200, Willy Tarreau wrote: > > > On Mon, Apr 28, 2008 at 06:29:43PM -0700, H. Peter Anvin wrote: > > > > Willy Tarreau wrote: > > > > >Is this really needed Adrian ? I mean, everyone reads iso-8859-1, not > > > > >everyone reads UTF-8. > > > > > > > > "Everyone" who speaks a Western European language, perhaps; and even > > > > then, mostly because a lot of tools still have a "oh, it's not valid > > > > UTF-8, guess iso-8859-1" mode. > > > > > > Or simply because people have not migrated all their install, or have > > > explicitly disabled UTF-8 a few hours after starting to use it once > > > they discovered the mess it caused and the poor support from the > > > tools :-/ > > > > Non-ancient distributions default to UTF-8 and have tools that handle it > > fine. > > > > If you had bad experiences in the last millenium you should try again. > > Well, I accidentally used a freshly installed laptop running mandriva 2008. > I was typing in a terminal inside KDE (I don't know the program name, sort > of an xterm, but with huge borders all around). I made a typo in a word and > typed in a "é" (e acute). Pressing backspace to fix it showed me that I > remove more chars than typed. I tried again. Pressing this letter 5 times, > then 10 times backspace. I removed 5 chars from the prompt. I suspect that > if I had used some chars with wider encoding (eg 4 bytes), I could have > removed as many... Clearly those tools are not ready. >... This sounds as if you had UTF-8 characters in a non UTF-8 environment. If you did your "explicitly disabled UTF-8" then this is what triggered it. > > > > The most common instance of non-ASCII > > > > characters in Linux kernel code are people's names, and there are plenty > > > > of names which aren't representable in either ASCII or iso-8859-1. > > > > > > > > The debate on this was years ago, and the consensus was to migrate to > > > > UTF-8; however, the salient information should be expressed in the ASCII > > > > character set unless impossible. > > > > > > And do we really consider that people's names in *comments* cannot > > > be converted to pure ASCII ? I'm western european and have always > > > been against accents in comments (another reason to write comments > > > in english BTW). > > > > Accents are very rare in names in the kernel. > > > > Most non-ASCII characters are umlauts and there's no sane way to > > express them in ASCII (and the vowels without umlaut are pronounced > > quite differently and might even make names look very strange). > > Agreed, but it's been done for *years*. I received mails from people > spelled "jorn" or "jurgen" and they had no trouble using that spelling > in their names or mail addresses. Email addresses are a different topic. But it's not right in names, and if someone then pronounces their name according to the wrong writing the result is also wrong. > > And that's only within European languages, outside it becomes even > > worse. > > > > > Unix and internet have lived without accents for > > > almost 30 years without anyone really bothering. And now we try to > > > put them everywhere (even in domain names, implying big security > > > issues) and it causes real annoyances. People's names have not > > > changed in 30 years, so I guess that the rules used during this > > > time to ASCII-fy the names are still usable. > > > > The comments in the kernel have been converted to UTF-8 quite some time > > ago, what I'm fixing with my patch is just some recent non-UTF-8 stuff > > that creeped in. > > Well, if that had already begun, at least you're standardizing. >... > Willy cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 8:14 ` Willy Tarreau 2008-04-29 9:06 ` Helge Hafting 2008-04-29 9:43 ` Adrian Bunk @ 2008-04-29 19:31 ` H. Peter Anvin 2008-04-29 20:05 ` Willy Tarreau 2 siblings, 1 reply; 48+ messages in thread From: H. Peter Anvin @ 2008-04-29 19:31 UTC (permalink / raw) To: Willy Tarreau; +Cc: Adrian Bunk, linux-kernel, trivial Willy Tarreau wrote: > > Well, I accidentally used a freshly installed laptop running mandriva 2008. > I was typing in a terminal inside KDE (I don't know the program name, sort > of an xterm, but with huge borders all around). I made a typo in a word and > typed in a "é" (e acute). Pressing backspace to fix it showed me that I > remove more chars than typed. I tried again. Pressing this letter 5 times, > then 10 times backspace. I removed 5 chars from the prompt. I suspect that > if I had used some chars with wider encoding (eg 4 bytes), I could have > removed as many... Clearly those tools are not ready. > Presumably, this was konsole. konsole works fine with UTF-8 (I use it that way every day); the most common cause of this kind of problems is people explicitly clobbering the locale or charset class defaults in their login scripts. -hpa ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 19:31 ` H. Peter Anvin @ 2008-04-29 20:05 ` Willy Tarreau 2008-04-29 20:09 ` H. Peter Anvin 0 siblings, 1 reply; 48+ messages in thread From: Willy Tarreau @ 2008-04-29 20:05 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Adrian Bunk, linux-kernel, trivial On Tue, Apr 29, 2008 at 12:31:01PM -0700, H. Peter Anvin wrote: > Willy Tarreau wrote: > > > >Well, I accidentally used a freshly installed laptop running mandriva 2008. > >I was typing in a terminal inside KDE (I don't know the program name, sort > >of an xterm, but with huge borders all around). I made a typo in a word and > >typed in a "é" (e acute). Pressing backspace to fix it showed me that I > >remove more chars than typed. I tried again. Pressing this letter 5 times, > >then 10 times backspace. I removed 5 chars from the prompt. I suspect that > >if I had used some chars with wider encoding (eg 4 bytes), I could have > >removed as many... Clearly those tools are not ready. > > > > Presumably, this was konsole. Possible. It was the one you get by clicking on a terminal icon. Huuhhh what an horror, I'm discussing icons and GUIs on LKML. I must take my meds :-) > konsole works fine with UTF-8 (I use it > that way every day); the most common cause of this kind of problems is > people explicitly clobbering the locale or charset class defaults in > their login scripts. I really doubt the miss would have done this. Or someone would have done it for her which I really doubt in such a small time frame after a fresh install from the day before. I will investigate though. Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 20:05 ` Willy Tarreau @ 2008-04-29 20:09 ` H. Peter Anvin 0 siblings, 0 replies; 48+ messages in thread From: H. Peter Anvin @ 2008-04-29 20:09 UTC (permalink / raw) To: Willy Tarreau; +Cc: Adrian Bunk, linux-kernel, trivial Willy Tarreau wrote: > >> konsole works fine with UTF-8 (I use it >> that way every day); the most common cause of this kind of problems is >> people explicitly clobbering the locale or charset class defaults in >> their login scripts. > > I really doubt the miss would have done this. Or someone would have done > it for her which I really doubt in such a small time frame after a fresh > install from the day before. I will investigate though. > From one of Alan's posts it sounds like there was a bug with multibyte characters in readline at some point that got fixed relatively quickly, but still made it out. -hpa ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 5:06 ` Willy Tarreau 2008-04-29 6:04 ` H. Peter Anvin 2008-04-29 7:29 ` Adrian Bunk @ 2008-05-09 12:48 ` David Kågedal 2 siblings, 0 replies; 48+ messages in thread From: David Kågedal @ 2008-05-09 12:48 UTC (permalink / raw) To: Willy Tarreau; +Cc: H. Peter Anvin, Adrian Bunk, linux-kernel, trivial Willy Tarreau <w@1wt.eu> writes: > And do we really consider that people's names in *comments* cannot > be converted to pure ASCII ? I'm western european and have always > been against accents in comments (another reason to write comments > in english BTW). Unix and internet have lived without accents for > almost 30 years without anyone really bothering. That's a ridiculous statement. Just because you didn't bother, you can't assume that the people who were actually affected didn't bother. I went through large parts of the 1990's under the name "David K}gedal". And I bothered. And no, the second character in my last name is not an accented a, they have been separate letters for hundreds of years in Sweden. So I can live without using accented letters, as long as I can write Kågedal including the å. :-) Not that my name appears anywhere in the Linux source, but I still felt the urge to reply... -- David Kågedal ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-28 23:05 ` Willy Tarreau 2008-04-29 1:29 ` H. Peter Anvin @ 2008-04-29 9:01 ` Alan Cox 2008-04-29 9:19 ` Jan Engelhardt 2008-04-29 9:34 ` Willy Tarreau 1 sibling, 2 replies; 48+ messages in thread From: Alan Cox @ 2008-04-29 9:01 UTC (permalink / raw) To: Willy Tarreau; +Cc: Adrian Bunk, linux-kernel, trivial > In fact, I would have better converted accentuated chars to their ASCII > equivalent to be more friendly with people who only read 7-bit. Perhaps we should put them in latin as well just in case any Roman is struggling with this new language 8) Distibutions have been shipping UTF enabled by default for years and years. Alan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 9:01 ` Alan Cox @ 2008-04-29 9:19 ` Jan Engelhardt 2008-04-29 9:34 ` Willy Tarreau 1 sibling, 0 replies; 48+ messages in thread From: Jan Engelhardt @ 2008-04-29 9:19 UTC (permalink / raw) To: Alan Cox; +Cc: Willy Tarreau, Adrian Bunk, linux-kernel, trivial On Tuesday 2008-04-29 11:01, Alan Cox wrote: >> In fact, I would have better converted accentuated chars to their ASCII >> equivalent to be more friendly with people who only read 7-bit. > >Perhaps we should put them in latin as well just in case any Roman is >struggling with this new language 8) Distibutions have been shipping UTF >enabled by default for years and years. With some being overly late. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 9:01 ` Alan Cox 2008-04-29 9:19 ` Jan Engelhardt @ 2008-04-29 9:34 ` Willy Tarreau 2008-04-29 9:41 ` Alan Cox 1 sibling, 1 reply; 48+ messages in thread From: Willy Tarreau @ 2008-04-29 9:34 UTC (permalink / raw) To: Alan Cox; +Cc: Adrian Bunk, linux-kernel, trivial On Tue, Apr 29, 2008 at 10:01:07AM +0100, Alan Cox wrote: > > In fact, I would have better converted accentuated chars to their ASCII > > equivalent to be more friendly with people who only read 7-bit. > > Perhaps we should put them in latin as well just in case any Roman is > struggling with this new language 8) Distibutions have been shipping UTF > enabled by default for years and years. "enabled" does not mean "working" Alan. I know one distro which I will not name in order not to offense you which shipped with it enabled by default, but which would not properly display the characters on the console, resulting in mangled messages during boot. I particularly remember the "[ECHEC]" ("[FAILED]") with random garbage instead of the first 'E'. :-) Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-29 9:34 ` Willy Tarreau @ 2008-04-29 9:41 ` Alan Cox 0 siblings, 0 replies; 48+ messages in thread From: Alan Cox @ 2008-04-29 9:41 UTC (permalink / raw) To: Willy Tarreau; +Cc: Adrian Bunk, linux-kernel, trivial > "enabled" does not mean "working" Alan. I know one distro which I will > not name in order not to offense you which shipped with it enabled by No offence taken. In fact I seem to remember filing similar bugs at the time about rpm/popt getting its help formatting wrong in some locales (eg Welsh) for similar reasons - but that was some time ago. All the mainstream tools handle utf-8 just fine, joe is quite happy editing utf-8 these days (as are the legacy vim and emacs editing tools ;)). There really are no good reasons left not to use UTF-8. Alan -- > you are confusing me even more. Of course. "I'm from IBM. I'm here to help." ;-) -- Alan Altmark ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [2.6 patch] UTF-8 fixes in comments 2008-04-28 15:40 [2.6 patch] UTF-8 fixes in comments Adrian Bunk 2008-04-28 23:05 ` Willy Tarreau @ 2008-04-29 12:18 ` KOSAKI Motohiro 1 sibling, 0 replies; 48+ messages in thread From: KOSAKI Motohiro @ 2008-04-29 12:18 UTC (permalink / raw) To: Adrian Bunk; +Cc: kosaki.motohiro, linux-kernel, trivial > This patch converts some non-UTF-8 encoded text in comments to UTF-8. > > Signed-off-by: Adrian Bunk <bunk@kernel.org> Good Job! Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> AFAIK some file already are written by utf-8. frankly, I say from the standpoint as the non-Europian, all files are written by ascii: no problem all files are written by iso8859-1: need editor customize all files are written by utf-8: no problem some files are written by iso8859-1, but another files are written by utf-8: Ouch! Noooooo!! ^ permalink raw reply [flat|nested] 48+ messages in thread
end of thread, other threads:[~2008-05-09 13:07 UTC | newest] Thread overview: 48+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-04-30 0:08 [2.6 patch] UTF-8 fixes in comments Samuel Thibault 2008-04-30 3:38 ` Chris Adams 2008-04-30 9:38 ` Samuel Thibault 2008-04-30 19:45 ` Willy Tarreau 2008-04-30 19:49 ` Willy Tarreau 2008-05-03 23:50 ` Samuel Thibault 2008-05-04 8:55 ` Willy Tarreau 2008-05-04 10:25 ` Fix VT canonical input in UTF-8 mode [Was: UTF-8 fixes in comments] Samuel Thibault 2008-05-04 11:03 ` Willy Tarreau 2008-05-05 23:00 ` Andrew Morton 2008-05-05 23:54 ` Samuel Thibault -- strict thread matches above, loose matches on Subject: below -- 2008-04-28 15:40 [2.6 patch] UTF-8 fixes in comments Adrian Bunk 2008-04-28 23:05 ` Willy Tarreau 2008-04-29 1:29 ` H. Peter Anvin 2008-04-29 5:06 ` Willy Tarreau 2008-04-29 6:04 ` H. Peter Anvin 2008-04-29 7:29 ` Adrian Bunk 2008-04-29 8:14 ` Willy Tarreau 2008-04-29 9:06 ` Helge Hafting 2008-04-29 9:33 ` Alan Cox 2008-04-29 10:09 ` Willy Tarreau 2008-04-29 10:10 ` Alan Cox 2008-04-29 10:33 ` Willy Tarreau 2008-04-29 10:34 ` Alan Cox 2008-04-29 22:12 ` Willy Tarreau 2008-04-29 22:15 ` Alan Cox 2008-04-29 23:05 ` Willy Tarreau 2008-05-01 20:18 ` H. Peter Anvin 2008-05-01 9:46 ` Alexander E. Patrakov 2008-04-29 19:33 ` H. Peter Anvin 2008-04-29 10:42 ` Adrian Bunk 2008-04-29 11:06 ` Willy Tarreau 2008-04-29 11:27 ` Adrian Bunk 2008-04-29 11:32 ` Adrian Bunk 2008-04-29 20:18 ` Jeremy Fitzhardinge 2008-04-30 9:15 ` Helge Hafting 2008-04-30 19:22 ` Adrian Bunk 2008-04-30 19:42 ` H. Peter Anvin 2008-04-29 9:43 ` Adrian Bunk 2008-04-29 19:31 ` H. Peter Anvin 2008-04-29 20:05 ` Willy Tarreau 2008-04-29 20:09 ` H. Peter Anvin 2008-05-09 12:48 ` David Kågedal 2008-04-29 9:01 ` Alan Cox 2008-04-29 9:19 ` Jan Engelhardt 2008-04-29 9:34 ` Willy Tarreau 2008-04-29 9:41 ` Alan Cox 2008-04-29 12:18 ` KOSAKI Motohiro
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox