* The Greek letter "rho" is considered as two letters
@ 2010-08-07 19:37 Alkis Georgopoulos
2010-08-07 19:57 ` Alkis Georgopoulos
0 siblings, 1 reply; 4+ messages in thread
From: Alkis Georgopoulos @ 2010-08-07 19:37 UTC (permalink / raw)
To: dash
$ touch ρ
$ ls ?
ls: cannot access ?: No such file or directory
$ ls ??
ρ
It happens to some utf-8 characters, but not for all of them.
This might be related:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=532302
Please CC me if possible, I'm not on the list.
Kind regards,
Alkis Georgopoulos
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: The Greek letter "rho" is considered as two letters
2010-08-07 19:37 The Greek letter "rho" is considered as two letters Alkis Georgopoulos
@ 2010-08-07 19:57 ` Alkis Georgopoulos
2010-08-08 12:55 ` ? doesn't match non-ascii characters Alkis Georgopoulos
2010-08-08 12:56 ` The Greek letter "rho" is considered as two letters Jilles Tjoelker
0 siblings, 2 replies; 4+ messages in thread
From: Alkis Georgopoulos @ 2010-08-07 19:57 UTC (permalink / raw)
To: dash
Erm actually this problem happens with all utf8 characters, i.e. dash
does not properly take utf8 characters into account when expanding "?".
$ touch appétit
$ ls app?tit
ls: cannot access app?tit: No such file or directory
$ ls app??tit
appétit
I'll send another mail about the greek rho problem which occurs only
with redirections.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: ? doesn't match non-ascii characters
2010-08-07 19:57 ` Alkis Georgopoulos
@ 2010-08-08 12:55 ` Alkis Georgopoulos
2010-08-08 12:56 ` The Greek letter "rho" is considered as two letters Jilles Tjoelker
1 sibling, 0 replies; 4+ messages in thread
From: Alkis Georgopoulos @ 2010-08-08 12:55 UTC (permalink / raw)
To: dash
I've changed the title because it was misleading
(was: "The Greek letter "rho" is considered as two letters").
Repeating the problem,
$ touch appétit
$ ls app?tit
ls: cannot access app?tit: No such file or directory
$ ls app??tit
appétit
I.e. double-byte utf-8 characters need two "?" to be matched.,
triple-byte utf-8 characters (e.g. ἀ) need three "?" to be matched etc.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: The Greek letter "rho" is considered as two letters
2010-08-07 19:57 ` Alkis Georgopoulos
2010-08-08 12:55 ` ? doesn't match non-ascii characters Alkis Georgopoulos
@ 2010-08-08 12:56 ` Jilles Tjoelker
1 sibling, 0 replies; 4+ messages in thread
From: Jilles Tjoelker @ 2010-08-08 12:56 UTC (permalink / raw)
To: Alkis Georgopoulos; +Cc: dash
On Sat, Aug 07, 2010 at 10:57:12PM +0300, Alkis Georgopoulos wrote:
> Erm actually this problem happens with all utf8 characters, i.e. dash
> does not properly take utf8 characters into account when expanding "?".
> $ touch appétit
> $ ls app?tit
> ls: cannot access app?tit: No such file or directory
> $ ls app??tit
> appétit
Yes, it seems that dash has zero support for locales. In some ways this
is an advantage, as locale support can make things considerably slower
and configure/startup scripts don't need it. However, it leads to
inconsistent behaviour with other utilities that do support locales.
For FreeBSD's /bin/sh, which is another ash variant, I think some degree
of locale support (at least for utf-8) is desirable at some point. This
would include changing pattern matching and ${#var}.
I don't know what Herbert Xu thinks about this.
--
Jilles Tjoelker
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-08-08 12:56 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-08-07 19:37 The Greek letter "rho" is considered as two letters Alkis Georgopoulos
2010-08-07 19:57 ` Alkis Georgopoulos
2010-08-08 12:55 ` ? doesn't match non-ascii characters Alkis Georgopoulos
2010-08-08 12:56 ` The Greek letter "rho" is considered as two letters Jilles Tjoelker
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox