All of lore.kernel.org
 help / color / mirror / Atom feed
* The Greek letter "rho" is considered as two letters
@ 2010-08-07 19:37 Alkis Georgopoulos
  2010-08-07 19:57 ` Alkis Georgopoulos
  0 siblings, 1 reply; 4+ messages in thread
From: Alkis Georgopoulos @ 2010-08-07 19:37 UTC (permalink / raw)
  To: dash

$ touch ρ
$ ls ?
ls: cannot access ?: No such file or directory
$ ls ??
ρ

It happens to some utf-8 characters, but not for all of them.
This might be related:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=532302

Please CC me if possible, I'm not on the list.

Kind regards,
Alkis Georgopoulos


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: The Greek letter "rho" is considered as two letters
  2010-08-07 19:37 The Greek letter "rho" is considered as two letters Alkis Georgopoulos
@ 2010-08-07 19:57 ` Alkis Georgopoulos
  2010-08-08 12:55   ` ? doesn't match non-ascii characters Alkis Georgopoulos
  2010-08-08 12:56   ` The Greek letter "rho" is considered as two letters Jilles Tjoelker
  0 siblings, 2 replies; 4+ messages in thread
From: Alkis Georgopoulos @ 2010-08-07 19:57 UTC (permalink / raw)
  To: dash

Erm actually this problem happens with all utf8 characters, i.e. dash
does not properly take utf8 characters into account when expanding "?".

$ touch appétit              
$ ls app?tit
ls: cannot access app?tit: No such file or directory
$ ls app??tit
appétit


I'll send another mail about the greek rho problem which occurs only
with redirections.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ? doesn't match non-ascii characters
  2010-08-07 19:57 ` Alkis Georgopoulos
@ 2010-08-08 12:55   ` Alkis Georgopoulos
  2010-08-08 12:56   ` The Greek letter "rho" is considered as two letters Jilles Tjoelker
  1 sibling, 0 replies; 4+ messages in thread
From: Alkis Georgopoulos @ 2010-08-08 12:55 UTC (permalink / raw)
  To: dash

I've changed the title because it was misleading
(was: "The Greek letter "rho" is considered as two letters").

Repeating the problem,

$ touch appétit              
$ ls app?tit
ls: cannot access app?tit: No such file or directory
$ ls app??tit
appétit

I.e. double-byte utf-8 characters need two "?" to be matched.,
triple-byte utf-8 characters (e.g. ἀ) need three "?" to be matched etc.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: The Greek letter "rho" is considered as two letters
  2010-08-07 19:57 ` Alkis Georgopoulos
  2010-08-08 12:55   ` ? doesn't match non-ascii characters Alkis Georgopoulos
@ 2010-08-08 12:56   ` Jilles Tjoelker
  1 sibling, 0 replies; 4+ messages in thread
From: Jilles Tjoelker @ 2010-08-08 12:56 UTC (permalink / raw)
  To: Alkis Georgopoulos; +Cc: dash

On Sat, Aug 07, 2010 at 10:57:12PM +0300, Alkis Georgopoulos wrote:
> Erm actually this problem happens with all utf8 characters, i.e. dash
> does not properly take utf8 characters into account when expanding "?".

> $ touch appétit              
> $ ls app?tit
> ls: cannot access app?tit: No such file or directory
> $ ls app??tit
> appétit

Yes, it seems that dash has zero support for locales. In some ways this
is an advantage, as locale support can make things considerably slower
and configure/startup scripts don't need it. However, it leads to
inconsistent behaviour with other utilities that do support locales.

For FreeBSD's /bin/sh, which is another ash variant, I think some degree
of locale support (at least for utf-8) is desirable at some point. This
would include changing pattern matching and ${#var}.

I don't know what Herbert Xu thinks about this.

-- 
Jilles Tjoelker

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-08-08 12:56 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-08-07 19:37 The Greek letter "rho" is considered as two letters Alkis Georgopoulos
2010-08-07 19:57 ` Alkis Georgopoulos
2010-08-08 12:55   ` ? doesn't match non-ascii characters Alkis Georgopoulos
2010-08-08 12:56   ` The Greek letter "rho" is considered as two letters Jilles Tjoelker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.