From: "Joseph D. Wagner" <theman@josephdwagner.info>
To: "'Jan Hudec'" <bulb@ucw.cz>
Cc: <linux-fsdevel@vger.kernel.org>
Subject: RE: RFC: Illegal Characters in File Names
Date: Mon, 19 Jul 2004 14:21:33 -0500 [thread overview]
Message-ID: <S265489AbUGSTVf/20040719192135Z+666@vger.kernel.org> (raw)
In-Reply-To: <20040719084757.GC3227@vagabond>
> There are just two illegal characters. '\0' and '/'. All other
> characters are *permited [sic]*.
By whose standard? POSIX doesn't require non-printing control characters to be legal. Linux PERMITS them, but there's no standard requiring them to be permitted.
> They are not illegal. The shell has problems with them, and the problems
> are not absolute. You CAN type them in if you try. The file managers
> will operate on them without problems, too.
I would argue that the shell having problems with them is the exact reason they should be made illegal.
Allowing them begs the question "how should they be handled?" For example, if a file name contained a backspace, displaying the raw backspace would backup the character's position and result in two characters being overwritten: the backspace and the character immediately prior to the backspace. Printing a substitute character instead of the raw character simply leads to more questions. Does '\b' mean backspace or backslash and b? How do you tell the difference?
Handling these characters has not been standardized on Linux. Different applications handle the same characters differently. For example, ls displays a backslash followed by the octal character number, while KDE's Konqueror automatically substitutes characters with their hexadecimal form prefixing them with '%' so that prefix and hexadecimal form is stored on disk and when displayed the hexadecimal character is automatically transformed into the actual character. This can lead to problems when different applications need to access the same file. How do you know which method the other application used in handling these characters?
Additionally, security vulnerabilities (now patched) have resulted from the allowed use of control characters.
I want to close this potential can-of-worms.
> The system is 8-bit clean an [sic] is relied upon by many users.
> I have LOTS of files with iso-8859-2 encoded names on my filesystem.
According to ISO 8859, the lower 128 characters are all the same. It's the upper 128 characters that differ with iso-8859-1, iso-8859-2, etc. Hence, the proposed change should be OK regardless of the encoding mechanism.
For iso-8859-2 specifically, see:
http://nl.ijs.si/gnusl/cee/charset.html
Joseph D. Wagner
next prev parent reply other threads:[~2004-07-19 19:21 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-07-19 0:41 RFC: Illegal Characters in File Names Joseph Wagner
2004-07-19 8:47 ` Jan Hudec
2004-07-19 19:21 ` Joseph D. Wagner [this message]
2004-07-19 20:08 ` Pat LaVarre
2004-07-19 20:54 ` Joseph D. Wagner
2004-07-20 6:33 ` Jan-Benedict Glaw
2004-07-20 16:25 ` Joseph D. Wagner
2004-07-20 20:42 ` Stephen Rothwell
[not found] ` <20040720162549.857014B7E7@dvmwest.gt.owl.de>
2004-07-20 16:52 ` Jan-Benedict Glaw
[not found] ` <20040719192145.50750578E5@jabberwock.ucw.cz>
2004-07-19 21:01 ` Jan Hudec
2004-07-20 16:40 ` Bryan Henderson
2004-07-20 16:54 ` Guy
2004-07-20 18:10 ` viro
2004-07-20 20:44 ` Guy
2004-07-20 21:27 ` Matthew Wilcox
2004-07-20 21:37 ` Jan Hudec
2004-07-20 21:40 ` Matthew Wilcox
2004-07-20 21:45 ` Jan Hudec
2004-07-20 21:49 ` Guy
2004-07-20 22:04 ` Jan Hudec
2004-07-20 22:11 ` Paul Stewart
2004-07-20 22:16 ` Joseph D. Wagner
2004-07-21 12:26 ` Jan-Benedict Glaw
2004-07-21 15:28 ` Guy
2004-07-21 16:25 ` Jan-Benedict Glaw
2004-07-21 12:24 ` Jan-Benedict Glaw
2004-07-20 21:41 ` Bryan Henderson
2004-07-21 12:21 ` Jan-Benedict Glaw
2004-07-21 15:25 ` Guy
2004-07-22 18:04 ` Matthew Wilcox
2004-07-22 18:35 ` Guy
2004-07-20 20:57 ` Jan Hudec
2004-07-20 21:09 ` Guy
2004-07-20 21:36 ` Jan Hudec
2004-07-20 22:13 ` viro
2004-07-20 22:44 ` Jan Hudec
2004-07-20 22:51 ` viro
2004-07-20 23:30 ` Guy
2004-07-21 20:25 ` Bryan Henderson
2004-07-22 3:17 ` John Newbigin
2004-07-22 3:24 ` Matthew Wilcox
2004-07-22 6:01 ` viro
2004-07-22 22:12 ` Bryan Henderson
2004-07-22 14:51 ` Jan-Benedict Glaw
2004-07-22 22:44 ` Bryan Henderson
2004-07-22 22:47 ` Jan Hudec
2004-07-23 18:10 ` Bryan Henderson
2004-07-20 23:52 ` John Newbigin
2004-07-21 3:26 ` Joseph D. Wagner
2004-07-21 4:15 ` viro
2004-07-21 5:03 ` Guy
2004-07-21 12:28 ` Jan-Benedict Glaw
2004-07-21 15:30 ` Guy
2004-07-21 16:26 ` Jan-Benedict Glaw
2004-07-21 16:33 ` Jan Hudec
2004-07-21 16:41 ` Guy
2004-07-21 17:01 ` Jan Hudec
2004-07-20 22:16 ` Joseph D. Wagner
2004-07-21 12:43 ` Jan-Benedict Glaw
2004-07-20 22:31 ` viro
2004-07-20 18:27 ` Bryan Henderson
2004-07-19 9:26 ` Matthew Wilcox
2004-07-19 19:21 ` Joseph D. Wagner
[not found] ` <E1BmdhG-0004NG-00@master.debian.org>
2004-07-20 2:43 ` Matthew Wilcox
2004-07-20 3:16 ` Joseph D. Wagner
2004-07-20 8:45 ` Jan Hudec
2004-07-20 16:25 ` Joseph D. Wagner
2004-07-20 16:41 ` Guy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=S265489AbUGSTVf/20040719192135Z+666@vger.kernel.org \
--to=theman@josephdwagner.info \
--cc=bulb@ucw.cz \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).