* newlines in filenames; POSIX.1-2024
@ 2025-04-16 16:50 Alejandro Colomar
2025-04-22 22:21 ` Theodore Ts'o
0 siblings, 1 reply; 6+ messages in thread
From: Alejandro Colomar @ 2025-04-16 16:50 UTC (permalink / raw)
To: linux-kernel, linux-api, linux-man
[-- Attachment #1: Type: text/plain, Size: 754 bytes --]
Hi,
I'm updating the manual pages for POSIX.1-2024. One of the changes in
this revision is that POSIX now encourages implementations to disallow
using new-line characters in file names.
Historically, Linux (and maybe all existing POSIX systems?) has allowed
new-line characters in file names.
I guess there's no intention to change that behavior. But I should ask.
I thought of adding this paragraph to all pages that create file names:
+.SH CAVEATS
+POSIX.1-2024 encourages implementations to
+disallow creation of filenames containing new-line characters.
+Linux doesn't follow this,
+and allows using new-line characters.
Are there any comments?
Have a lovely day!
Alex
--
<https://www.alejandro-colomar.es/>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: newlines in filenames; POSIX.1-2024
2025-04-16 16:50 newlines in filenames; POSIX.1-2024 Alejandro Colomar
@ 2025-04-22 22:21 ` Theodore Ts'o
2025-04-23 7:31 ` Alejandro Colomar
2025-04-25 14:54 ` Christoph Hellwig
0 siblings, 2 replies; 6+ messages in thread
From: Theodore Ts'o @ 2025-04-22 22:21 UTC (permalink / raw)
To: Alejandro Colomar; +Cc: linux-kernel, linux-api, linux-man
On Wed, Apr 16, 2025 at 06:50:00PM +0200, Alejandro Colomar wrote:
>
> I'm updating the manual pages for POSIX.1-2024. One of the changes
> in this revision is that POSIX now encourages implementations to
> disallow using new-line characters in file names.
>
> Historically, Linux (and maybe all existing POSIX systems?) has
> allowed new-line characters in file names.
Do we have any information of which implementations (if any) might
decide to disallow new-line characters?
If the Austin Group is going to add these sorts of "encouragements"
without engaging with us dirctly, it seems to be much like King Canute
commanding that the tide not come in....
Personally, I'm not convinced a newline is any different from any
number of weird-sh*t characters, such as zero-width space Unicode
characters, ASCII ETX or EOF characters, etc.
I suppose we could add a new mount option which disallows the
weird-sh*t characters, but I bet it will break some userspace
programs, and it also begs the question of *which* weird-sh*t
characters should be disallowed by the kernel.
> I guess there's no intention to change that behavior. But I should
> ask. I thought of adding this paragraph to all pages that create
> file names:
>
> +.SH CAVEATS
> +POSIX.1-2024 encourages implementations to
> +disallow creation of filenames containing new-line characters.
> +Linux doesn't follow this,
> +and allows using new-line characters.
>
> Are there any comments?
I think this is giving the Austin Group way more attention/respect
than they deserve, especially when it's an optional "encourage", but
whatever...
- Ted
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: newlines in filenames; POSIX.1-2024
2025-04-22 22:21 ` Theodore Ts'o
@ 2025-04-23 7:31 ` Alejandro Colomar
2025-04-24 0:05 ` Theodore Ts'o
2025-04-25 14:54 ` Christoph Hellwig
1 sibling, 1 reply; 6+ messages in thread
From: Alejandro Colomar @ 2025-04-23 7:31 UTC (permalink / raw)
To: Theodore Ts'o; +Cc: linux-kernel, linux-api, linux-man
[-- Attachment #1: Type: text/plain, Size: 3110 bytes --]
Hi Ted,
On Tue, Apr 22, 2025 at 05:21:31PM -0500, Theodore Ts'o wrote:
> On Wed, Apr 16, 2025 at 06:50:00PM +0200, Alejandro Colomar wrote:
> >
> > I'm updating the manual pages for POSIX.1-2024. One of the changes
> > in this revision is that POSIX now encourages implementations to
> > disallow using new-line characters in file names.
> >
> > Historically, Linux (and maybe all existing POSIX systems?) has
> > allowed new-line characters in file names.
>
> Do we have any information of which implementations (if any) might
> decide to disallow new-line characters?
Such a list doesn't exist.
<http://austingroupbugs.net/view.php?id=251>
> If the Austin Group is going to add these sorts of "encouragements"
> without engaging with us dirctly, it seems to be much like King Canute
> commanding that the tide not come in....
>
> Personally, I'm not convinced a newline is any different from any
> number of weird-sh*t characters, such as zero-width space Unicode
> characters, ASCII ETX or EOF characters, etc.
Newline is slightly more problematic than those, especially in scripts.
But yes, other characters (mainly control characters) were also
discussed in that bug. From what I can read, it seems they were scared
that if they attempted to suggest banning all control characters at
once, there might be more opposition, and the standard would be toilet
paper.
> I suppose we could add a new mount option which disallows the
> weird-sh*t characters, but I bet it will break some userspace
> programs,
That's an interesting approach. Being an opt-in mount option, users
will only break at their will, and they can always go back to old mode
when they need to do some operation with weird-sh*t characters.
TBH, while I see the chances of breaking stuff (so I don't see this
being the default in a long time; maybe ever), I think an opt-in mode
would be interesting, for those that know that don't need to handle such
broken file names, to have a tighter system. I would enable such a mode
in my systems.
> and it also begs the question of *which* weird-sh*t
> characters should be disallowed by the kernel.
I think a mode for disallowing _any control characters_ (aka [:cntrl:],
aka 0-31) would be a good choice.
> > I guess there's no intention to change that behavior. But I should
> > ask. I thought of adding this paragraph to all pages that create
> > file names:
> >
> > +.SH CAVEATS
> > +POSIX.1-2024 encourages implementations to
> > +disallow creation of filenames containing new-line characters.
> > +Linux doesn't follow this,
> > +and allows using new-line characters.
> >
> > Are there any comments?
>
> I think this is giving the Austin Group way more attention/respect
> than they deserve, especially when it's an optional "encourage", but
> whatever...
I'm not worried about that, I was more worried about the churn in the
pages. I later remembered we have a pathname(7) page, so I'll put it
there, just once.
Have a lovely day!
Alex
--
<https://www.alejandro-colomar.es/>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: newlines in filenames; POSIX.1-2024
2025-04-23 7:31 ` Alejandro Colomar
@ 2025-04-24 0:05 ` Theodore Ts'o
2025-04-24 7:00 ` Alejandro Colomar
0 siblings, 1 reply; 6+ messages in thread
From: Theodore Ts'o @ 2025-04-24 0:05 UTC (permalink / raw)
To: Alejandro Colomar; +Cc: linux-kernel, linux-api, linux-man
On Wed, Apr 23, 2025 at 09:31:42AM +0200, Alejandro Colomar wrote:
>
> <http://austingroupbugs.net/view.php?id=251>
Ugh. Reading through that bug, despite the fact that the original
proposal was *significantly* bared down, has greatly reduced my
respect for the Austin Group.
One of the people in that bug argued unironically that using pipes
should be deprecated. i.e., that somehow "find . ... -print0 | xargs
-0 ..." was a security problem.
<<Sigh>>
Other people pointed out that creating proscriptions that were not
implemented by many/most historical implementations would fragment the
standard and decrease the respect people would have towards the POSIX
specification. That was the "toilet paper" comment which you
referenced.
Well, they got that right.
> I think a mode for disallowing _any control characters_ (aka
> [:cntrl:], aka 0-31) would be a good choice.
As the Austin Group Bug pointed out, the problem is that the control
characters can be printable characters, depending on the code page
that you might be using. The example that was given was cp437.
The problem is that historically speaking, the kernel does *not* know
about what locale that is in use. We made an exception to handle case
folding, where we added Unicode tables into the kernel. Some would
say that was a major mistake, and it's certainly been a headache.
- Ted
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: newlines in filenames; POSIX.1-2024
2025-04-24 0:05 ` Theodore Ts'o
@ 2025-04-24 7:00 ` Alejandro Colomar
0 siblings, 0 replies; 6+ messages in thread
From: Alejandro Colomar @ 2025-04-24 7:00 UTC (permalink / raw)
To: Theodore Ts'o; +Cc: linux-kernel, linux-api, linux-man
[-- Attachment #1: Type: text/plain, Size: 2027 bytes --]
Hi Ted,
On Wed, Apr 23, 2025 at 07:05:34PM -0500, Theodore Ts'o wrote:
> On Wed, Apr 23, 2025 at 09:31:42AM +0200, Alejandro Colomar wrote:
> >
> > <http://austingroupbugs.net/view.php?id=251>
>
> Ugh. Reading through that bug, despite the fact that the original
> proposal was *significantly* bared down, has greatly reduced my
> respect for the Austin Group.
>
> One of the people in that bug argued unironically that using pipes
> should be deprecated. i.e., that somehow "find . ... -print0 | xargs
> -0 ..." was a security problem.
Huh! I hadn't read that part.
> <<Sigh>>
>
> Other people pointed out that creating proscriptions that were not
> implemented by many/most historical implementations would fragment the
> standard and decrease the respect people would have towards the POSIX
> specification. That was the "toilet paper" comment which you
> referenced.
>
> Well, they got that right.
>
> > I think a mode for disallowing _any control characters_ (aka
> > [:cntrl:], aka 0-31) would be a good choice.
>
> As the Austin Group Bug pointed out, the problem is that the control
> characters can be printable characters, depending on the code page
> that you might be using. The example that was given was cp437.
>
> The problem is that historically speaking, the kernel does *not* know
> about what locale that is in use. We made an exception to handle case
> folding, where we added Unicode tables into the kernel. Some would
> say that was a major mistake, and it's certainly been a headache.
Hmmmm, I'm not too worried about that code page for my own system, and
most people aren't either. I still believe it would be good to have the
option to forbid 0-31, and let those users who need access file systems
with such weird conventions continue using the default (that is, not
enabling the new mode). I think ASCII has won the character wars;
especially in POSIX systems.
Have a lovely day!
Alex
--
<https://www.alejandro-colomar.es/>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: newlines in filenames; POSIX.1-2024
2025-04-22 22:21 ` Theodore Ts'o
2025-04-23 7:31 ` Alejandro Colomar
@ 2025-04-25 14:54 ` Christoph Hellwig
1 sibling, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2025-04-25 14:54 UTC (permalink / raw)
To: Theodore Ts'o; +Cc: Alejandro Colomar, linux-kernel, linux-api, linux-man
On Tue, Apr 22, 2025 at 05:21:31PM -0500, Theodore Ts'o wrote:
> Do we have any information of which implementations (if any) might
> decide to disallow new-line characters?
AFAIK: none. At least none that matters.
> Personally, I'm not convinced a newline is any different from any
> number of weird-sh*t characters, such as zero-width space Unicode
> characters, ASCII ETX or EOF characters, etc.
It isn't any different in a substantial way.
> I suppose we could add a new mount option which disallows the
> weird-sh*t characters, but I bet it will break some userspace
> programs, and it also begs the question of *which* weird-sh*t
> characters should be disallowed by the kernel.
Don't go there. The only limitations that does make some limited
sense in some limited environment is limiting to valid utf8. We've
already done that for CI, and that's causing enough problems despite
having a use case. Adding random mount options to limit random
characters has a lot of downside but absolutely no actual upside.
>
> > I guess there's no intention to change that behavior. But I should
> > ask. I thought of adding this paragraph to all pages that create
> > file names:
> >
> > +.SH CAVEATS
> > +POSIX.1-2024 encourages implementations to
> > +disallow creation of filenames containing new-line characters.
> > +Linux doesn't follow this,
> > +and allows using new-line characters.
> >
> > Are there any comments?
>
> I think this is giving the Austin Group way more attention/respect
> than they deserve, especially when it's an optional "encourage", but
> whatever...
Yeah. Don't even mention these idiotic recommendations, any attention
spent on this is too much.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-04-25 14:54 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-16 16:50 newlines in filenames; POSIX.1-2024 Alejandro Colomar
2025-04-22 22:21 ` Theodore Ts'o
2025-04-23 7:31 ` Alejandro Colomar
2025-04-24 0:05 ` Theodore Ts'o
2025-04-24 7:00 ` Alejandro Colomar
2025-04-25 14:54 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).